Start of Newspaper Digitization Project

Today I had a meeting with my supervisors and co-workers to discuss the major upcoming newspaper digitization project. The NCDHC is planning to digitize 100 reels of microfilm in the coming months, all from newspapers in North Carolina. My boss selected the titles for digitization based on nominations from around the state, and a set of criteria such as geographic location, years available, and content. The purpose of the meeting was to establish a workflow for undertaking this project, because of the many steps and people involved. The reels are sent to a vendor for digitization, who sends back to us for files for each frame of microfilm: a JPEG, pdf, tif, and txt file. Once the files are received, I have to check the tif files to ensure none are corrupted. Once that is done, all files are immediately copied into the digital archive, and we begin working only with the pdfs, which is the file type to be uploaded to the web. I then collect metadata from the files, based upon a set of standards from the NEH National Digital Newspaper Program. I record the date, volume, issue number and page numbers for every issue into an Excel spreadsheet. Various other data remains unchanged for every issue, such as the title, location, and format. Once the metadata is collected, the files are combined into issues, and are ready to be uploaded online through CONTENTdm with their corresponding metadata. Once everything is uploaded, we verify that all issues are present on the web, and there are not corrupt files or missing data. This process is not a difficult one, but when reels contain a few thousand frames, it can take days to collect the metadata. I’m becoming more and more excited about the project as we move along, because there are going to be so many newspapers online, and freely accessible to the public. It’s going to be a highly beneficial project.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s