From the archives to digitized: Historic Vermont newspapers get new lease on life online
University of Vermont Cataloging and Metadata Specialist Michael Breiner performs some frame by frame inspection of a copy of the Vermont Phoenix, one of 12 Vermont newspapers that are being digitized as part of the Vermont Digital Newspaper Project.

From the archives to digitized: Historic Vermont newspapers get new lease on life online

The Vermont Digital Newspaper Project last week added its first batch of digitized newspaper pages to a national database dedicated to providing searchable digital copies of historic newspapers from all over the nation.

Tom McMurdo, the project librarian for the state effort, said there are currently 25 states involved in the National Digital Newspaper Project along with Washington, D.C.

Vermont recently added the database's oldest available pages, some from 1836, the earliest year within Vermont's range of funding. Two Windham County weeklies that were the ancestors of today's Brattleboro Reformer, the Windham County Democrat and the Vermont Phoenix, are among the papers to be digitized.

The $391,552 for the Vermont project comes from the National Endowment for the Humanities, and stipulates that the project must digitize 100,000 pages of Vermont newspapers published between 1836 and 1922.

The earliest page available from anywhere in the nation, from Jan. 5, 1836, is from the Rutland Herald. It features a follow-up story on New York's Great Fire of December 1835.

Federal officials chose 1836 as a starting point in an effort to extend the realm of public information to before the Civil War, around which much work has already been done. 1922 is the last year of public domain, said McMurdo, so the project would have to get special permission from publishers in order to digitize any content published after that year.

The work outlined in the grant is the beginning of a much larger effort, said McMurdo.

“100,000 pages is just a drop in the bucket,” he said. “There are millions of pages of newspapers in Vermont that have not been digitized.”

McMurdo moved to Vermont from California, where he had been working with the California Digital Newspaper Collection since the inception of the national project in 2005. He was hired by UVM as a full-time librarian on the Vermont project. McMurdo said that as the primary 100,000 pages are completed, the project will seek additional funding to extend their efforts.

The process of digitizing the newspapers is much more complicated than simply scanning an image, McMurdo said. Because the digitization effort also involves a searchable database of pages, the process requires an additional process called Optical Character Recognition, OCR for short.

Optical Character Recognition is software that analyzes scanned pages and assigns digital text values to written characters, McMurdo said. Once these values are assigned, users can search the database of pages for specific terms. The searches will return a set of pages which contain the terms searched.

The process, McMurdo said, is not 100 percent accurate. If the digital scans are made using degraded microfilm which was made using degraded newspapers, OCR accuracy can be as low as 25 percent. However, using high quality microfilm images, accuracy can be up to 98 percent, McMurdo said.

The National Digital Newspaper Project publishes the searchable database at Chronicling America, http://chroniclingamerica.loc.gov, a website where digitized pages from all over the country are available.

Vermont's project is focusing primarily on 12 titles from all over the state within the prescribed time period. A 12-member advisory committee of journalists, librarians, and historians from all over Vermont decided on the selection of publications, focusing on capturing quality historical content while at the same time maintaining a good geographic spread, McMurdo said.

The Vermont Digital Newspaper Project is contracting out the labor-intensive microfilm scanning process, McMurdo said, to iArchives, a Utah-based company. Microfilm from the state archives is copied and sent to the company, where they are scanned and then sent back in digital form for processing. The state archive originals, he said, never leave the state.

Subscribe to the newsletter for weekly updates