The Daily Princetonian is digitized and keyword searchable

prince_inverted.gif

The Princeton University Archives, working in conjunction with the Princeton University Library Digital Initiatives, has nearly completed a monumental project that will change the way researchers investigate University history. The student newspaper, The Daily Princetonian, has been digitized from its inception in 1876 through 2002. The site has been available in beta for almost two years, but all issues will be loaded as of June 30, 2012. At the suggestion of The Daily Princetonian alumni board who have been among the prime backers of this project, the site is named in honor of the newspaper’s long-serving production manager Larry Dupraz, and researchers are able to perform sophisticated keyword searches that can unlock the vast richness of the daily newspaper that documents so much of the University’s history. (For the years 2002- present, users may search online via the Daily Prince site.)

DailyPsearchsreenshop

“I wrote my final paper for my Freshman Writing Seminar about how the presence of veterans on Princeton’s campus following World War II affected Princeton’s academic environment and social atmosphere,” said Jennifer Klingman ’13. “My research heavily relied on The Daily Princetonian archives, and I had to spend a lot of time and energy searching for relevant articles in Firestone’s microform versions of the newspaper. It was difficult to comb through the articles, and as a result my research was limited in scope. This spring, I wrote my history department junior paper on academic and social changes taking place at Princeton during the late 1940s and 1950s. The online Daily Princetonian archives proved to be invaluable. I was able to access the archives anywhere and at any time, and use the archives’ search function to find a number of extremely useful articles. My independent work has definitely benefited from the existence of the online archives.”

100_0988

Freelance journalist W. Barksdale Maynard ’88 states “I am able to write about the social history of Princeton in an entirely new way and have restructured my research to take full advantage of this exciting new resource. For my Princeton Alumni Weekly article on the early history of automobiles at Princeton, the Dupraz Digital Archives allowed me to identify every reference to cars as early as 1901, to pinpoint who owned them and what kinds. I would never have attempted this article without The Dupraz Digital Archives.”

Maynard’s PAW colleague, Gregg Lange ’70, regularly uses the site for his column, “Rally Round the Cannon,” which examines and appraises University history. “You can piece together the story of Princeton football or Woodrow Wilson in a dozen ways. But the unique accessibility of a daily publication allows more subtle topics to arise and recede, and for cross-generational tales to emerge. Be it Ella Fitzgerald singing at a Princeton dance at age 19, then receiving an honorary degree 54 years later; or student revolts against the clubs’ Bicker selection system in 1917 and 1940 presaging its loss of monopoly in 1968, the combination of detail and long view is indispensable in understanding the ethos of the institution over time, and essentially inaccessible without the DuPraz technology and precision. And existentially, if I never see another microfiche in my life I will die a happy man.”

Maynard added, “My regular column in PAW, “From Princeton’s Vault,” has benefited enormously. Recently I was able to identify the earliest references to Princetonians as “tigers,” which had been guesswork previously. It turns out we were wrong by a decade.

This has been an international project, with the newspapers sent from Princeton to Brechin Imaging in Canada, where TIFF images are generated using high end German cameras. The files are then sent via a hard drive to Cambodia, where Digital Divide Data analyzes the structure of each page and uses an optical character recognition (OCR) program to derive machine-readable text, which allows for keyword searching. The hard drive is then shipped to Austin, Texas, where the US office of New Zealand company DL Consulting loads the data into a content-management system called Veridian, which supports searching and browsing, online reading, article extraction and printing, and other features.

Within the library, many hands have worked for this project’s success. At Mudd Library, project archivists Dan Brennan and then Adriane Hanson have overseen the day-to-day work of the project, managing the shipment of the newspapers to Brechin, as well as supervising students with the quality control phase. University Archivist Dan Linke raised the funds from various University and alumni sources and coordinated the project.

Within the greater Library system, Cliff Wulfman, the Library’s Digital Initiatives Coordinator, took the lead in writing the Request for Proposals and then selecting and coordinating the work with DDD, as well as providing technical assistance, support and vision. The Library System Office’s Antonio Barrera designed the front end web page with Phil Menos providing server support, and Deputy University Librarian and Systems Librarian Marvin Bielawski allocated the funds to acquire the Veridian software.

The project employs the METS/ALTO markup standard, the same used by the Library of Congress’s Newspaper Digitization Project, which means that as software changes and improves, we will be able to sustain this resource for many years to come.

100_0996