Our NHPRC-Funded Digitization Project at Six Months

Late last year, the Mudd Manuscript Library was granted an award by the National
Historical Publications and Records Commission (NHPRC) to digitize our most-used Public Policy collections, serve them online, and create a report for the larger archival community about cost-efficient digitization practices. Excerpts from our six-month progress report is below.

nhprc-logo-l

Work so far

  1. Project planning

From the time we were awarded the grant to the present, we have produced an overall project plan and timeline, a vendor RFQ and plan of work, in-house quality control procedures for vendor-supplied images, a workplan for in-house scanning, and hardware-specific instructions for in-house scanning. All activities are either on schedule or ahead of schedule. Vendor-supplied digitization is currently eight months ahead of schedule.

  1. Finding a vendor

After distributing an RFQ and collecting bids, we decided on The Crowley Company as our vendor, based on both price and our confidence that they would be able to manage the materials and the work carefully and efficiently.

  1. Managing vendor-supplied digitization

Before materials can go out to the vendor, we first create a manifest of everything we want to send by transforming the EAD-encoded finding aid into an easily-read Excel worksheet. Since we want each folder of material to have a cover sheet that explains the collection name, box number, folder number, URL, and copyright policy, we used collection manifests to make target sheets with this information. A total of 6,943 target sheets were created, printed, and inserted into the beginnings of folders by student workers before materials were sent out to the vendor.

Once materials have been imaged by the vendor, students sample ten percent of the collection to check for completeness and readability. So far, everything has passed quality control with flying colors.

Each month, Crowley sends us a report of how many images have been created that month, how many images have been created cumulatively, and average scanning rate per hour. This information is below:

Boxes Scanned

Pages Scanned

2013 March

15

17119

2013 April

32

45761

2013 May

50

49499

2013 June

65

97896

Totals

162

210275

  1. In-house imaging

Imaging of the John Foster Dulles papers started in June. So far, we have completed a pilot of scanning with the sheet-feed of the photocopier, and pilots of microfilm scanning and scanning with a Zeutschel face-up scanner are underway.

Project goals and deliverables

  1. Twelve series or subseries from six collections digitized

To date, five series or subseries have been completely digitized, and three others are in the process of being digitized.

  1. Approximately 416,000 images created and posted online

As of July 1, 2013, 210,275 images have been scanned by the vendor. Of this total, 39,834 images have been posted online. Our vendor is several months ahead of schedule for this project, and in-house scanning is on track. Since beginning in-house scanning in June, 1,838 pages have been scanned by student workers. In the next months, we will calculate the per-page costs for scanning on a Zeutschel face-up scanner and with a microfilm scanner. From there, we plan to image fifty feet of materials with the sheet feeder of the photocopier, 10.3 feet with the Zeutschel face-up scanner, and 33.4 feet with the microfilm scanner.

  1. Six EAD finding aids updated to include links for 17,508 components (folders)

Two finding aids (Council on Foreign Relations Records and Adlai Stevenson Papers) have been updated to include links to digitized content. Another (George F. Kennan Papers) is ready to be updated. This process is managed semi-automatically with a series of shell scripts. After quality control hard drives of images are sent to Princeton’s digital studios. Staff there verify and copy digital assets to permanent storage. After this, PDF and JPEG2000 files are derived from the master TIFFs, and the relationship between these objects is described in an automatically generated METS file. The digital archival object (<dao>) tag is added to the EAD-encoded finding aid for each component.

  1. Digital imaging cost of less than 80 cents per page achieved

The plan of work with our vendor calls for scanning costs well below the 80 cents per page. Our first (and likely least expensive) of three in-house scanning pilots estimates the costs of scanning with the sheet feeder of a copier to be two cents per page. We will have numbers for microfilm scanning and scanning with a face-up scanner at the time of our next report.

  1. Metrics for digital imaging of 20th century archival collections for

    1. In-house microfilm conversion

    2. Sheet feeding through a networked photocopier

    3. Vendor supplied images

The information that we have collected thus far is below. Our vendor metrics are based on the quote and plan of work with The Crowley Company. Sheet feed metrics are collected by having a student worker fill out a minimal, time-stamped form at the beginning and end of each scan, and then analyzing that information. These numbers are preliminary. Sheet-fed scans have not yet been checked for quality control — re-scans may increase the total time per page and dollars per page for this method.

Vendor

Sheet Feed

Microfilm

Zeutschel

Total pages:

270,600*

1838

Total feet:

530.95

1.68

Total time:

2:25:14

Total time (decimal):

2.42

Time per page:

0:00:04

Pages per hour:

270.75

759.33

Hours per foot:

1:26:26

Feet per hour

0.69

Cost per page:

TBD

$0.02

*This number is an estimate, based on an assumed 1200 pages per box. Our reports from Crowley show anywhere from 1050-1750 pages in a box.

Note: in addition to these three methods, we plan to add a fourth – scanning with a face-up scanner (in our case, a Zeutschel scanner table).

  1. Policies and documentation for large-scale digitization initiative created and shared with archival community

As we go forward with our project, we have been blogging not just about the content of our digitized collections, but also our methods and rationales. A blog post written in February explains how this project fits into our other digitization activities and our approach to access. In early June, we wrote about the reasons why this kind of project is so important, and how our materials will now reach researchers worldwide (and of all ages) who might otherwise never come to our reading room in Princeton, New Jersey.

A more formal report on our methods and results will be made available once more data has been gathered.

Records of Adlai Stevenson, Ambassador to the United Nations, Now Available to View Online

In October 1962, at the height of the Cuban missile crisis, Adlai Stevenson spoke the most famous line of his career. The former Illinois governor and two-time presidential candidate was the United States’ ambassador to the United Nations.

After a series of provocative political moves and a failed US attempt to overthrow the Cuban regime,  Nikita Khrushchev proposed the idea of placing Soviet nuclear missiles on Cuba to deter any future invasion attempt in May 1962. By October 14, American spy planes captured images showing sites for medium-range and intermediate-range ballistic nuclear missiles under construction in Cuba.

Tensions mounted quickly. Concurrent with other negotiations, the United States requested an emergency meeting of the United Nations Security Council on October 25. There, Adlai Stevenson confronted Soviet Ambassador Valerian Zorin, challenging him to admit the existence of the missiles. Ambassador Zorin refused to answer.

“Do you, Ambassador Zorin, deny that the U.S.S.R. has placed and is placing medium- and intermediate-range missiles and sites in Cuba? Don’t wait for the translation! Yes or no?”

“I am not in an American courtroom, sir,” Zorin responded, “and therefore I do not wish to answer a question put to me in the manner in which a prosecutor does–”

“You are in the courtroom of world opinion right now,” Stevenson interrupted, “and you can answer yes or no. You have denied that they exist, and I want to know whether I have understood you correctly.”

“You will have your answer in due course,” Zorin replied. “I am prepared to wait for my answer until hell freezes over, if that’s your decision,” countered Stevenson. “And I am also prepared to present the evidence in this room.”

The Mudd Manuscript Library holds the papers of Adlai Stevenson, and as part of our NHPRC-funded project, we have digitized records relating to his tenure as United States Ambassador to the United Nations. Here, especially in his section on Cuba, we get more of the story behind the story — notes, memoranda and letters of congratulations after this memorable speech, and records from 1963-1965, after the crisis and when the cold war was icier than ever.

Patrons can view thumbnails of a file to get a sense of what’s available

Browsing Adlai Stevenson correspondence

Scroll through to see all 164 images.

Simply click on any of the thumbnail images to see a larger view.

The entire file is also available for download in PDF form.

Clicking on this button will download a pdf of the entire file.

Clicking on this button will download a pdf of the entire file.

We hope that researchers everywhere will be able to make use of these newly-available materials. As always, please contact the Mudd Library with questions about any of our collections.

Archives for Everyone

In each of the last two springs, several staff of the Mudd Manuscript Library and other members of the Department of Rare Books and Special Collections have judged at the regional qualifier of the National History Day competition held on Princeton’s campus. This is a contest for middle and high school students who, based on rigorous guidelines, synthesize and analyze information about a historic event. They then create a paper, website, documentary, exhibit or performance explaining what they have learned.

Judging National History Day is a powerful touchstone about the value of archives in the production of history. Each year, I see students adroitly avoid some of the more common traps of historical production — their projects are clear, level-headed, open-minded, and support their claims with evidence. Students who submit the best projects don’t just have a clear argument and lengthy bibliography — they let the primary sources surprise them and challenge their previous conceptions of the past. Yes, they may start with textbooks and biographies, but stronger projects evaluate primary sources. And the very best projects tend to not just look at key documents that have been artificially assembled on a website (although this is valuable too) — they look at records in context and try to make arguments about subtext and authenticity.

The best place to find records in context is usually an archives. But of course, access to archives isn’t easy for students. Working parents may not be able to take their children to the New Jersey Historical Society or National Archives or Mudd Library, as much as they might like to provide that experience. Most archives are only open during the hours when parents are working and visiting these institutions can be intimidating. From a young student’s perspective, it’s often hard to tell what the holdings are and whether the trip will be worth it.

Our NHPRC-funded project hopes to be a model toward ameliorating this barrier to access. We believe that by scanning our records and making them available within the same context that one would see them in the reading room, anyone with an internet connection can have a meaningful scholarly experience without the cost and inconvenience of traveling to Princeton, New Jersey.

We hope that children will benefit as much as anyone from this project. As Cathy Gorn, the Executive Director of National History Day, noted in her letter of support for the grant:

Having primary source materials on the Cold War available via the Internet would allow many NHD students around the country to conduct research for their projects that they ordinarily would not be able to, and the Mudd collections to be digitized are broad enough to support a variety of NHD Projects.

Of course, students don’t just wish to access historical records for National History Day — they want access for the same reasons that any other researcher does. A teenager may want to know more about when and how his family came to America. He might want to know more about the history of his town, and how certain sites came to be created. Or he may be interested in the history of ideas, policies and customs that affect his life. The collections that we plan to digitize — the John Foster Dulles papers, the Allen Dulles papers, the James Forrestal papers, the Council on Foreign Relations records, the George Kennan papers and the Adlai Stevenson papers — document how cold war activities were conducted and understood. They also present an opportunity for students to understand through diaries and correspondence the false starts, misunderstandings, and possible alternatives that constitute all historical events.

The historian John Lewis Gaddis makes the argument for access more persuasively than I could. In his letter of support for our grant, he explained the cost, inconvenience and wear on records for professional researchers trying to do research on-site.

But the most fundamental shortcoming of this old system was the disservice it did to students of history who never got to see an archive in the first place. Maybe they lived abroad. Maybe they attended American universities or colleges that could not provide research support. Maybe they were high school or even elementary students who might have gotten hooked on history for life had they had the chance to work with original materials – but they didn’t have that chance.

Now, however, almost all of them have access to a new means of access, which is of course the internet- even if they’re stuck in a place like Cotulla, Texas, where I grew up. I mention this little town because it’s where the young Lyndon B. Johnson spent a year teaching, in 1928-29, in the then segregated Mexican-American school. What he tried to do for those kids is still remembered: it gets its own chapter in the first volume of Robert Caro’s massive biography. But just think what LBJ could have done as a teacher had he had the resources that are available now. That’s why this project is important.

It has the potential, quite literally, to globalize the possibility of doing archival research. That’s no guarantee that this will produce a greater number of great books than in the past. What it will ensure, however, is a quantum leap in the opportunities students and their teachers will have to bring the excitement of working with original documents into all classrooms. That’s easily as important, I think, as writing the kind of books that might get you tenure at a place like Yale.

Why — and How — We Digitize

It’s February, and we’re now in the second month of our NHPRC-funded digitization project. In twenty-three more months, we’ll have completed scanning and uploading 400,000 pages of our most-viewed material to our finding aids, and anyone with an internet connection will be able to view it.

This is just the most recent effort to introduce digitization as a normal part of our practice at Mudd. As I said in my previous post, we know that it’s well and good that we have collections that document the history of US diplomacy, economics, journalism and civil rights in the twentieth and twenty-first centuries. But for the majority of potential users, who may never be able to come to Princeton, NJ, this is irrelevant. However interested they may be, they may never be able to afford to visit us. And there’s a whole other subset of potential users — let’s call them working people — who can’t come between the hours of 9:00 and 4:45, Monday through Friday. Are we really providing fair and equitable access under these conditions? Since we have the resources to digitize, it’s imperative that we develop the infrastructure and political will to do so.

We know that it’s time to get serious — and smart — about scanning.

The ball has been rolling in this direction for some time. We have three “streams” of making digital content available, and with our new finding aids site, we have an intuitive way of linking descriptions of our materials to the materials themselves.

Images of the collection in the context of the finding aid

Images of the collection in the context of the finding aid

Our first is patron-driven digitization.

The Zeutschel -- our amazing German powerhouse face-up scanner

This is our Zeutschel scanner. It does amazing work, is easy on our materials, and usually requires very little quality control.

Archives have been providing photoduplication services since the advent of the photocopier. At Mudd, we have dedicated staff who have been doing this work for decades. Recently, we’ve just slightly tweaked our processes to create scans instead of paper copies and to (in many cases) re-use the scans that we make so that they’re available to all patrons, not just the one requesting the scan.

A patron (maybe you!) finds something in our finding aids that he thinks he may be interested in, and asks for a copy.

If he’s in our reading room, he flags the pages of material he wants. If he’s remote, he identifies the folders or volumes to be scanned. The archivist tells him how much the scan will cost, and he pre-pays.

Now, the scanning. This either happens on our photocopier (the technician can press “scan” instead of “photocopy” to create a digital file instead of a paper one) or on our Zeutschel scanner. And while we feel happy and lucky to have the Zeutschel, we don’t strictly need it to fulfill our mission to digitize.

The scan is named in a way that associates it with the description of the material in the finding aid, and is then linked up and served online. We currently send the patron an email of this scan, but in the future we may just send them a link to the uploaded content.

Our second stream is targeted digitization based on users’ viewing patterns

Our friendly student receptionist, Ashley, scans materials at the front desk when she isn't welcoming patrons.

Our student receptionist, Ashley, scans materials at the front desk when she isn’t welcoming patrons.

We try to keep lots of good information about what our users find interesting. We use a service called google analytics to learn about what users are browsing online, and we keep statistics about which physical materials patrons see in the reading room.

From these sources, we create a list of most-viewed materials, and set up a system for our students to scan them in their downtime when they’re working at the front desk.

We do this because we want to make sure that we’re putting the effort into digitizing resources that patrons actually want to see — there are more than 35,000 linear feet of materials at the Mudd Library. We probably won’t ever be able to digitize absolutely everything, and it wouldn’t make sense to start from “A” and go to “Z”. So, we pay attention to trends and try to anticipate what researchers might find useful.

Our final stream — and the one for which we currently have to rely on external support — is large-scale vendor-supplied digitization.

Our current cold war project is a great example of this. We’ve put together a project plan, chosen materials, called for quotes and chosen a vendor. We recently shipped our first collection to be digitized, and I’ll be posting information to the blog as we move forward.

Another good example of an externally-supported digitization activity is the scanning of microfilm from our American Civil Liberties Union Records. Our earliest records were microfilmed decades ago and recently, Professor Sam Walker supported the digitization of some of this microfilm so that they could be made available online.

No single stream — externally-supported projects, left-to-right scanning, or patron-driven digitization — would be enough to support our goal of maximizing the content available online. We hope that the three, each pursued aggressively, will help us realize our mission of providing equitable access to our materials. And we think that focusing on this cold war project will help us reflect on and improve all of our digitization activities.

New Public Policy Accessions: May – June 2011

There’s a scene in a documentary about the French philosopher Jacques Derrida where Derrida visits UC Irvine (where he had donated his personal papers). The philosopher, going through the rows of newly-processed collections, comments that the gray archival boxes on the shelves look like little gravestones.

For someone whose best-known axiom was that "there is nothing outside the text," and who was very concerned about who has "authority" over the archive, perhaps it was somewhat distressing for Derrida to see his texts buried away in folders, boxes, shelves and behind locked doors.

It’s easy to understand this concern. In some ways, archival records are by their nature "dead" — they have been given to the archives because they’re no longer used in the course of daily business. And it’s true that most institutions keep these materials tucked away in closed stacks.

On the other hand, from my point of view as someone who processes new accessions as they come to Mudd, collections are constantly growing, re-interpreted by new context and new evidence, and given new life through the research and reference process. We care for collections so that they may find new life — all of our core activities, as an institution, are to serve researcher needs in their synthesis and analysis of the past.

In May and June of this year, most of our accessions were additions to collections we already hold — in some cases, this was an instance of a donor finding or having created additional material that rounds out our collections. In most cases, new additions to an archival collections are an opportunity to re-examine the existing collection from a new point of view.

We hope that this will be the case with our newest additions. Here is a list of what we received in May and June:

[ML.2011.015] Photocopy of Douglas Linder Article
[ML.2011.016] Photographs and correspondence to William H. Kellenberger from John Foster Dulles
[ML.2011.017] Women’s World Banking Records
[ML.2011.019] Chalmers Benedict Wood Papers
[ML.2011.021] George S. McGovern Photographs and Letters
[ML.2011.022] Marten van Heuven Writings and Correspondence
[ML.2011.023] Woodrow Wilson Letter
[ML.2011.025] Kennett Love Papers

Folk Art in the Archives

Bowen

[Left] William Bowen by Stanislaus Korneski. Paint and etching on wood, AR1995.78. [Right] Photo of William Bowen by the Princeton Alumni Weekly.

I would guess that every archives has material like this — objects created out of affection or respect in a non-official capacity. These two paintings on etched wood — recently re-discovered here at Mudd — were created by Stanislaus Korneski, a member of the drafting section of the grounds and buildings department. They were given to Ed Edenfield, another University employee, as a gift, who then sent them to the archives in 1995. Resemblances are striking, I think.

Goheen

[Left]] Robert Goheen by Stanislaus Korneski. Paint and etching on wood,

AR1995.78. [Right] Photo of Robert Goheen by the Princeton Alumni

Weekly.

New Public Policy Accessions: April 2011

As organizations grow and change through time, so do their archives.The Mudd Manuscript Library collects the records of the American Civil Liberties Union [ML.2011.011], the Association on American Indian Affairs [ML.2011.005], and Americans United for Separation of Church and State [ML.2011.003], among many other organizations. In the last few months, we’ve had the pleasure of receiving a new cache of materials from each of these organizations and adding them to existing collections. Although some materials from both the ACLU and AAIA may be restricted for some time to comply with legal and privacy concerns, the remainder will be valuable to researchers hoping to learn about the recent history of these important organizations.

In addition to these organization records, we have also received an accrual of papers from James A. Baker III, former Chief of Staff to President Reagan and Secretary of State to President George H.W. Bush [ML.2011.002]. These records, mostly from his post-Washington career, include research files created during the writing of his memoir, correspondence, events files, and a small number of financial files. They also include his “desk drawer” files, letters and notes from important figures in his career, including Presidents George H.W. and George W. Bush, Nancy Reagan, Henry Kissinger, and Karl Rove. Please consult the finding aid for this collection for access restrictions.

Continue reading

“How High Can an Income Tax Fix Go?” The LBJ tax scandal that you’ve probably never heard of.

The Mudd Manuscript Library recently acquired an extremely interesting collection from a little-noted event in political history.

werner

Werner’s 1944 memo explaining the discovery of fraudulent bonuses to Brown & Root executives. The actual recipient of these funds was determined to be the Lyndon B. Johnson 1941 U.S. Senate campaign.

Between 1942 and 1944, Elmer Charles Werner led an Internal Revenue Service investigation of Brown & Root’s* covert financial support of then U.S. Representative Lyndon B. Johnson’s failed 1941 U.S. Senate campaign. According to Werner’s records, this investigation was impeded and eventually terminated by a complicated series of requests from Johnson to Roosevelt’s White House to senior IRS officials.

This collection includes Werner’s diaries from 1942-1945 (the period during which Johnson was investigated); Werner’s notes and newspaper clippings regarding the case; a chronology of the facts of the case prepared by Werner; and Werner’s manuscript narrative regarding his experiences which he entitled “How High Can an Income Tax Fix Go?”

Many years before their transmittal to Mudd, these records were central sources for a chapter in Robert A. Caro’s book The Years of Lyndon Johnson: The Path to Power (1981). There, Caro explains how Johnson’s connections to the Roosevelt White House prevented the IRS investigation from exploring the full scope of Brown & Root’s secret contributions to the Johnson campaign.

Continue reading

New Public Policy Accessions: July 2010 – March 2011

One of Mudd’s newest accessions, the Kristen Timothy Papers, finds itself in good company with other Mudd collections documenting individuals who have had profound influence in the United Nations, including the papers of Margaret Snyder, Regional Advisor of the

United Nations Economic Commission for Africa; Henry R. Labouisse, Director of UNRWA and Executive Director of UNICEF; David A. Morse, Director-General of the ILO; and many other luminaries.

Timothy organized the United Nations’ Fourth World Conference on Women in Beijing in 1995. The conference addressed enduring inequalities for women and girls across the world. Timothy was instrumental in outlining the Beijing Declaration and Platform for Action, which were adopted by consensus on 15 September 1995.

Timothy’s records include audio-visual materials (much of which is available online), records regarding the creation of the platform for action, materials created in preparation for and during the conference, and a series of Timothy’s research records on the history of the global women’s movement.

Continue reading