Library Committee Handbook

Executive Committee



Final Report - Increasing Library Contributions to the HathiTrust

 

Final Report: Enhancing Library Contributions to the HathiTrust (Library Innovation Fund ACTY60)  

Kyle Rimkus

August 27, 2012

 

Table of Contents

Overview......................................................................................................................................... 1

Summary of Work Accomplished................................................................................................ 2

Technical Work............................................................................................................................ 2

Policy Work.................................................................................................................................. 2

Policy for Use of HathiTrust for Preservation and Access to Non-Unique Book Content..................... 3

Financial Summary....................................................................................................................... 3

Next Steps...................................................................................................................................... 4

Appendix I: Project Proposal (Enhancing Library Contributions to the HathiTrust: An Innovation Funding Proposal)........................................................................................................................................ 5

PROBLEM STATEMENT......................................................................................................... 5

BACKGROUND.......................................................................................................................... 5

PROPOSED USE OF INNOVATION FUNDING.................................................................. 6

BENEFITS.................................................................................................................................. 6

ALIGNMENT WITH LIBRARY PRIORITIES........................................................................ 6

BUDGET..................................................................................................................................... 7

TIMELINE................................................................................................................................. 7

 

 

 

Overview

In October 2012, an Innovation fund request to develop internal tools for improving HathiTrust ingest workflows was submitted by Kyle Rimkus, MJ Han, Betsy Kruger, and Tom Habing.  The authors of the request recognized the need to stimulate the contribution of locally digitized book content to HathiTrust, which had stalled because UIUC JPEG2000 book image files were consistently rejected by HathiTrust ingest tools.  Innovation funds were used to analyze the technical problems behind this and develop tools to mitigate barriers to ingest.

 

 

 

Summary of Work Accomplished

Technical Work

Technical work began later than scheduled due to challenges hiring a graduate hourly programmer.  Instead of beginning in November 2012, the project began in earnest in March 2013, with the addition of Haruit Kumar as a graduate hourly programmer.

 

Haruit, under the direction of Kirk Hess and Kyle Rimkus, with input from the UIUC HathiTrust Users Group and HathiTrust staff at the University of Michigan (Jeremy York, Aaron Elkiss), has made the progress listed below over the past 6 months (all locally developed scripts are available in a restricted bitbucket repository at https://bitbucket.org/hkumar3/htfeeduiuc):

 

 

Policy Work

Technical progress was complemented by significant progress in HathiTrust policy development.  The Library formed a HathiTrust Users Group (https://wiki.cites.illinois.edu/wiki/display/libemployees/HathiTrust+at+UIUC) under the umbrella of the Digital Repository Advisory Group to advise on local HathiTrust needs.  Most importantly, the Library approved a clear policy on the use of HathiTrust as a preservation repository for specific types of materials (reproduced below;  also available at: https://wiki.cites.illinois.edu/wiki/display/LibraryDigitalPreservation/Policy+for+Use+of+HathiTrust+for+Preservation+and+Access+to+Non-Unique+Book+Content)

 

Policy for Use of HathiTrust for Preservation and Access to Non-Unique Book Content

The HathiTrust Digital Library will serve as the preservation repository for most non-unique book-like content that is digitized by the University Library whether in house or outsourced and that has an OCLC number (a requirement of HathiTrust). In addition, the Library will no longer store or provide access to the access copies of this material, but will instead link to the copies held by the HathiTrust and/or the Internet Archive (or other access mechanisms that become available). We will provide links to these copies via URLs in the catalog records, as well as through a splash page containing the metadata and the links. The Preservation Unit will periodically ensure that the HathiTrust is continuing to perform the preservation and access activities to which it has committed.

Exceptions to this policy include:

Note that these exceptions may change as the HathiTrust updates their policies and procedures. For example, if the HathiTrust allows the deposit of uncropped archival masters, we may reconsider depositing those from RBML into the HathiTrust.

 

Financial Summary

Financial codes

 

Project funds were spent exclusively to fund the time of Haruit Kumar, a graduate programmer, who created the scripts and tools described above.

 

Original budget:

dollars/hour

hours/week

total weeks

TOTAL

$19.47

20

22

$8566.80

 

Actual expenditures:

Budget

Funds spent

Balance

$8566.80

$7534.89

$1032.91

 

Next Steps

Refinements to the ingest process are ongoing.  Future steps include:

Fortunately, the Library has identified Provost/IT Fee funds to be spent in FY2014 to support continued efforts in streamlining HathiTrust ingest.  Haruit Kumar is currently employed at 20 hours a week to meet these needs.  

 

Appendix I: Project Proposal (Enhancing Library Contributions to the HathiTrust: An Innovation Funding Proposal) 

 

Kyle Rimkus, MJ Han, Betsy Kruger, Tom Habing

September 14, 2012

PROBLEM STATEMENT

The University of Illinois Library's book digitization efforts lack effective tools for contributing locally digitized content to the HathiTrust.  Hundreds of books digitized under the supervision of Digital Content Creation intended for contribution to the HathiTrust are sitting on local servers with no clear workflow for moving them into the HathiTrust.  Likewise, the Brittle Books program in Preservation has been unable to contribute content to the HathiTrust, despite strong interest in restructuring current workflows to rely on it as a key pillar of local content preservation and access strategies.     

 

BACKGROUND

On July 2, 2012, a subgroup of the Digital Library Access, Repository, and Scholarly Communications Services Advisory Group consisting of Tim Cole, Bill Ingram, Betsy Kruger, Michael Norman, Kyle Rimkus, and Sarah Shreeves met to recommend policies for how best to utilize the HathiTrust’s access and digital preservation services within the context of the library’s broader digital content management strategies.  Action items from this meeting included the following:

 

 

On September 13, 2012, Kyle Rimkus convened a meeting of HathiTrust users to discuss, among other things, progress on the items above.  This group included Betsy Kruger, Michael Norman, Rimkus, MJ Han, Annette Morris, Gary Maixner, William Weathers, Mike Tang, and Kirk Hess.  The group concluded that insufficient progress had been made on these tasks.  In addition, Digital Content Creation and Brittle Books representatives confirmed that much of their work intended for Hathi --  hundreds of volumes of locally digitized content, in fact -- is sitting on local servers with no clear workflow for moving content into the HathiTrust, and has been, in many cases, for at least a year’s time.  

 

This is due to three factors:

 

PROPOSED USE OF INNOVATION FUNDING

We are proposing an Innovation grant to stimulate University of Illinois contributions of content to the HathiTrust.  We will hire a graduate student, preferably in Computer Science, to work twenty hours a week over the Fall and Spring semesters.  This student will, under the supervision of Kyle Rimkus, MJ Han, Tom Habing, and Betsy Kruger, write scripts and develop a web-based management tool to facilitate contributing locally created materials into the HathiTrust.  This includes: 

 

 

This student will report to Tom Habing in the Library’s Software Development Group and will follow the direction of key stakeholders MJ Han and Annette Morris under the guidance of project leaders Betsy Kruger and Kyle Rimkus.

 

BENEFITS

An improved HathiTrust workflow would have several important benefits.  Namely, we would: 

ALIGNMENT WITH LIBRARY PRIORITIES

The Library's strategic plan explicitly mentions participation in the HathiTrust as a priority:

 

"Promote collaborative efforts toward accomplishing local, regional, and national goals for digital preservation programs through participation in initiatives such as the DuraSpace Foundation, ArchivesSpace, and HathiTrust."

 

This project will allow the Library to reap some return on our already considerable investment in the HathiTrust by allowing us to rely on its services as an essential component of our digital preservation, access, and file management practices for digitized books.

 

BUDGET

We are proposing to hire a programmer, preferably a Masters Student in Computer Science at the Library's graduate hourly rate of $19.47/hour for 20 hours a week from the remainder of the Fall semester to the end of the Spring semester.  This comes to 440 hours, or $8,566.80. 

 

dollars/hour

hours/week

total weeks

TOTAL

$19.47

20

22

$8566.80

  

TIMELINE

This project will begin in October, 2012, and will terminate at the end of the Spring semester in May, 2013.