Library Committee Handbook


Executive Committee


AUL Annual Reports


Committee Charge and Membership


Innovation Fund


2017-2018 Meeting Information


Previous Documents 


Position Table
Position Request Form


Standing Policies

Building a Repository for Print Holdings

Submitted by William Weathers

June 2014

Project: Building a Repository for Print Holdings             

This proposal is for building a repository of CIC (http://www.cic.net/home) libraries’ print holdings[1], which will serve as an aggregated database for print holdings metadata which can be accessed via a web interface to serve different facets of library management needs. 

 

Project Objectives

A repository of metadata for print holdings at libraries from a consortium or geographic region would be beneficial in designing and planning many areas of library services. For example, this could provide greater efficiency in various library-initiated digitization projects with regards to their planning. If, as planned, the repository implements a federated search capability into its web interface using OCLC[2] and HathiTrust Digital Library’s API services[3], one will easily be able to determine whether or not the item should be digitized by identifying already freely available digital copies in aggregated digital libraries.

Additionally, this repository could be a valuable tool in collaborative collection development between libraries by providing insightful collection analysis such as how many institutions hold a certain printed book or serial, and how many copies are available.  At the present time, there is not a good resource to which librarians can refer in order to assess journal literature coverage at their peer institutions. 

There are three things that distinguish this repository from other metadata repositories:
1.       The repository will have more granular levels of data than others repositories
Item level holdings data will be ingested and queried for this repository. The item level holdings data includes enumeration and chronology information as well as the copy numbers of individual items that could help further identify the number of copies each institution has in terms of monographs, and volume level information for serials or monographic sets. Depending on the source data, it may be enhanced with core bibliographic information.
 
2.       The repository will provide web services
The final objective for this work is not only building a database, but also developing a web interface which users can query and acquire necessary information. It is anticipated that the initial user group for the service will be members of the UIUC Library, but as the data that the repository curates grows, the user group will expand to include staff in institutions that contribute data.
 
3.       The repository will exploit other repositories’ API services

As mentioned above, we will include OCLC and HathiTrust Digital Library as our search targets by exploiting their API services. This will greatly increase the scope of the service and the possible functionality of the web service.

 

Resources Needed

400 undergraduate student programming hours are requested for this project.  This is based on two students working 10 hours a week for twenty weeks (from August to December 2014).  400 hours at a rate of $12.25 is a total of $4,900 requested.

A request has been made of Library IT and a Linux, Apache, MySQL, and PHP based server has been allocated for the initial prototype development.

 

Sustainability

Part of the project will be establishing ongoing workflows and automating the process of ingesting metadata and updating the database as much as is feasible. 

 

Time-line

August 2014

  1. Hire programmers
  2. Begin analyzing UIUC print holdings data
  3. Assess needs and sources for data enrichment
  4. Enrich data based on results from item 3
  5. Identify institutions that contribute data (we will contact HathiTrust and CIC libraries for their contribution)

September 2014

  1. Design Database
  2. Develop workflow for automated data loading
  3. Begin loading data and testing workflow

October 2014

  1. Clean-up data
  2. Identify and examine APIs to be queried
  3. Begin design of web interface including search and results pages

November 2014

  1. Further clean-up of data
  2. Fully implement web interface

December

  1. Perform testing and assessment of system
  2. Compile reports and documentation of project

 



[1] The project will start with UIUC’s print monographs and serials information and later ingest data from other institutions. 

[2] WorldCat Metadata API: http://www.oclc.org/developer/develop/web-services/worldcat-metadata-api.en.html

[3] HathiTrust Bibliographic API: http://www.hathitrust.org/bib_api