Final Project Report for Making Metadata Maker – Staff Website

March 23, 2015

Submitted by:
- Myung-Ja (MJ) Han
- Nicole Ream-Sotomayor
- Patricia Lampron
- Janet Weber
- Deren Kudeki
- Duration: October 2014 – March 20, 2015
- Web Application: http://iisdev1.library.illinois.edu/marcmaker/
- Source Files: https://github.com/dkudeki/metadata-maker
- Budget: $7,600 (Academic hourly programmer)

1. Objective

As more and more resources purchased by libraries come with cataloging records prepared by vendors or publishers, cataloging and metadata operations in academic libraries have been focusing on original cataloging for backlogs, including their unique and hidden collections, as well as an increasing number of foreign language materials that have not been available to users due to lack of bibliographic metadata. According to Jones,[1] American Research Libraries cited “backlogs as one of their institution’s major concerns” (88), and “for printed volumes, about 15 percent of collections on average remained unprocessed or uncataloged” (90) which has resulted in these collections being inaccessible.

In order to make these hidden and valuable resources searchable and discoverable in a timely manner, libraries must hire professional catalogers who have subject knowledge and cataloging experience. Otherwise, libraries must train staff to create metadata in Machine-Readable Cataloging (MARC) format, as well as to use cataloging software including an integrated library system (ILS), such as Ex Libris’ Voyager, and a shared cataloging system, e.g. OCLC Connexion. However, training staff to use MARC is an intensive and time consuming process, as MARC has more than 1,900 fields in numeric form. In order to use MARC efficiently, one should either know or look up what each numeric tag and subsequent subfield and indicator means and how it should be used differently depending on the information in other tags. In addition, in order to create metadata in MARC format, staff must have permissions to access their institution’s ILS and OCLC, which also requires training and close monitoring. The idea for developing a metadata web application, Metadata Maker, was born to meet the library’s current need, that of a tool for metadata production, which can be used by anyone, regardless of cataloging instruction or knowledge, to create good enough metadata records.

2. Project Outcomes

2.1 Application Design

The goal of this project was to create a web application that would be easy to use for creating good enough metadata. The project team identified the information critical for information organization and access as well as information that is needed to ensure quality metadata records. The team then divided the information into two categories: information that is required, and information that is optional, as shown below. In addition, there are several types of information, e.g., bibliographic level, cataloging source, and content/media/carrier types, which are added as a default to every record to make the record complete. These elements were added into the Metadata Maker web application, which was written in three programming languages that are fast and well-supported: JavaScript, CSS, and HTML.

Element Name	Required/Optional	Element Name	Required/Optional
Title	Required	Date of publication	Optional
Subtitle	Optional	Copyright date	Optional
ISBN	Optional	Number of pages (volumes)	Optional
Edition statement	Optional	Dimensions	Required
Language	Required	Literature	Optional
Author	Required if available	Illustration	Optional
Name of publisher	Optional	Keywords	Required
Place of publication	Optional	Note to the Cataloger	Optional
Country of publication	Optional

2.2 Metadata Formats

Because libraries work with many different metadata standards, the application currently creates and delivers metadata in four different formats: MARC21, MARCXML, MODS and HTML encoded with Schema.org semantics. MARC21 (as .mrc) was the first format chosen for output, since the majority of the metadata created by the application will be ingested into OCLC and Voyager for OPAC discovery. In addition, it was decided that MARCXML, MODS, and Schema.org HTML should also be offered as output formats by the application. Currently an increasing number of metadata services require MARC records in XML format, and having MARCXML without the need for another transformation process would be beneficial. The Library’s digital preservation system requires MODS as its bibliographic metadata standard, so MODS was also added as one of the options. Lastly, the Schema.org semantics embedded HTML format was added as part of the library’s on-going efforts in linked open data development work. The library has already experimented with linked data by transforming its entire 5.5 million bibliographic and holdings data to linked data using Schema.org semantics (http://catalogdata.library.illinois.edu/). For this experimentation, most MARC data fields and subfields were mapped to Schema.org semantics and transformed accordingly. Mapping from the elements provided in the application to Schema.org semantics was relatively easy due to the application’s simple set of elements, each with clearly defined meanings.

2.3 Supporting Foreign Language Cataloging

Metadata Maker supports UTF-8 characters, allowing vernacular cataloging for non-Roman foreign language materials in any script, including those supported by OCLC.[2] The application also supports the addition of transliterated fields and diacritics in order to greatly improve the overall productivity of foreign language cataloging.

2.4 Working with FAST (Faceted Application of Subject Terminology)

Authority control is one important aspect of cataloging and metadata creation that has a direct impact on discovery services. Since keyword is one of the required fields in the application, we decided to add a feature for users to select subject terms from FAST. When a user starts typing a possible term, the application dynamically shows possible terms that start with the same letter. And when the user selects the term, the application adds the term and the FAST ID into the MARC format record (both .mrc and XML), and a FAST URI into the MODS and schema.org records as controlled vocabularies. Because FAST is still under development, the application also allows users to add other non-FAST terms as keywords. We hope that this new feature will improve the subject-based faceted service, the metadata quality, and the FAST vocabularies.

In addition to FAST, there are several authorities that have a web API service for search/retrieval of their controlled vocabularies, including Virtual International Authority Files (viaf.org) and Library of Congress Subject Headings (id.loc.gov). However, we decided to test these other services in the next phase of Metadata Maker, if that is required, because they require additional investigation of the service and a more extensive training plan. We also initially experimented with the VIAF API for names, but decided not to implement at this time, because the ambiguous name terms displayed through the VIAF API may cause confusion for users.

2.5 Sharing Source Files

Currently the prototype web application is hosted on one of the web servers in the library (http://iisdev1.library.illinois.edu/metadata-maker/) which can be used by anyone. All source files are available in GitHub (https://github.com/dkudeki/marcmaker) with an MIT license (http://choosealicense.com/licenses/mit/), a standard license for open source software, including jQuery, which the application utilizes.

3. User Testing

The Foreign Language Cataloging Specialist conducted initial user testing with six staff in CAM. Testers included two student workers, whose daily work is physical processing of materials, two Graduate Assistants with experience in copy cataloging, and two hourly staff who have experience in copy cataloging and maintenance work. Testers were given a variety of monographs in Western European languages, including English, which needed original cataloging. With little instruction they were asked to fill out the form for each item and create metadata in all formats. Over 240 records were created as part of this initial test; five of the users created between 5-8 records per hour, while one user created just over 10 records per hour. Although all records were reviewed and enhanced by the Foreign Language Cataloging Specialist, the test showed that Metadata Maker can greatly improve cataloging and metadata production.

4. Conclusion

The Metadata Maker application was developed for libraries to create good enough quality metadata that would increase the discovery and accessibility of hidden collections or backlogs that lack descriptive metadata. Initial testing has revealed that the application is easy to use and creates metadata that meets minimum level record requirements. When combined with a well-designed and simple training program and given materials that have the appropriate information, users could potentially create full-level metadata. However, the application has room to improve in areas that will enhance metadata quality and ease of working with other cataloging/metadata systems and databases. The application works well as a metadata production tool that supports diacritics, Unicode non-Roman language encoding, and creates good enough quality metadata records. The application was created as an open source project, so any institution can use and modify it as needed. We hope that Metadata Maker will grow into a tool that is used and improved upon by the community, and that it will benefit the community as well as our end users.

[1] Jones, Barbara M. (2004). Hidden Collections, Scholarly Barriers: Creating Access to Unprocessed Special Collections Materials in North America’s Research Libraries. RBM 5, no.2 (Fall 2004): 88-105.

[2]http://www.oclc.org/content/dam/support/connexion/documentation/client/international/internationalcataloging.pdf