June 25, 2013 Meeting of Content Access Policy & Technology (CAPT) – Staff Website

June 25, 20131:30 pm - 3:30 pm Room 428 Library

Agenda Details

Agenda

1. SFX hosting (Lynn Wiley, Wendy Shelburne)
2. Primo update (Jenny Emanuel, Michael Norman)
3. Catalog questions (Tom and Beth)
4. MarcIT (Michael Norman)

Minutes Details

Attendees

Tom Teper, Lynn Wiley, Jim Dohle, Sue Searing; Michael Norman; Beth Woodard : Robert Slater; Lisa Hinchliffe ; Jenny Johnson

Minutes

SFX hosting (Lynn Wiley, Wendy Shelburne)

Lynn and Wendy in conjunction with IT have recommended that we host the server side of operations with ExLibris. This will relieve IT of version updates and downtime issues and will allow Eresources staff to focus on maintaining the administration side of the work.
IT is in agreement; we have a source of funding (see below re MARCit) and need to resolve a few issues:
Issues remaining:
implementation as far as a review of any URL linking during the change and implications for our customization and our statistics files.
Need 60 days to look at these issues..This means that we are looking at winter break for the changeover. Lynn and Wendy will prepare a brief report and get back to the group.

Primo update (Jenny Emanuel, Michael Norman)

Michael sent out an update on Phase II plans:
Here is a list of items to work on over this Phase 2 period of time:

1. Set up separate institution for Undergrad–we’ve let ExLibris know we want to do this; UGL will let us know what they want included
2. Set up an institution and not turn on as many things that search the full text, to compare how this searches to what we have now
3. Add Illinois Harvest records for 1) Illinois Digital Archive and 2) CARLI collection of images ;
4. Get Archon and Illinois Digital Newspapers represented in Primo
5. More search assistant features in the custom tile
6. Additional User Testing
7. Guides on using Primo, which resources to choose
8. Update to “Which Catalog is Right for Me” section to include Primo Catalog Search

Updates on the PRIMO link for users were discussed :

PIT adding a second tab for Primo for Fall (not default search) and then adding Primo catalog search in online catalog. Stats on catalog links indicate that VuFnd gets lots of use. Primo will be at the top there. Group will look at what to call Vufind (is now Library Catalog)
Classic still getting 40% of search activity. Working on Active Directory LDAP for Primo authentication for access to requests accounts: Users will log in.
One other thing to check on is unique functionality between catalogs and detail them now to offer advice and directions for users
Lisa raised question on OPAC analysis as in Which version does What. Primo close to Classic for local catalog. Beth noted a need to review music and CJK access as well. Teper says maybe we need an evaluation tool for divisions or units, one that identifies known issues and one available for the Fall to help direct and for general information. And involve WAG as for example who covers what is listed on our Gateway for that catalog list?
CAPT may charge a group to do this. The data will assist with PRIMO work as well as for WAG and of course for general information.

Michael said he could head up this work and then refer back to public services but asked to who in particular. CAPT discussed possible candidates and then suggested referring a draft on for more input as needed. Micahel will work on this and update CAPT. He also noted that he would meet with Ex Libris at LA and would report back on PRIMO news.

Catalog questions (Tom and Beth)

Tom issues raised about OPACs utilized as well as other tools and how we name things and how they will change in the near future but in the specific context of how we name those interfaces or linking mechanisms and how we use them with any consistency. It is a problem now for example with libguides. The work described above will help but this go beyond that.

Lisa talked about this at a User Ed meeting and it is hard just for libguides alone. CAPT discussed style guide that could be available for things like catalog and eresources. The need for consistent labels for Discovery and in general a more regular nomenclature is needed.

CAPT noted that this fits with WAG’s charge.
Lisa offered to work with User Ed on this and then at data re discover and text alignment and will turn it over to WAG for their follow up.

MARCit (Michael Norman)

See document distributed and copied here:

CAPT agreed with the recommendations with the caveat that migration issues be reviewed as well as SFX and titles changes. CAPT also agreed that the savings would help cover SFX hosting costs.

Evaluation of MarcIt Service
June 25, 2013

Team members:

Michael Norman
MJ Han
William Weathers

Charge of the MarcIt Evaluation Team
To evaluate Ex Libris’s MarcIt service to determine if implementation would offer better access and provide effective means for maintenance of UIUC Library’s e-serial titles through the online catalog and the Primo web-scale discovery system. The group will consult with individuals in the Library who could be impacted by implementation of MarcIt, including Acquisitions, E-Resources, Content Access Management, and Public Services. The team will submit a recommendation to CAPT whether to implement the MarcIt service or present alternative ways of providing and maintaining bibliographic access to e-serial titles.

What is the MarcIt service?
The Ex Libris MarcIt service provides MARC records for e-journals. These records can be loaded into an online catalog to provide additional access points to e-journal titles. The MarcIt central database is populated with CONSER records obtained from the Library of Congress Cataloging Distribution Service. MarcIt matches activated SFX titles with corresponding CONSER records. Brief records are returned when MarcIt is unable to match a CONSER record. These are typically very short records, containing only the object metadata and holdings information originally exported by SFX.
Process of evaluation
The team re-examined the MarcIt service in detail to determine if the system offered advantages to existing options for providing access to e-journal content. The Library has had a license to the MarcIt service for over three years but has not successfully employed the system for use of loading e-journal records into the Voyager online catalog. For various reasons, including the difficulty of loading large sets of records into Voyager, working with CARLI to set up automated process for loading and maintaining records when serial titles are added or dropped, creating multiple systems for maintaining the process of providing access to serial content, generating day-to-day work that would result in maintaining and troubleshooting the url links in the online catalog, and devoting individuals to monitor the weekly loads of records, the Content Access Management (CAM) department never did incorporate the MarcIt service into its day-to-day processes and workflows. The possibility of implementing the service always remained a possibility if sufficient workforce was established to effectively use these MARC records and provide maintenance for the changes occurring regarding the e-journal titles.
The group also wanted to evaluate the quality of the MarcIt records to determine a level of enhanced representation of serial titles through the service. After interviewing several individuals in Acquisitions and CAM, the fullness of the record set was questioned as to additional information gained from the collected MARC records. Also, with the implementation of the Primo web-scale discovery service, an additional method of having records loaded from SFX directly into Primo through automated processes has become another option available for the representation of e-journal titles through the Library’s search and discovery tools. The MarcIt Evaluation Team looked at these three areas in this report and recommendation back to the Content Access Policy and Technology Committee.
Loading MarcIt records into Voyager and subsequent maintenance of records
Working with large record sets in Voyager is not an easy process. Loading into Voyager an initial MARC record set, particularly one that is close to 100,000 records in size, requires coordinating the activity with CARLI to get the set into Voyager. For a record set of 100,000 records, it takes approximately 8-10 weeks to get the entire set loaded (capable of loading 2,000 to 2,500 records per weeknight). Then after the initial load of records, a scheduled harvesting of records to update changes or deletions of data needs to be set up with CARLI as well. This scheduled harvesting could occur, daily, weekly, monthly, or any duration required to keep the data elements current. For records deleted entirely, a separate process needs to be established to ensure that the maintenance of records occurs in a timely manner, eliminating records and links to serial content no longer available to the Library. The Team learned that working within a system where the Library cannot control the loading or removal of records is difficult and, many times, has been problematic in the past as coordination needs to continuously occur for optimal results with CARLI performing nightly continuance of services for over seventy libraries. It is not an easy process to set up with so many constant changes occurring to a record set as large as the MarcIt service would cover for the UIUC Library.
To efficiently handle this kind of maintenance of deleted titles, individuals would need to be assigned this work on a daily basis to keep up with the removal of these records from public view. Depending on the month, this work has been estimated to be in the dozens to hundreds of titles per month. It was mentioned several times in conversations with Acquisitions and CAM individuals, that it has always been difficult to sustain this level of work, with several references to the difficulty of upkeep for the Online Research Resources (ORR) in keeping up with updates and changes to serial titles. Also, as acknowledged by several people, pulling in records from MarcIt service adds one more location to perform troubleshooting and maintenance of url links to the full-text content. It would add another layer to determine exactly what is going wrong when a user encounters trouble getting to the full text content after clicking on a link.
One important decision point mentioned during an interview, similar to that within the Library’s VuFind catalog instance, the SFX/Discover service is already in place, with the Discover button appearing within the results for a journal title that has electronic access. This up-to-date access is maintained through the daily activations and de-activations made within the SFX Knowledgebase. Additional activity is not required for maintaining records (or the urls represented in the individual MARC record) within the VuFind online catalog. VuFind is making a real-time call out to the SFX Knowledgebase to determine the current access to a journal title. This setup requires maintenance of journal title data only in one system rather than the multiple places where loading of individual records would need to occur on an ongoing basis.
After examining the situation from the standpoint of loading the records from the MarcIt service, with the current levels of staffing, it would be difficult to sustain the maintenance of these 100,000 records for e-journal titles within the Voyager system. As Vufind shows, the most advantageous situation would be the ability to utilize the work already occurring within SFX to help maintain access to the Library e-journal title list if possible.
MarcIt Service Records Quality

From the Ex Libris SFX system, we requested the MarcIt service for MARC catalog records be run against all UIUC Library serial titles activated in the SFX Knowledgebase. We wanted access to this file of records to perform a quality analysis study of the records available within this service. In this file, we received 99,891 MARC records and we used this file as our sample records to compare with records already available in our Voyager catalog and also available in the Primo web-scale discovery system. Since the record quality can be measured by many different aspects, we decided to analyze main data fields contained in the records that are important in providing access services and resource management of the Library’s serials collections.
MARC has about 2,000 fields (including data fields, subfields, and fixed fields) that describe resources. However, not all fields are used in catalog records. According to Moen and Benardino’s research (2003) in examining usage of various MARC elements, only 4% of the fields appeared in 80% of the catalog records (approximately 400,000 sample MARC records they examined). It is also known that the importance of the MARC fields is evaluated differently by each individual field, i.e., the field for title is more important than the location of the publisher. For our need, we selected the data fields based on the Library of Congress’ Minimal Level Record Examples for Serials and data fields used for the discovery services listed below.
The analysis shows that most of the records meet the minimum requirement that the Library of Congress identified. Each record has more than one content note (data field 866), which allows users to know the full-text coverage of a specific journal, showing which volumes and issues are available electronically. More than 80% of the records have the current publication information (publisher and location of publisher). Also, 80% of the records showed each title’s language designation and includes the titles in both English and its respectful original language. However, some data fields that are important for serials cataloging, notably current publication frequency (44.2%) and dates of publication and (or) sequential designation (47.5%) appeared in less than half of the records. Also, physical description (20.5%), preceding serial title entry (17.5%), and succeeding serial title entry (8.6%), are present in less than one fifth of the records.
An identifier number has become increasingly important in resource management, specifically identifiers used widely in the shared library and information science community, such as International Standard Serial Number (ISSN) and Online Computer Library Center (OCLC) number. Among the sample records, 48.4% of the records have OCLC number and 62.7% have the ISSN numbers. Given the importance of the OCLC and ISSN in database management as well as collection management, it is rather a disappointing result.
In conclusion, while the MarcIt service provides catalog records that meet the minimum level requirement set by the Library of Congress, our analysis shows that the records are not of ideal quality that would ensure the adequate serial access services, data management, and resource management for the Library.
Table: Data fields analysis

MARC field	Number of records	%
035 (OCLC number)	43,527	48.4
310 (Current Publication Frequency)	39,751	44.2
362 (Dates of Publication and/or Sequential Designation)	42,745	47.5
260 (Publication, Distribution, etc.)	75,686	84.1
300 (Physical Description)	18,483	20.5
022 (International Standard Serial Number)	56,421	62.7
210 (Abbreviated Title)	83,288	92.6
780 (Preceding Entry)	15,710	17.5
785 (Succeeding Entry)	7,776	8.6
866 (Content note)	126,951	N/A

Representation of SFX in Primo
The Library has had access to the SFX service since 2005. The SFX Knowledgebase allows the Library to management its large collection of e-serial titles, almost 100,000 at last count. Titles and/or collection of titles are activated or de-activated within the knowledgebase through defined processes, with maintenance reflected in real time or overnight to users of the service. Also, SFX has the capability of exporting out a file of records for titles activated within the knowledgebase. For frequency, this export can occur on a schedule determined by the Library, occurring weekly if desired. The metadata is not as full as a MARC record, including title, publisher, ISSN, other identifier numbers, publication dates, and other information available in the SFX knowledgebase.
In the Primo web-scale discovery system, this export from SFX can be harvested into the Primo index and can be set up to automatically load at a specified scheduled time each week. The SFX collection can be represented in multiple ways, including as a separate search scope (Online Journals & Databases), as part of the Everything search scope (Online Catalog, Primo Central Index, UIUC Created Content, and the SFX A to Z list), pulled into the Online Catalog scope, and other configurations. When searching by journal title, the Primo system will either de-duplicate the various formats of a journal title (print, electronic, microform, or CD) and show information (including holdings information for print format) within one result display or FRBRize the results so user can expand out to see information regarding each individual format. Access to full text is represented through the SFX openurl service, showing the “View Online” tab only if the Library has activated an instance in the SFX knowledgebase. Similar to the representation in VuFind, maintenance for this linking out to full text content occurs within the SFX knowledgebase. When a title is activated or de-activated in SFX Knowledgebase, this is then reflected in Primo through the automated loading of record set from SFX. No activity is required by individuals in the Library to get this data changed in Primo. The harvesting of the SFX pipe maintains the accuracy and currency of the record and url through an automated procedure that is straightforward to maintain.
The interoperability of the two systems, SFX being critical to the successful operation of the Primo system, was one of the selling points for the Library when it chose Primo as the web-scale discovery service to test and utilize. The data loads of journal titles from SFX to Primo could also allow the Library to keep this information as current as possible without the need to devote additional individuals to maintain this important information for Library users.
Recommendation regarding utilizing MarcIt Service
The MarcIt Evaluation Team believes that there are other available sources and services, including the harvesting capacity of the Primo system that provide better bibliographic data and automated processes to allow real time maintenance of these constantly used electronic resources. The difficulties with maintaining a separate load into Voyager, including devoting additional personnel to monitor and alter existing records and urls, would be difficult to initiate with the current levels available at this time. It would add layers of troubleshooting and maintenance that would not be easy to sustain. The quality of the records from the MarcIt service is not optimal with only some of the records being produced through the Library of Congress’s CONSER program. As the above table shows, most of the records are short in nature with only the basic information available to use as access points. Finally, Primo offers the capability to harvest this data from the SFX Knowledgebase directly through automated means. This negates the need to assign additional individuals to help maintain this information within the system. The work occurring in SFX can be utilized in Primo to keep the bibliographic data as current as possible and connected to the up-to-date url path to full text access. The Team recommends we do not implement the MarcIt service and utilize other sources already available to the Library. The Team also recommends that the Library cancel our subscription to the MarcIt service.

Moen, W. E. & Benardino, P. (2003). Assessing Metadata Utilization: An Analysis of MARC Content Designation Use. 10pp. Available from http://www.unt.edu/wmoen/publications/MARCPaper_Final2003.pdf

http://www.loc.gov/marc/bibliographic/bdapndxc.html#serial

Time and Location of Meeting

Agenda Details

Agenda

Minutes Details

Attendees

Minutes