Digital Content Creation

 


 

Digital Content Creation

415 Library, MC-522
1408 W. Gregory
Urbana, IL 61801

(217) 244-2062

Email: digicc [at] library.illinois.edu

CONTACT US

Scholarly commons

Illinois Harvest logo

 

 

Visit us on Flickr

Hathi Trust Digital Library


Pixels RSS Feed

1.0 Best Practices for File Formats

Download PDF for Chapter 1

Introduction

The choice of digital file format plays a large role in whether the digital content within can be preserved and maintained over time.   For example, highly proprietary or scarcely adopted file formats have a higher danger of obsolescence and will likely prove difficult to maintain over time.  The University Library acknowledges the inherent challenges involved with preserving any digital content and recommends that all file format decisions be discussed with local preservation and digital content experts.

As file formats are constantly changing, format recommendations must be reviewed frequently.  Current format recommendations should be well documented and should reference Categories of File Formats below.

Until the Library is able to finalize its own File Format categorizations, projects should use the below guidelines (based on the IDEALS Preservation Support Policy: https://services.ideals.uiuc.edu/wiki/bin/view/IDEALS/PreservationSupportPolicy) when choosing digital file formats. 

Table of Contents

1.1 Background

1.2 Categories of File Formats

 

 

 

 

______________________________________________

1.1 Background

Our ability to preserve digital objects is dependent, among other things, on whether the file format used:

All digital objects maintained by the Library will receive a basic level of preservation. Basic preservation means that the Library will preserve the viability of the original object through:

Basic preservation does not ensure that a digital object may be opened by a computer program or is understandable by a human in the future. For example, in 2006 a faculty member deposits a conference presentation in the Microsoft PowerPoint format (.ppt), a proprietary format. In 2030, a graduate student would like to view that conference presentation, but the software program - Microsoft PowerPoint - used to open and read .ppt files has been discontinued since 2020. Old versions of the software program are difficult to find, and, because the .ppt file format had never been publicly documented, there exist no other software programs to open the file. Even though the original digital object (the conference presentation in ppt) is still technically viable, it is no longer renderable (able to be opened by a computer program), and thus not understandable by the graduate student in 2030.

Therefore, for digital objects that meet certain criteria (see below), the Library will strive to preserve not only the viability of the object but also the renderability and the understandability of the content of the digital object, as well as the original file itself. In the case of some objects in proprietary formats, this will mean that in addition to the original digital object, the Library will also save a copy of the object transformed into a file format that is more preservable than the original. For example, the conference presentation in ppt might also be saved as a pdf/a object (an open, publicly documented standard). The pdf/a object is a more preservable format than the .ppt format. What may be lost is the full functionality of the original digital object. For example, the graduate student in our example may not be able to view the conference presentation as a slide show as the Microsoft PowerPoint software program allows. However, the content of the conference presentation will be preserved.

The Library also recognizes that in some cases an access copy of a digital object is necessary due to the proprietary nature or cost of the software used to render it. In some cases, the access copy and the preservable copy may be the one and the same - a pdf/a version, for example.

 

1.2 Categories of File Formats

The Library categorizes digital objects into three categories of preservation support. These categories are defined below. This policy is subject to change as new and emerging technologies impact our ability to preserve deposited conte

Category 1 - Highest Confidence

Description:

Criteria:

Examples:

Category 2 - Moderate Confidence

Description:

Criteria:

OR

NOTE: Files with embedded content (for example, a PowerPoint (ppt) with an AVI video file (avi) inserted into it) are more preservable if the files are maintained separately. If the content remains embedded, it will likely not remain intact when the file is transformed to a more preservable format.

NOTE: Files with dynamic content (for example, an Excel spreadsheet (xls) with dynamic functions - even simple ones!) are more preservable if the dynamic content is either documented (for example, a note in an Excel spreadsheet explaining the functions that are included) or the document is saved as a static document (for example, a cell in an Excel spreadsheet that is the sum of a column is saved as the sum, not the function of adding the multiple cells).

Examples:

Category 3 - Low Confidence

Description:

Criteria:

Examples:

2.0 Best Practices for File Naming

Back to Table of Contents