Sources for digital humanities work generally come in the form of (or need to be transformed into) digital data, although you may work with that data in more traditional analog ways as well as using intrinsically digital approaches. If you need assistance finding and acquiring humanities data sets, like text corpora, image collections, or multimedia for research purposes, start here. We also point to some lists of other data available online.
To find textual data, see the library guide Finding Text Data Sets.
To get started with a text mining project, or find example projects, see the library guide Text Mining Tools and Methods.
If you have a book or other printed material that needs to be digitized, you can use OCR (Optical Character Recognition) software to make it machine readable. Check out the library guide on OCR and Searchable PDFs.
Image and Multimedia Sources
- HathiTrust Digital Library
- The HathiTrust Digital Library is a collection of books, digitized primary sources, images, and more. They focus on long-term preservation, and provide both public domain and in-copyright content from Google, the Internet Archive, and Microsoft. The researcher will need to extract images from electronic books where necessary. Another related organization, the HathiTrust Research Center, provides research support and tools for a variety of analysis methods.
- Digital Public Library of America
- DPLA provides open access to digitized materials from libraries, archives, and museums around the United States. These materials include images, videos, and audio. It seeks to be a resource for students, teachers, scholars, and the public.
- Internet Archive
- Internet Archive is a non-profit digital library that offers free universal access to books, movies, music, and more.
- World Digital Library
- The World Digital Library, sponsored in part by the Library of Congress, archives digitized images of historical materials, both texts and images, from across the globe.
Other Types of Data
- Numeric Data: See the library guide Finding Numeric Data.
- Spatial and Mapping Data: See the library guide GIS Data.
- Other Data: You can use a web-based API to download large amounts of data automatically. See this list of publicly available web APIs. APIs are sorted by topical area. Some provide data, and some allow you to do things with data.
Need other data not listed here? Let the Scholarly Commons help out! Read more about our Data Discovery services.