Generate Files

Generate MARC.XML Files

For input, this tool takes a path to a directory of files, each of which is a digitized volume, and is named for that volume’s bibid. The program then retrieves MARC.XML files for these bibId’s and writes them into the folder for each corresponding bibid or mmsid. It uses the GetMARC service to retrieve these MARC.XML files from the Library.

Generate OCR Files

Uses Google Tesseract to create OCR text files for images.

Settings:

Path: Path containing tiff or jp2 files. Image File Type: The type of Image file to use.

Adding Additional Languages:

To modify the available languages, place Tesseract traineddata files for current version in to the data directory

Note:

It’s important to use the correct version of the traineddata files. Using incorrect versions won’t crash the program but they may produce unexpected results.

For more information about these files, go to https://github.com/tesseract-ocr/tesseract/wiki/Data-Files

Make Checksum Batch [Single]

The checksum is a signature of a file. If any data is changed, the checksum will provide a different signature. The checksum.md5 contains a record of each file in a single item along with respective checksum values
Creates a single checksum.md5 for every file inside a given folder
Input: Path to a root folder

Make Checksum Batch [Multiple]

The checksum is a signature of a file. If any data is changed, the checksum will provide a different signature. The checksum.md5 contains a record of the files for a given package.
The tool creates a checksum.md5 for every subdirectory found inside a given path.
Input: Path to a root directory that contains subdirectories to generate checksum.md5 files

Make JP2

Makes Jpeg 2000 files from TIFF. Tool converts tiff files in access folder in each directory to an JP2000 files with Kakadu.

For example, the following directory would have “c:\package_dirs” for Input:

| c:\package_dirs
|.└── 99423682912205899/
|….└── access/
|…….├── 99423682912205899-00000001.tif
|…….├── 99423682912205899-00000002.tif

and etc…