Notice: Due to ongoing construction, 4 East is currently closed to the public.  To obtain items located on 4 East, please place an online request for the item to be paged for you using the ‘Place Request’ button in the catalog. Please visit our Circulation FAQ page for assistance in using our catalog.
Notice: Due to ongoing construction, 4 East is currently closed to the public.  To obtain items located on 4 East, please place an online request for the item to be paged for you using the ‘Place Request’ button in the catalog. Please visit our Circulation FAQ page for assistance in using our catalog.

Feeding America: The Historic American Cookbook Dataset

View of encoded cookbook text data
metadata tags in file

Download Data

The Feeding America: The Historic American Cookbook Dataset

Description

The Feeding America: The Historic American Cookbook dataset contains transcribed and encoded text from 76 influential American cookbooks held by MSU Libraries Stephen O. Murray and Keelung Hong Special Collections. Features encoded within the text include but are not limited to recipes, types of recipes, cooking implements, and ingredients. The 76 texts were chosen among more than 7000 cookbooks that MSU Libraries holds as representative of periods and themes in American cookbook history spanning the late 18th to early 20th century.

Preferred Citation

Feeding America: The Historic American Cookbook Dataset. East Lansing: Michigan State University Libraries Stephen O. Murray and Keelung Hong Special Collections. https://lib.msu.edu/feedingamericadata/


Background

The Feeding America: The Historic American Cookbook project, from which this dataset was derived, was made possible with funds from a 2001 IMLS National Leadership Grant. The project began September 1, 2001 and was completed August 31, 2003.


Data Summary

Format

The "Feeding America: The Historic American Cookbook" dataset contains 76 plain text files of transcribed cookbook text, 76 XML files of encoded cookbook text, 1 XML file that includes metadata records for each cookbook in the dataset, and 1 DTD file that describes the schema that was used to encode the cookbooks.

File Naming Conventions

  • content_type - e.x. cookbook_text.zip
  • bookname - amem.xml

Size

  • metadata - 293KB
  • plain text - 16 MB compressed, 64 MB uncompressed
  • encoded text - 17.6 MB compressed, 78.9 MB uncompressed
  • dtd - 21 KB

Data Quality

Quality of text is of high fidelity to original cookbook text given transcription of text rather than application of optical character recognition (OCR). Text data is enhanced by encoding of features of text like recipe, recipe type, ingredient, measurements, and cooking implements.


Acknowledgements

Data description prepared by Thomas Padilla, Devin Higgins, and Lucas Mak.

Credit is also due to Ruth Ann Jones for writing the DTD that defines the schema used to encode the cookbooks and Amy Vance for leading the charge on the encoding process. Data is derived from Feeding America: The Historic American Cookbook Collection