The 'brary mystique re: MARC, etc.
April 1, 2008 6:57 AM   Subscribe

Dear AskMeFi Librarians and Librarian-Wannabes, can you please explain-slash-distinguish between MARCXML, MODS, METS and EAD in terms of why one is better (or even different) than the other?

I've never worked in a library and I'm only familiar enough with MARC records to know what they stand for and that they presumably need something like MARCXML, MODS, METS or EAD to make them more system-shareable and human-readable.

There internet is in no short supply of definitions and discussions about these standards, but I need someone to explain it to me (or point me to an explanation) that is in layman's terms and without presuming I have an MLS.

You folks are as bad as us webbies with the acronyms, sheesh!

posted by 10ch to Computers & Internet (6 answers total) 9 users marked this as a favorite
At risk of embodying the cliche (and I'm not even a librarian anymore!) can you be a little more specific about what you're trying to do? Constructing "MARC crosswalks" between systems is a pretty daunting task, but knowing what systems/entities you're trying to translate between would probably be helpful.

A few random thoughts:

In short: MARC has numbered fields and lettered subfields that correspond to different pieces of information about a book (or other information object). It's comparatively easy to make them human-readable, to the degree that the title, author, dimensions etc. are written in human language, but making them easily human-usable or integrate-able with another system (say, MEDLINE...or Google) is more challenging.

EAD is actually an archival description markup; probably not particularly useful for MARC records. I believe MOD is ___-to-MARC only but I might be wrong.

This code4lib article might be helpful; not sure if it's too technical.
posted by chesty_a_arthur at 7:16 AM on April 1, 2008

I've not worked with all the formats, but this is my understanding from a brief perusal of the docs:

MARC is a standard metadata archiving format; it's binary. MARCXML is an XML implementation of that format.

EAD is another metadata format, in XML, but it is geared more for special collections or other sorts of archives which don't have the same properties as the books, etc. for which MARC was designed (e.g. they may be more unique). EAD is also more geared for searching and exchange than a pure MARC records (though MARCXML mitigates this somewhat).

MODS is another XML format, which, to quote wikipedia: "was designed as a compromise between the complexity of the MARC format used by libraries and the extreme simplicity of Dublin Core metadata." If you know what DC data looks like, you can see why you might want a friendlier format.

METS is yet another XML format, but geared towards digital collections. It records metadata about digital objects, which may be made up of many files, other chunks of metadata, etc. So the METS records is designed to coordinate all these pieces to construct the library "holding."
posted by beerbajay at 7:46 AM on April 1, 2008

It depends on what you're trying to accomplish with the metadata. What kinds of documents are you working with? What kind of institution are you working for? What kind of information are you trying to store? Just descriptive information? Administrative information and/or technical information? Some schemes are better suited to support only one type of metadata (i.e., Dublin Core is best for descriptive metadata).

EAD stands for Encoded Archival Description. It is an XML markup specifically created to describe hierarchical finding aids and is used in archives. I'm going to guess that it's not what you're looking for, since it sounds like you're in a library. I've used EAD before; it's not really for standard library collections, although if you're trying to encode finding aids from your special collections, EAD is probably your best choice.

MODS (Metadata Object Description Schema) is basically an XML rendering of MARC’s content. It differs from MARCXML in the tagging, which in MODS is done with text. It's also a bit easier to work with than MARCXML. It supports only a subset of MARC-21 data, but it might be right for your purposes. I'm going to guess that this is the metadata scheme you're going to want to choose based on what you've said. See the MODS Uses and Features page; it's well-written and straightforward.

METS, the Metadata Encoding and Transmission Standard, is for digital objects/collections. It can store administrative, technical, and descriptive metadata. I don't know that much about METS, but I imagine it's not what you're looking for.

Also have a look at the MetaMap. It might be of some use; I haven't looked at it in awhile and I don't have an SVG-capable browser on hand. Apologies if it's too technical.
posted by k8lin at 9:52 AM on April 1, 2008

Response by poster: Thank you all for the information.

This question was asked in the spirit of researching a position I may interview for. So, I do not know specifics of the data, but only that it'd probably be beneficial for me to be able to have an intelligent conversation about these formats, and why to choose one, and when, even.

I do know this much... The documents are historical manuscripts that are part of special collection(s) and they are in the process of being digitized so that they may shared online. A lot of it is personal correspondence and such from a historical figure. The collection will need to be searched and presented in a way that is user-friendly (that's where I come in).

So, from your answers I have drawn the following conclusions: 1) MODS and MARCXML are both XML renderings of MARC content. You would likely pick MODS because it's easier to work with. 2) EAD is mentioned because we're dealing with a special collection and some of the data will have fields that MODS/MARCXML doesn't account for. And 3) METS is mentioned because the ultimate result of this project will be a digital collection, and there may be yet more fields that EAD and MODS/MARCXML don't account for.

Am I even close? Still confused and uncertain, obviously, but the puzzle at least has edges. Thanks again.
posted by 10ch at 10:34 AM on April 1, 2008

Your summary is pretty much right on there, 10ch. I'll add that the scope what's being described in these formats can differ, especially when it comes to material from an archival collection.

Typically, material in archives and special collections may have a MARC record only at the collection-level, as opposed to every item-within-the-collection level. EAD can vary in scope from only collection-level to including some description about individual items in the collection. METS, however, will typically describe a single (or group of closely related) digital object(s), which as you correctly guess, may have yet more specific metadata.
posted by alb at 1:10 PM on April 1, 2008

Yeah it sounds like they'd probably have EAD records for each object they have now, but would probably be constructing METS records for all their digitized content. As for choosing MODS/MARCXML, it depends on what you're going for. For example, if you're just representing data, in XML, that you already have in MARC format, use MARCXML--don't bother converting it to MODS.
posted by beerbajay at 3:40 AM on April 3, 2008

« Older What is this genre called, anyway?   |   Best way to count online ballots with numerous... Newer »
This thread is closed to new comments.