What additional information do people include in the album MP3 tag?
October 4, 2006 6:31 AM   Subscribe

I need to understand the most popular ways that people "corrupt" the album tag within their MP3's with additional information (such as disc number).

I'm working on something which will take the name of an album from the metadata in an MP3 and pass it to a server to do some funky stuff involving searching of catalogues of albums available to purchase.

The problem that I've found is that some people (myself included) "corrupt" the album tag with additional information that would cause a straight album search to fail.

For example, a common one appears to be putting the disk number someone in the album name. For example, I have an album called "Cream Classics (disc 1)" but my friend might have it labelled as "Cream Classics [D1]".

As such, doing a straight search on these examples will fail because that is not the correct album name. So therefore I need to include some logic that strips out that additional information prior to searching.

Unfortunately I don't know what the common ways that people use the album field to hold additional information. Will just handling the two presented above solve 80% of the issues or are there other things that I need to consider?

So in short:

1. If you put additional information into the album title field, what do you put and how do you format it?
2. Is there any way I can use the web or a website to look through how people mangle the title field to put additional information?
posted by mr_silver to Computers & Internet (25 answers total) 1 user marked this as a favorite
Oops, I also forgot to mention that I will also need to do the same with the "artist" tag. So if you corrupt that with additional information, I'd love to hear about it.
posted by mr_silver at 6:34 AM on October 4, 2006

Is there any way I can use the web or a website to look through how people mangle the title field to put additional information?

Fire up an app that connects to cddb (or freedb, or...), and which will prompt you to choose whenever it finds more than one match for a physical disc—and see what shows up for a variety of discs likely to sport your useful corruptions (multidisc sets, compilations, soundtracks, misc. various artists situations).

Or cut out the middle man and examine the cddb or freedb directly, if such a thing is possible.
posted by cortex at 6:57 AM on October 4, 2006

If you want, I can send you an .xml file of my iTunes library, which since it's cobbled from many sources, will have many of the most common errors.
posted by klangklangston at 7:07 AM on October 4, 2006

I'm a real purist when it comes to tagging, and worse yet I have a lot of sins to atone for during my Napster days. Thus, this is a topic near and dear to my heart.

1. For albums, the addition of [OST] or "Original Soundtrack" for soundtracks rather than using the more traditional album title. An example of this would be something like "Final Fantasy VI [OST] [Disc 1]"

The spec gives one the ability to define the Genre as "Soundtrack," which seems much more appropriate.

2. VA albums tend to have all sorts of weirdness associated with them. Some people will put Artist - Tracktitle in the Track field and 'Various' in the field for the artist (this breaks music databases).

3. Artist fields can get messed up when someone tries being too comprehensive with featured artists. I've seen tracks with tags like "Deftones (feat. DMX with Los Lonely Boys and Death Cab for Cutie)"[*]. At some point there's just too much information, not to mention that it also breaks catalog/database software.

4. Remixes. Dear God, they're the bane of my tagging existence. It takes all the fun of #3 and adds the joy of illogical naming and a complete inability to locate information to ensure the tags are correct. "Green Day - American Idiot (Remixed by DJ Keoki) Dark Mix, Jungle Beats (feat. Marlene Dietrich and Dashboard Confessional) Special Remix by DJ Sammael with Funky Beats"[*]

OK, a bit extreme, but given how many 'DJs' will put a back beat on a track and label it their own, there are a ton of tracks like that out there. Generally, unless I rip a track myself, I can only believe that a remix song has incorrect tags.

For more information, you may want to go and visit HydrogenAudio. They're really particular about music and debates about 'correct' tagging break out all the time.

[*]: Though it'd be really interesting to see what these kinds of results these collaborations might bring, these are obviously songs that aren't real.
posted by owenkun at 7:08 AM on October 4, 2006

Will just handling the two presented above solve 80% of the issues or are there other things that I need to consider?

As a more general solution, you could strip out everything in brackets or parentheses.
posted by smackfu at 7:17 AM on October 4, 2006

I wanted to be able to browse my iPod chronologically, so I wrote a quick AppleScript that stuck the year in front of the album name in the Album field for each of my songs. So instead of "Superunknown," my Album field says "1994-Superunknown."

(Wow, that's old!)
posted by cebailey at 7:21 AM on October 4, 2006

Supposedly people can use the Interweb tubes to download music. Some parts of the release 'scene' will put their name in the artist or album fields, so that's something to mention as well. A lot of them use 1337speak, so that's something to check for.

Of course, I know nothing about illegally downloading music and disavow any knowledge of 'scene' releases and the like.
posted by owenkun at 7:22 AM on October 4, 2006

A few of the albums I've "acquired" have the year in the Album Title field in this format:
Elephant (2000),

or even more annoying:
(1994) Honky Donkey*

*Does not reflect a real album title, but rather an album I would buy if i saw it in a store.
posted by muddgirl at 7:37 AM on October 4, 2006

As a more general solution, you could strip out everything in brackets or parentheses.

But there are plenty of albums with brackets in the title legitimately.

I reckon the artist field will be more of a problem than the album field - as owenkun says, there'll be lots of 'featuring X' and 'with Y' type stuff in there.

Something else to watch out for in the album field is references to the original format [From Vinyl], [From Mono Vinyl], [From Cassette] , &c.
posted by jack_mo at 8:12 AM on October 4, 2006

I have a handful of live performances and shows in which the date and/or venue of the show is listed in the "album" tag.
posted by Robot Johnny at 8:28 AM on October 4, 2006

Thanks for all your comments so far, please keep them coming. All very useful!

Before I wrote the question, I was planning the following:

1. Anything in square brackets should be removed (eg. "[VA]" or "[D1]" would be covered).
2. Items in brackets should also be removed if they start with "disc" or "disk" (so "Buddha Bar (disc 1: Dinner)" would also be caught correctly).
3. Items in brackets containing just numbers (eg. "(2004)") would be removed.
4. Items in brackets starting with "feat" or "ft" should be removed (they may match like that, but they will also match without it).

Various Artists and "x vs. y" remixes are going to be hard ones to consider. I accept that no solution I come up with will be perfect but i'd like to try and match as many songs as possible if i can.

I will definately look into the freeDB data as this is downloadable.
posted by mr_silver at 8:38 AM on October 4, 2006

I think you'll have a hard time coming up with a set of rules that will handle even half of the wacky things people do. For example, all of my soundtracks have the artist as "Soundtrack". I think I do the same thing with compilations - everything is listed under various. And the title tags for those tracks all contain the actual artist and title of the song (like owenkun mentioned).

(In the case of the soundtracks, it's kind of overkill, since I think I've got the genre set as "Soundtrack" as well - but I don't really use the genre tag for anything.)

The other thing I do that's slightly abnormal is track numbering - if it's a mult-disc album, I treat it as one big album. So if there are 13 tracks on the first disc, the first track on the second disc is numbered as 14 under the same album name.
posted by flipper at 8:50 AM on October 4, 2006

I change the artist name on cover songs to artist singing (orginal artist cover).

Sometimes when I sample mp3s from music blogs they come with the name of the blog as the album name.

Also if I get a mix cd from a friend, the album name is often the name that the friend gave the mix and not the original title. Don't know how to work around these problems though.
posted by rmless at 8:50 AM on October 4, 2006

I use (and have seen elsewhere) this convention in Artist:

Justin Timberlake [f/ Timbaland]

So you may want to add that to your "feat"/"ft" rule.

And, yes, I am bringing sexy back.
posted by Rock Steady at 9:56 AM on October 4, 2006

You might try and take a look at the matching routines in XBMC. They do a fairly good job (75%) of matching scene releases up to IMDB which have all kinds of extra bullshit in them.

They also match mp3 tags to CDDB, I believe, so you may just be able to adapt their routines to your purposes.

not sure what licensing issues would affect you. you can find the source here .
posted by fishfucker at 10:01 AM on October 4, 2006 [1 favorite]

I would suggest a simpler heuristic: just start dropping words from the end until you start getting matches. The more words you have to drop, the less "certain" you are of the match.

You may want to start out by dropping "stop words" such as "the", "a", "an", and "and" wherever they occur before doing this.
posted by kindall at 10:33 AM on October 4, 2006

This is how I corrupt them.

Various albums Get (Various Artists) after the album name

Soundtracks get [Soundtrack]

Singles get (Single)

I dont think im that weird....
posted by gergtreble at 3:06 PM on October 4, 2006

A lot of my albums have the record label and/or the catalog number after the album.
posted by scodger at 3:40 PM on October 4, 2006

Usually the additional stuff is at the end, so you'll have to look from the front until you get a match or a couple matches, then pick among them.

Any chance musicbrainz would let you use their fingerprinting service?
posted by Mr. Gunn at 8:07 PM on October 4, 2006

Are you doing anything with musicals? I honestly can't think of a more confusing, more mislabeled genre.

For example, take Anything Goes, which was composed by Cole Porter. But every song is sung by a different person -- do I give every song a different artist (what about duets?), or use the composer as the artist (blatant misuse of the field), or do something else altogether?

What about revivals? The same cast recording might have the album titles Anything Goes 1987 BRC, Anything Goes Broadway Revival Cast, Anything Goes Patti LuPone Revival, Anything Goes Howard McGillin Revival, Anything Goes Patti LuPone & Howard McGillin Revival -- with or without a colon after each title; with or without the year; with the year before or after the cast; with the production date, recording date, album release date, or CD reissue date as the year.

And don't forget there's the 1962 OOBC, 1956 soundtrack, 1988 studio cast, 1989 LRC, 2001 LRC...how many more am I forgetting?
posted by booksandlibretti at 9:08 PM on October 4, 2006

use the composer as the artist?

Define a "Composer" field and populate that.
posted by meehawl at 7:12 AM on October 5, 2006

This is gonna be a little long lol.

I am a nut when it comes to organizing my music collection.

Here are some guidelines I work by:

1.) The majority of my mp3 files are named in the following manner:
ie Barenaked Ladies - One Week

< # of track> -
ie 04 - Stairway To Heaven

I use the first method when it's just a loose song in the "general" directory. Any artist I have more than 3-5 songs from gets their own subfolder with the band/group name. Then album names go into those folders. If it's a full album I usually use the second method above, to cut down on filename length.

2.) I do the majority of my tag editing in iTunes. It's easy, simple to move onto the next song, and I can add lyrics, album art, and extra comments.

3.) Any multi-disc collections such as a Ministry Of Sound Clubber's Guide which can be anywhere from 2 discs to 5, I label the album name as follows Ministry Of Sound Annual 2005 (Disc 1), and tag the files on Disc 1 as such in iTunes. That was at a quick search I can get to the correct album, whereas if the Disc was not stated in the title the whole album would be just one big mash.

4.) Unless there are comments I want, I will remove anything from that field. Comments are usually reserved for live material, such as the venue it was recorded as. I don't need to have the title tell me it was LIVE AT RED ROCKS 9-18-1993 featuring Tom Petty and @(#*@*#(@*(#" etc. Any little details abou the song I want to remember such as if it was atn unreleased, rare etc.

5.) All songs that are live or acoustic and not easily identifiable as such from the title will have the (acoustic), (live) suffix. Details again in the comment section.

6.) I didn't use to, but I'm starting to tag years for albums releases, although single songs will take forever to do.

7.) I'm a stickler for proper spelling and very rarely use abbreviations. Sometimes I'll use feat., sometimes if there's a few artists it'll be f. and then the artists seperated by commas.

8.) Maybe it's OCD, but I like Song Titles to be all first letter Capitalized. Take for instance "over the hills and far away." The file and tag will be edited to "Over The Hills And Far Away." Some people will leave ands, the, of etc lowercase but that has always irked me.

9.) Unless the album is ridiculously long and is a remix that makes the filename ridiculously, I'll leave the filenames as is. A lot of people use ripping software that puts either hyphens or underscores to denote spaces. I really dislike that which is why I do so much editing before adding to my library, but if it's something I might not use that often, meh I'll leave it be and make sure the tags are right.

10.) Sometimes for a few of my complete discographies for chronology's sake I'll in the band folder I'll add a 01 - , 02 - , 03 - in the album folder title. That was I know the order they were released in, in case I'd like to start at the beginning of an artist and work forward. For instance I've done this with U2, Sting, Michael Jackson, Pearl Jam. It helps me to find things quicker as well.

Well that's all I'm going to write, it's way too much, but you can get a good idea of my naming conventions and how they might differ from other peoples' as well as what I find that I don't like and have to change.

posted by PetiePal at 9:49 AM on October 5, 2006

Define a "Composer" field and populate that.

There is a composer field, and I use it. But he's talking about how you use the artist and album fields. Also, I sort by the artist field to browse, and filling out the composer field doesn't affect that -- all the problems I mentioned above still exist.
posted by booksandlibretti at 11:42 AM on October 5, 2006

All the problems I mentioned above still exist

I think doing everything with just two fields, Album and Artist, is always going to lead to an awful mess. For it to work, you are going to have to adhere rigidly to a specific delimiting pattern. For that much work, you may as well break it out into field columns.

I use the Artist tag for the Main Artist, and use "People1", "People2", etc etc. I also define some custom tags for label name and for catalog number. Given that I follow a lot of German electronica, random name changes, fluid groupings, and anonymity are par for the course. So I also have some tags for individual artist names within groups that sometimes get used. I also embed the biographical text info from discogs or similar in the "Note" tag. That helps provide context.

I store a few different custom view sort orders so I can get a quick run down of a title or composer/performer with a few clicks. I can also combine the fields in any order for a Smart List, or use database expressions for some funky stuff with, for example, dates and bpms. It's pretty easy to dump any combined information back into a temporary tag buffer, then write it back to the file if outputting to a simple schema or to use for mass file renaming.

I use Media Center.
posted by meehawl at 1:56 PM on October 5, 2006

every song is sung by a different person -- do I give every song a different artist (what about duets?), or use the composer as the artist (blatant misuse of the field), or do something else altogether?

Just by the by, iTunes 7 supports an Album Artist field which is a significant aid in dealing with compilations, soundtracks, musicals, etc. Each track can have its own artist, as appropriate, but the album's artist can be something like "Various" or "Soundtrack" or "Cole Porter" or whatever you feel the album's artist is. Very handy.
posted by kindall at 7:14 PM on October 5, 2006

« Older How can I sleep through the night?   |   I'm looking for a reliable shell account service.... Newer »
This thread is closed to new comments.