How does software to detect musical key work?
September 4, 2006 6:26 AM   Subscribe

How does software which tries to detect the key of a piece of music work (and how reliable is it)?

As a some-time musician key detection always seems like a pretty complex job (great explanation of the full complexities from chrismears in this thread) so I was wondering how applications such as Mixshare and MixedinKey go about the job. Both seem to be aimed at people who want to do harmonic mixing so I guess they are primarily interested in dealing with dance music.
posted by rongorongo to Computers & Internet (4 answers total) 2 users marked this as a favorite
 
I am not familiar with the specific software, but I can venture a guess: independent of the substant musical content of a passage of recorded audio, the key can be guessed at based on the dominant frequencies present in a spectral breakdown of the audio. The most notable sustained frequencies likely correspond to the fundamentals of the notes of the current "chord" in the piece.

So, do you have a lot of energy at [220Hz = A, 275 = C#, 330 = E] and/or multiples thereof (octaves up the scale), with not as much energy devoted to other specific tones? For that region in the song, you can guess that it's an A Major.

Combine that sort of moment-by-moment analysis of the aural content with a large-scale analysis of the chordal movement in the song that measures against some known music theoretical heursitics, and you can not only guess what chords happen at a given time but what key(s) the song is in.
posted by cortex at 7:20 AM on September 4, 2006


cortex's methods might work most of the time, but they are not the ways a thorough musicologist would follow.

I would guess that a piece could (theoretically) be in a key without that root note ever appearing. The key is established by a harmonic cadence, which is a sequence of chords that establishes the key or region of the music, by methodically eliminating alternative keys or regions.

Notes that are present and could belong to another key are put in their place and disambiguated by the sequence of harmonies that establishes the relationships.
posted by StickyCarpet at 9:12 AM on September 4, 2006


Best answer: Although I don't know these specific bits of software, I'd wager that cortex is essentially right. There's something called the Krumhansl-Schmuckler technique where you do a statistical match between the overall length of time that each pitch occurs in the sample, and a 'profile' for each major and minor key. The profiles basically say, "for a major key, the root note is the most common, then the fifth, then the major third, etc.", except that an actual numerical weighting is attached to each of the twelve possible notes. (I think the profiles were developed through a combination of basic musical theory and experimentation).



This figure (taken from this paper) shows the K-S profiles (in blue) for the major and minor keys. The x-axis is the pitch relative to the root note, and the y-axis is the relative rating.

For both scales, the root note (0) is the most common. In the major scale, though, the major third (4) is more common than the minor third (3), while in the minor scale it's the other way around, as you'd expect.

(The red lines show an improved version of the original profiles by David Temperley).

So you can generate one of these for each of the twelve keys (A major, Bb major, B major, etc.), and then you can match that up to the pitch profile of your sample, and find the closest match.

It's not exactly a full harmonic analysis, which is why these programs don't boast a 100% success rate, but I'd imagine it works pretty well for dance music.

I'm a bit hazy on exactly how you get from an audio sample to a pitch profile, but it will basically involve doing a FFT (Fast Fourier Transform) to translate the audio wave into a frequency/strength graph. I think you also have to do something along the lines of taking many small samples (< 1 second) to counter the fact that each note you play generates many harmonic frequencies as well as the base frequency that you're interested in, but the FFT is the heart of it.
posted by chrismear at 11:01 AM on September 4, 2006 [1 favorite]


It's not exactly a full harmonic analysis, which is why these programs don't boast a 100% success rate, but I'd imagine it works pretty well for dance music.

I've never heard of this kind of software before; sounds pretty neat. I just want to agree that this would probably work well for dance music etc., but I'd be extremely surprised if it had much success with more harmonically complex compositions.
posted by ludwig_van at 1:51 PM on September 4, 2006


« Older Help me to get WinXP to recognise my new SATA RAID...   |   How to deal with the IRS? Newer »
This thread is closed to new comments.