I didn't think I had that many John Cage covers ...
November 29, 2008 7:48 AM   Subscribe

Several mp3s in my large collection are silent, or almost entirely so. Please help me find ways of detecting them automatically.

I ripped all of my 1200+ CD collection using cdparanoia, and put them in storage. The files are now being served by a linux box running Firefly Media Server. For unknown reasons, some of the tracks have ended up being silent, or containing very short bursts of noise separated by silence. I've uploaded a typical example; it's supposed to be Robyn Hitchcock and The Egyptians – Oceanside.

What I'd really like is a script or command-line program that will scan an mp3 and indicate if it's mostly or entirely silent. It would be preferable if the process were fairly quick, as there are about 20000 files to scan.
posted by scruss to Computers & Internet (5 answers total)
Best answer: Hmm... I wonder if an mp3 that is mostly silence will have a smaller filesize than a typical mp3 of similar length? If so, you might be able to write some sort of script that compares the length and filesize of each mp3 and flags any that seem to have unusually small filesizes.
posted by oulipian at 8:54 AM on November 29, 2008

Response by poster: Yes, they are smaller. I was going to add that most of the files are VBR mp3s, and so silence results in a smaller file. A couple of other things:
  • the files still have their original time of creation as time stamps, so I could search around the time of known-bad files
  • silent files zip down to almost nothing, while real mp3s only compress by a small amount.

posted by scruss at 9:16 AM on November 29, 2008

A quick analysis on the file showed that the most popular character is 'U' (Hex 55) - count = approx 550K, followed by NUL (Hex 00) - count = approx 150K.

If your other files follow this characteristic it would be easy to write a script to flag them.
posted by gadha at 9:38 AM on November 29, 2008

Response by poster: While gadha's answer might be a better general solution, I marked oulipian's as best as it seemed to work best for me. lame in VBR mode encodes silence at less than the minimum allowed 32kbit/s MP3 rate, so all I really needed to do was see if the file's average bitrate was extremely low.
I did it with the following Perl one-liner:

perl -le 'use MP3::Info; foreach (@ARGV) {my $info = get_mp3info($_); print $_ if ($info->{BITRATE} <>files

Took a few minutes to scan my whole library.
posted by scruss at 3:13 PM on November 30, 2008

Response by poster: (ah, something screwed up in my preview. Okay, you can used something like MP3::Info.)
posted by scruss at 3:15 PM on November 30, 2008

« Older Hundreds and hundreds of stamps. What now?   |   Narrowed my phone choices down to: iPhone... Newer »
This thread is closed to new comments.