I didn't think I had that many John Cage covers ...
November 29, 2008 7:48 AM Subscribe
Several mp3s in my large collection are silent, or almost entirely so. Please help me find ways of detecting them automatically.
I ripped all of my 1200+ CD collection using cdparanoia, and put them in storage. The files are now being served by a linux box running Firefly Media Server. For unknown reasons, some of the tracks have ended up being silent, or containing very short bursts of noise separated by silence. I've uploaded a typical example; it's supposed to be Robyn Hitchcock and The Egyptians – Oceanside.
What I'd really like is a script or command-line program that will scan an mp3 and indicate if it's mostly or entirely silent. It would be preferable if the process were fairly quick, as there are about 20000 files to scan.
I ripped all of my 1200+ CD collection using cdparanoia, and put them in storage. The files are now being served by a linux box running Firefly Media Server. For unknown reasons, some of the tracks have ended up being silent, or containing very short bursts of noise separated by silence. I've uploaded a typical example; it's supposed to be Robyn Hitchcock and The Egyptians – Oceanside.
What I'd really like is a script or command-line program that will scan an mp3 and indicate if it's mostly or entirely silent. It would be preferable if the process were fairly quick, as there are about 20000 files to scan.
Response by poster: Yes, they are smaller. I was going to add that most of the files are VBR mp3s, and so silence results in a smaller file. A couple of other things:
posted by scruss at 9:16 AM on November 29, 2008
- the files still have their original time of creation as time stamps, so I could search around the time of known-bad files
- silent files zip down to almost nothing, while real mp3s only compress by a small amount.
posted by scruss at 9:16 AM on November 29, 2008
A quick analysis on the file showed that the most popular character is 'U' (Hex 55) - count = approx 550K, followed by NUL (Hex 00) - count = approx 150K.
If your other files follow this characteristic it would be easy to write a script to flag them.
posted by gadha at 9:38 AM on November 29, 2008
If your other files follow this characteristic it would be easy to write a script to flag them.
posted by gadha at 9:38 AM on November 29, 2008
Response by poster: While gadha's answer might be a better general solution, I marked oulipian's as best as it seemed to work best for me. lame in VBR mode encodes silence at less than the minimum allowed 32kbit/s MP3 rate, so all I really needed to do was see if the file's average bitrate was extremely low.
I did it with the following Perl one-liner:
Took a few minutes to scan my whole library.
posted by scruss at 3:13 PM on November 30, 2008
I did it with the following Perl one-liner:
perl -le 'use MP3::Info; foreach (@ARGV) {my $info = get_mp3info($_); print $_ if ($info->{BITRATE} <>files>
Took a few minutes to scan my whole library.
posted by scruss at 3:13 PM on November 30, 2008
Response by poster: (ah, something screwed up in my preview. Okay, you can used something like MP3::Info.)
posted by scruss at 3:15 PM on November 30, 2008
posted by scruss at 3:15 PM on November 30, 2008
« Older Hundreds and hundreds of stamps. What now? | Narrowed my phone choices down to: iPhone... Newer »
This thread is closed to new comments.
posted by oulipian at 8:54 AM on November 29, 2008