Compressing Dolphin vocalizations and Boat noise?
June 25, 2007 12:37 AM Subscribe
Hi there, I am doing an honours project on the effects of boat noise on dolphin communication and my supervisor had this idea of using compression to analyse this.
He basically suggested that we record dolphin vocalizations in the presence and absence of boat noise, and then compress the files and determine which compresses more (dolphin sounds with the boat noise, or dolphin sounds without the boat noise). His thinking was that less complex sounds should compress less...so he thinks that boat noise will compress less than dolphin vocalizations. Thus, if dolphin vocalizations happen to compress more in the presence of boat noise than they do without the boat noise...this would sort of imply they are "losing information" so to speak. Does this make sense?
I asked on another forum, and most people seem to think the opposite - that boat noise will compress less than dolphin sounds because it is more random. This makes sense to me...because i guess boat noise has a lot of different frequencies making it hard to compress. I mentioned this to my supervisor, and he suggested maybe cutting out the random high frequencies in the boat noise to make it more level with the dolphin sounds and then compress from there and compare them.
I was wondering what all your opinions are on this in general? Do you the idea would work?? If not, why? And do have suggestions for something else i could do? I have no experience with audio related things, so this is completely new to me. I am really interested in hearing what you have to say and ANY advice would be greatly appreciated. Thanks!!
P/s: I will be using a lossless compression program like Flac to compress the raw files. I have been out in the field and so far just have a recording of dolphin sounds in the presence of boat noise ...so i still need to get a recording of dolphin sounds *without* the boat noise to test it out.
I asked on another forum, and most people seem to think the opposite - that boat noise will compress less than dolphin sounds because it is more random. This makes sense to me...because i guess boat noise has a lot of different frequencies making it hard to compress. I mentioned this to my supervisor, and he suggested maybe cutting out the random high frequencies in the boat noise to make it more level with the dolphin sounds and then compress from there and compare them.
I was wondering what all your opinions are on this in general? Do you the idea would work?? If not, why? And do have suggestions for something else i could do? I have no experience with audio related things, so this is completely new to me. I am really interested in hearing what you have to say and ANY advice would be greatly appreciated. Thanks!!
P/s: I will be using a lossless compression program like Flac to compress the raw files. I have been out in the field and so far just have a recording of dolphin sounds in the presence of boat noise ...so i still need to get a recording of dolphin sounds *without* the boat noise to test it out.
You have to filter out the boat noise completely to get useful data, but I'm sure similar problems have been solved in passive sonar applications.
posted by BrotherCaine at 1:00 AM on June 25, 2007
posted by BrotherCaine at 1:00 AM on June 25, 2007
I'm no audio expert either, but I will tell you this: I don't think it will work. My reasoning may be flawed -- as I've said, I'm no expert -- but here's my logic:
"Compression" isn't going to work on some reliable, proportionate level necessarily. Boat noise may *generally* be more random than dolphin communication (or vice versa), but the ease of a file's compression is the result of many random factors. I'll give you an example of why I think this:
I have the same song on two different CD compilations. The two CDs sound as if use the same mastering of the song (they're both on the same label too), and the uncompressed file on them is about 15.3 MB. I imported both into iTunes using the Apple Lossless compression utility. One of the compressed files ended up being 11.3 MB, and the other 7.4 MB. That's a huge difference, based on who knows what. I imported them both again, with identical results.
Why would two seemingly identical versions of the same song end up with drastically different compressed file sizes? It just happens that way sometimes. The mastering may have been slightly different (although one can't hear it, and the running time is identical on both "versions.") It's probably just that the information on one - for whatever reason - allowed for heavier compression of the file.
You'd have to compress zillions of files to establish an average ratio of compression for the with and without boat noise, but if my scenario (and I've got many more like it) is of any value, that 50% plus difference would be tough to explain.
I would think that other factors would also distort any statistical meaning beyond repair. You'd have to control for the distance from which your recordings are made from the dolphins AND boats, the depth under water you're recording, wave size (that's got to affect sound), miscellaneous noises, the relative size, speed and sound of each boat making noise, other atmospheric conditions (humidity, for instance) and a zillion other things. I'm not a scientist, but this would seem like an insurmountable number of very variable variables to overcome to the extent that you would have unmuddled and valid evidence to support your thesis. Not to get you down - you must knows scads more about all this than me!
How would I approach this idea with my scant knowledge? I'd be looking at things such as the "length" of communications between dolphins, the relative pitch of this communication with and without boat noise, that sort of thing.
One factor I'd be worried about was whether there was a difference in dolphin speech simply because the dolphins were "communicating" something related to the boats being there - "danger!" or something like that. In other words, the dolphins wouldn't be "losing information," they'd just be talking about something they don't talk about when boats are around. How could you determine the difference?
A lot of the above is probably silly, but maybe it'll help you somehow. Forgive me if I've wasted your time.
posted by Dee Xtrovert at 1:07 AM on June 25, 2007
"Compression" isn't going to work on some reliable, proportionate level necessarily. Boat noise may *generally* be more random than dolphin communication (or vice versa), but the ease of a file's compression is the result of many random factors. I'll give you an example of why I think this:
I have the same song on two different CD compilations. The two CDs sound as if use the same mastering of the song (they're both on the same label too), and the uncompressed file on them is about 15.3 MB. I imported both into iTunes using the Apple Lossless compression utility. One of the compressed files ended up being 11.3 MB, and the other 7.4 MB. That's a huge difference, based on who knows what. I imported them both again, with identical results.
Why would two seemingly identical versions of the same song end up with drastically different compressed file sizes? It just happens that way sometimes. The mastering may have been slightly different (although one can't hear it, and the running time is identical on both "versions.") It's probably just that the information on one - for whatever reason - allowed for heavier compression of the file.
You'd have to compress zillions of files to establish an average ratio of compression for the with and without boat noise, but if my scenario (and I've got many more like it) is of any value, that 50% plus difference would be tough to explain.
I would think that other factors would also distort any statistical meaning beyond repair. You'd have to control for the distance from which your recordings are made from the dolphins AND boats, the depth under water you're recording, wave size (that's got to affect sound), miscellaneous noises, the relative size, speed and sound of each boat making noise, other atmospheric conditions (humidity, for instance) and a zillion other things. I'm not a scientist, but this would seem like an insurmountable number of very variable variables to overcome to the extent that you would have unmuddled and valid evidence to support your thesis. Not to get you down - you must knows scads more about all this than me!
How would I approach this idea with my scant knowledge? I'd be looking at things such as the "length" of communications between dolphins, the relative pitch of this communication with and without boat noise, that sort of thing.
One factor I'd be worried about was whether there was a difference in dolphin speech simply because the dolphins were "communicating" something related to the boats being there - "danger!" or something like that. In other words, the dolphins wouldn't be "losing information," they'd just be talking about something they don't talk about when boats are around. How could you determine the difference?
A lot of the above is probably silly, but maybe it'll help you somehow. Forgive me if I've wasted your time.
posted by Dee Xtrovert at 1:07 AM on June 25, 2007
You've completely misinterpreted your supervisor's idea. The idea is not to compress boat noise. That's not part of the experiment. The idea is to compress dolphin vocalization A and dolphin vocalization B. Vocalization A was produced in the presence of boat noise; vocalization B was produced in its absence. (You'd probably have to have a set of A's and a set of B's and compare the average amount of compression they took.) You're looking for a qualitative change in the amount of boat noise.
The difficult part of this project is to record dolphin vocalization A without recording any boat noise along with it. This is a technical problem, not an experimental-design problem. An underwater mic with a small receptive field might be able to do it. Alternatively, you might record a second track, the boat noise away from the dolphin, and do some subtraction of one track from the other to isolate the dolphin vocalization. That would require good syncing and a good audio manipulation program.
If your main interest is dolphin behavior, not acoustics or recording engineering, I'm not sure that this is going to be a very fun project.
posted by ikkyu2 at 1:17 AM on June 25, 2007
The difficult part of this project is to record dolphin vocalization A without recording any boat noise along with it. This is a technical problem, not an experimental-design problem. An underwater mic with a small receptive field might be able to do it. Alternatively, you might record a second track, the boat noise away from the dolphin, and do some subtraction of one track from the other to isolate the dolphin vocalization. That would require good syncing and a good audio manipulation program.
If your main interest is dolphin behavior, not acoustics or recording engineering, I'm not sure that this is going to be a very fun project.
posted by ikkyu2 at 1:17 AM on June 25, 2007
That would require good syncing and a good audio manipulation program.
It can't be done by direct processing of a digitization of the analog stream. If you have the same sound recorded with two mics, you can't subtract one from the other to cancel it out -- the phases will almost certainly be different and you could actually end up amplifying it instead of attenuating it.
posted by Steven C. Den Beste at 1:55 AM on June 25, 2007
It can't be done by direct processing of a digitization of the analog stream. If you have the same sound recorded with two mics, you can't subtract one from the other to cancel it out -- the phases will almost certainly be different and you could actually end up amplifying it instead of attenuating it.
posted by Steven C. Den Beste at 1:55 AM on June 25, 2007
The obvious way to quantitatively analyse this would, to my mind, be spectral analysis. That I assume you're interested in the effects of boat noise masking dolphin communication. And the closer something is to white noise, the better it's masking properties. Spectral analysis of the waveforms would be a useful way of exploring this. If the periodogram should an exponential shape, then you're looking at white noise, and testing for deviations from that may be useful. And there are many other time-series statistical techniques that are useful for isolating the "random" (ie. white noise) component of a signal using seasonal decomposition.
posted by Jimbob at 2:14 AM on June 25, 2007
posted by Jimbob at 2:14 AM on June 25, 2007
(Your question doesn't really make it clear what you're trying to test...you are interested in the masking-ability and noisyness of boat motors, right? In any case, relying on the output of a piece of software like FLAC that's basically a black box unless you take apart the source code, probably isn't the best basis for hypothesis testing.)
posted by Jimbob at 2:18 AM on June 25, 2007
posted by Jimbob at 2:18 AM on June 25, 2007
I'm with Jimbob... Fourier analysis or other spectral decomposition techniques. The boat noise will stick out like a sore thumb.
If Professor is postulating that the doplhins can't hear as well in the presence of noise, he gets a prize for stating the obvious. Neither can I!
If he's looking for a computationally cheap way of making his point, spectral analysis costs the same as compression and can be done quasi-real time. It suffers from being very relevant.
posted by FauxScot at 3:18 AM on June 25, 2007
If Professor is postulating that the doplhins can't hear as well in the presence of noise, he gets a prize for stating the obvious. Neither can I!
If he's looking for a computationally cheap way of making his point, spectral analysis costs the same as compression and can be done quasi-real time. It suffers from being very relevant.
posted by FauxScot at 3:18 AM on June 25, 2007
I don't know much about acoustic recording of Dolphins but I have worked with people who do, and I would be shocked if this experimental work has not already been done. Cetacean acoustics is a very well studied field and since boat interference is a common problem it should be well studied. Check out this NOAA site.
If you want some email addresses of people in the field drop me an email (in profile).
posted by afu at 3:41 AM on June 25, 2007
If you want some email addresses of people in the field drop me an email (in profile).
posted by afu at 3:41 AM on June 25, 2007
Without understanding the mechanics of the compression, this is useless. (I suspect it's useless anyway, but....)
In particular, loss-y audio compression algorithms work (in part) by throwing away sounds that humans are less able/ unable to hear, thus the name "psychoacoustic model". Since dolphins' ranges of hearing are no doubt fundamentally different than humans', I suspect what will happen is loss-y compression will throw out high-frequency dolphin (and motor) sounds, and you'll discover there's little dolphin sound in the 200-400 Hertz range of human speech that the loss-y compression strives to keep.
Note also that off-the shelf microphones generally can't record much higher than 20kHz; dolphin sonar goes up to 200kHZ, according to various reports (Do dolphins "communicate" with sonar, or just navigate with it? I have no idea. One might explore this by seeing if like certain bats, dolphins "close" their ears to prevent damage when emitting sonar, then reopen the ear to hear the returned echo.)
I think what you want to do is find frequencies motor noise and dolphin sounds overlap, or where they don't but the motor noise, by damaging/masking the dolphins' receptors, nevertheless prevents hearing at the frequencies dolphins do use.
Your supervisor, is he a marine biologist?
posted by orthogonality at 3:41 AM on June 25, 2007
In particular, loss-y audio compression algorithms work (in part) by throwing away sounds that humans are less able/ unable to hear, thus the name "psychoacoustic model". Since dolphins' ranges of hearing are no doubt fundamentally different than humans', I suspect what will happen is loss-y compression will throw out high-frequency dolphin (and motor) sounds, and you'll discover there's little dolphin sound in the 200-400 Hertz range of human speech that the loss-y compression strives to keep.
Note also that off-the shelf microphones generally can't record much higher than 20kHz; dolphin sonar goes up to 200kHZ, according to various reports (Do dolphins "communicate" with sonar, or just navigate with it? I have no idea. One might explore this by seeing if like certain bats, dolphins "close" their ears to prevent damage when emitting sonar, then reopen the ear to hear the returned echo.)
I think what you want to do is find frequencies motor noise and dolphin sounds overlap, or where they don't but the motor noise, by damaging/masking the dolphins' receptors, nevertheless prevents hearing at the frequencies dolphins do use.
Your supervisor, is he a marine biologist?
posted by orthogonality at 3:41 AM on June 25, 2007
Do dolphins "communicate" with sonar, or just navigate with it?
Dolphins make two distinct types of sounds. "Clicks" are high frequncy bursts of sounds that are used for echolocation. "Whistles" are lower frequency highly varied sounds used for in species communication.
posted by afu at 3:52 AM on June 25, 2007
Dolphins make two distinct types of sounds. "Clicks" are high frequncy bursts of sounds that are used for echolocation. "Whistles" are lower frequency highly varied sounds used for in species communication.
posted by afu at 3:52 AM on June 25, 2007
Whoa. Did NOT expect to see a question of this type show up on Ask Metafilter.
So here's the deal: Is it possible to transcribe Dolphin vocalizations? There is in fact work that has looked at Whale utterances, and indeed one can generate a context free grammar from them.
What you're going to want to do is:
1) Record a dolphin normally
2) Record a dolphin w/ boat noise
3) Transcribe both datasets, manually
4) Run through an automated CFG generator, a la Sequitur, and see if you get significantly different structure with and without the noise.
Ping me and I can send you my sequitur mods (XML output, some visualization -- I've been using this for very different things).
BTW, the model of compression your supervisor had won't work; the existence of the boat noise will wildly override any variable bitrate encoding effects from the dolphin noise. That's why you want to do the transcription, actually -- to focus on what's coming from the dolphin in differential environments, not what's up in the environment.
posted by effugas at 4:08 AM on June 25, 2007
So here's the deal: Is it possible to transcribe Dolphin vocalizations? There is in fact work that has looked at Whale utterances, and indeed one can generate a context free grammar from them.
What you're going to want to do is:
1) Record a dolphin normally
2) Record a dolphin w/ boat noise
3) Transcribe both datasets, manually
4) Run through an automated CFG generator, a la Sequitur, and see if you get significantly different structure with and without the noise.
Ping me and I can send you my sequitur mods (XML output, some visualization -- I've been using this for very different things).
BTW, the model of compression your supervisor had won't work; the existence of the boat noise will wildly override any variable bitrate encoding effects from the dolphin noise. That's why you want to do the transcription, actually -- to focus on what's coming from the dolphin in differential environments, not what's up in the environment.
posted by effugas at 4:08 AM on June 25, 2007
Does your supervisor really know anything about information theory? I would not suggest just compressing stuff at random and looking for correlations. Perhaps check out a book on information theory and communication systems or go talk to someone in the electrical engineering department to get some theoretical background on why non-lossy compression works the way it does.
posted by GuyZero at 6:06 AM on June 25, 2007
posted by GuyZero at 6:06 AM on June 25, 2007
Guy,
Information theory works a little differently for something like audio, where we inherently throw away most of the signal after doing frequency analysis, versus streams of bits. We tend not to look for symbols, and repeat them, in audio. This is because symbol matching in audio is *inherently* a fuzzy matching operation, and fuzzy matches are annoyingly difficult. Discrete data is convenient in that it's very easy to see when the same sequence occurs again, therefore an entire compression regime can form around not re-transmitting the same thing again.
So, again: We miss most of audio anyway, so the game in audio compression is to miss a bit more without us noticing. We don't miss anything from a file on our hard drive, but we can easily tell when the same exact message was repeated. So the game in data compression is to do repeats. Racergirl's supervisor is trying to use a trick from data compression on audio compression, and that ain't going to fly.
But if she converts the signal (dolphin noise) from audio to semantically normalized bit sequences...yeah, that has a chance of exposing coolness.
posted by effugas at 2:56 PM on June 25, 2007
Information theory works a little differently for something like audio, where we inherently throw away most of the signal after doing frequency analysis, versus streams of bits. We tend not to look for symbols, and repeat them, in audio. This is because symbol matching in audio is *inherently* a fuzzy matching operation, and fuzzy matches are annoyingly difficult. Discrete data is convenient in that it's very easy to see when the same sequence occurs again, therefore an entire compression regime can form around not re-transmitting the same thing again.
So, again: We miss most of audio anyway, so the game in audio compression is to miss a bit more without us noticing. We don't miss anything from a file on our hard drive, but we can easily tell when the same exact message was repeated. So the game in data compression is to do repeats. Racergirl's supervisor is trying to use a trick from data compression on audio compression, and that ain't going to fly.
But if she converts the signal (dolphin noise) from audio to semantically normalized bit sequences...yeah, that has a chance of exposing coolness.
posted by effugas at 2:56 PM on June 25, 2007
This thread is closed to new comments.
You'd need multiple simultaneous directional microphones, and some very significant signal processing to separate them. (The signal processing could be done after the fact in non-real time, but it still wouldn't be simple.)
The people in the other forum were correct: "simple" compresses a lot. "Random" doesn't compress at all. However, a boat noise isn't random. But it isn't communication, either. So there's no telling whether dolphin-sounds alone would compress more or less than boat-noises alone. The only thing that's certain is that dolphin-sounds plus boat sounds would compress less than either alone.
And that doesn't prove that the dolphins are losing information, either. If you talk to a person in a perfectly quiet room, and then later talk to them with a radio at low volume in the background, then you talk at the same speed in both cases. But recorded sound from the latter case would compress less than the former case.
It doesn't sound to me as if your supervisor understands Claude Shannon's work at all. Before you undertake this I think you'd be well advised to try to understand the mathematical basis of compression, and what it means for a bit stream to be "high entropy" or "low entropy".
I think the problem here is that your supervisor thinks that "information" is equivalent to "meaning" -- and that's not the case. A random bitstream is very low entropy, but it doesn't have any meaning whatever.
posted by Steven C. Den Beste at 12:56 AM on June 25, 2007