Join 3,512 readers in helping fund MetaFilter (Hide)

Many iPhones, One recording?
November 20, 2009 11:14 AM   Subscribe

Audio/mobile phone idea/question: I was at my kid's band concert recently, trying to record some of it for Grandma. I only thought of it after we got there so I was using my iPhone. Not a bad recorder, but it's hand held. It picks up me moving in my seat, people fidgeting, etc. Why isn't there an app/service where I could record the concert on my less-than-professional-equipment, and a bunch of OTHER Dads and Moms record the same thing from other vantage points (with other noise!) then put 'em all together subtract the noise and end up with a beautifull, clear, stereo recording. With the iphone's bluetooth/wifi/gps the iphones should be able to figure out where they are relative to each other. Is the doable? Am I nuts? Does it already exist?
posted by davereed to Computers & Internet (14 answers total) 5 users marked this as a favorite
There's a CompSci PhD thesis in there somewhere, I'm sure.
posted by Oktober at 11:22 AM on November 20, 2009 [6 favorites]

well, generally, you can't subtract 'noise' (like squirming in your seat, etc.) from a single-track recording. You can fuck with it in other ways to try and edit some of that out, but you can't just subtract it, to be sure.

I really doubt that simply combining iPhone recordings from various spots in the auditorium would result in something much better than yours. For one, you'd get a lot of sound pollution just from overdubbing all the recordings. For two, the iPhone's recording capability is probably not so good as to capture all of the nuances of the slightly louder sounds of the trombones on stage left v. the clarinets on stage right.

Generally, if you're trying to record something from a bit of a distance (such as a band concert from your seat) you would use a shotgun mic - which are spendy and then you'd have to get a different recording device to run it through..

I'm not at all an expert in this sort of thing, so I will be very interested to see if someone has a better idea.
posted by Lutoslawski at 11:31 AM on November 20, 2009

This is the best idea I've heard in weeks. It's doable if you had the right people (audio engineer, computer engineer, and a mathematician), and I don't know of anyone who has done anything like this yet.

You'd need to have a hosted server component that collects everyone's recordings and sends it back to everyone who has downloaded your app and opt-ed into the recording session.
posted by judge.mentok.the.mindtaker at 11:31 AM on November 20, 2009

I've wondered about this for years -- it seems like there must be some way to average a group of audio recordings, keeping the common signal and discarding the discrete noise. As far as I know this tech does not exist today but likely will within the next, say, 10 or 15 years.
posted by The Winsome Parker Lewis at 11:36 AM on November 20, 2009

I'm not sure GPS, even with AGPS or DGPS are accurate enough to allow for perfection in this kind of modeling.

There are many variables that would need to be taken into account, not limited to:

- Venue geometry
- Venue wall and ceiling composition
- Number and location of people acting as baffles in the audience
- Humidity level in the venue
- Air temperature
- Elevation

Doing this would be akin to mixing the dry components of a cake together, and then mathematically un-mixing them.

That said, as an audio guy, I love the idea.

I could certainly envision ways to remove some localized noises, using inverted summing, or quick mixing to different sources further away from the noise.

Possibly easier and cheaper would be to just remember to bring a good recorder and directional mic. Or a good mic to plug into your iPhone.
posted by tomierna at 11:42 AM on November 20, 2009 [1 favorite]

it seems like there must be some way to average a group of audio recordings, keeping the common signal and discarding the discrete noise.

That is a very interesting idea - but, like you say, I don't think the tech exists and I'm not sure how it would even work. Imagine you have 15 different iphone recordings of a concert. If you were to break it down to spectral you'd see the wave forms common to all the recordings - which would be, ideally, the music of the concert. But how, since they are, after all, still single-track recordings, would you be able discard the discrete noise? It'd be akin to be being able to, say, go into a single-track recording of a symphony and just snip out the flute part. I can't think of anyway that would be possible. But like you said, 10 or 15 years. And that would be very cool - and save me a lot of headaches.
posted by Lutoslawski at 11:45 AM on November 20, 2009

As a halfway point, and given the noise-reduction problems, you might want something more easily achievable like this:

1. Multiple point recording sources get uploaded to a common repository;
2. Software finds common audio signatures to sync up the clips;
3. Software compares each audio track, gives you a sample of each with some basic filter noise reduction options, and lets you choose the "best" one you'd like applied;
4. Software allows you to look at all point recording videos side-by-side, and pick the best shot for any given moment;
5. Software allows you to render and download the video, and make it available on the site to other people who uploaded point recording sources.
posted by davejay at 11:52 AM on November 20, 2009

it's doable, it's just probably too much of a pain for the amount of effort it would take. i mean, how much would you be willing to pay for the resulting recording?
posted by rhizome at 12:07 PM on November 20, 2009

That is like saying you want to take many crappy cell phone camera pictures and do some computer stuff to turn it into one high res photo. You'd be better off getting a cleaner sample. Buy a h2 Zoom for $150, it has a much better microphone than your phone.
posted by andrewzipp at 12:27 PM on November 20, 2009

? It'd be akin to be being able to, say, go into a single-track recording of a symphony and just snip out the flute part.

I know nothing about the tech behind this, but it seems melodyne has just that capability.

Combine that with some sort of averaging algorithm, and you just might be in business.

Seems like that would be VERY processor intensive though. If it were an iPhone app, for instance, it would probably need multiple servers and be pretty expensive.
posted by Truthiness at 12:32 PM on November 20, 2009

I don't know of any software that does this, but it sounds doable. Averaging wouldn't work so well, but you could totally get more sophisticated than that, use some sort of a Bayesian approach to figure out what the input sound was most likely to have been. (I'm more familiar with denoising techniques in image analysis, not audio.)

The question of separating out one source from among many (i.e. the flute in a symphony) is a well-studied problem called the Cocktail party problem. Our brains solve it just fine, and it's interesting reading about the algorithms to do it on a computer.
posted by wyzewoman at 1:18 PM on November 20, 2009

Whilst I am not sure that this is an efficient way of making a hight quality recording I would be fascinated to browse through a recording of an event made by several devices in the same room. The real attraction for me would be not so much trying to reproduced a high quality sound from a single source (the stage) as being able to listen to what was going on in different places. You are assuming that the noise is the part to get rid of. For me that would be the part I'd like to explore.

What I am saying is this: develop and test the application in a larger outdoor area where crowds of a suitable size might gather: a sports event, a rock festival, a market, a large party, sailors in a boat race. Give anonymity to the uploaders if you like - but set the eavesdroppers free. Besides a bigger space outdoor space would make your tracking problem easier.
posted by rongorongo at 3:47 PM on November 20, 2009

I think many people are making this too complicated by trying to model acoustics and positions. If we make the assumption of many single track recorders collecting the same signal but different noise, simple averaging will improve the signal-to-noise ratio. This would reduce (but not eliminate) the noise.

The primary complicating factor is that the same signal approximation isn't perfectly valid, but the likely effect is to reduce fidelity while enriching the signal. The synchronizing signal for the server performing the averaging should be audio from on stage to reduce signal broadening from different delays (caused by different distances; this is a significant effect that requires correction when doing live sound amplification in even a modestly-sized venue). This would require a server with decent computational power and a good algorithm to determine appropriate weightings of the different tracks. I think it is feasible, but I worry the target audience wouldn't support the development costs, especially when mastering a proper recording to distribute to interested parties is such an eminently viable alternative.
posted by JMOZ at 8:37 PM on November 20, 2009

I'm very surprised that this thread went by entirely without mention of the concept of phase. As far as my understanding goes, that will be the main obstacle, and I'm not sure it's at all surmountable.

Phase cancellation is one of the most important reasons sound engineers prefer to use as small a number of microphones as possible. See, one can not simply correct for simple delay, i.e. sounds arriving at different microphones at different times: the reflections (echoes) will also arrive at different times.

Although it is now possible to accurately model the acoustics of a given space and artificially add a realistic simulation of the space's acoustic properties to an audio signal, for this you'd essentially need to do the opposite, while also having the algorithm be aware of every recorder's position, and I haven't heard of anything currently existing to make this possible.

I doubt it ever will be, although I'm not in a position to rule anything out. But given the practical problems of phase audio engineers have had to learn to work around for decades now, I believe it will never be practical.
posted by goodnewsfortheinsane at 3:15 PM on December 30, 2009

« Older Who is the best cobbler in Vic...   |  Mystery water dripping from th... Newer »
This thread is closed to new comments.