Building a sculpture of a sound
January 18, 2010 2:48 PM Subscribe
How can I turn a waveform
graph of an audio clip into a string of numbers describing the graph?
posted by metaBugs to computers & internet (17 answers total) 1 user marked this as a favorite
I want to build a 3D representation of a five(ish) second audio clip. My plan for this is basically:
1) Record the clip
2) Put audio into Audacity and do a screen capture to get a picture of the waveform
3) Chop the graph into about 150 vertical slices (i.e. vertical lines spaced along the Time axis)
4) Determine the height of the waveform in each slice (mean, maximum, mode - whichever I think looks nicer)
5) Use the measured height of each slice to build a physical slice (basically, cut a disk to that radius) then assemble them in order to give a physical model of the waveform
My sticking point is with steps 3-4. Obviously I could do this manually with a printout or with the rulers in a graphics program, but I didn't buy a computer so I could do precisely defined, repetitive analysis tasks by hand!
So, can you suggest a way to automate this process? Basically to look at an image, determine where the top of a graph is along an x-axis (move one pixel along x axis, count how many pixels upwards the green region goes, repeat) and send that to a file?
I realise that I'm turning numbers in a .wav file to a picture then back into numbers... if there's a way to get the same result by analysing the audio clip directly instead of the picture, that'd be grand. Can I script an audio program to report the amplitude at time x?
I can run software on Linux (openSuse) or Windows XP. I don't have any programming skills to speak of but I'm interested in learning, so pointers to useful commands (checking pixel colours in a picture, amplitude in a sound file, etc) in fairly easy languages (Python?) would be appreciated too.
This is for a sculpture so, while my geekiness wants perfect representation of the information, I'm already throwing away a huge amount to cut five seconds down to 100-200 slices. So I'm happy to make some concessions to form over function; the data I get out can be a bit dirty and "close enough", but of course more accurate would be more satisfying.