Learning real-time sound event synchronization C++ programming?
February 7, 2009 1:37 PM   RSS feed for this thread Subscribe

What are the libraries/techniques that are available for event synchronization in C++? More specifically, how would I go about learning the best strategies for synchronizing events (say, a graphical event with an sound, or one sound with another)? I'm looking to learn about the techniques that are used by complex games as well as audio software, like Ableton Live or Logic.

I've programmed in a number of languages and frameworks but most of the time I haven't had to think about this aspect of making (rudimentary) sound software. I guess I don't care too much about a response being C++ specific if it helps me understand the general concepts well, but I figure that's a good place to start, since that's the language I want to use.

I see that there are some libraries out there that may be appropriate, but I'm a relative n00b at C++ and real-time programming, and I'm not sure how to proceed. Do game/audio programmers use complex systems for syncing events, or is it "not that hard?" Do they roll their own usually, or are there standard libs out there that I just don't know about? Are there good open source projects out there that implement this sort of thing that I could look at? Am I even asking the right questions?

Many thanks!
posted by dubitable to computers & internet (10 comments total)
game and audio programming is soft realtime, not hard realtime, and they usually have some amount of buffering handy. you can generally deal with this via a standard event loop.

precise synchronization is toughter. generally games do not do precise synchronization; games do mixing in an environment where a hundredth of a second or more is close enough.

what problem are you actually trying to solve?
posted by rr at 1:50 PM on February 7


Games don't have very strict synchronization between video and audio. Typically it is an simple as saying "Start the explosion animation" and "Queue the boom sound" - the human brain tends to merge visual and audio stimuli that happen at more-or-less the same time into a single event.

The libraries you link to are both network libraries? Are you really interested in voice chat/network streaming?
posted by AndrewStephens at 2:19 PM on February 7


Hi folks, thanks for the informative answers so far! Here's some responses to your questions:

@rr: what problem are you actually trying to solve?

I'm interested in figuring out how to do relatively synchronized (i.e. registering as synchronized to the "human perceptual system") triggering of sound and graphical events. I don't want to get too deep into it, but I'd like to be able to have graphical events--such as what you see and control in a video game, generally speaking--synchronized more precisely with sounds, including layered sounds. Think of a game like Rez for a general notion of the sort of thing I'd like to approach, but perhaps with even more user control over parameters, more in terms of (although probably still more on the game side in terms of complexity) software like Ableton Live. Is this helpful, or still too vague?

@AndrewStephens: The libraries you link to are both network libraries? Are you really interested in voice chat/network streaming?

Nope. Those were the libraries that came up when I did searches on "real-time event synchronization C++" and similar search strings. I have no idea if similar concepts are required for doing the kinds of event synchronization (in this context these terms may have different meanings than the meanings I'm ascribing to them) that I'm looking for, so that's why I linked to those...sounds like I may be off-base then?
posted by dubitable at 2:34 PM on February 7


To clarify a bit on what I want to do: I'm not interested in the vague level of sound event queuing that many games have; I want something more sophisticated as what I'm trying to build has sound as a central component. Perhaps an alternative way to think about what I'd like to do is building a "baby sequencer" with more sophisticated, experimental graphical control over the audio elements...sorry I'm being vague. Part of that is because I don't want to say too much, and part because...it's still in the design/prototyping phase.

Yet another way to put it: a game like Guitar Hero or Rock Band must have a pretty sophisticated event synchronization system, at least compared to other games, right (or not)? So how different is that from an audio sequencer application in terms of technology used, precision of event synchronization, etc.? Just getting a sense of this would be helpful to me. And, rr: can you explain "soft realtime" vs. "hard realtime" to me, or give me pointers to what I can read about these?
posted by dubitable at 2:44 PM on February 7


The tactics of game audio are entirely different than the tactics used by professional audio stuff like Ableton Live. Games can be sloppy within reasonable limits. In fact, syncing with video that plays at 24/30/48/60 frames per second naturally gives you plenty of milliseconds of fudge time. However if you're 1/60th of a second off in a professional audio suite, something is seriously broken.

A game can prioritize sounds, delaying or skipping low priority sounds if there are others that are more important. However something like Live, Logic, or Reason need to devote all resources to doing everything that's asked of it. They'll prioritize just about everything including interface updates over playing audio, with listening to an input stream (midi, etc) a close second and everything else a distant third. Games generally put audio at the bottom of the priority queue because it is honestly the least time critical, with everything from player input, graphics, and AI to game logic, physics, and the network layer all getting much higher priority. And even within the audio stack there are priorities. Nearby sounds and time critical sounds (eg: gun shots, cracks of the bat) are high priority within the stack. Music, atmospheric sounds, and stuff that's far away are much lower priority.

If I was making a professional audio application I would never even think of using a game audio library. And vice versa.

Check out RAD Game tools for a very common set of audio libraries. For something like Live, I'm around 99% sure they wrote the audio handling from the ground up. That's what those tools are for.

On preview I'm not sure this answers your question, but I'm still not sure what your question is. Is your question about audio at all, or is it about how you do timed events vs just doing everything as fast as possible (ie: game development vs an office or web application)?
posted by Ookseer at 2:50 PM on February 7


Wikipedia has a section hard vs. soft realtime.
posted by mmascolino at 3:12 PM on February 7


I am still a little confused as to exactly what you want. Searching for "real-time event synchronization C++" is going to get you a whole bunch of links to threading libraries as all those terms have precise technical meaning that have nothing to do with audio or graphics.

I don't know anything about GH or Rock Band work, but here is how I would do it. The music would be stored as a series of separate audio tracks, one for each instrument the players can control and a backing track. Consoles are more than powerful enough to mix multiple streams together.

Along side the audio I would have a encode the actions the player needs to take - these would be pre-prepared by my team of musicians. The game just needs to play the audio streams together and show the actions as it comes across them in the audio streams. I guess you could think of Rock Band as a very simple reverse sequencer. There are probably all sorts of little details that need to be sorted out, but you are over-thinking the problem.

As for real audio programs, they spend a lot of effort maintaining accurate timing - something that modern multitasking OSes have been traditionally poor at. You may want to look up "audio latency and timing". You will probably find more than you ever wanted to know.
posted by AndrewStephens at 3:21 PM on February 7


a game like Guitar Hero or Rock Band must have a pretty sophisticated event synchronization system, at least compared to other games, right (or not)?

I would say "not" to this. Guitar Hero and Rock Band are not doing much more than anything else does on the _sound_ front. It's not even clear to me that they do anything sophisticated on the controller-to-video-timing synchronzation front..
posted by rr at 3:48 PM on February 7


This reply is kind of all over the place, and sort of a brain dump, but then again you asked a very broad question.

The best approach for audio programming seems to be callback based. In the Linux audio world using the RTC timer kernel module to generate a sort of "master clock" and calling each application's audio processing callback from the central jack audio server is pretty much the state of the art.

it is sort of like this (rough outline here, I have written apps that use jack but I am simplifying for clarity):
I like the jack approach because it means that I don't have to make modules for saving to a soundfile or getting input from a microphone or displaying a sonogram or vu within my softsynth - the user can use jack to plug my synth into apps that do each job exclusively, and thus most likely much better than I would have the patience to implement.

People have talked about (and even maybe prototyped?) a video extension to jack, where some apps would fill or read video data buffers, but the varying formats and frame rates and etc. of video makes all of this a little less straightforward.

Basically you can keep many audio apps in sample accurate synchronization by having one app that is connected to the sound card, that calls the audio processing callbacks of all the other apps (this is actually strikingly similar to the way windows programs use vsts).

Video players tend to base their central clock on the timing of audio playback, so that the audio thread tells the video thread when it is time to show a new frame. You can fudge video timing by dropping or repeating frames and it is hard to notice except in extreme cases, but if you fudge audio you will drop parts of sounds or stutter them and probably get clicking artifacts on top of that - we can assimilate a dropped frame the way we would a blink, but there is no such thing as blinking your ears, so it is jarring and unnatural.

There are tradeoffs with buffer sizes - the larger you set the buffers, the less likely you skip or stutter (very jarring events, most musicians would consider them to be serious enough to ruin a performance), while the shorter you set them, the more of a real time interaction the musician can have (with a small enough buffer size you can get low enough total latency so that you can play a fast drum beat on a midi keyboard with sound from a software drum machine or use a laptop as a guitar multieffects box and stay on the beat etc, but latency quickly fucks you up, this is possible with Linux, but afaik windows and macosX lack the soft real time performance which make it possible to get low enough latency without severe audio artifacts).
posted by idiopath at 4:47 PM on February 7 [1 favorite]


@Ookseer: On preview I'm not sure this answers your question, but I'm still not sure what your question is. Is your question about audio at all, or is it about how you do timed events vs just doing everything as fast as possible (ie: game development vs an office or web application)?

Your answer was great, considering how oblique I'm being. As I mentioned initially, I'm not even sure if I'm asking the right questions. You illuminated for me what it is that I need to think about: what needs to take priority in the application I'm going to build, and relative to what? Can game dev sound libraries do all the work, or not? To answer your question, I'm interested in not just audio, and not just "as fast as possible" but (relatively) precise synchronization.

@mmascolino: gracias. Google is my friend, I know, I know...

@AndrewStephens: Searching for "real-time event synchronization C++" is going to get you a whole bunch of links to threading libraries as all those terms have precise technical meaning that have nothing to do with audio or graphics.

Yeah. I guess I had the idea that "events are events are events," if you know what I mean, so the sort of techniques you'd use to sync up, say, OS signal events would be the same as sound events. Clearly that's not the case!

As for real audio programs, they spend a lot of effort maintaining accurate timing - something that modern multitasking OSes have been traditionally poor at. You may want to look up "audio latency and timing". You will probably find more than you ever wanted to know.

Yes indeed...

@rr: Thanks--that's helpful. One of the things I'm trying to wrap my head around, fundamentally, is the difference between a sophisticated audio app and a sophisticated video game, in terms of the approach to events timing/synchronization.

@idiopath: thanks for your answer...I don't know that I'm far enough along to know what to take from it and what not to, but I'm sure it will be helpful. For the record I used to use Linux as my desktop, and remember setting up a system with Jackd and Ardour and PD and some other stuff...I thought it worked quite nicely (although setup was a bear).

Thank you everyone. I apologize for the "premature" or vague questions, but at the same time, I know much better now what to investigate to move forward. I really appreciate your help.
posted by dubitable at 1:28 PM on February 8


« Older What do i do for a girl i'm no...   |   I am looking for a quote I onc... Newer »

You are not logged in, either login or create an account to post comments