Should I use llvm, or yacc/lex or what?
July 30, 2009 1:50 PM Subscribe
Should I use llvm as a backend for my multimedia programming language, is there a better alternative?
I am working on a personal multimedia programming project, where I want to compile visual directed graphs into audio / video synthesis chains. Basically, this will be Max/MSP or PD with some "syntactic" differences, and better integration of video/image rendering. My approach right now is to parse an SVG file containing the graph and compile that into a runnable application. At this point my program is written in c and uses the expat library to load the SVG and parse it into a directed graph ready for compilation. Ideally I could eventually have a self-hosting compiler where all the source code would be expressed as visually laid out graphs, but I do understand that it could realistically be years of work to get to that stage.
I am looking for suggestions for the compilation stage. I could use csound or the supercollider scsynth backend, or even produce a valid PD file, but I think that for the kind of audio/video integration I want, and the kind of performance I would need from this sort of app, that compiling to machine code or at least low level byte code that can access system libraries would be best.
Which brings me to the question: what is my best approach here? I am under the impression that the JVM and Parrot are a bit more overhead than I want, I am skeptical of the .Net/Mono CLR route because it is a Microsoft technology and I just want as little to do with them as possible. Making a full fledged GCC extension frontend seems like a large amount of work, which leaves me leaning toward llvm.
Is this a wise choice? How usable is llvm, is there something else with better performance or a more straightforward usage scenario for multimedia?
I am working on a personal multimedia programming project, where I want to compile visual directed graphs into audio / video synthesis chains. Basically, this will be Max/MSP or PD with some "syntactic" differences, and better integration of video/image rendering. My approach right now is to parse an SVG file containing the graph and compile that into a runnable application. At this point my program is written in c and uses the expat library to load the SVG and parse it into a directed graph ready for compilation. Ideally I could eventually have a self-hosting compiler where all the source code would be expressed as visually laid out graphs, but I do understand that it could realistically be years of work to get to that stage.
I am looking for suggestions for the compilation stage. I could use csound or the supercollider scsynth backend, or even produce a valid PD file, but I think that for the kind of audio/video integration I want, and the kind of performance I would need from this sort of app, that compiling to machine code or at least low level byte code that can access system libraries would be best.
Which brings me to the question: what is my best approach here? I am under the impression that the JVM and Parrot are a bit more overhead than I want, I am skeptical of the .Net/Mono CLR route because it is a Microsoft technology and I just want as little to do with them as possible. Making a full fledged GCC extension frontend seems like a large amount of work, which leaves me leaning toward llvm.
Is this a wise choice? How usable is llvm, is there something else with better performance or a more straightforward usage scenario for multimedia?
What about compiling into C and then using a C compiler?
posted by qxntpqbbbqxl at 2:28 PM on July 30, 2009
posted by qxntpqbbbqxl at 2:28 PM on July 30, 2009
I've used LLVM for compiling GLSL-like languages. The biggest issue I've run into is that LLVM for 64-bit Windows is not quite there yet. Expect major hassle if Vista 64 is important to you right now.
posted by ryanrs at 2:32 PM on July 30, 2009
posted by ryanrs at 2:32 PM on July 30, 2009
Response by poster: Netzapper: do the multimedia extensions support rendering/synthesis as opposed to just mixing/compositing? I expect to have to write some multimedia libraries, or at least wrappers, regardless of backend.
qxntpqbbbqxl: generating code for a human readable language is often less than ideal, though I am willing to consider it. Llvm has the advantage over C that it is designed to be automatically generated and the intermediate bytecode can be optimized on the host machine, also a single build could theoretically ship for any OS/architecture if I wanted to be portable and used portable libraries.
ryanrs: my main must-support platform is 32 bit Linux to start with, though the more portable this could be, the better.
posted by idiopath at 2:45 PM on July 30, 2009
qxntpqbbbqxl: generating code for a human readable language is often less than ideal, though I am willing to consider it. Llvm has the advantage over C that it is designed to be automatically generated and the intermediate bytecode can be optimized on the host machine, also a single build could theoretically ship for any OS/architecture if I wanted to be portable and used portable libraries.
ryanrs: my main must-support platform is 32 bit Linux to start with, though the more portable this could be, the better.
posted by idiopath at 2:45 PM on July 30, 2009
Netzapper: do the multimedia extensions support rendering/synthesis as opposed to just mixing/compositing? I expect to have to write some multimedia libraries, or at least wrappers, regardless of backend.
No. They barely do mixing and compositing, for that matter. And nothing in video... just playback and sequencing.
In that case, you're going to need to build a synthesis library from something. And the JMF should be ignored.
Here's what I'd do: OpenGL and OpenAL from a JVM. You're going to need to develop that synthesis library, and that's where much of your work is going to be regardless. But OpenGL and OpenAL are excellent methods of doing playback, and the Java bindings to them are mature and robust. They also inherently do 3D, and hugely speed up many, many operations through hardware acceleration--well, GL does; nobody does hardware accelerated AL anymore.
As a bonus of using the JVM, you can write huge swaths of your libraries in jython. Jython also gets JIT compiled, and in my testing generally runs about as fast as the Java implementation--especially if you precompile to a .class file. And it can transparently call into anything with a Java binding.
The other thing you might consider is doing the compilation at runtime using a bytecode manipulation library. You could write the whole thing in jython: take the SVG source code, compile to java bytecode using ASM, load it through the classloader, and run it.
If you can't tell, I'm a big fan of the JVM. I'm not really a diehard fan of the Java language, but the JVM is a great target, IMO.
posted by Netzapper at 3:50 PM on July 30, 2009 [1 favorite]
No. They barely do mixing and compositing, for that matter. And nothing in video... just playback and sequencing.
In that case, you're going to need to build a synthesis library from something. And the JMF should be ignored.
Here's what I'd do: OpenGL and OpenAL from a JVM. You're going to need to develop that synthesis library, and that's where much of your work is going to be regardless. But OpenGL and OpenAL are excellent methods of doing playback, and the Java bindings to them are mature and robust. They also inherently do 3D, and hugely speed up many, many operations through hardware acceleration--well, GL does; nobody does hardware accelerated AL anymore.
As a bonus of using the JVM, you can write huge swaths of your libraries in jython. Jython also gets JIT compiled, and in my testing generally runs about as fast as the Java implementation--especially if you precompile to a .class file. And it can transparently call into anything with a Java binding.
The other thing you might consider is doing the compilation at runtime using a bytecode manipulation library. You could write the whole thing in jython: take the SVG source code, compile to java bytecode using ASM, load it through the classloader, and run it.
If you can't tell, I'm a big fan of the JVM. I'm not really a diehard fan of the Java language, but the JVM is a great target, IMO.
posted by Netzapper at 3:50 PM on July 30, 2009 [1 favorite]
qxntpqbbbqxl: generating code for a human readable language is often less than ideal, though I am willing to consider it. Llvm has the advantage over C that it is designed to be automatically generated and the intermediate bytecode can be optimized on the host machine, also a single build could theoretically ship for any OS/architecture if I wanted to be portable and used portable libraries.
These things are true. But C offers the following advantages: It's higher level than LLVM, most of the A/V libraries you're going to want to deal with have C APIs, and you get to benefit from the C compiler's optimizations. Generated C code would still be mostly-portable, dependent on the portability of aforementioned libraries.
posted by qxntpqbbbqxl at 4:55 PM on July 30, 2009
These things are true. But C offers the following advantages: It's higher level than LLVM, most of the A/V libraries you're going to want to deal with have C APIs, and you get to benefit from the C compiler's optimizations. Generated C code would still be mostly-portable, dependent on the portability of aforementioned libraries.
posted by qxntpqbbbqxl at 4:55 PM on July 30, 2009
Response by poster: qxntpqbbbqxl: actually, regarding optimization, llvm can do whole-program optimization, which is (AFAIK) impossible in C without violating the language standard. The llvm re-implementation of GCC claims to be able to generate faster code than GCC itself. Also, there is no language or byte-code out there that cannot use C calling conventions (hell, even when I code in assembly I can set up a stack frame and call C by hand), so library availability is a non-issue.
C does have the advantage for me that I already know the language.
posted by idiopath at 5:44 PM on July 30, 2009
C does have the advantage for me that I already know the language.
posted by idiopath at 5:44 PM on July 30, 2009
LLVM is fine, but I think you are overcomplicating things. Codegen is a performance optimization. For your "first draft" you can just walk the AST and perform the operations as you encounter them. This is likely to be "fast enough". You can replace this with codegen later, if necessary, but you will not necessarily see huge performance increases.
posted by jrockway at 3:31 AM on July 31, 2009
posted by jrockway at 3:31 AM on July 31, 2009
Response by poster: jrockway: this was my initial plan, my concern is the accumulative overhead. One of my motives in this project is to have a better performing multimedia processing tool. A program like PD or csound or supercollider will actually do what amounts to a partial compilation to simplify the AST, eliminate indirection where possible, cut out unused code paths. Once you are doing that, why not go whole hog and code gen? And why not use a tool that automates that sort of thing rather than re-inventing the optimized compilation wheel?
My biggest concern, I guess, is that given that I will be attempting soft-real-time video rendering and audio synthesis (hopefully including real-time impulse convolution and physical modeling), and don't own a cray, I have severe doubts anything at all will be "fast enough".
Also, the idea with this kind of multimedia programming environment is to be able to sketch out fairly low level stuff, to experiment with novel DSP approaches, etc. Adding an extra data structure, a type tag, and at least one level of function pointer indirection for every add, subtract, conditional etc., and, on top of that, using a linked list in the heap to emulate what the call stack is designed to do, pretty much adds up to a gigantic performance hit. Consider that the operation will be (in a typical video situation) repeated 20 times a second across two of 720x480 arrays, writing data to a third; and it will be one of hundreds that will be applied in that processing cycle. Cache misses are a huge performance hit for multimedia, and the kind of type tagging and function pointer following that walking the AST in real time entails is going to be very expensive on a desktop machine.
It could be that coding video rendering operations at this low a level natively rather than using C plugins is quixotic, but if I am going to try, I need to plan for some very serious optimization from the beginning.
People talk about premature optimizations, but in the case of a turing-complete multimedia rendering program, one has a pretty good idea of how often a given piece of code will run, implicit in the program's design. It is either an asynchronous or nonrepeating message (no need to optimize, will be used to control parameters in other elements or turn some piece of data flow on/off), running at audio rate (will run sample_rate / buffer_size times a second, and loop over buffer_size elements on each iteration), or running at video rate (will run frame_rate times per second, most often iterating over height*width*input_count data elements for each iteration). Of course there are other potential data types (fft frames, network packets, filesystem buffers, etc.), but their usage case tells you exactly how often that code path will be visited per second, they are essentially profiled already.
posted by idiopath at 6:49 AM on July 31, 2009
My biggest concern, I guess, is that given that I will be attempting soft-real-time video rendering and audio synthesis (hopefully including real-time impulse convolution and physical modeling), and don't own a cray, I have severe doubts anything at all will be "fast enough".
Also, the idea with this kind of multimedia programming environment is to be able to sketch out fairly low level stuff, to experiment with novel DSP approaches, etc. Adding an extra data structure, a type tag, and at least one level of function pointer indirection for every add, subtract, conditional etc., and, on top of that, using a linked list in the heap to emulate what the call stack is designed to do, pretty much adds up to a gigantic performance hit. Consider that the operation will be (in a typical video situation) repeated 20 times a second across two of 720x480 arrays, writing data to a third; and it will be one of hundreds that will be applied in that processing cycle. Cache misses are a huge performance hit for multimedia, and the kind of type tagging and function pointer following that walking the AST in real time entails is going to be very expensive on a desktop machine.
It could be that coding video rendering operations at this low a level natively rather than using C plugins is quixotic, but if I am going to try, I need to plan for some very serious optimization from the beginning.
People talk about premature optimizations, but in the case of a turing-complete multimedia rendering program, one has a pretty good idea of how often a given piece of code will run, implicit in the program's design. It is either an asynchronous or nonrepeating message (no need to optimize, will be used to control parameters in other elements or turn some piece of data flow on/off), running at audio rate (will run sample_rate / buffer_size times a second, and loop over buffer_size elements on each iteration), or running at video rate (will run frame_rate times per second, most often iterating over height*width*input_count data elements for each iteration). Of course there are other potential data types (fft frames, network packets, filesystem buffers, etc.), but their usage case tells you exactly how often that code path will be visited per second, they are essentially profiled already.
posted by idiopath at 6:49 AM on July 31, 2009
Response by poster: Thanks all for your suggestions. I ended up going for llvm, and migrating my parsing / compiling code to ocaml since the compile time code was not as performance sensitive and I find ocaml an easier language to program in than c.
posted by idiopath at 12:42 PM on August 30, 2009
posted by idiopath at 12:42 PM on August 30, 2009
This thread is closed to new comments.
The overhead of the JVM is not zero, but it's actually fairly low. The modern JVM uses JIT compilation throughout the system. Benchmarks of modern Java show that it's roughly as fast as C++ for compute-intensive tasks. The benchmarks I've seen put it ahead of .NET/mono.
The one problem using the JVM is how iffy the multimedia extensions are. The JMF is littered with bugs. And, furthermore, it just doesn't do some things you'd expect it to. So you'd probably wind up writing your own extensions into some C multimedia library.
posted by Netzapper at 2:09 PM on July 30, 2009