How can I edit a compiled .cgi program/script?
May 27, 2010 11:47 AM   Subscribe

How can I edit this compiled .cgi file without breaking its functionality?

I have a compiled .cgi script that is fully functional on my server. I would like to update and change this .cgi with a text editor to include my organization's branding, text, etc. However, I do not have the source code to this .cgi. All my attempts to edit this .cgi with text editors, HTML editors, and command line tools (pico, etc.) fail because the file is a compiled binary program and the editors don't understand how to read this data. Essentially, it's

GIBBERISH BINARY CODE
XML CODE
XML CODE
XML CODE
GIBBERISH BINARY CODE

Editing the XML is easy, but upon saving the file, it munges the file because the binary code isn't interpretable.

The developer of this .cgi is no longer in business; his site doesn't resolve and seems to have disappeared.

How can I maintain the functionality of this script but still edit it? Mac tools preferred, but I'm not opposed to building a Windows environment and installing tools if it allows me to edit this .cgi
posted by mrbarrett.com to Computers & Internet (13 answers total) 3 users marked this as a favorite
 
If it's truly a binary and not a script, then the short answer is you can't edit it. Long answer is that you can decompile it, but you'll need someone who speaks the assembler of your particular architecture..and..well.. not worth it. Maybe not legal to do so either.
posted by Geckwoistmeinauto at 11:54 AM on May 27, 2010


What I'm about to say is an educated WAG.

If it's a compiled binary, you have two choices for editing it: One is to use a hex editor that will allow you to directly edit the binary as a binary. You won't be able to meaningfully alter functionality, but replacing individual bytes in what are obviously strings shouldn't be a huge problem. Here's Wikipedia's comparison of hex editors; many are text editors that also handle hex.

That's not a great solution, though--in particular, you'll likely mess things up if you try to change the length of strings. You're pretty much stuck with changing individual characters without disturbing their position.

The more complex but more sustainable route is to use a decompiler to generate source code that can then be compiled back into a working binary. This offers greater flexibility, and allows you to edit the CGI more comprehensively, including functionality. This requires some knowledge of the original programming language and the compiler that was originally used, so your best bet might be to find a new consultant who knows how to do this and can take over maintenance. This will be more expensive, but you're not just paying to edit the strings, you're paying to get back the ability to fully control the CGI.
posted by fatbird at 11:55 AM on May 27, 2010


You probably need to use a Hex editor so you can edit the file without disturbing the binary bits. One thing to keep in mind while editing the file is that the number of bytes that the XML code comprises cannot change. This is because it's likely that the code expects the XML to be so many bytes long and that the following binary code is at a certain offset inside the file. So, if your new XML is shorter, pad it out with spaces. If it's longer, you're out of luck unless you can find spaces to take out elsewhere, or you can shorten names, etc.
posted by zsazsa at 11:56 AM on May 27, 2010


You may be able to make changes to it if you make sure to not change the length of the xml section. Adding a bunch of spaces at the end to pad it out to the right length should be fine. The type of program that you want would be a "hex editor", which will help you avoid changing or moving the binary parts.
posted by dodecapus at 11:59 AM on May 27, 2010


Alternatively, you can write a PHP (or other scripting language) front-end to your CGI if it isn't economical to rewrite the whole thing. Your PHP app can forward requests to the CGI and replace the branding in the response before forwarding that back to the user.
posted by rocketpup at 12:03 PM on May 27, 2010


Editing it sounds like a path that has not paid off. If the other editing suggestions don't pan out what I would is treat it as a black box with an API. I would create a a wrapper cgi that:

1) Takes the same parameters your script does
     and passes them through
2) Captures the output of your script and adds the presentation
     changes you want to make on the fly, via string
     substitution probably, then outputs it.

You could do this in any number of language, via exec in PHP, Perl or whatever language you know best that can make system calls.
posted by artlung at 12:05 PM on May 27, 2010


farbird and dodecapus are right. When strings are serialized in binaries, the length is usually written first so that the deserialization function knows how many bytes to read. If you change the string's length, it is almost guaranteed to cause problems.

If your new string is shorter than the original, make judicious use of the insert key and pad with spaces if necessary. If the string you're putting in is longer than the original, that makes things a good deal more complicated. You would have to find the place where the length is written and replace that. It will not be plaintext, so you would need a hex editor to do so.
posted by jakejake at 12:25 PM on May 27, 2010


Mine and artlung's suggestions are functionally identical, though the methods used to execute the CGI differ:

Mine: would make use of an HTTP request from the script to execute the CGI in the same way as if the request was coming directly from a browser.

Artlung's: would execute the CGI directly as a system call.

Artlung's would be more efficient, while mine would probably require fewer privileges for the wrapper script.
posted by rocketpup at 12:26 PM on May 27, 2010


Best answer: You probably need to use a Hex editor so you can edit the file without disturbing the binary bits.

Also if this is too clunky, you can open up the cgi file in a hex editor, copy the bytes from the XML part, and paste them into a new empty file. Then edit that file with a standard text editor, making sure that the total number of characters is the same when you're done. Once you've made your changes, open that new file in the hex editor again, then copy and paste those bytes back into the original cgi file.

You probably won't be able to figure out a way to actually make the xml section longer, although it would probably be technically possible if you knew enough about the format of that file.
posted by burnmp3s at 12:26 PM on May 27, 2010


Do you happen to know the source language? These days, many binaries are actually programs for some kind of virtual machine (Java, .Net, etc.) and those are actually pretty straightforward to decompile.

If this turns out to be compiled C code, you probably won't be able to decompile it. (There are tools but they usually won't get you the whole source code. The best you can expect is 90% of the original code and a lot of little things that need to be cleaned up by hand.)

My suggestion, in that case, is to use a dissassembler to convert it to assembly language, edit the strings that way (they'll be string constants) and then reassemble it when you're done. This way, you don't have to worry about corrupting the executable, at least.
posted by suetanvil at 2:27 PM on May 27, 2010


Response by poster: I think I'm going to try burnmp3s idea. I have no idea what the source language for this .cgi was. And since I really just need to edit the text in included XML and can make new graphics and name the same file names, this should work (provided I keep the same number of characters as people here have said). I'll give it a shot.
posted by mrbarrett.com at 3:08 PM on May 27, 2010


I would use commandline in Terminal:

# emacs
M-x find-file-literal (M-x means metakey and x, or failing that, tap Esc, then 'x')
type in the filename and push return. Use TAB to autocomplete
M-x hexl-mode (Changes into hex mode)
M-x overwrite-mode (You can't insert/add/remove bytes, as that would shift the binary. You can only overwrite characters with different ones. If your changes require you to insert longer lines than are already in the file you can't do it).
Then use ^s text to search for 'text', and keep pushing ^s until you find where you want to edit. (cursor keys, or ^g will get you out of search).
Change things as you want.
^x^s to save the file, answer 'y' to the question.

If you need to insert binary, like a 00, use M-^x (or esc-^X) then hex number.
posted by lundman at 8:12 PM on May 27, 2010


Unfortunately .cgi is a pretty meaningless file extension. If you're on a Unix machine, you can use the file command (e.g. file myscript.cgi) to get a guess as to whether its a native binary or say, Java bytecode. I did once have to change a single integer constant in a Mac OS X binary, and it was a full day ordeal for me and a coworker (for two architectures, but still). I really wouldn't try attempt changing anything more than text strings without some serious programming skills. But it can be a fun challenge, which is why anti-piracy schemes get cracked so quickly.

I once heard a story from an old school Mac developer that the original Macintalk text-to-speech system was written by a 3rd party, and Apple never got the source code. It proved popular, and when they went to make an improved version the original company had gone under and the source had been lost. So the Apple programmers had to write version 2.0 by hacking the original binary in assembler.
posted by serathen at 8:20 PM on May 27, 2010


« Older Tattoo N00b   |   How to get good sound without annoying neighbors? Newer »
This thread is closed to new comments.