Some rather clueless questions about using undocumented APIs in Windows
January 16, 2008 9:56 AM   Subscribe

Some rather clueless questions about using undocumented APIs in Windows

The details behind this are kind of a long story. But the deal is, I have a DLL that contains an API. It's totally undocumented. Using APIMonitor I can find the API calls that look like what I want, and I can call these from my programs provided I load the DLL, etc.

But the problem is, I don't know the prototypes for these functions. APIMonitor gives some clues because it shows the stack, but as far as I can tell it's not really "enough" to figure out the function signatures. If I use the wrong signatures obviously I corrupt the stack and crash my program pretty much straight away.

I have a program (not mine) that uses the DLL, which is what I've been using to run APIMonitor on. I can also use some tools I have to "hook" into this API, so that my function gets called instead of the API function. I was hoping I could use this to look at the details of the calling arguments but it's not really that much help because I still need to know the number and type of arguments that are getting passed to my function.

Are there any tools or methods I can use to figure out how to call these functions?

(This is all using Visual C/C++ if that matters)
posted by RustyBrooks to Computers & Internet (17 answers total) 3 users marked this as a favorite
It would probably be easier to look for an open source equivalent to what the mystery DLL is supposed to do. If you must do this, what you need is called a "reverse compiler."
posted by TeatimeGrommit at 10:29 AM on January 16, 2008

Er, if you must use the dll, and not go open source, then you need a reverse compiler.
posted by TeatimeGrommit at 10:30 AM on January 16, 2008

I found and tried the demo version of "IDA" which seems like it can do what I'd like. It recommended a prototype for the function, and it works, to an extent. That is, I can hook into the API call, and print out the arguments, and it looks kind of right to me (the arguments are recognizable). However, the program I'm running crashing immediately after I make the corresponding call still. So that's kind of depressing.
posted by RustyBrooks at 10:38 AM on January 16, 2008

Depending on what your API does, can you attack the problem at another point? If the API communicates over the network, maybe you do that communication yourself; if it accesses a file, maybe you can open the file yourself. In fact, this week I worked out the network protocol implemented by an API I was using, and it turns out the protocol is vastly simpler than the API makes it seem.
posted by pocams at 10:45 AM on January 16, 2008

My current approach is to actually "attack" the problem downstream, where it outputs data to the screen with ExtTextOutW. However, I find it impossible to determine which window this output was intendede for (there are multiple windows) and therefore it's very hard to seperate the data streams properly.

Moving upstream might help, i.e. looking at the network data. It's all encrypted, and I expect that much of the data is binary also, so I am looking to avoid that if possible.
posted by RustyBrooks at 10:48 AM on January 16, 2008

Ouch, nasty - dealing with encrypted network data would be pretty miserable. What about picking the data out of the already-rendered windows using AutoIt? It's not elegant, but it could do the trick, and it's easy to embed its functionality as a DLL.

I don't know much about Win32 programming, but it seems there are a few different calling conventions (cdecl, stdcall); are you sure you're using the right one? If I were in your position, I'd probably be fooling around with Python and ctypes, which lets you load and call arbitrary DLLs; it has some nice features, like detecting a corrupt stack and telling you how much it's off by, and catching Win32 exceptions to try to keep you from crashing. At least it might help you experiment a little more quickly and easily.
posted by pocams at 10:57 AM on January 16, 2008

I actually started with AutoIt. The widget that contains the data sort of isn't a standard widgeet, so you can't use plain AutoIt commands to get it out. That's what started me down this path.

It's definitely possible I'm using the wrong calling convention. I've mostly been trying cdecl and WINAPI

The return value from the function I'm calling seems OK even. But right after it goes to crap. The debugger says it's somewhere in msvcr80.dll which is totally useless info as far as I can tell.

I wouldn't even know where to begin with python
posted by RustyBrooks at 11:10 AM on January 16, 2008

well, cdecl, stdcall and WINAPI all seem to do about the same thing (i.e. give me the proper arguments and return value). MS fastcall doesn't work at all. Anyway, all three that "work" still crash immediately after so they must be screwing the stack up somehow.
posted by RustyBrooks at 11:24 AM on January 16, 2008

IDA reports it as __cdecl if that helps.
posted by RustyBrooks at 11:26 AM on January 16, 2008

Well, I'm afraid I'm out of ideas - hopefully someone that knows more can help you out. I'll try to keep an eye on this thread in case anything more comes to light, but you might also want to ask this question someplace more specialized, since you've really gone quite a ways with it already.
posted by pocams at 12:08 PM on January 16, 2008

Any idea where to ask? I am totally not a windows programmer (by choice at least)

I found a few threads online, most of them point to something like IDA, which was helpful but ultimately is not getting the job done for me.
posted by RustyBrooks at 12:11 PM on January 16, 2008

Sorry, I'm even less of a Windows programmer and I'm not sure which forums are more reputable than others. Here's a thought - maybe you could use IDA to examine the stack immediately before and immediately after the function call, making sure that it's getting cleaned up properly? If the stack is coming back wrecked, you'll be able to see exactly where it's going wrong and maybe why. If it turns out that the stack is okay, maybe there's some other dependency or tricky little API thing going on.
posted by pocams at 12:30 PM on January 16, 2008

This DLL Export Viewer (part of PE Explorer which has a 30 day free trial) gives parameter lists for exported functions.
posted by null terminated at 2:02 PM on January 16, 2008

Could you find the code surrounding the function call in the other program and disassemble it? Most likely the code immediately before the call will be setting up the parameters.
posted by equalpants at 2:07 PM on January 16, 2008

DLL Export Viewer has this to say about parameter lists (which is true):

If you don't have the source code and API documentation, the machine code is all there is. PE Explorer provides a Disassembler. There is only one way to figure out the parameters: run the disassembler and read the disassembly output. This task of reverse engineering the interface cannot be automated, sorry.

PE Explorer comes bundled with descriptions for 39 various libraries, including the core Windows operating system libraries (eg. KERNEL32, GDI32, USER32, SHELL32, WSOCK32), key graphics libraries (DDRAW, OPENGL32) and more. But PE Explorer is unable to provide description sets for all libraries or functions ever written by humankind.

So unless your DLL is one of the 39 (doubtful) then you're SOL. C export tables simply don't include the parameter information. There's no need, since functions can't be overloaded, and it's assumed you'll have the header file. C++, however, is a different story.

Disassembling is probably the best way... but then you need to know assembly. Here's a handy starter guide. I'd start by looking at the %EBP and now it's manipulated. Listing out the offsets should at least give you an idea of the size of the parameters.

As a side note, something I forgot (and which the site mentions), with __stdcall, MSVC encodes the size of the parameters in the exported symbol name. So if your symbol is something like "_foo@8" then that means to call "foo" as __stdcall with 8 bytes of parameters. If your symbol doesn't look like this, then you're using __cdecl.

What is the return value expected to be? If it's small enough, then it will be in %EAX or %EAX:%EDX (wiki). If it's floating point, then it's in %FP0. If it's a class or structure, you'll have to look up how MSVC expects to pass the memory in.

So why is it later crashing in msvcr80.dll? My guess is that your stack is getting smashed. If you're passing too few parameters, the function could be overwriting your own local storage, or your stored instruction pointer. Since you're using cdecl you might not know until much later that your stack is screwed up. Break before the function, examine the stack, call, break after, examine the stack. If something changed that you didn't expect, then there's your problem. Or, you could be missing a step in the libraries initialization. Perhaps you need to call a function to setup the DLL (think like WSAStartup).

One other thing to try: google for the symbol name. Perhaps someone else has tried to figure this function out before.
posted by sbutler at 2:38 PM on January 16, 2008

equalpants also has a very good idea.
posted by sbutler at 2:39 PM on January 16, 2008

Thanks for all the info above. I'll try to address a few points.

* I don't know assembly. It might be time to learn enough to debug this. I don't know if I would be able to tell by looking if I was messing up the stack.

* I have googled for these function names, no dice. Seems to be uncharted waters.

* I'll see if I can find the code that calls it, maybe I'll get lucky.

* So far today I've tried hooking into some different entry points with pretty similar problems. I'm using HookAPI from here
I can insert my own hooks for windows API functions just fine. If I try to hook into anything else I run into trouble, though.
posted by RustyBrooks at 2:43 PM on January 16, 2008

« Older Do Europeans/NON U.S. citizen read more than...   |   Stomp stomp stomp, KILL Newer »
This thread is closed to new comments.