How to use a dll without knowing parameters? - windows

I have a dll that I need to make use of. I also have a program that makes calls to this dll to use it. I need to be able to use this dll in another program, however previous programmer did not leave any documentation or source code. Is there a way I can monitor what calls are made to this dll and what is passed?

You can't, in general. This is from the Dependency Walker FAQ:
Q: How do I view the parameter and
return types of a function?
A: For most functions, this
information is simply not present in
the module. The Windows' module file
format only provides a single text
string to identify each function.
There is no structured way to list the
number of parameters, the parameter
types, or the return type. However,
some languages do something called
function "decoration" or "mangling",
which is the process of encoding
information into the text string. For
example, a function like int Foo(int,
int) encoded with simple decoration
might be exported as _Foo#8. The 8
refers to the number of bytes used by
the parameters. If C++ decoration is
used, the function would be exported
as ?Foo##YGHHH#Z, which can be
directly decoded back to the
function's original prototype: int
Foo(int, int). Dependency Walker
supports C++ undecoration by using the
Undecorate C++ Functions Command.
Edit: One thing you could do, I think, is to get a disassembler and disassemble the DLL and/or the calling code, and work out from that the number and types of the arguments, and the return types. You wouldn't be able to find out the names of the arguments though.

You can hook the functions in the DLL you wish to monitor (if you know how many arguments they take)

You can use dumpbin (Which is part of the Visual Studio Professional or VC++ Express, or download the platform kit, or even use OpenWatcom C++) on the DLL to look for the 'exports' section, as an example:
dumpbin /all SimpleLib.dll | more
Output would be:
Section contains the following exports for SimpleLib.dll
00000000 characteristics
4A15B11F time date stamp Thu May 21 20:53:03 2009
0.00 version
1 ordinal base
2 number of functions
2 number of names
ordinal hint RVA name
1 0 00001010 fnSimpleLib
2 1 00001030 fnSimpleLib2
Look at the ordinals, there are the two functions exported...the only thing is to work out what parameters are used...
You can also use the PE Explorer to find this out for you. Working out the parameters is a bit tricky, you would need to disassemble the binary, and look for the function call at the offset in the file, then work out the parameters by looking at the 'SP', 'BP' registers.
Hope this helps,
Best regards,
Tom.

Related

Win32 Assembly - Extern function naming (The meaning of '#')

As I see, extern WinAPI functions in assembly code have names like _ExitProcess#4.
What is the meaning of the #4 part, and how to determine what number to use after # ?
I know that this has something to do with DLL we are linking against, but in many cases it's not known what number to use after the #, and this leads to many nasty undefined reference errors.
As Andreas H answer said the number after the # is the number of bytes the function removes from stack before the function returns. This means it should be easy to determine that number, as it's the also number of bytes you need push on the stack to correctly call the function. It should be the number of PUSH instructions before the call multiplied by 4. In most cases this will also be the number of arguments passed to the function multiplied by 4.
If you want to double check that you've gotten the right number and you have Microsoft Visual Studio installed you can find the decorated symbol name from the Developer Command Prompt like this:
C:\> dumpbin /headers kernel32.lib | find "ExitProcess"
Symbol name : _ExitProcess#4
Name : ExitProcess
If you're using the MinGW compiler tools to link your assembly code, you can do this instead:
C:\> nm C:\MinGW\lib\libkernel32.a | find "ExitProcess"
00000000 I __imp__ExitProcess#4
00000000 T _ExitProcess#4
You'll need to replace C:\MinGW with the directory you installed MinGW.
Since not all Windows APIs reside in the kernel32 import library you'll need to replace kernel32 with the name of the import library given in the Windows SDK documentation for the API function you want to link to. For example, with MessageBoxA you'd need to use user32.lib with Visual Studio and libuser32.a with MinGW instead.
Note there are few rare Windows APIs that don't use the stdcall calling convention. These are functions like wsprintf that take a variable number of arguments, which the stdcall calling convention doesn't support. These functions just have an underscore _ before their names, and no # or number after. They also require that the caller remove the arguments from the stack.
The # symbol is, as the leading underscore, part of the function name when the stdcall calling convention is specified for the function.
The number specifies the number of bytes the function removes from the stack.
The compiler generates this number.
The suffix is added so that the function is not accidentally called with the wrong calling convention or the prototype in the source code specifies the wrong number or size of arguments. So the intention is to provide a means to avoid program crashes.
See also https://msdn.microsoft.com/de-de/library/zxk0tw93.aspx
If you want to get the number to use, make sure you have _NT_SYMBOL_PATH defined to the correct value.
Like:
srv*https://msdl.microsoft.com/download/symbols
or
srv*c:\MyServerSymbols*https://msdl.microsoft.com/download/symbols
For example (in cmd.exe, windows command line):
set _NT_SYMBOL_PATH=srv*https://msdl.microsoft.com/download/symbols
Then use:
dumpbin /exports /symbols kernel32.lib | findstr _ExitProcess#
You'll have to be in the directory where kernel32 is and you'll have to have grep.
Probably is a way to use built in find command. You could also redirect it to a file and then view it in your editor.

Compiling Fortran external symbols

When compiling fortran code into object files: how does the compiler determine the symbol names?
when I use the intrinsic function "getarg" the compiler converts it into a symbol called "_getarg#12"
I looked in the external libraries and found that the symbol name inside is called "_getarg#16" what is the significance of the "#[number]" at the end of "getarg" ?
_name#length is highly Windows-specific name mangling applied to the name of routines that obey the stdcall (or __stdcall by the name of the keyword used in C) calling convention, a variant of the Pascal calling convention. This is the calling convention used by all Win32 API functions and if you look at the export tables of DLLs like KERNEL32.DLL and USER32.DLL you'd see that all symbols are named like this.
The _...#length decoration gives the number of bytes occupied by the routine arguments. This is necessary since in the stdcall calling conventions it is the callee who cleans up the arguments from the stack and not the caller as is the case with the C calling convention. When the compiler generates a call to func with two 4-byte arguments, it puts a reference to _func#8 in the object code. If the real func happens to have different number or size of arguments, its decorated name would be something different, e.g. _func#12 and hence a link error would occur. This is very useful with dynamic libraries (DLLs). Imagine that a DLL was replaced with another version where func takes one additional argument. If it wasn't for the name mangling (the technical term for prepending _ and adding #length to the symbol name), the program would still call into func with the wrong arguments and then func would increment the stack pointer with more bytes than was the size of the passed argument list, thus breaking the caller. With name mangling in place the loader would not launch the executable at all since it would not be able to resolve the reference to _func#8.
In your case it looks like the external library is not really intended to be used with this compiler or you are missing some pragma or compiler option. The getarg intrinsic takes two arguments - one integer and one assumed-sized character array (string). Some compilers pass the character array size as an additional argument. With 32-bit code this would result in 2 pointers and 1 integer being passed, totalling in 12 bytes of arguments, hence the _getarg#12. The _getarg#16 could be, for example, 64-bit routine with strings being passed by some kind of descriptor.
As IanH reminded me in his comment, another reason for this naming discrepancy could be that you are calling getarg with fewer arguments than expected. Fortran has this peculiar feature of "prototypeless" routine calls - Fortran compilers can generate calls to routines without actually knowing their signature, unlike in C/C++ where an explicit signature has to be supplied in the form of a function prototype. This is possible since in Fortran all arguments are passed by reference and pointers are always the same size, no matter the actual type they point to. In this particular case the stdcall name mangling plays the role of a very crude argument checking mechanism. If it wasn't for the mangling (e.g. on Linux with GNU Fortran where such decorations are not employed or if the default calling convention was cdecl) one could call a routine with different number of arguments than expected and the linker would happily link the object code into an executable that would then most likely crash at run time.
This is totally implementation dependent. You did not say, which compiler do you use. The (nonstandard) intrinsic can exist in more versions for different integer or character kinds. There can also be more versions of the runtime libraries for more computer architectures (e.g. 32 bit and 64 bit).

How to programmatically inject parameters/instructions into a pre-built portable executable

I have two executables, both manually created by me, I shall call them 1.exe and 2.exe respectively. First of all, both the executables are compiled by MSVS 2010, using the Microsoft compiler. I want to type a message into 1.exe, and I want 1.exe to inject that message into 2.exe (possibly as some sort of parameter), so when I run 2.exe after 1.exe has injected the message, 2.exe will display that message.
NOTE - this is not for illicit use, both these executables were created by me.
The big thing for me is:
Where to place the message/instructions in 2.exe so they can be easily accessed by 2.exe
How will 2.exe actually FIND use these parameters (message).
I fully understand that I can't simply use C++ code as injection, it must be naked assembly which can be generated/translated by the compiler at runtime (correct me if I am wrong)
Some solutions I have been thinking of:
Create a standard function in 2.exe requiring parameters (eg displaying the messagebox), and simply inject these parameters (the message) into the function?
Make some sort of structure in 2.exe to hold the values that 1.exe will inject, if so how? Will I need to hardcode the offset at which to put these parameters into?
Note- I don't expect a spoonfeed, I want to understand this aspect of programming proficiently, I have read up the PE file format and have solid understanding of assembly (MASM assembler syntax), and am keen to learn alot more. Thank you for your time.
Very few programmers ever need to do this sort of thing. You could go your entire career without it. I last did it in about 1983.
If I remember correctly, I had 2.exe include an assembler module with something like this (I've forgotten the syntax):
.GLOBAL TARGET
TARGET DB 200h ; Reserve 512 bytes
1.exe would then open 2.exe, search the symbol table for the global symbol "TARGET", figure out where that was within the file, write the 512 bytes it wanted to, and save the file. This was for a licensing scheme.
The comment from https://stackoverflow.com/users/422797/igor-skochinsky reminded me that I did not use the symbol table on that occasion. That was a different OS. In this case, I did scan for a string.
From your description it sounds like passing a value on the command line is all you need.
The Win32 GetCommandLine() function will give you ther passed value that you can pass to MessageBox().
If it needs to be another running instance then another form of IPC like windows messages (WM_COPYDATA) will work.

Should use "__imp__ApiName#N" or "_ApiName#N"?

I have dumped a Windows SDK .lib file (kernel32.lib) with DUMPBIN, the output show me that there are two "versions" for every API name, for example:
_imp_ExitProcess#4
and
_ExitProcess#4
So, why there are two of the same with different name mangling? .
Let say i want to call ExitProcess from NASM, wich of them should i use when declare EXTERN?, mi practice shows me that i can call any of them but i don't know which one is the "correct" or the "prefered" to stick with it.
I think the _imp_ version is meant to be used with __declspec(dllimport) on Visual C++ compilers to prevent potential conflicts with code in the same module.
You're not supposed to use that fact explicitly in your code -- just use the original function, i.e. _ExitProcess#4.

Using GetProcAddress when the name might be decorated

What is the correct way to use GetProcAddress() on a 32 bit DLL? On win32, there are three calling conventions, cdecl, stdcall and fastcall. If the function in the DLL is foo they will decorate the name in the following ways _foo, _foo#N and #foo#N.
But if the author of the dll has used a .def file, then the exported name will be changed to just "foo" without any decoration.
This spells trouble for me because if I want to load foo from a dll that is using stdcall, should I use the decorated name:
void *h = LoadLibraryEx(L"foo.dll", NULL, 0);
GetProcAddres((HMODULE)h, L"_foo#16");
Or the undecorated one:
void *h = LoadLibraryEx(L"foo.dll", NULL, 0);
GetProcAddres((HMODULE)h, L"foo");
? Should I guess? I've looked at lots of 32 bit DLL files (stdcall and cdecl) and they all seem to export the undecorated name. But can you assume that is always the case?
There's really no shortcut or definitive rule here. You have to know the name of the function. The normal scenario is that you know at compile time the name of the function. In which case it does not matter whether the exported name is mangled, decorated, or indeed completely unrelated to the semantic name. Functions can be exported without names, by ordinal. Again, you need to know how the function was exported.
If you are presented with a header file for a library, and want to link to it with explicit linking (LoadLibrary/GetProcAddress) then you will need to find out the names of the function. Use a tool like dumpbin or Dependency Walker to do that.
Now, the other scenario which might lead to you asking the question is that you don't know the name at compile time. For instance, the name is provided by the user of your program in one way or another. Again, it is quite reasonable to require the user to know the exported name of the function.
Finally, you can parse the PE meta data for the executable file to enumerate its exported function. This will give you a list of exported function names, and exported function ordinals. This is what tools like dumpbin and Dependency Walker do.
If __declspec(dllexport) is used during compilation and __declspec(dllimport) in header file, as well as extern "c", then you do not need to decorate those functions. The __declspec helps in using the undecorated names, but function overloads, namespaces, and classes still require same way to distinguish them.
Usually, object oriented functions are exported using function ordinals instead of their decorated names. Cast the ordinal as (char*)(unsigned short)ordinal, thus, GetProcAddress(module, (char*)(unsigned short)ordinal);
Edit: while most of Windows use UTF-16, GetProcAddress uses UTF-8, so it cannot use a wide character string.
GetProcAddress(module, L"foo") is identical to GetProcAddress(module, "f");

Resources