I am just starting to learn Assembly (x86 NASM) and I am currently going over function calls. Wherever I looked on the internet I saw everyone calling functions like this:
call power
Where power is the label where the function starts. But what I am trying to see is how to print something in Assembly, and interestingly enough, calling a function like in the above case doesn't seem to work. We'll use the printf function from C. Say I already used extern printf and import printf msvcrt.dll in my program (so I can actually use printf), also say I already defined a symbol in my data segment msg db "Hello World", 0 and now I am trying to print this message. If I do this:
push dword msg
call printf
Nothing happens, it doesn't work. I have no idea why. However, if I do this:
push dword msg
call [printf]
The message is printed just as expected.
This doesn't make much sense to me after all the articles that I read used just the label, without brackets. It also made a lot of sense to me when using just the label as we're using the call instruction to perform a jump to that label, so we needed the address of the label. But here it doesn't make sense at all to me why we're using the brackets and what exactly happens. I mean, what is [printf] and what would [power] be, for the example I presented at the start of my question. However, despite my confusion, this is what works and the method I initially used doesn't work.
Can you please tell me exactly what is going on? (PS: I am using Olly Debugger if that makes any difference)
It depends on what is "printf" in your assembly. If it is a function pointer (aka, the address of some function is stored at the address named "printf"), then you need brackets []. If "printf" is a function, that is, if the machine code is stored at the address that your assembler calls "printf", then you must not put brackets (or else you will probably end up with a segmentation fault, as the first 32 of 64 bits of machine code of "printf" probably don't accidentally contain an address of an executable code).
Related
I want to print the addresses of all the local and global variables which are being used in a function, at different points of execution of a program and store them in a file.
I am trying to use gdb for this same.
The "info local" command prints the values of all local variables. I need something to print the addresses in a similar way. Is there any built in command for it?
Edit 1
I am working on a gcc plugin which generates a points-to graph at compile time.
I want to verify if the graph generated is correct, i.e. if the pointers do actually point to the variables, which the plugin tells they should be pointing to.
We want to validate this points-to information on large programs with over thousands of lines of code. We will be validating this information using a program and not manually. There are several local and global variables in each function, therefore adding printf statements after every line of code is not possible.
There is no built-in command to do this. There is an open feature request in gdb bugzilla to have a way to show the meaning of all the known slots in the current stack frame, but nobody has ever implemented this.
This can be done with a bit of gdb scripting. The simplest way is to use Python to iterate over the Blocks of the selected Frame. Then in each such Block, you can iterate over all the variables, and invoke info addr on the variable.
Note that printing the address with print &var will not always work. A variable does not always have an address -- but, if the variable exists, it will have a location, which is what info addr will show.
One simple way these ideas can differ is if the compiler decides to put the variable into a register. There are more complicated cases as well, though, for example the compiler can put the variable into different spots at different points in the function; or can split a local struct into its constituent parts and move them around.
By default info addr tries to print something vaguely human-readable. You can also ask it to just dump the DWARF location expressions if you need that level of detail.
programmatically ( in C/C++ ) you use the & operator to get the address of a variable (assuming it's not a pointer):
int a; //variable declaration
print("%d", a); //print the value of the variable (as an integer)
print("0x%x", &a); //print the address of the variable (as hex)
The same goes for (gdb), just use &
plus the question has already been answered here (and not only)
How can I use the call op code to generate a direct call, without dereferencing a pointer?
Since the call instruction can mean any of a number of op codes such as xE8, xFF and x9A, how do I tell Visual Studio which one I want to generate when I use the call instruction?
The assembler will figure out which opcode to generate based on what expression you use.
If you write call foo it will be a near relative call E8. If you write call [foo], call [eax] or call eax it will be near indirect call FF (opcode extension 2). If you write call foo:bar (or if the assembler knows you are calling a far procedure) it will be a direct far call 9A. If you write call far [foo] it will be an indirect far call which is also FF (but opcode extension 3).
PS: not sure about far call syntax for VS, see the documentation for details. You don't normally use far calls, though.
Windows API has ChildWindowFromPoint() and ChildWindowFromPointEx() functions and they differ in that the latter has uFlags parameter specifying which windows to skip.
It looks like if I pass CWP_ALL into ChildWindowFromPointEx() I'll get exactly the same effect as I would have with ChildWindowFromPoint().
Is the only difference in uFlags parameter? Can I just use ChildWindowFromPointEx() everywhere and pass CWP_ALL when I need ChildWindowFromPoint() behavior?
If it helps at all, I hacked up a quick test application that calls both functions and stepped into the disassembled USER32.DLL to see where the calls go.
For ChildWindowFromPoint, after some preamble, I reached this point:
The main processing was delegated to the call at 75612495.
Then, for ChildWindowFromPointEx, I step into the assembly and get this:
As that entry point is the target of the call from the first function, it seems pretty clear to me that ChildWindowFromPoint calls ChildWindowFromPointEx, presumably with uFlags set to CWP_ALL (my assembler knowledge is limited but I'm looking hard at that push 0 before the call - CWP_ALL is defined as zero).
If you intent to always use ChildWindowFromPointEx with CWP_ALL, you could just use ChildWindowFromPoint().
If you intent to always use ChildWindowFromPoint, you could just use ChildWindowFromPointEx with CWP_ALL.
ChildWindowFromPoint is equivalent to ChildWindowFromPointEx with CWP_ALL.
Advice: use ChildWindowFromPointEx (you may one day have usage for other flags value)
I have two executables, both manually created by me, I shall call them 1.exe and 2.exe respectively. First of all, both the executables are compiled by MSVS 2010, using the Microsoft compiler. I want to type a message into 1.exe, and I want 1.exe to inject that message into 2.exe (possibly as some sort of parameter), so when I run 2.exe after 1.exe has injected the message, 2.exe will display that message.
NOTE - this is not for illicit use, both these executables were created by me.
The big thing for me is:
Where to place the message/instructions in 2.exe so they can be easily accessed by 2.exe
How will 2.exe actually FIND use these parameters (message).
I fully understand that I can't simply use C++ code as injection, it must be naked assembly which can be generated/translated by the compiler at runtime (correct me if I am wrong)
Some solutions I have been thinking of:
Create a standard function in 2.exe requiring parameters (eg displaying the messagebox), and simply inject these parameters (the message) into the function?
Make some sort of structure in 2.exe to hold the values that 1.exe will inject, if so how? Will I need to hardcode the offset at which to put these parameters into?
Note- I don't expect a spoonfeed, I want to understand this aspect of programming proficiently, I have read up the PE file format and have solid understanding of assembly (MASM assembler syntax), and am keen to learn alot more. Thank you for your time.
Very few programmers ever need to do this sort of thing. You could go your entire career without it. I last did it in about 1983.
If I remember correctly, I had 2.exe include an assembler module with something like this (I've forgotten the syntax):
.GLOBAL TARGET
TARGET DB 200h ; Reserve 512 bytes
1.exe would then open 2.exe, search the symbol table for the global symbol "TARGET", figure out where that was within the file, write the 512 bytes it wanted to, and save the file. This was for a licensing scheme.
The comment from https://stackoverflow.com/users/422797/igor-skochinsky reminded me that I did not use the symbol table on that occasion. That was a different OS. In this case, I did scan for a string.
From your description it sounds like passing a value on the command line is all you need.
The Win32 GetCommandLine() function will give you ther passed value that you can pass to MessageBox().
If it needs to be another running instance then another form of IPC like windows messages (WM_COPYDATA) will work.
We have a native C++ application running via COM+ on a windows 2003 server. I've recently noticed from the event viewer that its throwing exceptions, specifically the C0000005 exception, which, according to http://blogs.msdn.com/calvin_hsia/archive/2004/06/30/170344.aspx means that the process is trying to write to memory not within its address space, aka access violation.
The entry in event viewer provides a call stack:
LibFmwk!UTIL_GetDateFromLogByDayDirectory(char const *,class utilCDate &) + 0xa26c
LibFmwk!UTIL_GetDateFromLogByDayDirectory(char const *,class utilCDate &) + 0x8af4
LibFmwk!UTIL_GetDateFromLogByDayDirectory(char const *,class utilCDate &) + 0x13a1
LibFmwk!utilCLogController::GetFLFInfoLevel(void)const + 0x1070
LibFmwk!utilCLogController::GetFLFInfoLevel(void)const + 0x186
Now, I understand that its giving me method names to go look at but I get a feeling that the address at the end of each line (e.g. + 0xa26c) is trying to point me to a specific line or instruction within that method.
So my questions are:
Does anyone know how I might use this address or any other information in a call stack to determine which line within the code its falling over on?
Are there any resources out there that I could read to better understand call stacks,
Are there any freeware/opensource tools that could help in analysing a call stack, perhaps by attaching to a debug symbol file and/or binaries?
Edit:
As requested, here is the method that appears to be causing the problem:
BOOL UTIL_GetDateFromLogByDayDirectory(LPCSTR pszDir, utilCDate& oDate)
{
BOOL bRet = FALSE;
if ((pszDir[0] == '%') &&
::isdigit(pszDir[1]) && ::isdigit(pszDir[2]) &&
::isdigit(pszDir[3]) && ::isdigit(pszDir[4]) &&
::isdigit(pszDir[5]) && ::isdigit(pszDir[6]) &&
::isdigit(pszDir[7]) && ::isdigit(pszDir[8]) &&
!pszDir[9])
{
char acCopy[9];
::memcpy(acCopy, pszDir + 1, 8);
acCopy[8] = '\0';
int iDay = ::atoi(&acCopy[6]);
acCopy[6] = '\0';
int iMonth = ::atoi(&acCopy[4]);
acCopy[4] = '\0';
int iYear = ::atoi(&acCopy[0]);
oDate.Set(iDay, iMonth, iYear);
bRet = TRUE;
}
return (bRet);
}
This is code written over 10 years ago by a member of our company who has long since gone, so I don't presume to know exactly what this is doing but I do know its involved in the process of renaming a log directory from 'Today' to the specific date, e.g. %20090329. The array indexing, memcpy and address of operators do make it look rather suspicious.
Another problem we seem to have is that this only happens on the production system, we've never been able to reproduce it on our test systems or development systems here, which would allow us to attach a debugger.
Much appreciated!
Andy
Others have said this in-between the lines, but not explicitly. look at:
LibFmwk!UTIL_GetDateFromLogByDayDirectory(char const *,class utilCDate &) + 0xa26c
The 0xa26c offset is huge, way past the end of the function. the debugger obviously doesn't have the proper symbols for LibFmwk so instead it's relying on the DLL exports and showing the offset relative to the closest one it can find.
So, yeah, get proper symbols and then it should be a breeze. UTIL_GetDateFromLogByDayDirectory is not at fault here.
if you really need to map those addresses to your functions - you'll need to work with .MAP file and see where those addresses really point to.
But being in your situation I would rather investigate this problem under debugger (e.g. MSVS debugger or windbg); as alternative (if crash happens at customer's site) you can generate crash dump and study it locally - it can be done via Windows MiniDumpWriteDump API or SysInternals ProcDump utility (http://download.sysinternals.com/Files/procdump.zip).
Make sure that all required symbol files are generated and available (also setup microsoft symbol server path so that windows DLLs' entry points get resolved also).
IMHO this is just the web site you need: http://www.dumpanalysis.org - which is the best resource to cover all your questions.
Consider also taking a look at this PDF - http://windbg.info/download/doc/pdf/WinDbg_A_to_Z_color.pdf
Point 2 and 3 are easily answered:
3rd Point. Any debugger. That's what they are made for. Set your debugger to break on this special exception. You should be able to click yourself through the callstack and find the different calls on the stack (at least delphi can do this, so visual studio should be able as well). Compile without optimisations if possible. OllyDBG might work as well - perhaps in combination with its trace functionality.
2nd Point. Any information about x86 Assembler, Reverseengineering ... Try: OpenRCE, NASM Documentation, ASM Community.
1st Point. The callstack tells you the functions. I don't know if it is written in order or in opposite order - so it might be that the first line is the last called function or the first called function. Follow the calls with the help of the debugger. Sometimes you can change between asm and code (depending on the debugger, map files ...). If you don't have the source - learn assembler, read about reverse engineering. Read the documentation of the functions you call in third party components. Perhaps you do not satisfy a precondition.
If you can tell a bit more about the programm (which parts of the source code do you have, is a library call involved?, ...)
Now some code-reading:
The function accepts a pointer to a zero terminated string and a reference to a date object. The pointer is assumed to be valid!
The function checks wether the string is in a specific format (% followed by 8 digits followed by a \0). If this is not the case, it returns false. This check (the big if) accesses the pointer without any validity checks. The length is not checked and if the pointer is pointing somewhere in the wild, this space is accessed. I don't know if a shorter string will cause problems. It shouldn't because of the way && is evaluated.
Then some memory is allocated on the stack. The number-part of the string is copied into it (which is ok) and the buffer gets its \0 termination. The atois extract the numbers. This will work because of the different start-locations used and the \0-termination after each part. Somehow tricky but nice. Some comments would have made everything clear.
These numbers are then inserted into the object. It should be valid since it is passed into the function by reference. I don't know if you can pass a reference to a deleted object but if this is the case, this might be your problem as well.
Anyway - except the missing check for the string-pointer, this function is sound and not the cause of your problem. It's only the place that throws the exception. Search for arguments that are passed into this function. Are they always valid? Do some logging.
I hope I didn't do any major mistakes as I am a Delphi programmer. If I did - feel free to comment.