COFF symbol table vs import/export/debug section - windows

As far as I have understood, COFF symbol table in the Microsoft's Portable Executable format is used to store the export, import and the debug symbols. But as we already have a .edata, .idata and .debug section for the purpose why do we need another such structure for it?

See here: http://msdn.microsoft.com/en-us/library/ms809762.aspx
"[the COFF symbol table] is only used in OBJ files and PE files with COFF debug information."
"The .rdata section is used for at least two things. [...] (In TLINK32 EXEs, the debug directory is in a section named .debug.) [...] Three main types of debug information appear: CodeView®, COFF, and FPO."
"Why would anyone need COFF debug information when the much more complete CodeView information is available? If you intend to use the Windows NT system debugger (NTSD) or the Windows NT kernel debugger (KD), COFF is the only game in town."
In other words, the COFF symbol table is used only for debugging, only for the more primitive debuggers, and is typically placed inside the .debug (or .rdata) section.

Related

How to generate ELF file format for JIT code for GDB?

Background:
I am generating a JIT code (which generates x86-64 code). After the end of JIT process, I have a .text section, a .data section and a .eh_frame section generated (.eh_frame is used for stack unwinding). I am able to execute this JIT code successfully. But the issue is GDB. I want to be able to debug this JIT code using GDB (specifically the 'backtrace' command of GDB should work).
Problem:
I need to tell GDB about this loaded JIT code (in particular I need to tell GDB about .eh_frame so it can use that frame for stack-unwinding). I see that GDB has a JIT interface: https://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html
Possible Solutions:
There are two options here:
Hand over a ELF file to GDB
Write a Customer Jit-Reader plugin to handle debugging of custom object file.
Right now I have a custom object file (just bunch of three independent sections loaded into memory). I don't want to write my own Jit Reader plugin.
Blocking Issue:
Does anyone knows existing code that will help me package these three independent sections into a simple ELF file (which I can then register with GDB by calling __jit_debug_register_code())? I am guessing all I need to do is write some header (conforming to ELF specifications) which have names and pointers to the section. Is there existing open source code for this or if not, can someone point me towards how to do this packaging myself?
I need bare minimum ELF file so that GDB is happy (I don't need to Load the ELF file as .text and .data section are already loaded)
libelf could be of help for constructing an ELF object. There are open-source implementations available at:
elftoolchain (BSD licensed)
elfutils (GPL).

Undefined reference to `WinMain' when compiling Nasm program on windows (MinGW)

I would like to compile the Hello World NASM example on windows.
I've pasted the code above into a main.asm file, and compiled it into an obj file with this command:
nasm -fwin32 .\main.asm -o main.obj
After that I wanted to compile this obj file to an exe, like this:
g++ .\main.obj -o main.exe -m32
But I get this error:
C:/Program Files (x86)/mingw-w64/i686-8.1.0-posix-dwarf-rt_v6-rev0/mingw32/bin/../lib/gcc/i686-w64-mingw32/8.1.0/../../../../i686-w64-mingw32/lib/../lib/libmingw32.a(lib32_libmingw32_a-crt0_c.o):crt0_c.c:(.text.startup+0x39): undefined reference to `WinMain#16'
What do I miss? How is it possible to fix this error?
That Hello World program is trying to create the PE import table manually. In order for that to work, you need to instruct the linker carefully (the PE sections are not tied to PE directories, idata is just a name).
Further assumptions are made in that source (e.g. the base address of the image and the need for the CRT).
Honestly, it's just nonsense. Use the linker properly, like Jester shown.
Being really honest, that whole Wikipedia section is just informational at best.
Long story short: never use Wikipedia as a programming tutorial.
EDIT: The x86-64 Linux example on the Wikipedia page has been updated by Peter Cordes; the others may still be misleading.
A bit of brief theory
You can create a 32-bit Windows console program mainly in two ways:
Use the C run time (CRT)
This lets you use the common C functions (above all printf).
There are two ways to use the CRT:
Statically
The object files resulting from the compilation of the CRT source code are linked with the object file resulting from the compilation/assembling of your source code.
The CRT code is embedded entirely in your application.
In this scenario your main function (main/WinMain/DllMain and unicode variants) is being called by the CRT that runs first by a properly set PE entry-point).
In order to use this method you need the CRT object files, these can be found with Visual Studio or MinGW (to name twos).
The order of execution is: The Windows loader calls your PE entry-point, this is set to something like _mainCRTStartup that initialize the CRT and the CRT calls your main function.
Dynamically
The CRT main dll is msvcrt.dll for the version shipped with Windows installation or msvcrtXX0.dll for the version shipped with Visual Studio installation (where XX depends on the VS version).
The CRT dll has the initialization and tear down code in the DLL entry point so by just putting it in the PE import table the CRT is automagically managed.
The order of execution is: The Windows loader loads your PE dependencies, including the CRT DLL (that got initialised as per above) and then call your PE entry-point.
Use only the Windows API
The Windows API are the OS exposed functions, these are what the CRT implementation ends up calling.
You can use the Windows API and the CRT (the common scenario is for a graphical application to have the CRT statically linked and use WinMain as the entry-point - where the Windows APIs are intermixed with C utility functions) or the Windows API alone.
When using them alone you get a smaller, faster and easy to make executable.
To use 1.1 you need the CRT object files and these are usually shipped with a compiler (they once were shipped with the Windows SDK but now that VS is free Microsoft moved them in the VS package - fair but VS is orders of magnitude more heavy than the SDK).
1.2 and 2 don't need these object files.
Note however that compilers/assemblers/linkers compatibility may be a nasty beast, especially the .lib machinery for linking external APIs (basically libs file are a way to make the linker find the functions that will be resolved by the loader at runtime - i.e. those defined in an external DLL).
Hello, world!
Method 2
First, to write Hello, World! using the method 2., see this other answer of mine.
It was written when a linker was available in the Windows SDK, today I use GoLink.
It is a minimalist, very easy to use, linker.
One key point of it is that it doesn't need the .lib files, instead you can pass it the path of the DLLs where the external functions reside.
The NASM command is the same, to link use:
golink /console /entry main c:\windows\system32\kernel32.dll hello.obj -fo hello.exe
Untested - optionally add /largeaddressaware if you code can handle that
That example is for 64-bit programming, it's more involved than a 32-bit one but may be useful anyway.
Method 1.2
This is what the Wikipedia article is trying to use.
Before analyzing that specific code, let me show how I'd write it:
BITS 32
GLOBAL _main
EXTERN printf
EXTERN exit
SECTION .text
_main:
push strHelloWorld
call printf
add esp, 04h
push 0
call exit
SECTION .data
strHelloWorld db "Hello, world!", 13, 10, 0
This is pretty straightforward compared to the Wiki's one.
To make an executable:
nasm -fwin32 helloworld.asm -o helloworld.obj
golink /console /entry _main c:\windows\system32\msvcrt.dll helloworld.obj -fo helloworld.exe
The Wikipedia's code is creating an .idata sections that stores the PE Import Address Table.
This is a silly move, the linker is used to generate that table based on the dynamic dependencies of the object files.
To make that program link we need to:
Tell the linker that the base address is 0x400000. This can be done with any linker (for golink use /base 0x400000).
Tell the linker that the entry-point is where the .text section starts. I don't know if link.exe can take .text as a valid symbol name or if allows to specify an entry-point relative to .text but that seems very unlikely. Golink won't allow for that. In short a label is probably missing.
Tell the linker to make the Import directory points to the .idata section. I'm not aware of any linker that would allow for that (though it may exists).
In short, forget about it.
Method 1.1
This is what the link Jester pointed out is using.
The assembly code is the same as for 1.2 but you use MinGW for linking.

Source files missing from ELF symbol table - how to include them?

I am working with a project that was handed off to me and some of the building and linking concepts are new to me. I have a makefile, several assembly and C source files, an ELF file and binary file. When I load the ELF file onto my target, I am only able to step-through the C files, not the assembly files.
When I do a readelf on the ELF file, I see that the assembly (.S) files are missing from the symbol table. Likewise, my debugger (RealView Debugger 4.1) doesn't list those .S files in the "sources from image" tree. I can see that some of the symbols from those files are included (i.e. label names) in my readelf output, but not the file type symbols themselves. I've been going over the makefile to try to spot what may be failing to include them, but I'm not sure what I'm looking for. Can anyone please point me in the right direction? Thanks!
You mentioned using the RealView debugger so I'm making an educated guess that you have RVDS. If so, have you tried using the readelf equivalent that ships with RVDS, fromelf. I have no way to confirm this now but I recall there were subtle differences between assembly code generated by the ARM compiler and gcc.

Extract debugging information from a msys/mingw gcc built dll using rebase.exe?

I'm trying to analyze a mini crash dump and need symbol files in order to get more details about the crash. Im currently just seeing:
"034eff74 0086eee9 00000000 0089d58d 034eff94 app_integrator!ZNK14ACE_Data_Block4baseEv+0x6"
Is it possible to extract debugging information from a msys/mingw gcc built dll into a windbg readable format? If not, is there any other way of getting more detailed information, like loading a MAP file in some way?
The dll and all it's contained .o files are built with the -g flag.
Windbg can't cope with the debugging information that will be generated by -g on a mingw installation. However, it can allegedly cope with COFF symbols.
If the source files for your DLL are small enough, you can probably get COFF debug information to build (-gcoff rather than -g).
So, Windbg can (allegedly) handle COFF symbols and GCC can generate them. So it should be easy from there, right? I was trying to do exactly this with a Win32 executable generated by Visual Studio 2008 that was loading a gcc-compiled DLL. Unfortunately for me, compiling with -gcoff didn't work. Mingw's gcc won't generate COFF symbols for projects with more than 64k lines of code. The DLL I was using was distincly larger then 64K code lines. Sadly I have to admit, I gave up and fell back on the trusty OutputDebugString. Otherwise I'd be able to give more complete instructions. I didn't fancy investigating the option of making gcc do COFF symbols for larger source files, or the alternative option of writing a debugging extension to parse DWARF or STABS data into windbg's internal symbol tables.
I fixed the issue, by the way!
Further suggestions can be found in this forum post at windbg.info.

The symbol packaged downloaded from MS site

I just downloaded the symbol package for WIN7 RTM but in my windbg it still find the symbol information for RegQueryValueEx().
From the windbg information it said some of the OS dll symbol is not provided in the pdb file, but how can I know which ones are not provided and which one does?
Specifically the symbol I am searching for is RegQueryValueEx();
Thanks.
Bin
You can watch your loaded modules and corresponding symbols using the lm command. However, since WinDbg doesn't load symbols until they are needed, you can do a .reload /f to force load of all symbols.
If the output from lm says (pdb symbols) for a given module, you have the correct public symbols for that module.

Resources