How to generate ELF file format for JIT code for GDB? - compilation

Background:
I am generating a JIT code (which generates x86-64 code). After the end of JIT process, I have a .text section, a .data section and a .eh_frame section generated (.eh_frame is used for stack unwinding). I am able to execute this JIT code successfully. But the issue is GDB. I want to be able to debug this JIT code using GDB (specifically the 'backtrace' command of GDB should work).
Problem:
I need to tell GDB about this loaded JIT code (in particular I need to tell GDB about .eh_frame so it can use that frame for stack-unwinding). I see that GDB has a JIT interface: https://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html
Possible Solutions:
There are two options here:
Hand over a ELF file to GDB
Write a Customer Jit-Reader plugin to handle debugging of custom object file.
Right now I have a custom object file (just bunch of three independent sections loaded into memory). I don't want to write my own Jit Reader plugin.
Blocking Issue:
Does anyone knows existing code that will help me package these three independent sections into a simple ELF file (which I can then register with GDB by calling __jit_debug_register_code())? I am guessing all I need to do is write some header (conforming to ELF specifications) which have names and pointers to the section. Is there existing open source code for this or if not, can someone point me towards how to do this packaging myself?
I need bare minimum ELF file so that GDB is happy (I don't need to Load the ELF file as .text and .data section are already loaded)

libelf could be of help for constructing an ELF object. There are open-source implementations available at:
elftoolchain (BSD licensed)
elfutils (GPL).

Related

Is it possible to use any program as a library?

I'm trying to create some debug scripts with compiled programs, for this I'm trying to create something where I prepare my variables in some code I generate and then jump into another program.
Is there a way to do that ? For example by having some C code and then jumping to a label or place in the executable. For now I'm focusing on ELF programs, but if something exists on Windows I'm also interested !
Thanks !
I've tried to bring back the ELF file into a .s for GCC and recompile, however this doesn't seem to work well for all ELF files (e.g non-PIE binaries). And I've looked to see if there were tools that would create a .s but they are either buggy, incomplete or both.

Undefined reference to `WinMain' when compiling Nasm program on windows (MinGW)

I would like to compile the Hello World NASM example on windows.
I've pasted the code above into a main.asm file, and compiled it into an obj file with this command:
nasm -fwin32 .\main.asm -o main.obj
After that I wanted to compile this obj file to an exe, like this:
g++ .\main.obj -o main.exe -m32
But I get this error:
C:/Program Files (x86)/mingw-w64/i686-8.1.0-posix-dwarf-rt_v6-rev0/mingw32/bin/../lib/gcc/i686-w64-mingw32/8.1.0/../../../../i686-w64-mingw32/lib/../lib/libmingw32.a(lib32_libmingw32_a-crt0_c.o):crt0_c.c:(.text.startup+0x39): undefined reference to `WinMain#16'
What do I miss? How is it possible to fix this error?
That Hello World program is trying to create the PE import table manually. In order for that to work, you need to instruct the linker carefully (the PE sections are not tied to PE directories, idata is just a name).
Further assumptions are made in that source (e.g. the base address of the image and the need for the CRT).
Honestly, it's just nonsense. Use the linker properly, like Jester shown.
Being really honest, that whole Wikipedia section is just informational at best.
Long story short: never use Wikipedia as a programming tutorial.
EDIT: The x86-64 Linux example on the Wikipedia page has been updated by Peter Cordes; the others may still be misleading.
A bit of brief theory
You can create a 32-bit Windows console program mainly in two ways:
Use the C run time (CRT)
This lets you use the common C functions (above all printf).
There are two ways to use the CRT:
Statically
The object files resulting from the compilation of the CRT source code are linked with the object file resulting from the compilation/assembling of your source code.
The CRT code is embedded entirely in your application.
In this scenario your main function (main/WinMain/DllMain and unicode variants) is being called by the CRT that runs first by a properly set PE entry-point).
In order to use this method you need the CRT object files, these can be found with Visual Studio or MinGW (to name twos).
The order of execution is: The Windows loader calls your PE entry-point, this is set to something like _mainCRTStartup that initialize the CRT and the CRT calls your main function.
Dynamically
The CRT main dll is msvcrt.dll for the version shipped with Windows installation or msvcrtXX0.dll for the version shipped with Visual Studio installation (where XX depends on the VS version).
The CRT dll has the initialization and tear down code in the DLL entry point so by just putting it in the PE import table the CRT is automagically managed.
The order of execution is: The Windows loader loads your PE dependencies, including the CRT DLL (that got initialised as per above) and then call your PE entry-point.
Use only the Windows API
The Windows API are the OS exposed functions, these are what the CRT implementation ends up calling.
You can use the Windows API and the CRT (the common scenario is for a graphical application to have the CRT statically linked and use WinMain as the entry-point - where the Windows APIs are intermixed with C utility functions) or the Windows API alone.
When using them alone you get a smaller, faster and easy to make executable.
To use 1.1 you need the CRT object files and these are usually shipped with a compiler (they once were shipped with the Windows SDK but now that VS is free Microsoft moved them in the VS package - fair but VS is orders of magnitude more heavy than the SDK).
1.2 and 2 don't need these object files.
Note however that compilers/assemblers/linkers compatibility may be a nasty beast, especially the .lib machinery for linking external APIs (basically libs file are a way to make the linker find the functions that will be resolved by the loader at runtime - i.e. those defined in an external DLL).
Hello, world!
Method 2
First, to write Hello, World! using the method 2., see this other answer of mine.
It was written when a linker was available in the Windows SDK, today I use GoLink.
It is a minimalist, very easy to use, linker.
One key point of it is that it doesn't need the .lib files, instead you can pass it the path of the DLLs where the external functions reside.
The NASM command is the same, to link use:
golink /console /entry main c:\windows\system32\kernel32.dll hello.obj -fo hello.exe
Untested - optionally add /largeaddressaware if you code can handle that
That example is for 64-bit programming, it's more involved than a 32-bit one but may be useful anyway.
Method 1.2
This is what the Wikipedia article is trying to use.
Before analyzing that specific code, let me show how I'd write it:
BITS 32
GLOBAL _main
EXTERN printf
EXTERN exit
SECTION .text
_main:
push strHelloWorld
call printf
add esp, 04h
push 0
call exit
SECTION .data
strHelloWorld db "Hello, world!", 13, 10, 0
This is pretty straightforward compared to the Wiki's one.
To make an executable:
nasm -fwin32 helloworld.asm -o helloworld.obj
golink /console /entry _main c:\windows\system32\msvcrt.dll helloworld.obj -fo helloworld.exe
The Wikipedia's code is creating an .idata sections that stores the PE Import Address Table.
This is a silly move, the linker is used to generate that table based on the dynamic dependencies of the object files.
To make that program link we need to:
Tell the linker that the base address is 0x400000. This can be done with any linker (for golink use /base 0x400000).
Tell the linker that the entry-point is where the .text section starts. I don't know if link.exe can take .text as a valid symbol name or if allows to specify an entry-point relative to .text but that seems very unlikely. Golink won't allow for that. In short a label is probably missing.
Tell the linker to make the Import directory points to the .idata section. I'm not aware of any linker that would allow for that (though it may exists).
In short, forget about it.
Method 1.1
This is what the link Jester pointed out is using.
The assembly code is the same as for 1.2 but you use MinGW for linking.

Generate library from ELF file

I'm trying to generate a static library from a compiled ELF file.
Previously, I've been able to generate the library by compiling my source code to object files, then passing those objects to avr-ar to successfully create my library. In order to reduce code space of the project, I've switched over to using link-time optimisations so save ~1.5 kB of space - however, in order to do so I end up passing all my source and header files to avr-gcc in one invocation and it spits out a .elf file.
I can't seem to get the -flto option working with the linker (I'm using a custom linker script) and compiler driver, otherwise I'd have the object files I need.
Is it possible to take this generated .elf and push it through ar to generate a library?
Problem Context:
This is related to this problem. I've written the shared libraries and bootloader section, and am using this linker script to set out my flash space. Here's the Makefile that drives all this - it's very hacked together.
Ideally, what I'd like to happen is to be able to compile my src/ director to separate object files in obj/, all with link time optimisation enabled to cut down on code space as much as possible but still leaving unused functions in the output (the shared library that is stored in flash is not fully utilised by the bootloader application, but may be linked against by the loaded applications). I'd then like to be able link those objects together to create a .elf and libbootloader.a. The elf is then used to generate a binary to flash to my AVR and the bootloader library is referenced when building user applications that refer to the stored library in flash space already. (Perhaps I want to just link against a list of symbols referencing the shared library section?)

How to detect code compiled with LTO?

Exist any way to detect if code is compiled with -flto?
Example is classic library or executable under Linux compiled with GCC (4.9.1), without debugging.
Considering that LTO information is stored in several ELF sections inside object files (see LTO file sections), you could try and see what readelf returns (as used for instance in this answer).
Look for .gnu.lto_.xxx entries.

Codewarrior debugger not showing C source after compiling some code and data into a new ELF section

In our project, we are building an ELF file and a partially linked file (PLF) which is converted to a proprietary format and loaded into memory after the ELF is loaded. We use Codewarrior to run and debug, which has been working just fine (the C++ source code is always available to step through when debugging).
I've recently made a change where some code and data are compiled into a different section in the PLF file (.init, which was previously empty). Now, when debugging, a majority of the files are available only in assembler. When I re-build, no longer using .init, we can step through C++ source code again.
Does anyone know why this would be the case?
why this would be the case
One reason could be that codewarrior is not expecting to find code in .init section.
You are unlikely to get a good answer here. Try codewarrior support forums.
I got this working by switching the order of the sections using the linker command file (.lcf) so that the .init section comes second after .text. I guess as Employed Russian suggests, CodeWarrior is surprised by having code in .init and craps out. Changing the order of the sections seems to have no ill effects and now debugging works as expected again.

Resources