GDB Debug information on scalar storage order - debugging

We have recently started using the GCC scalar_storage_order attribute for C structures which are shared between processors with different endianness.
The "problem" we are trying to solve is that it appears the debugger interprets the structure fields in the processor native scalar order (endianness).
Is there a way to include the endianness information for scalars in a structure in the debug information using GCC?
Does GDB support different endianness for specific structure definitions?
Please indicate if the question is not clear and thanks for any information on this.

You are probably hitting this gcc issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82509, which was fixed in yet unreleased gcc 8.0.
As far as I understand you can try to apply the patch to your current gcc and also apply attached patch to gdb: https://sourceware.org/ml/gdb-patches/2017-10/msg00266.html to solve the problem you are seeing.

Related

How to extract Linux kernel data objects statically?

I am trying to figure out the easiest way to extract kernel data objects using static analysis tools, I found CIL as one option but looks like its more embedded in to GCC and may not be feasible when we need to run it with cross compiler. I wonder if any other C parser would help me for doing such task.
Could someone please recommend me a tool/utility to do static analysis of the kernel source code.
Did you try to explore the parser in clang. I have the same demand, some people refer me to clang. Haven't got time to get into it yet.

Can I mix arm-eabi with arm-elf?

I have a product which bootloader and application are compiled using a compiler (gnuarm GCC 4.1.1) that generates "arm-elf".
The bootloader and application are segregated in different FLASH memory areas in the linker script.
The application has a feature that enables it to call the bootloader (as a simple c-function with 2 parameters).
I need to be able to upgrade existing products around the world, and I can safely do this using always the same compiler.
Now I'd like to be able to compile this product application using a new GCC version that outputs arm-eabi.
Everything will be fine for new products, where both application and bootloader are compiled using the same toolchain, but what happens with existing products?
If I flash a new application, compiled with GCC 4.6.x and arm-none-eabi, will my application still be able to call the bootloader function from the old arm-elf bootloader?
Furthermore, not directly related to the above question, can I mix object files compiled with arm-elf into a binary compiled with arm-eabi?
EDIT:
I think is good to make clear I am building for a bare metal ARM7, if it makes any difference...
No. An ABI is the magic that makes binaries compatible. The Application Binary Interface determines various conventions on how to communicate with other libraries/applications. For example, an ABI will define calling convention, which makes implicit assumptions about things like which registers are used for passing arguments to C functions, and how to deal with excess arguments.
I don't know the exact differences between EABI and ABI, but you can find some of them by reading up on EABI. Debian's page mentions the syscall convention is different, along with some alignment changes.
Given the above, of course, you cannot mix arm-elf and arm-eabi objects.
The above answer is given on the assumption that you talk to the bootloader code in your main application. Given that the interface may be very simple (just a function call with two parameters), it's possible that it might work. It'd be an interesting experiment to try. However, it is not ** guaranteed** to work.
Please keep in mind you do not have to use EABI. You can generate an arm-elf toolchain with gcc 4.6 just as well as with older versions. Since you're using a binary toolchain on windows, you may have more of a challenge. I'd suggest investigating crosstool-ng, which works quite well on Linux, and may work okay on cygwin to build the appropriate toolchain.
There is always the option of making the call to bootloader in inline assembly, in which case you can adhere to any calling standard you need :).
However, besides the portability issue it introduces, this approach will also make two assumptions about your bootloader and application:
you are able to detect in your app that a particular device has a bootloader built with your non-EABI toolchain, as you can only call the older type bootloader using the assembly code.
the two parameters you mentioned are used as primitive data by your bootloader. Should the bootloader use them, for example, as pointers to structs then you could be facing issues with incorrect alignment, padding and so forth.
I Think that this will be OK. I did a migration something like this myself, from what I remember I only ran into a problem to do with handling division.
This is the best info I can find about the differences, it suggests that if you don't have struct alignment issues, you may be OK.

GCC: disguising between GCC versions

This question was emerged from this question.
The problem is that there is a NVidia driver for Linux, compiled wth GCC 4.5. The kernel is compiled with GCC 4.6. Well, the stuff doesn't work because of the version number difference between GCCs. (the installer says the driver won't work - for details please visit the link above)
Could one disguise a binary compiled with GCC 4.5 to a binary compiled with GCC 4.6? If it is possible, under what circumstances would it work well?
Your problem is called ABI: Application Binary Interface. This is a set of rules (among others) how functions in a piece of code get their arguments (ordering, padding of types on the stack), naming of the function so the linker can resolve symbols and padding/alignment of fields in structures.
GCC tries to keep the ABI stable between compiler versions but that's not always possible.
For example, GCC 4.4 fixed a bug in packed bit-fields which means that old/new code can't read structures using this feature properly anymore. If you would mix versions before and after 4.4, data corruption would occur without any crashes.
There is no indication in the 4.6 release notes that the ABI was changed but that's something which the Linux kernel can't know - it just reads the compiler version used to compile the code and if the first two numbers change, it assumes that running the code isn't safe.
There are two solutions:
You can compile the Nvidia driver with the same compiler as the kernel. This is strongly recommended
You can patch the version string in the binary. This will trick the kernel into loading the module but at the risk of causing data corruption to internal data structures.

Suggestions on how to write a debug format conversion tool

I'm looking to write a tool that aims to convert debug symbols of one format to another format that's compatible for use under GDB. This seems like a tedious and potentially complex project so I'm not exactly sure how to tackling it.
Intially I'm aiming to convert the Turbo Debug Symbol table(TDS) emitted from borland compilers into something like stabs or dwarf format(seems like dwarf is prefer from my research). But ideally I want to design my tool to be easy enough to extend so it could convert other formats too later on. e.g. codeview4 or maybe even pdb.
My primary motivation for creating this are:
Interoperability. If I can convert a foreign debug format into a form gdb can work with then source-level debugging would be possible on binaries compiled from another compiler other than gcc. This means any frontend debugging interface that uses gdb as a backend will work as well.
No other tools exist. I did a google searching around for similar tools and the closest I've found is tds2dbg. But it doesn't quite do what I'm looking for.
What I have to work with at the moment:
I already have a debug hook API that can understand the TDS debug format. I can use that to help me get at the needed information from the source format I'm converting from.
For the scope of this project, I'm mainly interested in getting this to work under the win32 environment. Other platforms and tools I'm not really concerned about.
The target dwarf debug format I'm converting to. This one I'm really not familiar with at all. I have used gcc ported compilers like MinGW before and debugged them with gdb with the dwarf format. But I don't have any idea how this format is implemented on windows.
The last point is the one I'm concerned about. I'm reading through the dwarf spec documentation but I find I'm having trouble really understanding and comprehending how it works. There's so much detail in there but at the same time it doesn't have any details about how dwarf gets implemented on object files and image files on a platform that doesn't use ELF natively -- namely the PE-COFF format that windows uses. The documentation is also a very dry read, long sentences make it hard to understand and diagrams and illustrations are sparse. I came across an API called libDwarf that should take most of the parsing work out of interpreting dwarf. The problem is I'm still trying to get it to build and I don't know yet how it will work out.
I haven't written any code yet since I don't fully understand what it is I need to build. I have a feeling the biggest hurtle will be figuring out how to work with dwarf due to it's complexity. Googling for information on how dwarf works under windows hasn't turned up anything helpful either. Like for example, there's no information about the 'glue' code that's needed to contain dwarf within a PE executable image file. How are the dwarf sections exactly layed out? Are there any header information for each section? GDB clearly doesn't just take a 'raw' dwarf debug file and use it as is. So what kind of format does gdb expect the debug file to be in for it to be able to work with it?
My question is, how can I start on such a project? More importantly, where can I turn to for help when I inevitably get stuck on a problem?
Affinic Assembler for Windows
Affinic Assembler is an x86/x86-64 assembler for Windows that takes GAS-syntax assembly source with DWARF debug information and generates corresponding CodeView format sections in object file in order to make the linked program debuggable in Visual Studio. This program is good for Cygwin and MinGW users to port Linux code to Windows.
http://www.affinic.com/?page_id=48
You are asking several questions here :-)
I think you are heading in the right direction, using libdwarf.
BUT, have you taken a look at objcopy to see if this tool can do some of the work for you? It probably doesn't support borland, pdb or codeview4, but it might be worth looking into. (Another approach may be to extend objcopy to support the formats you are trying to convert between.)
I have used the dwarf-discuss mailing list sometimes when I have become stuck.
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
As for the questions on dwarf, split them into separate questions and I will do my best to
answer them. :-)

Lua compiled scripts on Mac OS X - Intel vs PPC

Been using Lua 5.0 in a Mac OS X universal binary app for some years. Lua scripts are compiled using luac and the compiled scripts are bundled with the app. They have worked properly in Tiger and Leopard, Intel or PPC.
To avoid library problems at the time, I simply added the Lua src tree to my Xcode project and compiled as is, with no problems.
It was time to update to a more modern version of Lua so I replaced my source tree with that of 5.1.4. I rebuilt luac using make macosx (machine is running Leopard on Intel).
Uncompiled scripts work properly in Tiger and Leopard, Intel and PPC, as always.
However, now compiled scripts fail to load on PPC machines.
So I rebuilt luac with the 'ansi' flag, and recompiled my scripts. Same error. Similarly, a build flag of 'generic' produced no joy.
Can anyone please advise on what I can do next?
Lua's compiled scripts are pretty much the raw bytecode dumped out after a short header. The header documents some of the properties of the platform used to compile the bytecode, but the loader only verifies that the current platform has the same properties.
Unfortunately, this creates problems when loading bytecode compiled on another platform, even if compiled by the very same version of Lua. Of course, scripts compiled by different versions of Lua cannot be expected to work, and since the version number of Lua is included in the bytecode header, the attempt to load them is caught by the core.
The simple answer is to just not compile scripts. If Lua compiles the script itself, you only have to worry about possible version mismatches between Lua cores in your various builds of your application, and that isn't hard to deal with.
Actually supporting a full cross compatibility for compiled bytecode is not easy. In that email, Mike Pall identified the following issues:
Endianess: swap on output as needed.
sizeof(size_t), affects huge string constants: check for overflow when
downgrading.
sizeof(int), affectsMAXARG_Bx and MAXARG_sBx: check for overflow when
downgrading.
typeof(lua_Number): easy in C, but only when the host and the target
follow the same FP standard; precision
loss when upgrading (rare case);
warn about non-integer numbers when
downgrading to int32.
From all the discussions that I've seen about this issue on the mailing list, I see two likely viable approaches, assuming that you are unwilling to consider just shipping the uncompiled Lua scripts.
The first would be to fix the byte order as the compiled scripts are loaded. That turns out to be easier to do than you'd expect, as it can be done by replacing the low-level function that reads the script file without recompiling the core itself. In fact, it can even be done in pure Lua, by supplying your own chunk reader function to lua_load(). This should work as long as the only compatibility issue over your platforms is byte order.
The second is to patch the core itself to use a common representation for compiled scripts on all platforms. This has been described as possible by Luiz Henrique de Figueiredo:
....
I'm convinced that the best route to
byte order or cross-compiling is
third-party dump/undump pairs. The
files ldump.c and lundump.c are
completely replaceable; they export a
single, well-defined, entry point. The
format of precompiled chunks is not
sacred at all; you can use any format,
as long as ldump.c and lundump.c agree
about it. (For instance, Rici Lake is
considering writing a text format for
precompiled chunks.)
....
Personally, I'd recommend giving serious consideration to not pre-compiling the scripts and thus avoid the platform portability issues entirely.
Edit: I've updated my description of the bytecode header thanks to lhf's comment. I hadn't read this part of the Lua source yet, and I probably should have checked it before being quite so assertive about what information is or is not present in the header.
Here is the fragment from lundump.c that forms a copy of the header matching the running platform for comparison to the bytecode being loaded. It is simply compared with memcmp() for an exact match to the header from the file, so any mismatch will cause the stock loader (luaU_undump()) to reject the file.
/*
* make header
*/
void luaU_header (char* h)
{
int x=1;
memcpy(h,LUA_SIGNATURE,sizeof(LUA_SIGNATURE)-1);
h+=sizeof(LUA_SIGNATURE)-1;
*h++=(char)LUAC_VERSION;
*h++=(char)LUAC_FORMAT;
*h++=(char)*(char*)&x; /* endianness */
*h++=(char)sizeof(int);
*h++=(char)sizeof(size_t);
*h++=(char)sizeof(Instruction);
*h++=(char)sizeof(lua_Number);
*h++=(char)(((lua_Number)0.5)==0); /* is lua_Number integral? */
}
As can be seen, the header is 12 bytes long and contains a signature (4 bytes, "<esc>Lua"), version and format codes, a flag byte for endianness, sizes of the types int, size_t, Instruction, and lua_Number, and a flag indicating whether lua_Number is an integral type.
This allows most platform distinctions to be caught, but doesn't attempt to catch every way in which platforms can differ.
I still stand by the recommendations made above: first, ship compilable sources; or second, customize ldump.c and lundump.c to store and load a common format, with the additional note that any custom format should redefine the LUAC_FORMAT byte of the header so as to not be confused with the stock bytecode format.
You may want to use a patched bytecode loader that supports different endianness.
See this.
I would have commented on RBerteig's post, but I apparently don't have enough reputation yet to be able to do so. In working on bringing LuaRPC up to speed with Lua 5.1.x AND making it work with embedded targets, I've been modifying the ldump.c and lundump.c sources to make them both a bit more flexible. The embedded Lua project (eLua) already had some of the patches you can find on the Lua list, but I've added a bit more to make lundump a little more friendly to scripts compiled on different architectures. There's also cross-compilation support provided so that you can build for targets differing from the host system (see luac.c in the same directory as the links below).
If you're interested in checking out the modifications, you can find them in the eLua source repository:
http://svn.berlios.de/wsvn/elua/trunk/src/lua/lundump.c
http://svn.berlios.de/wsvn/elua/trunk/src/lua/lundump.h
http://svn.berlios.de/wsvn/elua/trunk/src/lua/ldump.c
Standard Disclaimer:
I make no claim that the modifications are perfect or work in every situation. If you use it and find anything broken, I'd be glad to hear about it so that it can be fixed.
Lua bytecode is not portable. You should ship source scripts with your application.
If download size is a concern, they are generally shorter than the bytecode form.
If intellectual property is a concern, you can use a code obfuscator, and keep in mind that disassembling Lua bytecode is anything but difficult.
If loading time is a concern, you can precompile the sources locally in your installation script.
I conjecture that you compiled the scripts on an Intel box.
Compiled scripts are wildly unportable. If you really want to precompile scripts, you'll need to include two versions of each compiled script: one for Intel and one for PPC. Your app will have to interrogate which program it's running on and use the correct compiled script.
I don't have enough reputation to comment, so I have to provide this as an answer instead even though it's not an appropriate answer to the question asked. Sorry.
There is an Lua Obfuscator available here:
http://www.capprime.com/CapprimeLuaObfuscator/CapprimeLuaObfuscator.aspx
Full disclosure: I am the author of the obfuscator and I am aware it is not perfect. Feedback is welcome and encouraged (there is a feedback page available from the above page).

Resources