What is the difference between "gcc -s" and a "strip" command? - gcc

I wonder what is the difference between these two:
gcc -s: Remove all symbol table and relocation information from the executable.
strip: Discard symbols from object files.
Do they have the same meaning?
Which one do you use to:
reduce the size of executable?
speed up its running?

gcc being a compiler/linker, its -s option is something done while linking. It's also not configurable - it has a set of information which it removes, no more no less.
strip is something which can be run on an object file which is already compiled. It also has a variety of command-line options which you can use to configure which information will be removed. For example, -g strips only the debug information which gcc -g adds.
Note that strip is not a bash command, though you may be running it from a bash shell. It is a command totally separate from bash, part of the GNU binary utilities suite.

The accepted answer is very good but just to complement your further questions (and also as reference for anyone that end up here).
What's the equivalent to gcc -s in terms of strip with some of its options?
They both do the same thing, removing the symbols table completely. However, as #JimLewis pointed out strip allows finer control. For example, in a relocatable object, strip --strip-unneeded won't remove its global symbols. However, strip or strip --strip-all would remove the complete symbols table.
Which one do you use to reduce the size of executable and speed up its running
The symbols table is a non-allocable section of the binary. This means that it never gets loaded in RAM memory. It stores information that can be useful for debugging purporses, for instance, to print out a stacktrace when a crash happens. A case where it could make sense to remove the symbols table would be a scenario where you have serious constraints of storage capacity (in that regard, gcc -Os -s or make CXXFLAGS="-Os -s" ... is useful as it will result in a smaller slower binary that is also stripped to reduce size further). I don't think removing the symbols table would result into a speed gain for the reasons commented.
Lastly, I recommend this link about stripping shared objects: http://www.technovelty.org/linux/stripping-shared-libraries.html

"gcc -s" removes the relocation information along with the symbol table which is not done by "strip". Note that, removing relocation information would have some effect on Address space layout randomization. See this link.

They do similar things, but strip allows finer grained control over what gets removed from
the file.

Related

GCC Linker : how to generate a report of per file contribution on output sections

Recently I meet a problem when trying to link my program. It report .text can't fit in specified memory region. Obviously the source code grows too large to be linked in limited memory region.
What I want to do now is to analyze which file contribute most significantly to the ".text" section so that follow up code optimization can be performed. I tried many ways but don't make it.
nm -s output.elf gives size of each symbol, but don't group the symbol under each source file.
nm -s file.obj go through every object file don't work because -fdata-section -ffunction-section -Wl are specified, so not all content in an object file will get linked to final output
readelf -s output.elf gives information from file to file, but it simply list symbols under each file and their size. A script can be written to sum all the size under a file, but the total value seems wrong, an obvious error here is some symbols may point to the same memory location, so the same memory region may be calculated many times.
When gcc link do its work, it should know all the details of what is extracted from a object file and put to the output section, but seems it don't provide a switch to generate a detail report (Or am I missing something?)
Is there any tool which can do this job?
Perhaps --gc-sections together with --print-gc-sections and/or --print-map-discarded?
If everything is in separate sections, then you have all your sections as input, and list of discarded sections. Then simple script shall produce list of used sections, their size and file mapping.

what do I do with an SIGFPE address in gdb?

While running an executable in gdb, I encountered the following error:
Program received signal SIGFPE, Arithmetic exception.
0x08158307 in radtra_ ()
How do I understand what line number and file does 0x08158307 without recompiling or otherwise modifying the source? if it helps, the source language was Fortran.
How do I understand what line number and file does 0x08158307 without recompiling or otherwise modifying the source?
That isn't easy. You could use GDB disassemble command, look for access to global variables and CALL instructions, and make a guess where inside radtra_ you are. This is harder the larger the routine is, the more optimizations compiler has applied to it, and the fewer calls and global variable accesses are performed.
If you can't guess, your only options are:
Rebuild the application adding -g flag, but leaving all other compile options unmodified, then use addr2line to translate the address to line number. (This is how you should build the application from the start.)
If you can't rebuild the entire application, rebuild just the source containing radtra_ (again with same flags, but add -g). You should be able to match the output from objdump -d radtra.o with the output from disassemble. Once you have a match, read output from readelf -wl radtra.o or objdump -g radtra.o to associate code offsets within radtra_ with source lines that code was generated from.
Hire an expert to guess for you. This wouldn't be cheap, as people skilled in this kind of reverse engineering are usually gainfully employed and value their time.

What are gcc linker map files used for?

What are the ".map" files generated by gcc/g++ linker option "-Map" used for ?
And how to read them ?
I recommend generating a map file and keeping a copy for any software you put into production.
It can be useful for deciphering crash reports. Depending on the system, you likely can get a stack dump from the crash. The stack dump will include memory addresses and one of the registers will include the Instruction Pointer. That tells you the memory address code was executing at. On some systems, code addresses can be moved around (when loading dynamic libraries, hence, dynamic), but the lower order bytes should remain the same.
The map file is a MAP from memory location -> code location. It gives you the name of the function at a given memory address. Due to optimizations, it may not be extremely accurate, but it gives you a place to start in terms of looking for bugs that cause the crash.
Now, in 30 years of writing commercial software, this is the only thing I've used the map files for. Twice successfully.
What are the ".map" files generated by gcc/g++ linker option "-Map" used for?
There is no such thing as 'gcc linker' -- GCC and linker are independent and separate projects.
Usually the map is used for understanding decisions that ld made while linking the binary. From man ld:
-M
--print-map
Print a link map to the standard output.
A link map provides information about the link, including the following:
· Where object files are mapped into memory.
· How common symbols are allocated.
· All archive members included in the link, with a mention of the symbol which caused the archive member to be brought in.
· The values assigned to symbols.
...
If you don't understand what that means, you likely don't (yet) have the questions that this output answers, and hence have no need to read it.
The compiler gcc is one program that generates object code files, the linker ld is a second program to combine the object code files into an executable. The two can be combined into a single command line.
If you are generating a program to run on an ARM processor you need to use arm-none-eabi-gcc and arm-none-eabi-ld so that the code will be correct for the ARM architecture. Gcc and ld will generate code for your host computer.

How to rename debugging information?

EDIT: I am rephrasing entirely my original question as it was far from clear (it's non-clearness can be seen at the bottom!).
I am developing a RTOS where both the kernel and the applications must be mapped to very specific locations in memory. For example:
0x00000000:0x0000ffff: application #1
0x00010000:0x0000ffff: application #2
...
0xffff0000:0xffffffff: kernel
The applications (and the kernel) are developed (and compiled) separately. To merged everything into a single executable, the following process is used:
(Separately) Compile the kernel and the applications (stripped of any symbols).
(Through a script) Generate a linker script to relocate the kernel and the applications to the desired locations. To prevent any conflicts between sections' names, the generated linker script "renames" all sections of all applications (e.g. .app1.text, .app1.data, .app1.bss, ...).
Link using the previously generated linker script (i.e. merge all).
Question 1) Is it possible to replace steps #2 and #3 with something like the following process?
Relocate the object files of the kernel and the applications to the desired position.
Rename all symbols on the applications' object files (to prevent name clashes).
Merge all.
I'm trying to replace the generation of the linker script with some already available tools.
Step #1 should be possible through the creation of a position independent executable (I still have to investigate this).
Step #2 is possible through GNU objcopy.
For Step #3 I have no possible solution yet. If GNU ld is used, it uses some default linker script and the previous relocation is lost. If GNU gdb accepted archives generated from GNU ar the problem would be solved (I guess!).
Question 2) If the above process is possible, can it be applied to debugging information as well?
Step #1 should remain intact.
For step #2 I am not sure if debugging information gets renamed or not.
The problem with step #3 remains.
The original question follows:
I have a custom kernel and one or more applications and, I want to use
GDB to debug the entire system. In order to avoid any name clashes
during linkage I use objcopy to rename all the sections and symbols
names (applications' start addresses are hard-coded in the kernel).
However, debugging information is [I guess] hard-coded inside those
.debug.* sections and do not get renamed.
Is there a way to rename the debugging information? And, after that,
merge that information with another set of already existent debugging
information?
I have searched GCC's manual to see if I can find an option to prefix
(like a global namespace) all symbols during compilation, but I
haven't found any.
My guess is that there is a debugging format which exposes its
information on the objects symbol table (which can be renamed).
Answer to Question 1)
Step #1 should be possible through the creation of a position
independent executable (I still have to investigate this).
No, it is not possible. A position independent executable is useful when the load address of the executable is known only at load time. In my case, I want to hardwire the load address.
For Step #3 I have no possible solution yet. If GNU ld is used, it
uses some default linker script and the previous relocation is lost.
If GNU gdb accepted archives generated from GNU ar the problem would
be solved (I guess!).
There seems to be no workaround. A linker script is thus mandatory.
Answer to Question 2)
For step #2 I am not sure if debugging information gets renamed or
not.
In fact, debugging information does not get renamed. You can use objdump -s and check that debugging information is hardwired inside those .debug.* sections.
Workaround)
Even without debugging information you can use the object file's symbol table to set breakpoints. However, instead b main your must use b * main because the symbols in the symtable are interpreted as address. This is not much, but it certainly helps.

Object files without ELF header or ways to reduce object file size with GCC?

I'm using GCC to compile some C code. Is there a way to strip e.g. the ELF header from object file and make linker to add the header? Or, are there other possibilities to strip down the resulting object file size than the obvious -Os and -s flags?(-ffast-math, -fomit-frame-pointer, -fshort-doubles do help to reduce the code size but hexdumping the object file reveals huge amounts of zeroes which are "seemingly" useless).
Tools like strip/sstrip aren't really of much use as the object file has to preserve the symbols(it will be linked later on). (-strip-unneeded and -R .comment -R .gnu.version do their magic though).
What I'm doing is something which requires me to bundle (compressed) object file to the user and have a script embedded to link it at users-end. Every byte counts!
The ELF header can't be removed and restored later on, as valuable information is stored there and then lost forever (like file offsets for some tables, architecture, etc. IIRC). You've already listed almost everything that you can do to reduce the size, except maybe bzip'ing.
If you run the object files through a compression algorith, those "huge amounts of zeroes" should shrink by a large factor, since they have low information content. You might want to investigate a better compression algorithnm, perhaps it's possible to gain more there than by going to a modified/non-standard "hacked" object file format, if even possible.
You can try playing with -fdata-sections, -ffunction-sections and -Wl,--gc-sections, but this is not safe, so be sure to understand how they work before using them.

Resources