Using compile flag -ffunction-sections with debug symbols - gcc

I am compiling a C file using the gcc flag -ffunction-sections, to move every function into it's own section. The assembler is throwing the error:
job_queue.s:2395: Error: operation combines symbols in different segments
The compiler's assembly output at line 2395 is given here:
.section .debug_ranges,info
.Ldebug_ranges0:
.4byte .LBB7-.Ltext0
The symbol LBB7 is in the function (and thus the section) named ".text.add_event_handler"
The symbol Ltext0 is in the (otherwise empty) section named: ".text"
GCC --version gives:
pic30-elf-gcc.exe (GCC) 4.0.3 (dsPIC30, Microchip v3_30) (B) Build date: Jun 29 2011
If I use the compiler flag -g0 (to turn off debug info) everything compiles and runs perfectly.
My question:
Is this GCC output clearly wrong? It seems to me that GCC should have calculated the symbol LBB7's offset from the beginning of the .add_even_handler section instead of the .text section.
I suspect I am misunderstanding something because I cannot find anyone having the same difficulty on the Google.

The GCC output is definitely wrong. Perhaps it's fixed in newer GCC versions. If you can't upgrade you compiler, try compiling with -gdwarf-2 or, failing that, with -gdwarf-2 -gstrict-dwarf (for -gstrict-dwarf you'll have to upgrade the compiler too).
What this option does is to instruct GCC to generate (strict) DWARF2, which does not include non-contiguous address ranges support, introduced in DWARF3.
Of course, this may degrade the debugging information quality somewhat, YMMV.

Related

Link a "toy" OS using llvm/clang

Is it possible (within reason) to build a "toy" OS on a mac using llmv/clang (and the other "normal" build tools)? By "toy" OS, I mean the simple, "Hello, World" examples found on OSDev (http://wiki.osdev.org/Bare_Bones) and x86 Bare Metal (https://github.com/cirosantilli/x86-bare-metal-examples).
My main problem is I can't figure out how to specify precisely where the linker should place the code (i.e., that the starting point should be 0x7c00, that bytes 510 and 511 need to be 0xaa55, etc.).
I would say yes it is possible within reason, at least if you consider waiting for a build of lld (and its dependency llvm) reasonable. Instructions to build lld can be found on their website or as part of this answer.
Compiling and linking for a different target than the host is relatively easy with clang. You just have to set a target, for example -target i386-none-elf for an ELF binary. Cross-compilation using clang is explained in more detail here.
As for macOS, as Micheal Petch noted, you have to use another linker than the standard ld installed. You could in theory install binutils to get an ELF ld but then you have to compile it yourself to set the target. My recommendation is to use lld which can target many architectures without the need to recompile.
With clang and a lld in place we can compile sources with
clang -c -o file.o file.c -target i386-none-elf # freestanding flags omitted
and then link them with
clang -o kernel.bin file.o -target i386-linux-elf -nostdlib -Wl,linkerscript.ld -fuse-ld=lld
Note that for linking I am using i386-linux-elf because there is a bug in clang where they just forward their input to gcc. But when using -nostdlib it is essentially the same.
If you want to see a complete example ready to build, you can take a look at https://github.com/Henje/x86-Toy-OS.

How to compile dhrystone benchmark for RV32I

when i am trying to compile the dhrystone benchmark in riscv-sodor for -march RV32I, an unrecognized opcode 'amoadd' error exits the compilation.
riscv-gcc -m32 -Wa,-march=RV32I -std=gnu99 -O2 -nostdlib -nostartfiles -DPREALLOCATE=1 -DHOST_DEBUG=0 \
-c -I../env -I./common -I./dhrystone ./dhrystone/dhrystone_main.c -o dhrystone_main.o
In file included from ./dhrystone/dhrystone_main.c:12:0:
./common/util.h:11:0: warning: "rdcycle" redefined [enabled by default]
./dhrystone/dhrystone.h:385:0: note: this is the location of the previous definition
/tmp/ccDLKK6P.s: Assembler messages:
/tmp/ccDLKK6P.s:16: Error: unrecognized opcode `amoadd'
make: *** [dhrystone_main.o] Error 1
Is there any way to compile the dhrystone for baremetal hardware of RV32I.
Remove the -Wa,-march=RV32I flag.
-Wa,-march=RV32I is telling the assembler to only accept valid RV32I instructions. It has found an amoadd instruction, so it properly errored out. The assembler is in no position to change out invalid instructions for valid instructions, that's the compiler's job.
Unfortunately, as of 2014 Dec, the riscv-toolchain's gcc port does not yet support -march=RV32I.
Instead, it is on the programmer to personally avoid writing code that will emit multiplies, AMOs, floating point, etc. In fact, the provided dhrystone code includes floating point instructions (I believe only in the initialization code) which would also cause -Wa,-march=RV32I to fail as well. In this situation, the programmer of dhrystone checks whether the processor supports floating point instructions and branches around them as required.
On a more academic note, I believe the amoadd is coming in from the common/util.h's "barrier" implementation. However, since dhrystone does not call barrier, I'm unsure why the assembler is being passed an amoadd instruction. I certainly don't see it in my own disassemblies of dhrystone. I also was unable to reproduce your error on my own RISCV compilers. I am using a riscv-gcc from early August 2014 (https://github.com/ucb-bar/riscv-tools) and a newer gcc4.9 found at (https://github.com/ucb-bar/riscv-tools/tree/new-abi), which is in testing, but will soon be moved to master (~Jan 2015).

What flags or environment variables can I pass to Clang to get maximum debugging on both BSD and Linux?

I'm interested in answers, approaches, and ideas out of the box. At a high level, the main page is pretty sparse and they mainly list -g, with one level, suggesting that -O0 is also either very helpful or essential.
But I'm wondering what other clang flags can be given to give maximum debugging. Is there an equivalent to gcc's -ggdb3 which includes some of the source or annotations directly in the object output? Or could there be? Is it possible and helpful to recompile the OS and its original libraries to have debugging (and if so, if I'm using Debian, can I have it write the debugging into the main .deb package instead of putting a separate debugged .deb package which stores debugging data in /usr/lib/debug?)? Will a static build of a binary affect the ability to see a good stacktrace? And is there anything that needs to be done to ensure that addr2line works well? Is it needed to compile all libraries (even glibc) with clang to get the maximum debugging benefit? I note that there is a project to recompile Debian with clang, and otherwise am open to a distribution that does so or otherwise places emphasis on debugging.
On Linux there are also options like an LD_PRELOAD set to /lib/libSegFault.so, or a set of LD_LIBRARY_PATH reassignments to /usr/lib/debug instead of the usual /usr/lib location (including redirecting libc itself to the debugged version). Is there a central place or external sources for answers to this question of how to enhance debuggability of a binary? The bigger mystery is clang, since I see in the long gcc man page that there are various options which can increase debugging (or reduce optimisations), but on the other hand the documentation for clang only shows a smaller set. It's possible that clang will accept more options than given, including gcc flags (which may either translate to a no-op or to more debugging - hard to tell without a canonical source of information).
Also from a package build perspective, since an external package may not respect CFLAGS, I've redirected /usr/bin/strip to be a no-op command that always succeeds, but other ideas on ensuring compliance are suggested (I believe that pkgsrc does a good job of wrapping gcc and the linker in shell scripts - useful to insert mandatory flags). Also there may be various ld options that can be passed to increase debugging of the outputted target. Also, it's quite possible that BSD (including FreeBSD 10, based upon clang) may have a different linking architecture which could make it easier to request and find debugged symbols in the generated libraries and executables.
To take debugging more broadly defined, I've set LD_WARN=yes, LD_DEBUG=unused, SEGFAULT_SIGNALS="all", LD_PRELOAD=.../libSegFault.so (as mentioned above), and LD_BIND_NOW=yes. Also I believe I can prefer that gcc search for libraries in /usr/lib/debug - above the standard search paths using strategic -Bs. Also, using --whole-archive for a static build might ensure that more objects are included in the linked output. There's also ulimit -c unlimited, and on Linux a nice way to differentiate core files like:
sysctl -w kernel.core_pattern="core.%t.SIG-%s.PID-%p.ID-%g-%u.%h.%E"
For gcc I've used and seen flags like: -O0 -fno-omit-frame-pointer -fverbose-asm -ggdb3 -mno-omit-leaf-frame-pointer -mtune=generic -fvar-tracking -D_GLIBCXX_DEBUG=1 -frecord-gcc-switches -femit-class-debug-always -fmath-errno -fno-eliminate-unused-debug-symbols -fno-eliminate-unused-debug-types -fno-merge-debug-strings -mieee-fp -mtune=generic -static-libgcc -fexceptions -fvar-tracking -fbounds-check -rdynamic -UNDEBUG -DDEBUG=1 (-ffreestanding -static-libgcc -pass-exit-codes) -fno-stack-check (since I believe I've read that the latter can interfere with debugging)
Other flags are there for other reasons but the emphasis is to be on maximum debugging. With all or most of the above, it's unclear to what extent clang would support or use there, or whether there are other options.
Clang does not support the -ggdb3 flag, only -g, as you have noticed. If you try to use it, you'll get the message:
clang: warning: argument unused during compilation: '-ggdb3'
so you can run your entire command line through Clang and it will tell you which of those GCC flags it supports and which it does not, some will print warnings, others may error out, but Clang will not silently ignore them. Here are the ones that Clang rejected when I tried your long command: -static-libgcc and -pass-exit-codes.
As pointed out in another SO answer, clang -cc1 --help can be used to list supported compilation flags, where we see the following which may be of interest to you:
-disable-llvm-optzns: Don't run LLVM optimization passes
-fno-elide-constructors: Disable C++ copy constructor elision
-mdisable-fp-elim: Disable frame pointer elimination optimization

From which version of gcc does -mcmodel=large work?

I have code which produces executables larger than 2GB (it's generated code).
On x64 with gcc 4.3.2 I get errors like:
crtstuff.c:(.text+0x20): relocation truncated to fit:
R_X86_64_32S against `.dtors'
So I understand i need the -mcmodel=large option. However that doesn't do anything or solve the problem on my system.
I am sure I read somewhere, that it was only supported from a particular version of gcc, and the option was ignored on versions before that. I would tell my operations team to install that version of gcc if only I knew what it was. But I just can't find any evidence right now to tell me if that hypothesis is true, and if so in which version the feature was introduced.
For example
(1) Here it is stated that the option doesn't do anything. The book in question claims to cover "GCC 4.x". The book came out 2006.
(2) Here a compiler bug is being reported against the option, therefore I conclude in that version it must do at least something. That seems to be gcc 4.6.1.
So although I can no longer find evidence of exactly in which version the feature was implemented, at least there is evidence that this has changed over time.
I have tried looking through the changelogs for all the various GCC 4.x versions to no avail (and normally they are pretty good so the lack of information there almost implies that I am wrong and nothing has changed between versions.)
Edit: This seems to imply that perhaps it did work, but I need to "recompile crtstuff.c", but I don't really know where I find that file or how I do that.
I believe 4.4 is the version that added support for this feature. I demonstrate below that 4.1 doesn't work while 4.4 does, on something that needs a large data block (rather than code). I'm not sure about 4.2 and 4.3, but both your example and my memory suggest 4.3 didn't have working support for this. My example should let you validate whether a particular installation works or not though, on an otherwise easy to compile bit of code.
As background, I maintain a program that's a fork of the stream benchmark, modified specially to use 64 bit structures for testing larger systems. I was plagued with these "relocation truncated to fit" errors until I started using "-mcmodel=large", and my fork won't compile/run unless that really does work. The oldest version of gcc I've definitely found my program compatible with is the 4.4.5 that ships with Debian Squeeze.
Here's a complete test case showing my fork of stream compiling and using >4GB of RAM with the large model, after failing to do so without the option:
$ gcc --version
gcc (Debian 4.4.5-8) 4.4.5
...
$ git clone https://github.com/gregs1104/stream-scaling.git
$ cd stream-scaling
$ gcc -O3 -DN=200000000 -fopenmp stream.c -o stream
/tmp/cca8rR1I.o: In function `checkSTREAMresults':
stream.c:(.text+0x34): relocation truncated to fit: R_X86_64_32S against `.bss'
...
stream.c:(.text+0x6ab): additional relocation overflows omitted from the output
collect2: ld returned 1 exit status
$ gcc -O3 -DN=200000000 -fopenmp stream.c -o stream -mcmodel=large
$ ./stream
-------------------------------------------------------------
STREAM version $Revision: 5.9 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 200000000, Offset = 0
Total memory required = 4577.6 MB.
...
And here's what happens on a version of gcc that doesn't have the large model, one running RedHat 5 derived software (CentOS 5.8):
$ gcc --version
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52)
...
$ gcc -O3 -DN=200000000 -fopenmp stream.c -o stream -mcmodel=large
stream.c:1: sorry, unimplemented: code model ‘large’ not supported yet
So on older versions of gcc, it should throw that error out, not just ignore the option.
crtstuff is a library coming with gcc. The bug report you linked to on the gcc mailing list was from someone trying to build their own gcc for a RedHat 5 system, which as you can see in this last example ships with gcc 4.1. They rebuilt part of gcc with the large model, but it was still linking against the original, 4.1 built crtstuff library. You shouldn't run into that problem if you're using a properly packaged gcc, which is why it wasn't considered a real bug by the gcc developers. I think you just need gcc 4.4 or later.

How to use gcc and -msoft-float on an i386/x86-64? [duplicate]

Is it (easily) possible to use software floating point on i386 linux without incurring the expense of trapping into the kernel on each call? I've tried -msoft-float, but it seems the normal (ubuntu) C libraries don't have a FP library included:
$ gcc -m32 -msoft-float -lm -o test test.c
/tmp/cc8RXn8F.o: In function `main':
test.c:(.text+0x39): undefined reference to `__muldf3'
collect2: ld returned 1 exit status
It is surprising that gcc doesn't support this natively as the code is clearly available in the source within a directory called soft-fp. It's possible to compile that library manually:
$ svn co svn://gcc.gnu.org/svn/gcc/trunk/libgcc/ libgcc
$ cd libgcc/soft-fp/
$ gcc -c -O2 -msoft-float -m32 -I../config/arm/ -I.. *.c
$ ar -crv libsoft-fp.a *.o
There are a few c files which don't compile due to errors but the majority does compile. After copying libsoft-fp.a into the directory with our source files they now compile fine with -msoft-float:
$ gcc -g -m32 -msoft-float test.c -lsoft-fp -L.
A quick inspection using
$ objdump -D --disassembler-options=intel a.out | less
shows that as expected no x87 floating point instructions are called and the code runs considerably slower as well, by a factor of 8 in my example which uses lots of division.
Note: I would've preferred to compile the soft-float library with
$ gcc -c -O2 -msoft-float -m32 -I../config/i386/ -I.. *.c
but that results in loads of error messages like
adddf3.c: In function '__adddf3':
adddf3.c:46: error: unknown register name 'st(1)' in 'asm'
Seems like the i386 version is not well maintained as st(1) points to one of the x87 registers which are obviously not available when using -msoft-float.
Strangely or luckily the arm version compiles fine on an i386 and seems to work just fine.
Unless you want to bootstrap your entire toolchain by hand, you could start with uclibc toolchain (the i386 version, I imagine) -- soft float is (AFAIK) not directly supported for "native" compilation on debian and derivatives, but it can be used via the "embedded" approach of the uclibc toolchain.
GCC does not support this without some extra libraries. From the 386 documentation:
-msoft-float Generate output containing library calls for floating
point. Warning: the requisite
libraries are not part of GCC.
Normally the facilities of the
machine's usual C compiler are used,
but this can't be done directly in
cross-compilation. You must make your
own arrangements to provide suitable
library functions for
cross-compilation.
On machines where a function returns
floating point results in the 80387
register stack, some floating point
opcodes may be emitted even if
-msoft-float is used
Also, you cannot set -mfpmath=unit to "none", it has to be sse, 387 or both.
However, according to this gnu wiki page, there is fp-soft and ieee. There is also SoftFloat.
(For ARM there is -mfloat-abi=softfp, but it does not seem like something similar is available for 386 SX).
It does not seem like tcc supports software floating point numbers either.
Good luck finding a library that works for you.
G'day,
Unless you're targetting a platform that doesn't have inbuilt FP support, I can't think of a reason why you'd want to emulate FP support.
Doesn't your x386 platform have external FPU support? Pity it's not a x486 with the FPU built in!
In my experience, any soft emulation is bound to be much slower than its hardware equivalent.
That's why I finished up writing a package in Ada to taget the onboard 68k FPU instead of using the soft emulation provided by the compiler manufacturer at the time. They finished up bundling it in their compiler as a matter of fact.
Edit: Just seen your comment below. Hmmm, if you don't need a full suite of FP support is it possible to roll your own for the few math functions you do need? That how the Ada package I mentioned got started.
HTH
cheers,

Resources