Is there a way to make Clang output LLVM IR alongside executable? - compilation

I know that you can make clang output LLVM IR using -emit-llvm option, however this makes it the sole output.
I was wondering if there is some combination of compiler options that would make clang function exactly as before, but also produce .ll files as a byproduct?
The problem I'm facing right now is a project with a very complicated cmake-based build, for which I can only change clang compile options. I want to generate llvm IR files for it, but unfortunately, if I just pass -emit-llvm, CMake fails, since it's compiler tests/sanity checks don't pass (since .ll file is generated instead of the valid executable).
Is there some way to make clang generate both exe/object and .ll files? Or somehow workaround this issue in other ways?

There are at least two ways to achieve that:
-flto: instead of each object file you will get an LLVM Bitcode file (except of files that compiled from assembly, they will still be object files).
-fembed-bitcode: clang will add another section into the final executable, which contains all the LLVM bitcode files (again, except of the assembly files, they will still be object files). Then, you can use LibEBC to extract all those files.
No matter which approach you take, you would have to use llvm-dis tool to convert LLVM Bitcode files into LLVM IR files.
I hope it helps.

Related

Compile single static library for Cortex M3, M4, M23 and M33

I'm currently working on a rather generic communication stack. It gets bytes in on one end, parses the packet and calls a callback.
I want to have this stack in a static library (i.e. libcommstack.a).
The library is aimed towards embedded ARM Cortex-M devices. At the moment we have specified that at least a Cortex-M3 should be used (but it should also work for an M4 or M33).
Right now I'm integrating it into another application to verify that linking it is possible. In the future the idea is that we will ship this .a file to customers so they can build their application around it, without having direct access to our sources (to encapsulate our IP).
We are using GCC ARM v7.2.1 to compile both the library and the application that is linked to it.
The application I'm trying to integrate it with is compiled for a Cortex M33 with -mfloat-abi=hard -mfpu-fpv6-sp-d16.
The code for the library does not use any floating points and is compiled using -march=archv7-m (both have the -mthumb flag).
Linking seemed to all go well, until I actually called a function from the lib. At that point the linker starts to complain:
application.elf uses VFP register arguments, libcommstack.a(somefile.c.obj) does not
failed to merge target specific data of file libcommstack.a(somefile.c.obj)
Since I'm not using floating points in the library and I don't know (upfront) if the target application does or does not have an FPU (or even uses floats), I'm not sure how to approach this.
I figured there would be two approaches:
Compile a single version of the lib, using an instruction set that all of the microcontrollers understand. I was hoping that this would be the case with ARMv7 (although I'm not yet 100% confident that the M23/M33 also support this).
Compile a lot of different libs for the different flavors based on the different architectures, FPU, etc.
As you can imagine, I would prefer to keep it simple and go for option 1, but I'm not sure how to "convince" the linker to link these two (or perhaps how to convince the compiler NOT to care about floating points for the lib).
Does anyone know if option 1 is feasible and how it can be achieved?
If it is not feasible, what would be the variables to keep in mind to determine the different build flavors?
Does anyone know if option 1 is feasible
Well, feasible, probably.
how it can be achieved?
Get all the processors you want to support and determine the instructions sets available on all these processors. Then compile for that instruction set.
But, please don't, that is a workaround.
If it is not feasible, what would be the variables to keep in mind to determine the different build flavors?
Gcc has something like "multilib profiles". See arm-none-eabi-gcc --print-multi-lib output. If you have newlib installed, you can go to /usr/arm-none-eabi/lib/thumb/ and see the directories there - newlib is compiled for each profile and installs separate library for it and different library is picked up depending on configuration. Compile for each of those profiles, and package your library by putting libraries in proper /usr/arm-none-eabi/lib/proper/directory/here and compiler will pick them up by itself (see gcc -v output for library search paths). For an example search newlib sources where it happens, can't find it. (Here's my example). With cmake as a backend as a example you could compile and install as follows:
arm-none-eabi-gcc --print-multi-lib |
while IFS=';' read -r dir opts; do
cmake -B builddir CMAKE_C_FLAGS="$opts" CMAKE_INSTALL_LIBDIR="$dir"
cmake --build builddir
cmake --install builddir --prefix "/usr/arm-none-eabi/"
done

Can I have linker tell me which object files are used from a statically linked library

When building a binary, I'm statically linking against a library that we build from scratch. I'd like to know which particular object files are being linked against from the static library. Is there a way to get the linker to tell me this?
I am using GNU ld, the one associated with the gcc 4.8 compiler.
Update
Using Can GNU ld be instructed to print which .o files are needed during a link?, I got what I needed via nm on the built binary. But it would still be interesting to know whether this can be derived via linker output.

How to include ncurses while using Emscripten emcc and make on Mac

I'm trying to build a project (namely, Angband's source - http://rephial.org/downloads/3.3/angband-v3.3.2.tar.gz) with Emscripten's emcc in order to port it to Javascript and ultimately build an online version.
I've managed to get the process started with
emconfigure ./configure
make
which begins to successfully start generating LLVM bitcode .o files, but then it hangs up on main-gcu.c with 'main-gcu.c:43:11: fatal error: 'ncurses.h' file not found'
I believe main-gcu.c is the only file that references ncurses, but I just can't figure out how to include the library while compiling. Is there a way to specify including ncurses with 'make', or should I compile the main-gcu.c file individually, with 'emcc main-gcu.c -c -lncurses'? I tried doing that but that led to another error with emcc being unable to find other actually included header files two levels down (it couldn't find headers that were included by a header included by main-gcu.c - anyway to fix that?).
I'm also not certain if I have/need to install the ncurses library on Mac OSX. All I can really find are references to libncurses5-dev for Linux.
Thanks!
I think you misunderstand the compilation via Emscripten. I will try to point out a few problems you are facing.
The general rule is that all tools of Emscripten ONLY can turn LLVM formats (e.g. BITCODE) into JavaScript. emconfigure, emmake, ... modify the build environment so that your sourcecode is compiled to one of the LLVM formats (there are exceptions to the rule but nevermind). So anything you want to link against your final result has to be in a LLVM format, as well (which by default ncurses is not).
Since the output is JavaScript, there is no chance to execute any program code in different threads. While a lot of C/C++ code does use a thread for the UI and others for processing, such a model does NOT work for Emscripten. So in order to get the software compiling/running you will have to rewrite the parts that use threading. See emscripten_set_main_loop for pointers.
Even if you have the libraries compiled you then have to statically link them to Emscripten. At this point it is less of a technical problem but more of a license issue since if your library is licensed under e.g. LGPL due to static linking the GPL terms are effective.
I hope all clarity finally vanished ;)

Why are the debug symbols lost in the LLVM compile/link process?

I wrote an LLVM transformation that basically replaces mallocs by kind of guarded mallocs and some other stuff.
I'm using clang (or llvm-gcc) for compiling a c file to get a bitcode file (using the -emit-llvm option) which contains debug information. These also contain method names, line numbers and so on.
Afterwards I'm using opt to instrument this bitcode file. The result is an instrumented bitcode file, still containing all relevant debug infos.
In a third and last step, since we need some runtime libs, we link the bitcode against some other bitcode files using llvm-gcc to get a final binary.
This binary I cannot debug since it doesn't contain any debug information although all linked bitcode files did contain them. The only thing gdb can tell me is in which function we are but no line numbers and so on...
I would be grateful for any hints.
As I understood you are running optimizations (opt tool optimizes the code and debug information also). So may be the missing part which you want to see when debugging is a result of optimized debug information ?
P.S.
I would add this in the comment, but unfortunately I don't have 50 reputations which are needed to add a comment.

Migrating from Winarm to Yagarto

This question must apply to so few people...
I am busy mrigrating my ARM C project from Winarm GCC 4.1.2 to Yagarto GCC 4.3.3.
I did not expect any differences and both compile my project happily using the same makefile and .ld files.
However while the Winarm version runs the Yagarto version doesn't. The processor is an Atmel AT91SAM7S.
Any ideas on where to look would be most welcome. i am thinking that my assumption that a makefile is a makefile is incorrect or that the .ld file for Winarm is not applicable to Yagarto.
Since they are both GCC toolchains and presumably use the same linker they must surely be compatable.
TIA
Ends.
I agree that the gcc's and the other binaries (ld) should be the same or close enough for you not to notice the differences. but the startup code whether it is your or theirs, and the C library can make a big difference. Enough to make the difference between success and failure when trying to use the same source and linker script. Now if this is 100% your code, no libraries or any other files being used from WinARM or Yagarto then this doesnt make much sense. 3.x.x to 4.x.x yes I had to re-spin my linker scripts, but 4.1.x to 4.3.x I dont remember having problems there.
It could also be a subtle difference in compiler behavior: code generation does change from gcc release to gcc release, and if your code contains pieces which are implementation-dependent for their semantics, it might well bite you in this way. Memory layouts of data might change, for example, and code that accidentally relied on it would break.
Seen that happen a lot of times.
Try it with different optimization options in the compile and see if that makes a difference.
Both WinARM and YAGARTO are based on gcc and should treat ld files equally. Also both are using gnu make utility - make files will be processed the same way. You can compare the two toolchains here and here.
If you are running your project with an OCD, then there is a difference between the implementation of the OpenOCD debugger. Also the commands sent to the debugger to configure it could be different.
If you are producing an hex file, then this could be different as the two toolchains are not using the same version of newlib library.
In order to be on the safe side, make sure that in both cases the correct binutils are first in the path.
If I were you I'd check the compilation/linker flags - specifically the defaults. It is very common for different toolchains to have different default ABIs or FP conventions. It might even be compiling using an instruction set extension that isn't supported by your CPU.

Resources