How to add a tool to a GCC toolchain? - gcc

I am currently working on the toolchain for a processor that has been developed at my university. The processor is closely based on OpenRISC (orpsocv2 has been used as a baseline). Building programs for that platform requires that some custom instructions are added to the binary. I already implemented tools that modify assembly code accordingly (utilizing regular expressions). However, I am looking for a way to integrate it with the GNU toolchain of OpenRISC.
A regular toolchain consists of the following tools:
preprocessor -> compiler -> assembler -> linker
I need my adaptations to be integrated somewhere after compilation (because I require information about the basic blocks that will be present in the binary) and before linking (because afterwards things get messy when you try to change addresses).
Now my question: Is there an easy way to add another tool between the compiler and the assembler of the GNU toolchain?
I don't want to do that manually in the Makefile, because I would like to have the tools as compatible as possible to existing software projects.
So far, I haven't been able to find anything related in the GCC documentation or the web.

Related

Compile single static library for Cortex M3, M4, M23 and M33

I'm currently working on a rather generic communication stack. It gets bytes in on one end, parses the packet and calls a callback.
I want to have this stack in a static library (i.e. libcommstack.a).
The library is aimed towards embedded ARM Cortex-M devices. At the moment we have specified that at least a Cortex-M3 should be used (but it should also work for an M4 or M33).
Right now I'm integrating it into another application to verify that linking it is possible. In the future the idea is that we will ship this .a file to customers so they can build their application around it, without having direct access to our sources (to encapsulate our IP).
We are using GCC ARM v7.2.1 to compile both the library and the application that is linked to it.
The application I'm trying to integrate it with is compiled for a Cortex M33 with -mfloat-abi=hard -mfpu-fpv6-sp-d16.
The code for the library does not use any floating points and is compiled using -march=archv7-m (both have the -mthumb flag).
Linking seemed to all go well, until I actually called a function from the lib. At that point the linker starts to complain:
application.elf uses VFP register arguments, libcommstack.a(somefile.c.obj) does not
failed to merge target specific data of file libcommstack.a(somefile.c.obj)
Since I'm not using floating points in the library and I don't know (upfront) if the target application does or does not have an FPU (or even uses floats), I'm not sure how to approach this.
I figured there would be two approaches:
Compile a single version of the lib, using an instruction set that all of the microcontrollers understand. I was hoping that this would be the case with ARMv7 (although I'm not yet 100% confident that the M23/M33 also support this).
Compile a lot of different libs for the different flavors based on the different architectures, FPU, etc.
As you can imagine, I would prefer to keep it simple and go for option 1, but I'm not sure how to "convince" the linker to link these two (or perhaps how to convince the compiler NOT to care about floating points for the lib).
Does anyone know if option 1 is feasible and how it can be achieved?
If it is not feasible, what would be the variables to keep in mind to determine the different build flavors?
Does anyone know if option 1 is feasible
Well, feasible, probably.
how it can be achieved?
Get all the processors you want to support and determine the instructions sets available on all these processors. Then compile for that instruction set.
But, please don't, that is a workaround.
If it is not feasible, what would be the variables to keep in mind to determine the different build flavors?
Gcc has something like "multilib profiles". See arm-none-eabi-gcc --print-multi-lib output. If you have newlib installed, you can go to /usr/arm-none-eabi/lib/thumb/ and see the directories there - newlib is compiled for each profile and installs separate library for it and different library is picked up depending on configuration. Compile for each of those profiles, and package your library by putting libraries in proper /usr/arm-none-eabi/lib/proper/directory/here and compiler will pick them up by itself (see gcc -v output for library search paths). For an example search newlib sources where it happens, can't find it. (Here's my example). With cmake as a backend as a example you could compile and install as follows:
arm-none-eabi-gcc --print-multi-lib |
while IFS=';' read -r dir opts; do
cmake -B builddir CMAKE_C_FLAGS="$opts" CMAKE_INSTALL_LIBDIR="$dir"
cmake --build builddir
cmake --install builddir --prefix "/usr/arm-none-eabi/"
done

can single gcc generate executable for multiple targets like x86,arm,ppc?

We want to use a single gcc for multiple targets. Is it possible to build from source for supporting multiple targets?
The answer is no, you cannot do this with gcc. You can use some cross compilers to achieve this goal.
But if you really need to do this, you can use clang compiler. Here is the link:
https://clang.llvm.org/docs/CrossCompilation.html
Adding answer to the Gabriel. All architecture what you mentioned above are different CPU's.
It's not possible to generate different binaries with the gcc compiler.
You need to have different toolchains for each compiler that produce corresponding compatible code.
x86, PPC and ARM are the different machines. You cant run the code which you build using host toolchains.
Below provided reference use machine specific toolchains not host-gcc. This is very cumbersome and not straightforward approach.
For curiosity, you can have a look at the bitbake, parallel build of multiple machines
I'll also add that to have a useful toolchain you also need to build other components besides GCC. Components like an assembler and linker (from binutils for example) and a C library (e.g. glibc, musl, newlib etc). Each such component needs to be configured for a specific target

Why do we need cmake?

I don't understand, why do we need cmake to build libraries ? I am sorry if my question is stupid, but i need to use some libraries on Widnows, and what ever library i choose i need to build it and/or compile it with cmake.. What is it for ? Why cant i just #include "path" the things that i need into my project, and than it can be compiled/built at the same time as my project ?
And also, sometimes i needed to install Ruby, Perl, Python all of them some specific version so cmake can build libraries... Why do i need those programs, and will i need them only to build library or later in my project too ? (concrete can i uninstall those programs after building libraries ?)
Building things in c++ on different platforms is a mess currently.
There are several different build system out there and there is no standard way to do this. Just providing a visual studio solution wont help compilation on linux or mac.
If you add a makefile for linux or mac you need to repeat configurations between the solution and the makefiles. Which can result in a lot of maintenance overhead. Also makefiles are not really a good build tool compared to the new ones out there.
That you have only CMake libraries is mostly a coincidence. CMake is though a popular choice currently.
There are several solutions out there to unify builds. CMake is a build tool in a special way. It can create makefiles and build them but you can also tell cmake to create a visual studio solution if you like.
The same goes with external programs. They are the choice of the maintainer of the library you use and there are no standards for things like code generation.
While CMake may not be "the" solution (although the upcoming visual studio 2015 is integrating cmake support) but the trend for those build system which are cross-platform is going more and more in this direction.
To your question why you cannot only include the header:
Few libraries are header only and need to be compiled. Either you can get precompiled libs/dlls and just include the header + add the linker path. This is easier in linux because you can have -dev packages which just install a prebuild library and it's header via the package manager. Windows has no such thing natively.
Or you have to build it yourself with whatever buildtool the library uses.
The short answer is that you don't, but it would probably be difficult to build the project without it.
CMake does not build code, but is instead a build file generator. It was developed by KitWare (during the ITK project around 2000) to make building code across multiple platforms "simpler". It's not an easy language to use (which Kitware openly admits), but it unifies several things that Windows, Mac, and Linux do differently when building code.
On Linux, autoconf is typically used to make build files, which are then compiled by gcc/g++ (and/or clang)
On Windows, you would typically use the Visual Studio IDE and create what they call a "Solution" that is then compiled by msvc (the Microsoft Visual C++ compiler)
On Mac, I admit I am not familiar with the compiler used, but I believe it is something to do with XCode
CMake lets you write a single script you can use to build on multiple machines and specify different options for each.
Like C++, CMake has been divided between traditional/old-style CMake (version < 3.x) and modern CMake (version >= 3.0). Use modern CMake. The following are excellent tutorials:
Effective CMake, by Daniel Pfeifer, C++Now 2017*
Modern CMake Patterns, by Matheiu Ropert, CppCon 2017
Better CMake
CMake Tutorial
*Awarded the most useful talk at the C++Now 2017 Conference
Watch these in the order listed. You will learn what Modern CMake looks like (and old-style CMake) and gain understanding of how
CMake helps you specify build order and dependencies, and
Modern CMake helps prevent creating cyclic dependencies and common bugs while scaling to larger projects.
Additionally, the last video introduces package managers for C++ (useful when using external libraries, like Boost, where you would use the CMake find_package() command), of which the two most common are:
vcpkg, and
Conan
In general,
Think of targets as objects
a. There are two kinds, executables and libraries, which are "constructed" with
add_executable(myexe ...) # Creates an executable target "myexe"
add_library(mylib ...) # Creates a library target "mylib"
Each target has properties, which are variables for the target. However, they are specified with underscores, not dots, and (often) use capital letters
myexe_FOO_PROPERTY # Foo property for myexe target
Functions in CMake can also set some properties on target "objects" (under the hood) when run
target_compile_definitions()/features()/options()
target_sources()
target_include_directories()
target_link_libraries()
CMake is a command language, similar shell scripting, but there's no nesting or piping of commands. Instead
a. Each command (function) is on its own line and does one thing
b. The argument(s) to all commands (functions) are strings
c. Unless the name of a target is explicitly passed to the function, the command applies to the target that was last created
add_executable(myexe ...) # Create exe target
target_compile_definitions(...) # Applies to "myexe"
target_include_directories(...) # Applies to "myexe"
# ...etc.
add_library(mylib ...) # Create lib target
target_sources(...) # Applies to "mylib"
# ...etc.
d. Commands are executed in order, top-to-bottom, (NOTE: if a target needs another target, you must create the target first)
The scope of execution is the currently active CMakeLists.txt file. Additional files can be run (added to the scope) using the add_subdirectory() command
a. This operates much like the shell exec command; the current CMake environment (targets and properties, except PRIVATE properties) are "copied" over into a new scope ("shell"), where additional work is done.
b. However, the "environment" is not the shell environment (CMake target properties are not passed to the shell as environment variables like $PATH). Instead, the CMake language maintains all targets and properties in the top-level global scope CACHE
PRIVATE properties get used by the current module. INTERFACE properties get passed to subdirectory modules. PUBLIC is for the current module and submodules (the property is appropriate for the current module and applies to/should be used by modules that link against it).
target_link_libraries is for direct module dependencies, but it also resolves all transitive dependencies. This means when you link to a library, you gets all the PUBLIC properties of the parent modules as well.
a. If you want to link to a library that has a direct path, you can use target_link_libraries, and
b. if you want to link to a module with a project and take its interface, you also use target_link_libraries
You run CMake on CMakeLists.txt files to generate the build files you want for your system (ninja, Visual Studio solution, Linux make, etc.) and the run those to compile and link the code.

c++ libs from ubuntu 16.04 repo - compiler options

Ubuntu 16.04 comes with GCC 5.4 which does support c++11 and it is the default compiler. By default c++11 is not enabled in that particular version of GCC.
My intent is to use some of the binary libraries (not header only) from their repository (e.g. boost). In my projects I will enable c++ 11.
How were c++ libraries from the repository compiled? Is it possible to use them with c++ 11 enabled? I know that c++ libraries can be called from different languages (Java, Pythons, C# etc) by hiding all c++ stuff behind plain C interface. With boost it is not a case. If a certain function returns me a string or a vector or anything from STL then it is a problem. AFAIK STL objects binary representation depends on compiler flags (eg. std=c++11).
Thank you.
Which exact libraries are you talking about?
If you are talking about the standard library, libstdc++ is a part of gcc. It is always okay to link it no matter which standard you compile at. gcc also made a decision to include ABI tags, so that they can be ABI compatible with code compiled at C++11 and pre C++11. See for instance TC's really nice answer to a question I asked here:
Is this simple C++ program using <locale> correct?
If by
How were c++ libraries from the repository compiled?
you mean, how are all of the C++ libraries in the ubuntu repositories compiled, the answer is, it may be different for each one.
For instance if you want to use libfreetype6-dev or libsdl2-dev, these are C libraries, they will be okay to link to no matter what standard you target.
If you want to use libsilly-dev from CEGUI, that is a C++ library, and it is usually best to use the exact same compiler for your project and the C++ lib that you are linking to. If it appears in ubuntu repository, you can assume it was built with the default g++ version that ubuntu is shipping. If you need to use a different compiler, it's probably best to build the C++ lib yourself -- in general C++ is not ABI stable across different compilers, or even different versions of the same compiler.
If you want to use compiled boost libraries, it's probably best to use the libs they give you and use the compiler they give you. If you only use header-only boost, then the compiler doesn't matter since you don't actually have to link with something they built. So you then have more flexibility with respect to compilers.
Often, if you need to use C++ libraries, it's best to integrate their build system into yours so that it can be easily rebuilt from source and you only have to configure the compiler once. (At least in my experience.) This can save a lot of time when you decide to upgrade compilers later. If you use cmake then it's often feasible, but sometimes this can be hard, especially if you have a lot of C++ dependencies. If you don't use cmake, well, many libraries use cmake and it won't be that easy to integrate them this way. cmake is still kind of a pain anyways, so this might not be such a loss.

How do I translate CIL to LLVM IR?

I want to compile C# to LLVM IR. So I think translate compiled CIL to LLVM IR is one way I can try.
There are some tools I can use such as vmkit and mono-llvm.
Is anybody using this tools? Or how can I translate CIL to LLVM?
The answer depends on your goals. Why do you want to translate C# to LLVM?
VMKit was designed as a framework for building virtual machine implementations. I believe it had some support for the CLR at one point, but that support since stagnated in favor of its JVM implementation. Its purpose is to make building a VM from scratch.
Mono-llvm is a project that replaces the mono JIT backend with an LLVM back end. It's goal is to improve the performance of JITed code on Mono.
If your goal is to use Mono, with better performance, mono-llvm is a good choice.
If you want to build an entire VM from scratch, then VMKit might work.
If you are just looking to implement an ahead-of-time compiler that produces executables with no CLR dependencies, you can just download the LLVM core libraries from:
http://llvm.org/
Basically it would translate the CIL into a textual representation of LLVM IR and then use the LLVM APIs to compile it to native machine code.
I don't know if LLVM will generate object files for you. You may have to generate them yourself, but that's pretty easy. It's basically just stuffing the machine code into a data structure, building up string, section, and symbol tables, and then serializing everything to disk.
To get LLVM IR code from CIL you need to use the tool il2bc (other name C# Native) which you can download from http://csnative.codeplex.com/.
You just need to perform some simple steps.
Il2Bc.exe <Your DLL>.dll
If you want to generate an executable from it, you need to compile the generated .ll file (LLVM IR Code).
For example, you have your "Hello World" app
Compile it (it will generate a helloworld.ll file)
Il2Bc.exe helloworld.cs /corelib:CoreLib.dll
Generate LLVM IR file for the core library (it will generate corelib.ll file)
Il2Bc.exe CoreLib.dll
You need to generate an EXE file (it will generate a .EXE file):
llc -filetype=obj -mtriple=i686-w64-mingw32 CoreLib.ll
llc -filetype=obj -mtriple=i686-w64-mingw32 helloworld.ll
g++ -o helloworld.exe helloworld.obj CoreLib.obj -lstdc++ -lgc-lib -march=i686 -L .
I think I understand the question to be that you want to use LLVM IR in the same way that the GCC can compile Java using gcj?
The LLVM had an option to output CIL directly from whatever front end you used (So in theory you could do C/C++ to CIL). The following command options:
llc -march=msil
would output CIL from (in theory) any supported LLVM Front-End.
Going from C# or CIL to LLVM IR hasn't been done yet (or at least finished). You'd need a C# front-end.
VMKit had some kind of C# front end scaffolding. Support was never feature complete and interest has since faded. They've moved to just supporting Java. You might try their source repository and see if there are any remnants of their early C# work can be reworked into a full C# frontend.
Also note that you can write your own C# to LLVM IR compiler in C# (using Mono or whatever) and use P/Invoke to call into LLVM libraries and create LLVM IR. There are some good information out there such as Writing Your Own Toy Compiler Using Flex, Bison and LLVM.
This area is also getting interesting now that the compiler as a service (Roslyn) project has had its first couple of CTP releases, and Mono has its Mono.CSharp project. Though I think Roslyn is a bit more feature-rich.

Resources