How is C++ compiled - compilation

How is C++ compiled - compilation

I am working on some (very) low level programming but not everything is completely clear to me. I start by creating a .cpp (or .c) file which is run through gcc to create an elf or object file but what is an object file? I get object files when I use the "as" compiler but how are these used and what is the purpose of having an object file when we could have a straight binary?

There is a very clear explanation of this question on the this site. I pasted it down below as well. But I strongly suggest you take a look at the diagram on the website. That will give you a much better high-level understanding of what is going on.
Compiling a source code file in C++ is a four-step process. For example, if you have a C++ source code file named prog1.cpp and you execute the compile command
g++ -Wall -ansi -o prog1 prog1.cpp
the compilation process looks like this:
The C++ preprocessor copies the contents of the included header files into the source code file, generates macro code, and replaces symbolic constants defined using #define with their values.
The expanded source code file produced by the C++ preprocessor is compiled into the assembly language for the platform.
The assembler code generated by the compiler is assembled into the object code for the platform.
The object code file generated by the assembler is linked together with the object code files for any library functions used to produce an executable file.
By using appropriate compiler options, we can stop this process at any stage.
To stop the process after the preprocessor step, you can use the -E option:
g++ -E prog1.cpp
The expanded source code file will be printed on standard output (the screen by default); you can redirect the output to a file if you wish. Note that the expanded source code file is often incredibly large - a 20 line source code file can easily produce an expanded file of 20,000 lines or more, depending on which header files were included.
To stop the process after the compile step, you can use the -S option:
g++ -Wall -ansi -S prog1.cpp
By default, the assembler code for a source file named filename.cpp will be placed in a file named filename.s.
To stop the process after the assembly step, you can use the -c option:
g++ -Wall -ansi -c prog1.cpp
By default, the assembler code for a source file named filename.cpp will be placed in a file named filename.o

Related

Does "-Wl,-soname" work on MinGW or is there an equivalent?

I'm experimenting a bit with building DLLs on windows using MINGW.
A very good summary (in my opinion) can be found at:
https://www.transmissionzero.co.uk/computing/building-dlls-with-mingw/
There is even a basic project which can be used for the purpose of this discussion:
https://github.com/TransmissionZero/MinGW-DLL-Example/releases/tag/rel%2Fv1.1
Note there is a cosmetic mistake in this project which will make it fail out of the box: the Makefile does not create an "obj" directory - Either adjust the Makefile or create it manually.
So here is the real question.
How to change the Windows DLL name so it differs from the actual DLL file name ??
Essentially I'm trying to achieve on Windows, the effect which is very well described here on Linux:
https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
Initially I tried changing "InternalName" and ""OriginalFilename" in the resource file used to create the DLL but that does not work.
In a second step, I tried adding "-Wl,-soname,SoName.dll" on the command that performs the final link, to change the Windows DLL name.
However, that does not seem to have the expected effect (I'm using MingW 7.3.0, x86_64-posix-seh-rev0).
Two things makes me say that:
1/ The test executable still works (I would expect it to fail, because it tries to locate SoName.dll but can't find it).
2/ "pexports.exe AddLib.dll" produces the output below, where the library name hasn't changed:
LIBRARY "AddLib.dll"
EXPORTS
Add
bar DATA
foo DATA
Am I doing anything wrong ? Are my expectations wrong perhaps ?
Thanks for your help !
David

First of all, I would like to say it's important to use either a .def file for specifying the exported symbols or use __declspec(dllexport) / __declspec(dllimport), but never mix these two methods. There is also another method using the -Wl,--export-all-symbols linker flag, but I think that's ugly and should only be used when quick and dirty is what you want.
It is possible to tell MinGW to use a DLL filename that does not match the library name. In the link step use -o to specify the DLL and use -Wl,--out-implib, to specify the library file.
Let me illustrate by showing how to build chebyshev as a both static and shared library. Its sources consist of only only 2 files: chebyshev.h and chebyshev.c.
Compile
gcc -c -o chebyshev.o chebyshev.c -I. -O3
Create static library
ar cr libchebyshev.a chebyshev.o
Create a .def file (as it wasn't supplied and __declspec(dllexport) / __declspec(dllimport) wasn't used either). Note that this file doesn't contain a line with LIBRARY allowing the linker to specify the DLL filename later.
There are several ways to do this if the .def file wasn't supplied by the project:
3.1. Get the symbols from the .h file(s). This may be hard as sometimes you need to distinguish for example between type definitions (like typedef, enum, struct) and actual functions and variables that need to be exported;
echo "EXPORTS" > chebyshev.def
sed -n -e "s/^.* \**\(chebyshev_.*\) *(.*$/\1/p" chebyshev.h >> chebyshev.def
3.2. Use nm to list symbols in the library file and filter out the type of symbols you need.
echo "EXPORTS" > chebyshev.def
nm -f posix --defined-only -p libchebyshev.a | sed -n -e "s/^_*\([^ ]*\) T .*$/\1/p" >> chebyshev.def
Link the static library into the shared library.
gcc -shared -s -mwindows -def chebyshev.def -o chebyshev-0.dll -Wl,--out-implib,libchebyshev.dll.a libchebyshev.a
If you have a project that uses __declspec(dllexport) / __declspec(dllimport) things are a lot easier. And you can even have the link step generate a .def file using the -Wl,--output-def, linker flag like this:
gcc -shared -s -mwindows -o myproject.dll -Wl,--out-implib,myproject.dll.a -Wl,--output-def,myproject.def myproject.o
This answer is based on my experiences with C. For C++ you really should use __declspec(dllexport) / __declspec(dllimport).

I believe I have found one mechanism to achieve on Windows, the effect described for Linux in https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
This involves dll_tool
In the example Makefile there was originally this line:
gcc -o AddLib.dll obj/add.o obj/resource.o -shared -s -Wl,--subsystem,windows,--out-implib,libaddlib.a
I simply replaced it with the 2 lines below instead:
dlltool -e obj/exports.o --dllname soname.dll -l libAddLib.a obj/resource.o obj/add.o
gcc -o AddLib.dll obj/resource.o obj/add.o obj/exports.o -shared -s -Wl,--subsystem,windows
Really, the key seems to be the creation with dlltool of an exports file in conjunction with dllname. This exports file is linked with the object files that make up the body of the DLL and it handles the interface between the DLL and the outside world. Note that dlltool also creates the "import library" at the same time
Now I get the expected effect, and I can see that the "Internal DLL name" (not sure what the correct terminology is) has changed:
First evidence:
>> dlltool.exe -I libAddLib.a
soname.dll
Second evidence:
>> pexports.exe AddLib.dll
LIBRARY "soname.dll"
EXPORTS
Add
bar DATA
foo DATA
Third evidence:
>> AddTest.exe
Error: the code execution cannot proceed because soname.dll was not found.
Although the desired effect is achieved, this still seems to be some sort of workaround. My understanding (but I could well be wrong) is that the gcc option "-Wl,-soname" should achieve exactly the same thing. At least it does on Linux, but is this broken on Windows perhaps ??

Configure clang-check for c++ standard libraries

I am trying to run Ale as my linter, which in turn uses clang-check to lint my code.
$ clang-check FeatureManager.h
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "FeatureManager.h"
No compilation database found in /home/babbleshack/ or any parent directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
/home/babbleshack/FeatureManager.h:6:10: fatal error: 'unordered_map' file not found
#include <unordered_map>
^~~~~~~~~~~~~~~
1 error generated.
Error while processing /home/babbleshack/FeatureManager.h.
Whereas compiling with clang++ returns only a warning.
$ clang++ -std=c++11 -Wall FeatureManager.cxx FeatureManager.h
clang-5.0: warning: treating 'c-header' input as 'c++-header' when in C++ mode, this behavior is deprecated [-Wdeprecated]
There are no flags to clang-check allowing me to set compilation flags.

Took a while to figure this out, but you can do
clang-check file.cxx -- -Wall -std=c++11 -x c++
or if you are using clang-tidy
clang-tidy file.cxx -- -Wall -std=c++11 -x c++
To get both working with ALE, I added the following to my vimrc
let g:ale_cpp_clangtidy_options = '-Wall -std=c++11 -x c++'
let g:ale_cpp_clangcheck_options = '-- -Wall -std=c++11 -x c++'
If you want ALE to work for C as well, you will have to do the same for g:ale_c_clangtidy_options and g:ale_c_clangcheck_options.

I was getting stumped by a similar error message for far too long:
/my/project/src/util.h:4:10: error: 'string' file not found [clang-diagnostic-error]
#include <string>
^
I saw other questions suggesting that I was missing some critical package, but everything already seemed to be installed (and my code built just fine, it was only clang-tidy that was getting upset).
Passing -v showed that my .h file was being handled differently:
$ clang-tidy ... src/*.{h,cc} -- ... -v
...
clang-tool ... -main-file-name util.cc ... -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 ... -x c++ ... /tmp/copy/src/util_test.cc
...
clang-tool ... -main-file-name util.h ... -x c-header /my/project/src/util.h
...
As Kris notes the key distinction is the -x c-header flag, which is because clang assumes a .h file contains C, not C++, and this in turn means that the system C++ includes weren't being used to process util.h.
But the -main-file-name flag also stood out to me as odd; why would a header file ever be the main file? While digging around I also came across this short but insightful answer that header files shouldn't be directly compiled in the first place! Using src/*.cc instead of src/*.{h,cc} avoids the problem entirely by never asking Clang to try to process a .h on its own in the first place!
This does introduce one more wrinkle, though. Errors in these header files won't be reported by default, since they're not the files you asked clang-tidy to look at. This is where the "Use -header-filter=. to display errors from all non-system headers.*" message clang-tidy prints comes in. If I pass -header-filter=src/.* (to only include my src headers and not any other header files I'm including with -I) I see the expected errors in my header files. Phew!
I'm not sure whether to prefer -x c++ or -header-filter=.* generally. A downside of -header-filter is you have to tune the filter regex, rather than just passing in the files you want to check. But on the other hand processing header files in isolation is essentially wasteful work (that I expect would add up quickly in a larger project).

Don't understand gcc that well, but I can't find why it's not working

I'm trying to compile a simple "hello world"
file_name
#include <stdio.h>
void main () {
printf ("Hello World\n");
}
then I try: gcc file_name and I get "File not recognized. File format not recognized"
I however am 100% sure I did the exact same thing a few weeks back (just to see if it works, as now) and it worked, so I just don't get it.
gcc -ver // returns 4.6.1 if this helpes
Also how is gcc -o supposed to work ? The manual (man gcc) is just gibberish at times (for me)

Let's say you program is saved as helloworld.c. Typing gcc -o myprog helloworld.c would compile helloworld.c into myprog. That way, when you want to run the program, all you type in the command line is ./myprog

gcc tries to guess the language used (e.g. C or C++) based on the extension of the file, so you need to ensure you have the proper file extension (usually .cpp for C++ and .c for C dource files). Alternatively, read the manual if there is a command line option to explicitly state the format (regardless of the extension).
As for the "-o" command line parameter: the name specified after that option is the name of the object file created from the compiled source file. The object files are then linked together to form an executable

Using gfortran with libraries

I have a legacy code written using fortran 77. I'm trying to build it with gfortran. But I seem to be failing at the stage where I include the libraries in the build. The dozens of *.f source files compile fine, but when they are being linked, I get a bunch of "undefined reference" errors all relating to subroutines and functions that are defined in my libraries. I already ran the makefile for the libraries first, so the variables I need should all be exported. I'm playing with the "-L" option, but can't get it to work as desired.
First, here's my syntax of the linking line in my makefile:
29 $(PROGRAM): $(SRCS) $(LIBS)
30 $(FC) $(FLFLAGS) -o $# $+ -L$(DIRLIB)
PROGRAM is the program name, SRCS are all the compiled source files, LIBS is set to two different files - an archive file (file.a) and a file.o file.
FC is gfortran, I don't have any specific linking flags for FLFLAGS as of now, and DIRLIB is the main directory of the libraries.
The thing is that my *.o files that resulted from building my librarires don't reside in just the main directory, DIRLIB. DIRLIB contains several directories, all with their own *.o files that are needed by my code.
I tried adding each individual directory after the -L option (e.g. DIRLIB/DIR1/*.o DIRLIB/DIR2/*.o DIRLIB/DIR3/*.o), but I eventually start getting errors that some subroutines are multiply defined.
All this business of user-defined libraries and archive files just confuses me and I'm pretty new to making makefiles in the first place, so I'm just taking a shot in the dark here that somebody might be able to help me shed some light on this.

Libraries need to come after the .o files that reference them in the linking command.
I'm guessing the object file in LIBS comes after the library, but needs some of the procedures from it. Can you show the command that is actually run (with all variables expanded), to confirm this?

I tried to build this code again using the library. It worked this time. I'm pretty sure I'm doing the same thing in my makefile as I did before, so it must be related to the library I had. Maybe somebody altered it along the way and inadvertently broke it. But I got a fresh clean copy of the library. My steps are to:
1) run the makefile for the library source files; it creates a library.a archive file
2) run my code makefile:
it has a line to specify the location of this archive file and assign it to "DIRLIB"
DIRLIB := ../library
then the linking command of the makefile becomes
$(FC) $(FLFLAGS) -o $# $+ -L$(DIRLIB) -lskit
FC is my compiler, FLFLAGS are my linking flags, -L is the option specifying the location of libraries to be included and -lskit is a crucial option which appears to allow the use of F77 libraries... without the -lskit option, I get many undefined reference errors. It may have been that last time I was not including this -lskit option at the end.

How to force gcc to link like g++?

In this episode of "let's be stupid", we have the following problem: a C++ library has been wrapped with a layer of code that exports its functionality in a way that allows it to be called from C. This results in a separate library that must be linked (along with the original C++ library and some object files specific to the program) into a C program to produce the desired result.
The tricky part is that this is being done in the context of a rigid build system that was built in-house and consists of literally dozens of include makefiles. This system has a separate step for the linking of libraries and object files into the final executable but it insists on using gcc for this step instead of g++ because the program source files all have a .c extension, so the result is a profusion of undefined symbols. If the command line is manually pasted at a prompt and g++ is substituted for gcc, then everything works fine.
There is a well-known (to this build system) make variable that allows flags to be passed to the linking step, and it would be nice if there were some incantation that could be added to this variable that would force gcc to act like g++ (since both are just driver programs).
I have spent quality time with the gcc documentation searching for something that would do this but haven't found anything that looks right, does anybody have suggestions?

Considering such a terrible build system write a wrapper around gcc that exec's gcc or g++ dependent upon the arguments. Replace /usr/bin/gcc with this script, or modify your PATH to use this script in preference to the real binary.
#!/bin/sh
if [ "$1" == "wibble wobble" ]
then
exec /usr/bin/gcc-4.5 $*
else
exec /usr/bin/g++-4.5 $*
fi

The problem is that C linkage produces object files with C name mangling, and that C++ linkage produces object files with C++ name mangling.
Your best bet is to use
extern "C"
before declarations in your C++ builds, and no prefix on your C builds.
You can detect C++ using
#if __cplusplus

Many thanks to bmargulies for his comment on the original question. By comparing the output of running the link line with both gcc and g++ using the -v option and doing a bit of experimenting, I was able to determine that "-lstdc++" was the magic ingredient to add to my linking flags (in the appropriate order relative to other libraries) in order to avoid the problem of undefined symbols.
For those of you who wish to play "let's be stupid" at home, I should note that I have avoided any use of static initialization in the C++ code (as is generally wise), so I wasn't forced to compile the translation unit containing the main() function with g++ as indicated in item 32.1 of FAQ-Lite (http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio