Linking c++ against boost in an R package - boost

I'm making an R package on data flows in networks. For speed, some of the code is written in C++, and with my own implementation of graph algorithms. I'd like to re-write my code to use the Boost Graph Library.
What would I need to put in the Makevars file to set the compiler/linker option to find boost? Sorry, I'm not very good with Make.
I'm working in a Linux environment
Yes I looked at RBGL, but did not find a makefile in that package
If it matters, I'm using Rcpp to interface the R and C++ code

It's not that hard. By default, every Rcpp package has a default src/Makevars which contains
## Use the R_HOME indirection to support installations of multiple R version
PKG_LIBS = `$(R_HOME)/bin/Rscript -e "Rcpp:::LdFlags()"`
where the library for Rcpp itself is found dynamically by asking R. You just expand this and add
-lboost_graph
(and/or maybe the parallel or mt variants). If you they are installed in an unusual place, also add -L... flags. Dito for PKG_CFLAGS for header locations.
That's about it. See the Writing R Extensions manual about more details on building R packages.

Related

How to get the structure field data type in the GCC compiler source code and modify it?

If I have such a structure:
struct test{
float c,
f,
ops;
};
How can I modify the GCC compiler source code to make it as follows:
struct test{
double c,
f,
ops;
};
I now have such a requirement, I need to modify the gcc source code so that when he compiles a structure of a certain mode, he changes its type to a specified type.
Thank you!
your goal is very ambitious!
A possible approach could be to develop your own GCC plugin doing that job.
My recommendations:
budget several months of your time for that work (and perhaps several years) - at least 6 months full time to get a "proof of concept" thing which would fail on most code bases (in C). For C++, add another full year.
read carefully the C11 standard n1570 (if you target C), and the C++11 standard n3337 (if you target C++). That effort alone may take you a full month.
make your plugin open source software, and put its code quickly (e.g. under LGPL license) on some repository such as github or gitlab.
target the latest available version of GCC. In September 2020, that means GCC 10. Plugins and GCC APIs are changing incompatibly from one version to the next. If you need to stick to GCC 8 specifically, be prepared to spend a big amount of money for companies like AdaCore.
read carefully the documentation on GCC internals. You need to understand the GENERIC representation.
study very carefully the gcc/tree.def and gcc/treestruct.def and gcc/gimple.def files of GCC source code. You need to basically understand every line in them.
study very carefully the gcc/passes.def file of GCC source code. Again, you need to understand every line in that file.
learn to compile GCC from its source code. You certainly want to build it with g++ -Wall -Wextra -g -O1
read this draft report.
ask help in written English on the gcc#gcc.gnu.org mailing list, but have some working plugin before.
consider making a PhD out of this work. It is worth one. Alexandre Lissy in France got a PhD on a very similar topic.
If your code base is in C, consider using Frama-C (or Clang) and design your tool as a C to C transpiler.
Perhaps clever preprocessor tricks like #define float double followed by #undef float might be enough.

Where is __builtin_va_start defined?

I'm trying to locate where __builtin_va_start is defined in GCC's source code, and see how it is implemented. (I was looking for where va_start is defined and then found that this macro is defined as __builtin_va_start.) I used cscope -r in GCC 9.1's source code directory to search the definition but haven't found it. Can anyone point where this function is defined?
That __builtin_va_start is not defined anywhere. It is a GCC compiler builtin (a bit like sizeof is a compile-time operator). It is an implementation detail related to the <stdarg.h> standard header (provided by the compiler, not the C standard library implementation libc). What really matters are the calling conventions and ABI followed by the generated assembler.
GCC has special code to deal with compiler builtins. And that code is not defining the builtin, but implementing its ad-hoc behavior inside the compiler. And __builtin_va_start is expanded into some compiler-specific internal representation of your compiled C/C++ code, specific to GCC (some GIMPLE perhaps)
From a comment of yours, I would infer that you are interested in implementation details. But that should be in your question
If you study GCC 9.1 source code, look inside some of gcc-9.1.0/gcc/builtins.c (the expand_builtin_va_start function there), and for other builtins inside gcc-9.1.0/gcc/c-family/c-cppbuiltin.c, gcc-9.1.0/gcc/cppbuiltin.c, gcc-9.1.0/gcc/jit/jit-builtins.c
You could write your own GCC plugin (in 2Q2019, for GCC 9, and the C++ code of your plugin might have to change for the future GCC 10) to add your own GCC builtins. BTW, you might even overload the behavior of the existing __builtin_va_start by your own specific code, and/or you might have -at least for research purposes- your own stdarg.h header with #define va_start(v,l) __my_builtin_va_start(v,l) and have your GCC plugin understand your __my_builtin_va_start plugin-specific builtin. Be however aware of the GCC runtime library exception and read its rationale: I am not a lawyer, but I tend to believe that you should (and that legal document requires you to) publish your GCC plugin with some open source license.
You first need to read a textbook on compilers, such as the Dragon book, to understand that an optimizing compiler is mostly transforming internal representations of your compiled code.
You further need to spend months in studying the many internal representations of GCC. Remember, GCC is a very complex program (of about ten millions lines of code). Don't expect to understand it with only a few days of work. Look inside the GCC resource center website.
My dead GCC MELT project had references and slides explaining more of GCC (the design philosophy and architecture of GCC changes slowly; so the concepts are still relevant, even if individual details changed). It took me almost ten years full time to partly understand some of the middle-end layers of GCC. I cannot transmit that knowledge in a StackOverflow answer.
My draft Bismon report (work in progress, funded by H2020, so lot of bureaucracy) has a dozen of pages (in its sections §1.3 and 1.4) introducing the internal representations of GCC.

GCC technical details

I don't know if this is the right place for things like this, but I am curious about a few aspects of the GCC front-end/back-end architecture:
I know I can compile .o files from C code and link them to C++ code, and I think I can do it the other way round, too. Does this work because the two languages are similar, or because the GCC back-end is really language-independent? Would this work with ADA code too? (I don't even know if that makes sense, since I don't know ADA or if it even has "functions", but the question is understood. If it makes no sense, think "Pascal" or even "my own custom language front-end")
Where would garbage-collection be implemented? For example, a Java front-end. The way I understand, if compiling to a JVM back-end, the "platform" will take care of the GC, and so the front-end needs not do anything about it, but if compiling to native code, would the front-end send garbage-collecting GENERIC code to the back-end, or does it turn on some flag telling the back-end to produce garbage-collecting code? The first makes more sense to me, but that would mean the front-end produces different output based on the target, which seems to miss the point of the GCC's front-end/back-end architecture.
Where would language-specific libraries go? For instance, the standard Java classes or standard C headers. If they are linked in at the end, then could a C program theoretically call functions from the Java library or something like that, since it is just another linked library?
Yes, the backend is at least reasonably language independent. Yes, it works with Ada.
GCJ generates native code which uses a runtime library. The garbage collector is part of the runtime library.
GCJ implements the CNI, which allows you to write code in C++ that can be used as native methods by Java code -- but being able to do this is a consequence of them having designed it in, not just an accidental byproduct of using the same back-end.
It is possible because calling convention is compatible, but name mangling is different (no mangling in C). To call C function from C++ you should declare it with extern "C". And to call C++ function from C you should declare it with mangled name (and may be with additional or different type args). The calling Fortran code is possible in some cases too, but argument passing convention is different (pass by ref in Fortran).
There were actually a converters from C++ to C (cfront) and from fortran to c (f2c) and some solutions from them are still used.
garbage-collection is implemented in run-time library, e.g. boehm. Backend should generate objects compatible with selected GC library.
Compiler driver (g++, gfortran, ..) will add language-specific libraries to linking step.

Static library "interface"

Is there any way to tell the compiler (gcc/mingw32) when building an object file (lib*.o) to only expose certain functions from the .c file?
The reason I want to do this is that I am statically linking to a 100,000+ line library (SQLite), but am only using a select few of the functions it offers. I am hoping that if I can tell the compiler to only expose those functions, it will optimize out all the code of the functions that are never needed for those few I selected, thus dratically decreasing the size of the library.
I found several possible solutions:
This is what I asked about. It is the gcc equivalent of Windows' dllexpoort:
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Code-Gen-Options.html (-fvisibility)
http://gcc.gnu.org/wiki/Visibility
I also discovered link-time code-generation. This allows the linker to see what parts of the code are actually used and get rid of the rest. Using this together with strip and -fwhole-program has given me drastically better results.
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html (see -flto and -fwhole-program)
Note: This flag only makes sense if you are not compiling the whole program in one call to gcc, which is what I was doing (making a sqlite.o file and then statically linking it in).
The third option which I found but have not yet looked into is mentioned here:
How to remove unused C/C++ symbols with GCC and ld?
That's probably the linkers job, not the compilers. When linking that as a program (.exe), the linker will take care of only importing the relevant symbols, and when linking a DLL, the __dllexport mechanism is probably what you are looking for, or some flags of ld can help you (man ld).

question on C preprocessor definitions

Hi I am currently installing a software called super LU and in the README file there is the following instruction for modifying a makefile depending on system set-up.
C preprocessor definition CDEFS.
In the header file SRC/Cnames.h, we use macros to determine how
C routines should be named so that they are callable by Fortran.
(Some vendor-supplied BLAS libraries do not have C interfaces. So the
re-naming is needed in order for the SuperLU BLAS calls (in C) to
interface with the Fortran-style BLAS.)
The possible options for CDEFS are:
o -DAdd_: Fortran expects a C routine to have an underscore
postfixed to the name;
o -DNoChange: Fortran expects a C routine name to be identical to
that compiled by C;
o -DUpCase: Fortran expects a C routine name to be all uppercase.
A Makefile is provided in each subdirectory. The installation can be done
completely automatically by simply typing "make" at the top level.
I am not really sure what this instruction means. Which of these three options should I choose?
Just try building the software running make at the top level.
If there are linking problems because of missing BLAS functions
start experimenting with underscore.
So start with NoChange, then try Add_.

Resources