How to get the structure field data type in the GCC compiler source code and modify it? - gcc8

If I have such a structure:
struct test{
float c,
f,
ops;
};
How can I modify the GCC compiler source code to make it as follows:
struct test{
double c,
f,
ops;
};
I now have such a requirement, I need to modify the gcc source code so that when he compiles a structure of a certain mode, he changes its type to a specified type.
Thank you!

your goal is very ambitious!
A possible approach could be to develop your own GCC plugin doing that job.
My recommendations:
budget several months of your time for that work (and perhaps several years) - at least 6 months full time to get a "proof of concept" thing which would fail on most code bases (in C). For C++, add another full year.
read carefully the C11 standard n1570 (if you target C), and the C++11 standard n3337 (if you target C++). That effort alone may take you a full month.
make your plugin open source software, and put its code quickly (e.g. under LGPL license) on some repository such as github or gitlab.
target the latest available version of GCC. In September 2020, that means GCC 10. Plugins and GCC APIs are changing incompatibly from one version to the next. If you need to stick to GCC 8 specifically, be prepared to spend a big amount of money for companies like AdaCore.
read carefully the documentation on GCC internals. You need to understand the GENERIC representation.
study very carefully the gcc/tree.def and gcc/treestruct.def and gcc/gimple.def files of GCC source code. You need to basically understand every line in them.
study very carefully the gcc/passes.def file of GCC source code. Again, you need to understand every line in that file.
learn to compile GCC from its source code. You certainly want to build it with g++ -Wall -Wextra -g -O1
read this draft report.
ask help in written English on the gcc#gcc.gnu.org mailing list, but have some working plugin before.
consider making a PhD out of this work. It is worth one. Alexandre Lissy in France got a PhD on a very similar topic.
If your code base is in C, consider using Frama-C (or Clang) and design your tool as a C to C transpiler.
Perhaps clever preprocessor tricks like #define float double followed by #undef float might be enough.

Related

Where is __builtin_va_start defined?

I'm trying to locate where __builtin_va_start is defined in GCC's source code, and see how it is implemented. (I was looking for where va_start is defined and then found that this macro is defined as __builtin_va_start.) I used cscope -r in GCC 9.1's source code directory to search the definition but haven't found it. Can anyone point where this function is defined?
That __builtin_va_start is not defined anywhere. It is a GCC compiler builtin (a bit like sizeof is a compile-time operator). It is an implementation detail related to the <stdarg.h> standard header (provided by the compiler, not the C standard library implementation libc). What really matters are the calling conventions and ABI followed by the generated assembler.
GCC has special code to deal with compiler builtins. And that code is not defining the builtin, but implementing its ad-hoc behavior inside the compiler. And __builtin_va_start is expanded into some compiler-specific internal representation of your compiled C/C++ code, specific to GCC (some GIMPLE perhaps)
From a comment of yours, I would infer that you are interested in implementation details. But that should be in your question
If you study GCC 9.1 source code, look inside some of gcc-9.1.0/gcc/builtins.c (the expand_builtin_va_start function there), and for other builtins inside gcc-9.1.0/gcc/c-family/c-cppbuiltin.c, gcc-9.1.0/gcc/cppbuiltin.c, gcc-9.1.0/gcc/jit/jit-builtins.c
You could write your own GCC plugin (in 2Q2019, for GCC 9, and the C++ code of your plugin might have to change for the future GCC 10) to add your own GCC builtins. BTW, you might even overload the behavior of the existing __builtin_va_start by your own specific code, and/or you might have -at least for research purposes- your own stdarg.h header with #define va_start(v,l) __my_builtin_va_start(v,l) and have your GCC plugin understand your __my_builtin_va_start plugin-specific builtin. Be however aware of the GCC runtime library exception and read its rationale: I am not a lawyer, but I tend to believe that you should (and that legal document requires you to) publish your GCC plugin with some open source license.
You first need to read a textbook on compilers, such as the Dragon book, to understand that an optimizing compiler is mostly transforming internal representations of your compiled code.
You further need to spend months in studying the many internal representations of GCC. Remember, GCC is a very complex program (of about ten millions lines of code). Don't expect to understand it with only a few days of work. Look inside the GCC resource center website.
My dead GCC MELT project had references and slides explaining more of GCC (the design philosophy and architecture of GCC changes slowly; so the concepts are still relevant, even if individual details changed). It took me almost ten years full time to partly understand some of the middle-end layers of GCC. I cannot transmit that knowledge in a StackOverflow answer.
My draft Bismon report (work in progress, funded by H2020, so lot of bureaucracy) has a dozen of pages (in its sections §1.3 and 1.4) introducing the internal representations of GCC.

Make: Uses of static pattern rules in make

I would like to know the Uses of static pattern rules against normal rules in make. I an new to make and gone through some tutorials. I want to know when do we use this static pattern rules ? Could you please explain in brief ?
Thanks in Advance.
Your question is mostly a matter of opinion. Notice that there are several build automation tools (not only GNU make), e.g. also ninja, scons, omake, etc...
When you code in C (or in C++....) some project, you could have some C (or C++) files which are generated from something else (e.g. by lemon or by your own utility...). For such cases (pedantically you could call them metaprogramming), pattern rules could be useful (in particular if you have several such cases in a project). In other cases you generate other files (than object files) from C source (e.g. generating documentation with doxygen), and then pattern rules are also very useful.
An example of a large C++ project with many C++ code generators is the GCC compiler. And back when (in 2009) GCC was coded in C, it already had a dozen of specialized code generator programs emitting some C code. For these cases, pattern rules could be convenient.
Of course, pattern rules are a luxury. You could in principle generate your Makefile and have it contain a simple rule for each individual file. (in GCC, the Makefile-s are generated by autoconf and automake based things...)
If you observe and study the source code of most large free software projects, you'll find out that most of them do have generators for C (or C++) files. So generating C code is a usual practice (the original Unix from late 1970s did that already). Today, some software projects have most or even all (e.g. CAIA) of their C code generated.

What is the name for the structure fo the gcc assembly output

Im trying to learn assembly, first i was using NASM for the compiling, but then i understood that i could use .s files in gcc. This interested me greatly, since my goal for this is to be able to write a compiler for a custom language, so this was very intriguing, as it would allow me to link and compile with c code. So filled with excitement, I started compiling c to assembly (.s files) with gcc, and examen it. As I was doing this, it seamed to be structured in a different way then NASM assembly, with only main label, f.eks, and not _start, and other weird structure, and im not talking about Intel- vs AT&T syntax. So then my question follows:
Is it a different structure, in normal assembly and the .s files in gcc, or is it just me not having a good enough knowlage of assembly? If it is a different structure, does it have a name?
I have been trying to google my way to this for hours, but when i search for gcc assembly, and other things I can think of, I only get c inline assembly...
Please help, im going crazy from not figuring this out.
gcc emits definitions for all the functions present in the translation unit. (unless they're static inline or static and unused or it chooses to inline them everywhere...).
The CRT start files (linked by default by gcc, not re-built from source every time you compile) provides the definition for _start and the other functions you'll see if you disassemble the binary. They're only linked in at the link stage, not as part of compiling a .c to a .s, so you don't see them in gcc -S output.
Related: How to remove "noise" from GCC/clang assembly output? for tips on making compiler asm output human-readable.

How to bolt on ANTLR 4 front to GCC Generic/GIMBLE?

I'm writing a DSL front end using ANTLR v4 that I'd like to bolt on to GCC framework. The goal is to have a C language AST to leverage the rest of the GCC framework.
I haven't found any info or preexisting work to use an example of how to proceed. What I'm looking for is how to move the ANTLR 4 AST to GCC Generic/GIMBLE.
ANTLR 4 does not support a C language target, so I'll have to cludge up the C++ target to the GCC C language framework.
Help is appreciated
Gluing a C++ implementation of ANTLR into GCC so that GCC will call it is likely to be the easy step.
[Don't expect to be easy; GCC wants to be GCC, not your pet. You might get some help from GCC Melt, a package for interfacing to GCC machinery.]
The AST produced for an arbitrary (e.g., your custom DSL) language doesn't "just move (easily)" to a C AST or to the GCC Gimple (not GIMBLE) framework.
You will have to build, in essence, your DSL-AST to C-AST translator, or your DSL-AST to Gimple translator. There is no a priori reason to believe that building such a translator is easy; for example, you didn't tell us your DSL was "just like C except ...". So, you're going to have to build a translator. In the absence of evidence this is easy, you'll have to translate your DSL concepts to C concepts. The better ("non C-like") your DSL is, the harder this is going to be.
This SO link discusses the issues behind translation in more detail: What kinds of patterns could I enforce on the code to make it easier to translate to another programming language?

Intel Fortran to GNU Fortran Conversion [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am working on a custom CFD Solver written in Fortran 90 and MPI.
The code contain 15+ Modules and was initially designed to work with the Intel Fortran compiler. Now since i do not have access to the Intel compiler I need to make it work using the GNU Fortran Compiler.
I made changes in the Makefile that initially had flags suitable for the ifort.
I am using it on Ubuntu with GNU Fortran and Openmpi
I am sorry I am unable to put in anything from the code structure or terminal output due to IP restrictions of my university. Nevertheless,I will try to best describe the issues
So now when I compile the code I am having some strange issues.
The GNU Fortran is not able to read lines that are too long and I get errors during compilation. As a result I have to break it into multiple lines using the '&' symbol
A module D.f90 contains all the Global variables declared. However, now I during compilation i get error is in module B.F90.
The error I get is 'Unclassified Statement Error', I was able to fix it in some subroutines and functions by locally declaring the variables again.
I am not the most experienced person in Fortran, but I thought that the change in compiler should not be a reason for new found syntax errors.
The errors described above so far could be remedied but considering the expanse of the code it is impractical.
I was hoping if anyone could share views on this matter and provide guidance on how to tackle it.
You should start reading three pieces of documentation:
The Fortran 90 standard (alternatively, other versions), which tells you what is legal, standard Fortran and what is not. Whenever you find some error, look at your code and check if what you are doing is legal, standard Fortran. Likely, the code in question will either be completely nonstandard (e.g. REAL*8, although that extension is fairly well understood) or rely on unspecified behaviour that Intel Fortran and GFortran are interpreting in different ways.
The GFortran manual for your version, which tells you how GFortran decides such unspecified cases, what intrinsic functions are available, how to change some options/flags, etc. This would tell you that your problem with the line lengths would be solved by adding -ffree-line-length-none.
The Intel Fortran manual for your version, which in cases of non-standard or unspecified behaviour, will allow you to know what the code you are reading was written to do, e.g. the behaviour that you would expect. In particular, it will allow you to decipher what the compiler flags that are currently being used mean. They may or may not need translation to GFortran, e.g. /Qsave will need to become -f-no-automatic.
A concrete example of interpretative differences within the range allowed be the standard: until Fortran 2003, the units for the "record length" in random access record files were left unspecified. Intel Fortran used "one machine word" (4 bytes in x86) while GFortran used 1 byte. Both were compliant with the standard letter, but incompatible.
Furthermore, even when coding "to standard", you may hit a wall if the compiler does not implement part of the Fnn standard, or it is buggy. Case in point: Intel Fortran 12.0 (old, but it's what I work with) does not the implement the ALLOCATE(y, SOURCE=x) construct for polymorphic x (the "clone allocation"). On the other hand, GFortran has not completely implemented FINAL type-bound procedures (destructors).
In both cases, you will need to find workarounds. For example, for the first issue you can use a special form of the INQUIRE statement (kudos to #haraldkl). In other cases, the workaround might even involve using some kind of feature detection (see autoconf, CMake, etc.) and storing the results as PARAMETER variables in a config.f90 file that is included by your code. Your code would then take decisions based on it, as in:
! config.f90.in (things in #x# would get subtituted by automake, for example)
INTEGER, PARAMETER :: RECORD_LEN_BYTES = #RECORD_LEN_BYTES#
! Some other file which opens a file
INCLUDE "config.f90"
!...
OPEN(u, FILE='DE430.BIN', ACCESS='direct', FORM='unformatted', RECL=56 / RECORD_LEN_BYTES)
People have been having complaints about following the standard since at least the 60s. But those cDEC$ features were put in a for good reasons...
It is valuable to cross compile though and you usually have things caught in one compiler or the other.
For you question #1 "The GNU Fortran is not able to read lines that are too long and I get errors during compilation. As a result I have to break it into multiple lines using the '&' symbol"
In the days of old there was:
options/extended_source
SUBROUTINE...
In fort it is -132, but I have not found a gfortran equivalent to -132 . It may be -ffixed-line-length-n -ffixed-line-length-none -ffree-line-length-n -ffree-line-length-none per the link: http://www.math.uni-leipzig.de/~hellmund/Vorlesung/gfortran.html#SEC8
Also the ifort standard for .f90 and .f95 is the the compiler switch '-free' '-fixed' is the standard <.f90... However one can use -fixed with .f90 and use column 6 and 'D' in column #1... Which is handy with '-D_lines' or '-DD'.
Per the link: https://software.intel.com/sites/default/files/m/f/8/5/8/0/6366-ifort.txt
For you question #2: "A module D.f90 contains all the Global variables declared. However, now I during compilation i get error is in module B.F90. The error I get is 'Unclassified Statement Error', I was able to fix it in some subroutines and functions by locally declaring the variables again."
You probably need to put in the offending line, if you can get an IP waiver.
Making variables local if they are expected to be shared in a /common/ or shared in a module will not work.
If there were in /common/ or PUBLIC then they are shared.
If they are local then they are PRIVATE.
it would be easy to get that error if a PRIVATE statement was in the wrong place, or a USE statement was omitted.

Resources