Trying deterministic gcc compilation, symbol table problems

Trying deterministic gcc compilation, symbol table problems - gcc

I work for embedded systems and I am trying to make a build that yields exactly the same executable each time. Using -frandom-seed certainly helped to stabilize names that were otherwise variable, but still I have a couple of symbols that I have problems with. For example:
0x00003bfc _ZN13WorkingMemory17ReadTransactionalERN3HSL4FileERN58_GLOBAL__N_......_.._working_memory.cc_AE42A16A_FF4623503AllE
The ".._.." etc. part was evidently worked out of what I passed as -frandom-seed, id est, the source filename. Of the couple of hex number that follows sometimes, the second one sometimes is different, and I guess it is probably linked to the compilation date, but I am not sure.
I am working on ARM, using gcc 3.4.0, using FLAT executables. I tried to remove symbols using strip on the ELF file, but that prevents FLAT conversion.
Any ideas?

Related

How can I strip compiler information from PE?

is there a simple way to strip compiler information from PE file?

Use the program "strip" which comes with fpc (in fpc/bin).

The lazarus one needs to be in the units though (lclbase?), maybe the FPC one too (compiler/version.pas would be my guess). But potentially grepping is difficult because the strings might be made with {$i %%} include meta data constructs.
To work around this, and at least get the unit, one could also try to compile everything to assembler (-a -s), and then grep the generated assembler. The assembler will contain the final form
Strings can also get added by the linker, on Windows, FPC typically uses its internal (high speed) linkers.You can try to use the external (GNU LD) linker (-Xe) to see if that behaves differently.

Intel Fortran to GNU Fortran Conversion [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am working on a custom CFD Solver written in Fortran 90 and MPI.
The code contain 15+ Modules and was initially designed to work with the Intel Fortran compiler. Now since i do not have access to the Intel compiler I need to make it work using the GNU Fortran Compiler.
I made changes in the Makefile that initially had flags suitable for the ifort.
I am using it on Ubuntu with GNU Fortran and Openmpi
I am sorry I am unable to put in anything from the code structure or terminal output due to IP restrictions of my university. Nevertheless,I will try to best describe the issues
So now when I compile the code I am having some strange issues.
The GNU Fortran is not able to read lines that are too long and I get errors during compilation. As a result I have to break it into multiple lines using the '&' symbol
A module D.f90 contains all the Global variables declared. However, now I during compilation i get error is in module B.F90.
The error I get is 'Unclassified Statement Error', I was able to fix it in some subroutines and functions by locally declaring the variables again.
I am not the most experienced person in Fortran, but I thought that the change in compiler should not be a reason for new found syntax errors.
The errors described above so far could be remedied but considering the expanse of the code it is impractical.
I was hoping if anyone could share views on this matter and provide guidance on how to tackle it.

You should start reading three pieces of documentation:
The Fortran 90 standard (alternatively, other versions), which tells you what is legal, standard Fortran and what is not. Whenever you find some error, look at your code and check if what you are doing is legal, standard Fortran. Likely, the code in question will either be completely nonstandard (e.g. REAL*8, although that extension is fairly well understood) or rely on unspecified behaviour that Intel Fortran and GFortran are interpreting in different ways.
The GFortran manual for your version, which tells you how GFortran decides such unspecified cases, what intrinsic functions are available, how to change some options/flags, etc. This would tell you that your problem with the line lengths would be solved by adding -ffree-line-length-none.
The Intel Fortran manual for your version, which in cases of non-standard or unspecified behaviour, will allow you to know what the code you are reading was written to do, e.g. the behaviour that you would expect. In particular, it will allow you to decipher what the compiler flags that are currently being used mean. They may or may not need translation to GFortran, e.g. /Qsave will need to become -f-no-automatic.
A concrete example of interpretative differences within the range allowed be the standard: until Fortran 2003, the units for the "record length" in random access record files were left unspecified. Intel Fortran used "one machine word" (4 bytes in x86) while GFortran used 1 byte. Both were compliant with the standard letter, but incompatible.
Furthermore, even when coding "to standard", you may hit a wall if the compiler does not implement part of the Fnn standard, or it is buggy. Case in point: Intel Fortran 12.0 (old, but it's what I work with) does not the implement the ALLOCATE(y, SOURCE=x) construct for polymorphic x (the "clone allocation"). On the other hand, GFortran has not completely implemented FINAL type-bound procedures (destructors).
In both cases, you will need to find workarounds. For example, for the first issue you can use a special form of the INQUIRE statement (kudos to #haraldkl). In other cases, the workaround might even involve using some kind of feature detection (see autoconf, CMake, etc.) and storing the results as PARAMETER variables in a config.f90 file that is included by your code. Your code would then take decisions based on it, as in:
! config.f90.in (things in #x# would get subtituted by automake, for example)
INTEGER, PARAMETER :: RECORD_LEN_BYTES = #RECORD_LEN_BYTES#
! Some other file which opens a file
INCLUDE "config.f90"
!...
OPEN(u, FILE='DE430.BIN', ACCESS='direct', FORM='unformatted', RECL=56 / RECORD_LEN_BYTES)

People have been having complaints about following the standard since at least the 60s. But those cDEC$ features were put in a for good reasons...
It is valuable to cross compile though and you usually have things caught in one compiler or the other.
For you question #1 "The GNU Fortran is not able to read lines that are too long and I get errors during compilation. As a result I have to break it into multiple lines using the '&' symbol"
In the days of old there was:
options/extended_source
SUBROUTINE...
In fort it is -132, but I have not found a gfortran equivalent to -132 . It may be -ffixed-line-length-n -ffixed-line-length-none -ffree-line-length-n -ffree-line-length-none per the link: http://www.math.uni-leipzig.de/~hellmund/Vorlesung/gfortran.html#SEC8
Also the ifort standard for .f90 and .f95 is the the compiler switch '-free' '-fixed' is the standard <.f90... However one can use -fixed with .f90 and use column 6 and 'D' in column #1... Which is handy with '-D_lines' or '-DD'.
Per the link: https://software.intel.com/sites/default/files/m/f/8/5/8/0/6366-ifort.txt
For you question #2: "A module D.f90 contains all the Global variables declared. However, now I during compilation i get error is in module B.F90. The error I get is 'Unclassified Statement Error', I was able to fix it in some subroutines and functions by locally declaring the variables again."
You probably need to put in the offending line, if you can get an IP waiver.
Making variables local if they are expected to be shared in a /common/ or shared in a module will not work.
If there were in /common/ or PUBLIC then they are shared.
If they are local then they are PRIVATE.
it would be easy to get that error if a PRIVATE statement was in the wrong place, or a USE statement was omitted.

how can I verify that dead code was stripped from the binary?

My c/obj-c code (an iOS app built with clang) has some functions excluded by #ifdefs. I want to make sure that code that gets called from those functions, but not from others (dead code) gets stripped out (eliminated) at link time.
I tried:
Adding a local literal char[] in a function that should be eliminated; the string is still visible when running strings on the executable.
Adding a function that should be eliminated; the function name is still visible when running strings.
Before you ask, I'm building for release, and all strip settings (including dead-code stripping, obviously) are enabled.
The question is not really xcode/apple/iOS specific; I assume the answer should be pretty much the same on any POSIX development platform.

(EDIT)
In binutils, ld has the --gc-sections option which does what you want for sections on object level. You have several options:
use gcc's flags -ffunction-sections and -fdata-sections to isolate each symbol into its own section, then use --gc-sections;
put all candidates for removal into a separate file and the linker will be able to strip the whole section;
disassemble the resulting binary, remove dead code, assemble again;
use strip with appropriate -N options to discard the offending symbols from the
symbol table - this will leave the code and data there, but it won't show up in the symbol table.

Syntax Checking with unsupported languages

I have some files that have a particular syntax that is similar to ada (not identical though), however I would like to verify the syntax before going and running them. There isn't a compiler for these files, so I can't check them before using them. I tried to use the following:
gcc -c -gnats <file>
However this says compilation unit expected. I've tried a few variations on this, but to no avail.
I just want to make sure the file is syntactically correct before using it, but I'm not sure how to do it, and I really don't want to write an entire syntax checker just for this.
Is there some way to include an additional unsupported language to gcc without going through a recompile? Also is this simply a file that details to gcc what the syntax constructs are, or what would be entailed? I don't need a full compile, only a syntax check.
Alternately, are there any syntax checkers I can use that I can update an ada syntax check with the small number of changes required for this language?
I've listed Ada as a tag, since the syntax is nearly identical, and finding something that will do ada syntax checking without compiling will be a 90% solution for me.

You could try running the files through gnatchop first. The GCC Ada compiler is rather unique in that it expects filenames to match up with the main unit names inside the file. That may be what your error message is trying to say.
gnatchop will go through any files you give it and write out Ada source files with the appropriate names to make gcc happy (even splitting files into multiple files if needed).
Another option you might be interested in is OpenToken. It is a parser construction toolkit, written in Ada, that allows you to build your own parsers fairly easily. It comes with a syntax recognizer for Ada, so you may just be able to tweak that a bit for your needs.

Is there a way to strip all functions from an object file that I am not using?

I am trying to save space in my executable and I noticed that several functions are being added into my object files, even though I never call them (the code is from a library).
Is there a way to tell gcc to remove these functions automatically or do I need to remove them manually?

If you are compiling into object files (not executables), then a compiler will never remove any non-static functions, since it's always possible you will link the object file against another object file that will call that function. So your first step should be declaring as many functions as possible static.
Secondly, the only way for a compiler to remove any unused functions would be to statically link your executable. In that case, there is at least the possibility that a program might come along and figure out what functions are used and which ones are not used.
The catch is, I don't believe that gcc actually does this type of cross-module optimization. Your best bet is the -Os flag to optimize for code size, but even then, if you have an object file abc.o which has some unused non-static functions and you link statically against some executable def.exe, I don't believe that gcc will go and strip out the code for the unused functions.
If you truly desperately need this to be done, I think you might have to actually #include the files together so that after the preprocessor pass, it results in a single .c file being compiled. With gcc compiling a single monstrous jumbo source file, you stand the best chance of unused functions being eliminated.

Have you looked into calling gcc with -Os (optimize for size.) I'm not sure if it strips unreached code, but it would be simple enough to test. You could also, after getting your executable back, 'strip' it. I'm sure there's a gcc command-line arg to do the same thing - is it --dead_strip?

In addition to -Os to optimize for size, this link may be of help.

Since I asked this question, GCC 4.5 was released which includes an option to combine all files so it looks like it is just 1 gigantic source file. Using that option, it is possible to easily strip out the unused functions.
More details here

IIRC the linker by default does what you want ins some specific cases. The short of it is that library files contain a bunch of object files and only referenced files are linked in. If you can figure out how to get GCC to emit each function into it's own object file and then build this into a library you should get what you are looking.
I only know of one compiler that can actually do this: here (look at the -lib flag)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio