Need validation on a claim from Go Lang - go

I have been lately looking into GoLang -- coming from C++ background-- I am reading a paper which allegedly explains the reasoning behind making Golang, here is its link: https://talks.golang.org/2012/splash.article
One of the claims being is, handling Dependencies (Package) in C and C++ is pain and takes on a #ifndef guard instance to state
The intent is that the C preprocessor reads in the file but disregards
the contents on the second and subsequent readings of the file...
I referred a GCC page for the same, https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html.
so that if the header file appears in a subsequent #include directive
and FOO is defined, then it is ignored and it doesn’t preprocess or
even re-open the file a second time
Go: "Reads in and disregard"
vs
GCC: it doesn’t preprocess or even re-open the file a second time.
Doesn't contradict?
your thoughts are appreciated. Thanks for Reading my question.

The first passage is talking about a generic compiler, which, conceptually speaking, should read the contents of the file and disregard the contents (because they are #ifdefd out). That is, roughly, what the C standard specifies a compiler should do.
But practically everything in the C standard is under the "as if" rule - a compiler does not actually have to be implemented in the way suggested in the standard, so long as the end result it produces is exactly the same in every case. As such, GCC's particular implementation adds an optimization where, in cases where it can tell with certainty that the contents of the file would be disregarded, it doesn't actually read it. This is perfectly fine because it still behaves as if it has read the file but disregarded it.
Note that other compilers do not necessarily do the same.

Related

debugging yacc YYDEBUG where is y.debug

I trying to debug a the yacc generated component for awk (awk.g.c) but when I define YYDEBUG it includes y.debug which I don't seem to have.
Where does y.debug come from?
Without it there are several references that are undefined.
I'm compiling the old 32v or V7 version of awk so I'm not sure if this is something that still exists.
Some versions of yacc (in particular, the AT&T version, still available as part of Plan 9) generated an additional file with the suffix .debug containing debugging information, notably the table which translated symbol numbers back into names. Modern yacc-alikes just insert this information into the generated C file, on the grounds that the memory consumption is basically trivial these days.
The name table might not be generated if you don't request it, but the way you ask for it depends on the yacc version:
Most bison versions only generate the table if the trace option is enabled. (Posix mandates -t for this, but bison provides a host of alternatives and not all historical yaccs complied.)
As indicated above, some really old yaccs put the name table into y.debug. The AT&T implementation, as I mentioned above, always did this, but guarded the #include line with a preprocessor conditional on YY_DEBUG
However, the yacc implementation you pointed to in a comment, which uses the conditionally-included y.debug mechanism, only generates the y.debug file if you invoke it with the -D flag. So that's what you need to do.
Background notes
I unearthed the information in point 3 from the V10 source linked in a comment. The download link is at the top of this page; that wasn't immediately obvious from the link in the comment. (That's the complete source tarball, which is about 70MB. The individual files linked to by the link in the comment have been HTMLised, which makes them a pain to work with.) I could have saved myself some time by reading the release notes (called yaccnews rather than CHANGES). The last note in that file describes the implementation, and I include the paragraph here since it has all the details on how debugging works in this particular yacc version.
8/11/81
Debugging changed. If the parser starts with %{#define YYDEBUG %} and yacc is invoked as yacc -D (for Debugging), then the parser uses an external variable named yydebug to control debugging output. If yydebug == 1, the parser prints out the text of the reduction when it performs one. If yydebug == 2, the parser also prints out the name of the token returned by each call to yylex, and if yydebug == 3, the parser also prints out the active items each time it changes state (this is uninteresting).
For what it's worth, it should be possible to generate a working, compilable parser using a modern yacc (such as bison or byacc). In the long run, that will probably be easier. (If you use bison and you require legacy yacc compatibility, you can use the -y flag. That flag is not supported by byacc, which claims to be legacy compatible regardless.)

How many times does a Common Lisp compiler recompile?

While not all Common Lisp implementations do compilation to machine code, some of them do, including SBCL and CCL.
In C/C++, if the source files don't change, the binary output of a C/C++ compiler will also not change, assuming the underlying system remains the same.
In a Common Lisp compiler, the compilation is not under the user's direct control, unlike C/C++. My question is that if the Lisp source files haven't changed, under what circumstances will a CL compiler compile the code more than once, and why? If possible, a simple illustrative example would be helpful.
I think that the question is based on some misconceptions. The compiler doesn't compile files, and it's not something that the user has no control over. The compiler is quite readily available through the compile function. The compiler operates on code, not on files. E.g., you can type at the REPL
CL-USER> (compile nil (list 'lambda (list 'x) (list '+ 'x 'x)))
#<FUNCTION (LAMBDA (X)) {100460E24B}>
NIL
NIL
There's no file involved at all. However, there is also a compile-file function, but notice that its description is:
compile-file transforms the contents of the file specified by
input-file into implementation-dependent binary data which are placed
in the file specified by output-file.
The contents of the file are compiled. Then that compiled file can be loaded. (You can also load uncompiled source files, too.) I think your question might boil down to asking under what circumstances would compile-file generate a file with different contents. I think that's really implementation dependent, and it's not really predictable. I don't know that your characterization of compilers for other languages necessarily holds either:
In C/C++, if the source files don't change, the binary output of a
C/C++ compiler will also not change, assuming the underlying system
remains the same.
What if the compiler happens to include a timestamp into the output in some data segment? Then you'd get different binary output every time. It's true that some common scripted compilation/build systems (e.g., make and similar) will check whether previous output can be reused based on whether the input files have changed in the meantime. That doesn't really say what the compiler does, though.
The rules are pretty much the same, but in Common Lisp, it's not a practice to separate declarations from implementation, so usually you must recompile every dependency to be sure. This is a shared practical consequence of dynamic environments.
Imagining there was such separation in place, the following are blantant examples (clearly not exhaustive) of changes that require recompiling specific dependent files, as the output may be different:
A changed package definition
A changed macro character or a change in its code
A changed macro
Adding or removing a inline or notinline declaration
A change in a global type or function type declaration
A changed function used in #., defvar, defparameter, defconstant, load-time-value, eql specializer, make-load-form generated code, defmacro et al (e.g. setf expanders)...
A change in the Lisp compiler, or in the base image
I mean, you can see it's not trivial to determine which files need to be recompiled. Sometimes, the answer is "all subsequent files", e.g. changing the " (double-quotes) macro-character, which might affect every literal string, or the compiler evolved in a non-backwards compatible way. In essence, we end where we started: you can only be sure with a full recompile and not reusing fasls across compilations. And sometimes it's faster than determining the minimum set of files that need to be recompiled.
In practice, you end up compiling single definitions a lot in development (e.g. with Slime) and not recompiling files when there's a fasl as old or younger than the source file. Many times, you reuse files from e.g. Quicklisp. But for testing and deployment, I advise clearing all fasls and recompiling everything.
There have been efforts to automate minimum dependency compilation with SBCL, but I think it's too slow when you change the interim projects more often that not (it involves a lot of forking, so in Windows it's either infeasible or very slow). However, it may be a time saver for base libraries that rarely change, if at all.
Another approach is to make custom base images with base libraries built-in, i.e. those you always load. It'll save both compilation and load times.

Find write statement in Fortran

I'm using Fortran for my research and sometimes, for debugging purposes, someone will insert in the code something like this:
write(*,*) 'Variable x:', varx
The problem is that sometimes it happens that we forget to remove that statement from the code and it becomes difficult to find where it is being printed. I usually can get a good idea where it is by the name 'Variable x' but it sometimes happens that that information might no be present and I just see random numbers showing up.
One can imagine that doing a grep for write(*,*) is basically useless so I was wondering if there is an efficient way of finding my culprit, like forcing every call of write(*,*) to print a file and line number, or tracking stdout.
Thank you.
Intel's Fortran preprocessor defines a number of macros, such as __file__ and __line__ which will be replaced by, respectively, the file name (as a string) and line number (as an integer) when the pre-processor runs. For more details consult the documentation.
GFortran offers similar facilities, consult the documentation.
Perhaps your compiler offers similar capabilities.
As has been previously implied, there's no Fortran--although there may be a compiler approach---way to change the behaviour of the write statement as you want. However, as your problem is more to do with handling (unintentionally produced) bad code there are options.
If you can't easily find an unwanted write(*,*) in your code that suggests that you have many legitimate such statements. One solution is to reduce the count:
use an explicit format, rather than list-directed output (* as the format);
instead of * as the output unit, use output_unit from the intrinsic module iso_fortran_env.
[Having an explicit format for "proper" output is a good idea, anyway.]
If that fails, use your version control system to compare an old "good" version against the new "bad" version. Perhaps even have your version control system flag/block commits with new write(*,*)s.
And if all that still doesn't help, then the pre-processor macros previously mentioned could be a final resort.

Boost pretty print for GCC error messages

I'm using GCC 4.7.2. My code is rather heavy on template, STL and boost usage. When I compile and there is an error in some class or function that is derived from or uses some boost/STL functionality, I get error messages showing spectacularly hideous return types and/or function arguments for my classes/function.
My question:
Is there a prettyprint type of thing for GCC warnings/errors containing boost/STL types, so that the return types shown in error messages correspond to what I've typed in the code, or at least, become more intelligible?
I have briefly skimmed through this question, however, that is about GDB rather than GCC...
I've also come across this pretty printer in Haskell, but that just seems to add structure, not take away (mostly) unneeded detail...
Any other suggestions?
I asked a similar question, where someone suggested I try gccfilter. It's a Perl script that re-formats the output of g++ and colorizes it, shortens it, hides full pathnames, and lots more.
Actually, that suggestion answers this question really well too: it's capable of hiding unneeded detail and pretty-printing both STL and boost types. So: I'll leave this here as an answer too.
The only drawback I could see is that g++ needs to be called from within the script (i.e., piping to it is not possible at the time). I suspect that's easily fixed, and in any case, it's a relatively minor issue.
You could try STLfilt as mentioned in 'C++ Template Metaprogramming' by David Abrahms & Alesky Gurtovoy.
The book contains a chapter on template message diagnostics. It suggests using the STLFilt /showback:N to eliminate compiler backtrace material in order to get simplified output.

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.
There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.
Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.
A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.
In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

Resources