What does the GCC function suffix .constprop mean?

What does the GCC function suffix .constprop mean? - gcc

Looking at the disassembly of a C++ program, I see functions like _Z41__static_initialization_and_destruction_0ii.constprop.221. What does the constprop mean in this case? It appears to be similar in appearance to the isra suffix (and is sometimes combined, e.g. .isra.124.constprop.226), but it means something else.

From source code comments I've read - they indicate functions which have been cloned during optimization.
EDIT: This may be the answer, maybe not.
Simple constant propagation
This file implements constant propagation and merging. It looks for instructions involving only constant operands and replaces them with a constant value instead of an instruction. For example:
add i32 1, 2
becomes
i32 3
NOTE: this pass has a habit of making definitions be dead. It is a good idea to to run a DIE (Dead Instruction Elimination) pass sometime after running this pass.
SOURCE

Related

Why do Julia programmers need to prefix macros with the at-sign?

Whenever I see a Julia macro in use like #assert or #time I'm always wondering about the need to distinguish a macro syntactically with the # prefix. What should I be thinking of when using # for a macro? For me it adds noise and distraction to an otherwise very nice language (syntactically speaking).
I mean, for me '#' has a meaning of reference, i.e. a location like a domain or address. In the location sense # does not have a meaning for macros other than that it is a different compilation step.

The # should be seen as a warning sign which indicates that the normal rules of the language might not apply. E.g., a function call
f(x)
will never modify the value of the variable x in the calling context, but a macro invocation
#mymacro x
(or #mymacro f(x) for that matter) very well might.
Another reason is that macros in Julia are not based on textual substitution as in C, but substitution in the abstract syntax tree (which is much more powerful and avoids the unexpected consequences that textual substitution macros are notorious for).
Macros have special syntax in Julia, and since they are expanded after parse time, the parser also needs an unambiguous way to recognise them
(without knowing which macros have been defined in the current scope).
ASCII characters are a precious resource in the design of most programming languages, Julia very much included. I would guess that the choice of # mostly comes down to the fact that it was not needed for something more important, and that it stands out pretty well.

Symbols always need to be interpreted within the context they are used. Having multiple meanings for symbols, across contexts, is not new and will probably never go away. For example, no one should expect #include in a C program to go viral on Twitter.
Julia's Documentation entry Hold up: why macros? explains pretty well some of the things you might keep in mind while writing and/or using macros.
Here are a few snippets:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run.
...
It is important to emphasize that macros receive their arguments as
expressions, literals, or symbols.
So, if a macro is called with an expression, it gets the whole expression, not just the result.
...
In place of the written syntax, the macro call is expanded at parse
time to its returned result.

It actually fits quite nicely with the semantics of the # symbol on its own.
If we look up the Wikipedia entry for 'At symbol' we find that it is often used as a replacement for the preposition 'at' (yes it even reads 'at'). And the preposition 'at' is used to express a spatial or temporal relation.
Because of that we can use the #-symbol as an abbreviation for the preposition at to refer to a spatial relation, i.e. a location like #tony's bar, #france, etc., to some memory location #0x50FA2C (e.g. for pointers/addresses), to the receiver of a message (#user0851 which twitter and other forums use, etc.) but as well for a temporal relation, i.e. #05:00 am, #midnight, #compile_time or #parse_time.
And since macros are processed at parse time (here you have it) and this is totally distinct from the other code that is evaluated at run time (yes there are many different phases in between but that's not the point here).
In addition to explicitly direct the attention to the programmer that the following code fragment is processed at parse time! as oppossed to run time, we use #.
For me this explanation fits nicely in the language.
thanks#all ;)

Find write statement in Fortran

I'm using Fortran for my research and sometimes, for debugging purposes, someone will insert in the code something like this:
write(*,*) 'Variable x:', varx
The problem is that sometimes it happens that we forget to remove that statement from the code and it becomes difficult to find where it is being printed. I usually can get a good idea where it is by the name 'Variable x' but it sometimes happens that that information might no be present and I just see random numbers showing up.
One can imagine that doing a grep for write(*,*) is basically useless so I was wondering if there is an efficient way of finding my culprit, like forcing every call of write(*,*) to print a file and line number, or tracking stdout.
Thank you.

Intel's Fortran preprocessor defines a number of macros, such as __file__ and __line__ which will be replaced by, respectively, the file name (as a string) and line number (as an integer) when the pre-processor runs. For more details consult the documentation.
GFortran offers similar facilities, consult the documentation.
Perhaps your compiler offers similar capabilities.

As has been previously implied, there's no Fortran--although there may be a compiler approach---way to change the behaviour of the write statement as you want. However, as your problem is more to do with handling (unintentionally produced) bad code there are options.
If you can't easily find an unwanted write(*,*) in your code that suggests that you have many legitimate such statements. One solution is to reduce the count:
use an explicit format, rather than list-directed output (* as the format);
instead of * as the output unit, use output_unit from the intrinsic module iso_fortran_env.
[Having an explicit format for "proper" output is a good idea, anyway.]
If that fails, use your version control system to compare an old "good" version against the new "bad" version. Perhaps even have your version control system flag/block commits with new write(*,*)s.
And if all that still doesn't help, then the pre-processor macros previously mentioned could be a final resort.

When I use Conditional Compilation Arguments to Exclude Code, why doesn't VB6 EXE file size change?

Basically, when declaring Windows API functions in my VB6 code, there comes with these many constants that need to be declared or used with this function, in fact, usually most of these constants are not used and you only end up using one of them or so when making your API calls, so I am using Conditional Compilation Arguments to exclude these (and other things) using something like this:
IncludeUnused = 0 : Testing = 1
(this is how I set two conditional compilation arguments (they are of Boolean type by default).
So, many unused things are excluded like this:
#If IncludeUnused Then
' Some constant declarations and API declarations go here, sometimes functions
' and function calls go here as well, so it's not just declarations and constants
#End If
I also use a similar wrapper using the Testing Boolean declared in the Conditional Compilation Argument input field in the VB6 Properties windows "Make" tab. The Testing Boolean is used to display message boxes and things like that when I am in testing mode, and of course, these message boxed are removed (not displayed) if I have Testing set to 0 (and it is obviously 1 when I am Testing).
The problem is, I tried setting IncludeUnused and Testing to 0 and 1 and visa versa, a total of four (4) combinations, and no matter what combination I set these values to, the output EXE file size for my VB6 EXE does not change! It is always 49,152 when compiled to Native Code using Fast Code, and when using Small Code.
Additionally, if I compile to p-code under the four (4) combinations of Testing and IncludeUnused, i always end up with the file size 32,768 no matter what.
This is driving me crazy, since it is leading me to believe that no change is actually occuring, even though it is. Why is it that when segments of code are excluded from compilation, the file size is still the same? What am I missing or doing wrong, or what have I miscalculated?
I have considered the option that perhaps VB6 automatically does not compile code which is not used into the final output EXE, but I have read from a few sources that this is not true, in that, if it's included, it is compiled (correct me if I am wrong), and if this is right, then there is no need to use the IncludeUnused Boolean to remove unused code...?
If anyone can shed some light on these thoughts, I'd greatly appreciate it.

It could well be that the size difference is very small and that the exe size is padded to the next 512 or 1024 byte alignment. Try compressing the exe's with zip and see if the zip-file sizes differ.

You misunderstand what a compiler does. The output of the VB6 compiler is code. Constants are merely place holders for values, they are not code. The compiler adds them to its symbol table. And when it later encounters a statement in your code that uses the constant then it replaces the constant by its value. That statement produces the exact same code whether you use a constant or hard-code the value in the statement.
So this automatically implies that if you never actually use the constant anywhere then there is no difference at all in the generated code. All that you accomplished by using the #If is to keep the compiler's symbol table smaller. Which is something that makes very little sense to do, the actual gain from compilation speed you get is not measurable. Symbol tables are implemented as hash tables, they have O(1) amortized complexity.
You use constants only to make your code more readable. And to make it easy to change a constant value if the need ever arises. By using #If, you actually made your code less readable.

You can't test runtime data in conditional compilation directives.
These directives use expressions made up of literal values, operators, and CC constants. One way to set constant values is:
#Const IncludeUnused = 0
#Const Testing = 1
You can also define them via Project Properties for IDE testing. Go to the Make tab in that dialog and click the Help button for details.
Perhaps this is where you are setting the values? If so, consider this just additional info for later readers rather than an answer.
See #If...Then...#Else Directive

VB6 executable sizes are padded to 4KB blocks, so if the code difference is small it will make no difference to the executable.

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.

There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.

Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.

A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.

In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

GCC hidden/little-known features

This is my attempt to start a collection of GCC special features which usually do not encounter. this comes after #jlebedev in the another question mentioned "Effective C++" option for g++,
-Weffc++
This option warns about C++ code which breaks some of the programming guidelines given in the books "Effective C++" and "More Effective C++" by Scott Meyers. For example, a warning will be given if a class which uses dynamically allocated memory does not define a copy constructor and an assignment operator. Note that the standard library header files do not follow these guidelines, so you may wish to use this option as an occasional test for possible problems in your own code rather than compiling with it all the time.
What other cool features are there?

From time to time I go through the current GCC/G++ command line parameter documentation and update my compiler script to be even more paranoid about any kind of coding error. Here it is if you are interested.
Unfortunately I didn't document them so I forgot most, but -pedantic, -Wall, -Wextra, -Weffc++, -Wshadow, -Wnon-virtual-dtor, -Wold-style-cast, -Woverloaded-virtual, and a few others are always useful, warning me of potentially dangerous situations. I like this aspect of customizability, it forces me to write clean, correct code. It served me well.
However they are not without headaches, especially -Weffc++. Just a few examples:
It requires me to provide a custom copy constructor and assignment operator if there are pointer members in my class, which are useless since I use garbage collection. So I need to declare empty private versions of them.
My NonInstantiable class (which prevents instantiation of any subclass) had to implement a dummy private friend class so G++ didn't whine about "only private constructors and no friends"
My Final<T> class (which prevents subclassing of T if T derived from it virtually) had to wrap T in a private wrapper class to declare it as friend, since the standard flat out forbids befriending a template parameter.
G++ recognizes functions that never return a return value, and throw an exception instead, and whines about them not being declared with the noreturn attribute. Hiding behind always true instructions didn't work, G++ was too clever and recognized them. Took me a while to come up with declaring a variable volatile and comparing it against its value to be able to throw that exception unmolested.
Floating point comparison warnings. Oh god. I have to work around them by writing x <= y and x >= y instead of x == y where it is acceptable.
Shadowing virtuals. Okay, this is clearly useful to prevent stupid shadowing/overloading problems in subclasses but still annoying.
No previous declaration for functions. Kinda lost its importance as soon as I started copypasting the function declaration right above it.
It might sound a bit masochist, but as a whole, these are very cool features that increased my understanding of C++ and general programming.
What other cool features G++ has? Well, it's free, open, it's one of the most widely used and modern compilers, consistently outperforms its competitors, can eat almost anything people throw at it, available on virtually every platform, customizable to hell, continuously improved, has a wide community - what's not to like?

A function that returns a value (for example an int) will return a random value if a code path is followed that ends the function without a 'return value' statement. Not paying attention to this can result in exceptions and out of range memory writes or reads.
For example if a function is used to obtain the index into an array, and the faulty code path is used (the one that doesn't end with a return 'value' statement) then a random value will be returned which might be too big as an index into the array, resulting in all sorts of headaches as you wrongly mess up the stack or heap.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio