Is it really possible to find the endianness of a machine at runtime, without using environment variables or compiler flags? - endianness

Is it really possible to find the endianness of a machine at runtime ? Is it not hidden from the developers by the compiler ?
I see all sorts of answers in this forum and in the entire internet. some common solutions are like this:-
Getting the first byte of an integer variable with value 1( variable | 0xff).
Type-casting a character array to multi-byte data-type and checking the value with char[0].
Using a union etc.
Now, the question is, does the compiler not hide the endianness from the developer? In other words, in solution#2 above. Will the compiler not always fetch the 0th index of character irrespective of the underlying architecture(endianess) ?

Related

Why does the Tool Help Library offer 2 versions of same functions/structures?

I've noticed that the Tool Help Library offers some functions and structures with 2 versions: normal and ending with W. For example: Process32First and Process32FirstW. Since their documentation is identical, I wonder what are the differences between those two?
The W and A versions stand for "wide" and "ANSI". In the past they made different functions, structures and types for both ANSI and unicode strings. For the purpose of this answer, unicode is widechar which is 2 bytes per character and ANSI is 1 byte per character (but it's actually more complicated than that). By supplying both types, the developer can use whichever he wants but the standard today is to use unicode.
If you look at the ToolHelp32 header file it does include both A and W versions of the structures and functions. If you're not finding them, you're not looking hard enough, do an explicit search for the identifiers and you will find them. If you're just doing "view definition" you will find the #ifdef macros. If you still can't find them, change your character set in your Visual Studio project and check again.
Due to wide char arrays being twice the size, structure alignment will be incorrect if you do not use the correct types. Let the macros resolve them for you, by setting the correct character set and using PROCESSENTRY32 instead of indicating A or W, this is the preferred method. Some APIs you are better off using the ANSI version to be honest but that is something you will learn with experience and have to make your own decision.
Here is an excellent article on the topic of character sets / encoding

How are the allowed values of _POSIX_THREADS in Open Group Base Spec 6 to be interpreted?

This version of the POSIX spec states that the allowed values for the symbol _POSIX_THREADS are -1, 0, or 200112L, but does not state what each value represents.
Comments in boost suggest that values greater than zero indicate posix thread support, but the nearby preprocessor check appears to interpret zero as meaning 'threads enabled' as well.
How are the three permitted values to be interpreted? In particular, does -1 mean 'no threads'? Does zero mean threads or no threads? I'm guessing that 200112L means threads, but I'd be interested in more information on that as well.
Basically I just want to verify that the boost preprocessor check is the correct way to test for the presence of posix threads, despite what looks to be a slightly misleading comment.
To quote the POSIX spec page you reference:
If a symbolic constant is defined with the value -1, the option is not supported. Headers, data types, and function interfaces required only for the option need not be supplied. An application that attempts to use anything associated only with the option is considered to be requiring an extension.
If a symbolic constant is defined with a value greater than zero, the option shall always be supported when the application is executed. All headers, data types, and functions shall be present and shall operate as specified.
If a symbolic constant is defined with the value zero, all headers, data types, and functions shall be present. The application can check at runtime to see whether the option is supported by calling fpathconf(), pathconf(), or sysconf() with the indicated name parameter.

What does "_" mean in gcc?

In some gcc code I came across the following construct.
fatal (_("%s: cannot find section %s"), file_name, section_name);
I have never seen "_" in this context.
It is some sort of construct to create an entity from the character array, very probably a compiler extension.
Can someone tell me what it is?
It is usually a macro associated with the GNU gettext project, used for internationalization. The idea is the passed string is a key in a lookup table. There is one such table for each supported language, with the current one decided by handful of environmental factors.
The value found in the table should be a translation of the key, into the target language.
Since looking up such translated strings is a common activity in i18n code, _ is introduced as a convenient, short name for the lookup function.

When I use Conditional Compilation Arguments to Exclude Code, why doesn't VB6 EXE file size change?

Basically, when declaring Windows API functions in my VB6 code, there comes with these many constants that need to be declared or used with this function, in fact, usually most of these constants are not used and you only end up using one of them or so when making your API calls, so I am using Conditional Compilation Arguments to exclude these (and other things) using something like this:
IncludeUnused = 0 : Testing = 1
(this is how I set two conditional compilation arguments (they are of Boolean type by default).
So, many unused things are excluded like this:
#If IncludeUnused Then
' Some constant declarations and API declarations go here, sometimes functions
' and function calls go here as well, so it's not just declarations and constants
#End If
I also use a similar wrapper using the Testing Boolean declared in the Conditional Compilation Argument input field in the VB6 Properties windows "Make" tab. The Testing Boolean is used to display message boxes and things like that when I am in testing mode, and of course, these message boxed are removed (not displayed) if I have Testing set to 0 (and it is obviously 1 when I am Testing).
The problem is, I tried setting IncludeUnused and Testing to 0 and 1 and visa versa, a total of four (4) combinations, and no matter what combination I set these values to, the output EXE file size for my VB6 EXE does not change! It is always 49,152 when compiled to Native Code using Fast Code, and when using Small Code.
Additionally, if I compile to p-code under the four (4) combinations of Testing and IncludeUnused, i always end up with the file size 32,768 no matter what.
This is driving me crazy, since it is leading me to believe that no change is actually occuring, even though it is. Why is it that when segments of code are excluded from compilation, the file size is still the same? What am I missing or doing wrong, or what have I miscalculated?
I have considered the option that perhaps VB6 automatically does not compile code which is not used into the final output EXE, but I have read from a few sources that this is not true, in that, if it's included, it is compiled (correct me if I am wrong), and if this is right, then there is no need to use the IncludeUnused Boolean to remove unused code...?
If anyone can shed some light on these thoughts, I'd greatly appreciate it.
It could well be that the size difference is very small and that the exe size is padded to the next 512 or 1024 byte alignment. Try compressing the exe's with zip and see if the zip-file sizes differ.
You misunderstand what a compiler does. The output of the VB6 compiler is code. Constants are merely place holders for values, they are not code. The compiler adds them to its symbol table. And when it later encounters a statement in your code that uses the constant then it replaces the constant by its value. That statement produces the exact same code whether you use a constant or hard-code the value in the statement.
So this automatically implies that if you never actually use the constant anywhere then there is no difference at all in the generated code. All that you accomplished by using the #If is to keep the compiler's symbol table smaller. Which is something that makes very little sense to do, the actual gain from compilation speed you get is not measurable. Symbol tables are implemented as hash tables, they have O(1) amortized complexity.
You use constants only to make your code more readable. And to make it easy to change a constant value if the need ever arises. By using #If, you actually made your code less readable.
You can't test runtime data in conditional compilation directives.
These directives use expressions made up of literal values, operators, and CC constants. One way to set constant values is:
#Const IncludeUnused = 0
#Const Testing = 1
You can also define them via Project Properties for IDE testing. Go to the Make tab in that dialog and click the Help button for details.
Perhaps this is where you are setting the values? If so, consider this just additional info for later readers rather than an answer.
See #If...Then...#Else Directive
VB6 executable sizes are padded to 4KB blocks, so if the code difference is small it will make no difference to the executable.

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.
There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.
Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.
A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.
In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

Resources