When I use Conditional Compilation Arguments to Exclude Code, why doesn't VB6 EXE file size change? - performance

Basically, when declaring Windows API functions in my VB6 code, there comes with these many constants that need to be declared or used with this function, in fact, usually most of these constants are not used and you only end up using one of them or so when making your API calls, so I am using Conditional Compilation Arguments to exclude these (and other things) using something like this:
IncludeUnused = 0 : Testing = 1
(this is how I set two conditional compilation arguments (they are of Boolean type by default).
So, many unused things are excluded like this:
#If IncludeUnused Then
' Some constant declarations and API declarations go here, sometimes functions
' and function calls go here as well, so it's not just declarations and constants
#End If
I also use a similar wrapper using the Testing Boolean declared in the Conditional Compilation Argument input field in the VB6 Properties windows "Make" tab. The Testing Boolean is used to display message boxes and things like that when I am in testing mode, and of course, these message boxed are removed (not displayed) if I have Testing set to 0 (and it is obviously 1 when I am Testing).
The problem is, I tried setting IncludeUnused and Testing to 0 and 1 and visa versa, a total of four (4) combinations, and no matter what combination I set these values to, the output EXE file size for my VB6 EXE does not change! It is always 49,152 when compiled to Native Code using Fast Code, and when using Small Code.
Additionally, if I compile to p-code under the four (4) combinations of Testing and IncludeUnused, i always end up with the file size 32,768 no matter what.
This is driving me crazy, since it is leading me to believe that no change is actually occuring, even though it is. Why is it that when segments of code are excluded from compilation, the file size is still the same? What am I missing or doing wrong, or what have I miscalculated?
I have considered the option that perhaps VB6 automatically does not compile code which is not used into the final output EXE, but I have read from a few sources that this is not true, in that, if it's included, it is compiled (correct me if I am wrong), and if this is right, then there is no need to use the IncludeUnused Boolean to remove unused code...?
If anyone can shed some light on these thoughts, I'd greatly appreciate it.

It could well be that the size difference is very small and that the exe size is padded to the next 512 or 1024 byte alignment. Try compressing the exe's with zip and see if the zip-file sizes differ.

You misunderstand what a compiler does. The output of the VB6 compiler is code. Constants are merely place holders for values, they are not code. The compiler adds them to its symbol table. And when it later encounters a statement in your code that uses the constant then it replaces the constant by its value. That statement produces the exact same code whether you use a constant or hard-code the value in the statement.
So this automatically implies that if you never actually use the constant anywhere then there is no difference at all in the generated code. All that you accomplished by using the #If is to keep the compiler's symbol table smaller. Which is something that makes very little sense to do, the actual gain from compilation speed you get is not measurable. Symbol tables are implemented as hash tables, they have O(1) amortized complexity.
You use constants only to make your code more readable. And to make it easy to change a constant value if the need ever arises. By using #If, you actually made your code less readable.

You can't test runtime data in conditional compilation directives.
These directives use expressions made up of literal values, operators, and CC constants. One way to set constant values is:
#Const IncludeUnused = 0
#Const Testing = 1
You can also define them via Project Properties for IDE testing. Go to the Make tab in that dialog and click the Help button for details.
Perhaps this is where you are setting the values? If so, consider this just additional info for later readers rather than an answer.
See #If...Then...#Else Directive

VB6 executable sizes are padded to 4KB blocks, so if the code difference is small it will make no difference to the executable.

Related

Visual Studio Obfuscation

I am trying to test different obfuscators. Before obfuscating I used Reko decompiler. It seems that the exe is already obfuscated - please look at the screen shot. Can someone please explain - why all the methods and variables seems as if the exe is already obfuscated?
Symbol names are not compiled into executable machine code.
They can be preserved, but in this case they are saved in separate .pdb file. If you don't generate it during build, or don't make available to debugger/decompiler, it cannot figure out variables and function names (except for the imported/exported ones)
High level constructs, like for or while are implemented with jumps and conditional jumps, so it is not possible to figure out if a loop was implemented via for or goto or if a conditional was if statement or ternary operator.
Optimization hugely transforms code, throwing away unnecessary parts, making some operations at compile time, etc.

Find write statement in Fortran

I'm using Fortran for my research and sometimes, for debugging purposes, someone will insert in the code something like this:
write(*,*) 'Variable x:', varx
The problem is that sometimes it happens that we forget to remove that statement from the code and it becomes difficult to find where it is being printed. I usually can get a good idea where it is by the name 'Variable x' but it sometimes happens that that information might no be present and I just see random numbers showing up.
One can imagine that doing a grep for write(*,*) is basically useless so I was wondering if there is an efficient way of finding my culprit, like forcing every call of write(*,*) to print a file and line number, or tracking stdout.
Thank you.
Intel's Fortran preprocessor defines a number of macros, such as __file__ and __line__ which will be replaced by, respectively, the file name (as a string) and line number (as an integer) when the pre-processor runs. For more details consult the documentation.
GFortran offers similar facilities, consult the documentation.
Perhaps your compiler offers similar capabilities.
As has been previously implied, there's no Fortran--although there may be a compiler approach---way to change the behaviour of the write statement as you want. However, as your problem is more to do with handling (unintentionally produced) bad code there are options.
If you can't easily find an unwanted write(*,*) in your code that suggests that you have many legitimate such statements. One solution is to reduce the count:
use an explicit format, rather than list-directed output (* as the format);
instead of * as the output unit, use output_unit from the intrinsic module iso_fortran_env.
[Having an explicit format for "proper" output is a good idea, anyway.]
If that fails, use your version control system to compare an old "good" version against the new "bad" version. Perhaps even have your version control system flag/block commits with new write(*,*)s.
And if all that still doesn't help, then the pre-processor macros previously mentioned could be a final resort.

VB6 random double overflow errors

does anyone know a cause for random overflow errors in vb6?
I have to customize a legacy application written in VB6 and lately overflow errors have started to occur all over the place. Sometimes in functions which have not been touched in years!
The error always happens when trying to assign something to a variable of type Double.
The reason for those errors is probably not the code that throws the error but something else. But I dont know what to look for. The most confusing example of a function failing with an overflow error was the following code:
Dim test As Double
test = 0#
How can that possibly throw an overflow error?
I tried enabling some compiler optimizations, like not checking for floating point calculation errors, and some more. This has "solved" some of the problems, but others remain.
VB6 will run things in such a way where if something external signals a floating-point error flag, it'll not be reported until the next floating-point operation is performed within your own code.
Under most circumstances, this is likely caused by some DLL that is performing floating-point operation. If you have any control over these external DLLs, then my suggestion is to put this line at the end of the functions called by your application:
_clearfp();
This function is documented here: http://msdn.microsoft.com/en-us/library/49bs2z07.aspx
If you do not have much control, you can get around this by making your own function called from a DLL that calls that function. Or a simple hack with only using VB6 is:
Public Sub ClearFP()
On Error Resume Next
Dim d as Double
d = 0#
End Sub
Which you can call after any DLL calls that you believe is the culprit.
A trick to isolating which function did it originally, is simply look at the calls before the error appears. Alternatively, a more complicated solution, is to compile your application and run it through a debugger that can break on floating-point exceptions.
In VB6 the hash (#) symbol can mean many things:
Used in file names
used with dates ususlly when applied to DBs
To treat Numbers as Doubles
To compile constants or sections of code if a condition is true
I'm sure there are more.
It may depend on the compiler.
My suggestion would be to try:
Dim test As Double
test = CDbl(0)
to see if that resolves the issue.

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.
There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.
Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.
A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.
In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

GCC hidden/little-known features

This is my attempt to start a collection of GCC special features which usually do not encounter. this comes after #jlebedev in the another question mentioned "Effective C++" option for g++,
-Weffc++
This option warns about C++ code which breaks some of the programming guidelines given in the books "Effective C++" and "More Effective C++" by Scott Meyers. For example, a warning will be given if a class which uses dynamically allocated memory does not define a copy constructor and an assignment operator. Note that the standard library header files do not follow these guidelines, so you may wish to use this option as an occasional test for possible problems in your own code rather than compiling with it all the time.
What other cool features are there?
From time to time I go through the current GCC/G++ command line parameter documentation and update my compiler script to be even more paranoid about any kind of coding error. Here it is if you are interested.
Unfortunately I didn't document them so I forgot most, but -pedantic, -Wall, -Wextra, -Weffc++, -Wshadow, -Wnon-virtual-dtor, -Wold-style-cast, -Woverloaded-virtual, and a few others are always useful, warning me of potentially dangerous situations. I like this aspect of customizability, it forces me to write clean, correct code. It served me well.
However they are not without headaches, especially -Weffc++. Just a few examples:
It requires me to provide a custom copy constructor and assignment operator if there are pointer members in my class, which are useless since I use garbage collection. So I need to declare empty private versions of them.
My NonInstantiable class (which prevents instantiation of any subclass) had to implement a dummy private friend class so G++ didn't whine about "only private constructors and no friends"
My Final<T> class (which prevents subclassing of T if T derived from it virtually) had to wrap T in a private wrapper class to declare it as friend, since the standard flat out forbids befriending a template parameter.
G++ recognizes functions that never return a return value, and throw an exception instead, and whines about them not being declared with the noreturn attribute. Hiding behind always true instructions didn't work, G++ was too clever and recognized them. Took me a while to come up with declaring a variable volatile and comparing it against its value to be able to throw that exception unmolested.
Floating point comparison warnings. Oh god. I have to work around them by writing x <= y and x >= y instead of x == y where it is acceptable.
Shadowing virtuals. Okay, this is clearly useful to prevent stupid shadowing/overloading problems in subclasses but still annoying.
No previous declaration for functions. Kinda lost its importance as soon as I started copypasting the function declaration right above it.
It might sound a bit masochist, but as a whole, these are very cool features that increased my understanding of C++ and general programming.
What other cool features G++ has? Well, it's free, open, it's one of the most widely used and modern compilers, consistently outperforms its competitors, can eat almost anything people throw at it, available on virtually every platform, customizable to hell, continuously improved, has a wide community - what's not to like?
A function that returns a value (for example an int) will return a random value if a code path is followed that ends the function without a 'return value' statement. Not paying attention to this can result in exceptions and out of range memory writes or reads.
For example if a function is used to obtain the index into an array, and the faulty code path is used (the one that doesn't end with a return 'value' statement) then a random value will be returned which might be too big as an index into the array, resulting in all sorts of headaches as you wrongly mess up the stack or heap.

Resources