VC++ compilation time and performance - performance

I'm working on a multiplaform project (MacOS, Linux and Windows) and I've been having some performance issues when trying to compile a big source file in VS C++ 2010.
Here's a little background. There's one .cpp file inside the project that is 800KB big. The size of the file is caused by the fact that I'm compiling an array that contains image information. So, it's a huge unsigned char array that can't be split.
Now, I've been working on MacOS during the last couple of months, so I didn't notice this problem until some days ago. In both MacOS and Linux, gcc compiles the file in a second or so, but when I use VC++ it takes about an hour.
At first I though it was cased by the computer itself, since it's not a fast one. But then I tried Cygwin and GCC 4 in the same machine and the compilation time was almost as fast as in MacOS. So I have to assume the problem is caused by something within VC++ 2010.
I haven't tweek VC++ in any form. The project files are generated by CMake, so I believe there should some room for optimizations here. Any help will be appreciated.
Thanks.
Hernan

Any chance you can place that large array into a seperate resource file and read it in that way? That's how I would go about fixing this problem if that array is indeed the problem. Failing that, I'd place the array in its own file so that it doesn't recompile often.

Looks like there is some O(n^k) part of VC++ with k>1 when parsing array initializers...
That would qualify as a logical bug you cannot do much about, but something that may work is
unsigned char bdata[][100] = {
{ 0x01, 0x02, ... , 0x63} ,
{ 0x64, 0x65, ... , 0xC7} ,
{ 0xC8, 0xC9, ... , 0x2B} ,
...
};
unsigned char *data = &(bdata[0][0]);
that is breaking the data in 100-bytes rows... MAY BE this will be parsed/compiled a lot faster by VC (just a suspect I have given the symptoms) and it shouldn't change your build process by much.
I don't use VC++2010 so I cannot check.
Just pay attention that sizeof(data) in this case will be just the size of a pointer and sizeof(bdata) will be instead the size of the image but rounded up to a multiple of the size of the row.
If this version runs at the same speed the unfortunately the code is O(n^k) in the number of bytes and you're basically doomed if you want that to be compiled as an array.
Another option could be using a huge string literal... the compiler may work better on that (may be they coded a special code path for string literals because "big" literals are not so uncommon), but your code generator will have to handle escaping of special chars.

Related

Resources listing known compiler bugs in VC++ 6.0

Is there a resource listing CString fixes between VC6.0 and Visual Studio 2010. We have encountered what appears to be a compiler bug in VC6.0 sp6 that works in 2010.
I'm working to distill it into a small test case but essentially in cases where ~300 strings are referenced two nearly identical strings resolve such that one is lost at the assembly level. Seems like a possible internal hash table collision internal to vc6.0.
I need to prove this for a vc6.0 work around solution. (Our legacy code is vc6.0). I'll try to post a code snippet once I can / (if I can ) distill it to something I can post.
Visual C++ uses COMDAT naming to support the /GF string pooling flag (which is implied by /ZI) However, under VC++6.0 the symbol name length was truncated to 256 characters.
I suspect your strings have identical prefixes up to the 256th character.

Trying deterministic gcc compilation, symbol table problems

I work for embedded systems and I am trying to make a build that yields exactly the same executable each time. Using -frandom-seed certainly helped to stabilize names that were otherwise variable, but still I have a couple of symbols that I have problems with. For example:
0x00003bfc _ZN13WorkingMemory17ReadTransactionalERN3HSL4FileERN58_GLOBAL__N_......_.._working_memory.cc_AE42A16A_FF4623503AllE
The ".._.." etc. part was evidently worked out of what I passed as -frandom-seed, id est, the source filename. Of the couple of hex number that follows sometimes, the second one sometimes is different, and I guess it is probably linked to the compilation date, but I am not sure.
I am working on ARM, using gcc 3.4.0, using FLAT executables. I tried to remove symbols using strip on the ELF file, but that prevents FLAT conversion.
Any ideas?

Converting a working code from double-precision to quadruple-precision: How to read quadruple-precision numbers in FORTRAN from an input file

I have a big, old, FORTRAN 77 code that has worked for many, many years with no problems.
Double-precision is not enough anymore, so to convert to quadruple-precision I have:
Replaced all occurrences of REAL*8 to REAL*16
Replaced all functions like DCOS() into functions like COS()
Replaced all built-in numbers like 0.d0 to 0.q0 , and 1D+01 to 1Q+01
The program compiles with no errors or warnings with the gcc-4.6 compiler on
operating system: openSUSE 11.3 x86_64 (a 64-bit operating system)
hardware: Intel Xeon E5-2650 (Sandy Bridge)
My LD_LIBRARY_PATH variable is set to the 64-bit library folder:
/gcc-4.6/lib64
The program reads an input file that has numbers in it.
Those numbers used to be of the form 1.234D+02 (for the double-precision version of the code, which works).
I have changed them so now that number is 1.234Q+02 , however I get the runtime error:
Bad real number in item 1 of list input
Indicating that the subroutine that reads in the data from the input file (called read.f) does not find the first number in the inputfile, to be compatible with what it expected.
Strangely, the quadruple-precision version of the code does not complain when the input file contains numbers like 1.234D+02 or 123.4 (which, based on the output seems to automatically be converted to the form 1.234D+02 rather than Q+02), it just does not like the Q+02 , so it seems that gcc-4.6 does not allow quadruple-precision numbers to be read in from input files in scientific notation !
Has anyone ever been able to read from an input file a quadruple-precision number in scientific notation (ie, like 1234Q+02) in FORTRAN with a gcc compiler, and if so how did you get it to work ? (or did you need a different compiler/operating system/hardware to get it to work ?)
Almost all of this is already in comments by #IanH and #Vladimi.
I suggest mixing in a little Fortran 90 into your FORTRAN 77 code.
Write all of your numbers with "E". Change your other program to write the data this way. Don't bother with "D" and don't try to use the infrequently supported "Q". (Using "Q" in constants in source code is an extension of gfortran -- see 6.1.8 in manual.)
Since you want the same source code to support two precisions, at the top of the program, have:
use ISO_FORTRAN_ENV
WP = real128
or
use ISO_FORTRAN_ENV
WP = real64
as the variation that changes whether your code is using double or quadruple precision. This is using the ISO Fortran Environment to select the types by their number of bits. (use needs to between program and implicit none; the assignment statement after implicit none.)
Then declare your real variables via:
real (WP) :: MyVar
In source code, write real constants as 1.23456789012345E+12_WP. The _type is the Fortran 90 way of specifying the type of a constant. This way you can go back and forth between double and quadruple precision by only changing the single line defining WP
WP == Working Precision.
Just use "E" in input files. Fortran will read according to the type of the variable.
Why not write a tiny test program to try it out?

When I use Conditional Compilation Arguments to Exclude Code, why doesn't VB6 EXE file size change?

Basically, when declaring Windows API functions in my VB6 code, there comes with these many constants that need to be declared or used with this function, in fact, usually most of these constants are not used and you only end up using one of them or so when making your API calls, so I am using Conditional Compilation Arguments to exclude these (and other things) using something like this:
IncludeUnused = 0 : Testing = 1
(this is how I set two conditional compilation arguments (they are of Boolean type by default).
So, many unused things are excluded like this:
#If IncludeUnused Then
' Some constant declarations and API declarations go here, sometimes functions
' and function calls go here as well, so it's not just declarations and constants
#End If
I also use a similar wrapper using the Testing Boolean declared in the Conditional Compilation Argument input field in the VB6 Properties windows "Make" tab. The Testing Boolean is used to display message boxes and things like that when I am in testing mode, and of course, these message boxed are removed (not displayed) if I have Testing set to 0 (and it is obviously 1 when I am Testing).
The problem is, I tried setting IncludeUnused and Testing to 0 and 1 and visa versa, a total of four (4) combinations, and no matter what combination I set these values to, the output EXE file size for my VB6 EXE does not change! It is always 49,152 when compiled to Native Code using Fast Code, and when using Small Code.
Additionally, if I compile to p-code under the four (4) combinations of Testing and IncludeUnused, i always end up with the file size 32,768 no matter what.
This is driving me crazy, since it is leading me to believe that no change is actually occuring, even though it is. Why is it that when segments of code are excluded from compilation, the file size is still the same? What am I missing or doing wrong, or what have I miscalculated?
I have considered the option that perhaps VB6 automatically does not compile code which is not used into the final output EXE, but I have read from a few sources that this is not true, in that, if it's included, it is compiled (correct me if I am wrong), and if this is right, then there is no need to use the IncludeUnused Boolean to remove unused code...?
If anyone can shed some light on these thoughts, I'd greatly appreciate it.
It could well be that the size difference is very small and that the exe size is padded to the next 512 or 1024 byte alignment. Try compressing the exe's with zip and see if the zip-file sizes differ.
You misunderstand what a compiler does. The output of the VB6 compiler is code. Constants are merely place holders for values, they are not code. The compiler adds them to its symbol table. And when it later encounters a statement in your code that uses the constant then it replaces the constant by its value. That statement produces the exact same code whether you use a constant or hard-code the value in the statement.
So this automatically implies that if you never actually use the constant anywhere then there is no difference at all in the generated code. All that you accomplished by using the #If is to keep the compiler's symbol table smaller. Which is something that makes very little sense to do, the actual gain from compilation speed you get is not measurable. Symbol tables are implemented as hash tables, they have O(1) amortized complexity.
You use constants only to make your code more readable. And to make it easy to change a constant value if the need ever arises. By using #If, you actually made your code less readable.
You can't test runtime data in conditional compilation directives.
These directives use expressions made up of literal values, operators, and CC constants. One way to set constant values is:
#Const IncludeUnused = 0
#Const Testing = 1
You can also define them via Project Properties for IDE testing. Go to the Make tab in that dialog and click the Help button for details.
Perhaps this is where you are setting the values? If so, consider this just additional info for later readers rather than an answer.
See #If...Then...#Else Directive
VB6 executable sizes are padded to 4KB blocks, so if the code difference is small it will make no difference to the executable.

Where Is gcvt or gcvtf Defined in gcc Source Code?

I'm working on some old source code for an embedded system on an m68k target, and I'm seeing massive memory allocation requests sometimes when calling gcvtf to format a floating point number for display. I can probably work around this by writing my own substitute routine, but the nature of the error has me very curious, because it only occurs when the heap starts at or above a certain address, and it goes away if I hack the .ld linker script or remove any set of global variables (which are placed before the heap in my memory map) that add up to enough byte size so that the heap starts below the mysterious critical address.
So, I thought I'd look in the gcc source code for the compiler version I'm using (m68k-elf-gcc 3.3.2). I downloaded what appears to be the source for this version at http://gcc.petsads.us/releases/gcc-3.3.2/, but I can't find the definition for gcvt or gcvtf anywhere in there. When I search for it, grep only finds some documentation and .h references, but not the definition:
$ find | xargs grep gcvt
./gcc/doc/gcc.info: C library functions `ecvt', `fcvt' and `gcvt'. Given va
lid
./gcc/doc/trouble.texi:library functions #code{ecvt}, #code{fcvt} and #code{gcvt
}. Given valid
./gcc/sys-protos.h:extern char * gcvt(double, int, char *);
So, where is this function actually defined in the source code? Or did I download the entirely wrong thing?
I don't want to change this project to use the most recent gcc, due to project stability and testing considerations, and like I said, I can work around this by writing my own formatting routine, but this behavior is very confusing to me, and it will grind my brain if I don't find out why it's acting so weird.
Wallyk is correct that this is defined in the C library rather than the compiler. However, the GNU C library is (nearly always) only used with Linux compilers and distributions. Your compiler, being a "bare-metal" compiler, almost certainly uses the Newlib C library instead.
The main website for Newlib is here: http://sourceware.org/newlib/, and this particular function is defined in the newlib/libc/stdlib/efgcvt.c file. The sources have been quite stable for a long time, so (unless this is a result of a bug) chances are pretty good that the current sources are not too different from what your compiler is using.
As with the GNU C source, I don't see anything in there that would obviously cause this weirdness that you're seeing, but it's all eventually a bunch of wrappers around the basic sprintf routines.
It is in the GNU C library as glibc/misc/efgcvt.c. To save you some trouble, the code for the function is:
char *
__APPEND (FUNC_PREFIX, gcvt) (value, ndigit, buf)
FLOAT_TYPE value;
int ndigit;
char *buf;
{
sprintf (buf, "%.*" FLOAT_FMT_FLAG "g", MIN (ndigit, NDIGIT_MAX), value);
return buf;
}
The directions for obtain glibc are here.

Resources