Eclipse CDT exceeding 32k CreateProcess limit during build on Windows - windows

I've ran into a problem during the C/C++ build using Eclipse/CDT for a project containing many source files - during the linking phase a very long command line is created that is then passed to CreateProcess Windows API function. This command is in format of: {compiler} {flags} path/to/file1.o path/to/file2.o path/to/file3.o [...] and essentially contains all object files created during compilation and in my case is over 34K characters long.
Now historically this has already been looked at, but in my feeling the problem was only raised to a higher character limit. In particular, when looking here it is mentioned there was a problem with 8192 character limit, which seems to be the case when the command was directly passed to command line. The change which attempted to fix it was to use CreateProcess WINAPI function instead, which has a higher limitation of 32767 characters (limits depend on version of Windows used). In case of the project I'm working with, neither limit is high enough, as I've already reached over 34K characters for that particular command.
I've mentioned GNU ARM Elipse as this is what I'm working with, however this is a general issue with the CDT itself and it seems to also be a problem in other IDEs - for example I've read the same kind of reports from Netbeans users. Because I work with embedded systems and there are tools already integrated with Eclipse, moving to another IDE isn't an option.
As for the actual solution - the ideal situation to me would be to have the list of object files passed through a temporary file, as the character limitation is completely bypassed this way. So far I'm using CTD's makefile generator, as it is integrated with the IDE itself (flags, build exclusions etc.), however it isn't configurable in any way or at least none that I'm aware of that would solve my issue.
What are my options here? If it's necessary I might look into the makefile generator itself and modify it to output to file, however I'd rather avoid doing so if it's not the last possible option.

Related

ALINK error 1065 even when Windows Long Paths enabled

I am trying to get a C# Visual Studio 2019/MSBuild job to build on a Jenkins build server. I know that my file paths are too long, so I have enabled Long File Paths in the Group Policy Editor (and verified that it has persisted in the registry editor after a server restart).
However, now I am getting the following error "ALINK: fatal error AL1065: File name ... is too long or invalid".
A quick google search led me to this page for Alchemy Software. However I have no idea what Alchemy Software is and why it is being used in the build process and why it is failing. (Although for the last point, I'm guessing that the Alchemy Software .dll is not "Long Address Aware", which I believe is necessary for an application to take advantage of Long Paths in Windows. But since I can't locate any .dll or .exe associated with this software, I can't be certain.)
Does anyone know why my build is still failing with this error, what Alchemy Software is, and how to get it to take advantage of Long Paths in Windows?
P.S. And please, no comments about how I should restructure my file paths to be shorter. I have tried doing that but it's impractical for this application. And anyway, it keeps popping up and is becoming a whack-a-mole situation, so I'd really rather fix the root cause rather than constantly putting band-aids everywhere.
I am going to set this as an answer since, after Hans Passant's very helpful comments and subsequent research, I think it's pretty definitive that this can mostly only be worked around, not resolved. (As possible exception will be discussed at the end of this answer.)
As stated in those comments, this error originates from a linker module called al.exe that is utilized by MSBuild for certain project configurations (more on that later). This linker can be found in a couple places, but for me it was being called from C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools.
MSBuild can handle long file paths as long as Windows is configured to do so (by enabling long paths in the Group Policy Editor or by modifying the registry directly). However this al.exe module cannot. And as far as I can tell, there's no way to force it to do so. So if your build tool chain requires this al.exe module to be used, you're kind of SOL.
For my particular situation, for a job in Jenkins that was failing because my paths were too long, I worked around this by changing the workspace of my job. So now instead of the default of something like D:\Jenkins\workspaces\[product]\Releases\[product_version], I changed it to D:\j\dd0_1c, which is an encoding that makes sense to me. This shortening of the folder path avoids any subsequent file paths from exceeding the limit of 260 characters. It's not a satisfying solution, but it works for my particular situation.
I did mention that there was a possible exception to all of this: if you can get MSBuild to avoid using al.exe altogether, then you can avoid this error.
I don't know all of the scenarios or workflows in which MSBuild utilizes this module, but I do know that it does get utilized when your application has localized resources, and somehow, some way, MSBuild uses al.exe when it is generating those resources. This was exactly my scenario, and I found this page and this page describing how you can reconfigure your localization projects such that MSBuild does not utilize al.exe. I did try the steps described in these pages and was able to verify that I no longer got this ALINK error from al.exe. However I never got my project to fully build since this reconfiguration caused other build errors to crop up. So in the end, for the sake of expediency (and because it was cleaner than performing a major refactor of my code), I went with the Jenkins workspace workaround.
However, it is interesting to note that you can get MSBuild to avoid using al.exe as long as your project is conducive to the solution given in those two links. So hopefully, if someone runs into this same issue, they might have more success than I in utilizing this method.

Recommended tool to automate complicated build procedure

I am developing an OS for embedded devices that runs bytecode. Basically, a micro JVM.
In the process of doing so, I am able to compile and run Java applications to bytecode(ish) and flash that on, for instance, an Atmega1284P.
Now I've added support for C applications: I compile and process it using several tools and with some manual editing I eventually get bytecode that runs on my OS.
The process is very cumbersome and heavy and I would like to automate it.
Currently, I am using makefiles for automatic compilation and flashing of the Java applications & OS to devices.
All steps, roughly, for a C application are as follows and consist of consecutive manual steps:
(1) Use Docker to run a Linux container with lljvm that compiles a .c file to a .class file (see also https://github.com/davidar/lljvm/tree/master)
(2) convert this c.class file to a jasmin file (https://github.com/davidar/jasmin) using the ClassFileAnalyzer tool (http://classfileanalyzer.javaseiten.de/)
(3) manually edit this jasmin file in a text editor by replacing/adjusting some strings
(4) convert the modified jasmin file to a .class file again using jasmin
(5) put this .class file in a folder where the rest of my makefiles (the ones that already make and deploy the OS and class files from Java apps) can take over.
Current options seem to be just keep using makefiles but this is a bit unwieldly (I already have 5 different makefiles and this would further extend that chain). I've also read a bit about scons. In essence, I'm wondering which are some recommended tools or a good approach for complicated builds.
Hopefully this may help a bit, but the question as such could probably be a subject for a heated discussion without much helpful results.
As pointed out in the comments by others, you really need to automate the steps starting with your .c file to the point you can integrated it with the rest of your system.
There is generally nothing wrong with make and you would not win too much by switching to SCons. You'd get more ways to express what you want to do. Among other things meaning that if you wanted to write that automation directly inside the build system and its rules, you could also use Python and not only shell (should that be of a concern though, you could just as well call that Python code from make). But the essence of target, prerequisite, recipe is still there. And with that need for writing necessary automation for those .c to integration steps.
If you really wanted to look into alternative options. bazel might be of interest to you. The downside being the initial effort to write the necessary rules to fit your needs could be costly. And depending on size of your project, might just be too much. On the other hand once done with that, it'd be very easy to use (apply those rules on growing code base) and you could also ditch the container and rely on its more lightweight sand-boxing and external rules to get the tools and bits you need for your build... all with a single system for build description.

How do you verify that 2 copies of a VB 6 executable came from the same code base?

I have a program under version control that has gone through multiple releases. A situation came up today where someone had somehow managed to point to an old copy of the program and thus was encountering bugs that have since been fixed. I'd like to go back and just delete all the old copies of the program (keeping them around is a company policy that dates from before version control was common and should no longer be necessary) but I need a way of verifying that I can generate the exact same executable that is better than saying "The old one came out of this commit so this one should be the same."
My initial thought was to simply MD5 hash the executable, store the hash file in source control, and be done with it but I've come up against a problem which I can't even parse.
It seems that every time the executable is generated (method: Open Project. File > Make X.exe) it hashes differently. I've noticed that Visual Basic messes with files every time the project is opened in seemingly random ways but I didn't think that would make it into the executable, nor do I have any evidence that that is indeed what's happening. To try to guard against that I tried generating the executable multiple times within the same IDE session and checking the hashes but they continued to be different every time.
So that's:
Generate Executable
Generate MD5 Checksum: md5sum X.exe > X.md5
Verify MD5 for current executable: md5sum -c X.md5
Generate New Executable
Verify MD5 for new executable: md5sum -c X.md5
Fail verification because computed checksum doesn't match.
I'm not understanding something about either MD5 or the way VB 6 is generating the executable but I'm also not married to the idea of using MD5. If there is a better way to verify that two executables are indeed the same then I'm all ears.
Thanks in advance for your help!
That's going to be nearly impossible. Read on for why.
The compiler will win this game, every time...
Compiling the same project twice in a row, even without making any changes to the source code or project settings, will always produce different executable files.
One of the reasons for this is that the PE (Portable Executable) format that Windows uses for EXE files includes a timestamp indicating the date and time the EXE was built, which is updated by the VB6 compiler whenever you build the project. Besides the "main" timestamp for the EXE as a whole, each resource directory in the EXE (where icons, bitmaps, strings, etc. are stored in the EXE) also has a timestamp, which the compiler also updates when it builds a new EXE. In addition to this, EXE files also have a checksum field that the compiler recalculates based on the EXE's raw binary content. Since the timestamps are updated to the current date/time, the checksum for the EXE will also change each time a project is recompiled.
But, but...I found this really cool EXE editing tool that can undo this compiler trickery!
There are EXE editing tools, such as PE Explorer, that claim to be able to adjust all the timestamps in an EXE file to a fixed time. At first glance you might think you could just set the timestamps in two copies of the EXE to the same date, and end up with equivalent files (assuming they were built from the same source code), but things are more complicated than that: the compiler is free to write out the resources (strings, icons, file version information, etc.) in a different order each time you compile the code, and you can't really prevent this from happening. Resources are stored as independent "chunks" of data that can be rearranged in the resulting EXE without affecting the run-time behavior of the program.
If that wasn't enough, the compiler might be building up the EXE file in an area of uninitialized memory, so certain parts of the EXE might contain bits and pieces of whatever was in memory at the time the compiler was running, creating even more differences.
As for MD5...
You are not misunderstanding MD5 hashing: MD5 will always produce the same hash given the same input. The problem here is that the input in this case (the EXE files) keep changing.
Conclusion: Source control is your friend
As for solving your current dilemma, I'll leave you with this: associating a particular EXE with a specific version of the source code is a more a matter of policy, which has to be enforced somehow, than anything else. Trying to figure out what EXE came from what version without any context is just not going to be reliable. You need to track this with the help of other tools. For example, ensuring that each build produces a different version number for your EXE's, and that that version can be easily paired with a specific revision/branch/tag/whatever in your version control system. To that end, a "free-for-all" situation where some developers use source control and others use "that copy of the source code from 1997 that I'm keeping in my network folder because it's my code and source control is for sissies anyway" won't help make this any easier. I would get everyone drinking the source control Kool-Aid and adhering to a standard policy for creating builds right away.
Whenever we build projects, our build server (we use Hudson) ensures that the compiled EXE version is updated to include the current build number (we use the Version Number Plugin and a custom build script to do this), and when we release a build, we create a tag in Subversion using the version number as the tag name. The build server archives release builds, so we can always get the specific EXE (and setup program) that was given to a customer. For internal testing, we can choose to pull an archived EXE from the build server, or just tell the build server to rebuild the EXE from the tag we created in Subversion.
We also never, ever, ever release any binaries to QA or to customers from any machine other than the build server. This prevents "works on my machine" bugs, and ensures that we are always compiling from a "known" copy of the source code (it only pulls and builds code that is in our Subversion repository), and that we can always associate a given binary with the exact version of the code that it was created from.
I know it has been a while, but since there is VB De-compiler app, you may consider bulk-decompiling vb6 apps, and then feeding decompilation results to an AI/statistical anomaly detection on the various code bases. Given the problem you face doesn't have an exact solution, it is unlikely the results will be 100% accurate, but as you feed more data, the detection should become more and more accurate

How to put version information in a multi platform program *nix and win32?

I want to know what is the standard way of doing it.
currently I'm thinking in add a series of defines in a header file and inlcudie that file in the main resource file win win32 to update the version resource in win32 and in *nix make some global functions to return this information.
and in windows make the msi install file also reflect the same version.
That sounds like a reasonable way to do it. I don't think there IS a standard way of doing this; there aren't any real standards for version reporting that are cross-platform.
Since we wanted to avoid the overhead of changing a "version.cpp" or equivalent every time we hit build -- and thereby taking the time to do at least one compile and link -- we modify the binary after the build.
If you're outputting to e.g. ELF or PE format executables, you can use some basic knowledge of ELF or PE and a linker map to figure out what to replace, otherwise you can scan through the binary looking for a set pattern (we use something like static const char VERSION[] = "[VERSIONBLOCK xxxxxxxxxxxxx]";) and replace a portion (e.g. the xxxx part above) with relevant info:
build date and time
build machine
username
output of e.g. svnversion
Note that this won't work very well if your binaries are signed or compressed before this step, but usually you can engineer your build process so the signing/compressing happens after this step.
I'm sure a variant of this could be extended to hit the Win32 PE version metadata as well as any embedded version string.

Comparing VB6.exes

We're going through a massive migration project at the minute and trying to validate the code that is deployed to the live estate matches the code we have in source control.
Obviously the .net code is easy to compare because we can disassemble. I don't believe this is possible in vb6 exes because of the manner of compilation.
Does anyone have any ideas on how I could validate the source code and the compiled executable matches the file I have in Live.
Thanks
Visual Basic had (has) two ways of compiling, one to the interpreter ( called P-code) that would result in smaller binaries, and a second one that generates "regular" windows .exe file (called native) that was introduced because it was supposed to be fastar than p-code; although the compiled file size increased with this option.
If your compilation was using p-code, it is in theory possible to restore the sources.
Either way is pretty difficult to do, but there are tools that claim they can partially do this, one that I know of ( never tried but there is a trial version ) is VB-decompiler
http://www.vb-decompiler.org/
Unfortunately that's almost impossible. Bear in mind that VB6 code compiled on different machines will have different exe sizes and deployment requirements.
This is why the old VB'ers had a dedicated machine to compile their code.
This won't help you with already deployed items, but if you upped the revision number on every compile (there is a project setting to do this for you automatically) then you could easily compare version numbers.
My old company bought a copy of that VB-Decompiler and as noted before VB5/6 generates P-Code extra, that tool did produce some code and if not Assembly code which could be "read".
If you have all the code you compiled, you could compare the CRC's of that code to what is deployed in the field. But if you don't have the original compiled code, depending upon how you compiled the code you (if you used P-Code rather than Native Code you may be able to disassemble but the disassembly will look nothing like your source code). I doubt you would have shipped the PDB's with the exe's, but if you did, you could certainly use those to compare with the source code in your repository.
Have a trusted computer that can check out the various libraries and exes you make and compile them automatically. Keep those in a read-only but accessible location. Then do a binary comparison between the deployed site and your comparison site.
However I am not sure of the logic over disassembling the the complied units. My company and most other places I know of use a combination of a build computer and unit testing. In our company the EXE we make is a very thin shell over a bunch of libraries. For example a button click will be passed to a UI Active X DLL that does the actual processing. What we do after a build is run a special EXE that perform our list of unit test. If they all passed we know our libraries, where 90% of our code is, are good. As for the actual EXE we have a hand procedure that takes about two hours to do and then we are good. IIt is rare for any errors to happen in the EXE.

Resources