GNU Make: how to check whether the content of a file was updated by a recipe, or the old content was generated?

GNU Make: how to check whether the content of a file was updated by a recipe, or the old content was generated? - makefile

A tool invoked by GNU make always updates a file, even if its content did not change. The file always gets a new timestamp, and further rules are triggered unnecessarily.
My idea is generating a temporary file, and only copying it over the real target, when its content changed. We can assume that the file is text-based and relatively small (<10k).
It would be great if the solution could work with a basic Cygwin/MSYS2 installation (+make) under Windows, without having to install further tools.
Any ideas?
Note: cmp and diff are not installed by default.

Related

Trick gmake build/ Copying files to preserve time stamps and avoid complete rebuild

short background info:
For some rather convoluted reasons I am trying to trick my projects build system. We are working with Code Composer Studio on Windows from Texas Instruments and the gmake implementation that was included. We use the standard Debug build option pretty much unmodified as far as I am aware of (my understanding of make and our implementation of it is unfortunately limited). When developing in Code Composer the build works as one would expect, only the files that have been changed and their dependencies are changed. Sometimes I need to regenerate all code from a code generator that we use however, triggering a complete rebuild.
Trying to trick this complete rebuild I wrote a python script that moves all the original files out from the project, waits for the code generator, then compares the files and replaces the new identical files with the old ones. Thus the creation timestamp and the last modified timestamp should be preserved. This also seems to work when examining these properties. However a complete rebuild of the problem is always triggered.
Core of the Problem:
Having already built the project in Code Composer and building again it concludes that nothing has changed and that there is nothing to do, as one would expect.
I move a file out of the folder and back in again, preserving both the creation time stamp and last modified time stamp as far as I can see (in windows explorer -> properties).
When now rebuilding the project it will rebuild the said file and its dependencies, despite it being identical to before and having the same timestamps.
What is gmake looking at? Does it detect that the folder has been changed. Is it looking at some other hidden timestamp? Is it a Windows problem? Shouldn't it be possible to do this, inserting a file with an old timestamp?
What am I missing? Anyone have a clue?

Having examined this I concluded that the problem does not lie with gmake or the makefile but with Code Composer Studio's/Eclipse makefile generation.
We use the built in makefile generation in Code Composer Studio (which is an Eclipse derivative). It seems to keep track of if files have been removed/moved out from the projects workspace. If it has it removes the dependecies of that files during the makefile generation, triggering a rebuild.
Examples:
Case 1, works as I expect:
Rebuilding an unchanged project in Code Composer Studio. Nothing is recompiled since nothing is changed.
Case 2, works as I expect:
Changing something in a single then rebuilding results in recompilation of only that single file. As I/once expects.
Case 3, does not work as I expect:
Having an unchanged project and the moving one source code file (xyz.cpp) out of the project, not changing it in any way and then mvoing it back results in a recompilation of xyz.obj.
Code Composer/Eclipse seem to keep track on the filestructure. Despite the file being put back into the project exactly the way it was, Code Composer/Eclipse deletes xyz.obj during the makefile generation, triggering a rebuild.
On the other hand; closing Code Composer, moving the file back and forth, Reopening Code Composer and regenerating makefile/rebuilding works as one expects. Nothing is rebuilt and no .obj files are removed.
Since this didn't have anything to do with gmake and makefile I guess this question was invalid/unrelevant to begin with. I will collect my thoughts and reformulate a new question formulated strictly as a Code Composer/Eclipse errand.
If anyone has found themselves in the same situation or have relevant information on their hands please share however!

What are the differences between make clean, make clobber, make distclean, make mrproper and make realclean?

I don't always write make files but when I do I like to try and write them well. Trying to make the interface consistent with what other developers might expect is always a struggle. What I am looking for is a summary of all the common make something cleaner (GNU) make targets.
What are the commonly found cleaning make targets?
When is each target normally used?
How does each target compare to others?
I have worked with a system using make clean, make clobber and make mrproper before. Those got steadily more extream with; make clean only tidying up temporaries, make clobber getting rid of most configuration and make mrproper almost going back to a just checked out state. Is that the normal order of things? Should make mrproper always remove the generated binaries and shared libraries for deployment?
Reading around suggests that make distclean tidies things up to the point of being ready to make a distribution package. I would imagine that leaves behind some automatically generated version tagging files, file manifests and shared libraries but possibly strips out temporary files that you would not want archived?
make realclean was completely new to me when I spied it on the GNU Make Goals manual page. As it is listed with distclean and clobber I would guess it had similar effects. Have I never come across it as it is a historical artifact or just quite specific to a set of projects I have to worked on?
Sorry, that is is a bit rambling. I have found various questions and answers that compare one target to the other but none that seemed to give a good overview.

Trying to form my own answer based on some research. I think the approximate order of severity is; mostlyclean, clean, maintainer-clean, mrproper, distclean and finally clobber (which is combined distclean and uninstall).
make clean
make clean is the most basic level. It cleans up most generated files but not anything that records configuration. GNU Make Manual's Goals page states:
Delete all files that are normally created by running make.
Further, the GNU Make Manual's Standard Targets page stages:
Delete all files in the current directory that are normally created by building the program. Also delete files in other directories if they are created by this makefile. However, don’t delete the files that record the configuration. Also preserve files that could be made by building, but normally aren’t because the distribution comes with them. There is no need to delete parent directories that were created with ‘mkdir -p’, since they could have existed anyway.
Delete .dvi files here if they are not part of the distribution.
make mostlyclean
make mostlyclean is the only gentler form of clean I have found, it behaves like clean but leaves behind files that would take a long time to compile and do not often need to be regenerated.
The GNU Make Manual's Standard Targets page stages:
Like ‘clean’, but may refrain from deleting a few files that people normally don’t want to recompile. For example, the ‘mostlyclean’ target for GCC does not delete libgcc.a, because recompiling it is rarely necessary and takes a lot of time.
make distclean
make distclean is the first step up from the basic make clean on many GNU Make systems. It seems to be pseudonymous or at least very similar to with make realclean and make clobber in many, but not all cases. It will delete everything that make clean does and remove the configuration.
In the Linux system this is one step beyond make mrpropper, see the section below for details.
I am not sure if the name implies that things are being made clean enough for distribution (the forming of a tar archive) or that the process is returning them to the state equal to what was distributed (just as things were immediately after unpacking a tar archive).
GNU Make Manual's Goals page states:
Any of these targets might be defined to delete more files than ‘clean’ does. For example, this would delete configuration files or links that you would normally create as preparation for compilation, even if the makefile itself cannot create these files.
Further, the GNU Make Manual's Standard Targets page stages:
Delete all files in the current directory (or created by this makefile) that are created by configuring or building the program. If you have unpacked the source and built the program without creating any other files, ‘make distclean’ should leave only the files that were in the distribution. However, there is no need to delete parent directories that were created with ‘mkdir -p’, since they could have existed anyway.
make uninstall
make uninstall will uninstall software installed via make install or one of the install-* variants. This is similar to part of the behaviour to make clobber on some systems but make uninstall should not touch the build area as make clobber will.
The GNU Make Manual's Standard Targets page stages:
Delete all the installed files—the copies that the ‘install’ and ‘install-*’ targets create.
This rule should not modify the directories where compilation is done, only the directories where files are installed.
make maintainer-clean
make maintainer-clean seems to be one step back from the more common make distclean. It deletes almost everything apart from the configuration. This makes it very similar to make clean.
The GNU Make Manual's Standard Targets page stages:
Delete almost everything that can be reconstructed with this Makefile. This typically includes everything deleted by distclean, plus more: C source files produced by Bison, tags tables, Info files, and so on.
It is also highlighted that this is not a commonly used target as it is for a specific set of users:
The ‘maintainer-clean’ target is intended to be used by a maintainer of the package, not by ordinary users.
make mrproper
make mrproper seems to be a Linux Kernel version of make distclean or make clobber.
that stops short of removing backup and patch files. It does everything that the make clean target does and strips out configuration.
I believe the name comes from a cleaning product known in the USA as Mr. Clean and the UK as Flash (which is why I had not heard of the product as named). Linus Torvalds being Finnish-American was presumably familiar with the Mr. Propper brand name.
The Linux Kernel Makefile states:
# Cleaning is done on three levels.
# make clean Delete most generated files
# Leave enough to build external modules
# make mrproper Delete the current configuration, and all generated files
# make distclean Remove editor backup files, patch leftover files and the like
make clobber
make clobber gets a mention on the Wikipedia article for Clobbering too. That also states that it is more severe than make clean, possibly one that even uninstalls the software. It is possible a combination of make uninstall and make distclean.
There is no single source for make clean levels. As things have evolved over time terminology and behaviour is inconsistent. The above is the best I have managed to piece together so far.

I stopped the make command mid-way, will previous build be affected?

I built a very big project, which had a number of sub-projects, using make command. It took me 3 hours. Then by mistake (without cleaning the previous build) I re-executed the make command for a few minutes and then stopped it.
Have I ruined my previous build? How does make actually work behind the scenes? Are building the object files done in an atomic and safe manner?
Note: I cannot really run any of my binary files to see if they are broken since that's another lengthy process. I just want to know if I am fine or I have to re-run the make and let it finish.

If you want to publish this binary as a production version of your commercial product, then I would not rely on it, always be 100% sure that you are using a successfully built version of a fully saved and committed code base.
On the other hand, I you need this for debugging purposes, then you could use this! why? because the make system overrides the output binary only once if finishes compiling all the object files and only if it detects changes that requires a relink of the binary:
After recompiling whichever object files need it, make decides whether to relink edit. This must be done if the file edit does not exist, or if any of the object files are newer than it. If an object file was just recompiled, it is now newer than edit, so edit is relinked.
From GNU make: How Make Works
So if you haven't changed your code base, the linker will not relink the binary, leaving it as it was created by the successful build.

make: Adding function for "no rule" targets

I have a Makefile which uses a lot of static files, for example:
/mystaticpath/somepath/official_gpl_package-0.1.2.tar.gz
I use this Makefile on several computers and I don't need all packages on all computers. (That's the whole point of make)
Anyway, what I would like to do is put everything in /mystaticpath on a centralized server and download the packages on demand.
In other words, whenever make encounters a missing source file ("no rule"-error), it should run a script and then try again afterwards. The script would need the name of the missing file as a parameter and would download the file from the centralized server, so from make's point of view, the script is an universal creator of everything that might be needed in /mystaticpath.
Does anybody know whether that is possible with make?

Your design makes the little hairs on the back of my neck stand up, but give this a try:
/mystaticpath/%:
retrieve_script $#

How do you verify that 2 copies of a VB 6 executable came from the same code base?

I have a program under version control that has gone through multiple releases. A situation came up today where someone had somehow managed to point to an old copy of the program and thus was encountering bugs that have since been fixed. I'd like to go back and just delete all the old copies of the program (keeping them around is a company policy that dates from before version control was common and should no longer be necessary) but I need a way of verifying that I can generate the exact same executable that is better than saying "The old one came out of this commit so this one should be the same."
My initial thought was to simply MD5 hash the executable, store the hash file in source control, and be done with it but I've come up against a problem which I can't even parse.
It seems that every time the executable is generated (method: Open Project. File > Make X.exe) it hashes differently. I've noticed that Visual Basic messes with files every time the project is opened in seemingly random ways but I didn't think that would make it into the executable, nor do I have any evidence that that is indeed what's happening. To try to guard against that I tried generating the executable multiple times within the same IDE session and checking the hashes but they continued to be different every time.
So that's:
Generate Executable
Generate MD5 Checksum: md5sum X.exe > X.md5
Verify MD5 for current executable: md5sum -c X.md5
Generate New Executable
Verify MD5 for new executable: md5sum -c X.md5
Fail verification because computed checksum doesn't match.
I'm not understanding something about either MD5 or the way VB 6 is generating the executable but I'm also not married to the idea of using MD5. If there is a better way to verify that two executables are indeed the same then I'm all ears.
Thanks in advance for your help!

That's going to be nearly impossible. Read on for why.
The compiler will win this game, every time...
Compiling the same project twice in a row, even without making any changes to the source code or project settings, will always produce different executable files.
One of the reasons for this is that the PE (Portable Executable) format that Windows uses for EXE files includes a timestamp indicating the date and time the EXE was built, which is updated by the VB6 compiler whenever you build the project. Besides the "main" timestamp for the EXE as a whole, each resource directory in the EXE (where icons, bitmaps, strings, etc. are stored in the EXE) also has a timestamp, which the compiler also updates when it builds a new EXE. In addition to this, EXE files also have a checksum field that the compiler recalculates based on the EXE's raw binary content. Since the timestamps are updated to the current date/time, the checksum for the EXE will also change each time a project is recompiled.
But, but...I found this really cool EXE editing tool that can undo this compiler trickery!
There are EXE editing tools, such as PE Explorer, that claim to be able to adjust all the timestamps in an EXE file to a fixed time. At first glance you might think you could just set the timestamps in two copies of the EXE to the same date, and end up with equivalent files (assuming they were built from the same source code), but things are more complicated than that: the compiler is free to write out the resources (strings, icons, file version information, etc.) in a different order each time you compile the code, and you can't really prevent this from happening. Resources are stored as independent "chunks" of data that can be rearranged in the resulting EXE without affecting the run-time behavior of the program.
If that wasn't enough, the compiler might be building up the EXE file in an area of uninitialized memory, so certain parts of the EXE might contain bits and pieces of whatever was in memory at the time the compiler was running, creating even more differences.
As for MD5...
You are not misunderstanding MD5 hashing: MD5 will always produce the same hash given the same input. The problem here is that the input in this case (the EXE files) keep changing.
Conclusion: Source control is your friend
As for solving your current dilemma, I'll leave you with this: associating a particular EXE with a specific version of the source code is a more a matter of policy, which has to be enforced somehow, than anything else. Trying to figure out what EXE came from what version without any context is just not going to be reliable. You need to track this with the help of other tools. For example, ensuring that each build produces a different version number for your EXE's, and that that version can be easily paired with a specific revision/branch/tag/whatever in your version control system. To that end, a "free-for-all" situation where some developers use source control and others use "that copy of the source code from 1997 that I'm keeping in my network folder because it's my code and source control is for sissies anyway" won't help make this any easier. I would get everyone drinking the source control Kool-Aid and adhering to a standard policy for creating builds right away.
Whenever we build projects, our build server (we use Hudson) ensures that the compiled EXE version is updated to include the current build number (we use the Version Number Plugin and a custom build script to do this), and when we release a build, we create a tag in Subversion using the version number as the tag name. The build server archives release builds, so we can always get the specific EXE (and setup program) that was given to a customer. For internal testing, we can choose to pull an archived EXE from the build server, or just tell the build server to rebuild the EXE from the tag we created in Subversion.
We also never, ever, ever release any binaries to QA or to customers from any machine other than the build server. This prevents "works on my machine" bugs, and ensures that we are always compiling from a "known" copy of the source code (it only pulls and builds code that is in our Subversion repository), and that we can always associate a given binary with the exact version of the code that it was created from.

I know it has been a while, but since there is VB De-compiler app, you may consider bulk-decompiling vb6 apps, and then feeding decompilation results to an AI/statistical anomaly detection on the various code bases. Given the problem you face doesn't have an exact solution, it is unlikely the results will be 100% accurate, but as you feed more data, the detection should become more and more accurate

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio