I am interested in relinking a bazel project after modifying (and compiling) go's standard library.
Building the project with no cache takes a long time, and I need to do this modification and build multiple times successively, meaning it needs to happen as quickly as possible.
My bazel project builds many packages but ends up with one final binary that is relevant for these purposes.
I tried to just modify the file in usr/local/go/src and delete only the final binary from bazel's output path. This didn't make bazel rerun any actions (even though they depend on a .go file that was supposedly modified).
How could I then make the minimal amount of actions run again so that my modification applies to the final binary?
More specifically, how can I recompile a specific (standard library) package and then only relink the binary instead of also recompiling it and its dependencies.
Related
I'm trying to understand how the source code for third-party dependencies is or is not compiled into my Go binary. I'm building in a Docker container, so I can see precisely what's fetched for my build without interference from other builds.
After my go build completes I see source code files for several dependencies under go/pkg/mod/$module#$version directories. The Module cache documentation tells me that these directories contain "extracted contents of a module .zip file. This serves as a module root directory for a downloaded module." My best guess is that the presence of extracted source code for these dependencies indicates that "yes, these dependencies are definitely compiled into your binary."
I also see many more dependencies pulled into go/pkg/mod/cache/download/$module directories. The Module cache documentation tells me that this directory contains "files downloaded from module proxies and files derived from version control systems," which I don't fully understand. As far as I can see, these files do not include any extracted source code, though there are several .zip files that I assume contain the source. For the most part these files seem to be .mod files that just contain text representing some sort of dependency graph.
My question is: if a third-party dependency has module files under go/pkg/mod/cache/download but no source code under go/pkg/mod/$module#$version, does that mean that dependency's code was NOT compiled into my Go binary?
I don't understand why the Go build pulls in all these module files but only has extracted source code for some of the third-party modules. Perhaps Go preemptively parses and pulls module information for the full transitive set of modules referenced from the modules my first-party code imports, but perhaps many of those modules don't end up being needed for my binary's compile + build process and therefore don't get extracted. If that's not true and the answer to my question is no, then I don't understand how or why my binary can link in those dependencies without go build fetching their source code.
As mentioned in "Compile and install packages and dependencies"
Compiled packages are cached automatically.
GOPATH and Module includes:
When using modules, GOPATH is no longer used for resolving imports.
However, it is still used to store downloaded source code (in GOPATH/pkg/mod) and compiled commands (in GOPATH/bin).
So if you see sources in pkg/mod which are not in pkg/mod/cache, try a go mod tidy
add missing and remove unused modules
From there, you should have the same modules between sources (pkg/mod) and compiled modules (pkg/mod/cache)
Based on the OP's comment
I need to know exactly what's included in the binary for compliance reasons.
I would recommend a completely different approach: dumping the list of symbols contained in the binary and then correlating it with whatever information is needed.
The command
go tool nm -type /path/to/the/executable/image/file
would dump the symbols — names of the functions — whose code was taken from both the standard library packages, 3rd-party and/or vendored packages and internal packages, compiled and linked into the binary, and print to its standard output stream a sequence of lines
address type name
which you can then process programmatically.
Another approach you might employ is to use go list with various flags to query the program's source code about the packages and/or modules which will be used when building: whatever that command outputs describing the full dependency graph of the source code is whatever go build will use when building — provided the source code is not changed between these calls.
Yet another possibility is to build the program using go build -x, save the debug trace it produces on its standard error stream and parsing it for exact module names the command reported as used during building.
I just built the linux kernel for CentOS using the instructions that can be found here: https://wiki.centos.org/HowTos/Custom_Kernel
Now, I made my changes and I would like to rebuild the kernel and test it with my changes. How do I do that but:
1. Without having to recompile everything. So, build process should reuse whatever object files generated by the first build that wont need to be modified.
2. Without having to build the other packages that are build with the kernel (e.g., debuginfo, tools, debug-devel, ...etc.).
Thanks.
You cannot. The paradigm of rpmbuild is to always start from a clean slate to ensure reproducibility and predictability. The subpackages would be also be invalidated because they depend on the exact output of your kernel build, e.g. locations within the binary images where certain symbols are defined, that may have changed when you rebuilt it.
I built a very big project, which had a number of sub-projects, using make command. It took me 3 hours. Then by mistake (without cleaning the previous build) I re-executed the make command for a few minutes and then stopped it.
Have I ruined my previous build? How does make actually work behind the scenes? Are building the object files done in an atomic and safe manner?
Note: I cannot really run any of my binary files to see if they are broken since that's another lengthy process. I just want to know if I am fine or I have to re-run the make and let it finish.
If you want to publish this binary as a production version of your commercial product, then I would not rely on it, always be 100% sure that you are using a successfully built version of a fully saved and committed code base.
On the other hand, I you need this for debugging purposes, then you could use this! why? because the make system overrides the output binary only once if finishes compiling all the object files and only if it detects changes that requires a relink of the binary:
After recompiling whichever object files need it, make decides whether to relink edit. This must be done if the file edit does not exist, or if any of the object files are newer than it. If an object file was just recompiled, it is now newer than edit, so edit is relinked.
From GNU make: How Make Works
So if you haven't changed your code base, the linker will not relink the binary, leaving it as it was created by the successful build.
I have a project whose make process generates different build artifacts for each configuration, e.g. if initiated with make a=a0 b=b0 it would build object files into builds/a0.b0, generate a binary myproject.a0.b0, and finally update an ambiguous executable link to point to the most recently built project ln -s myproject.a0.b0 myproject. For this project, this is a useful feature mainly because:
It separates the object files for different configurations (so when I rebuild in another configuration I don't have to recompile every single source with new defines and configurations set, (unfortunately) a very common procedure).
It retains the binaries for each configuration (so it's not required to rebuild to use a different configuration if it has already been built).
It makes a copy (or link) to the last built binary which is useful in testing.
Right now this is implemented in an ugly decades-old non-portable Makefile. I'd like to reproduce the same behavior in CMake to allow easier building on other platforms but I have not been able to reproduce it in any reasonable manner. It seems like adding targets with add_library/executable, but doing this for each possible permutation of input parameters seems wrong. And I'm not sure I could get the utilization correct, allowing make, make a=a0, make b=b0 a=a0 as opposed to what specifying a cmake target would require: make myproject-a0.b0.
Is this possible to do in cmake? Namely:
Specify the build directory based on input parameters.
Accept the parameters as make arguments which can be left out (defaulted) to select the appropriate target at the level of the makefile (so it's not required to rerun cmake for a new configuration).
XCode4 is putting build time in executables it creates. When I build the same code twice, binaries will differ by few bytes belonging to a unix timestmap.
Is there a way to prevent this from happening?
(I'm running expensive tests and benchmarks after each build and cache results based on hash of executables, but ever-changing executables broke my cache and pollute benchmark results with duplicates).
I've worked around this by switching to building project myself "old skool" way with Makefiles & gcc.