What does it mean when a package is in the go/pkg/mod/cache dir but it has no source code extracted? - go

I'm trying to understand how the source code for third-party dependencies is or is not compiled into my Go binary. I'm building in a Docker container, so I can see precisely what's fetched for my build without interference from other builds.
After my go build completes I see source code files for several dependencies under go/pkg/mod/$module#$version directories. The Module cache documentation tells me that these directories contain "extracted contents of a module .zip file. This serves as a module root directory for a downloaded module." My best guess is that the presence of extracted source code for these dependencies indicates that "yes, these dependencies are definitely compiled into your binary."
I also see many more dependencies pulled into go/pkg/mod/cache/download/$module directories. The Module cache documentation tells me that this directory contains "files downloaded from module proxies and files derived from version control systems," which I don't fully understand. As far as I can see, these files do not include any extracted source code, though there are several .zip files that I assume contain the source. For the most part these files seem to be .mod files that just contain text representing some sort of dependency graph.
My question is: if a third-party dependency has module files under go/pkg/mod/cache/download but no source code under go/pkg/mod/$module#$version, does that mean that dependency's code was NOT compiled into my Go binary?
I don't understand why the Go build pulls in all these module files but only has extracted source code for some of the third-party modules. Perhaps Go preemptively parses and pulls module information for the full transitive set of modules referenced from the modules my first-party code imports, but perhaps many of those modules don't end up being needed for my binary's compile + build process and therefore don't get extracted. If that's not true and the answer to my question is no, then I don't understand how or why my binary can link in those dependencies without go build fetching their source code.

As mentioned in "Compile and install packages and dependencies"
Compiled packages are cached automatically.
GOPATH and Module includes:
When using modules, GOPATH is no longer used for resolving imports.
However, it is still used to store downloaded source code (in GOPATH/pkg/mod) and compiled commands (in GOPATH/bin).
So if you see sources in pkg/mod which are not in pkg/mod/cache, try a go mod tidy
add missing and remove unused modules
From there, you should have the same modules between sources (pkg/mod) and compiled modules (pkg/mod/cache)

Based on the OP's comment
I need to know exactly what's included in the binary for compliance reasons.
I would recommend a completely different approach: dumping the list of symbols contained in the binary and then correlating it with whatever information is needed.
The command
go tool nm -type /path/to/the/executable/image/file
would dump the symbols — names of the functions — whose code was taken from both the standard library packages, 3rd-party and/or vendored packages and internal packages, compiled and linked into the binary, and print to its standard output stream a sequence of lines
address type name
which you can then process programmatically.
Another approach you might employ is to use go list with various flags to query the program's source code about the packages and/or modules which will be used when building: whatever that command outputs describing the full dependency graph of the source code is whatever go build will use when building — provided the source code is not changed between these calls.
Yet another possibility is to build the program using go build -x, save the debug trace it produces on its standard error stream and parsing it for exact module names the command reported as used during building.

Related

Why do go module versions sometimes need 2 lines in go.sum

Why do individual module versions sometimes need 2 lines in go.sum?
one line is just for the module version (v0.1.1 in the example below)
one line also has /go.mod tacked onto the version (v0.1.1/go/mod in the example below).
For example:
github.com/foo/bar v0.1.1 h1:kDgnGXZpvZUi7ym6Rm23yVn3gRqBag+vU6M/wytZR9c=
github.com/foo/bar v0.1.1/go.mod h1:MZcarCLffCxoj/EF1yhRb4HvOSmCkm5Z8FPmzWrMG+g=
The reason I ask is because sometimes when I go get a package, an indirect dependency will be generated in go.sum with only the second line from the example above, and then the build will fail with 410 gone for that package#version. However if I manually go get the indirect dependency, the build no longer fails with 410 gone.
I believe this only happens with private repositories, so I understand it will not play well with sum.golang.org. However, I'd like to figure out if it's possible to avoid getting the 410 in the first place, especially with regards to automated module updates, etc.
The v0.1.1/go.mod entry contains the checksum for the go.mod file in isolation. That is needed to ensure consistency any time you are loading or changing dependencies.
The v0.1.1 entry (without the /go.mod suffix) contains the checksum for the full source code of the module, including all of the .go source files for the packages within it.
The two parts are downloaded separately so that you don't need to download the full source code for dependencies that you don't intend to build or test (a fairly common situation for projects with casual contributors). But because the go command downloads them separately, it needs separate checksums for them.

go: Why do I need the pkg directory if I cannot delete the source files in src?

I wrote a function (not! main) and prompted go install. This command generated a path and a package in my pkg-directory. I tested the function by using it in a main function, generated the .exe and everything worked just fine.
After that, I wanted to see if I understood the concept of packages in go correctly and deleted the source file of the function in the src-directory and deleted the main .exe. I did not remove the package file in my pkg-directory. Then I tried to go install the main .exe again, but it didn't work: "package can not be found". I obviously missunderstood the whole concept because I thought I could use the packages in pkg without the source files in src. If my conclusion was correct, why do I need the "pkg" directory at all?
For more explanation take a look at this picture please:
In /bin is the binary code of the main function "hello". This main function also contains the function "reverse" of the "stringutil" package.
By generating the "hello.exe", Go also generates the package "stringutil" into pkg.
My question is: Should I not be able to delete "reverse.go" in src and still be able to use the same function because it was already put into pkg?
Is it just the way the AST works now that they've rewritten the compiler in go? It checks for GOPATH/src/**/.go when it parses the "imports", then when the linker goes to build the final binary, it'll go check pkg. So the compiler errors out first when trying to feed the ast to the assembler because of the incomplete source tree.
Thank you very much!!!
It is true that the pkg dir is usually used as a cache directory, but also It is possible to use packages without having the source code available, with a feature named binary only packages AFAIK It was implemented since Go 1.7.
However, there's a caveat for this approach: Versions of the compiler used to build the package and the compiler to 'use' the package to generate a new library/executable must match. Also, the files must match the pair of os/architecture to build against. If you want cross-compiling, you'll need to distribute your package to every pair of os/architecture you'll want to build.
This project has a demo for the aforementioned feature.
I hope my explanation was detailed enough :)
The pkg dir is just a local cache for compiling, it's not something you can depend upon and it doesn't replace .go files, it's not a static lib dir but a temp build products dir. It speeds up compilation, so you don't need the pkg dir, but compiles might be slower if it is empty, and it is for the compiler's use, not yours.
As you've discovered, you do need the src dir.
To link a static library to your project, you could copy out that .a file and use ldflags to link but then you lose all the nice things like cross-compiling and having the entire source for your app, so unlike c for example people don't typically do that.

Only put .proto protocol buffer file in a repository?

I wonder what is the best practice for protocol buffer regarding source repository (e.g. git) :
Do I have to put ONLY the .proto file in the repository and let anyone else who uses the source code to regenerate classes code with protoc compiler ? or is it a best pratice to put both .proto files AND source code generated by protoc compiler ?
You should never check in generated code if you can avoid it.
If you check in generated code, you take on multiple risks, such as:
You risk losing the knowledge of how to correctly regenerate that code. If it's not automated as part of the build, it's too easy to forget to document, or to have the documentation be wrong.
You risk the generated code getting out-of-sync with the schema. For example, someone could make a change to the .proto file but forget to update the generated code. Their changes won't actually "take effect" until someone else later on regenerates the generated code for some other reason -- and then all of the sudden they see side effects they weren't expecting.
Your generated code might be for a different version of protocol buffers than what the builder has installed. In this case it won't work correctly, since it's necessary to use the exact same version of the compiler and runtime library.
If for some reason you absolutely have to check in generated code, I highly recommend creating an automated test that checks if the checked-in code matches what protoc would generate if run fresh. (For example, the protobuf repository itself contains checked-in copies of generated code for descriptor.proto because this code is needed to compile protoc, creating a circular dependency. But there is a unit test that checks that the checked-in code matches what protoc would generate.)
If your project is commonly used in its source code form (e.g. a library or a program every user is supposed to compile himself), I would make available release packages that have the generated files.
But I wouldn't put the generated files into the repository directly. And if most users will use a compiled binary, it is not that important to provide easy-to-compile source packages either. The protobuf generator then becomes just another build dependency.

Working with digital signatures in Go

I would like to use signatures for a program that I am writing in Go, but I can't figure out the documentation, which is here. In particular, I would like to use the SignPKCS1v15 and VerifyPKCS1v15 functions, but I'm not sure exactly what I have to pass as arguments. I would greatly benefit from some example code of these two functions. Thanks.
Note: The message that I would like to send is a struct that I defined.
I think the src\pkg\crypto\rsa\pkcs1v15_test.go file in the Go source tree should be a good start.
An update striving provide more context… Go source contains many tests for the code in its standard library (and the crypto/rsa package is a part of it), so whenever you have no idea how to use a standard package (or, actually, any other Go package), a good place to start is to look at the tests involving that package as testing code naturally uses the package! Tests are kept in files ending in _test.go, usually have meaningful names and are located in the same directories actual code implementing a particular package is kept.
So in your particular case you could do this:
Download the Go source package of the version matching your compiler (what go version shows) and unpack it somewhere.
Navigate to the directory matching the package of interest. Code for standard Go packages is located in the "pkg" directory under the "src" top-level directory, so if you're interested in the crypto/rsa package, you need the src/pkg/crypto/rsa directory.

ada95 have 3 files .ali, .adb and .o - can I compile

I've found some old college work, with my final Ada95 project on it. Sadly, the disc was corrupted, and I have only managed to recover 3 files (the source and executable couldnt be recovered):
project.adb, project.ali and project.o
Are these 3 files enough to compile a new exe? I'm downloading the gnat compiler now, but have to admit, I have forgotten almost everything ada related...
Frank
[EDIT]
shucks.... using GCC to compile the project.adb throws an error about a missing ads file, which I cannot recover.
Is it possible to extract this / compile just the ".o" or ".ali" files? Or, am I stuffed?
project.adb is a source file.
Since you say that gcc complains about a missing .ads file, that indicates that project.adb contains a package body. You can manually construct a corresponding package spec by putting the following into package.ads:
package Project is
end Project;
Now that's almost certainly not enough, because the project spec probably had some type and constant declarations in it, so you'd have to analyze your package body and identify what it references. Infer what those declarations should look like and add them. Oh, and if your package body "with's" any packages that are not part of the standard Ada library, you'll have to recover those as well.
If you do manage to get your reverse engineered spec and the body to compile, you'll still have to create a "driver" program that "with's" the project package, and calls whatever functions and/or procedures that carried out the function of your project (and you'll have to pull the specs of those subprograms--which match their appearance in the package body--into the spec as well.)
Frankly, if it were me, I'd spend more time on trying to use some disk recovery tools to pull whatever else I could off the disk.
In Ada95 (and 2005) one mostly work with adb files (occasionally with ads files) everything else is generated on the run. In your case the adb file is surely other linked up to other ads files.
However, ads files are usually small programs (Obviously, if you are not attempting really exotic things as 'the dining philosophers') which pertain to the algorithmic/mathematical structure of the program, if you can dig out what you did in your project then it should not be impossible to restore it !

Resources