Getting all dependencies of a Go project - go

For my project I am trying to get all the dependencies and sub dependencies of my project. I need to specific version of each of these dependencies. Not only do I need the dependencies of my project, but the dependencies of the dependencies and so on until the root.
For my project, go list -m all works for everything except indirect dependencies that have not opted into using go.mod files. Right now my workflow is taking an initial batch of repositories, downloading them from git then using “GO111MODULE=on go build ./…”. and “GO111MODULE=on go list -m -json all” to get the list of dependencies. I do not check for go.mod as all of the repositories I am scanning are using go.mod files.
For the list of dependencies that come out of this initial list I have some questions, for files without go.mod files, I used this as a reference: “https://blog.golang.org/using-go-modules”
-Path = Received from go list -m all, it can be GitHub, gopkg, or whatever is used to dl the go package.
Without go.mod
-“GO111MODULE=on go mod init <PATH from parent go.mod>”
-“GO111MODULE=on go build ./…”
-“GO111MODULE=on go mod tidy”
-“GO111MODULE=on go list -m -json all”
-From there I get a list of the dependencies of this module.
With go.mod
-“GO111MODULE=on go build ./…”
-“GO111MODULE=on go mod tidy”
-“GO111MODULE=on go list -m -json all”
Should I be running go build on each dependency that has a go.mod file? For ones without a go.mod file, I understand this should be done, as how else will we populate the go.mod file with the dependencies. But for files with a go.mod file, will I be pulling extra stuff that is not necessarily being used by my project with go build, like test files and other files that might not be used when I am simply importing that project? I understand that its better to get more unused dependencies rather than missing some, but it is getting a bit overwhelming with how massive the amount of dependencies is.

I can try to analyze go.sum file (when you execute go list -u, go.sum was created)
The go command uses the go.sum file to ensure that future downloads of these modules retrieve the same bits as the first download, to ensure the modules your project depends on do not change unexpectedly, whether for malicious, accidental, or other reasons. Both go.mod and go.sum should be checked into version control. (Using Go Modules - Adding a dependency)
go.sum file lists down the checksum (and version tag) of direct and indirect dependency required by the module.
% cat go.sum
...
github.com/bmizerany/perks v0.0.0-20141205001514-d9a9656a3a4b/go.mod h1:ac9efd0D1fsDb3EJvhqgXRbFx7bs2wqZ10HQPeU8U/Q=
github.com/c2h5oh/datasize v0.0.0-20171227191756-4eba002a5eae/go.mod h1:S/7n9copUssQ56c7aAgHqftWO4LTf4xY6CGWt8Bc+3M=
github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc=
github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
github.com/cockroachdb/apd v1.1.0/go.mod h1:8Sl8LxpKi29FqWXR16WEFZRNSz3SoPzUzeMeY4+DwBQ=
...

The phrase “all the dependencies” is unfortunately ambiguous.
go list -m lists all modules whose selected versions are determined by your go.mod file. That is one possible definition of "all the dependencies", but it is a much broader set of modules than I think most people intend when they talk about the “dependencies” of a module.
In practice, go list -test all (without the -m flag) is the broadest set of dependencies I would generally care about: it includes all of the packages listed in import statements within your module (i.e. everything you need in order to run go test ./... within your module), plus all of the packages transitively needed to run go test on those packages.
(In particular, go list -test all is also the set of packages that will be resolved when you run go mod tidy.)

Related

Go mod vendored dependencies being downloaded not ignored

I have multiple dependencies for an application. Is it possible to have some dependencies vendored(application vendored code local that I can add debugging when deployed) and some that are downloaded via go.mod/sum. When I attempt to do this the dependencies vendored and in the modules.txt get pulled down regardless. Am I missing a step? Do I need to update the imports, go.mod/sum additionally to prevent this?
vendor/modules.txt
# github.com/sendgrid/sendgrid-go v2.0.0+incompatible
github.com/sendgrid/sendgrid-go
# github.com/sendgrid/smtpapi-go v0.4.0
github.com/sendgrid/smtpapi-go

Is it possible to automatically load transitive dependencies with Gazelle?

I'd like to use Gazelle to manage my Go dependencies (and their dependencies) in Bazel. Running bazel run //:gazelle update-repos firebase.google.com/go adds a properly configured go_repository to my WORKSPACE file:
go_repository(
name = "com_google_firebase_go",
importpath = "firebase.google.com/go",
sum = "h1:3TdYC3DDi6aHn20qoRkxwGqNgdjtblwVAyRLQwGn/+4=",
version = "v3.13.0+incompatible",
)
However, this does not work out of the box. Running bazel build #com_google_firebase_go//:go_default_library returns an error:
ERROR: /private/var/tmp/_bazel_spencerconnaughton/9b09d78e8f2190e9af61aa37bcab571e/external/com_google_firebase_go/BUILD.bazel:3:11: no such package '#org_golang_google_api//option': The repository '#org_golang_google_api' could not be resolved and referenced by '#com_google_firebase_go//:go'
ERROR: Analysis of target '#com_google_firebase_go//:go_default_library' failed; build aborted: Analysis failed
INFO: Elapsed time: 0.596s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (23 packages loaded, 133 targets configured)
Is there a way to tell gazelle to load the #org_golang_google_api transitive dependency and others without needing to run update-repos for each one?
I have been struggling with this as well, but it seems that you can simply solve it by using go mod :)
If you use the go mod and go get commands to generate go.mod files, the transative dependencies are included automatically. You can then use this go.mod file in your bazel-gazelle command ;)
let's say your project is called github.com/somesampleproject:
go mod init github.com/somesampleproject
Then, use go get to add your dependency to the go.mod file:
go get firebase.google.com/go
go mod actually handles the transitive dependencies as well, as described in the documentation here. So your go.mod file should contain all of your required dependencies now :)
I quickly tried this out with gazelle in one of our projects (as described in the gazelle documentation)
bazel run //:gazelle -- update-repos -from_file=\<insert-subfolder-here>/go.mod
now our WORKSPACE file includes the firebase dependency, but also the dependencies that firebase has itself.
For bonus points you could use tools to validate your go.mod file to make sure you have no dependencies with known security bugs, a quick google search returned snyke, nancy and built-in support in ide's such as vs code apparently ;)

Building Go module without main file

I have a small module that contains some shared code. The module looks like the following :
Shared
├── go.mod
├── go.sum
├── logging
│ └── logger.go
└── db-utils
├── db.go
If I'll try to build the Shared module from inside the directory I'm getting an error that no go file available under this module.
bash-3.2$ go build .
no Go files in /Users/xxxx/go/src/Shared
Is there a way to build a Go module that only has packages inside and doesn't have a main.go file? The motivation here is to use those packages in other go modules that can't access the internet or private repo in order to retrieve the packages.
The go command stores downloaded modules in the module cache as source code, not compiled object files. go build rebuilds the object files on demand and stores them in a separate build cache.
If you want to distribute a non-public module for use with other modules, you have a few options:
You can publish the module to a private repository — typically accessed with credentials stored in a .netrc file — and use the GOPRIVATE environment variable to tell the go command not to try to fetch it from the public proxy and checksum servers.
You can provide a private GOPROXY server or directory containing the source code, and use the GOPROXY environment variable to instruct the go command to use that proxy.
You can publish the source code as a directory tree and use a replace directive in the consumer's go.mod file to slot it into the module graph.
If you only needed to build files in either the logging or db-utils directory, then you could executing the following from the root directory Shared:
go build <INSERT_MODULE_NAME_FROM_GO_MOD>/logging
go build <INSERT_MODULE_NAME_FROM_GO_MOD>/db-utils
I'm not certain if those commands will work if one of the files has a dependency on a file from the other directory.
Another command that will probably build the entire project is this:
go build ./...
Is there a way to build a go module that only has packages inside and doesn't have a main.go file?
No. The input for the build process is a package, not a module. Note how it says [packages] in the CLI documentation of go build.
When building a package leads to multiple packages being compiled, that is merely a consequence of direct and indirect import statements coming from .go-files located in the package you are building.
Note that Go does not support compiling packages to binaries to distribute closed-source libraries or such. This was not always the case, though. See #28152 and Binary-Only packages. The closest which exists to supporting that are plugins, but they are a work in progress and require resolution of symbols at runtime.

Is there an easier way to keep local Go packages updated

I am using multiple packages that I import into different projects, these range from custom adapters for my business logic that are shared by lambda and google cloud functions and other public packages. The way I do this right now is that I vendor them and include them for cloud functions. For applications that can be compiled and deployed on a VM, I compile them separately. This works fine for me, however, its a pain developing these modules.
If I update the method signature and names in the package, I have to push my changes to github / gitlab (my package path is something like gitlab.com/groupName/projectName/pkg/packageName) and then do a go get -u <pacakgeName> to update the package.
This also, does not really update it, sometimes am stuck with an older version with no idea on how to update it. Is there an easier way of working with this I wonder.
For sake of clarity:
Exported package 1
Path: gitlab.com/some/name/group/pkg/clients/psql
psql-client
|
|_ pkg
|
|_psql.go
Application 1 uses psql-client
Path: gitlab.com/some/name/app1
Application 2 uses psql-client
Path: gitlab.com/some/name/app2
My understanding is that (a) you are using the new Go modules system, and that (b) part of the problem is that you don't want to keep pushing changes to github or gitlab across different repositories when you are doing local development.
In other words, if you have your changes locally, it sounds like you don't want to round-trip those changes through github/gitlab in order for those changes to be visible across your related repositories that you are working on locally.
Most important advice
It is greatly complicating your workflow to have > 1 module in a single repository.
As is illustrated by your example, in general it is almost always more work on an on-going basis to have > 1 module in a single repository. It is also very hard to get right. For most people, the cost is almost always not worth it. Also, often the benefit is not what people expect, or in some cases, there is no practical benefit to having > 1 module in a repo.
I would definitely recommend you follow the commonly followed rule of "1 repo == 1 module", at least for now. This answer has more details about why.
Working with multiple repos
Given you are using Go modules, one approach is you can add a replace directive to a module's go.mod file that informs that Go module about the on-disk location of other Go modules.
Example structure
For example, if you had three repos repo1, repo2, repo3, you could clone them so that they all sit next to each other on your local disk:
myproject/
├── repo1
├── repo2
└── repo3
Then, if repo1 depends on repo2 and repo3, you could set the go.mod file for repo1 to know the relative on-disk location of the other two modules:
repo1 go.mod:
replace github.com/me/repo2 => ../repo2
replace github.com/me/repo3 => ../repo3
When you are inside the repo1 directory or any of its child directories, a go command like go build or go test ./.... will use the on-disk versions of repo2 and repo3.
repo2 go.mod:
If repo2 depends on repo3, you could also set:
replace github.com/me/repo3 => ../repo3
repo3 go.mod:
If for example repo3 does not depend on either of repo1 or repo2, then you would not need to add a replace to its go.mod.
Additional details
The replace directive is covered in more detail in the replace FAQ on the Modules wiki.
Finally, it depends on your exact use case, but a common solution at this point is to use gohack, which automates some of this process. In particular, it creates a mutable copy of a dependency (by default in $HOME/gohack, but the location is controlled by $GOHACK variable). gohackalso sets your current go.mod file to have a replace directive to point to that mutable copy.
go get is transitive, so you can just add it to your build process. A typical Go project build is basically:
go get -u ./... && go test ./... && go build ./cmd/myapp
Which gets & updates dependencies, runs all project tests, then builds the binary.

Should go.sum file be checked in to the git repository?

I have a program with source code hosted on GitHub that uses Go Modules introduced in go 1.11.
go.mod file describes my dependencies, but go.sum file seems to be a lockfile. Should I be adding go.sum to my repository or should I gitignore it?
https://github.com/golang/go/wiki/Modules#releasing-modules-all-versions:
Ensure your go.sum file is committed along with your go.mod file.
(Building on a previous answer.)
Yes, commit go.sum.
Ensure your go.sum file is committed along with your go.mod file. See FAQ below for more details and rationale.
From the FAQ:
Should I commit my 'go.sum' file as well as my 'go.mod' file?
Typically your module's go.sum file should be committed along with
your go.mod file.
go.sum contains the expected cryptographic checksums of the content of specific module versions.
If someone clones your repository and downloads your dependencies using the go command, they will receive an error if there is any
mismatch between their downloaded copies of your dependencies and the
corresponding entries in your go.sum.
In addition, go mod verify checks that the on-disk cached copies of module downloads still match the entries in go.sum.
Note that go.sum is not a lock file as used in some alternative dependency management systems. (go.mod provides enough information
for reproducible builds).
See very brief rationale here from
Filippo Valsorda on why you should check in your go.sum. See the
"Module downloading and
verification"
section of the tip documentation for more details. See possible future
extensions being discussed for example in
#24117 and
#25530.

Resources