Best practice(s) with Go Modules - go-modules

I'm "all in" on Go Modules. Mostly, I prefer the experience. In Go development, I've -- perhaps like many others -- treated dependencies as if I worked in a mono repo, each of my projects had its own GOPATH and I'd often clone from scratch and pull all then-latest versions of dependencies.
Using Modules, I think I'm breaking the best practice:
For per-commit builds, my projects' go.mod file would contain only primary -- and often only one -- explicit dependencies. Effectively, I don't commit go.mod and leave my build process to generate it and then the build. My thinking being that, apart from e.g. specific platforms that I'm using, where my familiarity with them means I'm confident in pinning to a specific version, for other dependencies, I'd rather maintain currency and get #vLatest.
If I get to building releases, I'd then go mod tidy and commit the go.mod to source control for the basis of the build.
Besides:
potentially breaking builds (which is acceptable for currency);
the absence of go.sum and package hashes (which I'm not independently verifying but trusting, e.g. golang.proxy.org); and
the repetition of pulling dependencies which is unavoidable anyway with my build process,
Is this approach bad?

For building releases, dependency immutability and build reproducibility are critical to software releases. Relying on go mod tidy to create the go.mod assumes the module git tag is immutable and is always available which is not the case. To ensure that the module tag is persistent and immutable, a go module repository is recommended. Refer to Go1.11 documentation for a list of "always on" module repositories and enterprise proxies. A short video on "Go Module and Dependency Management - GoCenter and Project Athens" talks about immutable dependency management..

Related

What are the benefits of having a vendor folder?

I can't really grasp the purpose of having a vendor folder. Based on what I learned, it seems the vendor folder is only beneficial if you're trying to make your repo compatible with golang versions earlier than 1.11. We are running golang 1.12.14.
When I brought this up to my coworker he said:
Please use vendor with modules - go doesn't have a global artifactory. this is, currently, the best option to make sure you have hermetic builds and your code doesn't break when somebody changes something in their repo.
I thought this is what Go modules does? I asked this question and a commenter is saying I shouldn't use vendor? Does it make sense to add `go mod vendor` to a pre-commit hook?
Go modules bring the guarantee that you will be able to build your packages deterministically by locking down the dependencies into a go.sum. That being said, the promise to deterministically build your project only stands if your dependencies are still accessible in the future. You don't know if this is going to be the case.
Vendoring on the other hand, with or without Go modules, brings stronger guarantees as it enables to commit the dependencies next to the code. Thus even if the remote repository is no longer accessible (deleted, renamed, etc), you will still be able to build your project.
Another alternative is to use Go modules along with a proxy. You can find more information in the official documentation. You can also look at some OSS implementations like gomods/athens or goproxy/goproxy. If you don't feel like setting up and maintaining your own proxy, some commercial offers are available on the market.
So should you go mod vendor each time you commit? Well it's ultimately up to you dependending on the kind of guarantees you want. But yes leveraging a proxy or vendoring your dependencies help getting closer to reproducable builds.
Note: with Go 1.17, go mod vendor (from 1.17 Go commands) might be easier to use:
vendor contents
If the main module specifies go 1.17 or higher, go mod vendor now annotates vendor/modules.txt with the go version indicated by each vendored module in its own go.mod file.
The annotated version is used when building the module's packages from vendored source code.
If the main module specifies go 1.17 or higher, go mod vendor now omits go.mod and go.sum files for vendored dependencies, which can otherwise interfere with the ability of the go command to identify the correct module root when invoked within the vendor tree.
Vendor Folder is a great way to organize and manage third-party dependencies in your project. It is especially useful when your code relies on external libraries or frameworks.
Benefits of having a Vendor Folder:
It helps to reduce dependencies conflicts.
It allows you to keep a separate version of each library / framework installed in your project.
It helps to keep the project structure clean and organized.
It makes it easy to update, install, and remove any dependencies with minimal effort.
It makes it easier to switch between different versions of a library or framework.

Best practices for external benchmarks when using Go Modules

I have a Go repository, and within it I have some benchmarks (in a _test suffixed package). These benchmarks compare it to, among other things, some third party libraries. I am not using these libraries in my non-benchmark code.
I am now migrating my repo to go modules. I do not want those third party libraries in my go.mod since my library doesn't need them for normal usage, and I don't want to tie my module to those unnecessarily.
What is the recommended go-mod way to do this? My ideas:
build tag on the benchmarks
benchmarks to another repo
module within my module
If someone wants to run your benchmark (for example, to check whether its stated results hold for their machine configuration), then they need to know what versions of dependencies those benchmarks were originally run with. The information needed to reproduce your test and benchmark results belongs in your go.mod file.
But note that “having a minimum version” is not the same as “importing”.
If a user builds your package but does not build and run its test, or if they build some other package within your module, then they will not need to download the source code for the benchmark dependency even if that dependency is included in your go.mod file.
(And the proposal in https://golang.org/issue/36460 doubles-down on that property: if implemented, that proposal would avoid loading dependencies of packages that are never imported, potentially pruning out large chunks of the dependency graph.)
So if you really don't want users to have to build the dependencies of your benchmark, put the benchmark in a separate package from the one that you expect your users to import.

Multiple modules within the same project

I've been playing with Go modules and I was wondering what the best practice is in terms of the following directory structure:
project
├── go.mod
├── main.go
└── players
├── go.mod
├── players.go
└── players_test.go
I was having problems importing the players package into my root project at first, but I noticed I could do this in the root go.mod file
module github.com/<name>/<project>
require (
github.com/<name>/players v0.0.0
)
replace github.com/<name>/players => ./players
This then allows me to do import "github.com/<name>/players" in my main.go file.
Now this approach works and was taken from here but I'm not sure if that's the correct approach for this or whether this approach is just meant for updating a local package temporarily while it's outside version control.
Another option, that seems a little overkill, is to make every module its own repository?
TL;DR; - What's the best practice approach to having multiple modules within the same repository and importing them in in other modules / a root main.go file?
In general a module should be a collection of packages.
But still you can create modules of single packages. As Volker said, this might only make sense, if you want these packages to have a different lifecycle. It could also make sense, when you want to import these modules from another project and don't want the overhead of the whole collection of packages.
In General:
A module is a collection of related Go packages that are versioned together as a single unit.
Modules record precise dependency requirements and create reproducible builds.
Most often, a version control repository contains exactly one module defined in the repository root. (Multiple modules are supported in a single repository, but typically that would result in more work on an on-going basis than a single module per repository).
Summarizing the relationship between repositories, modules, and packages:
A repository contains one or more Go modules.
2. Each module contains one or more Go packages.
3. Each package consists of one or more Go source files in a single directory.
Source of the Quote: https://github.com/golang/go/wiki/Modules#modules
To answer the question:
You can do it the way you have shown in your approach
I understand this is an old question, but there are some more details that are worth mentioning when managing multiple modules in one repository, with or without go.work.
TL;DR
Each approach has pros and cons, but if you are working on a large code base with many modules, I'd suggest sticking to use version handling based on commits or tags, and use Go Workspace for your day to day development.
Go Module Details
replace Directive with No Versioning
When you use replace directive pointing to a local directory, you will find the version of the dependency module as v0.0.0-00010101000000-000000000000. Essentially you get no version information.
With the main go.mod defined with github.com/name/project module path, github.com/name/project module cannot make a reproducible build, because the dependency target for replace directive may have had its content updated. This can be especially problematic if the dependency target of github.com/name/project/players is used by many modules. Any change in such a common package can result in a behaviour change for all the dependents, all at the same time.
If that's not your concern, replace directive should work absolutely fine. In such a setup, go.work may be a layer you don't really need.
With Versioning
If you want to ensure version setup works for reproducible and deterministic build for multiple modules, you can take a few different approaches.
One go.mod, one repository
This is probably the easiest approach. For each module, there is a clear commit history and versioning. As long as you refer to the module via remote repository, this is probably the easiest setup to start with, and dependency setup is very clear.
However, note that this approach would mean you'd need to manage multiple repositories, and making go.work to help is going to require appropriate local directory mapping, which can be difficult for someone new to the code base.
Commit based versioning
It is still possible to deterministically define dependency with version information so that you can build your code, within a single repository. Commit based approach requires least step, and still works nicely. There are some catches to be noted, though.
For github.com/name/project to have a dependency for github.com/name/project/players, you need to ensure the code you need is in the remote repository. This is because github.com/name/project will pull the code and commit information from the remote repository, even if the same code is available on your local copy of the repository. This ensures that the version of github.com/name/project/players is taken from the commit reference, such as v0.1.1-0.20220418015705-5f504416395d (ref: details of "pseudo-version")
The module name must match up the directory structure. For example, if you have the single repository github.com/name/project, and module under /src/mymodule/, the module name must be github.com/name/project/src/mymodule. This is because when module path resolution takes place, Go finds the root of repository (in the above example, this would be github.com/name/project.git), and then tries to follow the directory path based on the module name.
If you are working in a private repository, you will need to ensure go.sum check doesn't block you. You can simply use GOPRIVATE=github.com/name/project to specify paths you don't want the checksum verification to be skipped.
Tag based versioning
Instead of using the commit SHA, you can use Git tags.
But because there could be many modules in one repository, Go Module needs to find which tag maps to which. For example, with the following directory structure:
# All assumed to be using `github.com/name/project` prefix before package name
mypackage/ # v1.0.0
anotherpackage/ # v0.5.1
nested/dependency/ # v0.8.3
You will need to create tags in github.com/name/project, named exactly to match the directory structure, such that:
mypackage/v1.0.0
anotherpackage/v0.5.1
nested/dependency/v0.8.3
This way, each tag is correctly referenced by Go Module, and your dependency can be kept deterministic.
go.work Behaviour
If you have go.work on a parent directory with go work use github.com/name/project/players, etc., that takes precedence and uses the local files. This is even when you have a version specified in your go.mod.
For local development, which spans across multiple projects, Go Workspace is a great way to work on multiple things at once, without needing to push the code change for the dependency only first. But at the same time, actual release will still require broken up commits, so that first commit can be referenced later in other code change.
go.work is said to be a file you rarely need to commit to the repository. You must be aware of what the impact of having go.work in parent paths would be, though.
--
References:
https://go.dev/doc/modules/managing-source: Discussion around repository setup
https://go.dev/ref/mod: Go Modules Reference
Side Note:
I have given a talk about this at Go Conference, hosted in Japan - you can find some demo code, slides, etc. here if you are curious to know more with examples.
In 2022, the best practice approach to having multiple modules within the same repository and importing them in other modules.
This is supported with a new "go module workspace".
Released with Go 1.18 and the new go work command.
See "Proposal: Multi-Module Workspaces in cmd/go" and issue 45713:
The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode.
The go.work file specifies a set of local modules that comprise a workspace.
When invoked in workspace mode, the go command will always select these modules and a consistent set of dependencies.
go.work file:
go 1.18
directory (
./baz // foo.org/bar/baz
./tools // golang.org/x/tools
)
replace golang.org/x/net => example.com/fork/net v1.4.5
You now have CL 355689
cmd/go: add GOWORK to go env command
GOWORK will be set to the go.work file's path, if in workspace mode
or will be empty otherwise.

multiple go projects and sharing a vendor directory (in go before 1.11)

I have started learning go (1.7.4) and have a project which currently produces two executables. I have a directory structure as below following the standard go layout:
GOPATH=`pwd`
bin
src/
src/<project1>
src/<project1>/vendor
src/<project1>/glide.yaml
src/<project2>
src/<project2>/vendor
src/<project2>/glide.yaml
pkg/
Project 1 and project 2 share a lot of dependencies.
Is there a way to share the vendor directory between project1 and project2 and still pin the versions to ensure reproducible builds?
I don't want to duplicate the glide.yaml and vendor directories for each project as it bloats the build and violates DRY.
The pkg directory is the obvious the way to do this but unlike vendor I don't have a dependency manager tool like glide to ensure a specific version is used (see also my related question).
A possibly related issue is how this project is organised. I believe in go it would be more conventional for each project sub-directory to map to a single github repository. However, for my project I want to build at least two executables. I realise you can do this by using different package names but it confuses go and glide. I wrestled with getting this to work under a single project and decided/discovered it was easier to use the standard go layout and work two levels up. For example, an advantage is that "go build" etc. in the subdirectories just works without having to name the package. I can also have my build, test and package machinery at the top level operate on all projects and keep my go environment separate from any others.
The programs are not complex enough to warrant separate git repositories (even as submodules). If there is a pattern that makes this work it might render my original question moot.
It should be possible to have a shared vendor directory. The way I am doing it involves Go 1.11 and the new Go feature called modules. But I am pretty sure it should work with vendor and tools like glide and dep. To use dep/glide your directory structure might looks like this
- src
- projects
- project1
- project2
- vendor
- Glide.yaml
And you can build it either from the projects folder using go build -o p1 project1/*.go or from individual project folder using go build
The same structure, but outside of GOPATH will work for Go 1.11 modules. You would have to set the GO111MODULE variable to "on" or "auto". Mind you that go modules store dependencies in some other location and download them automatically during the build process when needed.
Note: glide github page recommends switching to dep as the more official tool
Edit: Just tested it with dep. It works for me.
I recommend look at new vendoring system - https://github.com/golang/go/wiki/Modules
It allows you to fix versions of packages used:
module github.com/my/thing
require (
github.com/some/dependency v1.2.3
github.com/another/dependency/v4 v4.0.0
)

Language/Platform/Build-Independent Dependency Manager

I'm in need of a dependency manager that is not tied to a particular language or build system. I've looked into several excellent tools (Gradle, Bazel, Hunter, Biicode, Conan, etc.), but none satisfy my requirements (see below). I've also used Git Submodules and Mercurial Subrepos.
My needs are well described in a presentation by Daniel Pfeifer at Meeting C++ 2014. To summarize the goals of this dependency tool (discussed #18:55 of the linked video):
Not just a package manager
Supports pre-built or source dependencies
Can download or find locally - no unnecessary downloads
Fetches using a variety of methods (i.e. download, or VCS clones, etc.)
Integrated with the system installer - can check if lib is installed
No need to adapt source code in any way
No need to adapt the build system
Cross-platform
Further requirements or clarifications I would add:
Suitable for third-party and/or versioned dependencies, but also capable of specifying non-versioned and/or co-developed dependencies (probably specified by a git/mercurial hash or tag).
Provides a mechanism to override the specified fetching behavior to use some alternate dependency version of my choosing.
No need to manually set up a dependency store. I'm not opposed to a central dependency location as a way to avoid redundant or circular dependencies. However, we need the simplicity of cloning a repo and executing some top-level build script that invokes the dependency manager and builds everything.
Despite the requirement that I should not have to modify my build system, obviously some top-level build must wield the dependency manager and then feed those dependencies to the individual builds. The requirement means that the individual builds should not be aware of the dependency manager. For example, if using CMake for a C++ package, I should not need to modify its CMakeLists.txt to make special functional calls to locate dependencies. Rather, the top-level build manager should invoke the dependency manager to retrieve the dependencies and then provide arguments CMake can consume in traditional ways (i.e find_package or add_subdirectory). In other words, I should always have the option of manually doing the work of the top-level build and dependency manager and the individual build should not know the difference.
Nice-to-have:
A way to interrogate the dependency manager after-the-fact to find where a dependency was placed. This would allow me to create VCS hooks to automatically update the hash in dependency metadata of co-developed source repo dependencies. (Like submodules or subrepos do).
After thoroughly searching the available technologies, comparing against package managers in various languages (i.e. npm), and even having a run at my own dependency manager tool, I have settled on Conan. After diving deep into Conan, I find that it satisfies most of my requirements out of the box and is readily extensible.
Prior to looking into Conan, I saw BitBake as the model of what I was looking for. However, it is linux only and is heavily geared toward embedded linux distros. Conan has essentially the same recipe features as bb and is truly cross-platform
Here are my requirements and what I found with Conan:
Not just a package manager
Supports pre-built or source dependencies
Conan supports classic release or dev dependencies and also allows you to package source. If binaries with particular configurations/settings do not exist in the registry (or "repository", in Conan parlance), a binary will be built from source.
Can download or find locally - no unnecessary downloads
Integrated with the system installer - can check if lib is installed
Conan maintains a local registry as a cache. So independent projects that happen to share dependencies don't need to redo expensive downloads and builds.
Conan does not prevent you from finding system packages instead of the declared dependencies. If you write your build script to be passed prefix paths, you can change the path of individual dependencies on the fly.
Fetches using a variety of methods (i.e. download, or VCS clones, etc.)
Implementing the source function of the recipe gives full control over how a dependency is fetched. Conan supports the recipes that do the download/clone of source or can "snapshot" the source, packaging it with the recipe itself.
No need to adapt source code in any way
No need to adapt the build system
Conan supports a variety of generators to make dependencies consumable by your chosen build system. The agnosticism from a particular build system is Conan's real win and ultimately what makes dependency management from the likes of Bazel, Buckaroo, etc. cumbersome.
Cross-platform
Python. Check.
Suitable for third-party and/or versioned dependencies, but also capable of specifying non-versioned and/or co-developed dependencies (probably specified by a git/mercurial hash or tag).
Built with semver in mind, but can use any string identifier as version. Additionally has user and channel to act as namespaces for package versions.
Provides a mechanism to override the specified fetching behavior to use some alternate dependency version of my choosing.
You can prevent the fetch of a particular dependency by not including it in the install command. Or you can modify or override the generated prefix info to point to a different location on disk.
No need to manually set up a dependency store. I'm not opposed to a central dependency location as a way to avoid redundant or circular dependencies. However, we need the simplicity of cloning a repo and executing some top-level build script that invokes the dependency manager and builds everything.
Despite the requirement that I should not have to modify my build system, obviously some top-level build must wield the dependency manager and then feed those dependencies to the individual builds. The requirement means that the individual builds should not be aware of the dependency manager. For example, if using CMake for a C++ package, I should not need to modify its CMakeLists.txt to make special functional calls to locate dependencies. Rather, the top-level build manager should invoke the dependency manager to retrieve the dependencies and then provide arguments CMake can consume in traditional ways (i.e find_package or add_subdirectory). In other words, I should always have the option of manually doing the work of the top-level build and dependency manager and the individual build should not know the difference.
Conan caches dependencies in a local registry. This is seamless. The canonical pattern you'll see in Conan's documentation is to add some Conan-specific calls in your build scripts, but this can be avoided. Once again, if you write your build scripts to consumer prefix paths and/or input arguments, you can pass the info in and not use Conan at all. I think the Conan CMake generators could use a little work to make this more elegant. As a fallback, Conan lets me write my own generator.
A way to interrogate the dependency manager after-the-fact to find where a dependency was placed. This would allow me to create VCS hooks to automatically update the hash in dependency metadata of co-developed source repo dependencies. (Like submodules or subrepos do).
The generators point to these locations. And with the full capability of Python, you can customize this to your heart's content.
Currently co-developing dependent projects is the biggest question mark for me. Meaning, I don't know if Conan has something out of the box to make tracking commits easy, but I'm confident the hooks are in there to add this customization.
Other things I found in Conan:
Conan provides the ability to download or build toolchains that I need during development. It uses Python virtualenv to make enabling/disabling these custom environments easy without polluting my system installations.

Resources