How should I use vendor in Go 1.6? - go

First I have read this answer: Vendoring in Go 1.6, then I use it as my example.
My gopath is GOPATH="/Users/thinkerou/xyz/", and the follow like:
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/ou$ pwd
/Users/baidu/xyz/src/ou
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/ou$ ls
main.go vendor
Now, I use go get, then becomes this:
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/ou$ ls
main.go vendor
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/ou$ cd vendor/
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/ou/vendor$ ls
vendor.json
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/ou/vendor$ cd ../..
thinkerou#MacBook-Pro-thinkerou:~/xyz/src$ ls
github.com ou
thinkerou#MacBook-Pro-thinkerou:~/xyz/src$ cd github.com/
thinkerou#MacBook-Pro-thinkerou:~/xyz/src/github.com$ ls
zenazn
vendor.json is this:
{
"comment": "",
"package": [
{
"path": "github.com/zenazn/goji"
}
]
}
then, I should use what commands? why have no use vendor? My go version is 1.6.2.

With Go1.6, vendoring is built in as you read. What does this mean? Only one thing to keep in mind:
When using the go tools such as go build or go run, they first check to see if the dependencies are located in ./vendor/. If so, use it. If not, revert to the $GOPATH/src/ directory.
The actual "lookup paths" in Go 1.6 are, in order:
./vendor/github.com/zenazn/goji
$GOPATH/src/github.com/zenazn/goji
$GOROOT/src/github.com/zenazn/goji
With that said, go get will continue to install into you $GOPATH/src; and, go install will install into $GOPATH/bin for binaries or $GOPATH/pkg for package caching.
So, how do I use ./vendor?!?!
Hehe, armed with the knowledge above, it's pretty simple:
mkdir -p $GOPATH/src/ou/vendor/github.com/zenazn/goji
cp -r $GOPATH/src/github.com/zenazn/goji/ $GOPATH/src/ou/vendor/github.com/zenazn/goji
In short, to use vendoring, you copy the files using the same github.com/zenazn/goji full path, into your vendor director.
Now, the go build/install/run tooling will see and use your vendor folder.
An easier way instead of copying everything manually
Instead of finding and copying all 25+ vendor items, managing their versions, updating other projects etc... It would be better to use a dependency management tool. There are many out there and a little googling will point to you several.
Let me mention two that works with the vendor folder and doesn't fight you:
godep
govendor
In short, these tools will inspect your ou code, find the remote dependencies, and copy them from your $GOPATH/src to your $GOPATH/src/ou/vendor directory (actually, whatever current directory you are in when you run them).
For example, say you have all of your dependencies installed and working normally in your $GOPATH/src/ou/ project using the normal GOPATH/src/github installation of your dependencies. Your project runs and your tests validate everything is working with the exact version of the repos you have. With Godep as an example, you'd run this from your project root folder $GOPATH/src/ou/:
godep save ./...
This would copy all dependencies your project uses into your ./vendor folder.
Godep is by far and large the most popular. They have their own Slack channel on the Gopher Slack group. And, it's the one I use on my teams.
Govendor is another alternative I read has a nice sync feature. I haven't used it though.
Over Usage of Dependency Management Tool
This is purely opinion, and I'm sure haters will downvote... But as I need to finish my blog post on the subject, let me mention here that most people worry too much about depdency management in Go.
Yes, there is a need to lock in a repo to a version you depend on so you can ensure your system builds in production. Yes there is a need to ensure no breaking changes to a way a dependency is interrupting something.
Use dependency management for those, absolutely.
But, there is overuse of simple projects that lock in huge amounts of dependencies when in reality...
You may only need to lock in only 1 dependencies; otherwise, you want the latest version of MySQL drivers and test assertion frameworks for bug fixes.
This is where using the ./vendor/ folder apart from dependency managrment tools can really shine: you'd only need to copy that repo that need you lock in.
You selectively pick the one misbehaving repo and put it into your ./vendor/ folder. By doing this, you are telling your consumers:
Hey, this one repo needs to be held back at this revision. All others are fine and use the latest of those and update often with go get -u ./...; but, this one failed with newer versions so don't upgrade this one repo.
But if blanketly saving all your dependencies with a dependency management tool, you are basically telling your consumers:
There may or may not be a problem with one or more repos out of the 20 in the vendor folder. You may or may not be able to update them. You may or may not be able to get the latest MySQL driver. We simply don't know which may or may not be causing problems and just locked in something that worked at the time that I ran godep save. So yeah, upgrade at your own risk.
Personally, I have ran into this several times. A dependency was updated with a breaking change, and we have dozens of repos dependent on it. Vendoring just that one repo in /vendor allows us to use that one version of dependency, while go get ./... continues to run normally for all other repos to get the latest. We run with the latest bug fixes in PSQL and MySQL and others (there are constant fixes for these!) and so on.

Related

Go update all modules

Using this module as an example (using a specific commit so others will see
what I see):
git clone git://github.com/walles/moar
Set-Location moar
git checkout d24acdbf
I would like a way to tell Go to "update everything". Assume that the module
will work with the newest version of everything. Below are five ways I found to
do this, assume each is run on a clean clone. This results in a go.mod of 19
lines:
go get -u
This results in a go.mod of 14 lines:
go get -u
go mod tidy
This results in a go.mod of 13 lines:
go mod tidy
If I just manually delete everything in require and run go mod tidy, I get
12 lines. If I just manually delete everything in require and run go get -u, I get 11 lines. My question is, why are these methods producing different
results, and what is the "right way" to do what I am trying to do?
tl;dr;
this is what you want:
go get -u
go mod tidy
and to recursively update packages in any subdirectories:
go get -u ./...
The inconsistencies you are seeing is due to the inherent organic nature of software.
Using your example, commit d24acdbf of git://github.com/walles/moar most likely was checked in by the maintainer without running go mod tidy (explaining the longer 19 lines). If the maintainer had, then you would see the 13 line version you see at the end.
go get -u on it's own is more aggressive in pulling in dependencies. Also, the mere fact of updating dependencies to their latest (compatible) version, may in & of itself pull in new direct/indirect dependencies. These dependencies may grow even further if you tried this tomorrow (the latest version of some sub-dependency adds new functionality, so it needs new dependencies). So there may be a valid reason the repo maintainer fixes at a particular (non-latest) version.
go mod tidy cleans up this aggressive dependency analysis.
P.S. It's a common misconception that dependencies will shrink after go mod tidy: tracking go.sum, in some cases this file will grow after a tidy (though, not in this case)
Run go get -u && go mod tidy 1
More details:
go get -u (same as go get -u .) updates the package in the current directory, hence the module that provides that package, and its dependencies to the newer minor or patch releases when available. In typical projects, running this in the module root is enough, as it likely imports everything else.
go get -u ./... will expand to all packages rooted in the current directory, which effectively also updates everything (all modules that provide those packages).
Following from the above, go get -u ./foo/... will update everything that is rooted in ./foo
go get -u all updates everything including test dependencies; from Package List and Patterns
When using modules, all expands to all packages in the main module and their dependencies, including dependencies needed by tests of any of those.
go get will also add to the go.mod file the require directives for dependencies that were just updated.
go mod tidy makes sure go.mod matches the source code in the module. In your project it results in 12 lines because those are the bare minimum to match the source code.
go mod tidy will prune go.sum and go.mod by removing the unnecessary checksums and transitive dependencies (e.g. // indirect), that were added to by go get -u due to newer semver available. It may also add missing entries to go.sum.
Note that starting from Go 1.17, newly-added indirect dependencies in go.mod are arranged in a separate require block.
1: updates dependencies' newest minor/patch versions, go.mod, go.sum
There are some hidden dragons, here is what I recommend:
go get -u ./... walks all packages in your project. This is the command you want to use.
go get -t -u ./... walks all packages in your project and also downloads tests files of these dependencies. Probably you don’t need that.
go get -u updates in the current directory only. Useful for small single-package projects, just use the first version.
go get -u specific.com/package updates just one (or more separated by space) packages (and dependencies).
go get -u specific.com/package#version the same but to a specific version.
go get -u all updates modules from the build list from go.mod. This is useful for listing (go list -m -u all) but not too useful for updates.

Is there a good way to initialize go mods with defaulting to dependency versions in the GOPATH?

I'm starting the process of switching my applications to using go modules from currently not using any dependency manager. I want to use all of the same versions of the dependencies I currently use to avoid the risk of a different version of something causing unforeseen issues. Since I have a microservice architecture with a lot of applications I'm trying to figure out if there's a better way to do this than checking each application and its individual dependencies against what is currently in the build server's GOPATH.
Is there any way, even if just once when first initializing go mods, to have go modules default to the versions in the GOPATH.
If that's not possible (which I have a strong feeling it's not), is it possible to use go list or something similar to print the imported dependencies and the current git sha that exists in the GOPATH?
From the root of your project directory:
go mod init
To pull in build (and optionally test dependencies):
go build ./... # ... notation will scan any subdirectories for any nested packages/tools
go test ./... # optional
The above will pull the latest (semver) version of each dependency. This may NOT be the version you are using with a GOPATH build.
So to ensure you get the latest commit (which is what GOPATH builds use) - I'd go through each dependency in go.mod and issue a manual update to master. So for example lets say you had logrus as a dependency, to update to the latest commit:
go get github.com/sirupsen/logrus#master
If the latest semver matches the latest commit - no change will occur - but if not you will get a tag version plus commit style pseudo-version.
The Go wiki has other go-modules daily workflows e.g. fast-forward, a month/year from now, to pull the latest version of your dependencies into your go.mod (and go.sum):
go get -u ./...
but again be aware if a dependency does not use semver or has switched to a v2 breaking change - the above will not work.
The best practice is to eyeball the repo:
does it support go-modules (i.e. does it have a go.mod at the top level or at the import path level)
is the git repo tagged
and if the latest commit is tagged
only then can you be sure you are getting the version you expect.

Should I commit the yarn.lock file and what is it for?

Yarn creates a yarn.lock file after you perform a yarn install.
Should this be committed to the repository or ignored? What is it for?
Yes, you should check it in, see Migrating from npm
What is it for?
The npm client installs dependencies into the node_modules directory non-deterministically. This means that based on the order dependencies are installed, the structure of a node_modules directory could be different from one person to another. These differences can cause works on my machine bugs that take a long time to hunt down.
Yarn resolves these issues around versioning and non-determinism by using lock files and an install algorithm that is deterministic and reliable. These lock files lock the installed dependencies to a specific version and ensure that every install results in the exact same file structure in node_modules across all machines.
Depends on what your project is:
Is your project an application? Then: Yes
Is your project a library? If so: No
A more elaborate description of this can be found in this GitHub issue where one of the creators of Yarn eg. says:
The package.json describes the intended versions desired by the original author, while yarn.lock describes the last-known-good configuration for a given application.
Only the yarn.lock-file of the top level project will be used. So unless ones project will be used standalone and not be installed into another project, then there's no use in committing any yarn.lock-file – instead it will always be up to the package.json-file to convey what versions of dependencies the project expects then.
I see these are two separate questions in one. Let me answer both.
Should you commit the file into repo?
Yes. As mentioned in ckuijjer's answer it is recommended in Migration Guide to include this file into repo. Read on to understand why you need to do it.
What is yarn.lock?
It is a file that stores the exact dependency versions for your project together with checksums for each package. This is yarn's way to provide consistency for your dependencies.
To understand why this file is needed you first need to understand what was the problem behind original NPM's package.json. When you install the package, NPM will store the range of allowed revisions of a dependency instead of a specific revision (semver). NPM will try to fetch update the dependency latest version of dependency within the specified range (i.e. non-breaking patch updates). There are two problems with this approach.
Dependency authors might release patch version updates while in fact introducing a breaking change that will affect your project.
Two developers running npm install at different times may get the different set of dependencies. Which may cause a bug to be not reproducible on two exactly same environments. This will might cause build stability issues for CI servers for example.
Yarn on the other hand takes the route of maximum predictability. It creates yarn.lock file to save the exact dependency versions. Having that file in place yarn will use versions stored in yarn.lock instead of resolving versions from package.json. This strategy guarantees that none of the issues described above happen.
yarn.lock is similar to npm-shrinkwrap.json that can be created by npm shrinkwrap command. Check this answer explaining the differences between these two files.
You should:
add it to the repository and commit it
use yarn install --frozen-lockfile and NOT yarn install as a default both locally and on CI build servers.
(I opened a ticket on yarn's issue tracker to make a case to make frozen-lockfile default behavior, see #4147).
Beware to NOT set the frozen-lockfile flag in the .yarnrc file as that would prevent you from being able to sync the package.json and yarn.lock file. See the related yarn issue on github
yarn install may mutate your yarn.lock unexpectedly, making yarn claims of repeatable builds null and void. You should only use yarn install to initialize a yarn.lock and to update it.
Also, esp. in larger teams, you may have a lot of noise around changes in the yarn lock only because a developer was setting up their local project.
For further information, read upon my answer about npm's package-lock.json as that applies here as well.
This was also recently made clear in the docs for yarn install:
yarn install
Install all the dependencies listed within package.json
in the local node_modules folder.
The yarn.lock file is utilized as follows:
If yarn.lock is present and is enough to satisfy all the dependencies
listed in package.json, the exact versions recorded in yarn.lock are
installed, and yarn.lock will be unchanged. Yarn will not check for
newer versions.
If yarn.lock is absent, or is not enough to satisfy
all the dependencies listed in package.json (for example, if you
manually add a dependency to package.json), Yarn looks for the newest
versions available that satisfy the constraints in package.json. The
results are written to yarn.lock.
If you want to ensure yarn.lock is not updated, use --frozen-lockfile.
From My experience I would say yes we should commit yarn.lock file. It will ensure that, when other people use your project they will get the same dependencies as your project expected.
From the Doc
When you run either yarn or yarn add , Yarn will generate a yarn.lock file within the root directory of your package. You don’t need to read or understand this file - just check it into source control. When other people start using Yarn instead of npm, the yarn.lock file will ensure that they get precisely the same dependencies as you have.
One argue could be, that we can achieve it by replacing ^ with --. Yes we can, but in general, we have seen that majority of npm packages comes with ^ notation, and we have to change notation manually for ensuring static dependency version.But if you use yarn.lock it will programatically ensure your correct version.
Also as Eric Elliott said here
Don’t .gitignore yarn.lock. It is there to ensure deterministic dependency resolution to avoid “works on my machine” bugs.
Not to play the devil's advocate, but I have slowly (over the years) come around to the idea that you should NOT commit the lock files.
I know every bit of documentation they have says that you should. But what good can it possibly do?! And the downsides far outweigh the benefits, in my opinion.
Basically, I have spent countless hours debugging issues that have eventually been solved by deleting lock files. For example, the lock files can contain information about which package registry to use, and in an enterprise environment where different users access different registries, it's a recipe for disaster.
Additionally, the lock files can really mess up your dependency tree. Because yarn and npm create a complex tree and keep external modules of different versions in different places (e.g. in the node_modules folder within a module in the top node_modules folder of your app), if you update dependencies frequently, it can create a real mess. Again, I have spent tons of time trying to figure out what an old version of a module was still being used in a dependency wherein the module version had been updated, only to find that deleting the lock file and the node_modules folder solved all the hard-to-diagnose problems.
I even have shell aliases now that delete the lock files (and sometimes node_modules folders as well!) before running yarn or npm.
Just the other side of the coin, I guess, but blindly following this dogma can cost you........
I'd guess yes, since Yarn versions its own yarn.lock file:
https://github.com/yarnpkg/yarn
It's used for deterministic package dependency resolution.
Yes! yarn.lock must be checked in so any developer who installs the dependencies get the exact same output! With npm [that was available in Oct 2016], for instance, you can have a patch version (say 1.2.0) installed locally while a new developer running a fresh install might get a different version (1.2.1).
Yes, You should commit it. For more about yarn.lock file, refer the official docs here

Library dependencies in Go

I've created a library/package in Go and the consensus was that only applications include a vendor folder in their project and libraries don't.
So now I included my package in another (govendor'ed) project and everything worked fine, untill it got to Jenkins and it had to use its local resources, where 2 of the dependencies were missing.
My project readme says all you need to do is go get my project and you're done. But that's not the case in case you're using govendoring.
What should be the approach for my library? Can this be solved, or is this 'problem' just something the end-user has to solve because they use govendor?
This is more of an opinion question I guess, however I'll share what I use.
I use git subtree for vendoring sub repos in my tree then add a //go:generate line to update it later on, for example:
➜ git subtree add --prefix vendor/xxx/yyy/zzz https://github.com/xxx/yyy/zzz master --squash
Then add //go:generate git subtree pull --prefix vendor/xxx/yyy/zzz https://github.com/xxx/yyy/zzz master --squash to one of my library files.
And just run go generate before I make release.
That solves the vendoring issue without the need of any external tools.
Live example: https://github.com/OneOfOne/xxhash/blob/master/xxhash_cgo.go

Any smart method to get exp/html back after Go1?

I've installed the Go release version as root.
Go1 removed all exp/ code.
Is there smart method to get exp/* back after Go1?
(I mean how to install in my local GOPATH?)
[My Solution]
# pull from go repository to $HOME/repo/go
cd $HOME/repo
hg clone https://go.googlecode.com/hg/go
# make symbolic link to your GOPATH(eg. $HOME/go)
cd $HOME/go/src
ln -s $HOME/repo/go/src/pkg/exp .
The exp/html library was incomplete which is why it was removed for Go1.
However if you really want to use it then
go get code.google.com/p/go/src/pkg/exp/html
may install it back for you. If you want a slightly more complete html parser then you might checkout http://code.google.com/p/go-html-transform/ as well it has an html5 parser as well as a css selector based scraping and transformation library.
EDIT: Apparently trying to go get the package that way doesn't really work. It appears the only way to install this is to checkout the go source code and then install from source. This is actually a really quick an painless process if you want to go that route.
Building from source is the way to do this. When you do the hg update step though, note that since the exp tree is not tagged go1, that hg update release won't get it for you. Instead hg update weekly will get it, and is probably what you want.
Edit: Weekly releases were discontinued after Go 1, so hg update weekly will access increasingly stale code. A better strategy is hg update tip, then copy the exp directory or directories of interest somewhere and recompile it with whatever Go version you are using, Go 1.0.1, for example.
Note: with go 1.4 (Q4, 2014), the url for that exp package will change (again):
code.google.com/p/go.exp => golang.org/x/exp
That means now:
go get golang.org/x/exp
See "Go 1.4 subrepo renaming".
Regarding the html package, it is in net/html, so this will become (as commented by andybalholm):
go get golang.org/x/net/html
The exp packages have been moved to different repositories now, to make them easier to install. Now you can install the former exp/html with go get "golang.org/x/net/html".
This answer is outdated.
This is covered in the golang wiki:
https://code.google.com/p/go-wiki/wiki/InstallingExp
% cd $GOPATH/src
% hg clone https://code.google.com/p/go go-exp
requesting all changes
adding changesets
adding manifests
adding file changes
added 13323 changesets with 50185 changes to 7251 files (+5 heads)
updating to branch default
3464 files updated, 0 files merged, 0 files removed, 0 files unresolved
% mv go-exp/src/pkg/exp .
% rm -rf go-exp
% go install exp/...
Then, to use it:
import "exp/proxy"
I tried this a few months ago and it worked pretty well. Also, when I ran go install ... I limited it to only the package I was interested in: go install exp/html (if I recall, correctly).

Resources