I am using dep to handle my Go dependencies. Is it best practice to also commit the vendor directory into version control? Or is best practice to always execute dep ensure after checking out a repository?
The dep tool's FAQ answers this:
Should I commit my vendor directory?
It's up to you:
Pros
It's the only way to get truly reproducible builds, as it guards
against upstream renames, deletes and commit history overwrites.
You don't need an extra dep ensure step to sync vendor/ with
Gopkg.lock after most operations, such as go get, cloning, getting
latest, merging, etc.
Cons
Your repo will be bigger, potentially a lot bigger, though prune can help minimize this problem.
PR diffs will include changes for files under vendor/ when Gopkg.lock is modified, however files in vendor/ are hidden by default on GitHub.
imagine what would happen to your project if a dependency was taken offline by the author. Until Go has a central server to hold all packages which are unable to be deleted a lot of people will always see the need to commit the vendor folder
I don’t commit for well available sources. Because on this case many commit messages are bloated with changes in vendors. When I want update then I do it and then commit updated Gopkg.*.
Related
I am using go modules on go1.12 to handle my Go dependencies. Is it best practice to also commit the vendor/ directory into version control?
This is somewhat related to Is it best-practice to commit the `vendor` directory? which asks this question in the case of using dep. With dep, commiting vendor/ is the only way to get truly reproducible builds. What about for go modules?
I'd like to give some arguments in favour of committing vendor, go.mod and go.sum.
I agree with the accepted answer's arguments that it's technically unnecessary and bloats the repo.
But here is a list of contra-arguments:
Building the project doesn't depend on some code being available on Github/Gitlab/... or the Go proxy servers. Open source projects may disappear because of censorship, authors incentives, licensing changes or some other reasons I can't currently think of, which did happen on npm, the JavaScript package manager, and broke many projects. Not in your repo, not your code.
We may have used internal or 3rd party Go modules (private) which may also disappear or become inaccessible, but if they are committed in vendor, they are part of our project. Nothing breaks unexpectedly.
Private Go modules may not follow semantic versioning, which means the Go tools will rely on the latest commit hash when fetching them on-the-fly. Repo history may be rewritten (e.g. rebase) and you, a colleague or your CI job may end up with different code for the dependencies they use.
Committing vendor can improve your code review process. Typically we always commit dependency changes in a separate commit, so they can be easily viewed if you're curious.
Here's an interesting observation related to bloating the repo. If I make code review and a team member has included a new dependency with 300 files (or updated one with 300 files changed), I would be pretty curious to deep dive into that and start a discussion about code quality, the need for this change or alternative Go modules. This may lead to actually decrease your binary size and overall complexity.
If I just see a single new line in go.mod in a new Merge Request, chances are I won't even think about it.
CI/CD jobs which perform compilation and build steps need not waste time and network to download the dependencies every time the CI job is executed. All needed dependencies are local and present (go build -mod vendor)
These are on top of my head, if I remember something else, I'll add it here.
Unless you need to modify the vendored packages, you shouldn't. Go modules already gives you reproducible builds as the go.mod file records the exact versions and commit hashes of your dependencies, which the go tool will respect and follow.
The vendor directory can be recreated by running the go mod vendor command, and it's even ignored by default by go build unless you ask it to use it with the -mod=vendor flag.
Read more details:
Go wiki: How do I use vendoring with modules? Is vendoring going away?
Command go: Modules and vendoring
Command go: Make vendored copies of dependencies
I am new to Go and was wondering whether it's safe to commit the Gopkg.lock file in a VCS?
Yes, you should commit it. This file makes much less sense if everybody has his own version.
It makes the build reproducible from one developer to another, and more importantly it ensures the deployment build environnment won't use an unexpected set of dependency versions.
Currently, we have all vendored libraries in src/vendor which makes docker-compose build quite fast. Although adding vendored libraries to source control has the disavantage of libraries not being updated and also heavily polluting the diff of pull requests.
Is there a way in between, maybe with caching?
Is there a way in between, maybe with caching?
Yes, several. But don't fight the system/preferred method.
Use $GOPATH/src/MyProject/vendor like you are already doing.
adding vendored libraries to source control has the disavantage of libraries not being updated...
That all depends on your team's management of your repo. If everyone ignores the vendor, ya it will get stale.
Personally I make it a "1st of the month" habit of going through and refreshing all dependencies, running our test suites, and if no errors update for QA integration testing on the dev server and keep an eye on the error logs after release. Tools like godep and gostatus greatly help keep your GOPATH in chrcn with latest, that you can update your vendor folder(s) with quickly.
Just make sure it is a dedicated commit, so it can be reverted in a hurry if an issue creeps up.
also heavily polluting the diff of pull requests
First of all, that's just a process task. I enforce rebasing on all pull requests and reject all merges in all repos. This keeps a very clean git history; but, more to the point, rebasing moves your local commits until after the vendor updates. Shouldn't ever get a conflict unless someone added the same package. Which at that point is easy, just take the latest one and be done.
Sound like there are process issues to work out than worrying about /vendor management.
I am really frustrated about using git's submodule feature. Either I still don't get it right or it just don't work as I am expecting this. Following project situation is given:
Project
| .git
| projsrc
| source (submodule)
| proj.sln
In this scenario source is pointing to another repository containing the common source data across all our projects. There is a lot of development happening under source as also under projsrc. Unfortunately Project points to some commit of the source submodule and not to the actual HEAD of it. This is the usual git behaviour, as far as I got it know.
I already found out that
git submodule update
just get the version of submodule which was commited together with the main Project. However, I would really like to be always up-to date with the submodules development, but do not have any real clue how to do this right. Hence my question is:
Is it possible to attach the Project to the HEAD of the submodule,
reagardless of the fact if this will break the compilation of Project or not.
I just don't want to go always into the submodule
directory and do git pull there. Since I think I could loose my changes done
in the submodule directory, because this is simple attached to a
commit and not really to any branch or so.
Please consider following constraints:
Developers in our group are not that familiar with all VCS around. We are used to use really huge svn repository before, without any external repo features at all.
We are working on Windows
A click'n'forget solution would be best, since most of project members are rather scared by using a command line interface :)
The reason why a submodule points to a particular revision is important. If you point to a HEAD, builds will be unreproducible. I.e. if you checkout yesterday's a version of the Project, you would never know which exact version of source#HEAD was yesterday.
That's why it always stores particular revision sha.
To pull all submodules you could use Easy way pull latest of all submodules
I am not good at Git and submodule. But I think some simple rules would be very helpful.
Commit and push from sub-directory.
Go back to root of your project, check the status if you need to commit and push again.
when Pull. can try to use script to bundle the "pull/submodule update" together. And only do it at the root of your project.
Consider this:
source is pointing to HEAD (as you would want).
you make changes to source inside you Project (you commit but not push them)
now you have two HEADs : one in your Project's source, another in your common source.
Which one you would want to be present in your Project when you make submodule update?
The problem (and the main feature) of git in your case is that you consider commit and push as atomic operation. It isn't. Git is decentralized. There is no common HEAD. You might have multiple repositories with different HEADs.
Consider this:
you have 3 developers (A, B and C) with a git project.
they all pull a project's HEAD.
each developer has made changes to project
now each of them has 3 HEADS: A HEAD, B HEAD and C HEAD.
Which HEAD you consider "true" HEAD?
So, to answer your question: If you want common source submodule always be synchronized with central repository, git isn't your choice. And probably none of VCSs would help you with that.
You should treat git submodule as a thirdparty library which should be updated manually with these two steps:
Pull your submodule ("download thirdparty library")
Commit your project with updated submodule ("place the new version of thirdparty library to your project")
If you want to make changes to submodule you should do the same in reverse order:
Commit your submodule ("make your changes to the thirdparty library")
Push your submodule ("send your changes to the thirdparty library maintainer")
Commit your project with updated submodule ("place the new version of thirdparty library to your project")
I'm going to start tracking a project I'm working on using TortoiseGit. I have a lot of .c and .h files, and then I also have .exe, .obj, .pdb, .ilk, suo, etc. I would like to create a snapshot of everything, all those files. So that I can roll back to a prior revision if necessary. However after a few weeks I want to upload all those revisions to github but I would like people to see only the .c and .h file changes and have only those files visible in the clean public version of the project. I'm new to git and not sure how best to go about this. The closest question I found (but don't understand really) is here:
Push a branch of a git repo to a new remote (github), hiding its history
Is that what I want to do? Can someone break it down for me with a step by step that I can do using gitk (Git GUI with msysgit) or tortoisegit? My experience level is I've read the GitBook but not the advanced section yet. Thanks
I think the link to the question and the answers given for it are the way to go for this.
Another way ( which many may frown upon ) is that you can put your git repo in a git repo. This way, commit the local binaries etc. in the outter repo, but ignore them in the inner one. Also ignore .git from the inner repo in the outter repo. This will enable you to go back to some older version of the binaries and corresponding version of the source.