How to store go dependencies? - go

I am using GoDep to resolve a project dependencies.
My problem is that repositories for dependencies maight be removed and my project wouldn't build.
I am trying to find any solution to store dependencies at Artifactory or another solution.
Please advice.
Regards.

Okay so GoDeps may be the standard way of doing this, but I usually found it a bit complicated. In my opinion, use a Makefile which sets a custom GoPath and just include dependencies with your code (remove their .git folder). This way the version freezes and no one needs to do a godep restore or something similar.
You can make recipes like make deploy that builds your code, runs GoFmt, cleans the pkg files, installs it to your custom GoPath bin/ and then you just go and run the binary.
You can have another one like make install that will install any missing dependencies.
I've managed to create a watch using this on my Makefile to keep on looking for changes on a linux based system using inotify-tools and call rebuild.
Internally all commands will be using standard go commands but you'll get rid of the GoDeps and maintaining JSON. To upgrade a dependency, it may be a bit of a problem as you'd have to manually copy the whole directory into your custom path and remove the .git/ folder.
Our company uses this method and seems to work quite nice for us.
Plus this method basically gets you away the $GOPATH/src/github.com/repoName/ kind of paths.
If i seem unclear, let me know, I'll add a gist on github.

Related

Should I commit vendor directory with go mod?

I am using go modules on go1.12 to handle my Go dependencies. Is it best practice to also commit the vendor/ directory into version control?
This is somewhat related to Is it best-practice to commit the `vendor` directory? which asks this question in the case of using dep. With dep, commiting vendor/ is the only way to get truly reproducible builds. What about for go modules?
I'd like to give some arguments in favour of committing vendor, go.mod and go.sum.
I agree with the accepted answer's arguments that it's technically unnecessary and bloats the repo.
But here is a list of contra-arguments:
Building the project doesn't depend on some code being available on Github/Gitlab/... or the Go proxy servers. Open source projects may disappear because of censorship, authors incentives, licensing changes or some other reasons I can't currently think of, which did happen on npm, the JavaScript package manager, and broke many projects. Not in your repo, not your code.
We may have used internal or 3rd party Go modules (private) which may also disappear or become inaccessible, but if they are committed in vendor, they are part of our project. Nothing breaks unexpectedly.
Private Go modules may not follow semantic versioning, which means the Go tools will rely on the latest commit hash when fetching them on-the-fly. Repo history may be rewritten (e.g. rebase) and you, a colleague or your CI job may end up with different code for the dependencies they use.
Committing vendor can improve your code review process. Typically we always commit dependency changes in a separate commit, so they can be easily viewed if you're curious.
Here's an interesting observation related to bloating the repo. If I make code review and a team member has included a new dependency with 300 files (or updated one with 300 files changed), I would be pretty curious to deep dive into that and start a discussion about code quality, the need for this change or alternative Go modules. This may lead to actually decrease your binary size and overall complexity.
If I just see a single new line in go.mod in a new Merge Request, chances are I won't even think about it.
CI/CD jobs which perform compilation and build steps need not waste time and network to download the dependencies every time the CI job is executed. All needed dependencies are local and present (go build -mod vendor)
These are on top of my head, if I remember something else, I'll add it here.
Unless you need to modify the vendored packages, you shouldn't. Go modules already gives you reproducible builds as the go.mod file records the exact versions and commit hashes of your dependencies, which the go tool will respect and follow.
The vendor directory can be recreated by running the go mod vendor command, and it's even ignored by default by go build unless you ask it to use it with the -mod=vendor flag.
Read more details:
Go wiki: How do I use vendoring with modules? Is vendoring going away?
Command go: Modules and vendoring
Command go: Make vendored copies of dependencies

Go dep keep package even if not currently used

Go dep's dep ensure command will remove packages not currently in use. There's one particular package we use for debugging github.com/sanity-io/litter. The challenge we're facing is if we run dep ensure outside of a debugging session, dep will remove that package.
One solution could be to call that package in some backstage spot in the code that won't bother anyone, thereby showing dep that we are, in fact, using this package. But that sounds ugly, hacky, and could get removed by a future developer on the team.
So, the question is, how to tell dep to keep a package, even if it's not currently in use?
Add to the beginning of Gopkg.toml:
required = ["github.com/sanity-io/litter"]
The Gopkg.toml docs state about required:
Use this for: linters, generators, and other development tools that
Are needed by your project
Aren't imported by your project, directly or transitively
You don't want to put them in your GOPATH, and/or you want to lock the version
Please note that this only pulls in the sources of these dependencies.
It does not install or compile them.
You should use required for your dependency, take a look at documentation
about it. And maybe more useful link about required section.

Golang Workspaces In Practice

According to the Go documentation they would like you to have a workspace that you should put all their projects in.1 However, as far as I can tell, this all falls apart as soon as you want to make a project that does not use Go exclusively.
Take a project where it is made up of many micoservices for example. Lets say that it is structured like this:
app/
authentication/ (Using rust)
users/ (Using NodeJS)
posts/ (Using Go)
Only one part of the app would be written in Go, and that part is nested in a subdirectory of the app. How would I apply the Go workspace philosophy to this situation?
https://golang.org/doc/code.html#Workspaces
Using a different GOPATH per project is a very good and simple approach. In my experience this also works better than vendor since you can also install binaries and keep them on different versions.
vg is a simple tool that helps managing workspaces, it integrates with your shell and detects automatically workspaces when you cd them.
Disclaimer: I am one of the authors of the tool.
As of Go 1.11, go now has modules. Amongst other things, modules enable you to have isolated source trees (with any number of packages and their own dependencies) outside of your $GOPATH.
You create a new module by running go mod init <module name> (you must be outside of $GOPATH/src to do this). This will create a go.mod file in the current folder, and any go command you run in that folder (or any folder beneath) will use that folder as your project root.
You can read more about using go modules as workspaces in this post: https://aliceh75.github.io/using-modules-for-workspaces-in-golang (disclaimer: I wrote it), and you can read more about Go modules on the Go Modules Wiki:
https://github.com/golang/go/wiki/Modules
You can put app/ in $GOPATH/src. Then whenever you're ready to build, you specify the path of your source files, relative to where they are in GOPATH.
For example:
if your app source is in $GOPATH/src/app/ and your .go files are in $GOPATH/src/app/posts/ then you can build a source (lets say posts.go in app/posts/) with go build $GOPATH/src/app/posts/posts.go or better go build posts/posts.go with app/ as your current working directory.
just set GOPATH according to your go files:
GOPATH=$PROJECT_PATH/app/posts
then put your source codes under
$PROJECT_PATH/app/posts/src/package

Project structure. Scientific Python projects

I am looking for a better way to structure my research projects. I have the following setup:
There are projects a,b,c and a library lib. Each project tackles a different research question and the library carries code that is used across projects. Thus all projects depend on lib. Things get more complicated as project c depends on projects a and b as well. When I work on project c, I will also update a,b or lib simultaneously. Each project is in a separate git repository.
So far I have dealt with this situation by including the dependencies above via git submodule and all the source files are located in the root dir of the project. The advantage is that I keep track of which version of lib my projects depend. Also one of my projects could depend on an outdated version of lib. I run everything from the root directory without "installing" any of the packages to site-packages or so. When a path is not set correctly, I override it via sys.path.insert.
However, the following points make me want to change layout:
I keep losing track of which version of lib I am editing.
I want to make use of automated testing tools (tox,jenkins etc.) which seem to be much easier to handle with a standard project setup.
sys.path.insert can lead to subtle problems which are hard to debug.
I usually want all my projects to work with the tip of lib anyway.
Therefore I am currently rearranging all projects (especiall lib) to be in line with the standard Python directory structure (source stored in a subdirectory, root contains a setup.py file) to be able to work in a virtualenv. Then I can list all my dependencies in requirements.txt. First I install lib as develop via pip install -e . Then I run pip freeze > requirements.txt which then includes a line similar to this.
-e git+<path_to_remote>#<sha>#egg=`lib`
So again I have generated a dependency to a specific commit (sha) as with git submodule, ensuring that I can checkout an old commit and the project should run. I can now install everything in a virtualenv and got rid of my path problems. Great.
I face some new trouble though. One problem is, how to update the sha in requirements.txt. The easiest (but probably not most elegant) solution I see is to write a pre-commit hook that updates the sha before commiting. Is there a better way?
And more generally - do you see a better solution given my setup?
As far as I see you have mostly solved your problem and there are only small bits left.
1) Don't use hashes to identify versions of your libraries. Even if you don't publish your libraries to the Cheese Shop, do a normal library versioning (semver) and tag you git repositories accordingly. Thing way you will have human-readable and manageable version in your git+https://github.com/... URLs of dependencies.
2) Make your tox setup in the way that will let you test stable version of dependencies (that you have tagged last time) and master version right from the latest repo revision.

Jenkins + Cmake + JIRA = CI of multiple interdependent projects?

We have a number of small projects within our system running on Linux (Slackware 7-11, slowly migrating to RHEL 6.0). Around 50-100 applications and 15-20 libraries. Almost all our applications use one or more of our libraries. Our source tree looks something like this:
/app1
/app2
/app3
/include
/foo/app4
/foo/app5
/foo/app6
/foo/lib1
/foo/lib2
/lib/lib3
/lib/lib4
/lib/include
Now, I've done some work creating some CMakeLists.txt files and built most of the libs and some of the apps. I'm fairly comfortable with using cmake to build. I did this with v2.6, and I recently (an hour ago) upgraded to 2.8. Each of the above projects have their own CMakeLists.txt file specific to the project to do building and installation (no packaging, yet).
I have a requirement to make use of and enforce continuous integration. I've installed and played around with Jenkins, and from what I've seen I'm very impressed. I'm also evaluating JIRA to do our issue tracking.
Just to get things up and going, I've done a cmake install on all the libs, so the apps can find them in the filesystem. Headers are installed to /usr/local/include and libs to /usr/local/lib. Is this a bad thing to do? Would it be better to tell cmake to look for the lib's source directory, use the export interface or the recently introduced ExternalProject_Add?
Because I'm going to be using Jenkins, I cannot be guaranteed that cmake can find the source or build directory. Of course, I can tell Jenkins to build the projects in order (or at least, build the dependencies first). If an update to a library breaks the building of another project, then I guess it'll be up to someone with 3/4 of a wit to determine this.
Thank you in advance
Just to get things up and going, I've done a cmake install on all the libs, so the apps can find them in the filesystem. Headers are installed to /usr/local/include and libs to /usr/local/lib. Is this a bad thing to do?
No it is not a bad thing to do, but your build should reproduce resources from scratch. Things like portability and fixing build bugs will become an issue if things need to be pre-installed in the system outside of the build process. If you are able to do it other ways as you mentioned I would suggest that way, but if its going to make your build that much longer, its something you need to feel out. My ideology is everything should be movable to a new Jenkins machine with a fresh install at the drop of a hat, again this always isn't achievable, but something to strive for.
Because I'm going to be using Jenkins, I cannot be guaranteed that
cmake can find the source or build directory. Of course, I can tell
Jenkins to build the projects in order (or at least, build the
dependencies first). If an update to a library breaks the building of
another project, then I guess it'll be up to someone with 3/4 of a wit
to determine this.
Well one of the things I do in interdependent jobs is that on the successful building of one jobs triggers the job that depends on it. So for example if A depends on B, and A fail, B will never be run and whoever created the issue in build A is responsible for it and so on. This prevents a cascading affect of broken build that all were caused by a broken dependency. I would suggest that you keep files in a particular build in its job folder and specify to the dependency the location of the required files. Again keep your builds separate and clean.
I'm also evaluating JIRA to do our issue tracking.
I highly recommend JIRA as an issue tracking system for company; You might want to look at this Jenkins plugin for integration. If your using git, and you dont mind hosting your code off site, I would GitHub issues a shot as well.
Goodluck you seem to be on the right track.

Resources