How to limit / stop golangs go tool from going to the internet - go

We recently brought Golang into the company that I work in for general use, but we hit a snag in the roll out because Go can use the go get command to get packages from the internet. Typically when we roll out Java and Python we are able to limit where the developer can pull packages from by pointing them to our internal artifactory.
So with Python we can change where they pull from by altering the pip command to pull from our internal artifactory, and with Java we can alter their settings.xml and pom.xml to point to our internal packages.
I know that during development you can fetch and pull in dependencies into your local then compile them into a standalone binary. What i am looking for is some mechanism that stops people from going out and pulling from the open internet.
Does something like this exist in Go? Can I stop people from going to the internet and go get 'ing packages?
Any help would be greatly appreciated.

It depends on your definition of "roll out", but typically there are three stages:
Development - at this point you can't prevent arbitrary go get calls, apart from putting the development machines behind restrictive proxies or similar technical measures.
Deployment - since Go programs can (should) be deployed as single binaries, go get is not used at all during deployment.
Building deployment artefacts - this is probably your issue:
The usual approach is not to fetch dependencies when building Go programs. Instead, dependencies are fetched during development, and made part of the source tree using vendoring, for example by using the dep tool.
At this point, the build step no longer needs to fetch any dependencies. The choice of which dependencies are allowed now becomes part of the rest of your process, such as code reviews.

Related

Approach for using GitLab-CI for complex builds

I'm new to continuous integration. I'm interested in systems that would be able to test if the changes that I made to a code break the compilation of the code on a list of different build types.
Properties of code (Which I will call CodeA):
1.) Has dependencies to numerical libraries like SUNDIALS and PETSC
2.) Has dependencies on two other codes (CodeB CodeC) which themselves have dependencies to things like HDF5, MPI, etc.
Is it feasible to use the CI feature of GitLab to set up a system that would be able to build CodeA (linked with CodeB and CodeC) on Linux machines with different system flavors (Ubuntu, OpenSuSe, RHEL, Fedora, etc)?
Most of the examples that I've found of using GitLab for CI have been things like testing to see if HelloWold.cpp compiles if lines are changed on it. Just simple builds with very little external dependency management/integration.
So it sounds like you've got a few really great questions in here. I'll break them apart as I see them and you let me know if this fully answers your question.
How can I build in different flavors of linux?
The approach I would take would be to to use docker files as Connor Shea mentioned in the comment. This allows you to continue using generic build agents in your CI system but test across multiple platforms.
Another option would be to look at how you're distributing your application and see if you could use a snap package. That would allow you to not have to worry about the environment you're deploying to.
How do I deal with dependencies?
This is where it's really useful to consider and artifact repository. Both jfrog's artifactory and sonatype's nexus work wonders here. This will allow you to hook up your build pipeline for any app or library and push an artifact that the others can consume. These can be locked down with a set of credentials that you supply to your build.
I hope this helped.

Build dependencies and local builds with continuous integration

Our company currently uses TFS for source control and build server. Most of our projects are written in C/C++, but we also have some .NET projects and wouldn't want to be limited if we need to use other languages in the future.
We'd like to use Git for our source control and we're trying to understand what would be the best choice for a build server. We have started looking into TeamCity, but there are some issues we're having trouble with which will probably be relevant regardless of our choice of build server:
Build dependencies - We'd like to be able to control the build dependencies for each <project, branch>. For example, have <MyProj, feature_branch> depend on <InfraProj1, feature_branch> and <InfraProj2, master>.
From what we’ve seen, to do that we might need to use Gradle or something similar to build our projects instead of plain MSBuild. Is this correct? Are there simpler ways of achieving this?
Local builds - Obviously we'd like to be able to build projects locally as well. This becomes somewhat of a problem when project dependencies are introduced, as we need a way to reference these resources or copy them locally for the build to succeed. How is this usually solved?
I'd appreciate any input, but a sample setup which covers these issues will also be a great help.
IMHO both issues you mention fall really in the config management category, thus, as you say, unrelated to the build server choice.
A workspace for a project build (doesn't matter if centralized or local) should really contain all necessary resources for the build.
How can you achieve that? Have a project "metadata" git repo with a "content" file containing all your project components and their dependencies (each with its own git/other repo) and their exact versions - effectively tying them together coherently (you may find it useful to store other metadata in this component down the road as well, like component specific SCM info if using a mix of SCMs across the workspace).
A workspace pull wrapper script would first pull this metadata git repo, parse the content file and then pull all the other project components and their dependencies according with the content file info. Any build in such workspace would have all the parts it needs.
When time comes to modify either the code in a project component or the version of one of the dependencies you'll need to also update this content file in the metadata git repo to reflect the update and commit it - this is how your project makes progress coherently, as a whole.
Of course, actually managing dependencies is another matter. Tons of opinions out there, some even conflicting.

What is the most effective way to lock down external dependency "versions" in Golang?

By default, Go pulls imported dependencies by grabbing the latest version in master (github) or default (mercurial) if it cannot find the dependency on your GOPATH. And while this workflow is quite simple to grasp, it has become somewhat difficult to tightly control. Because all software change incurs some risk, I'd like to reduce the risk of this potential change in a manageable and repeatable way and avoid inadvertently picking up changes of a dependency, especially when running clean builds via CI server or preparing to deploy.
What is the most effective way I can pin (i.e. lock down or capture) a package dependency so I don't find myself unable to reproduce an old package, or even worse, unexpectedly broken when I'm about to release?
---- Update ----
Additional info on the Current State of Go Packaging. While I ended up (as of 7.20.13) capturing dependencies in a 3rd party folder and managing updates (ala Camlistore), I'm still looking for a better way...
Here is a great list of options.
Also, be sure to see the go 1.5 vendor/ experiment to learn about how go might deal with the problem in future versions.
You might find the way Camlistore does it interesting.
See the third party directory and in particular the update.pl and rewrite-imports.sh script. These scripts update the external repositories, change imports if necessary and make sure that a static version of external repositories is checked in with the rest of the camlistore code.
This means that camlistore has a completely repeatable build as it is self contained, but the third party components can be updated under the control of the camlistore developers.
There is a project to help you in managing your dependencies. Check gopack
godep
I started using godep early last year (2014) and have been very happy with it (it met the concerns I mentioned in my original question). I am no longer using custom scripts to manage the vendoring of dependencies as godep just takes care of it. It has been excellent for ensuring that no drift is introduced regardless of timing or a machine's package state. It works with the existing mechanism of go get and introduces the ability to pin (godep save) and restore (godep restore) based on Godeps/godeps.json.
Check it out:
https://github.com/tools/godep
There is no built in tooling for this in go. However you can fork the dependencies yourself either on local disk or in a cloud service and only merge in upstream changes once you've vetted them.
The 3rd party repositories are completely under your control. 'go get' clones tip, you're right, but you're free to checkout any revision of the cloned-by-go-get or cloned-by-you repository. As long as you don't do 'go get -u', nothing touches your 3rd party repositories already sitting at your hard disk.
Effectively, your external, locally cloned, dependencies are always locked down by default.

is it bad form to have your continuous integration system commit to a repository

I have recently been charged with building out our "software infrastructure" and so I am putting together a continuous integration server.
After a build completes would it be considered bad form for the CI system to check in some of the artifacts it creates into a tag so that it can be fetched easily later (or if the build breaks you can more easily recreate the problem.)
For the record we use SVN and BuildMaster (free edition) here.
This is more of a best practices question rather than a how-to question. (It is pretty easy to do with BuildMaster)
Seth
If you believe this approach would be beneficial to you, go ahead and do it. As long as you maintain a clear trace of what source code was used to build each artifact, you'll be fine.
You should keep this artifact repository separated from the source code repository.
It is however a little odd to use a source code repository for this - these are typically used for things that will change, something your artifacts most definitely should not.
Source code repositories are also often used in a context where you want to check out "everything", for example the entire trunk. With artifacts you are typically looking for a specific version, and checking out all of the would only be done if exporting them to some other medium.
There are several artifact repositories specialized for this, for example Artifactory or Apache Archiva, but a properly backed up file server will thought-through access settings might be a simple and good-enough solution.
I would say it's a smell to check in binaries as a tag. Your build artifacts should be associated with a particular build version in your build system, and that build should be associated with a particular checkin. You should be able to recreate the exact source code from that information. If what you're looking for is a one-stop-function to open the precise source-code revision that generated the broken build, I'd suggest that you invest some time into building a Powershell module that will do that for you.
Something with a signature like:
OpenBuild -projectName "some project name" -buildNumber "some build number"

What should the repository contain?

I am trying to set up a Continuous Integration process. For my various build tasks(compiling, testing, documentation etc.)I need to have tools that perform these tasks(csc, NUnit, NDoc etc.). My question is should these tools too go into my source control repository?
Why I think that they should is because I read in some online article that the developer environment should be as much similar to the build server environment. To fulfill this requirement, the article suggested that you put everything that is required for your build in the repository and when you check out the code(or the build server checks out the code) you are ready to build the project right away without first installing any other tools. But on the other hand if I put these tools with my source code in the repository then the build server will have to install them whenever a build is run.
Is it OK to install these tools? Won't it increase the time for each build unnecessarily?
It's often more trouble than it's worth to try to check in tools to source control. Rather, write a list of software requirements that must be installed before the source can be checked out and built (one thing that would need to be on this list in any case is the source control system itself). If you rely on software being in source control, some tools might need to be installed in certain paths or be otherwise configured (registry entries come to mind).
I would certainly not check in the compiler itself to source control, and I probably wouldn't check in NUnit or NDoc either. Just install these beforehand, as they are not likely to change too much over the lifetime of your project. Your build script might want to check that the expected version(s) of the required software packages are installed before the build may proceed.
Unless you're customizing the tools there's probably no reason to put their source code in your repository. However there are excellent reasons for putting your config files in the repository.
Re-installing the tools for every single build is overkill and will slow you down.
However it's by far better to have a server dedicated to the continuous integration so that you know its state ; you sure nobody installed anything that may have an impact on the outcome of the build.
If you want to be able to re-generate today's build next year, you need to be able to re-create your environment first. Make sure you'll be able to re-install your tools (exact same version), either by keeping them on your server (installing the newer versions in different directories), or storing the whole package in your configuration management tool.
Think about how you would create another continuous integration server, either to have two of them, or for a second site, or to recover after a disaster. Document how the continuous integration server was set up.
What really needs to be version controlled, is the build scripts, that access the right versions of the tools, especially if you opt for installing several versions of the tools.

Resources