From go 1.13, go modules uses https://proxy.golang.org/ to cache repositories. Consider that I have a private repository as a go module in github.com/Ihtkas/libraries and I imported the module in another local go code sort.go. When I build the local code with GIT_TERMINAL_PROMPT=1, go builds the sort.go with my login credentials for the local repository. In this case, does go caches the private repository in proxy.golang.com? When someone else imports the same private package and uses valid credentials to access the package, is the package in private repo served from proxy.golang.com with just authentication forwarded to github.com?
My exact question is
Does go in anyway hold the private repo code in proxy server?
From https://index.golang.org:
If I don't set GOPRIVATE and request a private module from these services, what leaks?
The proxy and checksum database protocols only send module paths and versions to the remote server. If you request a private module, the mirror will try to download it just as any Go user would and fail in the same way. Information about failed requests isn't published anywhere. The only trace of the request will be in internal logs, which are governed by the privacy policy.
With GOPRIVATEworking as described at https://golang.org/cmd/go/#hdr-Module_configuration_for_non_public_modules
The GOPRIVATE environment variable controls which modules the go command considers to be private (not available publicly) and should therefore not use the proxy or checksum database. The variable is a comma-separated list of glob patterns (in the syntax of Go's path.Match) of module path prefixes. For example,
GOPRIVATE=*.corp.example.com,rsc.io/private
causes the go command to treat as private any module with a path prefix matching either pattern, including git.corp.example.com/xyzzy, rsc.io/private, and rsc.io/private/quux.
To sum it up: if it is a private module, the proxy services tries to access it and will fail. I assume Go then will fall back to access it directly, circumventing the proxy altogether. To prevent this roundtrip, add your private repositories to GOPRIVATE and if you still are concerned about it, use something like wireshark to make double sure that your private modules are accessed directly.
Related
Background
At my company, we use Bit Bucket to host our git repos. All traffic to the server flows through a custom, non-standard port. Cloning from our repos looks something like git clone ssh://git#stash.company.com:9999/repo/path/name.git.
The problem
I would like to create Go modules hosted on this server and managed by go mod, however, the fact that traffic has to flow through port 9999 makes this very difficult. This is because go mod operates on the standard ports and doesn't seem to provide a way to customise ports for different modules.
My question
Is it possible to use go mod to manage Go modules hosted on a private git server with a non-standard port?
Attempted solutions
Vendoring
This seems to be the closest to offering a solution. First I go mod vendor the Go application that wants to use these Go modules, then I git submodule the Go module in the vendor/ directory. This works perfectly up to the point that a module needs to be updated or added. go mod tidy will keep failing to download or update the other Go modules because it cannot access the "git URL" of the custom Go module. Even when the -e flag is set.
Editing .gitconfig
Editing the .gitconfig to replace the URL without the port with the URL with the port is a solution that will work but is a very dirty hack. Firstly, these edits will have to be done for any new modules, and for every individual developer. Secondly, this might brake other git processes when working on these repositories.
The go tool uses git under the hood, so you'd want to configure git in your environment to use an alternate url. Something like
git config --global url."ssh://git#stash.company.com:9999/".insteadOf "https://stash.company.com"
Though I recall that bitbucket/stash sometimes provides an extra suffix for reasons I don't recall, so you might need to do something like this:
git config --global url."ssh://git#stash.company.com:9999/".insteadOf "https://stash.company.com/scm/"
ADDITIONAL EDIT
user bcmills mentioned below that you can also serve the go-import metadata over HTTPS, and use whatever vanity URL you like, provided you control the domain resolution. This can be done with varying degrees of sophistication, from a simple nginx rule to static content generators, dedicated vanity services or even running your own module proxy with Athens
This still doesn't completely solve the problem of build environment configuration, however, since you'll want the user to set GOPRIVATE or GOPROXY or both, depending on your configuration.
Also, if your chosen domain is potentially globally resolvable, you might want to consider registering it anyway to keep it from being registered by a potentially-malicious third party.
I have Artifactory set up and working, serving other artifacts (RPM, etc)
I would like to have local copies of public and private Go programs and libraries
to ensure version consistency
to let public repositories get bugs out
to let public repositories secure from unauthorized alterations
I've created a Go repository in Artifactory, and populated it with, as an example, spf13/viper using frog-cli (which created a zip file and a mod file)
Questions:
Is the zip file the proper way to store Go modules in Artifactory?
How does one use the zip file in a Go program? E.g. the URL to get the zip file is http://hostname/artifactory/reponame/github.com/spf13/viper/#v/v1.6.1.zip (and .mod for the mod file) E.g., do I set GOPATH to some value?
Is there a way to ensure all requirements are automatically included in the local Artifactory repository? At the time of the primary package's (e.g. viper) inclusion into the local Artifactory repository?
Answering 3rd question first -
Here's another article that will help - https://jfrog.com/blog/why-goproxy-matters-and-which-to-pick/. There are two ways to publish private go modules to Artifactory. The first is a traditional way i.e. via JFrog CLI that's highlighted in another article.
Another way is to point a remote repository to a private GitHub repository. This capability was added recently. In this case, a virtual repository will have two remotes. The first remote repository defaults to GoCenter via which public go modules are fetched. The second remote repository points to private VCS systems.
Setting GOPROXY to ONLY the virtual go modules repository will ensure that Artifactory continues to be a source of truth for both public and private go modules. If you want to store complied go binaries, you can use a local generic repository but would advise using a custom layout to structure the contents of a generic repository.
Answering the first 2 questions -
Go module is a package manager in Golang similar to what maven is for Java. In Artifactory, for every go module, there are 3 files for every go module version: go.mod, .info, and the archive file.
Artifactory follows the GOPROXY protocol, hence the dependencies mentioned in the go.mod will be automatically fetched from the virtual repository. This will include the archive file too which is a collection of packages (source files).
There's additional metadata that's stored for public go modules such as tile and lookup requests since GoSumDB requests are cached to ensure that Artifactory remains the source of truth for modules and metadata even in an air-gapped environment.
I want to setup a development environment that allows reusing some artifacts from public Maven repositories like Maven Central, Code Haus. Specifically, I like the concept of transitive dependencies.
In our company, our production network cannot export any data outside, but we can push data inside. We already have some gateways to copy file from the outside into our network. Therefore, I could use this to copy the required packages manually but we would miss the power of maven. In our case, the perfect solution would be to be able to get data from public repository but be forbidden to deploy to the external repo.
So I would like to have your expert view on this problem.
We can use various means, as long as the capability to export data outside our network is guarantee:
External packages are created on a disk area that is read-only from production servers.
Some HTTP requests are filtered.
Using a repository manager, as Nexus.
In the repository management guide, Nexus talks about this possibility (http://books.sonatype.com/nexus-book/reference/confignx-sect-manage-repo.html). I would like a confirmation from you guys about how secure it is. Specifically, this has to be updated only by the IT manager.
Regards,
Loïc.
This is completely feasible and a common setup with Nexus. Here are the steps roughly.
Lock all developers and CI server inside the network disallowing direct access to outside servers
Setup Nexus to proxy external repositories like Central as desired
Allow Nexus to reach to those external repositories via the proxy
Configure developers and CI server machines to access Nexus to get the dependencies (and transitive dependencies) as desired
Optionally you can also
Configure CI servers to deploy any internal packages to Nexus
Configure deployment tools to get components for deployment from Nexus
Also note this can be done via different repository formats and toolchains. The common one is Maven, but Nexus also supports NPM, Nuget, Rubygems, sites, YUM and others.
And if you want to make some of your packages in Nexus available to the outside you can configure this as well following multiple options.
Also note that a proxy repository is by definition read only in terms of deployments to it directly. Thats what a hosted repository is for...
I have an internal maven repo that I want all traffic to go through while I am on the corp network, but when I am not on the network, I would like to use public repos whenever possible. Is there any way for me to do this in settings.xml with mirrors? I would like to have the internal repo "mirror" external ones, but when it is not reachable to fall back to the external ones.
I would like to avoid using profiles unless they can auto-detect and fall back. I rather not use special flags to enable/disable internal use.
If you have "some" control over the repo, you could configure a "repository" of your corporate repo to be a proxy of the public repos. (I use Nexus, but it dhould not be very different in artifactory).
Then you put the corporate repo and the public repos in your repository list, in the right order (corporate first, public next).
Do you know a way to configure Nexus OSS so that it publishes the artifact repository to a remote server in a form that can be statically served, e.g. by Apache Httpd? I'd like to use this static copy to serve only my own artifacts, so the nexus server could actively trigger an update in case there is something new published.
Technically, I think it should be possible to create the metadata for the repo and store them in a static file, but I'm not sure with that. Any hints appreciated.
If there is another repo manager to achieve that, it would be fine for me as well.
I clearly understand the advantages to use the repo manager directly, but due to IT rules I can run Nexus only internally and it would be necessary to have these artifacts available in a (private) repo copy on the Internet as well.
A typical way to solve this IT requirement of only exposing known servers like Apache httpd is to setup Apache httpd as a reverse proxy as documented here.
You can use that approach in a more restrictive way by only exposing a specific repository or better repository group (so you can combine snapshots and releases) and tying that together with a specific user or a specifically restricted setup of the anonymous user that is used by default when no credentials are passed through.
Also if you need more help feel free to contact us in the user mailinglist or on hipchat.