how to handle go import absolute paths and github forks? - go

There are plenty of questions around this, including why you shouldn't use import "./my/path" and why it only works because some legacy go code requires it.
If this is correct, how do you handle encapsulation of a project and by extension github forks? In every other lang, I can do a github fork of a project, or git clone, and everything is encapsulated there. How do I get the same behaviour out of a go project?
Simple example using the go "hello world" example.
hello.go
package main
import ("fmt"
"github.com/golang/examples/stringutil")
func main() {
fmt.Printf(stringutil.Reverse("hello, world")+"\n")
}
The above works great. But if I want to use my own stringutil which is in a subdirectory and will compile to a single binary, I still need the complete path:
package main
import ("fmt"
"github.com/myrepo/examples/util/stringutil")
func main() {
fmt.Printf(stringutil.Reverse("hello, world")+"\n")
}
Now, if someone copies or forks my repo, it has a direct dependency on "github.com/myrepo/", even though this is used entirely internally!
What if there are 20 different files that import utils/? I need to change each one each time someone forks? That is a lot of extraneous changes and a nonsensical git commit.
What am I missing here? Why are relative paths such a bad thing? How do I fork a project that refers to its own subsidiary directories (and their packages) without changing dozens of files?

As for the reasoning behind not allowing relative imports, you can read this discussion for some perspective: https://groups.google.com/forum/#!msg/golang-nuts/n9d8RzVnadk/07f9RDlwLsYJ
Personally I'd rather have them enabled, at least for internal imports, exactly for the reason you are describing.
Now, how to deal with the situation?
If your fork is just a small fix from another project that will probably be accepted as a PR soon - just manually edit the git remotes for it to refer to your own git repo and not the original one. If you're using a vendoring solution like godep, it will work smoothly since saving it will just vendor your forked code, and go get is never used directly.
If your fork is a big change and you intend to remain forked, rewrite all the import paths. You can automate it with sed or you can use gofmt -r that supports rewriting of the code being formatted.
[EDIT] I also found this tool which is designed to help with that situation: https://github.com/rogpeppe/govers
I've done both 1 and 2 - when I just had a small bugfix to some library I just changed the remote and verndored it. When I actually forked a library without intent of merging my changes back, I changed all the import paths and continued to use my repo only.
I can also think of an addition to vendoring tools allowing automation of this stuff, but I don't think any of them support it currently.

Related

What is the correct setting for GOPATH?

I was trying to follow a tutorial on using aws with go. When I gave the command "go get github.com/aws/aws-sdk-go/aws", I still got failure to import. I was left wondering if the "go get" succeeded or not.
Following the guidance provided in the answers here, I updated my GOPATH variable and now the import succeeds.
Your hello.go should start something like this:
package main
import (
"github.com/aws/aws-sdk-go/aws"
)
See this example in the playground. So when importing a remote module use the full path in the import statement.
Re another question you asked in the comments (and I mentioned in comments to another of your questions) - go mod init initialises a module. See this article for information. When using modules, GOPATH is no longer used for resolving imports (see this article). So basically GOPATHis the old way of doing things; go modules is the new way (it solves a lot of issues but having the two approaches can be confusing to someone new to the language because some tutorials assume GOPATH and others use modules).
For completeness you have other questions about environmental variables. I think this may be due to a misunderstanding about how these work. When you enter export GOPATH=XXX in a terminal session that will update the environment within that session (that is that terminal window only; it will not have any impact on any other sessions you have open, including vscode). If you want to set a system wide environmental variable then you need to update a configuration file (often ~/.bashrc but this depends upon your OS see this article for info). After doing that restart the applications (or ideally youre PC) to pick up the new setting.
When you run go get github.com/aws/aws-sdk-go/aws; go tool will download the package into $GOPATH/src/github.com/aws/aws-sdk-go/aws directory.
That is why you need to import package by its full name (which is same as path to files under $GOPATH/src directory:
import "github.com/aws/aws-sdk-go/aws"
You can find more about how source code is organized under GOPATH and how imports work here and specifically about remote (github) imports here.
NOTE: GOPATH resolution works only when your project doesn't use modules (you don't have go.mod file). If you're following a tutorial and it doesn't mention use of modules above should work fine.

managing hardcoded import paths

In Go, it is common that some packages are versioned. So a program might look like this:
package main
import (
"github.com/go-gl/gl/v3.3-core/gl"
"github.com/go-gl/glfw/v3.2/glfw"
)
// ... do stuff
Sometimes, I might want to update the version of glfw. Lets imagine GLFW 3.3 bindings come to Go and I want to update from 3.2.
I might have multiple Go files in a project all using glfw. I don't want to go into each of them and update the version of the import by hand. Ideally I wouldn't be copying that long path around, either, and I could define it in one place per project.
Maybe I could write a script to find+replace "github.com/go-gl/glfw/v3.2/glfw"
Maybe I could template the file with Genny
Maybe I could create a symlink inside the root Go path "glfw" -> "github.com/go-gl/glfw/v3.2/glfw", update it when changing version, and just use import "glfw"
but this information then lives "outside" the project, so no-one cloning my project knows what version to use
but this is a global change and I might have multiple projects which want to depend on different versions
Ideally I would be able to do something like this in each source file:
package main
import (
$gl
$glfw
)
And in some project-level dot file, something like:
gl=github.com/go-gl/gl/v3.3-core/gl
glfw=github.com/go-gl/glfw/v3.2/glfw
Or, a command-line argument attached to go build defining constants that could look something like:
go build -Dgl=github.com/go-gl/gl/v3.3-core/gl -Dglfw=github.com/go-gl/glfw/v3.2/glfw
How is everyone else handling this currently?
See github.com/golang/go/wiki/Modules for the recommended way of managing package versions.

Facilitating git merges with multi-package go modules

We have two versions of a go module hosted by different people. Let's call them example.org/devproject and company.com/project. Different programs obviously import the two projects by different import paths, as appropriate, depending on whether they want the version maintained by example.org or company.com. Because the modules have numerous internal packages, the source files themselves contain references to their own module's name in many import statements.
My question is how example.org can make it as easy as possible for company.com to pull from our git repository, because right now a straight git pull messes up the import statements. Note that stack overflow has similar questions on this prior to go modules. However, this question is specifically about projects with go.mod files using go 1.11 or later, so please do not refer me to non-module answers or suggest not using go modules yet, because that is a different question.
I've tried the following that unfortunately do not work super well:
Use some abstract module name like module self in the go.mod file and add replace self => example.org/devproject. Then internal packages can just import "self/subpackage", and only the one replace line needs to change between repositories. However, that means all including projects need to use a replace directive, and we'd need to convince company.com to change all of their imports relative to "self", which isn't super likely. More importantly, this totally breaks go get and is probably a bad idea.
Use semi-automated scripts to maintain a git branch pull-me that is always exactly one commit ahead of master but has all the import paths and go.mod file fixed up for company.com. This is what we currently do, but it's annoying because the scripts don't know how to parse go source code, so just blindly replace example.org/devproject with company.com/project everywhere, meaning there is no guarantee that just because master works pull-me will, too. We somehow can't get gomvpkg working in a go module outside of a gopath (it will change package statements, but not the imports, which are the tricky part, and won't recurse into all the subpackages).
Create a fake gopath, copy the master branch into it, use gomvpkg to move stuff, then copy the files back with a revised import path, then delete the fake gopath. I can't 100% rule this out, but I gave up after gomvpkg just kept refusing to do the right thing and I got frustrated that something so conceptually simple was so difficult to make work in practice.
Are there any other solutions we should be considering? Are there any tools like govmpkg that understand go modules and will rewrite import paths without also trying to move stuff around in a gopath?
As is probably clear, the dynamics here are that example.org is much more motivated to get stuff upstreamed than company.com is to pull from us, so ideally most of the hoop jumping would happen on the example.org side, and there would maybe be a one-time simple non-intrusive request to company.com to do something to make this work.

Implications of not using repo path for my own packages

Suppose I decide to keep all personally developed packages organized
as follows:
$GOPATH/
bin/
pkg/
src/
somepkg1
somepkg2
...
somepkgN
Further, suppose there is a great deal of code reuse among them, so I
decide to keep the whole $GOPATH workspace under the same Git
repository (each package could be a submodule), as opposed to more
traditional scenario where subpackages are less coherent (co-existing
solely because of using go get from the same workspace):
$GOPATH/
bin/
pkg/
src/github.com/<me>/
somepkg1
somepkg2
...
somepkgN
I can see that with the former approach (not using github.com/<me>/
in the package paths), go get would not be able to fetch packages as
they are not "declaring" themselves to be available online. However,
one can easily work around that by using git submodules, so all
packages would be fetched in the first place (note it's a tightly
controlled ecosystem so there will be no name clashes).
Is there any other limitation (besides go get) of not using full
paths for packages?
(I am mostly concerned about limitations arising from certain code
refactoring/analysis tools that exploit the repository path as base path convention that
allows go get to look for the package online.)
For the Go compiler and all elements of the go tool except go get the package import path is an almost opaque string containing the import path. You can lay out your code like you want (the compiler itself happy compiles files from different folders into one package). If you don't need or want your code to be go getable there is no need to use a repo path. The analysis and refactoring tools in golang.org/x/tools work on the opaque import paths (as far as I know) and do not access the network.

Is it better to have golang statements go out to github or have relative path and why?

In golang import statements, is it better to have something like:
import(
'project/package'
)
or
import(
'github.com/owner/project/package'
)
for files local to your project?
For either option, why would you want to do one over the other?
I ask this because while it's easy and more intuitive to do the first one, I've also seen a lot of big projects like Kubernetes and Etcd do it the second way.
These are exactly the same as far as the go build tools are concerned. The fact that package is in the directory
$GOPATH/src/github.com/owner/project/package
or in the directory
$GOPATH/src/project/package
makes no difference.
The only difference is that with the former you can use go get to fetch the source automatically, and the latter you have to clone the code yourself.
What you don't want to use are paths relative to the project itself, like
import "./project/package"
That is not compatible with all the go tools, and is highly discouraged.

Resources