Organising Go programs - packages or something else? - go

I have read the Go Tour and Googled "golang packages" but I have not yet found any advice about best practice in Go for organising moderately sized applications.
If I have an application that conceptually has several distinct parts, perhaps 10^3-10^4 LOC, and I don't intent to create reusable libraries for use in other applications, should all the source code files be package main?
To clarify ...
For example, lets say my program will have the following major chunks:
Something that manages a bunch of persistently stored data
allowing usual create, read, update, delete operations
Something that allows a human to view the stored data
Something that coordinates / mediates between these
Something that periodically fetches data updates from a web-service using SOAP.
So that would be MVC plus a fetcher of data.
From looking around at what people do, I now suspect I should
create $GOPATH/src/myprogramname
in there put some main.go with package main and func main() { ... } in it.
create some subdirectories like
$GOPATH/src/myprogramname/model
$GOPATH/src/myprogramname/view
$GOPATH/src/myprogramname/control
$GOPATH/src/myprogramname/fetch
have the .go files in those subdirectories begin with package fetch, etc. Where the package name always matches the subdirectory name.
my main.go will probably import ( ... "fetch"; "model"; "view"; "control" )
as main.go grows, split it into other reasonably sized .go files named according to purpose.
build the program, including *.go in the above package subdirectories by
cd $GOPATH/src/myprogramname
go build
Is that all I need to do? Is that the properly idiomatic Go way of organising things? Is there more I should know or be thinking of? Is there some canonical webpage or PDF I overlooked and should read to find out this stuff?
In short, I don't want a 10,000 line main.go with everything in it. What are the idiomatic Go principles for organising code into files, subdirectories, packages and any other organisational units corresponding to normal conceptual divisions according to well-known structured-programming and/or OO principles?

You could break down your project into several layers based on the encapsulation level of your functions, i.e. having low-level functions in separate packages and logic functions in your main package. (You could inspire yourself of MVC-like architectures)
Since we don't have any details about your code, it is hard to see what kind of architecture would be best suited.
But in the end your choice will be based on the code simplicity / re-usability balance.

The general "best practice" in Go seems to be having each package provide a type or a service. Most of the packages in the standard library expose one or two types, and functions for working with those types. Some, like net/http and testing, provide a service - not in the "microservices" sense of something executable in itself, but rather a set of functionality related to a specific activity.

Related

How to merge similar multi-package projects in go

Several organizations distribute variants of the same project, and we regularly pull changes from one another. It would be great if we could eventually merge code repositories and maybe, maybe have a common source tree managed by a consortium. However, each member would probably want the option of distributing their own variant without too much pain for customers in case there is trouble upstreaming changes required to work with newer products.
The project consists of three packages:
A library
A compiler executable that outputs go code that needs to import the library
A utility executable that uses code generated by #2 and links with #1
A big annoyance, when pulling changes back and forth, is gratuitous differences in import paths. We basically have to edit every version of import "github.com/companyA/whatever" to import "companyB.com/whatever". Of course these problems would go away with (gasp) relative import paths. If we resorted to such heresy, our compiler can just hard-code the absolute import path in generated code to isolate end users from the library's import path. It would also require only one gratuitous difference in the source trees (the line in the compiler that outputs import statements) rather than a bunch.
But anyway, I know relative import paths are bad - this is a tricky situation. I know this is similar to questions such as this or this, because the answer of just asking end users to create a directory called companyB.com and cloning something from companyA in there is just not going to fly for practical and political reasons.
I already know that go is not really good at accommodating this scenario, so I'm also not asking for a magic bullet to make go handle something it can't. Another thing that unfortunately won't fly is asking customers to curl whatever | sh, as this is viewed as too much of a liability (deemed "training customers to do dangerous things"). Maybe we could forego go get and have everyone clone to some neutral non-DNS-name under $GOPATH/src, but we would need to do this without a "flag day" in which code suddenly breaks if it's in the wrong place.
My question is whether anyone has successfully merged SDK-type projects with existing end users, and if so, how did you do it, what worked, and what didn't? Did you in fact avoid relative import paths or gnarly GOPATH hacking, and if so was it worth it? What mechanisms did you employ (environment variables, configuration files, .project-config files in the current working directory, abusing the vendor directory, code-generation packages that figure out their absolute import path at compliation time) to make this work smoothly? Did you just muddle through with a huge amount of sed or maybe gofmt -r? Are there tricks involving clever use of .gitattributes or go generate to rewrite import paths on checkout/checkin?
Merging them is pretty easy - cross-merge so that they all match, pick one (or create a new one) as the canonical source of truth, then migrate all references to the canonical import and make all future updates there.
The probablem arises here:
each member would probably want the option of distributing their own variant without too much pain for customers in case there is trouble upstreaming changes required to work with newer products
There's no particularly good way to do that without using any of the known solutions you've already ruled out, and no way to do that depending on your threshold for "too much pain".
Without knowing more about the situation it's hard to suggest options, but if there's any way that each company can abstract their portion out into a separate library they could maintain and update at their pace while using a smaller shared library with shared responsibility, that would likely be the best option - something like the model used by Terraform for its providers? Basically you'd have shared maintenance of a shared "core" and then independent maintenance of "vendor-specific" packages.

Where to put main.go for a Go binary?

Lets say I have an awesome Go binary that denormalizes some data, and I'd like to place it under:
//some/path/denormalize/
Ideally my package name would be denormalize which would allow me to follow the guidelines for "[when] designing a package, consider how the two parts of a qualified identifier work together, not the member name alone". Thus a potential member function might be denormalize.FromX(...). However, if I place the main.go for the denormalize binary under denormalize/ package I can't use the intended package name as now the package would be named main.
Some options I've considered:
Place main.go under denormalize/ and then place the rest of the code under internal/ or api/ or pkg/. Pros: This seems to be what most folks do. Cons: The package name is not meaningful, and it violates the principle of insuring that the two parts of the qualifier work together since api.FromX(...) no longer makes sense. This leads to more verbose names such as api.DenormalizeFromX(...).
Place main.go under denormalize/main/ and the rest of the code under denormalize/. Pros: Package name stays meaningful. Cons: a subdirectory then has a dependency on a parent directory which has some code smell.
Is there other options that I have not considered for where to place the main.go such that it won't force me to use a different package name?

What is the proper way to organize / structure Go package folders and files?

I know this might be controversial or not very broad but I'm going to try to be very specific and relate to other questions.
Ok so when I make a Go program what things should I be thinking in terms of how I should organize my project? (E.g. should I think ok I'm going to have controllers of some sort so I should have a controller subdirectory that's going to do this so I should have that)
How should I structure a package?
For example the current program I'm working on I'm trying to make a SteamBot using this package
But while I'm writing it I don't know if I should export certain methods into their own own files, e.g. I'd have something like
func (t *tradeBot) acceptTrade() {}
func (t *tradeBot) declineTrade() {}
func (t *tradeBot) monitorTrade() {}
func (t *tradeBot) sendTrade() {}
each method is going to have quite a bit of code so should I export each method into its own file or just have 1 file with 3000 lines of code in it?
Also using global variables so that I can set one variable and then leave it and be able to use it in multiple functions, or is this bad and I should pass the variables as arguments?
Also should I order my files like:
package
imports
constants
variables
functions
methods
Or do I just put things where I need them?
The first place to look for inspiration is the Go standard library. The Go standard library is a pretty good guide of what "good" Go code should look like. A library isn't quite the same as an application, but it's definitely a good introduction.
In general, you would not break up every method into its own file. Go tends to prefer larger files that cover a topic. 3000 lines seems quite large, though. Even the net/http server.go is only 2200 lines, and it's pretty big.
Global mutable variables are just as bad in Go as in any language. Since Go relies so heavily on concurrent programming, global mutable variables are quite horrible. Don't do that. One common exception is sync structures like sync.Pool or sync.Once, which sometimes are package global, but are also designed to be accessed concurrently. There are also sometimes "Default" versions of structures, like http.DefaultClient, but you can still pass explicit ones to functions. Again, look through the Go standard library and see what is common and what is rare.
Just a few tips that you hopefully find useful:
Organize code into multiple files and packages by features, not by layers. This becomes more important the bigger your application becomes. package controllers with one or two controllers is probably ok, but putting hundreds of unrelated controllers in the same package doesn't make any sense. The following article explains it very well: http://www.javapractices.com/topic/TopicAction.do?Id=205
Global variables sometimes make code easier to write however they should be used with care. I think unexported global variables for things like logger, debug flags, etc are ok.

How should I organise structs, variables and interfaces in Go?

I have a codebase where one file contains quite a lot of Structs, Interfaces and Variables in the same file as functions and I'm not sure if I need to seperate this into separate files with appending filename. So for example accounts.go will be accounts_struct.go and accounts_interface.go with struct and interface respectively.
What would be a good approach for the file organisation when you have growing codebase for Structs, Variables and Interfaces?
A good model to check out is the source code of Go itself: http://golang.org/src
(in addition of the official "Effective Go")
You will see that this approach (separating based on language items like struct, interface, ...) is never used.
All the files are based on features, and it is best to use a proximity principle approach, where you can find in the same file the definition of what you are using.
Generally, those features are grouped in one file per package, except for large ones, where one package is composed of many files (net, net/http)
If you want to separate anything, separate the source (xxx.go) from the tests/benchmarks (xxx_test.go)
As Thomas Jay Rush adds in the comments
Sometimes source code is automatically generated -- especially data structure definitions.
If the data structures are in the same file as the hand-wrought code, one must build capacity to preserve the hand-wrought portion in the code-generation phase.
If the data structures are separated in a different file, then inclusion allows one to simply write the data structure file without worry.
Dave Cheney offers an interesting perspective in "Absolute Unit (Test) # LondonGophers" (March 2019)
You should take a broader view of the "unit" under test.
The units are not each internal function you write, but a whole package. Specifically the public API of a package.
Organizing your files to facilitate testing their Public API is a good idea.
accounts_struct_test.go would not, in that regards, make much sense.
See also "How I organize packages in Go" by Bartłomiej Klimczak
Sometimes, a few handlers or repositories are needed.
For example, some information can be stored in a database and then sent via an event to a different part of your platform. Keeping only one repository with a method like saveToDb() isn’t that handy at all.
All of elements like that are split by the functionality: repository_order.go or service_user.go.
If there are more than 3 types of the object, there are moved to a separate subfolder.
Here is my mental model for designing a package.
a. A package should encompass one idea or concept. http is a concept, http client or http message is not.
b. A file in a package should encompass a set of related types, a good rule of thumb is if two files share the same set of imports, merge them. Using the previous example, http/client.go, http/server.go are a good level of granularity
c. Don't do one file per type, that's not idiomatic Go.

When writing a single package meant to be used as a command, which is idiomatic: name all identifiers as private or name all identifiers as public?

In Go, public names start with an upper case letter and private names start with a lower case letter.
I'm writing a program that is not library and is a single package. Is there any Go idiom that stipulates whether my identifiers should be all public or all private? I do not plan on using this package as a library or as something that should be imported from another Go program.
I can't think of any reason why I'd want a mixture. It "feels" like going all private is the proper choice.
I don't think I got any concrete answer, but Nate was closest with telling me to think of "exporting vs non-exporting" instead of "public and private".
This leads me to believe that not exporting anything is the best approach. In the worst case scenario, if I end up importing code from my application in another package, I will have to rethink what should be exported and what shouldn't be. Which is a good thing IMO.
If you are attempting to adjust your mindset to be more Go idiomatic, you should stop thinking of variables, functions, and methods as public or private. The more accurate term is exported or not exported. It definitely has a more C like feel to it.
As others have stated exporting really isn't needed for application program code. If for organizational reasons you decide to break your program up into packages, you could use sub-packages. At work we've decided to do just this. We have:
projectgopath/src/projectname
projectname/subcomponent1
projectname/subcomponent2
So far I am really liking this structure. It aids in separation of concerns, but does not go to the extent of making a package outside of the main project. The intent is clear. The sub-package's intended use is for this program only...
The new go build and go install commands seem to deal very well with it. We group components together in packages and expose only the necessary bits via exports.
In the described situation both approaches are equally valid, so it's more or less a matter of personal preferences. In my case I'm using camelCase identifiers for package main, mostly out of habit.
A lot of my go files started their life in isolated commands and were moved to packages as they could be reused by a few commands around the same topic.
I think you should make private all that couldn't possibly be called from elsewhere (supposing one day you make it an importable package) and make public the big functions that can be understood from elsewhere (if any) and structs fields when they are orthogonal (I mean when a change of the value of one field doesn't break the consistency of the struct value).

Resources