How to merge similar multi-package projects in go - go

Several organizations distribute variants of the same project, and we regularly pull changes from one another. It would be great if we could eventually merge code repositories and maybe, maybe have a common source tree managed by a consortium. However, each member would probably want the option of distributing their own variant without too much pain for customers in case there is trouble upstreaming changes required to work with newer products.
The project consists of three packages:
A library
A compiler executable that outputs go code that needs to import the library
A utility executable that uses code generated by #2 and links with #1
A big annoyance, when pulling changes back and forth, is gratuitous differences in import paths. We basically have to edit every version of import "github.com/companyA/whatever" to import "companyB.com/whatever". Of course these problems would go away with (gasp) relative import paths. If we resorted to such heresy, our compiler can just hard-code the absolute import path in generated code to isolate end users from the library's import path. It would also require only one gratuitous difference in the source trees (the line in the compiler that outputs import statements) rather than a bunch.
But anyway, I know relative import paths are bad - this is a tricky situation. I know this is similar to questions such as this or this, because the answer of just asking end users to create a directory called companyB.com and cloning something from companyA in there is just not going to fly for practical and political reasons.
I already know that go is not really good at accommodating this scenario, so I'm also not asking for a magic bullet to make go handle something it can't. Another thing that unfortunately won't fly is asking customers to curl whatever | sh, as this is viewed as too much of a liability (deemed "training customers to do dangerous things"). Maybe we could forego go get and have everyone clone to some neutral non-DNS-name under $GOPATH/src, but we would need to do this without a "flag day" in which code suddenly breaks if it's in the wrong place.
My question is whether anyone has successfully merged SDK-type projects with existing end users, and if so, how did you do it, what worked, and what didn't? Did you in fact avoid relative import paths or gnarly GOPATH hacking, and if so was it worth it? What mechanisms did you employ (environment variables, configuration files, .project-config files in the current working directory, abusing the vendor directory, code-generation packages that figure out their absolute import path at compliation time) to make this work smoothly? Did you just muddle through with a huge amount of sed or maybe gofmt -r? Are there tricks involving clever use of .gitattributes or go generate to rewrite import paths on checkout/checkin?

Merging them is pretty easy - cross-merge so that they all match, pick one (or create a new one) as the canonical source of truth, then migrate all references to the canonical import and make all future updates there.
The probablem arises here:
each member would probably want the option of distributing their own variant without too much pain for customers in case there is trouble upstreaming changes required to work with newer products
There's no particularly good way to do that without using any of the known solutions you've already ruled out, and no way to do that depending on your threshold for "too much pain".
Without knowing more about the situation it's hard to suggest options, but if there's any way that each company can abstract their portion out into a separate library they could maintain and update at their pace while using a smaller shared library with shared responsibility, that would likely be the best option - something like the model used by Terraform for its providers? Basically you'd have shared maintenance of a shared "core" and then independent maintenance of "vendor-specific" packages.

Related

Using unexported functions/types from stdlib in Go

Disclaimer: yes I know that this is not "supposed to be done" and "use interface composition and delegation" and "the authors of the language know better". However, I am confronted with a choice of either copy-pasting from the standard library and creating my own packages, or doing what I am asking. So please do not reply with "What you want to do is wrong, you are a bad dev and you should feel bad."
So, in Go we have the http stdlib package. This package has a number of functions for dealing with HTTP Range headers and responses (parsers, a struct for "offset+size" and so forth). For various reasons I want to use something that is very similar to ServeContent but works a bit differently (long story short - the amount of plumbing needed to do the ReaderAt gymnastics is suboptimal for what I want to accomplish) so I want to parse the HTTP Range header myself, using the utility functions/structs from the http stdlib package and then deal with them manually. Basically, I want a changed version of ServeContent :-)
Is there a way for me to "reopen" the http stdlib package to use it's unexported identifiers? ABI is not a concern for me as the source is mine, the program gets compiled from scratch every time etc. etc. and it does not need binary compatibility with older/other Go versions. I.e. I am able to ensure that the build is going to be done on a specific Go version and there are tests to check that an unexported identifier disappeared. So...
If there is a package called foo in the Go standard library, but it only exposes a MagicMegamethod that does the thing I do not need, and uses usefulFunc and usefulStruct that I want to get access to, is there a way for me to get access to those identifiers? Either by reopening the package, or using some other way... that does not involve copy-pasting dozens of lines from stdlib without tests etc.
There exist (rather gruesome) ways of accessing unexported symbols, but it requires nontrivial amounts of tricky code, so there's unlikely to be a net win.
Since you've outruled the "don't do this" direction, it seems that the answer is either NO or use the methods described in the post I linked to (and this repo).
FWIW I'd personally just copy the code I need from the standard library and tweak it to my needs. This would likely take less time than the time it took you to write this SO question :-)

Organising Go programs - packages or something else?

I have read the Go Tour and Googled "golang packages" but I have not yet found any advice about best practice in Go for organising moderately sized applications.
If I have an application that conceptually has several distinct parts, perhaps 10^3-10^4 LOC, and I don't intent to create reusable libraries for use in other applications, should all the source code files be package main?
To clarify ...
For example, lets say my program will have the following major chunks:
Something that manages a bunch of persistently stored data
allowing usual create, read, update, delete operations
Something that allows a human to view the stored data
Something that coordinates / mediates between these
Something that periodically fetches data updates from a web-service using SOAP.
So that would be MVC plus a fetcher of data.
From looking around at what people do, I now suspect I should
create $GOPATH/src/myprogramname
in there put some main.go with package main and func main() { ... } in it.
create some subdirectories like
$GOPATH/src/myprogramname/model
$GOPATH/src/myprogramname/view
$GOPATH/src/myprogramname/control
$GOPATH/src/myprogramname/fetch
have the .go files in those subdirectories begin with package fetch, etc. Where the package name always matches the subdirectory name.
my main.go will probably import ( ... "fetch"; "model"; "view"; "control" )
as main.go grows, split it into other reasonably sized .go files named according to purpose.
build the program, including *.go in the above package subdirectories by
cd $GOPATH/src/myprogramname
go build
Is that all I need to do? Is that the properly idiomatic Go way of organising things? Is there more I should know or be thinking of? Is there some canonical webpage or PDF I overlooked and should read to find out this stuff?
In short, I don't want a 10,000 line main.go with everything in it. What are the idiomatic Go principles for organising code into files, subdirectories, packages and any other organisational units corresponding to normal conceptual divisions according to well-known structured-programming and/or OO principles?
You could break down your project into several layers based on the encapsulation level of your functions, i.e. having low-level functions in separate packages and logic functions in your main package. (You could inspire yourself of MVC-like architectures)
Since we don't have any details about your code, it is hard to see what kind of architecture would be best suited.
But in the end your choice will be based on the code simplicity / re-usability balance.
The general "best practice" in Go seems to be having each package provide a type or a service. Most of the packages in the standard library expose one or two types, and functions for working with those types. Some, like net/http and testing, provide a service - not in the "microservices" sense of something executable in itself, but rather a set of functionality related to a specific activity.

What are possible pros and cons for prefixing folders with digits?

My company strongly suggests to use subfolder names prefixed by digits for larger projects. This is recorded in the companies code convention articles.
This should look something like this
ApplicationRoot/
SomeSubFolder
00_SubSubFolder/
01_SubSubFolder/
02_SubSubFolder/
AnotherSubFolder
00_SubSubFolder/
01_SubSubFolder/
02_SubSubFolder/
Somehow this feels like an useless overhead to me but I have no valid arguments against that.
Maybe more experienced people can tell me about scenarios which show why this is a bad habit or tell my why it is good - besides the possibility to force the folder to be in a certain order?
it's useful only if the order is important (e.g. order of running scripts). otherwise it's bad (in my opinion). the arguments are:
some products don't allow it. e.g. java package structure maps directly to directory structure. but package name can't start with a digit.
can't use convention over configuration. some tools help you a lot with software development and they assume you are doing it same way as rest of the world (because it's a good practise). you will have a lot of configuration to make them accept your structure (e.g. maven)
human perception. we look for data by names, not by numbers. when i navigate to a file in e.g. krusader/total commander and i have a dozen of dirs i type a letter because i know the folder name.
confusion. if those numbers mean nothing then it introduces confusion to other people. they will always ask 'why', they will always affraid to modify add, remove because they will think someone did it because of some very important reason. that's a clear violation of KISS and least surprise principles (such things heavily affect new developers entry barrier)
no flexibility. sometimes it's good to have custom folder names. for whatever reason, e.g automatic search of configuration in multiple directories (often used in java/spring). but heaving such naming convention it's more difficult to do it. sometimes when you want to use automatic naming translation it also may be harder as your target format may not support names starting with digits (e.g. logins)
overhead. if there is no reason to keep it then any overhead should be removed. again: KISS
last but not least. developer/architect is always the one that makes decisions about software design, layout, used techniques etc. if his hands are tights because of senseless rules invented by non-technical bureaucrats from the previous epoch, that's nothing but troubles

When writing a single package meant to be used as a command, which is idiomatic: name all identifiers as private or name all identifiers as public?

In Go, public names start with an upper case letter and private names start with a lower case letter.
I'm writing a program that is not library and is a single package. Is there any Go idiom that stipulates whether my identifiers should be all public or all private? I do not plan on using this package as a library or as something that should be imported from another Go program.
I can't think of any reason why I'd want a mixture. It "feels" like going all private is the proper choice.
I don't think I got any concrete answer, but Nate was closest with telling me to think of "exporting vs non-exporting" instead of "public and private".
This leads me to believe that not exporting anything is the best approach. In the worst case scenario, if I end up importing code from my application in another package, I will have to rethink what should be exported and what shouldn't be. Which is a good thing IMO.
If you are attempting to adjust your mindset to be more Go idiomatic, you should stop thinking of variables, functions, and methods as public or private. The more accurate term is exported or not exported. It definitely has a more C like feel to it.
As others have stated exporting really isn't needed for application program code. If for organizational reasons you decide to break your program up into packages, you could use sub-packages. At work we've decided to do just this. We have:
projectgopath/src/projectname
projectname/subcomponent1
projectname/subcomponent2
So far I am really liking this structure. It aids in separation of concerns, but does not go to the extent of making a package outside of the main project. The intent is clear. The sub-package's intended use is for this program only...
The new go build and go install commands seem to deal very well with it. We group components together in packages and expose only the necessary bits via exports.
In the described situation both approaches are equally valid, so it's more or less a matter of personal preferences. In my case I'm using camelCase identifiers for package main, mostly out of habit.
A lot of my go files started their life in isolated commands and were moved to packages as they could be reused by a few commands around the same topic.
I think you should make private all that couldn't possibly be called from elsewhere (supposing one day you make it an importable package) and make public the big functions that can be understood from elsewhere (if any) and structs fields when they are orthogonal (I mean when a change of the value of one field doesn't break the consistency of the struct value).

why use com.company.project structure? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Does anyone know of the practical reasons for the com.company.project package structure and why it has become the de-facto standard?
Does anyone actually store everything in a folder-structure that directly reflects this package structure? (this is coming from an Actionscript perspective, btw)
Preventing name-clashes is a rather practical reason for package structures like this. If you're using real domain names that you own and everybody else uses their package names by the same role, clashes are highly unlikely.
Esp. in the Java world this is "expected behaviour". It also kind of helps if you want to find documentation for a legacy library you're using that no one can remember anymore where it was coming from ;-)
Regarding storing files in such a package structure: In the Java world packages are effectively folders (or paths within a .jar file) so: Yes, quite a few people do store their files that way.
Another practical advantage of such a structure is, that you always know if some library was developed in-house or not.
I often skip the com. as even small orgs have several TLDs, but definitely useful to have the owner's name in the namespace, so when you start onboarding third-party libraries, you don't have namespace clashes.
Just think how many Utility or Logging namespaces there would be around, here at least we have Foo.Logging and Bar.Logging, and the dev can alias one namespace away :)
If you start with a domain name you own, expressed backwards, then it is only after that point that you can clash with anyone else following the same structure, as nobody else owns that domain name.
It's only used on some platforms.
Several reasons are:
Using domain names makes it easier to achieve uniqueness, without adding a new registry
As far as hierarchical structuring goes, going from major to minor is natural
For the second point, consider the example of storing dated records in a hierarchical file structure. It's much more sensible to arrange it hierarchically as YYYY/MM/DD than say DD/MM/YYYY: at the root level you see folders that organize records by year, then at the next level by month, and then finally by day. Doing it the other way (by days or months at the root level) would probably be rather awkward.
For domain names, it usually goes subsub.sub.domain.suffix, i.e. from minor to major. That's why when converting this to a hierarchical package name, you get suffix.domain.sub.subsub.
For the first point, here is an excerpt from Java Language Specification 3rd Edition that may shed some light into this package naming convention:
7.7 Unique Package Names
Developers should take steps to avoid the possibility of two published packages having the same name by choosing unique package names for packages that are widely distributed. This allows packages to be easily and automatically installed and catalogued. This section specifies a suggested convention for generating such unique package names. Implementations of the Java platform are encouraged to provide automatic support for converting a set of packages from local and casual package names to the unique name format described here.
If unique package names are not used, then package name conflicts may arise far from the point of creation of either of the conflicting packages. This may create a situation that is difficult or impossible for the user or programmer to resolve. The class ClassLoader can be used to isolate packages with the same name from each other in those cases where the packages will have constrained interactions, but not in a way that is transparent to a naïve program.
You form a unique package name by first having (or belonging to an organization that has) an Internet domain name, such as sun.com. You then reverse this name, component by component, to obtain, in this example, com.sun, and use this as a prefix for your package names, using a convention developed within your organization to further administer package names.
The name of a package is not meant to imply where the package is stored within the Internet; for example, a package named edu.cmu.cs.bovik.cheese is not necessarily obtainable from Internet address cmu.edu or from cs.cmu.edu or from bovik.cs.cmu.edu. The suggested convention for generating unique package names is merely a way to piggyback a package naming convention on top of an existing, widely known unique name registry instead of having to create a separate registry for package names.

Resources