When writing a single package meant to be used as a command, which is idiomatic: name all identifiers as private or name all identifiers as public? - go

In Go, public names start with an upper case letter and private names start with a lower case letter.
I'm writing a program that is not library and is a single package. Is there any Go idiom that stipulates whether my identifiers should be all public or all private? I do not plan on using this package as a library or as something that should be imported from another Go program.
I can't think of any reason why I'd want a mixture. It "feels" like going all private is the proper choice.
I don't think I got any concrete answer, but Nate was closest with telling me to think of "exporting vs non-exporting" instead of "public and private".
This leads me to believe that not exporting anything is the best approach. In the worst case scenario, if I end up importing code from my application in another package, I will have to rethink what should be exported and what shouldn't be. Which is a good thing IMO.

If you are attempting to adjust your mindset to be more Go idiomatic, you should stop thinking of variables, functions, and methods as public or private. The more accurate term is exported or not exported. It definitely has a more C like feel to it.
As others have stated exporting really isn't needed for application program code. If for organizational reasons you decide to break your program up into packages, you could use sub-packages. At work we've decided to do just this. We have:
projectgopath/src/projectname
projectname/subcomponent1
projectname/subcomponent2
So far I am really liking this structure. It aids in separation of concerns, but does not go to the extent of making a package outside of the main project. The intent is clear. The sub-package's intended use is for this program only...
The new go build and go install commands seem to deal very well with it. We group components together in packages and expose only the necessary bits via exports.

In the described situation both approaches are equally valid, so it's more or less a matter of personal preferences. In my case I'm using camelCase identifiers for package main, mostly out of habit.

A lot of my go files started their life in isolated commands and were moved to packages as they could be reused by a few commands around the same topic.
I think you should make private all that couldn't possibly be called from elsewhere (supposing one day you make it an importable package) and make public the big functions that can be understood from elsewhere (if any) and structs fields when they are orthogonal (I mean when a change of the value of one field doesn't break the consistency of the struct value).

Related

Using unexported functions/types from stdlib in Go

Disclaimer: yes I know that this is not "supposed to be done" and "use interface composition and delegation" and "the authors of the language know better". However, I am confronted with a choice of either copy-pasting from the standard library and creating my own packages, or doing what I am asking. So please do not reply with "What you want to do is wrong, you are a bad dev and you should feel bad."
So, in Go we have the http stdlib package. This package has a number of functions for dealing with HTTP Range headers and responses (parsers, a struct for "offset+size" and so forth). For various reasons I want to use something that is very similar to ServeContent but works a bit differently (long story short - the amount of plumbing needed to do the ReaderAt gymnastics is suboptimal for what I want to accomplish) so I want to parse the HTTP Range header myself, using the utility functions/structs from the http stdlib package and then deal with them manually. Basically, I want a changed version of ServeContent :-)
Is there a way for me to "reopen" the http stdlib package to use it's unexported identifiers? ABI is not a concern for me as the source is mine, the program gets compiled from scratch every time etc. etc. and it does not need binary compatibility with older/other Go versions. I.e. I am able to ensure that the build is going to be done on a specific Go version and there are tests to check that an unexported identifier disappeared. So...
If there is a package called foo in the Go standard library, but it only exposes a MagicMegamethod that does the thing I do not need, and uses usefulFunc and usefulStruct that I want to get access to, is there a way for me to get access to those identifiers? Either by reopening the package, or using some other way... that does not involve copy-pasting dozens of lines from stdlib without tests etc.
There exist (rather gruesome) ways of accessing unexported symbols, but it requires nontrivial amounts of tricky code, so there's unlikely to be a net win.
Since you've outruled the "don't do this" direction, it seems that the answer is either NO or use the methods described in the post I linked to (and this repo).
FWIW I'd personally just copy the code I need from the standard library and tweak it to my needs. This would likely take less time than the time it took you to write this SO question :-)

What is the proper way to organize / structure Go package folders and files?

I know this might be controversial or not very broad but I'm going to try to be very specific and relate to other questions.
Ok so when I make a Go program what things should I be thinking in terms of how I should organize my project? (E.g. should I think ok I'm going to have controllers of some sort so I should have a controller subdirectory that's going to do this so I should have that)
How should I structure a package?
For example the current program I'm working on I'm trying to make a SteamBot using this package
But while I'm writing it I don't know if I should export certain methods into their own own files, e.g. I'd have something like
func (t *tradeBot) acceptTrade() {}
func (t *tradeBot) declineTrade() {}
func (t *tradeBot) monitorTrade() {}
func (t *tradeBot) sendTrade() {}
each method is going to have quite a bit of code so should I export each method into its own file or just have 1 file with 3000 lines of code in it?
Also using global variables so that I can set one variable and then leave it and be able to use it in multiple functions, or is this bad and I should pass the variables as arguments?
Also should I order my files like:
package
imports
constants
variables
functions
methods
Or do I just put things where I need them?
The first place to look for inspiration is the Go standard library. The Go standard library is a pretty good guide of what "good" Go code should look like. A library isn't quite the same as an application, but it's definitely a good introduction.
In general, you would not break up every method into its own file. Go tends to prefer larger files that cover a topic. 3000 lines seems quite large, though. Even the net/http server.go is only 2200 lines, and it's pretty big.
Global mutable variables are just as bad in Go as in any language. Since Go relies so heavily on concurrent programming, global mutable variables are quite horrible. Don't do that. One common exception is sync structures like sync.Pool or sync.Once, which sometimes are package global, but are also designed to be accessed concurrently. There are also sometimes "Default" versions of structures, like http.DefaultClient, but you can still pass explicit ones to functions. Again, look through the Go standard library and see what is common and what is rare.
Just a few tips that you hopefully find useful:
Organize code into multiple files and packages by features, not by layers. This becomes more important the bigger your application becomes. package controllers with one or two controllers is probably ok, but putting hundreds of unrelated controllers in the same package doesn't make any sense. The following article explains it very well: http://www.javapractices.com/topic/TopicAction.do?Id=205
Global variables sometimes make code easier to write however they should be used with care. I think unexported global variables for things like logger, debug flags, etc are ok.

Artifact naming convention

We're doing a big project on OSGi and adding some commons modules. There's some discussion about naming the artifact.
So, one possibility when naming the module is for example:
cmns-definitions (for common definitions), another is cmns-definition, still another is cmns-def. This has some effect also on the package name. Now it's
xx.xxx.xxx.xxx.xxx.commons.definitions, if changing to cmns-def it would be xx.xxx.xxx.xxx.xxx.commons.def.
Inside this package will be classes like enums and other definitions to be used throughout the system.
I personally lean to cmns-definitions since there's not only 1 definition inside the package. Other people point out that java.util doesn't have only 1 utility there for example. Still, java.util is an abbreviation for me. It can mean java utility or java utilities. Same thing happens with commons-lang.
How would you name the package? Why would you choose this name?
cmns-definitions
cmns-definition
cmns-def
Bonus question: How to name something like cmns-exceptions? That's how I name it. Would you name it cmns-xcpt?
Ă‹DIT:
I'm throwing in my own thoughts on this in the hope of being either confirmed or contradicted. If you can, please do.
According to what I think, the background reason why you name something is to make it easier to understand what's inside it. Or, according to Peter Kriens, to make it easy to remember and being able to automate processes via patterns. Both are valid arguments.
My reasoning is as follows in terms of pattern:
1) When a substantivation occurs and it's well known in the industry, follow it on your naming.
Eg:
"features" is a case on this. We have a module called cmns-features. Does this mean we have many features on this module? No. It means "the module that implements the "features" file from Apache karaf".
"commons" is a substantivation of "common" well-accepted on the industry. It doesn't mean "many common". It means "Common code".
If I see extr-commons as a module name, I know that it contains common code for extr (in this case extraction), for example.
2) When a quantity of classes inside the module are cooperating to give a distinct "one and one only" meaning to the whole, use singular form to name it.
The majority of modules are included here. If I name something cmns-persistence-jpa, I mean that whatever classes inside cooperate together to provide the jpa implementation of cmns-persistence-api. I don't expect 2 implementations inside it, but actually a myriad of classes that together make one implementation. Crystal clear to me. No?
3) When a grouping of classes is done with the sole purpose of gathering classes by affinity, but the classes don't cooperate together to no purpose, use plural.
Here is the case for example of cmns-definitions (enums used by the whole system).
Alternatively, using an abbreviation circumvents the problem, e.g. cmns-def which can be also "interpreted expanded" by a human reader to cmns-definitions. Many people use also "xxxx-util" meaning xxxx-utilities.
Still a third option can be used to pack things together, using a name that itself means a pluralization. The word "api" comes to mind, but any word that pluralizes something would do, like "pack".
Support to these cases (3) are well-known modules like commons-collections (using the plural) or commons-dbcp (using abbreviation) or commons-lang (again abbreviation) and anything that uses api to pack classes together by affinity.
From apache:
commons-collections -> many powerful data structures that accelerate development of most significant Java applications
commons-lang -> host of helper utilities for the java.lang API
commons-dbcp -> package of several database connection pools
'it is just a name ...'
I find in my long career that these just names can make a tremendous difference in productivity. I do not think it makes a difference if you use definitions, definition, or def as long as you're consistent and use patterns in the name that are easy to remember and can be used to automate processes. A build based on a consistent naming scheme is infinitely easier to work with than a build with "nice human display" names that are ad-hoc and have no discernible pattern.
If you use patterns, names tend to become shorter. Now people working with these names usually spent a lot of time with them. So their readability is not nearly as important as their mnemonic value. It turns out that abbreviations of 3 or 4 characters are surprisingly powerful. One of the reason is they work well is that there is only one possible abbreviation while if you go longer there are many candidates.
Anyway, most import part is the overall consistency. Good luck.
definitions (or def or definition) is a bad name because it doesn't have any semantic to a reader. You're in an object oriented world (I suppose) - try to follow its conventions and principles. Modules in Maven should be named after the biggest "abstraction" they contain. "Definition" is a form, not a meaning.
Your question is similar to: "Which class name is better FileUtilities or FileUtils". Answer: none.
Basically what you do with the Definitions and Exceptions is to provide kind of an API for your other modules. So I propose to combine definitions, exceptions and add interfaces to it. Then it makes sense to call it all cmns-api. I normally prefer the singular names as they are shorter but you are free to decide as it is just a name.

Is fully qualified naming vs the using directive simply a matter of opinion?

I find now that I work in a mostly solo environment that I actually type fully qualified methods calls more and more, instead of make use of the using directive. Previously, I just stayed consistent with the most prominent coding practice on the team.
Personally, I find it easier to read verbose code at a glance, I type fast especially with autocompletion and I find myself using Google more often as my source of documentation, which a fully qualified name returns a much narrower results set. These are obviously very arbitrary reasons to prefer fully qualifying over using the using directive.
In this day and age of refactoring tools, is there a concrete reason why using the using directive is superior to fully qualified or vice versa, or is this purely a personal discretion issue like comment spacing? Finally, which do you prefer and why?
Probably could call this subjective.
Where I work/What I prefer is to use using statements. It keeps the names/lines short enough, which just makes day to day life easier. Plus, you can just hover over something for the fully qualified name.
I use usings whenever possible. The less there is to read, the less I have to parse:
System.Windows.Form form = new System.Windows.Form();
is just way more work than
var form = new Form();
Please note that this does require that your entire shop commits to not doing something silly, such as creating your own super-duper Form class which causes ambiguity.
I generally prefer using/imports for real code and fully qualified code for examples/etc.
Readability.
Think about yourself in 1 year, trying to read your code. You want it to be explicit and short and have only one point of information for each data (DRY principle).

Is there any downside to redundant qualifiers? Any benefit?

For example, referencing something as System.Data.Datagrid as opposed to just Datagrid. Please provide examples and explanation. Thanks.
The benefit is that you don't need to add an import for everything you use, especially if it's the only thing you use from a particular namespace, it also prevents collisions.
The downside, of course, is that the code balloons out in size and gets harder to read the more you use specific qualifiers.
Personally I tend to use imports for most things unless I know for sure I will only be using something from a particular namespace once or twice, so it won't impact the readability of my code.
You're being very explicit about the type you're referencing, and that is a benefit. Although, in the very same process you're giving up code clarity, which clearly is a downside in my case, as I want code to be readable and understandable. I go for the short version unless I have a conflict in different namespaces which can only be solved with the explicit referencing to classes.. Unless I make an alias for it with the keyword using:
using Datagrid = System.Data.Datagrid;
Actually the full path is global::System.Data.DataGrid. The point of using a more qualified path is to avoid having to use additional using statements, especially if the introduction of another using will cause problems with type resolution. More fully qualified identifiers exist so that you can be explicit when you need to be explicit, but if the class's namespace is clear, then the DataGrid version is clearer to many.
I generally use the shortest form available in order to keep the code as clean and readable as possible. That's what using directives are for, after all, and tooltips in the VS editor give you instant detail on the provenance of a type.
I also tend to use a namespace tag for RCWs in a COM interop layer, to call out those variables explicitly in the code (they may need special attention on lifecycle and collection), eg
using _Interop = Some.Interop.Namespace;
In terms of performance there is no upside/downside. Everything is resolved at compile time and the generated MSIL is identical whether you use fully-qualified names or not.
The reason why its use is prevalent in the .NET world is because of auto-generated code, such as designer markup. In that case it would be better to fully-qualify names like class names because of possible conflicts with other classes you may have in your code.
If you have a tool like ReSharper, it will actually tell you what fully-qualified references you have are unnecessary (e.g. by graying them out) so you can lop them off. If you frequently cut-paste code across your various code bases, it would be a must to fully qualify them. (then again, why would you want to do cut-paste all the time; it's a bad form of code reuse!)
I don't think there is really a downside, just readability vs actual time spent coding. In general if you don't have namespaces with ambiguous object I don't think it's really needed. Another thing to consider is level of use. If you have one method that uses reflection and you are alright with typeing System.Reflection 10 times, then it's not a big deal but if you plan on using a namespace alot then I would recommend an include.
Depending on your situation, extra qualifiers will generate a warning (if this is what you mean by redundant). If you then treat warnings as errors, that's a pretty serious downside.
I've run into this with GCC for example.
struct A {
int A::b; // warning!
}

Resources