Any documentation/article about the `&MyType{}` pattern in golang? - go

In most golang codebases I look, people are using types by reference:
type Foo struct {}
myFoo := &Foo{}
I usually take the opposite approach, passing everything as copy and only pass by reference when I want to perform something destructive on the value, which allows me to easily spot destructive functions (and which is fairly rare).
But seeing how references are commonplace, I guess it's not just a matter of taste. I get there's a cost in duplicating values, is it that much of a game changer? Or are there other reasons why references are preferred?
It would be great if someone could point me to an article or documentation about why references are preferred.
Thanks!

Go is pass by value. I try to use references like in your example as much as possible to remove the mental process of thinking about not making duplicates of objects. Go is mostly meant for networking & scaling, which makes performance a priority. Obvious downside of this is as you say, receiving methods can destroy the object that the pointer points to.
Otherwise there is no rule as to which you should use. Both are quite ok.
Also, somewhat related to the question, from the Go docs: Pointers vs. Values

Related

When to use references versus types versus boxes and slices versus vectors as arguments and return types?

I've been working with Rust the past few days to build a new library (related to abstract algebra) and I'm struggling with some of the best practices of the language. For example, I implemented a longest common subsequence function taking &[&T] for the sequences. I figured this was Rust convention, as it avoided copying the data (T, which may not be easily copy-able, or may be big). When changing my algorithm to work with simpler &[T]'s, which I needed elsewhere in my code, I was forced to put the Copy type constraint in, since it needed to copy the T's and not just copy a reference.
So my higher-level question is: what are the best-practices for passing data between threads and structures in long-running processes, such as a server that responds to queries requiring big data crunching? Any specificity at all would be extremely helpful as I've found very little. Do you generally want to pass parameters by reference? Do you generally want to avoid returning references as I read in the Rust book? Is it better to work with &[&T] or &[T] or Vec<T> or Vec<&T>, and why? Is it better to return a Box<T> or a T? I realize the word "better" here is considerably ill-defined, but hope you'll understand my meaning -- what pitfalls should I consider when defining functions and structures to avoid realizing my stupidity later and having to refactor everything?
Perhaps another way to put it is, what "algorithm" should my brain follow to determine where I should use references vs. boxes vs. plain types, as well as slices vs. arrays vs. vectors? I hesitate to start using references and Box<T> returns everywhere, as I think that'd get me a sort of "Java in Rust" effect, and that's not what I'm going for!

(Idiomatic?) Difference between new(T) and &T{...}?

I started kidding around with Go and am a little irritated by the new function. It seems to be quite limited, especially when considering structures with anonymous fields or inline initialisations. So I read through the spec and stumbled over the following paragraph:
Calling the built-in function new or taking the address of a composite literal allocates storage for a variable at run time.
So I have the suspicion that new(T) and &T{} will behave in the exact same way, is that correct? And if that is correct, in what situation should new be used?
Yes, you are correct. new is not that useful with structs. But it is with other basic types. new(int) will get you a pointer to a zero-valued int, and you can't do &int{} or similar.
In any case, in my experience, you rarely want that, so new is rarely used. You can just declare a plain int and pass around a pointer to it. In fact, doing this is probably better because liberates you from thinking about allocating in the stack vs. in the heap, as the compiler will decide for you.

Should private and protected variables, methods, and classes be commented?

When creating a private or protected variable, method, class, etc., should it be commented with the documentation comment?
Yes! The comments are to help any developer - yourself included - when reviewing, maintaining or extending the code in future. Whether it's public/private shouldn't be an influencing factor, quite simply if you think something isn't clear enough without a comment, put one in.
(Of course the best documentation is clear self-documenting code in the first place)
Some people will no doubt tell you that nothing needs to be commented (and technically they are right in that comments have no effect on output). However, it's up to 'coding style' like you tagged it as. I personally always comment all variables in addition to giving them a descriptive name. Remember other people may want to work with your source, or you might want to in a years time, in which case it's worth the few seconds to document it while you still know what it does.
Definitely yes. When for example you find a bug in your code after like three months, with commenting it will be easier to recall what this code was supposed to do.
Commenting individual variables is occasionally helpful, but more often than not variables will have logical groupings that will be expected to uphold certain invariants. A comment describing how the group as a whole is supposed to behave will often be more useful than comments describing individual variables.
For example, if an EditablePolygon class in Java might contain four essential fields:
int[] xCoords;
int[] yCoords;
int numCoords;
int sharedPortion;
and expect to uphold the invariants that both arrays will always be the same length, and that length will be >= numCoords, and all coordinates of interest will be in array slots below numCoords. It may further specify that there may exist multiple EditablePolygon objects sharing the same arrays, provided that all but one such object has a sharedPortion greater than numCoords or equal to the array length, and that one object's sharePortion is no less than the numCoords value of any of the others [making a clone of a shape require a defensive copy unless a change is requested to part of the original which was shared with the clone, or to any part of the clone [which is entirely shared with the original].
Note that the most important things for the comments to document are (1) the array lengths may exceed the number of points, and (2) certain portions of the array may be shared. The first may be somewhat obvious from the code, but the second will likely be far less obvious. The field sharedPortion does have some meaning in isolation, but its meaning and purpose can really only be understood in relation to the other variables.
It's a good practice to document methods and Classes. Moreover javadocs for public methods should be more stressed as those act as reference manual for external objects. Similarly Javadoc could be beneficial for public variables, though i personally is not in favor of having comments for variables.

Should data be descriptive of itself? In what cases should it preferably be or not be?

I am not sure how to put this easily into a simple question.
I am just going to use an example.
Say I am sending some parameter to a web browser from a server. The Javascript will know what to do with it. Say it was a setting for some page element that could have 4 different values. I could make it be 0-3, or I could make it be "bright", "dark", "transparent", "none". Do you see what I mean? In one case the data is descriptive.
Now step outside of the realm of web development. In fact, step away from any facet of programming that would NOT require one method or the other, and think of some that would prefer one over the other. Meaning it would be beneficial to the over all goals if it was done in a descriptive manner, or beneficial if it was done in a cryptic manner.
Can you think of some examples where you would want one over the other?
PS: I may need help with the tags on this one guys.
Benefit of the number variant is smaller data size. That can be useful if you are communicating a lot of data or communicating over a restricted bandwidth channel. Also comparing numbers is much faster than comparing strings.
The alternative with meaningful names is beneficial when you need easy extensibility and maintainability. You can see what the value means without using any other translation table. Also you can enable others to add new values by defining some naming rules.
The benefits of using the one strategy over the other is quite simmilar to the benefits of strong vs. weak typing. Values like "bright", "dark" etc. is strongly typed while 0, 1, 2 is weakly typed.
The most important benefits of using strongly typed data is 1) that it is easy for other people to know what the value means and how to use it and 2) that you will get a meaningful, syntactic error early if you use an illegal value.
The benefits of weakly typing is that you may introduce new values without having to change intermediate modules. I.e. you could introduce "4" without changing intermediate modules that don't really have to understand what the value means.
I would definitely go for "bright", "dark" etc.
NB! Some would probably argue that "bright" is a string and so is weakly typed in the same way as "1", but this depends on the perspective.

How do you decide which parts of the code shall be consolidated/refactored next?

Do you use any metrics to make a decision which parts of the code (classes, modules, libraries) shall be consolidated or refactored next?
I don't use any metrics which can be calculated automatically.
I use code smells and similar heuristics to detect bad code, and then I'll fix it as soon as I have noticed it. I don't have any checklist for looking problems - mostly it's a gut feeling that "this code looks messy" and then reasoning that why it is messy and figuring out a solution. Simple refactorings like giving a more descriptive name to a variable or extracting a method take only a few seconds. More intensive refactorings, such as extracting a class, might take up to a an hour or two (in which case I might leave a TODO comment and refactor it later).
One important heuristic that I use is Single Responsibility Principle. It makes the classes nicely cohesive. In some cases I use the size of the class in lines of code as a heuristic for looking more carefully, whether a class has multiple responsibilities. In my current project I've noticed that when writing Java, most of the classes will be less than 100 lines long, and often when the size approaches 200 lines, the class does many unrelated things and it is possible to split it up, so as to get more focused cohesive classes.
Each time I need to add new functionality I search for already existing code that does something similar. Once I find such code I think of refactoring it to solve both the original task and the new one. Surely I don't decide to refactor each time - most often I reuse the code as it is.
I generally only refactor "on-demand", i.e. if I see a concrete, immediate problem with the code.
Often when I need to implement a new feature or fix a bug, I find that the current structure of the code makes this difficult, such as:
too many places to change because of copy&paste
unsuitable data structures
things hardcoded that need to change
methods/classes too big to understand
Then I will refactor.
I sometimes see code that seems problematic and which I'd like to change, but I resist the urge if the area is not currently being worked on.
I see refactoring as a balance between future-proofing the code, and doing things which do not really generate any immediate value. Therefore I would not normally refactor unless I see a concrete need.
I'd like to hear about experiences from people who refactor as a matter of routine. How do you stop yourself from polishing so much you lose time for important features?
We use Cyclomatic_complexity to identify the code that needs to be refactored next.
I use Source Monitor and routinely refactor methods when the complexity metric goes aboove around 8.0.

Resources