Do three dots contain multiple meanings? - go

As I recognize, "..." means the length of the array in the below snippet.
var days := [...]string { "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" }
On the other hand, "..." means unpacking the slice y to arguments of int in the below snippet, as I guess. I'm not really sure about this.
x := []int{1,2,3}
y := []int{4,5,6}
x = append(x, y...)
Now, the difference in the two meanings makes it hard for me to understand what "..." is.

You've noted two cases of ... in Go. In fact, there are 3:
[...]int{1,2,3}
Evaluates at compile time to [3]int{1,2,3}
a := make([]int, 500)
SomeVariadicFunc(a...)
Unpacks a as the arguments to a function. This matches the one you missed, the variadic definition:
func SomeVariadicFunc(a ...int)
Now the further question (from the comments on the OP) -- why can ... work semantically in all these cases? The answer is that in English (and other languages), this is known as an ellipsis. From that article
Ellipsis (plural ellipses; from the Ancient Greek: ἔλλειψις,
élleipsis, "omission" or "falling short") is a series of dots that
usually indicates an intentional omission of a word, sentence, or
whole section from a text without altering its original meaning.1
Depending on their context and placement in a sentence, ellipses can
also indicate an unfinished thought, a leading statement, a slight
pause, and a nervous or awkward silence.
In the array case, this matches the "omission of a word, sentence, or whole section" definition. You're omitting the size of the array and letting the compiler figure it out for you.
In the variadic cases, it uses the same meaning, but differently. It also has hints of "an unfinished thought". We often use "..." to mean "and so on." "I'm going to get bread, eggs, milk..." in this case "..." signifies "other things similar to breads, eggs, and milk". The use in, e.g., append means "an element of this list, and all the others." This is perhaps the less immediately intuitive usage, but to a native speaker, it makes sense. Perhaps a more "linguistically pure" construction would have been a[0]... or even a[0], a[1], a[2]... but that would cause obvious problems with empty slices (which do work with the ... syntax), not to mention being verbose.
In general, "..." is used to signify "many things", and in this way both uses of it make sense. Many array elements, many slice elements (albeit one is creation, and the other is calling).
I suppose the hidden question is "is this good language design?" On one hand, once you know the syntax, it makes perfect sense to most native speakers of English, so in that sense it's successful. On the other hand, there's value in not overloading symbols in this way. I probably would have chose a different symbol for array unpacking, but I can't fault them for using a symbol that was probably intuitive to the language designers. Especially since the array version isn't even used terribly often.
As mentioned, this is of no issue to the compiler, because the cases can never overlap. You can never have [...] also mean "unpack this", so there's no symbol conflict.
(Aside: There is another use of it in Go I omitted, because it's not in the language itself, but the build tool. Typing something like go test ./... means "test this package, and all packages in subdirectories of this one". But it should be pretty clear with my explanation of the other uses why it makes sense here.)

Just FYI, myfunc(s...) does not mean "unpack" the input s.
Rather, "bypass" would be a more suitable expression.
If s is a slice s := []string{"a", "b", "c"},
myfunc(s...) is not equivalent to myfunc(s[0], s[1], s[2]).
This simple code shows it.
Also, see the official Go specification (slightly modified for clarity):
Given the function
func Greeting(prefix string, who ...string)
If the final argument is assignable to a slice type []T and is
followed by ..., it is passed unchanged as the value for a ...T
parameter. In this case no new slice is created.
Given the slice s and call
s := []string{"James", "Jasmine"}
Greeting("goodbye:", s...)
within Greeting, who will have the same value as s with the same underlying
array.
If it "unpacks" the input argument, a new slice with a different array should be created (which is not the case).
Note: It's not real "bypass" because the slice itself (not the underlying array) is copied into the function (there is no 'reference' in Go). But, that slice within the function points to the same original underlying array, so it would be a better description than "unpack".

Related

Go Ints and Strings are immutable OR mutable?

What I am reading about ints and strings over internet is they are immutable in the nature.
But the following code shows that after changing the values of these types, still they points to the same address. This contradicts the idea behind the nature of types in python.
Can anyone please explain me this?
Thanks in advance.
package main
import (
"fmt"
)
func main() {
num := 2
fmt.Println(&num)
num = 3
fmt.Println(&num) // address value of the num does not change
str := "2"
fmt.Println(&str)
str = "34"
fmt.Println(&str) // address value of the str does not change
}```
A number is immutable by nature. 7 is 7, and it won't be 8 tomorrow. That doesn't mean that which number is stored in a variable cannot change. Variables are variable. They're mutable containers for values which may be mutable or immutable.
A Go string is immutable by language design; the string type doesn't support any mutating operators (like appending or replacing a character in the middle of the string). But, again, assignment can change which string a variable contains.
In Python (CPython at least), a number is implemented as a kind of object, with an address and fields like any other object. When you do tricks with id(), you're looking at the address of the object "behind" the variable, which may or may not change depending on what you do to it, and whether or not it was originally an interned small integer or something like that.
In Go, an integer is an integer. It's stored as an integer. The address of the variable is the address of the variable. The address of the variable might change if the garbage collector decides to move it (making the numeric value of the address more or less useless), but it doesn't reveal to you any tricks about the implementation of arithmetic operators, because there aren't any.
Strings are more complicated than integers; they are kind of object-ish internally, being a structure containing a pointer and a size. But taking the address of a string variable with &str doesn't tell you anything about that internal structure, and it doesn't tell you whether the Go compiler decided to use a de novo string value for an assignment, or to modify the old one in place (which it could, without breaking any rules, if it could prove that the old one would never be seen again by anything else). All it tells you is the address of str. If you wanted to find out whether that internal pointer changed you would have to use reflection... but there's hardly ever any practical reason to do so.
When you read about a string being immutable, it means you cannot modify it by index, ex:
x := "hello"
x[2] = 'r'
//will raise an error
As a comment says, when you modify the whole var(and not a part of it with an index), it's not related to being mutable or not, and you can do it

Safety of using reflect.StringHeader in Go?

I have a small function which passes the pointer of Go string data to C (Lua library):
func (L *C.lua_State) pushLString(s string) {
gostr := (*reflect.StringHeader)(unsafe.Pointer(&s))
C.lua_pushlstring(L, (*C.char)(unsafe.Pointer(gostr.Data)), C.ulong(gostr.Len))
// lua_pushlstring copies the given string, not keeping the original pointer.
}
It works in simple tests, but from the documentations it's unclear whether this is safe at all.
According to Go document, the memory of reflect.StringHeader should be pinned for gostr, but the Stringheader.Data is already a uintptr, "an integer value with no pointer semantics" - which is itself odd because if it has no pointer semantics, wouldn't the field be completely useless as the memory may be moved right after the value is read? Or is the field treated specially like reflect.Value.Pointer? Or perhaps there is a different way of getting C pointer from string?
it's unclear whether this is safe at all.
Tapir Liui (https://twitter.com/TapirLiu/) dans Go101 (https://github.com/go101/go101) gives a clue as to the "safety" of reflect.StringHeader in this tweet:
Since Go 1.20, the reflect.StringHeader and reflect.SliceHeader types will be depreciated and not recommended to be used.
Accordingly, two functions, unsafe.StringData and unsafe.SliceData, will be introduced in Go 1.20 to take over the use cases of two old reflect types.
That was initially discussed in CL 401434, then in issue 53003.
The reason for deprecation is that reflect.SliceHeader and reflect.StringHeader are commonly misused.
As well, the types have always been documented as unstable and not to be relied upon.
We can see in Github code search that usage of these types is ubiquitous.
The most common use cases I've seen are:
converting []byte to string:
Equivalent to *(*string)(unsafe.Pointer(&mySlice)), which is never actually officially documented anywhere as something that can be relied upon.
Under the hood, the shape of a string is less than a slice, so this seems valid per unsafe rule.
converting string to []byte:
commonly seen as *(*[]byte)(unsafe.Pointer(&string)), which is by-default broken because the Cap field can be past the end of a page boundary (example here, in widely used code) -- this violates unsafe rule.
grabbing the Data pointer field for ffi or some other niche use converting a slice of one type to a slice of another type
Ian Lance Taylor adds:
One of the main use cases of unsafe.Slice is to create a slice whose backing array is a memory buffer returned from C code or from a call such as syscall.MMap.
I agree that it can be used to (unsafely) convert from a slice of one type to a slice of a different type.

Copying reference to pointer or by value

I think I understand the answer from here but just in case, I want to explicitly ask about the following (my apologies if you think it is the same question, but to me, it feels different on the concerns):
func f() *int {
d := 6
pD := new(int)
pD = &d // option 1
*pD = d // option 2
return pD
}
The first option where I just copy the reference as a pointer is performance-wise, more optimal (this is educational guess, but it seems obvious). I would prefer this method/pattern.
The second option would (shallow) copy (?) instead. What I presume is that this method, because it copies, I have no concerns about GC sweeping the instance of 'd'. I often use this method due to my insecurity (or ignorance as a beginner).
What I am concerned about (or more so, insecure about) is that in the first method (where address of 'd' is transfered), will GC recognize that it (the 'd' variable) is referenced by a pointer container, thus it will not be swept? Thus it will be safe to use this method instead? I.e. can I safely pass around pointer 'pD' returned from func 'f()' for the lifetime of the application?
Reference: https://play.golang.org/p/JWNf5yRd_B
There is no better place to look than the official documentation:
func NewFile(fd int, name string) *File {
if fd < 0 {
return nil
}
f := File{fd, name, nil, 0}
return &f
}
Note that, unlike in C, it's perfectly OK to return the address
of a local variable; the storage associated with the variable survives
after the function returns. In fact, taking the address of a composite
literal allocates a fresh instance each time it is evaluated, so we
can combine these last two lines.
(source: "Effective Go")
So the first option (returning a pointer to a local variable) is absolutely safe and even encouraged. By performing escape analysis the compiler can tell that a variable escapes its local scope and allocates it on the heap instead.
In short: No.
First: There are no "references" in Go. Forget about this idea now, otherwise you'll hurt yourself. Really. Thinking about "by reference" is plain wrong.
Second: Performance is totally the same. Forget about this type of nano optimisations now. Especially when dealing with int. If and only if you have a performance problem: Measure, then optimize. It might be intuitively appealing to think "Handing around a tiny pointer of 8 bytes must be much faster than copying structs with 30 or even 100 bytes." It is not, at least it is not that simple.
Third: Just write it a func f() *int { d := 6; return &d; }. There is no need to do any fancy dances here.
Fourth: Option 2 makes a "deep copy" of the int. But this might be misleading as there are no "shallow copies" of an int so I'm unsure if I understand what you are asking here. Go has no notion of deep vs. shallow copy. If you copy a pointer value the pointer value is copied. You remember the first point? There are no references in Go. A pointer value is a value if copied you have a copy of the pointer value. Such a copy does absolutely nothing to the value pointed to, especially it doesn't do a copy. This would hint that copies in Go are not "deep". Forget about deep/shallow copy when talking about Go. (Of course you can implement functions which perform a "deep copy" of your custom objects)
Fifth: Go has a properly working garbage collector. It makes absolutely no difference what you do: While an object is live it won't be collected and once it can be collected it will be. You can pass, return, copy, hand over, take address, dereference pointers or whatever you like, it just does not matter. The GC works properly. (Unless you are deliberately looking for pain and errors by using package unsafe.)

Parsing s-expressions in Go

Here's a link to lis.py if you're unfamiliar: http://norvig.com/lispy.html
I'm trying to implement a tiny lisp interpreter in Go. I've been inspired by Peter Norvig's Lis.py lisp implementation in Python.
My problem is I can't think of a single somewhat efficient way to parse the s-expressions. I had thought of a counter that would increment by 1 when it see's a "(" and that would decrement when it sees a ")". This way when the counter is 0 you know you've got a complete expression.
But the problem with that is that it means you have to loop for every single expression which would make the interpreter incredibly slow for any large program.
Any alternative ideas would be great because I can't think of any better way.
There is an S-expression parser implemented in Go at Rosetta code:
S-expression parser in Go
It might give you an idea of how to attack the problem.
You'd probably need to have an interface "Sexpr" and ensure that your symbol and list data structures matches the interface. Then you can use the fact that an S-expression is simply "a single symbol" or "a list of S-expressions".
That is, if the first character is "(", it's not a symbol, but a list, so start accumulating a []Sexpr, reading each contained Sexpr at a time, until you hit a ")" in your input stream. Any contained list will already have had its terminal ")" consumed.
If it's not a "(", you're reading a symbol, so read until you hit a non-symbol-constituent character, unconsume it and return the symbol.
In 2022, you can also test eigenhombre/l1, a small Lisp 1 written in Go, by John Jacobsen .
It is presented in "(Yet Another) Lisp In Go"
It does include in commit b3a84e1 a parsing and tests for S-expressions.
func TestSexprStrings(T *testing.T) {
var tests = []struct {
input sexpr
want string
}{
{Nil, "()"},
{Num(1), "1"},
{Num("2"), "2"},
{Cons(Num(1), Cons(Num("2"), Nil)), "(1 2)"},
{Cons(Num(1), Cons(Num("2"), Cons(Num(3), Nil))), "(1 2 3)"},
{Cons(
Cons(
Num(3),
Cons(
Num("1309875618907812098"),
Nil)),
Cons(Num(5), Cons(Num("6"), Nil))), "((3 1309875618907812098) 5 6)"},
}

Go receiver methods calling syntax confusion

I was just reading through Effective Go and in the Pointers vs. Values section, near the end it says:
The rule about pointers vs. values for receivers is that value methods can be invoked on pointers and values, but pointer methods can only be invoked on pointers. This is because pointer methods can modify the receiver; invoking them on a copy of the value would cause those modifications to be discarded.
To test it, I wrote this:
package main
import (
"fmt"
"reflect"
)
type age int
func (a age) String() string {
return fmt.Sprintf("%d yeasr(s) old", int(a))
}
func (a *age) Set(newAge int) {
if newAge >= 0 {
*a = age(newAge)
}
}
func main() {
var vAge age = 5
pAge := new(age)
fmt.Printf("TypeOf =>\n\tvAge: %v\n\tpAge: %v\n", reflect.TypeOf(vAge),
reflect.TypeOf(pAge))
fmt.Printf("vAge.String(): %v\n", vAge.String())
fmt.Printf("vAge.Set(10)\n")
vAge.Set(10)
fmt.Printf("vAge.String(): %v\n", vAge.String())
fmt.Printf("pAge.String(): %v\n", pAge.String())
fmt.Printf("pAge.Set(10)\n")
pAge.Set(10)
fmt.Printf("pAge.String(): %v\n", pAge.String())
}
And it compiles, even though the document says it shouldn't since the pointer method Set() should not be invocable through the value var vAge. Am I doing something wrong here?
That's valid because vAge is addressable. See the last paragraph in Calls under the language spec:
A method call x.m() is valid if the method set of (the type of) x
contains m and the argument list can be assigned to the parameter list
of m. If x is addressable and &x's method set contains m, x.m() is
shorthand for (&x).m().
vAge is not considered as only a "value variable", because it's a known location in memory that stores a value of type age. Looking at vAge only as its value, vAge.Set(10) is not valid as an expression on its own, but because vAge is addressable, the spec declares that it's okay to treat the expression as shorthand for "get the address of vAge, and call Set on that" at compile-time, when we will be able to verify that Set is part of the method set for either age or *age. You're basically allowing the compiler to do a textual expansion on the original expression if it determines that it's necessary and possible.
Meanwhile, the compiler will allow you to call age(23).String() but not age(23).Set(10). In this case, we're working with a non-addressable value of type age. Since it's not valid to say &age(23), it can't be valid to say (&age(23)).Set(10); the compiler won't do that expansion.
Looking at the Effective Go example, you're not directly calling b.Write() at the scope where we know b's full type. You're instead making a temporary copy of b and trying to pass it off as a value of type interface io.Writer(). The problem is that the implementation of Printf doesn't know anything about the object being passed in except that it has promised it knows how to receive Write(), so it doesn't know to take a byteSlice and turn it into a *ByteSlice before calling the function. The decision of whether to address b has to happen at compile time, and PrintF was compiled with the precondition that its first argument would know how to receive Write() without being referenced.
You may think that if the system knows how to take an age pointer and convert it to an age value, that it should be able to do the reverse; t doesn't really make sense to be able to, though. In the Effective Go example, if you were to pass b instead of &b, you'd modify a slice that would no longer exist after PrintF returns, which is hardly useful. In my age example above, it literally makes no sense to take the value 23 and overwrite it with the value 10. In the first case, it makes sense for the compiler to stop and ask the programmer what she really meant to do when handing b off. In the latter case, it of course makes sense for the compiler to refuse to modify a constant value.
Furthermore, I don't think the system is dynamically extending age's method set to *age; my wild guess is that pointer types are statically given a method for each of the base type's methods, which just dereferences the pointer and calls the base's method. It's safe to do this automatically, as nothing in a receive-by-value method can change the pointer anyway. In the other direction, it doesn't always make sense to extend a set of methods that are asking to modify data by wrapping them in a way that the data they modify disappears shortly thereafter. There are definitely cases where it makes sense to do this, but this needs to be decided explicitly by the programmer, and it makes sense for the compiler to stop and ask for such.
tl;dr I think that the paragraph in Effective Go could use a bit of rewording (although I'm probably too long-winded to take the job), but it's correct. A pointer of type *X effectively has access to all of X's methods, but 'X' does not have access to *X's. Therefore, when determining whether an object can fulfill a given interface, *X is allowed to fulfill any interface X can, but the converse is not true. Furthermore, even though a variable of type X in scope is known to be addressable at compile-time--so the compiler can convert it to a *X--it will refuse to do so for the purposes of interface fulfillment because doing so may not make sense.

Resources