Go Ints and Strings are immutable OR mutable? - go

What I am reading about ints and strings over internet is they are immutable in the nature.
But the following code shows that after changing the values of these types, still they points to the same address. This contradicts the idea behind the nature of types in python.
Can anyone please explain me this?
Thanks in advance.
package main
import (
"fmt"
)
func main() {
num := 2
fmt.Println(&num)
num = 3
fmt.Println(&num) // address value of the num does not change
str := "2"
fmt.Println(&str)
str = "34"
fmt.Println(&str) // address value of the str does not change
}```

A number is immutable by nature. 7 is 7, and it won't be 8 tomorrow. That doesn't mean that which number is stored in a variable cannot change. Variables are variable. They're mutable containers for values which may be mutable or immutable.
A Go string is immutable by language design; the string type doesn't support any mutating operators (like appending or replacing a character in the middle of the string). But, again, assignment can change which string a variable contains.
In Python (CPython at least), a number is implemented as a kind of object, with an address and fields like any other object. When you do tricks with id(), you're looking at the address of the object "behind" the variable, which may or may not change depending on what you do to it, and whether or not it was originally an interned small integer or something like that.
In Go, an integer is an integer. It's stored as an integer. The address of the variable is the address of the variable. The address of the variable might change if the garbage collector decides to move it (making the numeric value of the address more or less useless), but it doesn't reveal to you any tricks about the implementation of arithmetic operators, because there aren't any.
Strings are more complicated than integers; they are kind of object-ish internally, being a structure containing a pointer and a size. But taking the address of a string variable with &str doesn't tell you anything about that internal structure, and it doesn't tell you whether the Go compiler decided to use a de novo string value for an assignment, or to modify the old one in place (which it could, without breaking any rules, if it could prove that the old one would never be seen again by anything else). All it tells you is the address of str. If you wanted to find out whether that internal pointer changed you would have to use reflection... but there's hardly ever any practical reason to do so.

When you read about a string being immutable, it means you cannot modify it by index, ex:
x := "hello"
x[2] = 'r'
//will raise an error
As a comment says, when you modify the whole var(and not a part of it with an index), it's not related to being mutable or not, and you can do it

Related

Create repeatable byte array of Go struct which contains a pointer

I want to be able to create repeatable byte arrays of structs in Go so I can hash them and then verify that hash at some point.
I am currently following this simple approach to create a byte array from a struct with:
[]byte(fmt.Sprintf("%v", struct))...)
This works perfectly until my struct holds an embedded struct with a pointer, for example:
type testEmbeddedPointerStruct struct {
T *testSimpleStruct
}
In my tests this creates a different byte array each time, I think it may be because with the pointer the address in memory changes each time?
Is there a way of creating a repeatable byte array digest even if the struct holds a pointer?
Thanks
... I think it may be because with the pointer the address in memory changes ...
That's the obvious candidate, yes. You have chosen a very simple encoding, in which pointer fields are encoded as a hexadecimal representation of the pointer, rather than any value found at the target of the pointer.
Is there a way of creating a repeatable byte array digest even if the struct holds a pointer?
You may need to define more precisely what "repeat of same value" means to you,1 but in general, this is really an encoding problem. The encoding/gob package could perhaps give you an encoding you would like here, though note that unlike %v formatting, it encodes only exported struct fields and keeps the various names. It has the effect of "flattening" any pointer data, but won't work for cyclic data structures.
(You can write your own simpler encoder that simply follows pointers when it encounters them, and otherwise works like %v.)
1For example, suppose you have:
type T struct {
I int
P *Sub
}
type Sub struct {
J int
}
// ...
s2 := Sub{2}
s3 := Sub{3}
t1 := T{1, &s2}
t2 := T{1, &s3}
Obviously printing t1 and t2 (while flattening away pointers) produces an encoded version of {1 2} and {1 3} respectively, so these are not the same value. However, if we change s3 itself to:
s3 := Sub{2}
we now have two different entities, t1 and t2, that both "contain as a value" {1 2}. In Go, t1 and t2 are different because their pointers differ. Their values, in other words, are different. In the proposed encoding, t1 and t2 both encode the same, so they are the same value.
This is the kind of thing that occurs with pointers: the underlying data may be the same—the "same value" in one sense—but the objects holding those values may differ in location, so that if one object is modified, the other is not. If you run such objects through an encode-then-decode process that makes them share the pointed-to value, you may give up the ability to modify one object without modifying the other, or to distinguish between them.
Since you get to choose how to do the encoding, you get to decide exactly what you want to have happen here. But you must make that choice on purpose, not just accidentally.

What does Fiddle.dlwrap and Fiddle.dlunwrap do in Ruby?

I'm trying to understand how Ruby's Fiddle library works. I understand mostly how it interacts with libffi, but there's just one thing that's still baffling me: what on earth do Fiddle.dlwrap and Fiddle.dlunwrap do exactly?
The documentation just says
dlunwrap(addr)
Returns the hexadecimal representation of a memory pointer address addr
dlwrap(val)
Returns a memory pointer of a function’s hexadecimal address location val
(from ruby-doc.org)
I've tried experimenting with them passing in various different types of objects and strings. The methods always return values no matter what you pass in (not just hexadecimal strings of pointers to C functions). It seems that dlwrap is merely returning the memory address of the passed-in object, but two things don't make too much sense in this case:
If I pass in a short string, and create a pointer using the result as the address, the memory it points to is not the string.
If I pass in a number, it just returns the object ID of the number.
If anyone has some secret knowledge on the inner workings on Fiddle, and is willing to share, please help :)

Do three dots contain multiple meanings?

As I recognize, "..." means the length of the array in the below snippet.
var days := [...]string { "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" }
On the other hand, "..." means unpacking the slice y to arguments of int in the below snippet, as I guess. I'm not really sure about this.
x := []int{1,2,3}
y := []int{4,5,6}
x = append(x, y...)
Now, the difference in the two meanings makes it hard for me to understand what "..." is.
You've noted two cases of ... in Go. In fact, there are 3:
[...]int{1,2,3}
Evaluates at compile time to [3]int{1,2,3}
a := make([]int, 500)
SomeVariadicFunc(a...)
Unpacks a as the arguments to a function. This matches the one you missed, the variadic definition:
func SomeVariadicFunc(a ...int)
Now the further question (from the comments on the OP) -- why can ... work semantically in all these cases? The answer is that in English (and other languages), this is known as an ellipsis. From that article
Ellipsis (plural ellipses; from the Ancient Greek: ἔλλειψις,
élleipsis, "omission" or "falling short") is a series of dots that
usually indicates an intentional omission of a word, sentence, or
whole section from a text without altering its original meaning.1
Depending on their context and placement in a sentence, ellipses can
also indicate an unfinished thought, a leading statement, a slight
pause, and a nervous or awkward silence.
In the array case, this matches the "omission of a word, sentence, or whole section" definition. You're omitting the size of the array and letting the compiler figure it out for you.
In the variadic cases, it uses the same meaning, but differently. It also has hints of "an unfinished thought". We often use "..." to mean "and so on." "I'm going to get bread, eggs, milk..." in this case "..." signifies "other things similar to breads, eggs, and milk". The use in, e.g., append means "an element of this list, and all the others." This is perhaps the less immediately intuitive usage, but to a native speaker, it makes sense. Perhaps a more "linguistically pure" construction would have been a[0]... or even a[0], a[1], a[2]... but that would cause obvious problems with empty slices (which do work with the ... syntax), not to mention being verbose.
In general, "..." is used to signify "many things", and in this way both uses of it make sense. Many array elements, many slice elements (albeit one is creation, and the other is calling).
I suppose the hidden question is "is this good language design?" On one hand, once you know the syntax, it makes perfect sense to most native speakers of English, so in that sense it's successful. On the other hand, there's value in not overloading symbols in this way. I probably would have chose a different symbol for array unpacking, but I can't fault them for using a symbol that was probably intuitive to the language designers. Especially since the array version isn't even used terribly often.
As mentioned, this is of no issue to the compiler, because the cases can never overlap. You can never have [...] also mean "unpack this", so there's no symbol conflict.
(Aside: There is another use of it in Go I omitted, because it's not in the language itself, but the build tool. Typing something like go test ./... means "test this package, and all packages in subdirectories of this one". But it should be pretty clear with my explanation of the other uses why it makes sense here.)
Just FYI, myfunc(s...) does not mean "unpack" the input s.
Rather, "bypass" would be a more suitable expression.
If s is a slice s := []string{"a", "b", "c"},
myfunc(s...) is not equivalent to myfunc(s[0], s[1], s[2]).
This simple code shows it.
Also, see the official Go specification (slightly modified for clarity):
Given the function
func Greeting(prefix string, who ...string)
If the final argument is assignable to a slice type []T and is
followed by ..., it is passed unchanged as the value for a ...T
parameter. In this case no new slice is created.
Given the slice s and call
s := []string{"James", "Jasmine"}
Greeting("goodbye:", s...)
within Greeting, who will have the same value as s with the same underlying
array.
If it "unpacks" the input argument, a new slice with a different array should be created (which is not the case).
Note: It's not real "bypass" because the slice itself (not the underlying array) is copied into the function (there is no 'reference' in Go). But, that slice within the function points to the same original underlying array, so it would be a better description than "unpack".

Does the actual value of a enum class enumeration remain constant/invariant?

Given code for an incomplete server like:
enum class Command : uint32_t {
LOGIN,
MESSAGE,
JOIN_CHANNEL,
PART_CHANNEL,
INVALID
};
Can I expect that converting Command::LOGIN to an integer will always give the same value?
Across compilers?
Across compiler versions?
If I add another enumeration?
If I remove an enumeration?
Converting Command::LOGIN would look something like this:
uint32_t number = static_cast<uint32_t>(Command::LOGIN);
Some extra information on what I am doing here. This enumeration is fed onto the wire by converting it to an integer sending it along to the server/client. I do not really particularly care what the number is, as long as it will always stay the same. If it will not stay the same, then obviously I will have to provide my own numbers through the usual way.
Now my sneaking suspicion is that it will change depending on what compiler was used to compile the code, but I would like to know for sure.
Bonus question: How does the compiler/language determine what number to use for Command::LOGIN?
Before submitting this question, I have noticed some changes from say 3137527848 to 0 and back, so it is obviously not valid to rely on it not changing. I am still curious about how this number is determined, and how or why that number is changing.
From the C++11 Standard (or rather, n3485):
[dcl.enum]/2
If the first enumerator has no initializer, the value of the corresponding constant is zero. An enumerator-definition without an initializer gives the enumerator the value obtained by increasing the value of the previous enumerator by one.
Additionally, [expr.static.cast]/9
A value of a scoped enumeration type can be explicitly converted to an integral type. The value is unchanged if the original value can be represented by the specified type.
I think it's obvious that the values of the enumerators can be represented by uint32_t; if they weren't, [dcl.enum]/5 says "if the initializing value of an enumerator cannot be represented by the underlying type, the program is ill-formed."
So as long as you use the underlying type for conversion (either explicitly or via std::underlying_type<Command>::type), the value of those enumerators are fixed as long as you don't add any enumerators before them (in the same enumeration) or alter their order.
As Nicolas Louis Guillemo pointed out, be aware of possible different endianness when transferring the value.
If you assign explicit integer values to your enum constants then you are guaranteed to always have the same value when converting to the integer type.
Just do something like the following:
enum class Command : uint32_t {
LOGIN = 12,
MESSAGE = 46,
JOIN_CHANNEL = 5,
PART_CHANNEL = 0,
INVALID = 42
};
If you don't specify any values explicitly, the values are set implicitly, starting from zero and increasing by one with each move down the list.
Quoting from draft n3485:
[dcl.enum] paragraph 2
The enumeration type declared with an enum-key of only enum is an
unscoped enumeration, and its enumerators are unscoped enumerators.
The enum-keys enum class and enum struct are semantically equivalent;
an enumeration type declared with one of these is a scoped
enumeration, and its enumerators are scoped enumerators. [...] The
identifiers in an enumerator-list are declared as constants, and can
appear wherever constants are required. An enumerator-definition with
= gives the associated enumerator the value indicated by the constant-expression. If the first enumerator has no initializer, the
value of the corresponding constant is zero. An
enumerator-definition without an initializer gives the enumerator the
value obtained by increasing the value of the previous enumerator by
one.
The drawback of relying on this, is that if the list order somehow changes in the future, then your code might silently break, so I would advise you be explicit.
Command::LOGIN will always be 0 as long as it's the first enum in the list. Just be careful with the rest of the enums, because they will have different binary representations based on if the computer is using big endian or little endian.

Go receiver methods calling syntax confusion

I was just reading through Effective Go and in the Pointers vs. Values section, near the end it says:
The rule about pointers vs. values for receivers is that value methods can be invoked on pointers and values, but pointer methods can only be invoked on pointers. This is because pointer methods can modify the receiver; invoking them on a copy of the value would cause those modifications to be discarded.
To test it, I wrote this:
package main
import (
"fmt"
"reflect"
)
type age int
func (a age) String() string {
return fmt.Sprintf("%d yeasr(s) old", int(a))
}
func (a *age) Set(newAge int) {
if newAge >= 0 {
*a = age(newAge)
}
}
func main() {
var vAge age = 5
pAge := new(age)
fmt.Printf("TypeOf =>\n\tvAge: %v\n\tpAge: %v\n", reflect.TypeOf(vAge),
reflect.TypeOf(pAge))
fmt.Printf("vAge.String(): %v\n", vAge.String())
fmt.Printf("vAge.Set(10)\n")
vAge.Set(10)
fmt.Printf("vAge.String(): %v\n", vAge.String())
fmt.Printf("pAge.String(): %v\n", pAge.String())
fmt.Printf("pAge.Set(10)\n")
pAge.Set(10)
fmt.Printf("pAge.String(): %v\n", pAge.String())
}
And it compiles, even though the document says it shouldn't since the pointer method Set() should not be invocable through the value var vAge. Am I doing something wrong here?
That's valid because vAge is addressable. See the last paragraph in Calls under the language spec:
A method call x.m() is valid if the method set of (the type of) x
contains m and the argument list can be assigned to the parameter list
of m. If x is addressable and &x's method set contains m, x.m() is
shorthand for (&x).m().
vAge is not considered as only a "value variable", because it's a known location in memory that stores a value of type age. Looking at vAge only as its value, vAge.Set(10) is not valid as an expression on its own, but because vAge is addressable, the spec declares that it's okay to treat the expression as shorthand for "get the address of vAge, and call Set on that" at compile-time, when we will be able to verify that Set is part of the method set for either age or *age. You're basically allowing the compiler to do a textual expansion on the original expression if it determines that it's necessary and possible.
Meanwhile, the compiler will allow you to call age(23).String() but not age(23).Set(10). In this case, we're working with a non-addressable value of type age. Since it's not valid to say &age(23), it can't be valid to say (&age(23)).Set(10); the compiler won't do that expansion.
Looking at the Effective Go example, you're not directly calling b.Write() at the scope where we know b's full type. You're instead making a temporary copy of b and trying to pass it off as a value of type interface io.Writer(). The problem is that the implementation of Printf doesn't know anything about the object being passed in except that it has promised it knows how to receive Write(), so it doesn't know to take a byteSlice and turn it into a *ByteSlice before calling the function. The decision of whether to address b has to happen at compile time, and PrintF was compiled with the precondition that its first argument would know how to receive Write() without being referenced.
You may think that if the system knows how to take an age pointer and convert it to an age value, that it should be able to do the reverse; t doesn't really make sense to be able to, though. In the Effective Go example, if you were to pass b instead of &b, you'd modify a slice that would no longer exist after PrintF returns, which is hardly useful. In my age example above, it literally makes no sense to take the value 23 and overwrite it with the value 10. In the first case, it makes sense for the compiler to stop and ask the programmer what she really meant to do when handing b off. In the latter case, it of course makes sense for the compiler to refuse to modify a constant value.
Furthermore, I don't think the system is dynamically extending age's method set to *age; my wild guess is that pointer types are statically given a method for each of the base type's methods, which just dereferences the pointer and calls the base's method. It's safe to do this automatically, as nothing in a receive-by-value method can change the pointer anyway. In the other direction, it doesn't always make sense to extend a set of methods that are asking to modify data by wrapping them in a way that the data they modify disappears shortly thereafter. There are definitely cases where it makes sense to do this, but this needs to be decided explicitly by the programmer, and it makes sense for the compiler to stop and ask for such.
tl;dr I think that the paragraph in Effective Go could use a bit of rewording (although I'm probably too long-winded to take the job), but it's correct. A pointer of type *X effectively has access to all of X's methods, but 'X' does not have access to *X's. Therefore, when determining whether an object can fulfill a given interface, *X is allowed to fulfill any interface X can, but the converse is not true. Furthermore, even though a variable of type X in scope is known to be addressable at compile-time--so the compiler can convert it to a *X--it will refuse to do so for the purposes of interface fulfillment because doing so may not make sense.

Resources