How does golang implemented the convertion between []byte and string? - go

I am not able to get the answer by checking the generated assemblies:
{
a := []byte{'a'}
s1 := string(a)
a[0] = 'b'
fmt.Println(s1) // a
}
{
a := "a"
b := []byte(a)
b[0] = 'b'
fmt.Println(a) // a
}
Why the observed behavior is happening?
Is there a description of how go interprets these lines of code?
What does go compiler do for type conversion?

This isn't so much a compiler issue as it is a language specification issue. The compiler can and will do strange things sometimes--what matters here is that whatever machine code the compiler ends up spitting out, it follows the rules laid out in the language specification.
As mentioned in the comments, the language specification defines the conversion of byte slices to and from string types like this:
Converting a slice of bytes to a string type yields a string whose successive bytes are the elements of the slice.
Converting a value of a string type to a slice of bytes type yields a slice whose successive elements are the bytes of the string.
In order to understand the behavior of your examples, you have to also read the definition of string types, also in the specification:
Strings are immutable: once created, it is impossible to change the contents of a string.
Because []byte is mutable, behind the scenes go must make a copy of the relevant data when converting to and from a string. This can be verified by printing the addresses of the 0th element of the []byte object and the pointer to the first element of data in the string object. Here is an example (and a Go Playground version):
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
a := "a"
b := []byte(a)
ah := (*reflect.StringHeader)(unsafe.Pointer(&a))
fmt.Printf("a: %4s # %#x\n", a, ah.Data)
fmt.Printf("b: %v # %p\n\n", b, b)
c := []byte{'a'}
d := string(c)
dh := (*reflect.StringHeader)(unsafe.Pointer(&d))
fmt.Printf("c: %v # %p\n", c, c)
fmt.Printf("d: %4s # %#x\n", d, dh.Data)
}
The output looks like this:
a: a # 0x4c1ab2
b: [97] # 0xc00002c008
c: [97] # 0xc00002c060
d: a # 0x554e21
Notice that the pointer locations of the string and []byte are not the same and do not overlap. Therefore there is no expectation that changes to the []byte values will effect the string values in any way.
Okay, technically the result didn't have to be this way because I didn't make any changes in my example to the values of b or c. Technically the compiler could have taken a shortcut and simply called b a length=1 []byte starting at the same memory address as a. But that optimization would not be allowed if I did something like this instead:
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
a := "a"
b := []byte(a)
b[0] = 'b'
ah := (*reflect.StringHeader)(unsafe.Pointer(&a))
fmt.Printf("a: %4s # %#x\n", a, ah.Data)
fmt.Printf("b: %v # %p\n\n", b, b)
}
Output:
a: a # 0x4c1ab2
b: [98] # 0xc00002c008
See this in action at the Go Playground.

Related

Why reflect.ValueOf() gives different output in called function and calling function in golang?

Why the outputs of s and s1 differs although the starting value is same ? Help me understand
package main
import (
"fmt"
"reflect"
)
type Auth struct {
a, b interface{}
}
func main() {
fmt.Println("--------p1,s1--------")
p1 := Auth{}
fmt.Println(p1)
s1 := reflect.ValueOf(p1)
fmt.Println(s1)
fmt.Println("-------s---------")
callFunc(p1)
}
func callFunc(a ...interface{}) {
s := reflect.ValueOf(a)
fmt.Println(s)
}
Ps. Filename is nil.go
On running the code using :go run nil.go
The output is :
The fmt package documents that when printing reflect.Value values, it prints the value wrapped inside them:
Except when printed using the verbs %T and %p, special formatting considerations apply for operands that implement certain interfaces. In order of application:
If the operand is a reflect.Value, the operand is replaced by the concrete value that it holds, and printing continues with the next rule.
[...]
callFunc() has a variadic parameter:
func callFunc(a ...interface{})
This means a is a slice. So when printing reflect.ValueOf(a), you'll get the slice printed. And the fmt package also documents that:
For compound objects, the elements are printed using these rules, recursively, laid out like this:
struct: {field0 field1 ...}
array, slice: [elem0 elem1 ...]
maps: map[key1:value1 key2:value2 ...]
pointer to above: &{}, &[], &map[]
Slices are printed enclosed in square brackets.
Note that if callFunc() would not be variadic:
func callFunc(a interface{}) {
s := reflect.ValueOf(a)
fmt.Println(s)
}
Then output would be the same (try it on the Go Playground):
--------p1,s1--------
{<nil> <nil>}
{<nil> <nil>}
-------s---------
{<nil> <nil>}

DeepEqual incorrect after serializing map into gob

I've encountered some strange behavior with reflect.DeepEqual. I have an object of type map[string][]string, with one key whose value is an empty slice. When I use gob to encode this object, and then decode it into another map, these two maps are not equal according to reflect.DeepEqual (even though the content is identical).
package main
import (
"fmt"
"bytes"
"encoding/gob"
"reflect"
)
func main() {
m0 := make(map[string][]string)
m0["apple"] = []string{}
// Encode m0 to bytes
var network bytes.Buffer
enc := gob.NewEncoder(&network)
enc.Encode(m0)
// Decode bytes into a new map m2
dec := gob.NewDecoder(&network)
m2 := make(map[string][]string)
dec.Decode(&m2)
fmt.Printf("%t\n", reflect.DeepEqual(m0, m2)) // false
fmt.Printf("m0: %+v != m2: %+v\n", m0, m2) // they look equal to me!
}
Output:
false
m0: map[apple:[]] != m2: map[apple:[]]
A couple notes from follow-up experiments:
If I make the value of m0["apple"] a nonempty slice, for example m0["apple"] = []string{"pear"}, then DeepEqual returns true.
If I keep the value as an empty slice but I construct the identical map from scratch rather than with gob, then DeepEqual returns true:
m1 := make(map[string][]string)
m1["apple"] = []string{}
fmt.Printf("%t\n", reflect.DeepEqual(m0, m1)) // true!
So it's not strictly an issue with how DeepEqual handles empty slices; it's some strange interaction between that and gob's serialization.
This is because you encode an empty slice, and during decoding the encoding/gob package only allocates a slice if the one provided (the target to decode into) is not big enough to accomodate the encoded values. This is documented at: gob: Types and Values:
In general, if allocation is required, the decoder will allocate memory. If not, it will update the destination variables with values read from the stream.
Since there are 0 elements encoded, and a nil slice is perfectly capable of accomodating 0 elements, no slice will be allocated. We can verify this if we print the result of comparing the slices to nil:
fmt.Println(m0["apple"] == nil, m2["apple"] == nil)
Output of the above is (try it on the Go Playground):
true false
Note that the fmt package prints nil slice values and empty slices the same way: as [], you cannot rely on its output to judge if a slices is nil or not.
And reflect.DeepEqual() treats a nil slice and an empty but non-nil slice different (non-deep equal):
Note that a non-nil empty slice and a nil slice (for example, []byte{} and []byte(nil)) are not deeply equal.

How to get memory size of variable?

Does anybody know how to get memory size of the variable (int, string, []struct, etc) and print it? Is it possible?
var i int = 1
//I want to get something like this:
fmt.Println("Size of i is: %?", i)
//Also, it would be nice if I could store the value into a string
You can use the unsafe.Sizeof function for this.
It returns the size in bytes, occupied by the value you pass into it.
Here's a working example:
package main
import "fmt"
import "unsafe"
func main() {
a := int(123)
b := int64(123)
c := "foo"
d := struct {
FieldA float32
FieldB string
}{0, "bar"}
fmt.Printf("a: %T, %d\n", a, unsafe.Sizeof(a))
fmt.Printf("b: %T, %d\n", b, unsafe.Sizeof(b))
fmt.Printf("c: %T, %d\n", c, unsafe.Sizeof(c))
fmt.Printf("d: %T, %d\n", d, unsafe.Sizeof(d))
}
Take note that some platforms explicitly disallow the use of unsafe, because it is.. well, unsafe. This used to include AppEngine. Not sure if that is still the case today, but I imagine so.
As #Timur Fayzrakhmanov notes, reflect.TypeOf(variable).Size() will give you the same information. For the reflect package, the same restriction goes as for the unsafe package. I.e.: some platforms may not allow its use.
You can do it with either unsafe.Sizeof(), or reflect.Type.Size()
The size of a variable can be determined by using unsafe.Sizeof(a). The result will remain the same for a given type (i.e. int, int64, string, struct etc), irrespective of the value it holds. However, for type string, you may be interested in the size of the string that the variable references, and this is determined by using len(a) function on a given string. The following snippet illustrates that size of a variable of type string is always 8 but the length of a string that a variable references can vary:
package main
import "fmt"
import "unsafe"
func main() {
s1 := "foo"
s2 := "foobar"
fmt.Printf("s1 size: %T, %d\n", s1, unsafe.Sizeof(s1))
fmt.Printf("s2 size: %T, %d\n", s2, unsafe.Sizeof(s2))
fmt.Printf("s1 len: %T, %d\n", s1, len(s1))
fmt.Printf("s2 len: %T, %d\n", s2, len(s2))
}
Output:
s1 size: string, 8
s2 size: string, 8
s1 len: string, 3
s2 len: string, 6
The last part of your question is about assigning the length (i.e. an int value) to a string. This can be done by s := strconv.Itoa(i) where i is an int variable and the string returned by the function is assigned to s.
Note: the name of the converter function is Itoa, possibly a short form for Integer to ASCII. Most Golang programmers are likely to misread the function name as Iota.
I've written a package which calculates the actual memory size consumed by variable at runtime: https://github.com/DmitriyVTitov/size
It has single function, so basic usage is:
fmt.Println(size.Of(varName))
unsafe.Sizeof() is the correct solution.
var i int
var u uint
var up uintptr
fmt.Printf("i Type:%T Size:%d\n", i, unsafe.Sizeof(i))
fmt.Printf("u Type:%T Size:%d\n", u, unsafe.Sizeof(u))
fmt.Printf("up Type:%T Size:%d\n", up, unsafe.Sizeof(up))
The int, uint, and uintptr types are usually 32 bits wide on 32-bit systems and 64 bits wide on 64-bit systems. When you need an integer value you should use int unless you have a specific reason to use a sized or unsigned integer type.

Golang : interface to swap two numbers

I want to swap two numbers using interface but the interface concept is so confusing to me.
http://play.golang.org/p/qhwyxMRj-c
This is the code and playground. How do I use interface and swap two input numbers? Do I need to define two structures?
type num struct {
value interface{}
}
type numbers struct {
b *num
c *num
}
func (a *num) SwapNum(var1, var2 interface{}) {
var a num
temp := var1
var1 = var2
var2 = temp
}
func main() {
a := 1
b := 2
c := 3.5
d := 5.5
SwapNum(a, b)
fmt.Println(a, b) // 2 1
SwapNum(c, d)
fmt.Println(c, d) // 5.5 3.5
}
First of all, the interface{} type is simply a type which accepts all values as it is an interface with an empty method set and every type can satisfy that. int for example does not have any methods, neither does interface{}.
For a method which swaps the values of two variables you first need to make sure these variables are actually modifiable. Values passed to a function are always copied (except reference types like slices and maps but that is not our concern at the moment). You can achieve modifiable parameter by using a pointer to the variable.
So with that knowledge you can go on and define SwapNum like this:
func SwapNum(a interface{}, b interface{})
Now SwapNum is a function that accepts two parameters of any type.
You can't write
func SwapNum(a *interface{}, b *interface{})
as this would only accept parameters of type *interface{} and not just any type.
(Try it for yourself here).
So we have a signature, the only thing left is swapping the values.
func SwapNum(a interface{}, b interface{}) {
*a, *b = *b, *a
}
No, this will not work that way. By using interface{} we must do runtime type assertions to check whether we're doing the right thing or not. So the code must be expanded using the reflect package. This article might get you started if you don't know about reflection.
Basically we will need this function:
func SwapNum(a interface{}, b interface{}) {
ra := reflect.ValueOf(a).Elem()
rb := reflect.ValueOf(b).Elem()
tmp := ra.Interface()
ra.Set(rb)
rb.Set(reflect.ValueOf(tmp))
}
This code makes a reflection of a and b using reflect.ValueOf() so that we can
inspect it. In the same line we're assuming that we've got pointer values and dereference
them by calling .Elem() on them.
This basically translates to ra := *a and rb := *b.
After that, we're making a copy of *a by requesting the value using .Interface()
and assigning it (effectively making a copy).
Finally, we set the value of a to b with [ra.Set(rb)]5, which translates to *a = *b
and then assigning b to a, which we stored in the temp. variable before. For this,
we need to convert tmp back to a reflection of itself so that rb.Set() can be used
(it takes a reflect.Value as parameter).
Can we do better?
Yes! We can make the code more type safe, or better, make the definition of Swap type safe
by using reflect.MakeFunc. In the doc (follow the link) is an example which is very
like what you're trying. Essentially you can fill a function prototype with content
by using reflection. As you supplied the prototype (the signature) of the function the
compiler can check the types, which it can't when the value is reduced to interface{}.
Example usage:
var intSwap func(*int, *int)
a,b := 1, 0
makeSwap(&intSwap)
intSwap(&a, &b)
// a is now 0, b is now 1
The code behind this:
swap := func(in []reflect.Value) []reflect.Value {
ra := in[0].Elem()
rb := in[1].Elem()
tmp := ra.Interface()
ra.Set(rb)
rb.Set(reflect.ValueOf(tmp))
return nil
}
makeSwap := func(fptr interface{}) {
fn := reflect.ValueOf(fptr).Elem()
v := reflect.MakeFunc(fn.Type(), swap)
fn.Set(v)
}
The code of swap is basically the same as that of SwapNum. makeSwap is the same
as the one used in the docs where it is explained pretty well.
Disclaimer: The code above makes a lot of assumptions about what is given and
what the values look like. Normally you need to check, for example, that the given
values to SwapNum actually are pointer values and so forth. I left that out for
reasons of clarity.

Convert between slices of different types

I get a byte slice ([]byte) from a UDP socket and want to treat it as an integer slice ([]int32) without changing the underlying array, and vice versa. In C(++) I would just cast between pointer types; how would I do this in Go?
As others have said, casting the pointer is considered bad form in Go. Here are examples of the proper Go way and the equivalent of the C array casting.
WARNING: all code untested.
The Right Way
In this example, we are using the encoding/binary package to convert each set of 4 bytes into an int32. This is better because we are specifying the endianness. We are also not using the unsafe package to break the type system.
import "encoding/binary"
const SIZEOF_INT32 = 4 // bytes
data := make([]int32, len(raw)/SIZEOF_INT32)
for i := range data {
// assuming little endian
data[i] = int32(binary.LittleEndian.Uint32(raw[i*SIZEOF_INT32:(i+1)*SIZEOF_INT32]))
}
The Wrong Way (C array casting)
In this example, we are telling Go to ignore the type system. This is not a good idea because it may fail in another implementation of Go. It is assuming things not in the language specification. However, this one does not do a full copy. This code uses unsafe to access the "SliceHeader" which is common in all slices. The header contains a pointer to the data (C array), the length, and the capacity. Instead of just converting the header to the new slice type, we first need to change the length and capacity since there are less elements if we treat the bytes as a new type.
import (
"reflect"
"unsafe"
)
const SIZEOF_INT32 = 4 // bytes
// Get the slice header
header := *(*reflect.SliceHeader)(unsafe.Pointer(&raw))
// The length and capacity of the slice are different.
header.Len /= SIZEOF_INT32
header.Cap /= SIZEOF_INT32
// Convert slice header to an []int32
data := *(*[]int32)(unsafe.Pointer(&header))
You do what you do in C, with one exception - Go does not allow to convert from one pointer type to another. Well, it does, but you must use unsafe.Pointer to tell compiler that you are aware that all rules are broken and you know what you are doing. Here is an example:
package main
import (
"fmt"
"unsafe"
)
func main() {
b := []byte{1, 0, 0, 0, 2, 0, 0, 0}
// step by step
pb := &b[0] // to pointer to the first byte of b
up := unsafe.Pointer(pb) // to *special* unsafe.Pointer, it can be converted to any pointer
pi := (*[2]uint32)(up) // to pointer to the first uint32 of array of 2 uint32s
i := (*pi)[:] // creates slice to our array of 2 uint32s (optional step)
fmt.Printf("b=%v i=%v\n", b, i)
// all in one go
p := (*[2]uint32)(unsafe.Pointer(&b[0]))
fmt.Printf("b=%v p=%v\n", b, p)
}
Obviously, you should be careful about using "unsafe" package, because Go compiler is not holding your hand anymore - for example, you could write pi := (*[3]uint32)(up) here and compiler wouldn't complain, but you would be in trouble.
Also, as other people pointed already, bytes of uint32 might be layout differently on different computers, so you should not assume these are layout as you need them to be.
So safest approach would be to read your array of bytes one by one and make whatever you need out of them.
Alex
The short answer is you can't. Go wont let you cast a slice of one type to a slice of another type. You will have loop through the array and create another array of the type you want while casting each item in the array. This is generally regarded as a good thing since typesafety is an important feature of go.
Since Go 1.17, there is a simpler way to do this using the unsafe package.
import (
"unsafe"
)
const SIZEOF_INT32 = unsafe.Sizeof(int32(0)) // 4 bytes
func main() {
var bs []byte
// Do stuff with `bs`. Maybe do some checks ensuring that len(bs) % SIZEOF_INT32 == 0
data := unsafe.Slice((*int32)(unsafe.Pointer(&bs[0])), len(bs)/SIZEOF_INT32)
// A more verbose alternative requiring `import "reflect"`
// data := unsafe.Slice((*int32)(unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&bs)).Data)), len(bs)/SIZEOF_INT32)
}
Go 1.17 and beyond
Go 1.17 introduced the unsafe.Slice function, which does exactly this.
Converting a []byte to a []int32:
package main
import (
"fmt"
"unsafe"
)
func main() {
theBytes := []byte{
0x33, 0x44, 0x55, 0x66,
0x11, 0x22, 0x33, 0x44,
0x77, 0x66, 0x55, 0x44,
}
numInts := uintptr(len(theBytes)) * unsafe.Sizeof(theBytes[0]) / unsafe.Sizeof(int32(0))
theInts := unsafe.Slice((*int32)(unsafe.Pointer(&theBytes[0])), numInts)
for _, n := range theInts {
fmt.Printf("%04x\n", n)
}
}
Playground.
I had the size unknown problem and tweaked the previous unsafe method with the following code.
given a byte slice b ...
int32 slice is (*(*[]int)(Pointer(&b)))[:len(b)/4]
The array to slice example may be given a fictional large constant and the slice bounds used in the same way since no array is allocated.
You can do it with the "unsafe" package
package main
import (
"fmt"
"unsafe"
)
func main() {
var b [8]byte = [8]byte{1, 2, 3, 4, 5, 6, 7, 8}
var s *[4]uint16 = (*[4]uint16)(unsafe.Pointer(&b))
var i *[2]uint32 = (*[2]uint32)(unsafe.Pointer(&b))
var l *uint64 = (*uint64)(unsafe.Pointer(&b))
fmt.Println(b)
fmt.Printf("%04x, %04x, %04x, %04x\n", s[0], s[1], s[2], s[3])
fmt.Printf("%08x, %08x\n", i[0], i[1])
fmt.Printf("%016x\n", *l)
}
/*
* example run:
* $ go run /tmp/test.go
* [1 2 3 4 5 6 7 8]
* 0201, 0403, 0605, 0807
* 04030201, 08070605
* 0807060504030201
*/
Perhaps it was not available when the earlier answers were given, but it would seem that the binary.Read method would be a better answer than "the right way" given above.
This method allows you to read binary data from a reader directly into the value or buffer of your desired type. You can do this by creating a reader over your byte array buffer. Or, if you have control of the code that is giving you the byte array, you can replace it to read directly into your buffer without the need for the interim byte array.
See https://golang.org/pkg/encoding/binary/#Read for the documentation and a nice little example.
http://play.golang.org/p/w1m5Cs-ecz
package main
import (
"fmt"
"strings"
)
func main() {
s := []interface{}{"foo", "bar", "baz"}
b := make([]string, len(s))
for i, v := range s {
b[i] = v.(string)
}
fmt.Println(strings.Join(b, ", "))
}
func crackU32s2Bytes(us []uint32) []byte {
var bs []byte
var ptrBs = (*reflect.SliceHeader)(unsafe.Pointer(&bs))
var ptrUs = (*reflect.SliceHeader)(unsafe.Pointer(&us))
ptrBs.Data = ptrUs.Data
ptrBs.Len = ptrUs.Len*4
ptrBs.Cap = ptrBs.Len
return bs
}
func crackBytes2U32s(bs []byte) []uint32 {
var us []uint32
var ptrBs = (*reflect.SliceHeader)(unsafe.Pointer(&bs))
var ptrUs = (*reflect.SliceHeader)(unsafe.Pointer(&us))
ptrUs.Data = ptrBs.Data
ptrUs.Len = ptrBs.Len/4
ptrUs.Cap = ptrUs.Len
return us
}

Resources