Why does io.WriterTo's WriteTo method return an int64 rather than an int? - go

Most of the output methods in Go's io package return (int, error), for example io.Writer's Write([]byte) method and the io.WriteString(io.Writer, string) function. However, a few of the output methods, such as io.WriterTo's WriteTo method, return (int64, error) instead. This makes it inconvenient to implement WriteTo in terms of Write or WriteString without storing an intermediate value and type converting it from int to int64. What is the reason for this discrepancy?

It's possible that WriteTo copies more than int32 bytes of data.
With the io.Readerand io.Writer interfaces, the amount of data is limited by the size of the given slice, which has a length limited by int for the current architecture.

The signature of the Writer.Write() method:
Write(p []byte) (n int, err error)
It writes the contents of a slice. Quoting from the spec: Slice types:
A slice is a descriptor for a contiguous segment of an underlying array...
As we all know, the slice has an underlying array. Quoting again from the Spec: Array types:
The length is part of the array's type; it must evaluate to a non-negative constant representable by a value of type int.
So the maximum length of an array is limited by the maximum value of the int type (which is 2147483647 in case of 32 bit and 9223372036854775807 in case of 64 bit architectures).
So back to the Writer.Write() method: since it writes the content of the passed slice, it is guaranteed that the number of written bytes will not be more that what fits into an int.
Now WriteTo.WriteTo() method:
WriteTo(w Writer) (n int64, err error)
A slice or array is nowhere mentioned. You have no guarantees that the result will fit into an int, so the int64 is more than justified.
Example: BigBuffer
Imagine a BigBuffer implementation which temporarily writes data into an array or slice. The implementation may manage multiple arrays so that if one is full (e.g. reached max int), continues in another one. Now if this BigBuffer implements the WriteTo interface and you call this method to write the content into an os.File, the result will be more than max int.

Related

Casting []uint32 to []byte without copying in golang

I'm working on a processor simulator in golang (for educational purposes). I need a type for memory unit for addressing. It may contain either a slice of memory (memory type is []byte) or one or several registers (they have []uint32 type), must be readable and writable. So, is there an option to convert []uint32 to []byte? I know there's an unsafe module, but I'm not sure how exactly to do this conversion. In other words, I need something like reinterpret_cast in C++
I know memory unit can be an interface with different implementations for memory, single register and several register, but it's not so efficient. Making register a byte slice also decreases performance
The unsafe.Slice function makes it more convenient to convert an arbitrary pointer to a slice of any type. You could even make a generic version to cast a slice of any type to bytes if you were so inclined:
func castToBytes[T any](s []T) []byte {
if len(s) == 0 {
return nil
}
size := unsafe.Sizeof(s[0])
return unsafe.Slice((*byte)(unsafe.Pointer(&s[0])), int(size)*len(s))
}

encode object to bytes by golang unsafe?

func Encode(i interface{}) ([]byte, error) {
buffer := bytes.NewBuffer(make([]byte, 0, 1024))
// size := unsafe.Sizeof(i)
size := reflect.TypeOf(i).Size()
fmt.Println(size)
ptr := unsafe.Pointer(&i)
startAddr := uintptr(ptr)
endAddr := startAddr + size
for i := startAddr; i < endAddr; i++ {
bytePtr := unsafe.Pointer(i)
b := *(*byte)(bytePtr)
buffer.WriteByte(b)
}
return buffer.Bytes(), nil
}
func TestEncode(t *testing.T) {
test := Test{10, "hello world"}
b, _ := Encode(test)
ptr := unsafe.Pointer(&b)
newTest := *(*Test)(ptr)
fmt.Println(newTest.X)
}
I am learning how to use golang unsafe and wrote this function for encoding any object. I meet with two problems, first, dose unsafe.Sizeof(obj) always return obj's pointer size? Why it different from reflect.TypeOf(obj).Size()? Second, I want to iterate the underlying bytes of obj and convert it back to obj in TestEncode function by unsafe.Pointer(), but the object's values all corrupt, why?
First, unsafe.Sizeof returns the bytes that needs to store the type. It is a little bit tricky, but it does not mean bytes that needs to store the data.
For example, a slice, as it is well known, stores 3 4-byte ints on a 32bit machine. One uintptr for memory address of the underlying array, and two int32 for len and cap. So no matter how long a slice is or what type it is of, a slice takes always 12 bytes on a 32 bit machine. Likely, a string uses 8 bytes: 1 uintptr for address and 1 int32 for len.
As for difference between reflect.TypeOf().Size, it is about interface. reflect.TypeOf looks into the interface and gets an concrete type, and reports bytes needed about the concrete type, while unsafe.Sizeof just returns 8 for an interface type: 2 uintptr for a pointer to the data and a pointer to the method lists.
Second part is quite clear now. For one, unsafe.Pointer is taking the address of the interface, instead of the concrete type. Two, in TestEncode, unsafe.Pointer is taking address to the 12-byte slice "header". There might be other errors, but with the two mentioned, they are meaningless to spot.
Note: I avoid talking about orders of the uintptr and int32 not only because I don't know, but also becuase they are not documented, unsafe, and implentation depended.
Note 2: Conclusion: Don't try to dump memory of a Go data.
Note 3: I change everything to 32 bit becuase playground is using it, so it is easier to check.

Conversion of a slice of string into a slice of custom type

I'm quite new to Go, so this might be obvious. The compiler does not allow the following code:
(http://play.golang.org/p/3sTLguUG3l)
package main
import "fmt"
type Card string
type Hand []Card
func NewHand(cards []Card) Hand {
hand := Hand(cards)
return hand
}
func main() {
value := []string{"a", "b", "c"}
firstHand := NewHand(value)
fmt.Println(firstHand)
}
The error is:
/tmp/sandbox089372356/main.go:15: cannot use value (type []string) as type []Card in argument to NewHand
From the specs, it looks like []string is not the same underlying type as []Card, so the type conversion cannot occur.
Is it, indeed, the case, or did I miss something?
If it is the case, why is it so? Assuming, in a non-pet-example program, I have as input a slice of string, is there any way to "cast" it into a slice of Card, or do I have to create a new structure and copy the data into it? (Which I'd like to avoid since the functions I'll need to call will modify the slice content).
There is no technical reason why conversion between slices whose elements have identical underlying types (such as []string and []Card) is forbidden. It was a specification decision to help avoid accidental conversions between unrelated types that by chance have the same structure.
The safe solution is to copy the slice. However, it is possible to convert directly (without copying) using the unsafe package:
value := []string{"a", "b", "c"}
// convert &value (type *[]string) to *[]Card via unsafe.Pointer, then deref
cards := *(*[]Card)(unsafe.Pointer(&value))
firstHand := NewHand(cards)
https://play.golang.org/p/tto57DERjYa
Obligatory warning from the package documentation:
unsafe.Pointer allows a program to defeat the type system and read and write arbitrary memory. It should be used with extreme care.
There was a discussion on the mailing list about conversions and underlying types in 2011, and a proposal to allow conversion between recursively equivalent types in 2016 which was declined "until there is a more compelling reason".
The underlying type of Card might be the same as the underlying type of string (which is itself: string), but the underlying type of []Card is not the same as the underlying type of []string (and therefore the same applies to Hand).
You cannot convert a slice of T1 to a slice of T2, it's not a matter of what underlying types they have, if T1 is not identical to T2, you just can't. Why? Because slices of different element types may have different memory layout (different size in memory). For example the elements of type []byte occupy 1 byte each. The elements of []int32 occupy 4 bytes each. Obviously you can't just convert one to the other even if all values are in the range 0..255.
But back to the roots: if you need a slice of Cards, why do you create a slice of strings in the first place? You created the type Card because it is not a string (or at least not just a string). If so and you require []Card, then create []Card in the first place and all your problems go away:
value := []Card{"a", "b", "c"}
firstHand := NewHand(value)
fmt.Println(firstHand)
Note that you are still able to initialize the slice of Card with untyped constant string literals because it can be used to initialize any type whose underlying type is string. If you want to involve typed string constants or non-constant expressions of type string, you need explicit conversion, like in the example below:
s := "ddd"
value := []Card{"a", "b", "c", Card(s)}
If you have a []string, you need to manually build a []Card from it. There is no "easier" way. You can create a helper toCards() function so you can use it everywhere you need it.
func toCards(s []string) []Card {
c := make([]Card, len(s))
for i, v := range s {
c[i] = Card(v)
}
return c
}
Some links for background and reasoning:
Go Language Specification: Conversions
why []string can not be converted to []interface{} in golang
Cannot convert []string to []interface {}
What about memory layout means that []T cannot be converted to []interface in Go?
From the specs, it looks like []string is not the same underlying type as []Card, so the type conversion cannot occur.
Exactly right. You have to convert it by looping and copying over each element, converting the type from string to Card on the way.
If it is the case, why is it so? Assuming, in a non-pet-example program, I have as input a slice of string, is there any way to "cast" it into a slice of Card, or do I have to create a new structure and copy the data into it? (Which I'd like to avoid since the functions I'll need to call will modify the slice content).
Because conversions are always explicit and the designers felt that when a conversion implicitly involves a copy it should be made explicit as well.

Go's value method receiver vs pointer method receiver

I've read a Tour of Go and Effective Go, http://golang.org/doc/effective_go.html#pointers_vs_values, but still have a difficult time understanding when you would define a method on a struct using a value method receiver instead of a pointer method receiver. In other words, when would this:
type ByteSlice []byte
func (slice ByteSlice) Append(data []byte) []byte {
}
be preferable over this?
func (p *ByteSlice) Append(data []byte) {
slice := *p
*p = slice
}
Slices are one place where it's not always obvious at first. The Slice header is small, so copying it is cheap, and the underlying array is referenced via a pointer, so you can manipulate the contents of a slice with a value receiver. You can see this in the sort package, where the methods for the sortable types are defined without pointers.
The only time you need to use a pointer with a slice, is if you're going to manipulate the slice header, which means changing the length or capacity. For an Append method, you would want:
func (p *ByteSlice) Append(data []byte) {
*p = append(*p, data...)
}
There is an FAQ entry on that matter:
Should I define methods on values or pointers?
First, and most important, does the method need to modify the receiver? If it does, the receiver must be a pointer.
...
Second is the consideration of efficiency. If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver.

Why is the method receiver not required to be a pointer when implementing sort.Interface for golang types?

I am reading the docs for the sort stdlib package and the sample code reads like this:
type ByAge []Person
func (a ByAge) Len() int { return len(a) }
func (a ByAge) Swap(i, j int) { a[i], a[j] = a[j], a[i] }
func (a ByAge) Less(i, j int) bool { return a[i].Age < a[j].Age }
As I've learnt, function that mutate a type T needs to use *T as its method receiver.
In the case of Len, Swap and Less why does it work ? Or am I misunderstanding the difference between using T vs *T as method receivers ?
Go has three reference types:
map
slice
channel
Every instance of these types holds a pointer to the actual data internally. This means that
when you pass a value of one of these types the value is copied like every other value but the
internal pointer still points to the same value.
Quick example (run on play):
func dumpFirst(s []int) {
fmt.Printf("address of slice var: %p, address of element: %p\n", &s, &s[0])
}
s1 := []int{1, 2, 3}
s2 := s1
dumpFirst(s1)
dumpFirst(s2)
will print something like:
address of slice var: 0x1052e110, address of element: 0x1052e100
address of slice var: 0x1052e120, address of element: 0x1052e100
You can see: the address of the slice variable changes but the address of the first element in that slice remains the same.
I just had a minor epiphany regarding this exact same question.
As has already been explained, a typed slice (not a pointer to a slice) can implement the sort.Interface interface; part of the reason for this is that, even though the slice is being copied, one of its fields is a pointer to an array, so any modification of that backing array will be reflected in the original slice.
Normally, though, this isn't enough of a justification for a bare slice to be an acceptable receiver. It's generally incorrect to try to modify a struct as a receiver of a method, because any append() calls will change the slice copy's length without modifying the original slice's headers. Slice modification may even trigger the initialization of a new backing array, completely disconnecting the copied receiver from the original slice.
By the very nature of sorting, however, this isn't a problem in the sort.Sort case. The only array-modifying operation it exercises is Swap, meaning the array's required memory will remain the same, so the slice will not change size, and therefore there will be no changes to the slice's actual values (starting index, length, and array pointer).
I'm sure this was obvious to a lot of people, but it just dawned on me, and I thought it might be useful to others wondering why sort plays nicely with bare slices.

Resources