Go - Failing escape analysis on different slice headers with shared data - go

I'm working on a project where I frequently convert []int32 to []byte. I created a function intsToBytes to perform an inplace conversion to minimize copying. I noticed that Go's escape analysis doesn't realize that ints and bytes reference the same underlying data. As a result, ints is overwritten by the next function's stack data and bytes lives on and references the overwritten data.
The only solution I can think of involves copying the data into a new byte slice. Is there away to avoid copying the data?
func pack() []byte {
ints := []int32{1,2,3,4,5} // This does not escape so it is allocated on the stack
bytes := intsToBytes(ints) // 'ints' and 'bytes' are different slice headers
return bytes
// After the return, the []int32{...} is deallocated and can be overwritten
// by the next function's stack data
}
func intsToBytes(i []int32) []byte {
const SizeOfInt32 = 4
// Get the slice header
header := *(*reflect.SliceHeader)(unsafe.Pointer(&i))
header.Len *= SizeOfInt32
header.Cap *= SizeOfInt32
// Convert slice header to an []byte
data := *(*[]byte)(unsafe.Pointer(&header))
/* Potentital Solution
outData := make([]byte, len(data))
copy(outData, data)
return outData
*/
return data
}

Related

How do I do a FAST conversion of an int array to an array of bytes?

I have a process that needs to pack a large array of int16s to a protobuf every few milliseconds. Understanding the protobuf side of it isn't critical, since all I really need is a way to convert a bunch of int16s (160-16k of them) to []byte. It's a CPU-critical operation, so I don't want to do something like this:
for _, sample := range listOfIntegers {
protobufObject.ByteStream = append(protobufObject.Bytestream, byte(sample>>8))
protobufObject.ByteStream = append(protobufObject.Bytestream, byte(sample&0xff))
}
(If you're interested, this is the protobuf)
message ProtobufObject {
bytes byte_stream = 1;
... = 2;
etc.
}
There has to be a faster way to supply that list of ints as a block of memory to the protobuf. I've fiddled with the cgo library to get access to memcpy, but suspect I've been destroying an underlying go data structure because I get crashes in totally unrelated sections of code.
A faster version of the above code is:
protobufObject.ByteStream := make([]byte, len(listOfIntegers) * 2)
for i, n := range listOfIntegers {
j := i * 2
protobufObject.ByteStream[j+1] = byte(n)
protobufObject.ByteStream[j] = byte(n>>8)
}
You can avoid copying the data when running on a big-endian architecture.
Use the unsafe package to copy the []int16 header to the []byte header. Use the unsafe package again to get a pointer to the []byte header and adjust the length and capacity for the conversion.
b = *(*[]byte)(unsafe.Pointer(&listOfIntegers))
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
hdr.Len *= 2
hdr.Cap *= 2
protobufObject.ByteStream = b

How to return []byte from internal void * in C function by CGO?

I am wrapping an C function with the following definition:
int parser_shift(parser* parser, void* buffer, int length);
It removes up to length bytes from internal buffer of unparsed bytes, storing them in the given buffer.
Now I wish to wrap it into a Go function with the following definition:
func (p *Parser) Shift() []byte {
var buffer []byte
// TODO:
return buffer
}
What is the right way of writing to finish the TODO in above with CGO?
I tried the following way, however it crashed with error: Error in "/path/to/my/program': free(): invalid next size (fast): 0x00007f8fe0000aa0:
var buffer []byte
bufStr := C.CString(string(buffer))
defer C.free(unsafe.Pointer(bufStr))
C.parser_shift(p.Cparser, unsafe.Pointer(bufStr), C.int(8192))
buffer = []byte(C.GoString(bufStr))
return buffer
Assuming that your parser_shift function returns the number of bytes actually stored in the buffer, you can do something like this:
func (p *Parser) Shift() []byte {
var buffer [8192]byte
parsed := int(C.parser_shift(p.Cparser, unsafe.Pointer(&buffer[0]), C.int(len(buffer))))
return buffer[:parsed]
}
There is no need to convert to or from string, just pass it the memory that you want it to write to.
try this function
// C pointer, length to Go []byte
func C.GoBytes(unsafe.Pointer, C.int) []byte

Implementation of io.ReadWriteSeeker in golang

Is there an implementation of io.ReadWriteSeeker to use in Golang?
Since, bytes.Buffer does not implement Seek method, I need to find such an implementation to use as a buffer written by zipwriter and to be read with seeking.
In addition I wont go with Reader(buff.Bytes()) to covert with memory copy, because I can not afford double memory size for buffered data.
In addition, when using os.File as the option, if I wont call f.Sync, it will never touch file system, right? Thanks.
My simplified codes:
func process() {
buff := new(bytes.Buffer)
zipWriter := zip.NewWriter(buff)
// here to add data into zipWriter in sequence
zipWriter.Close()
upload(buff) // upload(io.ReadSeeker)
}
For example, using the same underlying array for (uBuf and zBuf) buffers,
package main
import (
"archive/zip"
"bytes"
"io"
)
func upload(io.ReadSeeker) {}
func process() {
zBuf := new(bytes.Buffer)
zipWriter := zip.NewWriter(zBuf)
// add data into zipWriter in sequence
zipWriter.Close()
uBuf, zBuf := zBuf.Bytes(), nil
// upload(io.ReadSeeker)
upload(bytes.NewReader(uBuf))
}
func main() {}
Playground: https://play.golang.org/p/8TKmnL_vRY9
Package bytes
import "bytes"
func (*Buffer) Bytes
func (b *Buffer) Bytes() []byte
Bytes returns a slice of length b.Len() holding the unread portion of
the buffer. The slice is valid for use only until the next buffer
modification (that is, only until the next call to a method like Read,
Write, Reset, or Truncate). The slice aliases the buffer content at
least until the next buffer modification, so immediate changes to the
slice will affect the result of future reads.
The tuple assignment statement
uBuf, zBuf := zBuf.Bytes(), nil
gets the slice descriptor for the zipped bytes (zBuf.Bytes()) and assigns it to the slice descriptor uBuf. A slice descriptor is a struct with a pointer to the underlying array, the slice length, and the slice capacity. For example,
type slice struct {
array unsafe.Pointer
len int
cap int
}
Then, for safety, we assign nil to zBuf to ensure that no further changes can be made to its underlying array, which is now used by uBuf.

encode object to bytes by golang unsafe?

func Encode(i interface{}) ([]byte, error) {
buffer := bytes.NewBuffer(make([]byte, 0, 1024))
// size := unsafe.Sizeof(i)
size := reflect.TypeOf(i).Size()
fmt.Println(size)
ptr := unsafe.Pointer(&i)
startAddr := uintptr(ptr)
endAddr := startAddr + size
for i := startAddr; i < endAddr; i++ {
bytePtr := unsafe.Pointer(i)
b := *(*byte)(bytePtr)
buffer.WriteByte(b)
}
return buffer.Bytes(), nil
}
func TestEncode(t *testing.T) {
test := Test{10, "hello world"}
b, _ := Encode(test)
ptr := unsafe.Pointer(&b)
newTest := *(*Test)(ptr)
fmt.Println(newTest.X)
}
I am learning how to use golang unsafe and wrote this function for encoding any object. I meet with two problems, first, dose unsafe.Sizeof(obj) always return obj's pointer size? Why it different from reflect.TypeOf(obj).Size()? Second, I want to iterate the underlying bytes of obj and convert it back to obj in TestEncode function by unsafe.Pointer(), but the object's values all corrupt, why?
First, unsafe.Sizeof returns the bytes that needs to store the type. It is a little bit tricky, but it does not mean bytes that needs to store the data.
For example, a slice, as it is well known, stores 3 4-byte ints on a 32bit machine. One uintptr for memory address of the underlying array, and two int32 for len and cap. So no matter how long a slice is or what type it is of, a slice takes always 12 bytes on a 32 bit machine. Likely, a string uses 8 bytes: 1 uintptr for address and 1 int32 for len.
As for difference between reflect.TypeOf().Size, it is about interface. reflect.TypeOf looks into the interface and gets an concrete type, and reports bytes needed about the concrete type, while unsafe.Sizeof just returns 8 for an interface type: 2 uintptr for a pointer to the data and a pointer to the method lists.
Second part is quite clear now. For one, unsafe.Pointer is taking the address of the interface, instead of the concrete type. Two, in TestEncode, unsafe.Pointer is taking address to the 12-byte slice "header". There might be other errors, but with the two mentioned, they are meaningless to spot.
Note: I avoid talking about orders of the uintptr and int32 not only because I don't know, but also becuase they are not documented, unsafe, and implentation depended.
Note 2: Conclusion: Don't try to dump memory of a Go data.
Note 3: I change everything to 32 bit becuase playground is using it, so it is easier to check.

how to store a slice of byte slices?

I would like to understand how to store several byte slices separately in a slice. As hopefully illustrated below, I want the storage struct to store the result of the compressed result of n found in buf.
type storage struct
{
compressed []byte
}
func (s* storage) compress(n []byte) {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
w.Write(n)
w.Close()
store := buf.Bytes()
s.compressed = append(s.compressed, store)
}
In your code compressed is a slice of bytes. If you want to store slices of bytes you need a slice of slices of bytes. So change the type of compressed to [][]byte

Resources