encode object to bytes by golang unsafe? - go

func Encode(i interface{}) ([]byte, error) {
buffer := bytes.NewBuffer(make([]byte, 0, 1024))
// size := unsafe.Sizeof(i)
size := reflect.TypeOf(i).Size()
fmt.Println(size)
ptr := unsafe.Pointer(&i)
startAddr := uintptr(ptr)
endAddr := startAddr + size
for i := startAddr; i < endAddr; i++ {
bytePtr := unsafe.Pointer(i)
b := *(*byte)(bytePtr)
buffer.WriteByte(b)
}
return buffer.Bytes(), nil
}
func TestEncode(t *testing.T) {
test := Test{10, "hello world"}
b, _ := Encode(test)
ptr := unsafe.Pointer(&b)
newTest := *(*Test)(ptr)
fmt.Println(newTest.X)
}
I am learning how to use golang unsafe and wrote this function for encoding any object. I meet with two problems, first, dose unsafe.Sizeof(obj) always return obj's pointer size? Why it different from reflect.TypeOf(obj).Size()? Second, I want to iterate the underlying bytes of obj and convert it back to obj in TestEncode function by unsafe.Pointer(), but the object's values all corrupt, why?

First, unsafe.Sizeof returns the bytes that needs to store the type. It is a little bit tricky, but it does not mean bytes that needs to store the data.
For example, a slice, as it is well known, stores 3 4-byte ints on a 32bit machine. One uintptr for memory address of the underlying array, and two int32 for len and cap. So no matter how long a slice is or what type it is of, a slice takes always 12 bytes on a 32 bit machine. Likely, a string uses 8 bytes: 1 uintptr for address and 1 int32 for len.
As for difference between reflect.TypeOf().Size, it is about interface. reflect.TypeOf looks into the interface and gets an concrete type, and reports bytes needed about the concrete type, while unsafe.Sizeof just returns 8 for an interface type: 2 uintptr for a pointer to the data and a pointer to the method lists.
Second part is quite clear now. For one, unsafe.Pointer is taking the address of the interface, instead of the concrete type. Two, in TestEncode, unsafe.Pointer is taking address to the 12-byte slice "header". There might be other errors, but with the two mentioned, they are meaningless to spot.
Note: I avoid talking about orders of the uintptr and int32 not only because I don't know, but also becuase they are not documented, unsafe, and implentation depended.
Note 2: Conclusion: Don't try to dump memory of a Go data.
Note 3: I change everything to 32 bit becuase playground is using it, so it is easier to check.

Related

Go vet reports "possible misuse of reflect.SliceHeader"

I have the following code snippet which "go vet" complains about with the warning "possible misuse of reflect.SliceHeader". I can not find very much information about this warning other then this. After reading that it is not very clear to me what is needed to do this in a way that makes go vet happy - and without possible gc issues.
The goal of the snippet is to have a go function copy data to memory which is managed by an opaque C library. The Go function expects a []byte as a parameter.
func Callback(ptr unsafe.Pointer, buffer unsafe.Pointer, size C.longlong) C.longlong {
...
sh := &reflect.SliceHeader{
Data: uintptr(buffer),
Len: int(size),
Cap: int(size),
}
buf := *(*[]byte)(unsafe.Pointer(sh))
err := CopyToSlice(buf)
if err != nil {
log.Fatal("failed to copy to slice")
}
...
}
https://pkg.go.dev/unsafe#go1.19.4#Pointer
Pointer represents a pointer to an arbitrary type. There are four
special operations available for type Pointer that are not available
for other types:
A pointer value of any type can be converted to a Pointer.
A Pointer can be converted to a pointer value of any type.
A uintptr can be converted to a Pointer.
A Pointer can be converted to a uintptr.
Pointer therefore allows a program to defeat the type system and read
and write arbitrary memory. It should be used with extreme care.
The following patterns involving Pointer are valid. Code not using
these patterns is likely to be invalid today or to become invalid in
the future. Even the valid patterns below come with important caveats.
Running "go vet" can help find uses of Pointer that do not conform to
these patterns, but silence from "go vet" is not a guarantee that the
code is valid.
(6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data
field to or from Pointer.
As in the previous case, the reflect data structures SliceHeader and
StringHeader declare the field Data as a uintptr to keep callers from
changing the result to an arbitrary type without first importing
"unsafe". However, this means that SliceHeader and StringHeader are
only valid when interpreting the content of an actual slice or string
value.
var s string
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s)) // case 1
hdr.Data = uintptr(unsafe.Pointer(p)) // case 6 (this case)
hdr.Len = n
In this usage hdr.Data is really an alternate way to refer to the
underlying pointer in the string header, not a uintptr variable
itself.
In general, reflect.SliceHeader and reflect.StringHeader should be used only as *reflect.SliceHeader and *reflect.StringHeader pointing at actual slices or strings, never as plain structs. A program should not declare or allocate variables of these struct types.
// INVALID: a directly-declared header will not hold Data as a reference.
var hdr reflect.StringHeader
hdr.Data = uintptr(unsafe.Pointer(p))
hdr.Len = n
s := *(*string)(unsafe.Pointer(&hdr)) // p possibly already lost
It looks like JimB (from the comments) hinted upon the most correct answer, though he didn't post it as an answer and he didn't include an example. The following passes go vet, staticcheck, and golangci-lint - and doesn't segfault so I think it is the correct answer.
func Callback(ptr unsafe.Pointer, buffer unsafe.Pointer, size C.longlong) C.longlong {
...
buf := unsafe.Slice((*byte)(buffer), size)
err := CopyToSlice(buf)
if err != nil {
log.Fatal("failed to copy to slice")
}
...
}

Casting []uint32 to []byte without copying in golang

I'm working on a processor simulator in golang (for educational purposes). I need a type for memory unit for addressing. It may contain either a slice of memory (memory type is []byte) or one or several registers (they have []uint32 type), must be readable and writable. So, is there an option to convert []uint32 to []byte? I know there's an unsafe module, but I'm not sure how exactly to do this conversion. In other words, I need something like reinterpret_cast in C++
I know memory unit can be an interface with different implementations for memory, single register and several register, but it's not so efficient. Making register a byte slice also decreases performance
The unsafe.Slice function makes it more convenient to convert an arbitrary pointer to a slice of any type. You could even make a generic version to cast a slice of any type to bytes if you were so inclined:
func castToBytes[T any](s []T) []byte {
if len(s) == 0 {
return nil
}
size := unsafe.Sizeof(s[0])
return unsafe.Slice((*byte)(unsafe.Pointer(&s[0])), int(size)*len(s))
}

How is this code generating memory aligned slices?

I'm trying to do direct i/o on linux, so I need to create memory aligned buffers. I copied some code to do it, but I don't understand how it works:
package main
import (
"fmt"
"golang.org/x/sys/unix"
"unsafe"
"yottaStore/yottaStore-go/src/yfs/test/utils"
)
const (
AlignSize = 4096
BlockSize = 4096
)
// Looks like dark magic
func Alignment(block []byte, AlignSize int) int {
return int(uintptr(unsafe.Pointer(&block[0])) & uintptr(AlignSize-1))
}
func main() {
path := "/path/to/file.txt"
fd, err := unix.Open(path, unix.O_RDONLY|unix.O_DIRECT, 0666)
defer unix.Close(fd)
if err != nil {
panic(err)
}
file := make([]byte, 4096*2)
a := Alignment(file, AlignSize)
offset := 0
if a != 0 {
offset = AlignSize - a
}
file = file[offset : offset+BlockSize]
n, readErr := unix.Pread(fd, file, 0)
if readErr != nil {
panic(readErr)
}
fmt.Println(a, offset, offset+utils.BlockSize, len(file))
fmt.Println("Content is: ", string(file))
}
I understand that I'm generating a slice twice as big than what I need, and then extracting a memory aligned block from it, but the Alignment function doesn't make sense to me.
How does the Alignment function works?
If I try to fmt.Println the intermediate steps of that function I get different results, why? I guess because observing it changes its memory alignment (like in quantum physics :D)
Edit:
Example with fmt.println, where I don't need any more alignment:
package main
import (
"fmt"
"golang.org/x/sys/unix"
"unsafe"
)
func main() {
path := "/path/to/file.txt"
fd, err := unix.Open(path, unix.O_RDONLY|unix.O_DIRECT, 0666)
defer unix.Close(fd)
if err != nil {
panic(err)
}
file := make([]byte, 4096)
fmt.Println("Pointer: ", &file[0])
n, readErr := unix.Pread(fd, file, 0)
fmt.Println("Return is: ", n)
if readErr != nil {
panic(readErr)
}
fmt.Println("Content is: ", string(file))
}
Your AlignSize has a value of a power of 2. In binary representation it contains a 1 bit followed by full of zeros:
fmt.Printf("%b", AlignSize) // 1000000000000
A slice allocated by make() may have a memory address that is more or less random, consisting of ones and zeros following randomly in binary; or more precisely the starting address of its backing array.
Since you allocate twice the required size, that's a guarantee that the backing array will cover an address space that has an address in the middle somewhere that ends with as many zeros as the AlignSize's binary representation, and has BlockSize room in the array starting at this. We want to find this address.
This is what the Alignment() function does. It gets the starting address of the backing array with &block[0]. In Go there's no pointer arithmetic, so in order to do something like that, we have to convert the pointer to an integer (there is integer arithmetic of course). In order to do that, we have to convert the pointer to unsafe.Pointer: all pointers are convertible to this type, and unsafe.Pointer can be converted to uintptr (which is an unsigned integer large enough to store the uninterpreted bits of a pointer value), on which–being an integer–we can perform integer arithmetic.
We use bitwise AND with the value uintptr(AlignSize-1). Since AlignSize is a power of 2 (contains a single 1 bit followed by zeros), the number one less is a number whose binary representation is full of ones, as many as trailing zeros AlignSize has. See this example:
x := 0b1010101110101010101
fmt.Printf("AlignSize : %22b\n", AlignSize)
fmt.Printf("AlignSize-1 : %22b\n", AlignSize-1)
fmt.Printf("x : %22b\n", x)
fmt.Printf("result of & : %22b\n", x&(AlignSize-1))
Output:
AlignSize : 1000000000000
AlignSize-1 : 111111111111
x : 1010101110101010101
result of & : 110101010101
So the result of & is the offset which if you subtract from AlignSize, you get an address that has as many trailing zeros as AlignSize itself: the result is "aligned" to the multiple of AlignSize.
So we will use the part of the file slice starting at offset, and we only need BlockSize:
file = file[offset : offset+BlockSize]
Edit:
Looking at your modified code trying to print the steps: I get an output like:
Pointer: 0xc0000b6000
Unsafe pointer: 0xc0000b6000
Unsafe pointer, uintptr: 824634466304
Unpersand: 0
Cast to int: 0
Return is: 0
Content is:
Note nothing is changed here. Simply the fmt package prints pointer values using hexadecimal representation, prefixed by 0x. uintptr values are printed as integers, using decimal representation. Those values are equal:
fmt.Println(0xc0000b6000, 824634466304) // output: 824634466304 824634466304
Also note the rest is 0 because in my case 0xc0000b6000 is already a multiple of 4096, in binary it is 1100000000000000000100001110000000000000.
Edit #2:
When you use fmt.Println() to debug parts of the calculation, that may change escape analysis and may change the allocation of the slice (from stack to heap). This depends on the used Go version too. Do not rely on your slice being allocated at an address that is (already) aligned to AlignSize.
See related questions for more details:
Mix print and fmt.Println and stack growing
why struct arrays comparing has different result
Addresses of slices of empty structs

Calling kernel32's ReadProcessMemory in Go

I'm trying to manipulate processes on Windows using Go language,
and I'm starting off by reading other process' memory by using ReadProcessMemory.
However, for most of the addresses I get Error: Only part of a ReadProcessMemory or WriteProcessMemory request was completed. error. Maybe my list of arguments is wrong, but I can't find out why.
Can anyone point out what I am doing wrong here?
package main
import (
"fmt"
)
import (
windows "golang.org/x/sys/windows"
)
func main() {
handle, _ := windows.OpenProcess(0x0010, false, 6100) // 0x0010 PROCESS_VM_READ, PID 6100
procReadProcessMemory := windows.MustLoadDLL("kernel32.dll").MustFindProc("ReadProcessMemory")
var data uint = 0
var length uint = 0
for i := 0; i < 0xffffffff; i += 2 {
fmt.Printf("0x%x\n", i)
// BOOL ReadProcessMemory(HANDLE hProcess, LPCVOID lpBaseAddress, LPVOID lpBuffer, DWORD nSize, LPDWORD lpNumberOfBytesRead)
ret, _, e := procReadProcessMemory.Call(uintptr(handle), uintptr(i), uintptr(data), 2, uintptr(length)) // read 2 bytes
if (ret == 0) {
fmt.Println(" Error:", e)
} else {
fmt.Println(" Length:", length)
fmt.Println(" Data:", data)
}
}
windows.CloseHandle(handle)
}
uintptr(data) is incorrect: it takes the value from data (0 of type uint) and converts that to unitptr type — yielding the same value converted to another type — producing, on x86, a null pointer.
Note that Go is not C, and you can't really play dirty games with pointers in it, or, rather, you can, but only through using the unsafe built-in package and its Pointer type which is like void* (pointing somewhere in a data memory block) in C.
What you need is something like
import "unsafe"
var (
data [2]byte
length uint32
)
ret, _, e := procReadProcessMemory.Call(uintptr(handle), uintptr(i),
uintptr(unsafe.Pointer(&data[0])),
2, uintptr(unsafe.Pointer(&length))) // read 2 bytes
Observe what was done here:
A variable of type "array of two bytes" is declared;
The address of the first element of this array is taken;
That address is type-converted to the type unsafe.Pointer;
The obtained value is then type-converted to uintptr.
The last two steps are needed because Go features garbage collection:
In Go, when you take an address of a value in memory and store it in a variable, the GC knows about this "implicit" pointer and the value which address was taken won't be garbage-collected even if it becomes unreachable with that value holding its address being the only reference left.
Even if you make that address value lose the type information it maintains — through type-converting it to unsafe.Pointer, the new value is still considered by GC and behaves like "normal" values containing addresses — as explained above.
By type-converting such value to uintptr you make GC stop considering it as a pointer. Hence this type is there only for FFI/interop.
In other words, in
var data [2]byte
a := &data[0]
p := unsafe.Pointer(a)
i := uintptr(p)
there are only three references to the value in data: that variable itself, a and p, but not i.
You should consider these rules when dealing with calling outside code because you should never ever pass around unitptr-typed values: they're only for marshaling data to the called functions and unmarshaling it back, and have to be used "on the spot" — in the same scope as the values they are type-converted from/to.
Also observe that in Go, you can't just take the address of a variable of an integer type and supply that address to a function which expects a pointer to a memory block of an appropriate size. You have to deal with byte arrays and after the data has been written by the called function, you need to explicitly convert it to a value of the type you need. That's why there's no "type casts" in Go but only "type conversions": you can't reinterpret the data type of a value through type-conversion, with the uintptr(unsafe.Pointer) (and back) being a notable exception for the purpose of FFI/interop, and even in this case you basically convert a pointer to a pointer, just transfer it through the GC boundary.
To "serialize" and "deserialize" a value of an integer type you might use the encoding/binary standard package or hand-roll no-brainer simple functions which do bitwise shifts and or-s and so on ;-)
2015-10-05, updated as per the suggestion of James Henstridge.
Note that after the function returns, and ret signalizes there's no error
you have to check the value of the length variable.

Why does io.WriterTo's WriteTo method return an int64 rather than an int?

Most of the output methods in Go's io package return (int, error), for example io.Writer's Write([]byte) method and the io.WriteString(io.Writer, string) function. However, a few of the output methods, such as io.WriterTo's WriteTo method, return (int64, error) instead. This makes it inconvenient to implement WriteTo in terms of Write or WriteString without storing an intermediate value and type converting it from int to int64. What is the reason for this discrepancy?
It's possible that WriteTo copies more than int32 bytes of data.
With the io.Readerand io.Writer interfaces, the amount of data is limited by the size of the given slice, which has a length limited by int for the current architecture.
The signature of the Writer.Write() method:
Write(p []byte) (n int, err error)
It writes the contents of a slice. Quoting from the spec: Slice types:
A slice is a descriptor for a contiguous segment of an underlying array...
As we all know, the slice has an underlying array. Quoting again from the Spec: Array types:
The length is part of the array's type; it must evaluate to a non-negative constant representable by a value of type int.
So the maximum length of an array is limited by the maximum value of the int type (which is 2147483647 in case of 32 bit and 9223372036854775807 in case of 64 bit architectures).
So back to the Writer.Write() method: since it writes the content of the passed slice, it is guaranteed that the number of written bytes will not be more that what fits into an int.
Now WriteTo.WriteTo() method:
WriteTo(w Writer) (n int64, err error)
A slice or array is nowhere mentioned. You have no guarantees that the result will fit into an int, so the int64 is more than justified.
Example: BigBuffer
Imagine a BigBuffer implementation which temporarily writes data into an array or slice. The implementation may manage multiple arrays so that if one is full (e.g. reached max int), continues in another one. Now if this BigBuffer implements the WriteTo interface and you call this method to write the content into an os.File, the result will be more than max int.

Resources