Index Out of Range when using binary.PutVarint(...) - go

http://play.golang.org/p/RqScJVvpS7
package main
import (
"fmt"
"math/rand"
"encoding/binary"
)
func main() {
buffer := []byte{0, 0, 0, 0, 0, 0, 0, 0}
num := rand.Int63()
count := binary.PutVarint(buffer, num)
fmt.Println(count)
}
I had this working awhile ago when num was just an incrementing uint64 and I was using binary.PutUvarint but now that it's a random int64 and binary.PutVarint I get an error:
panic: runtime error: index out of range
goroutine 1 [running]:
encoding/binary.PutUvarint(0x1042bf58, 0x8, 0x8, 0x6ccb, 0xff9faa4, 0x9acb0442, 0x7fcfd52, 0x4d658221)
/usr/local/go/src/encoding/binary/varint.go:44 +0xc0
encoding/binary.PutVarint(0x1042bf58, 0x8, 0x8, 0x6ccb, 0x7fcfd52, 0x4d658221, 0x14f9e0, 0x104000e0)
/usr/local/go/src/encoding/binary/varint.go:83 +0x60
main.main()
/tmp/sandbox010341234/main.go:12 +0x100
What am I missing? I would have thought this to be a trivial change...
EDIT: I just tried extending my buffer array. For some odd reason it works and I get a count of 10. How can that be? int64 is 64 bits = 8 bytes, right?

Quoting the doc of encoding/binary:
The varint functions encode and decode single integer values using a variable-length encoding; smaller values require fewer bytes. For a specification, see https://developers.google.com/protocol-buffers/docs/encoding.
So the binary.PutVarint() is not a fixed, but a variable-length encoding. When passing an int64, it will need more than 8 bytes for large numbers, and less than 8 bytes for small numbers. Since the number you're encoding is a random number, it will have random bits even in its highest byte.
See this simple example:
buffer := make([]byte, 100)
for num := int64(1); num < 1<<60; num <<= 4 {
count := binary.PutVarint(buffer, num)
fmt.Printf("Num=%d, bytes=%d\n", num, count)
}
Output:
Num=1, bytes=1
Num=16, bytes=1
Num=256, bytes=2
Num=4096, bytes=2
Num=65536, bytes=3
Num=1048576, bytes=4
Num=16777216, bytes=4
Num=268435456, bytes=5
Num=4294967296, bytes=5
Num=68719476736, bytes=6
Num=1099511627776, bytes=6
Num=17592186044416, bytes=7
Num=281474976710656, bytes=8
Num=4503599627370496, bytes=8
Num=72057594037927936, bytes=9
The essence of variable-length encoding is that small numbers use less bytes, but this can only be achieved if in turn big numbers may use more than 8 bytes (that would be size of int64).
Details of the specific encoding is on the linked page.
A very easy example would be: A byte is 8 bits. Use 7 bits of the output byte as the "useful" bits to encode the data/number. If the highest bit is 1, that means more bytes are required. If highest bit is 0, we're done. You can see that small numbers can be encoded using 1 output byte (e.g. n=10), while we're using 1 extra bit for every 7-bit useful data, so if the input number uses all the 64 bits, we will end up with more than 8 bytes: 10 groups are required to cover 64 bits, so we will need 10 bytes (9 groups is only 9*7=63 bits).

Related

Does go use something like space padding for structs? [duplicate]

This question already has answers here:
Sizeof struct in Go
(6 answers)
Closed 4 months ago.
I was playing around in go, and was trying to calculate and get the size of struct objects. And found something interesting, if you take a look at the following structs:
type Something struct {
anInteger int16 // 2 bytes
anotherInt int16 // 2 bytes
yetAnother int16 // 2 bytes
someBool bool // 1 byte
} // I expected 7 bytes total
type SomethingBetter struct {
anInteger int16 // 2 bytes
anotherInt int16 // 2 bytes
yetAnother int16 // 2 bytes
someBool bool // 1 byte
anotherBool bool // 1 byte
} // I expected 8 bytes total
type Nested struct {
Something // 7 bytes expected at first
completingByte bool // 1 byte
} // 8 bytes expected at first sight
But the result I got using unsafe.Sizeof(...) was as following:
Something -> 8 bytes
SomethingBetter -> 8 bytes
Nested -> 12 bytes, still, after finding out that "Something" used 8 bytes, though this might use 9 bytes
I suspect that go does something kind of like padding, but I don't know how and why it does that, is there some formula? Or logics? If it uses space padding, is it done randomly? Or based on some rules?
Yes, we have padding! if your system architecture is 32-bit the word size is 4 bytes and if it is 64-bit, the word size is 8 bytes. Now, what is the word size? "Word size" refers to the number of bits processed by a computer's CPU in one go (these days, typically 32 bits or 64 bits). Data bus size, instruction size, address size are usually multiples of the word size.
For example, suppose this struct:
type data struct {
a bool // 1 byte
b int64 // 8 byte
}
This struct it's not 9 bytes because, when our word size is 8, for first cycle, cpu reads 1 byte of bool and padding 7 bytes for others.
Imagine:
p: padding
+-----------------------------------------+----------------+
| 1-byte bool | p | p | p | p | p | p | p | int-64 |
+-----------------------------------------+----------------+
first 8 bytes second 8 bytes
For better performance, sort your struct items from bigger to small.
This is not good performance:
type data struct {
a string // 16 bytes size 16
b int32 // 4 bytes size 20
// 4 bytes padding size 24
c string // 16 bytes size 40
d int32 // 4 bytes size 44
// 4 bytes padding size 48 - Aligned on 8 bytes
}
Now It's better:
type data struct {
a string // 16 bytes size 16
c string // 16 bytes size 32
d int32 // 4 bytes size 36
b int32 // 4 bytes size 40
// no padding size 40 - Aligned on 5 bytes
}
See here for more examples.

Whats happening with this method?

type IntSet struct {
words []uint64
}
func (s *IntSet) Has(x int) bool {
word, bit := x/64, uint(x%64)
return word < len(s.words) && s.words[word]&(1<<bit) != 0
}
Lets go through what I think is going on:
A new type is declared called IntSet. Underneath its new type declaration it is unint64 slice.
A method is created called Has(). It can only receive IntSet types, after playing around with ints she returns a bool
Before she can play she needs two ints. She stores these babies on the stack.
Lost for words
This methods purpose is to report whether the set contains the non-negative value x. Here is a the go test:
func TestExample1(t *testing.T) {
//!+main
var x, y IntSet
fmt.Println(x.Has(9), x.Has(123)) // "true false"
//!-main
// Output:
// true false
}
Looking for some guidance understanding what this method is doing inside. And why the programmer did it in such complicated means (I feel like I am missing something).
The return statement:
return word < len(s.words) && s.words[word]&(1<<bit) != 0
Are the order of operations this?
return ( word < len(s.words) && ( s.words[word]&(1<<bit)!= 0 )
And what is the [words] and & doing within:
s.words[word]&(1<<bit)!= 0
edit: Am beginning to see slightly seeing that:
s.words[word]&(1<<bit)!= 0
Is just a slice but don't understand the &
As I read the code, I scribbled some notes:
package main
import "fmt"
// A set of bits
type IntSet struct {
// bits are grouped into 64 bit words
words []uint64
}
// x is the index for a bit
func (s *IntSet) Has(x int) bool {
// The word index for the bit
word := x / 64
// The bit index within a word for the bit
bit := uint(x % 64)
if word < 0 || word >= len(s.words) {
// error: word index out of range
return false
}
// the bit set within the word
mask := uint64(1 << bit)
// true if the bit in the word set
return s.words[word]&mask != 0
}
func main() {
nBits := 2*64 + 42
// round up to whole word
nWords := (nBits + (64 - 1)) / 64
bits := IntSet{words: make([]uint64, nWords)}
// bit 127 = 1 * 64 + 63
bits.words[1] = 1 << 63
fmt.Printf("%b\n", bits.words)
for i := 0; i < nWords*64; i++ {
has := bits.Has(i)
if has {
fmt.Println(i, has)
}
}
has := bits.Has(127)
fmt.Println(has)
}
Playground: https://play.golang.org/p/rxquNZ_23w1
Output:
[0 1000000000000000000000000000000000000000000000000000000000000000 0]
127 true
true
The Go Programming Language Specification
Arithmetic operators
& bitwise AND integers
peterSO's answer is spot on - read it. But I figured this might also help you understand.
Imagine I want to store some random numbers in the range 1 - 8. After I store these numbers I will be asked if the number n (also in the range of 1 - 8) appears in the numbers I recorded earlier. How would we store the numbers?
One, probably obvious, way would be to store them in a slice or maybe a map. Maybe we would choose a map since lookups will be constant time. So we create our map
seen := map[uint8]struct{}{}
Our code might look something like this
type IntSet struct {
seen: map[uint8]struct{}
}
func (i *IntSet) AddValue(v uint8) {
i.seen[v] = struct{}{}
}
func (i *IntSet) Has(v uint8) bool {
_, ok := i.seen[v]
return ok
}
For each number we store we take up (at least) 1 byte (8 bits) of memory. If we were to store all 8 numbers we would be using 64 bits / 8 bytes.
However, as the name implies, this is an int Set. We don't care about duplicates, we only care about membership (which Has provides for us).
But there is another way we could store these numbers, and we could do it all within a single byte. Since a byte provides 8 bits, we can use these 8 bits as markers for values we have seen. The initial value (in binary notation) would be
00000000 == uint8(0)
If we did an AddValue(3) we could change the 3rd bit and end up with
00000100 == uint8(3)
^
|______ 3rd bit
If we then called AddValue(8) we would have
10000100 == uint8(132)
^ ^
| |______ 3rd bit
|___________ 8th bit
So after adding 3 and 8 to our IntSet we have the internally stored integer value of 132. But how do we take 132 and figure out whether a particular bit is set? Easy, we use bitwise operators.
The & operator is a logical AND. It will return the value of the bits common between the numbers on each side of the operator. For example
10001100 01110111 11111111
& 01110100 & 01110000 & 00000001
-------- -------- --------
00000100 01110000 00000001
So to find out if n is in our set we simply do
our_set_value & (1 << (value_we_are_looking_for - 1))
which if we were searching for 4 would yield
10000100
& 00001000
----------
0 <-- so 4 is not present
or if we were searching for 8
10000100
& 10000000
----------
10000000 <-- so 8 is present
You may have noticed I subtracted 1 from our value_we_are_looking for. This is because I am fitting 1-8 into our 8bit number. If we only wanted to store seven numbers then we could just skip using the very first bit and assume our counting starts at bit #2 then we wouldn't have to subtract 1, like the code you posted does.
Assuming you understand all of that, here's where things get interesting. So far we have been storing our values in a uint8 (so we could only have 8 values, or 7 if you omit the first bit). But there are larger numbers that have more bits, like uint64. Instead of 8 values, we can store 64 values! But what happens if the range of values we want to track exceed 1-64? What if we want to store 65? This is where the slice of words comes from in the original code.
Since the code posted skips the first bit, from now on I will do so as well.
We can use the first uint64 to store the numbers 1 - 63. When we want to store the numbers 64-127 we need a new uint64. So our slice would be something like
[ uint64_of_1-63, uint64_of_64-127, uint64_of_128-192, etc]
Now, to answer the question about whether a number is in our set we need to first find the uint64 whose range would contain our number. If we were searching for 110 we would want to use the uint64 located at index 1 (uint64_of_64-128) because 110 would fall in that range.
To find the index of the word we need to look at, we take the whole number value of n / 64. In the case of 110 we would get 1, which is exactly what we want.
Now we need to examine the specific bit of that number. The bit that needs to be checked would be the remainder when dividing 110 by 64, or 46. So if the 46th bit of the word at index 1 is set, then we have seen 110 before.
This is how it might look in code
type IntSet struct {
words []uint64
}
func (s *IntSet) Has(x int) bool {
word, bit := x/64, uint(x%64)
return word < len(s.words) && s.words[word]&(1<<bit) != 0
}
func (s *IntSet) AddValue(x int) {
word := x / 64
bit := x % 64
if word < len(s.words) {
s.words[word] |= (1 << uint64(bit))
}
}
And here is some code to test it
func main() {
rangeUpper := 1000
bits := IntSet{words: make([]uint64, (rangeUpper/64)+1)}
bits.AddValue(127)
bits.AddValue(8)
bits.AddValue(63)
bits.AddValue(64)
bits.AddValue(998)
fmt.Printf("%b\n", bits.words)
for i := 0; i < rangeUpper; i++ {
if ok := bits.Has(i); ok {
fmt.Printf("Found %d\n", i)
}
}
}
OUTPUT
Found 8
Found 63
Found 64
Found 127
Found 998
Playground of above
Note
The |= is another bitwise operator OR. It means combine the two values keeping anywhere there is a 1 in either value
10000000 00000001 00000001
& 01000000 & 10000000 & 00000001
-------- -------- --------
11000000 10000001 00000001 <-- important that we
can set the value
multiple times
Using this method we can reduce the cost of storage for 65535 numbers from 131KB to just 1KB. This type of bit manipulation for set membership is very common in implementations of Bloom Filters
An IntSet represents a Set of integers. The presence in the set of any of a contiguous range of integers can be established by writing a single bit in the IntSet. Likewise, checking whether a specific integer is in the IntSet can be done by checking whether the particular integer corresponding to that bit is set.
So the code is finding the specific uint64 in the Intset corresponding to the integer:
word := x/64
and then the specific bit in that uint64:
bit := uint(x%64)
and then checking first that the integer being tested is in the range supported by the IntSet:
word < len(s.words)
and then whether the specific bit corresponding to the specific integer is set:
&& s.words[word]&(1<<bit) != 0
This part:
s.words[word]
pulls out the specific uint64 of the IntSet that tracks whether the integer in question is in the set.
&
is a bitwise AND.
(1<<bit)
means take a 1, shift it to the bit position representing the specific integer being tested.
Performing the bitwise AND between the integer in question, and the bit-shifted 1 will return a 0 if the bit corresponding to the integer is not set, and a 1 if the bit is set (meaning, the integer in question is a member of the IntSet).

How to return the port number in 2 bytes to client in socks5 proxy?

I am trying to implement socks5 proxy server.
Most things are clear according to the rfc but I'm stuck interpreting client port and writing my port number in bytes.
I made a function that tkes an int and returns 2 bytes. This function first converts number into binary then literally splits the bits as string then converts them back to byte.However this seems wrong because if the right most bits are 0 they are lost.
Here is the function
func getBytesOfInt(i int) []byte {
binary := fmt.Sprintf("%b", i)
if i < 255 {
return []byte{byte(i)}
}
first := binary[:8]
last := binary[9:]
fmt.Println(binary, first, last)
i1, _ := strconv.ParseInt(first, 2, 64)
i2, _ := strconv.ParseInt(last, 2, 64)
return []byte{byte(i1), byte(i2)}
}
Can you please explain me how am i supposed to parse the number and get 2 bytes and most importantly how am i going to cast it back to an integer.
Currently if you give 1024 to this function it will return []byte{0x80, 0x0} which is 128 in decimals but as you see the right bits are lost theres only one 0 which is useless.
Your code has multiple problem. First :8 and 9: miss an element ([8]), see: https://play.golang.org/p/yuhh4ZeJFNL
And also, you should interept the second byte as lowbyte of the int and the first as highbyte, not literally cut the binary string. for example 4 should be interept as [0x0,0x4] instead of [0x4,0x0] which shoulld be 1024.
If you want to keep using strconv you should use:
n := len(binary)
first := binary[:n-8]
last := binary[n-8:]
However it is very unefficient.
I would suggest b[0],b[1] = i >> 8, i & 255, and i = b[0]<<8 + b[1] .

go - encoding unsigned 16 bit float in binary

In Go, how can I encode a float into a byte array as a 16 bit unsigned float with 11 explicit bits of mantissa and 5 bits of explicit exponent?
There doesn't seem to be a clean way to do it. The only thing I can think of is encoding it as in Convert byte array "[]uint8" to float64 in GoLang and manually truncating the bits.
Is there a "go" way to do this?
Here's the exact definition:
A 16 bit unsigned float with 11 explicit bits of mantissa and 5 bits of explicit exponent
The bit format is loosely modeled after IEEE 754. For example, 1 microsecond is represented as 0x1, which has an exponent of zero, presented in the 5 high order bits, and mantissa of 1, presented in the 11 low order bits. When the explicit exponent is greater than zero, an implicit high-order 12th bit of 1 is assumed in the mantissa. For example, a floatingvalue of 0x800 has an explicit exponent of 1, as well as an explicit mantissa of 0, but then has an effective mantissa of 4096 (12th bit is assumed to be 1). Additionally, the actual exponent is one-less than the explicit exponent, and the value represents 4096 microseconds. Any values larger than the representable range are clamped to 0xFFFF.
I am not sure whether I understand the encoding correctly (see my comment on the original question), but here is a function which may do what you want:
func EncodeFloat(seconds float64) uint16 {
us := math.Floor(1e6*seconds + 0.5)
if us < 0 {
panic("cannot encode negative value")
} else if us > (1<<30)*4095+0.5 {
return 0xffff
}
usInt := uint64(us)
expBits := uint16(0)
if usInt >= 2048 {
exp := uint16(1)
for usInt >= 4096 {
exp++
usInt >>= 1
}
usInt -= 2048
expBits = exp << 11
}
return expBits | uint16(usInt)
}
(code is at http://play.golang.org/p/G599VOBMcL )

Convert uint64 to int64 without loss of information

The problem with the following code:
var x uint64 = 18446744073709551615
var y int64 = int64(x)
is that y is -1. Without loss of information, is the only way to convert between these two number types to use an encoder and decoder?
buff bytes.Buffer
Encoder(buff).encode(x)
Decoder(buff).decode(y)
Note, I am not attempting a straight numeric conversion in your typical case. I am more concerned with maintaining the statistical properties of a random number generator.
Your conversion does not lose any information in the conversion. All the bits will be untouched. It is just that:
uint64(18446744073709551615) = 0xFFFFFFFFFFFFFFFF
int64(-1) = 0xFFFFFFFFFFFFFFFF
Try:
var x uint64 = 18446744073709551615 - 3
and you will have y = -4.
For instance: playground
var x uint64 = 18446744073709551615 - 3
var y int64 = int64(x)
fmt.Printf("%b\n", x)
fmt.Printf("%b or %d\n", y, y)
Output:
1111111111111111111111111111111111111111111111111111111111111100
-100 or -4
Seeing -1 would be consistent with a process running as 32bits.
See for instance the Go1.1 release notes (which introduced uint64)
x := ^uint32(0) // x is 0xffffffff
i := int(x) // i is -1 on 32-bit systems, 0xffffffff on 64-bit
fmt.Println(i)
Using fmt.Printf("%b\n", y) can help to see what is going on (see ANisus' answer)
As it turned out, the OP wheaties confirms (in the comments) it was run initially in 32 bits (hence this answer), but then realize 18446744073709551615 is 0xffffffffffffffff (-1) anyway: see ANisusanswer;
The types uint64 and int64 can both represent 2^64 discrete integer values.
The difference between the two is that uint64 holds only positive integers (0 thru 2^64-1), where as int64 holds both negative and positive integers using 1 bit to hold the sign (-2^63 thru 2^63-1).
As others have said, if your generator is producing 0xffffffffffffffff, uint64 will represent this as the raw integer (18,446,744,073,709,551,615) whereas int64 will interpret the two's complement value and return -1.

Resources