Why big.NewInt(0).Bytes() returns [] instead of [0] in Go? - go

I find it weird that running big.NewInt(0).Bytes() returns [] instead of [0]. Is it really supposed to work that way?
https://play.golang.org/p/EEaS8sCvhFb

big.Int is a struct. It's idiomatic to make the zero value useful whenever possible. big.Int is no exception: The zero value for an Int represents the value 0.
It's an implementation detail, but the data of the Int is stored in a slice. The zero value for slices is nil, that is: no elements.
So this is very convenient, and very efficient. 0 is probably the most frequent value, and there may be cases where an initial big.Int won't get changed, and so no slice for the internal representation will be allocated.
See related: Is there another way of testing if a big.Int is 0?

From the documentation:
Bytes returns the absolute value of x as a big-endian byte slice.
The package API doesn't define how many bytes long the slice will be. In this case, it's using the smallest number of bytes needed to convey the whole number.
The more likely reason why this happens is an implementation detail: The big.Int maintains the bytes of the number in a slice. nil slices in Go (the zero value of a slice) have length 0. When a big.Int value is initially created, we'd expect it to also have a value of 0. Therefore, it simplifies the implementation if an empty slice internally corresponds to a numerical value of 0, without needing to perform extra checks or padding.

Related

for loop value semantic in golang

First question about Go in SO. The code below shows, n has the same address in each iteration. I am aware that such a for loop is called value semantic by some people and what's actually ranged over is a copy of the slice not the actual slice itself. Why does n in each iteration has the same address? Is it because each element in the slice is copied rather than the whole slice is copied once beforehand. If only each element from the original slice is copied, then a single memory address can be reused in each iteration?
package main
import (
"fmt"
)
func main() {
numbers := []int{1, 2}
for i, n := range numbers {
fmt.Println(&n, &numbers[i])
}
}
A sample result from go playground:
0xc000122030 0xc000122020
0xc000122030 0xc000122028
You are slightly wrong in your question, it is not a copy of the slice that is being iterated over. In Go when you pass a slice you really pass a pointer to memory and the size and capacity of that memory, this is called a slice header. The header is copied, but the copy points to the same underlying memory, meaning that when you pass a []int to a function, change the values in that function, the values will be changed in the original []int in the outer code as well.
This is in contrast to an array like [5]int which is passed by value, meaninig this would really be copied when you pass it around. In Go structs, strings, numbers and arrays are passed by value. Slices are really also passed by value but as described above, the value in this case contains a pointer to memory. Passing a copy of a pointer still lets you change the memory pointed to.
Now to your experiment:
for i, n := range numbers
will create two variables before the loop starts: integers i and n. In each loop iteration i will be incremented by 1 and n will be assigned the value (a copy of the integer value that is) of numbers[i].
This means there really are only two variables i and n. They are the same which is what you see in your output.
The addresses of numbers[i] are different of course, they are the memory addresses of the items in the array.
The Go Wiki has a Common Mistakes page talking about this exact issue. It also provides an explanation of how to avoid this issue in real code. The quick answer is that this is done for efficiency, and has little to do with the slice. n is a single variable / memory location that gets assigned a new value on each iteration.
If you want additional insight into why this happens under the hood, take a look at this post.

When to use fixed value protobuf type? Or under what scenarios?

I want to transfer a serialized protobuf message over TCP and I've tried to use the first field to indicate the total length of the serialized message.
I know that the int32 will change the length after encoding. So, maybe a fixed32 is a good choice.
But at last of the Encoding chapter, I found that I can't depend on it even if I use a fixed32 with field_num #1. Because Field Order said that the order may change.
My question is when do I use fixed value types? Are there any example scenarios?
"My question is when do I use fixed value types?"
When it comes to serializing values, there's always a tradeoff. If we look at the Protobuf-documentation, we see we have a few options when it comes to 32-bit integers:
int32: Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.
uint32: Uses variable-length encoding.
sint32: Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.
fixed32: Always four bytes. More efficient than uint32 if values are often greater than 2^28.
sfixed32: Always four bytes.
int32 is a variable-length data-type. Any information that is not specified in the type itself, needs to be expressed somehow. To deserialize a variable-length number, we need to know what the length is. That is contained in the serialized message as well, which requires additional storage space. The same goes for an optional negative sign. The resulting message may be smaller because of this, but may be larger as well.
Say we have a lot of integers between 0 and 255 to encode. It would be cheaper to send this information as a two bytes (one byte with that actual value, and one byte to indicate that we just have one byte), than to send a full 32-bit (4 bytes) integer [fictional values, actual implementation may differ]. On the other hand, if we want to serialize a large value, that can only fit in 4 bytes the result may be larger (4 bytes and an additional byte to indicate that the value is 4 bytes; a total of 5 bytes). In this case it will be more efficient to use a fixed32. We simply know a fixed32 is 4 bytes; we don't need to serialize that fixed32 is a 4-byte number.
And if we look at fixed32 it actually mentions that the tradeoff point is around 2^28 (for unsigned integers).
So some types are good [as in, more efficient in terms of storage space] for large values, some for small values, some for positive/negative values. It all depends on what the actual values represent.
"Are there any example scenarios?"
32-bit hashes (ie: CRC-32), IPv4 addresses/masks. A predictable message sizes could be relevant.

C++11 Strange notation [0:size())

Am I to understand from
Stroutrup C++ Programming Language - Invariants
that the notation above is a range initializer or is this interpretive instruction to convey mathematically that the Vector class array range is between 0 and some predetermined size?
Should I even be using this book because it contains errors such as accessing a struct member from a variable of that struct using . instead of ->?
It's a half-closed interval. He's saying the index to a vector must be in the range of 0 up to but not including the vector's size. So 0 would be a valid index (assuming the vector is not empty), but size() would not. This is not a code example.

What size Integer is guranteed to be 4 bytes?

How big does an Integer have to be in Java to definitely be 4 bytes long when converted into a byte[] using ByteBuffer.allocate(int_value).array()?
I ask this because I use Integers for Entity Ids in a game I'm working on and it's much cheaper to generate 4 byte Ids as as opposed to fill each byte[] with bytes that hold the value of 0x00.
As far as I understand you, you're making wrong assumptions here. There is no conversion/truncation/expansion done with allocate() nor array() - you just allocate int_value amount of bytes, and get bytes[int_value]-sized array from array() call if array() is supported at all. https://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer.html#allocate%28int%29
To make the array 4 bytes long, simply use ByteBuffer.allocate(4), that's all. Then, if you want, use putInt(somevalue), and you get 4 byte buffer filled with given int, because that's the size of a Java int (32-bits, as per https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html), regardless of it's value.
Note: you're probably approaching this from a wrong angle, btw. It's best to use big buffers, giving you continuous memory regions, and simply segment them based on some metric, e.g. for 4 byte (int) cells, allocate 4*totalInts and then, e.g. get(4*i) etc, or use bulk getting.
An integer (the primitive int) in Java should always be 4 bytes long hence the type isn't dynamic. See Primitive Data Types.
However, if your purpose is just to create an empty byte array, then just create it. There is no need to fill it with zeros hence in Java, the default value for bytes is 0.
If you want to ensure, that the byte array has the length 4, you could use
ByteBuffer.allocate(4).putInt(int_value).array();

What are Go arrays indexed by?

I am some 'memory allocator' type code, by using an array and indexes rather than pointers. I'm hoping that the size of the index of the array is smaller than a pointer. I care because I am storing 'pointers' as integer indexes in an array rather than 64-bit pointers.
I can't see anything in the Go spec that says what an array is indexed by. Obviously it's some kind of integer. Passing very large values makes the runtime complain that I can't pass negative numbers, so I'm guessing that it's somehow cast to a signed integer. So is it an int32? I'm guessing it's not an int64 because I didn't touch the top bit (which would have been 2's compliment for a negative number).
Arrays may be indexed by any integer type.
The Array types section of the Go Programming Language Specification says that in an array type definition,
The length is part of the array's type and must be a constant
expression that evaluates to a non-negative integer value.
In an index expression such as a[x]:
x must be an integer value and 0 <= x < len(a)
But there is a limitation on the magnitude of an index; the description of Length and capacity says:
The built-in functions len and cap take arguments of various types and
return a result of type int. The implementation guarantees that the
result always fits into an int.
So the declared size of an array, or the index in an index expression, can be of any integer type (int, uint, uintptr, int8, int16, int32, int64, uint8, uint16, uint32, uint64), but it must be non-negative and within the range of type int (which is the same size as either int32 or int64 -- though it's a distinct type from either).
It's a very interesting question indeed. I have not found any direct rules in documentation too; instead I've found two great discussions in Groups.
In the first one, among many things, I've found an answer why indexes are implemented as int - but not uint:
Algorithms can benefit from the ability to express negative offsets
and such. If indexes were unsigned you'd always need a conversion in
these cases.
The second one specifically talks about possibility (but possibility only!) of using int64 for large arrays, mentioning limitations of len and cap functions (which limitations are actually mentioned in the doc):
The built-in functions len and cap take arguments of various types and
return a result of type int. The implementation guarantees that the
result always fits into an int.
I do agree, though, that more... official point of view wouldn't hurt. )
Arrays and slices are indexed by ints. An int is defined as being a 32 or 64 bit signed integer. The most common implementation (6g) uses 32 bit integers regardless of the architecture at this point in time. However, it is planed that eventually an int will be 64bit on 64bit machines and therefore the same length as a pointer.
The language spec defines 3 implementation dependent numeric types:
uint either 32 or 64 bits
int same size as uint
uintptr an unsigned integer large enough to store the uninterpreted bits of a pointer value

Resources