Official document says uint64 is an unsigned integer of 64-bits, does that mean any uint64 number should take 8 bytes storage, no matter how small or how large it is?
Edit:
Thanks for everyone's answer!
I raised the doubt when I noticed that binary.PutUvarint consumes up to 10 bytes to store a large uint64, despite that maximum uint64 should only take 8 bytes.
I then found answer to my doubt in the source code of Golang lib:
Design note:
// At most 10 bytes are needed for 64-bit values. The encoding could
// be more dense: a full 64-bit value needs an extra byte just to hold bit 63.
// Instead, the msb of the previous byte could be used to hold bit 63 since we
// know there can't be more than 64 bits. This is a trivial improvement and
// would reduce the maximum encoding length to 9 bytes. However, it breaks the
// invariant that the msb is always the "continuation bit" and thus makes the
// format incompatible with a varint encoding for larger numbers (say 128-bit).
According to http://golang.org/ref/spec#Size_and_alignment_guarantees:
type size in bytes
byte, uint8, int8 1
uint16, int16 2
uint32, int32, float32 4
uint64, int64, float64, complex64 8
complex128 16
So, yes, uint64 will always take 8 bytes.
Simply put: yes, a 64-bit fixed size integer type will always take 8 bytes. It would be an unusual language where that isn't the case.
There are languages/platforms which support variable-length numeric types where the storage in memory does depend on the value, but you wouldn't then specify the number of bits in the type in such a simple way, as that can vary.
The Go Programming Language Specification
Numeric types
A numeric type represents sets of integer or floating-point values.
The predeclared architecture-independent numeric types are:
uint64 the set of all unsigned 64-bit integers (0 to 18446744073709551615)
Yes, exactly 64 bits or 8 bytes.
Just remember the simple rule, the variable type is usually optimized to fit certain memory space and the minimum memory space is 1 bit(s). And 8 bit(s) = 1 byte(s):
Therefore 64bit(s) = 8 byte(s)
Related
I want to transfer a serialized protobuf message over TCP and I've tried to use the first field to indicate the total length of the serialized message.
I know that the int32 will change the length after encoding. So, maybe a fixed32 is a good choice.
But at last of the Encoding chapter, I found that I can't depend on it even if I use a fixed32 with field_num #1. Because Field Order said that the order may change.
My question is when do I use fixed value types? Are there any example scenarios?
"My question is when do I use fixed value types?"
When it comes to serializing values, there's always a tradeoff. If we look at the Protobuf-documentation, we see we have a few options when it comes to 32-bit integers:
int32: Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.
uint32: Uses variable-length encoding.
sint32: Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.
fixed32: Always four bytes. More efficient than uint32 if values are often greater than 2^28.
sfixed32: Always four bytes.
int32 is a variable-length data-type. Any information that is not specified in the type itself, needs to be expressed somehow. To deserialize a variable-length number, we need to know what the length is. That is contained in the serialized message as well, which requires additional storage space. The same goes for an optional negative sign. The resulting message may be smaller because of this, but may be larger as well.
Say we have a lot of integers between 0 and 255 to encode. It would be cheaper to send this information as a two bytes (one byte with that actual value, and one byte to indicate that we just have one byte), than to send a full 32-bit (4 bytes) integer [fictional values, actual implementation may differ]. On the other hand, if we want to serialize a large value, that can only fit in 4 bytes the result may be larger (4 bytes and an additional byte to indicate that the value is 4 bytes; a total of 5 bytes). In this case it will be more efficient to use a fixed32. We simply know a fixed32 is 4 bytes; we don't need to serialize that fixed32 is a 4-byte number.
And if we look at fixed32 it actually mentions that the tradeoff point is around 2^28 (for unsigned integers).
So some types are good [as in, more efficient in terms of storage space] for large values, some for small values, some for positive/negative values. It all depends on what the actual values represent.
"Are there any example scenarios?"
32-bit hashes (ie: CRC-32), IPv4 addresses/masks. A predictable message sizes could be relevant.
There's int, int32, int64 in Golang.
int32 has 32 bits,
int64 has 64 bits,
int has 32 or 64 or different number of bits according to the environment.
I think int32 and int64 will be totally enough for the program.
I don't know why int type should exist, doesn't it will make the action of our code harder to predict?
And also in C++, type int and type long have uncertain length. I think it will make our program fragile. I'm quite confused.
Usually each platform operates best with integral type of its native size.
By using simple int you say to your compiler that you don't really care about what bit width to use and you let him choose the one it will work fastest with. Note, that you always want to write your code so that it is as platform independent as possible...
On the other hand int32 / int64 types are useful if you need the integer to be of a specific size. This might be useful f.e. if you want to save binary files (don't forget about endiannes). Or if you have large array of integers (that will only reach up to 32b value), where saving half the memory would be significant, etc.
Usually size of int is equal to the natural word size of target. So if your program doesn't care for the size of int (Minimal int range is enough), it can perform best on variety of compilers.
When you need a specific size, you can of course use int32 etc.
In versions of Go up to 1.0, int was just a synonym for int32 — a 32-bit Integer. Since int is used for indexing slices, this prevented slices from having more than 2 billion elements or so.
In Go 1.1, int was made 64 bits long on 64-bit platforms, and therefore large enough to index any slice that fits in main memory. Therefore:
int32 is the type of 32-bit integers;
int64 is the type of 64-bit integers;
int is the smallest integer type that can index all possible slices.
In practice, int is large enough for most practical uses. Using int64 is only necessary when manipulating values that are larger than the largest possible slice index, while int32 is useful in order to save memory and reduce memory traffic when the larger range is not necessary.
The root cause for this is array addressability. If you came into a situation where you needed to call make([]byte, 5e9) your 32 bit executable would be unable to comply, while your 64 bit executable could continue to run. Addressing an array with int64 on a 32 bit build is wasteful. Addressing an array with int32 on a 64 bit build is insufficient. Using int you can address an array to its maximum allocation size on both architectures without having to code a distinction using int32/int64.
I can't find if there is possible to have char / byte type in proto.
I can see various types here:
https://developers.google.com/protocol-buffers/docs/proto
https://developers.google.com/protocol-buffers/docs/encoding
but I can't find byte type and even int16 types there.
No, there is no fixed 1-byte type. Fixed length has 4 and 8 byte variants only. Most other numeric values are encoded as "varint"s, which is variable length depending on magnitude (and sign, but "zigzag" comes into play there). So you can store bytes with value 0-127 in one byte, and 128-255 in two bytes. 16-bit values will take between 1 and 3 bytes depending on magnitude (and sign /zigzag etc).
For multiples, there is "bytes" for the 8-bit version, and "packed" for the rest; this avoids the cost of a field-header per value.
If I store an integer field in int32...will this use more space than int64?
From what I understand the varint will adjust its size with the size of the number being stored.
No, this only impacts the generated code. Any combination of [s|u]int{32|64} uses "varint" encoding, so the size is generally related to the magnitude, at least after noting the difference in negative numbers. In particular, a negative number that doesn't use sint* will be disproportionately large (10 bytes, IIRC), regardless of whether it is 32 or 64.
I'm having an overflow error in VB 6.0 when using the Clong datatype because of really big values. How to overcome this? Is there anything else available higher than the Clong datatype?
Depending on how big your really big values are, the VB6 Currency data type might be a good choice.
It supports values in the range -922,337,203,685,477.5808 to 922,337,203,685,477.5807.
You could use a Double instead of a Long since it can hold larger numbers. The function is CDbl() instead CLng().
In VB6.0, a Long is 32-bits and can hold values up to: 2,147,483,648
A Double is 64-bits and can old values up to: 1.79769313486231570E+308
EDIT: Please refer to this reference
I believe the upcoming VB in MSVS2010 has the CLonger (64 bits), CEvenLongerYet (128 bits) and CTooDamnLongForSensibleUse (256 bits) data types.
</humor>
Here are some options from the VB6 reference manual topic on data types
Long (long integer) 4 bytes
-2,147,483,648 to 2,147,483,647
Single (single-precision
floating-point) 4 bytes -3.402823E38
to -1.401298E-45 for negative values;
1.401298E-45 to 3.402823E38 for positive values. About 6 or 7 significant figures accuracy.
Double
(double-precision floating-point) 8
bytes -1.79769313486231E308 to
-4.94065645841247E-324 for negative values; 4.94065645841247E-324 to
1.79769313486232E308 for positive values. About 15 or 16 significant figures accuracy.
Currency (scaled integer) 8
bytes -922,337,203,685,477.5808 to
922,337,203,685,477.5807
Decimal 14
bytes
+/-79,228,162,514,264,337,593,543,950,335
with no decimal point;
+/-7.9228162514264337593543950335 with 28 places to the right of the
decimal; smallest non-zero number is
+/-0.0000000000000000000000000001
Try avoiding division by zero. If the numerator and denominator object of your code is equal to zero, try making the denominator equal to 1. hence, zero/zero = overflow
zero/1 = zero ( no overflow)