Why does an empty slice have 24 bytes? - go

I'm want to understand what happens when created an empty slice with make([]int, 0). I do this code for test:
emptySlice := make([]int, 0)
fmt.Println(len(emptySlice))
fmt.Println(cap(emptySlice))
fmt.Println(unsafe.Sizeof(emptySlice))
The size and capacity return is obvious, both are 0, but the size of slice is 24 bytes, why?
24 bytes should be 3 int64 right? One internal array for a slice with 24 bytes should be something like: [3]int{}, then why one empty slice have 24 bytes?

If you read the documentation for unsafe.Sizeof, it explains what's going on here:
The size does not include any memory possibly referenced by x. For instance, if x is a slice, Sizeof returns the size of the slice descriptor, not the size of the memory referenced by the slice.
All data types in Go are statically sized. Even though slices have a dynamic number of elements, that cannot be reflected in the size of the data type, because then it would not be static.
The "slice descriptor", as implied by the name, is all the data that describes a slice. That is what is actually stored in a slice variable.
Slices in Go have 3 attributes: The underlying array (memory address), the length of the slice (memory offset), and the capacity of the slice (memory offset). In a 64-bit application, memory addresses and offsets tend to be stored in 64-bit (8-byte) values. That is why you see a size of 24 (= 3 * 8 ) bytes.

unsafe.Sizeof is the size of the object in memory, exactly the same as sizeof in C and C++. See How to get memory size of variable?
A slice has size, but also has the ability to resize, so the maximum resizing ability must also be stored somewhere. But being resizable also means that it can't be a static array but needs to store a pointer to some other (possibly dynamically allocated) array
The whole thing means it needs to store its { begin, end, last valid index } or { begin, size, capacity }. That's a tuple of 3 values which means its in-memory representation is at least 3×8 bytes on 64-bit platforms, unless you want to limit the maximum size and capacity to much smaller than 264 bytes
It's exactly the same situation in many C++ types with the same dynamic sizing capability like std::string or std::vector is also a 24-byte type although on some implementations 8 bytes of padding is added for alignment reasons, resulting in a 32-byte string type. See
C++ sizeof Vector is 24?
Why is sizeof array(type string) 24 bytes with a single space element?
Why is sizeof(string) == 32?
Why is sizeof array(type string) 24 bytes with a single space element?
In fact golang's strings.Builder which is the closest to C++'s std::string has size of 32 bytes. See demo

Related

Accesing more Window Extra Bytes than LongPointer

Im trying to figure out of how to correctly work with the Extra Bytes that you can let windows allocate for your windows and window classes.
If i'm reading the docs correctly, you can tell windows to allocate a specified amount of memory for your window or window class.
But there is only two methods i could find to access and modify said data, SetWindowLongPtr and GetWindowLongPtr.
Problem is, with those methods you can only set a LongPtr full of data, so 64 / 32 bits depending on your system.
Can somebody explain this to me, is there a method i am missing or is this as it should be?
(Get|Set)WindowLong() accesses a value at the specified nIndex as a whole LONG.
(Get|Set)WindowLongPtr() accesses a value at the specified nIndex as a whole LONG_PTR.
So yes, this does mean that (Get|Set)WindowLongPtr() will access a different number of bytes depending on whether you are compiling your project for 32bit or 64bit. As such, if you want to read/write a smaller number of bytes, you will have to read/write a whole LONG/_PTR and do some bit shifting as needed.
Even though you can specify an arbitrary byte count for the WNDCLASS/EX::cbWndExtra field, in reality it needs to be large enough to hold at least sizeof(LONG/_PTR) number of bytes at the last byte offset you intend to specify in the nIndex parameter.
This is stated in the GetWindowLong()/SetWindowLong() and GetWindowLongPtr()/SetWindowLongPtr documentations:
nIndex
Type: int
The zero-based offset to the value to be retrieved. Valid values are in the range zero through the number of bytes of extra window memory, minus four; for example, if you specified 12 or more bytes of extra memory, a value of 8 would be an index to the third 32-bit integer.
nIndex
Type: int
The zero-based offset to the value to be retrieved. Valid values are in the range zero through the number of bytes of extra window memory, minus the size of a LONG_PTR.
From the documentation for both Extra Class Memory and Extra Window Memory:
Because extra memory is allocated from the system's local heap, an application should use extra [class or window] memory sparingly. The RegisterClassEx function fails if the amount of extra [class or window] memory requested is greater than 40 bytes. If an application requires more than 40 bytes, it should allocate its own memory and store a pointer to the memory in the extra [class or window] memory.

nanopb oneof size requirements

I came across nanopb and want to use it in my project. I'm writing code for an embedded device so memory limitations are a real concern.
My goal is to transfer data items from device to device, each data item has an 32bit identifier and an value. The value can be anything from 1 char to float to long string. I'm wondering what would be the most efficient way of declaring the messages for this type of problem.
I was thinking something like this:
message data_msg{
message data_item{
int32 id = 1;
oneof value{
int8 ival = 2;
float fval = 3;
string sval = 4;
}
}
repeated data_item;
}
But as I understood this is converted in to C union, which is the size of the largest element. Say I limit the string to 50 chars, then the union is always 50 bytes long, even if I would need 4 bytes for a float.
Have I understood this correctly or is there some other way to accomplish this?
Thanks!
Your understanding is correct, the structure size in C will equal the size of the largest member of the oneof. However this is only the size in memory, the message size after serialization will be the minimum needed for the contents.
If the size in memory is an issue, there are several options available. The default of allocating maximum possibly needed size makes memory management easy. If you want to dynamically allocate only the needed amount of memory, you'll have to decide how you want to do it:
Using PB_ENABLE_MALLOC compilation option, you can use the FT_POINTER field type for the string and other large fields. The memory will then be allocated using malloc() from the system heap.
With FT_CALLBACK, instead of allocating any memory, you'll get a callback where you can read out the string and process or store it any way you want. For example, if you wanted to write the string to SD card, you could do so without ever storing it completely in memory.
In overall system design, static allocation for maximum size required is often the easiest to test for. If the data fits once, it will always fit. If you go for dynamic allocation, you'll need to analyze more carefully what is the maximum possible memory usage needed.

Memory Pool Returned Memory Alignment

In my application I have a memory pool. I allocate all memory at startup in the form of a uint64 array (64 bit machine). Then construct objects in this array using placement new. So object 1 starts at position pool[0] object 2 start at position pool[1] so on so forth. Since each object will span at least 64bits or multiples of sizeof(uint64) (if it needs more uint64 slots to allocate).
Am I correct to assume that all memory returned from the pool will be aligned correctly? Since each uint64 in the array will be properly aligned by the compiler. If so, does using uint32 on a 32bit machine in the same manner will work?
You are correct to assume that there will not be any padding. Compilers mostly pack in 2 bytes or 4 bytes boundaries (this can be controlled).
You should verify this on your specific target using __alignof__.
The keyword __alignof__ allows you to inquire about how an object is aligned, or the minimum alignment usually required by a type. Its syntax is just like sizeof.
However, an allocation of 8 bytes, might not be aligned to 64bit, if the array was allocated starting with a 32bit offset address.
You could use aligned_alloc(8, size) to allocate the memory, then cast it to array of uint64.

What postscript object is 8-bytes normally, but 9-bytes packed?

A packed array in Postscript is supposed to be a space-saving feature, where objects can be squeezed tightly in memory by omitting extraneous information. Like a null can be just a single byte because it carries no information. Booleans could be a signel byte, too. Integers could be 5 (or 3) bytes (if it's a small number). Reference objects would need the full 8-bytes that a normal object does. But the Postscript Manual says that packed objects occupy 1-9 bytes!
When the PostScript language scanner encounters a procedure delimited by
{ … }, it creates either an array or a packed array, according to the current packing
mode (see the description of the setpacking operator in Chapter 8). An
array value occupies 8 bytes per element. A packed array value occupies 1 to 9
bytes per element, depending on each element’s type and value; a typical average
is 2.5 bytes per element. --PLRM 3ed, B.2. Virtual Memory Use, p. 742
So what object gets bigger when packed? And Why? Hydrogen-bonding??!
Any value that you can't represent in 7 bytes or less, will need 9 bytes.
The packed format starts with a byte that contains how many data bytes follow, so any value that needs all 8 bytes of data will be 9 bytes including the leading length byte.

Behind Windows x64's 44-bit virtual memory address limit

http://www.alex-ionescu.com/?p=50.
I read the above post. The author explains why Windows x64 supports only 44-bit virtual memory address with singly linked list example.
struct { // 8-byte header
ULONGLONG Depth:16;
ULONGLONG Sequence:9;
ULONGLONG NextEntry:39;
} Header8;
The first sacrifice to make was to reduce the space for the sequence
number to 9 bits instead of 16 bits, reducing the maximum sequence
number the list could achieve. This still only left 39 bits for the
pointer — a mediocre improvement over 32 bits. By forcing the
structure to be 16-byte aligned when allocated, 4 more bits could be
won, since the bottom bits could now always be assumed to be 0.
Oh, I can't understand.
What "By forcing the structure to be 16-byte aligned when allocated, 4 more bits could be won, since the bottom bits could now always be assumed to be 0." means?
16 is 0010000 in binary
32 is 0100000 in binary
64 is 1000000 in binary
etc
You can see that for all numbers that are multiples of 16, the last four bits are always zero.
So, instead of storing these bits you can leave them out and add them back in when it's time to use the pointer.
For a 2^N-byte aligned pointer, its address is always divisible by 2^N - which means that lower N bits are always zero. You can store additional information in them:
encode ptr payload = ptr | payload
decode_ptr data = data & ~mask
decode_payload data = data & mask
where mask is (1 << N) - 1 - i.e. a number with low N bits set.
This trick is often used to save space in low-level code (payload can be GC flags, type tag, etc.)
In effect you're not storing a pointer, but a number from which a pointer can be extracted. Of course, care should be taken to not dereference the number as a pointer without decoding.

Resources