What is the size of a DIBSECTION? - winapi

I am looking to find the size of a device independent bitmap structure for use with GetObject in the windows API. I have an hBitmap. GetObject says that to get information about the hBitmap, I can either send a buffer with the size of a Bitmap structure or the size of a DIBSection. I don't know the exact sizes for a BITMAP and DIBSECTION struct are, can any one let me know what they are on both 32-bit and 64-bit systems?

You don't need to do any math manually. Simply declare a DIBSECTION variable, and then pass a pointer to it to GetObject() along with sizeof() as the size of that variable, eg:
DIBSECTION dib;
GetObject(hBitmap, sizeof(dib), &dib);

I took out a piece of paper and added up everything myself.
A DIBSECTION contains 5 parts.
typedef struct tagDIBSECTION {
BITMAP dsBm;
BITMAPINFOHEADER dsBmih;
DWORD dsBitfields[3];
HANDLE dshSection;
DWORD dsOffset;
} DIBSECTION, *LPDIBSECTION, *PDIBSECTION;
So let's start with BITMAP.
typedef struct tagBITMAP {
LONG bmType;
LONG bmWidth;
LONG bmHeight;
LONG bmWidthBytes;
WORD bmPlanes;
WORD bmBitsPixel;
LPVOID bmBits;
} BITMAP, *PBITMAP, *NPBITMAP, *LPBITMAP;
A LONG is just an int which is 4 bytes. A WORD is a unsigned short which is 2 bytes. And LPVOID is a ptr.
4+4+4+4+2+2 = 20. But wait, a struct has to be aligned properly. So we need to test divisibility by 8 on 64-bit systems. 20 is not divisible by 8, so we add 4 bytes of padding to get 24. Adding the ptr gives us 32.
The size of the BITMAPINFOHEADER is 40 bytes. It's divisible by 8, so nothing fancy needed. We're at 72 now.
Back to the DIBSECTION. There's an array of DWORDs. And each DWORD is an unsigned int. Adding 12 to 72 gives us 84.
Now there's a handle. A handle is basically a pointer, whose value can be 4 or 8 depending on 32 or 64 bit. Time to check if 84 is divisible by 8. It's not so we add 4 bytes of padding to get 88. Then add the pointer to get 96.
Finally there's the last DWORD and the total reaches 100 on a 64-bit system.
But what about sizeof()?????? Can't you just do sizeof(DIBSECTION)? After all magic numbers = bad. Ken White said in the comments that I didn't need to do any math. I disagree with this. First, as a programmer, it's essential to understand what is happening and why. Nothing could be more elementary than memory on a computer. Second, I only tagged the post as winapi. For the people reading this, if you scroll down on the GetObject page, the function is exported on Gdi32.dll. Any windows program has access to Gdi32.dll. Not every windows program has access to sizeof(). Third, it may be important for people who need to know the math to have the steps shown. Not everyone programs in a high level language. It might even be a question on an exam.
Perhaps the real question is if a struct of size 100 gets padded to 104 when memory is being granted on a 64-bit system.

Related

Pointers to static variables must respect canonical form?

Assuming I have the following example:
struct Dummy {
uint64_t m{0llu};
template < class T > static uint64_t UniqueID() noexcept {
static const uint64_t uid = 0xBEA57;
return reinterpret_cast< uint64_t >(&uid);
}
template < class T > static uint64_t BuildID() noexcept {
static const uint64_t id = UniqueID< T >()
// dummy bits for the sake of example (whole last byte is used)
| (1llu << 60llu) | (1llu << 61llu) | (1llu << 63llu);
return id;
}
// Copy bits 48 through 55 over to bits 56 through 63 to keep canonical form.
uint64_t GetUID() const noexcept {
return ((m & ~(0xFFllu << 56llu)) | ((m & (0xFFllu << 48llu)) << 8llu));
}
uint64_t GetPayload() const noexcept {
return *reinterpret_cast< uint64_t * >(GetUID());
}
};
template < class T > inline Dummy DummyID() noexcept {
return Dummy{Dummy::BuildID< T >()};
}
Knowing very well that the resulting pointer is an address to a static variable in the program.
When I call GetUID() do I need to make sure that bit 47 is repeated till bit 63?
Or I can just AND with a mask of the lower 48 bits and ignore this rule.
I was unable to find any information about this. And I assume that those 16 bits are likely to always be 0.
This example is strictly limited to x86_64 architecture (x32).
In user-space code for mainstream x86-64 OSes, you can normally assume that the upper bits of any valid address are zero.
AFAIK, all the mainstream x86-64 OSes use a high-half kernel design where user-space addresses are always in the lower canonical range.
If you wanted this code to work in kernel code, too, you would want to sign-extend with x <<= 16; x >>= 16; using signed int64_t x.
If the compiler can't keep 0x0000FFFFFFFFFFFF = (1ULL<<48)-1 around in a register across multiple uses, 2 shifts might be more efficient anyway. (mov r64, imm64 to create that wide constant is a 10-byte instruction that can sometimes be slow to decode or fetch from the uop cache.) But if you're compiling with -march=haswell or newer, then BMI1 is available so the compiler can do mov eax, 48 / bzhi rsi, rdi, rax. Either way, though, one AND or BZHI is only 1 cycle of critical path latency for the pointer vs. 2 for 2 shifts. Unfortunately BZHI isn't available with an immediate operand. (x86 bitfield instructions mostly suck compared to ARM or PowerPC.)
Your current method of extracting bits [55:48] and using them to replace the current bits [63:56] is probably slower because the compiler has to mask out the old high byte and then OR in the new high byte. That's already at least 2 cycle latency so you might as well just shift, or mask which can be faster.
x86 has crap bitfield instructions so that was never a good plan. Unfortunately ISO C++ doesn't provide any guaranteed arithmetic right shift, but on all actual x86-64 compilers, >> on a signed integer is a 2's complement arithmetic shift. If you want to be really careful about avoiding UB, do the left shift on an unsigned type to avoid signed integer overflow.
int64_t is guaranteed to be a 2's complement type with no padding if it exists.
I think int64_t is actually a better choice than intptr_t, because if you have 32-bit pointers, e.g. the Linux x32 ABI (32-bit pointers in x86-64 long mode), your code might still Just Work, and casting a uint64_t to a pointer type will simply discard the upper bits. So it doesn't matter what you did to them, and zero-extension first will hopefully optimize away.
So your uint64_t member would just end up storing a pointer in the low 32 and your tag bits in the high 32, somewhat inefficiently but still working. Maybe check sizeof(void*) in a template to select an implementation?
Future proofing
x86-64 CPUs with 5-level page tables for 57-bit canonical addresses are probably coming at some point soonish, to allow use of large memory mapped non-volatile storage like Optane / 3DXPoint NVDIMMs.
Intel has already published a proposal for a PML5 extension https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf (see https://en.wikipedia.org/wiki/Intel_5-level_paging for a summary). There's already support for it in the Linux kernel so it's ready for the appearance of actual HW.
(I can't find out if it's expected in Ice Lake or not.)
See also Why in 64bit the virtual address are 4 bits short (48bit long) compared with the physical address (52 bit long)? for more about where the 48-bit virtual address limit comes from.
So you can still use the high 7 bits for tagged pointers and maintain compat with PML5.
If you assume user-space, then you can use the top 8 bits and zero-extend, because you're assuming the 57th bit (bit 56) = 0.
Redoing sign- (or zero-) extension of the low bits was already optimal, we're just changing it to a different width that only re-extends the bits we disturb. And we're disturbing few enough high bits that it should be future proof even on systems that enable PML5 mode and use wide virtual addresses.
On a system with 48-bit virtual addresses, broadcasting bit 57 to the upper 7 still works, because bit 57 = bit 48. And if you don't disturb those lower bits, they don't need to be re-written.
And BTW, your GetUID() returns an integer. It's not clear why you need that to return the static address.
And BTW, it may be cheaper for it to return &uid (just a RIP-relative LEA) than to load + re-canonicalize your m member value. Move static const uint64_t uid = 0xBEA57; to a static member variable instead of being within one member function.

Why does type int exist in Go

There's int, int32, int64 in Golang.
int32 has 32 bits,
int64 has 64 bits,
int has 32 or 64 or different number of bits according to the environment.
I think int32 and int64 will be totally enough for the program.
I don't know why int type should exist, doesn't it will make the action of our code harder to predict?
And also in C++, type int and type long have uncertain length. I think it will make our program fragile. I'm quite confused.
Usually each platform operates best with integral type of its native size.
By using simple int you say to your compiler that you don't really care about what bit width to use and you let him choose the one it will work fastest with. Note, that you always want to write your code so that it is as platform independent as possible...
On the other hand int32 / int64 types are useful if you need the integer to be of a specific size. This might be useful f.e. if you want to save binary files (don't forget about endiannes). Or if you have large array of integers (that will only reach up to 32b value), where saving half the memory would be significant, etc.
Usually size of int is equal to the natural word size of target. So if your program doesn't care for the size of int (Minimal int range is enough), it can perform best on variety of compilers.
When you need a specific size, you can of course use int32 etc.
In versions of Go up to 1.0, int was just a synonym for int32 — a 32-bit Integer. Since int is used for indexing slices, this prevented slices from having more than 2 billion elements or so.
In Go 1.1, int was made 64 bits long on 64-bit platforms, and therefore large enough to index any slice that fits in main memory. Therefore:
int32 is the type of 32-bit integers;
int64 is the type of 64-bit integers;
int is the smallest integer type that can index all possible slices.
In practice, int is large enough for most practical uses. Using int64 is only necessary when manipulating values that are larger than the largest possible slice index, while int32 is useful in order to save memory and reduce memory traffic when the larger range is not necessary.
The root cause for this is array addressability. If you came into a situation where you needed to call make([]byte, 5e9) your 32 bit executable would be unable to comply, while your 64 bit executable could continue to run. Addressing an array with int64 on a 32 bit build is wasteful. Addressing an array with int32 on a 64 bit build is insufficient. Using int you can address an array to its maximum allocation size on both architectures without having to code a distinction using int32/int64.

What is dwLowDateTime and dwHighDateTime

I know they are variables in the FileTime struct, but what is the low-order and high-order part of the file time?
Older compilers did not have support for 64 bit types. So the structure splits the 64 bit value into two 32 bit parts. The low part contains the least significant 32 bits. The high part contains the most significant 32 bits.
So if you have the two 32 bit parts, the corresponding 64 bit value is
low + 2^32 * high
The officially santioned way to get a 64 bit value from the two 32 bit parts is via the ULARGE_INTEGER union.
From the FILETIME documentation:
It is not recommended that you add and subtract values from the FILETIME structure to obtain relative times. Instead, you should copy the low- and high-order parts of the file time to a ULARGE_INTEGER structure, perform 64-bit arithmetic on the QuadPart member, and copy the LowPart and HighPart members into the FILETIME structure.
Do not cast a pointer to a FILETIME structure to either a ULARGE_INTEGER* or __int64* value because it can cause alignment faults on 64-bit Windows.
That is legacy stuff. The point was to have 64-bit value by having couple of 32-bit values. So afterwords you'll end up doing:
FILETIME ft;
// get time here
__int64 fileTime64;
memcpy( &fileTime64, &ft, sizeof( __int64 ) );
Or, as Microsoft wants you to do it:
FILETIME ft;
// get time here
ULARGE_INTEGER ul;
ul.LowPart = ft.dwLowDateTime;
ul.HighPart = ft.dwHighDateTime;
__int64 fileTime64 = ul.QuadPart;

XChangeProperty for an atom property on a system where Atom is 64 bits

The X11 protocol defines an atom as a 32-bit integer, but on my system, the Atom type in is a typedef for unsigned long, which is a 64-bit integer. The manual for Xlib says that property types have a maximum size of 32 bits. There seems to be some conflict here. I can think of three possible solutions.
If Xlib treats properties of type XA_ATOM as a special case, then you can simply pass 32 for 'format' and an array of atoms for 'data'. This seems unclean and hackish, and I highly doubt that this is correct.
The manual for Xlib appears to be ancient. Since Atom is 64 bits long on my system, should I pass 64 for the 'format' parameter even though 64 is not listed as an allowed value?
Rather than an array of Atoms, should I pass an array of uint32_t values for the 'data' parameter? This seems like it would most likely be the correct solution to me, but this is not what they did in some sources I've looked up that use XChangeProperty, such as SDL.
SDL appears to use solution 1 when setting the _NET_WM_WINDOW_TYPE property, but I suspect that this may be a bug. On systems with little endian byte order (LSB first), this would appear to work if the property has only one element.
Has anyone else encountered this problem? Any help is appreciated.
For the property routines you always want to pass an array of 'long', 'short' or 'char'. This is always true independent of the actual bit width. So, even if your long or atom is 64 bits, it will be translated to 32 bits behind the scenes.
The format is the number of server side bits used, not client side. So, for format 8, you must pass a char array, for format 16, you always use a short array and for format 32 you always use a long array. This is completely independent of the actual lengths of short or long on a given machine. 32 bit values such as Atom or Window always are in a 'long'.
This may seem odd, but it is for a good reason, the C standard does not guarantee types exist that have exactly the same widths as on the server. For instance, a machine with no native 16 bit type. However a 'short' is guaranteed to have at least 16 bits and a long is guaranteed to have at least 32 bits. So by making the client API in terms of 'short' and 'long' you can both write portable code and always have room for the full X id in the C type.

Pointer increment difference b/w 32-bit and 64-bit

I was trying to run some drivers coded for 32-bit vista (x86) on 64-bit win7 (amd64) and it was not running. After a lot of debugging, and hit-and-trial, I made it to work on the latter, but I don't know the reason why it's working. This is what I did:
At many places, buffer pointers pointed to an array of structures(different at different places), and to increment them, at some places this type of statement was used:
ptr = (PVOID)((PCHAR)ptr + offset);
And at some places:
ptr = (PVOID)((ULONG)ptr + offset);
The 2nd one was returning garbage, so I changed them all to 1st one. But I found many sample drivers on the net following the second one. My questions:
Where are these macros
defined(google didn't help much)?
I understand all the P_ macros are
pointers, why was a pointer casted
to ULONG? How does this work on
32-bit?
PCHAR obviously changes the
width according to the environment. Do you know any place to find documentation for this?
they should be defined in WinNT.h (they are in the SDK; don't have the DDK at hand)
ULONG is unsigned long; on a 32-bit system, this is the size of a pointer. So a pointer
can be converted back and forth to ULONG without loss - but not so on a 64-bit system
(where casting the value will truncate it). People cast to ULONG to get byte-base pointer
arithmetic (even though this has undefined behavior, as you found out)
Pointer arithmetic always works in units of the underlying type, i.e. in CHARs for PCHAR; this equates to bytes arithmetic
Any C book should elaborate on the precise semantics of pointer arithmetic.
The reason this code fails on 64-bit is that it is casting pointers to ULONG. ULONG is a 32-bit value while pointers on 64-bit are 64-bit values. So you will be truncating the pointer whenever you use the ULONG cast.
The PCHAR cast, assuming PCHAR is defined as char * is fine, provided the intention is to increment the pointer by an explicit number of bytes.
Both macros have the same intention but only one of them is valid where pointers are larger than 32-bits.
Pointer arithmetic works like this. If you have:
T *p;
and you do:
p + n;
(where n is a number), then the value of p will change by n * sizeof(T).
To give a concrete example, if you have a pointer to a DWORD:
DWORD *pdw = &some_dword_in_memory;
and you add one to it:
pdw = pdw + 1;
then you will be pointing to the next DWORD. The address pdw points to will have increased by sizeof(DWORD), i.e. 4 bytes.
The macros you mention are using casts to cause the address offsets they apply to be multiplied by different amounts. This is normally only done in low-level code which has been passed a BYTE (or char or void) buffer but knows the data inside it is really some other type.
ULONG is defined in WinDef.h in Windows SDK and is always 32-bit, so when you cast a 64-bit pointer into ULONG you truncate the pointer to 32 bits.

Resources