According to MSDN, RECT and RECTL are identical structures. Is there any difference between them at all and if not whats the point of having both of them instead of just one?
There's no difference between them, as documented in the MSDN article. To understand why they both exist, you have to go back in history, back to Windows version 3 and earlier. Those were 16-bit versions of Windows, unlike the versions of Windows that everybody uses today. The Windows SDK version for Windows 3.1 declared the RECT structure like this, in windows.h:
typedef struct tagRECT
{
int left;
int top;
int right;
int bottom;
} RECT;
And the ole2.h header file declared RECTL using long for the structure elements. 16-bit C and C++ compilers back then implemented int as a 16-bit type, fitting the word size of a 16-bit processor, and implemented long as a 32-bit type.
32-bit compilers, as used in modern Windows versions, made int a 32-bit type, fitting the word size of a 32-bit processor. And kept long as a 32-bit type. Which made the difference between the two structure types disappear.
Related
I have a codebase which contains AVX512 intrinsic instructions and was build using intel compiler. I am trying to run the same thing using GNU compiler. While compiling the code with -mavx512f flag using gcc, I am getting declaration error only for some AVX512 instructions like _mm512_mask_i32logather_pd.
Standalone Implementation
#include <iostream>
#include <immintrin.h>
int main() {
__m512d set = _mm512_undefined_pd();
__mmask16 msk = 42440;
__m512i v_index = _mm512_set_epi32(64,66,70,96,98,100,102,104,106,112,114,116,118,120,124,256);
int scale = 8;
int count_size = 495*4;
float *src_ptr = (float*)malloc(count_size*sizeof(float));
__m512 out_512 = (__m512)_mm512_mask_i32logather_pd(set, msk, v_index, (float*)src_ptr, _MM_SCALE_8);
return 0;
}
After running this standalone implementation for the function through gcc I am getting the error as
error: ‘_mm512_mask_i32logather_pd’ was not declared in this scope; did you mean ‘_mm512_mask_i32gather_pd’?
Running the same code using icc with -xCORE-AVX512 flag runs perfectly fine.
Is this because the GNU compiler doesn't support all the AVX512 instructions even though most of the instructions works perfectly fine by using -mavx512f flag?
Relevant information
gcc version - 11.2.0
ubuntu version - 22.04
icc version 2021.6.0
GCC has intrinsics for all AVX-512 instructions. It doesn't always have every alternate version of every intrinsic that differ only in their C semantics, not the underlying instruction they expose.
I think the only difference between the regular _mm512_mask_i32gather_pd intrinsic (which GCC supports) is that logather takes a __m512i vindex instead of __m256i. But uses only the low half, hence the lo in the name. (I looked at them in the intrinsics guide - same pseudocode, just a difference in C/C++ function signature. And they're listed as intrinsics for the same single instruction). There doesn't seem to be a higather intrinsic that includes a shuffle; you need to do the extracting yourself.
vgatherdpd gathers 8 double elements to fill a __m512d, using 32-bit indices. The corresponding 8 indices are only a total of 32 bytes wide. That's why the regular more widely-supported intrinsic only takes a __m256i vindex arg.
Your code strangely bothers to initialize 64 bytes (16 indices), not shuffling the high half down. Also you're merge-masking into _mm512_undefined_pd(), which seems a weird example. But pretty obviously this isn't intended to be useful, since you're also loading from uninitialized malloc. You're casting the result to a __m512, I guess using this instruction to gather pairs of float instead of individual doubles? If so, yeah it's more efficient to gather fewer elements, but it's a weird way to make a minimal simple example for an intrinsic you're looking for. I wonder if perhaps you were looking for _mm512_mask_i32gather_ps to gather 16x float elements, merging into a __m512 vector. (The non-_mask_ version gathers all 16 elements, and you don't have to supply a merge target; that's often what you want.)
If you do have your 8 indices in a wider vector for some reason (e.g. as a result of computation and you're going to do 2 gathers after shuffling), you can just cast the vector type:
__m512i vindex = ...; // the part we want is only the low half
__m512d result = something to merge into;
result = _mm512_mask_i32gather_pd(result, mask, _mm512_castsi512_si256(vindex),
src_ptr, _MM_SCALE_8);
Your cast to (float*) in the arg list to the intrinsic makes no sense: it actually takes a void* so you can gather 64-bit chunks from anything (and yes it's strict-aliasing and alignment safe, not following C rules). But the normal type would be double*, since this is a _pd gather.
In your example, it would be simpler to just __m256 vindex = _mm256_setr_epi32(...); (Or set, if you like the highest-element-first order for the argument list.)
As per the documentation (https://golang.org/pkg/unsafe/#Sizeof) unsafe.Sizeof returns the size of the given expression in bytes. A size of any given expression can ideally be denoted by a uint32 or uint64. Then why does Golang return a uintptr instead? Isn't that confusing? A uintptr is supposed to hold a pointer to some data value but in this case it is not actually a pointer it is just a number right?
There are a lot of good answers in the comments, which boil down to "because that's big enough, yet not too big". I think, though, it might be helpful to view this from a historical perspective, with particular attention to how this all came about in the C programming language.
In very old (pre-standard) C, if you go far back enough in time, there was not even an explicit unsigned integer type. The PDP-11 had:
char, which was 8 bits and signed;
int, which was 16 bits and signed; and
pointers, which were 16 bits and unsigned.
That is:
int i;
int *u;
was how you made two integers, i being signed, and u being unsigned. Setting i to 32767 (0x7fff) and then incrementing it gave you -32768 (0x8000), which gradually increased to -1 (0xffff) and then zero. Setting u to 32767 and then incrementing it gave you 32768, which gradually increased to 65535, and then rolled over to zero.
The lack of distinction between integers and pointers meant that device drivers could read:
struct {
int csr;
int blk;
int bar;
int bcr;
};
0177440->bcr = count;
0177440->blk = block;
0177440->bar = addr;
0177440->csr = READ | GO;
which might be how one told a device to read some bytes or blocks.
(This is also why struct member names, like st_ino in struct stat, were all prefixed like this: st_ino just meant "some integer offset" and you could use the st_ino member with any pointer, or even with an ordinary variable. The prefix meant you could #include multiple headers without having their struct member names collide.)
All of this turned untenable when C was made to work on 32-bit and other machines. C grew an unsigned integer type, rather than pressing pointers into service as unsigned integers, and Steve Johnson's PCC compiler turned unsigned into a modifier, that could be applied to char and short as well as int. A lot of experimentation occurred. Eventually, in 1989, C was first standardized with most of the syntax and semantics that we have now (though new standards have added new types, and many functions, and so on).
Some of the early C pioneers were involved with creating Go, with particular influence from Ken Thompson. There is a quote on the Wikipedia page that is appropriate here:
When the three of us [Thompson, Rob Pike, and Robert Griesemer] got started, it was pure research. The three of us got together and decided that we hated C++. [laughter] ... [Returning to Go,] we started off with the idea that all three of us had to be talked into every feature in the language, so there was no extraneous garbage put into the language for any reason.
As we see from the early days of C, a pointer-as-integer is a suitable unsigned type that can not only hold any pointer, but, if treated as unsigned, can also hold any object size. A pointer-as-integer is not directly usable as a pointer, of course, and with a GC system and concurrency, we need the language itself to have pointers. But we also need to be able to write the runtime support for the language,1 for which we need integer-ized pointers, which also covers all of our needs for object sizes. So one type, built in to the compiler, covers all the requirements. That is as simple as possible, but no simpler.
1I say "we" as if I had anything to do with it. It's just obvious, once you have implemented a few runtime systems.
Few months ago i get myself a laptop with cpu intel i7-2630qm with a 64-bit windows. While practising my programming skils under this system , I encountered some difference in terms of integer size which makes me think that it's probably due to my new 64-bit system.
Let's take a look at a code.
The C Code :
#include <stdio.h>
int main(void)
{
int num = 20;
printf("%d %lld\n" , num , num);
return 0;
}
The Question :
1.) I remember before getting this new laptop , which mean that time i'm still using my old 32-bit system , when i run this code , the program will print the integer 20 while some random number next to it due to the %lld specifier.
2.)But this phenomena no longer happen when i'm using my new laptop , it will instead print both integer correctly , even if i change the variable num to type short.
3.)Is it on a 64-bit system , there's new integer promotion which will promote int to long long when it's use as an argument??Or is it short integer can be promoted to long long which is 64-bit too when pass as an argument??
4.)Besides that I'm quite confuse with one thing , on 16-bit system , int would be 16-bit and it would be 32-bit when it's on a 32-bit system.But why isn't it become 64-bit when it's on a 64-bit??
==================================================================================
Addon :
1.)I choose "console program(64-bit)" as my project on the IDE while using my new laptop but "console program" on my 32-bit old PC system.
2.)I've check the size of int under "console program(64-bit)" project using sizeof operator and it returns 32-bit while short still remain 16-bit.The only change is long type , it's 64-bit and long long still remain its usual 64-bit size.
You are seeing this side-effect because the calling convention is different for x64 code. The function arguments in 32-bit x86 code are passed on the stack. The printf() function will read a word from the stack that isn't part of the activation frame. The odds that it contains a value of 0 are extremely low.
In x64 code, the first 4 arguments for a function are passed through cpu registers, not the stack. The odds that the high word of the 64-bit register is zero by chance are quite good. Left there by a previous 64-bit operation that worked with small numbers. But certainly not guaranteed.
Trying to reason out the defined behavior of undefined behavior is otherwise not useful. Other than trying to guess how the language is implemented for the core that's in your machine. There are better resources for that. Learning the machine code that's applicable to your compiler is an excellent shortcut. Together with the decent debugger that shows you how your C code got translated into machine code. Machine code has no undefined behavior.
I do not have access to an windows 64-bit compiler right now, but my guess is the following.
Your question is not about integer promotion, but regarding how parameters are passed from the function caller to the called function. This is beyond the C specification, but it is interesting to know.
In 32-bit, all parameters are divided into 32-bit blocks as all registers can hold 32 bits. So in this case we have the following stack layout:
[ 32-bit format string pointer ][ num as 32-bit ][ num as 32-bit ] junk...
In 64-bit, all parameters are divided into 64-bit blocks as all registers can hold 64 bits. So the stack will contain the following:
[ 64-bit format string pointer ][ num as 64-bit ][ num as 64-bit ] junk...
The upper 32 bits of the 64-bit registers holding 32-bit values are conveniently set to zero.
So when printf is reading a 64-bit number, it will load the equivalent of two 32-bit registers on a 32-bit platform but only one 64-bit register, with high bits cleared, on a 64-bit platform.
(1 and 2) As already stated, the behaviour in this situation is undefined, so the compiler is allowed to behave differently for any reason or indeed no reason at all.
(3) The compiler is allowed to define int as 64-bit, in which case no promotion would be necessary because all the variables in question would be the same size. But it almost certainly doesn't.
(4) On most or all 64-bit compilers, int is 32-bits. This is because int has been 32 bits for so long that programmers have come to expect it and changing it would break existing code. As far as I know this isn't officially part of the standard, but it's one of those de-facto standards that are even harder to change. :-)
Everything you are describing is specific to whatever spec your compiler is using and the platform you are on (with the exception that long is guaranteed to be at least the same size as int):
Wikipedia entries:
long long
int
The c99 standard seeks to end this ambiguity by adding specific types; int32_t, uint64_t, etc. There's also a POSIX spec that defines u_int32_t, etc.
Edit: I missed the question about printf(), sorry. As #nos points out in the comments on your question, passing something other than a long long to %lld results in undefined behavior. This means there is no rhyme or reason as to what it will do; unicorns spontaneously appearing would not be out of the question.
Oh - and on every compiler and OS I know, int is 32 bit. Changing that has the potential to break things that depend on it being 32 bit.
I need to store the maximum value of an NSInteger into an NSInteger? What is the correct syntax to do it?
Thanks.
The maximum value of an NSInteger is NSIntegerMax.
The maximum value for an NSInteger is NSIntegerMax
(from Foundation Constants Reference)
For 32-bit & 64 bit, there are two conventions:
a)ILP32
b)LP64
The 32-bit runtime uses a convention called ILP32, in which integers, long integers, and pointers are 32-bit quantities. The 64-bit runtime uses the LP64 convention; integers are 32-bit quantities, and long integers and pointers are 64-bit quantities. These conventions match the ABI for apps running on OS X (and similarly, the Cocoa Touch conventions match the data types used in Cocoa), making it easy to write interoperable code between the two operating systems.
Table 1-1 all of the integer types commonly used in Objective-C code. Each entry includes the size of the data type and its expected alignment in memory. The highlighted table entries indicate places where the LP64 convention differs from the ILP32 convention. These size differences indicate places where your code’s behavior changes when compiled for the 64-bit runtime. The compiler defines the LP64 macro when compiling for the 64-bit runtime.
for 64 bit max range for NSInteger is : LONG_MAX : 9223372036854775807
Took me a little while for me to realise why I was getting a different value from NSIntegerMax when using NSUInteger!!
And the maximum for a NSUInteger is NSUIntegerMax
(also from http://developer.apple.com/library/ios/#documentation/cocoa/reference/foundation/Miscellaneous/Foundation_Constants/Reference/reference.html)
I was trying to run some drivers coded for 32-bit vista (x86) on 64-bit win7 (amd64) and it was not running. After a lot of debugging, and hit-and-trial, I made it to work on the latter, but I don't know the reason why it's working. This is what I did:
At many places, buffer pointers pointed to an array of structures(different at different places), and to increment them, at some places this type of statement was used:
ptr = (PVOID)((PCHAR)ptr + offset);
And at some places:
ptr = (PVOID)((ULONG)ptr + offset);
The 2nd one was returning garbage, so I changed them all to 1st one. But I found many sample drivers on the net following the second one. My questions:
Where are these macros
defined(google didn't help much)?
I understand all the P_ macros are
pointers, why was a pointer casted
to ULONG? How does this work on
32-bit?
PCHAR obviously changes the
width according to the environment. Do you know any place to find documentation for this?
they should be defined in WinNT.h (they are in the SDK; don't have the DDK at hand)
ULONG is unsigned long; on a 32-bit system, this is the size of a pointer. So a pointer
can be converted back and forth to ULONG without loss - but not so on a 64-bit system
(where casting the value will truncate it). People cast to ULONG to get byte-base pointer
arithmetic (even though this has undefined behavior, as you found out)
Pointer arithmetic always works in units of the underlying type, i.e. in CHARs for PCHAR; this equates to bytes arithmetic
Any C book should elaborate on the precise semantics of pointer arithmetic.
The reason this code fails on 64-bit is that it is casting pointers to ULONG. ULONG is a 32-bit value while pointers on 64-bit are 64-bit values. So you will be truncating the pointer whenever you use the ULONG cast.
The PCHAR cast, assuming PCHAR is defined as char * is fine, provided the intention is to increment the pointer by an explicit number of bytes.
Both macros have the same intention but only one of them is valid where pointers are larger than 32-bits.
Pointer arithmetic works like this. If you have:
T *p;
and you do:
p + n;
(where n is a number), then the value of p will change by n * sizeof(T).
To give a concrete example, if you have a pointer to a DWORD:
DWORD *pdw = &some_dword_in_memory;
and you add one to it:
pdw = pdw + 1;
then you will be pointing to the next DWORD. The address pdw points to will have increased by sizeof(DWORD), i.e. 4 bytes.
The macros you mention are using casts to cause the address offsets they apply to be multiplied by different amounts. This is normally only done in low-level code which has been passed a BYTE (or char or void) buffer but knows the data inside it is really some other type.
ULONG is defined in WinDef.h in Windows SDK and is always 32-bit, so when you cast a 64-bit pointer into ULONG you truncate the pointer to 32 bits.