Is there efficient way to do this?
That's something you could be using a union for:
union {
UINT64 ui64;
struct {
DWORD d0;
DWORD d1;
} y;
} un;
un.ui64 = 27;
// Use un.y.d0 and un.y.d1
An example (under Linix so using different types):
#include <stdio.h>
union {
long ui64;
struct {
int d0;
int d1;
} y;
} un;
int main (void) {
un.ui64 = 27;
printf ("%d %d\n", un.y.d0, un.y.d1);
return 0;
}
This produces:
27 0
Thought I would provide an example using LARGE_INTEGER FOR the windows platform.
If I have a variable called "value" that is 64 bit, then I do:
LARGE_INTEGER li;
li.QuadPart = value;
DWORD low = li.LowPart;
DWORD high = li.HighPart;
Yes, this copies it, but I like the readability of it.
Keep in mind that 64-bit integers have alignment restrictions at least as great as 32-bit integers on all platforms. Therefore, it's perfectly safe to cast a pointer to a 64-bit integer as a pointer to a 32-bit.
ULONGLONG largeInt;
printf( "%u %u\n", ((DWORD *)&largeInt)[ 0 ], ((DWORD *)&largeInt)[ 1 ] );
Obviously, Pax's solution is a lot cleaner, but this is technically more efficient since it doesn't require any data copying.
Related
I'm wondering if it's possible to hint to gcc that a pointer points to an aligned boundary. if I have a function:
void foo ( void * pBuf ) {
uint64_t *pAligned = pBuf;
pAligned = ((pBuf + 7) & ~0x7);
var = *pAligned; // I want this to be aligned 64 bit access
}
And I know that pBuf is 64 bit aligned, is there any way to tell gcc that pAligned points to a 64 bit boundary? If I do:
uint64_t *pAligned __attribute__((aligned(16)));
I believe that means that the address of the pointer is 64 bit aligned, but it doesn't tell the compiler that the what it points to is aligned, and therefore the compiler would likely tell it to do an unaligned fetch here. This could slow things down if I'm looping through a large array.
There are several ways to inform GCC about alignment.
Firstly you can attach align attribute to pointee, rather than pointer:
int foo() {
int __attribute__((aligned(16))) *p;
return (unsigned long long)p & 3;
}
Or you can use (relatively new) builtin:
int bar(int *p) {
int *pa = __builtin_assume_aligned(p, 16);
return (unsigned long long)pa & 3;
}
Both variants optimize to return 0 due to alignment.
Unfortunately the following does not seem to work:
typedef int __attribute__((aligned(16))) *aligned_ptr;
int baz(aligned_ptr p) {
return (unsigned long long)p & 3;
}
and this one does not either
typedef int aligned_int __attribute__((aligned (16)));
int braz(aligned_int *p) {
return (unsigned long long)p & 3;
}
even though docs suggest the opposite.
Is it OK to use both parts of a union if you know the parts don't overlap? Like in this example, is it OK to use both buf[31] as well as ps?
struct PtrSize {
const char *data;
size_t size;
};
class SmallStringOrNot {
union {
PtrSize ps;
char buf[32];
} pb;
public:
bool IsSmallString() const {
return pb.buf[31] != 0;
}
SmallStringOrNot(const char *str) {
size_t len = strlen(str);
if (len && len < 31) {
memcpy(pb.buf, str, len);
pb.buf[31] = len;
} else {
pb.ps.data = str;
pb.ps.size = len;
pb.buf[31] = 0; // is this OK, accessing buf right after ps?
}
}
PtrSize AsPtrSize() const {
if (IsSmallString()) {
return PtrSize{pb.buf, pb.buf[31]};
} else {
return pb.ps;
}
}
};
Unfortunately the code is not OK: you are at least not in "undefined behaviour"-zone, since in C++ it is always legal to access a union through a char member, but you have no guarantee that by modifying buf[31] you are not altering ps.data or ps.size. In a 128-bit machine you would almost surely be doing it.
On more normal architectures, your code should be fine but for a 100% guarantee you should refer to the compiler documentation, since size_t could in principle be bigger than a void*. For example, even on a 64-bit machine you could theoretically have a 192-bit ps.size member (which summed with the 64 bit of the ps.data pointer would make the PtrSize completely overlap the buffer.
#include <iostream>
#include <Windows.h>
#include <string>
using namespace std;
HANDLE hPort = CreateFile("COM2",
GENERIC_WRITE|GENERIC_READ,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
DCB dcb;
bool writebyte(char*data)
{
DWORD byteswritten;
if (!GetCommState(hPort,&dcb))
{
printf("\nSerial port can't be open\n");
return false;
}
dcb.BaudRate = CBR_9600;
dcb.ByteSize = 8;
dcb.Parity = NOPARITY;
dcb.StopBits = ONESTOPBIT;
if (!SetCommState(hPort,&dcb))
return false;
bool retVal = WriteFile(hPort,data,1,&byteswritten,NULL);
return retVal;
}
int ReadByte()
{
int Val;
BYTE Byte;
DWORD dwBytesTransferred;
DWORD dwCommModemStatus;
if (!GetCommState(hPort,&dcb))
return 0;
SetCommMask(hPort,EV_RXCHAR | EV_ERR);
WaitCommEvent (hPort,&dwCommModemStatus,0);
if (dwCommModemStatus & EV_RXCHAR)
ReadFile (hPort,&Byte,1,&dwBytesTransferred,0);
Val = Byte;
return Val;
}
int main() {
POINT p;
int x;
int y;
int z;
while(0==0){
GetCursorPos(&p);
x = p.x;
y = p.y;
HDC hDC;
hDC = GetDC(NULL);
cin >> z;
cout << GetPixel(hDC, x, y) << endl;
Sleep(z);
ReleaseDC(NULL, hDC);
char data = GetPixel(hDC, x, y);
if (writebyte(&data))
cout <<" DATA SENT.. " << (int)data<< "\n";
}
}
in the part of sending data through serial communication, instead of sending the data as GetPixel(hDC, x, y), it only sends the value "-1" . I was thinking it is because char is only for small integers and the output I was giving is a very very long number. I tried to change it to long int but i still get the same result. That it only sends "-1". I thought that the solution might be converting char to long int or long int to char before sending the data but I don't know how..can someone help me?
Why do you use hDC after releasing it?
ReleaseDC(NULL, hDC);
char data = GetPixel(hDC, x, y);
GetPixel will return -1 (CLR_INVALID) in case of an error (see MSDN).
And, by the way, a COLORREF is not a char, so you lose Information when storing the return value of GetPixel in char data. You should store the complete COLORREF and send/receive all of it's bytes (so send/receive sizeof(COLORREF) bytes).
Also be aware of byte order. If you are transferring multi byte data between two machines then you must assure that both agree on the order of the bytes. If for example one machine is little endian and the other big endian, then they store COLORREF with different byte order in memory. One is storing the COLORREF 0x00BBGGRR in memory as { 0, 0xbb, 0xgg, 0xrr } whereas the other is storing it as { 0xrr, 0xgg, 0xbb, 0 }. So you need to define a transmit byte order which both sides use independant of their host byte order. If you don't want to invent the wheel new, you can take a look at network byte order and reuse that. Socket API gives you some functions like ntohl and htonl which help you in converting from host byte order to network byte order and vice versa.
I ported one project from Visual C++ 6.0 to VS 2010 and found that a critical part of the code (scripting engine) now runs in about three times slower than in was before.
After some research I managed to extract code fragment which seems to cause the slowdown. I minimized it as much as possible, so it ill be easier to reproduce the problem.
The problem is reproduced when assigning a complex class (Variant) which contains another class (String), and the union of several other fields of simple types.
Playing with the example I discovered more "magic":
1. If I comment one of unused (!) class members, the speed increases, and the code finally runs faster than those complied with VS 6.2
2. The same is true if I remove the "union" wrapper"
3. The same is true event if change the value of the filed from 1 to 0
I have no idea what the hell is going on.
I have checked all code generation and optimization switches, but without any success.
The code sample is below:
On my Intel 2.53 GHz CPU this test, compiled under VS 6.2 runs 1.0 second.
Compiled under VS 2010 - 40 seconds
Compiled under VS 2010 with "magic" lines commented - 0.3 seconds.
The problem is reproduces with any optimization switch, but the "Whole program optimization" (/GL) should be disabled. Otherwise this too smart optimizer will know that out test actually does nothing, and the test will run 0 seconds.
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
class String
{
public:
char *ptr;
int size;
String() : ptr(NULL), size( 0 ) {};
~String() {if ( ptr != NULL ) free( ptr );};
String& operator=( const String& str2 );
};
String& String::operator=( const String& string2 )
{
if ( string2.ptr != NULL )
{
// This part is never called in our test:
ptr = (char *)realloc( ptr, string2.size + 1 );
size = string2.size;
memcpy( ptr, string2.ptr, size + 1 );
}
else if ( ptr != NULL )
{
// This part is never called in our test:
free( ptr );
ptr = NULL;
size = 0;
}
return *this;
}
struct Date
{
unsigned short year;
unsigned char month;
unsigned char day;
unsigned char hour;
unsigned char minute;
unsigned char second;
unsigned char dayOfWeek;
};
class Variant
{
public:
int dataType;
String valStr; // If we comment this string, the speed is OK!
// if we drop the 'union' wrapper, the speed is OK!
union
{
__int64 valInteger;
// if we comment any of these fields, unused in out test, the speed is OK!
double valReal;
bool valBool;
Date valDate;
void *valObject;
};
Variant() : dataType( 0 ) {};
};
void TestSpeed()
{
__int64 index;
Variant tempVal, tempVal2;
tempVal.dataType = 3;
tempVal.valInteger = 1; // If we comment this string, the speed is OK!
for ( index = 0; index < 200000000; index++ )
{
tempVal2 = tempVal;
}
}
int main(int argc, char* argv[])
{
int ticks;
char str[64];
ticks = GetTickCount();
TestSpeed();
sprintf( str, "%.*f", 1, (double)( GetTickCount() - ticks ) / 1000 );
MessageBox( NULL, str, "", 0 );
return 0;
}
This was rather interesting. First I was unable to reproduce the slow down in release build, only in debug build. Then I turned off SSE2 optimizations and got the same ~40s run time.
The problem seems to be in the compiler generated copy assignment for Variant. Without SSE2 it actually does a floating point copy with fld/fstp instructions because the union contains a double. And with some specific values this apparently is a really expensive operation. The 64-bit integer value 1 maps to 4.940656458412e-324#DEN which is a denormalized number and I believe this causes problems. When you leave tempVal.valInteger uninitialized it may contain a value that works faster.
I did a small test to confirm this:
union {
uint64_t i;
volatile double d1;
};
i = 0xcccccccccccccccc; //with this value the test takes 0.07 seconds
//i = 1; //change to 1 and now the test takes 36 seconds
volatile double d2;
for(int i=0; i<200000000; ++i)
d2 = d1;
So what you could do is define your own copy assignment for Variant that just does a simple memcpy of the union.
Variant& operator=(const Variant& rhs)
{
dataType = rhs.dataType;
union UnionType
{
__int64 valInteger;
double valReal;
bool valBool;
Date valDate;
void *valObject;
};
memcpy(&valInteger, &rhs.valInteger, sizeof(UnionType));
valStr = rhs.valStr;
return *this;
}
I have an existing C codebase that works on x86.
I'm now compiling it for x64.
What I'd like to do is cast a size_t to a DWORD, and throw an exception if there's a loss of data.
Q: Is there an idiom for this?
Here's why I'm doing this:
A bunch of Windows APIs accept DWORDs as arguments, and the code currently assumes sizeof(DWORD)==sizeof(size_t). That assumption holds for x86, but not for x64. So when compiling for x64, passing size_t in place of a DWORD argument, generates a compile-time warning.
In virtually all of these cases the actual size is not going to exceed 2^32. But I want to code it defensively and explicitly.
This is my first x64 project, so... be gentle.
see boost::numeric_cast
http://www.boost.org/doc/libs/1_33_1/libs/numeric/conversion/doc/numeric_cast.html
I just defined a function to perform the cast.
I included an assert-like behavior to insure I'm not silently rubbishing pointers.
DWORD ConvertSizeTo32bits(size_t sz, char *file, int line)
{
if (!(0 <= sz && sz <= INT32_MAX)) {
EmitLogMessage("Invalid Pointer size: %d file(%s) line(%d)",
sz, file, line);
ExitProcess( 0 );
}
return (DWORD) sz;
}
#define size_t_to_DWORD(st,dw) if ((DWORD)(st) != st) RaiseException(exLossOfData, 0, 0, NULL); else dw = (DWORD)(st)
size_t st;
DWORD dw;
st = 0xffffffff;
size_t_to_DWORD(st,dw); // this succeeds
st = 0xffffffff1;
size_t_to_DWORD(st,dw); // this throws
EDIT:
Or better yet, do this so you can use it in an expression:
DWORD MyRaiseException()
{
RaiseException(1, 0, 0, NULL);
return 0;
}
#define size_t_to_DWORD(st) (DWORD)(st) != (st) ? MyRaiseException() : (DWORD)(st)
void main(void)
{
size_t st;
DWORD dw;
st = 0xffffffff1;
dw = size_t_to_DWORD(st);
printf("%u %u\n", st, dw);
}