Fastest ways to set and get a bit - bit

I'm just trying to develop ultra-fast functions for setting and getting bits in uint32 arrays. For example, you can say "set bit 1035 to 1". Then, the uint32 indexed with 1035 / 32 is used with the bitposition 1035 % 32. I especially don't like the branching in the setbit function.
Here is my approach:
void SetBit(uint32* data, const uint32 bitpos, const bool newval)
{
if (newval)
{
//Set On
data[bitpos >> 5u] |= (1u << (31u - (bitpos & 31u)));
return;
}
else
{
//Set Off
data[bitpos >> 5u] &= ~(1u << (31u - (bitpos & 31u)));
return;
}
}
and
bool GetBit(const uint32* data, const uint32 bitpos)
{
return (data[bitpos >> 5u] >> (31u - (bitpos & 31u))) & 1u;
}
Thank you!

First, I would drop the 31u - ... from all expressions: all it does is reordering the bits in your private representation of the bit set, so you can flip this order without anyone noticing.
Second, you can get rid of the branch by using a clever bit hack:
void SetBit(uint32* data, const uint32 bitpos, const bool f)
{
uint32 &w = data[bitpos >> 5u];
uint32 m = 1u << (bitpos & 31u);
w = (w & ~m) | (-f & m);
}
Third, you can simplify your getter by letting the compiler do the conversion:
bool GetBit(const uint32* data, const uint32 bitpos)
{
return data[bitpos >> 5u] & (1u << (bitpos & 31u));
}

Related

How to feed a random source with uint32s

I'm trying to implement a 32-bit (MT19937-32, LFSR113 & LFSR88, among others) random sources in Go, but math.Rand's source interface accepts Int63() as method.
How do we convert uint32 to int64 (non-negative int64, or 63-bit)
here's an LFSR88 code (some methods and consts omitted):
type LFSR88 struct {
s1, s2, s3, b uint32
}
.
.
.
func (lfsr *LFSR88) Uint32() uint32 {
lfsr.b = (((lfsr.s1 << 13) ^ lfsr.s1) >> 19)
lfsr.s1 = (((lfsr.s1 & 4294967294) << 12) ^ lfsr.b)
lfsr.b = (((lfsr.s2 << 2) ^ lfsr.s2) >> 25)
lfsr.s2 = (((lfsr.s2 & 4294967288) << 4) ^ lfsr.b)
lfsr.b = (((lfsr.s3 << 3) ^ lfsr.s3) >> 11)
lfsr.s3 = (((lfsr.s3 & 4294967280) << 17) ^ lfsr.b)
return (lfsr.s1 ^ lfsr.s2 ^ lfsr.s3)
}
Converting a uint32 to an int64 is quite simple:
var u32 uint32 = /* some number */
var i64 int64 = int64(u32)
The problem with this alone is that you'll end up with an int64 that's half 0 bits, so you probably want to combine two of them:
var u1, u2 uint32 = /* two numbers */
var i64 uint64 = int64(u1) + int64(u2)<<32
See a complete example here.

Windows GetPixel typedef C++

How do I define the type for GetPixel and / or what else am I missing to use GetPixel?
#include <windows.h>
class PollPixelArray
{
public:
PollPixelArray(HDC hdcMonitor, LPRECT lprcMonitor);
unsigned long createHex(int r, int g, int b);
private:
PollPixelArray();
};
PollPixelArray::PollPixelArray(HDC hdcMonitor, LPRECT lprcMonitor)
{
GetPixel(hdcMonitor, 50, 100);
}
unsigned long createHex(int r, int g, int b){
return (((r & 0xff) << 16) + ((g & 0xff) << 8) + (b & 0xff));
}
Always returns the same unsigned long / DWORD / COLORREF no matter the X or Y co-ordinates.
while (tempX<40){
COLORREF tempREF = GetPixel(hdcMonitor, tempX, tempY); //COLORREF | unsigned long |
unsigned int dummy = GetRValue(tempREF);
std::cout << "RGB: " << ("%d", dummy);
dummy = GetGValue(tempREF);
std::cout << "," << ("%d", dummy);
dummy = GetBValue(tempREF);
std::cout << "," << ("%d", dummy);
std::cout << " at " << ("%d", tempX) << ", " << ("%d", tempY) << std::endl;
tempX++;
tempY++;
}
The loop always returns 255,255,255 for the RGB values.
The HDC Callback function:
#include <windows.h>
#include "pollpixelarray.h"
BOOL CALLBACK MonitorEnumProc(HMONITOR hMonitor, HDC hdcMonitor, LPRECT lprcMonitor, LPARAM dwData)
{
PollPixelArray::PollPixelArray(hdcMonitor, lprcMonitor);
return true;
}
void main()
{
EnumDisplayMonitors(NULL, NULL, MonitorEnumProc, 0);
std::cin.get();
}
This has nothing at all to do with typedefs and GetPixel works correctly. There are a couple of plausible explanations for the behaviour you observe:
The device context is not valid, or
The coordinates that you pass are outside the bounds of the device.
Looking at your code, both are likely to be the case,
The documentation for the hdc parameter of EnumDisplayMonitors says:
If this parameter is NULL, the hdcMonitor parameter passed to the callback function will be NULL, and the visible region of interest is the virtual screen that encompasses all the displays on the desktop.
The documentation for GetPixel says:
If the pixel is outside of the current clipping region, the return value is CLR_INVALID (0xFFFFFFFF defined in Wingdi.h).

Combine three 32-bit identifiers into one 32-bit identifier?

Given three identifiers, combine them into a single 32-bit value.
It is known, that the first identifier may have (2^8)-1 different values. Analogically, the second (2^8)-1 and the third (2^10)-1. Therefore the total count of identifiers of all kinds will not exceed (2^32)-1.
Example solution could be to have a map:
key: 32 bits,
value: 8 (or 10) bits.
The value would begin at 0 and be incremented every time a new identifier is provided.
Can it be done better? (instead of 3 maps) Do you see a problem with this solution?
To clarify, the identifier can hold ANY values from the range <0, 2^32). The only information that is given, is that the total number of them will not exceed (2^8)-1 (or 10th).
The identifiers can have the same values (it's completely random). Consider the randomness source memory addresses given by the OS to heap-allocated memory (e.g. using a pointer as an identifier). I realize this might work differently on x64 systems, however, I hope the general's problem solution to be similiar to this specific one.
This means that a simple bit shifting is out of question.
You could try something like this:-
#include <map>
#include <iostream>
class CombinedIdentifier
{
public:
CombinedIdentifier (unsigned id1, unsigned id2, unsigned id3)
{
m_id [0] = id1;
m_id [1] = id2;
m_id [2] = id3;
}
// version to throw exception on ID not found
static CombinedIdentifier GetIdentifier (unsigned int id)
{
// search m_store for a value = id
// if found, get key and return it
// else....throw an exception->id not found
}
// version to return found/not found instead of throwing an exception
static bool GetIdentifier (unsigned int id, CombinedIdentifier &out)
{
// search m_store for a value = id
// if found, get key and save it to 'out' and return true
// else....return false
}
int operator [] (int index) { return m_id [index]; }
bool operator < (const CombinedIdentifier &rhs) const
{
return m_id [0] < rhs.m_id [0] ? true :
m_id [1] < rhs.m_id [1] ? true :
m_id [2] < rhs.m_id [2];
}
bool operator == (const CombinedIdentifier &rhs) const
{
return m_id [0] == rhs.m_id [0] &&
m_id [1] == rhs.m_id [1] &&
m_id [2] == rhs.m_id [2];
}
bool operator != (const CombinedIdentifier &rhs) const
{
return !operator == (rhs);
}
int GetID ()
{
int
id;
std::map <CombinedIdentifier, int>::iterator
item = m_store.find (*this);
if (item == m_store.end ())
{
id = m_store.size () + 1;
m_store [*this] = id;
}
else
{
id = item->second;
}
return id;
}
private:
int
m_id [3];
static std::map <CombinedIdentifier, int>
m_store;
};
std::map <CombinedIdentifier, int>
CombinedIdentifier::m_store;
int main ()
{
CombinedIdentifier
id1 (2, 4, 10),
id2 (9, 14, 1230),
id3 (4, 1, 14560),
id4 (9, 14, 1230);
std::cout << "id1 = " << id1.GetID () << std::endl;
std::cout << "id2 = " << id2.GetID () << std::endl;
std::cout << "id3 = " << id3.GetID () << std::endl;
std::cout << "id4 = " << id4.GetID () << std::endl;
}
You can get this with bit shifting and unsafe code.
There is an article on SO: What are bitwise shift (bit-shift) operators and how do they work?
Then you can use the whole 32bit range for your three values
---- 8 bits ---- | ---- 8 bits ---- | ---- 10 bits ---- | ---- unused 6 bits ----
int result = firstValue << (8 + 10 + 6);
result += secondValue << (10 + 6);
result += thirdValue << 6;
I think you could make use of a Perfect Hash Function. In particular, the link provided in that that article to Pearson Hashing seems to be appropriate. You might even be able to cut-and-paste the included C program the 2nd article except for the fact that its output is a 64-bit number not a 32-bit one. But if you modify it slightly from
for (j=0; j<8; j++) {
// standard Pearson hash (output is h)
to
for (j=0; j<4; j++) {
// standard Pearson hash (output is h)
You'll have what you need.

What is the most efficient way to subtract signed integral data in binary (bits)?

I'm working in C on a PC, trying to leverage as little C++ as possible, working with binary data stored in unsigned char format, although other formats are certainly possible if worthwhile. The goal is subtracting two signed integer values (which can be ints, signed ints, longs, signed longs, signed shorts, etc.) in binary without converting to other data formats. The raw data is just prepackaged as unsigned char, though, with the user basically knowing which of the signed integer formats should be used for reading (i.e. we know how many bytes to read at once). Even though data is stored as an unsigned char array, data are meant to be read signed as two's-complement integers.
One common way we're often taught in school is adding the negative. Negation, in turn, is often taught to be performed as flipping bits and adding 1 (0x1), resulting in two additions (perhaps a bad thing?); or, as other posts point out, flipping bits past the first zero starting from the MSB. I'm wondering if there is a more efficient way, that may not be easily described as a pen-and-paper operation, but works because of the way data is stored in bit format. Here are some prototypes I've written, which may not be the most efficient way, but which summarizes my progress so far based on textbook methodology.
The addends are passed by reference in case I have to manually extend them to balance their length. Any and all feedback will be appreciated! Thanks in advance for considering.
void SubtractByte(unsigned char* & a, unsigned int & aBytes,
unsigned char* & b, unsigned int & bBytes,
unsigned char* & diff, unsigned int & nBytes)
{
NegateByte(b, bBytes);
// a - b == a + (-b)
AddByte(a, aBytes, b, bBytes, diff, nBytes);
// Restore b to its original state so input remains intact
NegateByte(b, bBytes);
}
void AddByte(unsigned char* & a, unsigned int & aBytes,
unsigned char* & b, unsigned int & bBytes,
unsigned char* & sum, unsigned int & nBytes)
{
// Ensure that both of our addends have the same length in memory:
BalanceNumBytes(a, aBytes, b, bBytes, nBytes);
bool aSign = !((a[aBytes-1] >> 7) & 0x1);
bool bSign = !((b[bBytes-1] >> 7) & 0x1);
// Add bit-by-bit to keep track of carry bit:
unsigned int nBits = nBytes * BITS_PER_BYTE;
unsigned char carry = 0x0;
unsigned char result = 0x0;
unsigned char a1, b1;
// init sum
for (unsigned int j = 0; j < nBytes; ++j) {
for (unsigned int i = 0; i < BITS_PER_BYTE; ++i) {
a1 = ((a[j] >> i) & 0x1);
b1 = ((b[j] >> i) & 0x1);
AddBit(&a1, &b1, &carry, &result);
SetBit(sum, j, i, result==0x1);
}
}
// MSB and carry determine if we need to extend:
if (((aSign && bSign) && (carry != 0x0 || result != 0x0)) ||
((!aSign && !bSign) && (result == 0x0))) {
++nBytes;
sum = (unsigned char*)realloc(sum, nBytes);
sum[nBytes-1] = (carry == 0x0 ? 0x0 : 0xFF); //init
}
}
void FlipByte (unsigned char* n, unsigned int nBytes)
{
for (unsigned int i = 0; i < nBytes; ++i) {
n[i] = ~n[i];
}
}
void NegateByte (unsigned char* n, unsigned int nBytes)
{
// Flip each bit:
FlipByte(n, nBytes);
unsigned char* one = (unsigned char*)malloc(nBytes);
unsigned char* orig = (unsigned char*)malloc(nBytes);
one[0] = 0x1;
orig[0] = n[0];
for (unsigned int i = 1; i < nBytes; ++i) {
one[i] = 0x0;
orig[i] = n[i];
}
// Add binary representation of 1
AddByte(orig, nBytes, one, nBytes, n, nBytes);
free(one);
free(orig);
}
void AddBit(unsigned char* a, unsigned char* b, unsigned char* c,
unsigned char* result) {
*result = ((*a + *b + *c) & 0x1);
*c = (((*a + *b + *c) >> 1) & 0x1);
}
void SetBit(unsigned char* bytes, unsigned int byte, unsigned int bit,
bool val)
{
// shift desired bit into LSB position, and AND with 00000001
if (val) {
// OR with 00001000
bytes[byte] |= (0x01 << bit);
}
else{ // (!val), meaning we want to set to 0
// AND with 11110111
bytes[byte] &= ~(0x01 << bit);
}
}
void BalanceNumBytes (unsigned char* & a, unsigned int & aBytes,
unsigned char* & b, unsigned int & bBytes,
unsigned int & nBytes)
{
if (aBytes > bBytes) {
nBytes = aBytes;
b = (unsigned char*)realloc(b, nBytes);
bBytes = nBytes;
b[nBytes-1] = ((b[0] >> 7) & 0x1) ? 0xFF : 0x00;
} else if (bBytes > aBytes) {
nBytes = bBytes;
a = (unsigned char*)realloc(a, nBytes);
aBytes = nBytes;
a[nBytes-1] = ((a[0] >> 7) & 0x1) ? 0xFF : 0x00;
} else {
nBytes = aBytes;
}
}
The first thing to notice is that signed vs. unsigned doesn't matter to the generated bit pattern in two's complement. All that changes is the interpretation of the result.
The second thing to notice is that an addition has carried if the result is less than either input when done with unsigned arithmetic.
void AddByte(unsigned char* & a, unsigned int & aBytes,
unsigned char* & b, unsigned int & bBytes,
unsigned char* & sum, unsigned int & nBytes)
{
// Ensure that both of our addends have the same length in memory:
BalanceNumBytes(a, aBytes, b, bBytes, nBytes);
unsigned char carry = 0;
for (int j = 0; j < nbytes; ++j) { // need to reverse the loop for big-endian
result[j] = a[j] + b[j];
unsigned char newcarry = (result[j] < a[j] || (unsigned char)(result[j]+carry) < a[j]);
result[j] += carry;
carry = newcarry;
}
}

Monochrome Bitmap SetPixel/GetPixel problems... Win32 C Code

This is some of my bitmask code (monochrome bitmaps). There is no problem with the Bitmask_Create() function. I have tested it with opening, loading and saving windows monochrome bitmaps, and it works great. However, the GetPixel and SetPixel functions I've made don't seem to work right. In some instances they seem to work fine depending on the bitmap dimensions.
If anyone could help, I would appreciate it. It's driving me insane.
Thanks.
typedef struct _GL_BITMASK GL_BITMASK;
struct _GL_BITMASK {
int nWidth; // Width in pixels
int nHeight; // Height in pixels
int nPitch; // Width of scanline in bytes (may have extra padding to align to DWORD)
BYTE *pData; // Pointer to the first byte of the first scanline (top down)
};
int BitMask_GetPixel(GL_BITMASK *pBitMask, int x, int y)
{
INT nElement = ((y * pBitMask->nPitch) + (x / 8));
PBYTE pElement = pBitMask->pData + nElement;
BYTE bMask = 1 << (7 - (x % 8));
return *pElement & bMask;
}
void BitMask_SetPixel(GL_BITMASK *pBitMask, int x, int y, int nPixelColor)
{
INT nElement = x / 8;
INT nScanLineOffset = y * pBitMask->nPitch;
PBYTE pElement = pBitMask->pData + nScanLineOffset + nElement;
BYTE bMask = 1 << (7 - (x % 8));
if(*pElement & bMask)
{
if(!nPixelColor) return;
else *pElement ^= bMask;
}
else
{
if(nPixelColor) return;
else *pElement |= bMask;
}
}
GL_BITMASK *BitMask_Create(INT nWidth, INT nHeight)
{
GL_BITMASK *pBitMask;
int nPitch;
nPitch = ((nWidth / 8) + 3) & ~3;
pBitMask = (GL_BITMASK *)GlobalAlloc(GMEM_FIXED, (nPitch * nHeight) + sizeof(GL_BITMASK));
if(!pBitMask)
return (GL_BITMASK *)NULL;
pBitMask->nPitch = nPitch;
pBitMask->nWidth = nWidth;
pBitMask->nHeight = nHeight;
pBitMask->pData = (PBYTE)pBitMask + sizeof(GL_BITMASK);
return pBitMask;
}
I think your formula for calculating pitch is just a little bit off. It works when the width is a multiple of 8, but not otherwise. Try:
nPitch = ((nWidth + 31) / 8) & ~3;
I figured it out... I had the two tests backwards for nPixelColor in SetPixel()
if(*pElement & bMask)
{
if(nPixelColor) return; // this was !nPixelColor
else *pElement ^= bMask;
}
else
{
if(!nPixelColor) return; // this was nPixelColor
else *pElement |= bMask;
}

Resources