When comparing two Hex values, if one is shorter than the other, what do I fill in for the shorter? - byte

Sorry for the confusing title if it doesn't make much sense.
I'm somewhat new to systems and C in general, so when it comes to bits/bytes/hex, I'm on the sidelines trying to figure it out.
My question is that if I have an order of bytes, lets say:
0x01020304 and I want to use the & logical operator with 0xFF, what bytes in the former am I comparing FF to since it's a much shorter byte?
I think I remember that the MSB's are moved to the right, as in FF, and then filled in with 0's on the left so that the 0s are compared with 0x010203, where FF would compare with 04.
Am I correct in thinking this or do I fill in this comparison another way?

Here's a hint:
0xFF is the same as 0x000000FF.
So...
int f1 = 0x01020304; (16909060 in decimal)
int f2 = 0x000000ff; (255 in decimal)
int f3 = f1&f2; (gives us 4 in decimal)
same with this code:
int f1 = 0x01020304; (16909060 in decimal)
int f2 = 0xff; (255 in decimal)
int f3 = f1&f2; (gives us 4 in decimal)
Hope that helps.
In other words: Yup, you're right with your considerations.

Related

circular byte shifting in an array

I'm coding an LED display (7x48) and the language I'm working in is BASIC (no former experience in that language, but in C/C++) and I have a small issue.
I have an array (red[20] of byte) and an example of a current state is:
to make it easier here lets say its red[3]
10011010 01011100 01011101
and now i need to shift the array by 1 so in next cycle its supposed to be
00110100 10111000 10111011
so what happened is that the whole array shifted for 1 bit to left
the BASIC I'm working with doesn't have any .NET APIs so I need the total low level code (doesn't have to be BASIC, I can translate it, I just need an idea how to do it as I'm limited to 8KB code memory so I have to fully optmize it)
If most significant bit is 1:
subtract value of most significant bit
multiply by 2
add 1
otherwise:
multiply by 2
You should be able to use bit shift operations:
http://msdn.microsoft.com/en-us/library/2d9yb87a.aspx
Let x be the element you want to shift:
x = (x<<1) | (x>>23)
or in general, if you want to shift left by y bits and there are a total of n bits:
x = (x<<y) | (x>>(n-y))
I don't know basic well, but here's what I would do in a C++/Java/C# language:
Assuming you have red[] of length n:
int b = 32; //Number of bits per byte (your example showed 24, but usually there are 32)
int y = 1; //Number of bytes to shift to the left
int carry = 0; //The bytes to carry over (I'm assuming that they move up the array from red[0] to red[1], etc.
for (int i=0;i<n;i++)
{
int newCarry = (red[i]>>(n-y));
red[i] = (red[i]<<y) | carry;
carry = newCarry;
}
//Complete the loop
red[0]|=carry;

checking duplicate characher in a string

In an interview i was asked to check weather a given string has duplicate characters. Googleing about this question i came to know about a question which uses bit-manipulation.
bool check(char*name)
{
int i;
int checker=0;
for(i=0;name[i]!=0;i++)
{
int val=name[i]-'a';
if((checker&(1<<val))>0)return false;
checker|=(1<<val);
}
return true;
}
I check this code and It is working fine.But I didn't understand the logic behind this line.
> if((checker&(1<<val))>0)return false;
> checker|=(1<<val);
And second doubt is that will this work if a string is too long or contains Unicode(2 byte wide chars)?
The algorithm uses 1 bit per ascii character to indicate existence to the set. So it at least works for English lowercase letters -- 26 of them and with successive ascii codes.
a= 000001, b= 000010, c= 000100, etc.
'aacaaccc' and 'ac' and 'ca' would all encode to 000101 regardless of the number of occurances of a and c. Thus the string length doesn't matter.
You are right about the 2-byte chars. Latin character set would also cause problems, but the issue of mixing cases (upper and lower) could be easily resolved by masking off the 5th bit (32) to convert to upper case (or oring with 32 to convert to lower case).
The ASCII character table assigns an integer to all characters:
# = 64 = 01**0**00000 ...
A = 65 = 01**0**00001 ... a = 97 = 01**1**00001
B = 66 = 01**0**00010 ... b = 98 = 01**1**00010
..
Z = 90 = 01**0**11010 ... z = 122 = 01**1**11010
Upper and lower case characters differ only on that particular bit and 'a' - 32 == 'A' or the other way round: 'B' + 32 == 'b' or 'B' | 32 == 'b', where | is the bitwise OR operator.
This is known as bit masking .Here checker is the bit mask.
The first expression :if((checker&(1<<val))>0) gets the bit and the second expression checker|=(1<<val) sets the bit.
The left shift operator raises the by 2^val.So you have something like 001000 (for 'd').
The & operator returns true whenever both the checker's ith bit and the new val(001000) are 1.So you know if that character was already covered or not.
The | operator simply sets the ith bit to 1 .So if at some instance checker was 010000, Now it become 011000.

How to compute the integer absolute value

How to compute the integer absolute value without using if condition.
I guess we need to use some bitwise operation.
Can anybody help?
Same as existing answers, but with more explanations:
Let's assume a twos-complement number (as it's the usual case and you don't say otherwise) and let's assume 32-bit:
First, we perform an arithmetic right-shift by 31 bits. This shifts in all 1s for a negative number or all 0s for a positive one (but note that the actual >>-operator's behaviour in C or C++ is implementation defined for negative numbers, but will usually also perform an arithmetic shift, but let's just assume pseudocode or actual hardware instructions, since it sounds like homework anyway):
mask = x >> 31;
So what we get is 111...111 (-1) for negative numbers and 000...000 (0) for positives
Now we XOR this with x, getting the behaviour of a NOT for mask=111...111 (negative) and a no-op for mask=000...000 (positive):
x = x XOR mask;
And finally subtract our mask, which means +1 for negatives and +0/no-op for positives:
x = x - mask;
So for positives we perform an XOR with 0 and a subtraction of 0 and thus get the same number. And for negatives, we got (NOT x) + 1, which is exactly -x when using twos-complement representation.
Set the mask as right shift of integer by 31 (assuming integers are stored as two's-complement 32-bit values and that the right-shift operator does sign extension).
mask = n>>31
XOR the mask with number
mask ^ n
Subtract mask from result of step 2 and return the result.
(mask^n) - mask
Assume int is of 32-bit.
int my_abs(int x)
{
int y = (x >> 31);
return (x ^ y) - y;
}
One can also perform the above operation as:
return n*(((n>0)<<1)-1);
where n is the number whose absolute need to be calculated.
In C, you can use unions to perform bit manipulations on doubles. The following will work in C and can be used for both integers, floats, and doubles.
/**
* Calculates the absolute value of a double.
* #param x An 8-byte floating-point double
* #return A positive double
* #note Uses bit manipulation and does not care about NaNs
*/
double abs(double x)
{
union{
uint64_t bits;
double dub;
} b;
b.dub = x;
//Sets the sign bit to 0
b.bits &= 0x7FFFFFFFFFFFFFFF;
return b.dub;
}
Note that this assumes that doubles are 8 bytes.
I wrote my own, before discovering this question.
My answer is probably slower, but still valid:
int abs_of_x = ((x*(x >> 31)) | ((~x + 1) * ((~x + 1) >> 31)));
If you are not allowed to use the minus sign you could do something like this:
int absVal(int x) {
return ((x >> 31) + x) ^ (x >> 31);
}
For assembly the most efficient would be to initialize a value to 0, substract the integer, and then take the max:
pxor mm1, mm1 ; set mm1 to all zeros
psubw mm1, mm0 ; make each mm1 word contain the negative of each mm0 word
pmaxswmm1, mm0 ; mm1 will contain only the positive (larger) values - the absolute value
In C#, you can implement abs() without using any local variables:
public static long abs(long d) => (d + (d >>= 63)) ^ d;
public static int abs(int d) => (d + (d >>= 31)) ^ d;
Note: regarding 0x80000000 (int.MinValue) and 0x8000000000000000 (long.MinValue):
As with all of the other bitwise/non-branching methods shown on this page, this gives the single non-mathematical result abs(int.MinValue) == int.MinValue (likewise for long.MinValue). These represent the only cases where result value is negative, that is, where the MSB of the two's-complement result is 1 -- and are also the only cases where the input value is returned unchanged. I don't believe this important point was mentioned elsewhere on this page.
The code shown above depends on the value of d used on the right side of the xor being the value of d updated during the computation of left side. To C# programmers this will seem obvious. They are used to seeing code like this because .NET formally incorporates a strong memory model which strictly guarantees the correct fetching sequence here. The reason I mention this is because in C or C++ one may need to be more cautious. The memory models of the latter are considerably more permissive, which may allow certain compiler optimizations to issue out-of-order fetches. Obviously, in such a regime, fetch-order sensitivity would represent a correctness hazard.
If you don't want to rely on implementation of sign extension while right bit shifting, you can modify the way you calculate the mask:
mask = ~((n >> 31) & 1) + 1
then proceed as was already demonstrated in the previous answers:
(n ^ mask) - mask
What is the programming language you're using? In C# you can use the Math.Abs method:
int value1 = -1000;
int value2 = 20;
int abs1 = Math.Abs(value1);
int abs2 = Math.Abs(value2);

Pass two integers as one integer

I have two integers that I need to pass through one integer and then get the values of two integers back.
I am thinking of using Logic Operators (AND, OR, XOR, etc) .
Using the C programming language, it could be done as follows assuming that the two integers are less than 65535.
void take2IntegersAsOne(int x)
{
// int1 is stored in the bottom half of x, so take just that part.
int int1 = x & 0xFFFF;
// int2 is stored in the top half of x, so slide that part of the number
// into the bottom half, and take just that part.
int int2 = (x >> 16) & 0xFFFF
// use int1 and int2 here. They must both be less than 0xFFFF or 65535 in decimal
}
void pass2()
{
int int1 = 345;
int int2 = 2342;
take2Integers( int1 | (int2 << 16) );
}
This relies on the fact that in C an integer is stored in 4 bytes. So, the example uses the first two bytes to store one of the integers, and the next two bytes for the second. This does impose the limit though that each of the integers must have a small enough value so that they will each fit into just 2 bytes.
The shift operators << and >> are used to slide the bits of an integer up and down. Shifting by 16, moves the bits by two bytes (as there are 8 bits per byte).
Using 0xFFFF represents the bit pattern where all of the bits in the lower two bytes of the number are 1s So, ANDing (with with & operator) causes all the bits that are not in these bottom two bytes to be switched off (back to zero). This can be used to remove any parts of the 'other integer' from the one you're currently extracting.
There are two parts to this question. First, how do you bitmask two 32-bit Integers into a 64-bit Long Integer?
As others have stated, let's say I have a function that takes an X and Y coordinate, and returns a longint representing that Point's linear value. I tend to call this linearization of 2d data:
public long asLong(int x, int y) {
return ( ((long)x) << 32 ) | y;
}
public int getX(long location) {
return (int)((location >> 32) & 0xFFFFFFFF);
}
public int getY(long location) {
return (int)(location & 0xFFFFFFFF);
}
Forgive me if I'm paranoid about order of operations, sometimes other operations are greedier than <<, causing things to shift further than they should.
Why does this work? When might it fail?
It's convenient that integers tend to be exactly half the size of longints. What we're doing is casting x to a long, shifting it left until it sits entirely to the left of y, and then doing a union operation (OR) to combine the bits of both.
Let's pretend they're 4-bit numbers being combined into an 8-bit number:
x = 14 : 1110
y = 5 : 0101
x = x << 4 : 1110 0000
p = x | y : 1110 0000
OR 0101
---------
1110 0101
Meanwhile, the reverse:
p = 229 : 1110 0101
x = p >> 4 : 1111 1110 //depending on your language and data type, sign extension
//can cause the bits to smear on the left side as they're
//shifted, as shown here. Doesn't happen in unsigned types
x = x & 0xF:
1111 1110
AND 0000 1111
-------------
0000 1110 //AND selects only the bits we have in common
y = p & 0xF:
1110 0101
AND 0000 1111
-------------
0000 0101 //AND strikes again
This sort of approach came into being a long time ago, in environments that needed to squeeze every bit out of their storage or transmission space. If you're not on an embedded system or immediately packing this data for transmission over a network, the practicality of this whole procedure starts to break down really rapidly:
It's way too much work just for boxing a return value that almost always immediately needs to be unboxed and read by the caller. That's kind of like digging a hole and then filling it in.
It greatly reduces your code readability. "What type is returned?" Uh... an int.. and another int... in a long.
It can introduce hard-to-trace bugs down the line. For instance, if you use unsigned types and ignore the sign extension, then later on migrate to a platform that causes those types to go two's complement. If you save off the longint, and try to read it later in another part of your code, you might hit an off-by-one error on the bitshift and spend an hour debugging your function only to find out it's the parameter that's wrong.
If it's so bad, what are the alternatives?
This is why people were asking you about your language. Ideally, if you're in something like C or C++, it'd be best to say
struct Point { int x; int y; };
public Point getPosition() {
struct Point result = { 14,5 };
return result;
}
Otherwise, in HLLs like Java, you might wind up with an inner class to achieve the same functionality:
public class Example {
public class Point {
public int x;
public int y;
public Point(int x, int y) { this.x=x; this.y=y; }
}
public Point getPosition() {
return new Point(14,5);
}
}
In this case, getPosition returns an Example.Point - if you keep using Point often, promote it to a full class of its own. In fact, java.awt has several Point classes already, including Point and Point.Float
Finally, many modern languages now have syntactic sugar for either boxing multiple values into tuples or directly returning multiple values from a function. This is kind of a last resort. In my experience, any time you pretend that data isn't what it is, you wind up with problems down the line. But if your method absolutely must return two numbers that really aren't part of the same data at all, tuples or arrays are the way to go.
The reference for the c++ stdlib tuple can be found at
http://www.cplusplus.com/reference/std/tuple/
Well.. #Felice is right, but if they both fit in 16 bit there's a way:
output_int = (first_int << 16) | second_int
^
means 'or'
to pack them, and
first_int = output_int & 0xffff
second_int = (output int >> 16) & 0xffff
^
means 'and'
to extract them.
Two integer can't fit one integer, or at least you cant get back the two original one.
But anyway, if the two original integer are bounded to a sure number of bits you can ( in pseudocode ):
First integer
OR with
(Second integer SHIFTLEFT(nOfBits))
for getting back the two integer
mask the merged integer with a number that is binary represented by nOfBitsOne and you obtain the first integer, then
ShiftRight by nOfBits the merged integer, and you have back the second.
You could store 2 16-bit integers within a 32-bit integer. First one i 16 first bits and second one in the last 16 bits. To retrieve and compose the value you use shift-operators.

Convert string to integer (not atoi!)

I want to be able to take, as input, a character pointer to a number in base 2 through 16 and as a second parameter, what base the number is in and then convert that to it's representation in base 2. The integer can be of arbitrary length. My solution now does what the atoi() function does, but I was curious purely out of academic interest if a lookup table solution is possible.
I have found that this is simple for binary, octal, and hexadecimal. I can simply use a lookup table for each digit to get a series of bits. For instance:
0xF1E ---> (F = 1111) (1 = 0001) (E = 1110) ---> 111100011110
0766 ---> (7 = 111) (6 = 110) (6 = 110) ---> 111110110
1000 ---> ??? ---> 1111101000
However, my problem is that I want to do this look up table method for odd bases, like base 10. I know that I could write the algorithm like atoi does and do a bunch of multiplies and adds, but for this specific problem I'm trying to see if I can do it with a look up table. It's definitely not so obvious with base 10, though. I was curious if anyone had any clever way to figure out how to generate a generic look up table for Base X -> Base 2. I know that for base 10, you can't just give it one digit at a time, so the solution would likely have to lookup a group of digits at a time.
I am aware of the multiply and add solution but since these are arbitrary length numbers, the multiply and add operations are not free so I'd like to avoid them, if at all possible.
You will have to use a look up table with an input width of m base b symbols returning n bits so that
n = log2(b) * m
for positive integers b, n and m. So if b is not a power of two, there will be no (simple) look up table solution.
I do not think that there is a solution. The following example with base 10 illustrates why.
65536 = 1 0000 0000 0000 0000
Changing the last digit from 6 to 5 will flip all bits.
65535 = 0 1111 1111 1111 1111
And almost the same will hold if you process the input starting from the end. Changing the first digit from 6 to 5 flips a significant number of bits.
55535 = 0 1101 1000 1111 0000
This is not possible in bases that aren't powers of two to convert to base-2. The reason that it is possible for base 8 (and 16) is that the way the conversion works is following:
octal ABC = 8^2*A + 8^1*B + 8^0*C (decimal)
= 0b10000000*A + 0b1000*B + C (binary)
so if you have the lookup table of A = (0b000 to 0b111), then the multiplication is always by 1 and some trailing zeros, so the multiplication is simple (just shifting left).
However, consider the 'odd' base of 10. When you look at the powers of 10:
10^1 = 0b1010
10^2 = 0b1100100
10^3 = 0b1111101000
10^4 = 0b10011100010000
..etc
You'll notice that the multiplication never gets simple, so you can't have any lookup tables and do bitshifts and ors, no matter how big you group them. It will always overlap. The best you can do is have a lookup table of the form: (a,b) where a is the digit position, and b is the digit (0..9). Then, you are only reduced to adding n numbers, rather than multiplying and adding n numbers (plus the cost of the memory of the lookup table)
How big are the strings? You can potentially convert the multiply-and-add to a lookup-and-add by doing something like this:
Store the numbers 0-9, 10, 20, 30, 40, ... 90, 100, 200, ... 900, 1000, 2000, ... , 9000, 10000, ... in the target base in a table.
For each character starting with the rightmost, index appropriately into the table and add it to a running result.
Of course I'm not sure how well this will actually perform, but it's a thought.
The algorithm is quite simple. Language agnostic would be:
total = 0
base <- input_base
for each character in input:
total <- total*base + number(char)
In C++:
// Helper to convert a digit to a number
unsigned int number( char ch )
{
if ( ch >= '0' && ch <= '9' ) return ch-'0';
ch = toupper(ch);
if ( ch >= 'A' && ch <= 'F' ) return 10 + (ch-'A');
}
unsigned int parse( std::string const & input, unsigned int base )
{
unsigned int total = 0;
for ( int i = 0; i < input.size(); ++i )
{
total = total*base + number(input[i]);
}
return total;
}
Of course, you should take care of possible errors (incoherent input: base 2 and input string 'af12') or any other exceptional condition.
Start with a running count of 0.
For each character in the string (reading left to right)
Multiply count by base.
Convert character to int value (0 through base)
Add character value to running count.
How accurate do you need to be?
If you're looking for perfection, then multiply-and-add is really your only recourse. And I'd be very surprised if it's the slowest part of your application.
If order-of-magnitude is good enough, use a lookup table to find the closest power of 2.
Example 1: 1234, closest power of 2 is 1024.
Example 2: 98765, closest is 65536
You could also drive this by counting the number of digits, and multiplying the appropriate power of 2 by the leftmost digit. This can be implemented as a left-shift:
Example 3: 98765 has 5 digits, closest power of 2 to 10000 is 8192 (2^13), so result is 9 << 13
I wrote this before your clarifying comment so it probably isn't quite is applicable. I'm not sure if a lookup table approach is possible or not. If you really don't need arbitrary precision, then take advantage of the runtime.
If a C/C++ solution is acceptable, I believe that the following is what you are looking for is something like the following. It probably contains bugs in edge cases, but it does compile and work as expected at least for positive numbers. Making it really work is an exercise for the reader.
/*
* NAME
* convert_num - convert a numerical string (str) of base (b) to
* a printable binary representation
* SYNOPSIS
* int convert_num(char const* s, int b, char** o)
* DESCRIPTION
* Generates a printable binary representation of an input number
* from an arbitrary base. The input number is passed as the ASCII
* character string `s'. The input string consists of characters
* from the ASCII character set {'0'..'9','A'..('A'+b-10)} where
* letter characters may be in either upper or lower case.
* RETURNS
* The number of characters from the input string `s' which were
* consumed by this operation. The output string is placed into
* newly allocated storage which is pointed to by `*o' upon successful
* completion. An error is signalled by returning `-1'.
*/
int
convert_num(char const *str, int b, char **out)
{
int rc = -1;
char *endp = NULL;
char *outp = NULL;
unsigned long num = strtoul(str, &endp, b);
if (endp != str) { /* then we have some numbers */
int numdig = -1;
rc = (endp - str); /* we have this many base `b' digits! */
frexp((double)num, &numdig); /* we need this many base 2 digits */
if ((outp=malloc(numdig+1)) == NULL) {
return -1;
}
*out = outp; /* return the buffer */
outp += numdig; /* make sure it is NUL terminated */
*outp-- = '\0';
while (numdig-- != 0) { /* fill it in from LSb to MSb */
*outp-- = ((num & 1) ? '1' : '0');
num >>= 1;
}
}
return rc;
}

Resources