I am currently stuck with this simple bit-shifting problem. The problem is that when I assign a short variable any values, and shift them with << 8, I get 0xffff(2 extra bytes) when I save the result to the 'short' variables. However, for 'long', it is OK. So I am wondering why this would anyhow happen ??
I mean, short isn't supposed to read more than 2 bytes but... it clearly shows that my short values are containing Extra 2 bytes with the value 0xffff.
I'm seeking for your wisdom.. :)
This image describes the problem. Clearly, when the 'sign' bit(15) of 'short' is set to 1 AFTER the bit shift operation, the whole 2 byte ahead turns into 0xffff. This is demonstrated by showing 127(0x7f) passing the test but 0x81 NOT passing the test because when it is shifted, Due to it's upper 8. That causes to set Bit15(sign bit) to '1'. Also, Because 257(0x101) doesn't set the bit 15 after shifting, it turns out to be OK.
There are several problems with your code.
First, you are doing bit shift operations with signed variables, this may have unexpected results. Use unsigned short instead of short to do bit shifting, unless you are sure of what you are doing.
You are explicitly casting a short to unsigned short and then storing the result back to a variable of type short. Not sure what you are expecting to happen here, this is pointless and will prevent nothing.
The issue is related to that. 129 << 8 is 33024, a value too big to fit in a signed short. You are accidently lighting the sign bit, causing the number to become negative. You would see that if you printed it as %d instead of %x.
Because short is implicitly promoted to int when passed as parameter to printf(), you see the 32-bit version of this negative number, which has its 16 most relevant bits lit in accordance. This is where the leading ffff come from.
You don't have this problem with long because even though its signed long its still large enough to store 33024 without overloading the sign bit.
Related
I got confused by all this shifting thing since I saw two different results of shifting the same number. I know there are tons of questions about this thing but seems like I still couldn't find what I was looking for (Feel free to post link of a question or a website that could help).
So, first I have seen the number 13 binary like: 001101 (not whole word of bits).
When applied shifting to the left by 2 they hold the last bit (bit for sign probably) and results like 0|10100 = 20. However on other place I have seen the number 13 represented like: 01101, and now the 01101<<2 was 0|0100 = 4. I know shifting left is same as multiplying by the base, however this made me confused. Should i present 13 as 001101 or 01101 and apply shifting.
I think we omit the overflow considering the results.
Thank you !!
This behaviour seems to be corresponding with integers of length 5 and
4 (in bits, not counting the sign bit). So it seems overflow is indeed the problem. If it isn't, could you add some context as to where these strange results occur?
001101, 01101 and also 1101 and 00001101 and other sizes have equal claim to "being" 13. You can't really say that 13 has one definitive size, rather it is the operation that has a size (which may be infinite, then a left shift never wraps).
So you have to decide what size of shift you're doing, independently of the value you're shifting. Common choices are 32 or 64 bits, but you're certainly not limited to that, although "strange" sizes take more effort to implement on typical machines and in typical programming languages.
The sign is never deliberately kept in left shifts by the way, there is no useful way to do so: forcefully keeping it means the wrapping happens in a really odd way, instead of the usual wrapping modulo a power of two (which has nice properties).
I'm working on a problem out of Cracking The Coding Interview which requires that I swap odd and even bits in an integer with as few instructions as possible (e.g bit 0 and 1 are swapped, bits 2 and 3 are swapped, etc.)
The author's solution revolves around using a mask to grab, in one number, the odd bits, and in another num the even bits, and then shifting them off by 1.
I get her solution, but I don't understand how she grabbed the even/odd bits. She creates two bit masks --both in hex -- for a 32 bit integer. The two are: 0xaaaaaaaa and 0x55555555. I understand she's essentially creating the equivalent of 1010101010... for a 32 bit integer in hexadecimal and then ANDing it with the original num to grab the even/odd bits respectively.
What I don't understand is why she used hex? Why not just code in 10101010101010101010101010101010? Did she use hex to reduce verbosity? And when should you use one over the other?
It's to reduce verbosity. Binary 10101010101010101010101010101010, hexadecimal 0xaaaaaaaa, and decimal 2863311530 all represent exactly the same value; they just use different bases to do so. The only reason to use one or another is for perceived readability.
Most people would clearly not want to use decimal here; it looks like an arbitrary value.
The binary is clear: alternating 1s and 0s, but with so many, it's not obvious that this is a 32-bit value, or that there isn't an adjacent pair of 1s or 0s hiding in the middle somewhere.
The hexadecimal version takes advantage of chunking. Assuming you recognize that 0x0a == 0b1010, you can mentally picture the 8 groups of 1010 in the assumed value.
Another possibility would be octal 25252525252, since... well, maybe not. You can see that something is alternating, but unless you use octal a lot, it's not clear what that alternating pattern in binary is.
Everyone knows about overflow in the programming languages, if it happens program goes to crash. However, it is not clear for me what happens actually with data which get out of the boundary. Could you explain me, saying, giving example on C++ or Java. For example, Integer can save maximum 4 byte, what will happen if one puts data more than 4 byte to Integer. How compiler will identify this undefined behaviour?
what will happen if one puts data more than 4 byte to Integer.
Typically the value will roll-over1, meaning it will jump from one end of its range to another.
This can be seen, even in Windows calculator. Start with the highest possible signed 32-bit value:
Now add one to it:
We overflowed the maximum value of a signed Dword (231-1).
1 - This is a typical result. Some architectures might actually generate an exception on integer overflow, so you shouldn't count on this behavior.
How compiler will identify this undefined behaviour?
The compiler won't identify it. That's the problem. C# can mitigate this with the checked keyword, which checks to make sure that any arithmetic done on an integer will not cause overflow/underflow.
The X11 protocol defines an atom as a 32-bit integer, but on my system, the Atom type in is a typedef for unsigned long, which is a 64-bit integer. The manual for Xlib says that property types have a maximum size of 32 bits. There seems to be some conflict here. I can think of three possible solutions.
If Xlib treats properties of type XA_ATOM as a special case, then you can simply pass 32 for 'format' and an array of atoms for 'data'. This seems unclean and hackish, and I highly doubt that this is correct.
The manual for Xlib appears to be ancient. Since Atom is 64 bits long on my system, should I pass 64 for the 'format' parameter even though 64 is not listed as an allowed value?
Rather than an array of Atoms, should I pass an array of uint32_t values for the 'data' parameter? This seems like it would most likely be the correct solution to me, but this is not what they did in some sources I've looked up that use XChangeProperty, such as SDL.
SDL appears to use solution 1 when setting the _NET_WM_WINDOW_TYPE property, but I suspect that this may be a bug. On systems with little endian byte order (LSB first), this would appear to work if the property has only one element.
Has anyone else encountered this problem? Any help is appreciated.
For the property routines you always want to pass an array of 'long', 'short' or 'char'. This is always true independent of the actual bit width. So, even if your long or atom is 64 bits, it will be translated to 32 bits behind the scenes.
The format is the number of server side bits used, not client side. So, for format 8, you must pass a char array, for format 16, you always use a short array and for format 32 you always use a long array. This is completely independent of the actual lengths of short or long on a given machine. 32 bit values such as Atom or Window always are in a 'long'.
This may seem odd, but it is for a good reason, the C standard does not guarantee types exist that have exactly the same widths as on the server. For instance, a machine with no native 16 bit type. However a 'short' is guaranteed to have at least 16 bits and a long is guaranteed to have at least 32 bits. So by making the client API in terms of 'short' and 'long' you can both write portable code and always have room for the full X id in the C type.
I've been reading up on the Y2038 problem and I understand that time_t will eventually revert to the lowest representable negative number because it'll try to "increment" the sign bit.
According to that Wikipedia page, changing time_t to an unsigned integer cannot be done because it would break programs that handle early dates. (Which makes sense.)
However, I don't understand why it wasn't made an unsigned integer in the first place. Why not just store January 1, 1970 as zero rather than some ridiculous negative number?
Because letting it start at signed −2,147,483,648 is equivalent to letting it start at unsigned 0. It doesn't change the range of a values a 32 bit integer can hold - a 32 bit integer can hold 4,294,967,296 different states. The problem isn't the starting point, the problem is the maximum value which can be held by the integer. Only way to mitigate the problem is to upgrade to 64 bit integers.
Also (as I just realized that): 1970 was set as 0, so we could reach back in time as well. (reaching back to 1901 seemed to be sufficient at the time). If they went unsigned, the epoch would've begun at 1901 to be able to reach back from 1970, and we would have the same problem again.
There's a more fundamental problem here than using unsigned values. If we used unsigned values, then we'd get only one more bit of timekeeping. This would have a definitely positive impact - it would double the amount of time we could keep - but then we'd have a problem much later on in the future. More generally, for any fixed-precision integer value, we'd have a problem along these lines.
When UNIX was being developed in the 1970s, having a 60 year clock sounded fine, though clearly a 120-year clock would have been better. If they had used more bits, then we'd have a much longer clock - say 1000 years - but after that much time elapsed we'd be right back in the same bind and would probably think back and say "why didn't they use more bits?"
Because not all systems have to deal purely with "past" and "future" values. Even in the 70s, when Unix was created and the time system defined, they had to deal with dates back in the 60s or earlier. So, a signed integer made sense.
Once everyone switches to 64bit time_t's, we won't have to worry about a y2038k type problem for another 2billion or so 136-year periods.