Computer uses two's complement to store integers. Say, for int32 signed, 0xFFFFFFFF represents '-1'. According to this theory, it is not hard to write such code in C to init a signed integer to -1;
int a = 0xffffffff;
printf("%d\n", a);
Obviously, the result is -1.
However, in Go, the same logic dumps differently.
a := int(0xffffffff)
fmt.Printf("%d\n", c)
The code snippet prints 4294967295, the maximum number an uint32 type can hold. Even if I cast c explicitly in fmt.Printf("%d\n", int(c)), the result is still the same.
The same problem happens when some bit operations are imposed on signed integer as well, make signed become unsigned.
So, what happens to Go in such a situation?
The problem here is that size of int is not fixed, it is platform dependent. It may be 32 or 64 bits. In the latter case assigning 0xffffffff to it is equivalent to assigning 4294967295 to it, which is what you see printed.
Now if you convert that value to int32 (which is 32-bit), you'll get your -1:
a := int(0xffffffff)
fmt.Printf("%d\n", a)
b := int32(a)
fmt.Printf("%d\n", b)
This will output (try it on the Go Playgroung):
4294967295
-1
Also note that in Go it is not possible to assign 0xffffffff directly to a value of type int32, because the value would overflow; nor it is valid to create a typed constant having an illegal value, such as int32(0xffffffff). Spec: Constants:
The values of typed constants must always be accurately representable by values of the constant type.
So this gives a compile-time error:
var c int32 = 0xffffffff // constant 4294967295 overflows int32
But you may simply do:
var c int32 = -1
You may also do:
var c = ^int32(0) // -1
Related
This question already has an answer here:
Does Go compiler's evaluation differ for constant expression and other expression
(1 answer)
Closed 4 years ago.
func main() {
var a = math.MaxInt64
fmt.Println(a + 1) //-9223372036854775808
fmt.Println(math.MaxInt64 + 1) //constant 9223372036854775808 overflows int
}
why the two ways perform differently?
In the second example math.MaxInt64 + 1 is a constant expression and is computed at compile time. The spec says:
Constant expressions are always evaluated exactly; intermediate values and the constants themselves may require precision significantly larger than supported by any predeclared type in the language.
However when the value of the expression is passed to fmt.Println it has to be converted into a real predeclared type, in this case an int, which is represented as a signed 64 bit integer, which is incapable of representing the constant.
A constant may be given a type explicitly by a constant declaration or conversion, or implicitly when used in a variable declaration or an assignment or as an operand in an expression. It is an error if the constant value cannot be represented as a value of the respective type.
In the first example a + 1 is not a constant expression, rather it's normal arithmetic because a was declared to be a variable and so the constant expression math.MaxInt64 is converted to an int. It's the same as:
var a int = math.MaxInt64
Normal arithmetic is allowed to overflow:
For signed integers, the operations +, -, *, /, and << may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow.
With minor modifications you can make the examples the same:
func main() {
const a = math.MaxInt64
fmt.Println(a + 1) //constant 9223372036854775808 overflows int
fmt.Println(math.MaxInt64 + 1) //constant 9223372036854775808 overflows int
}
It is showing an error because if you check the type of a constant value of 1 it will show you that you are actually adding int to a int64 value. So first declare a variable of type int64 and than add to const math.MaxInt64 just like that
package main
import (
"fmt"
"math"
)
func main() {
var a int64 = 1
fmt.Println(math.MaxInt64 + a) //-9223372036854775808
fmt.Printf("%#T", 1)
//fmt.Println(math.MaxInt64 + 1) //constant 9223372036854775808 overflows
}
Note: This question is different from Fastest way to calculate a 128-bit integer modulo a 64-bit integer.
Here's a C# fiddle:
https://dotnetfiddle.net/QbLowb
Given the pseudocode:
UInt64 a = 9228496132430806238;
UInt32 d = 585741;
How do i calculate
UInt32 r = a % d?
The catch, of course, is that i am not in a compiler that supports the UInt64 data type.1 But i do have access to the Windows ULARGE_INTEGER union:
typedef struct ULARGE_INTEGER {
DWORD LowPart;
DWORD HighPart;
};
Which means really that i can turn my code above into:
//9228496132430806238 = 0x80123456789ABCDE
UInt32 a = 0x80123456; //high part
UInt32 b = 0x789ABCDE; //low part
UInt32 r = 585741;
How to do it
But now comes how to do the actual calculation. I can start with the pencil-and-paper long division:
________________________
585741 ) 0x80123456 0x789ABCDE
To make it simpler, we can work in variables:
Now we are working entirely with 32-bit unsigned types, which my compiler does support.
u1 = a / r; //integer truncation math
v1 = a % r; //modulus
But now i've brought myself to a standstill. Because now i have to calculate:
v1||b / r
In other words, I have to perform division of a 64-bit value, which is what i was unable to perform in the first place!
This must be a solved problem already. But the only questions i can find on Stackoverflow are people trying to calculate:
a^b mod n
or other cryptographically large multi-precision operations, or approximate floating point.
Bonus Reading
Microsoft Research: Division and Modulus for Computer Scientists
https://stackoverflow.com/questions/36684771/calculating-large-mods-by-hand
Fastest way to calculate a 128-bit integer modulo a 64-bit integer (unrelated question; i hate you people)
1But it does support Int64, but i don't think that helps me
Working with Int64 support
I was hoping for the generic solution to the performing modulus against a ULARGE_INTEGER (and even LARGE_INTEGER), in a compiler without native 64-bit support. That would be the correct, good, perfect, and ideal answer, which other people will be able to use when they need.
But there is also the reality of the problem i have. And it can lead to an answer that is generally not useful to anyone else:
cheating by calling one of the Win32 large integer functions (although there is none for modulus)
cheating by using 64-bit support for signed integers
I can check if a is positive. If it is, i know my compiler's built-in support for Int64 will handle:
UInt32 r = a % d; //for a >= 0
Then there's there's how to handle the other case: a is negative
UInt32 ModU64(ULARGE_INTEGER a, UInt32 d)
{
//Hack: Our compiler does support Int64, just not UInt64.
//Use that Int64 support if the high bit in a isn't set.
Int64 sa = (Int64)a.QuadPart;
if (sa >= 0)
return (sa % d);
//sa is negative. What to do...what to do.
//If we want to continue to work with 64-bit integers,
//we could now treat our number as two 64-bit signed values:
// a == (aHigh + aLow)
// aHigh = 0x8000000000000000
// aLow = 0x0fffffffffffffff
//
// a mod d = (aHigh + aLow) % d
// = ((aHigh % d) + (aLow % d)) % d //<--Is this even true!?
Int64 aLow = sa && 0x0fffffffffffffff;
Int64 aHigh = 0x8000000000000000;
UInt32 rLow = aLow % d; //remainder from low portion
UInt32 rHigh = aHigh % d; //this doesn't work, because it's "-1 mod d"
Int64 r = (rHigh + rLow) % d;
return d;
}
Answer
It took a while, but i finally got an answer. I would post it as an answer; but Z29kIGZ1Y2tpbmcgZGFtbiBzcGVybSBidXJwaW5nIGNvY2tzdWNraW5nIHR3YXR3YWZmbGVz people mistakenly decided that my unique question was an exact duplicate.
UInt32 ModU64(ULARGE_INTEGER a, UInt32 d)
{
//I have no idea if this overflows some intermediate calculations
UInt32 Al = a.LowPart;
UInt32 Ah = a.HighPart;
UInt32 remainder = (((Ah mod d) * ((0xFFFFFFFF - d) mod d)) + (Al mod d)) mod d;
return remainder;
}
Fiddle
I just updated my ALU32 class code in this related QA:
Cant make value propagate through carry
As CPU assembly independent code for mul,div was requested. The divider is solving all your problems. However it is using Binary long division so its a bit slover than stacking up 32 bit mul/mod/div operations. Here the relevant part of code:
void ALU32::div(DWORD &c,DWORD &d,DWORD ah,DWORD al,DWORD b)
{
DWORD ch,cl,bh,bl,h,l,mh,ml;
int e;
// edge cases
if (!b ){ c=0xFFFFFFFF; d=0xFFFFFFFF; cy=1; return; }
if (!ah){ c=al/b; d=al%b; cy=0; return; }
// align a,b for binary long division m is the shifted mask of b lsb
for (bl=b,bh=0,mh=0,ml=1;bh<0x80000000;)
{
e=0; if (ah>bh) e=+1; // e = cmp a,b {-1,0,+1}
else if (ah<bh) e=-1;
else if (al>bl) e=+1;
else if (al<bl) e=-1;
if (e<=0) break; // a<=b ?
shl(bl); rcl(bh); // b<<=1
shl(ml); rcl(mh); // m<<=1
}
// binary long division
for (ch=0,cl=0;;)
{
sub(l,al,bl); // a-b
sbc(h,ah,bh);
if (cy) // a<b ?
{
if (ml==1) break;
shr(mh); rcr(ml); // m>>=1
shr(bh); rcr(bl); // b>>=1
continue;
}
al=l; ah=h; // a>=b ?
add(cl,cl,ml); // c+=m
adc(ch,ch,mh);
}
cy=0; c=cl; d=al;
if ((ch)||(ah)) cy=1; // overflow
}
Look the linked QA for description of the class and used subfunctions. The idea behind a/b is simple:
definition
lets assume that we got 64/64 bit division (modulus will be a partial product) and want to use 32 bit arithmetics so:
(ah,al) / (bh,bl) = (ch,cl)
each 64bit QWORD will be defined as high and low 32bit DWORD.
align a,b
exactly like computing division on paper we must align b so it divides a so find sh that:
(bh,bl)<<sh <= (ah,al)
(bh,bl)<<(sh+1) > (ah,al)
and compute m so
(mh,ml) = 1<<sh
beware that in case bh>=0x80000000 stop the shifting or we would overflow ...
divide
set result c = 0 and then simply substract b from a while b>=a. For each substraction add m to c. Once b>a shift both b,m right to align again. Stop if m==0 or a==0.
result
c will hold 64bit result of division so use cl and similarly a holds the remainder so use al as your modulus result. You can check if ch,ah are zero if not overflow occurs (as result is bigger than 32 bit). The same goes for edge cases like division by zero...
Now as you want 64bit/32bit simply set bh=0 ... To do this I needed 64bit operations (+,-,<<,>>) which I did by stacking up 32bit operations with Carry (that is the reason why my ALU32 class was created in the first place) for more info see the link above.
Is there anyway to perform an unsigned shift (namely, unsigned right shift) operation in Go? Something like this in Java
0xFF >>> 3
The only thing I could find on this matter is this post but I'm not sure what I have to do.
Thanks in advance.
The Go Programming Language Specification
Numeric types
A numeric type represents sets of integer or floating-point values.
The predeclared architecture-independent numeric types include:
uint8 the set of all unsigned 8-bit integers (0 to 255)
uint16 the set of all unsigned 16-bit integers (0 to 65535)
uint32 the set of all unsigned 32-bit integers (0 to 4294967295)
uint64 the set of all unsigned 64-bit integers (0 to 18446744073709551615)
int8 the set of all signed 8-bit integers (-128 to 127)
int16 the set of all signed 16-bit integers (-32768 to 32767)
int32 the set of all signed 32-bit integers (-2147483648 to 2147483647)
int64 the set of all signed 64-bit integers (-9223372036854775808 to 9223372036854775807)
byte alias for uint8
rune alias for int32
The value of an n-bit integer is n bits wide and represented using
two's complement arithmetic.
There is also a set of predeclared numeric types with
implementation-specific sizes:
uint either 32 or 64 bits
int same size as uint
uintptr an unsigned integer large enough to store the uninterpreted bits of a pointer value
Conversions are required when different numeric types are mixed in an
expression or assignment.
Arithmetic operators
<< left shift integer << unsigned integer
>> right shift integer >> unsigned integer
The shift operators shift the left operand by the shift count
specified by the right operand. They implement arithmetic shifts if
the left operand is a signed integer and logical shifts if it is an
unsigned integer. There is no upper limit on the shift count. Shifts
behave as if the left operand is shifted n times by 1 for a shift
count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the
same as x/2 but truncated towards negative infinity.
In Go, it's an unsigned integer shift. Go has signed and unsigned integers.
It depends on what type the value 0xFF is. Assume it's one of the unsigned integer types, for example, uint.
package main
import "fmt"
func main() {
n := uint(0xFF)
fmt.Printf("%X\n", n)
n = n >> 3
fmt.Printf("%X\n", n)
}
Output:
FF
1F
Assume it's one of the signed integer types, for example, int.
package main
import "fmt"
func main() {
n := int(0xFF)
fmt.Printf("%X\n", n)
n = int(uint(n) >> 3)
fmt.Printf("%X\n", n)
}
Output:
FF
1F
In gdb,
(gdb) p -2147483648
$28 = 2147483648
(gdb) pt -2147483648
type = unsigned int
Since -2147483648 is within the range of type int, why is gdb treating it as an unsigned int?
(gdb) pt -2147483647-1
type = int
(gdb) p -2147483647-1
$27 = -2147483648
I suspect that gdb applies the unary negation operator after setting the type of the digit value:
In case 1: gdb parses 2147483648 which overflows int type and becomes unsigned int. Then it applies the negation.
In case 2: 2147483647 is a valid int and stays int when negation and subtraction are subsequently applied.
gdb appears to be following a set of rules for determining the type of a decimal integer literal that are inconsistent with the rules given by the C standard.
I'll assume your system has a 32-bit int and long int types, using 2's-complement and no padding bits (that's a common choice for 32-bit systems, and it's consistent with what you're seeing). Then the ranges of int and unsigned int are:
int: -2147483648 .. +2147483647
unsigned int: 0 .. 4294967295
and the ranges of long int and unsigned long int are the same.
2147483647 is within the range of type int, so that's its type.
Since the value of 2147483648 is outside the range of type int, apparently gdb is choosing to treat it as an unsigned int. And -2147483648 is not an integer literal, it's an expression consisting of a unary - operator applied to the constant 2147483648. Since gdb treats 2147483648 as an unsigned int, it also treats -2147483648 as an unsigned int, and the unary - operator for unsigned types wraps around, yielding 2147483648.
As for -2147483647-1, that's an expression all of whose operands are of type int, and there's no overflow.
In all versions of ISO C, though, an unsuffixed decimal literal can never be of type unsigned int. In C90, its type is the first of:
int
long int
unsigned long int
that can represent its value. Under C99 rules (and later), the type of a decimal integer constant is the first of:
int
long int
long long int
that can represent its value.
I don't know whether there's a way to tell gdb to use C rules for integer literals.
I have a small sample function:
#define VALUE 0
int test(unsigned char x) {
if (x>=VALUE)
return 0;
else
return 1;
}
My compiler warns me that the comparison (x>=VALUE) is true in all cases, which is right, because x is an unsigned character and VALUE is defined with the value 0. So I changed my code to:
if ( ((signed int) x ) >= ((signed int) VALUE ))
But the warning comes again. I tested it with three GCC versions (all versions > 4.0, sometimes you have to enable -Wextra).
In the changed case, I have this explicit cast and it should be an signed int comparison. Why is it claiming, that the comparison is always true?
Even with the cast, the comparison is still true in all cases of defined behavior. The compiler still determines that (signed int)0 has the value 0, and still determines that (signed int)x) is non-negative if your program has defined behavior (casting from unsigned to signed is undefined if the value is out of range for the signed type).
So the compiler continues warning because it continues to eliminate the else case altogether.
Edit: To silence the warning, write your code as
#define VALUE 0
int test(unsigned char x) {
#if VALUE==0
return 1;
#else
return x>=VALUE;
#endif
}
x is an unsigned char, meaning it is between 0 and 256. Since an int is bigger than a char, casting unsigned char to signed int still retains the chars original value. Since this value is always >= 0, your if is always true.
All the values of an unsigned char can fir perfectly in your int, so even with the cast you will never get a negative value. The cast you need is to signed char - however, in that case you should declare x as signed in the function signature. There is no point lying to the clients that you need an unsigned value while in fact you need a signed one.
The #define of VALUE to 0 means that your function is reduced to this:
int test(unsigned char x) {
if (x>=0)
return 0;
else
return 1;
}
Since x is always passed in as an unsigned char, then it will always have a value between 0 and 255 inclusive, regardless of whether you cast x or 0 to a signed int in the if statement. The compiler therefore warns you that x will always be greater than or equal to 0, and that the else clause can never be reached.