Fibonacci Function in RISC V - overflow

Implementing the fibonacci function using RISC - V, so that f(0) = 0 ,
f(1) = 1, ..., up to f(47).
My output matches everything up to 46. But when i try to calculate f(47) i get: -1323752223.
Here's the output:Output from Code below with n=47
Is there some sort of Overflow because i get a negative Integer value? Where should i look into to try and fix the error?
.data
n: .word 47
.text
.globl main
main:
li x2, 0 # Used to determine if n (x7) equals 0
li x3, 1 # Used to determine if n (x7) equals 1
li x5, 0 # First number
li x6, 1 # Second number
lw x7, n # Limit
li x8, 1 # Counter
beq x7, x2, DO # If n == 0 then jump to DO (Which shoud print 0). Implements f(0) = 0
beq x7, x3, WRITE # if n == 1 then jump to WRITE (Which should print 1). Implements f(1) = 1
LOOP: beq x8, x7, EXIT # Comparse the counter x8 which starts with 1 to n (limit). If x8 == x7 jump to EXIT
add x4, x5, x6 # Add x5 to x6 and store in x4
ori x5, x6, 0 # Assign the second number to my first number
ori x6, x4, 0 # Assign the sum of x5 and x6 to my second number
addi x8, x8, 1 # Add 1 to my counter
j LOOP # Jump to loop
EXIT:
li x17, 1 # Load constant 1 to x17
add x10,x4,x0 # Add x4 (which contains the result after the above coe) to x10
ecall # Issue an SystemCall which prints an integer (Because of the 1 in x17)
li x17, 5
ecall
li x17, 10
ecall # Reads an int from input console (Because of the 10 in x17)
DO:
li x4, 0 # load 0 in x10 (x10 will be used by the SysCall to print) and print
add x10,x4,x0
li x17, 1
ecall
li x17, 5
ecall
li x17, 10
ecall
WRITE: li x4, 1 # load 1 in x10 and print
add x10,x4,x0
li x17,1
ecall
li x17, 5
ecall
li x17, 10
ecall

Yes, you have found the boundary of what signed 32-bit integers can hold.
fib(47) = 2971215073 won't fit in 32-bit integers as signed, but will fit as unsigned — however, RARS does not have an "unsigned int" print function.
fib(48) = 4807526976 won't fit in 32-bit form, even as unsigned.
List of fibonacci numbers:
https://www.math.net/list-of-fibonacci-numbers
If you want to represent larger numbers, you will need a strategy.
if precision is not important, you can switch to floating point, where the results with such large numbers will be inexact.
use two integers together for 64-bit arithmetic — you'll be good up to fib(92) = 7540113804746346429.
use variable length integers for precision limited only by computer memory and compute time
a complex combination of the above
And finally, you can detect and issue an error on overflow of your chosen arithmetic data type.  Overflow detection on RISC V is somewhat simple but not really obvious.
Technically, addition of 2 arbitrary 32-bit numbers results in a 33 bit answer — but no more than 33 bits, so we can use that knowledge of math in detecting overflow.
First, in a 32-bit number the top bit is either the sign bit (when signed data type), or the magnitude bit (when unsigned data type).
If you add two positive numbers (as is the case with fib) and the sign bit is set, you have signed overflow — but a proper bit pattern, if interpreted as unsigned.  However, you won't be able to print the number properly using the ecall #1, because it will print the number as signed and interpret as negative.  You can write your own unsigned number print; you can also look for this case and simply stop the program from printing and issue an error instead.
Going beyond that you can check for overflow in 32-bit unsigned addition by seeing if the resulting value is less than one of the input operands.
The last approach is also used in multiword addition to make a carry from the lower word(s) to higher word(s).

Related

MIPS32 Multiplication of Three 32-bit Integers

I am trying to write a MIPS32 program that will multiply three numbers together. For example: (A * B * C) where A, B, and C are 32 bit signed numbers.
I know from this link that I need to do the following to multiply A*B (assume A is stored in $s0, B is stored in $s1, and C is stored in $2):
mult $s0, $s1
mfhi $t0
mflo $t1
If I am assuming my result of the multiplication of A * B is 64 bits, how do I calculate the result of (A * B) with C?

MIPS overflow logic

Good evening. I am trying to figure out how to determine if an integer qualifies 16 bit integer in MIPS.
I understand that 2^15-1 =32767 or 2^(16-1)-1=32767 and that we want 16 bit values for binary number. Anyway, I am trying to determine if an integer passes the test. I wrote this:
addi $s3, $zero, 32767
bgt $t2, $s3, else #branch to else if t2>s3
move $v0, $t2 #if no overflow; place t2 in v0
addi $v1, $zero, 0 #if no overflow; place zero in v1
else:
addi $v0, $zero, 0 #if overflow; place 0 in v0
addi $v1, $zero, -1 #if overflow; place -1 in v1
Anyway, There's a problem with my logic when I try and evaluate negative numbers. I have assignment due tomorrow. I am learning MIPS programming. I am not a programming snob so any helpful advice is appreciate. Thank you for your time.
This is too late for you assignment1.
It's a bit unclear whenever you want to test a) if a number N encoded as a 32-bit two's complement number can be also encoded as a 16-bit two's complement number or if you want to test b) if a 32-bit number can be encoded as a 16-bit number.
In the case of b) you just need to test if any bit higher than the 16th is set:
#Assume $t0 is the number to test
lui $t1, 0xffff #$t1 = 0xffff0000
and $t1, $t1, $t0 #$t1 is zero if all higher bits of $t0 are zero
beq $t1, $0 fits16bits #Jump to label if fits
#Here the number doesn't fit 16 bits
For the case a) the key is to understand that just like the number 0x00f1 and the number 0x0000000f1 are the same, the leading zeros are not significant, the number 0xffff and the number 0xffffffff are the same number in two's complement (the number -1).
To extend a 16-bit two's complement number to 32-bit we need to perform a sign extension, i.e. replicate the most significant bit (the sign bit) of the original number in the upper 16 bits.
So 0x7fff becomes 0x00007fff, 0xc000 becomes 0xffffc0000.
A simple way to test that all the upper 17 bits are equal is to shift them right arithmetically so that if they actually are equal we end up with 0x00000000 or 0xffffffff.
sra $t1, $t0, 15 #Shift right 16 bits duplicanting the MSb
beqz $t1, $0, fits16bits #Jump to label if fits (All zero)
addiu $t1, 1 #Add 1
beqz $t1, $0, fits16bits #Jump to label if fits (Before +1 was all ones)
#Here the numbers doesn't fit
1 Maybe it's better this way.

Algorithm to find out matched bits and non matched bits from two streams

I am adding corresponding bits of two bit steams in Java like below:
1 0 1 1 0 0
1 0 1 0 1 0
====================
2 0 2 1 1 0
After this I am adding result as:
2+0+2+1+1+0 = 6
Now, I have to find out number of 1ns and 2s in the result (6) that is matched bits and non-matched bits. I tried hard to device such an algorithm which can tell me exact number of 1ns and 2s the result is made up of but I am unable to create any so far.
It allows multiplying of each addition result with a constant number. Individual bits can be subtracted to achieve above goal.
I can also multiply these individual bits like I am adding above. But I cannot add these bits neither the result. Bits can be multiplied with itself or any other bit. Even I can represent these bits with number of my choice. That is I can say that 1=2 and 0=3 then I can have:
For addition (Pascal Paillier):
2 3 2 2 3 3
2 3 2 3 2 3
====================
4 6 4 5 5 9
For Multiplication (RSA)
2 3 2 2 3 3
2 3 2 3 2 3
====================
4 9 4 6 6 9
The only purpose is to find out the number of similar bits (1&1) and non similar bits (0&1, 1&0) from the overall number will be generated either by addition (Pascal Paillier) or by multiplication (RSA).
Furthermore, 2nd bit-stream can be represented with different numbers than the above.
Following can also be used:
Multiplication with bits and results and exponential with a constant
Addition/Subtraction among bits and result and multiplication with a constant only
Further detail:
I am using Pascal Paillier Homomorphic algorithm to encrypt these individual bits. Pascal Paillier supports addition only over encrypted data so I have to add only. I have to send this number to some application which have to find out the exact number of matched bits and non-matched bits.
Also, I can use RSA but it only allow multiplication.
In general, from two original streams of bits a and b, you will want to perform something like
popcnt(a XOR b) = # of non-matching bits (pairs of 0,1 and 1,0)
popcnt(a AND b) = # of matching 1-bits (pairs of 1,1)
where popcnt is the Hamming weight of the resultant string.
If you happen to be compiling it on a processor which supports the SSE4 POPCNT instruction (AMD and Intel processors produced in the last 7 years), this will likely be the most performant way of solving the problem. As an example, an implementation for both (testing an int-sized number of bits from streams a and b at a time)
int nonmatch(int a, int b) {
return __builtin_popcount(a ^ b);
}
int match(int a, int b) {
return __builtin_popcount(a & b);
}
will yield assembly code
; nonmatch
xor %esi, %edi
popcnt %edi, %eax
retq
; match
and %esi, %edi
popcnt %edi, %eax
retq
respectively.

Why does a 4 bit adder/subtractor implement its overflow detection by looking at BOTH of the last two carry-outs?

This is the diagram we were given for class:
Why wouldn't you just use C4 in this image? If C4 is 1, then the last addition resulted in an overflow, which is what we're wondering. Why do we need to look at C3?
Overflow flag indicates an overflow condition for a signed operation.
Some points to remember in a signed operation:
MSB is always reserved to indicate sign of the number
Negative numbers are represented in 2's complement
An overflow results in invalid operation
Two's complement overflow rules:
If the sum of two positive numbers yields a negative result, the sum has overflowed.
If the sum of two negative numbers yields a positive result, the sum has overflowed.
Otherwise, the sum has not overflowed.
For Example:
**Ex1:**
0111 (carry)
0101 ( 5)
+ 0011 ( 3)
==================
1000 ( 8) ;invalid (V=1) (C3=1) (C4=0)
**Ex2:**
1011 (carry)
1001 (-7)
+ 1011 (−5)
==================
0100 ( 4) ;invalid (V=1) (C3=0) (C4=1)
**Ex3:**
1110 (carry)
0111 ( 7)
+ 1110 (−2)
==================
0101 ( 5) ;valid (V=0) (C3=1) (C4=1)
In a signed operation if the two leftmost carry bits (the ones on the far left of the top row in these examples) are both 1s or both 0s, the result is valid; if the left two carry bits are "1 0" or "0 1", a sign overflow has occurred. Conveniently, an XOR operation on these two bits can quickly determine if an overflow condition exists. (Ref:Two's complement)
Overflow vs Carry: Overflow can be considered as a two's complement form of a Carry. In a signed operation overflow flag is monitored and carry flag is ignored. Similarly in an unsigned operation carry flag is monitored and overflow flag is ignored.
Overflow for signed numbers occurs when the carry-in into the most significant bit is not equal to the carry out.
For example, working with 8 bits, 65 + 64 = 129 actually results in a overflow. This is because this is 1000 0001 in binary which is also -127 in 2's complement. If you work through this example, you can see that it is a result of the carry out not equalling the carry in.
It is possible to have a correct computation even when the carry flag is high.
Consider
1000 1000 = -120
+ 1111 1111 = -1
=(1) 10000111 = -121
There is a carry out of 1, but there has been no overflow.
I would like the give a more general answer to this question for any positive natural number of bits.
Lets call the last Carry output C1, the second to last Carry output C0, the sum sign output S0 and the signbits of A and B respectively A0 and B0.
Then the following holds:
C1 = A0 + B0 + C0
S0 = A0*B0 + A0*C0 + B0*C0
Lets now walk through the posibilities.
If C1 == 1 there are two possibilities:
if C0 == 0: A0 and B0
must both have been 1 (and thus both A en B must be negative). This
means S0 has to be 0 meaning the solution was positive while A en B
were negative => overflow
if C0 == 1: either
A en B have opposite signs, so overflow is not possible. => no overflow
A0 en B0 are both 1 (and thus A en B
must both be negative). This means S0 has to be 1 meaning the solution was negative => no overflow
If C1 == 0 there are two possibilities:
if C0 == 0: either
A0 en B0 are both 0 (and thus A en B
must both be positive). This means S0 has to be 0 meaning the solution was positive => no overflow
A en B have opposite signs => no overflow
if C0 == 1: A0 en B0 must both be 0 (and thus A en B
must both be positive) This
means S0 has to be 1 meaning the solution was negative while A en B
were positive => overflow
Hope that helps someone out there.

maximum value of xor operation

I came up with this question.
There is an encryption algorithm which uses bitwise XOR operations extensively. This encryption algorithm uses a sequence of non-negative integers x1, x2, ... xn as key. To implement this algorithm efficiently, Xorq needs to find maximum value for (a xor xj) for given integers a, p and q such that p <= j <= q. Help Xorq to implement this function.
Input
First line of input contains a single integer T (1<=T<=6). T test cases follow.
First line of each test case contains two integers N and Q separated by a single space (1 <= N <= 100,000; 1 <= Q <= 50,000). Next line contains N integers x1, x2, ... xn separated by a single space (0 <= xj < 215). Each of next Q lines describe a query which consists of three integers ai, pi and qi (0 <= ai < 215, 1<= pi <= qi <= N).
Output
For each query, print the maximum value for (ai xor xj) such that pi <= j <= qi in a single line.
Sample Input
1
15 8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
10 6 10
1023 7 7
33 5 8
182 5 10
181 1 13
5 10 15
99 8 9
33 10 14
Sample Output
13
1016
41
191
191
15
107
47
Explanation
First Query (10 6 10): x6 xor 10 = 12,
x7 xor 10 = 13, x8 xor 10 = 2, x9 xor 10 = 3, x10 xor 10 = 0,
therefore answer for this query is 13.
Second Query (1023 7 7): x7 xor 1023 = 1016,
therefore answer for this query is 1016.
Third Query (33 5 8): x5 xor 33 = 36, x6 xor 33 = 39,
x7 xor 33 = 38, x8 xor 33 = 41, therefore answer for this query is 41.
Fourth Query (182 5 10): x5 xor 182 = 179,
x6 xor 182 = 176, x7 xor 182 = 177, x8 xor 182 = 190,
x9 xor 182 = 191, x10 xor 182 = 188,
therefore answer for this query is 191.
I tried this by first making the numbers length(in binary)
in the given range equal and then comparing 'a' bit by
bit with the particular xj values.But it is time exceeding.
Maximum time limit in java is 5sec.
I haven't gone through your code in detail, but you seem to have loops over the range of r = p - 1; r < q - 1; r++, and it would be nice not to have to do this.
Given ai, we want to find a value of xi in the given range with as many of its top bits the inverse of ai as possible. Everything is between 0 and 2^15, so there aren't many bits to worry about. For n = 1 to 15 you could divide the xi up according to its n highest bits, so dividing it into 2, 4, 8, 16.. 32768 portions. For each portion keep a list in sorted order of the positions where each possible value is found, so for the top bit you will have two lists, one giving the positions at which the bit pattern is 0.............. and one giving the position at which the bit pattern is 1............ For each triple, you can use binary chop on a particular portion to find if there are any positions within your range at which the top n bits have the bit pattern you are looking for. If they do, fine. If not you will have to accept that one of the xor positions is 0 and slightly modify the pattern you look for with one more top bit set.
The setup cost is 15 linear passes over the xi, which is probably less time than it takes you to read it in. For each line you could do 15 binary chops to see which values of xi match in the top n bits, and modify the pattern of top bits you look for if you can't match a particular bit.
I think your program would be clearer if you separated the I/O from the problem code by making the problem code a separate subroutine. This would also make it easier to compare one version of the problem code with another, to see which is faster and if they both get the same answer.
The biggest inefficiency that I can spot in the original algorithm is that N can be up to 100,000 but a and x can only go up to 214. So I would write pseudocode something like this:
bool set[256] = { false };
for (j = p; j <= q; j++) set[x[j]] = true;
for (k = 255; !set[a ^ j]; k--);
return k;
This reduces the number of xor operations to 256 in the worst case.

Resources