This is what I have so far but I can't get it to work. I need to have it input a dividend and a divisor and output the result along with the remainder. Example: if the input is 33 followed by 6 the output will be 5 followed by 3 since 33/6 is 5 remainder 3.
00 INP //ask the user
01 BRZ QUIT // halt the execution if input zero
02 STA DIVIDEND // store in dividend variable
03 INP // input dividor
04 BRZ QUIT // halt the execution if input zero
05 STA DIVISOR // store in divider variable
06 LDA DIVIDEND // load into acc
07 LOOP STA RESULT // store the temp result
08 LDA RESULT // load the result
09 SUB DIVISOR // subtract the dividor to acc BRP
10 BRP LOOP //loop if acc is positive or zero
11 LDA RESULT // load the result into acc
12 OUT // display the result
13 QUIT HLT // halt if zero
14 HLT // halt the execution
15 DIVIDEND DAT //declare variable
16 DIVISOR DAT
17 RESULT DAT
Currently you correctly get the input, calculate the remainder and output it. You just miss the part where you calculate the quotient and output it. The quotient is really the number of times you jump back to the start of the loop. So, increment a counter in each iteration of the loop. Then when you exit the loop you'll have counted one too many, so for the quotient you would output one less than that value.
Other remarks:
It is not necessary to perform LDA RESULT if it immediately follows STA RESULT, since that value is still in the accumulator -- no need to reload it.
There is no need to have HLT followed by HLT. That second one will never be executed.
It is not useful to say in comments what an LMC instruction does... For instance, "ask the user" is not a useful comment next to INP. Comments should explain something more -- something that is specific to this program. Most of your comments are just saying what someone could look up in a LMC language specification. That is not the purpose of comments.
So here is your code with the extra counter for getting the quotient, and with the above remarks taken into account. You can run it here.
#input: 19 9
INP // Input the dividend
BRZ QUIT // Zero is not considered a valid input
STA DIVIDEND
INP // Input divisor
BRZ QUIT // Division by zero is not allowed
STA DIVISOR
LDA ZERO // Initialise quotient
STA QUOTIENT
LDA DIVIDEND // Let dividend be the initial value of the remainder
// Repeat as long as remainder would be non-negative:
LOOP STA REMAINDER
LDA QUOTIENT // Increment quotient
ADD ONE
STA QUOTIENT
LDA REMAINDER // Reduce remainder
SUB DIVISOR
BRP LOOP
// Output the results
LDA QUOTIENT // quotient is one too great now
SUB ONE
OUT
LDA REMAINDER
OUT
QUIT HLT
// constants:
ZERO DAT 0
ONE DAT 1
// variables:
DIVIDEND DAT
DIVISOR DAT
QUOTIENT DAT
REMAINDER DAT
<script src="https://cdn.jsdelivr.net/gh/trincot/lmc#v0.813/lmc.js"></script>
Related
I'm trying to learn RISC-V and wrote a factorial function, but it's running into a simulator error, hinting at a possible infinite loop. I'm not really sure how to debug my code at the moment, and was wondering if people could drop hints on what I might be doing wrong.
Thank you!
.globl factorial
.data
n: .word 8
.text
main:
la t0, n #t0 corresponds to n
lw a0, 0(t0)
jal ra, factorial
addi a1, a0, 0
addi a0, x0, 1
ecall # Print Result
addi a1, x0, '\n'
addi a0, x0, 11
ecall # Print newline
addi a0, x0, 10
ecall # Exit
factorial:
addi sp sp -16
sw s0 0(sp) #s0 corresponds to i, initialised to n
sw s1 4(sp) #s1 corresponds to factorial that will be constantly updated; also initialised to 1
sw s2 8(sp) #s2 corresponds to n, or t0
sw s3 12(sp)
add s2 x0 t0
addi s1 x0 1
add s0 x0 t0
addi s3 x0 4 #this is what we use to decrement s0 (i) by 1 each time
loop:
beq s0 x0 exit
mul s1 s1 s0
sub s0 s0 s3
j loop
exit:
lw s0 0(sp)
lw s1 4(sp)
lw s2 8(sp)
lw s3 12(sp)
addi sp sp 16
ret
How to debug a factorial function I'm writing in RISC-V assembly?
I'm not really sure how to debug my code at the moment,
So, you want to learn debugging. Yes, this is a mandatory skill for any programming, especially assembly language. Debugging is an interactive process, which is poorly suited to a Q & A format.
The normal approach is to run every line of code and verify that it does what you think it is doing. If any line of code doesn't do what you expect, then that's what to work on. Every single line has to work properly or else the program won't run properly.
In assembly we call this single stepping. The behavior of an instruction includes both the effect it has on the registers, and the effect on memory — the state of the program, if you will. We verify that the registers and memory are all updated as expected, and also that it goes on to the proper next instruction — flow of control is equally important, and can also meet or mismatch expectations.
We should write small amounts of code and run them to verify they are working, rather than write a whole program and then see if it compiles/assembles and runs. Much better to build incrementally onto working code, as often debugging a small piece of new code will change your understanding (e.g. of the machine, or of the problem you're trying to solve), and hence make writing the rest easier.
When testing some code, debug verify it (single step) with the smallest possible input first: so for factorial, for example, run it first with f(1) get that working, then work on f(2).
When doing function calls, you'll need to switch roles, first considering the caller, then the callee, then the caller again. At the point of the call, verify the arguments are in the right registers and the stack, if applicable. At the first instruction of the called function, verify the same, and also make note of the return address value (in the ra register) and the stack pointer value (sp), before stepping through the function. When you store values to memory, verify the values and where they go, so that when you later use memory you are getting what you expect.
I was doing a project in ASM about pascal triangle using NASM
so in the project you need to calculate pascal triangle from line 0 to line 63
my first problem is where to store the results of calculation -> memory
second problem what type of storage I use in memory, to understand what I mean I have 3 way first declare a full matrices so will be like this way
memoryLabl: resd 63*64 ; 63 rows of 64 columns each
but the problem in this way that half of matrices is not used that make my program not efficient so let's go the second method is available
which is declare for every line a label for memory
for example :
line0: dd 1
line1: dd 1,1
line2: dd 1,2,1 ; with pre-filled data for example purposes
...
line63: resd 64 ; reserve space for 64 dword entries
this way of doing it is like do it by hand,
some other from the class try to use macro as you can see here
but i don't get it
so far so good
let's go to the last one that i have used
which is like the first one but i use a triangle matrices , how is that,
by declaring only the amount of memory that i need
so to store line 0 to line 63 line of pascal triangle, it's give me a triangle matrices because every new line I add a cell
I have allocate 2080 dword for the triangle matrices how is that ??
explain by 2080 dword:
okey we have line0 have 1 dword /* 1 number in first line */
line1 have 2 dword /* 2 numbers in second line */
line2 have 3 dword /* 3 numbers in third line */
...
line63 have 64 dword /* 64 numbers in final line*/
so in the end we have 2080 as the sum of them
I have give every number 1 dword
okey now we have create the memory to store results let's start calculation
first# in pascal triangle you have all the cells in row 0 have value 1
I will do it in pseudo code so you understand how I put one in all cells of row 0:
s=0
for(i=0;i<64;i++):
s = s+i
mov dword[x+s*4],1 /* x is addresses of triangle matrices */
second part in pascal triangle is to have the last row of each line equal to 1
I will use pseudo code to make it simple
s=0
for(i=2;i<64;i++):
s = s+i
mov dword[x+s*4],1
I start from i equal to 2 because i = 0 (i=1) is line0 (line1) and line0 (line1)is full because is hold only one (tow) value as I say in above explanation
so the tow pseudo code will make my rectangle look like in memory :
1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
...
1 1
now come the hard part is the calculation using this value in triangle to fill all the triangle cells
let's start with the idea here
let's take cell[line][row]
we have cell[2][1] = cell[1][0]+cell[1][1]
and cell[3][1]= cell[2][0]+cell[2][1]
cell[3][2]= cell[2][1]+cell[2][2]
in **general** we have
cell[line][row]= cell[line-1][row-1]+cell[line-1][row]
my problem I could not break this relation using ASM instruction because i have a
triangle matrices which weird to work with can any one help me to break it using a relation or very basic pseudo code or asm code ?
TL:DR: you just need to traverse the array sequentially, so you don't have to work out the indexing. See the 2nd section.
To random access index into a (lower) triangular matrix, row r starts after a triangle of size r-1. A triangle of size n has n*(n+1)/2 total elements, using Gauss's formula for the sum of numbers from 1 to n-1. So a triangle of size r-1 has (r-1)*r/2 elements. Indexing a column within a row is of course trivial, once we know the address of the start of a row.
Each DWORD element is 4 bytes wide, and we can take care of that scaling as part of the multiply, because lea lets us shift and add as well as put the result in a different register. We simplify n*(n-1)/2 elements * 4 bytes / elem to n*(n-1) * 2 bytes.
The above reasoning works for 1-based indexing, where row 1 has 1 element. We have to adjust for that if we want zero-based indexing by adding 1 to row indices before the calculation, so we want the size of a triangle
with r+1 - 1 rows, thus r*(r+1)/2 * 4 bytes. It helps to put the linear array index into a triangle to quickly double-check the formula
0
4 8
12 16 20
24 28 32 36
40 44 48 52 56
60 64 68 72 76 80
84 88 92 96 100 104 108
The 4th row, which we're calling "row 3", starts 24 bytes from the start of the whole array. That's (3+1)*(3+1-1) * 2 = (3+1)*3 * 2; yes the r*(r+1)/2 formula works.
;; given a row number in EDI, and column in ESI (zero-extended into RSI)
;; load triangle[row][col] into eax
lea ecx, [2*rdi + 2]
imul ecx, edi ; ecx = r*(r+1) * 2 bytes
mov eax, [triangle + rcx + rsi*4]
This assuming 32-bit absolute addressing is ok (32-bit absolute addresses no longer allowed in x86-64 Linux?). If not, use a RIP-relative LEA to get the triangle base address in a register, and add that to rsi*4. x86 addressing modes can only have 3 components when one of them is a constant. But that is the case here for your static triangle, so we can take full advantage by using a scaled index for the column, and base as our calculated row offset, and the actual array address as the displacement.
Calculating the triangle
The trick here is that you only need to loop over it sequentially; you don't need random access to a given row/column.
You read one row while writing the one below. When you get to the end of a row, the next element is the start of the next row. The source and destination pointers will get farther and farther from each other as you go down the rows, because the destination is always 1 whole row ahead. And you know the length of a row = row number, so you can actually use the row counter as the offset.
global _start
_start:
mov esi, triangle ; src = address of triangle[0,0]
lea rdi, [rsi+4] ; dst = address of triangle[1,0]
mov dword [rsi], 1 ; triangle[0,0] = 1 special case: there is no source
.pascal_row: ; do {
mov rcx, rdi ; RCX = one-past-end of src row = start of dst row
xor eax, eax ; EAX = triangle[row-1][col-1] = 0 for first iteration
;; RSI points to start of src row: triangle[row-1][0]
;; RDI points to start of dst row: triangle[row ][0]
.column:
mov edx, [rsi] ; tri[r-1, c] ; will load 1 on the first iteration
add eax, edx ; eax = tri[r-1, c-1] + tri[r-1, c]
mov [rdi], eax ; store to triangle[row, col]
add rdi, 4 ; ++dst
add rsi, 4 ; ++src
mov eax, edx ; becomes col-1 src value for next iteration
cmp rsi, rcx
jb .column ; }while(src < end_src)
;; RSI points to one-past-end of src row, i.e. start of next row = src for next iteration
;; RDI points to last element of dst row (because dst row is 1 element longer than src row)
mov dword [rdi], 1 ; [r,r] = 1 end of a row
add rdi, 4 ; this is where dst-src distance grows each iteration
cmp rdi, end_triangle
jb .pascal_row
;;; triangle is constructed. Set a breakpoint here to look at it with a debugger
xor edi,edi
mov eax, 231
syscall ; Linux sys_exit_group(0), 64-bit ABI
section .bss
; you could just as well use resd 64*65/2
; but put a label on each row for debugging convenience.
ALIGN 16
triangle:
%assign i 0
%rep 64
row %+ i: resd i + 1
%assign i i+1
%endrep
end_triangle:
I tested this and it works: correct values in memory, and it stops at the right place. But note that integer overflow happens before you get down to the last row. This would be avoided if you used 64-bit integers (simple change to register names and offsets, and don't forget resd to resq). 64 choose 32 is 1832624140942590534 = 2^60.66.
The %rep block to reserve space and label each row as row0, row1, etc. is from my answer to the question you linked about macros, much more sane than the other answer IMO.
You tagged this NASM, so that's what I used because I'm familiar with it. The syntax you used in your question was MASM (until the last edit). The main logic is the same in MASM, but remember that you need OFFSET triangle to get the address as an immediate, instead of loading from it.
I used x86-64 because 32-bit is obsolete, but I avoided too many registers, so you can easily port this to 32-bit if needed. Don't forget to save/restore call-preserved registers if you put this in a function instead of a stand-alone program.
Unrolling the inner loop could save some instructions copying registers around, as well as the loop overhead. This is a somewhat optimized implementation, but I mostly limited it to optimizations that make the code simpler as well as smaller / faster. (Except maybe for using pointer increments instead of indexing.) It took a while to make it this clean and simple. :P
Different ways of doing the array indexing would be faster on different CPUs. e.g. perhaps use an indexed addressing mode (relative to dst) for the loads in the inner loop, so only one pointer increment is needed. But if you want it to run fast, SSE2 or AVX2 vpaddd could be good. Shuffling with palignr might be useful, but probably also unaligned loads instead of some of the shuffling, especially with AVX2 or AVX512.
But anyway, this is my version; I'm not trying to write it the way you would, you need to write your own for your assignment. I'm writing for future readers who might learn something about what's efficient on x86. (See also the performance section in the x86 tag wiki.)
How I wrote that:
I started writing the code from the top, but quickly realized that off-by-one errors were going to be tricky, and I didn't want to just write it the stupid way with branches inside the loops for special cases.
What ended up helping was writing the comments for the pre and post conditions on the pointers for the inner loop. That made it clear I needed to enter the loop with eax=0, instead of with eax=1 and storing eax as the first operation inside the loop, or something like that.
Obviously each source value only needs to be read once, so I didn't want to write an inner loop that reads [rsi] and [rsi+4] or something. Besides, that would have made it harder to get the boundary condition right (where a non-existant value has to read as 0).
It took some time to decide whether I was going to have an actual counter in a register for row length or row number, before I ended up just using an end-pointer for the whole triangle. It wasn't obvious before I finished that using pure pointer increments / compares was going to save so many instructions (and registers when the upper bound is a build-time constant like end_triangle), but it worked out nicely.
How can I use the "if statement"?
I have a 9-bit number. I need to check the MSB. If the MSB is "1" then I have to do the XOR operation with "100101" (reduction polynomial).
If the MSB is zero then I have skip the bit.
My main aim is to reduce the 9-bit number to 5-bit.
For example:
Here m = 5
Loop 1 (2m-2 = 8)
101010100 (MSB is the 9th bit)
100101
x01111100
MSB = 1 (true), XOR with reduction polynomial.
Result: 01111100 (8 bit result, removed the 9th bit)
Loop 2 (7)
01111100 (MSB is the 8th bit)
100101
MSB = 0 (false), skip and end the loop.
Result: 01111100 (still 8 bit result, but we are not using the MSB for the next loop)
Loop 3 (6)
1111100 (MSB is the 7th bit)
100101
x110110
MSB = 1 (true), XOR with reduction polynomial.
Result: 0110110 (7 bit result)
Loop 4 (m = 5)
110110 (MSB is the 6th bit)
100101
x10011 (Final result)
MSB = 1 (true), XOR with reduction polynomial.
Final result: 010011 (6 bit result, but we can discard the MSB)
Could you please give me some idea about it?
Many Thanks!
I know next to nothing about VHDL, but it looks like you're trying to implement something like a Galois LFSR or, more generally, reduction modulo a polynomial in GF(2n).
If so, you don't need to use an "if statement" for this.
Instead, just extract the MSB of you number and XOR it with each of the bits in the rest of the number that are set in your reduction polynomial.
If the MSB is set, this is equivalent to XORing your number with the reduction polynomial; if it's unset, then nothing happens, since XORing a bit with 0 does nothing.
if you have a std_logic_vector (let's call it v) you can check any particular bit of it using:
if v(5) = '1' then
end if;
For checking MSBs and LSBs you can use the built in attributes 'left and 'right. For example:
if v'left = '1' then -- check the MSB
I am having a bit of an issue with this problem. I am taking a Pascal programming class and this problem was in my logic book. I am required to have the user enter a series of (+) numbers and once he/she enters a (-) number, the program should find the sum of all the (+) numbers. I accomplished this, but now I am attempting part two of this problem, which requires me to utilize a nested loop to run the program x amount of times based on the user's input.
I do not know how to rerun the summation process based on a number the user enters. In other words, the program is required to
1) Ask the user how many times he/she would like to run the program
2) Begin the nested loop that prompts the user for a series of positive numbers
3) User enters numbers as loop asks for them
4) A negative number then signals the end of the series
5) After the repeat until loop, the program should then add all of the positive numbers together
steps 2-4 is one iteration of the program. I need this to run x amount of times, of course, based on user input.
The following code is what I have so far and honestly I am stumped:
program summation;
var num, sum, counter, userValue : integer;
begin
writeln('Run program how many times?');
readln(userValue);
for counter := 1 to userValue do
begin
sum := 0;
repeat
writeln('Enter a number: ');
readln(num);
if num >= 0 then
begin
sum := num + sum;
end;
until num < 0;
writeln('The sum is: ', sum);
readln();
end;
end.
Update [6/27] 3:40 Pacific Time
Output:
I attempted to upload an image of the output, but I require 10 rep points. Anyway, the program's output is as follows:
How many times would you like the program to run?
2
Enter a number:
1
Enter a number:
1
Enter a number:
-1 <-- negative number signals one iteration of the nested loop
Enter a number:
2
Enter a number:
-3 <-- negative number signals one iteration of the nested loop
The sum is: 6
The negative number signals the program to stop an iteration. However, I would like the program to repeat the summation of a sequence three times.
Update [6/27] 7:25PM Pacific Time
Currently my program executes correctly. By correctly I mean it (1) Asks the user how many times he/she would like to run it. (2) The nested loop begins and prompts user for a series of numbers. (3) Once a negative number is entered it signals the end of the series. (4) The program sums the positive numbers. (5) The program restarts by asking the user for another series of numbers. (6) Once again a negative number ends the series. (7) Error begins here: Once the program iterates (series of number prompts) according to the user defined number, it adds all of the sums from previous iterations to the final sum. This is not my goal. My goal is to have separate sums (one for each run) not all sums "summed" at the final iteration.
In summary (pun intended), your final corrected listing is:
program summation;
var num, sum, counter, userValue : integer;
begin
{ Prompt the user for how many times they wish to sum up a list of numbers }
writeln('Run program how many times?');
readln(userValue);
{ For each time the user wants to sum numbers, prompt them for the numbers }
for counter := 1 to userValue do
begin
sum := 0; { start with a sum of 0 for this list }
{ Repeatedly request a number from the user until they enter a negative to end }
repeat
{ Prompt for and get the number }
writeln('Enter a number: ');
readln(num);
if num >= 0 then
sum := num + sum; { add the number to the sum if it's not negative }
until num < 0; { finish with this sum if the number entered is negative }
{ Write out the last calculated sum }
writeln('The sum is: ', sum);
readln; { Let the user press enter to go around and request another sum }
end;
end.
In numPromptLoop change the name of the NUM parameter to SUM
I am trying to decipher some assembly code that involves multiple left rotations on an 8-bit binary number.
For reference, the code is:
lab: rol dl,1
rol dl,1
dec ecx
jnz lab
The dec and jnz isn't an issue, but is there to show that the 2 rols are executed several times.
What I am trying to do is figure out a mathematical equivalent of this code, such as a formula. I'm certainly not looking for a complete formula to tell me the whole code, but I would like to know if there is a formula that gives the equivalent (in denary) of a single left rotation.
I've tried figuring this out with a couple of different numbers, but cannot see a link between the two results. For example: if the start number is 115 it comes out as 220, but if the start number is 99 it comes out as 216.
Given your sample results, I assume we are treating the 8-bit quantity as unsigned.
The 7 low-order bits are shifted left, multiplying that part of the number by 2; and the high-order bit is swapped around to the beginning.
Thus, (x % 128) * 2 + (x / 128), using the usual integer div/mod operators.
Shifting a byte containing number X by one bit (position) left is equal to multiplying the number X by 2:
x << 1 <==> x = x * 2