LC3 simulator checks if a register is greater than 0 - lc3

How do I check if a register is greater than 0?
For example: I want to check if R2 is greater than 0
This is what I did:
ADD R2, R2, #0
But this doesn't check if R2 is greater than 0, it seems like it sets R2's value to 0

To check if a register is greater than zero is a two step process.
First you will need to set the condition code register and then you will use the BR instruction to branch on a condition.
ADD R2, R2, 0 ; Store R2 in R2, this has no effect other than setting CC register.
BRNZ LESS_THAN_OR_ZERO ; Branch if R2 is <= 0, based on the CC register set in last instruction
[statements here] ; if we are here then R2 > 0
BR DONE ; optional if we don't want to execute the next section of code. unconditional branch to done
LESS_THAN_OR_ZERO
[more statements here] ; if we are here then R2 <= 0
DONE
[more statements here]
More about the CC register, it is updated with N, Z, or P based on the last instruction that writes to a register that means LD, LEA, LDI, LDR, ADD, AND, and NOT will update the CC register automatically.
Check out the ISA documentation for the BR instruction.

Related

How to debug a factorial function I'm writing in RISC-V assembly?

I'm trying to learn RISC-V and wrote a factorial function, but it's running into a simulator error, hinting at a possible infinite loop. I'm not really sure how to debug my code at the moment, and was wondering if people could drop hints on what I might be doing wrong.
Thank you!
.globl factorial
.data
n: .word 8
.text
main:
la t0, n #t0 corresponds to n
lw a0, 0(t0)
jal ra, factorial
addi a1, a0, 0
addi a0, x0, 1
ecall # Print Result
addi a1, x0, '\n'
addi a0, x0, 11
ecall # Print newline
addi a0, x0, 10
ecall # Exit
factorial:
addi sp sp -16
sw s0 0(sp) #s0 corresponds to i, initialised to n
sw s1 4(sp) #s1 corresponds to factorial that will be constantly updated; also initialised to 1
sw s2 8(sp) #s2 corresponds to n, or t0
sw s3 12(sp)
add s2 x0 t0
addi s1 x0 1
add s0 x0 t0
addi s3 x0 4 #this is what we use to decrement s0 (i) by 1 each time
loop:
beq s0 x0 exit
mul s1 s1 s0
sub s0 s0 s3
j loop
exit:
lw s0 0(sp)
lw s1 4(sp)
lw s2 8(sp)
lw s3 12(sp)
addi sp sp 16
ret
How to debug a factorial function I'm writing in RISC-V assembly?
I'm not really sure how to debug my code at the moment,
So, you want to learn debugging.  Yes, this is a mandatory skill for any programming, especially assembly language.  Debugging is an interactive process, which is poorly suited to a Q & A format.
The normal approach is to run every line of code and verify that it does what you think it is doing.  If any line of code doesn't do what you expect, then that's what to work on.  Every single line has to work properly or else the program won't run properly.
In assembly we call this single stepping.  The behavior of an instruction includes both the effect it has on the registers, and the effect on memory — the state of the program, if you will.  We verify that the registers and memory are all updated as expected, and also that it goes on to the proper next instruction — flow of control is equally important, and can also meet or mismatch expectations.
We should write small amounts of code and run them to verify they are working, rather than write a whole program and then see if it compiles/assembles and runs.  Much better to build incrementally onto working code, as often debugging a small piece of new code will change your understanding (e.g. of the machine, or of the problem you're trying to solve), and hence make writing the rest easier.
When testing some code, debug verify it (single step) with the smallest possible input first: so for factorial, for example, run it first with f(1) get that working, then work on f(2).
When doing function calls, you'll need to switch roles, first considering the caller, then the callee, then the caller again.  At the point of the call, verify the arguments are in the right registers and the stack, if applicable.  At the first instruction of the called function, verify the same, and also make note of the return address value (in the ra register) and the stack pointer value (sp), before stepping through the function.  When you store values to memory, verify the values and where they go, so that when you later use memory you are getting what you expect.

How are Mathematical Equality Operators Handled at the Machine-Code Level

So I wanted to ask a rather existential question today, and it's one that I feel as though most programmers skip over and just accept as something that works, without really asking the question of "how" it works. The question is rather simple: how is the >= operator compiled down to machine code, and what does that machine code look like? Down at the very bottom, it must be a greater than test, mixed with an "is equal" test. But how is this actually implemented? Thinking about it seems rather paradoxical, because at the very bottom there cannot be a > or == test. There needs to be something else. I want to know what this is.
How do computers test for equality and greater than at the fundamental level?
Indeed there is no > or == test as such. Instead, the lowest level comparison in assembler works by binary subtraction.
On x86, the opcode for integer comparisons is CMP. It is really the one instruction to rule them all. How it works is described for example in 80386 Programmer's reference manual:
CMP subtracts the second operand from the first but, unlike the SUB instruction, does not store the result; only the flags are changed.
CMP is typically used in conjunction with conditional jumps and the SETcc instruction. (Refer to Appendix D for the list of signed and unsigned flag tests provided.) If an operand greater than one byte is compared to an immediate byte, the byte value is first sign-extended.
Basically, CMP A, B (In Intel operand ordering) calculates A - B, and then discards the result. However, in an x86 ALU, arithmetic operations set condition flags inside the flag register of the CPU based on the result of the operation. The flags relevant to arithmetic operations are
Bit Name Function
0 CF Carry Flag -- Set on high-order bit carry or borrow; cleared
otherwise.
6 ZF Zero Flag -- Set if result is zero; cleared otherwise.
7 SF Sign Flag -- Set equal to high-order bit of result (0 is
positive, 1 if negative).
11 OF Overflow Flag -- Set if result is too large a positive number
or too small a negative number (excluding sign-bit) to fit in
destination operand; cleared otherwise.
For example if the result of calculation is zero, the Zero Flag ZF is set. CMP A, B executes A - B and discards the result. The result of subtraction is 0 iff A == B. Thus the ZF will be set only when the operands are equal, cleared otherwise.
Carry flag CF would be set iff the unsigned subtraction would result in borrow, i.e. A - B would be < 0 if A and B are considered unsigned numbers and A < B.
Sign flag is set whenever the MSB bit of the result is set. This means that the result as a signed number is considered negative in 2's complement. However, if you consider the 8-bit subtraction 01111111 (127) - 10000000 (-128), the result is 11111111, which interpreted as a 8-bit signed 2's complement number is -1, even though 127 - (-128) should be 255. A signed integer overflow happened The sign flag alone doesn't alone tell which of the signed quantities was greater - theOF overflow flag tells whether a signed overflow happened in the previous arithmetic operation.
Now, depending on the place where this is used, a Byte Set on Condition SETcc or a Jump if Condition is Met Jcc instruction is used to decode the flags and act on them. If the boolean value is used to set a variable, then a clever compiler would use SETcc; Jcc would be a better match for an if...else.
Now, there are 2 choices for >=: either we want a signed comparison or an unsigned comparison.
int a, b;
bool r1, r2;
unsigned int c, d;
r1 = a >= b; // signed
r2 = c >= d; // unsigned
In Intel assembly the names of conditions for unsigned inequality use the words above and below; conditions for signed equality use the words greater and less. Thus, for r2 the compiler could decide to use Set on Above or Equal, i.e. SETAE, which sets the target byte to 1 if (CF=0). For r1 the result would be decoded by SETGE - Set Byte on Greater or Equal, which means (SF=OF) - i.e. the result of subtraction interpreted as a 2's complement is positive without overflow, or negative with overflow happening.
Finally an example:
#include <stdbool.h>
bool gte_unsigned(unsigned int a, unsigned int b) {
return a >= b;
}
The resulting optimized code on x86-64 Linux is:
cmp edi, esi
setae al
ret
Likewise for signed comparison
bool gte_signed(int a, int b) {
return a >= b;
}
The resulting assembly is
cmp edi, esi
setge al
ret
Here's a simple C function:
bool lt_or_eq(int a, int b)
{
return (a <= b);
}
On x86-64, GCC compiles this to:
.file "lt_or_eq.c"
.text
.globl lt_or_eq
.type lt_or_eq, #function
lt_or_eq:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl -4(%rbp), %eax
cmpl -8(%rbp), %eax
setle %al
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size lt_or_eq, .-lt_or_eq
The important part is the cmpl -8(%rbp), %eax; setle %al; sequence. Basically, it's using the cmp instruction to compare the two arguments numerically, and set the state of the zero flag and the carry flag based on that comparison. It then uses setle to decide whether to to set the %al register to 0 or 1, depending on the state of those flags. The caller gets the return value from the %al register.
First the computer needs to figure out the type of the data. In a language like C, this would be at compile time, python would dispatch to different type specific tests at run time. Assuming we are coming from a compiled language, and that we know the values that we are comparing are integers, The complier would make sure that the valuses are in registers and then issue:
SUBS r1, r2
BGE #target
subtracting the registers, and then checking for zero/undflow. These instructions are built in operation on the CPU. (Which I'm assuming here is ARM-like there are many variations).

LC3 Multiplication

So I have an LC3 coding assignment where we have to implement and test user subroutines for input and output of unsigned integers in decimal format. Now for our input we have to do a sequence of keystrokes to construct a single integer value by applying a Repeated Multiplication algorithm, which would be multiplication by 10 via 4 additions. I am not really understanding this concept of multiplication by 4 additions. Could anyone please explain?
x is number you want to multiply by 10
a = x+x = 2x
b = a+a = 4x
c = b+b = 8x
d = a+c = 10x
If your value is in R1 you can try the following:
ADD R2, R1, R1 ;Value = Value x 10
ADD R4, R2, R2
ADD R1, R4, R4
ADD R1, R1, R2

How can I write an interpreter for 'eq' for Hack Assembly language?

I am reading and studying The Elements of Computing Systems but I am stuck at one point. Sample chapter skip the next 5 instruction s can be found here.
Anyway, I am trying to implement a Virtual Machine (or a byte code to assembly translator) but I am stuck at skip the next 5 instruction one point.
You can find the assembly notation here.
The goal is to implement a translator that will translate a specific byte code to this assembly code.
An example I have done successfully is for the byte code
push constant 5
which is translated to:
#5
D=A
#256
M=D
As I said, the assembly language for Hack is found in the link I provided but basically:
#5 // Load constant 5 to Register A
D=A // Assign the value in Reg A to Reg D
#256// Load constant 256 to Register A
M=D // Store the value found in Register D to Memory Location[A]
Well this was pretty straight forward. By definition memory location 256 is the top of the stack. So
push constant 5
push constant 98
will be translated to:
#5
D=A
#256
M=D
#98
D=A
#257
M=D
which is all fine..
I also want to give one more example:
push constant 5
push constant 98
add
is translated to:
#5
D=A
#256
M=D
#98
D=A
#257
M=D
#257 // Here starts the translation for 'add' // Load top of stack to A
D=M // D = M[A]
#256 // Load top of stack to A
A=M // A = M[A]
D=D+A
#256
M=D
I think it is pretty clear.
However I have no idea how I can translate the byte code
eq
to Assembly. Definition for eq is as follows:
Three of the commands (eq, gt, lt) return Boolean values. The VM
represents true and false as 􏰁-1 (minus one, 0xFFFF) and 0 (zero,
0x0000), respectively.
So I need to pop two values to registers A and D respectively, which is quite easy. But how am I supposed to create an Assembly code that will check against the values and push 1 if the result is true or 0 if the result is false?
The assembly code supported for Hack Computer is as follows:
I can do something like:
push constant 5
push constant 6
sub
which will hold the value 0 if 2 values pushed to the stack are equal or !0 if not but how does that help? I tried using D&A or D&M but that did not help much either..
I can also introduce a conditional jump but how am I supposed to know what instruction to jump to? Hack Assembly code does not have something like "skip the next 5 instructions" or etc..
[edit by Spektre] target platform summary as I see it
16bit Von Neumann architecture (address is 15 bits with 16 bit Word access)
Data memory 32KW (Read/Write)
Instruction (Program) memory 32KW (Read only)
native 16 bit registers A,D
general purpose 16 bit registers R0-R15 mapped to Data memory at 0x0000 - 0x000F
these are most likely used also for: SP(R0),LCL(R1),ARG(R2),This(R3),That(R4)
Screen is mapped to Data memory at 0x4000-0x5FFF (512x256 B/W pixels 8KW)
Keyboard is mapped to Data memory at 0x6000 (ASCII code if last hit key?)
It appears there is another chapter which more definitively defines the Hack CPU. It says:
The Hack CPU consists of the ALU specified in chapter 2 and three
registers called data register (D), address register (A), and program
counter (PC). D and A are general-purpose 16-bit registers that can be
manipulated by arithmetic and logical instructions like A=D-1 , D=D|A
, and so on, following the Hack machine language specified in chapter
4. While the D-register is used solely to store data values, the contents of the A-register can be interpreted in three different ways,
depending on the instruction’s context: as a data value, as a RAM
address, or as a ROM address
So apparently "M" accesses are to RAM locations controlled by A. There's the indirect addressing I was missing. Now everything clicks.
With that confusion cleared up, now we can handle OP's question (a lot more easily).
Let's start with implementing subroutine calls with the stack.
; subroutine calling sequence
#returnaddress ; sets the A register
D=A
#subroutine
0 ; jmp
returnaddress:
...
subroutine: ; D contains return address
; all parameters must be passed in memory locations, e.g, R1-R15
; ***** subroutine entry code *****
#STK
AM=M+1 ; bump stack pointer; also set A to new SP value
M=D ; write the return address into the stack
; **** subroutine entry code end ***
<do subroutine work using any or all registers>
; **** subroutine exit code ****
#STK
AM=M-1 ; move stack pointer back
A=M ; fetch entry from stack
0; jmp ; jmp to return address
; **** subroutine exit code end ****
The "push constant" instruction can easily be translated to store into a dynamic location in the stack:
#<constant> ; sets A register
D=A ; save the constant someplace safe
#STK
AM=M+1 ; bump stack pointer; also set A to new SP value
M=D ; write the constant into the stack
If we wanted to make a subroutine to push constants:
pushR2: ; value to push in R2
#R15 ; save return address in R15
M=D ; we can't really use the stack,...
#R2 ; because we are pushing on it
D=M
#STK
AM=M+1 ; bump stack pointer; also set A to new SP value
M=D ; write the return address into the stack
#R15
A=M
0 ; jmp
And to call the "push constant" routine:
#<constant>
D=A
#R2
M=D
#returnaddress ; sets the A register
D=A
#pushR2
0 ; jmp
returnaddress:
To push a variable value X:
#X
D=M
#R2
M=D
#returnaddress ; sets the A register
D=A
#pushR2
0 ; jmp
returnaddress:
A subroutine to pop a value from the stack into the D register:
popD:
#R15 ; save return address in R15
M=D ; we can't really use the stack,...
#STK
AM=M-1 ; decrement stack pointer; also set A to new SP value
D=M ; fetch the popped value
#R15
A=M
0 ; jmp
Now, to do the "EQ" computation that was OP's original request:
EQ: ; compare values on top of stack, return boolean in D
#R15 ; save return address
M=D
#EQReturn1
D=A
#PopD
0; jmp
#EQReturn1:
#R2
M=D ; save first popped value
#EQReturn2
D=A
#PopD
0; jmp
#EQReturn2:
; here D has 2nd popped value, R2 has first
#R2
D=D-M
#EQDone
equal; jmp
#AddressOfXFFFF
D=M
EQDone: ; D contains 0 or FFFF here
#R15
A=M ; fetch return address
0; jmp
Putting it all together:
#5 ; push constant 5
D=A
#R2
M=D
#returnaddress1
D=A
#pushR2
0 ; jmp
returnaddress1:
#X ; now push X
D=M
#R2
M=D
#returnaddress2
D=A
#pushR2
0 ; jmp
returnaddress2:
#returnaddress3 ; pop and compare the values
D=A
#EQ
0 ; jmp
returnaddress3:
At this point, OP can generate code to push D onto the stack:
#R2 ; push D onto stack
M=D
#returnaddress4
D=A
#pushR2
0 ; jmp
returnaddress4:
or he can generate code to branch on the value of D:
#jmptarget
EQ ; jmp
As I wrote in last comment there is a branch less way so you need to compute the return value from operands directly
Lets take the easy operation like eq for now
if I get it right eq a,d is something like a=(a==d)
true is 0xFFFF and false is 0x0000
So this if a==d then a-d==0 this can be used directly
compute a=a-d
compute OR cascade of all bits of a
if the result is 0 return 0
if the result is 1 return 0xFFFF
this can be achieved by table or by 0-OR_Cascade(a)
the OR cascade
I do not see any bit shift operations in your description
so you need to use a+a instead of a<<1
and if shift right is needed then you need to implement divide by 2
So when I summarize this eq a,d could look like this:
a=a-d;
a=(a|(a>>1)|(a>>2)|...|(a>>15))&1
a=0-a;
you just need to encode this into your assembly
as you do not have division or shift directly supported may be this may be better
a=a-d;
a=(a|(a<<1)|(a<<2)|...|(a<<15))&0x8000
a=0-(a>>15);
the lower and greater comparison are much more complicated
you need to compute the carry flag of the substraction
or use sign of the result (MSB of result)
if you limit the operands to 15 bit then it is just the 15th bit
for full 16 bit operands you need to compute the 16th bit of result
for that you need to know quite a bit of logic circuits and ALU summation principles
or divide the values to 8 bit pairs and do 2x8 bit substraction cascade
so a=a-d will became:
sub al,dl
sbc ah,dh
and the carry/sign is in the 8th bit of result which is accessible

Clueless About insrwi Instruction

I have looked it up and nothing explains it well. It says that rlwimi can be used to be equivalent to it, but I don't know that instruction either.
Code with it in there:
andi. r0, r6, 3 # while(r6 != 3)
bdnzf eq, loc_90014730 # if(CTR != 0) loc_90014730();
insrwi r4, r4, 8,16 # ????
srwi. r0, r5, 4 # r0 = r5 >> 4;
insrwi r4, r4, 16,0
(r4 == 0)
I've been stuck on this instruction for a while. Please, don't just give me the result, please give me a detailed explanation.
I think you need to do some experiments with rlwimi to fully explain it to yourself, but here is what I find helpful.
There is a programming note in Book 1 of the Power PC Programming Manual for rlwimi that provides a little more detail on inslwi and insrwi:
rlwimi can be used to insert an n-bit field that is left-justified in
the low-order 32 bits of register RS, into RAL starting at bit
position b, by setting SH=32-b, MB=b, and ME=(b+n)-1. It can be used
to insert an n-bit field that is right-justified in the low-order 32
bits of register RS, into RAL starting at bit position b, by setting
SH=32-(b+n), MB=b, and ME=(b+n)-1.
It also helps to compare the results of insrwi and inslwi. Here are two examples tracing through the rlwimi procedure, where r4=0x12345678.
insrwi r4,r4,8,16 is equivalent to rlwimi r4,r4,8,16,23
Rotate left 8 bits and notice it puts the last 8 bits of the original r4 in those positions that match the generated mask: 0x34567812
Generate the mask: 0x0000FF00
Insert the last 8 bits, which were those 8 bits that were right justified in r4, under the control of the generated mask: 0x12347878
So insrwi takes n bits from the right side (starting at bit 32) and inserts them into the destination register starting at bit b.
inslwi r4,r4,8,16 is equivalent to rlwimi r4,r4,16,16,23
Rotate left 16 bits and notice it puts the first 8 bits of the original r4 in those positions that match the generated mask: 0x56781234
Generate the mask: 0x0000FF00
Insert the first 8 bits, which were those 8 bits that were left justified in r4, under the control of the generated mask: 0x12341278
So inslwi takes n bits from the left side (starting at bit 0) and inserts them into the destination register starting at bit b.
PowerISA 2.07 [1] states insrwi is an extended mnemonic of rlwimi, with the equivalent rlwimi instruction and how they are related.
Probably PowerISA has the detail level you want. :)
[1] https://www.power.org/documentation/power-isa-version-2-07/ (or google, pdf)

Resources