With only ADD, AND, NOT How do we OR - lc3

In LC-3, how can I store r1 or r2 into register r3
ld r1, a ;move the value of a into r1
ld r2, b ;move the value of b into r2
and r3, r1, r2 ;
halt ;stop

You can use DeMorgan's law to achieve this.
X ∨ Y = ¬(¬X ∧ ¬Y)
In LC3 code it's:
LD R1, a ;move the value of a into R1
LD R2, b ;move the value of b into R2
NOT R1, R1
NOT R2, R2
AND R3, R1, R2
NOT R3, R3
HALT

Related

Is this a GCC bug or am I doing something wrong?

I am trying to get the final accumulate in the code below to use the ARM M7 SMLAL 32*32->64 bit accumulate function. If I include the T3 = T3 + 1 than it does use this, but if I comment it out it does a full 64*64 bit and accumulate using 3 multiply and 2 add instructions. I don't actually want to add 1 to T3 so it needs to go.
I've broken the code down so that I could analyse it in more detail and it definitely seems to be that the cast of T3 to int32_t and throwing away the bottom 32 bits from the multiply isn't being picked up by the compiler and it thinks T3 still has 64 bits. Bit when I add the simple increment of T3 it then gets it correct. I tried adding zero but then it goes back to the full 64*64 bit multiply.
I'm using the -O2 optimisation on STM's STM32CubeIDE which uses a version of GCC. Other optimations never use SMLAL or unroll everything.
int64_t T4 = 0;
osc = key * NumHarmonics;
harmonic = 0;
do
{
if (OscLevel[osc] > 1)
{
OscPhase[osc] = OscPhase[osc] + (uint32_t)(T2);
int32_t T5 = Sine[(OscPhase[osc] >> 16) & 0x0000FFFF];
int64_t T6 = (int64_t)T1 * Tremelo[harmonic];
int32_t T3 = (int32_t)(T6 >> 32); // grab the most significant register
// T3 = T3 + 1; // needs the +1 to force use of SMLAL in next instruction ! (+0 doesn't help)
T4 = T4 + (int64_t)T3 * (int64_t)T5; // should be SMLAL but does a full 64*64 mult if no +1 above
}
osc++;
harmonic++;
}
while (harmonic < NumHarmonics);
OscTotal = T4;
without the addition :
800054e: 4b13 ldr r3, [pc, #76] ; (800059c <main+0xd8>)
8000550: f853 1024 ldr.w r1, [r3, r4, lsl #2]
8000554: ea4f 79e1 mov.w r9, r1, asr #31
8000558: fba7 4501 umull r4, r5, r7, r1
800055c: fb07 f309 mul.w r3, r7, r9
8000560: fb01 3202 mla r2, r1, r2, r3
8000564: 4415 add r5, r2
8000566: e9dd 2300 ldrd r2, r3, [sp]
800056a: 1912 adds r2, r2, r4
800056c: 416b adcs r3, r5
800056e: e9cd 2300 strd r2, r3, [sp]
}
osc++;
8000572: 3001 adds r0, #1
harmonic++;
with the addition
8000542: 4b0b ldr r3, [pc, #44] ; (8000570 <main+0xac>)
8000544: f853 3020 ldr.w r3, [r3, r0, lsl #2]
8000548: fbc3 6701 smlal r6, r7, r3, r1
}
osc++;
800054c: 3201 adds r2, #1
harmonic++;

LC3 continue getting a trap was executed with an illegal vector number

I am trying to create a program that will present the number input in binary to the user. Currently, all I have is the setup to get the user's number once they are finished typing all of their characters, however I don't understand why the code below will not run.
.ORIG x3000
RESET
AND R1, R1, #0
AND R2, R2, #0
AND R3, R3, #0
AND R4, R4, #0
ASCII .FILL #-48 ;ASCII CONVERSION
LD R5, ASCII ;
AND R6, R6, #0 ;NEGATIVE FLAG
DISPLAY .STRINGZ "\nTYPE A NUMBER THEN PRESS ENTER: "
LEA R0 DISPLAY
PUTS
loop
LOOP
GETC
OUT
AND R4, R4, #0 ;CHECK IF LF
ADD R4, R4, #-10 ;
ADD R4, R4, R0 ;
BRZ READY
LD R4, CHECKN ;check if negative
AND R4, R4, #0 ;
ADD R4, R4, R0 ;
BRZ NEGATIVE ;
ADD R1, R0, R5
BRNZP MULTIPLY
ADD R2, R1, R3
BRNZP LOOP
NEGATIVE
ADD R6, R6, #1
BRNZP LOOP
multiply by adding the same number 10 times
MULTIPLY
ADD R3, R2, R2
ADD R3, R3, R2
ADD R3, R3, R2
ADD R3, R3, R2
ADD R3, R3, R2
ADD R3, R3, R2
ADD R3, R3, R2
ADD R3, R3, R2
ADD R3, R3, R2
BRNZP LOOP
CHECKN .FILL #-45
READY
HALT
.END

LC3 trap executed illegal vector number

I'm trying to count the number of characters in LC3 simulator and keep getting "a trap was executed with an illegal vector number".
These are the objects I execute
charcount.obj:
0011000000000000
0101010010100000
0010011000010000
1111000000100011
0110001011000000
0001100001111100
0000010000001000
1001001001111111
0001001001100001
0001001001000000
0000101000000001
0001010010100001
0001011011100001
0110001011000000
0000111111110110
0010000000000100
0001000000000010
1111000000100001
1111000000100101
1110001011111111
0000000000110000
and verse:
.ORIG x3100
.STRINGZ "Simple Simon met a pieman,"
.STRINGZ "Going to the fair;"
.STRINGZ "Says Simple Simon to the pieman,"
.STRINGZ "Let me taste your ware."
.FILL x04
.END
Looks like we're going to need more information before we can help you much. I understand you've provided us with some binary and I ran that through the LC3 simulator. Here's where I'm a bit lost, which string would you like to count and where is it stored?
After trying to piece together what you've provided here's what I've found.
Registers:
R0 x0061 97
R1 x0000 0
R2 x0000 0
R3 xE2FF -7425
R4 x0000 0
R5 x0000 0
R6 x0000 0
R7 x3003 12291
PC x3004 12292
IR x62C0 25280
CC Z
Memory:
x3000 0101010010100000 x54A0 AND R2, R2, #0
x3001 0010011000010000 x2610 LD R3, x3012
x3002 1111000000100011 xF023 TRAP IN
x3003 0110001011000000 x62C0 LDR R1, R3, #0
x3004 0001100001111100 x187C ADD R4, R1, #-4
x3005 0000010000001000 x0408 BRZ x300E
x3006 1001001001111111 x927F NOT R1, R1
x3007 0001001001100001 x1261 ADD R1, R1, #1
x3008 0001001001000000 x1240 ADD R1, R1, R0
x3009 0000101000000001 x0A01 BRNP x300B
x300A 0001010010100001 x14A1 ADD R2, R2, #1
x300B 0001011011100001 x16E1 ADD R3, R3, #1
x300C 0110001011000000 x62C0 LDR R1, R3, #0
x300D 0000111111110110 x0FF6 BRNZP x3004
x300E 0010000000000100 x2004 LD R0, x3013
x300F 0001000000000010 x1002 ADD R0, R0, R2
x3010 1111000000100001 xF021 TRAP OUT
x3011 1111000000100101 xF025 TRAP HALT
x3012 1110001011111111 xE2FF LEA R1, x3112
x3013 0000000000110000 x0030 NOP
The values displayed in the registers is what I get when I stop after line x3003. For some reason the literal value of xE2FF gets loaded into register R3. After that the value of 0 at memory location xE2FF is loaded into register R1 and then the problems mount from there.
I would recommend displaying your asm code and then commenting each line so we can better understand what you're trying to accomplish.

How can I write a simple LC-3 program

How can I write a simple LC-3 program that compares the two numbers in R1 and R2 and puts the value 0 in R0 if R1 = R2, 1 if R1 > R2 and -1 if R1 < R2.
The comparison is done using simple arithmetic.
In my example we compare 2 and 6, you know what the result is.
LD R1, NUMBER1 ;load NUMBER1 into R1
LD R2, NUMBER2 ;load NUMBER1 into R2
AND R6,R6,#0 ;initialize R0 with 0
NOT R3, R2 ;R3 = -R2 (we negate NUMBER2)
ADD R4, R3, R1 ;R4 = R1 - R2
BRz Equals ;we jump to Equals if NUMBER1 = NUMBER2 (we can just jump directly to END)
BRn GreaterR2 ;we jump to GreaterR2 if NUMBER1 < NUMBER2
BRp GreaterR1 ;we jump to GreaterR2 if NUMBER1 > NUMBER2
Equals BRnzp End ;nothing to do, because R0=0 (THIS IS NOT NECCESARY)
GreaterR2 ADD R0, R0, #-1 ;R0 = -1
BRnzp End
GreaterR1 ADD R0, R0, #1 ;R0 = 1
BRnzp End
Done HALT ;THE END
NUMBER1 .FILL #2 ;/ Here we declare the numbers we want to compare
NUMBER1 .FILL #6 ;\
.ORIG x3000
AND R1, R1, x0
AND R2, R2, x0
LD R6, RESET
LEA R0, LINE1
PUTS
GETC
OUT
ADD R1, R6, R0
LEA R0, LINE2
PUTS
GETC
OUT
ADD R2, R6, R0
JSR COMPARE
HALT
;////////// COMPARE FUNCTION BEGINS /////////////
COMPARE
AND R3, R3, x0
NOT R2, R2
ADD R2, R2, x1
ADD R3, R1, R2
BRn NEG
ADD R3, R3, x0
BRp POS
ADD R3, R3, x0
BRz EQ
AND R5, R5, x0
ADD R5, R5, R1
RET
NEG LEA R0, N ; triggers when R3 IS NEGATIVE
PUTS
RET
POS LEA R0, P ; triggers when R3 IS POSITIVE
PUTS
RET
EQ LEA R0, E ; triggers when R3 IS ZERO
PUTS
RET
N .STRINGZ "\nX IS LESS THAN Y"
P .STRINGZ "\nX IS GREATER THAN Y"
E .STRINGZ "\nX IS EQUAL TO Y"
RESET .FILL xFFD0; RESET = -48 AS THIS IS ASCII RESETER FOR OUR PROGRAM
LINE1 .STRINGZ "ENTER X : "
LINE2 .STRINGZ "\nENTER Y : "
.END

ARM Assembly: Absolute Value Function: Are two or three lines faster?

In my embedded systems class, we were asked to re-code the given C-function AbsVal into ARM Assembly.
We were told that the best we could do was 3-lines. I was determined to find a 2-line solution and eventually did, but the question I have now is whether I actually decreased performance or increased it.
The C-code:
unsigned long absval(signed long x){
unsigned long int signext;
signext = (x >= 0) ? 0 : -1; //This can be done with an ASR instruction
return (x + signet) ^ signext;
}
The TA/Professor's 3-line solution
ASR R1, R0, #31 ; R1 <- (x >= 0) ? 0 : -1
ADD R0, R0, R1 ; R0 <- R0 + R1
EOR R0, R0, R1 ; R0 <- R0 ^ R1
My 2-line solution
ADD R1, R0, R0, ASR #31 ; R1 <- x + (x >= 0) ? 0 : -1
EOR R0, R1, R0, ASR #31 ; R0 <- R1 ^ (x >= 0) ? 0 : -1
There are a couple of places I can see potential performance differences:
The addition of one extra Arithmetic Shift Right call
The removal of one memory fetch
So, which one is actually faster? Does it depend upon the processor or memory access speed?
Here is a nother two instruction version:
cmp r0, #0
rsblt r0, r0, #0
Which translate to the simple code:
if (r0 < 0)
{
r0 = 0-r0;
}
That code should be pretty fast, even on modern ARM-CPU cores like the Cortex-A8 and A9.
Dive over to ARM.com and grab the Cortex-M3 datasheet. Section 3.3.1 on page 3-4 has the instruction timings. Fortunately they're quite straightforward on the Cortex-M3.
We can see from those timings that in a perfect 'no wait state' system your professor's example takes 3 cycles:
ASR R1, R0, #31 ; 1 cycle
ADD R0, R0, R1 ; 1 cycle
EOR R0, R0, R1 ; 1 cycle
; total: 3 cycles
and your version takes two cycles:
ADD R1, R0, R0, ASR #31 ; 1 cycle
EOR R0, R1, R0, ASR #31 ; 1 cycle
; total: 2 cycles
So yours is, theoretically, faster.
You mention "The removal of one memory fetch", but is that true? How big are the respective routines? Since we're dealing with Thumb-2 we have a mix of 16-bit and 32-bit instructions available. Let's see how they assemble:
Their version (adjusted for UAL syntax):
.syntax unified
.text
.thumb
abs:
asrs r1, r0, #31
adds r0, r0, r1
eors r0, r0, r1
Assembles to:
00000000 17c1 asrs r1, r0, #31
00000002 1840 adds r0, r0, r1
00000004 4048 eors r0, r1
That's 3x2 = 6 bytes.
Your version (again, adjusted for UAL syntax):
.syntax unified
.text
.thumb
abs:
add.w r1, r0, r0, asr #31
eor.w r0, r1, r0, asr #31
Assembles to:
00000000 eb0071e0 add.w r1, r0, r0, asr #31
00000004 ea8170e0 eor.w r0, r1, r0, asr #31
That's 2x4 = 8 bytes.
So instead of removing a memory fetch you've actually increased the size of the code.
But does this affect performance? My advice would be to benchmark.

Resources