I tried to write the inst x0000 which means BR with nzp=0 and offset 0.
I wrote BR #0 in the simulator.
Instead of giving me that x0000 on the simulator,
I get 0x0E00 which means nzp is 111.
What is the correct way of doing this?
You cannot have nzp=000. A number is either negative, zero or positive.
According to this course on LC3 from UPenn
LC-3 has three 1-bit condition code registers
N - negative
Z - zero
P - positive (greater than zero)
Exactly one will be set at all times.
Based on the last instruction that altered a register.
You can do NOP because if nzp=000 it means that the PC won't change, so you just need to pass this instruction.
Another option is do LABEL .fill x0000 because the instruction code of BR with nzp=0 and offset 0 will be just 000.
BR will assemble to branch-always, which is exactly the same as BRnzp, which is why you're seeing that in the assembled code.
Related
I am trying to understand how modern CPU works. I am focused on RISC-V. there are a few types of branches:
BEQ
BNE
BLT
BGE
BLTU
BGEU
I use a venus simulator to test this and also I am trying to simulate it as well and so far so good it works, but I cannot understand, how are branches calculated.
From what I have read, the ALU unit has just one signal output - ZERO (apart from its math output) which is active whenever the output is zero. But just how can I determine if the branch should be taken or not based just on the ZERO output? And how are they calculated?
Example code:
addi t0, zero, 9
addi t1, zero, 10
blt t0, t1, end
end:
Example of branches:
BEQ - subtract 2 numbers, if ZERO is active, branch
BNE - subtract 2 numbers, if ZERO is not active, branch
BLT - and here I am a little bit confused; should I subtract and then look at the sign bit, or what?
BGE / BGEU - and how to differentiate these? What math instructions should I use?
Yes, the ZERO output gives you equal / not-equal. You can also use XOR instead of SUB for equality comparisons if that runs faster (ready earlier in a partial clock cycle) and/or uses less power (fewer transistors switching).
Fun fact: MIPS only has eq / ne and signed-compare-against-zero branch conditions, all of which can be tested fast without carry propagation or any other cascading bits. That mattered because it checked branch conditions in the first half cycle of exec, in time to forward to fetch, keeping branch latency down to 1 cycle which the branch-delay slot hid on classic MIPS pipelines. For other conditions, like blt between two registers, you need slt and branch on that. RISC-V has true hardware instructions for blt between two registers, vs. MIPS's bltz against zero only.
Why use an ALU with only a zero output? That makes it unusable for comparisons other than exact equality.
You need other outputs to determine GT / GE / LE / LT (and their unsigned equivalents) from a subtract result.
For unsigned conditions, all you need is zero and a carry/borrow (unsigned overflow) flag.
The sign bit of the result on its own is not sufficient for signed conditions because signed overflow is possible: (-1) - (-2) = +1 : -1 > -2 (signbit clear) but (8-bit wraparound) 0x80 - 0x7F = +1 (signbit also clear) but -128 < 127. The sign bit of a number on its own is only useful if comparing against zero.
If you widen the result (by sign-extending the inputs and doing one more bit of add/sub) that makes signed overflow impossible so that 33rd bit is a signed-less-than result directly.
You can also get a signed-less-than result from signed_overflow XOR signbit instead of actually widening + adding. You might also want an ALU output for signed overflow, if RISC-V has any architectural way for software to check for signed-integer overflow.
Signed-overflow can be computed by looking at the carry in and carry out from the MSB (the sign bit). If those differ, you have overflow. i.e. SF = XOR of those two carries. See also http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt for a detailed look at unsigned carry vs. signed overflow with 2-bit and 4-bit examples.
In CPUs with a FLAGS register (e.g. x86 and ARM), those ALU outputs actually go into a special register with named bits. You can look at an x86 manual for conditional-jump instructions to see how condition names like l (signed less-than) or b (unsigned below) map to those flags:
signed conditions:
jl (aka RISC-V blt) : Jump if less (SF≠ OF). That's output signbit not-equal to Overflow Flag, from a subtract / cmp
jle : Jump if less or equal (ZF=1 or SF≠ OF).
jge (aka RISC-V bge) : Jump if greater or equal (SF=OF).
jg (aka RISC-V bgt) : Jump short if greater (ZF=0 and SF=OF).
If you decide to have your ALU just produce a "signed-less-than" output instead of separate SF and OF outputs, that's fine. SF==OF is just !(SF != OF).
(x86 also has some mnemonic synonyms for the same opcode, like jl = jnge. There are "only" 16 FLAGS predicates, including OF=0 alone (test for overflow, not a compare result), and the parity flag. You only care about the actual signed/unsigned compare conditions.)
If you think through some example cases, like testing that INT_MAX > INT_MIN you'll see why these conditions make sense, like that example I showed above for 8-bit numbers.
unsigned:
jb (aka RISC-V bltu) : Jump if below (CF=1). That's just testing the carry flag.
jae (aka RISC-V bgeu) : Jump short if above or equal (CF=0).
ja (aka RISC-V bgtu) : Jump short if above (CF=0 and ZF=0).
(Note that x86 subtract sets CF = borrow output, so 1 - 2 sets CF=1. Some other ISAs (e.g. ARM) invert the carry flag for subtract. When implementing RISC-V this will all be internal to the CPU, not architecturally visible to software.)
I don't know if RISC-V actually has all of these different branch conditions, but x86 does.
There might be simpler ways to implement a signed or unsigned comparator than doing subtraction at all.
But if you already have an add/subtract ALU and want to piggyback on that then you might just want it to generate Carry and Signed-less-than outputs as well as Zero.
That way you don't need a separate sign-flag output, or to grab the MSB of the integer result. It's just one extra XOR gate inside the ALU to combine those two things.
You don't have to do subtraction to compare two (signed or unsigned) numbers.
You can use cascaded 7485 chip for example.
With this chip you can do all Branch computation without doing any subtraction.
I've been looking around for info on how Ethereum deals with jumps and jump destinations. From various blogs and the yellow paper what I found is as follows:
The operand taken by JUMP and the first of the two operands taken by JUMPI are the value the the PC is set to (assume the first stack value != 0 in the case of JUMPI).
However, looking at this contract's creation code (as opcodes) the first few opcodes/values are:
PUSH1 0x60
PUSH1 0x40
MSTORE
CALLDATASIZE
ISZERO
PUSH2 0x00f8
JUMPI
As I understand it this means that if the value pushed to the stack by ISZERO != 0 then PC will change to 0x00f8 as JUMPI takes two from the stack, checks if the second is 0 and if not sets PC to the value of its first operand.
The problem I am having is that 0x00f8 in decimal is 248. The 248th position in the contract appears to be MSTORE and not a JUMPDEST, which would cause the contract to fail in its execution as JUMP* can only point to a valid JUMPDEST.
Presumably contracts don't jump to invalid destinations on purpose?
If anyone could explain how jumps and jump destinations are resolved I would be very grateful.
In case it helps others:
The confusion arose from the EVM reading byte by byte and NOT word by word.
From the example in the question, 0x00f8 would be the 248th byte, not the 248th word.
As each opcode is 1 byte long PC is normally incremented by 1 when reading an opcode.
However in the case of a PUSH instruction, information on how many of the following bytes are to be taken as its operand is also included.
For example PUSH2 takes the 2 bytes that follow it, PUSH6 takes 6 bytes that follow it, and so on. Here PC would be incremented by 1 for the PUSH and then 2 or 6 respectively for each byte of the data used by the PUSH.
Just want to point out that there is a difference in JUMP and JUMPI.
JUMP just takes 1 element from the stack i.e. destination. Which is generally an offset in hex pushed to the stack.
JUMPI is a conditional jump that takes top 2 elements from the stack i.e. destination and condition.
In the example you gave the condition is ISZERO(checks if the top most element of the stack is 0 or not).
So if that returns true, it will JUMP to the desitnation that is the offset 0x00f8(248 in decimal).
If the condition is False, it will just increase the program counter by 1.
In the contract you mentioned, it is a JUMPDEST opcode at (Program counter)248.
The program counter depends on the opcode. How much many bytes does a opcode push into the stack,etc. e.g.
PUSH1 0x60 - PC[0]
PUSH1 0x40 - PC[2]
MSTORE - PC[4]
CALLDATASIZE- PC[5]
ISZERO - PC[6]
PUSH2 0x00f8- PC[7]
JUMPI - PC[10]
Maybe this website will give you a better understanding on opcodes https://ethervm.io/
So I have a word, and I want to loop through and test the left most bit. I have my word and I'm passing it to my subroutine, I know how to build a loop, I'm just not sure how to test the left most bit in the word.
Thanks for any help
The best way to do this is with bit masking -- perform a bitwise AND between the word you want to check and a bit mask with a 1 in any position you wish to test. i.e. in binary:
my word: 11
bitmask: 10
& ==
10
you can see that the 1 in the left side drops out. So to do something similar on a 16bit number:
0x0230 & 0x8000 = 0x0000
0xC020 & 0x8000 = 0x8000 != 0x0000
The important thing to note here is that if the bit is not present the AND returns a 0, and if the bit is present it returns something else. It doesn't matter what it is, just that it's not zero.
Not sure if it applies to your specific task but a simple approach could be performing a logical/arithmetic left shift to the word. This is simply done by adding the word to itself (which is equal to multiplying by 2 and thus shifting all the bits to the left 1 "spot). After doing this, the condition codes will be set (assuming you're using the GPRs) and you can test if the left most bit is a 1 or a 0 by checking if the shifted word is positive OR zero (hence the left most bit is a 0), or negative (hence the left most bit is a 1). Loop over the whole word following this approach and you'll be able to determine the value of each bit in your word. Hope this helps.
I have been hanging on it too much time...
I've read «A Painless Guide to CRC Error Detection Algorithms» several times. May be I not completely understand theory, but practice seems as clear as sky, but something wrong.
I'm not about code and particular realization, but conceptual (a plain method).
I do this:
1. Take a single byte.
2. Take a uint and fill it with 0xffffffff.
3. Check if the highest bit is 1.
4. Shift one bit to the left.
5. Put the next bit from source byte.
6. It Step3 checking is true, then XOR it with 0x04C11DB7.
7. After data is end, reverse (reflect) working uint.
8. XOR it with 0xffffffff
And it works... but only with zeros (I've checked 1,2,3,4 bytes of zeros). But when I take a byte 0x01 it fails (online calculators show different result). I just can't catch what am I doing wrong.
Step by step (mine version with lowest bit first):
01.Initialization 0xffffffff
02.Shift<< 0fffffffe
03.Place that single 1 0xffffffff
04.XOR 0xfb3ee248
05.Shift<< 0xf67dc490
06.XOR 0xf2bcd927
07.Shift<< 0xe579b24e
08.XOR 0xe1b8aff9
09.Shift<< c3715ff2
10.XOR 0xc7b04245
11.Shift<< 0x8f60848a
12.XOR 8ba1993d
13.Shift<< 0x1743327a
14.XOR 0x13822fcd
15.Shift<< 0x27045f9a
16.Shift<< 0x4e08bf34
17.Reflect 0x2cfd1072
18.XOR (0xffffffff) 0xd302ef8d (the result)
Please help! What is wrong with it?
At last, I've got the reciept. It took much time, but I reinvented it ))
Share it with anyone, who need it:
1. Take first 4 bytes from message (if it less than 4 byte - add zeros). May be you will need to reflect bits in EVERY byte (I have to, but I think it depends on particular architecture). Put it into Register (uint).
2. Make Register XOR 0xFFFFFFFF.
3. Shift one bit left.
4. Place the next message's bit (the lowest one first) to the right side of Register.
5. If shifted bit was 1, than Register XOR 0x04C11DB7.
6. Do steps 3-5 until the end of the message.
7. Do steps 3-5 for 32 bits of zeros (if the message is less than 32 bits, than this number must correspond with input length).
7. Reflect bits in the whole Register.
8. Make Register XOR 0xffffffff.
That's it - you have the CRC32, which all online calculators show and, at least, correct for deflate, PNG, etc.
I have been instructed by my teacher to append 0 before the hexa numbers while writing instructions as some compilers search for 0 before the number in an instruction to differentiate it from a label. I am confused if the instruction already starts with a 0, what should be done in such a case?
For Example,
AND BL, 0FH
Is there a need of adding 0 before that hexa number or not? Please help me out. Thanks
EDIT:
Sorry if I had not been clearer enough before. What I meant was that in the above example, a 0 is already present, do I need to convert it to,
AND BL, 00FH
Except for the special cases like 0 or 1, I tend to encode my hex numbers with the full complement of digits just so it's easier to see what the intent is:
mov al, 09h
mov ax, 0123h
and so on.
For cases where the number starts with an alpha character (like deadbeef), I prefix it with an extra 0.
But no, it's not usually (a) necessary to do this if your hex number already begins with a digit.
In any case, I'd be putting most numbers into an equ statement rather than sprinkling magic numbers throughout the code. Would you rather see:
mov ax, 80
or:
mov ax, lines_per_screen
(a) Of course, it depends on your assembler but, from memory, all the ones I've used work this way.
No, there's no need (and including more than one leading 0 is fairly unusual).
Your example is an apt one though -- without the leading 0 to tell it this was a number, the assembler would normally interpret FH as a symbol rather than a number.