I've been looking around for info on how Ethereum deals with jumps and jump destinations. From various blogs and the yellow paper what I found is as follows:
The operand taken by JUMP and the first of the two operands taken by JUMPI are the value the the PC is set to (assume the first stack value != 0 in the case of JUMPI).
However, looking at this contract's creation code (as opcodes) the first few opcodes/values are:
PUSH1 0x60
PUSH1 0x40
MSTORE
CALLDATASIZE
ISZERO
PUSH2 0x00f8
JUMPI
As I understand it this means that if the value pushed to the stack by ISZERO != 0 then PC will change to 0x00f8 as JUMPI takes two from the stack, checks if the second is 0 and if not sets PC to the value of its first operand.
The problem I am having is that 0x00f8 in decimal is 248. The 248th position in the contract appears to be MSTORE and not a JUMPDEST, which would cause the contract to fail in its execution as JUMP* can only point to a valid JUMPDEST.
Presumably contracts don't jump to invalid destinations on purpose?
If anyone could explain how jumps and jump destinations are resolved I would be very grateful.
In case it helps others:
The confusion arose from the EVM reading byte by byte and NOT word by word.
From the example in the question, 0x00f8 would be the 248th byte, not the 248th word.
As each opcode is 1 byte long PC is normally incremented by 1 when reading an opcode.
However in the case of a PUSH instruction, information on how many of the following bytes are to be taken as its operand is also included.
For example PUSH2 takes the 2 bytes that follow it, PUSH6 takes 6 bytes that follow it, and so on. Here PC would be incremented by 1 for the PUSH and then 2 or 6 respectively for each byte of the data used by the PUSH.
Just want to point out that there is a difference in JUMP and JUMPI.
JUMP just takes 1 element from the stack i.e. destination. Which is generally an offset in hex pushed to the stack.
JUMPI is a conditional jump that takes top 2 elements from the stack i.e. destination and condition.
In the example you gave the condition is ISZERO(checks if the top most element of the stack is 0 or not).
So if that returns true, it will JUMP to the desitnation that is the offset 0x00f8(248 in decimal).
If the condition is False, it will just increase the program counter by 1.
In the contract you mentioned, it is a JUMPDEST opcode at (Program counter)248.
The program counter depends on the opcode. How much many bytes does a opcode push into the stack,etc. e.g.
PUSH1 0x60 - PC[0]
PUSH1 0x40 - PC[2]
MSTORE - PC[4]
CALLDATASIZE- PC[5]
ISZERO - PC[6]
PUSH2 0x00f8- PC[7]
JUMPI - PC[10]
Maybe this website will give you a better understanding on opcodes https://ethervm.io/
Related
I'm participating in a ctf where one task is to reverse a row of input bytes using an assembly-ish environment. The input is x bytes long and the last byte is always 0x00. One example would be :
Input 4433221100, output 0011223344
I'm thinking that a loop that loops until it reaches input 00 is a place to start.
Do any of you have a suggestion on how to approach this? I don't need specific code examples, but some advice to point me in the right direction would be great. I only have basic alu operations, jumps and conditional jumps, storing and reading memory addresses, and some other basic stuff available. All alu operations are mod 256.
Yes, finding the length by searching for the 0 byte to find the end / length is one way to start. Depending on where you want the destination, it's possible to copy in the same loop that searches for the end.
If you want to reverse in-place, you need to find the end first (with a separate loop). Then you can load from both ends, store registers to opposite locations, and walk your pointers inward until they cross, standard in-place reverse that you can find examples of anywhere.
If you want make a reversed copy into other space, you could do it in one pass over the source (without finding the length first). Store output starting from the end of a buffer, decrementing the output pointer as you increment the read pointer. When you're done, you have a pointer to the start of the reversed copy, which you can pass to an output function. You won't know where you're going to stop, so the buffer needs to be big enough. But since you're just passing the pointer to another function, it's fine that you don't know (until you're done copying) where the start of the reversed copy will be.
You could still separately find the length and then copy, but that would be pointlessly inefficient.
If you need the reversed copy to start at some known position in another buffer (e.g. to append to another string or array), you would need the length or a pointer to the end before you store anything, so it's a 2-pass operation like reversing in-place.
You can then read the source backwards and write the destination forwards (or "output" each byte 1 at a time to some IO stream). Your loop termination condition could be a down-counter or a pointer compare using a pointer in a register, comparing src against the already-known start of the source or dst against the calculated end of the destination.
Or you can read the source forwards until you reach the position you found for the end, storing in reverse order starting from the calculated end of where the destination should go.
(If your machine is like 6502 and can easily index into a static array, but not easily keep a whole pointer in a register, obviously you'll want to use indices that count from 0. That makes detecting the start even easier, like sub reg, 1 / jnz if subtract already sets flags for a conditional branch to test.)
save your stackpointer in a variable
for each byte of the string
push byte onto the stack
repeat if byte was <> 0
pull byte from stack
output byte
repeat until old_stackpointer is reached
in 6502 assembler this could look like
tsx
stx OLD_STACKPTR
ldy#$ff
loop:
iny
lda INPUT,y
pha
bne loop
ldy#$ff
loop2:
iny
pla
sta INPUT,y
tsx
cpx OLD_STACKPTR
bne loop2
I know from
C Function alignment in GCC
that i can align functions using
__attribute__((optimize("align-functions=32")))
Now, what if I want a function to start at an "odd" address, as in, I want it to start at an address of the form 32(2k+1), where k is any integer?
I would like the function to start at address (decimal) 32 or 96 or 160, but not 0 or 64 or 128.
Context: I'm doing a research project on code caches, and I want a function aligned in one level of cache but misaligned in another.
GCC doesn't have options to do that.
Instead, compile to asm and do some text manipulation on that output. e.g. gcc -O3 -S foo.c then run some script on foo.s to odd-align before some function labels, before compiling to a final executable with gcc -o benchmark foo.s.
One simple way (that costs between 32 and 95 bytes of padding) is this simplistic way:
.balign 64 # byte-align by 64
.space 32 # emit 32 bytes (of zeros)
starts_half_way_into_a_cache_line:
testfunc1:
Tweaking GCC/clang output after compilation is in general a good way to explore what gcc should have done. All references to other code/data inside and outside the function uses symbol names, nothing depends on relative distances between functions or absolute addresses until after you assemble (and link), so editing the asm source at this point is totally safe. (Another answer proposes copying final machine code around; that's very fragile, see the comments under it.)
An automated text-manipulation script will let you run your experiment on larger amounts of code. It can be as simple as
awk '/^testfunc.*:/ { print ".p2align 6; .skip 32"; print $0 }' foo.s
to do this before every label that matches the pattern ^testfunc.*. (Assuming no leading underscore name mangling.)
Or even use sed which has a convenient -i option to do it "in-place" by renaming the output file over the original, or perl has something similar. Fortunately, compiler output is pretty formulaic, for a given compiler it should be a pretty easy pattern-matching problem.
Keep in mind that the effects of code-alignment aren't always purely local. Branches in one function can alias (in the branch-predictor) with branches from another function depending on alignment details.
It can be hard to know exactly why a change affects performance, especially if you're talking about early in a function where it shifts branch addresses in the rest of the function by a couple bytes. You're not talking about changes like that, though, just shifting the whole function around. But it will change alignment relative to other functions, so tests that call multiple functions alternating with each other, or if the functions call each other, can be affected.
Other effects of alignment include uop-cache packing on modern x86, as well as fetch block. (Beyond the obvious effect of leaving unused space in an I-cache line).
Ideally you'd only insert 0..63 bytes to reach a desired position relative to a 64-byte boundary. This section is a failed attempt at getting that to work.
.p2align and .balign1 support an optional 3rd arg which specifies a maximum amount of padding, so we're close to being about to do it with GAS directives. We can maybe build on that to detect whether we're close to an odd or even boundary by checking whether it inserted any padding or not. (Assuming we're only talking about 2 cases, not the 4 cases of 16-byte relative to 64-byte for example.)
# DOESN'T WORK, and maybe not fixable
1: # local label
.balign 64,,31 # pad with up to 31 bytes to reach 64-byte alignment
2:
.balign 32 # byte-align by 32, maybe to the position we want, maybe not
.ifne 2b - 1b
# there is space between labels 2 and 1 so that balign reached a 64-byte boundary
.space 32
.endif # else it was already an odd boundary
But unfortunately this doesn't work: Error: non-constant expression in ".if" statement. If the code between the 1: and 2: labels has fixed size, like .long 0xdeadbeef, it will assemble just fine. So apparently GAS won't let you query with a .if how much padding an alignment directive inserted.
Footnote 1: .align is either .p2align (power of 2) or .balign (byte) depending on which target you're assembling for. Instead of remembering which is which on which target, I'd recommend always using .p2align or .balign, not .align.
As this question is tagged assembly, here are two spots in my (NASM 8086) sources that "anti align" following instructions and data. (Here just with an alignment to even addresses, ie 2-byte alignment.) Both were based on the calculation done by NASM's align macro.
https://hg.ulukai.org/ecm/ldebug/file/683a1d8ccef9/source/debug.asm#l1161
times 1 - (($ - $$) & 1) nop ; align in-code parameter
call entry_to_code_sel, exc_code
https://hg.ulukai.org/ecm/ldebug/file/683a1d8ccef9/source/debug.asm#l7062
; $ - $$ = offset into section
; % 2 = 1 if odd offset, 0 if even
; 2 - = 1 if odd, 2 if even
; % 2 = 1 if odd, 0 if even
; resb (2 - (($-$$) % 2)) % 2
; $ - $$ = offset into section
; % 2 = 1 if odd offset, 0 if even
; 1 - = 0 if odd, 1 if even
resb 1 - (($-$$) % 2) ; make line_out aligned
trim_overflow: resb 1 ; actually part of line_out to avoid overflow of trimputs loop
line_out: resb 263
resb 1 ; reserved for terminating zero
line_out_end:
Here is a simpler way to achieve anti-alignment:
align 2
nop
This is more wasteful though, it may use up 2 bytes if the target anti-alignment already would be satisfied before this sequence. My prior examples will not reserve any more space than necessary.
I believe GCC only lets you align on powers of 2
If you want to get around this for testing, you could compile your functions using position independent code (-FPIC or -FPIE) and then write a separate loader that manually copies the function into an area that was MMAP'd as read/write. And then you can change the permissions to make it executable. Of course for a proper performance comparison, you would want to make sure the aligned code that you are comparing it against was also compiled with FPIC/FPIE.
I can probably give you some example code if you need it, just let me know.
I tried to write the inst x0000 which means BR with nzp=0 and offset 0.
I wrote BR #0 in the simulator.
Instead of giving me that x0000 on the simulator,
I get 0x0E00 which means nzp is 111.
What is the correct way of doing this?
You cannot have nzp=000. A number is either negative, zero or positive.
According to this course on LC3 from UPenn
LC-3 has three 1-bit condition code registers
N - negative
Z - zero
P - positive (greater than zero)
Exactly one will be set at all times.
Based on the last instruction that altered a register.
You can do NOP because if nzp=000 it means that the PC won't change, so you just need to pass this instruction.
Another option is do LABEL .fill x0000 because the instruction code of BR with nzp=0 and offset 0 will be just 000.
BR will assemble to branch-always, which is exactly the same as BRnzp, which is why you're seeing that in the assembled code.
I am referring BrokenThorn's OS development tutorial, and currently reading the part on developing a complete first stage bootloader that loads the second stage - Bootloaders 4.
In the part of converting Logical Block Address (LBA) to Cylinder-Head-Sector (CHS) format, this is the code that is used -
LBACHS:
xor dx, dx ; prepare dx:ax for operation
div WORD [bpbSectorsPerTrack] ; divide by sectors per track
inc dl ; add 1 (obsolute sector formula)
mov BYTE [absoluteSector], dl
xor dx, dx ; prepare dx:ax for operation
div WORD [bpbHeadsPerCylinder] ; mod by number of heads (Absolue head formula)
mov BYTE [absoluteHead], dl ; everything else was already done from the first formula
mov BYTE [absoluteTrack], al ; not much else to do :)
ret
I am not able to understand the logic behind this conversion. I tried using a few sample values to walk through it and see how it works, but that got me even more confused. Can someone explain how this conversion works and the logic used ?
I am guessing that your LBA value is being stored in AX as you are performing division on some value.
As some pre-information for you, the absoluteSector is the CHS sector number, absoluteHead is the CHS head number, and absoluteTrack is the CHS cylinder number. Cylinders and tracks are the exact same thing, just a different name.
Also, the DIV operation for your code in 16-bit will take whatever is in the DX:AX register combination and divide it by some value. The remainder of the division will be in the DX register while the actual result will be in the AX register.
Next, the *X registers are 16-bit registers, where * is one of ABCD. They are made up of a low and high component, both referred to as *H and *L for high and low, respectively. For example, the DX register has DH for the upper 8 bits and DL for the lower 8 bits.
Finally, as the BYTE and WORD modifiers simply state the size of the data that will be used/transferred.
The first value you must extract is the sector number, which is obtained by diving the LBA value by the number of sectors per track. The DL register will then contain the sector number minus one. This is because counting sectors starts at 1, which is different than most values, which start at zero. To fix this, we add one to the DL register get the correct sector value. This value is stored in memory at absoluteSector.
The next value you must extract is the head number, which is obtained by dividing the result of the last DIV operation by the number of heads per cylinder. The DL register will then contain the head number, which we store at absoluteHead.
Finally, we get the track number. With the last division we already obtained the value, which is in the AL register. We then store this value at absoluteTrack.
Hope this cleared things up a little bit.
-Adrian
I have been instructed by my teacher to append 0 before the hexa numbers while writing instructions as some compilers search for 0 before the number in an instruction to differentiate it from a label. I am confused if the instruction already starts with a 0, what should be done in such a case?
For Example,
AND BL, 0FH
Is there a need of adding 0 before that hexa number or not? Please help me out. Thanks
EDIT:
Sorry if I had not been clearer enough before. What I meant was that in the above example, a 0 is already present, do I need to convert it to,
AND BL, 00FH
Except for the special cases like 0 or 1, I tend to encode my hex numbers with the full complement of digits just so it's easier to see what the intent is:
mov al, 09h
mov ax, 0123h
and so on.
For cases where the number starts with an alpha character (like deadbeef), I prefix it with an extra 0.
But no, it's not usually (a) necessary to do this if your hex number already begins with a digit.
In any case, I'd be putting most numbers into an equ statement rather than sprinkling magic numbers throughout the code. Would you rather see:
mov ax, 80
or:
mov ax, lines_per_screen
(a) Of course, it depends on your assembler but, from memory, all the ones I've used work this way.
No, there's no need (and including more than one leading 0 is fairly unusual).
Your example is an apt one though -- without the leading 0 to tell it this was a number, the assembler would normally interpret FH as a symbol rather than a number.