adding breakpoints manually to an assembly code - debugging

if I have the following assembly code:
foo:
mov $0x1,%rax
callq bar
retq
bar:
dec %r8
cmp $0x0,%r8
je end
callq foo
mov $0x5,%rax
mov $0x1,%rdi
end:
retq
_start:
mov $0x3,%r8
callq 4000d4 <bar>
and I want to debug the function foo manually (by adding breakpoint to the start and end of the function using int 3 opcode to the machine code), but what is happening now is that the last time foo is called it does callq bar and doesn't return to the last line of foo thus losing the breakpoint, how can we solve this problem?

Related

Putting assembly code into Macro creates an unknown token in expression error

So, I created a quine in Assembly. And wanted to put parts of the code into a macro. However as soon as I do that, I get a warning saying:
instantiation :9:6: error: unknown token in expression
mov , %rcx
Grace.s:28:1: note: while in macro instantiation
notmain(x)
Here is the code in question with the macro (Gets error)
.macro nomain
_main:
push %rbp
mov %rsp, %rbp
lea name(%rip), %rdi
lea write(%rip), %rsi
call _fopen
mov %rax, %rdi
mov $10, %rdx
mov $9, %rcx
mov $34, %r8
lea quine(%rip), %r9
lea quine(%rip), %rsi
call _fprintf
leave
ret
.endmacro
.data
quine: .string ".data%1$cquine: .string %3$c%4$s%3$c%1$cname: .string %3$cGrace_kid.s%3$c%1$cwrite: .string %3$cw%3$c%1$c%1$c.text%1$c.globl _main%1$c#main function%1$c_main:%1$c%2$cpush %%rbp%1$c%2$cmov %%rsp, %%rbp%1$c%2$clea name(%%rip), %%rdi%1$c%2$clea write(%%rip), %%rsi%1$c%2$ccall _fopen%1$c%2$cmov %%rax, %%rdi%1$c%2$cmov $10, %%rdx%1$c%2$cmov $9, %%rcx%1$c%2$cmov $34, %%r8%1$c%2$clea quine(%%rip), %%r9%1$c%2$clea quine(%%rip), %%rsi%1$c%2$ccall _fprintf%1$c%2$cleave%1$c%2$cret%1$c"
name: .string "Grace_kid.s"
write: .string "w"
.text
.globl _main
#main function
nomain
and without macro (Works)
.data
quine: .string ".data%1$cquine: .string %3$c%4$s%3$c%1$cname: .string %3$cGrace_kid.s%3$c%1$cwrite: .string %3$cw%3$c%1$c%1$c.text%1$c.globl _main%1$c#main function%1$c_main:%1$c%2$cpush %%rbp%1$c%2$cmov %%rsp, %%rbp%1$c%2$clea name(%%rip), %%rdi%1$c%2$clea write(%%rip), %%rsi%1$c%2$ccall _fopen%1$c%2$cmov %%rax, %%rdi%1$c%2$cmov $10, %%rdx%1$c%2$cmov $9, %%rcx%1$c%2$cmov $34, %%r8%1$c%2$clea quine(%%rip), %%r9%1$c%2$clea quine(%%rip), %%rsi%1$c%2$ccall _fprintf%1$c%2$cleave%1$c%2$cret%1$c"
name: .string "Grace_kid.s"
write: .string "w"
.text
.globl _main
#main function
_main:
push %rbp
mov %rsp, %rbp
lea name(%rip), %rdi
lea write(%rip), %rsi
call _fopen
mov %rax, %rdi
mov $10, %rdx
mov $9, %rcx
mov $34, %r8
lea quine(%rip), %r9
lea quine(%rip), %rsi
call _fprintf
leave
ret

Why is GDB filling the 0s of a memory address with 5s during a register info?

I am using GDB on a x64 CPU. As you can see, I am trying to access the value of the rip register and for some reason the address of the instruction the register is pointing to is displayed using 5s instead of 0s as it should be.
Dump of assembler code for function main:
0x0000000000001139 <+0>: push rbp
0x000000000000113a <+1>: mov rbp,rsp
0x000000000000113d <+4>: sub rsp,0x10
0x0000000000001141 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x0000000000001148 <+15>: mov DWORD PTR [rbp-0x4],0x0
0x000000000000114f <+22>: jmp 0x1161 <main+40>
0x0000000000001151 <+24>: lea rdi,[rip+0xeac] # 0x2004
0x0000000000001158 <+31>: call 0x1030 <puts#plt>
0x000000000000115d <+36>: add DWORD PTR [rbp-0x4],0x1
0x0000000000001161 <+40>: cmp DWORD PTR [rbp-0x4],0x9
0x0000000000001165 <+44>: jle 0x1151 <main+24>
0x0000000000001167 <+46>: mov eax,0x0
0x000000000000116c <+51>: leave
0x000000000000116d <+52>: ret
End of assembler dump.
(gdb) break main
Breakpoint 1 at 0x1141: file Desktop/myprogram.c, line 6.
(gdb) run
Starting program: /home/william/Desktop/a.out
Breakpoint 1, main () at Desktop/myprogram.c:6
6 int i = 0;
(gdb) info register rip
rip 0x555555555141 0x555555555141 <main+8>
As you can see, the rip register contains the address of the mov instruction listed above but for some reason has replaced all the 0s for 5s. Any idea why?
Before running a position-independent executable, there is no base address so gcc assumes 0. This matches what you'll see from objdump -drwC -Mintel /bin/ls or whatever.
On running the executable to create a process, the OS's program-loader maps it to an address. x86-64 Linux chooses a page address that starts with 0x0000555555555... when GDB disables ASLR.
If you run it outside GDB, or with set disable-randomization off, then the address will still start with 0x000055555, but be randomized in some range.

How to set breakpoint at the very beginning of a method call in Xcode so that I can check $rdi $rsi etc?

I would like to set breakpoint at the very beginning of a method call so that I can check it's $rdi $rsi etc.
In Xcode, when I set a symbolic breakpoint at a method call say -[HelperClass doThingWithBlock:], it stops at the first line of the method body, which is already several instructions after the beginning of the method call, as shown in the disassembly of the method call below.
DebugBlock`-[HelperClass doThingWithBlock:]:
0x109844aa0 <+0>: pushq %rbp
0x109844aa1 <+1>: movq %rsp, %rbp
0x109844aa4 <+4>: subq $0x30, %rsp
0x109844aa8 <+8>: leaq -0x18(%rbp), %rax
0x109844aac <+12>: movq %rdi, -0x8(%rbp)
0x109844ab0 <+16>: movq %rsi, -0x10(%rbp)
0x109844ab4 <+20>: movq $0x0, -0x18(%rbp)
0x109844abc <+28>: movq %rax, %rdi
0x109844abf <+31>: movq %rdx, %rsi
0x109844ac2 <+34>: callq 0x109844c74 ; symbol stub for: objc_storeStrong
0x109844ac7 <+39>: leaq 0x15a2(%rip), %rax ; #"hi"
0x109844ace <+46>: movl $0x16, %ecx
0x109844ad3 <+51>: movl %ecx, %edx
-> 0x109844ad5 <+53>: movq -0x18(%rbp), %rsi
0x109844ad9 <+57>: movq %rsi, %rdi
0x109844adc <+60>: movq %rsi, -0x20(%rbp)
0x109844ae0 <+64>: movq %rax, %rsi
0x109844ae3 <+67>: movq -0x20(%rbp), %rax
0x109844ae7 <+71>: callq *0x10(%rax)
0x109844aea <+74>: xorl %ecx, %ecx
0x109844aec <+76>: movl %ecx, %esi
0x109844aee <+78>: leaq -0x18(%rbp), %rdx
0x109844af2 <+82>: movq %rdx, %rdi
0x109844af5 <+85>: movb %al, -0x21(%rbp)
0x109844af8 <+88>: callq 0x109844c74 ; symbol stub for: objc_storeStrong
0x109844afd <+93>: addq $0x30, %rsp
0x109844b01 <+97>: popq %rbp
0x109844b02 <+98>: retq
Jason's advice is good for more complex problems, but this is a pretty common requirement, so break set has an option specifically to control pushing the breakpoint past the prologue:
(lldb) break set -n main --skip-prologue 0
lldb will advance the breakpoint location to the first source line in the function when you have debug information. The idea is that most people with source level information are more interested in printing the arguments with their names instead of looking at the registers that were used to pass them in.
If you were working with a simple C function, you could set an address breakpoint evaluating the function name to an address, e.g.
(lldb) br s -n main
Breakpoint 1: where = a.out`main + 11 at a.c:3, address = 0x0000000100000f8b
(lldb) br s -a `main`
Breakpoint 2: address = 0x0000000100000f80
(lldb)
The backtick notation `` evaluates the expression in the backtick to an address/value. And breakpoint set --address works as you'd expect.
There's even some special magic built in where you don't need backticks for things that are expecting an address, like br s -a. You can do br s -a main and it will work -- and as a very special bonus, you can add offsets to functions here like br s -a main+5 which is normally not a valid C expression.
Unfortunately we're working with an objc method -[HelperClass doThingWithBlock:] which you can't drop in as an expression like I did with main. I think in this case you're going to need to find the address of it yourself, e.g. you might disassemble an instruction at the start like dis -c 1 -n '-[ViewController setRepresentedObject:]', and then feed that address into br s -a.

x86 asm - 12 bytes subtracted from esp. Only 8 needed

I've compiled this code with gcc (gcc -ggdb -mpreferred-stack-boundary=2 -o demo demo.c) and decompiled it to look at the assembly (I know it's using unsafe functions, this was for an exercise into buffer overflows):
#include<stdio.h>
CanNeverExecute()
{
printf("I can never execute\n");
exit(0);
}
GetInput()
{
char buffer[8];
gets(buffer);
puts(buffer);
}
main()
{
GetInput();
return 0;
}
Here is the assembly for the GetInput() Function:
(gdb) disas GetInput
Dump of assembler code for function GetInput:
0x08048432 <+0>: push ebp
0x08048433 <+1>: mov ebp,esp
0x08048435 <+3>: sub esp,0xc
=> 0x08048438 <+6>: lea eax,[ebp-0x8]
0x0804843b <+9>: mov DWORD PTR [esp],eax
0x0804843e <+12>: call 0x8048320 <gets#plt>
0x08048443 <+17>: lea eax,[ebp-0x8]
0x08048446 <+20>: mov DWORD PTR [esp],eax
0x08048449 <+23>: call 0x8048340 <puts#plt>
0x0804844e <+28>: leave
0x0804844f <+29>: ret
End of assembler dump.
Here is the assembly for the Main() Function:
(gdb) disas main
Dump of assembler code for function main:
0x08048450 <+0>: push ebp
0x08048451 <+1>: mov ebp,esp
0x08048453 <+3>: call 0x8048432 <GetInput>
0x08048458 <+8>: mov eax,0x0
0x0804845d <+13>: pop ebp
0x0804845e <+14>: ret
End of assembler dump.
I've set a breakpoint at line 13 (gets(buffer))
From Main(), I can see that the ebp value is pushed onto the stack. Then when GetInput() function is called the ret address is also pushed onto the stack. Once entered the GetInput function, the ebp value is pushed onto the stack again.
Now this is where I get confused:
0x08048435 <+3>: sub esp,0xc
The buffer variable is only 8 bytes, so 8 bytes should be subtracted from esp to allow for the buffer local variable.
The stack:
(gdb) x/8xw $esp
0xbffff404: 0x08048360 0x0804847b 0x002c3ff4 0xbffff418
0xbffff414: 0x08048458 0xbffff498 0x00147d36 0x00000001
(gdb) x/x &buffer
0xbffff408: 0x0804847b
0x08048458 is the ret address, 0xbffff418 is the old value of ebp, and 4 bytes of the buffer variable is in 0x0804847b, so I guess the other 4 bytes is 0x002c3ff4. But there seems to be another 4 bytes on the stack.
So my question is, why is it subtracting 12 bytes if only 8 bytes is needed? What's the extra 4 bytes for?
Thank you
It's because of the
mov DWORD PTR [esp],eax
Apparently, your puts and gets implementations require the argument to be pushed onto the stack.
Value [ebp-0xc] is actually [esp] now, that's why that dword is reserved ahead.
Why is it so? Doing it this way is more efficient, as you don't have to pop and push, but just move eax on [esp], so you spare at least one instruction. However, I guess this code has gone through some optimiation, because this one is clever.

Illegal instruction in Assembly

I really do not understand why this simple code works fine in the first attempt but when
putting it in a procedure an error shows:
NTVDM CPU has encountered an illegal instruction
CS:db22 IP:4de4 OP:f0 ff ff ff ff
The first code segment works just fine:
.model small
.stack 100h
.code
start:
mov ax,#data
mov ds,ax
mov es,ax
MOV AH,02H ;sets cursor up
MOV BH,00H
MOV DH,02
MOV DL,00
INT 10H
EXIT:
MOV AH,4CH
INT 21H
END
However This generates an error:
.model small
.stack 100h
.code
start:
mov ax,#data
mov ds,ax
mov es,ax
call set_cursor
PROC set_cursor near
MOV AH,02H ;sets cursor up
MOV BH,00H
MOV DH,02
MOV DL,00
INT 10H
RET
set_cursor ENDP
EXIT:
MOV AH,4CH
INT 21H
END
Note: Nothing is wrong with windows config. I have tried many sample codes that work fine
Thanks
You left out a JMP:
call set_cursor
jmp EXIT ; <== you forgot this part
PROC set_cursor near
What's happening is that after call set_cursor, you're then falling through to the proc and and executing it again, then when you hit the ret it pops the stack and you jump to, well, who knows?
Edit: As someone else pointed out, you're better off putting your PROC after your main code ends, instead of sticking it in the middle and jumping around it. But you've probably figured that out already :)
You should move the code of the procedure after the part where you exit the program (or follow egrunin's advice).
The reason for your segfault is that the code in the procedure is executed again after you first call it. During the second execution the code crashes on RET because there is no valid return address on the stack.

Resources