How to fix gdb skipping breakpoints - debugging

I wrote an assembly code to test multiplication and division with integers. After assembling the code with -g and compiling it with gcc -g; I put a series of breakpoints in GDB, before and after the arithmetic operations.
Some breakpoints are skipped and GDB ends the debugging session.
What could be happening? I attach example code and picture of it.
Debian 10.8 (gcc 8.3.0-6)and Nasm assembler.
I would appreciate the help.
test code
section .bss
data: resq 1
section .text
global main
main:
push rbp
mov rbp,rsp
mov rax,77
mov rbx,2
mul rbx
mov [rel data],rax
mov rsp,rbp
pop rbp
mov rax,60
mov rdi,0
syscall
ok this is the gdb output; consider two breakpoints, one is mul bx, and the other on mov rsp,rbp; before the syscall exit.
Breakpoint 2, in__libc_csu_init()
(gdb)c
Continuing.
[inferior 1 process(1974) exited normally]
(gdb)

Related

Compile assembly (NASM with int 80h system calls) in windows

I am learning to code assembly (NASM). But i have problem, i am coding online but i want to convert this code below to exe and run it. (By clicking double click on it, not in cmd). And i dont have a clue how to do it. i know i must use a nasm from https:://www.nasm.us and a linker. For the linker i want to use ld from mingw. but i dont know how to do it. i didnt find any thing on the internet
section .data
msg: db "Eneter your name : ", 10
msg_l: equ $-msg
hello: db "Hello, "
hello_l: equ $-hello
section .bss
name: resb 255
section .text
global _start:
_start:
mov eax, 4
mov ebx, 1
mov ecx, msg
mov edx, msg_l
int 80h
mov eax, 3
mov ebx, 0
mov ecx, name
mov edx, 255
int 80h
mov eax, 4
mov ebx, 1
mov ecx, hello
mov edx, hello_l
int 80h
mov eax, 4
mov ebx, 1
mov ecx, name
mov edx, 255
int 80h
mov eax, 1
mov ebx, 0
int 80h
The best way to work with linux in windows is to use wsl2. The windows subsystem will allow you to use real linux system calls. There is a learning curve but its worth it.
Follow a guide on how to install ws2.
Go to the windows store and download one of the few linux terminals. I use ubuntu.
Install gcc in the terminal so that you will have a gnu compiler, gnu assembler, and the gnu linker(ld).
Install nasm in the terminal. Not the windows app version.
After everything is set up, you can get a nice workflow going.
you would open the terminal.
change directories which will get you into the c drive: cd /mnt/c
create a folder in the c drive where you want to do your work
change directories to that folder: cd foldername
create a nasm asm file and put some code into it.
then you can use nasm to assemble, ld to link, execute
When you assemble with nasm you can now use elf:
nasm -f elf32 main.asm
ld -m elf_i386 main.o -o main
./main
or:
nasm -f elf64 main.asm
ld main.o -o main
./main

Implementing the "." word from Forth in x86 assembly

I am trying to make a function that, prints a number out on screen. Eventually, I'll make it able to take the top stack item, print it, and then pop it (like the "." word in Forth). But for now, I am trying to keep it simple. I think that I need to align the call stack in some way - and I figured that pushing and popping an arbitrary register before and after calling printf (rbx) would do the trick - but I am still getting a segmentation fault. A backtrace in GDB hasn't helped me make any progress either. Does anyone know why this code is causing a segmentation fault, and how to fix it?
How I am assembling (GAS):
gcc -masm=intel
.data
format_num: .ascii "%d\0"
.text
.global _main
.extern _printf
print_num:
push rbx
lea rdi, format_num[RIP]
mov esi, 250
xor eax, eax
call _printf
pop rbx
ret
_main:
call print_num
mov rdi, 0
mov rax, 0x2000001
syscall

Successive sys_write syscalls not working as expected, NASM bug on OS X?

I'm trying to learn MacOS assembly using NASM and I can't get a trivial program to work. I'm trying a variation of the "Hello, World" where the two words are independently called by a macro. My source code looks like this:
%macro printString 2
mov rax, 0x2000004 ; write
mov rdi, 1 ; stdout
mov rsi, %1
mov rdx, %2
syscall
%endmacro
global start
section .text
start:
printString str1,str1.len
printString str2,str2.len
mov rax, 0x2000001 ; exit
mov rdi, 0
syscall
section .data
str1: db "Hello,",10,
.len: equ $ - str1
str2: db "world",10
.len: equ $ - str2
The expected result should be:
$./hw
Hello,
World
$
Instead I get:
$./hw
Hello,
$
What am I missing? How do I fix it?
EDIT: I am compiling & running with the following commands:
/usr/local/bin/nasm -f macho64 hw.asm
ld -macosx_version_min 10.7.0 -lSystem -o hw hw.o
./hw
NASM 2.11.08 and 2.13.02+ have bugs with macho64 output. What you are observing seems to be something I saw specifically with 2.13.02+ recently when using absolute references. The final linked program has incorrect fixups applied so the reference to str2 is incorrect. The incorrect fixup causes us to print out memory that isn't str2.
NASM has a bug report about this issue in their system. I have added a specific example of this failure based on the code in the question. Hopefully the NASM developers will be able to reproduce the failure and create a fix.
Update: As of June 2018 my view is that there are enough recurring bugs and regressions in NASM that I do not recommend NASM at this point in time for Macho-64 development.
Another recommendation I have for Macho-64 development is to use RIP relative addressing rather than absolute. RIP relative addressing is the default for 64-bit programs on later versions of MacOS.
In NASM you can use the default rel directive in your file to change the default from absolute to RIP relative addresses. For this to work you will have to change from using mov register, variable to lea register, [variable] when trying to move the address of a variable to a register. Your revised code could look like:
default rel
%macro printString 2
mov rax, 0x2000004 ; write
mov rdi, 1 ; stdout
lea rsi, [%1]
mov rdx, %2
syscall
%endmacro
global start
section .text
start:
printString str1,str1.len
printString str2,str2.len
mov rax, 0x2000001 ; exit
mov rdi, 0
syscall
section .data
str1: db "Hello,",10
.len: equ $ - str1
str2: db "world",10
.len: equ $ - str2

Assembly code of malloc

I want to view the assembly code of malloc(), calloc() and free() but when I print the assembly code on radare2 it gives me the following code:
push rbp
mov rbp, rsp
sub rsp, 0x10
mov eax, 0xc8
mov edi, eax
call sym.imp.malloc
xor ecx, ecx
mov qword [local_8h], rax
mov eax, ecx
add rsp, 0x10
pop rbp
ret
How can I see sym.imp.malloc function code? Is there any way to see the code or any website to see the assembly?
Since libc is an open-source library, it is freely available and you can simply read the source code.
The source-code of malloc is available on many places online (example), and you can view the source of different versions of libc under malloc/malloc.c here.
The symbol sym.imp.malloc is how radare flags the address of malloc in the PLT (Procedure Linkage Table) and not the function itself.
Reading the Assembly of the function can be done in several ways:
Open your local libc library with radare2, seek to malloc, analyze the function and then print its disassmbly:
$ r2 /usr/lib/libc.so.6
[0x00020630]> s sym.malloc
[0x0007c620]> af
[0x0007c620]> pdf
If you want to see malloc when linked to another binary you need to open the binary in debug mode, then step to main to make it load the library, then search for the address of malloc, seek to it, analyze the function and print the disassembly:
$ r2 -d /bin/ls
Process with PID 20540 started...
= attach 20540 20540
bin.baddr 0x00400000
Using 0x400000
Assuming filepath /bin/ls
asm.bits 64
[0x7fa764841d80]> dcu main
Continue until 0x004028b0 using 1 bpsize
hit breakpoint at: 4028b0
[0x004028b0]> dmi libc malloc~name=malloc$
vaddr=0x7fa764315620 paddr=0x0007c620 ord=4162 fwd=NONE sz=388 bind=LOCAL type=FUNC name=malloc
vaddr=0x7fa764315620 paddr=0x0007c620 ord=5225 fwd=NONE sz=388 bind=LOCAL type=FUNC name=malloc
vaddr=0x7fa764315620 paddr=0x0007c620 ord=5750 fwd=NONE sz=388 bind=GLOBAL type=FUNC name=malloc
vaddr=0x7fa764315620 paddr=0x0007c620 ord=7013 fwd=NONE sz=388 bind=GLOBAL type=FUNC name=malloc
[0x004028b0]> s 0x7fa764315620
[0x7fa764315620]> af
[0x7fa764315620]> pdf

Incremental linking causes unexpected disassembly for MASM program

A while ago I posted this question regarding strange behavior I was experiencing in trying to step through a MASM program.
Essentially, given the following code:
; Tell MASM to use the Intel 80386 instruction set.
.386
; Flat memory model, and Win 32 calling convention
.MODEL FLAT, STDCALL
; Treat labels as case-sensitive (required for windows.inc)
OPTION CaseMap:None
include windows.inc
include masm32.inc
include user32.inc
include kernel32.inc
include macros.asm
includelib masm32.lib
includelib user32.lib
includelib kernel32.lib
.DATA
BadText db "Error...", 0
GoodText db "Excellent!", 0
.CODE
main PROC
int 3
mov eax, 6
xor eax, eax
_label: add eax, ecx
dec ecx
jnz _label
cmp eax, 21
jz _good
_bad: invoke StdOut, addr BadText
jmp _quit
_good: invoke StdOut, addr GoodText
_quit: invoke ExitProcess, 0
main ENDP
END main
I could not get the int 3 instruction to trigger. It was clear why it didn't, examining the disassembly:
00400FFD add byte ptr [eax],al
00400FFF add ah,cl
--- [User path]\main.asm
mov eax, 6
00401001 mov eax,6
xor eax, eax
00401006 xor eax,eax
_label: add eax, ecx
The int 3 instruction had been replaced with add al,cl, but I had no idea why. I managed to track the problem to whether or not Incremental Linking was enabled. The above disassembly was generated with Incremental Linking disabled (/INCREMENTAL:NO option on the command line). Re-enabling it would result in something like the following:
.CODE
main PROC
int 3
00401010 int 3
mov eax, 6
00401011 mov eax,6
xor eax, eax
00401016 xor eax,eax
I should note that the interleaving lines are references back to the original code (I guess a feature of Visual Studio's disassembly window). With Incremental Linking enabled, the disassembly corresponds exactly to what I had written in the program, which is how I expected it to behave all along.
So, why would disabling Incremental Linking cause the disassembly of my program to be altered? What could be happening behind the scenes that would actually alter how the program executes?
The "add" instruction is a two byte instruction, the second of which is the 1-byte opcode of your int3. The first byte of the two byte add instruction is probably some garbage just before the entrypoint. The address of the add instruction is probably 1 byte before where the int3 instruction would be.
I quickly assembled and then disassembled those two instructions with GNU as en objdump, and the result is:
8: 00 cc add %cl,%ah
a: cc int3
Here you can clearly see that the the add instruction contains the second byte 0xcc, while int3 is 0xcc
IOW make sure that you start disassembling on the entry point to avoid this problem.

Resources