Debugging NASM local labels with gdb

Debugging NASM local labels with gdb - debugging

I have been having some issues debugging code assembled by nasm with gdb: it seems like gdb doesn't do well with nasm local labels. nasm generates a local symbol named «function».label, which seems to confuse gdb, as it loses track of which function it is in.
Here is one scenario in which it gives a sub-optimal debugging experience:
section .text
global _start
_start:
call foo
ud2
foo:
push rbp
mov rbp, rsp
call bar
.end:
pop rbp
ret
bar:
ret
Compile and debug:
$ nasm -f elf64 -g -F DWARF example.asm -o example.o
$ ld example.o -o example
$ gdb ./example
Reading symbols from ./example...done.
(gdb) b foo
Breakpoint 1 at 0x400087: file example.asm, line 10.
(gdb) run
Starting program: /home/mvanotti/orga2/gdb/example
Breakpoint 1, foo () at example.asm:10
10 push rbp
(gdb) ni
11 mov rbp, rsp
(gdb) ni
12 call bar
(gdb) ni
Program received signal SIGILL, Illegal instruction.
_start () at example.asm:7
7 ud2
As you can see, nexti continues execution even after the return from the bar function call. I believe this is caused because the next instruction in foo belongs to the foo.end symbol, causing gdb to not recognize that as the return point of the function. Adding any other instruction before the .end label in the asm file fixes the issue.
Similarly, the backtrace gets all messed up when it steps into a local label:
(gdb)
foo.end () at example.asm:14
14 pop rbp
(gdb) bt
#0 foo.end () at example.asm:14
#1 0x0000000000000000 in ?? ()
(gdb)
This also affects yasm and lldb.
There is not a clear workaround for this. I couldn't find an option in nasm to not emit the function.label symbols, or an easy way to remove them. strip for example, lets you specify the --wildcard option, but the regexp syntax is too basic and cannot match something like .+\.*. The closest I got was strip --wildcard -N "*.*", but that also matches .something
In gas, this is solved by creating a label in the form of .Llocal_label$ which gets discarded automatically by ld.

Related

Debugging Issue on Assemble Code by using GDB

I tried to use gdb to debug the Assemble code but got error message,it said:
(gdb) run Starting program: /root/assembler_program/bsawp.o
/bin/bash: /root/assembler_program/bsawp.o: cannot execute binary file
The code:
.section .text
.globl _start
_start:
nop
movl 0x12345678 , %ebx
bswap %ebx
movl $1 , %eax
int $0x80
Then I use gdb :
(gdb) break *_start+1
Breakpoint 1 at 0x400079
(gdb) run
Starting program: /root/assembler_program/bsawp
Breakpoint 1, 0x0000000000400079 in _start ()
(gdb) step
Single stepping until exit from function _start,
which has no line number information.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400079 in _start ()
Could you please provide any hint and suggestion?
Thanks!
Best regards

Here:
Starting program: /root/assembler_program/bsawp.o
you are trying to run a relocatable object file. Don't do that.
You must link your object into an executable first. Something like this:
gcc -nostdlib -nostartfile test.s
Here:
Starting program: /root/assembler_program/bsawp
you apparently did link the bsap.o into an executable. The crash is happening here:
(gdb) x/i $pc
=> 0x400079 <_start+1>: mov 0x12345678,%ebx
This instruction is trying to load value from address 0x12345678, into register. But that address does not point to a valid memory location.
You most likely meant for it to load a constant 0x12345678, in which case the instruction you want is:
movl $0x12345678, %ebx
With that fix, I get expected:
(gdb) run
Starting program: /tmp/a.out
[Inferior 1 (process 238270) exited with code 022]
(gdb) p/x 022
$1 = 0x12

Why does this assembly code throw a seg fault?

The book Assembly Language Step by Step provides the following code as a sandbox:
section .data
section .text
global _start
_start:
nop
//insert sandbox code here
nop
Any example that I include in the space for sandbox is creating a segmentation fault. For example, adding this code:
mov ax, 067FEh
mov bx, ax
mov cl, bh
mov ch, bl
Then compiling with:
nasm -f macho sandbox.asm
ld -o sandbox -e _start sandbox.o
creates a seg fault when I run it on my OS/X. Is there a way to get more information about what's causing the segmentation fault?

The problem you have is that you have created a program that runs past the end of the code that you have written.
When your program executes, the loader will end up issuing a jmp to your _start. Your code then runs, but you do not have anything to return to the OS at the end, so it will simply continue running, executing whatever bytes happen to be in RAM after your code.
The simplest fix would be to properly exit the code. For example:
mov eax, 0x1 ; system call number for exit
sub esp, 4 ; OS X system calls needs "extra space" on stack
int 0x80
Since you are not generating any actual output, you would need to step through with a debugger to see what's going on. After compiling you could use lldb to step through.
lldb ./sandbox
image dump sections
Make note of the address listed that is of type code for your executable (not dyld). It will likely be 0x0000000000001fe6. Continuing within lldb:
b s -a 0x0000000000001fe6
run
register read
step
register read
step
register read
At this point you should be past the NOPs and see things changing in registers. Have fun!

GNU assembler did not produce a program that I can execute

I tried assembling some intermediate code generated by gcc. I used the command as -o hello hello.s, which, as far as I can tell, is the correct syntax. When I tried to run the program, it said bash: ./hello: cannot execute binary file. It doesn't seem like there's a problem with the assembly code, since it was the code generated by gcc, and it doesn't seem like there's anything wrong with how I invoked the assembler, since that seems to be the right syntax according to this manual. Can anyone help me with this?

Working with GNU Assembler
Assume that your assembly file is called hello.s and looks something like (assuming a 32-Bit Linux target):
.data
msg: .asciz "Hello World\n"
msglen = .-msg
.text
.global _start
_start:
/* Use int $0x80/eax=4 to write to STDOUT */
/* Output Hello World */
mov $4, %eax /* write system call */
mov $0, %ebx /* File descriptor 0 = STDOUT */
mov $msg, %ecx /* The message to output */
mov $msglen, %edx /* length of message */
int $0x80 /* make the system call */
/* Exit the program with int $0x80/eax=1 */
mov $1, %eax /* 1 = exit system call */
mov $0, %ebx /* value to exit with */
int $0x80 /* make the system call */
This is a 32-bit Linux assembler program in AT&T syntax that displays Hello World to standard output using 32-bit system calls via int $0x80. It doesn't use any C functions so can be assembled with the GNU assembler as and linked with the GNU linker ld to produce a final executable.
as --32 hello.s -o hello.o
ld -melf_i386 hello.o -o hello
The first line assembles hello.s into a 32-bit ELF object called hello.o . hello.o is then linked to a 32-bit ELF executable called hello with the second command. The GNU linker assumes by default that your program starts execution at the label _start .
Alternatively you can use GCC to assemble and link this program with this command:
gcc -nostdlib -m32 hello.s -o hello
This will produce a 32-bit ELF executable called hello . The -nostdlib tells GCC not to link in the C runtime library and allows us to use _start as our program's entry point.
If your assembler program is intended to be linked to the C runtime and library so that it can utilize functions like C's printf then things are a bit different. Assume you have this program that needs printf (or any of the C library functions):
.data
msg: .asciz "Hello World\n"
.text
.global main
main:
push %ebp /* Setup the stack frame */
mov %esp, %ebp /* Stack frames make GDB debugging easier */
push $msg /* Message to print */
call printf
add $4,%esp /* cleanup the stack */
xor %eax, %eax /* Return 0 when exiting */
mov %ebp, %esp /* destroy our stack frame */
pop %ebp
ret /* Return to C runtime that called us
and allow it to do program termination */
Your entry point now must be mainon most *nix type systems. The reason is that the C runtime will have an entry point called _start that does C runtime initialization and then makes a call to the function called main which we supply in our assembler code. To compile/assemble and link this we can use:
gcc -m32 hello.s -o hello
Note: on Windows the entry point called by the C runtime is _WinMain, not main.
Working with NASM
In the comments you also asked about NASM so I'll provide some information when assembling with it. Assume that your assembly file is called hello.asm and looks something like (It doesn't require the C runtime libraries):
SECTION .data ; data section
msg db "Hello World", 13, 10
len equ $-msg
SECTION .text ; code section
global _start ; make label available to linker
_start: ; standard gcc entry point
mov edx,len ; length of string to print
mov ecx,msg ; pointer to string
mov ebx,1 ; write to STDOUT (file descriptor 0)
mov eax,4 ; write command
int 0x80 ; interrupt 80 hex, call kernel
mov ebx,0 ; exit code, 0=normal
mov eax,1 ; exit command to kernel
int 0x80 ; interrupt 80 hex, call kernel
Then to build it into an executable you can use commands like these:
nasm -f elf32 hello.asm -o hello.o
gcc -nostdlib -m32 hello.o -o hello
The first command assembles hello.asm to the ELF object file hello.o . The second line does the linking. -nostdlib excludes the C runtime from be linked in (functions like _printf etc wouldn't be available). The second line links hello.o to the executable hello .
Alternatively you can skip using GCC and use the linker directly like this:
nasm -f elf32 hello.asm -o hello.o
ld -melf_i386 hello.o -o hello
If you need the C runtime and library for calling things like printf then it is a bit different. Assume you have this NASM code that needs printf:
extern printf
SECTION .data ; Data section, initialized variables
msg: db "Hello World", 13, 10, 0
SECTION .text ; Code section.
global main ; the standard gcc entry point
main: ; the program label for the entry point
push ebp ; Setup the stack frame
mov ebp, esp ; Stack frames make GDB debugging easier
push msg ; Message to print
call printf
add esp, 4 ; Cleanup the stack
mov eax, 0 ; Return value of 0
mov esp, ebp ; Destroy our stack frame
pop ebp
endit:
ret ; Return to C runtime that called us
; and allow it to do program termination
Then to build it into an executable you can use commands like these:
nasm -f elf32 hello.asm -o hello.o
gcc -m32 hello.o -o hello

Neither a compiler nor an assembler generates an executable file. Both generate an object file, which can then be linked with other object and/or library files to generate an executable.
The command gcc -c, for example, invokes just the compiler; it can take a source file like hello.c as input and generate an object file like hello.o as output.
Likewise, as can take an assembly language source file like hello.s and generate an object file like hello.o.
The linker is a separate tool that generates executables from object files.
It just happens that compiling and linking in one step is so convenient that that's what the gcc command does by default; gcc hello.c -o hello invokes the compiler and the linker to generate an executable file.
Note that the gcc command isn't just a compiler. It's a driver program that invokes the preprocessor, the compiler proper, the assembler, and/or the linker. (The preprocessor and assembler, can be thought of as components of the compiler, and in some cases they aren't even separate programs, or a compiler can generate machine object code instead of assembly code.)
In fact, you can perform the same multi-step process in one command for assembly language as well:
gcc hello.s -o hello
will invoke the assembler and linker and generate an executable file.
This is specific to gcc (and probably to most other compilers for Unix-like systems). Other implementations might be organized differently.

Why am I getting a warning about absolute addressing with immediate operands?

This little program works fine on OS X, using nasm:
global _main
extern _puts
section .text
default rel
_main:
push rbp
lea rdi, [message]
call _puts
pop rbp
ret
message:
db 'Hello, world', 0
Here's how it runs:
$ nasm -fmacho64 hello.asm && gcc hello.o && ./a.out
Hello, world
But if I replace the LEA instruction (with a memory operand) with an equivalent MOV immediate:
global _main
extern _puts
section .text
default rel
_main:
push rbp
mov rdi, message ; <---- Should have same effect as lea rdi, [message]
call _puts
pop rbp
ret
message:
db 'Hello, world', 0
The program will run but with a warning message, that I know has been asked about before on Stack Overflow:
$ nasm -fmacho64 hello.asm && gcc hello.o && ./a.out
ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not allowed in code signed PIE, but used in _main from hello.o. To fix this warning, don't compile with -mdynamic-no-pic or link with -Wl,-no_pie
Hello, world
My question is why does this warning occur? I see the error is complaining about the linker not liking absolute addressing; however the MOV command is clearly using an immediate operand, not an absolute address! Is the warning mislabeled? I'm puzzled that
As an aside, this distinction does not happen under Linux. Removing default rel and the underscores on main and puts gives me a warning-free run on Ubuntu. What is OS X doing differently here? Is it a case of the assembler default configurations being set differently? Or is it something weird like OS X following AMD's ABI more closely than Ubuntu?

Emacs gdb - display arrow when debugging assembly

I'm trying to debug an assembly program using gdb and Emacs. My problem is that, when I try to debug step-by-step, it doesn't show a pointer arrow at the current executing line. The code I'm trying to debug is:
SECTION .data ; Section containing initialised data
EatMsg: db "Eat at Joe's!",10
EatLen: equ $-EatMsg
SECTION .bss ; Section containing uninitialized data
SECTION .text ; Section containing code
global _start ; Linker needs this to find the entry point!
_start:
nop ; This no-op keeps gdb happy...
mov eax,4 ; Specify sys_write call
mov ebx,1 ; Specify File Descriptor 1: Standard Output
mov ecx,EatMsg ; Pass offset of the message
mov edx,EatLen ; Pass the length of the message
int 80H ; Make kernel call
MOV eax,1 ; Code for Exit Syscall
mov ebx,0 ; Return a code of zero
int 80H ; Make kernel call
and I'm compiling with these lines:
nasm -f elf -g -F stabs eatsyscall.asm -l eatsyscall.lst
ld -melf_i386 -o eatsyscall eatsyscall.o
What I see in Emacs is that. In this screenshot I'm currently executing the line after the breakpoint and no pointer to that line appears. Is it possible to have one?

first of all, i hope you are still looking for the solution, it has been 2 years ! if you are, then try coaxing nasm to generate debugging information with DWARF instead of STAB i.e the following
nasm -f elf -g -F dwarf eatsyscall.asm ...
that seems to work for me (TM)

Try to download nasm2.5 or the latest available, it should work

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio