Difference between i386 and x86-64 memory stack

Difference between i386 and x86-64 memory stack - gcc

While trying to make a very tiny program with NASM and GCC on my Ubuntu machine, I noticed something weird.
The following code compiles fine under 64-bit NASM and GCC:
global main
extern puts
section .text
main:
push rax
mov rdi, message
call puts
jmp exit
exit:
;return stack memory
pop rax
ret
message:
db "Hello from NASM!", 0
But when trying to compile the same code (only with registers changed) under 32-bit NASM and GCC, it will either result segmentation fault and/or random characters. Why is this happening? Does the x64 architecture have different way in storing memory to the stack than i386? If so, how could this behaviour be prevented?

When in 32-bit mode, most calling conventions (cdecl, stdcall, etc...) expect arguments to be pushed on the stack, not in registers, unlike in 64-bit mode, and also, you would need to adjust the stack pointer after calling puts, so you would need to do something like:
lea edx, #message
push edx
call puts
add esp, 4
For the program to produce the same output in 32-bit mode. I may not have the NASM syntax right as I usually write assembly code in MASM and GAS.

Related

Trouble debugging assembly code for greater of two numbers

I wrote the following code to check if the 1st number- 'x' is greater than the 2nd number- 'y'. For x>y output should be 1 and for x<=y output should be 0.
section .txt
global _start
global checkGreater
_start:
mov rdi,x
mov rsi,y
call checkGreater
mov rax,60
mov rdi,0
syscall
checkGreater:
mov r8,rdi
mov r9,rsi
cmp r8,r9
jg skip
mov [c],byte '0'
skip:
mov rax,1
mov rdi,1
mov rsi,c
mov rdx,1
syscall
ret
section .data
x db 7
y db 5
c db '1',0
But due to some reasons(of course from my end), the code always gives 0 as the output when executed.
I am using the following commands to run the code on Ubuntu 20.04.1 LTS with nasm 2.14.02-1
nasm -f elf64 fileName.asm
ld -s -o fileName fileName.o
./fileName
Where did I make a mistake?
And how should one debug assembly codes, I looked for printing received arguments in checkGreater, but it turns out that's a disturbing headache itself.
Note: If someone wondering why I didn't directly use x and y in checkGreater, I want to extend the comparison to user inputs, and so wrote code in that way only.

The instructions
mov rdi,x
mov rsi,y
write the address of x into rdi, and of y into rsi. The further code then goes on to compare the addresses, which are always x<y, since x is defined above y.
What you should have written instead is
mov rdi,[x]
mov rsi,[y]
But then you have another problem: x and y variables are 1 byte long, while the destination registers are 8 bytes long. So simply doing the above fix will read extraneous bytes, leading to useless results. The final correction is to either fix the size of the variables (writing dq instead of db), or read them as bytes:
movzx rdi,byte [x]
movzx rsi,byte [y]
As for
And how should one debug assembly codes
The main tool for you is an assembly-level debugger, like EDB on Linux or x64dbg on Windows. But in fact, most debuggers, even the ones intended for languages like C++, are capable of displaying disassembly for the program being debugged. So you can use e.g. GDB, or even a GUI wrapper for it like Qt Creator or Eclipse. Just be sure to switch to machine code mode, or use the appropriate commands like GDB's disassemble, stepi, info registers etc..
Note that you don't have to build EDB or GDB from source (as the links above might suggest): they are likely already packaged in the Linux distribution you use. E.g. on Ubuntu the packages are called edb-debugger and gdb.

GDB Debugger: An internal issue to GDB has been detected

I'm new to GNU Debugger. I've been playing around with it, debugging Assembly Files (x86_64 Linux) for a day or so and just a few hours ago I ''discovered'' the TUI interface.
My first attempt using the TUI interface was to see the register changes as I execute each line at a time of a simple Hello World program (in asm). Here is the code of the program
section .data
text db "Hello, World!", 10
len equ $-text
section .text
global _start
_start:
nop
call _printText
mov rax, 60
mov rdi, 0
syscall
_printText:
nop
mov rax, 1
mov rdi, 1
mov rsi, text
mov rdx, len
syscall
ret
After creating the executable file in the terminal of linux I write
$ gdb -q ./hello -tui
Then I created three breakpoints: one right of the _start, another right after _printText and the last just above the mov rax, 60 for the SYS_EXIT.
After this:
1) I run the program.
2) On gdb mode I write layout asm to see the written code.
3) I write layout regs.
4) Finally I use stepi to see how the register change according the the written hello world program.
The thing is that when the RIP register points to the address of ret, corresponding to SYS_EXIT and I hit Enter I get the following message in console
[Inferior 1 (process 2059) exited normally]
/build/gdb-cXfXJ3/gdb-7.11.1/gdb/thread.c:1100: internal-error: finish_thread_st
ate: Assertion `tp' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
If I type n It appears this (as it says, it quits if I type y):
This is a bug, please report it. For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.
/build/gdb-cXfXJ3/gdb-7.11.1/gdb/thread.c:1100: internal-error: finish_thread_st
ate: Assertion `tp' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n)
As I don't know what a core file of GDB (and what is useful for), so I type n and the debugging session closes.
Does anyone know why this is happening and how can be fixed?
By the way, I'm new in Assembly also, so if this occurs because of something wrong in the program I'd also appreciate if anyone can point that out.

I use the same GDB version as you and I always use the TUI features; but I've never had this problem. However, when I use your code the internal GDB error occurs. But if I make one change in your write syscall function, the error does not manifest.
Although you are not calling another function from within a function, I generally create a stack frame by including at least the "push rbp", "mov rbp, rsp", and "leave" instructions in my x86-64 function calls. This may be a band-aide or a work around with respect to the "bug".
_printText:
push rbp
mov rbp, rsp
mov rax, 1
mov rdi, 1
mov rsi, text
mov rdx, len
syscall
leave
ret

Does anyone know why this is happening
It's happening because there is a bug in GDB (more precisely, an assertion that GDB internal variable tp is not NULL has been violated).
and how can be fixed?
You should try to reproduce this with current version of GDB (the bug may have already been fixed), and file a bug report (like the message tells you).
I don't know what a core file of GDB (and what is useful for),
It's only useful to GDB developers.

Why does this assembly code throw a seg fault?

The book Assembly Language Step by Step provides the following code as a sandbox:
section .data
section .text
global _start
_start:
nop
//insert sandbox code here
nop
Any example that I include in the space for sandbox is creating a segmentation fault. For example, adding this code:
mov ax, 067FEh
mov bx, ax
mov cl, bh
mov ch, bl
Then compiling with:
nasm -f macho sandbox.asm
ld -o sandbox -e _start sandbox.o
creates a seg fault when I run it on my OS/X. Is there a way to get more information about what's causing the segmentation fault?

The problem you have is that you have created a program that runs past the end of the code that you have written.
When your program executes, the loader will end up issuing a jmp to your _start. Your code then runs, but you do not have anything to return to the OS at the end, so it will simply continue running, executing whatever bytes happen to be in RAM after your code.
The simplest fix would be to properly exit the code. For example:
mov eax, 0x1 ; system call number for exit
sub esp, 4 ; OS X system calls needs "extra space" on stack
int 0x80
Since you are not generating any actual output, you would need to step through with a debugger to see what's going on. After compiling you could use lldb to step through.
lldb ./sandbox
image dump sections
Make note of the address listed that is of type code for your executable (not dyld). It will likely be 0x0000000000001fe6. Continuing within lldb:
b s -a 0x0000000000001fe6
run
register read
step
register read
step
register read
At this point you should be past the NOPs and see things changing in registers. Have fun!

Mach-O 64-bit format does not support 32-bit absolute addresses. NASM [duplicate]

This question already has answers here:
x64 nasm: pushing memory addresses onto the stack & call function
(3 answers)
Assembler Error: Mach-O 64 bit does not support absolute 32 bit addresses
(2 answers)
Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array
(1 answer)
Closed 4 years ago.
When I use nasm -f macho64 asm1.asm I get the following error:
asm1.asm:14: error: Mach-O 64-bit format does not support 32-bit absolute addresses
This is asm1.asm
SECTION .data ;initialized data
msg: db "Hello world, this is assembly", 10, 0
SECTION .text ;asm code
extern printf
global _main
_main:
push rbp
mov rbp, rsp
push msg
call printf
mov rsp, rbp
pop rbp
ret
I'm really new to assembly and barely know what these commands do. Any idea what's wrong here?

Mac OS X, like other UNIX/POSIX systems, uses a different calling convention for 64-bit code. Instead of pushing all the arguments to the stack, it uses RDI, RSI, RDX, RCX, R8, and R9 for the first 6 arguments. So instead of using push msg, you'll need to use something like mov RDI, msg.

Besides what Drew McGowen points out, rax needs to be zeroed (no vector parameters).
But -f win64 or -f elf64 will work on this code. I suspect a bug in -f macho64 (but I'm not sure what macho64 is "supposed" to do). Until this gets fixed(?), the workaround is to use default rel or mov rdi, rel msg. I "think" that'll work for ya.

x64 nasm: pushing memory addresses onto the stack & call function

I'm pretty new to x64-assembly on the Mac, so I'm getting confused porting some 32-bit code in 64-bit.
The program should simply print out a message via the printf function from the C standart library.
I've started with this code:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
push msg
call _printf
mov rsp, rbp
pop rbp
ret
Compiling it with nasm this way:
$ nasm -f macho64 main.s
Returned following error:
main.s:12: error: Mach-O 64-bit format does not support 32-bit absolute addresses
I've tried to fix that problem byte changing the code to this:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
mov rax, msg ; shouldn't rax now contain the address of msg?
push rax ; push the address
call _printf
mov rsp, rbp
pop rbp
ret
It compiled fine with the nasm command above but now there is a warning while compiling the object file with gcc to actual program:
$ gcc main.o
ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not
allowed in code signed PIE, but used in _main from main.o. To fix this warning,
don't compile with -mdynamic-no-pic or link with -Wl,-no_pie
Since it's a warning not an error I've executed the a.out file:
$ ./a.out
Segmentation fault: 11
Hope anyone knows what I'm doing wrong.

The 64-bit OS X ABI complies at large to the System V ABI - AMD64 Architecture Processor Supplement. Its code model is very similar to the Small position independent code model (PIC) with the differences explained here. In that code model all local and small data is accessed directly using RIP-relative addressing. As noted in the comments by Z boson, the image base for 64-bit Mach-O executables is beyond the first 4 GiB of the virtual address space, therefore push msg is not only an invalid way to put the address of msg on the stack, but it is also an impossible one since PUSH does not support 64-bit immediate values. The code should rather look similar to:
; this is what you *would* do for later args on the stack
lea rax, [rel msg] ; RIP-relative addressing
push rax
But in that particular case one needs not push the value on the stack at all. The 64-bit calling convention mandates that the fist 6 integer/pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9, exactly in that order. The first 8 floating-point or vector arguments go into XMM0, XMM1, ..., XMM7. Only after all the available registers are used or there are arguments that cannot fit in any of those registers (e.g. a 80-bit long double value) the stack is used. 64-bit immediate pushes are performed using MOV (the QWORD variant) and not PUSH. Simple return values are passed back in the RAX register. The caller must also provide stack space for the callee to save some of the registers.
printf is a special function because it takes variable number of arguments. When calling such functions AL (the low byte of RAX) should be set to the number of floating-point arguments, passed in the vector registers. Also note that RIP-relative addressing is preferred for data that lies within 2 GiB of the code.
Here is how gcc translates printf("This is a test\n"); into assembly on OS X:
xorl %eax, %eax # (1)
leaq L_.str(%rip), %rdi # (2)
callq _printf # (3)
L_.str:
.asciz "This is a test\n"
(this is AT&T style assembly, source is left, destination is right, register names are prefixed with %, data width is encoded as a suffix to the instruction name)
At (1) zero is put into AL (by zeroing the whole RAX which avoids partial-register delays) since no floating-point arguments are being passed. At (2) the address of the string is loaded in RDI. Note how the value is actually an offset from the current value of RIP. Since the assembler doesn't know what this value would be, it puts a relocation request in the object file. The linker then sees the relocation and puts the correct value at link time.
I am not a NASM guru, but I think the following code should do it:
default rel ; make [rel msg] the default for [msg]
section .data
msg: db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp ; re-aligns the stack by 16 before call
mov rbp, rsp
xor eax, eax ; al = 0 FP args in XMM regs
lea rdi, [rel msg]
call _printf
mov rsp, rbp
pop rbp
ret

No answer yet has explained why NASM reports
Mach-O 64-bit format does not support 32-bit absolute addresses
The reason NASM won't do this is explained in Agner Fog's Optimizing Assembly manual in section 3.3 Addressing modes under the subsection titled 32-bit absolute addressing in 64 bit mode he writes
32-bit absolute addresses cannot be used in Mac OS X, where addresses are above 2^32 by
default.
This is not a problem on Linux or Windows. In fact I already showed this works at static-linkage-with-glibc-without-calling-main. That hello world code uses 32-bit absolute addressing with elf64 and runs fine.
#HristoIliev suggested using rip relative addressing but did not explain that 32-bit absolute addressing in Linux would work as well. In fact if you change lea rdi, [rel msg] to lea rdi, [msg] it assembles and runs fine with nasm -efl64 but fails with nasm -macho64
Like this:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
xor al, al
lea rdi, [msg]
call _printf
mov rsp, rbp
pop rbp
ret
You can check that this is an absolute 32-bit address and not rip relative with objdump. However, it's important to point out that the preferred method is still rip relative addressing. Agner in the same manual writes:
There is absolutely no reason to use absolute addresses for simple memory operands. Rip-
relative addresses make instructions shorter, they eliminate the need for relocation at load
time, and they are safe to use in all systems.
So when would use use 32-bit absolute addresses in 64-bit mode? Static arrays is a good candidate. See the following subsection Addressing static arrays in 64 bit mode. The simple case would be e.g:
mov eax, [A+rcx*4]
where A is the absolute 32-bit address of the static array. This works fine with Linux but once again you can't do this with Mac OS X because the image base is larger than 2^32 by default. To to this on Mac OS X see example 3.11c and 3.11d in Agner's manual. In example 3.11c you could do
mov eax, [(imagerel A) + rbx + rcx*4]
Where you use the extern reference from Mach O __mh_execute_header to get the image base. In example 3.11c you use rip relative addressing and load the address like this
lea rbx, [rel A]; rel tells nasm to do [rip + A]
mov eax, [rbx + 4*rcx] ; A[i]

According to the documentation for the x86 64bit instruction set http://download.intel.com/products/processor/manual/325383.pdf
PUSH only accepts 8, 16 and 32bit immediate values (64bit registers and register addressed memory blocks are allowed though).
PUSH msg
Where msg is a 64bit immediate address will not compile as you found out.
What calling convention is _printf defined as in your 64bit library?
Is it expecting the parameter on the stack or using a fast-call convention where the parameters on in registers? Because x86-64 makes more general purpose registers available the fast-call convention is used more often.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio