gnu assembler: get address of label/variable [INTEL SYNTAX] - syntax

I have a code like this:
.bss
woof: .long 0
.text
bleh:
...some op codes here.
now I would like to move the address of woof into eax. What's the intel syntax code here for doing that? The same goes with moving bleh's address into, say, ebx.
Your help is much appreciated!

The bss section can't have any actual objects in it. Some assemblers may still allow you to switch to the .bss section, but all you can do there is say something like: x: . = . + 4.
In most assemblers these days and specifically in gnu for intel, there is no longer a .bss directive, so you temporarily switch to bss and create the bss symbol in one shot with something like: .comm sym,size,alignment. This is why you are presumably getting an error ".bss directive not recognized" or something like that.
And then you can get the address with either:
lea woof, %eax
or
movl $woof, %eax
Update: aha, intel syntax, not intel architecture. OK:
.intel_syntax noprefix
lea esi,fun
lea esi,[fun]
mov eax,OFFSET FLAT:fun
.att_syntax
lea fun, %eax
mov $fun, %eax
.data
fun: .long 0x123
All the lea forms should generate the same code.

Related

Implementing the "." word from Forth in x86 assembly

I am trying to make a function that, prints a number out on screen. Eventually, I'll make it able to take the top stack item, print it, and then pop it (like the "." word in Forth). But for now, I am trying to keep it simple. I think that I need to align the call stack in some way - and I figured that pushing and popping an arbitrary register before and after calling printf (rbx) would do the trick - but I am still getting a segmentation fault. A backtrace in GDB hasn't helped me make any progress either. Does anyone know why this code is causing a segmentation fault, and how to fix it?
How I am assembling (GAS):
gcc -masm=intel
.data
format_num: .ascii "%d\0"
.text
.global _main
.extern _printf
print_num:
push rbx
lea rdi, format_num[RIP]
mov esi, 250
xor eax, eax
call _printf
pop rbx
ret
_main:
call print_num
mov rdi, 0
mov rax, 0x2000001
syscall

What is the correct constant for the exit system call?

I am trying to learn x86_64 assembly, and am using GCC as my assembler. The exact command I'm using is:
gcc -nostdlib tapydn.S -D__ASSEMBLY__
I'm mainly using gcc for its preprocessor. Here is tapydn.S:
.global _start
#include <asm-generic/unistd.h>
syscall=0x80
.text
_start:
movl $__NR_exit, %eax
movl $0x00, %ebx
int $syscall
This results in a segmentation fault. I believe the problem is with the following line:
movl $__NR_exit, %eax
I used __NR_exit because it was more descriptive than some magic number. However, it appears that my usage of it is incorrect. I believe this to be the case because when I change the line in question to the following, it runs fine:
movl $0x01, %eax
Further backing up this trail of thought is the contents of usr/include/asm-generic/unistd.h:
#define __NR_exit 93
__SYSCALL(__NR_exit, sys_exit)
I expected the value of __NR_exit to be 1, not 93! Clearly I am misunderstanding its purpose and consequently its usage. For all I know, I'm getting lucky with the $0x01 case working (much like undefined behaviour in C++), so I kept digging...
Next, I looked for the definition of sys_exit. I couldn't find it. I tried using it anyway as follows (with and without the preceeding $):
movl $sys_exit, %eax
This wouldn't link:
/tmp/cc7tEUtC.o: In function `_start':
(.text+0x1): undefined reference to `sys_exit'
collect2: error: ld returned 1 exit status
My guess is that it's a symbol in one of the system libraries and I'm not linking it due to my passing -nostdlib to GCC. I'd like to avoid linking such a large library for just one symbol if possible.
In response to Jester's comment about mixing 32 and 64 bit constants, I tried using the value 0x3C as suggested:
movq $0x3C, %eax
movq $0x00, %ebx
This also resulting a segmentation fault. I also tried swapping out eax and ebx for rax and rbx:
movq $0x3C, %rax
movq $0x00, %rbx
The segmentation fault remained.
Jester then commented stating that I should be using syscall rather than int $0x80:
.global _start
#include <asm-generic/unistd.h>
.text
_start:
movq $0x3C, %rax
movq $0x00, %rbx
syscall
This works, but I was later informed that I should be using rdi instead of rbx as per the System V AMD64 ABI:
movq $0x00, %rdi
This also works fine, but still ends up using the magic number 0x3C for the system call number.
Wrapping up, my questions are as follows:
What is the correct usage of __NR_exit?
What should I be using instead of a magic number for the exit system call?
The correct header file to get the system call numbers is sys/syscall.h. The constants are called SYS_### where ### is the name of the system call you are interested in. The __NR_### macros are implementation details and should not be used. As a rule of thumb, if an identifier begins with an underscore it should not be used, if it begins with two it should definitely not be used. The arguments go into rdi, rsi, rdx, r10, r8, and r9. Here is a sample program for Linux:
#include <sys/syscall.h>
.globl _start
_start:
mov $SYS_exit,%eax
xor %edi,%edi
syscall
These conventions are mostly portable to other UNIX-like operating systems.

Unrelocated address when linking with link.exe

Problem
When I compile my assembly code with as (binutils) and link using link.exe (Visual Studio 2015) the program crashes because of an unrelocated address.
When linking with gcc (gcc hello-64-gas.obj -o hello-64-gas.exe) the program runs correctly without crash though.
Am I correctly assuming that the object file generated by as should be compiler independent, since abi compatibility problems are in the hands of the assembly code writer?
Since I am a beginner, any explanation of my mistakes/incorrect assumptions is appreciated.
Platform
Windows 10, 64 bit
Linker: Visual Studio 2015 using the native command tools command prompt (x64)
Compiler: as from MinGW-w64
Example
The following code does not link correctly:
# hello-64-gas.asm print a string using printf
# Assemble: as hello-64-gas.asm -o hello-64-gas.obj --64
# Link: link -subsystem:CONSOLE hello-64-gas.obj -out:hello-64-gas.exe libcmt.lib libvcruntime.lib libucrt.lib legacy_stdio_definitions.lib
.intel_syntax noprefix
.global main
# Declare needed C functions
.extern printf
.section .data
msg: .asciz "Hello world"
fmt: .asciz "%s(%d; %f)\n"
myDouble: .double 2.33, -1.0
.text
main:
sub rsp, 8*5
mov rcx, offset flat: fmt
mov rdx, offset flat: msg
mov r8, 0xFF
mov r9, offset flat: myDouble
mov r9, [r9]
movq xmm4, r9
call printf
add rsp, 8*5
mov rax, 0
ret
When debugging it seems mov r9, offset flat: myDouble is not relocated: mov r9,18h, where 18h would be correct if the .data section where at position zero. Looking at the relocation table with objdump -dr hello-64-gas.obj yields:
...
19: 49 c7 c1 18 00 00 00 mov $0x18,%r9
1c: R_X86_64_32S .data
...
Variation (workaround?)
Replacing mov with movabs seems to work:
# hello-64-gas.asm print a string using printf
# Assemble: as hello-64-gas.asm -o hello-64-gas.obj --64
# Link: link -subsystem:CONSOLE hello-64-gas.obj -out:hello-64-gas.exe libcmt.lib libvcruntime.lib libucrt.lib legacy_stdio_definitions.lib
.intel_syntax noprefix
.global main
# Declare needed C functions
.extern printf
.section .data
msg: .asciz "Hello world"
fmt: .asciz "%s(%d; %f)\n"
myDouble: .double 2.33, -1.0
.text
main:
sub rsp, 8*5
movabs rcx, offset flat: fmt
movabs rdx, offset flat: msg
mov r8, 0xFF
movabs r9, offset flat: myDouble
mov r9, [r9]
movq xmm4, r9
call printf
add rsp, 8*5
mov rax, 0
ret
This does somehow run correctly when linked using link.exe.
The relocation that the GNU assembler is using for your references to myDouble, along with fmt and msg, isn't supported by Microsoft's linker. This relocation, called R_X86_64_32S by the GNU utilities and having a value of 0x11, isn't documented in Microsoft's PECOFF specification. As can be evidenced by using Microsoft's DUMPBIN on your object file, Microsoft's linker seems to use relocations with this value for some other undocumented purpose:
RELOCATIONS #1
Symbol Symbol
Offset Type Applied To Index Name
-------- ---------------- ----------------- -------- ------
00000007 EHANDLER 7 .data
0000000E EHANDLER 7 .data
0000001C EHANDLER 7 .data
00000029 REL32 00000000 C printf
As work around you can use either use:
a LEA instruction with RIP relative addressing, which generates a R_X86_64_PC32/REL32 relocation
as you found out yourself, a MOVABS instruction, which generates a R_X86_64_64/ADDR64 relocation
a 32-bit MOV instruction which generates a R_X86_64_32/ADDR32 relocation
In order these would be written as:
lea r9, [rip + myDouble]
movabs r9, offset myDouble
mov r9d, offset myDouble
These, along with mov r9, offset myDouble, are four different instructions with different encodings and subtly different semantics each requiring a different type of relocation.
The LEA instruction encodes myDouble as a 32-bit signed offset relative to RIP. This is the preferable instruction to use here, as it takes only 4 bytes to encode the address and it allows the executable to be loaded anywhere in the 64-bit address space. The only limitation is that executable needs to be less than 2G in size, but this is a fundamental limitation x64 PECOFF executables anyways.
The MOVABS encodes myDouble as a 64-bit absolute address. While in theory this allows myDouble to be located anywhere in the 64-bit address space, even more than 2G away from the instruction, it takes 8 bytes of encoding space and doesn't actually get you anything under Windows.
The 32-bit MOV instruction encodes myDouble as an unsigned 32-bit absolute address. It has the disadvantage of requiring the the executable to be loaded somewhere in the first 4G of address space. Because of this you need to use the /LARGEADDRESSAWARE:NO flag with the Microsoft linker otherwise you'll get an error.
The 64-bit MOV instruction you're using encodes myDouble as a 32-bit signed absolute address. This also limits where the executable can be loaded, and requires a type of relocation that Microsoft's PECOFF format isn't documented as having and isn't supported by Microsoft's linker.

x86 asm printf causes segfault when using intel syntax (gcc)

I'm just starting to learn x86 assembly, and I am a bit confused as to why this little example doesn't work. All I want to do is to print the content of the eax register as a decimal value. This is my code in AT&T Syntax:
.data
intout:
.string "%d\n"
.text
.globl main
main:
movl $666, %eax
pushl %eax
pushl $intout
call printf
movl $1, %eax
int $0x80
Which I compile and run as follows:
gcc -m32 -o hello helloworld.S
./hello
This works as excepted (Printing 666 to the console). On a little side note, I would like to point out that I don't understand what exactly "movl $1, %eax" and "int $0x80" are supposed to accomplish here. I'm also a not sure what "pushl $intout" does. Why is my output composed out of two separate stack entries? And what exactly does the .string macro do?
These are only side questions however, since my real problem is that I can't find a way to make this run using the much easier to read/write/comprehend Intel syntax.
Here is the code:
.intel_syntax noprefix
.data
intout:
.string "%d\n"
.text
.globl main
main:
mov eax, 666
push eax
push intout
call printf
mov eax, 1
int 0x80
Running this same as above, it just prints "Segmentation fault".
What am I doing wrong?
You need to use push OFFSET intout otherwise the 32-bit value stored at intout will be pushed on the stack, rather than its address.
intout is just a label, which is basically a name assigned to an address in your program. The .string "%d\n" directive that follows it defines a sequence of bytes in your program, both allocating memory and initializing that memory. Specifically it allocates 4 bytes in the .data section and initializes them with the characters '%', 'd', '\n', and '\0'. Since the label intout is defined just before the .string line it has the address of the first byte in the string.
The line push intout results in a instruction that reads the 4 bytes starting at the address of referred to by intout and pushes them on to the stack (specifically it subtracts 4 from ESP and then copies them to the 4 bytes now pointed to by ESP.) The line push $intout (or push OFFSET intout) pushes the 4 bytes that make up the 32-bit address of intout on the stack.
This means that the line push intout pushes a meaningless value on to the stack. The function printf ends up interpreting it as a pointer, an address where the format string is supposed to be stored, but since it doesn't point to valid location in memory your program crashes.

x64 nasm: pushing memory addresses onto the stack & call function

I'm pretty new to x64-assembly on the Mac, so I'm getting confused porting some 32-bit code in 64-bit.
The program should simply print out a message via the printf function from the C standart library.
I've started with this code:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
push msg
call _printf
mov rsp, rbp
pop rbp
ret
Compiling it with nasm this way:
$ nasm -f macho64 main.s
Returned following error:
main.s:12: error: Mach-O 64-bit format does not support 32-bit absolute addresses
I've tried to fix that problem byte changing the code to this:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
mov rax, msg ; shouldn't rax now contain the address of msg?
push rax ; push the address
call _printf
mov rsp, rbp
pop rbp
ret
It compiled fine with the nasm command above but now there is a warning while compiling the object file with gcc to actual program:
$ gcc main.o
ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not
allowed in code signed PIE, but used in _main from main.o. To fix this warning,
don't compile with -mdynamic-no-pic or link with -Wl,-no_pie
Since it's a warning not an error I've executed the a.out file:
$ ./a.out
Segmentation fault: 11
Hope anyone knows what I'm doing wrong.
The 64-bit OS X ABI complies at large to the System V ABI - AMD64 Architecture Processor Supplement. Its code model is very similar to the Small position independent code model (PIC) with the differences explained here. In that code model all local and small data is accessed directly using RIP-relative addressing. As noted in the comments by Z boson, the image base for 64-bit Mach-O executables is beyond the first 4 GiB of the virtual address space, therefore push msg is not only an invalid way to put the address of msg on the stack, but it is also an impossible one since PUSH does not support 64-bit immediate values. The code should rather look similar to:
; this is what you *would* do for later args on the stack
lea rax, [rel msg] ; RIP-relative addressing
push rax
But in that particular case one needs not push the value on the stack at all. The 64-bit calling convention mandates that the fist 6 integer/pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9, exactly in that order. The first 8 floating-point or vector arguments go into XMM0, XMM1, ..., XMM7. Only after all the available registers are used or there are arguments that cannot fit in any of those registers (e.g. a 80-bit long double value) the stack is used. 64-bit immediate pushes are performed using MOV (the QWORD variant) and not PUSH. Simple return values are passed back in the RAX register. The caller must also provide stack space for the callee to save some of the registers.
printf is a special function because it takes variable number of arguments. When calling such functions AL (the low byte of RAX) should be set to the number of floating-point arguments, passed in the vector registers. Also note that RIP-relative addressing is preferred for data that lies within 2 GiB of the code.
Here is how gcc translates printf("This is a test\n"); into assembly on OS X:
xorl %eax, %eax # (1)
leaq L_.str(%rip), %rdi # (2)
callq _printf # (3)
L_.str:
.asciz "This is a test\n"
(this is AT&T style assembly, source is left, destination is right, register names are prefixed with %, data width is encoded as a suffix to the instruction name)
At (1) zero is put into AL (by zeroing the whole RAX which avoids partial-register delays) since no floating-point arguments are being passed. At (2) the address of the string is loaded in RDI. Note how the value is actually an offset from the current value of RIP. Since the assembler doesn't know what this value would be, it puts a relocation request in the object file. The linker then sees the relocation and puts the correct value at link time.
I am not a NASM guru, but I think the following code should do it:
default rel ; make [rel msg] the default for [msg]
section .data
msg: db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp ; re-aligns the stack by 16 before call
mov rbp, rsp
xor eax, eax ; al = 0 FP args in XMM regs
lea rdi, [rel msg]
call _printf
mov rsp, rbp
pop rbp
ret
No answer yet has explained why NASM reports
Mach-O 64-bit format does not support 32-bit absolute addresses
The reason NASM won't do this is explained in Agner Fog's Optimizing Assembly manual in section 3.3 Addressing modes under the subsection titled 32-bit absolute addressing in 64 bit mode he writes
32-bit absolute addresses cannot be used in Mac OS X, where addresses are above 2^32 by
default.
This is not a problem on Linux or Windows. In fact I already showed this works at static-linkage-with-glibc-without-calling-main. That hello world code uses 32-bit absolute addressing with elf64 and runs fine.
#HristoIliev suggested using rip relative addressing but did not explain that 32-bit absolute addressing in Linux would work as well. In fact if you change lea rdi, [rel msg] to lea rdi, [msg] it assembles and runs fine with nasm -efl64 but fails with nasm -macho64
Like this:
section .data
msg db 'This is a test', 10, 0 ; something stupid here
section .text
global _main
extern _printf
_main:
push rbp
mov rbp, rsp
xor al, al
lea rdi, [msg]
call _printf
mov rsp, rbp
pop rbp
ret
You can check that this is an absolute 32-bit address and not rip relative with objdump. However, it's important to point out that the preferred method is still rip relative addressing. Agner in the same manual writes:
There is absolutely no reason to use absolute addresses for simple memory operands. Rip-
relative addresses make instructions shorter, they eliminate the need for relocation at load
time, and they are safe to use in all systems.
So when would use use 32-bit absolute addresses in 64-bit mode? Static arrays is a good candidate. See the following subsection Addressing static arrays in 64 bit mode. The simple case would be e.g:
mov eax, [A+rcx*4]
where A is the absolute 32-bit address of the static array. This works fine with Linux but once again you can't do this with Mac OS X because the image base is larger than 2^32 by default. To to this on Mac OS X see example 3.11c and 3.11d in Agner's manual. In example 3.11c you could do
mov eax, [(imagerel A) + rbx + rcx*4]
Where you use the extern reference from Mach O __mh_execute_header to get the image base. In example 3.11c you use rip relative addressing and load the address like this
lea rbx, [rel A]; rel tells nasm to do [rip + A]
mov eax, [rbx + 4*rcx] ; A[i]
According to the documentation for the x86 64bit instruction set http://download.intel.com/products/processor/manual/325383.pdf
PUSH only accepts 8, 16 and 32bit immediate values (64bit registers and register addressed memory blocks are allowed though).
PUSH msg
Where msg is a 64bit immediate address will not compile as you found out.
What calling convention is _printf defined as in your 64bit library?
Is it expecting the parameter on the stack or using a fast-call convention where the parameters on in registers? Because x86-64 makes more general purpose registers available the fast-call convention is used more often.

Resources