Clean x86_64 assembly output with gcc? [duplicate] - gcc

This question already has answers here:
How to remove "noise" from GCC/clang assembly output?
(3 answers)
Closed 6 years ago.
I've been teaching myself GNU Assembly for a while now by writing statements in C, compiling them with "gcc -S" and studying the output. This works alright on x86 (and compiling with -m32) but on my AMD64 box, for this code (just as an example):
int main()
{
return 0;
}
GCC gives me:
.file "test.c"
.text
.globl main
.type main, #function
main:
.LFB2:
pushq %rbp
.LCFI0:
movq %rsp, %rbp
.LCFI1:
movl $0, %eax
leave
ret
.LFE2:
.size main, .-main
.section .eh_frame,"a",#progbits
.Lframe1:
.long .LECIE1-.LSCIE1
.LSCIE1:
.long 0x0
.byte 0x1
.string "zR"
.uleb128 0x1
.sleb128 -8
.byte 0x10
.uleb128 0x1
.byte 0x3
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.align 8
.LECIE1:
.LSFDE1:
.long .LEFDE1-.LASFDE1
.LASFDE1:
.long .LASFDE1-.Lframe1
.long .LFB2
.long .LFE2-.LFB2
.uleb128 0x0
.byte 0x4
.long .LCFI0-.LFB2
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI1-.LCFI0
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE1:
.ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
.section .note.GNU-stack,"",#progbits
Compared with:
.file "test.c"
.text
.globl main
.type main, #function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
movl $0, %eax
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
.section .note.GNU-stack,"",#progbits
on x86.
Is there a way to make GCC -S on x86_64 output Assembly without the fluff?

The stuff that goes into .eh_frame section is unwind descriptors, which you only need to unwind stack (e.g. with GDB). While learning assembly, you could simply ignore it. Here is a way to do the "clean up" you want:
gcc -S -o - test.c | sed -e '/^\.L/d' -e '/\.eh_frame/Q'
.file "test.c"
.text
.globl main
.type main,#function
main:
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
leave
ret
.size main,.Lfe1-main

You can try placing the code you want to study in a function.
E.g.:
int ftest(void)
{
return 0;
}
int main(void)
{
return ftest();
}
If you look at the assembly-source for test it will be as clean as you need.
..snip..
test:
.LFB2:
pushq %rbp
.LCFI0:
movq %rsp, %rbp
.LCFI1:
movl $0, %eax
leave
ret
..snip..

I've found that using the -Os flag makes things clearer. I tried it your tiny example, but it made very little difference.
That said, I remember it being helpful when I was learning assembly (on a Sparc).

Related

GCC LLVM assemble on different CPU arches

Hi there Im trying to understand more abt the compilers;
when using gcc -S
it generates a .s file like
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 12
.globl _main
.p2align 4, 0x90
_main: ## #main
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
subq $16, %rsp
leaq L_.str(%rip), %rdi
movb $0, %al
callq _printf
xorl %ecx, %ecx
movl %eax, -4(%rbp) ## 4-byte Spill
movl %ecx, %eax
addq $16, %rsp
popq %rbp
retq
.cfi_endproc
.section __TEXT,__cstring,cstring_literals
L_.str: ## #.str
.asciz "Hello World!\n"
.subsections_via_symbols
my question is : is this .s assemble file generated based on specific cpu arch like x86 or its cpu arch irrelevant?
if its irrelevant, will we add cpu config like ARMv7s in command "gcc -O " ?
Question 2:
llvm-gcc -S generated code is pretty much different from a assemble language; is that a cpu arch irrelevant LLVM IR language? and the LLVM backend handle the rest of jobs to convert it to specific cpu arches?
many thanks

Why does osx 64-bit asm syscall segfault

I'm trying to write an x86-64 hello world in assembly on OSX, but whenever I make a syscall to write, it's segfaulting. I've tried the equivalent syscall via Gnu C inline assembly and it works, so I'm thoroughly confused:
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main:
.cfi_startproc
movq 0x2000004, %rax
movq 1, %rdi
leaq _hi(%rip), %rsi
movq 12, %rdx
syscall
xor %rax, %rax
ret
.cfi_endproc
.section __DATA,__data
.globl _hi
_hi:
.asciz "Hello world\n"
This is based off of the following Gnu C, which works:
#include <string.h>
int main() {
char *hw = "Hello World\n";
unsigned long long result;
asm volatile ("movq %1, %%rax\n"
"movq %2, %%rdi\n"
"movq %3, %%rsi\n"
"movq %4, %%rdx\n"
"syscall\n"
: "=rax" (result)
: "Z" (0x2000004),
"Z" (1),
"r" (hw),
"Z" (12)
: "rax", "rdi", "rsi", "rdx");
}
The C block when compiled generates the following asm:
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main: ## #main
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
leaq L_.str(%rip), %rcx
movq %rcx, -8(%rbp)
## InlineAsm Start
movq $33554436, %rax
movq $1, %rdi
movq %rcx, %rsi
movq $12, %rdx
syscall
## InlineAsm End
movq %rcx, -16(%rbp)
xorl %eax, %eax
popq %rbp
ret
.cfi_endproc
.section __TEXT,__cstring,cstring_literals
L_.str: ## #.str
.asciz "Hello World\n"
Your problem is on these few lines:
movq 0x2000004, %rax
movq 1, %rdi
leaq _hi(%rip), %rsi
movq 12, %rdx
Be aware that with at&t syntax that if you want to use constants you MUST prefix them with a $ (dollar sign) otherwise you are referencing memory addresses. Without a $ sign your value is an immediate indirect address.
For instance:
movq 0x2000004, %rax
attempts to move the quadword from memory address 0x2000004 and place it in %rax.
You probably just have to modify your code to look like:
movq $0x2000004, %rax
movq $1, %rdi
leaq _hi(%rip), %rsi
movq $12, %rdx
Notice that I have added a dollar sign to the beginning of each constant.
Here is a simple 64-bit "Hello World" (or Hello StackOverflow) in this case. It should build on OSX. Give it a try:
section .data
string1 db 0xa, " Hello StackOverflow!!!", 0xa, 0xa, 0
len equ $ - string1
section .text
global _start
_start:
; write string to stdout
mov rax, 1 ; set write to command
mov rsi, string1 ; string1 to source index
mov rdi, rax ; set destination index to 1 (stdout) already in rax
mov rdx, len ; set length in rdx
syscall ; call kernel
; exit
xor rdi,rdi ; zero rdi (rdi hold return value)
mov rax, 0x3c ; set syscall number to 60 (0x3c hex)
syscall ; call kernel
; **Compile/Output**
;
; $ nasm -felf64 -o hello-stack_64.o hello-stack_64.asm
; $ ld -o hello-stack_64 hello-stack_64.o
; $ ./hello-stack_64
;
; Hello StackOverflow!!!

Where does C function's TAN return its value in 64-bit GCC?

I am linking my assembly function with GCC on linux 64-bit. The library I use is TAN from math.h. I link it with;
gcc -s prog.o -o prog -lm
The program works but the return value is 0.0000000 (for 3.4 radian). I use extrn in my assembly code;
extrn tan
extrn printf
I use xmm0 to pass the argument (in radian) to the TAN function. Now I am not sure which register is used to return the value from TAN. Is it xmm0, st0 or in RAX? I can't find a decent reference on this.
For my gcc, it's xmm0.
Here's a C program:
#include <stdio.h>
#include <math.h>
int main () {
double x = tan(M_PI/4.0);
// RESULT: x=1.000000
printf ("x=%f\n", x);
return 0;
}
And here's the corresponding "gcc -S":
.Ltext0:
.section .rodata
.LC1:
.string "x=%f\n"
.text
.globl main
.type main, #function
main:
.LFB0:
.file 1 "x.cpp"
.loc 1 4 0
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
.LBB2:
.loc 1 6 0
movabsq $4607182418800017407, %rax
movq %rax, -8(%rbp)
.loc 1 8 0
movq -8(%rbp), %rax
movq %rax, -24(%rbp)
movsd -24(%rbp), %xmm0
movl $.LC1, %edi
movl $1, %eax
call printf
.loc 1 9 0
movl $0, %eax
.LBE2:
.loc 1 10 0
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc

Clang's ASM output vs GCC's

(I don't know almost anything about assembly language yet).
I'm trying to follow this tutorial.
The problem is that his compiler, and my test setup (gcc on Linux 32 bit) produces completely different, and significantly less output than my main setup (clang on OSX 64 bit).
Here are my outputs for int main() {}
gcc on Linux 32 bit
$ cat blank.c
int main() {}
$ gcc -S blank.c
$ cat blank.s
.file "blank.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
popl %ebp
.cfi_def_cfa 4, 4
.cfi_restore 5
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section .note.GNU-stack,"",#progbits
clang on Mac OSX 64 bit
$ cat blank.c
int main() {}
$ clang -S blank.c
$ cat blank.s
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main: ## #main
Leh_func_begin0:
## BB#0:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
movl $0, %eax
popq %rbp
ret
Leh_func_end0:
.section __TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+live_support
EH_frame0:
Lsection_eh_frame0:
Leh_frame_common0:
Lset0 = Leh_frame_common_end0-Leh_frame_common_begin0 ## Length of Common Information Entry
.long Lset0
Leh_frame_common_begin0:
.long 0 ## CIE Identifier Tag
.byte 1 ## DW_CIE_VERSION
.asciz "zR" ## CIE Augmentation
.byte 1 ## CIE Code Alignment Factor
.byte 120 ## CIE Data Alignment Factor
.byte 16 ## CIE Return Address Column
.byte 1 ## Augmentation Size
.byte 16 ## FDE Encoding = pcrel
.byte 12 ## DW_CFA_def_cfa
.byte 7 ## Register
.byte 8 ## Offset
.byte 144 ## DW_CFA_offset + Reg (16)
.byte 1 ## Offset
.align 3
Leh_frame_common_end0:
.globl _main.eh
_main.eh:
Lset1 = Leh_frame_end0-Leh_frame_begin0 ## Length of Frame Information Entry
.long Lset1
Leh_frame_begin0:
Lset2 = Leh_frame_begin0-Leh_frame_common0 ## FDE CIE offset
.long Lset2
Ltmp2: ## FDE initial location
Ltmp3 = Leh_func_begin0-Ltmp2
.quad Ltmp3
Lset3 = Leh_func_end0-Leh_func_begin0 ## FDE address range
.quad Lset3
.byte 0 ## Augmentation size
.byte 4 ## DW_CFA_advance_loc4
Lset4 = Ltmp0-Leh_func_begin0
.long Lset4
.byte 14 ## DW_CFA_def_cfa_offset
.byte 16 ## Offset
.byte 134 ## DW_CFA_offset + Reg (6)
.byte 2 ## Offset
.byte 4 ## DW_CFA_advance_loc4
Lset5 = Ltmp1-Ltmp0
.long Lset5
.byte 13 ## DW_CFA_def_cfa_register
.byte 6 ## Register
.align 3
Leh_frame_end0:
.subsections_via_symbols
Is it possible to generate similar assembly output on my Mac, so I can follow the tutorial? or is this assembly code platform-specific? And if it is, what flags on clang can I use to generate less verbose/boilerplate(?) code?
Make sure you instruct clang to generate 32 bit code with clang -m32 on Mac OSX 64 bit and you basically don't have to worry about the other differences.
Both the .cfi_XXX directives in the gcc output and the lines after .section __TEXT,__eh_frame in the clang output are used to generate the .eh_frame section for stack unwinding. For details, see: http://blog.mozilla.org/respindola/2011/05/12/cfi-directives/
Compile your program with gcc -fno-asynchronous-unwind-tables. Or just ignore various .cfi_XYZ directives. For the clang case, just don't pay attention to the __eh_frame section. Bear in mind that it's rather uncommon for two different compilers to generate identical code, even from identical source.

using winapi in mingw assembler

I'm tring to compile the following assembly code into shared library (dll)
.extern _GetProcAddress
.global _main
_main:
CALL _GetProcAddress
using the following commend:
i686-w64-mingw32-gcc -shared -o file.dll file.S
but I'm getting the following link error:
`_GetProcAddress' referenced in section `.text' of /tmp/ccamwU4N.o: defined in discarded section `.text' of /usr/lib/gcc/i686-w64-mingw32/4.6/../../../../i686-w64-mingw32/lib/../lib/libkernel32.a(dxprbs00553.o)
how can I fix this error and call winapi functions?
This code:
// file: winapi0.s
.extern ___main
.extern _GetStdHandle#4
.extern _WriteConsoleA#20
.data
_text:
.ascii "Hello World!\12\0"
.section .text.startup,"x"
.p2align 2,,3
.globl _main
_main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $52, %esp
call ___main
movl $-11, (%esp)
call _GetStdHandle#4
pushl %edx
movl $0, 16(%esp)
leal -12(%ebp), %edx
movl %edx, 12(%esp)
movl $14, 8(%esp)
movl $_text, 4(%esp)
movl %eax, (%esp)
call _WriteConsoleA#20
subl $20, %esp
xorl %eax, %eax
movl -4(%ebp), %ecx
leave
leal -4(%ecx), %esp
ret
Compiles for me just fine with:
gcc -Wall -O2 winapi0.s -o winapi0.exe
And it works by printing this:
Hello World!
It's a slightly cleaned up version of this C code translated by gcc into assembly with the -S switch:
// file: winapi.c
#include <windows.h>
#include <tchar.h>
TCHAR text[] = _T("Hello World!\n");
int main(void)
{
DWORD err;
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE),
text,
sizeof(text) / sizeof(text[0]),
&err,
NULL);
return 0;
}
I got originally this:
.file "winapi.c"
.def ___main; .scl 2; .type 32; .endef
.section .text.startup,"x"
.p2align 2,,3
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB14:
.cfi_startproc
leal 4(%esp), %ecx
.cfi_def_cfa 1, 0
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
.cfi_escape 0x10,0x5,0x2,0x75,0
pushl %ecx
.cfi_escape 0xf,0x3,0x75,0x7c,0x6
subl $52, %esp
call ___main
movl $-11, (%esp)
call _GetStdHandle#4
pushl %edx
movl $0, 16(%esp)
leal -12(%ebp), %edx
movl %edx, 12(%esp)
movl $14, 8(%esp)
movl $_text, 4(%esp)
movl %eax, (%esp)
call _WriteConsoleA#20
subl $20, %esp
xorl %eax, %eax
movl -4(%ebp), %ecx
.cfi_def_cfa 1, 0
leave
leal -4(%ecx), %esp
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE14:
.globl _text
.data
_text:
.ascii "Hello World!\12\0"
.def _GetStdHandle#4; .scl 2; .type 32; .endef
.def _WriteConsoleA#20; .scl 2; .type 32; .endef
Can't you do the same for the 64-bit case?
Link with the import library for the Windows DLL that contains the function you want to call.
I believe that symbol is contained in kernel32.dll. You'll need to make an import library and then link with it.

Resources