I want to get the value of EIP from the following code, but the compilation does not pass
Command :
gcc -o xxx x86_inline_asm.c -m32 && ./xxx
file contetn x86_inline_asm.c:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned int eip_val;
__asm__("mov %0,%%eip":"=r"(eip_val));
return 0;
}
How to use the inline assembly to get the value of EIP, and it can be compiled successfully under x86.
How to modify the code and use the command to complete it?
This sounds unlikely to be useful (vs. just taking the address of the whole function like void *tmp = main), but it is possible.
Just get a label address, or use . (the address of the current line), and let the linker worry about getting the right immediate into the machine code. So you're not architecturally reading EIP, just reading the value it currently has from an immediate.
asm volatile("mov $., %0" : "=r"(address_of_mov_instruction) );
AT&T syntax is mov src, dst, so what you wrote would be a jump if it assembled.
(Architecturally, EIP = the end of an instruction while it's executing, so arguably you should do
asm volatile(
"mov $1f, %0 \n\t" // reference label 1 forward
"1:" // GAS local label
"=r"(address_after_mov)
);
I'm using asm volatile in case this asm statement gets duplicated multiple times inside the same function by inlining or something. If you want each case to get a different address, it has to be volatile. Otherwise the compiler can assume that all instances of this asm statement produce the same output. Normally that will be fine.
Architecturally in 32-bit mode you don't have RIP-relative addressing for LEA so the only good way to actually read EIP is call / pop. Reading program counter directly. It's not a general-purpose register so you can't just use it as the source or destination of a mov or any other instruction.
But really you don't need inline asm for this at all.
Is it possible to store the address of a label in a variable and use goto to jump to it? shows how to use the GNU C extension where &&label takes its address.
int foo;
void *addr_inside_function() {
foo++;
lab1: ; // labels only go on statements, not declarations
void *tmp = &&lab1;
foo++;
return tmp;
}
There's nothing you can safely do with this address outside the function; I returned it just as an example to make the compiler put a label in the asm and see what happens. Without a goto to that label, it can still optimize the function pretty aggressively, but you might find it useful as an input for an asm goto(...) somewhere else in the function.
But anyway, it compiles on Godbolt to this asm
# gcc -O3 -m32
addr_inside_function:
.L2:
addl $2, foo
movl $.L2, %eax
ret
#clang -O3 -m32
addr_inside_function:
movl foo, %eax
leal 1(%eax), %ecx
movl %ecx, foo
.Ltmp0: # Block address taken
addl $2, %eax
movl %eax, foo
movl $.Ltmp0, %eax # retval = label address
retl
So clang loads the global, computes foo+1 and stores it, then after the label computes foo+2 and stores that. (Instead of loading twice). So you still can't usefully jump to the label from anywhere, because it depends on having foo's old value in eax, and on the desired behaviour being to store foo+2
I don't know gcc inline assembly syntax for this, but for masm:
call next0
next0: pop eax ;eax = eip for this line
In the case of Masm, $ represents the current location, and since call is a 5 byte instruction, an alternative syntax without a label would be:
call $+5
pop eax
Related
I am testing simple inline assembly code using gcc. And I find the result of the following code unexpected:
#include <stdio.h>
int main(void) {
unsigned x0 = 0, x1 = 1, x2 = 2;
__asm__ volatile("movl %1, %0;\n\t"
"movl %2, %1"
:"=r"(x0), "+r"(x1)
:"r"(x2)
:);
printf("%u, %u\n", x0, x1);
return 0;
}
The printed result is 1, 1, rather than the expected 1, 2. Then I compiled the code with -S option and found out gcc generated the code as
movl %eax, %edx;
movl %edx, %eax;
%0 and %2 are using the same register, why?
I want gcc to generate, say,
movl %eax, %edx;
movl %ecx, %eax;
If I add "0"(x1) to the input constraints, gcc will generate the code above. Does it mean that all registers need to be initialized before being used in inline assembly?
Moving my comment to an 'Answer' so this question can be closed.
To prevent the compiler from re-using a register for both an input and an output, you can use the early clobber constraint (for example =&r (x)), which informs the compiler that the register associated with the parameter is
written before the instruction is finished using the input operands.
While this can be a good thing (since it reduces the number of registers that must made available before calling your asm), it can also cause problems (as you have seen). So, either make sure you have finished using all the inputs before writing to the output, or use & to tell the compiler not to do this optimization.
For completeness, let me also point out that using inline asm is usually a bad idea.
I write a boot loader in asm and want to add some compiled C code in my project.
I created a test function here:
test.c
__asm__(".code16\n");
void print_str() {
__asm__ __volatile__("mov $'A' , %al\n");
__asm__ __volatile__("mov $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");
}
And here is the asm code (the boot loader):
hw.asm
[org 0x7C00]
[BITS 16]
[extern print_str] ;nasm tip
start:
mov ax, 0
mov ds, ax
mov es, ax
mov ss, ax
mov sp, 0x7C00
mov si, name
call print_string
mov al, ' '
int 10h
mov si, version
call print_string
mov si, line_return
call print_string
call print_str ;call function
mov si, welcome
call print_string
jmp mainloop
mainloop:
mov si, prompt
call print_string
mov di, buffer
call get_str
mov si, buffer
cmp byte [si], 0
je mainloop
mov si, buffer
;call print_string
mov di, cmd_version
call strcmp
jc .version
jmp mainloop
.version:
mov si, name
call print_string
mov al, ' '
int 10h
mov si, version
call print_string
mov si, line_return
call print_string
jmp mainloop
name db 'MOS', 0
version db 'v0.1', 0
welcome db 'Developped by Marius Van Nieuwenhuyse', 0x0D, 0x0A, 0
prompt db '>', 0
line_return db 0x0D, 0x0A, 0
buffer times 64 db 0
cmd_version db 'version', 0
%include "functions/print.asm"
%include "functions/getstr.asm"
%include "functions/strcmp.asm"
times 510 - ($-$$) db 0
dw 0xaa55
I need to call the c function like a simple asm function
Without the extern and the call print_str, the asm script boot in VMWare.
I tried to compile with:
nasm -f elf32
But i can't call org 0x7C00
Compiling & Linking NASM and GCC Code
This question has a more complex answer than one might believe, although it is possible. Can the first stage of a bootloader (the original 512 bytes that get loaded at physical address 0x07c00) make a call into a C function? Yes, but it requires rethinking how you build your project.
For this to work you can no longer us -f bin with NASM. This also means you can't use the org 0x7c00 to tell the assembler what address the code expects to start from. You'll need to do this through a linker (either us LD directly or GCC for linking). Since the linker will lay things out in memory we can't rely on placing the boot sector signature 0xaa55 in our output file. We can get the linker to do that for us.
The first problem you will discover is that the default linker scripts used internally by GCC don't lay things out the way we want. We'll need to create our own. Such a linker script will have to set the origin point (Virtual Memory Address aka VMA) to 0x7c00, place the code from your assembly file before the data and place the boot signature at offset 510 in the file. I'm not going to write a tutorial on Linker scripts. The Binutils Documentation contains almost everything you need to know about linker scripts.
OUTPUT_FORMAT("elf32-i386");
/* We define an entry point to keep the linker quiet. This entry point
* has no meaning with a bootloader in the binary image we will eventually
* generate. Bootloader will start executing at whatever is at 0x07c00 */
ENTRY(start);
SECTIONS
{
. = 0x7C00;
.text : {
/* Place the code in hw.o before all other code */
hw.o(.text);
*(.text);
}
/* Place the data after the code */
.data : SUBALIGN(2) {
*(.data);
*(.rodata*);
}
/* Place the boot signature at LMA/VMA 0x7DFE */
.sig 0x7DFE : {
SHORT(0xaa55);
}
/* Place the uninitialised data in the area after our bootloader
* The BIOS only reads the 512 bytes before this into memory */
.bss : SUBALIGN(4) {
__bss_start = .;
*(COMMON);
*(.bss)
. = ALIGN(4);
__bss_end = .;
}
__bss_sizeb = SIZEOF(.bss);
/* Remove sections that won't be relevant to us */
/DISCARD/ : {
*(.eh_frame);
*(.comment);
}
}
This script should create an ELF executable that can be converted to a flat binary file with OBJCOPY. We could have output as a binary file directly but I separate the two processes out in the event I want to include debug information in the ELF version for debug purposes.
Now that we have a linker script we must remove the ORG 0x7c00 and the boot signature. For simplicity sake we'll try to get the following code (hw.asm) to work:
extern print_str
global start
bits 16
section .text
start:
xor ax, ax ; AX = 0
mov ds, ax
mov es, ax
mov ss, ax
mov sp, 0x7C00
call print_str ; call function
/* Halt the processor so we don't keep executing code beyond this point */
cli
hlt
You can include all your other code, but this sample will still demonstrate the basics of calling into a C function.
Assume the code above you can now generate the ELF object from hw.asm producing hw.o using this command:
nasm -f elf32 hw.asm -o hw.o
You compile each C file with something like:
gcc -ffreestanding -c kmain.c -o kmain.o
I placed the C code you had into a file called kmain.c . The command above will generate kmain.o. I noticed you aren't using a cross compiler so you'll want to use -fno-PIE to ensure we don't generate relocatable code. -ffreestanding tells GCC the C standard library may not exist, and main may not be the program entry point. You'd compile each C file in the same way.
To link this code to a final executable and then produce a flat binary file that can be booted we do this:
ld -melf_i386 --build-id=none -T link.ld kmain.o hw.o -o kernel.elf
objcopy -O binary kernel.elf kernel.bin
You specify all the object files to link with the LD command. The LD command above will produce a 32-bit ELF executable called kernel.elf. This file can be useful in the future for debugging purposes. Here we use OBJCOPY to convert kernel.elf to a binary file called kernel.bin. kernel.bin can be used as a bootloader image.
You should be able to run it with QEMU using this command:
qemu-system-i386 -fda kernel.bin
When run it may look like:
You'll notice the letter A appears on the last line. This is what we'd expect from the print_str code.
GCC Inline Assembly is Hard to Get Right
If we take your example code in the question:
__asm__ __volatile__("mov $'A' , %al\n");
__asm__ __volatile__("mov $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");
The compiler is free to reorder these __asm__ statements if it wanted to. The int $0x10 could appear before the MOV instructions. If you want these 3 lines to be output in this exact order you can combine them into one like this:
__asm__ __volatile__("mov $'A' , %al\n\t"
"mov $0x0e, %ah\n\t"
"int $0x10");
These are basic assembly statements. It's not required to specify __volatile__on them as they are already implicitly volatile, so it has no effect. From the original poster's answer it is clear they want to eventually use variables in __asm__ blocks. This is doable with extended inline assembly (the instruction string is followed by a colon : followed by constraints.):
With extended asm you can read and write C variables from assembler and perform jumps from assembler code to C labels. Extended asm syntax uses colons (‘:’) to delimit the operand parameters after the assembler template:
asm [volatile] ( AssemblerTemplate
: OutputOperands
[ : InputOperands
[ : Clobbers ] ])
This answer isn't a tutorial on inline assembly. The general rule of thumb is that one should not use inline assembly unless you have to. Inline assembly done wrong can create hard to track bugs or have unusual side effects. Unfortunately doing 16-bit interrupts in C pretty much requires it, or you write the entire function in assembly (ie: NASM).
This is an example of a print_chr function that take a nul terminated string and prints each character out one by one using Int 10h/ah=0ah:
#include <stdint.h>
__asm__(".code16gcc\n");
void print_str(char *str) {
while (*str) {
/* AH=0x0e, AL=char to print, BH=page, BL=fg color */
__asm__ __volatile__ ("int $0x10"
:
: "a" ((0x0e<<8) | *str++),
"b" (0x0000));
}
}
hw.asm would be modified to look like this:
push welcome
call print_str ;call function
The idea when this is assembled/compiled (using the commands in the first section of this answer) and run is that it print out the welcome message. Unfortunately it will almost never work, and may even crash some emulators like QEMU.
code16 is Almost Useless and Should Not be Used
In the last section we learn that a simple function that takes a parameter ends up not working and may even crash an emulator like QEMU. The main problem is that the __asm__(".code16\n"); statement really doesn't work well with the code generated by GCC. The Binutils AS documentation says:
‘.code16gcc’ provides experimental support for generating 16-bit code from gcc, and differs from ‘.code16’ in that ‘call’, ‘ret’, ‘enter’, ‘leave’, ‘push’, ‘pop’, ‘pusha’, ‘popa’, ‘pushf’, and ‘popf’ instructions default to 32-bit size. This is so that the stack pointer is manipulated in the same way over function calls, allowing access to function parameters at the same stack offsets as in 32-bit mode. ‘.code16gcc’ also automatically adds address size prefixes where necessary to use the 32-bit addressing modes that gcc generates.
.code16gcc is what you really need to be using, not .code16. This force GNU assembler on the back end to emit address and operand prefixes on certain instructions so that the addresses and operands are treated as 4 bytes wide, and not 2 bytes.
The hand written code in NASM doesn't know it will be calling C instructions, nor does NASM have a directive like .code16gcc. You'll need to modify the assembly code to push 32-bit values on to the stack in real mode. You will also need to override the call instruction so that the return address needs to be treated as a 32-bit value, not 16-bit. This code:
push welcome
call print_str ;call function
Should be:
jmp 0x0000:setcs
setcs:
cld
push dword welcome
call dword print_str ;call function
GCC has a requirement that the direction flag be cleared before calling any C function. I added the CLD instruction to the top of the assembly code to make sure this is the case. GCC code also needs to have CS to 0x0000 to work properly. The FAR JMP does just that.
You can also drop the __asm__(".code16gcc\n"); on modern GCC that supports the -m16 option. -m16 automatically places a .code16gcc into the file that is being compiled.
Since GCC also uses the full 32-bit stack pointer it is a good idea to initialize ESP with 0x7c00, not just SP. Change mov sp, 0x7C00 to mov esp, 0x7C00. This ensures the full 32-bit stack pointer is 0x7c00.
The modified kmain.c code should now look like:
#include <stdint.h>
void print_str(char *str) {
while (*str) {
/* AH=0x0e, AL=char to print, BH=page, BL=fg color */
__asm__ __volatile__ ("int $0x10"
:
: "a" ((0x0e<<8) | *str++),
"b" (0x0000));
}
}
and hw.asm:
extern print_str
global start
bits 16
section .text
start:
xor ax, ax ; AX = 0
mov ds, ax
mov es, ax
mov ss, ax
mov esp, 0x7C00
jmp 0x0000:setcs ; Set CS to 0
setcs:
cld ; GCC code requires direction flag to be cleared
push dword welcome
call dword print_str ; call function
cli
hlt
section .data
welcome db 'Developped by Marius Van Nieuwenhuyse', 0x0D, 0x0A, 0
These commands can be build the bootloader with:
gcc -fno-PIC -ffreestanding -m16 -c kmain.c -o kmain.o
ld -melf_i386 --build-id=none -T link.ld kmain.o hw.o -o kernel.elf
objcopy -O binary kernel.elf kernel.bin
When run with qemu-system-i386 -fda kernel.bin it should look simialr to:
In Most Cases GCC Produces Code that Requires 80386+
There are number of disadvantages to GCC generated code using .code16gcc:
ES=DS=CS=SS must be 0
Code must fit in the first 64kb
GCC code has no understanding of 20-bit segment:offset addressing.
For anything but the most trivial C code, GCC doesn't generate code that can run on a 286/186/8086. It runs in real mode but it uses 32-bit operands and addressing not available on processors earlier than 80386.
If you want to access memory locations above the first 64kb then you need to be in Unreal Mode(big) before calling into C code.
If you want to produce real 16-bit code from a more modern C compiler I recommend OpenWatcom C
The inline assembly is not as powerful as GCC
The inline assembly syntax is different but it is easier to use and less error prone than GCC's inline assembly.
Can generate code that will run on antiquated 8086/8088 processors.
Understands 20-bit segment:offset real mode addressing and supports the concept of far and huge pointers.
wlink the Watcom linker can produce basic flat binary files usable as a bootloader.
Zero Fill the BSS Section
The BIOS boot sequence doesn't guarantee that memory is actually zero. This causes a potential problem for the zero initialized region BSS. Before calling into C code for the first time the region should be zero filled by our assembly code. The linker script I originally wrote defines a symbol __bss_start that is the offset of the BSS memory and __bss_sizeb is the size in bytes. Using this info you can use the STOSB instruction to easily zero fill it. At the top of hw.asm you can add:
extern __bss_sizeb
extern __bss_start
And after the CLD instruction and before calling any C code you can do the zero fill this way:
; Zero fill the BSS section
mov cx, __bss_sizeb ; Size of BSS computed in linker script
mov di, __bss_start ; Start of BSS defined in linker script
rep stosb ; AL still zero, Fill memory with zero
Other Suggestions
To reduce the bloat of the code generated by the compiler it can be useful to use -fomit-frame-pointer. Compiling with -Os can optimize for space (rather than speed). We have limited space (512 bytes) for the initial code loaded by the BIOS so these optimizations can be beneficial. The command line for compiling could appear as:
gcc -fno-PIC -fomit-frame-pointer -ffreestanding -m16 -Os -c kmain.c -o kmain.o
I write a boot loader in asm and want to add some compiled C code in my project.
Then you need to use a 16-bit x86 compiler, such as OpenWatcom.
GCC cannot safely build real-mode code, as it is unaware of some important features of the platform, including memory segmentation. Inserting the .code16 directive will make the compiler generate incorrect output. Despite appearing in many tutorials, this piece of advice is simply incorrect, and should not be used.
First i want to express how to link C compiled code with assembled file.
I put together some Q/A in SO and reach to this.
C code:
func.c
//__asm__(".code16gcc\n");when we use eax, 32 bit reg we cant use this as truncate
//problem
#include <stdio.h>
int x = 0;
int madd(int a, int b)
{
return a + b;
}
void mexit(){
__asm__ __volatile__("mov $0, %ebx\n");
__asm__ __volatile__("mov $1, %eax \n");
__asm__ __volatile__("int $0x80\n");
}
char* tmp;
///how to direct use of arguments in asm command
void print_str(int a, char* s){
x = a;
__asm__("mov x, %edx\n");// ;third argument: message length
tmp = s;
__asm__("mov tmp, %ecx\n");// ;second argument: pointer to message to write
__asm__("mov $1, %ebx\n");//first argument: file handle (stdout)
__asm__("mov $4, %eax\n");//system call number (sys_write)
__asm__ __volatile__("int $0x80\n");//call kernel
}
void mtest(){
printf("%s\n", "Hi");
//putchar('a');//why not work
}
///gcc -c func.c -o func
Assembly code:
hello.asm
extern mtest
extern printf
extern putchar
extern print_str
extern mexit
extern madd
section .text ;section declaration
;we must export the entry point to the ELF linker or
global _start ;loader. They conventionally recognize _start as their
;entry point. Use ld -e foo to override the default.
_start:
;write our string to stdout
push msg
push len
call print_str;
call mtest ;print "Hi"; call printf inside a void function
; use add inside func.c
push 5
push 10
call madd;
;direct call of <stdio.h> printf()
push eax
push format
call printf; ;printf(format, eax)
call mexit; ;exit to OS
section .data ;section declaration
format db "%d", 10, 0
msg db "Hello, world!",0xa ;our dear string
len equ $ - msg ;length of our dear string
; nasm -f elf32 hello.asm -o hello
;Link two files
;ld hello func -o hl -lc -I /lib/ld-linux.so.2
; ./hl run code
;chain to assemble, compile, Run
;; gcc -c func.c -o func && nasm -f elf32 hello.asm -o hello && ld hello func -o hl -lc -I /lib/ld-linux.so.2 && echo &&./hl
Chain commands for assemble, compile and Run
gcc -c func.c -o func && nasm -f elf32 hello.asm -o hello && ld hello func -o hl -lc -I /lib/ld-linux.so.2 && echo && ./hl
Edit[toDO]
Write boot loader code instead of this version
Some explanation on how ld, gcc, nasm works.
I'm learning assembly and I found nothing that helps me do this. Is it even possible? I can't make this work.
I want this code to take the "b" value, put it in %eax and then move the content of %eax in my output and print that ASCII character, "0" in this case.
char a;
int b=48;
__asm__ (
//Here's the "Error: operand type mismatch for `mov'
"movl %0, %%eax;"
"movl %%eax, %1;"
:"=r"(a)
:"r" (b)
:"%eax"
);
printf("%c\n",a);
The instruction responsible for the error is this one:
movl %0, %%eax
So, in order to figure out why it's causing an error, we need to understand what it says. It's a 32-bit MOV instruction (the l suffix in AT&T syntax means "long", aka DWORD). The destination operand is the 32-bit EAX register. The source operand is the first input/output operand, a. In other words, this:
"=r"(a)
which says that char a; is to be used as an output-only register.
As such, what the inline assembler wants to do is to generate code like the following:
movl %dl, %eax
(assuming, for the sake of argument that a is allocated in the dl register, but it could just as easily have been allocated in any of the 8-bit registers). The problem is, that code is invalid because there is an operand size mismatch. The source operand and destination operand are different sizes: one is 32 bits while the other is 8 bits. This cannot work.
A workaround is the movzx/movsx instructions (introduced with the 80386) which move an 8 (or 16) bit source operand into a 32-bit destination operand, either with zero extension or sign extension, respectively. In AT&T syntax, the form that moves an 8-bit source into a 32-bit destination would be movzbl (for zero extension, used with unsigned values) or movsbl (for sign extension, used with signed values).
But wait—this is the wrong workaround. Your code is invalid for another reason: a is uninitialized! And not only is a uninitialized, but you've told the inline assembler via the output constraints it is an output-only operand (the = sign)! So you can't read from it—you can only store into it.
You have your operand notation backwards. What you really wanted was something like the following:
__asm__(
"movl %1, %%eax;"
"movl %%eax, %0;"
: "=r"(a)
: "r" (b)
: "%eax"
);
Of course, that's still going to give you an operand size mismatch, but it's now on the second assembly instruction. What this is telling the inline assembler to emit is the following code:
movl $48, %edx
movl %edx, %eax
movl %eax, %dl
which is invalid because a 32-bit source (%eax) cannot be moved into an 8-bit destination (%dl). And you can't fix this with movzx/movsx, because that is used to extend, not truncate. The way to write this would be the following:
movl $48, %edx
movl %edx, %eax
movb %al, %dl
where the last instruction is an 8-bit move, from an 8-bit source register to an 8-bit destination register.
In inline assembly, this would be written as:
__asm__(
"movl %1, %%eax;"
"movb %%al, %0;"
: "=r"(a)
: "r" (b)
: "%eax"
);
However, this is not the correct way to use inline assembly. You've manually hard-coded the EAX register inside of the inline assembly block, which means that you had to clobber it. The problem with this is that it ties the compiler's hands behind its back when it comes to register allocation. What you're supposed to do is put everything that goes into and out of the inline assembly block in the input and output operands. This lets the compiler handle all register allocation in the most optimal way possible. The code should look as follows:
char a;
int b = 48;
int temp;
__asm__(
"movl %2, %0\n\t"
"movb %b0, %1"
: "=r"(temp),
"=r"(a)
: "r" (b)
:
);
A lot of changes happened here:
I introduced another temporary variable (appropriately named temp) and added it to the output-only operands list. This causes the compiler to allocate a register for it automatically, which we then use inside of the asm block.
Now that we're letting the compiler do the register allocation, we don't need a clobber list, so that's left empty.
The b modifier is needed on the source operand for the movb instruction to ensure that the byte-sized portion of that register is used, rather than the entire 32-bit register.
Instead of using semicolons at the end of each asm instruction, I used \n\t (except on the last one). This is what is recommended for use in inline assembly blocks, and it gets you nicer assembly output listings because it matches what the compiler does internally.
Even better would be to introduce symbolic names for the operands, making the code more readable:
char a;
int b = 48;
int temp;
__asm__(
"movl %[input], %[temp]\n\t"
"movb %b[temp], %[dest]"
: [temp] "=r"(temp),
[dest] "=r"(a)
: [input] "r" (b)
:
);
And, at this point, if you hadn't noticed already, you'd see that this code is enormously silly. You don't need all those temporaries and register-register shuffling. You can just do:
movl $48, %eax
and the value 48 is already in al, since al is the low 8 bits of the 32-bit register eax.
Or, you can do:
movb $48, %al
which is just an 8-bit move of the value 48 explicitly into the 8-bit register al.
But, in fact, if you're calling printf, the argument must be passed as an int (not a char, since it's a variadic function), so you definitely want:
movl $48, %eax
When you start using inline assembly, the compiler can't easily optimize through it, so you get inefficient code. All you really needed was:
int a = 48;
printf("%c\n",a);
Which produces the following assembly code:
pushl $48
pushl $AddressOfFormatString
call printf
addl $8, %esp
or, equivalently:
movl $48, %eax
pushl %eax
pushl $AddressOfFormatString
call printf
addl $8, %esp
Now, I imagine you're saying to yourself something like: "Yes, but if I do that, then I'm not using inline assembly!" To which my response is: exactly. You don't need inline assembly here, and in fact, you should not be using it, because it just causes problems. It's more difficult to write and leads to inefficient code generation.
If you want to learn assembly language programming, get an assembler and use that—not a C compiler's inline assembler. NASM is a popular and excellent choice, as is YASM. If you want to stick with using the Gnu assembler so you can stick with this tortuous AT&T syntax, then run as.
Since a is defined as character (char a;), :"=r"(a) will assign a 8-byte register. The 32-byte register EAX cannot be loaded with an 8-byte register - movl %dl, %eax (movl %0, %%eax) will cause this error. There are the sign extend and zero extend instructions movzx and movsx (Intel syntax), in AT&T syntax: movs... and movz... for this purpose.
Change
movl %0, %%eax;
to
movzbl %0, %%eax;
This is about finding the Fibonacci number using recursive approach which I had asked
in my previous question. Using one of the solution(answered), the run time taken
by the program was almost 0. I attach the program in GDB and check the assembly instruction
and found the following:
#include<iostream>
template<size_t N>
struct fibonacci:std::integral_constant<size_t,fibonacci<N-1>{}+fibonacci<N-2>{}>{};
template<> struct fibonacci<1> : std::integral_constant<size_t,1> {};
template<> struct fibonacci<0> : std::integral_constant<size_t,0> {};
int main() {
int out = 0;
constexpr int number = 40;
out = fibonacci<number>();
std::cout<<"Fibonacci Series Of "<<number<<" is "<<out<<std::endl;
}
I have compiled my program using following flags and assembly instruction
of my program is as:
$g++ -g -gdwarf-2 -Wall -fdump-tree-all -std=c++11 fibonacci.cpp -o
fibcpp
(gdb) disassemble main
Dump of assembler code for function main():
0x0000000000400890 <+0>: push %rbp
0x0000000000400891 <+1>: mov %rsp,%rbp
0x0000000000400894 <+4>: sub $0x10,%rsp
0x0000000000400898 <+8>: movl $0x0,-0x8(%rbp)
0x000000000040089f <+15>: movl $0x28,-0x4(%rbp)
0x00000000004008a6 <+22>: lea -0x9(%rbp),%rax
0x00000000004008aa <+26>: mov %rax,%rdi
=> 0x00000000004008ad <+29>: callq 0x400952 <std::integral_constant<unsigned long, 102334155ul>::operator unsigned long() const>
0x00000000004008b2 <+34>: mov %eax,-0x8(%rbp)
0x00000000004008b5 <+37>: mov $0x400a15,%esi
we can see that(on the arrowed==>) 102334155 is there which is fibonacci(40). This indicates that indeed all calculation has happened in the compile time.
When we compile our program and put extra flag(-fdump-tree-all), we get many
internal files and normally(fibonacci.gimple) files are the one where normally template
instantiated code would go. However in this case I did not find anything
related to this calculation in fibonacci.gimple file.
My question is in which file g++ does calculate and store these information?. My aim over here is to understand more about compile time calculation/manipulation which happens in C++ program.
From you disassembly it seems, that the "method" operator unsigned long() is called and not inlined. When you look at its disassembly, you should see the actual returned value. It is the instantiation of integral_constant<>::operator value_type() with size_t = unsigned long as value_type.
But you might already know all that... You want to actually see it. The message https://gcc.gnu.org/ml/gcc/2011-06/msg00110.html suggests, that others thought about an -ftrace-template-instantiation option, but no one implemented it, yet.
EDIT: There is lots of information about debugging and tracing templates in this question.
#include <stdio.h>
int main(void){
int sum = 0;
sum += 0xabcd;
printf(“%x”, sum);
return 0;
}
This is my code and when I use gdb I can find different address when break main / break *main.
When I just type disassemble main it shows like this:
Dump of assembler code for function main:
0x080483c4 <+0>: push %ebp
0x080483c5 <+1>: mov %esp,%ebp
0x080483c7 <+3>: and $0xfffffff0,%esp
0x080483ca <+6>: sub $0x20,%esp
0x080483cd <+9>: movl $0x0,0x1c(%esp)
0x080483d5 <+17>:addl $0xabcd,0x1c(%esp)
0x080483dd <+25>:mov $0x80484c0,%eax
0x080483e2 <+30>:mov 0x1c(%esp),%edx
0x080483e6 <+34>:mov %edx,0x4(%esp)
0x080483ea <+38>:mov %eax,(%esp)
0x080483ed <+41>:call 0x80482f4 <printf#plt>
0x080483f2 <+46>:mov $0x0,%eax
0x080483f7 <+51>:leave
0x080483f8 <+52>:ret
End of assembler dump.
So when I type [break *main] it starts 0x080483c4 but type [break main] it start 0x080483cd
Why is start address is different?
Why is the address different.
Because break function and break *address are not the same thing(*address specifies the address of the function's first instruction, before the stack frame and arguments have been set up).
In the first case, GDB skips function prolog (setting up the current frame).
Total guess - and prepared to be totally wrong.
*main if address of the function
Breaking inside main is the first available address to stop inside the function when it is being executed.
Note that 0x080483cd is the first place a debugger can stop as it is modifying a variable (ie assigning zero to sum)
When you are breaking at 0x080483c4 this is before the setup assembler that C knows nothing about