Assembly inline AT&T Type mismatch - gcc

I'm learning assembly and I found nothing that helps me do this. Is it even possible? I can't make this work.
I want this code to take the "b" value, put it in %eax and then move the content of %eax in my output and print that ASCII character, "0" in this case.
char a;
int b=48;
__asm__ (
//Here's the "Error: operand type mismatch for `mov'
"movl %0, %%eax;"
"movl %%eax, %1;"
:"=r"(a)
:"r" (b)
:"%eax"
);
printf("%c\n",a);

The instruction responsible for the error is this one:
movl %0, %%eax
So, in order to figure out why it's causing an error, we need to understand what it says. It's a 32-bit MOV instruction (the l suffix in AT&T syntax means "long", aka DWORD). The destination operand is the 32-bit EAX register. The source operand is the first input/output operand, a. In other words, this:
"=r"(a)
which says that char a; is to be used as an output-only register.
As such, what the inline assembler wants to do is to generate code like the following:
movl %dl, %eax
(assuming, for the sake of argument that a is allocated in the dl register, but it could just as easily have been allocated in any of the 8-bit registers). The problem is, that code is invalid because there is an operand size mismatch. The source operand and destination operand are different sizes: one is 32 bits while the other is 8 bits. This cannot work.
A workaround is the movzx/movsx instructions (introduced with the 80386) which move an 8 (or 16) bit source operand into a 32-bit destination operand, either with zero extension or sign extension, respectively. In AT&T syntax, the form that moves an 8-bit source into a 32-bit destination would be movzbl (for zero extension, used with unsigned values) or movsbl (for sign extension, used with signed values).
But wait—this is the wrong workaround. Your code is invalid for another reason: a is uninitialized! And not only is a uninitialized, but you've told the inline assembler via the output constraints it is an output-only operand (the = sign)! So you can't read from it—you can only store into it.
You have your operand notation backwards. What you really wanted was something like the following:
__asm__(
"movl %1, %%eax;"
"movl %%eax, %0;"
: "=r"(a)
: "r" (b)
: "%eax"
);
Of course, that's still going to give you an operand size mismatch, but it's now on the second assembly instruction. What this is telling the inline assembler to emit is the following code:
movl $48, %edx
movl %edx, %eax
movl %eax, %dl
which is invalid because a 32-bit source (%eax) cannot be moved into an 8-bit destination (%dl). And you can't fix this with movzx/movsx, because that is used to extend, not truncate. The way to write this would be the following:
movl $48, %edx
movl %edx, %eax
movb %al, %dl
where the last instruction is an 8-bit move, from an 8-bit source register to an 8-bit destination register.
In inline assembly, this would be written as:
__asm__(
"movl %1, %%eax;"
"movb %%al, %0;"
: "=r"(a)
: "r" (b)
: "%eax"
);
However, this is not the correct way to use inline assembly. You've manually hard-coded the EAX register inside of the inline assembly block, which means that you had to clobber it. The problem with this is that it ties the compiler's hands behind its back when it comes to register allocation. What you're supposed to do is put everything that goes into and out of the inline assembly block in the input and output operands. This lets the compiler handle all register allocation in the most optimal way possible. The code should look as follows:
char a;
int b = 48;
int temp;
__asm__(
"movl %2, %0\n\t"
"movb %b0, %1"
: "=r"(temp),
"=r"(a)
: "r" (b)
:
);
A lot of changes happened here:
I introduced another temporary variable (appropriately named temp) and added it to the output-only operands list. This causes the compiler to allocate a register for it automatically, which we then use inside of the asm block.
Now that we're letting the compiler do the register allocation, we don't need a clobber list, so that's left empty.
The b modifier is needed on the source operand for the movb instruction to ensure that the byte-sized portion of that register is used, rather than the entire 32-bit register.
Instead of using semicolons at the end of each asm instruction, I used \n\t (except on the last one). This is what is recommended for use in inline assembly blocks, and it gets you nicer assembly output listings because it matches what the compiler does internally.
Even better would be to introduce symbolic names for the operands, making the code more readable:
char a;
int b = 48;
int temp;
__asm__(
"movl %[input], %[temp]\n\t"
"movb %b[temp], %[dest]"
: [temp] "=r"(temp),
[dest] "=r"(a)
: [input] "r" (b)
:
);
And, at this point, if you hadn't noticed already, you'd see that this code is enormously silly. You don't need all those temporaries and register-register shuffling. You can just do:
movl $48, %eax
and the value 48 is already in al, since al is the low 8 bits of the 32-bit register eax.
Or, you can do:
movb $48, %al
which is just an 8-bit move of the value 48 explicitly into the 8-bit register al.
But, in fact, if you're calling printf, the argument must be passed as an int (not a char, since it's a variadic function), so you definitely want:
movl $48, %eax
When you start using inline assembly, the compiler can't easily optimize through it, so you get inefficient code. All you really needed was:
int a = 48;
printf("%c\n",a);
Which produces the following assembly code:
pushl $48
pushl $AddressOfFormatString
call printf
addl $8, %esp
or, equivalently:
movl $48, %eax
pushl %eax
pushl $AddressOfFormatString
call printf
addl $8, %esp
Now, I imagine you're saying to yourself something like: "Yes, but if I do that, then I'm not using inline assembly!" To which my response is: exactly. You don't need inline assembly here, and in fact, you should not be using it, because it just causes problems. It's more difficult to write and leads to inefficient code generation.
If you want to learn assembly language programming, get an assembler and use that—not a C compiler's inline assembler. NASM is a popular and excellent choice, as is YASM. If you want to stick with using the Gnu assembler so you can stick with this tortuous AT&T syntax, then run as.

Since a is defined as character (char a;), :"=r"(a) will assign a 8-byte register. The 32-byte register EAX cannot be loaded with an 8-byte register - movl %dl, %eax (movl %0, %%eax) will cause this error. There are the sign extend and zero extend instructions movzx and movsx (Intel syntax), in AT&T syntax: movs... and movz... for this purpose.
Change
movl %0, %%eax;
to
movzbl %0, %%eax;

Related

How get EIP from x86 inline assembly by gcc

I want to get the value of EIP from the following code, but the compilation does not pass
Command :
gcc -o xxx x86_inline_asm.c -m32 && ./xxx
file contetn x86_inline_asm.c:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned int eip_val;
__asm__("mov %0,%%eip":"=r"(eip_val));
return 0;
}
How to use the inline assembly to get the value of EIP, and it can be compiled successfully under x86.
How to modify the code and use the command to complete it?
This sounds unlikely to be useful (vs. just taking the address of the whole function like void *tmp = main), but it is possible.
Just get a label address, or use . (the address of the current line), and let the linker worry about getting the right immediate into the machine code. So you're not architecturally reading EIP, just reading the value it currently has from an immediate.
asm volatile("mov $., %0" : "=r"(address_of_mov_instruction) );
AT&T syntax is mov src, dst, so what you wrote would be a jump if it assembled.
(Architecturally, EIP = the end of an instruction while it's executing, so arguably you should do
asm volatile(
"mov $1f, %0 \n\t" // reference label 1 forward
"1:" // GAS local label
"=r"(address_after_mov)
);
I'm using asm volatile in case this asm statement gets duplicated multiple times inside the same function by inlining or something. If you want each case to get a different address, it has to be volatile. Otherwise the compiler can assume that all instances of this asm statement produce the same output. Normally that will be fine.
Architecturally in 32-bit mode you don't have RIP-relative addressing for LEA so the only good way to actually read EIP is call / pop. Reading program counter directly. It's not a general-purpose register so you can't just use it as the source or destination of a mov or any other instruction.
But really you don't need inline asm for this at all.
Is it possible to store the address of a label in a variable and use goto to jump to it? shows how to use the GNU C extension where &&label takes its address.
int foo;
void *addr_inside_function() {
foo++;
lab1: ; // labels only go on statements, not declarations
void *tmp = &&lab1;
foo++;
return tmp;
}
There's nothing you can safely do with this address outside the function; I returned it just as an example to make the compiler put a label in the asm and see what happens. Without a goto to that label, it can still optimize the function pretty aggressively, but you might find it useful as an input for an asm goto(...) somewhere else in the function.
But anyway, it compiles on Godbolt to this asm
# gcc -O3 -m32
addr_inside_function:
.L2:
addl $2, foo
movl $.L2, %eax
ret
#clang -O3 -m32
addr_inside_function:
movl foo, %eax
leal 1(%eax), %ecx
movl %ecx, foo
.Ltmp0: # Block address taken
addl $2, %eax
movl %eax, foo
movl $.Ltmp0, %eax # retval = label address
retl
So clang loads the global, computes foo+1 and stores it, then after the label computes foo+2 and stores that. (Instead of loading twice). So you still can't usefully jump to the label from anywhere, because it depends on having foo's old value in eax, and on the desired behaviour being to store foo+2
I don't know gcc inline assembly syntax for this, but for masm:
call next0
next0: pop eax ;eax = eip for this line
In the case of Masm, $ represents the current location, and since call is a 5 byte instruction, an alternative syntax without a label would be:
call $+5
pop eax

Is it necessary to initialize all the used registers in inline assembly?

I am testing simple inline assembly code using gcc. And I find the result of the following code unexpected:
#include <stdio.h>
int main(void) {
unsigned x0 = 0, x1 = 1, x2 = 2;
__asm__ volatile("movl %1, %0;\n\t"
"movl %2, %1"
:"=r"(x0), "+r"(x1)
:"r"(x2)
:);
printf("%u, %u\n", x0, x1);
return 0;
}
The printed result is 1, 1, rather than the expected 1, 2. Then I compiled the code with -S option and found out gcc generated the code as
movl %eax, %edx;
movl %edx, %eax;
%0 and %2 are using the same register, why?
I want gcc to generate, say,
movl %eax, %edx;
movl %ecx, %eax;
If I add "0"(x1) to the input constraints, gcc will generate the code above. Does it mean that all registers need to be initialized before being used in inline assembly?
Moving my comment to an 'Answer' so this question can be closed.
To prevent the compiler from re-using a register for both an input and an output, you can use the early clobber constraint (for example =&r (x)), which informs the compiler that the register associated with the parameter is
written before the instruction is finished using the input operands.
While this can be a good thing (since it reduces the number of registers that must made available before calling your asm), it can also cause problems (as you have seen). So, either make sure you have finished using all the inputs before writing to the output, or use & to tell the compiler not to do this optimization.
For completeness, let me also point out that using inline asm is usually a bad idea.

Unexpected GCC inline ASM behaviour (clobbered variable overwritten)

On my computer, the compiled executable omits executing "mov %2, %%ax" at the top of the loop
when "add %1, %%ax" uncommented.
Anyone to doublecheck or comment ?
#include <stdio.h>
int main() {
short unsigned result, low ,high;
low = 0;
high = 1;
__asm__ (
"movl $10, %%ecx \n\t"
"loop: mov %2, %%ax \n\t"
// "add %1, %%ax \n\t" // uncomment and result = 10
"mov %%ax, %0 \n\t"
"subl $1, %%ecx \n\t"
"jnz loop"
: "=r" (result)
: "r" (low) , "r" (high)
: "%ecx" ,"%eax" );
printf("%d\n", result);
return 0;
}
Follows the assembly generated
movl $1, %esi
xorl %edx, %edx
/APP
movl $10 ,%ecx
loop: mov %si, %ax
mov %dx, %bx
add %bx, %ax
mov %ax, %dx
subl $1, %ecx
jnz loop
/NO_APP
Thanks to Jester the solution :
: "=&r" (result) // early clober modifier
GCC inline assembly is advanced programming, with a lot of pitfalls. Make sure you actually need it, and can't replace it with standalone assembly module, or C code using intrinsics. or vector support.
If you insist on inline assembly, you should be prepared to at least look at the generated assembly code and try to figure out any mistakes from there. Obviously the compiler does not omit anything that you write into the asm block, it just substitutes the arguments. If you look at the generated code, you might see something like this:
add %dx, %ax
mov %ax, %dx
Apparently the compiler picked dx for both argument 0 and 1. It is allowed to do that, because by default it assumes that the input arguments are consumed before any outputs are written. To signal that this is not the case, you must use an early clobber modifier for your output operand, so it would look like "=&r".
PS: Even when inline assembly seems to work, it may have hidden problems that will bite you another day, when the compiler happens to make other choices. You should really avoid it.

How to move a 64bit pointer into the RAX register?

I have the following code in a GNU C program:
void *segment = malloc(1024);
asm volatile("mov $%0, %%rax" : : "r" (segment));
And I get the following error:
Error: illegal immediate register operand %rax
What is wrong with %rax?
While FrankH's points are valid, strictly speaking cause of this error is the dollar sign. Dollar signs in assembler are used to denote constants. So "mov $1, %%eax" would work. However, your code generates:
mov $%rax, %rax
$%rax is meaningless and generates a error. This will resolve the error:
void *segment = malloc(1024);
asm volatile("mov %0, %%rax" : : "r" (segment));
Since malloc will return its value in rax, this will (most likely) generate "mov %rax, %rax".
In other words, it will still be meaningless, unsafe and inefficient, but it will compile without error.
Assuming this code is intended to be more than an experiment to teach you something about using asm, you will need to provide more details to get a more useful answer.

Explanation of Asm code

The following GCC inline asm is taken from LuaJit's coco library. Can someone provide a line by line explanation of what it does?
static inline void coco_switch(coco_ctx from, coco_ctx to)
{
__asm__ __volatile__ (
"movl $1f, (%0)\n\t"
"movl %%esp, 4(%0)\n\t"
"movl %%ebp, 8(%0)\n\t"
"movl 8(%1), %%ebp\n\t"
"movl 4(%1), %%esp\n\t"
"jmp *(%1)\n" "1:\n"
: "+S" (from), "+D" (to) : : "eax", "ebx", "ecx", "edx", "memory", "cc");
}
Thanks
My ASM is a bit fuzzy about the details, but I think I can give you a general idea.
ESP: Stack pointer, EBP: Base pointer.
movl $1f, (%0)
Move address of label 1 (defined on last line) into parameter 0 (from).
movl %%esp, 4(%0)
Move the content of register ESP into (from + 4).
movl %%ebp, 8(%0)
Move the content of register EBP into (from + 8).
movl 8(%1), %%ebp
Move the content of (to + 8) into register EBP.
movl 4(%1), %%esp
Move the content of (to + 4) into register ESP.
jmp *(%1)
Jump to address contained in (to).
The "1:" is a jump label.
"+S" declares a "source" (read) parameter, "+D" a destination (write) parameter. The list of registers at the end of the statement is the "clobber" list, a list of registers possibly modified by the ASM code, so the compiler can take steps to maintain consistency (i.e., not relying on e.g. ECX still containing the same value as before).
I guess that coco_ctx means "coco context". So: The function saves the current stack frame in the "from" structure, and sets the stack frame to what's saved in the "to" structure. Basically, it jumps from the current function into another function.
DevSolar has the right answer -- I'll just add that you can learn a little more about what EBP and ESP are for here.

Resources