Basic GCC inline assembly question - gcc

I want to move the value of the variable "userstack" inside the ESP register and then do an absolute jump to the memory address contained in the variable "location".
This is what I've got:
// These are the two variables that contains memory addresses
uint32_t location = current_running->LOCATION;
uint32_t userstack = current_running->user_stack;
// And then something like this
__asm__ volatile ("movl userstack, %esp");
__asm__ volatile ("ljmp $0x0000, location");
However when I try to compile I get the errors:
"Error: suffix or operands invalid for ljmp" and "undefined reference to `userstack'".
Any help would be very much appreciated.

Take a look at the manual.
I think you'd need something like this:
asm volatile ("movl %0, %esp" : "g" (userstack));
asm volatile ("ljmp $0x0000, %0" : "g" (location));
Basically GCC needs know what and where userstack and location may be (registers, memory operands, floating, restricted subset of registers, etc.) and that is specified by "g", in this case meaning a general operand.

Related

Confusion about different clobber description for arm inline assembly

I'm learning ARM inline assembly, and is confused about a very simple function: assign the value of x to y (both are int type), on arm32 and arm64 why different clobber description required?
Here is the code:
#include <arm_neon.h>
#include <stdio.h>
void asm_test()
{
int x = 10;
int y = 0;
#ifdef __aarch64__
asm volatile(
"mov %w[in], %w[out]"
: [out] "=r"(y)
: [in] "r"(x)
: "r0" // r0 not working, but r1 or x1 works
);
#else
asm volattile(
"mov %[in], %[out]"
: [out] "=r"(y)
: [in] "r"(x)
: "r0" // r0 works, but r1 not working
);
#endif
printf("y is %d\n", y);
}
int main() {
arm_test();
return 0;
}
Tested on my rooted android phone, for arm32, r0 generates correct result but r1 won't. For arm64, r1 or x1 generate correct result, and r0 won't. Why on arm32 and arm64 they are different? What is the concrete rule for this and where can I find it?
ARM / AArch64 syntax is mov dst, src
Your asm statement only works if the compiler happens to pick the same register for both "=r" output and "r" input (or something like that, given extra copies of x floating around).
Different clobbers simply perturb the compiler's register-allocation choices. Look at the generated asm (gcc -S or on https://godbolt.org/, especially with -fverbose-asm.)
Undefined Behaviour from getting the constraints mismatched with the instructions in the template string can still happen to work; never assume that an asm statement is correct just because it works with one set of compiler options and surrounding code.
BTW, x86 AT&T syntax does use mov src, dst, and many GNU C inline-asm examples / tutorials are written for that. Assembly language is specific to the ISA and the toolchain, but a lot of architectures have an instruction called mov. Seeing a mov does not mean this is an ARM example.
Also, you don't actually need a mov instruction to use inline asm to copy a valid. Just tell the compiler you want the input to be in the same register it picks for the output, whatever that happens to be:
// not volatile: has no side effects and produces the same output if the input is the same; i.e. the output is a pure function of the input.
asm (""
: "=r"(output) // pick any register
: "0"(input) // pick the same register as operand 0
: // no clobbers
);

Inline assembly multiplication "undefined reference" on inputs

Trying to multiply 400 by 2 with inline assembly, using the fact imul implicity multiplies by eax. However, i'm getting "undefined reference" compile errors to $1 and $2
int c;
int a = 400;
int b = 2;
__asm__(
".intel_syntax;"
"mov eax, $1;"
"mov ebx, $2;"
"imul %0, ebx;"
".att_syntax;"
: "=r"(c)
: "r" (a), "r" (b)
: "eax");
std::cout << c << std::endl;
Do not use fixed registers in inline asm, especially if you have not listed them as clobbers and have not made sure inputs or outputs don't overlap them. (This part is basically a duplicate of segmentation fault(core dumped) error while using inline assembly)
Do not switch syntax in inline assembly as the compiler will substitute wrong syntax. Use -masm=intel if you want intel syntax.
To reference arguments in an asm template string use % not $ prefix. There's nothing special about $1; it gets treated as a symbol name just like if you'd used my_extern_int_var. When linking, the linker doesn't find a definition for a $1 symbol.
Do not mov stuff around unnecessarily. Also remember that just because something seems to work in a certain environment, that doesn't guarantee it's correct and will work everywhere every time. Doubly so for inline asm. You have to be careful. Anyway, a fixed version could look like:
__asm__(
"imul %0, %1"
: "=r"(c)
: "r" (a), "0" (b)
: );
Has to be compiled using -masm=intel. Notice b has been put into the same register as c.
using the fact imul implicity multiplies by eax
That's not true for the normal 2-operand form of imul. It works the same as other instructions, doing dst *= src so you can use any register, and not waste uops writing the high half anywhere if you don't even want it.

How do memory operands work in avr-gcc inline assembly?

I'm trying to write a custom memory-copy function for AVR as inline assembly, because avr-gcc will always use a loop for memcpy and struct assignment, which is inefficient in terms of time. I want to use memory operands to avoid having to add a "memory" clobber. I currently have this:
void copy_2_bytes (char *restrict dst, char *restrict src)
{
struct S {
char x[2];
};
__asm__(
" ld __tmp_reg__,%[src]+\n"
" st %[dst]+,__tmp_reg__\n"
" ld __tmp_reg__,%[src]+\n"
" st %[dst]+,__tmp_reg__\n"
: [dst] "=m" ( *(struct S *)dst )
: [src] "m" ( *(struct S *)src )
);
}
This compiles, but it's incorrect in general because it modifies the pointer register pairs corresponding to the memory operands. It's easy to see that gcc assumes that the registers stay unchanged, for example by adding "*dst = 0;" after the assembly.
On the other hand, the Y and Z registers support the "ldd" and "std" instructions, which also take an immediate offset, so they can be used to access multiple bytes without being modified. But then there doesn't seem to be a way to force gcc to not select the X register, which doesn't support that.
UPDATE
Actually, if gcc determines that the address of the memory operand is constant, it will pass the constant address into the assembly, instead of a register pair. So now, I have absolutely no idea how to deal with this. Are there some magic instructions or assembly macros which can deal with both pointer registers and constant addresses at the same time?

Loading SSE registers

I'm working on homework project for OS development class. One task is to save context of SSE registers upon interrupt. Now, saving and restoring context is easy (fxsave/fxsave). But I have problem with testing. I want to put same sample date into one of registers, but all I get is error interrupt 6. Here is code:
// load some SSE registers
struct Vec4 {
int x, y, z, w;
} vec = { 0, 1, 2, 3 };
asm volatile ( "movl %0, %%eax"
: /* no output */
: "r"( &vec )
:
);
asm volatile ( "movups (%eax), %xmm0" );
I searched on internet for solution. All I got is that it might something to do with effective address space. But I don't know what it is.
You need to use a memory operand as a constraint in the inline assembly. This is much better than generating the address by yourself (as you tried with the & operator) and loading in in a register, because the latter will not work if the address is rip relative or relocatable.
asm volatile ( "movups %0, %%xmm0"
: /* no output */
: "m"( vec )
:
);
And you need to use two "%%" before register names.
Read more about gcc's constraints here: http://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints . The title is somewhat misleading, as this concept is far from simple :-)
I found out what is problem. Execution of SSE instructions must be enabled by setting some flags in CR0 and CR4 registers. More info here: http://wiki.osdev.org/SSE
You're making this way harder than it needs to be - just use the intrinsics in the *mmintrin.h headers, e.g.
#include <emmintrin.h>
__m128i vec = _mm_set_epi32(3, 2, 1, 0);
If you need to put this in a specific XMM register then use the above example as a starting point, then generate asm, e.g. using gcc -S and use the generated asm as a template for your own code.

Use both SSE2 intrinsics and gcc inline assembler

I have tried to mix SSE2 intrinsics and inline assembler in gcc. But if I specify a variable as xmm0/register as input then in some cases I get a compiler error. Example:
#include <emmintrin.h>
int main() {
__m128i test = _mm_setzero_si128();
asm ("pxor %%xmm0, %%xmm0" : : "xmm0" (test) : );
}
When compiled with gcc version 4.6.1 I get:
>gcc asm_xmm.c
asm_xmm.c: In function ‘main’:
asm_xmm.c:10:3: error: matching constraint references invalid operand number
asm_xmm.c:7:5: error: matching constraint references invalid operand number
The strange thing is that in same cases where I have other input variables/registers then it suddenly works with xmm0 as input but not xmm1, etc. And in another case I was able to specify xmm0-xmm4 but not above. A little confused/frustrated about this :S
Thanks :)
You should let the compiler do the register assignment. Here's an example of pshufb (for gcc too old to have tmmintrin for SSSE3):
static inline __m128i __attribute__((always_inline))
_mm_shuffle_epi8(__m128i xmm, __m128i xmm_shuf)
{
__asm__("pshufb %1, %0" : "+x" (xmm) : "xm" (xmm_shuf));
return xmm;
}
Note the "x" qualifier on the arguments and simply %0 in the assembly itself, where the compiler will substitute in the register it selected.
Be careful to use the right modifiers. "+x" means xmm is both an input and an output parameter. If you are sloppy with these modifiers (eg using "=x" meaning output only when you needed "+x") you will run into cases where it sometimes works and sometimes doesn't.

Resources