Can't call fseek with inline assembly - gcc

#include "stdio.h"
void fseek(void *, int, int);
main () {
FILE* f = fopen("myfile", "rb");
asm("push 2");
asm("push 0");
asm("push f");
asm("call fseek");
asm("add esp, 12");
}
gcc -masm=intel call.c
call.c:(.text+0x2c): undefined reference to `f'
call.c:(.text+0x31): undefined reference to `fseek'
I have been trying to use AT/T syntax but got the same result.

Well you can not write like this, since there is no grantee that symbol f would exist in the generated assembly -- it's merely a symbol in C.
The solution is to use GCC's extended asm syntax. For example, push f could be rewrited into this:
asm volatile ("pushl %0"
: /* no output operands */
: "m" (f)
: /* no clobbered operands */);
As for the function call fseek, I believed your code shall be alright (at least in my experience and on my laptop it works just now). What's your platform info? Do you have glibc or similar things providing the standard libraries of C?
Also Please notice you're using a weird declaration of fseek since it shall at least have a return value according to the C specification.
Just for your information, you may try this style of an indirect call:
asm volatile ("call *%0"
: /* no output operands */
: "r"(fseek)
: /* no clobbered operands */);

Related

Confusion about different clobber description for arm inline assembly

I'm learning ARM inline assembly, and is confused about a very simple function: assign the value of x to y (both are int type), on arm32 and arm64 why different clobber description required?
Here is the code:
#include <arm_neon.h>
#include <stdio.h>
void asm_test()
{
int x = 10;
int y = 0;
#ifdef __aarch64__
asm volatile(
"mov %w[in], %w[out]"
: [out] "=r"(y)
: [in] "r"(x)
: "r0" // r0 not working, but r1 or x1 works
);
#else
asm volattile(
"mov %[in], %[out]"
: [out] "=r"(y)
: [in] "r"(x)
: "r0" // r0 works, but r1 not working
);
#endif
printf("y is %d\n", y);
}
int main() {
arm_test();
return 0;
}
Tested on my rooted android phone, for arm32, r0 generates correct result but r1 won't. For arm64, r1 or x1 generate correct result, and r0 won't. Why on arm32 and arm64 they are different? What is the concrete rule for this and where can I find it?
ARM / AArch64 syntax is mov dst, src
Your asm statement only works if the compiler happens to pick the same register for both "=r" output and "r" input (or something like that, given extra copies of x floating around).
Different clobbers simply perturb the compiler's register-allocation choices. Look at the generated asm (gcc -S or on https://godbolt.org/, especially with -fverbose-asm.)
Undefined Behaviour from getting the constraints mismatched with the instructions in the template string can still happen to work; never assume that an asm statement is correct just because it works with one set of compiler options and surrounding code.
BTW, x86 AT&T syntax does use mov src, dst, and many GNU C inline-asm examples / tutorials are written for that. Assembly language is specific to the ISA and the toolchain, but a lot of architectures have an instruction called mov. Seeing a mov does not mean this is an ARM example.
Also, you don't actually need a mov instruction to use inline asm to copy a valid. Just tell the compiler you want the input to be in the same register it picks for the output, whatever that happens to be:
// not volatile: has no side effects and produces the same output if the input is the same; i.e. the output is a pure function of the input.
asm (""
: "=r"(output) // pick any register
: "0"(input) // pick the same register as operand 0
: // no clobbers
);

How to suppress "warning: control reaches end of non-void function"

I have some PowerPC assembly code translated with a gcc cross compiler with this function:
uint32_t fill_cache(void)
{
__asm__ ("addi 3, 0, 0\n"); /* R3 = 0 */
/* More asm here modifying R3 and filling the cache lines. */
}
which, under the PowerPC EABI, returns the value computed in R3. When compiling I get
foo.c:105: warning: control reaches end of non-void function
Is there a way to teach gcc that a value is actually returned? Or is there a way to suppress the warning (without removing -Wall or adding -Wno-*)? I would like to very selectively suppress this warning for only this function in order to leave the general warning level as high as possible.
It is not an option to make this function return void since the value computed is required by the caller.
Solution 1: with diagnostic pragmas you can locally suppress certain diagnostic checks. The specific option (which also is implied by -Wall) that complains for no return in a non-void function is -Wreturn-type. So the specific code to suppress the warning is:
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wreturn-type"
/* Your code here */
#pragma GCC diagnostic pop
You can find out which option is causing the warning by compiling with -fdiagnostics-show-option. It will simply append the option to the warning message.
Solution 2: define a register variable and put it in the desired register. Refer to the variable in an inline assembler template, with the resulting code:
uint32_t fill_cache(void)
{
register uint32_t cacheVal __asm__ ("r3");
__asm__ __volatile__ ("addi %0, 0, 0" : "=r" (cacheVal));
/* More code here */
return cacheVal;
}
The volatile modifier is to ensure that the instruction is not removed or in some other way affected undesirably by the optimization strategy.
Solution 2 is preferred for at least two reasons:
The value of a no returning non-void function is undefined as far as the standard is concerned.
There's no risk of suppressing (new) diagnostic warnings there was no intention to suppress in the first place.
Function could be declared as naked, in this case compiler would not generate prolog & epilog and would assume that programmer preserves all necessary registers and puts output value into correct register(s) before return.
uint32_t fill_cache(void) __attribute__((naked)); // Declaration
// attribute should be specified in declaration not in implementation
uint32_t fill_cache(void)
{
__asm__ ("addi 3, 0, 0\n"); /* R3 = 0 */
/* More asm here modifying R3 and filling the cache lines. */
}
A bit late but maybe someone will step in this as well :)
PS: For my best knowledge __asm__ as well as __volatile__ are std=c89 syntax. Practically there is not difference between __asm__ & asm in GNU GCC. But the modern approach is underscoreless style: asm volatile.
asm_language

gcc arm -- ensuring args are retained when inlining functions with inline asm statements

I have a series of functions that are ultimately implemented with an SVC call. For instance:
void func(int arg) {
asm volatile ("svc #123");
}
as you might imagine, the SVC operates on 'arg' which is presumably in a register. if i explictly add a 'noinline' attribute to the definition, everything works as you'd expect.
but, were the function inlined at a higher optimization level, the code that loads 'arg' into a register would be omitted -- as there is apprently no reference to 'arg'.
I've tried adding a 'used' attribute to the declaration of 'arg' itself -- but gcc apparently yields a warning in this case.
I've also tried adding "dummy" asm statements such as
asm ("" : "=r"(arg));
But this didn't appear to work in general. (maybe i need to say volatile here as well???)
Anyway, it seems unfortunate to have an explicit function call for a routine whose body essentially consists of one asm statement.
A relevant recipe is in the GCC manual, in Assembler Instructions with C Expression Operands section, that uses sysint with the same role of your svc instruction. The idea is to define a local register variable with a specified register, and then use extended asmsyntax to add inputs and outputs to the inline assembly block.
I tried to compile the following code:
#include <stdint.h>
__attribute__((always_inline))
uint32_t func(uint32_t arg) {
register uint32_t r0 asm("r0") = arg;
register uint32_t result asm("r0");
asm volatile ("svc #123":"=r" (result) : "0" (r0));
return result;
}
uint32_t foo(void) {
return func(2);
}
This is the disassembly of the compiled (with -O2 flag) object file:
00000000 <func>:
0: ef00007b svc 0x0000007b
4: e12fff1e bx lr
00000008 <foo>:
8: e3a00002 mov r0, #2
c: ef00007b svc 0x0000007b
10: e12fff1e bx lr
func is expanded inline and the argument is put in r0 correctly. I believe volatile is necessary, because if you don't make use of the return value of the service call, then the compiler might assume that the assembly piece of code is not necessary.
You should have a single asm block, compiler is still free to treat two asm blocks individually until otherwise specified. Meaning requirements put on second asm block won't have any effect on the first one.
You are assuming registers will be in their right places because of the calling convention.
What about something like this? (didn't test)
void func(int arg) {
asm volatile (
"mov r0, %[code]\n\t"
"svc #123"
:
: [code]"r" (code)
);
}
For more information, see ARM GCC Inline Assembler Cookbook.

can't find a register in class 'CREG' while reloading 'asm' - memcpy inline asm

I am trying to make an earlier verion Linux got compiled, you can download the source code from git://github.com/azru0512/linux-0.12.git. While compiling ''kernel/blk_drv/ramdisk.c'', I got error message below,
ramdisk.c:36:10: error: can't find a register in class 'CREG' while reloading 'asm'
ramdisk.c:40:10: error: can't find a register in class 'CREG' while reloading 'asm'
ramdisk.c:36:10: error: 'asm' operand has impossible constraints
ramdisk.c:40:10: error: 'asm' operand has impossible constraints
What in ramdisk.c are,
if (CURRENT-> cmd == WRITE) {
(void) memcpy(addr,
CURRENT->buffer,
len);
} else if (CURRENT->cmd == READ) {
(void) memcpy(CURRENT->buffer,
addr,
len);
} else
panic("unknown ramdisk-command");
And the memcpy is,
extern inline void * memcpy(void * dest,const void * src, int n)
{
__asm__("cld\n\t"
"rep\n\t"
"movsb"
::"c" (n),"S" (src),"D" (dest)
:"cx","si","di");
return dest;
}
I guess it's memcpy (include/string.h) inline asm problem, so I remove the clobber list from it but without luck. Could you help me to find out what's going wrong? Thanks!
GCC's syntax for this has changed / evolved a bit.
You must now specify each of the special target registers as an output operand:
...("...instructions..."
: "=c"(n), "=S"(src), "=D"(dest)
and then additionally as the same registers as source operands:
: "0"(n), "1"(src), "2"(dest)
and finally you need to clobber "memory" (I can't remember offhand if this affects condition codes, if so you would also need "cc"):
: "memory")
Next, because this instruction should not be moved or deleted, you need to use either volatile or __volatile__ (I'm not entirely sure why but without this the instructions were deleted, in my test-case).
Last, it's no longer a good idea to attempt to override memcpy because gcc "knows" how to implement the function. You can override gcc's knowledge with -fno-builtin.
This compiles (for me anyway, with a somewhat old gcc on an x86-64 machine):
extern inline void * memcpy(void * dest,const void * src, int n)
{
__asm__ volatile("cld\n\t"
"rep\n\tmovsb\n\t"
: "=c" (n), "=S" (src), "=D" (dest)
: "0" (n), "1" (src), "2" (dest)
: "memory", "cc");
return dest;
}
This exact problem & its reasons are discussed on GCC's bugzilla :
Bug 43998 - inline assembler: can't set clobbering for input register
gcc wont allow input & output registers as clobbers.
If you corrupt input register, do a dummy output to same register :
unsigned int operation;
unsigned int dummy;
asm ("cpuid" : "=a" (dummy) : "0" ( operation) :);

Use both SSE2 intrinsics and gcc inline assembler

I have tried to mix SSE2 intrinsics and inline assembler in gcc. But if I specify a variable as xmm0/register as input then in some cases I get a compiler error. Example:
#include <emmintrin.h>
int main() {
__m128i test = _mm_setzero_si128();
asm ("pxor %%xmm0, %%xmm0" : : "xmm0" (test) : );
}
When compiled with gcc version 4.6.1 I get:
>gcc asm_xmm.c
asm_xmm.c: In function ‘main’:
asm_xmm.c:10:3: error: matching constraint references invalid operand number
asm_xmm.c:7:5: error: matching constraint references invalid operand number
The strange thing is that in same cases where I have other input variables/registers then it suddenly works with xmm0 as input but not xmm1, etc. And in another case I was able to specify xmm0-xmm4 but not above. A little confused/frustrated about this :S
Thanks :)
You should let the compiler do the register assignment. Here's an example of pshufb (for gcc too old to have tmmintrin for SSSE3):
static inline __m128i __attribute__((always_inline))
_mm_shuffle_epi8(__m128i xmm, __m128i xmm_shuf)
{
__asm__("pshufb %1, %0" : "+x" (xmm) : "xm" (xmm_shuf));
return xmm;
}
Note the "x" qualifier on the arguments and simply %0 in the assembly itself, where the compiler will substitute in the register it selected.
Be careful to use the right modifiers. "+x" means xmm is both an input and an output parameter. If you are sloppy with these modifiers (eg using "=x" meaning output only when you needed "+x") you will run into cases where it sometimes works and sometimes doesn't.

Resources