memcpy on gcc code sourcery for ARM - gcc

I have my code compiled using arm code sourcery (arm-none-eabi-gcc) ( I think Lite Edition).
I define a struct variable inside a function, and do a memcpy like
typedef struct {
char src[6];
char dst[6];
uint16_t a;
uint16_t b;
uint32_t c;
uint16_t d;
} Info_t;
Info_t Info;
memcpy(Info.src, src, sizeof(Info.src));
memcpy(Info.dst, dst, sizeof(Info.dst));
The first memcpy goes through, but the second one is causing a abort.
I heard that the gcc optimizes memcpy and is resulting in an non- aligned struct acess?
I tried aligning the struct variable to a word boundary etc. But it did not work.
Can anyone give more details on the memcpy of gcc and alignment issue.
Thanks!

The memcopy() issue in ARM is related with the use of optimized implementation by the compiler as far as I understand.
"In many cases, when compiling calls to memcpy(), the ARM C compiler will generate calls to specialized, optimised, library functions instead. Since RVCT 2.1, these specialized functions are part of the ABI for the ARM architecture (AEABI), and include:
__aeabi_memcpy
This function is the same as ANSI C memcpy, except that the return value is void.
__aeabi_memcpy4
This function is the same as __aeabi_memcpy; but may assume the pointers are 4-byte aligned.
__aeabi_memcpy8
This function is the same as __aeabi_memcpy but may assume the pointers are 8-byte aligned."
Details can be found here : http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka3934.html

Related

linker - use own stdlib implementation

I have a problem. Requirement for the project is that we cannot link our app with standard library ( so -nostdlib is on in gcc).
my_stdlib.c contains implementation of all functions my_memset, my_memcpy ... but linker needs memcpy to copy structs
MyStruct struct = my_struct;
and is complaining about "undefined reference to `memcpy'", which is of course correct.
Is it possible to remap memcpy to my_memcpy using linker script, parameters passed to ld or other way, so linker can use our implementation to copy structs?
Probably -wrap,function could help but I cannot change my_memcpy to __wrap_memcpy.
At the GCC level, you can redirect the memcpy symbol to a different symbol using:
void *memcpy (void *, const void *, size_t) __asm__ ("my_memcpy");
This will apply to internally-generated memcpy calls, too. (With GCC. I think it does not change the internal call sites with Clang.)
compile with -fno-builtin. This should avoid it.

Stack Base Memory Address

Is there a simple way to find the stack base pointer programmatically? I am coding for an STM32F4 microcontroller and compiling with arm-none-eabi-gcc compiler.
When I was using the Arm C compiler packaged with Keil uVision 5 I could use the ABI function __user_initial_stackheap() to retrieve the stack base, but that doesn't seem to work with gcc.
This depends on how the different memory sections are set up (typically in a linker script). For instance, the linker script for an STM32F4 may define the stack base as:
__stack = ORIGIN(RAM) + LENGTH(RAM);
Then the linker script variables can be accessed in C code with
extern uint32_t __stack;
void foo() {
uint32_t stack_base = &__stack;
}

What is the correct jmp_buf size?

I got a library compiled wit GCC for the ARM Cortex-M3 processor compiled as static lib. This library has a jmp_buf at its interface.
struct png_struct_def {
#ifdef PNG_SETJMP_SUPPORTED
jmp_buf jmpbuf;
#endif
png_error_ptr error_fn;
// a lot more members ...
};
typedef png_struct_def png_struct;
When I pass a png_struct address to the library function it stores the value not to error_fn but the last field of jmpbuf. Obviously it has been compiled with another assumptions of the size of a jmp_buf.
Both compiler versions are arm-none-eabi-gcc. Why is the code incompatible. And what is the "correct" jmp_buf size? I can see in the disassembly that only the first half of the jmp_buf is used. Why does the size changes between the versions of GCC when it is too large anyway?
Edit:
The library is compiled with another library that I can't recompile because the source code is not available. This other library uses this interface. So I can't change the structure of the interface.
You may simply re-order the data declaration. I would suggest the following,
typedef struct if {
int some_value;
union
{
jmp_buf jmpbuf;
char pad[511];
} __attribute__ ((__transparent_union__));
} *ifp;
The issue is that depending on the ARM library, different registers maybe saved. At a maximum, 16 32bit general purpose registers and 32 64bit NEON registers might be saved. This gives around 320 bytes. If you struct is not used many times, then you can over-allocate. This should work no matter which definition of jmp_buf you get.
If you can not recompile the library, you may try to use,
typedef struct if {
char pad[LIB_JMPBUF_SZ];
int some_value;
} *ifp;
where you calculated the jmp_buf size. The libc may have changed the definition of jmp_buf between versions. Also, even though the compiler names match, one may support floating point and another one does not, etc. Even if the versions match, it is conceivable that the compiler configuration can give different jmp_buf sizes.
Both suggestion are non-portable. The 2nd suggestion will not work if your code calls setjmp() or longjmp(). Ie, I assume that the library is using these functions and the caller allocates the space.

Equivalent for GCC's naked attribute

I've got an application written in pure C, mixed with some functions that contain pure ASM. Naked attribute isn't available for x86 (why? why?!) and my asm functions don't like when prologue and epilogue is messing with the stack. Is it somehow possible to create a pure assembler function that can be referenced from C code parts? I simply need the address of such ASM function.
Just use asm() outside a function block. The argument of asm() is simply ignored by the compiler and passed directly on to the assembler. For complex functions a separate assembly source file is the better option to avoid the awkward syntax.
Example:
#include <stdio.h>
asm("_one: \n\
movl $1,%eax \n\
ret \n\
");
int one();
int main() {
printf("result: %d\n", one());
return 0;
}
PS: Make sure you understand the calling conventions of your platform. Many times you can not just copy/past assembly code.
PPS: If you care about performance, use extended asm instead. Extended asm essentially inlines the assembly code into your C/C++ code and is much faster, especially for short assembly functions. For larger assembly functions a seperate assembly source file is preferable, so this answer is really a hack for the rare case that you need a function pointer to a small assembly function.
Good news everyone. GCC developers finally implemented attribute((naked)) for x86. The feature will be available in GCC 8.
Certainly, just create a .s file (assembly source), which is run through gas (the assembler) to create a normal object file.

Getting GCC to compile without inserting call to memcpy

I'm currently using GCC 4.5.3, compiled for PowerPC 440, and am compiling some code that doesn't require libc. I don't have any direct calls to memcpy(), but the compiler seems to be inserting one during the build.
There are linker options like -nostdlib, -nostartfiles, -nodefaultlibs but I'm unable to use them as I'm not doing the linking phase. I'm only compiling. With something like this:
$ powerpc-440-eabi-gcc -O2 -g -c -o output.o input.c
If I check the output.o with nm, I see a reference to memcpy:
$ powerpc-440-eabi-nm output.o | grep memcpy
U memcpy
$
The GCC man page makes it clear how to remove calls to memcpy and other libc calls with the linker, but I don't want the compiler to insert them in the first place, as I'm using a completely different linker (not GNU's ld, and it doesn't know about libc).
Thanks for any help you can provide.
There is no need to -fno-builtins or -ffreestanding as they will unnecessarily disable many important optimizations
This is actually "optimized" by gcc's tree-loop-distribute-patterns, so to disable the unwanted behavior while keeping the useful builtin capabilities, you can just use:
-fno-tree-loop-distribute-patterns
Musl-libc uses this flag for its build and has the following note in their configure script (I looked through the source and didn't find any macros, so this should be enough)
# Check for options that may be needed to prevent the compiler from
# generating self-referential versions of memcpy,, memmove, memcmp,
# and memset. Really, we should add a check to determine if this
# option is sufficient, and if not, add a macro to cripple these
# functions with volatile...
# tryflag CFLAGS_MEMOPS -fno-tree-loop-distribute-patterns
You can also add this as an attribute to individual functions in gcc using its optimize attribute, so that other functions can benefit from calling mem*()
__attribute__((optimize("no-tree-loop-distribute-patterns")))
size_t strlen(const char *s){ //without attribute, gcc compiles to jmp strlen
size_t i = -1ull;
do { ++i; } while (s[i]);
return i;
}
Alternatively, (at least for now) you may add a confounding null asm statement into your loop to thwart the pattern recognition.
size_t strlen(const char *s){
size_t i = -1ull;
do {
++i;
asm("");
} while (s[i]) ;
return i;
}
Gcc emits call to memcpy in some circumstance, for example if you are copying a structure.
There is no way to change GCC behaviour but you can try to avoid this by modifying your code to avoid such copy. Best bet is to look at the assembly to figure out why gcc emitted the memcpy and try to work around it. This is going to be annoying though, since you basically need to understand how gcc works.
Extract from http://gcc.gnu.org/onlinedocs/gcc/Standards.html:
Most of the compiler support routines used by GCC are present in libgcc, but there are a few exceptions. GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp. Finally, if __builtin_trap is used, and the target does not implement the trap pattern, then GCC will emit a call to abort.
You need to disable a that optimization with -fno-builtin. I had this problem once when trying to compile memcpy for a C library. It called itself. Oops!
You can also make your binary a "freestanding" one:
The ISO C standard defines (in clause 4) two classes of conforming implementation. A conforming hosted implementation supports the whole standard [...]; a conforming freestanding implementation is only required to provide certain library facilities: those in , , , and ; since AMD1, also those in ; and in C99, also those in and . [...].
The standard also defines two environments for programs, a freestanding environment, required of all implementations and which may not have library facilities beyond those required of freestanding implementations, where the handling of program startup and termination are implementation-defined, and a hosted environment, which is not required, in which all the library facilities are provided and startup is through a function int main (void) or int main (int, char *[]).
An OS kernel would be a freestanding environment; a program using the facilities of an operating system would normally be in a hosted implementation.
(paragraph added by me)
More here. And the corresponding gcc option/s (keywords -ffreestanding or -fno-builtin) can be found here.
This is quite an old question, but I've hit the same issue, and none of the solutions here worked.
So I defined this function:
static __attribute__((always_inline)) inline void* imemcpy (void *dest, const void *src, size_t len) {
char *d = dest;
const char *s = src;
while (len--)
*d++ = *s++;
return dest;
}
And then used it instead of memcpy. This has solved the inlining issue for me permanently. Not very useful if you are compiling some sort of library though.

Resources