GCC: How to disable heap usage entirely on an MCU? - gcc

I have an application that runs on an ARM Cortex-M based MCU and is written in C and C++. I use gcc and g++ to compile it and would like to completely disable any heap usage.
In the MCU startup file the heap size is already set to 0. In addition to that, I would also like to disallow any accidental heap use in the code.
In other words, I would like the linker (and/or the compiler) to give me an error when the malloc, calloc, free functions or the new, new[], delete, delete[] operators are used.
So far I've tried -nostdlib which gives me issues like undefined reference to _start. I also tried -nodefaultlibs but that one still does not complain when I try to call malloc. What is the right way to do this?
Notes:
This app runs on “bare metal”, there is no operating system.
I would also like to avoid any malloc usage in 3rd-party code (vendor-specific libraries, the standard library, printf etc.).
I'm fully okay with not using the parts of the C / C++ standard libraries that would require dynamic memory allocations.
I'd prefer a compile-time rather than a run-time solution.

I'm not sure it's the best way to go, however you can use the --wrap flag of ld (which can pass through gcc using -Wl).
The idea is that --wrap allows you to ask to ld to redirect the "real" symbol to your custom one; for example, if you do --wrap=malloc, then ld will look for your __wrap_malloc function to be called instead of the original `malloc.
Now, if you do --wrap=malloc without defining __wrap_malloc you will get away with it if nobody uses it, but if anyone references malloc you'll get a linking error.
$ cat test-nomalloc.c
#include <stdlib.h>
int main() {
#ifdef USE_MALLOC
malloc(10);
#endif
return 0;
}
$ gcc test-nomalloc.c -Wl,--wrap=malloc
$ gcc test-nomalloc.c -DUSE_MALLOC -Wl,--wrap=malloc
/tmp/ccIEUu9v.o: In function `main':
test-nomalloc.c:(.text+0xa): undefined reference to `__wrap_malloc'
collect2: error: ld returned 1 exit status
For new you can use the mangled names _Znwm (operator new(unsigned long)) and _Znam (operator new[](unsigned long)), which should be what every new should come down to in the end.

(posted as an answer because it won't fit in a comment)
If the OS you're running supports the use of LD_PRELOAD, this code should detect attempts to use the heap:
/* remove the LD_PRELOAD from the environment so it
doesn't kill any child process the app may spawn */
static void lib_init(void) __attribute__((constructor));
static void lib_init( void )
{
unsetenv( "LD_PRELOAD" );
}
void *malloc( size_t bytes )
{
kill( getpid(), SIGSEGV );
return( NULL );
}
void *calloc( size_t n, size_t bytes )
{
kill( getpid(), SIGSEGV );
return( NULL );
}
void *realloc( void *ptr, size_t bytes )
{
kill( getpid(), SIGSEGV );
return( NULL );
}
void *valloc( size_t bytes )
{
kill( getpid(), SIGSEGV );
return( NULL );
}
void *memalign( size_t alignment, size_t bytes )
{
kill( getpid(), SIGSEGV );
return( NULL );
}
int posix_memalign( void **ptr, size_t alignment, size_t bytes )
{
*ptr = NULL;
kill( getpid(), SIGSEGV );
return( -1 );
}
Assuming new is implemented using malloc() and delete is implemented using free(), that will catch all heap usage and give you a core file with a stack trace, assuming core files are enabled.
Add the proper headers, compile the file:
gcc [-m32|-m64] -shared heapdetect.c -o heapdetect.so
Run your app:
LD_PRELOAD=/path/to/heapdetect.so /your/app/here args ...

Related

How to keep unused function in firmware image use arm-none-eabi-gcc toolchain?

I now try create a firmware image running STM32F0xx MCU. It's like flash algorithm, provide some function call to control STM32F0xx MCU Pins, but it's more complicated than flash algorithm. So it will use STM32 HAL lib and Mbed lib.
The Compiler/linker use "-ffunction-sections" and "-fdata-sections" flags.
So I use "attribute((used))" to try keep function into firmware image, but it's failed.
arm-none-eabi-gcc toolchain version is 4.9.3.
My codes like this:
extern "C" {
__attribute__((__used__)) void writeSPI(uint32_t value)
{
for (int i = 0; i < spiPinsNum; i++) {
spiPins[i] = (((value >> i) & 0x01) != 0) ? 1 : 0;
}
__ASM volatile ("movs r0, #0"); // set R0 to 0 show success
__ASM volatile ("bkpt #0"); // halt MCU
}
}
After build succeed, the writeSPI symbol no in image.
I also try static for function, the "-uXXXXX" flag, create a new section.
Question: How keep writeSPI function code with "-ffunction-sections" and "-fdata-sections" flags?
One way to ensure a wanted function doesn't get garbage collected is to create a function pointer to it within a method that is used. You don't have to do anything with the function pointer, just initialize it.
void(*dummy)(uint32_t)=&writeSPI;
An alternative would be to omit the -ffunction-sections flag from the compilation units that contain functions that should not be stripped, but that may involve significant restructuring of your code base.

Does MSP430 GCC support newer C++ standards? (like 11, 14, 17)

I'm writing some code that would greatly benefit from the concise syntax of lambdas, which were introduced with C++ 11. Is this supported by the compiler?
How do I specify the compiler flags when compiling using Energia or embedXcode?
As of February 2018, up to C++14 is supported with some limitations:
http://processors.wiki.ti.com/index.php/C%2B%2B_Support_in_TI_Compilers
There isn't much about this topic on the TI site, or, at least, I don't know enough C++ to give you a detailed and precise response.
The implementation of the embedded ABI is described in this document that is mainly a derivation of the Itanium C++ ABI. It explains nothing about the implementation of lambdas nor the auto, keyword (or probably I'm not able to derive this information from the documentation).
Thus I decided to directly test in Energia. Apparently the g++ version is 4.6.3, thus it should support both.
And in fact (from a compilation point of view, I don't have my MSP here to test the code) it can compile something like:
// In template.hpp
#ifndef TEMPLATE_HPP_
#define TEMPLATE_HPP_
template<class T>
T func(T a) {
auto c = [&](int n) { return n + a; };
return c(0);
}
#endif /* TEMPLATE_HPP_ */
// in the sketch main
#include "template.hpp"
void setup() { int b = func<int>(0); }
void loop() { }
(the template works only if in an header, in the main sketch raises an error). To compile this sketch I had to modify one internal file of the editor. The maximum supported standard seems to be -std=c++0x, and the compilation flags are in the file:
$ENERGIA_ROOT/hardware/energia/msp430/platform.txt
in my setup the root is in /opt/energia. Inside that file I modified line 32 (compiler.cpp.flags) and added the option. Notice that -std=c++11 is not supported (raises an error).
compiler.cpp.flags=-std=c++0x -c -g -O2 {compiler.mlarge_flag} {compiler.warning_flags} -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD
Unfortunately I have zero experience with embedXcode :\
Mimic std::function
std::function is not provided, thus you have to write some sort of class that mimics it. Something like:
// callback.hpp
#ifndef CALLBACK_HPP_
#define CALLBACK_HPP_
template <class RET, class ARG>
class Callback {
RET (*_f)(ARG);
public:
Callback() : _f(0) { };
Callback(RET (*f)(ARG)) : _f(f) { };
bool is_set() const { return (_f) ? true : false; }
RET operator()(ARG a) const { return is_set() ? _f(a) : 0; }
};
#endif /* CALLBACK_HPP_ */
// sketch
#include "callback.hpp"
// | !! empty capture!
void setup() { // V
auto clb = Callback<int, char>([](char c) { return (int)c; });
if (clb.is_set())
auto b = clb('a');
}
void loop() {}
may do the work, and it uses a simple trick:
The closure type for a lambda-expression with no lambda-capture has a public non-virtual non-explicit const conversion function to pointer to function having the same parameter and return types as the closure type’s function call operator. [C++11 standard 5.1.2]
As soon as you leave the capture empty, you are assured to have a "conversion" to a function pointer, thus you can store it without issues. The code I have written:
requires a first template RET that is the returned type
requires a second template ARG that is one argument for the callback. In the majority of the case you may consider to use void* as common argument (cast a struct pointer in a void pointer and use it as argument, to counter-cast in the function, the operation costs nothing)
implements two constructors: the empty constructor initialize the function pointer to NULL, while the second directly assigns the callback. Notice that the copy constructor is missing, you need to implement it.
implements a method to call the function (overloading the operator ()) and to check if the callback actually exists.
Again: this stuff compiles with no warnings, but I don't know if it works on the MSP430, since I cannot test it (it works on a common amd64 linux system).

Neither ld wrap nor LD_PRELOAD working to intercept system call

I am trying to use -Wl, -wrap=sendto -Wl, -wrap, sendto in my final g++ link command that links my app to replace the standard sendto function with my own.
I compile the following source code with:
gcc -c -o wrap.o wrap.c
and include the wrap.o in the final g++ command that links the app (the rest of the app is C++ hence the use of g++):
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
ssize_t __real_sendto(int, const void *, size_t, int, const struct sockaddr *, socklen_t);
ssize_t __wrap_sendto
(
int sockfd,
const void *buf,
size_t len,
int flags,
const struct sockaddr *dest_addr,
socklen_t addrlen
)
{
printf("my wrap sendto ...\n");
return __real_sendto(sockfd, buf, len, flags, dest_addr, addrlen);
}
When I use sendto in my own source code, the wrapper is in fact used ok but all 3rd party shared objects i linked against in my final g++ command that use sendto still use the system sendto i.e. not my wrapper. How can I get my sendto wrapper used throughout?
I have also tried an LD_PRELOAD approach with a sendto and dlsym(RTLD_NEXT) inside but that did not work either.
How can I figure out why the 3rd party library keeps on using the libc sendto directly?
When I use ldd to find all shared object dependencies of my compiled app and then objdump -T on each one of them grepping for sendto, I get UND (undefined) for all the 3rd party shared objects. The shared objects that do define it however is:
/lib64/libpthread.so.0
000000000000ed80 w DF .text 0000000000000064 GLIBC_2.2.5 sendto
/lib64/libc.so.6
00000000000e98d0 w DF .text 0000000000000064 GLIBC_2.2.5 sendto
I see in glibc sendto.c on git the following:
weak_alias (__libc_sendto, sendto)
weak_alias (__libc_sendto, __sendto)
The --wrap sendto option does not define the sendto symbol in your binary. Instead, it replaces references to this symbols with __wrap_sendto and leaves sendto undefined.
In other words, your executable does not provide sendto, so the run-time symbol resolution picks the one from glibc.
To fix this you need to define sendto in your executable. Try dlsym once again, but this time without LD_PRELOAD/shim library:
ssize_t sendto
(
int sockfd,
const void *buf,
size_t len,
int flags,
const struct sockaddr *dest_addr,
socklen_t addrlen
)
{
ssize_t (*libc_sendto)(int, const void *, size_t, int, const struct sockaddr *, socklen_t)
= dlsym(RTLD_NEXT, "sendto");
printf("my wrap sendto ...\n");
return libc_sendto(sockfd, buf, len, flags, dest_addr, addrlen);
}
If third-party libraries keep finding wrong sendto after this, then I see only one (not particularly likely) possibility. Shared libraries are linked with -Bsymbolic/-Bsymbolic-functions and provide their own sendto.
Also, since you've tagged this question as g++, make sure that your symbol names don't get mangled - use extern "C".
I have eventually managed to figure out what was going on here. Even though the strace states sendto is being called:
[pid 17956] sendto(4, "abc"..., 2052, 0, NULL, 0) = 2052
what was in fact happening was send(...) was being called (probably possible because 0, null, 0 last three parameters). The moment I made an interceptor for send(...) it worked.

How to find the address & length of a C++ function at runtime (MinGW)

As this is my first post to stackoverflow I want to thank you all for your valuable posts that helped me a lot in the past.
I use MinGW (gcc 4.4.0) on Windows-7(64) - more specifically I use Nokia Qt + MinGW but Qt is not involved in my Question.
I need to find the address and -more important- the length of specific functions of my application at runtime, in order to encode/decode these functions and implement a software protection system.
I already found a solution on how to compute the length of a function, by assuming that static functions placed one after each other in a source-file, it is logical to be also sequentially placed in the compiled object file and subsequently in memory.
Unfortunately this is true only if the whole CPP file is compiled with option: "g++ -O0" (optimization level = 0).
If I compile it with "g++ -O2" (which is the default for my project) the compiler seems to relocate some of the functions and as a result the computed function length seems to be both incorrect and negative(!).
This is happening even if I put a "#pragma GCC optimize 0" line in the source file,
which is supposed to be the equivalent of a "g++ -O0" command line option.
I suppose that "g++ -O2" instructs the compiler to perform some global file-level optimization (some function relocation?) which is not avoided by using the #pragma directive.
Do you have any idea how to prevent this, without having to compile the whole file with -O0 option?
OR: Do you know of any other method to find the length of a function at runtime?
I prepare a small example for you, and the results with different compilation options, to highlight the case.
The Source:
// ===================================================================
// test.cpp
//
// Intention: To find the addr and length of a function at runtime
// Problem: The application output is correct when compiled with: "g++ -O0"
// but it's erroneous when compiled with "g++ -O2"
// (although a directive "#pragma GCC optimize 0" is present)
// ===================================================================
#include <stdio.h>
#include <math.h>
#pragma GCC optimize 0
static int test_01(int p1)
{
putchar('a');
putchar('\n');
return 1;
}
static int test_02(int p1)
{
putchar('b');
putchar('b');
putchar('\n');
return 2;
}
static int test_03(int p1)
{
putchar('c');
putchar('\n');
return 3;
}
static int test_04(int p1)
{
putchar('d');
putchar('\n');
return 4;
}
// Print a HexDump of a specific address and length
void HexDump(void *startAddr, long len)
{
unsigned char *buf = (unsigned char *)startAddr;
printf("addr:%ld, len:%ld\n", (long )startAddr, len);
len = (long )fabs(len);
while (len)
{
printf("%02x.", *buf);
buf++;
len--;
}
printf("\n");
}
int main(int argc, char *argv[])
{
printf("======================\n");
long fun_len = (long )test_02 - (long )test_01;
HexDump((void *)test_01, fun_len);
printf("======================\n");
fun_len = (long )test_03 - (long )test_02;
HexDump((void *)test_02, fun_len);
printf("======================\n");
fun_len = (long )test_04 - (long )test_03;
HexDump((void *)test_03, fun_len);
printf("Test End\n");
getchar();
// Just a trick to block optimizer from eliminating test_xx() functions as unused
if (argc > 1)
{
test_01(1);
test_02(2);
test_03(3);
test_04(4);
}
}
The (correct) Output when compiled with "g++ -O0":
[note the 'c3' byte (= assembly 'ret') at the end of all functions]
======================
addr:4199344, len:37
55.89.e5.83.ec.18.c7.04.24.61.00.00.00.e8.4e.62.00.00.c7.04.24.0a.00.00.00.e8.42
.62.00.00.b8.01.00.00.00.c9.c3.
======================
addr:4199381, len:49
55.89.e5.83.ec.18.c7.04.24.62.00.00.00.e8.29.62.00.00.c7.04.24.62.00.00.00.e8.1d
.62.00.00.c7.04.24.0a.00.00.00.e8.11.62.00.00.b8.02.00.00.00.c9.c3.
======================
addr:4199430, len:37
55.89.e5.83.ec.18.c7.04.24.63.00.00.00.e8.f8.61.00.00.c7.04.24.0a.00.00.00.e8.ec
.61.00.00.b8.03.00.00.00.c9.c3.
Test End
The erroneous Output when compiled with "g++ -O2":
(a) function test_01 addr & len seem correct
(b) functions test_02, test_03 have negative lengths,
and fun. test_02 length is also incorrect.
======================
addr:4199416, len:36
83.ec.1c.c7.04.24.61.00.00.00.e8.c5.61.00.00.c7.04.24.0a.00.00.00.e8.b9.61.00.00
.b8.01.00.00.00.83.c4.1c.c3.
======================
addr:4199452, len:-72
83.ec.1c.c7.04.24.62.00.00.00.e8.a1.61.00.00.c7.04.24.62.00.00.00.e8.95.61.00.00
.c7.04.24.0a.00.00.00.e8.89.61.00.00.b8.02.00.00.00.83.c4.1c.c3.57.56.53.83.ec.2
0.8b.5c.24.34.8b.7c.24.30.89.5c.24.08.89.7c.24.04.c7.04.
======================
addr:4199380, len:-36
83.ec.1c.c7.04.24.63.00.00.00.e8.e9.61.00.00.c7.04.24.0a.00.00.00.e8.dd.61.00.00
.b8.03.00.00.00.83.c4.1c.c3.
Test End
This is happening even if I put a "#pragma GCC optimize 0" line in the source file, which is supposed to be the equivalent of a "g++ -O0" command line option.
I don't believe this is true: it is supposed to be the equivalent of attaching __attribute__((optimize(0))) to subsequently defined functions, which causes those functions to be compiled with a different optimisation level. But this does not affect what goes on at the top level, whereas the command line option does.
If you really must do horrible things that rely on top level ordering, try the -fno-toplevel-reorder option. And I suspect that it would be a good idea to add __attribute__((noinline)) to the functions in question as well.

Change stack size for a C++ application in Linux during compilation with GNU compiler

In OSX during C++ program compilation with g++ I use
LD_FLAGS= -Wl,-stack_size,0x100000000
but in SUSE Linux I constantly get errors like:
x86_64-suse-linux/bin/ld: unrecognized option '--stack'
and similar.
I know that it is possible to use
ulimit -s unlimited
but this is not nice as not always can a single user do that.
How can I increase the stack size in Linux with GCC for a single application?
You can set the stack size programmatically with setrlimit, e.g.
#include <sys/resource.h>
int main (int argc, char **argv)
{
const rlim_t kStackSize = 16 * 1024 * 1024; // min stack size = 16 MB
struct rlimit rl;
int result;
result = getrlimit(RLIMIT_STACK, &rl);
if (result == 0)
{
if (rl.rlim_cur < kStackSize)
{
rl.rlim_cur = kStackSize;
result = setrlimit(RLIMIT_STACK, &rl);
if (result != 0)
{
fprintf(stderr, "setrlimit returned result = %d\n", result);
}
}
}
// ...
return 0;
}
Note: even when using this method to increase stack size you should not declare large local variables in main() itself, since you may well get a stack overflow as soon as you enter main(), before the getrlimit/setrlimit code has had a chance to change the stack size. Any large local variables should therefore be defined only in functions which are subsequently called from main(), after the stack size has successfully been increased.
Instead of stack_size, use --stack like so:
gcc -Wl,--stack,4194304 -o program program.c
This example should give you 4 MB of stack space. Works on MinGW's GCC, but as the manpage says, "This option is specific to the i386 PE targeted port of the linker" (i.e. only works for outputting Windows binaries). Seems like there isn't an option for ELF binaries.
This is an old topic, but none of the flags answered here worked for me. Anyway by I found out that -Wl,-z,stack-size=4194304 (example for 4MB) seems to work.
Consider using -fsplit-stack option https://gcc.gnu.org/wiki/SplitStacks
Change it with the ulimit bash builtin, or setrlimit(), or at login
with PAM (pam_limits.so).
It's a settable
user resource limit; see RLIMIT_STACK in setrlimit(2).
http://bytes.com/topic/c/answers/221976-enlarge-stack-size-gcc

Resources