duplicate symbol of a function defined in a header file - c++11

Suppose I have a header file file_ops.hpp that looks something like this
#pragma once
bool systemIsLittleEndian() {
uint16_t x = 0x0011;
uint8_t *half_x = (uint8_t *) &x;
if (*half_x == 0x11)
return true;
else
return false;
}
I initially thought it had something to do with the implementation, but as it turns out, I'll get duplicate symbols with just
#pragma once
bool systemIsLittleEndian() { return true; }
If I make it inline, the linker errors go away. That's not something I want to rely on, since inline is a request not a guarantee.
What causes this behavior? I'm not dealing with a scenario where I'm returning some kind of singleton.
There are other methods that are marked as
bool MY_LIB_EXPORT someFunc();// implemented in `file_ops.cpp`
are these related somehow (mixed exported functions and "plain old functions")? Clearly I can just move the implementation to file_ops.cpp, I'm rather intrigued as to why this happens.

If I make it inline, the linker errors go away. That's not something I want to rely on, since inline is a request not a guarantee.
It's OK to inline the function.
Even if the object code is not inlined, the language guarantees that is will not cause linker errors or undefined behavior as long as the function is somehow not altered in different translation units.
If you #include the .hpp in hundreds of .cpp files, you may notice a bit of code bloat but the program is still correct.
What causes this behavior? I'm not dealing with a scenario where I'm returning some kind of singleton.
The #include mechanism is a convenience for reducing the amount of code you have to manually create in multiple files with the exact content. In the end, all translation units that #include other files get the lines of code from the files they #include.
If you #include file_ops.hpp in, let's say, file1.cpp and file2.cpp, it's as if you have:
file1.cpp:
bool systemIsLittleEndian() {
uint16_t x = 0x0011;
uint8_t *half_x = (uint8_t *) &x;
if (*half_x == 0x11)
return true;
else
return false;
}
file2.cpp:
bool systemIsLittleEndian() {
uint16_t x = 0x0011;
uint8_t *half_x = (uint8_t *) &x;
if (*half_x == 0x11)
return true;
else
return false;
}
When you compile those two .cpp files and link them together to create an executable, the linker notices that there are two definitions of the function named systemIsLittleEndian. That's the source of the linker error.
One solution without using inline
One solution to your problem, without using inline, is:
Declare the function in the .hpp file.
Define it in the appropriate .cpp file..
file_ops.hpp:
bool systemIsLittleEndian(); // Just the declaration.
file_ops.cpp:
#include "file_ops.hpp"
// The definition.
bool systemIsLittleEndian() {
uint16_t x = 0x0011;
uint8_t *half_x = (uint8_t *) &x;
if (*half_x == 0x11)
return true;
else
return false;
}
Update
Regarding
bool MY_LIB_EXPORT someFunc();// implemented in `file_ops.cpp`
There is lots of information on the web regarding. This is a Microsoft/Windows issue. Here are couple of starting points to learn about it.
Exporting from a DLL Using __declspec(dllexport)
Importing into an Application Using __declspec(dllimport)

Related

how to fix wrong GCC ARM startup code pointer to initialized and zero variables?

[skip to UPDATE2 and save some time :-)]
I use ARM Cortex-M4, with CMSIS 5-5.7.0 and FreeRTOS, compiling using GCC for ARM (10_2021.10)
My variables are not initialized as they should.
My startup code is pretty simple, the entry point is the reset handler (CMSIS declared startup_ARMCM4.s as deprecated and recommend using the C code startup code so this is what I do).
Here is my code:
__attribute__((__noreturn__)) void Reset_Handler(void)
{
DataInit();
SystemInit(); /* CMSIS System Initialization */
main();
}
static void DataInit(void)
{
typedef struct {
uint32_t const* src;
uint32_t* dest;
uint32_t wlen;
} __copy_table_t;
typedef struct {
uint32_t* dest;
uint32_t wlen;
} __zero_table_t;
extern const __copy_table_t __copy_table_start__;
extern const __copy_table_t __copy_table_end__;
extern const __zero_table_t __zero_table_start__;
extern const __zero_table_t __zero_table_end__;
for (__copy_table_t const* pTable = &__copy_table_start__; pTable < &__copy_table_end__; ++pTable) {
for(uint32_t i=0u; i<pTable->wlen; ++i) {
pTable->dest[i] = pTable->src[i];
}
}
for (__zero_table_t const* pTable = &__zero_table_start__; pTable < &__zero_table_end__; ++pTable) {
for(uint32_t i=0u; i<pTable->wlen; ++i) {
pTable->dest[i] = 0u;
}
}
}
__copy_table_start__, __copy_table_end__ etc. have the wrong values an so no data is copied to the appropriate place in RAM.
I tried adding __libc_init_array() before DataInit(), as suggested in this answer, and remove the nostartfiles flag from the linker, but at some point __libc_init_array() jumps to an illegal address and I get a HardFault interrupt.
Is there a different method to fix it? maybe one where I can use the nostartfiles flag?
UPDATE:
Looking at the memory, where __copy_table_start__ is located, I see the data there is valid (even without the use of __libc_init_array()). It seems that pTable doesn't get the correct value.
I tried using __data_start__, __data_end__, __bss_start__, __bss_end__ and __etext instead of the above variables, in the linker file it is said they can be used in code without definition, but they cannot (maybe that's a clue?). In any case they didn't work either.
UPDATE2:
found the actual problem
all struct members get the same value (modifying one changes all others), it happens with every struct. I have no idea how this is possible. In other words the value of __copy_table_start__.src is, for example, 0x14651234, __copy_table_start__.dest is 0x00100000, and __copy_table_start__.wlen is 0x0365. When looking at pTable all members are 0x14651234.

Trap memory accesses inside a standard executable built with MinGW

So my problem sounds like this.
I have some platform dependent code (embedded system) which writes to some MMIO locations that are hardcoded at specific addresses.
I compile this code with some management code inside a standard executable (mainly for testing) but also for simulation (because it takes longer to find basic bugs inside the actual HW platform).
To alleviate the hardcoded pointers, i just redefine them to some variables inside the memory pool. And this works really well.
The problem is that there is specific hardware behavior on some of the MMIO locations (w1c for example) which makes "correct" testing hard to impossible.
These are the solutions i thought of:
1 - Somehow redefine the accesses to those registers and try to insert some immediate function to simulate the dynamic behavior. This is not really usable since there are various ways to write to the MMIO locations (pointers and stuff).
2 - Somehow leave the addresses hardcoded and trap the illegal access through a seg fault, find the location that triggered, extract exactly where the access was made, handle and return. I am not really sure how this would work (and even if it's possible).
3 - Use some sort of emulation. This will surely work, but it will void the whole purpose of running fast and native on a standard computer.
4 - Virtualization ?? Probably will take a lot of time to implement. Not really sure if the gain is justifiable.
Does anyone have any idea if this can be accomplished without going too deep? Maybe is there a way to manipulate the compiler in some way to define a memory area for which every access will generate a callback. Not really an expert in x86/gcc stuff.
Edit: It seems that it's not really possible to do this in a platform independent way, and since it will be only windows, i will use the available API (which seems to work as expected). Found this Q here:
Is set single step trap available on win 7?
I will put the whole "simulated" register file inside a number of pages, guard them, and trigger a callback from which i will extract all the necessary info, do my stuff then continue execution.
Thanks all for responding.
I think #2 is the best approach. I routinely use approach #4, but I use it to test code that is running in the kernel, so I need a layer below the kernel to trap and emulate the accesses. Since you have already put your code into a user-mode application, #2 should be simpler.
The answers to this question may provide help in implementing #2. How to write a signal handler to catch SIGSEGV?
What you really want to do, though, is to emulate the memory access and then have the segv handler return to the instruction after the access. This sample code works on Linux. I'm not sure if the behavior it is taking advantage of is undefined, though.
#include <stdint.h>
#include <stdio.h>
#include <signal.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static void segv_handler(int, siginfo_t *, void *);
int main()
{
struct sigaction action = { 0, };
action.sa_sigaction = segv_handler;
action.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &action, NULL);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static void segv_handler(int, siginfo_t *info, void *ucontext_arg)
{
ucontext_t *ucontext = static_cast<ucontext_t *>(ucontext_arg);
ucontext->uc_mcontext.gregs[REG_RAX] = 1234;
ucontext->uc_mcontext.gregs[REG_RIP] += 2;
}
The code to read the register is written in assembly to ensure that both the destination register and the length of the instruction are known.
This is how the Windows version of prl's answer could look like:
#include <stdint.h>
#include <stdio.h>
#include <windows.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *);
int main()
{
SetUnhandledExceptionFilter(segv_handler);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *ep)
{
// only handle read access violation of REG_ADDR
if (ep->ExceptionRecord->ExceptionCode != EXCEPTION_ACCESS_VIOLATION ||
ep->ExceptionRecord->ExceptionInformation[0] != 0 ||
ep->ExceptionRecord->ExceptionInformation[1] != (ULONG_PTR)REG_ADDR)
return EXCEPTION_CONTINUE_SEARCH;
ep->ContextRecord->Rax = 1234;
ep->ContextRecord->Rip += 2;
return EXCEPTION_CONTINUE_EXECUTION;
}
So, the solution (code snippet) is as follows:
First of all, i have a variable:
__attribute__ ((aligned (4096))) int g_test;
Second, inside my main function, i do the following:
AddVectoredExceptionHandler(1, VectoredHandler);
DWORD old;
VirtualProtect(&g_test, 4096, PAGE_READWRITE | PAGE_GUARD, &old);
The handler looks like this:
LONG WINAPI VectoredHandler(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
static DWORD last_addr;
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION) {
last_addr = ExceptionInfo->ExceptionRecord->ExceptionInformation[1];
ExceptionInfo->ContextRecord->EFlags |= 0x100; /* Single step to trigger the next one */
return EXCEPTION_CONTINUE_EXECUTION;
}
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
DWORD old;
VirtualProtect((PVOID)(last_addr & ~PAGE_MASK), 4096, PAGE_READWRITE | PAGE_GUARD, &old);
return EXCEPTION_CONTINUE_EXECUTION;
}
return EXCEPTION_CONTINUE_SEARCH;
}
This is only a basic skeleton for the functionality. Basically I guard the page on which the variable resides, i have some linked lists in which i hold pointers to the function and values for the address in question. I check that the fault generating address is inside my list then i trigger the callback.
On first guard hit, the page protection will be disabled by the system, but i can call my PRE_WRITE callback where i can save the variable state. Because a single step is issued through the EFlags, it will be followed immediately by a single step exception (which means that the variable was written), and i can trigger a WRITE callback. All the data required for the operation is contained inside the ExceptionInformation array.
When someone tries to write to that variable:
*(int *)&g_test = 1;
A PRE_WRITE followed by a WRITE will be triggered,
When i do:
int x = *(int *)&g_test;
A READ will be issued.
In this way i can manipulate the data flow in a way that does not require modifications of the original source code.
Note: This is intended to be used as part of a test framework and any penalty hit is deemed acceptable.
For example, W1C (Write 1 to clear) operation can be accomplished:
void MYREG_hook(reg_cbk_t type)
{
/** We need to save the pre-write state
* This is safe since we are assured to be called with
* both PRE_WRITE and WRITE in the correct order
*/
static int pre;
switch (type) {
case REG_READ: /* Called pre-read */
break;
case REG_PRE_WRITE: /* Called pre-write */
pre = g_test;
break;
case REG_WRITE: /* Called after write */
g_test = pre & ~g_test; /* W1C */
break;
default:
break;
}
}
This was possible also with seg-faults on illegal addresses, but i had to issue one for each R/W, and keep track of a "virtual register file" so a bigger penalty hit. In this way i can only guard specific areas of memory or none, depending on the registered monitors.

Does MSP430 GCC support newer C++ standards? (like 11, 14, 17)

I'm writing some code that would greatly benefit from the concise syntax of lambdas, which were introduced with C++ 11. Is this supported by the compiler?
How do I specify the compiler flags when compiling using Energia or embedXcode?
As of February 2018, up to C++14 is supported with some limitations:
http://processors.wiki.ti.com/index.php/C%2B%2B_Support_in_TI_Compilers
There isn't much about this topic on the TI site, or, at least, I don't know enough C++ to give you a detailed and precise response.
The implementation of the embedded ABI is described in this document that is mainly a derivation of the Itanium C++ ABI. It explains nothing about the implementation of lambdas nor the auto, keyword (or probably I'm not able to derive this information from the documentation).
Thus I decided to directly test in Energia. Apparently the g++ version is 4.6.3, thus it should support both.
And in fact (from a compilation point of view, I don't have my MSP here to test the code) it can compile something like:
// In template.hpp
#ifndef TEMPLATE_HPP_
#define TEMPLATE_HPP_
template<class T>
T func(T a) {
auto c = [&](int n) { return n + a; };
return c(0);
}
#endif /* TEMPLATE_HPP_ */
// in the sketch main
#include "template.hpp"
void setup() { int b = func<int>(0); }
void loop() { }
(the template works only if in an header, in the main sketch raises an error). To compile this sketch I had to modify one internal file of the editor. The maximum supported standard seems to be -std=c++0x, and the compilation flags are in the file:
$ENERGIA_ROOT/hardware/energia/msp430/platform.txt
in my setup the root is in /opt/energia. Inside that file I modified line 32 (compiler.cpp.flags) and added the option. Notice that -std=c++11 is not supported (raises an error).
compiler.cpp.flags=-std=c++0x -c -g -O2 {compiler.mlarge_flag} {compiler.warning_flags} -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD
Unfortunately I have zero experience with embedXcode :\
Mimic std::function
std::function is not provided, thus you have to write some sort of class that mimics it. Something like:
// callback.hpp
#ifndef CALLBACK_HPP_
#define CALLBACK_HPP_
template <class RET, class ARG>
class Callback {
RET (*_f)(ARG);
public:
Callback() : _f(0) { };
Callback(RET (*f)(ARG)) : _f(f) { };
bool is_set() const { return (_f) ? true : false; }
RET operator()(ARG a) const { return is_set() ? _f(a) : 0; }
};
#endif /* CALLBACK_HPP_ */
// sketch
#include "callback.hpp"
// | !! empty capture!
void setup() { // V
auto clb = Callback<int, char>([](char c) { return (int)c; });
if (clb.is_set())
auto b = clb('a');
}
void loop() {}
may do the work, and it uses a simple trick:
The closure type for a lambda-expression with no lambda-capture has a public non-virtual non-explicit const conversion function to pointer to function having the same parameter and return types as the closure type’s function call operator. [C++11 standard 5.1.2]
As soon as you leave the capture empty, you are assured to have a "conversion" to a function pointer, thus you can store it without issues. The code I have written:
requires a first template RET that is the returned type
requires a second template ARG that is one argument for the callback. In the majority of the case you may consider to use void* as common argument (cast a struct pointer in a void pointer and use it as argument, to counter-cast in the function, the operation costs nothing)
implements two constructors: the empty constructor initialize the function pointer to NULL, while the second directly assigns the callback. Notice that the copy constructor is missing, you need to implement it.
implements a method to call the function (overloading the operator ()) and to check if the callback actually exists.
Again: this stuff compiles with no warnings, but I don't know if it works on the MSP430, since I cannot test it (it works on a common amd64 linux system).

link functions with mismatching signature

I'm playing around with gcc and g++ compiler and trying to compile some C code within those, my purpose is to see how the compiler / linker enforces that when linking a model with some function declaration to a model with that implementation of that function, the correct function are linked ( in terms of parameters passed and values returned )
for example let's take a look at this code
#include <stdio.h>
extern int foo(int b, int c);
int main()
{
int f = foo(5, 8);
printf("%d",f);
}
after compilation within my symbol table I'd have a symbol for foo, but within the elf file format there is not place that describes the arguments taken and the function signature, ( int(int,int) ), so basically if I write some other code such as this:
char foo(int a, int b, int c)
{
return (char) ( a + b + c );
}
compile that model it'll also have some symbol called foo, what if I link these models together, what's gonna happen? I have never thought of this, and how would a compiler overcome this weakness... I know that within g++ the compiler generates some prefix for every symbol regarding to it's namespace, but does it also take in mind the signature? If anyone has ever encountered this it would be great if he could shed some light upon this problem
The problem is solved with name mangling.
In compiler construction, name mangling (also called name decoration)
is a technique used to solve various problems caused by the need to
resolve unique names for programming entities in many modern
programming languages.
It provides a way of encoding additional information in the name of a
function, structure, class or another datatype in order to pass more
semantic information from the compilers to linkers.
The need arises where the language allows different entities to be
named with the same identifier as long as they occupy a different
namespace (where a namespace is typically defined by a module, class,
or explicit namespace directive) or have different signatures (such as
function overloading).
Note the simple example:
Consider the following two definitions of f() in a C++ program:
int f (void) { return 1; }
int f (int) { return 0; }
void g (void) { int i = f(), j = f(0); }
These are distinct functions, with no relation to each other apart
from the name. If they were natively translated into C with no
changes, the result would be an error — C does not permit two
functions with the same name. The C++ compiler therefore will encode
the type information in the symbol name, the result being something
resembling:
int __f_v (void) { return 1; }
int __f_i (int) { return 0; }
void __g_v (void) { int i = __f_v(), j = __f_i(0); }
Notice that g() is mangled even though there is no conflict; name
mangling applies to all symbols.
Wow, I've kept exploring and testing it on my own and I came up with a solution which quietly amazed my mind,
so I wrote the following code and compiled it on a gcc compiler
main.c
#include <stdio.h>
extern int foo(int a, char b);
int main()
{
int g = foo(5, 6);
printf("%d", g);
return 0;
}
foo.c
typedef struct{
int a;
int b;
char c;
char d;
} mystruct;
mystruct foo(int a, int b)
{
mystruct myl;
my.a = a;
my.b = a + 1;
my.c = (char) b;
my.d = (char b + 1;
return my1;
}
now I compiled foo.c to foo.o with gcc firstly and checked the symbol table using
readelf and I had some entry called foo
also after that I compiled main.c to main.o checked the symbol table and it also had some entry called foo, I linked those two together and surprisingly it worked, I ran main.o and obviously encountered some segmentation fault, which makes sense as the actual implementation of foo as implemented in foo.o probably expects three parameters (first one should be struct adders), a parameter which isn't passed in main.o under it's definition to foo then the actual implementation accesses some memory that doesn't belong to it from the stack frame of main, then tries accessing addresses that it thought it got, and ends up with segmentation fault, that's fine,
now I compiled both models again with g++ and not gcc and what came up was amazing.. I found out that the symbol entry under foo.o was _Z3fooii and under main.o it was _Z3fooic, now my guess is that the ii suffix means int int and ic suffix means int char which probably refers to the parameters that should be passed to function hence allowing the compiler to know some function deceleration gets the actual implementation. so I changed my foo declaration in main.c to
extern int foo(int a, int b);
re-compiled and this time got the symbol _Z3fooii, I linked both models again and amazingly this time it worked, I tried running it and again encountered segmentation fault, which again also makes sense as the compiler wont always even authorize correct return values.. anyways what was my original thought - that g++ includes function signature within symbol name and thus enforces the linker to give function implementation get correct parameters to correct function declaration

C++ shared library symbols versioning

I'm trying to create library with two versions of the same function using
__asm__(".symver ......
approach
library.h
#ifndef CTEST_H
#define CTEST_H
int first(int x);
int second(int x);
#endif
library.cpp
#include "simple.h"
#include <stdio.h>
__asm__(".symver first_1_0,first#LIBSIMPLE_1.0");
int first_1_0(int x)
{
printf("lib: %s\n", __FUNCTION__);
return x + 1;
}
__asm__(".symver first_2_0,first##LIBSIMPLE_2.0");
int first_2_0(int x)
{
int y;
printf("lib: %d\n", y);
printf("lib: %s\n", __FUNCTION__);
return (x + 1) * 1000;
}
int second(int x)
{
printf("lib: %s\n", __FUNCTION__);
return x + 2;
}
And here is the version scripf file
LIBSIMPLE_1.0{
global:
first; second;
local:
*;
};
LIBSIMPLE_2.0{
global:
first;
local:
*;
};
When build library using gcc, everything works well, and i am able to link to a library binary. Using nm tool i see that both first() and second() function symbols are exported.
Now, when i try to use g++, non of the symbols are exported.
So i tried to use extern "C" directive to wrap both declarations
extern "C" {
int first(int x);
int second(int x);
}
nm shows that second() function symbol is exported, but first() still remain unexported, and mangled.
What is here i am missing to make this to work? Or it is impossible with the c++ compiler to achieve this?
I don't know why, with 'extern "C"', 'first' was not exported - suspect there is something else interfering.
Otherwise C++ name mangling is certainly a pain here. The 'asm' directives (AFAIK) require the mangled names for C++ functions, not the simple 'C' name. So 'int first(int)' would need to be referenced as (e.g.) '_Z5firsti' instead of just 'first'. This is, of course, a real pain as far as portability goes...
The linker map file is more forgiving as its supported 'extern "C++" {...}' blocks to list C++ symbols in their as-written form - 'int first(int)'.
This whole process is a maintainance nightmare. What I'd really like would be a function attribute which could be used to specify the alias and version...
Just to add a reminder that C++11 now supports inline namespaces which can be used to provide symbol versioning in C++.

Resources