ARM GCC unaligned access - gcc

if TStruct is packed, then this code ends with Str.D == 0x00223344 (not 0x11223344). Why? ARM GCC 4.7
#include <string.h>
typedef struct {
unsigned char B;
unsigned int D;
} __attribute__ ((packed)) TStruct;
volatile TStruct Str;
int main( void) {
memset((void *)&Str, 0, sizeof(Str));
Str.D = 0x11223344;
if(Str.D != 0x11223344) {
return 1;
}
return 0;
}

I guess your problem has nothing to do with unaligned access, but with structure definition. int is not necessarily 32 bit long. According to the C standard, int is at least 16 bit long, and char is at least 8 bits long.
My guess is, Your compiler optimizes TStruct so it looks like this:
struct {
unsigned char B : 8;
unsigned int D : 24;
} ...;
When you are assigning 0x11223344 to Str.D, than according to the C standard, the compiler must only make sure that at least 16 bits (0x3344) are written to Str.D. You didn't specify that Str.D is 32 bit long, only that it is at least 16 bits long.
Your compiler may also arrange the struct like this:
struct {
unsigned char B : 16;
unsigned int D : 16;
} ...;
B is at least 8 bits long, and D is at least 16 bits long, all ok.
Probably, what you want to do, is:
#include <stdint.h>
typedef struct {
uint8_t B;
uint32_t D;
} __attribute__((packed)) TStruct;
That way You can ensure a 32-bit value 0x11223344 properly writes to Str.D. It is a good idea to use size constrained types for __packed structs.
As for unaligned access of a member inside a struct, the compiler should take care of it. If a compiler knows the structure definition, then when you are accessing Str.D it should take care of any unaligned access and bit/byte operations.

Related

What differences in behaviour can there be for a single program between C and C++? [duplicate]

C and C++ have many differences, and not all valid C code is valid C++ code.
(By "valid" I mean standard code with defined behavior, i.e. not implementation-specific/undefined/etc.)
Is there any scenario in which a piece of code valid in both C and C++ would produce different behavior when compiled with a standard compiler in each language?
To make it a reasonable/useful comparison (I'm trying to learn something practically useful, not to try to find obvious loopholes in the question), let's assume:
Nothing preprocessor-related (which means no hacks with #ifdef __cplusplus, pragmas, etc.)
Anything implementation-defined is the same in both languages (e.g. numeric limits, etc.)
We're comparing reasonably recent versions of each standard (e.g. say, C++98 and C90 or later)
If the versions matter, then please mention which versions of each produce different behavior.
Here is an example that takes advantage of the difference between function calls and object declarations in C and C++, as well as the fact that C90 allows the calling of undeclared functions:
#include <stdio.h>
struct f { int x; };
int main() {
f();
}
int f() {
return printf("hello");
}
In C++ this will print nothing because a temporary f is created and destroyed, but in C90 it will print hello because functions can be called without having been declared.
In case you were wondering about the name f being used twice, the C and C++ standards explicitly allow this, and to make an object you have to say struct f to disambiguate if you want the structure, or leave off struct if you want the function.
For C++ vs. C90, there's at least one way to get different behavior that's not implementation defined. C90 doesn't have single-line comments. With a little care, we can use that to create an expression with entirely different results in C90 and in C++.
int a = 10 //* comment */ 2
+ 3;
In C++, everything from the // to the end of the line is a comment, so this works out as:
int a = 10 + 3;
Since C90 doesn't have single-line comments, only the /* comment */ is a comment. The first / and the 2 are both parts of the initialization, so it comes out to:
int a = 10 / 2 + 3;
So, a correct C++ compiler will give 13, but a strictly correct C90 compiler 8. Of course, I just picked arbitrary numbers here -- you can use other numbers as you see fit.
The following, valid in C and C++, is going to (most likely) result in different values in i in C and C++:
int i = sizeof('a');
See Size of character ('a') in C/C++ for an explanation of the difference.
Another one from this article:
#include <stdio.h>
int sz = 80;
int main(void)
{
struct sz { char c; };
int val = sizeof(sz); // sizeof(int) in C,
// sizeof(struct sz) in C++
printf("%d\n", val);
return 0;
}
C90 vs. C++11 (int vs. double):
#include <stdio.h>
int main()
{
auto j = 1.5;
printf("%d", (int)sizeof(j));
return 0;
}
In C auto means local variable. In C90 it's ok to omit variable or function type. It defaults to int. In C++11 auto means something completely different, it tells the compiler to infer the type of the variable from the value used to initialize it.
Another example that I haven't seen mentioned yet, this one highlighting a preprocessor difference:
#include <stdio.h>
int main()
{
#if true
printf("true!\n");
#else
printf("false!\n");
#endif
return 0;
}
This prints "false" in C and "true" in C++ - In C, any undefined macro evaluates to 0. In C++, there's 1 exception: "true" evaluates to 1.
Per C++11 standard:
a. The comma operator performs lvalue-to-rvalue conversion in C but not C++:
char arr[100];
int s = sizeof(0, arr); // The comma operator is used.
In C++ the value of this expression will be 100 and in C this will be sizeof(char*).
b. In C++ the type of enumerator is its enum. In C the type of enumerator is int.
enum E { a, b, c };
sizeof(a) == sizeof(int); // In C
sizeof(a) == sizeof(E); // In C++
This means that sizeof(int) may not be equal to sizeof(E).
c. In C++ a function declared with empty params list takes no arguments. In C empty params list mean that the number and type of function params is unknown.
int f(); // int f(void) in C++
// int f(*unknown*) in C
This program prints 1 in C++ and 0 in C:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int d = (int)(abs(0.6) + 0.5);
printf("%d", d);
return 0;
}
This happens because there is double abs(double) overload in C++, so abs(0.6) returns 0.6 while in C it returns 0 because of implicit double-to-int conversion before invoking int abs(int). In C, you have to use fabs to work with double.
#include <stdio.h>
int main(void)
{
printf("%d\n", (int)sizeof('a'));
return 0;
}
In C, this prints whatever the value of sizeof(int) is on the current system, which is typically 4 in most systems commonly in use today.
In C++, this must print 1.
Another sizeof trap: boolean expressions.
#include <stdio.h>
int main() {
printf("%d\n", (int)sizeof !0);
}
It equals to sizeof(int) in C, because the expression is of type int, but is typically 1 in C++ (though it's not required to be). In practice they are almost always different.
An old chestnut that depends on the C compiler, not recognizing C++ end-of-line comments...
...
int a = 4 //* */ 2
+2;
printf("%i\n",a);
...
The C++ Programming Language (3rd Edition) gives three examples:
sizeof('a'), as #Adam Rosenfield mentioned;
// comments being used to create hidden code:
int f(int a, int b)
{
return a //* blah */ b
;
}
Structures etc. hiding stuff in out scopes, as in your example.
Another one listed by the C++ Standard:
#include <stdio.h>
int x[1];
int main(void) {
struct x { int a[2]; };
/* size of the array in C */
/* size of the struct in C++ */
printf("%d\n", (int)sizeof(x));
}
Inline functions in C default to external scope where as those in C++ do not.
Compiling the following two files together would print the "I am inline" in case of GNU C but nothing for C++.
File 1
#include <stdio.h>
struct fun{};
int main()
{
fun(); // In C, this calls the inline function from file 2 where as in C++
// this would create a variable of struct fun
return 0;
}
File 2
#include <stdio.h>
inline void fun(void)
{
printf("I am inline\n");
}
Also, C++ implicitly treats any const global as static unless it is explicitly declared extern, unlike C in which extern is the default.
#include <stdio.h>
struct A {
double a[32];
};
int main() {
struct B {
struct A {
short a, b;
} a;
};
printf("%d\n", sizeof(struct A));
return 0;
}
This program prints 128 (32 * sizeof(double)) when compiled using a C++ compiler and 4 when compiled using a C compiler.
This is because C does not have the notion of scope resolution. In C structures contained in other structures get put into the scope of the outer structure.
struct abort
{
int x;
};
int main()
{
abort();
return 0;
}
Returns with exit code of 0 in C++, or 3 in C.
This trick could probably be used to do something more interesting, but I couldn't think of a good way of creating a constructor that would be palatable to C. I tried making a similarly boring example with the copy constructor, that would let an argument be passed, albeit in a rather non-portable fashion:
struct exit
{
int x;
};
int main()
{
struct exit code;
code.x=1;
exit(code);
return 0;
}
VC++ 2005 refused to compile that in C++ mode, though, complaining about how "exit code" was redefined. (I think this is a compiler bug, unless I've suddenly forgotten how to program.) It exited with a process exit code of 1 when compiled as C though.
Don't forget the distinction between the C and C++ global namespaces. Suppose you have a foo.cpp
#include <cstdio>
void foo(int r)
{
printf("I am C++\n");
}
and a foo2.c
#include <stdio.h>
void foo(int r)
{
printf("I am C\n");
}
Now suppose you have a main.c and main.cpp which both look like this:
extern void foo(int);
int main(void)
{
foo(1);
return 0;
}
When compiled as C++, it will use the symbol in the C++ global namespace; in C it will use the C one:
$ diff main.cpp main.c
$ gcc -o test main.cpp foo.cpp foo2.c
$ ./test
I am C++
$ gcc -o test main.c foo.cpp foo2.c
$ ./test
I am C
int main(void) {
const int dim = 5;
int array[dim];
}
This is rather peculiar in that it is valid in C++ and in C99, C11, and C17 (though optional in C11, C17); but not valid in C89.
In C99+ it creates a variable-length array, which has its own peculiarities over normal arrays, as it has a runtime type instead of compile-time type, and sizeof array is not an integer constant expression in C. In C++ the type is wholly static.
If you try to add an initializer here:
int main(void) {
const int dim = 5;
int array[dim] = {0};
}
is valid C++ but not C, because variable-length arrays cannot have an initializer.
Empty structures have size 0 in C and 1 in C++:
#include <stdio.h>
typedef struct {} Foo;
int main()
{
printf("%zd\n", sizeof(Foo));
return 0;
}
This concerns lvalues and rvalues in C and C++.
In the C programming language, both the pre-increment and the post-increment operators return rvalues, not lvalues. This means that they cannot be on the left side of the = assignment operator. Both these statements will give a compiler error in C:
int a = 5;
a++ = 2; /* error: lvalue required as left operand of assignment */
++a = 2; /* error: lvalue required as left operand of assignment */
In C++ however, the pre-increment operator returns an lvalue, while the post-increment operator returns an rvalue. It means that an expression with the pre-increment operator can be placed on the left side of the = assignment operator!
int a = 5;
a++ = 2; // error: lvalue required as left operand of assignment
++a = 2; // No error: a gets assigned to 2!
Now why is this so? The post-increment increments the variable, and it returns the variable as it was before the increment happened. This is actually just an rvalue. The former value of the variable a is copied into a register as a temporary, and then a is incremented. But the former value of a is returned by the expression, it is an rvalue. It no longer represents the current content of the variable.
The pre-increment first increments the variable, and then it returns the variable as it became after the increment happened. In this case, we do not need to store the old value of the variable into a temporary register. We just retrieve the new value of the variable after it has been incremented. So the pre-increment returns an lvalue, it returns the variable a itself. We can use assign this lvalue to something else, it is like the following statement. This is an implicit conversion of lvalue into rvalue.
int x = a;
int x = ++a;
Since the pre-increment returns an lvalue, we can also assign something to it. The following two statements are identical. In the second assignment, first a is incremented, then its new value is overwritten with 2.
int a;
a = 2;
++a = 2; // Valid in C++.

ARM GCC compiler "buggy" conversion

Problem
I am working with flash memory optimization of STM32F051. It's revealed, that conversion between floatand int types consumes a lot of flash.
Digging into this, it turned out that the conversion to int takes around 200 bytes of flash memory; while the conversion to unsigned int takes around 1500 bytes!
It’s known, that both int and unsigned int differ only by the interpretation of the ‘sign’ bit, so such behavior – is a great mystery for me.
Note: Performing the 2-stage conversion float -> int -> unsigned int also consumes only around 200 bytes.
Questions
Analyzing that, I have such questions:
1) What is a mechanism of the conversion of float to unsigned int. Why it takes so many memory space, when in the same time conversion float->int->unsigned int takes so little memory? Maybe it’s connected with IEEE 754 standard?
2) Are there any problems expected when the conversion float->int->unsigned int is used instead of a direct float ->int?
3) Are there any methods to wrap float -> unsigned int conversion keeping the low memory footprint?
Note: The familiar question has been already asked here (Trying to understand how the casting/conversion is done by compiler,e.g., when cast from float to int), but still there is no clear answer and my question is about the memory usage.
Technical data
Compiler: ARM-NONE-EABI-GCC (gcc version 4.9.3 20141119 (release))
MCU: STM32F051
MCU's core: 32 bit ARM CORTEX-M0
Code example
float -> int (~200 bytes of flash)
int main() {
volatile float f;
volatile int i;
i = f;
return 0;
}
float -> unsigned int (~1500 bytes! of flash)
int main() {
volatile float f;
volatile unsigned int ui;
ui = f;
return 0;
}
float ->int-> unsigned int (~200 bytes of flash)
int main() {
volatile float f;
volatile int i;
volatile unsigned int ui;
i = f;
ui = i;
return 0;
}
There is no fundamental reason for the conversion from float to unsigned int should be larger than the conversion from float to signed int, in practice the float to unsigned int conversion can be made smaller than the float to signed int conversion.
I did some investigations using the GNU Arm Embedded Toolchain (Version 7-2018-q2) and
as far as I can see the size problem is due to a flaw in the gcc runtime library. For some reason this library does not provide an specialized version of the __aeabi_f2uiz function for Arm V6m, instead it falls back on a much larger general version.

c alignment of pointers

I'm wondering if it's possible to hint to gcc that a pointer points to an aligned boundary. if I have a function:
void foo ( void * pBuf ) {
uint64_t *pAligned = pBuf;
pAligned = ((pBuf + 7) & ~0x7);
var = *pAligned; // I want this to be aligned 64 bit access
}
And I know that pBuf is 64 bit aligned, is there any way to tell gcc that pAligned points to a 64 bit boundary? If I do:
uint64_t *pAligned __attribute__((aligned(16)));
I believe that means that the address of the pointer is 64 bit aligned, but it doesn't tell the compiler that the what it points to is aligned, and therefore the compiler would likely tell it to do an unaligned fetch here. This could slow things down if I'm looping through a large array.
There are several ways to inform GCC about alignment.
Firstly you can attach align attribute to pointee, rather than pointer:
int foo() {
int __attribute__((aligned(16))) *p;
return (unsigned long long)p & 3;
}
Or you can use (relatively new) builtin:
int bar(int *p) {
int *pa = __builtin_assume_aligned(p, 16);
return (unsigned long long)pa & 3;
}
Both variants optimize to return 0 due to alignment.
Unfortunately the following does not seem to work:
typedef int __attribute__((aligned(16))) *aligned_ptr;
int baz(aligned_ptr p) {
return (unsigned long long)p & 3;
}
and this one does not either
typedef int aligned_int __attribute__((aligned (16)));
int braz(aligned_int *p) {
return (unsigned long long)p & 3;
}
even though docs suggest the opposite.

There is a way in gcc to get a warning when a constexpr can't be evaluated at compile time?

I'm using gcc 5.1.0 (c++14) and I was trying with constexpr. Is very annoying to verify if the constexpr I've implemented are evaluated at compile time. I couldn't find any flag for get a warning about that situation.
Here is an example:
example.cpp -----------------------------------------
#include <stdlib.h>
const char pruVar[] = "12345678901234567";
[[gnu::noinline]] constexpr unsigned int myStrlen(const char* cstr)
{
unsigned int i=0;
for(;cstr[i]!=0;++i);
return i;
}
struct CEXAMPLE
{
unsigned int size;
constexpr CEXAMPLE(const char* s): size(myStrlen(s))
{
}
};
int main(void)
{
CEXAMPLE c(pruVar);
unsigned int size = myStrlen(pruVar);
void* a = malloc(c.size + size);
if (a != nullptr)
return 0;
else
return 1;
}
In the example CEXAMPLE::CEXAMPLE is evaluated at compile time including the call to myStrlen in it, but the call to myStrlen in main is being evaluated at runtime. The only way I have to know this is looking at the assembler.This website is very useful too: http://gcc.godbolt.org/
If you know how to make the compiler warn about this or something similar I'll appreciate it
myStrlen(pruVar) can be evaluated at compile time; the compiler is just choosing not to in this instance.
If you want to force the compiler to evaluate it at compile time or error if this is not possible, assign the result to a constexpr variable:
constexpr unsigned int size = myStrlen(pruVar);
^^^^^^^^^
You could also use an enum, or a std::integral_constant:
enum : unsigned int { size = myStrlen(pruVar) };
std::integral_constant<unsigned int, myStrlen(pruVar)> size;
Based on the fact that template arguments must be evaluated at compiletime a helper template can be used.
namespace helper {
template<class T, T v> constexpr T enforce_compiletime() {
constexpr T cv = v;
return cv;
}
}
#define compiletime(arg) ::helper::enforce_compiletime<decltype(arg), (arg)>()
This allows compile time enforcement without an additional constexpr variable, which is handy in order to calculate value lookup tables.
constexpr uint32_t bla(uint8_t blub) {
switch (blub) {
case 5:
return 22;
default:
return 23;
}
}
struct SomeStruct {
uint32_t a;
uint32_t b;
};
SomeStruct aStruct = {compiletime(bla(5)), compiletime(bla(6))};

Cocoa:NSUInteger vs unsigned int When the Range is Very Small?

I have an unsigned int variable and it can only have the values of 0 -> 30. What should I use: unsigned int or NSUInteger? (for both 32 and 64 bit)
I’d go with either NSUInteger (as the idiomatic general unsigned integer type in Cocoa) or uint8_t (if size matters). If I expected to be using 0–30 values in several places for the same type of data, I’d typedef it to describe what it represents.
Running this:
int sizeLong = sizeof(unsigned long);
int sizeInt = sizeof(unsigned int);
NSLog(#"%d, %d", sizeLong, sizeInt);
on 64bits gives:
8, 4
on 32 bits gives:
4, 4
So that yes, on 64 bits unsigned long (NSUInteger) takes twice as much memory as NSUInteger on 32 bits.
It makes very little difference in your case, there is no right or wrong. I might use an NSUInteger, just to match with Cocoa API stuff.
NSUInteger is defined like this:
#if __LP64__ || TARGET_OS_EMBEDDED || TARGET_OS_IPHONE || TARGET_OS_WIN32 || NS_BUILD_32_LIKE_64
typedef long NSInteger;
typedef unsigned long NSUInteger;
#else
typedef int NSInteger;
typedef unsigned int NSUInteger;
#endif
When u really want to use some of the unsigned type, choosing between unsigned int and NSUInteger does not matter because those types are equal(comparing the range and size in 32 & 64 bit). The same applies to int and NSInteger:
#if __LP64__ || (TARGET_OS_EMBEDDED && !TARGET_OS_IPHONE) || TARGET_OS_WIN32 || NS_BUILD_32_LIKE_64
typedef long NSInteger;
typedef unsigned long NSUInteger;
#else
typedef int NSInteger;
typedef unsigned int NSUInteger;
#endif
Personally, I have just been bitten by this choice. I went down the NSUInteger route and have just spent HOURS looking into an obscure bug.
I had code that picked a random number and returned an NSUInteger. The code for the relied on the overflow of the number. However I did not anticipate that the size of the number varied between 32bit and 64bit systems. The rest of my code assumed (incorrectly) that the number would be up to 32 bit in size. The result is the code worked perfectly under 32 bit devices, but on iPhone 5S, it all fell apart.
There is nothing wrong with using NSUInteger, however its worth remembering that the number range is significantly higher so factor this dynamicism into any maths you do with that number.

Resources