edit: it's probably a gcc bug. reported: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108798
int8_t has a range of -128 to 127, so when compiling the code
#include <stdint.h>
int main(){
int8_t i = 128;
(void)i;
}
with
gcc t.c -Woverflow
why does it not catch the 128 overflow? fwiw gcc catches the overflow only when you go over 255, for example
#include <stdint.h>
int main(){
int8_t i = 256;
(void)i;
}
is caught. so why isn't 128 caught?
interestingly, it catches the 128 overflow with -Woverflow -Wpedantic , resulting in a
prog.c:3:15: warning: overflow in conversion from 'int' to 'int8_t' {aka 'signed char'} changes value from '128' to '-128' [-Woverflow]
3 | int8_t ii=128;
tested with gcc 12.1.0 (released 2022-05-06)
Related
I have a short program that is causing Segmentation Fault on RPi4 after run several times (e.g.: 10 times in a loop).
I am using Raspbian GNU/Linux 10 (buster) and default gcc compiler (sudo apt install build-essential)
gcc --version
gcc (Raspbian 8.3.0-6+rpi1) 8.3.0
Do you think this is a gcc compiler problem? Maybe I am missing some special settings for RPi4.
I am using this to build:
gcc threads.c -o threads -l pthread
The output is sometimes (not always) something like this:
...
in thread_dummy, loop: 003
Segmentation fault
The code is here:
#include <stdio.h> /* for puts() */
#include <unistd.h> /* for sleep() */
#include <stdlib.h> /* for EXIT_SUCCESS */
#include <pthread.h>
#define PTR_SIZE (0xFFFFFF)
#define PTR_CNT (10)
void* thread_dummy(void* param)
{
void* ptr = malloc(PTR_SIZE);
//fprintf(stderr, "thread num: %03i, stack: %08X, heap: %08X - %08X\n", (int)param, (unsigned int)¶m, (unsigned int)ptr, (unsigned int)((unsigned char*)ptr + PTR_SIZE));
fprintf(stderr, "in thread_dummy, loop: %03i\n", (int)param);
sleep(1);
free(ptr);
pthread_detach(pthread_self());
return NULL;
}
int main(void)
{
void* ptrs[PTR_CNT];
pthread_t threads[PTR_CNT];
for(int i=0; i<PTR_CNT; ++i)
{
ptrs[i] = malloc(PTR_SIZE);
//fprintf(stderr, "main num: %03i, stack: %08X, heap: %08X - %08X\n", i, (unsigned int)&ptrs, (unsigned int)ptrs[i], (unsigned int)((unsigned char*)ptrs[i] + PTR_SIZE));
fprintf(stderr, "in main, loop: %03i\n", i);
}
fprintf(stderr, "-----------------------------------------------------------\n");
for(int i=0; i<PTR_CNT; ++i)
pthread_create(&threads[i], 0, thread_dummy, (void*)i);
for(int i=0; i<PTR_CNT; ++i)
pthread_join(threads[i], NULL);
for(int i=0; i<PTR_CNT; ++i)
free(ptrs[i]);
return EXIT_SUCCESS;
}
UPDATE:
I also tested it with new gcc, but the problem remains...
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/arm-linux-gnueabihf/11.1.0/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../configure --enable-languages=c,c++,fortran --with-cpu=cortex-a72 --with-fpu=neon-fp-armv8 --with-float=hard --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.1.0 (GCC)
pthread_create is like malloc, and pthread_detach or pthread_join is like free. You are basically doing something like "double free" - you detach a thread and join it at the same time. Either detach or join the thread.
You could remove pthread_join from main. But you should logically remove pthread_detach(...) from inside the thread, which is actually useless because the thread terminates right after anyway.
I have built simple program to calculate the exp of value. I got error;
#include <stdint.h>
#include "util.h"
#include <math.h>
#include <stdio.h>
int main() {
double value = -150;
Start_Timer();
for(int i=0; i<500 ;i++){
result = exp(value);
value++;
}
Stop_Timer();
User_Time=End_Time-Begin_Time;
printf("User_Time: %ld - %ld = %ld - \n", End_Time,Begin_Time,User_Time);
printf("The Exponential of %ld is %ld\n", value, result);
return 0;
}
Any idea how to use exp in Benchmark for testing.
i have figured out that exp function need -x and -lm for compiling. How can i use them in the test
C Failing to compile: Can't find math.h functions
I tried to edit the makefile in riscv-test/benchmark but i think , it is little bit tricky for me.
Error Message:https://github.com/riscv/riscv-tests/issues/142
I am trying to use lapack functions from C.
Here is some test code, copied from this question
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include "clapack.h"
#include "cblas.h"
void invertMatrix(float *a, unsigned int height){
int info, ipiv[height];
info = clapack_sgetrf(CblasColMajor, height, height, a, height, ipiv);
info = clapack_sgetri(CblasColMajor, height, a, height, ipiv);
}
void displayMatrix(float *a, unsigned int height, unsigned int width)
{
int i, j;
for(i = 0; i < height; i++){
for(j = 0; j < width; j++)
{
printf("%1.3f ", a[height*j + i]);
}
printf("\n");
}
printf("\n");
}
int main(int argc, char *argv[])
{
int i;
float a[9], b[9], c[9];
srand(time(NULL));
for(i = 0; i < 9; i++)
{
a[i] = 1.0f*rand()/RAND_MAX;
b[i] = a[i];
}
displayMatrix(a, 3, 3);
return 0;
}
I compile this with gcc:
gcc -o test test.c \
-lblas -llapack -lf2c
n.b.: I've tried those libraries in various orders, I've also tried others libs like latlas, lcblas, lgfortran, etc.
The error message is:
/tmp//cc8JMnRT.o: In function `invertMatrix':
test.c:(.text+0x94): undefined reference to `clapack_sgetrf'
test.c:(.text+0xb4): undefined reference to `clapack_sgetri'
collect2: error: ld returned 1 exit status
clapack.h is found and included (installed as part of atlas). clapack.h includes the offending functions --- so how can they not be found?
The symbols are actually in the library libalapack (found using strings). However, adding -lalapack to the gcc command seems to require adding -lcblas (lots of undefined cblas_* references). Installing cblas automatically uninstalls atlas, which removes clapack.h.
So, this feels like some kind of dependency hell.
I am on FreeBSD 10 amd64, all the relevant libraries seem to be installed and on the right paths.
Any help much appreciated.
Thanks
Ivan
I uninstalled everything remotely relevant --- blas, cblas, lapack, atlas, etc. --- then reinstalled atlas (from ports) alone, and then the lapack and blas packages.
This time around, /usr/local/lib contained a new lib file: libcblas.so --- previous random installations must have deleted it.
The gcc line that compiles is now:
gcc -o test test.c \
-llapack -lblas -lalapack -lcblas
Changing the order of the -l arguments doesn't seem to make any difference.
I am having trouble calling a c function from arm assembly. Vice versa works fine. Arch is cortex-m3 and the board is due. Compiler is gcc.
Here's the assembly code:
.syntax unified
.section .text
.thumb_func
.cpu cortex-m3
.extern my_c_add
.global call_my_c_add
call_my_c_add: # r0 - x, r1 - y
bl my_c_add
bx lr # return
And here's the c code:
#include <Arduino.h>
#include <SPI.h>
#include <Ethernet.h>
extern "C" unsigned int call_my_c_add (unsigned int, unsigned int);
unsigned int my_c_add(unsigned int, unsigned int);
unsigned int x=20;
unsigned int y = 15;
void setup()
{
Serial.begin(115200);
Serial.println("exiting setup");
}
void loop()
{
unsigned int z = 0;
z = call_my_c_add (x, y);
Serial.print("c calling asm calling c, addition is - ");
Serial.println(z);
}
unsigned int my_c_add(unsigned int x, unsigned int y)
{
return (x+y);
}
The error I get is -
small_sample.S.o: In function call_my_c_add':
small_sample.S:12: undefined reference tomy_c_add'
collect2: ld returned 1 exit status
Here's the command I use for linking -
arm-none-eabi-g++ -O3 -Wl,--gc-sections -mcpu=cortex-m3 -T flash.ld -Wl,-Map,mapfile -o elffile -L somefile -lm -lgcc -mthumb -Wl,--cref -Wl,--check-sections -Wl,--gc-sections -Wl,--entry=Reset_Handler -Wl,--unresolved-symbols=report-all -Wl,--warn-common -Wl,--warn-section-align -Wl,--start-group some.c.o some2.cpp.o assembly.S.o somelib.a -Wl,--end-group
g++ compiler does some name mangling. You probably need to add extern "C" also on the my_c_add, to disable it for that function.
Try to run arm-none-eabi-nm on the two object files, and check that the name of the symbol defined in the object compiled from C/C++ is the same as the symbol in the object compiled from assembly.
I am currently experimenting with the GCC vector extensions. However, I am wondering how to go about getting sqrt(vec) to work as expected.
As in:
typedef double v4d __attribute__ ((vector_size (16)));
v4d myfunc(v4d in)
{
return some_sqrt(in);
}
and at least on a recent x86 system have it emit a call to the relevant intrinsic sqrtpd. Is there a GCC builtin for sqrt that works on vector types or does one need to drop down to the intrinsic level to accomplish this?
Looks like it's a bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54408 I don't know of any workaround other than do it component-wise. The vector extensions were never meant to replace platform specific intrinsics anyway.
Some funky code to this effect:
#include <cmath>
#include <utility>
template <::std::size_t...> struct indices { };
template <::std::size_t M, ::std::size_t... Is>
struct make_indices : make_indices<M - 1, M - 1, Is...> {};
template <::std::size_t... Is>
struct make_indices<0, Is...> : indices<Is...> {};
typedef float vec_type __attribute__ ((vector_size(4 * sizeof(float))));
template <::std::size_t ...Is>
vec_type sqrt_(vec_type const& v, indices<Is...> const)
{
vec_type r;
::std::initializer_list<int>{(r[Is] = ::std::sqrt(v[Is]), 0)...};
return r;
}
vec_type sqrt(vec_type const& v)
{
return sqrt_(v, make_indices<4>());
}
int main()
{
vec_type v;
return sqrt(v)[0];
}
You could also try your luck with auto-vectorization, which is separate from the vector extension.
You can loop over the vectors directly
#include <math.h>
typedef double v2d __attribute__ ((vector_size (16)));
v2d myfunc(v2d in) {
v2d out;
for(int i=0; i<2; i++) out[i] = sqrt(in[i]);
return out;
}
The sqrt function has to trap for signed zero and NAN but if you avoid these with -Ofast both Clang and GCC produce simply sqrtpd.
https://godbolt.org/g/aCuovX
GCC might have a bug because I had to loop to 4 even though there are only 2 elements to get optimal code.
But with AVX and AVX512 GCC and Clang are ideal
AVX
https://godbolt.org/g/qdTxyp
AVX512
https://godbolt.org/g/MJP1n7
My reading of the question is that you want the square root of 4 packed double precision values... that's 32 bytes. Use the appropriate AVX intrinsic:
#include <x86intrin.h>
typedef double v4d __attribute__ ((vector_size (32)));
v4d myfunc (v4d v) {
return _mm256_sqrt_pd(v);
}
x86-64 gcc 10.2 and x86-64 clang 10.0.1
using -O3 -march=skylake :
myfunc:
vsqrtpd %ymm0, %ymm0 # (or just `ymm0` for Intel syntax)
ret
ymm0 is the return value register.
That said, it just so happens there is a builtin: __builtin_ia32_sqrtpd256, which doesn't require the intrinsics header. I would definitely discourage its use however.