I'm trying to compile LightZPng with warnings on level 4. I get a lot of C4127 on lines that are clearly not worthy of this warning. An example:
#define MAX_BITS 15
int values_per_bitlen[ MAX_BITS + 1 ];
for ( int i = 0; i <= MAX_BITS; ++i ) // C4127 is here
values_per_bitlen[ i ] = 0;
How can this code be changed to avoid the warning other than #pragma?
There's a piece of code at the top of LightZ.cpp that goes like this:
#define for if (false) {} else for
That means your actual statement is:
#define for if (false) {} else for ( int i = 0; i <= MAX_BITS; ++i )
which is why you're getting the constant expression error (it's the false, not the i <= MAX_BITS as I thought).
Simply comment out or delete that line from the file (I can't actually figure out why they would do that).
Yes, that its odd. It's truly not a constant expression since i changes in the loop. So this would appear to be a problem with VS2005. For what it's worth, VS2008 does exactly the same thing.
Strangely enough, a project with just this in it does not complain so it may well be some weird edge-case problem with Microsoft's warning generation code:
#define MAX_BITS 15
int values_per_bitlen[ MAX_BITS + 1 ];
int main(int argc, char* argv[]) {
for ( int i = 0; i <= MAX_BITS; ++i )
values_per_bitlen[ i ] = 0;
return 0;
}
However, you haven't actually asked a question. What is it that you want to know, or want us to do?
Update:
See "Windows programmer"'s answer for the actual cause - there's a "#define for if (false) {} else for" at the top of LightZ.cpp which is causing the problem.
I tested it on my VS2005 and the warning does not appear, even at warning level 4. .
A simple procedure for you to follow :
-Create a new Console App and place only the above code and see if the warning shows up again.
-If not, check for differences in the project settings.
-If yes, I would assume that your optimization setting may be causing it.
According to Charles Nicholson, Visual Studio 2005 gives this error with the "do...while(0)" trick:
#define MULTI_LINE_MACRO \
do { \
doSomething(); \
doSomethingElse(); \
} while(0)
If you absolutely must, you can use the __pragma directive to selectively disable that warning around a particular code fragment.
Related
I'm new to kernel development, and I need to write a Linux kernel module that performs several matrix multiplications (I'm working on an x64_64 platform). I'm trying to use fixed-point values for these operations, however during compilation, the compiler encounters this error:
error: SSE register return with SSE disabled
I don't know that much about SSE or this issue in particular, but from what i've found and according to most answers to questions about this problem, it is related to the usage of Floating-Point (FP) arithmetic in kernel space, which seems to be rarely a good idea (hence the utilization of Fixed-Point arithmetics). This error seems weird to me because I'm pretty sure I'm not using any FP values or operations, however it keeps popping up and in some ways that seem weird to me. For instance, I have this block of code:
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
const int scale = 16;
#define DOUBLE_TO_FIXED(x) ((x) * (1 << scale))
#define FIXED_TO_DOUBLE(x) ((x) / (1 << scale))
#define MULT(x, y) ((((x) >> 8) * ((y) >> 8)) >> 0)
#define DIV(x, y) (((x) << 8) / (y) << 8)
#define OUTPUT_ROWS 6
#define OUTPUT_COLUMNS 2
struct matrix {
int rows;
int cols;
double *data;
};
double outputlayer_weights[OUTPUT_ROWS * OUTPUT_COLUMNS] =
{
0.7977986, -0.77172316,
-0.43078753, 0.67738613,
-1.04312621, 1.0552227 ,
-0.32619684, 0.14119884,
-0.72325027, 0.64673559,
0.58467862, -0.06229197
};
...
void matmul (struct matrix *A, struct matrix *B, struct matrix *C) {
int i, j, k, a, b, sum, fixed_prod;
if (A->cols != B->rows) {
return;
}
for (i = 0; i < A->rows; i++) {
for (j = 0; j < B->cols; j++) {
sum = 0;
for (k = 0; k < A->cols; k++) {
a = DOUBLE_TO_FIXED(A->data[i * A->rows + k]);
b = DOUBLE_TO_FIXED(B->data[k * B->rows + j]);
fixed_prod = MULT(a, b);
sum += fixed_prod;
}
/* Commented the following line, causes error */
//C->data[i * C->rows + j] = sum;
}
}
}
...
static int __init insert_matmul_init (void)
{
printk(KERN_INFO "INSERTING MATMUL");
return 0;
}
static void __exit insert_matmul_exit (void)
{
printk(KERN_INFO "REMOVING MATMUL");
}
module_init (insert_matmul_init);
module_exit (insert_matmul_exit);
which compiles with no errors (I left out code that I found irrelevant to the problem). I have made sure to comment any error-prone lines to get to a point where the program can be compiled with no errors, and I am trying to solve each of them one by one. However, when uncommenting this line:
C->data[i * C->rows + j] = sum;
I get this error message in a previous (unmodified) line of code:
error: SSE register return with SSE disabled
sum += fixed_prod;
~~~~^~~~~~~~~~~~~
From what I understand, there are no FP operations taking place, at least in this section, so I need help figuring out what might be causing this error. Maybe my fixed-point implementation is flawed (I'm no expert in that matter either), or maybe I'm missing something obvious. Just in case, I have tested the same logic in a user-space program (using Floating-Point values) and it seems to work fine. In either case, any help in solving this issue would be appreciated. Thanks in advance!
Edit: I have included the definition of matrix and an example matrix. I have been using the default kbuild command for building external modules, here is what my Makefile looks like:
obj-m = matrix_mult.o
KVERSION = $(shell uname -r)
all:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules
Linux compiles kernel code with -mgeneral-regs-only on x86, which produces this error in functions that do anything with FP or SIMD. (Except via inline asm, because then the compiler doesn't see the FP instructions, only the assembler does.)
From what I understand, there are no FP operations taking place, at least in this section, so I need help figuring out what might be causing this error.
GCC optimizes whole functions when optimization is enabled, and you are using FP inside that function. You're doing FP multiply and truncating conversion to integer with your macro and assigning the result to an int, since the MCVE you eventually provided shows struct matrix containing double *data.
If you stop the compiler from using FP instructions (like Linux does by building with -mgeneral-regs-only), it refuses to compile your file instead of doing software floating-point.
The only odd thing is that it pins down the error to an integer += instead of one of the statements that compiles to a mulsd and cvttsd2si
If you disable optimization (-O0 -mgeneral-regs-only) you get a more obvious location for the same error (https://godbolt.org/z/Tv5nG6nd4):
<source>: In function 'void matmul(matrix*, matrix*, matrix*)':
<source>:9:33: error: SSE register return with SSE disabled
9 | #define DOUBLE_TO_FIXED(x) ((x) * (1 << scale))
| ~~~~~^~~~~~~~~~~~~~~
<source>:46:21: note: in expansion of macro 'DOUBLE_TO_FIXED'
46 | a = DOUBLE_TO_FIXED(A->data[i * A->rows + k]);
| ^~~~~~~~~~~~~~~
If you really want to know what's going on with the GCC internals, you could dig into it with -fdump-tree-... options, e.g. on the Godbolt compiler explorer there's a dropdown for GCC Tree / RTL output that would let you look at the GIMPLE or RTL internal representation of your function's logic after various analyzer passes.
But if you just want to know whether there's a way to make this function work, no obviously not, unless you compile a file without -mgeneral-registers-only. All functions in a file compiled that way must only be called by callers that have used kernel_fpu_begin() before the call. (and kernel_fpu_end after).
You can't safely use kernel_fpu_begin inside a function compiled to allow it to use SSE / x87 registers; it might already have corrupted user-space FPU state before calling the function, after optimization. The symptom of getting this wrong is not a fault, it's corrupting user-space state, so don't assume that happens to work = correct. Also, depending on how GCC optimizes, the code-gen might be fine with your version, but might be broken with earlier or later GCC or clang versions. I somewhat expect that kernel_fpu_begin() at the top of this function would get called before the compiler did anything with FP instructions, but that doesn't mean it would be safe and correct.
See also Generate and optimize FP / SIMD code in the Linux Kernel on files which contains kernel_fpu_begin()?
Apparently -msse2 overrides -mgeneral-regs-only, so that's probably just an alias for -mno-mmx -mno-sse and whatever options disables x87. So you might be able to use __attribute__((target("sse2"))) on a function without changing build options for it, but that would be x86-specific. Of course, so is -mgeneral-regs-only. And there isn't a -mno-general-regs-only option to override the kernel's normal CFLAGS.
I don't have a specific suggestion for the best way to set up a build option if you really do think it's worth using kernel_fpu_begin at all, here (rather than using fixed-point the whole way through).
Obviously if you do save/restore the FPU state, you might as well use it for the loop instead of using FP to convert to fixed-point and back.
I have been debugging a strange issue in the past hours that only occured in a release build (-O3) but not in a debug build (-g and no optimizations). Finally, I could pin it down to the "count trailing zeroes" builtin giving me wrong results, and now I wonder whether I just found a GCC bug or whether I'm missing something.
The short story is that apparently, GCC evaulates __builtin_ctz wrongly with -O2 and -O3 in some situations, but it does fine with no optimizations or -O1. The same applies to the long variants __builtin_ctzl and __builtin_ctzll.
My initial assumption is that __builtin_ctz(0) should resolve to 32, because it is the unsigned int (32-bit) version of the builtin and thus there are 32 trailing zero bits. I have not found anything stating that these builtins are undefined for the input being zero, and practical work with them has me convinced that they are not.
Let's have a look at the code I'd like to talk about now:
bool test_basic;
bool test_ctz;
bool test_result;
int ctz(const unsigned int x) {
const int q = __builtin_clz(x);
test_ctz = (q == 32);
return q;
};
int main(int argc, char** argv) {
{
const int q = __builtin_clz(0U);
test_basic = (q == 32);
}
{
const int q = ctz(0U);
test_result = (q == 32);
}
std::cout << "test_basic=" << test_basic << std::endl;
std::cout << "test_ctz=" << test_ctz << std::endl;
std::cout << "test_result=" << test_result << std::endl;
}
The code basically does three tests, storing the results in those boolean values:
test_basic is true if __builtin_clz(0U) resolves to 32.
test_ctz is true if __builtin_clz(x) equals 32 within the function ctz.
test_result is true if the result of ctz(0) equals 32.
Because I call ctz once in my main function and pass zero to it, I expect all three bools to be true by the end of the program. This actually is the case if I compile it without any optimizations or -O1. However, when I compile it with -O2, test_ctz becomes false. I consulted the Compiler Explorer to find out what the hell is going on. (Note that I am using g++ 7.5 myself, but I could reproduce this with any later version as well. In the Compiler Explorer, I picked the latest it has to offer, which is 10.2.)
Let's have a look at the code compiled with -O1 first. I see that test_ctz is simply set to 1. I guess that's because these builtins are treated as constexpr and the whole rather simple function ctz is evaluated at compile-time. The result is correct (under my initial assumption) and so I'm fine with that.
So what could possibly go wrong from here? Well, let's look at the code compiled with -O2. Nothing much has changed, just that test_ctz is now set to 0! And that's that, beyond any logic: the compiler apparently evaluates q == 32 to being false, but then q is returned from the function and we compare that against 32, and suddenly it's true (test_result). I have no explanation for this. Am I missing something? Have I found some demonical GCC bug?
It gets even funnier if you printf the value of q just before test_ctz is set: the console then prints 32, so the computation actually works as expected - at runtime. Yet at compile-time, the compiler thinks q is not 32 and test_ctz is forced to false. Indeed, if I change the declaration of q from const int to volatile int and thus force the computation at runtime, everything works as expected, so luckily there's a simple workaround.
To conclude, I'd like to note that I also use the "count leading zeroes" builtins (__builtin_clz and long versions) and I could not observe the same problem there; they work just fine.
I have not found anything stating that these builtins are undefined for the input being zero
How could you missed it??? From gcc online docs other builtins:
Built-in Function: int __builtin_ctz (unsigned int x)
Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.
So what could possibly go wrong from here?
Code behaving differently with different optimizations levels in 99% of cases is a clear indication of undefined behavior in your code. In this case the compiler optimizations makes different decisions then architecture instruction BSR and in case the compiler generates the BSR on x86 architecture, the result is still undefined, from the link If the content source operand is 0, the content of the destination operand is undefined. Och, there's also LZCNT in which case you'll get LZCNT will produce the operand size when the input operand is zero, which maybe better explains the behavior of your code.
Am I missing something?
Yes. You are missing that __builtin_ctz(0) is undefined.
Have I found some demonical GCC bug?
No.
I'd like to note that I also use the "count leading zeroes" builtins (__builtin_clz and long versions) I could not observe the same problem there; they work just fine.
Can be seen in gcc docs that __builtin_clz(0) is also undefined behavior.
I'm aware of this SO question and this SO question. The element
of novelty in this one is in its focus on Xcode, and in its use of
square brackets to dereference a pointer to void.
The following program compiles with no warning in Xcode 4.5.2, compiles
with a warning on GCC 4.2 and, even though I don't have Visual Studio
right now, I remember that it would consider this a compiler
error, and MSDN and Internet agree.
#include <stdio.h>
int main(int argc, const char * argv[])
{
int x = 24;
void *xPtr = &x;
int *xPtr2 = (int *)&xPtr[1];
printf("%p %p\n", xPtr, xPtr2);
}
If I change the third line of the body of main to:
int *xPtr2 = (int *)(xPtr + 1);
It compiles with no warnings on both GCC and Xcode.
I would like to know how can I turn this silence into warnings or errors, on
GDB and especially Xcode/LLVM, including the fact that function main is int but
does not explicitly return any value (By the way I think -Wall does
the trick on GDB).
that isnt wrong at all...
the compiler doesnt know how big the pointer is ... a void[] ~~ void*
thats why char* used as strings need to be \0-terminated
you cannot turn on a warning for that as it isnt possible to determine a 'size of memory pointer to by a pointer' at compile time
void *v = nil;
*v[1] = 0 //invalid
void *v = malloc(sizeof(int)*2);
*v[1] = 0 //valid
*note typed inline on SO -- sorry for any non-working code
I am using gcc compiler. I am working on a code that frequently involves writing chunks of statements inside a single #define directive. For example the following :
#include<stdio.h>
#define DO_RR(x) do { \
for(i=0;i<x; i++) \
printf("%d", i); \
}while(0);
int main() {
int i=0;
DO_RR(5)
return 0;
}
Now I want to be able to single step through the statements in DO_RR. However when I try it, the control jumps directly from DO_RR statement in main to the next statement and does not single step. Is there anyway to achieve stepping inside the preprocessor blocks ?
You cannot, #defines are expanded by the preprocessor and are not present in the code.
To supplement #Angelom's answer, you can workaround this by using functions. Move whatever code you can from the #define into a function, and you will be able to step through the function call.
Ideally, and most often, you can replace the entire #define with an inline function.
I am attempting to use OpenMP to create a parallel for loop in Visual Studio 2005 Professional. I have included omp.h and specified the /openmp compiler flag. However, I cannot get even the simplest parallel for loop to compile.
#pragma omp parallel for
for ( int i = 0; i < 10; ++i )
{
int a = i + i;
}
The above produces Compiler Error C3005 at the #pragma line.
Google hasn't been much help. I only found one obscure Japanese website with a user having similar issues. No mention of a resolution.
A standard parallel block compiles fine.
#prgram omp parallel
{
// Do some stuff
}
That is until you try to add a for loop.
#pragma omp parallel
{
#pragma omp for
for ( int i = 0; i < 10; ++i )
{
int a = i + i;
}
}
The above causes Compiler Error C3001. It seems 'for' is confusing to the compiler, but it shouldn't be. Any ideas?
I found the problem. Some genius defined the following macro deep within the headers:
#define for if ( false ) ; else for
My only guess is this was used to get variables declared in for loops to scope properly in Visual C++ 6. Undefining or commenting out the macro resolved the issue.