I am attempting to use OpenMP to create a parallel for loop in Visual Studio 2005 Professional. I have included omp.h and specified the /openmp compiler flag. However, I cannot get even the simplest parallel for loop to compile.
#pragma omp parallel for
for ( int i = 0; i < 10; ++i )
{
int a = i + i;
}
The above produces Compiler Error C3005 at the #pragma line.
Google hasn't been much help. I only found one obscure Japanese website with a user having similar issues. No mention of a resolution.
A standard parallel block compiles fine.
#prgram omp parallel
{
// Do some stuff
}
That is until you try to add a for loop.
#pragma omp parallel
{
#pragma omp for
for ( int i = 0; i < 10; ++i )
{
int a = i + i;
}
}
The above causes Compiler Error C3001. It seems 'for' is confusing to the compiler, but it shouldn't be. Any ideas?
I found the problem. Some genius defined the following macro deep within the headers:
#define for if ( false ) ; else for
My only guess is this was used to get variables declared in for loops to scope properly in Visual C++ 6. Undefining or commenting out the macro resolved the issue.
Related
I'm new to kernel development, and I need to write a Linux kernel module that performs several matrix multiplications (I'm working on an x64_64 platform). I'm trying to use fixed-point values for these operations, however during compilation, the compiler encounters this error:
error: SSE register return with SSE disabled
I don't know that much about SSE or this issue in particular, but from what i've found and according to most answers to questions about this problem, it is related to the usage of Floating-Point (FP) arithmetic in kernel space, which seems to be rarely a good idea (hence the utilization of Fixed-Point arithmetics). This error seems weird to me because I'm pretty sure I'm not using any FP values or operations, however it keeps popping up and in some ways that seem weird to me. For instance, I have this block of code:
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
const int scale = 16;
#define DOUBLE_TO_FIXED(x) ((x) * (1 << scale))
#define FIXED_TO_DOUBLE(x) ((x) / (1 << scale))
#define MULT(x, y) ((((x) >> 8) * ((y) >> 8)) >> 0)
#define DIV(x, y) (((x) << 8) / (y) << 8)
#define OUTPUT_ROWS 6
#define OUTPUT_COLUMNS 2
struct matrix {
int rows;
int cols;
double *data;
};
double outputlayer_weights[OUTPUT_ROWS * OUTPUT_COLUMNS] =
{
0.7977986, -0.77172316,
-0.43078753, 0.67738613,
-1.04312621, 1.0552227 ,
-0.32619684, 0.14119884,
-0.72325027, 0.64673559,
0.58467862, -0.06229197
};
...
void matmul (struct matrix *A, struct matrix *B, struct matrix *C) {
int i, j, k, a, b, sum, fixed_prod;
if (A->cols != B->rows) {
return;
}
for (i = 0; i < A->rows; i++) {
for (j = 0; j < B->cols; j++) {
sum = 0;
for (k = 0; k < A->cols; k++) {
a = DOUBLE_TO_FIXED(A->data[i * A->rows + k]);
b = DOUBLE_TO_FIXED(B->data[k * B->rows + j]);
fixed_prod = MULT(a, b);
sum += fixed_prod;
}
/* Commented the following line, causes error */
//C->data[i * C->rows + j] = sum;
}
}
}
...
static int __init insert_matmul_init (void)
{
printk(KERN_INFO "INSERTING MATMUL");
return 0;
}
static void __exit insert_matmul_exit (void)
{
printk(KERN_INFO "REMOVING MATMUL");
}
module_init (insert_matmul_init);
module_exit (insert_matmul_exit);
which compiles with no errors (I left out code that I found irrelevant to the problem). I have made sure to comment any error-prone lines to get to a point where the program can be compiled with no errors, and I am trying to solve each of them one by one. However, when uncommenting this line:
C->data[i * C->rows + j] = sum;
I get this error message in a previous (unmodified) line of code:
error: SSE register return with SSE disabled
sum += fixed_prod;
~~~~^~~~~~~~~~~~~
From what I understand, there are no FP operations taking place, at least in this section, so I need help figuring out what might be causing this error. Maybe my fixed-point implementation is flawed (I'm no expert in that matter either), or maybe I'm missing something obvious. Just in case, I have tested the same logic in a user-space program (using Floating-Point values) and it seems to work fine. In either case, any help in solving this issue would be appreciated. Thanks in advance!
Edit: I have included the definition of matrix and an example matrix. I have been using the default kbuild command for building external modules, here is what my Makefile looks like:
obj-m = matrix_mult.o
KVERSION = $(shell uname -r)
all:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules
Linux compiles kernel code with -mgeneral-regs-only on x86, which produces this error in functions that do anything with FP or SIMD. (Except via inline asm, because then the compiler doesn't see the FP instructions, only the assembler does.)
From what I understand, there are no FP operations taking place, at least in this section, so I need help figuring out what might be causing this error.
GCC optimizes whole functions when optimization is enabled, and you are using FP inside that function. You're doing FP multiply and truncating conversion to integer with your macro and assigning the result to an int, since the MCVE you eventually provided shows struct matrix containing double *data.
If you stop the compiler from using FP instructions (like Linux does by building with -mgeneral-regs-only), it refuses to compile your file instead of doing software floating-point.
The only odd thing is that it pins down the error to an integer += instead of one of the statements that compiles to a mulsd and cvttsd2si
If you disable optimization (-O0 -mgeneral-regs-only) you get a more obvious location for the same error (https://godbolt.org/z/Tv5nG6nd4):
<source>: In function 'void matmul(matrix*, matrix*, matrix*)':
<source>:9:33: error: SSE register return with SSE disabled
9 | #define DOUBLE_TO_FIXED(x) ((x) * (1 << scale))
| ~~~~~^~~~~~~~~~~~~~~
<source>:46:21: note: in expansion of macro 'DOUBLE_TO_FIXED'
46 | a = DOUBLE_TO_FIXED(A->data[i * A->rows + k]);
| ^~~~~~~~~~~~~~~
If you really want to know what's going on with the GCC internals, you could dig into it with -fdump-tree-... options, e.g. on the Godbolt compiler explorer there's a dropdown for GCC Tree / RTL output that would let you look at the GIMPLE or RTL internal representation of your function's logic after various analyzer passes.
But if you just want to know whether there's a way to make this function work, no obviously not, unless you compile a file without -mgeneral-registers-only. All functions in a file compiled that way must only be called by callers that have used kernel_fpu_begin() before the call. (and kernel_fpu_end after).
You can't safely use kernel_fpu_begin inside a function compiled to allow it to use SSE / x87 registers; it might already have corrupted user-space FPU state before calling the function, after optimization. The symptom of getting this wrong is not a fault, it's corrupting user-space state, so don't assume that happens to work = correct. Also, depending on how GCC optimizes, the code-gen might be fine with your version, but might be broken with earlier or later GCC or clang versions. I somewhat expect that kernel_fpu_begin() at the top of this function would get called before the compiler did anything with FP instructions, but that doesn't mean it would be safe and correct.
See also Generate and optimize FP / SIMD code in the Linux Kernel on files which contains kernel_fpu_begin()?
Apparently -msse2 overrides -mgeneral-regs-only, so that's probably just an alias for -mno-mmx -mno-sse and whatever options disables x87. So you might be able to use __attribute__((target("sse2"))) on a function without changing build options for it, but that would be x86-specific. Of course, so is -mgeneral-regs-only. And there isn't a -mno-general-regs-only option to override the kernel's normal CFLAGS.
I don't have a specific suggestion for the best way to set up a build option if you really do think it's worth using kernel_fpu_begin at all, here (rather than using fixed-point the whole way through).
Obviously if you do save/restore the FPU state, you might as well use it for the loop instead of using FP to convert to fixed-point and back.
Windows 10, LLVM 7, GCC 8.1, Visual studio 2019.
#include <iostream>
#include <fstream>
using namespace std;
char exe[1000000] = {};
int n = 0;
int filesize;
void read() {
int pointer = 0;
cin >> filesize;
fstream f;
f.open("s.exe", ios::in | ios::app | ios::binary);
f.seekp(pointer, ios::beg);
while (pointer < filesize) {
f.read((char*)&n,sizeof(char));
exe[pointer] = n;
pointer += 1;
}
f.close();
}
void showMassive(){
int pointer = 0;
while(pointer<filesize){
cout << pointer << ":" << (unsigned int8_t)exe[pointer] << endl;
pointer+=1;
}
}
void showAssembler(){
}
void write() {
int pointer = 0;
fstream f;
f.open("s1.exe", ios::out | ios::app | ios::binary);
f.seekp(pointer, ios::beg);
while (pointer < filesize) {
n=exe[pointer];
pointer += 1;
f.write((char*)&n,sizeof(char));
}
f.close();
}
void MachineCodeOptimizer(){
//some code
exe[1031] += 1;//just for example
}
int main(){
read();
showMassive();
showAssembler();
MachineCodeOptimizer();
write();
return 0;
}
this code. Clang creates an exe file 312 kilobytes size at best (-O1 key). GCC creates 66 KB size exe anyway. What happens? Why so difference between compilers? I look at machine code, but dont understand. Now i tried visual studio 2019 - 26 KB! Visual studio 2019 showing result close to assembler(in file size).
Clang and GCC are two completely independent compilers. When you write code in your source language, you only specify what you want the machine to execute, not how it should do that. Compilers are free in choosing their ways to get there, as long as they stay within the limits that are specified by your source language. So it's not surprising that the two resulting executables differ in file size. Also the chosen instructions by the two compilers might differ a lot (or completely) since there are, for example, a dozen different ways to represent loops in machine code (incl. taking advantage of parallel execution of the target processor ... or not). You might want to check out Matt Godbolt's talk from 2017 (https://www.youtube.com/watch?v=bSkpMdDe4g4); this can give you a short but exhaustive introduction in what compilers actually do (for you) behind the scenes.
When I try to run my code on Intel MIC it is giving an error like
"offload error: cannot offload to MIC - device is not available"
My sample code is
#include <stdio.h>
#include <omp.h>
int main()
{
int N=10;
int i, a[N];
#pragma offload target(mic)
#pragma omp parallel
#pragma omp for
for(i = 0; i < N; i++)
{
a[i]=i;
printf("a[%d] :: %d \n", i, a[i]);
printf(".....................:\n\n");
}
return 0;
}
One of 2 things is happening. Either the card is not booted, you can check this by:
sudo micctrl -s
Or the runtime cannot find dependent libraries. This is most likely due to not sourcing the compiler environment variables:
source /opt/intel/composerxe/bin/compilervars.sh intel64
I believe you have not set up the compiler's environment.
Compiler Environment:
source /opt/intel/composerxe/bin/compilervars.sh intel64
Also set the offload library as well.
#include "offload.h"
I am using the boost library for getting the current system time and my code works but visualt studio 2010 exits after the program.the debugger breaks while trying to free the non existing pointer. I know this is because of the boost native code.Since there is no error if I comment the boost portion of code.
Till now I tried using the #pragma as explained in MSDN but with no success.Can someone provide me some suggestions.? (I also tried GetSystemTime function to get the time but i cannot get the microsecond detail like boost.)
MY Code
#pragma managed(push, off)
void GetSystemDateTime(SDateTime& stimeblock);
#pragma managed(pop)
int main()
{
c++/cli code
SDateTime stimestruct[1];
//call to the function having the boost code..
GetSystemDateTime(stimestruct[0]);
}
Function Definition
#pragma managed(push, off)
void GetSystemDateTime(SDateTime& timeblock)
{
// SYSTEMTIME time;
// GetSystemTime(&time);
// WORD millis = (time.wSecond * 1000) + time.wMilliseconds;
boost::posix_time::ptime now = boost::posix_time::microsec_clock::local_time();
std::tm pt_tm = to_tm(now);
std::cout << now << std::endl;
//std::cout << time.wYear<< time.wMonth<<time.wDay //<<time.wHour<<time.wMinute<<time.wSecond<<time.wMilliseconds << std::endl;
std::string timestring = to_iso_string(now);
std::string sYear = timestring.substr (0,4);
std::string sMonth = timestring.substr (4,2);
std::string sDay = timestring.substr (6,2);
std::string sHour = timestring.substr (9,2);
std::string sMinute = timestring.substr (11,2);
std::string sSecond = timestring.substr (13,2);
std::string sUSecond = timestring.substr (16);
istringstream isYear(sYear);
istringstream isMonth(sMonth);
istringstream isDay(sDay);
istringstream isHour(sHour);
istringstream isMinute(sMinute);
istringstream isSec(sSecond);
istringstream isUSec(sUSecond);
// use is like an input stream
int iYear,iMonth,iDay,iHour,iMinute,iSecond,iUSecond;
isYear >> iYear;
isMonth >>iMonth;
isDay >>iDay;
isHour >>iHour;
isMinute >>iMinute;
isSec >>iSecond;
isUSec >>iUSecond;
timeblock.uiYear = iYear;
timeblock.usiMonth = time.wMonth;
timeblock.usiDay = time.wDay;
timeblock.usiHour = time.wHour;
timeblock.usiMinute = time.wMinute;
timeblock.usiSec = time.wSecond;
timeblock.udiUSec = time.wMilliseconds;
// Display version information
}
I've seen this error caused by using a static variable in native code in a C++/CLI assembly.
The only workaround I found was to remove the static variable, e.g., by moving it to class or file scope.
However, if this static variable is in the boost code, doing so may not be easy/possible. In that case, you could create a separate C++ file that's compiled without /clr, use the boost function in that file, and link that into your C++/CLI assembly.
This error seems to be caused by the compiler generating incorrect code. I filed a bug with Microsoft, which was closed "won't fix", but the compiler team gave some other workarounds in their response.
Try using
#pragma managed(push, off)
#pragma managed(pop)
around the #include lines for all boost header files.
I'm on the same problem for a few days now.
this is the best workaround i have found. and also explains why this is happening.
look at the end (number 7 and 9)
hope this helps http://www.codeproject.com/Articles/442784/Best-gotchas-of-Cplusplus-CLI
I'm trying to compile LightZPng with warnings on level 4. I get a lot of C4127 on lines that are clearly not worthy of this warning. An example:
#define MAX_BITS 15
int values_per_bitlen[ MAX_BITS + 1 ];
for ( int i = 0; i <= MAX_BITS; ++i ) // C4127 is here
values_per_bitlen[ i ] = 0;
How can this code be changed to avoid the warning other than #pragma?
There's a piece of code at the top of LightZ.cpp that goes like this:
#define for if (false) {} else for
That means your actual statement is:
#define for if (false) {} else for ( int i = 0; i <= MAX_BITS; ++i )
which is why you're getting the constant expression error (it's the false, not the i <= MAX_BITS as I thought).
Simply comment out or delete that line from the file (I can't actually figure out why they would do that).
Yes, that its odd. It's truly not a constant expression since i changes in the loop. So this would appear to be a problem with VS2005. For what it's worth, VS2008 does exactly the same thing.
Strangely enough, a project with just this in it does not complain so it may well be some weird edge-case problem with Microsoft's warning generation code:
#define MAX_BITS 15
int values_per_bitlen[ MAX_BITS + 1 ];
int main(int argc, char* argv[]) {
for ( int i = 0; i <= MAX_BITS; ++i )
values_per_bitlen[ i ] = 0;
return 0;
}
However, you haven't actually asked a question. What is it that you want to know, or want us to do?
Update:
See "Windows programmer"'s answer for the actual cause - there's a "#define for if (false) {} else for" at the top of LightZ.cpp which is causing the problem.
I tested it on my VS2005 and the warning does not appear, even at warning level 4. .
A simple procedure for you to follow :
-Create a new Console App and place only the above code and see if the warning shows up again.
-If not, check for differences in the project settings.
-If yes, I would assume that your optimization setting may be causing it.
According to Charles Nicholson, Visual Studio 2005 gives this error with the "do...while(0)" trick:
#define MULTI_LINE_MACRO \
do { \
doSomething(); \
doSomethingElse(); \
} while(0)
If you absolutely must, you can use the __pragma directive to selectively disable that warning around a particular code fragment.