Pragma directive to get thread number not working Rcpp - parallel-processing

I am trying to parallelise a loop using pragma directives in Rcpp. Aside from a warning message during compilation that pragma is not recognised (although this appears to be a non-issue from what I have read), other pragma commands are not working. This is the minimal example I have been using (content of the for-loop is irrelevant):
sourceCpp(code = '
#include <Rcpp.h>
#include <omp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
int test() {
#pragma omp parallel for
int foo = omp_get_num_threads() ;
for(int i = 0; i < 2; i++) {
Rcout << i ;
}
return foo ;
}')
Here is my error:
"C:/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/PROGRA~1/R/R-40~1.4/include" -DNDEBUG -I"C:/Users/User/Documents/R/win-library/4.0/Rcpp/include" -I"C:/Users/User/AppData/Local/Temp/RtmpWW0LXx/sourceCpp-x86_64-w64-mingw32-1.0.4.6" -O2 -Wall -mfpmath=sse -msse2 -mstackrealign -c file2fe83fae2189.cpp -o file2fe83fae2189.o
file2fe83fae2189.cpp:9: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
#pragma omp parallel for
C:/rtools40/mingw64/bin/g++ -std=gnu++11 -shared -s -static-libgcc -o sourceCpp_90.dll tmp.def file2fe83fae2189.o -LC:/PROGRA~1/R/R-40~1.4/bin/x64 -lR
C:/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: file2fe83fae2189.o:file2fe83fae2189.cpp:(.text+0x106): undefined reference to `omp_get_num_threads'
collect2.exe: error: ld returned 1 exit status
I am on a Windows machine so the MacOS compiler issue should not apply, and my num_threads call is inside the pragma section. Any ideas on what is going wrong here?

While this stuff can be finicky, you clearly missed the fact that you must inform Rcpp that you want an OpenMP compilation: you do this via the plugin (or in a package, which is what you should probably use anyway, via the src/Makevars or src/Makevars.win variable).
Anyway, here is a worked example I just derived from an older C++ example I had hanging around.
Code
#include <Rcpp.h>
#include <omp.h>
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::export]]
int foo() {
int th_id, nthreads;
#pragma omp parallel private(th_id)
{
th_id = omp_get_thread_num();
std::ostringstream ss; // allow for better synchronization
ss << "Hello World from thread " << th_id << std::endl;
Rcpp::Rcout << ss.str();
#pragma omp barrier
#pragma omp master
{
nthreads = omp_get_num_threads();
Rcpp::Rcout << "There are " << nthreads << " threads" << std::endl;
}
}
return 0;
}
/*** R
foo()
*/
Output
On my machine with a hyperthreaded six-core cpu:
> Rcpp::sourceCpp("answer.cpp")
> foo()
Hello World from thread 0
Hello World from thread 1
Hello World from thread 8
Hello World from thread 10
Hello World from thread 4
Hello World from thread 9
Hello World from thread 11
Hello World from thread 7
Hello World from thread 3
Hello World from thread 5
Hello World from thread 6
Hello World from thread 2
There are 12 threads
[1] 0
>

Related

Why does my ISR declaration break my program?

I am trying to make two LEDs blink on my Arduino Uno R3 (for learning purposes). I use avr-gcc and avrdude to compile and load my program.
The first one I make blink within a while loop in main. I am trying to use Timer0 to turn the second one on and off.
First, the code that works :
#include <avr/io.h>
#include <util/delay.h>
int main() {
TCCR0B |= (1 << CS02) | (1 << CS00);
TIMSK0 |= (1 << TOIE0);
DDRD = 1 << PD3;
DDRB = 1 << PB5;
PORTB = 0;
while(1) {
PORTD ^= 1 << PD3;
_delay_ms(500);
}
return 0;
}
As expected, this code makes my LED blink on and off, and start again every second. I am also setting up (but not using) the second LED and the timer.
Now, the issues start when I add an interrupt vector:
...
#include <avr/interrupt.h>
volatile uint8_t intrs;
ISR(TIMER0_OVF_vect) {
if (++intrs >= 62) { // meant to execute every second
PORTB ^= (1 << PB5);
intrs = 0;
}
}
int main() {
intrs = 0;
... // old setup
sei();
while(1) { ... }
}
Now, none of the LEDs blink. Even weirder, none of them blink when I remove the sei(). The only way I've found to make the first LED blink again is to comment out the ISR declaration or to mark it ISR_NAKED.
So, what gives?
PS : I use a makefile to compile & load. When I run it, it looks like this:
$ make
avr-gcc -c -Os -DF_CPU=16000000UL -mmcu=atmega328p -Wall -Wextra main.c
avr-gcc -o prog.elf main.o
avr-objcopy -O ihex -R .eeprom prog.elf prog.hex
avrdude -C/etc/avrdude.conf -v -V -carduino -patmega328p -P/dev/ttyACM0 -b115200 -D -Uflash:w:prog.hex
.. # avrdude logs
I use the arduino framework with setup() and loop() functions. It might not be an optimal choice, but it's easier. Timer 0 is used by wiring.c, which is responsible for delay functions (not _delay_ms() which doesn't use interrupts). This can be disabled, as explained in this post or timer 2 can be used instead. In the latter case, your second code works fine. Could it be that you face a similar problem?

std::async blocks even with std::launch::async flag depending on whether the returned future is used or ignored [duplicate]

This question already has an answer here:
Why C++ async run sequentially without future?
(1 answer)
Closed 2 years ago.
Description of the problem
std::async seems to block even with std::launch::async flag:
#include <iostream>
#include <future>
#include <chrono>
int main(void)
{
using namespace std::chrono_literals;
auto f = [](const char* s)
{
std::cout << s;
std::this_thread::sleep_for(2s);
std::cout << s;
};
std::cout << "start\n";
(void)std::async(std::launch::async, f, "1\n");
std::cout << "in between\n";
(void)std::async(std::launch::async, f, "2\n");
std::cout << "end\n";
return 0;
}
output shows that the execution is serialized. Even with std::launch::async flag.
start
1
1
in between
2
2
end
But if I use returned std::future, it suddenly starts to not block!
The only change I made is removing (void) and adding auto r1 = instead:
#include <iostream>
#include <future>
#include <chrono>
int main(void)
{
using namespace std::chrono_literals;
auto f = [](const char* s)
{
std::cout << s;
std::this_thread::sleep_for(2s);
std::cout << s;
};
std::cout << "start\n";
auto r1 = std::async(std::launch::async, f, "1\n");
std::cout << "in between\n";
auto r2 = std::async(std::launch::async, f, "2\n");
std::cout << "end\n";
return 0;
}
And, the result is quite different. It definitely shows that the execution is in parallel.
start
in between
1
end
2
1
2
I used gcc for CentOS devtoolset-7.
gcc (GCC) 7.2.1 20170829 (Red Hat 7.2.1-1)
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
My Makefile is:
.PHONY: all clean
all: foo
SRCS := $(shell find . -name '*.cpp')
OBJS := $(SRCS:.cpp=.o)
foo: $(OBJS)
gcc -o $# $^ -lstdc++ -pthread
%.o: %.cpp
gcc -std=c++17 -c -g -Wall -O0 -pthread -o $# $<
clean:
rm -rf foo *.o
Question
Is this behaviour in the specification?
Or is it a gcc implementation bug?
Why does this happen?
Can someone explain this to me, please?
The std::future destructor will block if it’s a future from std::async and is the last reference to the shared state. I believe what you’re seeing here is
the call to async returns a future, but
that future is not being captured, so
the destructor for that future fires, which
blocks, causing the tasks to be done in serial.
Explicitly capturing the return value causes the two destructors to fire only at the end of the function, which leaves both tasks running until they’re done.

EIGEN library with MKL rvalue references warning

I am trying to the use the EIGEN library linked with the MKL library (icc version 17.0.4) with the code:
#define EIGEN_USE_MKL_ALL
#define lapack_complex_float std::complex<float>
#define lapack_complex_double std::complex<double>
#include <iostream>
#include <Eigen/Dense>
#include <Eigen/Eigenvalues>
#include <complex>
#include <Eigen/PardisoSupport>
using namespace Eigen;
using Eigen::MatrixXd;
int main()
{
int size = 3;
MatrixXd A(size,size);
A(0,0)=1.0; A(0,1)=-0.5; A(0,2)=0.2;
A(1,0)=0.7; A(1,1)=-1.3; A(1,2)=-2.0;
A(2,0)=0.7; A(2,1)=-1.3; A(2,2)=-2.0;
std::cout << A << std::endl;
VectorXd vec(3);
vec(0) = 2;
vec(1) = 3;
vec(2) = 4;
std::cout << A*vec << "\n";
std::cout << A.eigenvalues() << "\n";
}
I compile via
icc -I${MKLROOT}/include -L${MKLROOT}/lib -Wl,-rpath,${MKLROOT}/lib \
-lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl \
-L/Users/user/eigen -I/Users/user/eigen
However I receive the error message:
/Users/user/eigen/Eigen/src/Core/DenseStorage.h(372): warning #3495: rvalue references
are a C++11 feature DenseStorage(DenseStorage&& other) EIGEN_NOEXCEPT
How to solve this warning?
Eigen seems to detect that your compiler supports rvalue references. You can either disable that by defining -DEIGEN_HAS_RVALUE_REFERENCES=0 via the command line or before including Eigen in your source by:
#define EIGEN_HAS_RVALUE_REFERENCES 0
Preferably, tell icc that it shall compile with C++11 support (I assume -std=c++11 works for icc as well).

C++ gettid() was not declared in this scope

A simple program is:
I would like to get the thread ID of both of the threads using this gettid function. I do not want to do the sysCall directly. I want to use this function.
#include <iostream>
#include <boost/thread/thread.hpp>
#include <boost/date_time/date.hpp>
#include <unistd.h>
#include <sys/types.h>
using namespace boost;
using namespace std;
boost::thread thread_obj;
boost::thread thread_obj1;
void func(void)
{
char x;
cout << "enter y to interrupt" << endl;
cin >> x;
pid_t tid = gettid();
cout << "tid:" << tid << endl;
if (x == 'y') {
cout << "x = 'y'" << endl;
cout << "thread interrupt" << endl;
}
}
void real_main() {
cout << "real main thread" << endl;
pid_t tid = gettid();
cout << "tid:" << tid << endl;
boost::system_time const timeout = boost::get_system_time() + boost::posix_time::seconds(3);
try {
boost::this_thread::sleep(timeout);
}
catch (boost::thread_interrupted &) {
cout << "thread interrupted" << endl;
}
}
int main()
{
thread_obj1 = boost::thread(&func);
thread_obj = boost::thread(&real_main);
thread_obj.join();
}
It gives Error on compilation; The use of gettid() has been done according to the man page:
$g++ -std=c++11 -o Intrpt Interrupt.cpp -lboost_system -lboost_thread
Interrupt.cpp: In function ‘void func()’:
Interrupt.cpp:17:25: error: ‘gettid’ was not declared in this scope
pid_t tid = gettid();
This is a silly glibc bug. Work around it like this:
#include <unistd.h>
#include <sys/syscall.h>
#define gettid() syscall(SYS_gettid)
The man page you refer to can be read online here. It clearly states:
Note: There is no glibc wrapper for this system call; see NOTES.
and
NOTES
Glibc does not provide a wrapper for this system call; call it using syscall(2).
The thread ID returned by this call is not the same thing as a POSIX thread ID (i.e., the opaque value returned by pthread_self(3)).
So you can't. The only way to use this function is through the syscall.
But you probably shouldn't anyway. You can use pthread_self() (and compare using pthread_equal(t1, t2)) instead. It's possible that boost::thread has its own equivalent too.
Additional to the solution provided by Glenn Maynard it might be appropriate to check the glibc version and only if it is lower than 2.30 define the suggested macro for gettid().
#if __GLIBC__ == 2 && __GLIBC_MINOR__ < 30
#include <sys/syscall.h>
#define gettid() syscall(SYS_gettid)
#endif

Overriding functions from dynamic libraries

Hello I have a program with a global function that I'd like to customize at run time. Say, there are many versions of function foo() scattered over shared libraries. Now, based on system configuration detected at run time I'd like to use function from appropriate library.
File loader.cpp:
#include <dlfcn.h>
#include <iostream>
void __attribute__((weak)) foo();
int main(int argc, char *argv[])
{
void* dl = dlopen("./other.so", RTLD_NOW | RTLD_GLOBAL);
if (!dl)
{
std::cerr << dlerror() << std::endl;
return 1;
}
if (foo)
{
foo();
}
else
{
std::cerr << "No foo?" << std::endl;
}
dlclose(dl);
return 0;
}
File other.cpp:
#include <iostream>
void foo()
{
std::cout << "FOO!" << std::endl;
}
I compile the program with
g++ -Wall -fPIC -o loaded loader.cpp -ldl
g++ -Wall -fPIC -shared -o other.so other.cpp
However the weak symbol is not overriden. Any hints?
Symbols are resolved during load time of the image in which they are referenced. So when your executable is loaded, the reference to foo is already resolved. A later dlopen won't go and rebind all symbols - it only may affect later loads.
You'll have to use dlsym instead, or set LD_PRELOAD:
martin#mira:/tmp$ LD_PRELOAD=/tmp/other.so ./loaded
FOO!
You compiled the shared lib with g++.
As a result, the name of the function is mangled:
$ nm -S other.so |grep foo
0000000000000690 000000000000002e T _Z3foov
If you make it a pure C code and compile with
gcc instead of g++, you'll find it working as you expect.
Alternatively, define it as follows:
extern "C" void foo()
{
std::cout << "FOO!" << std::endl;
}

Resources