Problems with creating a CUDA shared library and libpthread

Problems with creating a CUDA shared library and libpthread - gcc

I am currently trying to create a library with CUDA routines but I am running into trouble. I will explain my problems using a rather minimal example, my actual library will be larger.
I have successfully written test.cu, a source file containing a __global__ CUDA function and a wrapper around it (to allocate and copy memory). I can also successfully compile this file into a shared library using the following commands:
nvcc -c test.cu -o test.o -lpthread -lrt -lcuda -lcudart -Xcompiler -fPIC
gcc -m64 -shared -fPIC -o libtest.so test.o -lpthread -lrt -lcuda -lcudart -L/opt/cuda/lib64
The resulting libtest.so exports all my needed symbols.
I now compile my purely C main.c and link it against my library:
gcc -std=c99 main.c -o main -lpthread -ltest -L.
This step is also successful, but upon executing ./main all CUDA functions that are called return an error:
test.cu:17:cError(): cudaGetDeviceCount: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaMalloc: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaMemcpy: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaMemcpy: [38] no CUDA-capable device is detected
test.cu:17:cError(): cudaFree: [38] no CUDA-capable device is detected
(Error messages are created through a debugging function of my own)
During my initial steps I encountered the exact same problem, as I was directly creating an executable from test.cu, because I forgot to link against libpthread (-lpthread). But, as you can see above, I have linked all source files against libpthread. According to ldd, both libtest.so and main depend on libpthread, as it should be.
I am using CUDA 5 (yes, I do realize it is a beta) with gcc 4.6.3 and nvidia driver version 302.06.03 on ArchLinux.
Some help in solving this problem would be more than appreciated!

Here's a trivial example...
// File: test.cu
#include <stdio.h>
__global__ void myk(void)
{
printf("Hello from thread %d block %d\n", threadIdx.x, blockIdx.x);
}
extern "C"
void entry(void)
{
myk<<<1,1>>>();
printf("CUDA status: %d\n", cudaDeviceSynchronize());
}
Compile/link with nvcc -m64 -arch=sm_20 -o libtest.so --shared -Xcompiler -fPIC test.cu.
// File: main.c
#include <stdio.h>
void entry(void);
int main(void)
{
entry();
}
Compile/link with gcc -std=c99 -o main -L. -ltest main.c.

Related

Why is CMake putting "std=gnu++14" in Clang invocations when I don't specify GCC?

I have a pretty simple CMake-based project that builds one lib and a small executable that uses it. It builds fine on Linux with GCC, but fails on Mac OS with loads of errors of the following kind. The inciting line of code in main.cpp is the second one here:
#include <cstdlib>
#include <memory>
The first of many similar errors:
[build] In file included from /Users/me/data/series2server/main.cpp:2:
[build] In file included from /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.3.sdk/usr/include/c++/v1/memory:671:
[build] /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.3.sdk/usr/include/c++/v1/__algorithm/search.h:34:19: error: no member named 'make_pair' in namespace 'std::__1'
[build] return _VSTD::make_pair(__first1, __first1); // Everything matches an empty sequence
This appears to be a mismatch between Clang and GCC uses of std. But I can't figure out why CMake is configuring things to call clang++ but putting "std=gnu++14" in the compiler invocation. I did a full-text search for "std=gnu" in the whole source tree and didn't find it. I do see this in various CMakeLists.txt files:
set( CMAKE_CXX_STANDARD 14 )
A compiler invocation is below. Where might I look for where this gnu option is specified? Thanks!
[build] cd /Users/me/data/series2server/build/restbed && /usr/bin/clang++ -DBUILD_SSL -I/Users/me/data/series2server/restbed/source -isystem /Users/me/data/series2server/restbed/dependency/asio/asio/include -isystem /Users/me/data/series2server/restbed/dependency/openssl/include -Wall -Wextra -Weffc++ -pedantic -Wno-unknown-pragmas -Wno-deprecated-declarations -Wno-non-virtual-dtor -DASIO_STANDALONE=YES -Wno-deprecated-declarations -g -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.3.sdk -std=gnu++14 -MD -MT restbed/CMakeFiles/restbed-static.dir/source/corvusoft/restbed/detail/service_impl.cpp.o -MF CMakeFiles/restbed-static.dir/source/corvusoft/restbed/detail/service_impl.cpp.o.d -o CMakeFiles/restbed-static.dir/source/corvusoft/restbed/detail/service_impl.cpp.o -c /Users/me/data/series2server/restbed/source/corvusoft/restbed/detail/service_impl.cpp

From n.m.'s comment 10 years ago, for clarity:
set( CMAKE_CXX_STANDARD 14 ) sets gcc or clang flag to -std=gnu++14, unless CXX_EXTENSIONS property (or CMAKE_CXX_EXTENSIONS variable) is set to OFF.

Link only what is used/needed with Clang on MacOS

On MacOS, the dynamical linking behaviour seems to be fundamentally differnt from that of *Nix.
The problem is that on MacOS, Clang adds whatever libraries given at linking time to the produced binary, regardless of whether their symbols are needed or not.
Minimal C-example:
main.c
#include <stdio.h>
int main(void)
{
puts("Hello World!");
return 0;
}
foo.c
int foo(void)
{
return 42;
}
Compiling on MacOS with
clang -dynamiclib -o libfoo.dylib foo.c
clang -o main main.c -L. -lfoo
and then, using otool -L main, one finds that main binary depends on libfoo.dylib, althought it does not need the symbols in libfoo.dylib at all.
Under Linux, the result is different: Using
gcc -shared -fPIC -o libfoo.so foo.c
gcc -o main -L. -lfoo
to compile and then, ldd main shows no dependence on libfoo.so.
Tested with Apple Clang 12.0.0 and gcc 10.2.1 (Debian 10.2.1-6).
How could one reproduce the behaviour on Linux (“Link what you use/need”), when compiling and linking on MacOS?

Compiling OpenMP to WebAssembly

I am trying to compile a multi threaded application to WebAssembly. The application uses OpenMP for multithreading.
To compile I am using the Emscripten framework.
I have already downloaded the source files for OpenMP and compiled it for my host machine using make. With the following command I can get it to link with a simple demo application on my machine:
g++ -Wall -Werror -pedantic main.o -o main.x /$PATH_TO_OPENMP/build/runtime/src/libgomp.a -pthread -lstdc++ -Wl,--no-as-needed -ldl
I then tried to compile OpenMP to the llvm bytecode format used by Emscripten. To do so I tried to run 'emmake make', so that the emscripten framework executes the OpenMP makefiles with a suitable compiler. As emscripten does not like shared object files I compiled it to static library .a files.
This works and actually gives me object files to which I can link.
I then wanted to link my demo application with the following command
em++ -Wall -Werror -pedantic main.o -o main.html /home/main/data/Programming/openMP/openmp_web/build/runtime/src/libgomp.a -pthread -lstdc++ -Wl,--no-as-needed -ldl
But I get these warnings, that it couldn't link to OpenMP files:
shared:WARNING: object /tmp/emscripten_temp_ONa0eU_archive_contents/kmp_atomic.cpp.o is not a valid object file for emscripten, cannot link
.
.
shared:WARNING: object /tmp/emscripten_temp_ONa0eU_archive_contents/kmp_str.cpp.o is not a valid object file for emscripten, cannot link
shared:WARNING: object /tmp/emscripten_temp_ONa0eU_archive_contents
So I figured I must have compiled OpenMP with the wrong compiler. I then tried to change the compiler when building the library by using the following commands:
cmake -DCMAKE_C_COMPILER=emcc -DCMAKE_CXX_COMPILER=em++ -DLIBOMP_LIB_TYPE=normal -DLIBOMP_ENABLE_SHARED=OFF -DCMAKE_BUILD_TYPE=Release -DLIBOMP_ARCH=x86_64 OPENMP_STANDALONE_BUILD=1 ..
emmake make
But this just gives strange errors on some missing system variables
/home/main/data/Programming/openMP/openmp_web/runtime/src/kmp_platform.h:82:2: error: Unknown OS
/home/main/data/Programming/openMP/openmp_web/runtime/src/kmp_platform.h:203:2: error: Unknown or unsupported architecture
In file included from /home/main/data/Programming/openMP/openmp_web/runtime/src/kmp_alloc.cpp:13:
In file included from /home/main/data/Programming/openMP/openmp_web/runtime/src/kmp.h:77:
/home/main/data/Programming/openMP/openmp_web/runtime/src/kmp_os.h:171:2: error: "Can't determine size_t printf format specifier."
Does anyone have an idea on what I could do differently?

OMP parrallel being ignored in object file

NOTE: I do not understand object files, linking, or make files very well. I only understand enough to get a program running
I'm working on a GPU accelerated version of a previous project of mine that works with no problems. I currently am testing a modified version of a make file I have used for other CUDA programs.
The file:
exe: main.o b.o
gcc -fopenmp -L /usr/local/cuda/lib64 -o exe main.o b.o -lcudart -lglfw -lGL
main:
gcc -fopenmp -o main.o main.c -Ofast -march=native -mtune=native -lglfw -lGL -I /usr/local/cuda/include
b.o: b.cu b.h
nvcc -Xcompiler -fPIC -ccbin clang-3.8 -c -o b.o b.cu
b.cu is a CUDA file containing some test functions; it does not effect anything yet.
When I run the compiled program, it only uses a single core, and runs at 1/4 the frame rate (this is what would be expected on a 4 core cpu).
I've Googled as many questions as I can, but I have not found any results that work for me.
System info:
OS: Ubuntu 18.04 bionic
CPU: AMD A8-3850
GPU: GeForce GTX 1060 6GB
RAM: 7974MiB
GCC: 7.3.0

gcc: confused about -static -shared -fPIE -fPIC -Wl,-pie

I'm trying to build clang, with all library static linked in. So that I can run it on CentOS 6 with ancient GCC 4.4 version.
At first, I think adding the option -static by turning on LLVM_BUILD_STATIC is enough. But in the link stage, it errors out.
dynamic STT_GNU_IFUNC symbol `strcmp' with pointer equality in `/usr/lib/../lib64/libc.a(strcmp.o)' can not be used when making an executable; recompile with -fPIE and relink with -pie
So, I add -fPIE -Wl,-pie to CMAKE_CXX_FLAGS, and it says
-- Performing Test HAVE_CXX_ATOMICS_WITH_LIB
-- Performing Test HAVE_CXX_ATOMICS_WITH_LIB - Failed
CMake Error at cmake/modules/CheckAtomic.cmake:49 (message):
Host compiler must support std::atomic!
Call Stack (most recent call first):
cmake/config-ix.cmake:307 (include)
CMakeLists.txt:590 (include)
I checked the cmake/modules/CheckAtomic.cmake file, It compiles the following code
#include <atomic>
std::atomic<float> x(0.0f);
int main() { return (float)x; }
with command
/home/hailin/gcc-4.8.3-boost-1.55/rtf/bin/g++ -fPIE -Wl,-pie -DHAVE_CXX_ATOMICS_WITHOUT_LIB -std=c++11 -static -lm
/home/hailin/gcc-4.8.3-boost-1.55/rtf/bin/g++ -fPIE -Wl,-pie -DHAVE_CXX_ATOMICS_WITH_LIB -std=c++11 -static -lm -latomic
The command with option -Wl,-pie reproduce the same error.
It seems like a dead end. Is there any conflict between -shared and -fPIE -Wl,-pie ?

Old question, but in case someone else hits it: apparently you need to pass -pie to the compiler driver (gcc/g++), not just the linker (-Wl,-pie). Some startup object files differ for PIE (e.g. Scrt1.o instead of crt1.o) and these are passed by the driver to the linker, so the driver needs to know that you're making a PIE.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Problems with creating a CUDA shared library and libpthread - gcc

Related

Why is CMake putting "std=gnu++14" in Clang invocations when I don't specify GCC?

Link only what is used/needed with Clang on MacOS

Compiling OpenMP to WebAssembly

OMP parrallel being ignored in object file

gcc: confused about -static -shared -fPIE -fPIC -Wl,-pie

Categories

Resources