Just to see what kind of code CUDA is generating I like to compile to ptx in addition to an object file. Since some of my loop unrolling can take quite a while I'd like to be able to compile *.cu→*.ptx→*.o instead of wasting time with both *.cu→*.ptx and *.cu→*.o, which I'm currently doing.
Simply adding -ptx to the nvcc *.cu line gives the desired ptx output.
Using ptxas -c to compile *.ptx to *.o works, but causes an error in my executable linking: Relocations in generic ELF (EM: 190).
Attempting to compile the *.ptx with nvcc fails silently, outputting nothing.
this image is quite helpful:
Is there some option I need to pass to ptxas? How should I properly compile via ptx with separate compilation? Alternatively, can I just tell nvcc to keep the ptx?
Alternatively, can I just tell nvcc to keep the ptx?
Yes, you can tell nvcc to keep all intermediate files, one of which will be the .ptx file.
nvcc -keep ...
Keeping all the intermediate files is a bit messy, but I'm sure you can come up with a script to tidy things up, and only save the files you want.
Related
I have been trying to compile a large Fortran code with many files in it on a HPC with the use of CMake. CMake properly configures and generates a Makefile. While Making, I get an error saying '/bin/sh: ftn: command not found'. The Makefile tries to compile the code like this->
ftn -o CMakeFiles/s3d.x.dir/modules/param_m.f90.o
When I compile it on my personal system or another HPC, it goes like->
mpif90 -o CMakeFiles/s3d.x.dir/modules/param_m.f90.o
I don't know why CMake is not able to put actual compilers in place of 'ftn'.
I would really appreciate if any suggestions come up.
I have some C++ library code that I want strictly compiled for a quick check, and I don't want any files produced to be used for later stages (assembly, linkage, etc.)
I can do
g++ -S main.cpp
but this will give me an assembly file that I'm just going to wind up deleting anyway.
Is there an option that will tell the compiler to just compile a source file but don't produce any files?
EDIT[0]: I'm using mingw on Windows.
gcc has the option -fsyntax-only:
Check the code for syntax errors, but don’t do anything beyond that.
Its simple to call the preprocessor on a c/c++ code:
g++ -E <file>.cpp
And it passes through the preprocessor and generates the preprocessed code.
I have OpenCL kernel in .cl how to achieve the same?
This is what I did and failed:
g++ -E -I. -std=c++11 -g -O3 -march=native -I/path/to/opencl/include/ -Wno-unused-result kernel.cl -L/path/to/opencl/lib/x86_64/ -lOpenCL -lquadmath
g++: warning: kernel.cl: linker input file unused because linking not done
Thanks
OpenCL code can run on a different architecture to the one that you are using to compile this on. You might find that there are differences depending on the complile time settings in the code that depend on the physical configuration of the target.
The most reliable method for generation of the postprocessed code for AMD devices is to ask the framework to save the temporary files, including the postprocessed output files.
On linux all you need to do for AMD is set an environment varisable. ie:
export AMD_OCL_BUILD_OPTIONS_APPEND="-save-temps"
When you compile you opencl program you will see a few files in /tmp. The one with the .i extension is the postprocessed file. This might be different to the one that you will get using cpp on the host architecture.
I was trying to do code coverage on a simple hello world program in C++.
The target device is an arm processor and hence I am using GNU ARM toolchain.
arm-elf-gcc -mcpu=arm7tdmi -O2 -g -c main.c -o main.exe creates a .gcno file but fails to create a .gcda file which is needed by gcov to find out the code coverage.
Normally when I run g++/gcc -fprofile-arcs -ftest-coverage .cpp,it first creates a .gcno file and an .exe. After running the a.exe , it generates the .gcda file.
Here when I try to run the main.exe to generate the .gcda, it throws an error - Program too big to fit in memory.
How do I resolve this issue?
Am I going wrong somehere?
Thanks,
A-J
Obviously, you have to run your executable on the target device. The target device must have a filesystem. Upon exit, the executable writes coverage information using ordinary POSIX functions - open, fcntl, write, close, etc. Look at gcov-io.c in GCC sources. Make sure you can successfully link libgcov.a into your executable, that you have write permission on the target device, etc.
I have a problem with compiling a fortran program with the gfortan complier.
The main program is located in main.f. So, I write in console:
gfortran D:\test\test.f
But it displays me a lot of errors such as:
C:\Users\Efds~1\AppData\Local\Temp\cchFNGgc.o:test.f:<.test+0x3a>: undefined reference to '_gridw_'
C:\Users\Efds~1\AppData\Local\Temp\cchFNGgc.o:test.f:<.test+0x3a>: undefined reference to '_gridz_'
etc.
I think it's because of functions gridw, gridz etc. are located in other *.f files. But I don't know how to link these all together.
Also, I tried to use Compaq Visual Fortran Complier, but it didn't help me.
A basic command for compiling and linking multiple source files into one executable would be
gfortran -o executable source1.f source2.f source3.f
taking care that any .f file you specify is named to the right of any other source files on which it depends. All of this, and much more besides, is well covered in the compiler's documentation.
As noted above, you can compile several files with the same command, but it's quite unusual.
You may prefer first compile to object files (".o") :
gfortran -c gridw.f
gfortran -c gridz.f
And then compile the program
gfortran test.f grodw.o gridz.o
If you have many files to link, it may be interesting to build a library:
ar cru mylib.a gridw.o gridz.o
gfortran test.f mylib.a
If you name your library libSOMETHING.a, you can simply write
gfortran test.f -lSOMETHING