I am trying to measure the register spilling in my CUDA project in Visual Studio. To do so I am using the flag –Xptxas –v,–abi=no as it is written here
http://on-demand.gputechconf.com/gtc-express/2011/presentations/register_spilling.pdf
In my VS 2010 project in properties I tried to put this flag in:
properties / cuda / host / additional compilation flags - no effect.
properties / cuda / command line. The compiling exits with -1.
properties / c / command line. Compilation error
In Cuda properties I have also set to Yes flags : Generate GPU debug information and Verbose PTXAS output. I am looking for the output in Output window.
How to do it properly?
I have GPU with CC = 2.1.
EDIT:
so the correct place to put the flag as answers indicate is the properties/cuda/command line. But I still do not get the expeceted output (even in sample projects). Below I show my other options I have in properties:
cuda/device.
C interlaved in PTXAS output - No
Code generation - compute_20, sm_21
generate GPU debug info - Yes
max used register - 0
verbose ptxas output (yes/ no - tested both).
I think the steps are pretty straightforward. I did a clean install of VS2010 Express, followed by an install of CUDA 5.0 for windows 7.
I chose the VectorAdd sample code, which is in the CUDA 5.0 samples package. By default, my project was set up to compile for Win32 and Debug.
The only change I had to make was to select Project...Properties...CUDA C/C++...Command Line
I then added the -Xptxas -v options in the Addtional Options text box at the bottom of the properties dialog, like so:
(if you have trouble seeing the above picture clearly, right-click on the picture then click "Save Picture As..." and save it to your hard disk, then open it from there.)
After that, press Apply and OK.
Then hit F7 to build the project, and you should see output like this in the Output window (your output window should automatically display "Build" output when you are compiling:
1>------ Rebuild All started: Project: vectorAdd, Configuration: Debug Win32 -----
1>
1> C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\bin\nvcc.exe" -ccbin "C:\Program Files\Microsoft Visual Studio 10.0\VC\bin" -I"../../common/inc" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -Xptxas -v -g -DWIN32 -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MTd " -o "Win32/Debug/vectorAdd.cu.obj" "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd\vectorAdd.cu" -clean
1> Compiling CUDA source file vectorAdd.cu...
1>
1> C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\bin\nvcc.exe" -gencode=arch=compute_10,code=\"sm_10,compute_10\" -gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2010 -ccbin "C:\Program Files\Microsoft Visual Studio 10.0\VC\bin" -I"../../common/inc" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -Xptxas -v -g -DWIN32 -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MTd " -o "Win32/Debug/vectorAdd.cu.obj" "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd\vectorAdd.cu"
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_10'
1> ptxas : info : Used 4 registers, 32 bytes smem, 4 bytes cmem[1]
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_20'
1> ptxas : info : Function properties for _Z9vectorAddPKfS0_Pfi
1> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1> ptxas : info : Used 8 registers, 48 bytes cmem[0]
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_30'
1> ptxas : info : Function properties for _Z9vectorAddPKfS0_Pfi
1> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1> ptxas : info : Used 8 registers, 336 bytes cmem[0]
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_35'
1> ptxas : info : Function properties for _Z9vectorAddPKfS0_Pfi
1> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1> ptxas : info : Used 8 registers, 336 bytes cmem[0]
1> tmpxft_00001438_00000000-39_vectorAdd.compute_10.ii
1> vectorAdd_vs2010.vcxproj -> C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd\../../bin/win32/Debug/vectorAdd.exe
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========
Note that whether or not you see any actual spilling is a function of the code you are compiling. This code has no spilling, but if there were any, this is where the compiler would report it.
You don't need the -abi=no option in order to see the spilling results of the compiler.
Note that individual file options can override project settings (right click on one of your project source files, then click properties), but if you haven't modified any of these, they should not override your project settings.
There are probably other project settings that can interfere with this as well, so my suggestion is to try one of the CUDA sample codes that you haven't modified, and use the above steps as a sanity check to demonstrate that you can get it working there first. Then try it on your project.
Make sure you are modifying the settings (e.g. Win32/x64, Release/Debug) that correspond to the project you are actually building.
EDIT: The above case uses CUDA 5.0. The original question did not specify CUDA version. I found that with a previous version of CUDA in Visual Studio, the command line "Additional options" method did not seem to work, but using the selection/dropdown box to specify Verbose PTXAS output (Yes) did work.
EDIT2: OK I did a clean install of VS2010 followed by a clean install of CUDA 4.2 toolkit, and I was able to reproduce the issue. I used the following steps to be able to see the actual ptxas verbose output:
In Tools...Settings select "Expert Settings"
In Project...Properties...Configuration Properties...CUDA C/C++...Device change the ptxas verbose drop-down box to "Yes (--ptxas-options=-v)"
In Tools...Options...Projects and Solutions...Build and Run change the "MSBuild project build output verbosity" setting from "Minimal" to "Normal"
Then select Build...Rebuild Solution, and you should see the ptxas verbose output in the build output window.
I am using --ptxas-options=-v (without spaces), but admittedly I am still using some older CUDA version.
As for where to put:
Ad 1) properties / cuda / host / additional compilation flags -- this will alter your CPU code compilation of the CUDA source (functions marked as __host__). This is -not- where you want to put the flag.
Ad 2) properties / cuda / command line -- this should alter your GPU code compilation. If compiling exists with an error, what is the error message?
Ad 3) properties / c / command line -- this will affect your native C/C++ compiler which does not understand neither --ptxas-options nor -Xptxas
Related
I am using Visual Studio 17 for my projects and would like to use the Visual Leak Detector to find memory leaks.
Further the project is set up to use cmake, and thus I included the following lines of code in in my cmakelist.txt.
cmake_minimum_required(VERSION 3.11)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++14 -Wall -pedantic")
message("Enabling Visual Leak Detector")
find_library(VLD vld HINTS "C:/somePath/Visual Leak Detector/lib/Win64/")
find_file(VLD_H vld.h HINTS "C:/somePath/Visual Leak Detector/include")
include_directories("${VLD_H}")
link_libraries("${VLD}")
message("${VLD_H}")
message("${VLD}")
message("Enabling Visual Leak Detector Done")
add_executable(main main.cpp)
I have however the problem, that Visual Studio doesn't seem to find under this configuration.
If I use the automatically included/linked libraries/directories in a solution, VLD is found, can be included and runs, making me believe that above cmakelist.txt has just to be tweaked a bit.
Does someone have an idea, what I could do/try?
Error:
Error C1083 Cannot open include file: 'vld.h': No such file or directory
Error (active) E1696 cannot open source file "vld.h" main - x64-Debug
Output:
1> Command line: C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO\2017\COMMUNITY\COMMON7\IDE\COMMONEXTENSIONS\MICROSOFT\CMAKE\CMake\bin\cmake.exe -G "Ninja" -DCMAKE_INSTALL_PREFIX:PATH="C:\something\CMakeBuilds\8ab4cf5c-e7da-fe3d-9c4a-91e8ace77e1f\install\x64-Debug" -DCMAKE_CXX_COMPILER="C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.15.26726/bin/HostX64/x64/cl.exe" -DCMAKE_C_COMPILER="C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.15.26726/bin/HostX64/x64/cl.exe" -DCMAKE_BUILD_TYPE="Debug" -DCMAKE_MAKE_PROGRAM="C:\PROGRAM FILES (X86)\MICROSOFT VISUAL STUDIO\2017\COMMUNITY\COMMON7\IDE\COMMONEXTENSIONS\MICROSOFT\CMAKE\Ninja\ninja.exe" "C:\some\Advpt\Ex1"
1> Working directory: C:\something\CMakeBuilds\8ab4cf5c-e7da-fe3d-9c4a-91e8ace77e1f\build\x64-Debug
1> Enabling Visual Leak Detector
1> C:/somePath/Visual Leak Detector/include/vld.h
1> C:/somePath/Visual Leak Detector/lib/Win64/vld.lib
1> Enabling Visual Leak Detector Done
1> -- Configuring done
1> -- Generating done
1> -- Build files have been written to: C:/something/CMakeBuilds/8ab4cf5c-e7da-fe3d-9c4a-91e8ace77e1f/build/x64-Debug
1> Starting CMake target info extraction ...
1> CMake server connection made.
1> Extracted includes paths.
1> Extracted CMake variables.
1> Extracted source files and headers.
1> Extracted global settings.
1> Extracted code model.
1> Extracted CTest info.
1> Collating data ...
1> Target info extraction done.
I am learning some program which is compiled sucessfully for CUDA (v6.5). But when I switch to OpenMP I get the following errors:
Error 9 error C2668: 'thrust::raw_reference_cast' : ambiguous call to overloaded function C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include\thrust\detail\function.h 55 1
Error 10 error C2780: 'enable_if_non_const_reference_or_tuple_of_iterator_references::type>::type thrust::detail::host_unary_transform_functor::operator ()(Tuple)' : expects 1 arguments - 0 provided C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include\thrust\detail\function.h 55 1
Error 11 error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -IC:\Boost -I"C:\Program Files (x86)\Microsoft SDKs\MPI\Include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" --keep-dir x64\Debug-OMP(home) -maxrregcount=0 --machine 64 --compile -cudart static -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_OMP -DENABLE_LOGGING -DENABLE_DEBUG_LOGGING -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_OMP -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd /openmp" -o build\gopt-mpi\x64\Debug-OMP(home)\kernels_factory.cu.obj "C:\ALI\APOP\gopt-mpi\APOP_projects\src\cuda\kernels_factory.cu"" exited with code 2. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 6.5.targets 593 9
I work under Windows, Visual Studio 13 Community and switch to OpenMP according to https://github.com/thrust/thrust/wiki/Device-Backends. I made two changes in project properties, both for CUDA C/C++, Host: 1. "Additional compiler options" set as "/openmp" 2. "Preprocessor definitions" set as THRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_OMP What should I do to correct the situation?
I've installed CUDA 7.5 on Windows 7-SP1 and I'm uisng Visual Studio 2013.
Unfortunately, I can't run any CUDA code. I can't even build the sample bandwidthTest. I get the following error:
C:\ProgramData\NVIDIA Corporation\CUDA
Samples\v7.5\1_Utilities\bandwidthTest>"C:\Program Files\NVIDIA GPU
Computing Toolkit\CUDA\v7.5\bin\nvcc.exe"
-gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" -gencode=arch=compute_37,code=\"sm_37,compute_37\" -gencode=arch=compute_50,code=\"sm_50,compute_50\" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -I./ -I../../common/inc -I./
-I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5/include" -I../../common/inc -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" --keep-dir x64\Release -maxrregcount=0
--machine 64 --compile -cudart static -Xcompiler "/wd 4819" -DWIN32 -DWIN32 -D_MBCS -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MT " -o x64/Release/bandwidthTest.cu.obj "C:\ProgramData\NVIDIA
Corporation\CUDA
Samples\v7.5\1_Utilities\bandwidthTest\bandwidthTest.cu" 1> nvcc
fatal : Compiler 'cl.exe' in PATH different than the one specified
with -ccbin 1>C:\Program Files
(x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA
7.5.targets(604,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe"
-gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" -gencode=arch=compute_37,code=\"sm_37,compute_37\" -gencode=arch=compute_50,code=\"sm_50,compute_50\" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -I./ -I../../common/inc -I./
-I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5/include" -I../../common/inc -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" --keep-dir x64\Release -maxrregcount=0
--machine 64 --compile -cudart static -Xcompiler "/wd 4819" -DWIN32 -DWIN32 -D_MBCS -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MT " -o x64/Release/bandwidthTest.cu.obj "C:\ProgramData\NVIDIA
Corporation\CUDA
Samples\v7.5\1_Utilities\bandwidthTest\bandwidthTest.cu"" exited with
code 1.
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
My environment variables are:
Path:
C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\x86_amd64;C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin;C:\Program Files\Microsoft Visual Studio 12.0\Common7\IDE;C:\TDM-GCC-64\bin;C:\TDM-GCC-64\x86_64-w64-mingw32;C:\ProgramData\Oracle\Java\javapath;C:\Program
Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin;C:\Program
Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\libnvvp;c:\Program Files
(x86)\Intel\iCLS Client\;c:\Program Files\Intel\iCLS
Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;c:\Program
Files (x86)\Hewlett-Packard\HP Performance Advisor;C:\Program
Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program
Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files
(x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files
(x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files
(x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft
SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
Server\100\DTS\Binn\;C:\Program Files\TortoiseSVN\bin;C:\Program Files
(x86)\Windows Kits\8.1\Windows Performance Toolkit\;C:\Program
Files\Microsoft SQL Server\110\Tools\Binn\;C:\Program Files
(x86)\Microsoft SDKs\TypeScript\1.0\;C:\Program Files (x86)\Microsoft
SQL Server\110\Tools\Binn\;C:\Program Files\Microsoft SQL
Server\110\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL
Server\110\Tools\Binn\ManagementStudio\;C:\Program Files
(x86)\Microsoft SQL Server\110\DTS\Binn\;C:\Program
Files\R\R-3.1.3\bin\i386\;C:\Program Files (x86)\NVIDIA
Corporation\PhysX\Common
CUDA_PATH:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
CUDA_PATH_V7_5:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
I also changed
INCLUDES += "-I$(TOP)/include" $(SPACE)
to
INCLUDES += "-I$(TOP)/include" "-I$(TOP)/include/cudart" "-IC:/Program
Files (x86)/Microsoft Visual Studio 12.0/VC/include" $(SPACE)
in nvcc.profile.
But no luck so far!! :(
Could you please help me?!
Thanks
Problem
There are thee versions of Visual Studio compiler: for x86, for x86_64 and ARM platforms (and I heard fourth is coming soon).
The problem is most likely comes from the fact that you are compiling for a platform that is different from the compiler's platform you have in PATH.
Solution
You should never have Visual Studio's bin folders in your global PATH variable.
Remove everything related to Visual Studio from your PATH. Visual Studio IDE and CUDA tools are smart enough to find the compiler without your help (via registry entries).
If you want to run developer tools from the command line (without IDE) at the same time, use:
Visual Studio Command prompt in Start menu (which uses vcvarsall.bat script)
or use vcvarsall.bat directly (which temporarily sets up the environment for building against a platform given as a parameter)
or roll out your own script
If the above answer doesn't work for you, here's what I did with Visual Studio 2013 and CUDA 6.5 for an x64 compile.
I edited
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 6.5.props
So that
<VCBinDir Condition="'$(Platform)' == 'Win32'">$(VC_ExecutablePath_x86_x86)</VCBinDir>
<VCBinDir Condition="'$(Platform)' == 'x64'">$(VC_ExecutablePath_x64)</VCBinDir>
I have the same problem, the key cause of this problem is the name of cl.exe is duplicated. I have the BullseyeCoverage(a software which can analyze the fraction of coverage of code) installed. It have the same named cl.exe program. Then i uninstall it, the problem is gone!
I am working on CUDA with Visual Studio 2010. I installed the CUDA toolkit and SDK but one of the SDK examples is not building successfully.
The output console shows:
1>_CUDA_Build_Rule:
1> Compiling with CUDA Build Rule...
1> The system cannot find the path specified.
1>E:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\common\Cuda.targets(45,5): error MSB3721: The command "echo "$(CUDA_BIN_PATH)\nvcc.exe" -arch sm_10 -ccbin "E:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -Xcompiler "/EHsc /W3 /nologo /Od /Zi /MTd " -I"E:\CUDA\include;../../common/inc" -maxrregcount=32 --compile -o "$(IntDir)\$(InputName).cu.obj" "E:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\src\bandwidthTest\bandwidthTest.cu"
1>E:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\common\Cuda.targets(45,5): error MSB3721: "$(CUDA_BIN_PATH)\nvcc.exe" -arch sm_10 -ccbin "E:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -Xcompiler "/EHsc /W3 /nologo /Od /Zi /MTd " -I"E:\CUDA\include;../../common/inc" -maxrregcount=32 --compile -o "$(IntDir)\$(InputName).cu.obj" "E:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\src\bandwidthTest\bandwidthTest.cu"" exited with code 1.
1>
1>Build FAILED.
there is some problem with the nvcc.exe
When i execute nvcc.exe in command prompt, it shows :
nvcc fatal: No input file specified
I'm afraid yours is a non-programming question.
Anyway, if you run nvcc from the command line, you obviously get that error message because you are not specifying which file do you like to compile.
Below, I'm pointing out some other threads with the same problem you detailed. I hope they could be useful to you:
Visual Studio 2010 - how to fix Error MSB3721 - exiting with code 1
CUDA Visual Studio 2010 Express build error
Fixing Visual Studio Express error when cleaning 64-bit projects using CUDA 4.1 nvcc compiler
I was trying to compile some CUDA codes under visual studio 2010 with CUDA 4.2 (I created this CUDA project using Parallel Nsight 2.2), but I encountered an atomic problem "error : identifier "atomicAdd" is undefined", which I still can't solve after checking several forums.
So I tried to get some information from CUDA SDK Samples. First, I ran the simpleAtomicIntrinsics sample in CUDA SDK, which passed its test. Then, I copied all the files in this sample to a new CUDA 4.2 project in visual studio 2010 and compiled them, Here is the result.
1> E:\CUDA exercise Codes\CUDA_EXERCISES\CUDA_EXERCISES\CUDA_EXERCISES>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_10,code=\"sm_10,compute_10\" --use-local-env --cl-version 2010 -ccbin "c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -g -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o "Debug\simpleAtomicIntrinsics.cu.obj" "E:\CUDA exercise Codes\CUDA_EXERCISES\CUDA_EXERCISES\CUDA_EXERCISES\simpleAtomicIntrinsics.cu"
1> simpleAtomicIntrinsics.cu
1> tmpxft_00007220_00000000-3_simpleAtomicIntrinsics.compute_20.cudafe1.gpu
1> tmpxft_00007220_00000000-7_simpleAtomicIntrinsics.compute_20.cudafe2.gpu
1> simpleAtomicIntrinsics.cu
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(33): error : identifier "atomicAdd" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(36): error : identifier "atomicSub" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(39): error : identifier "atomicExch" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(42): error : identifier "atomicMax" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(45): error : identifier "atomicMin" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(48): error : identifier "atomicInc" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(51): error : identifier "atomicDec" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(54): error : identifier "atomicCAS" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(59): error : identifier "atomicAnd" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(62): error : identifier "atomicOr" is undefined
1>
1>e:\cuda exercise codes\cuda_exercises\cuda_exercises\cuda_exercises\simpleAtomicIntrinsics_kernel.cu(65): error : identifier "atomicXor" is undefined
1>
1> 11 errors detected in the compilation of "C:/Users/NIEXIA~1/AppData/Local/Temp/tmpxft_00007220_00000000-9_simpleAtomicIntrinsics.compute_10.cpp1.ii".
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 4.2.targets(361,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_10,code=\"sm_10,compute_10\" --use-local-env --cl-version 2010 -ccbin "c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -g -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o "Debug\simpleAtomicIntrinsics.cu.obj" "E:\CUDA exercise Codes\CUDA_EXERCISES\CUDA_EXERCISES\CUDA_EXERCISES\simpleAtomicIntrinsics.cu"" exited with code 2.
1>
1>Build FAILED.
By the way, I can run other samples such as clock, matrixMul, etc. under this vs2010 CUDA Project. (This means the include path is set correctly)
I googled it and found the following link Some issue with Atomic add in CUDA kernel operation. I changed the properties of both project and the .cu file according to it, but still can't solve the problem.
Any suggestion?
Try to compile with the flag -arch sm_20
Atomics are unavailable under compute architecture 1.0, but you're still trying to compile for it according to your build log. Try removing references to compute_10 and sm_10 from your CUDA project properties and compiling for just compute architecture 2.0 (GeForce 400 series and newer).
"Atomics are unavailable under compute architecture 1.0, but you're still trying to compile for it according to your build log. Try removing references to compute_10 and sm_10 from your CUDA project properties and compiling for just compute architecture 2.0 (GeForce 400 series and newer)."
It's absolutely right,BTW,if you guys are compiling rodrigob_doppia's source code(boosted_learning),you can just add the line in your machine configuration:
set(CUDA_NVCC_FLAGS "-arch=sm_20" CACHE STRING "nvcc flags" FORCE)
Actually,it is set to switch the arch flag to sm_20,just the same as the saying above.