The minimum required Cuda capability is 3.5 - amazon-ec2

After installing TensorFlow and its dependencies on a g2.2xlarge EC2 instance I tried to run an MNIST example from the getting started page:
python tensorflow/models/image/mnist/convolutional.py
But I get the following warning:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device
(device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute
capability 3.0. The minimum required Cuda capability is 3.5.
Is this a hard requirement? Any chance I could comment that check out in a fork of TensorFlow? It would be super nice to be able to train models in AWS.

There is a section in the official installation page that guides you to enable Cuda 3, but you need to build Tensorflow from source.
$ TF_UNOFFICIAL_SETTING=1 ./configure
# Same as the official settings above
WARNING: You are configuring unofficial settings in TensorFlow. Because some
external libraries are not backward compatible, these settings are largely
untested and unsupported.
Please specify a list of comma-separated Cuda compute capabilities you want to
build with. You can find the compute capability of your device at:
https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases
your build time and binary size. [Default is: "3.5,5.2"]: 3.0
Setting up Cuda include
Setting up Cuda lib64
Setting up Cuda bin
Setting up Cuda nvvm
Configuration finished

Currently only GPUs with compute capability >= 3.5 are officially supported. However, GitHub user #infojunkie has offered a patch that makes it possible to use TensorFlow with a GPU with compute capability 3.0.
The official fix is in development. Meanwhile, check out the discussion on the GitHub issue for adding this support.

There is a simple trick. You don't even have to build TF from sources.
In the file tensorflow\python\_pywrap_tensorflow.pyd there are two occurences of regex 3\.5.*5\.2. Just replace both 3.5 with 3.0.
Tested on Windows 10, Anaconda 4.2.13, Python 3.5.2, TensorFlow 0.12, CUDA 8, NVidia GTX 660m (CUDA cap. 3.0).

Related

Why is Tensorflow GPU extremely slow when creating models and training models compared to the CPU version?

I would first like to give you some information about how I installed tensorflow and other packages before explaining the problem. It took me a lot of time to get tensorflow running on my GPU (Nvidia RTX 3070, Windows 10 system). First, I installed Cuda (v.10.1), downloaded CuDDN (v7.6) and copied and pasted the CuDNN files to the correct Cuda installation folders (as described here: https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-windows)
I want to use tensorflow 2.3.0 and checked if the Cuda and cuDNN versions are compatible using the table on this page: https://www.tensorflow.org/install/source
Then I opened the anaconda prompt window, activated my new environment (>> activate [MyEnv]) and installed the required packages. I read that it is important to install tensorflow first, so the first package I installed was tensorflow-gpu, followed by a bunch of other packages. Later, I ran into the problem that my GPU was not found when I typed in
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
The response was "Num GPUs Available: 0"
I did a lot of googling and found the following discussion:
https://github.com/ContinuumIO/anaconda-issues/issues/12194#issuecomment-751700156
where it is mentioned that a faulty tensorflow build is installed when using conda install tensorflow-gpu in the anaconda prompt window. Instead (when using Pythin 3.8, as I do), one has to use pass the correct tensorflow build in the prompt window. So I set up a new environment and used
conda install tensorflow-gpu=2.3 tensorflow=2.3=mkl_py38h1fcfbd6_0
to install tensorflow. So far so good. Now, the cudatoolkit (version 10.1.243) and cudnn (version 7.6.5), which were missing in my first environment, are inculded in the the tensorflow package and thus in my second environment [MyEnv2].
I start VSCode, select the correct environment, and retest if the gpu can be found by repeating the test:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
And ...it works. The GPU is found and everything looks good at the first sight.
So what's the problem?
Using tensorflow on gpu is extremly slow. Not only when training models, but also when creating the model with
model = models.Sequential()
model.add(...)
(...)
model.summary()
Running the same code sample on CPU finishes almost immediately, wheras running the code on GPU needs more than 10 minutes! (When I look into the taskmanager performance tab nothing happens. Neither CPU nor GPU seems to do anything, when I run the Code on the GPU!) And this happens, when just creating the model without training!
After compiling the model and starting the training, the same problem occurs. Training on the CPU gives me a immediate feedback about the epoch process, while training on gpu seems to freeze the program as nothing happens for several minutes (maybe "freezing" is the wrong word, because I can still switch between the tabs in VSCode. The program itself is not freezing) Another confusing aspect is that when training on the gpu, I only get nans for the loss and mae when the training finally starts after minutes of waiting. In the task manager I can obeserve that the model needs about 7,5GB of VRAM. The RTX3070 comes with 8GB of VRAM. When I run the same code on the cpu, loss and mae look perfectly fine...
I really have no idea what is happening and why I am getting this strange behaviour when I run tensorflow on my gpu. Do you have any ideas?
Nvidia RTX 3070 cards are based on the Ampere architecture for which compatible CUDA version start with 11.x.
You can upgrade tensorflow from 2.3 to 2.4, because this version supports Ampere architecture.
So to get benefit from your GPU card, the compatible combination are TF 2.4, CUDA 11.0 and cuDNN is 8.0.

Installing CUDA Windows 10

I am trying to install the CUDA toolkit in order to be able to use Thundersvm in my personal computer.
However I keep getting the following message in the GUI installer:
"You already have a newer version of the NVIDIA Frameview SDK installed"
I read in the CUDA forums that this most probably results from having installed Geforce Experience (which I have installed). So I tried removing it from the Programs and Features windows panel. However I still got the error, so my guess is that the "Nvidia Corporation" folder was not removed.
In the same question, they also suggested performing a custom install. However I could not find any information on how to do a custom install of the CUDA toolkit. I would really appreciate if someone could explain how to do this custom install or safely remove the previous drivers. I thought of using DDU but I read that sometimes it may actually lead to trouble.
I had the same problem while I was trying to get TensorFlow to use my NVIDIA GTX1070 GPU for calculations. Here's what allowed me to perform the CUDA Toolkit installation on my Windows 10 machine.
As the error message in the installer says - you already have a newer Frameview SDK installed. It was the case for me.
Go to Settings/Uninstall or modify programs.
Remove the NVIDIA Frameview program. It should be there with GeForce Experience, PhysX, etc.
Uninstalling only this NVIDIA program didn't cause any driver problems for my machine and I was able to progress through the CUDA Toolkit installation.
I just met the same problem and fixed it now.
This problem occurred because you chose the default installation configuration, which might contain many installed parts. In my situation, I have installed NVIDIA Nsight Compute, which is the culprit during the first few installs.
Unchecking the redundant parts should be helpful.

Getting a better CPU-based OpenCL driver (OS X)

While browsing the Web, I came across this page from the PyOpenCL project:
Py OpenCL Mac OS Install Readme
On this page, something strange is alleged:
"OS X has support for both CPU- and GPU-based OpenCL built in. Unfortunately, the built-in drivers can be temperamental, and they have not advanced as quickly as one might like. To make PyOpenCL use a more up-to-date (and open-source) CPU-based OpenCL driver, type the following:
conda install osx-pocl-opencl pocl (OS X)
Note that, by installing osx-pocl-opencl, you will no longer be able to use PyOpenCL to talk to the system-wide Apple OpenCL drivers. To regain access to those drivers, simply uninstall osx-pocl-opencl and reinstall pyopencl afterwards."
Is this true? Are there faster OpenCL drivers for MacOS?
I'm not interested in installing PyOpenCL. Is there a way for me to get my hands on those "faster" drivers?
It seems that they're suggesting you use pocl, an open source OpenCL implementation with support for OpenCL 1.2 features (and some 2.0 features). I can't comment on the performance, but it's definitely true that the official Mac OS OpenCL drivers are pretty finicky.
I found this readme detailing the steps to build and install pocl for OSX.

H2O XGBoost bug and OS limitation

I have two questions.
1)
I'm testing H2O 3.10.5.1 version for xgboost modeling.
There is a known bug (PUBDEV-4585) that binary save/load of xgboost doesn't work.
Has it been fixed in the recent version? Confirmation is needed in order to make a decision with the server admin whether to upgrade the system or not.
2)
H2O.ai xgboost documentation says there is some limitation to platforms.
The "compilation OS" is Ubuntu 14.04, but is there a limitation to any other linux OS version like Redhat?
h2o.xgboost.available() returns TRUE but I need to make sure.
Thanks
Ad.1 Yes it's been fixed in version 3.18.0.1
Ad.2 The distro itself isn't really important. It's more important which exact version of RedHat are we talking about (since different versions come bundled with different lib versions) and whether you can upgrade libraries on your own if necessary. For example if you want to run the GPU version you'll need a certain version of glibc (2.17 or never if I remember correctly). For the CPU version most recent Linux distributions should be ok.

installing intel opencl sdk but cannot find platform at clGetPlatformIDs

I want to install intel opencl sdk. And surely I did everything written in intel opencl installation guide in intel's website.
http://software.intel.com/en-us/articles/intel-sdk-for-opencl-applications-xe-2013-release-notes#_Installation_Notes
I did everything written in there but it doesn't work.
Specifically, I can compile the source but it cannot find the platform at clGetPlatformIDs function. error code is -1001 and there's no -1001 error code at cl.h file. If I uninstall, then I cannot compile at all naturally. It means there's error message at compile time that it cannot find lots of functions and defined values. After I install the opencl sdk then there's no message and compile properly. I think it means that install is done properly. But at runtime, it cannot find platform. What's the problem? I've been struggling about one week. Please help me..
---add---
I forgot to let you know my OS.. and so on;;
My OS is Red Hat Enterprise 6.3 (Santiago).
My CPU is Intel Xeon CPU E5-2690.
The code I tried have no problem at other machines and platforms.
Thanks.

Resources