GPU driver and CUDA is not enabled and accessible by PyTorch on MacOS - macos

GPU driver and CUDA is not enabled and accessible by PyTorch.
torch.cuda.is_available() returns false
I am using macOS Mojave 10.14.6
I have installed Cuda 10.0 version of pytorch.
I tried verfication on https://pytorch.org/get-started/locally/ and constructing a randomly initialized tensor works just fine.
But when I tried
import torch
torch.cuda.is_availalbe()
it returns false.
Therefore, I followed instructions on Pytorch and installed Anaconda and Cuda.
Then tried this:
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install
In terminal, I got
fatal error: 'string.h' file not found
#include_next <string.h>
I searched on stackoverflow and came up with this. Build Pytorch from source . So I tried:
$ find /Library/Developer/CommandLineTools/usr -type f -name string.h
which returned /Library/Developer/CommandLineTools/usr/include/c++/v1/string.h
Doesn't this mean I already have string.h?
How can I solve this problem?

Are you installing from a conda env? According to the github this should work:
- Create a conda env
- conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing (that installs some requirements)
Then this (which I assume you've already done):
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive
And finally set up the conda variable and install:
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install
There are issues on the git reporting that behavior here which suggest to add something like:
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ NO_CUDA=1 python setup.py install
Check thatNO_CUDA, this issue as also been mentioned HERE in the forums and it seems to be that it could be an issue caused by the OS and driver versions. If that should be the case I recommend to use Nvidia Docker (hopefully it has mac support) with a pytorch container from https://ngc.nvidia.com/catalog/landing
It that should also fail, your best bet is to install without CUDA support.

Related

How can i install LightGBM? Error about Could not find OpenMP_C in Mac

I am using Mac os and installing Lightgbm
pip uninstall lightgbm
git clone --recursive https://github.com/Microsoft/LightGBM ; cd LightGBM
export CXX=g++-8 CC=gcc-8
mkdir build ; cd build
cmake ..
make -j4
I can't install gcc#8 , gcc#7 and so on,
so I tried brew install gcc and it worked.
But cmake .. is failed. The error message:
Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS OpenMP_C_LIB_NAMES)
error
Could NOT find OpenMP_C
LightGBM uses OpenMP to parallelize some computations. If you are building small models and / or working with small datasets and do not need the speedups this parallelism offers, you can install lightgbm (the LightGBM Python package) from source without OpenMP support.
git clone --recursive https://github.com/Microsoft/LightGBM
cd LightGBM/python-package
pip install --install-option="--nomp" .
To build lightgbm with OpenMP support, install OpenMP on your system. Since you said you are on a Mac, you could (for example) use the Homebrew package manager to do that.
brew install libomp
Additional information on building the lightgbm can be found in the LightGBM documentation.

Issues with scikit-bio on Mac M1

I tried installing scikit-bio by running conda install -c https://conda.anaconda.org/biocore scikit-bio per the documentation, but verifying the installation via python -m skbio.test yielded the following error: Error while finding module specification for 'skbio.test' (ModuleNotFoundError: No module named 'skbio').
Next, I tried installing with pip install numpy and pip install scikit-bio, but that yielded a huge wall of errors. Tried the installation test anyway, got the same ModuleNotFoundError.
I'm on a MacBook Air 2020 with a M1 processor, so not sure if that's causing the issue.
I don't know if you still have problems with scikit-bio but the solution that worked for me was using sse2neon package: https://github.com/DLTcollab/sse2neon
Following instructions should work
git clone https://github.com/biocore/scikit-bio
cd scikit-bio
wget https://github.com/DLTcollab/sse2neon/blob/master/sse2neon.h
Open the simde-sse2.h file and replace each of the following two lines:
#include <xmmintrin.h>
#include <emmintrin.h>
with:
#include "sse2neon.h"
then run:
pip install .
!Note that this solution doesn't work for newer python releases. I used python 3.8 for this solution.

Error when importing XGBoost on Apple M1?

Has anyone figured out how to make XGboost work with Apple M1?
I have tried multiple things to fix it, but it does not work.
I have tried reinstalling it; pip and pip3 and python -m pip and conda install; brew install limpomp; brew install gcc#8; Downloading source code and compiling locally.
It seems XGboost does not work on Apple M1.
Here is the error, this occurs when I import xgboost in my script:
XGBoostError: XGBoost Library (libxgboost.dylib) could not be loaded.
Likely causes:
* OpenMP runtime is not installed (vcomp140.dll or libgomp-1.dll for Windows, libomp.dylib for Mac OSX, libgomp.so for Linux and other UNIX-like OSes). Mac OSX users: Run `brew install libomp` to install OpenMP runtime.
* You are running 32-bit Python on a 64-bit OS
Error message(s): ['dlopen(/opt/anaconda3/envs/msc-env/lib/python3.8/site-packages/xgboost/lib/libxgboost.dylib, 6): Library not loaded: /usr/local/opt/libomp/lib/libomp.dylib\n Referenced from: /opt/anaconda3/envs/msc-env/lib/python3.8/site-packages/xgboost/lib/libxgboost.dylib\n Reason: image not found']
i'd got the same issue on MacBook Pro (13-inch, M1, 2020) with chip Apple M1, fortunately after of hours of some researches i got the solution, you just follow the following instruction:
brew install libomp
conda install -c conda-forge py-xgboost
https://discuss.xgboost.ai/t/xgboost-on-apple-m1/2004/8
How to install xgboost in python on MacOS?
A combination of the answer from cherry (first) and Christoffer (second) work for me with miniforge interpreter:
Make sure gcc-11 (and g+±11) is installed, if not do so with
brew install gcc#11
brew install cmake
Then, do the following
git clone --recursive https://github.com/dmlc/xgboost
mkdir xgboost/my_build
cd xgboost/my_build
CC=gcc-11 CXX=g++-11 cmake ..
make -j4
cd ../python_package
/Users/xx/miniforge3/envs/MLEnv/bin/python setup.py install
With the path to you miniforge venv
I put Terminal in Rosetta mode first before installing brew. This way I'm essentially running intel version of the packages. I provided more details in this gist.

build caffe2 with conda failed

I build the caffe2 with anaconda following the page.
In the server with a single titanx, has cudnn7 and cuda9 but do not have nccl, so I download the nccl2 from nvidia and extract it to path/to/local/nccl2, and then edit the ./pytorch/conda/integrated/build.sh in the line 42 to be:"export NCCL_ROOT_DIR=path/to/local/nccl2".
Then I need to use caffe2 with python2, so I added "conda_args+=(" --python 2.7") " in the ./pytorch/scripts/build_anaconda.sh to use python2.7.
The building was succeed, but when I run python2 test.py from caffe2.python import core
It tells me:
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: No module named caffe2_pybind11_state_hip
Segmentation fault (core dumped)
My question is:
a. why the conda does not support gpu?
b. if I am using a single gpu, is nccl necessary for building?
c. how to fix No module named caffe2_pybind11_state_hip
PyTorch or Caffe2: caffe2
How you installed PyTorch (conda, pip, source): conda
Build command you used (if compiling from source):./scripts/build_anaconda.sh --install-locally --cuda 9.0 --cudnn 7
OS:ubuntu16
PyTorch version:
Python version:2.7
CUDA/cuDNN version:9.1/7
GPU models and configuration:??
GCC version (if compiling from source):5.4.0
CMake version:not install
Versions of any other relevant libraries:
Thank you very much!
First of all get CUDA and install it:
sudo apt-get update && sudo apt-get install wget -y --no-install-recommends
wget "http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb"
sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda
Now proceed with installation from source ( do it in an environment):
FULL_CAFFE2=1 python setup.py install
You can find more info here: https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile#install-with-gpu-support
Follow the below procedure it worked for me
ubuntu#test:~$ cd $HOME
ubuntu#test:~$ conda create -n caffe2
ubuntu#test:~$ source activate caffe2
(caffe2) ubuntu#test:~$ git clone --recursive https://github.com/pytorch/pytorch.git && cd pytorch
(caffe2) ubuntu#test:~/pytorch$ git submodule update --init
(caffe2) ubuntu#test:~/pytorch$ CONDA_INSTALL_LOCALLY=1 ./scripts/build_anaconda.sh --cuda 8.0 --cudnn 7 -DUSE_CUDA=ON -DUSE_NCCL=ON

tensorflow installation issues:ImportError: No module named tensorflow

environments: Ubuntu 14.04(64bit) Python2.7.11
Firstly, I installed tensorflow in the way of Virtualenz installation.
$ sudo apt-get install python-pip python-dev python-virtualenv
$ virtualenv --system-site-packages ~/tensorflow
$ source ~/tensorflow/bin/activate
$export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp27-none-linux_x86_64.whl
$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl
$ pip install --upgrade $TF_BINARY_URL
and then, I test my installation and some issue appear. I know I didn't install tensorflow successfully.
import tensorflow
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
I don't know how to solve the problem. Please help me, it cost me one day. I tried to uninstall tensorflow and then I installed in the way of pip installation. But I get the same error.
The protocbuf is 3.1.0.
Are you running python in the same virtual environment you installed tensorflow in?
To access your tensorflow installation, you have to first "activate" the virtualenv in any new terminals, as follows:
source ~/tensorflow/bin/activate
python
import tensorflow as tf
If you run the above in a new terminal, does it solve your problem?
When you did
$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl
this step you are specifying that you are going to use Nvidia card.
To run tensorflow with GPU(Nvidia graphics card) you need to satisfy all Nvidia requirements
Nvidia requires some special privileges to its CUDA cores
You also need to check for Cuda pathnames to the LD_LIBRARY_PATH environment variable, check in Nvidia Documentation.Also, you need to install an profiling support, this can be done by libcupti-dev library, which is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support. To install this library, issue the following command:
sudo apt-get install libcupti-dev
But if you want to run tensorflow in CPU mode only, do not specify $ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl.With this you are overriding TF_BINARY_URL variable to use Nvidia CUDA core
So, to use CPU from all your steps remove $ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl and include only $export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp27-none-linux_x86_64.whl and reinstall
I hope this should clear the problem
In case, your prerequisite python packages are not installed properly,
check several things.
$ source $HOME/tensorflow/bin/activate
$ which python
$ which pip
please check these binaries are in the path $HOME/tensorflow/bin/activate. If so, try
$ pip install -I --upgrade $TF_BINARY_URL
where -I option forces to install packages.
INSTALLATION OF TENSORFLOW ON UBUNTU 18.04
download anaconda python package
install it via shell using bash
$bash anaconda*.sh
editing the .bashrc script //location home
$sudo apt-get install python3-pip
$sudo apt-get update
$cd
$nano .bashrc
nano is the text editor
insert the given line at the end of the file
export PATH=-/anaconda3/bin:$PATH
create a virtual environment
using conda
$conda create -n myenv python=3.5
//SPECIFY THE VERSION REQUIRED DO NOT USE 3.7 AS THERE IS A COMPATIBLITY ISSUE WITH TENSORFLOW 10
$source activate myenv
$pip install -U tensorflow
$python
>>import tensorflow as tf
>> //get this prompt without an error it means the installation is successful
>>exit()
source deactivate
fully tested if an issue arises do let me know
whenever you install python packages i would suggest to do it in a virtual environment

Resources