Dask: When reading from HDFS, pyarrow/hdfs.py returns OSError: Getting symbol hdfsNewBuilder failed - hadoop

I was trying to run dask-on-yarn with my research group's Hadoop cluster.
I tried each of the following instructions:
dd.read_parquet('hdfs://file.parquet', engine='fastparquet')
dd.read_parquet('hdfs://file.parquet', engine='pyarrow')
dd.read_csv('hdfs://file.csv')
Each time, the following error message occurs:
~/miniconda3/envs/dask/lib/python3.8/site-packages/fsspec/core.py in get_fs_token_paths(urlpath, mode, num, name_function, storage_options, protocol)
521 path = cls._strip_protocol(urlpath)
522 update_storage_options(options, storage_options)
--> 523 fs = cls(**options)
524
525 if "w" in mode:
~/miniconda3/envs/dask/lib/python3.8/site-packages/fsspec/spec.py in __call__(cls, *args, **kwargs)
52 return cls._cache[token]
53 else:
---> 54 obj = super().__call__(*args, **kwargs)
55 # Setting _fs_token here causes some static linters to complain.
56 obj._fs_token_ = token
~/miniconda3/envs/dask/lib/python3.8/site-packages/fsspec/implementations/hdfs.py in __init__(self, host, port, user, kerb_ticket, driver, extra_conf, **kwargs)
42 AbstractFileSystem.__init__(self, **kwargs)
43 self.pars = (host, port, user, kerb_ticket, driver, extra_conf)
---> 44 self.pahdfs = HadoopFileSystem(
45 host=host,
46 port=port,
~/miniconda3/envs/dask/lib/python3.8/site-packages/pyarrow/hdfs.py in __init__(self, host, port, user, kerb_ticket, driver, extra_conf)
38 _maybe_set_hadoop_classpath()
39
---> 40 self._connect(host, port, user, kerb_ticket, extra_conf)
41
42 def __reduce__(self):
~/miniconda3/envs/dask/lib/python3.8/site-packages/pyarrow/io-hdfs.pxi in pyarrow.lib.HadoopFileSystem._connect()
~/miniconda3/envs/dask/lib/python3.8/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()
OSError: Getting symbol hdfsNewBuilderfailed
How should I resolve this problem?
My Environment
Here are my packages in this conda env:
# Name Version Build Channel
_libgcc_mutex 0.1 main
abseil-cpp 20200225.2 he1b5a44_0 conda-forge
arrow-cpp 0.17.1 py38h1234567_9_cpu conda-forge
attrs 19.3.0 py_0
aws-sdk-cpp 1.7.164 hc831370_1 conda-forge
backcall 0.2.0 py_0
blas 1.0 mkl
bleach 3.1.5 py_0
bokeh 2.1.1 py38_0
boost-cpp 1.72.0 h7b93d67_1 conda-forge
brotli 1.0.7 he6710b0_0
brotlipy 0.7.0 py38h7b6447c_1000
bzip2 1.0.8 h7b6447c_0
c-ares 1.15.0 h7b6447c_1001
ca-certificates 2020.6.24 0
certifi 2020.6.20 py38_0
cffi 1.14.0 py38he30daa8_1
chardet 3.0.4 py38_1003
click 7.1.2 py_0
cloudpickle 1.4.1 py_0
conda-pack 0.4.0 py_0
cryptography 2.9.2 py38h1ba5d50_0
curl 7.71.0 hbc83047_0
cytoolz 0.10.1 py38h7b6447c_0
dask 2.19.0 py_0
dask-core 2.19.0 py_0
dask-yarn 0.8.1 py38h32f6830_0 conda-forge
decorator 4.4.2 py_0
defusedxml 0.6.0 py_0
distributed 2.19.0 py38_0
entrypoints 0.3 py38_0
fastparquet 0.3.2 py38heb32a55_0
freetype 2.10.2 h5ab3b9f_0
fsspec 0.7.4 py_0
gflags 2.2.2 he6710b0_0
glog 0.4.0 he6710b0_0
grpc-cpp 1.30.0 h9ea6770_0 conda-forge
grpcio 1.27.2 py38hf8bcb03_0
heapdict 1.0.1 py_0
icu 67.1 he1b5a44_0 conda-forge
idna 2.10 py_0
importlib-metadata 1.7.0 py38_0
importlib_metadata 1.7.0 0
intel-openmp 2020.1 217
ipykernel 5.3.0 py38h5ca1d4c_0
ipython 7.16.1 py38h5ca1d4c_0
ipython_genutils 0.2.0 py38_0
jedi 0.17.1 py38_0
jinja2 2.11.2 py_0
jpeg 9b h024ee3a_2
json5 0.9.5 py_0
jsonschema 3.2.0 py38_0
jupyter_client 6.1.3 py_0
jupyter_core 4.6.3 py38_0
jupyterlab 2.1.5 py_0
jupyterlab_server 1.1.5 py_0
krb5 1.18.2 h173b8e3_0
ld_impl_linux-64 2.33.1 h53a641e_7
libcurl 7.71.0 h20c2e04_0
libedit 3.1.20191231 h7b6447c_0
libevent 2.1.10 hcdb4288_1 conda-forge
libffi 3.3 he6710b0_1
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libllvm9 9.0.1 h4a3c616_0
libpng 1.6.37 hbc83047_0
libprotobuf 3.12.3 hd408876_0
libsodium 1.0.18 h7b6447c_0
libssh2 1.9.0 h1ba5d50_1
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.1.0 h2733197_1
llvmlite 0.33.0 py38hd408876_0
locket 0.2.0 py38_1
lz4-c 1.9.2 he6710b0_0
markupsafe 1.1.1 py38h7b6447c_0
mistune 0.8.4 py38h7b6447c_1000
mkl 2020.1 217
mkl-service 2.3.0 py38he904b0f_0
mkl_fft 1.1.0 py38h23d657b_0
mkl_random 1.1.1 py38h0573a6f_0
msgpack-python 1.0.0 py38hfd86e86_1
nbconvert 5.6.1 py38_0
nbformat 5.0.7 py_0
ncurses 6.2 he6710b0_1
notebook 6.0.3 py38_0
numba 0.50.1 py38h0573a6f_0
numpy 1.18.5 py38ha1c710e_0
numpy-base 1.18.5 py38hde5b4d6_0
olefile 0.46 py_0
openssl 1.1.1g h7b6447c_0
packaging 20.4 py_0
pandas 1.0.5 py38h0573a6f_0
pandoc 2.9.2.1 0
pandocfilters 1.4.2 py38_1
parquet-cpp 1.5.1 2 conda-forge
parso 0.7.0 py_0
partd 1.1.0 py_0
pexpect 4.8.0 py38_0
pickleshare 0.7.5 py38_1000
pillow 7.1.2 py38hb39fc2d_0
pip 20.1.1 py38_1
prometheus_client 0.8.0 py_0
prompt-toolkit 3.0.5 py_0
protobuf 3.12.3 py38he6710b0_0
psutil 5.7.0 py38h7b6447c_0
ptyprocess 0.6.0 py38_0
pyarrow 0.17.1 py38h1234567_9_cpu conda-forge
pycparser 2.20 py_0
pygments 2.6.1 py_0
pyopenssl 19.1.0 py38_0
pyparsing 2.4.7 py_0
pyrsistent 0.16.0 py38h7b6447c_0
pysocks 1.7.1 py38_0
python 3.8.3 hcff3b4d_2
python-dateutil 2.8.1 py_0
python_abi 3.8 1_cp38 conda-forge
pytz 2020.1 py_0
pyyaml 5.3.1 py38h7b6447c_1
pyzmq 19.0.1 py38he6710b0_1
re2 2020.07.01 he1b5a44_0 conda-forge
readline 8.0 h7b6447c_0
requests 2.24.0 py_0
send2trash 1.5.0 py38_0
setuptools 47.3.1 py38_0
six 1.15.0 py_0
skein 0.8.0 py38h32f6830_1 conda-forge
snappy 1.1.8 he6710b0_0
sortedcontainers 2.2.2 py_0
sqlite 3.32.3 h62c20be_0
tbb 2020.0 hfd86e86_0
tblib 1.6.0 py_0
terminado 0.8.3 py38_0
testpath 0.4.4 py_0
thrift 0.13.0 py38he6710b0_0
thrift-cpp 0.13.0 h62aa4f2_2 conda-forge
tk 8.6.10 hbc83047_0
toolz 0.10.0 py_0
tornado 6.0.4 py38h7b6447c_1
traitlets 4.3.3 py38_0
typing_extensions 3.7.4.2 py_0
urllib3 1.25.9 py_0
wcwidth 0.2.5 py_0
webencodings 0.5.1 py38_1
wheel 0.34.2 py38_0
xz 5.2.5 h7b6447c_0
yaml 0.2.5 h7b6447c_0
zeromq 4.3.2 he6710b0_2
zict 2.0.0 py_0
zipp 3.1.0 py_0
zlib 1.2.11 h7b6447c_3
zstd 1.4.4 h0b5b093_3
The Hadoop cluster is running version Hadoop 2.7.0-mapr-1607.
The Cluster object is created with:
# Create a cluster where each worker has two cores and eight GiB of memory
cluster = YarnCluster(
environment='conda-env-packed-for-worker-nodes.tar.gz',
worker_env={
# See https://github.com/dask/dask-yarn/pull/30#issuecomment-434001858
'ARROW_LIBHDFS_DIR': '/opt/mapr/hadoop/hadoop-0.20.2/c++/Linux-amd64-64/lib',
},
)
Suspected Cause
I suspect that the version mismatch between the hadoop-0.20.2 in the ARROW_LIBHDFS_DIR environmental variable and the hadoop CLI version Hadoop 2.7.0 might be a cause of the problem.
I had to manually specify pyarrow to use this file (using this setup: https://stackoverflow.com/a/62749053/1147061). The required file, libhdfs.so, is not provided under /opt/mapr/hadoop/hadoop-2.7.0/. Installing libhdfs3 via conda install -c conda-forge libhdfs3 does not resolve the requirement, either.
Might this be the problem?

(a part answer)
To use libhdfs3 (which is poorly maintained these days), you would need to call
dd.read_csv('hdfs://file.csv', storage_options={'driver': 'libhdfs3'})
and, of course, install libhdfs3. This will not help with the hadoop library option, they are independent code paths.
I also suspect that getting the JNI libhdfs (without the "3") working is a case of locating the right .so file.

Related

Why does `conda list cudnn` have no output after `conda install pytorch torchvision cudatoolkit=10.2 -c pytorch` installation [duplicate]

This question already has answers here:
How to run pytorch with NVIDIA "cuda toolkit" version instead of the official conda "cudatoolkit" version?
(2 answers)
Closed 2 years ago.
Please feel free to vote "Reopen" at the bottom of this question. The reason is that I have marked this as a duplicate although the answers there are not clear enough for this question.
As soon as the question is reopened, I can add the the following answer: that cuDNN is an integrated part of the conda pytorch cudatoolkit installer and that it is listed together with pytorch, see py3.7_cuda102_cudnn7_0, and not as a stand-alone package that you might perhaps get by using pip or conda to install it explicitly (not recommended! See all answers of How to run pytorch with NVIDIA “cuda toolkit” version instead of the official conda “cudatoolkit” version?). With conda's cudatoolkit, cuDNN is automatically installed together with CUDA just for pytorch, and as a built-in package inside a conda package it does not show up.
I have installed pytorch using conda install pytorch torchvision cudatoolkit=10.2 -c pytorch.
https://github.com/pytorch/pytorch/issues/17445#issuecomment-466838819 says that cudNN is automatically included.
Trying to check the install according to How to get the CUDA version?, I used conda list cudnn to check, and it is not installed, even though cudNN is needed to use your gpu for ML, while conda list cuda works.
Here is the output of anaconda prompt of both commands.
(base) PS C:\Users\Admin> conda list cudnn
# packages in environment at C:\Users\Admin\anaconda3:
#
# Name Version Build Channel
(base) PS C:\Users\Admin> conda list cuda
# packages in environment at C:\Users\Admin\anaconda3:
#
# Name Version Build Channel
cudatoolkit 10.2.89 h74a9793_1
I find a 445 MB cudnn file "cudnn64_7.dll" in C:\Users\Admin\anaconda3\Lib\site-packages\torch\lib, and there are some small files in C:\Users\Admin\anaconda3\Lib\site-packages\torch\backends\cudnn and C:\Users\Admin\anaconda3\Lib\site-packages\torch\include\ATen.
Questions:
Is cudnn installed as a part of pytorch only?
Is there a way to get conda list cudnn to work?
####
Output of just conda list mentions cudnn at
pytorch 1.6.0 py3.7_cuda102_cudnn7_0 pytorch
Full list:
# Name Version Build Channel
_ipyw_jlab_nb_ext_conf 0.1.0 py37_0
alabaster 0.7.12 py37_0
anaconda 2020.07 py37_0
anaconda-client 1.7.2 py37_0
anaconda-navigator 1.9.12 py37_0
anaconda-project 0.8.4 py_0
argh 0.26.2 py37_0
asn1crypto 1.3.0 py37_1
astroid 2.4.2 py37_0
astropy 4.0.1.post1 py37he774522_1
atomicwrites 1.4.0 py_0
attrs 19.3.0 py_0
autopep8 1.5.3 py_0
babel 2.8.0 py_0
backcall 0.2.0 py_0
backports 1.0 py_2
backports.functools_lru_cache 1.6.1 py_0
backports.shutil_get_terminal_size 1.0.0 py37_2
backports.tempfile 1.0 py_1
backports.weakref 1.0.post1 py_1
bcrypt 3.1.7 py37he774522_1
beautifulsoup4 4.9.1 py37_0
bitarray 1.4.0 py37he774522_0
bkcharts 0.2 py37_0
blas 1.0 mkl
bleach 3.1.5 py_0
blosc 1.19.0 h7bd577a_0
bokeh 2.1.1 py37_0
boto 2.49.0 py37_0
bottleneck 1.3.2 py37h2a96729_1
brotlipy 0.7.0 py37he774522_1000
bzip2 1.0.8 he774522_0
ca-certificates 2020.6.24 0
certifi 2020.6.20 py37_0
cffi 1.14.0 py37h7a1dbc1_0
chardet 3.0.4 py37_1003
click 7.1.2 py_0
cloudpickle 1.5.0 py_0
clyent 1.2.2 py37_1
colorama 0.4.3 py_0
comtypes 1.1.7 py37_1001
conda 4.8.3 py37_0
conda-build 3.18.11 py37_0
conda-env 2.6.0 1
conda-package-handling 1.6.1 py37h62dcd97_0
conda-verify 3.4.2 py_1
console_shortcut 0.1.1 4
contextlib2 0.6.0.post1 py_0
cryptography 2.9.2 py37h7a1dbc1_0
cudatoolkit 10.2.89 h74a9793_1
curl 7.71.1 h2a8f88b_1
cycler 0.10.0 py37_0
cython 0.29.21 py37ha925a31_0
cytoolz 0.10.1 py37he774522_0
dask 2.20.0 py_0
dask-core 2.20.0 py_0
decorator 4.4.2 py_0
defusedxml 0.6.0 py_0
diff-match-patch 20200713 py_0
distributed 2.20.0 py37_0
docutils 0.16 py37_1
entrypoints 0.3 py37_0
et_xmlfile 1.0.1 py_1001
fastcache 1.1.0 py37he774522_0
filelock 3.0.12 py_0
flake8 3.8.3 py_0
flask 1.1.2 py_0
freetype 2.10.2 hd328e21_0
fsspec 0.7.4 py_0
future 0.18.2 py37_1
get_terminal_size 1.0.0 h38e98db_0
gevent 20.6.2 py37he774522_0
glob2 0.7 py_0
gmpy2 2.0.8 py37h0964b28_3
greenlet 0.4.16 py37he774522_0
h5py 2.10.0 py37h5e291fa_0
hdf5 1.10.4 h7ebc959_0
heapdict 1.0.1 py_0
html5lib 1.1 py_0
icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha925a31_3
idna 2.10 py_0
imageio 2.9.0 py_0
imagesize 1.2.0 py_0
importlib-metadata 1.7.0 py37_0
importlib_metadata 1.7.0 0
intel-openmp 2019.4 245
intervaltree 3.0.2 py_1
ipykernel 5.3.2 py37h5ca1d4c_0
ipython 7.16.1 py37h5ca1d4c_0
ipython_genutils 0.2.0 py37_0
ipywidgets 7.5.1 py_0
isort 4.3.21 py37_0
itsdangerous 1.1.0 py37_0
jdcal 1.4.1 py_0
jedi 0.17.1 py37_0
jinja2 2.11.2 py_0
joblib 0.16.0 py_0
jpeg 9b hb83a4c4_2
json5 0.9.5 py_0
jsonschema 3.2.0 py37_1
jupyter 1.0.0 py37_7
jupyter_client 6.1.6 py_0
jupyter_console 6.1.0 py_0
jupyter_core 4.6.3 py37_0
jupyterlab 2.1.5 py_0
jupyterlab_server 1.2.0 py_0
keyring 21.2.1 py37_0
kiwisolver 1.2.0 py37h74a9793_0
krb5 1.18.2 hc04afaa_0
lazy-object-proxy 1.4.3 py37he774522_0
libarchive 3.4.2 h5e25573_0
libcurl 7.71.1 h2a8f88b_1
libiconv 1.15 h1df5818_7
liblief 0.10.1 ha925a31_0
libllvm9 9.0.1 h21ff451_0
libpng 1.6.37 h2a8f88b_0
libsodium 1.0.18 h62dcd97_0
libspatialindex 1.9.3 h33f27b4_0
libssh2 1.9.0 h7a1dbc1_1
libtiff 4.1.0 h56a325e_1
libxml2 2.9.10 h464c3ec_1
libxslt 1.1.34 he774522_0
llvmlite 0.33.0 py37ha925a31_0
locket 0.2.0 py37_1
lxml 4.5.2 py37h1350720_0
lz4-c 1.9.2 h62dcd97_0
lzo 2.10 he774522_2
m2w64-gcc-libgfortran 5.3.0 6
m2w64-gcc-libs 5.3.0 7
m2w64-gcc-libs-core 5.3.0 7
m2w64-gmp 6.1.0 2
m2w64-libwinpthread-git 5.0.0.4634.697f757 2
markupsafe 1.1.1 py37hfa6e2cd_1
matplotlib 3.2.2 0
matplotlib-base 3.2.2 py37h64f37c6_0
mccabe 0.6.1 py37_1
menuinst 1.4.16 py37he774522_1
mistune 0.8.4 py37hfa6e2cd_1001
mkl 2019.4 245
mkl-service 2.3.0 py37hb782905_0
mkl_fft 1.0.14 py37h6288b17_0
mkl_random 1.0.4 py37h343c172_0
mock 4.0.2 py_0
more-itertools 8.4.0 py_0
mpc 1.1.0 h7edee0f_1
mpfr 4.0.2 h62dcd97_1
mpir 3.0.0 hec2e145_1
mpmath 1.1.0 py37_0
msgpack-python 1.0.0 py37h74a9793_1
msys2-conda-epoch 20160418 1
multipledispatch 0.6.0 py37_0
navigator-updater 0.2.1 py37_0
nbconvert 5.6.1 py37_1
nbformat 5.0.7 py_0
networkx 2.4 py_1
ninja 1.9.0 py37h74a9793_0
nltk 3.5 py_0
nose 1.3.7 py37_1004
notebook 6.0.3 py37_0
numba 0.50.1 py37h47e9c7a_0
numexpr 2.7.1 py37h25d0782_0
numpy 1.17.0 py37h19fb1c0_0
numpy-base 1.17.0 py37hc3f5095_0
numpydoc 1.1.0 py_0
olefile 0.46 py37_0
openpyxl 3.0.4 py_0
openssl 1.1.1g he774522_0
packaging 20.4 py_0
pandas 1.0.5 py37h47e9c7a_0
pandoc 2.10 0
pandocfilters 1.4.2 py37_1
paramiko 2.7.1 py_0
parso 0.7.0 py_0
partd 1.1.0 py_0
path 13.1.0 py37_0
path.py 12.4.0 0
pathlib2 2.3.5 py37_1
pathtools 0.1.2 py_1
patsy 0.5.1 py37_0
pep8 1.7.1 py37_0
pexpect 4.8.0 py37_1
pickleshare 0.7.5 py37_1001
pillow 7.2.0 py37hcc1f983_0
pip 20.1.1 py37_1
pkginfo 1.5.0.1 py37_0
pluggy 0.13.1 py37_0
ply 3.11 py37_0
powershell_shortcut 0.0.1 3
prometheus_client 0.8.0 py_0
prompt-toolkit 3.0.5 py_0
prompt_toolkit 3.0.5 0
psutil 5.7.0 py37he774522_0
py 1.9.0 py_0
py-lief 0.10.1 py37ha925a31_0
pybson 0.5.9 pypi_0 pypi
pyclustering 0.9.3.1 pypi_0 pypi
pycodestyle 2.6.0 py_0
pycosat 0.6.3 py37he774522_0
pycparser 2.20 py_2
pycrypto 2.6.1 py37he774522_10
pycurl 7.43.0.5 py37h7a1dbc1_0
pydocstyle 5.0.2 py_0
pyflakes 2.2.0 py_0
pygments 2.6.1 py_0
pylint 2.5.3 py37_0
pymongo 3.9.0 py37ha925a31_0
pynacl 1.4.0 py37h62dcd97_1
pyodbc 4.0.30 py37ha925a31_0
pyopenssl 19.1.0 py_1
pyparsing 2.4.7 py_0
pyqt 5.9.2 py37h6538335_2
pyreadline 2.1 py37_1
pyrsistent 0.16.0 py37he774522_0
pysocks 1.7.1 py37_1
pytables 3.6.1 py37h1da0976_0
pytest 5.4.3 py37_0
python 3.7.7 h81c818b_4
python-dateutil 2.8.1 py_0
python-jsonrpc-server 0.3.4 py_1
python-language-server 0.34.1 py37_0
python-libarchive-c 2.9 py_0
pytorch 1.6.0 py3.7_cuda102_cudnn7_0 pytorch
pytz 2020.1 py_0
pywavelets 1.1.1 py37he774522_0
pywin32 227 py37he774522_1
pywin32-ctypes 0.2.0 py37_1001
pywinpty 0.5.7 py37_0
pyyaml 5.3.1 py37he774522_1
pyzmq 19.0.1 py37ha925a31_1
qdarkstyle 2.8.1 py_0
qt 5.9.7 vc14h73c81de_0
qtawesome 0.7.2 py_0
qtconsole 4.7.5 py_0
qtpy 1.9.0 py_0
regex 2020.6.8 py37he774522_0
requests 2.24.0 py_0
rope 0.17.0 py_0
rtree 0.9.4 py37h21ff451_1
ruamel_yaml 0.15.87 py37he774522_1
scikit-image 0.16.2 py37h47e9c7a_0
scikit-learn 0.23.1 py37h25d0782_0
scipy 1.5.0 py37h9439919_0
seaborn 0.10.1 py_0
send2trash 1.5.0 py37_0
setuptools 49.2.0 py37_0
simplegeneric 0.8.1 py37_2
singledispatch 3.4.0.3 py37_0
sip 4.19.8 py37h6538335_0
six 1.15.0 py_0
snappy 1.1.8 h33f27b4_0
snowballstemmer 2.0.0 py_0
sortedcollections 1.2.1 py_0
sortedcontainers 2.2.2 py_0
soupsieve 2.0.1 py_0
sphinx 3.1.2 py_0
sphinxcontrib 1.0 py37_1
sphinxcontrib-applehelp 1.0.2 py_0
sphinxcontrib-devhelp 1.0.2 py_0
sphinxcontrib-htmlhelp 1.0.3 py_0
sphinxcontrib-jsmath 1.0.1 py_0
sphinxcontrib-qthelp 1.0.3 py_0
sphinxcontrib-serializinghtml 1.1.4 py_0
sphinxcontrib-websupport 1.2.3 py_0
spyder 4.1.4 py37_0
spyder-kernels 1.9.2 py37_0
sqlalchemy 1.3.18 py37he774522_0
sqlite 3.32.3 h2a8f88b_0
statsmodels 0.11.1 py37he774522_0
sympy 1.6.1 py37_0
tbb 2020.0 h74a9793_0
tblib 1.6.0 py_0
terminado 0.8.3 py37_0
testpath 0.4.4 py_0
threadpoolctl 2.1.0 pyh5ca1d4c_0
tk 8.6.10 he774522_0
toml 0.10.1 py_0
toolz 0.10.0 py_0
torchvision 0.7.0 py37_cu102 pytorch
tornado 6.0.4 py37he774522_1
tqdm 4.47.0 py_0
traitlets 4.3.3 py37_0
typed-ast 1.4.1 py37he774522_0
typing_extensions 3.7.4.2 py_0
ujson 1.35 py37hfa6e2cd_0
unicodecsv 0.14.1 py37_0
urllib3 1.25.9 py_0
vc 14.1 h0510ff6_4
vs2015_runtime 14.16.27012 hf0eaf9b_3
watchdog 0.10.3 py37_0
wcwidth 0.2.5 py_0
webencodings 0.5.1 py37_1
werkzeug 1.0.1 py_0
wheel 0.34.2 py37_0
widgetsnbextension 3.5.1 py37_0
win_inet_pton 1.1.0 py37_0
win_unicode_console 0.5 py37_0
wincertstore 0.2 py37_0
winpty 0.4.3 4
wrapt 1.11.2 py37he774522_0
xlrd 1.2.0 py37_0
xlsxwriter 1.2.9 py_0
xlwings 0.19.5 py37_0
xlwt 1.3.0 py37_0
xmltodict 0.12.0 py_0
xz 5.2.5 h62dcd97_0
yaml 0.2.5 he774522_0
yapf 0.30.0 py_0
zeromq 4.3.2 ha925a31_2
zict 2.0.0 py_0
zipp 3.1.0 py_0
zlib 1.2.11 h62dcd97_4
zope 1.0 py37_1
zope.event 4.4 py37_0
zope.interface 4.7.1 py37he774522_0
zstd 1.4.5 ha9fde0e_0
You can try installing CuDNN explicitly by using conda install -c anaconda cudnn. Then conda list cudnn should show CuDNN.

Keras callback on_epoch_end throws error (Nonetype has no len())

I'm training a lot of neural networks using hyperopt at the moment. Sometimes it runs perfectly through, sometimes not and I don't understand why. It refers to the validation data, but that's always the same. I don't use any k-fold CV yet. The parameters stored in the dictionary setup 'Monitor', 'Patience', 'MinDelta', 'Epochs' and 'BatchSize' stay unchanged too. As you can see below, shuffle is also set to False. I've just tried to train the network manually with the same hyperparameters and it went through. GPU VRAM should be enough because I trained larger networks without problems (more neurons, higher batch size). Does someone have any suggestions or guesses what could lead to this error?
Here are some relevant code snippets:
from tensorflow.keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor=setup['Monitor'], patience=setup['Patience'],mode='min',
restore_best_weights=True,min_delta=setup['MinDelta'])
history = autoencoder.fit(trainx, trainy, epochs = setup['Epochs'], batch_size = setup['BatchSize'],
validation_data=(valx,valy),callbacks=[early_stopping],verbose=0,shuffle=False)
Used parameters of the last error that was shown:
{'AFunction': 'relu', 'BatchNorm': False, 'BatchSize': 56, 'Bottleneck': 16, 'Dataset': 'ml100k',
'Date': '2020-03-14__22_46_30', 'DecNeurons': 480, 'Decay': 0.0006340241989020302,
'Dropout': 0.0003539460040469268, 'EncNeurons': 256, 'Epochs': 100, 'ID': 3,
'LR': 0.3869023252696237, 'Layers': 4, 'Metric': 'RMSE', 'MinDelta': 0.01,
'Monitor': 'val_root_mean_squared_error', 'MP': True, 'Noise': 0.02, 'Normalize': False,
'Optimizer': 'adam', 'Patience': 25, 'RDigits': 5, 'Split': 'Movie', 'WeightInit': 0,
'Neurons': [480, 408, 328, 256, 16, 256, 328, 408, 480], 'NeuronSum': 2960,
'Loss': <function MMSE at 0x000001DED8DD60D0>, 'IO': 943}
I'm getting the following error:
File "<ipython-input-3-3d2f96200fbe>", line 26, in <module>
best = fmin(fn=ae,space=parameterspace,algo=algo,trials = bayes_trials,max_evals=5000)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\fmin.py", line 482, in fmin
show_progressbar=show_progressbar,
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\base.py", line 686, in fmin
show_progressbar=show_progressbar,
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\fmin.py", line 509, in fmin
rval.exhaust()
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\fmin.py", line 330, in exhaust
self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\fmin.py", line 286, in run
self.serial_evaluate()
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\fmin.py", line 165, in serial_evaluate
result = self.domain.evaluate(spec, ctrl)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\hyperopt\base.py", line 894, in evaluate
rval = self.fn(pyll_rval)
File "<ipython-input-1-5d0cd446e015>", line 229, in ae
validation_data=(valx,valy),callbacks=[early_stopping],verbose=0,shuffle=False)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 728, in fit
use_multiprocessing=use_multiprocessing)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 372, in fit
prefix='val_')
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\contextlib.py", line 88, in __exit__
next(self.gen)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 685, in on_epoch
self.callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\callbacks.py", line 298, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\callbacks.py", line 1238, in on_epoch_end
self.model.set_weights(self.best_weights)
File "C:\Users\Admin\Anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 1322, in set_weights
if len(params) != len(weights):
TypeError: object of type 'NoneType' has no len()
Used environment:
(tf2) C:\Users\Admin>conda list
# packages in environment at C:\Users\Admin\Anaconda3\envs\tf2:
#
# Name Version Build Channel
_tflow_select 2.1.0 gpu
absl-py 0.8.1 py36_0
alabaster 0.7.12 py36_0
alembic 1.3.2 py_0 conda-forge
appdirs 1.4.3 py_1 conda-forge
asn1crypto 1.3.0 py36_0
astor 0.8.0 py36_0
astroid 2.3.3 py36_0
attrs 19.3.0 py_0
babel 2.8.0 py_0
backcall 0.1.0 py36_0
backports 1.0 py_2 conda-forge
bayesian-optimization 1.0.1 py_0 conda-forge
blas 1.0 mkl
bleach 3.1.0 py36_0
ca-certificates 2020.1.1 0 anaconda
certifi 2019.11.28 py36_0 anaconda
cffi 1.13.2 py36h7a1dbc1_0
chardet 3.0.4 py36_1003
click 7.0 py_0 conda-forge
cloudpickle 1.2.2 py_0
colorama 0.4.3 py_0
configparser 3.7.3 py36_1 conda-forge
cryptography 2.8 py36h7a1dbc1_0
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
cycler 0.10.0 py36h009560c_0
databricks-cli 0.9.1 py_0 conda-forge
decorator 4.4.1 py_0
defusedxml 0.6.0 py_0
docker-py 4.1.0 py36_0 conda-forge
docker-pycreds 0.4.0 py_0 conda-forge
docutils 0.15.2 py36_0
entrypoints 0.3 py36_0
flask 1.1.1 py_1 conda-forge
floweaver 2.0.0a5 py_0 conda-forge
freetype 2.9.1 ha9979f8_1
future 0.18.2 pypi_0 pypi
gast 0.2.2 py36_0
gitdb2 2.0.6 py_0 conda-forge
gitpython 3.0.5 py_0 conda-forge
google-pasta 0.1.8 py_0
gorilla 0.3.0 py_0 conda-forge
grpcio 1.16.1 py36h351948d_1
h5py 2.9.0 py36h5e291fa_0
hdf5 1.10.4 h7ebc959_0
hyperopt 0.2.3 pypi_0 pypi
icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha66f8fd_1
idna 2.8 py36_0
imagesize 1.2.0 py_0
importlib_metadata 1.3.0 py36_0
intel-openmp 2019.4 245
ipykernel 5.1.3 py36h39e3cac_1
ipython 7.11.1 py36h39e3cac_0
ipython_genutils 0.2.0 py36_0
ipywidgets 7.5.1 py_0
isort 4.3.21 py36_0
itsdangerous 1.1.0 py_0 conda-forge
jedi 0.15.2 py36_0
jinja2 2.10.3 py_0
joblib 0.14.1 py_0
jpeg 9b hb83a4c4_2
jsonschema 3.2.0 py36_0
jupyter 1.0.0 py36_7
jupyter_client 5.3.4 py36_0
jupyter_console 6.0.0 py36_0
jupyter_core 4.6.1 py36_0
keras 2.3.1 py36h21ff451_0 conda-forge
keras-applications 1.0.8 py_0
keras-preprocessing 1.1.0 py_1
keyring 20.0.0 py36_0
kiwisolver 1.1.0 py36ha925a31_0
lazy-object-proxy 1.4.3 py36he774522_0
libgpuarray 0.7.6 hfa6e2cd_1003 conda-forge
libpng 1.6.37 h2a8f88b_0
libprotobuf 3.11.2 h7bd577a_0
libsodium 1.0.16 h9d3ae62_0
m2w64-gcc-libgfortran 5.3.0 6
m2w64-gcc-libs 5.3.0 7
m2w64-gcc-libs-core 5.3.0 7
m2w64-gmp 6.1.0 2
m2w64-libwinpthread-git 5.0.0.4634.697f757 2
mako 1.1.0 py_0 conda-forge
markdown 3.1.1 py36_0
markupsafe 1.1.1 py36he774522_0
matplotlib 3.1.1 py36hc8f65d3_0
mccabe 0.6.1 py36_1
mistune 0.8.4 py36he774522_0
mkl 2019.4 245
mkl-service 2.3.0 py36hb782905_0
mkl_fft 1.0.15 py36h14836fe_0
mkl_random 1.1.0 py36h675688f_0
mlflow 1.5.0 py36_1 conda-forge
more-itertools 8.0.2 py_0
msys2-conda-epoch 20160418 1
nbconvert 5.6.1 py36_0
nbformat 4.4.0 py36_0
networkx 2.2 pypi_0 pypi
notebook 6.0.2 py36_0
numpy 1.17.4 py36h4320e6b_0
numpy-base 1.17.4 py36hc3f5095_0
numpydoc 0.9.2 py_0
openssl 1.1.1 he774522_0 anaconda
opt_einsum 3.1.0 py_0
packaging 20.0 py_0
palettable 3.3.0 py_0 conda-forge
pandas 0.25.3 py36ha925a31_0
pandoc 2.2.3.2 0
pandocfilters 1.4.2 py36_1
parso 0.5.2 py_0
pickleshare 0.7.5 py36_0
pip 19.3.1 py36_0
plotly 3.10.0 pypi_0 pypi
prometheus_client 0.7.1 py_0
prometheus_flask_exporter 0.12.1 py_0 conda-forge
prompt_toolkit 2.0.10 py_0
protobuf 3.11.2 py36h33f27b4_0
psutil 5.6.7 py36he774522_0
pycodestyle 2.5.0 py36_0
pycparser 2.19 py36_0
pyflakes 2.1.1 py36_0
pygments 2.5.2 py_0
pygpu 0.7.6 py36hc8d92b1_1000 conda-forge
pylint 2.4.4 py36_0
pyopenssl 19.1.0 py36_0
pyparsing 2.4.6 py_0
pyqt 5.9.2 py36h6538335_2
pyreadline 2.1 py36_1
pyrsistent 0.15.6 py36he774522_0
pysocks 1.7.1 py36_0
python 3.6.10 h9f7ef89_0
python-dateutil 2.8.1 py_0
python-editor 1.0.4 py_0 conda-forge
pytz 2019.3 py_0
pywin32 227 py36he774522_1
pywin32-ctypes 0.2.0 py36_0
pywinpty 0.5.7 py36_0
pyyaml 5.3 py36hfa6e2cd_0 conda-forge
pyzmq 18.1.0 py36ha925a31_0
qt 5.9.7 vc14h73c81de_0
qtawesome 0.6.0 py_0
qtconsole 4.6.0 py_1
qtpy 1.9.0 py_0
querystring_parser 1.2.4 py_0 conda-forge
requests 2.22.0 py36_1
retrying 1.3.3 py36_2
rope 0.14.0 py_0
scikit-learn 0.22.1 py36h6288b17_0
scipy 1.3.2 py36h29ff71c_0
seaborn 0.10.0 py_0 anaconda
send2trash 1.5.0 py36_0
setuptools 44.0.0 py36_0
simplejson 3.17.0 py36hfa6e2cd_0 conda-forge
sip 4.19.8 py36h6538335_0
six 1.13.0 py36_0
smmap2 2.0.5 py_0 conda-forge
snowballstemmer 2.0.0 py_0
sphinx 2.3.1 py_0
sphinxcontrib-applehelp 1.0.1 py_0
sphinxcontrib-devhelp 1.0.1 py_0
sphinxcontrib-htmlhelp 1.0.2 py_0
sphinxcontrib-jsmath 1.0.1 py_0
sphinxcontrib-qthelp 1.0.2 py_0
sphinxcontrib-serializinghtml 1.1.3 py_0
spyder 3.3.6 py36_0
spyder-kernels 0.5.2 py36_0
sqlalchemy 1.3.12 py36hfa6e2cd_0 conda-forge
sqlite 3.30.1 he774522_0
sqlparse 0.3.0 py_0 conda-forge
tabulate 0.8.6 py_0 conda-forge
tensorboard 2.0.0 pyhb38c66f_1
tensorflow 2.0.0 gpu_py36hfdd5754_0
tensorflow-base 2.0.0 gpu_py36h390e234_0
tensorflow-estimator 2.0.0 pyh2649769_0
tensorflow-gpu 2.0.0 h0d30ee6_0
termcolor 1.1.0 py36_1
terminado 0.8.3 py36_0
testpath 0.4.4 py_0
theano 1.0.4 py36h6538335_1001 conda-forge
tornado 6.0.3 py36he774522_0
tqdm 4.43.0 pypi_0 pypi
traitlets 4.3.3 py36_0
typed-ast 1.4.0 py36he774522_0
urllib3 1.25.7 py36_0
vc 14.1 h0510ff6_4
vs2015_runtime 14.16.27012 hf0eaf9b_1
vs2015_win-64 14.0.25420 h55c1224_11
waitress 1.4.1 py_0 conda-forge
wcwidth 0.1.7 py36_0
webencodings 0.5.1 py36_1
websocket-client 0.57.0 py36_0 conda-forge
werkzeug 0.16.0 py_0
wheel 0.33.6 py36_0
widgetsnbextension 3.5.1 py36_0
win_inet_pton 1.1.0 py36_0
wincertstore 0.2 py36h7fe50ca_0
winpty 0.4.3 4
wrapt 1.11.2 py36he774522_0
yaml 0.2.2 hfa6e2cd_1 conda-forge
zeromq 4.3.1 h33f27b4_3
zipp 0.6.0 py_0
zlib 1.2.11 h62dcd97_3
This looks like it is cause by a bug (that is yet to be fixed) that happens when the training only sees a worsening performance.
The line self.best_weights (initialized to None (https://github.com/tensorflow/tensorflow/blob/295ad2781683835be974faba0a191528d8079768/tensorflow/python/keras/callbacks.py#L1190) is only overwritten with weight values if at the end of the epoch (https://github.com/tensorflow/tensorflow/blob/295ad2781683835be974faba0a191528d8079768/tensorflow/python/keras/callbacks.py#L1221) if the performance has improved.
If the data/hyperparameter configuration is such that the the performance only worsens, and you've set restore_best_weights=True, then it calls self.model.set_weights(self.best_weights) with None and that's where the error is raised.

Conda environment export to yaml file fails

I have created an environment but when I try to export it with
conda env export --name ENVNAME > ENVNAME.yml
I get the following error message:
InvalidVersionSpec: Invalid version '(>=': unable to convert to expression tree: ['(']
In response to the question from FlyingTeller, here is what conda list gives (sorry for the long list):
# Name Version Build Channel
altair 3.2.0 py38_0
appdirs 1.4.3 pypi_0 pypi
appnope 0.1.0 py38_0
asn1crypto 1.3.0 py38_0
attrs 19.3.0 py_0
backcall 0.1.0 py38_0
black 19.10b0 pypi_0 pypi
blas 1.0 mkl
bleach 3.1.0 py_0
ca-certificates 2020.1.1 0
certifi 2019.11.28 py38_0
cffi 1.14.0 py38hb5b8e2f_0
chardet 3.0.4 py38_1003
click 7.0 pypi_0 pypi
cloudpickle 1.3.0 py_0
cryptography 2.8 py38ha12b0ac_0
cycler 0.10.0 py38_0
decorator 4.4.1 py_0
defusedxml 0.6.0 py_0
entrypoints 0.3 py38_0
et_xmlfile 1.0.1 py38_0
ezc3d 1.2.4 py38hbf1eeb5_0 conda-forge
freetype 2.9.1 hb4e5f40_0
idna 2.8 py38_1000
importlib_metadata 1.5.0 py38_0
intel-openmp 2020.0 166
ipykernel 5.1.4 py38h39e3cac_0
ipython 7.12.0 py38h5ca1d4c_0
ipython-genutils 0.2.0 pypi_0 pypi
ipython_genutils 0.2.0 py38_0
jdcal 1.4.1 py_0
jedi 0.16.0 py38_0
jinja2 2.11.1 py_0
joblib 0.14.1 py_0
jsonschema 3.2.0 py38_0
jupyter_client 5.3.4 py38_0
jupyter_core 4.6.1 py38_0
kiwisolver 1.1.0 py38ha1b3eb9_0 conda-forge
libcxx 9.0.1 1 conda-forge
libedit 3.1.20181209 hb402a30_0
libffi 3.2.1 h475c297_4
libgfortran 3.0.1 h93005f0_2
libpng 1.6.37 ha441bb4_0
libsodium 1.0.16 h3efe00b_0
littleutils 0.2.2 py_0 conda-forge
llvm-openmp 4.0.1 hcfea43d_1
markupsafe 1.1.1 py38h1de35cc_0
matplotlib 3.1.3 py38_0
matplotlib-base 3.1.3 py38h9aa3819_0
mistune 0.8.4 py38h1de35cc_1000
mkl 2019.4 233
mkl-service 2.3.0 py38hfbe908c_0
mkl_fft 1.0.15 py38h5e564d8_0
mkl_random 1.1.0 py38h6440ff4_0
more-itertools 8.2.0 py_0
mpmath 1.1.0 py38_0
nb-black 1.0.7 pypi_0 pypi
nb_conda_kernels 2.2.2 py38_0 conda-forge
nbconvert 5.6.1 py38_0
nbformat 5.0.4 py_0
ncurses 6.1 h0a44026_1
notebook 6.0.3 py38_0
numpy 1.18.1 py38h7241aed_0
numpy-base 1.18.1 py38h6575580_1
openpyxl 3.0.3 py_0
openssl 1.1.1d h1de35cc_4
outdated 0.2.0 py_0 conda-forge
pandas 1.0.1 py38h6c726b0_0
pandas-flavor 0.2.0 py_0 conda-forge
pandoc 2.2.3.2 0
pandocfilters 1.4.2 py38_1
parso 0.6.1 py_0
pathspec 0.7.0 pypi_0 pypi
patsy 0.5.1 py38_0
peakutils 1.3.2 py_0 conda-forge
pexpect 4.8.0 py38_0
pickleshare 0.7.5 py38_1000
pingouin 0.3.2 py_0 conda-forge
pip 20.0.2 py38_1
prometheus_client 0.7.1 py_0
prompt_toolkit 3.0.3 py_0
ptyprocess 0.6.0 py38_0
pycparser 2.19 py_0
pygments 2.5.2 py_0
pyopenssl 19.1.0 py38_0
pyparsing 2.4.6 py_0
pyrsistent 0.15.7 py38h1de35cc_0
pysocks 1.7.1 py38_0
python 3.8.1 h359304d_1
python-dateutil 2.8.1 py_0
pytz 2019.3 py_0
pyzmq 18.1.1 py38h0a44026_0
readline 7.0 h1de35cc_5
regex 2020.1.8 pypi_0 pypi
requests 2.22.0 py38_1
scikit-learn 0.22.1 py38h27c97d8_0
scipy 1.4.1 py38h44e99c9_0
seaborn 0.10.0 py_0
send2trash 1.5.0 py38_0
setuptools 45.2.0 py38_0
six 1.14.0 py38_0
spyder-kernels 1.8.1 py38_0
sqlite 3.31.1 ha441bb4_0
statsmodels 0.11.0 py38h1de35cc_0
terminado 0.8.3 py38_0
testpath 0.4.4 py_0
tk 8.6.8 ha441bb4_0
toml 0.10.0 pypi_0 pypi
toolz 0.10.0 py_0
tornado 6.0.3 py38h1de35cc_3
traitlets 4.3.3 py38_0
typed-ast 1.4.1 pypi_0 pypi
urllib3 1.25.8 py38_0
wcwidth 0.1.8 py_0
webencodings 0.5.1 py38_1
wheel 0.34.2 py38_0
wurlitzer 2.0.0 py38_0
xarray 0.15.0 py_0
xlrd 1.2.0 py_0
xlwt 1.3.0 py38_0
xz 5.2.4 h1de35cc_4
zeromq 4.3.1 h0a44026_3
zipp 2.2.0 py_0
zlib 1.2.11 h1de35cc_3
I had run into a similar issue: it is due to some (everlasting) incompatibilities between conda and pip. In my case the pip package nb-black was causing the error: running pip uninstall nb-black followed by conda env export > environment.yml solved it: uninstalling black before exporting your environment might help.
I had a similar issue, which was caused by my own package, created with setuptools and distributed as wheel.
I have used a wrong format, with extra trailing dot, to specify required dependency versions in setup.py of my package.
# setup.py
import setuptools
setuptools.setup(
name="mypackage",
version="1.0.0",
python_requires=">=3.7",
install_requires=[
"pandas>=0.25",
"pyodbc>=4.", # Warning! This should be without trailing dot "pyodbc>=4",
"requests>=2.26"
]
)
In the package installed in a conda environment this information gets saved to %CONDA_PREFIX%/Lib/site-packages/mypackage-1.0.0.dist-info/METADATA. This is where command conda env export fails at reading package versions.
On windows you can use this method, suggested by RLashofRegas at
https://github.com/conda/conda/issues/9624#issuecomment-801623523
In Windows PowerShell you input
cd C:\Users\<user>\Anaconda3\envs\<env name>\Lib\site-packages
then
Get-ChildItem -File -Recurse -Filter METADATA | Select-String "4.7.0<4.8.0" | Select-Object -Unique Path
where you put your error inside the quotation marks, e.g. 4.7.0<4.8.0
This gives the path to the package file, that causes all the troubles.
You can either change it's METADATA, or uninstall.
I had a similar issue. I used the export command in debug mode to identify the package causing the problem. Then, uninstalling the package fixed the issue.
conda env export -vv

anaconda-navigator not opening on CentOS 7.3

I'm new to Anaconda. I had installed Anaconda24.4.0Linuxx86_64.sh on a Centos 7.3 server and after installed in the terminal (I use PuTTY as an emulator to connect to the server) I had enter anaconda-navigator. I got the following error message:
This application failed to start because it could not find or load the
Qt platform plugin "xcb" in "".
Available platform plugins are: minimal, offscreen, xcb.
Reinstalling the application may fix this problem.
Aborted (core dumped)
Any thoughts or suggestions on how to overcome this issue? Below my conda info and conda list.
conda Info:
platform : linux-64
conda version : 4.3.21
conda is private : False
conda-env version : 4.3.21
conda-build version : not installed
python version : 2.7.13.final.0
requests version : 2.14.2
root environment : /home/AD/soarelu/Download/ENTER (writable)
default environment : /home/AD/soarelu/Download/ENTER
envs directories : /home/AD/soarelu/Download/ENTER/envs
/home/AD/soarelu/.conda/envs
package cache : /home/AD/soarelu/Download/ENTER/pkgs
/home/AD/soarelu/.conda/pkgs
channel URLs:https://repo.continuum.io/pkgs/free/linux-64 https//repo.continuum.io/pkgs/free/noarch
https//repo.continuum.io/pkgs/r/linux-64
https//repo.continuum.io/pkgs/r/noarch
https//repo.continuum.io/pkgs/pro/linux-64 https//repo.continuum.io/pkgs/pro/noarch
config file : None
netrc file : None
offline mode : False
user-agent : conda/4.3.21 requests/2.14.2 CPython/2.7.13 Linux/3.10.0-514.26.2.el7.x86_64 CentOS Linux/7.3.1611 glibc/2.17
UID:GID : 1946947662:1946800513
conda list
packages in environment at /home/AD/soarelu/Download/ENTER:
_license 1.1 py27_1
alabaster 0.7.10 py27_0
anaconda 4.4.0 np112py27_0
anaconda-client 1.6.3 py27_0
anaconda-navigator 1.6.2 py27_0
anaconda-project 0.6.0 py27_0
asn1crypto 0.22.0 py27_0
astroid 1.4.9 py27_0
astropy 1.3.2 np112py27_0
babel 2.4.0 py27_0
backports 1.0 py27_0
backports_abc 0.5 py27_0
beautifulsoup4 4.6.0 py27_0
bitarray 0.8.1 py27_0
blaze 0.10.1 py27_0
bleach 1.5.0 py27_0
bokeh 0.12.5 py27_1
boto 2.46.1 py27_0
bottleneck 1.2.1 np112py27_0
cairo 1.14.8 0
cdecimal 2.3 py27_2
cffi 1.10.0 py27_0
chardet 3.0.3 py27_0
click 6.7 py27_0
cloudpickle 0.2.2 py27_0
clyent 1.2.2 py27_0
colorama 0.3.9 py27_0
conda 4.3.21 py27_0
conda-env 2.6.0 0
configparser 3.5.0 py27_0
contextlib2 0.5.5 py27_0
cryptography 1.8.1 py27_0
curl 7.52.1 0
cycler 0.10.0 py27_0
cython 0.25.2 py27_0
cytoolz 0.8.2 py27_0
dask 0.14.3 py27_1
datashape 0.5.4 py27_0
dbus 1.10.10 0
decorator 4.0.11 py27_0
distributed 1.16.3 py27_0
docutils 0.13.1 py27_0
entrypoints 0.2.2 py27_1
enum34 1.1.6 py27_0
et_xmlfile 1.0.1 py27_0
expat 2.1.0 0
fastcache 1.0.2 py27_1
flask 0.12.2 py27_0
flask-cors 3.0.2 py27_0
fontconfig 2.12.1 3
freetype 2.5.5 2
funcsigs 1.0.2 py27_0
functools32 3.2.3.2 py27_0
futures 3.1.1 py27_0
get_terminal_size 1.0.0 py27_0
gevent 1.2.1 py27_0
glib 2.50.2 1
greenlet 0.4.12 py27_0
grin 1.2.1 py27_3
gst-plugins-base 1.8.0 0
gstreamer 1.8.0 0
h5py 2.7.0 np112py27_0
harfbuzz 0.9.39 2
hdf5 1.8.17 1
heapdict 1.0.0 py27_1
html5lib 0.999 py27_0
icu 54.1 0
idna 2.5 py27_0
imagesize 0.7.1 py27_0
ipaddress 1.0.18 py27_0
ipykernel 4.6.1 py27_0
ipython 5.3.0 py27_0
ipython_genutils 0.2.0 py27_0
ipywidgets 6.0.0 py27_0
isort 4.2.5 py27_0
itsdangerous 0.24 py27_0
jbig 2.1 0
jdcal 1.3 py27_0
jedi 0.10.2 py27_2
jinja2 2.9.6 py27_0
jpeg 9b 0
jsonschema 2.6.0 py27_0
jupyter 1.0.0 py27_3
jupyter_client 5.0.1 py27_0
jupyter_console 5.1.0 py27_0
jupyter_core 4.3.0 py27_0
lazy-object-proxy 1.2.2 py27_0
libffi 3.2.1 1
libgcc 4.8.5 2
libgfortran 3.0.0 1
libiconv 1.14 0
libpng 1.6.27 0
libsodium 1.0.10 0
libtiff 4.0.6 3
libtool 2.4.2 0
libxcb 1.12 1
libxml2 2.9.4 0
libxslt 1.1.29 0
llvmlite 0.18.0 py27_0
locket 0.2.0 py27_1
lxml 3.7.3 py27_0
markupsafe 0.23 py27_2
matplotlib 2.0.2 np112py27_0
mistune 0.7.4 py27_0
mkl 2017.0.1 0
mkl-service 1.1.2 py27_3
mpmath 0.19 py27_1
msgpack-python 0.4.8 py27_0
multipledispatch 0.4.9 py27_0
navigator-updater 0.1.0 py27_0
nbconvert 5.1.1 py27_0
nbformat 4.3.0 py27_0
networkx 1.11 py27_0
nltk 3.2.3 py27_0
nose 1.3.7 py27_1
notebook 5.0.0 py27_0
numba 0.33.0 np112py27_0
numexpr 2.6.2 np112py27_0
numpy 1.12.1 py27_0
numpydoc 0.6.0 py27_0
odo 0.5.0 py27_1
olefile 0.44 py27_0
openpyxl 2.4.7 py27_0
openssl 1.0.2l 0
packaging 16.8 py27_0
pandas 0.20.1 np112py27_0
pandocfilters 1.4.1 py27_0
pango 1.40.3 1
partd 0.3.8 py27_0
path.py 10.3.1 py27_0
pathlib2 2.2.1 py27_0
patsy 0.4.1 py27_0
pcre 8.39 1
pep8 1.7.0 py27_0
pexpect 4.2.1 py27_0
pickleshare 0.7.4 py27_0
pillow 4.1.1 py27_0
pip 9.0.1 py27_1
pixman 0.34.0 0
ply 3.10 py27_0
prompt_toolkit 1.0.14 py27_0
psutil 5.2.2 py27_0
ptyprocess 0.5.1 py27_0
py 1.4.33 py27_0
pycairo 1.10.0 py27_0
pycosat 0.6.2 py27_0
pycparser 2.17 py27_0
pycrypto 2.6.1 py27_6
pycurl 7.43.0 py27_2
pyflakes 1.5.0 py27_0
pygments 2.2.0 py27_0
pylint 1.6.4 py27_1
pyodbc 4.0.16 py27_0
pyopenssl 17.0.0 py27_0
pyparsing 2.1.4 py27_0
pyqt 5.6.0 py27_2
pytables 3.3.0 np112py27_0
pytest 3.0.7 py27_0
python 2.7.13 0
python-dateutil 2.6.0 py27_0
pytz 2017.2 py27_0
pywavelets 0.5.2 np112py27_0
pyyaml 3.12 py27_0
pyzmq 16.0.2 py27_0
qt 5.6.2 4
qtawesome 0.4.4 py27_0
qtconsole 4.3.0 py27_0
qtpy 1.2.1 py27_0
readline 6.2 2
requests 2.14.2 py27_0
rope 0.9.4 py27_1
ruamel_yaml 0.11.14 py27_1
scandir 1.5 py27_0
scikit-image 0.13.0 np112py27_0
scikit-learn 0.18.1 np112py27_1
scipy 0.19.0 np112py27_0
seaborn 0.7.1 py27_0
setuptools 27.2.0 py27_0
simplegeneric 0.8.1 py27_1
singledispatch 3.4.0.3 py27_0
sip 4.18 py27_0
six 1.10.0 py27_0
snowballstemmer 1.2.1 py27_0
sortedcollections 0.5.3 py27_0
sortedcontainers 1.5.7 py27_0
sphinx 1.5.6 py27_0
spyder 3.1.4 py27_0
sqlalchemy 1.1.9 py27_0
sqlite 3.13.0 0
ssl_match_hostname 3.4.0.2 py27_1
statsmodels 0.8.0 np112py27_0
subprocess32 3.2.7 py27_0
sympy 1.0 py27_0
tblib 1.3.2 py27_0
terminado 0.6 py27_0
testpath 0.3 py27_0
tk 8.5.18 0
toolz 0.8.2 py27_0
tornado 4.5.1 py27_0
traitlets 4.3.2 py27_0
unicodecsv 0.14.1 py27_0
unixodbc 2.3.4 0
wcwidth 0.1.7 py27_0
werkzeug 0.12.2 py27_0
wheel 0.29.0 py27_0
widgetsnbextension 2.0.0 py27_0
wrapt 1.10.10 py27_0
xlrd 1.0.0 py27_0
xlsxwriter 0.9.6 py27_0
xlwt 1.2.0 py27_0
xz 5.2.2 1
yaml 0.1.6 0
zeromq 4.1.5 0
zict 0.1.2 py27_0
zlib 1.2.8 3

Scrapy shell Error

I am a newbie to Scrapy and going through the tutorials.
Ran this command and got some error.
C:\Users\Sandra\Anaconda>scrapy shell 'http://scrapy.org'
In particular what is this URLError: <urlopen error [Errno 10051] A socket operation was attempted to an unreachable network>
Full Error message:
2015-08-20 23:35:08 [scrapy] INFO: Scrapy 1.0.3 started (bot: scrapybot)
2015-08-20 23:35:08 [scrapy] INFO: Optional features available: ssl, http11, boto
2015-08-20 23:35:08 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0}
2015-08-20 23:35:10 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, CoreStats, SpiderState
2015-08-20 23:35:10 [boto] DEBUG: Retrieving credentials from metadata server.
2015-08-20 23:35:10 [boto] ERROR: Caught exception reading instance data
Traceback (most recent call last):
File "C:\Users\Sandra\Anaconda\lib\site-packages\boto\utils.py", line 210, in retry_url
r = opener.open(req, timeout=timeout)
File "C:\Users\Sandra\Anaconda\lib\urllib2.py", line 431, in open
response = self._open(req, data)
File "C:\Users\Sandra\Anaconda\lib\urllib2.py", line 449, in _open
'_open', req)
File "C:\Users\Sandra\Anaconda\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Users\Sandra\Anaconda\lib\urllib2.py", line 1227, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Users\Sandra\Anaconda\lib\urllib2.py", line 1197, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 10051] A socket operation was attempted to an unreachable network>
2015-08-20 23:35:10 [boto] ERROR: Unable to read instance data, giving up
2015-08-20 23:35:10 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddlewar
e, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddlewar
e, ChunkedTransferMiddleware, DownloaderStats
2015-08-20 23:35:10 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthM
iddleware, DepthMiddleware
2015-08-20 23:35:10 [scrapy] INFO: Enabled item pipelines:
2015-08-20 23:35:10 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
Traceback (most recent call last):
File "C:\Users\Sandra\Anaconda\Scripts\scrapy-script.py", line 5, in <module>
sys.exit(execute())
File "C:\Users\Sandra\Anaconda\lib\site-packages\scrapy\cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "C:\Users\Sandra\Anaconda\lib\site-packages\scrapy\cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "C:\Users\Sandra\Anaconda\lib\site-packages\scrapy\cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "C:\Users\Sandra\Anaconda\lib\site-packages\scrapy\commands\shell.py", line 63, in run
shell.start(url=url)
File "C:\Users\Sandra\Anaconda\lib\site-packages\scrapy\shell.py", line 44, in start
self.fetch(url, spider)
File "C:\Users\Sandra\Anaconda\lib\site-packages\scrapy\shell.py", line 81, in fetch
url = any_to_uri(request_or_url)
File "C:\Users\Sandra\Anaconda\lib\site-packages\w3lib\url.py", line 232, in any_to_uri
return uri_or_path if u.scheme else path_to_file_uri(uri_or_path)
File "C:\Users\Sandra\Anaconda\lib\site-packages\w3lib\url.py", line 213, in path_to_file_uri
x = moves.urllib.request.pathname2url(os.path.abspath(path))
File "C:\Users\Sandra\Anaconda\lib\nturl2path.py", line 58, in pathname2url
raise IOError, error
Error: Bad path: C:\Users\Sandra\Anaconda\'http:\scrapy.org'
Here is list of packages installed:
# packages in environment at C:\Users\Sandra\Anaconda:
#
_license 1.1 py27_0
alabaster 0.7.3 py27_0
anaconda 2.3.0 np19py27_0
argcomplete 0.8.9 py27_0
astropy 1.0.3 np19py27_0
babel 1.3 py27_0
backports.ssl-match-hostname 3.4.0.2
bcolz 0.9.0 np19py27_0
beautiful-soup 4.3.2 py27_1
beautifulsoup4 4.3.2
binstar 0.11.0 py27_0
bitarray 0.8.1 py27_1
blaze 0.8.0
blaze-core 0.8.0 np19py27_0
blz 0.6.2 np19py27_1
bokeh 0.9.0 np19py27_0
boto 2.38.0 py27_0
bottleneck 1.0.0 np19py27_0
cdecimal 2.3 py27_1
certifi 14.05.14 py27_0
cffi 1.1.2 py27_0
characteristic 14.3.0
clyent 0.3.4 py27_0
colorama 0.3.3 py27_0
conda 3.16.0 py27_0
conda-build 1.14.0 py27_0
conda-env 2.4.2 py27_0
configobj 5.0.6 py27_0
crcmod 1.7
cryptography 0.9.3 py27_0
cssselect 0.9.1 py27_0
cython 0.22.1 py27_0
cytoolz 0.7.3 py27_0
datashape 0.4.5 np19py27_0
decorator 3.4.2 py27_0
docopt 0.6.2
docutils 0.12 py27_1
dynd-python 0.6.5 np19py27_0
enum34 1.0.4 py27_0
fastcache 1.0.2 py27_0
filechunkio 1.6
flask 0.10.1 py27_1
funcsigs 0.4 py27_0
futures 3.0.2 py27_0
gcs-oauth2-boto-plugin 1.9
gevent 1.0.1 py27_0
gevent-websocket 0.9.3 py27_0
google-api-python-client 1.4.0
google-apitools 0.4.3
greenlet 0.4.7 py27_0
grin 1.2.1 py27_2
gsutil 4.12
h5py 2.5.0 np19py27_1
hdf5 1.8.15.1 2
httplib2 0.9.1
idna 2.0 py27_0
ipaddress 1.0.7 py27_0
ipython 3.2.0 py27_0
ipython-notebook 3.2.0 py27_0
ipython-qtconsole 3.2.0 py27_0
itsdangerous 0.24 py27_0
jdcal 1.0 py27_0
jedi 0.8.1 py27_0
jinja2 2.7.3 py27_2
jsonschema 2.4.0 py27_0
launcher 1.0.0 1
llvmlite 0.5.0 py27_0
lxml 3.4.4 py27_0
markupsafe 0.23 py27_0
matplotlib 1.4.3 np19py27_1
menuinst 1.0.4 py27_0
mistune 0.5.1 py27_1
mock 1.0.1 py27_0
mrjob 0.4.4
multipledispatch 0.4.7 py27_0
networkx 1.9.1 py27_0
nltk 3.0.3 np19py27_0
node-webkit 0.10.1 0
nose 1.3.7 py27_0
numba 0.19.1 np19py27_0
numexpr 2.4.3 np19py27_0
numpy 1.9.2 py27_0
oauth2client 1.4.7
odo 0.3.2 np19py27_0
openpyxl 1.8.5 py27_0
pandas 0.16.2 np19py27_0
patsy 0.3.0 np19py27_0
pattern 2.6
pbs 0.110
pep8 1.6.2 py27_0
pillow 2.8.2 py27_0
pip 7.1.0 py27_1
ply 3.6 py27_0
protorpc 0.10.0
psutil 2.2.1 py27_0
py 1.4.27 py27_0
pyasn1 0.1.7 py27_0
pyasn1-modules 0.0.5
pycosat 0.6.1 py27_0
pycparser 2.14 py27_0
pycrypto 2.6.1 py27_3
pyflakes 0.9.2 py27_0
pygments 2.0.2 py27_0
pyopenssl 0.15.1 py27_1
pyparsing 2.0.3 py27_0
pyqt 4.10.4 py27_1
pyreadline 2.0 py27_0
pytables 3.2.0 np19py27_0
pytest 2.7.1 py27_0
python 2.7.9 1
python-dateutil 2.4.2 py27_0
python-gflags 2.0
pytz 2015.4 py27_0
pywin32 219 py27_0
pyyaml 3.11 py27_1
pyzmq 14.7.0 py27_0
queuelib 1.2.2 py27_0
requests 2.7.0 py27_0
retry-decorator 1.0.0
rodeo 0.2.3
rope 0.9.4 py27_1
rsa 3.1.4
runipy 0.1.3 py27_0
scikit-image 0.11.3 np19py27_0
scikit-learn 0.16.1 np19py27_0
scipy 0.15.1 np19py27_0
scrapy 1.0.3
seaborn 0.5.1 np19py27_0
service-identity 14.0.0
setuptools 18.1 py27_0
simplejson 3.6.5
six 1.9.0 py27_0
snowballstemmer 1.2.0 py27_0
sockjs-tornado 1.0.1 py27_0
socksipy-branch 1.1
sphinx 1.3.1 py27_0
sphinx-rtd-theme 0.1.7
sphinx_rtd_theme 0.1.7 py27_0
spyder 2.3.5.2 py27_0
spyder-app 2.3.5.2 py27_0
sqlalchemy 1.0.5 py27_0
ssl_match_hostname 3.4.0.2 py27_0
statsmodels 0.6.1 np19py27_0
sympy 0.7.6 py27_0
tables 3.2.0
toolz 0.7.2 py27_0
tornado 4.2 py27_0
twisted 15.3.0 py27_0
ujson 1.33 py27_0
unicodecsv 0.9.4 py27_0
uritemplate 0.6
w3lib 1.12.0 py27_0
werkzeug 0.10.4 py27_0
wheel 0.24.0 py27_0
xlrd 0.9.3 py27_0
xlsxwriter 0.7.3 py27_0
xlwings 0.3.5 py27_0
xlwt 1.0.0 py27_0
zlib 1.2.8 0
zope.interface 4.1.2 py27_1
That particular error message is being generated by boto (boto 2.38.0 py27_0), which is used to connect to Amazon S3. Scrapy doesn't have this enabled by default.
If you're just going through the tutorial, and haven't done anything other than what you've been instructed to do, then it could be a configuration problem. Launching Scrapy with the shell argument from the command will still use the configuration and the associated settings file. By default, Scrapy will look in:
/etc/scrapy.cfg or c:\scrapy\scrapy.cfg (system-wide),
~/.config/scrapy.cfg ($XDG_CONFIG_HOME) and ~/.scrapy.cfg ($HOME) for global (user-wide) settings, and
scrapy.cfg inside a scrapy project’s root (see next section).
EDIT:
In reply to the comments, this appears to be a bug with Scrapy when boto is present (bug here).
In response "how to disable the Download handler", add the following to your settings.py file:
DOWNLOAD_HANDLERS : {
's3': None,
}
Your settings.py file should be in the root of your Scrapy project folder, (one level deeper than your scrapy.cfg file).
If you've already got DOWNLOAD_HANDLERS in your settings.py file, just add a new entry for 's3' with a None value.
EDIT 2:
I'd highly recommend looking at setting up virtual environments for your projects. Look into virtualenv, and it's usage. I'd make this recommendation regardless of packages used for this project, but doubly so with your extreme number of packages.
Maybe you should use double-quote (") instead of single-quote (').
My Python version is 2.7.10 on win32. Scrapy version is 1.0.3.

Resources