pyarrow hdfs.connect on windows - windows

i want to use pyarrow to read and write frome a hdfs.
I installed hadoop on my Windows 10 64 bit system as on:
https://github.com/MuhammadBilalYar/Hadoop-On-Window/wiki/Step-by-step-Hadoop-2.8.0-installation-on-Window-10
And installed pyarrow with pip.
But if i want to connect to hdfs in python i get the following error:
Python 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:27:37) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> pyarrow.hdfs.connect()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\TIKI_git\ai-core-python\venv\lib\site-packages\pyarrow\hdfs.py", line 183, in connect
extra_conf=extra_conf)
File "C:\TIKI_git\ai-core-python\venv\lib\site-packages\pyarrow\hdfs.py", line 37, in __init__
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libjvm
I checked my path variables as described in
http://wesmckinney.com/blog/python-hdfs-interfaces/
What can I do to fix this problem?
Is it even possible to use the pyarrow.hdfs.connect function on windows?
Thanks for your help!

Related

Spyder Python IDE unable to start consolle

today, after a restart of my editor, I discovered that it is not able any more to run the consolle (using it on a daily since years).
In the consolle the following message appears:
Traceback (most recent call last):
File "C:\Anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Anaconda3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Anaconda3\lib\site‑packages\spyder_kernels\console\__main__.py", line 23, in
start.main()
File "C:\Anaconda3\lib\site‑packages\spyder_kernels\console\start.py", line 253, in main
import_spydercustomize()
File "C:\Anaconda3\lib\site‑packages\spyder_kernels\console\start.py", line 43, in import_spydercustomize
import spydercustomize
ModuleNotFoundError: No module named 'spydercustomize'
My version of python is:
Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
and from a Python prompt I can see this:
>>> import spyder
>>> spyder.__version__
'4.2.5'
>>>
Anybody has an idea of what could have happened and how to fix this without reinstalling everything?
P.S. I noticed that the substrings "site‑packages" in the message from the consolle contains a character alike "-" that is not interpreted by the windows file explorer (it can't find the folder). It looks like to be another kid of minus sign. The folder "C:\Anaconda3\Lib\site-packages\spyder_kernels\customize" and its content are actually present on my PC

Tensorflow CPU Version installation error (DLL load fail) Win7, Python 3.5.0

Been struggling on this on for 3 days. I already have tensorflow working on 3 other mahcines but need to install this on a high performance CPU going forward.
Get following DLL Import error while trying to install tensorflow using pip.
Processor: Intel(R) Xeon(R) CPU
System Type: 64-bit
Based on other solutions i have already tried
Have followed all instructions on the Tensorflow installation page
have installed the Microsoft Visual C++ 2015 Redistributable Update
Tried with Conda Package after installing multiple Anaconda versions
Tried some alternate [whl][1] as I could not find my CPU in the AVX2
CPU List
Tried multiple individual versions (Python 3.5, 3.6, tensorflow
1.12.0, 1.9.0, 1.6.0 etc) (i.e. without anaconda)
pywrap_tensorflow_internal import
ImportError: DLL load failed with error code -1073741795
============================
(venv)
c:\software\Python35\venv\Lib\site-packages>echo %Path%
c:\software\Python35\venv\Scripts;C:\Windows\system32;C:\Windows;C:\Windows\
System32\Wbem;C:\software\Python35;C:\software\Python35\venv\Lib\site-packages;
C:\software\Python35\venv\Lib\site-packages\tensorflow\include\tensorflow;C:\software\curl
(venv) c:\software\Python35>pip show tensorflow
Name: tensorflow
Version: 1.12.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: opensource#google.com
License: Apache 2.0
Location: c:\software\python35\venv\lib\site-packages
Requires: termcolor, grpcio, keras-preprocessing, wheel, absl-py, six, numpy, ke
ras-applications, astor, tensorboard, protobuf, gast
Required-by:
(venv) c:\software\Python35>cd c:\software\python35\venv\lib\site-packag
es
(venv) c:\software\Python35\venv\Lib\site-packages>python
Python 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:27:37) [MSC v.1900 64 bit (AM
D64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Traceback (most recent call last):
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, descript
ion)
File "c:\software\Python35\venv\lib\imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "c:\software\Python35\venv\lib\imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: DLL load failed with error code -1073741795
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\__init__.py",
line 24, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-im
port
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\__init
__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "c:\software\Python35\venv\Lib\site-packages\tensorflow\python\pywrap
_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, descript
ion)
File "c:\software\Python35\venv\lib\imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "c:\software\Python35\venv\lib\imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: DLL load failed with error code -1073741795
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
Tried Python 3.6.0 install and Tensorflow 1.5.0. Got "google.protobuf.pyext import _message" error. So as per this I upgraded to 3.6.1. I am able to Import tensorflow now! So looks like i have to be with tensorflow 1.5.0 version

Tensorflow install on Windows with anaconda and no internet connection

I can not install Tensorflow on Windows because there in no internet connection based on commpany's security policy.
I just installed anaconda and python by transmiting files with intranet.
Please let me know how to install that with no internet connection.
==========================================================================
In addition, when I used the below command after installing tensorflow, I found
other problems..
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Daisy\Anaconda3\lib\site-packages\tensorflow\__init__.py", line 24, in <module>
from tensorflow.python import *
File "C:\Users\Daisy\Anaconda3\lib\site-packages\tensorflow\python\__init__.py", line 63, in <module>
from tensorflow.core.framework.graph_pb2 import *
File "C:\Users\Daisy\Anaconda3\lib\site-packages\tensorflow\core\framework\graph_pb2.py", line 6, in <module>
from google.protobuf import descriptor as _descriptor
ImportError: No module named 'google'
I don't know how to solve this one.
If you can download the whl file and transfer it to your workstation, then you can run:
pip.exe install --upgrade --no-deps <tensorflow whl file name>
This should avoid trying to connect to download tensorflow dependencies, as anaconda already has most of these.

python 2.7 OSX 10.10 package problems

I'm using :
mac osx 10.10.1
Yesterday I started to get errors using same packages that I normally use.
After few hrs I decided to remove python from my mac
I installed python again following this instructions:
http://docs.python-guide.org/en/latest/starting/install/osx/#install-osx
(i a nut shell I installed python 2.7 using HomeBrew)
from the shell:
$python
Python 2.7.8 (default, Oct 19 2014, 16:02:00)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
from the shell:
$ which python
/usr/local/bin/python
from the shell:
$ which -a python
/usr/local/bin/python
/usr/local/bin/python
I installed pip
I installed few pkgs using $ pip install command
when I try to just import openpyxl
I'm getting this:
import openpyxl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/openpyxl/__init__.py", line 27, in <module>
from openpyxl.workbook import Workbook
File "/usr/local/lib/python2.7/site-packages/openpyxl/workbook/__init__.py", line 25, in <module>
from .workbook import *
File "/usr/local/lib/python2.7/site-packages/openpyxl/workbook/workbook.py", line 11, in <module>
import threading
File
"/usr/local/Cellar/python/2.7.8_2/Frameworks/
Python.framework/Versions/2.7/lib/python2.7/threading.py",
line 14, in <module>
from time import time as _time, sleep as _sleep
ImportError: cannot import name time
please can you help me out?
thanks
d
additional info:
meanwhile I tried to do same thing else to fix the problem
(I didn't fix it)
Basically I installed virtualenv
I created a basic virtual env
I have a new folder with python and all the pkgs
(venv_002)danielepemys-MacBook-Pro:my_python_virtualenv danielepemy$ which python
/Users/danielepemy/my_python_virtualenv/venv_002/bin/python
everything looks fine, python works, pip works the pkgs are listed in the virtual env
(venv_002)danielepemys-MacBook-Pro:my_python_virtualenv danielepemy$ pip list
pip (1.5.6)
setuptools (3.6)
wsgiref (0.1.2)
XlsxWriter (0.6.4)
when I run a simple test script with XlsxWriter I get:
python ..//internet_speed_test_002.py
Traceback (most recent call last):
File "..//internet_speed_test_002.py", line 28, in <module>
excel_file.close()
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/workbook.py", line 286, in close
self._store_workbook()
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/workbook.py", line 509, in _store_workbook
xml_files = packager._create_package()
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/packager.py", line 142, in _create_package
self._write_core_file()
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/packager.py", line 325, in _write_core_file
core._assemble_xml_file()
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/core.py", line 57, in _assemble_xml_file
self._write_dcterms_created()
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/core.py", line 122, in _write_dcterms_created
date = self._localtime_to_iso8601_date(date)
File "/Users/danielepemy/my_python_virtualenv/venv_002/lib/python2.7/site-packages/xlsxwriter/core.py", line 76, in _localtime_to_iso8601_date
return date.strftime("%Y-%m-%dT%H:%M:%SZ")
AttributeError: 'module' object has no attribute 'struct_time'
Do you have any files called time.py or time.pyc in your current directory? If so, rename or delete them (.pyc files can simply be deleted).

Symbol not found: _PQclear (caldav server tools broken after Mavericks update on OSX Server)

After upgrading my server to Mavericks and Server.app version 3, those tools stopped to work properly :
calendarserver_export
calendarserver_manage_principals
...
When I enter this command, for example :
sudo calendarserver_export -u <account_name>
then this error is displayed :
Traceback (most recent call last):
File "/Applications/Server.app/Contents/ServerRoot/usr/sbin/calendarserver_export", line 32, in <module>
from calendarserver.tools.export import main
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/calendarserver/tools/export.py", line 50, in <module>
from calendarserver.tools.cmdline import utilityMain, WorkerService
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/calendarserver/tools/cmdline.py", line 21, in <module>
from calendarserver.tap.caldav import CalDAVServiceMaker, CalDAVOptions
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/calendarserver/tap/caldav.py", line 87, in <module>
from twistedcaldav.upgrade import UpgradeFileSystemFormatStep, PostDBImportStep
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/twistedcaldav/upgrade.py", line 63, in <module>
from calendarserver.tap.util import getRootResource, FakeRequest, directoryFromConfig
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/calendarserver/tap/util.py", line 87, in <module>
from txdav.base.datastore.subpostgres import PostgresService
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/txdav/base/datastore/subpostgres.py", line 35, in <module>
import pgdb
File "/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/pgdb.py", line 66, in <module>
from _pg import *
ImportError: dlopen(/Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/_pg.so, 2): Symbol not found: _PQclear
Referenced from: /Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/_pg.so
Expected in: flat namespace
in /Applications/Server.app/Contents/ServerRoot/usr/share/caldavd/lib/python/_pg.so
All these tools have in common to access the caldav data via python, through PyGreSQL (caldav data is hosted in a postgres database).
I managed to reproduce very simply this problem by :
#login with root
bob $ sudo -i
#launch python and import pg (postgres module for python)
root $ python
Python 2.7.5 (default, Mar 9 2014, 22:15:05)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pg
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/PyGreSQL-4.1.1-py2.7-macosx-10.9-intel.egg/pg.py", line 31, in <module>
from _pg import *
ImportError: dlopen(/Library/Python/2.7/site-packages/PyGreSQL-4.1.1-py2.7-macosx-10.9-intel.egg/_pg.so, 2): Symbol not found: _PQclear
Referenced from: /Library/Python/2.7/site-packages/PyGreSQL-4.1.1-py2.7-macosx-10.9-intel.egg/_pg.so
Expected in: flat namespace
in /Library/Python/2.7/site-packages/PyGreSQL-4.1.1-py2.7-macosx-10.9-intel.egg/_pg.so
BUT, if I launch python from my user account (not root), I don't get the error.
I assume there is something wrong with the path, or something similar, but I can't find what...
I need to have root credentials to be able to use **calendarserver_export**
Any clues ??
For some reason, in mavericks, you have to launch these commands with the "calendar" user (alias of _calendar).
So the command should be :
sudo -u calendar calendarserver_export -u

Resources