pyspark: pip install couldn't find a version - pip

I am trying to install the pyspark using pip install like below. But I got the following errors.
(python_virenv)edamame$ pip install pyspark
Collecting pyspark
Could not find a version that satisfies the requirement pyspark (from versions: )
No matching distribution found for pyspark
Does anyone have any idea? Thanks!

As of Spark 2.2, PySpark is now available in PyPI.
pip install pyspark
As of Spark 2.1, PySpark is pip installable but not yet from PyPI, which is under consideration for 2.2 in this ticket. To install PySpark you now just need download Spark 2.1+ and run setup.py:
cd spark-2.1/python/
pip install -e .
Big thanks to #Holden!

pyspark is not in PyPI so you could not directly use pip install to install it.
Instead you could download a proper version of Spark here: http://spark.apache.org/downloads.html, and you will get a compressed TAR file. Then unpack it and pyspark is in its python folder.
To open the Python version of the Spark shell, you could go into your Spark directory and type:
bin/pyspark
or
bin\pyspark
in Windows.

pyspark doesn't even exist in PyPI as you can see from https://pypi.python.org/pypi?%3Aaction=search&term=pyspark&submit=search, that's why pip is telling you it can't find it.

PySpark can be installed in the following ways.
Download spark from : Spark Downloads
Download and extract the compressed file. Go to the bin folder, and execute
./bin/pyspark
You might want to add the bin folder in the $PATH variable of your shell as well.
Or,
You can install it from the CDH distribution :
Add CDH keys following the steps here :
http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_cdh5_install.html
Install spark following the steps here :
http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cdh_ig_spark_install.html#xd_583c10bfdbd326ba--6eed2fb8-14349d04bee--7ef8

Related

pyspark 3.x in pypi limited to hadoop 2.7.4

I want to pip install pyspark into my python3 virtual environment but the only choice I have is the PyPI version compiled with Hadoop 2.7.4 dependencies. I need a Hadoop 3.x version since 2.7.4 is too old for modern AWS S3 integration.
Does anyone know why there isn't an option to pip install pyspark with Hadoop 3.x support?
Is my only option to build my own pyspark from source?

pip trying to install using old version of python

so I just uninstalled a wrong version of python 3.8 and downloaded python 3.7.4
Now I'm trying to install packages using the command pip install X and get the following error.
C:\Users\User>pip install cv2
Fatal error in launcher: Unable to create process using '"c:\users\user\appdata\local\programs\python\python38-32\python.exe" "C:\Users\User\AppData\Local\Programs\Python\Python38-32\Scripts\pip.exe" install cv2': The system cannot find the file specified.
clearly it is still trying to use the old version of python 3.8 even though I have uninstalled it and reinstalled pip several times.
Any idea on why its trying to look for this old path? and how can I change the default path it is using?
(btw this is just a matter of convenience because as of the moment if I use the command python -m pip install X it does seem to work)
try command python -m pip install --upgrade pip
Delete and clear environment config and install again...

Unable to install coremltools

I am trying to convert a pb file to a coreml file. to do this i need to install coremltools. However, when i try to install it by pip it comes up with this error :
ERROR: Could not find a version that satisfies the requirement coremltools (from versions: none)
ERROR: No matching distribution found for coremltools
i have tried to install it in a python 2.7 environment, still no joy
pip install coremltools
Collecting coremltools
ERROR: Could not find a version that satisfies the requirement coremltools (from versions: none)
ERROR: No matching distribution found for coremltools
Rorys-MBP:~ roryhodgson$
The only reason I could found out that explains why this is happening is that coremltools require python 2.7, make sure you are running it pip --version. If you just typed pip install coremltools the chances are that your machine (assuming it is running macOS) pip command is running the default version of macOS python which probably is 3.5.2 or greater.
I could fix this issue by creating an environment in which my python version was 2.7:
pip install virtualenv
Create a virtual environment:
virtualenv --python=/usr/bin/python2.7 py27
Activate it:
source py27/bin/activate
Lastly, install coremltools:
pip install -U coremltools
When you are done just deactivate the environment running deactivate in the terminal and that's it.
All this is available at the following source: satoshi.blogs.com
If you install from GitHub, then you will not need to install Python 2.7 or fiddle with virtual environments.
pip install "git+https://github.com/apple/coremltools"
The code above will let you install coremltools by cloning the Git repository.

pip search showed apache-beam 2.9 but pip install apache-beam only get apache-beam2.2 installed

In my fresh new virtual environment.
I run
pip search apache-beam
I got
apache-beam (2.9.0)
Then I run
pip install apache-beam
pip list
But I got apache-beam 2.2 installed, instead of 2.9
apache-beam 2.2.0
I then run
python -m apache_beam.examples.wordcount --output cout
I got the error
The Apache Beam SDK for Python is supported only on Python 2.7.
From this document
https://towardsdatascience.com/hands-on-apache-beam-building-data-pipelines-in-python-6548898b66a5
beam 2.9 will support python3. But pip search I found apache-beam 2.9. but pip install, I still get apache-beam 2.2.
Please help.
I got the same kind of requirement, I tried this way to install apache Beam
it was worked for me.
Step 01: Make sure to install python 3.7 or above
Step 02: Beam version, I choose 2.27 for my requirement
pip3 install apache-beam==2.27.0

Problem with tensorflow while downloading rasa_core

I’m trying to update the rasa bot source code from a friend of mine but I had a problem when trying to download rasa_core:
(cha_env) C:\Users\antoi\Documents\Programming\Nathalie\Chatbot_RASA_room_reservation>pip install rasa_core
Collecting rasa_core
...
Collecting tensorflow==1.10.0 (from rasa_core)
Could not find a version that satisfies the requirement tensorflow==1.10.0 (from rasa_core) (from versions: )
No matching distribution found for tensorflow==1.10.0 (from rasa_core)
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
Do you have any idea ? Is it because of the requirements ?
I tried to install tensorflow according to this answer :
python3 -m pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-0.12.0-py3-none-any.whl
But I still have the same issue.
I remember seeing this issue when my colleague was installing it rasa_core. Try installing tensorflow separately then continue with rasa_core installation. It should solve the problem
pip install tensorflow
If you get No matching distribution found for tensorflow error try following:
Note: This works for 64 bit Windows for tensorflow version 1.12.0. For all the variants of tensorflow distribution please check https://storage.googleapis.com/tensorflow/
python -m pip install --upgrade https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-windows-x86_64-1.12.0-rc2.zip

Resources