Running Scipy on Heroku - heroku

I got Numpy and Matplotlib running on Heroku, and I'm trying to install Scipy as well. However, Scipy requires BLAS[1] to install, which is not presented on the Heroku platform. After contacting Heroku support, they suggested me to build BLAS as a static library to deploy, and setup the necessary environment variables.
So, I compiled libblas.a on a 64bit Linux box, and set the following variables as described in [2] :
$ heroku config
BLAS => .heroku/vendor/lib/libfblas.a
LD_LIBRARY_PATH => .heroku/vendor/lib
LIBRARY_PATH => .heroku/vendor/lib
PATH => bin:/usr/local/bin:/usr/bin:/bin
PYTHONUNBUFFERED => true
After adding scipy==0.10.1 in my requirements.txt, the push still fails.
File "scipy/integrate/setup.py", line 10, in configuration
blas_opt = get_info('blas_opt',notfound_action=2)
File "/tmp/build_h5l5y31i49e8/lib/python2.7/site-packages/numpy/distutils/system_info.py", line 311, in get_info
return cl().get_info(notfound_action)
File "/tmp/build_h5l5y31i49e8/lib/python2.7/site-packages/numpy/distutils/system_info.py", line 462, in get_info
raise self.notfounderror(self.notfounderror.__doc__)
numpy.distutils.system_info.BlasNotFoundError:
Blas (http://www.netlib.org/blas/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [blas]) or by setting
the BLAS environment variable.
It seem that pip is not aware of the BLAS environment variable, so I check the environment using heroku run python:
(venv)bash-3.2$ heroku run python
Running python attached to terminal... import up, run.1
Python 2.7.2 (default, Oct 31 2011, 16:22:04)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.system('bash')
~ $ echo $BLAS
.heroku/vendor/lib/libfblas.a
~ $ ls .heroku/vendor/lib/libfblas.a
.heroku/vendor/lib/libfblas.a
~ $
And it seems fine. Now I have no idea how to solve this.
[1] http://www.netlib.org/blas/
[2] http://www.scipy.org/Installing_SciPy/Linux

I managed to get this working on the cedar stack by building numpy and scipy offline as bdists and then modifying the heroku python buildpack to unzip these onto the dyno's vendor/venv areas directly. You can also use the buildpack to set environment variables that persist.
Heroku haven't officially published buildpacks yet - search for 'heroku buildpacks' for more thirdparty/heroku ones and information.
My fork of the python build pack is here:
https://wyn#github.com/wyn/heroku-buildpack-python.git
The changes are in the bin/compile where I source two new steps, a scipy/numpy step and an openopt step. The scripts for the two steps are in bin/steps/npscipy and bin/steps/openopt. I also added some variables to bin/release. Note that I am assuming installation through a setup.py file rather than the requirements.txt approach (c.f. https://devcenter.heroku.com/articles/python-pip#traditional_distributions).
I also download the blas/lapack/atlas/gfortran binaries that were used to build numpy/scipy onto the dyno as there are c extensions that need to link through to them. The reason for building everything offline and downloading is that pip-installing numpy/scipy requires you to have a fortran compiler + associated dev environment and this made my slugs too big.
It seems to work, the slug size is now 35mb and scaling seems fast too. All but one of the numpy tests pass and all of the scipy tests pass.
This is still work in progress for me but I thought I'd share.

In case someone else comes to this like I did...
The excellent answer on this question from #coshx sadly no longer works (at least I could not get it to work). What I did, however was the following:
I forked npsicpy-binaries repository from #coshx and changed all the tar files such that they do not have venv as the root folder inside (my fork is here)
I forked the npsp-helloworld repository from #coshx and made it use a requirements.txt file instead of the setup.py (my fork is here - this means that you can use the whole pip approach).
I forked the heroku-buildpack-python repository from Heroku, took the npscipy file from #coshx and changed it to work with this latest version of the build pack (my fork is here - you can see that there is no venv set up, for example).
After doing those three things I have the npsp-helloworld application working perfectly. You just need to make sure you set the buildpack appropriately when creating the application and you are good to go:
$ heroku create --stack=cedar --buildpack=https://github.com/kmp1/heroku-buildpack-python.git
NOTE: I haven't worked out how to make my own binaries yet, so the three libraries (scipy, numpy and scikit-learn) are not the latest versions so make sure when you install it you do (if anyone can build these I would gladly accept a pull request for them):
pip install scipy==0.11.0
pip install numpy==1.7.0
pip install scikit-learn==0.13.1
By the way - I am really sorry if I have not done things in the correct way, etiquette-wise. I'm still learning git and this whole open source thing.

Another good option is the conda buildpack, which allows you to add any of the free Linux64 packages available through Anaconda/Miniconda to a Heroku app. Some of the most popular packages include numpy, scipy, scikit-learn, statsmodels, pandas, and cvxopt. While the buildpack makes it fairly simple to add packages to an app, the downsides are that the buildback takes up a lot of space and that you have to wait on Anaconda to update the libraries in the repository.
If you are starting a new Python app on Heroku, you can add the conda buildpack using the command:
$ heroku create YOUR_APP_NAME --buildpack https://github.com/kennethreitz/conda-buildpack.git
If you have already setup a Python app on Heroku, you can add the conda buildpack to the existing app using the command:
$ heroku config:add BUILDPACK_URL=https://github.com/kennethreitz/conda-buildpack.git
Or, if you need to specify the app by name:
$ heroku config:add BUILDPACK_URL=https://github.com/kennethreitz/conda-buildpack.git --app YOUR_APP_NAME
To use the buildpack, you will need to include two text files in the app directory, requirements.txt and conda-requirements.txt. Just as with the standard Python buildpack, the requirements.txt file lists packages that should be installed using pip. Packages that should be installed using conda are listed in the conda-requirements.txt file. Some of the most useful scientific packages include numpy, scipy, scikit-learn, statsmodels, pandas, and cvxopt. The full list of available conda packages can be found at repo.continuum.io.
For example:
$ cat requirements.txt
gunicorn==0.14.2
requests==0.11.1
$ cat conda-requirements.txt
scipy
numpy
cvxopt
That’s it! You can now add Anaconda packages to a Python app on Heroku.

The slug compiler is not aware of your environment variables, which is why it fails during push, and not once running.
The only real option you have is to look at the user_env_compile addon that's currently in labs beta.
http://devcenter.heroku.com/articles/labs-user-env-compile

For those who wish to use Python 3.4 in production, I've built numpy 1.8.1, scipy 0.14.0, and scikit-learn 0.15-git (0.14 doesn't work with the others for some reason) as binaries on Ubuntu 10.04 LTS 64-bit, which works on the Heroku cedar stack. Here is the git repo.
My heroku buildpack is quite similar to that of kmp. Note that the bin/steps/npscipy file links to my repository of binaries above.

I'm putting this here in case someone stumbled on this like I do. Since I am using python 3.4, thenovices buildpack doesn't work out for me.
I tried to look at conda buildpack, but it seems like a little bit of an overkill for my application which only requires scipy and numpy. What finally works for me is this excellent guide, using a multi buildpacks approach. The scipy installation was indeed a bit long though. Hope this helps! :)

Related

Trying to Avoid Using Two Package Managers (pip and Poetry) for the Same Project

After a fair bit of thrashing, I successfully installed the Python Camelot PDF table extraction tool (https://pypi.org/project/camelot-py/) and it works for the intended purpose. But in order to get it to work, aside from having to correct a deprecated dependency (by editing pyproject.toml and setting PyPDF2 =”2.12.1”) I used pip to install Camelot from within a Poetry (my preferred package manager) environment- because I haven’t yet figured out any other way.
Since I’m very new to Python and package management (but not to programming) I have some holes in my basic understanding that I need to patch up. I thought that using two package managers on the same project in principle defeats the purpose of using package managers, so I feel like I’m lucky that it works. Would love some input on what I’m missing.
The documentation for Camelot provides instructions for installing via pip and conda (https://camelot-py.readthedocs.io/en/master/user/install-deps.html), but not Poetry. As I understand (or misunderstand) it, packages are added to Poetry environments via the pyproject.toml file and then calling "poetry install."
I updated pyrpoject.toml as follows, having identified the current Camelot version as 0.10.1 (camelot --version):
[tool.poetry.dependencies]
python = "^3.8"
PyPDF2 = "2.12.1"
camelot = "^0.9.0"
This led to the error:
Because camelot3 depends on camelot (^0.9.0) which doesn't match any versions, version solving failed.
Same problem if I set (camelot = "0.10.1"). So I took the Camelot reference out of pyproject.toml, and ran the following command from within my Poetry virtual environment:
pip install “camelot-py[base]”
I was able to successfully proceed from here, but that doesn’t feel right. Is it wrong to try to force this project into Poetry, and should I instead consider using different package managers for different projects? Am I misunderstanding how Poetry works? What else am I missing here?
Whenever you see pip install 'Something[extra]' you can replace it with poetry add 'Something[extra]'.
Alternatively you can write it directly in the pyproject.toml and then run poetry install instead:
[tool.poetry.dependencies]
# ...
Something = { extras = ["extra"] }
Note that in your question you wrote camelot in the pyproject.toml but it is camelot-py that you should have written.

How to install solaris (on colab or locally)?

I try since multiple days to install solaris (https://github.com/CosmiQ/solaris) locally, on google colab or on renkulab (https://renkulab.io/). Up to now, without any luck. I tried on all platform different approaches:
Creating a conda environment (as recommended by the authors)
Directly through pip
And also cloning the repository and access the folders and functions directly
All of these approaches failed so far. Mostly there is a wheel building error for GDAL. Which i have installed first. I do not find any proper documentation or other failure descriptions which makes me question myself... Maybe here someone has experience with this library?
I highly appreciate every hint.
Thanks a lot
Colab Setup
I can get it set up in Colab with the following:
First Cell: Install Mamba/Conda
!pip install -q condacolab
import condacolab
condacolab.install()
This will trigger a runtime restart - it does this on purpose.
Second Cell: Install Solaris prerequisites
I'm assuming we want the GPU-enabled version. If not, there is another YAML in the solaris repository for a CPU-only environment.
!wget https://raw.githubusercontent.com/CosmiQ/solaris/main/environment-gpu.yml
!mamba env update -n base -f environment-gpu.yml
Manually restart the runtime after this completes!
Third Cell: Install Solaris
!pip install solaris
That should be it. Following these steps, I could import the module and use the entrypoints, e.g.,
Module import ✅
import solaris
Example entrypoint ✅
!make_masks -h
There were some future deprecation warnings from NumPy about some syntax in the TensorFlow code, but otherwise, seems functional. However, I don't personally use this tool, so I don't know if there is more to verify.

Pip install local package in conda environemnt

I recently developed a package my_package and am hosting it on GitHub. For easy installation and use, I have following setup.py:
from setuptools import setup
setup(name='my_package',
version='1.0',
description='My super cool package',
url='https://github.com/my_name/my_package',
packages=['my_package'],
python_requieres='3.9',
install_requires=[
'some_package==1.0.0'
])
Now I am trying to install this package in a conda environment:
conda create --name myenv python=3.9
conda activate myenv
pip install git+'https://github.com/my_name/my_package'
So far so good. If I try to use it in the project folder, everything works perfectly. If I try to use the packet outside the project folder (still inside the conda environment), I get the following error:
ModuleNotFoundError: No module named 'my_package'
I am working on windows, if that matters.
EDIT:
I'm verifying that both python and pip are pointing towards the correct version with:
which pip
which python
/c/Anaconda3/envs/my_env/python
/c/Anaconda3/envs/my_env/Scripts/pip
Also, when I run:
pip show my_package
I get a description of my package. So pip finds it, but as soon as I try to import my_package in the script, I get the described error.
I also verified that the package is installed in my environment. So in /c/Anaconda3/envs/my_env/lib/site-packages there is a folder my_package-1.0.dist-info/
Further: python "import sys, print(sys.path)"
shows, among other paths, /c/Anaconda3/envs/my_env/lib/site-packages. So it is in the path.
Check if you are using some explicit shebang in your script pointing to other Python interpreters.
Eg. using the system default Python:
#!/bin/env python
...
While inside your environment myenv, try to uninstall your package first, to do a clean test:
pip uninstall my_package
Also, you have a typo in your setup.py: python_requieres --> python_requires.
And I actually tried to install with your setup.py, and also got ModuleNotFoundError - but because it didn't properly install due to install_requires:
ERROR: Could not find a version that satisfies the requirement some_package==1.0.0
So, check also that everything installs without errors and warnings.
Hope that helps.
First thing I would like to point out (not the solution) regards the following statement you made:
If I try to use it in the project folder [...] If I try to use the packet outside the project folder [...]
I understand "project folder" means the "my_package" folder (inside the git repository). If that is the case, I would like to point out that you are mixing two situations: that of testing a (remote) package installation, while in your (local) repository. Which is not necessarily wrong, but error-prone.
Whenever testing the setup/install process of a package, make sure to move far from your repository (say, "/tmp/" equivalent in Windows) and, preferably, use a fresh environment. That will eliminate "noise" in your tests.
First thing I would tell you to do -- if not already -- is to create a fresh conda env and install your package from an empty/new folder. Eg,
$ conda env create -n test_my_package ipython pip
$ cd /tmp # equivalent temporary or new in your Windows
$ pip install git+https://github.com/my_name/my_package
If that doesn't work (maybe a problem with your pip' git+http code), do another way: create a release for your package (eg, "v1") and then install the released version by indicating the zip package URL (that you get from your "my_package" releases page on Github):
$ pip install https://github.com/my_name/my_package/archive/v1.zip

ModuleNotFoundError: No module named 'yaml'

I have used a YAML file and have imported PyYAML into my project.
The code works fine in PyCharm, however on creation of an egg and running the egg gives an error as module not found on command prompt.
You have not provided quite enough information for an exact answer, but, for missing python modules, simply run
py -m pip install PyYaml
or, in some cases
python pip install PyYaml
You may have imported it in your project (on PyCharm) but you have to make sure it is installed and imported outside of the IDE, and on your system, where the python interpreter runs it
I have not made an .egg for some time (you really should be consider using wheels for distributing packages), but IIRC an .egg should have a requires.txt file with an entry that specifies dependency on pyyaml.
You normally get that when setup() in your setup.py has an argument install_requires:
setup(
...
install_requires=['pyyaml<4']
...
)
(PyYAML 4.1 was retracted because there were problems with that version, but it might be in your local cache of PyPI as it was in my case, hence the <4, which restricts installation to the latest 3.x release)

jython 2.7 package installation

Jython Package installation issue, using pip
Hi, I have installed Jython2.7 configured with pydev in eclipse neon, also configured python 3.6 package
I am able to install packages for python using pip installer?
pip install "packagename"
Below are some of the packages in python/Lib/Site-packages directory
I was able to install all the packages
How do I use pip installer to install packages for jython?
I tried to install Jip package with
jython install setup.py
The binary File got installed in the Jython/Lib/Site-packages folder
However, I am not able to use it.
where and how do I get Jython package binaries like jip?
Also, Please let me know how to search jython packages?
Also, How to make pip install library packages in jython?
Any other configuration like jython home, etc that should be made?
This answer is going to be really generic but I just recently have slogged my way through the setup for jython/jip/pip and here's roughly what I had to do.
Firstly, I'm running Windows 7 64 Bit from behind a proxy (work machine.)
Had to install jython 2.7.0 instead of 2.7.1 because (I think anyway) 2.7.1 requires admin privileges which I don't have on my work PC.
Pip didn't install correctly during the Jython installation and I spent an obscene amount of time trying to get it installed and functioning as I knew it from my cpython days. NOTE: Just because you get pip installed, doesn't mean you can use any package on a python package repo. As of 2.7.0, Jython doesn't have end to end capability to interpret/compile some libraries that rely on certain python wrappers of native OS function calls. I believe 2.7.1 makes solid progress in the direction of supporting all needed native calls but don't quote me on that. For example, I tried to use wxPython to make a simple GUI to test my jython install. Trying to install it from pip kept causing really non-specific error info that took me a lot of time to figure out that the cause was jython simply couldn't compile the wxPython source so beware.
I had to set environment variables 'http_proxy' and 'https_proxy' in the form of http://proxyhosturl:port and https://proxyhosturl:port respectively to get out from behind the proxies without having to invoke pip with the proxy switch every time I called it.
To actually install pip, have a look here. These instructions are for Python and Linux/Unix but the principle is roughly the same. Just use jython -m instead of python -m and ignore the '$' at the start of each command line.
Also be sure to CD to your python_home/bin folder when invoking the ez_install exe.
If that doesn't work (didn't for me), try using get-pip.py script with these instructions https://pip.pypa.io/en/stable/installing/ (remember jython instead of python etc.). Download it, cd to the download location and follow the noted install steps. Worth noting is about half way down the install instructions where it details installing from local archives (source/binary zip or tar.gz archives of pip and setuptools as better described here: https://packaging.python.org/tutorials/installing-packages/#installing-from-local-archives).
The links to the bin archives of pip and setuptools are here:
https://pypi.python.org/pypi/setuptools
https://pypi.python.org/pypi/pip
It may also be worth making sure that your PATH environment variable has the jython/bin path in the variable value. The jython installer should do this but, again, mine did not.
If all goes well, you should be able to invoke pip with the --version switch and if it prints a line with the installed pip version info then you should be good to go
Another quirky issue I had was I could invoke a function of pip one time and any subsequent times I would get a stack trace ending with something along the lines of an object not having a certain property. I fixed that by finding my temp directory by opening a windows explorer instance and typing %TEMP% in the address bar and hitting enter, it should take you to a subdir of your AppData folder and there you may see a folder with the name of the package you were trying to install and the text "_pip" somewhere in the directory name. Delete the directory and try the pip install command again. I had to do this + invoke pip install pip -U to update my install to the latest version. Then pip began behaving correctly in my instance.
pip search numpy (or your library name) will generate a list of results with the same logic it uses to locate your desired package when you call pip install but, again, just because it returns a matching package doesn't mean it will compile when you install it (numpy doesn't work because of the missing java to C native function calls I described earlier.) The trade off is that you can import code artifacts from Java JAR files in your Jython script files and leverage their functionality with relative ease. Between the public Java APIs available and the python packages that work with the jython interpretor, you can (in my experience) come up with a way to accomplish your task. See the following info on JIP, Maven and IDEs.
IDE and jython integration (Eclipse)
- If you are stuck using Eclipse (like me) it actually has pretty decent support for python development. Install the PyDev plugin for Eclipse from Help -> Install Software. Put in this URL https://marketplace.eclipse.org/content/pydev-python-ide-eclipse, hit tab, and select the PyDev plugin and hit 'finish.'
- Setup the jython interpretor info from Windows -> Preferences -> PyDev. Provide the path to your jython.jar file.
- You should now be able to use File -> New PyDev project to create a basic python project and configure it to use your version of Jython and Java.
Brief Overview of Jip and Maven
- jip is a jython package that is invoked very similarly to pip but instead will download JAR files from the Maven Central Repository instead of python packages from pypi.com, for example. See the install instructions described here. Note the install procedure for a global jip install which differ from just pip install jip. https://pypi.python.org/pypi/jip/
- I never got jip to work exactly as I wished because there's not a ton of documentation on it outside of what I already linked. However, if you install a JAR using jip, you have to go to your project in Eclipse and actually add the JARs themselves to your PYTHONPATH in order for import statements and editing to have intellisense and so that you don't get a classnotfound exception at runtime. See following screen shot.
- There is a JIP config file that you can use similar to the pip config ini file but I have yet to find any exhaustive documentation on it's setup.
Note in the above screen shot the first entry in the External libraries entries. By default, pip places installed packages in that directory so to enable eclipse to find them, you need to also ensure that location is entered.
In Conclusion
- I have more to add to this answer and I will do so as soon as possible. In the meantime, see this example project I've loaded into github.
https://github.com/jheidlage1222/jython_java_integration_example
It shows basic config and how to interface with JARs from python code. I used the apache httpcomponents library as an example. Good luck amigo.

Resources