requirements.txt vs Pipfile in heroku flask webapp deployment? - heroku

I'm trying to deploy a Flask webapp to Heroku and I have seen conflicting information as to which files I need to include in the git repository.
My webapp is built within a virtual environment (venv), so I have a Pipfile and a Pipfile.lock. Do I also need a requirements.txt? Will one supersede the other?
Another related question I have is what would occur if a certain package failed to install in the virtual environment: can I manually add it to the requirements.txt or Pipfile? Would this effectively do the same thing as pipenv install ... or is that doing something else beyond adding the package to the list of requirements (considering Heroku is installing the packages upon deployment).

You do not need requirements.txt.
The Pipfile and Pipfile.lock that Pipenv uses are designed to replace requirements.txt. If you include all three files, Heroku will ignore the requirements.txt and just use Pipenv.
If you have build issues with a particular library locally I urge you to dig into that and get everything working on your local machine. But this isn't technically required... as long as the Pipfile and Pipfile.lock contain the right information (including hashes), Heroku will try to install your dependencies.

Related

Build a requirements.txt for streamlit app that directs to python script, which in turn determine the proper version of package to be installed

I am preparing a streamlit app for object detection that can do both training and inference.
I want to prepare a single requirements.txt file that can work whether when the app is run locally or if the app is deployed to streamlit cloud.
On streamlit cloud obviously, I won’t be doing training because of the need to have a GPU and in this case the user should clone the github repo for running locally.
Now, I come to the dependencies and requirements.txt. If I am running locally, I want to include say opencv-python but if I am running on streamlit cloud, I would want to include opencv-python-headless instead. PS: I have couple of cases like this; e.g pytorch+cuXX for local (GPU enabled) and pytorch for streamlit cloud (CPU only).
My question is how to reflect this in the requirements.txt file. I came across pip environment markers for conditional installation of pip packages, but the available environment markers cannot tell if the app is local or deployed. I wonder if there is a solution to this.
I came across this response which can help a lot, but as far as I understand it, I can make a setup.py and use subprocess.run to pip install the package. As I said, I would be doing previous work in this setup.py to determine the correct version for installation.
Given this (if this is a correct approach), then I call this setup.py from requirements.txt where each line represents something like a pip install <package_name>. I don't know yet how to do such a thing. If the total approach is not suitable for my case, I would appreciate your help.

Heroku deploy and install from wheel saved in git repo

Is there a way to install a dependency (listed in the requirements.txt) not from pypi but from a wheel saved in the git repo while deploying?
This question might sound odd at first, but it is simply due to the fact that I can not share the wheel on pypi.
First you deploy your application to Heroku then you can use Heroku Bash to install any requirements
run this command for start your bash on heroku
heroku run bash

Getting pipenv working in a Virtualenv where the App is working

I have my Django App working in a Virtualenv.
I would like to switch to pipenv. However, pipenv install fails with a dependency error.
Given that the App is working, I guess all the libraries are in the Virtualenv.
When getting the App working through Virtualenv + pip, I had to resolve the library dependency, but was able to and got it working. The thinking behind moving to pipenv is to avoid the dependency issues in a multiple member team setup.
Is there a way to tell pipenv to just take the versions of the libraries in the virtualenv and just go with it?
If you have a setup.py file you can install it with pipenv install .. Or even better, make it an editable development dependency: pipenv install -e . --dev.
You can also create a Pipfile/virtual env from a requirements.txt file. So you could do a pip freeze, then install from the requirements file.
Freezing your dependencies
From your working app virtual env, export your dependencies to a requirements file.
pip freeze > frozen-reqs.txt
Then create a new virtual env with pipenv, and install from the frozen requirements.
pipenv install -r frozen-reqs.txt
Then go into the Pipfile and start removing everything but the top level dependencies, and re-lock. Also where-ever possible, avoid pinning requirements as this makes dependency resolution much harder.
You can use pipenv graph and pipenv graph --reverse to help with this.

Conda environment from .yaml offline

I would like to create a Conda environment from a .yaml file on an offline machine (i.e. no Internet access). On an online machine this works perfectly fine:
conda env create -f environment.yaml
However, it doesn't work on an offline machine as the packages are then not found. How do I do this?
If that's not possible is there another easy way to get my complete Conda environment to an offline machine (including both Conda and pip installed packages)?
Going through the packages one by one to install them from the .tar.bz2 files works, but it is quite cumbersome, so I would like to avoid that.
If you can use pip to install the packages, you should take a look at devpi, particutlarily its server. devpi can cache packages normally installed from PyPI, so only on first install it actually retrieves them. You have to configure pip to retrieve the packages from the devpi server.
As you don't want to list all the packages and their dependencies by hand you should, on a machine connected to the internet:
install the devpi server (I have that running in a Docker container)
run your installation
examine the devpi repository and gathered all the .tar.bz2 and .whl files out of there (you might be able to tar the whole thing)
On the non-connected machine:
Install the devpi server and client
use the devpi client to upload all the packages you gathered (using devpi upload) to the devpi server
make sure you have pip configured to look at the devpi server
run pip, it will find all the packages on the local server.
devpi has a small learning curve, which already worth traversing because of the speed up and the ability to install private packages (i.e. not uploaded to PyPI) as a normal dependency, by just generating the package and upload it to your local devpi server.
I guess that Anthon's solution above is pretty good but just in case anybody is interested in an easy solution that worked for me:
I first created a .yaml file specifying the environment using conda env export > file.yaml. Following the instructions on http://support.esri.com/en/technical-article/000014951, I automatically downloaded all the necessary installation files for conda installed packages and created a channel from the files. For that, I just adapted the code from the link above to work with my .yaml file instead of the conda list file they used. In addition, I automatically downloaded the necessary files for the pip installed packages by looping through the pip entries in the .yaml file and using pip download for downloading each of them. Furthermore, I automatically created separate conda and pip requirement lists from the .yaml file. Then I created the environment using conda create with the offline flag, the file with the conda requirements and my custom channel. Finally, I installed the pip requirements using pip install with the pip requirements file and the folder containing the pip installation files for the option --find-links.
That worked for me. The only problem is that you can only download binaries with pip download if you need to specify a different operating system than the one you are running, and for some packages no binaries are available. That was okay for me now as the target machine has the some characteristics but might be problem in the future, so I am planning to look into the solution suggested by Anthon.

Is there a durable store for deb files (like a maven repo?)

I have a maven built docker image that was dependent on libssl1.0.2_1.0.2d-3_amd64.deb, but this has now a 404 and has been replaced by libssl1.0.2_1.0.2e-1_amd64.deb.
This is a problem because maven builds are meant to be durable - ie you can rebuild them at any point in the future. The main maven repo is durable, so artefacts taken from that will be there in the future. I could move the debs I need into the maven repo, but that is a bit of abuse of other peoples storage...
So is there a durable store of debian files that is guaranteed to exist... well at least until the revolution/meteor strike/Jurassic resurrection etc.
You can do this yourself with free, open-source tools. You can create your own APT repository for storing Debian packages. If you are interested in using GPG signature to sign repository metadata read this.
Once you've created the repository, you can create a configuration file in /etc/apt/sources.list.d/ pointing to your repository. You can run apt-get update to refresh your systems apt cache, and then run apt-get install to install the package of your choice.
BTW, you can install a particular version of a package by running: apt-get install packagename=version.
For example, to install version 1.0 of "test", run: apt-get install test=1.0.
Alternatively, if you don't want to deal with any of this yourself you can just use packagecloud.io for hosting Debian, RPM, RubyGem, and Python PyPI repositories. A lot of people who use our service have your exact use case, where Debian packages they depend on disappear from public repositories so they upload them to us.

Resources