How to make pip do as well as conda in solving for python version during environment creation and update - pip

One of the reasons I prefer conda is that, given a list of packages in a create, some of which depend on python:
mamba env create -f environment.yml
it will solve for and install the latest version of python that permits version alignment of all other packages. Moreover, one can at any time update the version of python if it becomes feasible to do so, by simply performing a conda update`:
mamba env update --file environment.yml --prune
Of course, this strategy fails when there is a package that only pip can provide via a - pip: section of environment.yml because conda isn't smart enough to query and install while solving alignment, including packages in the - pip: section. Therefore, when one is confronted with a situation in which the only advantage of conda is such solving but in which there are some pip-only packages, one might wish to create a requirements.txt and perform the operations analogous to those in the examples above -- including solving for and installing the correct version of python.
Is there a way to perform these operations using pip other PyPi-aware environment managers -- that is to say environment managers that can perform full version alignment among all listed packages so long as they are available at PyPi?

Related

How to create a conda environment file without local development packages?

I have a conda environment with packages installed via conda install. I also have two local development packages that were each installed with pip install -e .. Now, conda env export shows everything, including both local development packages. However, I don't want conda to include them when creating the same environment on other machines - I want to keep doing it via pip install -e ..
So how can I exclude both local packages when creating the environment.yml file? Do I need to manually remove them or is there a way to this from the command line?
While there are some alternative flags for conda env export that change output behavior (e.g., --from-history is most notable), there really isn't anything as specific as OP describes. Instead, manually remove the offending lines.
Note that YAMLs do support all pip install commands, so the editable installs can also be included. For example, https://stackoverflow.com/a/59864744/570918.
Consider Prioritizing the YAML Specification
In a software engineering setting, I would expect that users should not even be hitting development environments with conda install or pip install commands. Instead, the team should have a maunally-written, version-controlled YAML to begin with and all installations/changes to the environment are managed through editing the YAML file and using conda env update to propagate changes in the YAML to the environment.
That is, conda env export should not be necessary because the environment already has a well-defined means of creation.

conda create environment not responding

I want to install python 2.7 as a conda environment.
conda create -n python2 python=2.7 anaconda
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment:
it's been running for the last 12 hours.
If all that is actually needed is Python 2.7 environment (not full Anaconda distribution), then see #jakub's answer. However, Conda is perfectly capable of creating an Anaconda distribution environment with Python 2.7, and it should not take 12+ hours to solve.
Why so long? Channels!
The extremely long solve is almost certainly aggravated by your channel priorities. An "Anaconda" distribution should source most - if not all - of its software from the anaconda channel (part of defaults channel). However, most users eventually add conda-forge into their global channels and give it higher or equal priority (e.g., channel_priority: flexible). When this is the case, Conda will spend a bunch of time trying to satisfy the packages specified within the anaconda metapackage with the latest versions from conda-forge, and that's what tends to bog things down.
Option 1: Avoid Mixing Anaconda and Conda Forge
If you want a faster Anaconda install, then install only from Anaconda
conda create -n anaconda27 --override-channels -c defaults python=2.7 anaconda
Everything in the anaconda metapackage was originally intended to be sourced from the anaconda channel, so this shouldn't be so unreasonable.
Note that if you have conda-forge prioritized globally, this will be an issue every time you install in this environment (so remember to override channels).
Option 2: Mamba
Another option is Mamba. It's a faster (compiled) drop-in alternative to the conda CLI functionality. It seems to both solve faster and less prone to mutate unrelated packages when requesting changes - but that's just my anecdotal experience.
# install it in your *base* env (only need this once)
conda install -n base conda-forge::mamba
# use it like you would `conda`
mamba create -n python2 python=2.7 anaconda
The anaconda package is a metapackage, meaning it tells conda to install other packages. It will install hundreds of packages, and it turns out this can stress conda. One typically does not need all of the packages in the anaconda metapackage -- it is often better to install only the packages one requires.
Try to create an environment without anaconda and instead specify only the packages you need.
conda create -n python2 python=2.7

What is the right way to update Anaconda and Conda base & environments?

Just wondering as what is the right way to update Anaconda and Conda installation and virtual environments. Here is my confusion step by step:
When I run command conda update anaconda, it updates/downgrades alot of packages.
Then I ran conda update conda, which again updates/downgrades some packages.
Next, I ran conda update --update-all it starts downgrading/upgrading different packages.
Lastly, just to make sure that everything's updated, I ran conda update anaconda again. I was expecting a message like Everything's up to date but to my surprise it was again showing a huge list of packages that needed to be updated/downgraded again?
What am I doing wrong here? It appears to me as if I am going in circles with these commands. Any help?
You're not doing anything wrong per se, but it just doesn't make much sense to ever run conda update anaconda and conda update --all right after each other on the same env - they represent two completely different configurations.
Update Anaconda
Anaconda is a Python distribution that bundles together a ton of packages. Presumably, a bunch of testing goes into verifying that all the package versions and builds are compatible with each other. Because this takes time to do, the Anaconda team only releases new distributions (i.e., a new anaconda version) every couple months or so. If you want a stable set of packages that have been tested for interoperability, then do conda update anaconda.
Update All
In between Anaconda releases, new versions of many packages are still released on the Anaconda channel, and if you run conda update --all you're going to inevitably get ahead of the versions specified in the anaconda bundle. If you want the newest individual package releases and don't mind potentially working with package builds that aren't thoroughly tested for integration, then run conda update --all.
It may be worth noting that people who prioritize having access to the latest versions of packages often seem to prefer Conda Forge, because it tends to have more frequent package releases. However, in my opinion, there's almost no point to installing Anaconda if you're going to switch most packages to Conda Forge anyway. Instead, just install Miniconda and only install what you want from Conda Forge at the start.
Update None
Personally, I will rarely run conda update on an env once I've harden the requirements for a project. Every time you update an env, you risk breaking code that you've already written. Instead, Conda makes it quite easy to create new envs, and if they have a lot of overlap with other envs, then the envs can be quite light due to sharing packages across envs via hardlinking.
Update Conda
The one exception to everything above is the conda package, which is the very infrastructure you're using to manage packages and envs. That, one should update just like any other package manager (e.g., a pip or a homebrew).
Found the answers in this useful post by Anaconda
Keeping Anaconda Up To Date
Below is a question that gets asked so often that I decided it would be helpful to publish an answer explaining the various ways in which Anaconda can be kept up to date. The question was originally asked on StackOverflow.
I have Anaconda installed on my computer and I’d like to update it. In
Navigator I can see that there are several individual packages that
can be updated, but also an anaconda package that sometimes has a
version number and sometimes says custom. How do I proceed?
The Answer
What 95% of People Actually Want
In most cases what you want to do when you say that you want to update Anaconda is to execute the command:
conda update --all
This will update all packages in the current environment to the latest version—with the small print being that it may use an older version of some packages in order to satisfy dependency constraints (often this won’t be necessary and when it is necessary the package plan solver will do its best to minimize the impact).
This needs to be executed from the command line, and the best way to get there is from Anaconda Navigator, then the “Environments” tab, then click on the triangle beside the root environment, selecting “Open Terminal”:
This operation will only update the one selected environment (in this case, the root environment). If you have other environments you’d like to update you can repeat the process above, but first click on the environment. When it is selected there is a triangular marker on the right (see image above, step 3). Or, from the command line, you can provide the environment name (-n envname) or path (-p /path/to/env). For example, to update your dspyr environment from the screenshot above:
conda update -n dspyr --all
Update Individual Packages
If you are only interested in updating an individual package then simply click on the blue arrow or blue version number in Navigator, e.g. for astroid or astropy in the screenshot above, and this will tag those packages for an upgrade. When you are done you need to click the “Apply” button:
Or from the command line:
conda update astroid astropy
Updating Just the Packages in the Standard Anaconda Distribution
If you don’t care about package versions and just want “the latest set of all packages in the standard Anaconda Distribution, so long as they work together,” then you should take a look at this gist.
Why Updating the Anaconda Package is Almost Always a Bad Idea
In most cases, updating the Anaconda package in the package list will have a surprising result—you may actually downgrade many packages (in fact, this is likely if it indicates the version as custom). The gist above provides details.
Leverage conda Environments
Your root environment is probably not a good place to try and manage an exact set of packages—it is going to be a dynamic working space with new packages installed and packages randomly updated. If you need an exact set of packages, create a conda environment to hold them. Thanks to the conda package cache and the way file linking is used, doing this is typically fast and consumes very little additional disk space. For example:
conda create -n myspecialenv -c bioconda -c conda-forge python=3.5 pandas beautifulsoup seaborn nltk
The conda documentation has more details and examples.
pip, PyPI, and setuptools?
None of this is going to help with updating packages that have been installed from PyPI via pip, or any packages installed using python setup.py install. conda list will give you some hints about the pip-based Python packages you have in an environment, but it won’t do anything special to update them.
Commercial Use of Anaconda or Anaconda Enterprise
It’s pretty much exactly the same story, with the exception that you may not be able to update the root environment if it was installed by someone else (say, to /opt/anaconda/latest). If you’re not able to update the environments you are using, you should be able to clone and then update:
conda create -n myenv --clone root
conda update -n myenv --all
The other way in is simply,
anaconda-navigator
The resulting GUI image is below, the only difference with respect to this question is where you see "Installed", there is a drop down menu for for "Updatable" and therein you simply click the dependencies for updating for any given environment.
General info
I'm sure everyone knows this, but for anyone who doesn't Anaconda navigator is a point and click GUI already part of the Anaconda and is simply brilliant for managing, installing, updating and deleting all the dependencies.
With respect to the question it is great for managing all the dependencies inside new envs, creating new envs, loading new channels. It works great remotely via X11 if you have Anaconda loaded on a remote cluster/server.
The bonus for me is that I've never known it fail.
conda install conda=4.8.2
works, as it installs a specific version and '''/''' will not spn for long.

Best practices with pip and conda for consistency

I know there are a lot of questions on the coexistence and interchangeability/non-interchangeability of pip and conda. That is not my question: I know I need both for my work, I use both, and for the most part, my conda envs are a manageable mess.
But here's the thing: there's many ways to install pip. I happened to get conda going first, so my pip is through anaconda/bin/pip. It is the only pip on my machine. Here are my questions:
Is this sensible? Do I want my pip to be usr/bin/pip and be independent of global conda? It feels not-sensible.
If I install a new pip through say brew or easy_install, should I start downloading packages through this new pip? Would that be awful and mess everything up?
Thanks!
Pip always requires a version of Python to be installed, and is associated with that specific Python installation. By default, pip installs packages for its own Python, into the related site-packages directory inside the Python library directory. The exact location of this directory depends on your operating system and how you installed conda.
If you install pip via Homebrew or with another installation of Python, you should not use that pip and expect it to install for conda. For that matter, if you create a new conda environment, you should not expect that the pip in that environment will install packages into another environment.
There is the --user option to pip, which installs packages into a directory in your user account (on *nix systems, this is ~/.local; I can't recall for Windows where this is). These packages will be able to be found by all Python versions with the same major and minor version number. However, it is not recommended to install packages with the intent of sharing them among several Pythons this way, because if the different Pythons were compiled with different compilers, you may run into trouble.

Difference between pip freeze and conda list

I am using both "pip freeze" and "conda list" to list the packages installed in my environment, but what are their differences?
If the goal only is to list all the installed packages, pip list or conda list are the way to go.
pip freeze, like conda list --export, is more for generating requirements files for your environment. For example, if you have created a package in your customized environment with certain dependencies, you can do conda list --export > requirements.txt. When you are ready to distribute your package to other users, they can easily duplicate your environment and the associated dependencies with conda create --name <envname> --file requirements.txt.
The differences between conda and pip need a longer discussion. There are plenty of explanations on StackOverflow. This article by Jake VanderPlas is a great read as well.
You might also find this table useful. It lists operation equivalences between conda, pip and virtualenv.

Resources