Conda: Choose where packages are downloaded for each environment - anaconda

I am running different conda env and I'd like to specify where the packages are downloaded to, rather than having all of them in my $home.
I have found this question which, at time of writing, has no answers. However, my question is different: I don't want to specify the pkg_dir in my .condarc because I want to have a different download dir for each project (space is not an issue).
How do I define the pkg_dir for a specific conda env?
To note, I'm creating environments as conda env create -f my_env.yml -p complete-env.

A fundamental concept of conda is that packages get downloaded and extracted to a shared cache, from which they are selectively linked into the different conda environments. You want to work against this concept, so whatever you do will be hacky and have repercussions.
You could install a separate Miniconda for each of your projects, and (try to) make sure that they don't know about eachother by removing all conda-related files and environment settings from your home directory, or even use a different HOME for each project. Before working with a project, you'd have to put the correct conda on the PATH.
Or you could install Miniconda on a dedicated drive apart from your home directory, and put the conda environments inside your home directory. That would prevent conda from hard-linking the files. It would still download the packages into the shared cache, but then copy only the relevant files into each of your projects. Of course, copying is slower than hard-linking.
Specifying the package directory per environment rather than per conda installation is not possible, as darthbith has already pointed out in a comment.

Related

Why conda doesn't remove packages for removed environment?

I am not an expert in informatics stuffs. I deleted an environment that had many packages, one of them psi4 using the command:
conda remove --name myenv --all
However, in the folder:
~/anaconda3/pkgs
there are still some folders like:
psi4-1.3.2+ecbda83-py37h06ff01c_1, psi4-rt-1.3.2-py37h6cf1279_1
And the same happened for other packages that I manually identified, therefore, I assume that the same happen for the rest of the packages that belonged to this environment. And the problem is that these files take space from my disk and I really don't know how many and what are the packages on this situation.
Is there some way to delete all these non-used folders in order to free space?
Thanks in advance.
The command you used just removes the environment or installed package not the downloaded binary files. You can clean those up using:
conda clean -a

Can I remove the anaconda while leaving the conda (Ubuntu)? [duplicate]

I had installed Anaconda on my system before I knew the difference between Anaconda and Miniconda. I would like to downsize to Miniconda since I don't want the bloat of Anaconda, but I already have a handful of environments set up.
So far the only way I can think of migrating is to completely get rid of everything right now, install Miniconda, and then recreate my environments by hand, but that seems quite arduous. Is there a smarter way of going about this?
I agree with #darthbith: Export the envs to YAML files (conda env export) then recreate them once you have Miniconda installed (conda env create).
While there are some experimental tools for packaging and moving envs (i.e., so you avoid having to redownload packages), they only work on a single env basis. So, I can't see how going this route one could avoid making multiple copies of many of the shared files. Instead, if you let Conda handle the environment (re)creation, it will leverage hardlinks to minimize disk usage, and that seems to be one of your aims.
It may be possible to avoid redownloading packages during the environment recreations by retaining the pkgs directory in the root of your Anaconda install, then copying its contents over into the pkgs of the Miniconda install. I would only copy folders/tarballs that don't conflict with the ones that come with Miniconda. After finishing environment recreation, then a conda clean -p would likely be in order, since Anaconda includes many packages that likely aren't getting reused.

deleting conda environment safely?

I'm new to anaconda and conda. I have created an identical environment in two different directories. Is it safe to just delete the env folder or the environment that I no longer need, or do I need to do something in the anaconda prompt to remove the environment thoroughly? I'm not sure if creating an environment in a local folder leaves a trace in the registry or somewhere else in the computer that needs to be removed too?
conda remove --name myenv --all
Another option is
conda env remove --name myenv
Effectively no difference from the accepted answer, but personally I prefer to use conda env commands when operating on whole envs, and reserve conda remove for managing individual packages.
The difference between these and deleting the folder manually is that Conda provides action hooks for when packages are removed, and so allows packages to execute pre-unlink and post-unlink scripts.

"~/miniconda3/bin" does not prepended to PATH for custom environments

I use conda 4.7.11 with auto_activate_base: false in ~/.condarc. I installed htop using conda install -c conda-forge htop. It was installed at ~/miniconda3/bin/htop. When I am in base environment I am able to use htop because ~/miniconda3/bin is prepended to PATH variable. But when I am outside all environments then only ~/miniconda3/condabin is prepended to PATH. When I am in all other environments except base then ~/miniconda3/envs/CUSTOM_ENV/bin and ~/miniconda3/condabin are prepended to PATH but not ~/miniconda3/bin, that's why I can use htop only from base environment. So my question is about how to be able to use htop installed using conda from all environments, including case when all environments are deactivated.
Please, don't suggest using package managers like apt or yum in my case (CentOS), because I have no root access to use this package manager. Thank you in advance.
Conda environments aren't nested, so what is in base is not inherited by the others. Isolation of environments is the imperative requirement, so it should make sense that the content in base env is not accessible when it isn't activated.
Option 1: Environment Stacking
However, there is an option to explicitly stack environments, which at this point literally means what you're asking for, namely, keeping the previous enviroment's bin/ in the PATH variable. So, if you htop installed only in base, then you can retain access to it in other envs like so
conda activate base
conda activate --stack my_env
If you decide to go this route, I think it would be prudent to be very minimal about what you install in base. Of course, you could also create a non-base env to stack on, but then it might be a bother to have to always activate this env, whereas in default installs, base auto-activates.
Starting with Conda v4.8 there will be an auto_stack configuration option:
conda config --set auto_stack true
See the documentation on environment stacking for details.
Option 2: Install by Default
If you want to have htop in every env but not outside of Conda envs, then the naive solution is to install it in every env. Conda offers a simple solution to this called Default Packages, and is in the Conda config under the key create_default_packages. Running the following will tell Conda to always install htop when creating a new env:
conda config --add create_default_packages htop
Unfortunately that won't update any existing envs, so you'd still have to go back and do that (e.g., Install a package into all envs). There's also a --no-default-packages flag for ignoring default packages when creating new envs.
Option 3: Global Installs
A Word of Caution
The following two options are not official recommendations, so caveat emptor and, if you do ever use them, be sure to report such a non-standard manipulation of $PATH when reporting problems/troubleshooting in the future.
Linking
Another option (although more manual) is to create a folder in your user directory (e.g., ~/.local/bin) that you add to $PATH in your .bashrc and create links in there to the binaries that you wish to "export" globally. I do this with a handful of programs that I wanted to use independently of Conda (e.g., emacs) even though they are installed and managed by Conda.
Dedicated Env
If you plan to do this with a bunch of software, then it might work to dedicate an env to such global software and just add its whole ./bin dir to $PATH. Do not do this with base - Conda wants to strictly manage that itself since Conda v4.4. Furthermore, do not do this with anything Python-related: stick strictly to native (compiled) software (e.g., htop is a good example). If an additional Python of the same version ends up on your $PATH this can create a mess in library loading. I've never attempted this and prefer the manual linking because I know exactly what I'm exporting.

Conda environment from .yaml offline

I would like to create a Conda environment from a .yaml file on an offline machine (i.e. no Internet access). On an online machine this works perfectly fine:
conda env create -f environment.yaml
However, it doesn't work on an offline machine as the packages are then not found. How do I do this?
If that's not possible is there another easy way to get my complete Conda environment to an offline machine (including both Conda and pip installed packages)?
Going through the packages one by one to install them from the .tar.bz2 files works, but it is quite cumbersome, so I would like to avoid that.
If you can use pip to install the packages, you should take a look at devpi, particutlarily its server. devpi can cache packages normally installed from PyPI, so only on first install it actually retrieves them. You have to configure pip to retrieve the packages from the devpi server.
As you don't want to list all the packages and their dependencies by hand you should, on a machine connected to the internet:
install the devpi server (I have that running in a Docker container)
run your installation
examine the devpi repository and gathered all the .tar.bz2 and .whl files out of there (you might be able to tar the whole thing)
On the non-connected machine:
Install the devpi server and client
use the devpi client to upload all the packages you gathered (using devpi upload) to the devpi server
make sure you have pip configured to look at the devpi server
run pip, it will find all the packages on the local server.
devpi has a small learning curve, which already worth traversing because of the speed up and the ability to install private packages (i.e. not uploaded to PyPI) as a normal dependency, by just generating the package and upload it to your local devpi server.
I guess that Anthon's solution above is pretty good but just in case anybody is interested in an easy solution that worked for me:
I first created a .yaml file specifying the environment using conda env export > file.yaml. Following the instructions on http://support.esri.com/en/technical-article/000014951, I automatically downloaded all the necessary installation files for conda installed packages and created a channel from the files. For that, I just adapted the code from the link above to work with my .yaml file instead of the conda list file they used. In addition, I automatically downloaded the necessary files for the pip installed packages by looping through the pip entries in the .yaml file and using pip download for downloading each of them. Furthermore, I automatically created separate conda and pip requirement lists from the .yaml file. Then I created the environment using conda create with the offline flag, the file with the conda requirements and my custom channel. Finally, I installed the pip requirements using pip install with the pip requirements file and the folder containing the pip installation files for the option --find-links.
That worked for me. The only problem is that you can only download binaries with pip download if you need to specify a different operating system than the one you are running, and for some packages no binaries are available. That was okay for me now as the target machine has the some characteristics but might be problem in the future, so I am planning to look into the solution suggested by Anthon.

Resources