Creating Anaconda Environment from YML File Choking on Common Packages - os, pip, pandas - pip

Why is anaconda choking on common packages, in creating an envionment from a YML file? Anaconda COMES with these packages pre-installed in root (or so I thought?)
YML file:
name: rasterenv
- conda-forge
- gdal>=2.2.3
- rasterio
- cython
- jupyter
- matplotlib
- numpy
- pyproj
- shapely
- rasterio
- pandas
- geopandas
- os
- matplotlib
- seaborn
- fiona
- OSMnx
- pip:
- pygeotools
- pygeoprocessing
Trying to build file with: conda env create -f path/to/file
If I create an enviornment with JUST uncommon packages like rasterio, it appears to work. BUT, I want an environment with all! What gives here?
Error is:
- os
If I remove os from the list, the error then becomes:
- matplotlib

As #sinoroc pointed out in the comments, os is part of Python standard library and should not be listed as a dependency. (When you do define it as a dependency, Python is going to look for a package called os on all available repositories [PyPI or in this case] and won't find it.)
You can see which packages are part of the standard library by checking the docs here:
(Also there have been a few questions on SO on how to find out if a particular package is part of the std lib, e.g. How to check if a module/library/package is part of the python standard library?) When you create a new environment the packages from the std lib are the only ones which are available by default. Anything else needs to be installed.
Additionally there are two packages in your yaml file that are listed twice (rasterio and matplotlib) which makes me think that you manually created that file. You can generate a conda environment file by activating an environment and running conda env export > environment.yml which will create a file called environment.yml with all required dependencies.


Specifying --use-deprecated=legacy-resolver in conda YAML file

I'm creating an environment in an Azure DevOps pipeline from a .yml file. However, one of my modules has dependency issues, causing conda env create -n env-name --file conda.yml to get stuck. I know that I need to use --use-deprecated=legacy-resolver but since I'm creating the environment from a YAML file I don't know how to specify it in my YAML file (rather than directly running pip install).
- conda-forge
- nodefaults
- python=3.9.12
- pip>=19.0
- pip:
- numpy==1.22.0
- pandavro
- scikit-learn
- ipykernel
- pyspark
- mlflow
- mltable
I've tried adding [--use-deprecated=legacy-resolver] after one of my modules (e.g. pandavro [--use-deprecated=legacy-resolver]) but it seems like Conda doesn't recognize this syntax.
It seems like this feature hasn't yet been implemented, based on these posts:
The workaround that worked for me was removing all pip dependencies from my conda.yml file (i.e. all lines beginning with and including -pip:) and instead putting them in a separate requirements.txt file. I then added another step to my ADO pipeline to run pip install -r requirements.txt --use-deprecated=legacy-resolver.

Azure ML not able to create conda environment (exit code: -15)

When I try to run the experiment defined in this notebook in notebook, I encountered an error when it is creating the conda env. The error occurs when the below cell is executed:
from azureml.core import Experiment, ScriptRunConfig, Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.widgets import RunDetails
# Create a Python environment for the experiment
sklearn_env = Environment("sklearn-env")
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
sklearn_env.python.conda_dependencies = packages
# Get the training dataset
diabetes_ds = ws.datasets.get("diabetes dataset")
# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder,
arguments = ['--regularization', 0.1, # Regularizaton rate parameter
'--input-data', diabetes_ds.as_named_input('training_data')], # Reference to dataset
# submit the experiment
experiment_name = 'mslearn-train-diabetes'
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.submit(config=script_config)
Everytime I run this, I always faced the issue of creating the conda env as below:
Creating conda environment...
Running: ['conda', 'env', 'create', '-p', '/home/azureuser/.azureml/envs/azureml_000000000000', '-f', 'azureml-environment-setup/mutated_conda_dependencies.yml']
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Installing pip dependencies: ...working...
Attempting to clean up partially built conda environment: /home/azureuser/.azureml/envs/azureml_000000000000
Remove all packages in environment /home/azureuser/.azureml/envs/azureml_000000000000:
Creating conda environment failed with exit code: -15
I could not find anything useful on the internet and this is not the only script where it fail. When I am try to run other experiments I have sometimes faced this issue. One solution which worked in the above case is I moved the pandas from pip to conda and it was able to create the coonda env. Example below:
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip','pandas'],
The error message (or the logs from Azure) is also not much help. Would apprecite if a proper solution is available.
Edit: I have recently started learning to use Azure for Machine learning and so if I am not sure if I am missing something? I assume the example notebooks should work as is hence raised this question.
short answer
Totally been in your shoes before. This code sample seems a smidge out of date. Using this notebook as a reference, can you try the following?
packages = CondaDependencies.create(
longer answer
Using pip with Conda is not always smooth sailing. In this instance, conda isn't reporting up the issue that pip is having. The solution is to create and test this environment locally where we can get more information, which will at least will give you a more informative error message.
Install anaconda or miniconda (or use an Azure ML Compute Instance which has conda pre-installed)
Make a file called environment.yml that looks like this
name: aml_env
- python=3.8
- pip=21.0.1
- pip:
- azureml-defaults
- azureml-dataprep[pandas]
- scikit-learn==0.24.1
Create this environment with the command conda env create -f environment.yml.
respond to any discovered error message
If there' no error, use this new environment.yml with Azure ML like so
sklearn_env = Environment.from_conda_specification(name = 'sklearn-env', file_path = './environment.yml')
more context
the error I'm guessing that's happening is when you reference a pip requirements file from a conda environment file. In this scenario, conda calls pip install -r requirements.txt and if that command errors out, conda can't report the error.
name: aml_env
- python=3.8
- pip=21.0.1
- pip:
- -rrequirements.txt
What worked for me looking at the previous notebook 05 - Train Models.ipynb:
packages = CondaDependencies.create(conda_packages=['pip', 'scikit-learn'],
You have to:
Remove 'azureml-dataprep[pandas]' from pip_packages
Change the order of conda_packages - pip should go first

Conda does not set up properly path for JDK for pyjnius

I have installed pyjnius with conda. However, when I try to import pyjnius it fails
> from jnius import autoclass
line 12, in <module>
from .jnius import * # noqa ImportError: DLL load failed: The specified module could not be found.
Together with pyjnius conda installs also openjdk. Next, pyjnius looks for jvm.dll in one of PATH directories. DLL could be found in
but conda does not include it in PATH. It adds another folder in PATH:
while this directory is missing: JRE has not been installed, only JDK. I can, obviously, include first directory in my PATH, however, this would bypass conda virtual environments concept. How can I solve this problem in an elegant way?
Here's environment.yml to reproduce the problem:
name: example-env
- conda-forge
- python=3.7
- Cython
- pyjnius
Next, I create and activate as follows:
conda env update --file environment.yml
conda activate example-env

conda: proceed in two steps to avoid conflicts

I wrote a conda environment file in order to gather the minimum set of packages needed for setting up my environment. Say that my file is made of packages A, B, C and D as deps. When creating the environment through:
conda env create -f environment.yml
I get that D is conflicting without any additional information (conflicting with A, B, C ? Which is the underlying conflicting library ?). In order to solve the problem, I had to proceed in two steps: 1- create the environment using a modified environment file which just contains A, B and C packages 2- additionally install D separately through a conda install command. It works.
Is that a normal, at least not so an unusual, behavior I should live with ? Or is that a sign of an unstable environment which may lead to troubles in the future ?
here is my current environment file. The conflicting package is the last commented one.
name: jupyterhub
- anaconda
- conda-forge
- r
- git
- python
- numpy
- matplotlib
- h5py
- scipy
- pandas
- scikit-learn
- sympy
- notebook
- jupyterlab
- jupyterhub
- oauthenticator
- configurable-http-proxy
- gfortran_linux-64
- openmpi
- eigen
- boost
- xeus-cling
- cmake
- pip
- libiconv
- r-essentials
- r-base
# - mantid/label/nightly::mantid-framework
You're installing a lot of packages, and none of them with a version number. This is unstable by definition. Every time you install from that environment file, you can get a different version of any of those packages, and each new version might change it's prerequisites and their versions.
With that environment file, you cannot even predict which versions of Python and R will be installed.

Installing older version of h2o in conda virtual environment on Windows

I'm struggling to figure out conda virtual environments on windows. All I want is to be able to have different versions of h2o installed at the same time because of their insane decision to not allow you to be able to load files saved in even the most minor different version.
I created a virtual environment by cloning my base anaconda:
conda create -n h203_14_0_7 --clone base
I then activated the virtual environment like so:
C:\ProgramData\Anaconda3\Scripts\activate h203_14_0_7
Now that I'm in the virtual environment (I see the (h203_14_0_7) at the beginning of the prompt), i want to uninstall the version of h2o in this virtual environment so I tried:
pip uninstall h2o
But this output
which to me looks like it's going to uninstall the global h2o rather than the virtual environment h2o. So I think it's using the global pip instead of the pip it should have cloned off the base. So how to I use the virtual environment pip to uninstall h2o just for my virtual environment and how can I be sure that it's doing the right thing?
I then ran
conda intall pip
and it seems that after that I was able to use pip to uninstall h2o only from the virtual environment (I hope). I then downloaded the older h2o version from here:
but when I try install it I get
(h203_14_0_7) C:\ProgramData\Anaconda3\envs\h203_14_0_7>pip install C:\Users\dan25\Downloads\h2o-3-jenkins-rel-weierstrass-7.tar.gz
Processing c:\users\dan25\downloads\h2o-3-jenkins-rel-weierstrass-7.tar.gz
Complete output from command python egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\envs\h203_14_0_7\lib\", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\dan25\\AppData\\Local\\Temp\\pip-sf7r_6pm-build\\'
So what now?
I had trouble (e.g. ) getting that approach to ever work. It felt like some kind of global dependency was in there, somewhere.
So, I personally just uninstall, and install the desired version, as I need to move between versions. (Actually, I am more likely to use a different VirtualBox or AWS image for each.)
However I noticed searching for conda on the H2O jira that there is a lot of activity recently. They might all be pointing out the same bug you have found, but if so it sounds like it is something getting enough attention to get fixed.
Aside: finding old versions (and your edit showing install problems)
To find, e.g., google it with "h2o". The top hit is
The "rel-weierstrass" represents 3.14.0, and the 7 is in the URL. (I've yet to see a full list of all the rel-XXX names, but google will always find at least one in the series, even if it won't find the exact minor version.)
Download the zip file you find there. Inside you will find both an R package, and a whl package for Python. So unzip it, extract the one you want, then pip install it.
These zip files are always on S3 (AFAIK). The link you showed was a source snapshot, on github.
Install requirements:
pip install requests tabulate numpy scikit-learn
Extract the archive:
zcat h2o-3-jenkins-rel-weierstrass-7.tar.gz | tar xvf -
cd into Python directory and build:
cd h2o-py
../gradlew build
I have this working now. I think the trick is to make sure you do NOT have h2o installed on your base python. I did the following:
pip uninstall h2o
conda create --name h2o-base pip
conda activate h2o-base
conda install numpy
conda install pandas
conda install requests
conda install tabulate
conda install colorama
conda install future
conda install jupyter
python -m pip install ipykernel
conda deactivate
And now to install specific versions of h2o, you need to URL of the .whl file for that version and you can find a list of the URLs of all the old versions here:
So for example to install version
conda create --name h2o-3-18-0-8 --clone h2o-base
conda activate h2o-3-18-0-8
pip install
python -m ipykernel install --user --name h2o-3-18-0-8 --display-name "Python (h2o-3-18-0-8)"
or version (make sure to conda deactivate first):
conda create --name h2o-3-20-0-2 --clone h2o-base
conda activate h2o-3-20-0-2
pip install
python -m ipykernel install --user --name h2o-3-20-0-2 --display-name "Python (h2o-3-20-0-2)"
This set-up allows me to have multiple versions of h2o installed on the same computer and if I have to use serialized models I just have to run python from the virtual environment with the correct version of h2o installed. I think this is preferable to uninstalling and reinstalling h2o each time.
Here is the environments.yml file if you want to skip all the manual installs above:
name: h2o-base
- conda-forge
- defaults
- asn1crypto=0.24.0=py37_1003
- backcall=0.1.0=py_0
- bleach=3.0.2=py_0
- ca-certificates=2018.10.15=ha4d7672_0
- certifi=2018.10.15=py37_1000
- cffi=1.11.5=py37hfa6e2cd_1001
- chardet=3.0.4=py37_1003
- colorama=0.4.0=py_0
- cryptography=2.3=py37h74b6da3_0
- cryptography-vectors=2.3.1=py37_1000
- decorator=4.3.0=py_0
- entrypoints=0.2.3=py37_1002
- future=0.16.0=py37_1002
- icu=58.2=vc14_0
- idna=2.7=py37_1002
- ipykernel=5.1.0=pyh24bf2e0_0
- ipython=7.0.1=py37h39e3cac_1000
- ipython_genutils=0.2.0=py_1
- ipywidgets=7.4.2=py_0
- jedi=0.13.1=py37_1000
- jinja2=2.10=py_1
- jpeg=9b=vc14_2
- jsonschema=2.6.0=py37_1002
- jupyter=1.0.0=py_1
- jupyter_client=5.2.3=py_1
- jupyter_console=6.0.0=py_0
- jupyter_core=4.4.0=py_0
- libflang=5.0.0=vc14_20180208
- libpng=1.6.34=vc14_0
- libsodium=1.0.16=vc14_0
- llvm-meta=5.0.0=0
- markupsafe=1.0=py37hfa6e2cd_1001
- mistune=0.8.4=py37hfa6e2cd_1000
- nbconvert=5.3.1=py_1
- nbformat=4.4.0=py_1
- notebook=5.7.0=py37_1000
- openblas=0.2.20=vc14_8
- openmp=5.0.0=vc14_1
- openssl=1.0.2p=hfa6e2cd_1001
- pandas=0.23.4=py37h830ac7b_1000
- pandoc=2.3.1=0
- pandocfilters=1.4.2=py_1
- parso=0.3.1=py_0
- pickleshare=0.7.5=py37_1000
- pip=18.1=py37_1000
- prometheus_client=0.4.2=py_0
- prompt_toolkit=2.0.6=py_0
- pycparser=2.19=py_0
- pygments=2.2.0=py_1
- pyopenssl=18.0.0=py37_1000
- pyqt=5.6.0=py37h764d66f_7
- pysocks=1.6.8=py37_1002
- python=3.7.0=hc182675_1005
- python-dateutil=2.7.3=py_0
- pytz=2018.5=py_0
- pywinpty=0.5.4=py37_1002
- pyzmq=17.1.2=py37hf576995_1001
- qt=5.6.2=vc14_1
- qtconsole=4.4.2=py_1
- requests=2.19.1=py37_1001
- send2trash=1.5.0=py_0
- setuptools=40.4.3=py37_0
- simplegeneric=0.8.1=py_1
- sip=4.18.1=py37h6538335_0
- six=1.11.0=py37_1001
- tabulate=0.8.2=py_0
- terminado=0.8.1=py37_1001
- testpath=0.4.2=py37_1000
- tornado=5.1.1=py37hfa6e2cd_1000
- traitlets=4.3.2=py37_1000
- urllib3=1.23=py37_1001
- vc=14=0
- vs2015_runtime=14.0.25420=0
- wcwidth=0.1.7=py_1
- webencodings=0.5.1=py_1
- wheel=0.32.1=py37_0
- widgetsnbextension=3.4.2=py37_1000
- win_inet_pton=1.0.1=py37_1002
- wincertstore=0.2=py37_1002
- winpty=0.4.3=4
- zeromq=4.2.5=vc14_2
- zlib=1.2.11=vc14_0
- blas=1.0=mkl
- icc_rt=2017.0.4=h97af966_0
- intel-openmp=2019.0=118
- m2w64-gcc-libgfortran=5.3.0=6
- m2w64-gcc-libs=5.3.0=7
- m2w64-gcc-libs-core=5.3.0=7
- m2w64-gmp=6.1.0=2
- m2w64-libwinpthread-git=
- mkl=2019.0=118
- mkl_fft=1.0.6=py37hdbbee80_0
- mkl_random=1.0.1=py37h77b88f5_1
- msys2-conda-epoch=20160418=1
- numpy=1.15.2=py37ha559c80_0
- numpy-base=1.15.2=py37h8128ebf_0
