Specifying --use-deprecated=legacy-resolver in conda YAML file - pip

I'm creating an environment in an Azure DevOps pipeline from a .yml file. However, one of my modules has dependency issues, causing conda env create -n env-name --file conda.yml to get stuck. I know that I need to use --use-deprecated=legacy-resolver but since I'm creating the environment from a YAML file I don't know how to specify it in my YAML file (rather than directly running pip install).
channels:
- conda-forge
- nodefaults
dependencies:
- python=3.9.12
- pip>=19.0
- pip:
- numpy==1.22.0
- pandavro
- scikit-learn
- ipykernel
- pyspark
- mlflow
- mltable
I've tried adding [--use-deprecated=legacy-resolver] after one of my modules (e.g. pandavro [--use-deprecated=legacy-resolver]) but it seems like Conda doesn't recognize this syntax.

It seems like this feature hasn't yet been implemented, based on these posts:
https://github.com/conda/conda/issues/3763
https://github.com/conda/conda/issues/6805
The workaround that worked for me was removing all pip dependencies from my conda.yml file (i.e. all lines beginning with and including -pip:) and instead putting them in a separate requirements.txt file. I then added another step to my ADO pipeline to run pip install -r requirements.txt --use-deprecated=legacy-resolver.

Related

Weird error in creating conda environment from yml file? (PackagesNotFoundError for the yml file itself)

I'm reinstalling Conda after a PC factory reset and trying to re-create an old conda environment from a yml file that I created by
conda env export --prefix $path_to_old_env_dir > voice_dep.yml
The resulting yml file looks ok to me, here's what it looks like:
name: voiceeda
channels:
- defaults
- conda-forge
dependencies:
- ca-certificates=2022.12.7=h5b45459_0
- libsqlite=3.40.0=hcfcfb64_0
- openssl=1.1.1s=hcfcfb64_1
- pip=22.3.1=pyhd8ed1ab_0
- python=3.9.13=h6244533_2
- setuptools=66.1.1=pyhd8ed1ab_0
- sqlite=3.40.0=hcfcfb64_0
- tzdata=2022g=h191b570_0
- ucrt=10.0.22621.0=h57928b3_0
- vc=14.3=hb6edc58_10
- vs2015_runtime=14.34.31931=h4c5c07a_10
- wheel=0.38.4=pyhd8ed1ab_0
- pip:
- anyio==3.6.2
- argon2-cffi==21.3.0
- argon2-cffi-bindings==21.2.0
- arrow==1.2.3
...
but when I try to run
conda create -n voiceeda -f voice_dep.yml
The following odd error occurs.
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- voice_dep.yml
I'd understand if it wasn't finding a particular package, I can remove versions etc. if so, but why is it saying it can't find the yml file itself? I'm very confused, wondering if I missed a crucial setup step during conda installation or smth? I'm on Windows 10, and installed anaconda to a D drive (conda version 23.1.0 & Python 3.9.13.).
Any help would be much appreciated, thank you!

Azure ML not able to create conda environment (exit code: -15)

When I try to run the experiment defined in this notebook in notebook, I encountered an error when it is creating the conda env. The error occurs when the below cell is executed:
from azureml.core import Experiment, ScriptRunConfig, Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.widgets import RunDetails
# Create a Python environment for the experiment
sklearn_env = Environment("sklearn-env")
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
pip_packages=['azureml-defaults','azureml-dataprep[pandas]'])
sklearn_env.python.conda_dependencies = packages
# Get the training dataset
diabetes_ds = ws.datasets.get("diabetes dataset")
# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder,
script='diabetes_training.py',
arguments = ['--regularization', 0.1, # Regularizaton rate parameter
'--input-data', diabetes_ds.as_named_input('training_data')], # Reference to dataset
environment=sklearn_env)
# submit the experiment
experiment_name = 'mslearn-train-diabetes'
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.submit(config=script_config)
RunDetails(run).show()
run.wait_for_completion()
Everytime I run this, I always faced the issue of creating the conda env as below:
Creating conda environment...
Running: ['conda', 'env', 'create', '-p', '/home/azureuser/.azureml/envs/azureml_000000000000', '-f', 'azureml-environment-setup/mutated_conda_dependencies.yml']
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Installing pip dependencies: ...working...
Attempting to clean up partially built conda environment: /home/azureuser/.azureml/envs/azureml_000000000000
Remove all packages in environment /home/azureuser/.azureml/envs/azureml_000000000000:
Creating conda environment failed with exit code: -15
I could not find anything useful on the internet and this is not the only script where it fail. When I am try to run other experiments I have sometimes faced this issue. One solution which worked in the above case is I moved the pandas from pip to conda and it was able to create the coonda env. Example below:
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
pip_packages=['azureml-defaults','azureml-dataprep[pandas]'])
# Ensure the required packages are installed (we need scikit-learn, Azure ML defaults, and Azure ML dataprep)
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip','pandas'],
pip_packages=['azureml-defaults','azureml-dataprep'])
The error message (or the logs from Azure) is also not much help. Would apprecite if a proper solution is available.
Edit: I have recently started learning to use Azure for Machine learning and so if I am not sure if I am missing something? I assume the example notebooks should work as is hence raised this question.
short answer
Totally been in your shoes before. This code sample seems a smidge out of date. Using this notebook as a reference, can you try the following?
packages = CondaDependencies.create(
pip_packages=['azureml-defaults','scikit-learn']
)
longer answer
Using pip with Conda is not always smooth sailing. In this instance, conda isn't reporting up the issue that pip is having. The solution is to create and test this environment locally where we can get more information, which will at least will give you a more informative error message.
Install anaconda or miniconda (or use an Azure ML Compute Instance which has conda pre-installed)
Make a file called environment.yml that looks like this
name: aml_env
dependencies:
- python=3.8
- pip=21.0.1
- pip:
- azureml-defaults
- azureml-dataprep[pandas]
- scikit-learn==0.24.1
Create this environment with the command conda env create -f environment.yml.
respond to any discovered error message
If there' no error, use this new environment.yml with Azure ML like so
sklearn_env = Environment.from_conda_specification(name = 'sklearn-env', file_path = './environment.yml')
more context
the error I'm guessing that's happening is when you reference a pip requirements file from a conda environment file. In this scenario, conda calls pip install -r requirements.txt and if that command errors out, conda can't report the error.
requirements.txt
scikit-learn==0.24.1
azureml-dataprep[pandas]
environment.yml
name: aml_env
dependencies:
- python=3.8
- pip=21.0.1
- pip:
- -rrequirements.txt
What worked for me looking at the previous notebook 05 - Train Models.ipynb:
packages = CondaDependencies.create(conda_packages=['pip', 'scikit-learn'],
pip_packages=['azureml-defaults'])
You have to:
Remove 'azureml-dataprep[pandas]' from pip_packages
Change the order of conda_packages - pip should go first

Changing read timed out limit in pip when installing multiple packages using anaconda

I an installing environment.yml file via
conda env create -f environment.yml
But I get
raise ReadTimeoutError(self._pool, None, 'Read timed out.')
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.
failed
CondaEnvException: Pip failed
My environment.yml has a structure like this
name: relightable-nr
channels:
- pytorch
- defaults
dependencies:
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- absl-py==0.8.0
- astor==0.8.0
- astroid==2.3.3
- wrapt==1.11.2
- xarray==0.13.0
prefix: /root/anaconda3/envs/envn
I read How to solve ReadTimeoutError: HTTPSConnectionPool(host='pypi.python.org', port=443) with pip? and Pip Install Timeout Issue
I changed my conda default timeout to 300 but how to change pip timeout in my case here?
Pip will pull configuration options from a pip.conf/pip.inf (Unix/Win) file located in either a global, user, or environment scope, and settings such as timeout can be configured there. See the Pip User Guide section on Config Files.
While that answers the question proper, I would be remiss were I not to mention that all the packages listed in the YAML can come from Conda. A more appropriate solution would be to reconfigure the YAML to not hit PyPI in the first place, e.g.,
name: relightable-nr
channels:
- pytorch
- conda-forge
- defaults
dependencies:
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- absl-py=0.8.0
- astor=0.8.0
- astroid=2.3.3
- wrapt=1.11.2
- xarray=0.13.0
but perhaps you abridged the YAML and left out packages that only have PyPI builds. Still, I would recommend getting everything possible from Conda.
You can use:
sudo pip install --default-timeout=100 <name_of_your_library>

Creating Anaconda Environment from YML File Choking on Common Packages - os, pip, pandas

Why is anaconda choking on common packages, in creating an envionment from a YML file? Anaconda COMES with these packages pre-installed in root (or so I thought?)
YML file:
---
name: rasterenv
channels:
- conda-forge
dependencies:
- gdal>=2.2.3
- rasterio
- cython
- jupyter
- matplotlib
- numpy
- pyproj
- shapely
- rasterio
- pandas
- geopandas
- os
- matplotlib
- seaborn
- fiona
- OSMnx
- pip:
- pygeotools
- pygeoprocessing
Trying to build file with: conda env create -f path/to/file
If I create an enviornment with JUST uncommon packages like rasterio, it appears to work. BUT, I want an environment with all! What gives here?
Error is:
ResolvePackageNotFound:
- os
If I remove os from the list, the error then becomes:
ResolvePackageNotFound:
- matplotlib
As #sinoroc pointed out in the comments, os is part of Python standard library and should not be listed as a dependency. (When you do define it as a dependency, Python is going to look for a package called os on all available repositories [PyPI or anaconda.org in this case] and won't find it.)
You can see which packages are part of the standard library by checking the docs here: https://docs.python.org/3/library/
(Also there have been a few questions on SO on how to find out if a particular package is part of the std lib, e.g. How to check if a module/library/package is part of the python standard library?) When you create a new environment the packages from the std lib are the only ones which are available by default. Anything else needs to be installed.
Additionally there are two packages in your yaml file that are listed twice (rasterio and matplotlib) which makes me think that you manually created that file. You can generate a conda environment file by activating an environment and running conda env export > environment.yml which will create a file called environment.yml with all required dependencies.

Installing older version of h2o in conda virtual environment on Windows

I'm struggling to figure out conda virtual environments on windows. All I want is to be able to have different versions of h2o installed at the same time because of their insane decision to not allow you to be able to load files saved in even the most minor different version.
I created a virtual environment by cloning my base anaconda:
conda create -n h203_14_0_7 --clone base
I then activated the virtual environment like so:
C:\ProgramData\Anaconda3\Scripts\activate h203_14_0_7
Now that I'm in the virtual environment (I see the (h203_14_0_7) at the beginning of the prompt), i want to uninstall the version of h2o in this virtual environment so I tried:
pip uninstall h2o
But this output
which to me looks like it's going to uninstall the global h2o rather than the virtual environment h2o. So I think it's using the global pip instead of the pip it should have cloned off the base. So how to I use the virtual environment pip to uninstall h2o just for my virtual environment and how can I be sure that it's doing the right thing?
I then ran
conda intall pip
and it seems that after that I was able to use pip to uninstall h2o only from the virtual environment (I hope). I then downloaded the older h2o version from here: https://github.com/h2oai/h2o-3/releases/tag/jenkins-rel-weierstrass-7
but when I try install it I get
(h203_14_0_7) C:\ProgramData\Anaconda3\envs\h203_14_0_7>pip install C:\Users\dan25\Downloads\h2o-3-jenkins-rel-weierstrass-7.tar.gz
Processing c:\users\dan25\downloads\h2o-3-jenkins-rel-weierstrass-7.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\envs\h203_14_0_7\lib\tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\dan25\\AppData\\Local\\Temp\\pip-sf7r_6pm-build\\setup.py'
So what now?
I had trouble (e.g. https://0xdata.atlassian.net/browse/PUBDEV-3370 ) getting that approach to ever work. It felt like some kind of global dependency was in there, somewhere.
So, I personally just uninstall, and install the desired version, as I need to move between versions. (Actually, I am more likely to use a different VirtualBox or AWS image for each.)
However I noticed searching for conda on the H2O jira that there is a lot of activity recently. They might all be pointing out the same bug you have found, but if so it sounds like it is something getting enough attention to get fixed.
Aside: finding old versions (and your edit showing install problems)
To find, e.g. 3.14.0.7, google it with "h2o". The top hit is http://h2o-release.s3.amazonaws.com/h2o/rel-weierstrass/7/index.html
The "rel-weierstrass" represents 3.14.0, and the 7 is in the URL. (I've yet to see a full list of all the rel-XXX names, but google will always find at least one in the series, even if it won't find the exact minor version.)
Download the zip file you find there. Inside you will find both an R package, and a whl package for Python. So unzip it, extract the one you want, then pip install it.
These zip files are always on S3 (AFAIK). The link you showed was a source snapshot, on github.
Install requirements:
pip install requests tabulate numpy scikit-learn
Extract the archive:
zcat h2o-3-jenkins-rel-weierstrass-7.tar.gz | tar xvf -
cd into Python directory and build:
cd h2o-py
../gradlew build
I have this working now. I think the trick is to make sure you do NOT have h2o installed on your base python. I did the following:
pip uninstall h2o
conda create --name h2o-base pip
conda activate h2o-base
conda install numpy
conda install pandas
conda install requests
conda install tabulate
conda install colorama
conda install future
conda install jupyter
python -m pip install ipykernel
conda deactivate
And now to install specific versions of h2o, you need to URL of the .whl file for that version and you can find a list of the URLs of all the old versions here: https://github.com/h2oai/h2o-3/blob/master/Changes.md
So for example to install version 3.18.0.8:
conda create --name h2o-3-18-0-8 --clone h2o-base
conda activate h2o-3-18-0-8
pip install http://h2o-release.s3.amazonaws.com/h2o/rel-wolpert/8/Python/h2o-3.18.0.8-py2.py3-none-any.whl
python -m ipykernel install --user --name h2o-3-18-0-8 --display-name "Python (h2o-3-18-0-8)"
or version 3.20.0.2 (make sure to conda deactivate first):
conda create --name h2o-3-20-0-2 --clone h2o-base
conda activate h2o-3-20-0-2
pip install http://h2o-release.s3.amazonaws.com/h2o/rel-wright/2/Python/h2o-3.20.0.2-py2.py3-none-any.whl
python -m ipykernel install --user --name h2o-3-20-0-2 --display-name "Python (h2o-3-20-0-2)"
This set-up allows me to have multiple versions of h2o installed on the same computer and if I have to use serialized models I just have to run python from the virtual environment with the correct version of h2o installed. I think this is preferable to uninstalling and reinstalling h2o each time.
Here is the environments.yml file if you want to skip all the manual installs above:
name: h2o-base
channels:
- conda-forge
- defaults
dependencies:
- asn1crypto=0.24.0=py37_1003
- backcall=0.1.0=py_0
- bleach=3.0.2=py_0
- ca-certificates=2018.10.15=ha4d7672_0
- certifi=2018.10.15=py37_1000
- cffi=1.11.5=py37hfa6e2cd_1001
- chardet=3.0.4=py37_1003
- colorama=0.4.0=py_0
- cryptography=2.3=py37h74b6da3_0
- cryptography-vectors=2.3.1=py37_1000
- decorator=4.3.0=py_0
- entrypoints=0.2.3=py37_1002
- future=0.16.0=py37_1002
- icu=58.2=vc14_0
- idna=2.7=py37_1002
- ipykernel=5.1.0=pyh24bf2e0_0
- ipython=7.0.1=py37h39e3cac_1000
- ipython_genutils=0.2.0=py_1
- ipywidgets=7.4.2=py_0
- jedi=0.13.1=py37_1000
- jinja2=2.10=py_1
- jpeg=9b=vc14_2
- jsonschema=2.6.0=py37_1002
- jupyter=1.0.0=py_1
- jupyter_client=5.2.3=py_1
- jupyter_console=6.0.0=py_0
- jupyter_core=4.4.0=py_0
- libflang=5.0.0=vc14_20180208
- libpng=1.6.34=vc14_0
- libsodium=1.0.16=vc14_0
- llvm-meta=5.0.0=0
- markupsafe=1.0=py37hfa6e2cd_1001
- mistune=0.8.4=py37hfa6e2cd_1000
- nbconvert=5.3.1=py_1
- nbformat=4.4.0=py_1
- notebook=5.7.0=py37_1000
- openblas=0.2.20=vc14_8
- openmp=5.0.0=vc14_1
- openssl=1.0.2p=hfa6e2cd_1001
- pandas=0.23.4=py37h830ac7b_1000
- pandoc=2.3.1=0
- pandocfilters=1.4.2=py_1
- parso=0.3.1=py_0
- pickleshare=0.7.5=py37_1000
- pip=18.1=py37_1000
- prometheus_client=0.4.2=py_0
- prompt_toolkit=2.0.6=py_0
- pycparser=2.19=py_0
- pygments=2.2.0=py_1
- pyopenssl=18.0.0=py37_1000
- pyqt=5.6.0=py37h764d66f_7
- pysocks=1.6.8=py37_1002
- python=3.7.0=hc182675_1005
- python-dateutil=2.7.3=py_0
- pytz=2018.5=py_0
- pywinpty=0.5.4=py37_1002
- pyzmq=17.1.2=py37hf576995_1001
- qt=5.6.2=vc14_1
- qtconsole=4.4.2=py_1
- requests=2.19.1=py37_1001
- send2trash=1.5.0=py_0
- setuptools=40.4.3=py37_0
- simplegeneric=0.8.1=py_1
- sip=4.18.1=py37h6538335_0
- six=1.11.0=py37_1001
- tabulate=0.8.2=py_0
- terminado=0.8.1=py37_1001
- testpath=0.4.2=py37_1000
- tornado=5.1.1=py37hfa6e2cd_1000
- traitlets=4.3.2=py37_1000
- urllib3=1.23=py37_1001
- vc=14=0
- vs2015_runtime=14.0.25420=0
- wcwidth=0.1.7=py_1
- webencodings=0.5.1=py_1
- wheel=0.32.1=py37_0
- widgetsnbextension=3.4.2=py37_1000
- win_inet_pton=1.0.1=py37_1002
- wincertstore=0.2=py37_1002
- winpty=0.4.3=4
- zeromq=4.2.5=vc14_2
- zlib=1.2.11=vc14_0
- blas=1.0=mkl
- icc_rt=2017.0.4=h97af966_0
- intel-openmp=2019.0=118
- m2w64-gcc-libgfortran=5.3.0=6
- m2w64-gcc-libs=5.3.0=7
- m2w64-gcc-libs-core=5.3.0=7
- m2w64-gmp=6.1.0=2
- m2w64-libwinpthread-git=5.0.0.4634.697f757=2
- mkl=2019.0=118
- mkl_fft=1.0.6=py37hdbbee80_0
- mkl_random=1.0.1=py37h77b88f5_1
- msys2-conda-epoch=20160418=1
- numpy=1.15.2=py37ha559c80_0
- numpy-base=1.15.2=py37h8128ebf_0

Resources