I remember conda solver was more robust / produced more correct results when I was creating fairly complex environments with up to a hundred of dependencies.
Has the pip solver improved over the years? I don't have a good example handy, but from what I remember if there are
transient dependencies coming from package A
conflicting with transient dependencies packages coming from B
(simplified example), pip didn't do a good job, vs conda always produced correctly solved environments.
Are there any (documented or not documented) examples when pip solver falls short? Eg. like in the example above.
ps. With the recent licensing changes coming from Anaconda, Inc. I have a customer that's considering to standardize on pip and forgo conda.
pip doesn't do a good job at resolving transient dependencies.
pip's dependency resolution algorithm is not a complete resolver.
The current resolution logic has the following characteristics:
Only one top level specification of a requirement (otherwise pip raises a "double requirement" exception) "- first found wins" behavior
for dependency requirements/constraints, which results in not
respecting all constraints. pip 10+ print a warning when this occurs.
Discussed here
pip has been trying to solve this for a long time, and the new pip 20.3 has implemented a new solver that should address this, but still has its own issues.
Related
I would like to replace as many packages on my computer with the corresponding chocolatey packages, so they can be upgrade automatically.
Is there a possibility to scan the installed Apps and point out which of them have a chocolatey equivalent?
Thanks a bunch!
Yes, but it's probably not what you want to hear.
You can do this with the Package Synchronization feature, but this feature requires a Chocolatey for Business license (C4B). Automatic Synchronization is a similarly named feature (all paid licenses have it), but this only removes packages for which the related software was uninstalled outside of Chocolatey.
With the free version, you will have to instead synchronize your package state manually.
Note: I don't recommend doing this for packages you don't maintain on the community feed. The likelihood of getting malware is low, but I'd be more concerned with a poor search term causing the wrong package to get installed instead, or accidentally installing a less "official" package maintained by someone who is not as diligent with updates or has abandoned the package.
However, this should be a perfectly safe procedure for packages you develop and maintain (and in reality you'll probably know all the package ids and versions anyways, so you'll skip straight to step 3). Doubly so if you are installing from a private feed you or your organization controls.
Query your installed programs from Windows. Take note of the version you have installed so you can install the correct version.
Do a package search for each one, recording the package ID for each one.
choco list --order-by-popularity --version VERSION should help you avoid less official or less maintained packages for the same software, and get you the correct package version. Top of the list is the most popular.
This is not perfect as some software really only gets installed by a single version of the package, but either self updates or pulls from a latest URL. In these cases the package version is not usually updated or accurate.
Install each software per package ID you have. Do this one command at a time so you can specify the correct version.
choco install -n skips running the installation PowerShell script so it effectively only "imports" the package for management without performing the install.
I am wondering if there is an easy way to install all packages that reside in a Conda Channel. To be specific I would like to install all packages from the ESRI channel.
Thank you.
No, and all is likely impossible anyway. The packages there range widely in date since last updated, so I highly doubt one could install every package into the same environment.
I am new to Python and starting work on a large project that will be distributed to users. I am also the first in my company to be using, and I wanted to get recommendations on the best way to install Python & packages, so that I don't head off in the wrong direction.
I require data analysis frameworks (pandas, numpy, scipy, matplotlib, statsmodels, pymongo) and my initial approach was to install Python 3.5 directly, and then use pip install on each package.
I ran into similar problems that others have found [Unable to find vcvarsall], and resolved. Next problem was with BLAS and LAPACK missing when installing scipy. At this point I decided Anaconda was the way to go, rather than individual pip installs, and was easily able to set everything up.
One problem with Anaconda is that it installs a lot of packages which I will never use, and may not have some which I would like to use in future, e.g. TensorFlow (presumably can do pip install to get extra ones that are not included?).
An in-between solution seems to be Miniconda, which I believe would have fixed the BLAS/LAPACK problem with scipy.
So my question is: can someone with experience of developing data analysis projects in Python, that will be deployed to users' Windows desktops, and with server-side components running on Linux, provide recommendation of what they would do if starting from scratch at new organization?
(I'm currently in favour of heading down the Anaconda route.)
Personally, I think Anaconda(conda) is better. First of all, conda is cross-platform package manager, and it is easy to install and use. Second, conda has functionality of virtualenv, and you can use conda create to create environment. Finally, there is Anaconda cloud and condo-forge, those community can help you solve conda issue, build packages, and share ideas.
Moreover, Anaconda(conda) indeed install a lot of packages, but those are all dependencies. For example, when you "conda install scikit-learn", conda will automatically help you install the dependency, numpy and spicy.
For m, on OS X, conda update --all often downgrades libraries - along with updating many.
Is this usual? Or something possibly in my setup?
Earlier this year, it was pillow for many months.
Surprisingly, today it was several of the HDF5 related libraries, numba and llvmlite.
So conda update numba brings numba back to the most recent version, and so on with the other 8 libraries, but why doesn't conda update --all do this anyway?
It's a compatibility issue. Anaconda is a stable set of packages. When you update Anaconda, you update to this stable list.
However, when you update individual packages, they might cause incompatibility issues with the rest of the Anaconda distribution so they aren't considered stable. That's why when you use conda update --all, it gets you to the latest stable Anaconda distribution, which might or might not have the version of the individual package you wanted.
See here: https://github.com/ContinuumIO/anaconda-issues/issues/39
Edit: This behavior has changed. It now tries to increase the version of all packages (except Python between major/minor version) such that no packages will be incompatible with each other.
See here: http://continuum.io/blog/advanced-conda-part-1#conda-update-all
Some libraries depend on specific lower versions for compatibility purposes. conda update --all will try to update packages as much as possible, but it always maintains compatibility with the version restrictions in each package's metadata. Note that the anaconda package does not come into play here (assuming you have a recent version of conda), because conda update --all ignores it.
Unfortunately, it's not always easy to see what depends on what, but there are some tricks. One way is to pin each package to a version you want and running conda update --all. It should generate an unsatisfiability hint that will give you an idea of what is causing the problem. Another way is to search through the package metadata.
For numba, I can suggest that the problem is likely related to numbapro. There are a few packages that depend on hdf5. You can use conda info <package> to see the dependencies of a package (like conda info h5py).
I am writing a ruby gem that I would like to use an open source program distributed as python. I don't have the time to port the python program to ruby, and I want to manage the external dependency as automatically as possible.
I'm thinking of using the Gem.pre_install hook to automatically easy_install the python package I'm interested in.
http://rubygems.rubyforge.org/rubygems-update/Gem.html#method-c-pre_install
I'd appreciate suggestions of better ways, or support of pre_install, if it's the accepted practice.
Quite an old question, but worth a reply. Sorry, I haven't been checking stackoverflow for babushka-related questions :)
If the python package is available as a pip, then you could do something like this:
dep 'blah.gem' do
requires 'something.pip'
end
dep 'something.pip'
Then, babushka blah.gem would handle the install, including installing rubygems and pip as required.
Ben
You may want to look at Babushka for describing non-ruby dependencies.
I don't know whether installing the python package in the pre_install hook would be polite behaviour.