Solving Unsolvable Dependencies - pip

In python a conda environment is supposed to solve the multiple version problem, but "solving" takes a long time (although mamba and others do it faster). Not only that, but sometimes the version conflicts are unsolvable.
What I don't understand is why they can't have multiple versions of the same package in the same environment at the same time? For example suppose I have Packages A, B, C
with dependencies
I import A and B
A imports C(version=1)
B imports C(version=2)
This is then unsolvable. But if I could name the packages as Cv1 and Cv2, then
in A I could
import Cv1 as C
and in B I could
import Cv2 as C
So this is solvable.
Why isn't this done?

Related

Gradle dependency resolution with ranges

How does Gradle resolve dependencies when version ranges are involved? Unfortunately, I couldn't find sufficient explanation on the official docs.
Suppose we have a project A that has two declared dependencies B and C. Those dependencies specify a range of versions, i.e. for B it is [2.0, 3.0) and for C it is [1.0, 1.5].
B itself does not have a dependency to C until version 2.8, so that version introduced a strict dependency to C in the range [1.1, 1.2].
Looking at this example, we might determine that the resolved versions are:
B: 2.8 (because it is the highest version in the range)
C: 1.2 (because given B at 2.8, this is the highest version that satisfies both required ranges)
In general, it is not clear to me how this entire algorithm is carried out exactly. In particular, if ranges are involved, every possible choice of a concrete version inside a range might introduce different transitive dependencies (as in my example with B introducing a dependency to C only at version 2.8) which can declare dependencies with ranges themselves, and so on, making the number of possibilities explode quickly.
Does it apply some sort of greedy strategy in that it tries to settle a version as early as possible and if later a new dependency is encountered that conflicts the already chosen one, it tries to backtrack and choose another version?
Any help in understanding this is very much appreciated.
EDIT: I've read here that the problem in general is NP-hard. So does Gradle actually simplify the process somehow, to make it solvable in polynomial time? And if so, how?

Sampling from a joint distribution in Pyro

I understand how to sample from multidimensional categorical, or multivariate normal (with dependence within each column). For example, for a multivariate categorical, this can be done as below:
import pyro as p
import pyro.distributions as d
import torch as t
p.sample("obs1", d.Categorical(logits=logit_pobs1).independent(1), obs=t.t(obs1))
My question is, how can we do the same, if there are multiple distributions? For example, the following is not what I want as obs1, obs2 and obs3 are independent to each other.
p.sample("obs1", d.Categorical(logits=logit_pobs1).independent(1), obs=t.t(obs1))
p.sample("obs2", d.Normal(loc=mu_obs2, scale=t.ones(mu_obs2.shape)).independent(1), obs=t.t(obs2))
p.sample("obs3", d.Bernoulli(logits=logit_pobs3).independent(1),obs3)
I would like to do something like
p.sample("obs", d.joint(d.Bernoulli(...), d.Normal(...), d.Bernoulli(...)).independent(1),obs)

How to use the same vendor package a of a dependency in golang?

I have a project, say P, depends on a library L, which in turn depends on a vendor library V.
Now P is also (directly) depends on V. Is it possible to tell the compiler just to use the V library in L/vendor ? Or else I must add V in P/vendor which i think is really redundant

Algorithm for dependency resolution

I'm in the process of writing a package manager, and for that I want the dependency resolution to be as powerful as possible.
Each package has a list of versions, and each version contains the following information:
A comparable ID
Dependencies (a list of packages and for each package a set of acceptable versions)
Conflicts (a list of packages and for each package a set of versions that cause issues together with this version)
Provides (a list of packages and for each package a set of versions that this package also provides/contains)
For the current state I have a list of packages and their current versions.
I now want to, given the list of available packages and the current state, be able to get a version for each package in a list of packages, taking the given constraints into account (dependencies, conflicting packages, packages provided by other packages) and get back a list of versions for each of these packages. Circular dependencies are possible.
If no valid state can be reached, the versions of the existing packages may be changed, though this should only be done if necessary. Should it not be possible to reach a valid state as much information to the reason should be available (to tell the user "it could work if you remove X" etc.).
If possible it should also be possible to "lock" packages to a specific version, in which case the version of the package may NOT be changed.
What I'm trying to accomplish is very similar to what existing package managers already do, with the difference that not necessarily the latest version of a package needs to be used (an assumption which most package managers seem to do).
The only idea I have so far is building a structure of all possible states, for all possible versions of the packages in question, and then removing invalid states. I really hope this is not the only solution, since it feels very "brute force"-ish. Staying under a few seconds for ~500 available packages with ~100 versions each, and ~150 installed packages would be a good goal (though the faster the better).
I don't believe this is a language-specific question, but to better illustrate it, here is a bit of pseudecode:
struct Version
integer id
list<Package, set<integer>> dependencies
list<Package, set<integer>> conflicts
list<Package, set<integer>> provides
struct Package
string id
list<Version> versions
struct State
map<Package, Version> packages
map<Package, boolean> isVersionLocked
State resolve(State initialState, list<Package> availablePackages, list<Package> newPackages)
{
// do stuff here
}
(if you should have actual code or know about an existing implementation of something that does this (in any language, C++ preferred) feel free to mention it anyway)
It's NP-hard
Some bad news: This problem is NP-hard, so unless P=NP, there is no algorithm that can efficiently solve all instances of it. I'll prove this by showing how to convert, in polynomial time, any given instance of the NP-hard problem 3SAT into a dependency graph structure suitable for input to your problem, and how to turn the output of any dependency resolution algorithm on that problem back into a solution to the original 3SAT problem, again in polynomial time. The logic is basically that if there was some algorithm that could solve your dependency resolution problem in polynomial time, then it would also solve any 3SAT instance in polynomial time -- and since computer scientists have spent decades looking for such an algorithm without finding one, this is believed to be impossible.
I'll assume in the following that at most one version of any package can be installed at any time. (This is equivalent to assuming that there are implicit conflicts between every pair of distinct versions of the same package.)
First, let's formulate a slightly relaxed version of the dependency resolution problem in which we assume that no packages are already installed. All we want is an algorithm that, given a "target" package, either returns a set of package versions to install that (a) includes some version of the target package and (b) satisfies all dependency and conflict properties of every package in the set, or returns "IMPOSSIBLE" if no set of package versions will work. Clearly if this problem is NP-hard, then so is the more general problem in which we also specify a set of already-installed package versions that are not to be changed.
Constructing the instance
Suppose we are given a 3SAT instance containing n clauses and k variables. We will create 2 packages for each variable: one corresponding to the literal x_k, and one corresponding to the literal !x_k. The x_k package will have a conflict with the !x_k package, and vice versa, ensuring that at most one of these two packages will ever be installed by the package manager. All of these "literal" packages will have just a single version, and no dependencies.
For each clause we will also create a single "parent" package, and 7 versions of a "child" package. Each parent package will be dependent on any of the 7 versions of its child package. Child packages correspond to ways of choosing at least one item from a set of 3 items, and will each have 3 dependencies on the corresponding literal packages. For example, a clause (p, !q, r) will have child package versions having dependencies on the literal packages (p, q, !r), (!p, !q, !r), (!p, q, r), (p, !q, !r), (p, q, r), (!p, !q, r), and (p, !q, r): the first 3 versions satisfy exactly one of the literals p, !q or r; the next 3 versions satisfy exactly 2; and the last satisfies all 3.
Finally, we create a "root" package, which has all of the n parent clause packages as its dependencies. This will be the package that we ask the package manager to install.
If we run the package manager on this set of 2k + 8n + 1 package versions, asking it to install the root package, it will either return "IMPOSSIBLE", or a list of package versions to install. In the former case, the 3SAT problem is unsatisfiable. In the latter case, we can extract values for the variables easily: if the literal package for x_k was installed, set x_k to true; if the literal package !x_k was installed, set x_k to false. (Note that there won't be any variables with neither literal package installed: each variable appears in at least one clause, and each clause produces 7 child package versions, at least one of which must be installed, and which will force installation of one of the two literals for that variable.)
Even some restrictions are hard
This construction doesn't make any use of pre-installed packages or "Provides" information, so the problem remains NP-hard even when those aren't permitted. More interestingly, given our assumption that at most one version of any package can be installed at a time, the problem remains NP-hard even if we don't permit conflicts: instead of making the literals x_k and !x_k separate packages with conflict clauses in each direction, we just make them two different versions of the same package!

Is the resolution problem in OSGi NP-Complete?

The resolution problem is described in the modularity chapter of the OSGi R4 core specification. It's a constraint satisfaction problem and certainly a challenging problem to solve efficiently, i.e. not by brute force. The main complications are the uses constraint, which has non-local effects, and the freedom to drop optional imports to obtain a successful resolution.
NP-Completeness is dealt with elsewhere on StackOverflow.
There has already been plenty of speculation about the answer to this question, so please avoid speculation. Good answers will include a proof or, failing that, a compelling informal argument.
The answer to this question will be valuable to those projects building resolvers for OSGi, including the Eclipse Equinox and Apache Felix open source projects, as well as to the wider OSGi community.
Yes.
The approach taken by the edos paper Pascal quoted can be made to work with OSGi. Below I’ll show how to reduce any 3-SAT instance to an OSGi bundle resolution problem. This site doesn’t seem to support mathematical notation, so I’ll use the kind of notation that’s familiar to programmers instead.
Here’s a definition of the 3-SAT problem we’re trying to reduce:
First define A to be a set of propositional atoms and their negations A = {a(1), … ,a(k),na(1), … ,na(k)}. In simpler language, each a(i) is a boolean and we define na(i)=!a(i)
Then 3-SAT instances S have the form: S = C(1) & … & C(n)
where C(i) = L(i,1) | L(i,2) | L(i,3) and each L(i,j) is a member of A
Solving a particular 3-SAT instance involves finding a set of values, true or false for each a(i) in A, such that S evaluates to true.
Now let’s define the bundles we’ll be use to construct an equivalent resolution problem. In the following all bundle and package versions are 0 and import version ranges unrestricted except where specified.
The whole expression S will be represented by Bundle BS
Each clause C(i) will be represented by a bundle BC(i)
Each atom a(j) will be represented by a bundle BA(j) version 1
Each negated atom na(j) will be represented by a bundle BA(j) version 2
Now for the constraints, starting with the atoms:
BA(j) version 1
-export package PA(j) version 1
-for each clause C(i) containing atom a(j) export PM(i) and add PA(j) to PM(i)’s uses directive
BA(j) version 2
-export package PA(j) version 2
-for each clause C(i) containing negated atom na(j) export PM(i) and add PA(j) to PM(i)’s uses directive
BC(i)
-export PC(i)
-import PM(i) and add it to the uses directive of PC(i)
-for each atom a(j) in clause C(i) optionally import PA(j) version [1,1] and add PA(j) to the uses directive of the PC(i) export
-for each atom na(j) in clause C(i) optionally import PA(j) version [2,2] and add PA(j) to the uses directive of the PC(i) export
BS
-no exports
-for each clause C(i) import PC(i)
-for each atom a(j) in A import PA(j) [1,2]
A few words of explanation:
The AND relationships between the clauses is implemented by having BS import from each BC(i) a package PC(i) that is only exported by this bundle.
The OR relationship works because BC(i) imports package PM(i) which is only exported by the bundles representing its members, so at least one of them must be present, and because it optionally imports some PA(j) version x from each bundle representing a member, a package unique to that bundle.
The NOT relationship between BA(j) version 1 and BA(j) version 2 is enforced by uses constraints. BS imports each package PA(j) without version constraints, so it must import either PA(j) version 1 or PA(j) version 2 for each j. Also, the uses constraints ensure that any PA(j) imported by a clause bundle BC(i) acts as an implied constraint on the class space of BS, so BS cannot be resolved if both versions of PA(j) appear in its implied constraints. So only one version of BA(j) can be in the solution.
Incidentally, there is a much easier way to implement the NOT relationship - just add the singleton:=true directive to each BA(j). I haven’t done it this way because the singleton directive is rarely used, so this seems like cheating. I’m only mentioning it because in practice, no OSGi framework I know of implements uses based package constraints properly in the face of optional imports, so if you were to actually create bundles using this method then testing them could be a frustrating experience.
Other remarks:
A reduction of 3-SAT that doesn't use optional imports in also possible, although this is longer. It basically involves an extra layer of bundles to simulate the optionality using versions. A reduction of 1-in-3 SAT is equivalent to a reduction to 3-SAT and looks simpler, although I haven't stepped through it.
Apart from proofs that use singleton:=true, all of the proofs I know about depend on the transitivity of uses constraints. Note that both singleton:=true and transitive uses are non-local constraints.
The proof above actually shows that the OSGi resolution problem is NP-Complete or worse. To demonstrate that it’s not worse we need to show that any solution to the problem can be verified in polynomial time. Most of the things that need to be checked are local, e.g. looking at each non-optional import and checking that it is wired to a compatible export. Verifying these is O(num-local-constraints). Constraints based on singleton:=true need to look at all singleton bundles and check that no two have the same bundle symbolic name. The number of checks is less than num-bundlesnum-bundles. The most complicated part is checking that the uses constraints are satisfied. For each bundle this involves walking the uses graph to gather all of the constraints and then checking that none of these conflict with the bundle’s imports. Any reasonable walking algorithm would turn back whenever it encountered a wire or uses relationship it had seen before, so the maximum number of steps in the walk is (num-wires-in-framework + num-uses-in framework). The maximum cost of checking that a wire or uses relationship hasn't been walked before is less than the log of this. Once the constrained packages have been gathered the cost of the consistency check for each bundle is less than num-imports-in-bundlenum-exports-in-framework. Everything here is a polynomial or better.
This paper provides a demonstration: http://www.cse.ucsd.edu/~rjhala/papers/opium.html
From memory I thought this paper contained the demonstration, sorry for not checking that out before. Here is other link that I meant to copy that I'm sure provides a demonstration on page 48: http://www.edos-project.org/bin/download/Main/Deliverables/edos%2Dwp2d1.pdf

Resources