Interview question about dependency management [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
In one of my recent interviews, I was asked about how does a package manager tool like npm or pip might work internally in terms of figuring out which dependencies to install first.
For instance, say you want to install a package A which depends upon package B, which in turn depends on package C. In such a case, the package C should be installed first followed by B and then A.
The dependency trail can get a lot more complicated which I might believe can be represented in the form of a graph. Now the question is to figure out if there exists a cyclic dependency amongst the packages and if not, then to print the packages in the order in which they should be installed.
I couldn't come up with a correct/optimal solution in time, but maybe someone from here can help?
Thanks!

The dependency between packages can be modeled as a Directed Acyclic Graph.
A dependency graph is invalid if it contains a cycle, for which you can refer to the following algorithm for detecting a cycle in a directed graph: https://www.geeksforgeeks.org/detect-cycle-in-a-graph/
If there's no cycle in the graph, then you can perform a topological sort to obtain the order in which the dependencies should be installed: https://www.geeksforgeeks.org/topological-sorting/
Hope this answers your question. Cheers!

Please follow this link of npm algo for more clarification
Dependency Resolution
I am taking the exact example provided in the npm documentation.
Note: From the v3 the dependency resolution algorithm is changed so this example applicable for v3 and above npm versions.
Lets consider the following example :
Module-A, depends on Module B v1.0.
Module-C, depends on Module B v2.0.
Note the sequence of modules mentioned because it plays a significant role in the dependency resolution.
So first is the module A in sequence and it depends on the module B v1.0, npm will install both module A and its dependency, module B, inside the /node_modules directory, flat.
Next in the sequence is module C which depends on again module B but with a different version. npm handles this by nesting the new, different, module B version dependency under the module that requires it
Now what happens if we install another module that depends on Module B v1.0? or Module B v2.0?
So lets say :
Module-D, depends on Module B v2.0.
Module-E, depends on Module B v1.0.
Because B v1.0 is already a top-level dependency, we cannot install B v2.0 as a top level dependency. Therefore Module B v2.0 is installed as a nested dependency of Module D, even though we already have a copy installed, nested beneath Module C. Module B v1.0 is already a top-level dependency, we do not need to duplicate and nest it. We simply install Module E and it shares Module B v1.0 with Module A.
Now the interesting part, what happens if we update Module A to v2.0, which depends on Module B v2.0, not Module B v1.0 ?
The key is to remember that install order matters.
Even though Module A was installed first (as v1.0) via our package.json, using npm install command means that Module A v2.0 is the last package installed.
As a result, npm does the following things when we install module A v2.0
It removes Module A v1.0.
It installs Modules A v2.0.
It leaves Module Bv1.0 because Module E v1.0 still depends on it.
It installs Module Bv2.0 as a nested dependency under Module A v2.0, since Module B v1.0 is already occupying the top level in the directory hierarchy.
Finally, let’s also update Module E to v2.0, which also depends on Module B v2.0 instead of Module B v1.0, just like the Module A update.
npm performs the following things:
It removes Module E v1.0.
It installs Module E v2.0.
It removes Module B v1.0 because nothing depends on it anymore.
It installs Module B v2.0 in the top level of the directory because there is no other version of Module B there.
Now, this is clearly not ideal. We have Module B v2.0 in nearly every directory. To get rid of duplication, we can run:
npm dedupe
This command resolves all of the packages dependencies on Module B v2.0 by redirecting them to the top level copy of Module B v2.0 and removes all the nested copies.
Conclusion
So the key take away from this example is that installation order matters and that can be ensured only by using npm command while adding or updating any package in the project. There may be possibility the generated dependency tree by npm would be different on different local development machine but it wont affect the behavior of your application because Even though the trees are different, both sufficiently install and point all your dependencies at all their dependencies, and so on, down the tree. You still have everything you need, it just happens to be in a different configuration.
if you want your node_modules directory to be the same than use npm install command, when used exclusively to install packages from a package.json, will always produce the same tree. This is because install order from a package.json is always alphabetical. Same install order means that you will get the same tree.
You can reliably get the same dependency tree by removing your node_modules directory and running npm install whenever you make a change to your package.json.

Related

Do dependency versions affect substitution of included builds in a Gradle composite build?

Context
I have one project (A) that depends on another (B), both using the Gradle build system. Project A usually declares a dependency on a specific version of B. When modifying project B, it's convenient to immediately see the effects on project A. Using includeBuild in settings.gradle or --include-build on the command line, the build of project A is reconfigured to include the local development version of project B instead of a published artifact (a "composite build"). However, this local copy of project B declares a development version number, not the stable release version requested by project A. To ensure that the included build of B satisfies the declared dependency in A, I have been changing A to request a version range like org.example:B:[v6, ) instead of org.example:B:v6.0. However, I see that the build succeeds even when A specifies nonsensical version numbers that do not correspond to B.
Question
How do versions affect this dependency substitution process?
Research
The documentation I have read (linked below) shows how to include a build, but does not mention version numbers.
https://docs.gradle.org/current/samples/sample_composite_builds_basics.html
https://docs.gradle.org/current/userguide/composite_builds.html#defining_composite_builds
Some hints may be present on this page covering "brute force" modifications of dependency resolution:
In composite builds, the rule that you have to match the exact
requested dependency attributes is not applied: when using composites,
Gradle will automatically match the requested attributes. In other
words, it is implicit that if you include another build, you are
substituting all variants of the substituted module with an equivalent
variant in the included build.
Although this deals with "variants" and their "attributes", which according to the dependency terminology documentation seem to be distinct from module versioning, the statement seems related. Can something similar be said about dependency module versions?

Are go modules meant to be installed as executable programs or packages?

Can Go modules be built as an executable program? Or, are they meant to be published as libraries for code reuse?
Building an executable and publishing a library are not mutually exclusive (note that modules are not compiled, packages are).
A module is a collection of related Go packages that are versioned together as a single unit.
Modules record precise dependency requirements and create reproducible builds.
https://github.com/golang/go/wiki/Modules#modules
Whether these packages contain a main package or not is irrelevant.
They're intended to work as packages, like something you would install from NPM for a JavaScript project, or from PIP in a Python project.

How to build project whose dependencies depend on another version of a project dependency

Imagine you have a project which requires two modules A and B. I will call the project module P. Let's say that P requires A v1.0.0, B v1.1.0 and that A requires B v1.0.0. Furthermore B did not adhered semantic versioning rules thus the version change from v1.0.0 -> v1.1.0 introduced a breaking API change. So
P only builds with v1.1.0 and A only builds with v1.0.0.
Dependency graph:
P -> A (v1.0.0) -> B(v1.0.0)
P -> B (v1.1.0)
Is there any way to build this project with different versions. I heared about vendoring but I'm not sure if this would cause the dependency to use a different B module version.
And if it could provide a solution for the conflicting package versions, does the go tool recognize modules using vendoring if the dependencies do not include a vendor folder (some people say, you should not upload the vendor folder) in their git repository (In this case module A does not ship with a vendor folder, but the developer called go mod vendor locally), does the go get command respect vendor folders of dependencies (or can it detect that the module used vendoring without an upstream vendor folder)?
This seems like a conflict the module system cannot resolve. Since Go uses semantic versioning it will try to get B v1.1.0 to resolve both dependencies and then the build will break if A cannot work with B 1.1.0.
The best way to resolve it is to fix B by not breaking the API in a non-major version.
Lacking that, you could fork B into a separate module (with a different module name from the original B) and use an old version in A. Create BFORK=Bv1.0.0, and then you'll have:
P -> B (v1.0.0)
A -> BFORK vX.X.X

OSGi classloader issues

I am very new to OSGi.
I am developing a plugin A (osgi bundle), suppose A which depends on libraries, suppose B-1.0 and C-1.0. Now If the library C-1.0 depends on library B-2.0 (Note: the different version of library B). So my plugin has two different versions of the library B in its classpath. Now, How can I handle this situation ?
As I am studying from last 4-5 days about OSGi that it creates a classloader for each plugin in the JIRA application, so that dependency version mismatch do not occur between plugins. But What would a developer do If a plugin itself needs two different versions of a library jar ?
Can I create two different classloaders in a single osgi bundle through OSGi, say one for package X and another one for package Y ?
Please help me in any of the above scenarios or point me to the right direction.
Thanks in advance.
Remember that bundles do not depend on other bundles!!
Bundles import packages that are exported by other bundles. (unless you have used Require-Bundle, but you should not). So to rephrase the scenario from your example:
Bundle A imports package org.foo. Bundle C exports package org.foo, and OSGi wires the import to the export. So far so good.
Bundle C also imports package org.bar. Bundle B 1.0 exports package org.bar. Therefore OSGi wires these together and everything is still fine.
Now... bundle A also imports package org.wibble. Bundle B 2.0 exports package org.wibble. This is fine as well! Bundles B 1.0 and B 2.0 are simply different bundles as far as OSGi is concerned, they can both be installed at the same time.
So when you look at the dependencies the way they actually work, you find that it's perfectly possible for A to import code that comes from two different versions of B. However there is a limitation. Consider the following:
Bundle D imports packages org.foo and org.bar v1.0 (yes, packages are versioned).
Bundle E exports package org.foo, which satisfies the import in D. Bundle E also imports package org.bar v2.0.
Some other bundles (say F v1 and F v2) export the 2 versions of the org.bar packages.
Actually this scenario can still work. D can import version 1.0 of package org.bar from somewhere, and E can import version 2.0 of package org.bar from somewhere else, at the same time as D is importing package org.foo from E. I personally find this pretty incredible! But it does not work if org.foo "uses" org.bar, where "uses" means that some types in org.bar are visible in the API of org.foo. In this case, bundle D would be exposed to 2 different copies org.bar, which is not allowed, so OSGi will prevent bundle D from running by not allowing it to enter RESOLVED or ACTIVE states.
In osgi bundle or plugin you'll have meta-inf flie which will define which classes you import if you pass extra agrument being the version=2.0 then it will use the class from B-2.0 if you don't specify anything then it'll resolve to one that is loaded by classloader first.
i.e.
import-package(C 1.0):
b.some.package; version="2.0" or b.some.package; version="[2.0,4.0)"
import-package(A 1.0):
b.some.package; version="1.0" or b.some.package; version="1.0"
Hope this helps
Anup
Since each OSGi bundle has its own classloader, there will be 4 bundles in the runtime, and also 4 classloaders (A, B-1.0, B-2.0, C-1.0).
You may have two copies of the same class included in B (one from 1.0 and another from 2.0). If you run this, you may simply run into a ClassCastException in the A code because two versions of B classes are not the same.
OSGi provides a "uses" clause to detect this type of situations early. For example, C may have a uses clauses like the following:
Export-Package: c.some.package;uses="b.some.package";version="1.0"
Import-Package: b.some.package;version="2.0"
In this case, you will have an earlier failure (while resolving A), known as a uses conflict, because C places a constraint for its consumer on an acceptable version of B.
Conceptually, the only way to fix this problem is to have consumers of B (A and C in this case) agree on the version of B.

Equinox Bundle import conflict

1) Bundle A reexports package com.X, which it gets from bundle C
2) Bundle B exports package com.X
3) Now bunlde D has dependency on both A and B.
From where will the bundle D get the package com.X from?
The first question is why you have 2 bundles defining the same package - this is called split packages and isn't recommended because you can have problems with shadowing.
With Import-Package the runtime will pick either bundle A or B to resolve the package dependency and you can't control this directly (you can do various tricks like the Eclipse guys do by setting mandatory properties on the exports).
With Require-Bundle you'll end up with a merged com.X package, so you'll see the superset of types, but I'm not sure what happens if you have overlapping types.
The simplest thing is to avoid split packages in the first place.

Resources