If D depends on B and C which each depend on A, I want ABCD (or ACBD) as the result; that is generate a flat sequence from the graph such that all nodes appear before any of their descendants. For example, we may need to install dependencies for X before installing X.
What is a good algorithm for this?
In questions like this, terminology is crucial in order to find the correct links.
The dependencies you describe form a partially ordered set (poset). Simply put, that is a set with an order operator, for which the comparison of some pairs might be undefined. In your example B and C are incomparible (neither B depends on C, nor C depends on B).
An extension of the order operator is one that respects the original partial order and adds some extra comparisons to previously incomparable elements. In the extreme: a linear extension leads to a total ordering. For every partial ordering such an extension exists.
Algorithms to obtain a linear extension from a poset are called topological sorting. Wikipedia provides the following very simple algorithm:
L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edges
while S is non-empty do
remove a node n from S
add n to tail of L
for each node m with an edge e from n to m do
remove edge e from the graph
if m has no other incoming edges then
insert m into S
if graph has edges then
return error (graph has at least one cycle)
else
return L (a topologically sorted order)
Related
In a group of friends, each except one friend spies exactly one other friend. Every friend has some valuables, which is a positive integer. Find a group of friends with the biggest sum of valuables such that no friend spies any other friend within this group.
Example: We have the following graph for one of the possible test cases. The value above each vertex is the positive number of valuables owned by them.
The best possible group is [A,F,D,J,H] = 92 value
Looks like we can achieve the solution by ignoring the traversal through the graph and calculating the combinations of all possible groups.
Unfortunately not able to think of a dynamic programming approach or how to get started.
Give the constraints on your graph, you will always have:
one disjoint connected sub-graph that is a tree consisting of zero-or-more simple paths all branching off the friend who is not spying on anyone; and
zero-or-more disjoint connected sub-graphs where each sub-graph contains exactly one cycle and zero-or-more simple paths branching off the cycle.
To get the best sum of valuables:
Within each branching path, there can be at most two non-selected vertices between each selected vertex. (If there were three non-selected vertices then you would get a better result if the middle vertex of those three were selected; i.e. (A)->B->C->D->(E) where () is a selected vertex will always give a better result as (A)->B->(C)->D->(E)).
Within each cycle, unless blocked by a selected adjacent vertex in a branching path, there can be at most two non-selected vertices between each selected vertex. (for similar reasons to branching paths).
If you have the connected sub-graph (similar to the bottom of your example but E spies I, rather than spying nothing):
-----------
V |
C -> D -> E -> I -> J <- H
Then you can either start with the cycle and work outwards to the branching paths or from the branching paths inwards. If you consider the cycle-outwards then:
if E is selected then D, I and J cannot be and given that the minimum separation of 1 is achieved, then C and H must be selected.
if I is selected then E and J cannot be selected and H and either C or D can be selected.
if J is selected then E, I and H cannot be selected and either C or D can be selected.
if none of the vertices on the cycle are selected then H and either C or D can be selected (which, in this case, is always a strictly worse option that selecting J as well, since it does not block selection from the branching paths).
This gives possible solutions for that connected sub-graph of:
C, E, H
C, I, H
D, I, H
C, J
D, J
You can perform a similar operation for each disjoint subgraph and, for the one disjoint sub-graph containing the friend who is not spying on anyone then that friend can either be selected or not selected and each branching simple path can be evaluated independently choosing to either skip one-or-two vertices between selected vertices and finding the optimum total.
Find the solution with the highest value for the sub-graph and move on to the next sub-graph.
A tree T will be called structured if for every d, all nodes at distance d from the root contain the same type of data.
Our goal is to construct a new tree T' which is the "intersection" of some subtrees of T.
That is, T' has the same structure (same data types at the same order) as T, and every node v in T' is the result of an agreement/intersection of corresponding nodes in T.
An example will show this best:
In the tree T below, each leaf represents a colored permutation of 2 elements.
Our goal is to find all possible colored permutations of 3 elements: that are both allowed permutations by the restrictions in T, and have the same color. For example, 231 is an allowed permutation but doesn't have a mutual color.
Well then, how should I construct T'? what data structure should I use?
My intuition says a BFS-style algorithm will get the job done nicely, yet I suspect a better logic could be applied.
For example, here is one idea I had and I am not sure how to proceed with: maybe I can store T in some fancy database, compute all allowed permutations together, and then call the corresponding colored leaves?
The goal is to sort a list X of n unknown variables {x0, x1, x2, ... x(n-1)} using a list C of m comparison results (booleans). Each comparison is between two of the n variables, e.g. x2 < x5, and the pair indices for each of the comparisons are fixed and given ahead of time. Also given: All pairs in C are unique (even when flipped, e.g. the pair x0, x1 means there is no pair x1, x0), and never compare a variable against itself. That means C has at most n*(n-1)/2 entries.
So the question is can I prove that my list C of m comparisons is sufficient to sort the list X? Obviously it would be if C was the largest possible length (had all possible comparisons). But what about shorter lists?
Then, if it has been proven that C contains enough information to sort, how do I then actually go about performing the sort.
Let's imagine that you have the collection of objects to be sorted and form a graph from them with one node per object. You're then given a list of pairs indicating how the comparisons go. You can think of these as edges in the graph: if you know that object x compares less than object y, then you can draw an edge from x to y.
Assuming that the results of the comparisons are consistent - that is, you don't have any cycles - you should have a directed acyclic graph.
Think about what happens if you topologically sort this DAG. What you'll end up with is one possible ordering of the values that's consistent with all of the constraints. The reason for this is that in a topological ordering, you won't place an element x before an element y if there is any transitive series of edges leading from y to x, and there's a transitive series of edges leading from y to x if there's a chain of comparisons that transitively indicates that y precedes x.
You can actually make a stronger claim: the set of all topological orderings of the DAG is exactly the set of all possible orderings that satisfy all the constraints. We've already argued that every topological ordering satisfies all the constraints, so all we need to do now is argue that every sequence satisfying all the constraints is a valid topological ordering. The argument here is essentially that if you obey all the constraints, you never place any element in the sequence before something that it transitively compares less than, so you never place any element in the sequence before something that has a path to it.
This then gives us a nice way to solve the problem: take the graph formed this way and see if it has exactly one topological ordering. If so, then that ordering is the unique sorted order. If not, then there are two or more orderings.
So how best to go about this? Well, one of the standard algorithms for doing a topological sort is to annotate each node with its indegree, then repeatedly pull off a node of indegree zero and adjust the indegrees of its successors. The DAG has exactly one topological ordering if in the course of performing this algorithm, at every stage there is exactly one node of indegree zero, since in that case the topological ordering is forced.
With the right setup and data structures, you can implement this to run in time O(n + m), where n is the number of nodes and m is the number of constraints. I'll leave those details as a proverbial exercise to the reader. :-)
Your problem can be reduced to the well-known Topological sort.
To prove that "C contains enough information to sort" is to prove the uniqueness of topological sort:
If a topological sort has the property that all pairs of consecutive vertices in the sorted order are connected by edges, then these edges form a directed Hamiltonian path in the DAG. If a Hamiltonian path exists, the topological sort order is unique; no other order respects the edges of the path. Conversely, if a topological sort does not form a Hamiltonian path, the DAG will have two or more valid topological orderings, for in this case it is always possible to form a second valid ordering by swapping two consecutive vertices that are not connected by an edge to each other. Therefore, it is possible to test in linear time whether a unique ordering exists, and whether a Hamiltonian path exists, despite the NP-hardness of the Hamiltonian path problem for more general directed graphs (Vernet & Markenzon 1997).
I have a set of elements pairs. Each one of theses pairs means : In the final sequence the first elements precedes the second element.
The set of pairs contains enough pairs to reconstruct a unique sequence.
eg. :
If my set of pairs is {(A, B), (A, C), (C, B)}
= A precedes B, A precedes C and C precedes B.
my final sequence is ACB.
Now, I need an algorithm to reconstruct sequences from this kind of pair sets.
Efficiency is critical. Any smart tip is welcome !
Create a directed graph from those pairs, then perform topological sort.
This is problem of Topological sorting of Oriented graph. Read More
I'm looking for a simple algorithm to 'serialize' a directed graph. In particular I've got a set of files with interdependencies on their execution order, and I want to find the correct order at compile time. I know it must be a fairly common thing to do - compilers do it all the time - but my google-fu has been weak today. What's the 'go-to' algorithm for this?
Topological Sort (From Wikipedia):
In graph theory, a topological sort or
topological ordering of a directed
acyclic graph (DAG) is a linear
ordering of its nodes in which each
node comes before all nodes to which
it has outbound edges. Every DAG has
one or more topological sorts.
Pseudo code:
L ← Empty list where we put the sorted elements
Q ← Set of all nodes with no incoming edges
while Q is non-empty do
remove a node n from Q
insert n into L
for each node m with an edge e from n to m do
remove edge e from the graph
if m has no other incoming edges then
insert m into Q
if graph has edges then
output error message (graph has a cycle)
else
output message (proposed topologically sorted order: L)
I would expect tools that need this simply walk the tree in a depth-first manner and when they hit a leaf, just process it (e.g. compile) and remove it from the graph (or mark it as processed, and treat nodes with all leaves processed as leaves).
As long as it's a DAG, this simple stack-based walk should be trivial.
I've come up with a fairly naive recursive algorithm (pseudocode):
Map<Object, List<Object>> source; // map of each object to its dependency list
List<Object> dest; // destination list
function resolve(a):
if (dest.contains(a)) return;
foreach (b in source[a]):
resolve(b);
dest.add(a);
foreach (a in source):
resolve(a);
The biggest problem with this is that it has no ability to detect cyclic dependencies - it can go into infinite recursion (ie stack overflow ;-p). The only way around that that I can see would be to flip the recursive algorithm into an interative one with a manual stack, and manually check the stack for repeated elements.
Anyone have something better?
If the graph contains cycles, how can there exist allowed execution orders for your files?
It seems to me that if the graph contains cycles, then you have no solution, and this
is reported correctly by the above algorithm.