I have a graph that contains exactly one cycle but I need to sort it using a "topological sort" (of course actual topological sort does not handle cycle). I am wondering how can it be done?
For example:
A -> B
B -> C
C -> D
D -> A
Possible solutions are:
A -> B -> C -> D
B -> C -> D -> A
C -> D -> A -> B
D -> A -> B -> C
I see there is this algorithm as suggested here but it's too overcomplicated for my use case.
There are a few approaches and implementations for topological sort. The most intuitive one I have found is to:
Identify nodes with no incoming edges (Can be done through creating an adjacency list and a dictionary containing incoming edge counts for vertices).
Add these to the sorted list
Remove this node from the graph and subtract it from the count of incoming edges for the node
Repeat process until there are no longer any nodes with a count above 0.
If the sorted list is larger than the number of vertices you will know that you have a cycle at this point and can terminate the algorithm.
There are many code samples with various implementations online but this algorithm should help guide the implementation for a basic topological sort.
Related
I am trying to convert a DAG to a binary tree. Consider the following graph
I want to have the following output for above tree.
Since A,B,C,E forms a diamond, to convert it into a tree, I need to move B and C into a line.
I have tried following:
Topological Sort : Output is A -> D -> B -> C -> E -> F .
Topological Sort with order : A -> [B,C,D] -> E -> F
Topological Path gives us a straight path. But, I want to preserve the sequence if possible i.e. A -> D. However, if there is a diamond, I want a node to only have one parent and sequence these parents as well.
Is there a way to generate a tree from a DAG for above cases?
Algorithm in pseudo-code
Run a topological sort on the graph
For every node B, in reverse order of the topological sort:
If B has more than one parent:
Order its parents A1, A2, ..., An in the order of the topological sort
For every i in 1..n-1:
Add an arc from Ai to A(i+1)
Remove the arc from Ai to B
Proof of correctness
The algorithm always terminates, since it is a loop of fixed length; its time complexity is O(N^2) where N is the number of nodes
Immediately after the step on a given node B, B has no more than one parent
If the step on a given node C has already been executed, then executing the step on a node B that comes before C in the topological order only adds arcs to nodes that come before C in the topological order; hence once a node's step has been executed, they never gain new parents.
This proves that the algorithm terminates and that every node has at most one parent after executing the algorithm. Since we only remove parents from nodes which had more than one parent, I think it also satisfies your question.
Does reversing the result of a topological sort on a graph where all edges are in the wrong direction result in a valid topological order, as if the edges were reversed before the sort?
a -> b
a -> c
b -> d
c -> d
could give a toposort of a b c d. Reversing this list gives d c b a. Reversing all edges in the graph before toposorting could also give d c b a. Is this true in the general case? I'm guessing no, but I can't find an example that fails.
It obviously is, if you look at it from the right angle.
After a topo-sort, if we store all nodes in a list, all arrows out from any edge point in the same direction. If we reverse the list, all arrows now point in the opposite direction. Since all arrows point the same way, it's a valid topological sort.
And the other approach, first flipping all edges and then performing a toposort obviously yields a valid toposort, or the toposort algorithm is broken.
The exact total order produced by the two approaches might differ, but they're both valid.
I'm trying to learn about the Euler Tour algorithm and why it's popular for tree traversal. However, I'm failing to see the difference between a Euler Tour and a Pre-order traversal of a tree.
Let's say you have the tree:
A
/ \
B E
/ \ \
C D F
If you performed the euler tour algorithm, it would be:
A -> B -> C -> B -> D -> B -> A -> E -> F -> E -> A
But what's the purpose of this? It just seems like the exact same version of recursive pre-order:
A -> B -> C -> D -> E -> F
Obviously, in Euler Tour, you have each node value at least twice in the path, but that's only due to the recursive nature of the algorithm when you program it. If you wanted, you could do the same calculations you were doing with Euler Tour... with Pre-order, right?
If somebody could help explain Euler Tour and why it's used over other traversals, that'd be very much appreciated. Thanks.
With the Euler tour you can derive additional information from the result.
You can for example see if a node is a leaf. This would be the case if the predecessor and successor of a node are the same.
Additionally you would be able to calculate the depth of a node by adding +1 to a counter for every forward leg and subtracting 1 for every backward leg.
Those information are often useful when dealing with trees in your algorithms.
Note that also the postorder information is present in your Euler tour. If just list each node the last time it is listed, we get
C -> D -> B -> F -> E -> A
Unfortunately we cannot obtain the inorder (symmetric order) from the tour. In your example this is clear from looking at node E and its child F, there is no way we can actually see whether the child is left or right.
The Euler traversal method can be extended a little to include the three recursive orders of a tree (pre- in- post-). Follow the outline of the tree and visit each node three times, before entering the left child, between left and right child, and after the right child. If a child is missing, make consecutive visits.
We can extend your example in the following way, adding numbers to your earlier tour:
A1 -> B1 -> C123 -> B2 -> D123 -> B3 -> A2 -> E12 -> F123 -> E2 -> A3
I've got a routing problem where I need to retrieve the best n solutions between two points. I am using Dijkstra for the optimal solution and Yen Top K algorithm on top of that to get the n best solutions.
However there is a twist to it, you can have multiple parallel edges between to vertices. Lets imagine a bus network:
Line A: a -> b -> c -> d -> e
Line B: b -> c -> d -> e -> f
Line C: a -> b -> c -> g -> h
When you build your graph, how do you handle these parallel connections?
I am thinking of building the graph like:
Line A: a->b,a->c,a->d a->e,b->c,b->d,b->d,b->e,c->d,c->e,d->e
Line B: b->c,b->d,b->e,e->f,c->d,c->e,c->f,d->e,d->f,e->f
Line C: a->b,a->c,a->g,a->h,b->c,b->g,b->h,c->g,c->h,g->h
With that I have direct edges for when I don't have to change bus.
For each Vertex I go through I add a connection penalty weight.
So if I want to go from a->e I would probably get Line A as using Line C a->b, Line B b->e might be longer because of the connection time even if the time Line C a->b and Line B b->e might be faster than the route on Line A.
However I still need to handle parallel connections. So I guess I need to sort the parallel edges by weight.
Currently this is based on static timing information but at some point it should take actual schedule information into account. And depending on that your weights between two vertices could change. E.g. by the time you would get to point b the fastest connection via Line C wouldn't be the fastest anymore as you would have just missed Line C etc.
Are there any resources anywhere that explain how you would handle these more complex situations?
One approach could be to reduce the problem back to a simple graph (no parallel edges), by "splitting nodes"
That means, if you have a node u, with edges (v,u)_1, (v,u)_2, ..., (v,u)_k, you can split u to: u, u_1, u_2,...,u_k, with edges: (u_1,u), (u_2,u), ..., (u_k,u), (v,u_1), (v,u_2), ...., (v,u_k), and weights:
w(u_i,u) = 0 for all i and w(v,u_i) = w((v,u)_i) for all i
Now you can run any algorithm designed for simple graph with ease, where the number of vertices is increased in a linear factor of the number of parallel edges.
I have a weighted, directed graph with multiples edges starting from the ending at the same nodes.
Ex.
Multiple edges going from node A to node B.
What would be the best search algorithm to get all of the paths to a certain node and the associated costs of these paths?
Since you want all the paths, I'd use a simple breadth-first search. However, I also suggest that you collapse all the parallel edges into a single edge that has a list of weights.
Once you get all the different paths (that is, all the paths in which the nodes visited are different), for each path you can calculate all the possible alternative parallel routes.
So if you've got the following edges:
A -> C (5)
A -> C (3)
A -> B (7)
B -> C (1)
B -> C (4)
The first step transforms it into:
A -> C (5,3)
A -> B (7)
B -> C (1,4)
The breadth-first search on this graph will yield the following paths between A and B:
A -> B (7)
A -> C -> B (5,3) + (1,4)
So for the second path, you'll get the following costs:
5 + 1
5 + 4
3 + 1
3 + 4
This isn't going to be any faster in itself than just doing a BFS on the original graph, but a simpler graph is easier to handle.
If you have to output the cost of each path, there is nothing better than a plain DFS (or BFS). Since the problem is output sensitive and you might just have O(E + V) paths, you cannot accomplish anything better in terms of big-O notation.
As already stated, you can do bruteforce/backtracking using Depth First Search, etc.
Don't expect to find any shortcuts for this - if your graph is dense enough there can be lots of paths and even just finding how many of them there are is #P-complete (ie.: untractable).
(If your problem is different - maybe repeated nodes are allowed or you only want to find the shortest path or something like that then there could be tractable solution)
do you allow the cycling, that is you have directed link/path from a->b b-x-x-->a? for which case you will end up with unlimited paths.