Efficient way to keep only longest paths in a DAG? - algorithm

Is there an efficient way to remove all edges that are not part of a longest paths between two nodes in a DAG?
For example, for the graph (DAG): (1->2, 2->3, 2->4, 1->3, 1->4) I want to remove the edges 1->3, 1->4 since the paths 1->2->3, 1->2->4 are longer
Edit: so I think the best way is to use topological sort and traverse the array for right to left while aggregating for each node its descendants. Then for each edge a->b we can check whether b is reachable from a using all the other direct descendants of a (and if so we delete the edge). But I didn't find an implementation and I'm not sure it's correct, does anyone aware of an implementation of something like this?

The algorithm you suggest is correct, because if an edge is on any longest path, then it must be the longest and only path from its source to target vertices. You can therefore remove any edge that is redundant.
The name for what you are trying to do is "transitive reduction": https://en.wikipedia.org/wiki/Transitive_reduction
While your algorithm works, I don't see that the topological sort is doing you any good. The simple algorithm is to do a search from each vertex to find other ways of reaching its adjacent vertices.

Related

How to get path from one node to all other nodes in a weighted tree in minimum time?

I just want to get the distance of source node from every node. But it is different than graph problems since it is a tree and path between every node is unique so I expect answer to be in more efficient time.
Is it possible to get answer in efficient time?
You're absolutely right that in a tree, the difficulty of finding a path between two nodes is a lot lower than in a general graph because once you find any path (at least, one without cycles) you know it's the shortest. So all you have to do is just find all paths starting at the given node and going to each other node. You can do this with either a depth-first or a breadth-first search in time O(n). To find the lengths, just keep track of the lengths of the edges you've seen along the paths you've traveled as you travel them.
This is not different from "graph problems": a tree is a special case of a graph. Dijkstra's algorithm is a standard of graph traversal. Just modify it a little: keep all of the path lengths as you find them, and don't worry about the compare-update step, since you're going to keep all of the results. Continue until you run out of nodes to check, and there are your path lengths.

Will a standard Kruskal-like approach for MST work if some edges are fixed?

The problem: you need to find the minimum spanning tree of a graph (i.e. a set S of edges in said graph such that the edges in S together with the respective vertices form a tree; additionally, from all such sets, the sum of the cost of all edges in S has to be minimal). But there's a catch. You are given an initial set of fixed edges K such that K must be included in S.
In other words, find some MST of a graph with a starting set of fixed edges included.
My approach: standard Kruskal's algorithm but before anything else join all vertices as pointed by the set of fixed edges. That is, if K = {1,2}, {4,5} I apply Kruskal's algorithm but instead of having each node in its own individual set initially, instead nodes 1 and 2 are in the same set and nodes 4 and 5 are in the same set.
The question: does this work? Is there a proof that this always yields the correct result? If not, could anyone provide a counter-example?
P.S. the problem only inquires finding ONE MST. Not interested in all of them.
Yes, it will work as long as your initial set of edges doesn't form a cycle.
Keep in mind that the resulting tree might not be minimal in weight since the edges you fixed might not be part of any MST in the graph. But you will get the lightest spanning tree which satisfies the constraint that those fixed edges are part of the tree.
How to implement it:
To implement this, you can simply change the edge-weights of the edges you need to fix. Just pick the lowest appearing edge-weight in your graph, say min_w, subtract 1 from it and assign this new weight,i.e. (min_w-1) to the edges you need to fix. Then run Kruskal on this graph.
Why it works:
Clearly Kruskal will pick all the edges you need (since these are the lightest now) before picking any other edge in the graph. When Kruskal finishes the resulting set of edges is an MST in G' (the graph where you changed some weights). Note that since you only changed the values of your fixed set of edges, the algorithm would never have made a different choice on the other edges (the ones which aren't part of your fixed set). If you think of the edges Kruskal considers, as a sorted list of edges, then changing the values of the edges you need to fix moves these edges to the front of the list, but it doesn't change the order of the other edges in the list with respect to each other.
Note: As you may notice, giving the lightest weight to your edges is basically the same thing as you suggest. But I think it is a bit easier to reason about why it works. Go with whatever you prefer.
I wouldn't recommend Prim, since this algorithm expands the spanning tree gradually from the current connected component (in the beginning one usually starts with a single node). The case where you join larger components (because your fixed edges might not all be in a single component), would be needed to handled separately - it might not be hard, but you would have to take care of it. OTOH with Kruskal you don't have to adapt anything, but simply manipulate your graph a bit before running the regular algorithm.
If I understood the question properly, Prim's algorithm would be more suitable for this, as it is possible to initialize the connected components to be exactly the edges which are required to occur in the resulting spanning tree (plus the remaining isolated nodes). The desired edges are not permitted to contain a cycle, otherwise there is no spanning tree including them.
That being said, apparently Kruskal's algorithm can also be used, as it is explicitly stated that is can be used to find an edge that connects two forests in a cost-minimal way.
Roughly speaking, as the forests of a given graph form a Matroid, the greedy approach yields the desired result (namely a weight-minimal tree) regardless of the independent set you start with.

dijkstra/prim's algorithm...a little help?

I was wondering for dijkstra's and prim's algorithm, what happens when they are choosing between more than one vertex to go to ,and there are more than one vertex with the same weight.
For example
Example Image http://img688.imageshack.us/img688/7613/exampleu.jpg
It doesn't matter. Usually the tie will be broken in some arbitrary way like which node was added to the priority queue first.
The goal of Dijkstra is to find a shortest path. If you wanted to find all shortest paths, you would then have to worry about ties.
There could be multiple MSTs, and whichever arbitrary tiebreaking rules you use might give you a different one, but it'll still be a MST.
For example, you can imagine a triangle A-B-C where all the edge weights are one. There are three MST in this case, and they are all minimum.
The same goes for Dijkstra and the shortest path spanning tree -- there could be multiple shortest path spanning trees.
Correct me if I'm wrong, but your graph doesn't have any alternate paths for Dijkstra's algorithm to apply.
Dijkstra algorithms expands (or "relaxes") all the edges from a touches but not expanded node (or "gray" node) with the smallest cost.
If two nodes have the same cost, well... it's just random :)

How do I test whether a tree has a perfect matching in linear time?

Give a linear-time algorithm to test whether a tree has a perfect matching,
that is, a set of edges that touches each vertext of the tree exactly once.
This is from Algorithms by S. Dasgupta, and I just can't seem to nail this problem down. I know I need to use a greedy approach in some manner, but I just can't figure this out. Help?
Pseudocode is fine; once I have the idea, I can implement in any language trivially.
The algorithm has to be linear in anything. O( V + E ) is fine.
I think I have the solution. Since we know the graph is a tree, we know of the existance of leaf nodes, nodes with one edge and no children. In order for this node to be included in the perfect matching, that edge MUST exist in the final solution.
Ergo, we can find all edges connecting to a leaf node, add to the solution, and remove the touched edges from the graph. If, at the end of this process, we are left any remaining nodes untounched, there exists no perfect matching.
In the case of a "graph",
The first step of the problem should be to find the connected components.
Since every edge in the final answer connect two vertices, they belong to at most one of the connected components.
Then, the perfect matching could be found for each connected component.
Linear in what? Linear in the number of edges, keep the edges as an ordered incidence list, ie, every edge (vi, vj) in some total order. Then you can compare the two lists in O(n) of the edges.
The working algorithm would be something as follows:
For each leaf in the tree:
add edge from leaf to its parent to the solution
delete edge from leaf to its parent
delete all edges from the parent to any other vertices
delete leaf and parent from the tree
If the tree is empty then the answer is yes. Otherwise, there's no perfect matching.
I think that it's a simplified problem of finding a Hamiltonian path in a graph:
http://en.wikipedia.org/wiki/Hamiltonian_path
http://en.wikipedia.org/wiki/Hamiltonian_path_problem
I think that there are many solutions on the internet to this problem, but generally finding Hamilton cycle is a NP problem.

Find the shortest Path between two nodes (vertices)

I have a list of interconnected edges (E), how can I find the shortest path connecting from one vertex to another?
I am thinking about using lowest common ancestors, but the edges don't have a clearly defined root, so I don't think the solution works.
Shortest path is defined by the minimum number of vertexes traversed.
Note: There could be a multi-path connecting two vertices, so obviously breadth first search won't work
Dijkstra's algorithm will do this for you.
I'm not sure if you need a path between every pair of nodes or between two particular nodes. Since someone has already given an answer addressing the former, I will address the latter.
If you don't have any prior knowledge about the graph (if you do, you can use a heuristic-based search such as A*) then you should use a breadth-first search.
The Floyd-Warshall algorithm would be a possible solution to your problem, but there are also other solutions to solve the all-pairs shortest path problem.
Shortest path is defined by the minimum number of vertexes treversed
it is same as minimum number of edges plus one.
you can use standard breadth first search and it will work fine. If you have more than one path connecting two vertices just save one of them it will not affect anything, because weight of every edge is 1.
Additional 2 cents. Take a look at networkx. There are interesting algos already implemented for what you need, and you can choose the best suited.

Resources