All-pair shortest path for minimum spanning tree - algorithm

I am trying to solve an algorithm challenge about graphs, which I have managed to break down to the following: Given an undirected spanning tree, find the 2 leaves such that the cost between them is minimal.
Now I know of the Floyd Warshall algorithm that can find all-pair shortest paths with time complexity O(N^3) and space complexity O(N^2). The input of the problem is N = 10^5 so O(N^3) and O(N^2) are too much.
Is there a way to optimize space and time complexity for this problem?

As #Codor said, elaborating on that, in a MST there is only one unique path b/w any pair of nodes, and same will be the shortest path.
In order to calculate shortest path b/w all pairs.
You can choose to follow this algorithm.
You can basically choose find the root of the MST by constantly removing leaf nodes till only one or two nodes are left .
Complexity : centre node in a tree
this can be achieved in O(V) i.e linear time
Choose one of them as root. Calculate distance of all the other nodes in respect to the root node using Breadth First Search(BFS).
Complexity : O(V+E) ~ O(V) in case of tree
Now you can find distance b/w any pair of nodes call it a,b. Find its least common ancestor(lcp).
Then there are two case if
lcp(a,b) = r (root of the tree).
dis(a,b) = dis[a] + dis[b]
lcp(a,b) = c ( which is not the root node)
dis(a,b) = dis[a] + dis[b] - 2 * dis[c]
where dis(x,y) = distance b/w node x,y
and dis[x] distance of node x from root node
If implemented using Ranked Union Find
Complexity : O(h) , where h is height of the tree per pair of (a,b).
h = X/2, where X is the diameter of the tree.
So total complexity depends on the no. of leaf node pairs.

Related

Weighted Directed Graph best method for shortest path

For a question I was doing I'm confused about why the answer would be a BFS and not Dijkstra's algorithm.
The question was : There is a weighted digraph G=(V,E) with n nodes and m edges. Each node has a weight of 1 or 2. The question was to figure out which algorithm to use to find the shortest path in G from a given vetex u to a given vertex v. The options were:
a) O(n+m) time using a modified BFS
b) O(n+m) time using a modified DFS
c) O(mlogn) time using Dijkstra's Algorithm
d) O(n^3) time using modified Floyd-Warshall algorithm
The answer is a) O(n+m) time using a modified BFS,
I know that when comparing BFS to DFS, BFS is better for shorter paths. I also know Dijkstra's algorithm is similar to a BFS and if I'm not mistaken Dijkstra's algorithm is better for weighted graphs like in this case. I'm assuming BFS is better because it says modified BFS but what would modified exactly mean or is there another reason BFS would be better.
Since all paths are limited to either a distance of 1 or 2, for every edge of length 2 from nodes a to b you can just create a new node c with an edge from a to c of length 1 and an edge from c to b of length 1, and then this becomes a graph with only edges of weight 1 which can be BFS'd normally to find shortest path from u to v. Since you only add O(m) new nodes and O(m) new edges, this keeps the BFS's time complexity of O(n+m).
Another possibility is to, at each layer of BFS, store another list of nodes that are attained by edges with a weight of 2 from the current layer, and consider them at the same time as nodes attained two layers later. This approach is a bit more finicky though.

Algorithm to traverse k nodes of an undirected, weighted graph (and return to the origin) at the lowest cost

I am looking for an algorithm to do the following:
In an undirected, weighted graph with cycles
-find a path that visits exactly k nodes
-minimize the total cost(weight)
-each node can be visited only once
-return to the origin
edit: The start (and end) vertex is set in advance.
If I wanted to visit all nodes, the Traveling Salesman algorithm (and all its variations) would work. But in my case, the "salesman" needs to head home after visiting k nodes.
Both approximate and exact algorithms are fine in this case.
Since your problem includes the TSP for k=n as a special case in general it will be NP-complete. For small k you can adapt the dynamic programming solution of Bellmann (1962) to solve it in time O(2^k n^3).
Let T(u,S) be the length of the shortest route starting at vertex u with vertices in S visited already. Then you want the smallest of T(u0,{u0}) over all starting vertices u0. T satisfies the recurrence
T(u,S) = min { d(u,v)+T(v,S+{v}) | v in V\S } if |S|<k
T(u,S) = d(u,u0) if |S|=k
for distances d(u,v). The DP table has 2^kn entries, each entry takes O(n) time to compute, and you have to compute it n times, for each starting vertex.

How to find longest increasing subsequence among all simple paths of an unweighted general graph?

Let G = (V, E) be an unweighted general graph in which every vertex v has a weight w(v).
An increasing subsequence of a simple path p in G is a sequence of vertices of p in which the weights of all vertices along this sequence increase. The simple paths can be closed paths.
A longest increasing subsequence (LIS) of a simple path p is an increasing subsequence of p such that has maximum number of vertices.
The question is that, how to find a longest increasing subsequence among all simple paths of G?
Note that the graph is undirected, therefore it is not a directed acyclic graph (DAG).
Here's a very fast algorithm for solving this problem. The longest increasing subsequence in the graph is a subsequence of a path in the graph, and each path must belong purely to a single connected component. So if we can solve this problem on connected components, we can solve it for the overall graph by finding the best solution across all connected components.
Next, think about the case where you're solving this problem for a connected graph G. In that case, the longest increasing subsequence you could find would be formed by sorting the nodes by their weight, then traversing from the lowest-weight node to the second, then to the third, then to the fourth, etc. If there are any ties or duplicates, you can just skip them. In other words, you can solve this problem by
Sorting all the nodes by weight,
Discarding all but one node of each weight, and
Forming an LIS by visiting each node in sequence.
This leads to a very fast algorithm for the overall problem. In time O(m + n), find all connected components. For each connected component, use the preceding algorithm in time O(Sort(n)), where Sort(n) is the time required to sort n elements (which could be Θ(n log n) if you use heapsort, Θ(n + U) for bucket sort, Θ(n lg U) for radix sort, etc.). Then, return the longest sequence you find.
Overall, the runtime is O(m + n + &Sort(n)), which beats my previous approach and should be a lot easier to code up.
I had originally posted this answer, which I'll leave up because I think it's interesting:
Imagine that you pick a simple path out of the graph G and look at the longest increasing subsequence of that path. Although the path walks all over the graph and might have lots of intermediary nodes, the longest increasing subsequence of that path really only cares about
the first node on the path that's also a part of the LIS, and
from that point, the next-largest value in the path.
As a result, we can think about forming an LIS like this. Start at any node in the graph. Now, travel to any node in the graph that (1) has a higher value than the current node and (2) is reachable from the current node, then repeat this process as many times as desired. The goal is to do so in a way that gives the longest possible sequence of increasing values.
We can model this process as finding a longest path in a DAG. Each node in the DAG represents a node in the original graph G, and there's an edge from a node u to a node v if
there's a path from u to v in G, and
w(u) < w(v).
This is a DAG because of that second condition, even though the original graph isn't a DAG.
So we can solve this overall problem in a two-step process. First, build the DAG described above. To do so:
Find the connected components of the original graph G and label each node with its connected component number. Time: O(m + n).
For each node u in G, construct a corresponding node u' in a new DAG D. Time: O(n).
For each node u in G, and for each node v in G that's in the same SCC as u, if w(u) < w(v), add an edge from u' to v'. Time: Θ(n2) in the worst-case, Θ(n) in the best case.
Find the longest path in D. This path corresponds to the longest increasing subsequence of any simple path in G. Time: O(m + n).
Overall runtime: Θ(n2) in the worst-case, Θ(m + n) in the best-case.

Time complexity of Hill Climbing algorithm for finding local min/max in a graph

What is the time complexity (order of algorithm) of an algorithm that finds the local minimum in a graph with n nodes (having each node a maximum of d neighbors)?
Detail: We have a graph with n nodes. Each node in the graph has an integer value. Each node has maximum of d neighbors. We are looking for a node that has the lowest value among its neighbors. The graph is represented by an adjacency list. The algorithm starts by selecting random nodes and, within these nodes, it selects the node with minimum value (let's say node u). Starting from node u, the algorithm finds a neighbor v, where value(v) < value(u). Then, it continues with v and repeats the above step. The algorithm terminates when the node does not have any neighbor with a lower value. What is the time complexity of this algorithm and why?
Time complexity is O(n + d), because you can have n nodes, which are connected as this, the number shows the value of node :
16-15-14-13-12-11-10-9-8-7-6-5-4-3-2-1
And you can randomly select these, marked by "!"
!-!-!-13-12-11-10-9-8-7-6-5-4-3-2-1
So you select the node with value 14 and by described alghoritm, you will check all the nodes and all the edges until you reach the node with value 1.
The worst complexity for task : "find one element" is O(N), where "N" is the length of your input and length of your input is actually N=G(n,d)=n+d.

How to find longest path in graph?

We are given an Adjacency list of the form
U -> (U,V,C) -> (U,V,C) ...
U2 -> ...
U3 -> ...
.
.
etc
(U,V,C) means there's an edge from U to V with cost C.
The given Adjacency list is for a single connected tree with N nodes thus containing N-1 edges.
A set of nodes F=F1,F2,F3...Fk are given.
Now the question is what is the best way to find the longest path amongst the nodes in F?
Is it possible to do it in O(N)?
Is DFS from each node in F the only option?
I understood your question as asking to find a pair of nodes from the set F so that the unique path between those two nodes is as long as it can be. The path is unique because your graph is a tree.
The problem can be solved trivially by doing DFS from every node in F as you mention, for an O(n k) solution where n is the size of the graph and k is the size of the set F.
However, you can solve it potentially faster by a divide and conquer approach. Pick any node R from the graph, and use a single DFS to tabulate distances Dist(R, a) to every other node a a and at the same time partition the nodes to subtrees S1,...,Sm where m is the number of edges from R; that is, these are the m trees hanging at the root R. Now, for any f and g that belong to different subtrees it holds that the path between them has Dist(R, f) + Dist(R, g) edges, so it is possible to search for the longest such path in O(k^2) time. In addition, you have then to recurse to the subproblems S1,...,Sm to cover the case where the longest path is inside one of those trees. The overall complexity can be lower than O(n k) but the math is left as an exercise to the reader.
If I understood your question correctly, you are trying to find the longest cost path in a spanning tree.
You can find the path in just 2 complete traversal i.e., O(2N) ~ O(N) for large value of N.
you should do below step.
Pick any node in the spanning tree.
Run any algo (DFS or BFS) from the node and find the longest cost
path from this node.
This will not be your longest cost path as you started by randomly picking a node.
Run BFS or DFS one more time from the last node of longest cost path
found at step 2.
This time the longest cost path you get, will be the Longest cost
path in spanning tree.
You do not have to run DFS from each node.

Resources