Find the shortest path from source to target in a weighted-undirected graph in O(V + E) time - algorithm

I've been tasked with designing an algorithm that finds the shortest path in an weighted-undirected graph with V nodes and E edges in O(V + E) time. The graph weights are all positive integers and no weight is greater than 15.
I believe I can use Dijkstra's algorithm to find the shortest path from a source node to a target node, but I don't think it satisfies the runtime constraints.
Knowing at the runtimes of BFS and DFS, I'm thinking that some sort of modification with those algorithms will get me to O(V + E), but I'm not sure what direction to head in or how I can leverage the <= 15 weight constraint on the edges.
Any help is appreciated.

You can use Dijkstra's algorithm, but you have to be a little careful with the priority queue.
Since all the weights are integers from 1 to 15, there can only be 16 different priorities in the queue at any one time. You can use this fact to implement all your priority queue operations in constant time. That will change the complexity of the algorithm from O(|V| + |E| log |V|) to O(|V| + |E|)
There are lots of ways to make that priority queue. Basically you partition the entries into lists of entries with the same priority, and then you only have to prioritize the 16 lists. It's reasonable to keep those 16 lists in a circular array.

The algorithm that You're looking for is called Dial's Algorithm as it works also in graphs that contain cycles. It's complexity is O(E + WV). In case, W>>V you can replace one bucket per W with buckets for weights 1, 2-3, 4-7, 8-15 etc.
It's an optimization on Dijkstra, which uses the fact, that given the range of weights, You're able to replace the Fibonacci Heap with buckets which will decrease the find_node operation from O(logn) to O(1).
The algorithm in detail is well described on GeeksForGeeks and Wikipedia among others.
You should also be interested in Directed Acyclic Graph Shortest Path in Cormen's Introduction to Algorithms on p. 655 or on GeeksForGeeks

Related

Time complexity of two-way BFS

Time complexity of traditional (one-way) BFS is O(V+E) when an adjacency list is used. What is it in case of two-way BFS?
Based on the answer here, I know:
BFS will traverse 1 + B + B^2 + ... + B^k vertices; whereas
Bi-directional BFS will traverse 2 + 2B^2 + ... + 2B^(k/2) vertices.
But I don't know how to derive the time complexity based on this.
I'm pretty sure that the time complexity is O(V+E) too.
Let's imagine the following graph: You have two binary trees with the same depth. Then for each of the binary trees you connect all the leaves to a vertice and you connect the roots. Starting at the two points connected to the leaves of the binary trees we will see that both algorithms have to look at every vertex. Bi-directional has to look at all edges too. But one-way can skip 2^(d - 1) - 1 edges (d is the depth). So I somehow doubt that it is even worth it to have a more complex algorithm and do it bi-directional. My thought is that you might find a solution faster if the points are "close" but for the extreme cases that take the longest to calculate it doesn't help.
Btw. the number of vertices that are traversed is an approximation at best because you discard all vertices that were already discovered. So the number is probably a lot smaller than B^k.

Shortest Path in a Directed Acyclic Graph with two types of costs

I am given a directed acyclic graph G = (V,E), which can be assumed to be topologically ordered (if needed). The edges in G have two types of costs - a nominal cost w(e) and a spiked cost p(e).
The goal is to find the shortest path from a node s to a node t which minimizes the following cost:
sum_e (w(e)) + max_e (p(e)), where the sum and maximum are taken over all edges in the path.
Standard dynamic programming methods show that this problem is solvable in O(E^2) time. Is there a more efficient way to solve it? Ideally, an O(E*polylog(E,V)) algorithm would be nice.
---- EDIT -----
This is the O(E^2) solution I found using dynamic programming.
First, order all costs p(e) in an ascending order. This takes O(Elog(E)) time.
Second, define the state space consisting of states (x,i) where x is a node in the graph and i is in 1,2,...,|E|. It represents "We are in node x, and the highest edge weight p(e) we have seen so far is the i-th largest".
Let V(x,i) be the length of the shortest path (in the classical sense) from s to x, where the highest p(e) encountered was the i-th largest. It's easy to compute V(x,i) given V(y,j) for any predecessor y of x and any j in 1,...,|E| (there are two cases to consider - the edge y->x is has the j-th largest weight, or it does not).
At every state (x,i), this computation finds the minimum of about deg(x) values. Thus the complexity is O(|E| * sum_(x\in V) deg(x)) = O(|E|^2), as each node is associated to |E| different states.
I don't see any way to get the complexity you want. Here's an algorithm that I think would be practical in real life.
First, reduce the graph to only vertices and edges between s and t, and do a topological sort so that you can easily find shortest paths in O(E) time.
Let W(m) be the minimum sum(w(e)) cost of paths max(p(e)) <= m, and let P(m) be the smallest max(p(e)) among those shortest paths. The problem solution corresponds to W(m)+P(m) for some cost m. Note that we can find W(m) and P(m) simultaneously in O(E) time by finding a shortest W-cost path, using P-cost to break ties.
The relevant values for m are the p(e) costs that actually occur, so make a sorted list of those. Then use a Kruskal's algorithm variant to find the smallest m that connects s to t, and calculate P(infinity) to find the largest relevant m.
Now we have an interval [l,h] of m-values that might be the best. The best possible result in the interval is W(h)+P(l). Make a priority queue of intervals ordered by best possible result, and repeatedly remove the interval with the best possible result, and:
stop if the best possible result = an actual result W(l)+P(l) or W(h)+P(h)
stop if there are no p(e) costs between l and P(h)
stop if the difference between the best possible result and an actual result is within some acceptable tolerance; or
stop if you have exceeded some computation budget
otherwise, pick a p(e) cost t between l and P(h), find a shortest path to get W(t) and P(t), split the interval into [l,t] and [t,h], and put them back in the priority queue and repeat.
The worst case complexity to get an exact result is still O(E2), but there are many economies and a lot of flexibility in how to stop.
This is only a 2-approximation, not an approximation scheme, but perhaps it inspires someone to come up with a better answer.
Using binary search, find the minimum spiked cost θ* such that, letting C(θ) be the minimum nominal cost of an s-t path using edges with spiked cost ≤ θ, we have C(θ*) = θ*. Every solution has either nominal or spiked cost at least as large as θ*, hence θ* leads to a 2-approximate solution.
Each test in the binary search involves running Dijkstra on the subset with spiked cost ≤ θ, hence this algorithm takes time O(|E| log2 |E|), well, if you want to be technical about it and use Fibonacci heaps, O((|E| + |V| log |V|) log |E|).

Building MST from a graph with "very few" edges in linear time

I was at an interview and interviewer asked me a question:
We have a graph G(V,E), we can find MST using prim's or kruskal algorithm. But these algorithms do not take into the account that there are "very few" edges in G. How can we use this information to improve time complexity of finding MST? Can we find MST in linear time?
The only thing I could remember was that Kruskal's algorithm is faster in a sparse graphs while Prim's algorithm is faster in really dense graphs. But I couldn't answer him how to use prior knowledge about the number of edges to make MST in linear time.
Any insight or solution would be appreciated.
Kruskal's algorithm is pretty much linear after sorting the edges. If you use a union find structure like disjoint set forest The complexity for processing a single edge will be in the order of lg*(n) where n is the number of vertices and this function grows so slowly that for this case can be considered constant. However the problem is that to sort the edges you still need a O(m * log(m)). Where m is the number of edges.
Prim's algorithm will not be able to take advantage of the fact that the edges are very few.
One approach that you can use is something like a 'reversed' MST approach where you start off with all edges and remove the longest edge until the graph becomes disconnected. You keep doing that until only n - 1 edges are left. Still note that this will be better than Kruskal only if the number of edges to remove k are few enough so that k * n < m * log(m).
Lets say |E| = |V| +c ,c being a small constant. You can run DFS on the graph and every time you detect a circle, remove the largest edge. you must do that c +1 times. O(c+1 * |E|) = O(E) linear time in theory.

Linear Time Algorithm For MST

I was wondering if anyone can point to a linear time algorithm to find the MST of a graph when there is a small number of weights (I.e edges can only have 2 different weights).
I could not find anything on google other than Prim's, Kruskal's, Boruvka's none of which seem to have any properties that would reduce the run time in this special case. I'm guessing to make it linear time it would have to be some sort of modification of BFS (which finds the MST when the weights are uniform).
The cause of the lg V factor in Prim's O(V lg V) runtime is the heap that is used to find the next candidate edge. I'm pretty sure that it is possible to design a priority queue that does insertion and removal in constant time when there's a limited number of possible weights, which would reduce Prim to O(V).
For the priority queue, I believe it would suffice with an array whose indices covers all the possible weights, where each element points to a linked list that contains the elements with that weight. You'd still have a factor of d (the number of distinct weights) for figuring out which list to get the next element out of (the "lowest" non-empty one), but if d is a constant, then you'll be fine.
Elaborating on Aasmund Eldhuset's answer: if the weights in the MST are restricted to numbers in the range 0, 1, 2, 3, ..., U-1, then you can adapt many of the existing algorithms to run in (near) linear time if U is a constant.
For example, let's take Kruskal's algorithm. The first step in Kruskal's algorithm is to sort the edges into ascending order of weight. You can do this in time O(m + U) if you use counting sort or time O(m lg U) if you use radix sort. If U is a constant, then both of these sorting steps take linear time. Consequently, the runtime for running Kruskal's algorithm in this case would be O(m α(m)), where α(m) is the inverse Ackermann function, because the limiting factor is going to be the runtime of maintaining the disjoint-set forest.
Alternatively, look at Prim's algorithm. You need to maintain a priority queue of the candidate distances to the nodes. If you know that all the edges are in the range [0, U), then you can do this in a super naive way by just storing an array of U buckets, one per possible priority. Inserting into the priority queue then just requires you to dump an item into the right bucket. You can do a decrease-key by evicting an element and moving it to a lower bucket. You can then do a find-min by scanning the buckets. This causes the algorithm runtime to be O(m + nU), which is linear if U is a constant.
Barder and Burkhardt in 2019 proposed this approach to find MSTs in linear time given the non-MST edges are given in ascending order of their weights.

Directed graph (topological sort)

Say there exists a directed graph, G(V, E) (V represents vertices and E represents edges), where each edge (x, y) is associated with a weight (x, y) where the weight is an integer between 1 and 10.
Assume s and tare some vertices in V.
I would like to compute the shortest path from s to t in time O(m + n), where m is the number of vertices and n is the number of edges.
Would I be on the right track in implementing topological sort to accomplish this? Or is there another technique that I am overlooking?
The algorithm you need to use for finding the minimal path from a given vertex to another in a weighted graph is Dijkstra's algorithm. Unfortunately its complexity is O(n*log(n) + m) which may be more than you try to accomplish.
However in your case the edges are special - their weights have only 10 valid values. Thus you can implement a special data structure(kind of a heap, but takes advantage of the small dataset for the wights) to have all operations constant.
One possible way to do that is to have 10 lists - one for each weight. Adding an edge in the data structure is simply append to a list. Finding the minimum element is iteration over the 10 lists to find the first one that is non-empty. This still is constant as no more than 10 iterations will be performed. Removing the minimum element is also pretty straight-forward - simple removal from a list.
Using Dijkstra's algorithm with some data structure of the same asymptotic complexity will be what you need.

Resources