shortest path between 2 vertices in undirected weighted graph - algorithm

I am trying to find shortest path between 2 vertices in undirected weighted graph. It is also known that weights are integers less than log(log|V|), where |V| is amount of vertices. It is easy to solve using Bellman-Ford or Dijkstra algorithms, but is there any algorithm which can do it faster?
So far, I have been thinking of using BFS and dividing edges with weight greater than 1 into couple of them with weight 1, but it is not really good idea if |V| is large number. No, it is not my homework, I am just wondering.

One way to think of this question is to improve the running time of using Dijkstra's algorithm to find the shortest path between two vertices in the undirected weighted graph. So in this case, you can use a binary heap as the data structure. A heap is a complete binary tree with the heap property that every parent node is smaller (greater) than its children nodes in the tree in a min heap (a max heap). Here you can use the min heap to store the cost to each node from the starting node.
More information about heap can be found here: https://courses.csail.mit.edu/6.006/fall10/handouts/recitation10-8.pdf
With a heap, the running time of Dijkstra's algorithm can be reduced from O(V^2) to O(E log E), because selecting the minimum distance from the heap takes O(log V) (removing the minimum distance is O(1) and fixing the heap takes O(log V)) and updating distances to vertices takes O(E log V) in total (fixing heap takes O(log V) and it takes E times to examine neighbors and change costs).
Hope this help.

Related

To implement Dijkstra’s shortest path algorithm on unweighted graphs so that it runs in linear time, which data structure should be used?

To implement Dijkstra’s shortest path algorithm on unweighted graphs so that it runs in linear time, the data structure to be used is:
Queue
Stack
Heap
B-Tree
I found below answers:
A Queue because we can find single source shortest path in unweighted graph by using Breadth first search (BFS) algorithm which using "Queue" data structure , which time O(m+n) (i.e. linear with respect to the number of vertices and edges. )
A min heap is required to implement it in linear time because if we delete here a node in min heap it will not take any time in adjustment because all r having same weight so deletion will take O(1) for one node .. so for n-1 node it will be O(n).
Can someone explain which one is the correct answer?
please note that if the graph is unweighted no dijekstra is needed a simple BFS will work perfectly in O(E + V) ==> linear Time
A simple implementation of the algorithm needs A Queue Data Structure .

Minimum spanning tree to minimize cost

Can someone please help me solve this problem?
We have a set E of roads, a set H of highways, and a set V of different cities. We also have a cost x(i) associated to each road i and a cost y(i) associated to each highways i. We want to build the roads to connect the cities, with the conditions that there is always a path between any pair of cities and that we can build at most one highway, which may be cheaper than a road.
Set E and set H are different, and their respective costs are unrelated.
Design an algorithm to build the roads (with at most one highway) that minimize the total cost.
So, what we have is a fully connected graph of edges.
Solution steps:
Find the minimum spanning tree for the roads alone and consider it as the minimum cost.
Add one highway to the roads graph an calculate the minimum spanning cost tree again.
compare step 2 cost with the minimum cost to replace it if its smaller.
remove that high way.
go back to step 2 and go the steps again for each highway.
O(nm) = m*mst_cost(n)
Using Prim's or Kruskal's to build an MST: O(E log V).
The problem is the constraint of at most 1 highway.
1. Naive method to solve this:
For each possible highway, build the MST from scratch.
Time complexity of this solution: O(H E log V)
2. Alternative
Idea: If you build an MST, you can refine the MST with a better MST if you have an additional available edge you have not considered before.
Suppose the new edge connects (u,v). If you use this edge, you can remove the most expensive edge in the path between vertices u and v in the MST. You can find the path naively in O(V) time.
Using this idea, the time complexity is the cost to build the initial MST O(E log V) and the time to try to refine the MST with each of the H highways. The total algorithmic complexity is therefore O(E log V + H V), which is better than the first solution.
3. Optimized refinement
Instead of doing a naive path-searching method with the second method, we can find a faster way to do this. One related problem is LCA (lowest-common ancestor). A good way of solving LCA is using jump pointers. First you root hte tree, then each vertex will have jump pointers towards the root (1 step, 2 steps, 4 steps etc.) Pre-processing might cost O(V log V) time, and finding the LCA of 2 vertices is O(log V) worst case (although it is actually O(log (depth of tree)) which is usually better).
Once you have found the LCA, that implicitly gives you the path between vertices u and v. However, to find the most expensive edge to delete could be expensive since traversing the path is costly.
In 1-dimensional problems, the range-maximum-query (RMQ) can be employed. This uses a segment tree to solve the RMQ in O(log N) time.
Instead of a 1-dimensional space (like an array), we have a tree. However, we can apply the same idea, and build a segment tree-like structure. In fact, this is equivalent to bundling an extra piece of information with each jump pointer. To find the LCA, each vertex in the tree will have log(tree depth) jump pointers towards the root. We can bundle the maximum edge weight of the edges we jump over with the jump pointer. The cost of adding this information is the same as creating the jump pointer in the first place. Therefore, a slight refinement to the LCA algorithm allows us to find the maximum edge weight on the path between vertices u and v in O(log (depth)) time.
Finally, putting it together, the algorithmic complexity of this 3rd solution is O(E log V + H log V) or equivalently O((E+H) log V).

How can a heap be used to optimizie Prim's minimum spanning tree algorithm?

I have to solve a question that is something like this:
I am given a number N which represents the number of points I have. Each point has two coordinates: X and Y.
I can find the distance between two points with the following formula:
abs(x2-x1)+abs(y2-y1),
(x1,y1) being the coordinates of the first point, (x2,y2) the coordinates of the second point and abs() being the absolute value.
I have to find the minimum spanning tree, meaning I must have all my points connected with the sum of the edges being minimal. Prim's algorithm is good, but it is too slow. I read that I can make it faster using a heap but I didn't find any article that explains how to do that.
Can anyone explain me how Prim's algorithm works with a heap(some sample code would be good but not neccesarily), please?
It is possible to solve this problem efficiently(in O(n log n) time), but it is not that easy. Just using the Prim's algorithm with a heap does not help(it actually makes it even slower), because its time complexity is O(E log V), which is O(n^2 * log n) in this case.
However, you can use the Delaunay triangulation to reduce the number of edges in the graph. The Delaunay triangulation graph is planar, so it has linear number of edges. That's why running the Prim's algorithm with a heap on it gives O(n log n) time complexity(there are O(n) edges and n vertices). You can read more about it here(covering this algorithm in details and proving its correctness would make my answer way too long): http://en.wikipedia.org/wiki/Euclidean_minimum_spanning_tree. Note that even though the article is about the Euclidian mst, the approach for your case is essentially the same(it is possible to build the Delaunay triangulation for manhattan distance efficiently, too).
A description of the Prim's algorithm with a heap itself is already present in two other answers to your question.
From the Wikipedia article on Prim's algorithm:
[S]toring vertices instead of edges can improve it still further. The heap should order the vertices by the smallest edge-weight that connects them to any vertex in the partially constructed minimum spanning tree (MST) (or infinity if no such edge exists). Every time a vertex v is chosen and added to the MST, a decrease-key operation is performed on all vertices w outside the partial MST such that v is connected to w, setting the key to the minimum of its previous value and the edge cost of (v,w).
While it was pointed out that Prim's with a heap is O(E log V), which is O(n^2 log n) in the worst case, I can provide what makes the heap faster in cases other than that worst case, since that has still not been answered.
What makes Prim's so costly at O(V^2) is the necessary updating each iteration in the algorithm. In general, Prim's works by keeping a table of your vertices with the lowest length to other vertices and picking the cheapest vertex to add to your growing tree until all are added. Every time you add a vertex, you must then go back to your table and update any vertices that can now be accessed with less weight. You then must walk back all the way through your table to decide which vertex is cheapest to add. This setup - having to pick the next vertex (O(V)) V times - gives the O(V^2).
The heap is able to help this running time is all cases besides the worst case because it fixes this bottleneck. By working with a minimum heap, you can access the minimum weight in consideration in O(1). Additionally, it costs O(log V) to fix a heap after adding a number to it to maintain its properties, which is done E times for O(E log V) to maintain the heap for Prim's. This becomes the new bottleneck, which is what gives rise to the final running time of O(E log V).
So, depending on how much you know about your data, Prim's with a heap can certainly be more efficient than without!

Is the runtime of BFS and DFS on a binary tree O(N)?

I realize that runtime of BFS and DFS on a generic graph is O(n+m), where n is number of nodes and m is number of edges, and this is because for each node its adjacency list must be considered. However, what is the runtime of BFS and DFS when it is executed on a binary tree? I believe it should be O(n) because the possible number of edges that can go out of a node is constant (i.e., 2). Please confirm if this is the correct understanding. If not, then please explain what is the correct time complexity of BFS and DFS on a binary tree?
The time complexities for BFS and DFS are just O(|E|), or in your case, O(m).
In a binary tree, m is equal to n-1 so the time complexity is equivalent to O(|V|). m refers to the total number of edges, not the average number of adjacent edges per vertex.
Yes, O(n) is correct.
Also note that the number of edges can more exactly be expressed as the number of nodes - 1. This can quite easily be seen by considering that each node, except the root, has an edge from its parent to itself, and these edges cover all edges that exists in the tree.

Max weight euclidean spanning tree

A maximum spanning tree can be found by running kruskal's algorithm(just changing the edges function and considering the max weight edges first). I am interested in finding the max weight euclidean spanning tree. Does there exist a better algorithm(better worst case running time) than kruskal's to find such a spanning tree?
Monma et al. solve this in O(n log h) time and O(n) space, where n is the number of points and h is the size of the convex hull.
The algorithm (p.10 of the paper) is fairly simple so it should be accessible even without understanding the full proof.

Resources