I am having a hard time seeing the O(mn) bound for the straightforward implementation Dijkstra's algorithm (without a heap). In my implementation and others I have found the main loop iterates n-1 times (for each vertex that is not source, n-1), then in each iteration finding the minimum vertex is O(n) (examining each vertex in the queue and finding min distance to source) and then each discovered minimum vertex would have at most n-1 neighbors, so updating all neighbors is O(n). This would seem to me to lead to a bound of O(n^2). My implementation is provided below
public int[] dijkstra(int s) {
int[] dist = new int[vNum];
LinkedList queue = new LinkedList<Integer>();
for (int i = 0; i < vNum; i++) {
queue.add(i); // add all vertices to the queue
dist[i] = Integer.MAX_VALUE; // set all initial shortest paths to max INT value
}
dist[s] = 0; // the source is 0 away from itself
while (!queue.isEmpty()) { // iterates over n - 1 vertices, O(n)
int minV = getMinDist(queue, dist); // get vertex with minimum distance from source, O(n)
queue.remove(Integer.valueOf(minV)); // remove Integer object, not position at integer
for (int neighbor : adjList[minV]) { // O(n), max n edges
int shortestPath = dist[minV] + edgeLenghts[minV][neighbor];
if (shortestPath < dist[neighbor]) {
dist[neighbor] = shortestPath; // a new shortest path have been found
}
}
}
return dist;
}
I don't think this is correct, but I am having trouble see where m factors in.
Your implementation indeed removes the M factor, at least if we consider only simple graphs (no multiple edges between two vertices). It is O(N^2)! The complexity would be O(N*M) if you would iterate through all the possible edges instead of vertices.
EDIT: Well, it is actually O(M + N^2) to be more specific. Changing value in some vertex takes O(1) time in your algorithm and it might happen each time you consider an edge, in other words, M times. That's why there is M in the complexity.
Unfortunately, if you want to use simple heap, the complexity is going to be O(M* log M) (or M log N). Why? You are not able to quickly change values in heap. So if dist[v] suddenly decreases, because you've found a new, better path to v, you can't just change it in the heap, because you don't really know it's location. You may put a duplicate of v in your heap, but the size of the heap would be then O(M). Even if you store the location and update it cleverly, you might have O(N) items in the heap, but you still have to update the heap after each change, which takes O(size of heap). You may change the value up to O(M) times, what gives you O(M* log M (or N)) complexity
Related
What is the time complexity of this particular implementation of Dijkstra's algorithm?
I know several answers to this question say O(E log V) when you use a min heap, and so does this article and this article. However, the article here says O(V+ElogE) and it has similar (but not exactly the same) logic as the code below.
Different implementations of the algorithm can change the time complexity. I'm trying to analyze the complexity of the implementation below, but the optimizations like checking visitedSet and ignoring repeated vertices in minHeap is making me doubt myself.
Here is the pseudo code:
// this part is O(V)
for each vertex in graph {
distanceMap[vertex] = infinity
}
// initialize source vertex
minHeap.add(source vertex and 0 distance)
distanceMap[source] = 0
visitedSet.add(source vertex)
// loop through vertices: O(V)?
while (minHeap is not empty) {
// Removing from heap is O(log n) but is n the V or E?
vertex and distance = minHeap.removeMin
if (distance > saved distance in distanceMap) continue while loop
visitedSet.add(vertex)
// looping through edges: O(E) ?
for (each neighbor of vertex) {
if (visitedSet contains neighbor) continue for loop
totalDistance = distance + weight to neighbor
if (totalDistance < saved distance in vertexMap) {
// Adding to heap is O(log n) but is n the V or E?
minHeap.add(neighbor and totalDistance)
distanceMap[neighbor] = totalDistance
}
}
}
Notes:
Each vertex that is reachable from the source vertex will be visited at least once.
Each edge (neighbor) of each vertex is checked but ignored if in visitedSet
A neighbor is added to the heap only if it has a shorter distance that the currently known distance. (Unknown distances are assumed to have a default length of infinity.)
What is the actual time complexity of this implementation and why?
Despite the test, this implementation of Dijkstra may put Ω(E) items in the priority queue. This will cost Ω(E log E) with every comparison-based priority queue.
Why not E log V? Well, assuming a connected, simple, nontrivial graph, we have Θ(E log V) = Θ(E log E) since log (V−1) ≤ log E < log V² = 2 log V.
The O(E + V log V)-time implementations of Dijkstra's algorithm depend on a(n amortized) constant-time DecreaseKey operation, avoiding multiple entries for an individual vertex. The implementation in this question will likely be faster in practice on sparse graphs, however.
I am having quite a hard time figuring out the time complexity of my algorithm. I know that the "for" portion of the algorithm will run in O(n) but I am unsure of my while loop. The problem involves creating a binary tree from a given vector. Each node is evaluated for its value and its index in the vector so, essentially, every following node must be to the right of the previous and depending on whether its value is greater or smaller, it will be a child node or a parent node. The children of the parents nodes must be smaller in value.
I have used the while loop for the case where a child node is smaller than the next node to be placed, and I follow up through the parents until I find the spot for the new node to be placed. I beleive this will run, in the worst case, k-1 times, k being the depth of the tree, but how would I represent this as a time complexity? O(kn)? Is that linear?
for(int i = 0; i < vecteur_source.size(); i++){
if( i == 0){
do bla....
}else if((vecteur_source.at(i) > vecteur_source.at(i-1)) && (m_map_index_noeud.at(i-1)->parent)){
int v = m_map_index_noeud.at(i-1)->parent->index;
while ((vecteur_source.at(i) >= vecteur_source.at(v))){
v = m_map_index_noeud.at(v)->parent->index;
}
}
}
Allow me to simplify this into pseudocode:
# I'm assuming this takes constant or linear time
do the thing for i = 0
for i ← 1 to n-1:
if source[i] > source[i-1] and nodes[i] is not the root node:
v ← nodes[i-1].parent.index
while source[i] > source[v]:
v ← nodes[v].parent.index
If I've understood it properly from your code, then your analysis is correct: the outer loop iterates O(n) times, the inner loop iterates up to O(h) times where h is the height of the tree, so therefore the time complexity is O(nh).
This is not linear time, unless h is guaranteed to be at most a constant. More usually, h is O(log n) for balanced trees, or O(n) in the worst case for unbalanced trees.
I have a graph G with a starting node S and an ending node E. What's special with this graph is that instead of edges having costs, here it's the nodes that have a cost. I want to find the way (a set of nodes, W) between S and E, so that max(W) is minimized. (In reality, I am not interested of W, just max(W)) Equivalently, if I remove all nodes with cost larger than k, what's the smallest k so that S and E are still connected?
I have one idea, but want to know if it is correct and optimal. Here's my current pseudocode:
L := Priority Queue of nodes (minimum on top)
L.add(S, S.weight)
while (!L.empty) {
X = L.poll()
return X.weight if (X == G)
mark X visited
foreach (unvisited neighbour N of X, N not in L) {
N.weight = max(N.weight, X.weight)
L.add(N, N.weight)
}
}
I believe it is worst case O(n log n) where n is the number of nodes.
Here are some details for my specific problem (percolation), but I am also interested of algorithms for this problem in general. Node weights are randomly uniformly distributed between 0 and a given max value. My nodes are Poisson distributed on the R²-plane, and an edge between two nodes exists if the distance between two nodes is less than a given constant. There are potentially very many nodes, so they are generated on the fly (hidden in the foreach in the pseudocode). My starting node is in (0,0) and the ending node is any node on a distance larger than R from (0,0).
EDIT: The weights on the nodes are floating point numbers.
Starting from an empty graph, you can insert vertices (and their edges to existing neighbours) one at a time in increasing weight order, using a fast union/find data structure to maintain the set of connected components. This is just like the Kruskal algorithm for building minimum spanning trees, but instead of adding edges one at a time, for each vertex v that you process, you would combine the components of all of v's neighbours.
You also keep track of which two components contain the start and end vertices. (Initially comp(S) = S and comp(E) = E; before each union operation, the two input components X and Y can be checked to see whether either one is either comp(S) or comp(E), and the latter updated accordingly in O(1) time.) As soon as these two components become a single component (i.e. comp(S) = comp(E)), you stop. The vertex just added is the maximum weight vertex on the the path between S and E that minimises the maximum weight of any vertex.
[EDIT: Added time complexity info]
If the graph contains n vertices and m edges, it will take O(n log n) time to sort the vertices by weight. There will be at most m union operations (since every edge could be used to combine two components). If a simple disjoint set data structure is used, all of these union operations could be done in O(m + n log n) time, and this would become the overall time complexity; if path compression is also used, this drops to O(m A(n)), where A(n) is the incredibly slowly growing inverse Ackermann function, but the overall time complexity remains unchanged from before because the initial sorting dominates.
Assuming integer weights, Pham Trung's binary search approach will take O((n + m) log maxW) time, where maxW is the heaviest vertex in the graph. On sparse graphs (where m = O(n)), this becomes O(n log maxW), while mine becomes O(n log n), so here his algorithm will beat mine if log(maxW) << log(n) (i.e. if all weights are very small). If his algorithm is called on a graph with large weights but only a small number of distinct weights, then one possible optimisation would be to sort the weights in O(n log n) time and then replace them all with their ranks in the sorted order.
This problem can be solved by using binary search.
Assume that the solution is x, Starting from the start, we will use BFS or DFS to discover the graph, visit only those nodes which have weight <= x. So, in the end, if Start and End is connected, x can be the solution. We can find the optimal value for x by applying binary search.
Pseudo code
int min = min_value_of_all_node;
int max = max_value_of_all_node;
int result = max;
while(min<= max){
int mid = (min + max)>>1;
if(BFS(mid)){//Using Breadth first search to discover the graph.
result = min(mid, result);
max = mid - 1;
}else{
min = mid + 1;
}
}
print result;
Note: we only need to apply those weights that exist in the graph, so this can help to reduce time complexity of the binary search to O(log n) with n is number of distinct weights
If the weights are float, just use the following approach:
List<Double> listWeight ;//Sorted list of weights
int min = 0;
int max = listWeight.size() - 1;
int result = max;
while(min<= max){
int mid = (min + max)>>1;
if(BFS(listWeight.get(mid))){//Using Breadth first search to discover the graph.
result = min(mid, result);
max = mid - 1;
}else{
min = mid + 1;
}
}
print listWeight.get(result);
Let G = (V, E) be a directed graph with nodes v_1, v_2,..., v_n. We say that G is an ordered graph if it has the following properties.
Each edge goes from a node with lower index to a node with a higher index. That is, every directed edge has the form (v_i, v_j) with i < j.
Each node except v_n has at least one edge leaving it. That is, for every node v_i, there is at least one edge of the form (v_i, v_j).
Give an efficient algorithm that takes an ordered graph G and returns the length of the longest path that begins at v_1 and ends at v_n.
If you want to see the nice latex version: here
My attempt:
Dynamic programming. Opt(i) = max {Opt(j)} + 1. for all j such such j is reachable from i.
Is there perhaps a better way to do this? I think even with memoization my algorithm will still be exponential. (this is just from an old midterm review I found online)
Your approach is right, you will have to do
Opt(i) = max {Opt(j)} + 1} for all j such that j is reachable from i
However, this is exponential only if you run it without memoization. With memoization, you will have the memoized optimal value for every node j, j > i, when you are on node i.
For the worst case complexity, let us assume that every two nodes that could be connected are connected. This means, v_1 is connected with (v_2, v_3, ... v_n); v_i is connected with (v_(i+1), v_(i+2), ... v_n).
Number of Vertices (V) = n
Hence, number of edges (E) = n*(n+1)/2 = O(V^2)
Let us focus our attention on a vertex v_k. For this vertex, we have to go through the already derived optimal values of (n-k) nodes.
Number of ways of reaching v_k directly = (k-1)
Hence worst case time complexity => sigma((k-1)*(n-k)) from k=1 to k=n, which is a sigma of power 2 polynomical, and hence will result in O(n^3) Time complexity.
Simplistically, the worst case time complexity is O(n^3) == O(V^3) == O(E) * O(V) == O(EV).
Thanks to the first property, this problem can be solved O(V^2) or even better with O(E) where V is the number of vertices and E is the number of edges. Indeed, it uses the dynamic programming approach which is quiet similar with the one you gives. Let opt[i] be the length of the longest path for v_1 to v_i. Then
opt[i] = max(opt[j]) + 1 where j < i and we v_i and v_j is connected,
using this equation, it can be solved in O(V^2).
Even better, we can solve this in another order.
int LongestPath() {
for (int v = 1; v <= V; ++v) opt[v] = -1;
opt[1] = 0;
for (int v = 1; v <= V; ++v) {
if (opt[v] >= 0) {
/* Each edge can be visited at most once,
thus the runtime time is bounded by |E|.
*/
for_each( v' can be reached from v)
opt[v'] = max(opt[v]+1, opt[v']);
}
}
return opt[V];
}
Is it possible to modify Floyd-Warshall algorithm when solving the shortest path problem for a directed weighted graph with n nodes, such that each shortest path have no more than m steps? More precisely, for each pair of nodes i and j, you are about to find the shortest directed path from i to j that contains no more than m nodes. Does time complexity still remain O(n3)?
Meanwhile, I found an O(n3logm) algorithm for finding all-pairs shortest paths (ASPP) problem for the graph with n nodes such that no path contain more than m nodes.
Given two n x n matrices, say A = [aij] and B = [bij], their distance product is n x n matrix C = [cij] = A x B, defined by cij = min1≤k≤n {aik + bkj}.
This is related to the ASPP in the following way. Given weighted matrix E of distances in the graph, En is matrix of all-pairs shortest path. If we add constraint that no path contains more than m nodes, then matrix Em is the solution to ASPP. Since calculating power can be found in O(logm) time, this gives us an O(n3logm) algorithm.
Here, one may find faster algorithms for calculating distance product of matrices in some special cases, but the trivial one O(n3) is enough for me, since overall time is almost as fast as Floyd-Warshall approach.
Yes and Yes.
Each iteration of the algotithm adds a single unit of length of the paths you search for. So if you limit the iterations to m then you find a path of length at most m.
The complexity will remain O(n^3) in the worst case of m -> n. However, more precise estimate would be O(m * n^2).
I believe that this could be done with a different data structure (one that would allow you to keep track of the number of steps)?
Since usually Floyd-Warshall is done with a connectivity matrix (where matrix [j][k] represents the distance between nodes j and k), instead of making that matrix an integer we can make it a struct that has two integers : the distance between two nodes and the number of steps between them.
I wrote up something in C++ to explain what I mean :
#define INFINITY 9999999
struct floydEdge
{
int weight;
int steps;
};
floydEdge matrix[10][10];
int main()
{
//We initialize the matrix
for(int j=0;j<10;j++)
{
for(int k=0;k<10;k++)
{
matrix[j][k].weight=INFINITY;
matrix[j][k].steps=0;
}
}
//We input some weights
for(int j=0;j<10;j++)
{
int a, b;
cin>>a>>b;
cin>>matrix[a][b].weight;
matrix[b][a].weight=matrix[a][b].weight;
matrix[a][b].steps=matrix[b][a].steps=1;
}
//We do the Floyd-Warshall, also checking for the step count as well as the weights
for(int j=0;j<10;j++)
{
for(int k=0;k<10;k++)
{
for(int i=0;i<10;i++)
{
//We check if there is a shorter path between nodes j and k, using the node i. We also check if that path is shorter or equal to 4 steps.
if((matrix[j][k].weight > matrix[j][i].weight + matrix[i][k].weight) && (matrix[j][i].steps+matrix[i][k].steps<=4))
{
matrix[j][k].weight=matrix[k][j].weight=matrix[j][i].weight + matrix[i][k].weight;
matrix[j][k].steps=matrix[k][j].steps=matrix[j][i].steps+matrix[i][k].steps;
}
//If the path is not shorter, we can also check if the path is equal BUT requires less steps than the path we currently have.
else if((matrix[j][k].weight == matrix[j][i].weight + matrix[i][k].weight) && (matrix[j][i].steps+matrix[i][k].steps<matrix[j][k].steps))
{
matrix[j][k].weight=matrix[k][j].weight=matrix[j][i].weight + matrix[i][k].weight;
matrix[j][k].steps=matrix[k][j].steps=matrix[j][i].steps+matrix[i][k].steps;
}
}
}
}
return 0;
}
I believe (but I am not completely sure) this works perfectly (always giving the shortest paths for between all nodes). Give it a try and let me know!