Floyd-Warshall Algorithm - Why this order? - algorithm

In the algorithm, you have 3 loops for shortest[u,v,x] which goes
x from 1 - n,
u from 1 - n,
v from 1 - n.
Why are the loops from x,u,v and not x,v,u or u,v,x?

Assuming (u,v) are indices for vertices and x denotes the index such that we are finding the shortest distance between vertex u and vertex v with only those vertices in between that are numbered <= x.
Hence the recurrence for dist[u,v,x] uses dist[u,v,x-1], dist[u,x,x-1] and dist[x,v,x-1].
Since there are three vertices involved (u,v,x) in the computation of dist[u,v,x], so we should have already computed dist's for all three pairs before computing dist[u,v,x].
So, the loop for x has to be the outer-most loop. Inner loop can be either v,u or u,v because both are vertices.

Related

Pairs of vertices in graph having specific distance

Given a tree with N vertices and a positive number K. Find the number of distinct pairs of the vertices which have a distance of exactly K between them. Note that pairs (v, u) and (u, v) are considered to be the same pair (1 ≤ N ≤ 50000, 1 ≤ K ≤ 500).
I am not able to find an optimum solution for this. I can do BFS from each vertex and count the no of vertices that is reachable from that and having distance less than or equal to K. But then in worst case the complexity will be order of 2. Is there any faster way around??
You can achieve that in more simple way.
Run DFS on tree and for each vertex calculate the distance from the root - save those in array (access by o(1)).
For each pair of vertex in your graph:
Find their LCA (Lowest common ancestor there are algorithm to do that in 0(1)).
Assume u and v are 2 arbitrary vertices and w is their LCA -> subtract the distance from the w to the root from u to the root - now you have the distance between u and w. Do the same for v -> with o(1) you got the distance for (v,w) and (u,w) -> sum them together and you get the (v,u) distance - now all you have to do is compare to K.
Final complexity is o(n^2)
Improving upon the other answer's approach, we can make some key observations.
To calculate distances between two nodes you need their LCA(Lowest Common Ancestor) and depths as the other answer is trying to do. The formula used here is:
Dist(u, v) = depth[u] + depth[v] - 2 * depth[ lca(u, v) ]
Depth[x] denotes distance of x from root, precomputed using DFS once starting from root node.
Now here comes the key observation, you already have distance value K, assume that dist(u, v) = K using this assumption calculate(predict?) depth of LCA. By putting K in above formula we get:
depth[ lca(u, v) ] = (depth[u] + depth[v] - K) / 2
Now that you have depth of LCA you know that distance between u and lca is depth[u] - depth[ lca(u, v) ] and between v and lca is depth[v] - depth[ lca(u, v) ], let this be X and Y respectively.
Now we know that LCA is the lowest common ancestor thus, the Xth parent of u and Yth parent of v should be the LCA, so now if Xth parent of u and Yth parent of v is indeed the same node then we can say that our pre-assumption about distances between the nodes was true and the distance between the two nodes is K.
You can caluculate the Xth and Yth ancestor of the nodes in O(logN) complexity using Binary Lifting Algorithm with a preprocessing of O(NLogN) time, this preprocessing can be included directly in your DFS when a node is visited for the first time.
Corner Cases:
Calcuated depth of LCA should not be a fraction or negative.
If depth of any node u or v matches the calculated depth of the node then that node is the ancestor of the other node.
Consider this tree:
Assuming K = 4, we get that depth[lca] = 1 using the formula above, and if we get the Xth and Yth ancestor of u and v we will get the same node 1, which should validate our assumption but this is not true since the distance between u and v is actually 2 as visible in the picture above. This is because LCA in this case is actually 2, to handle this case calcuate X-1th and Y-1th ancestor of u and v too, respectively and check if they are different.
Final Complexity: O(NlogN)

How can I get the antichain elements in SPOJ-DIVREL?

Problem: http://www.spoj.com/problems/DIVREL
In question, we just need to find the maximum number of elements which are not multiples (a divisible by b form) from a set of elements given. If we just make an edge from an element to its multiple and construct a graph it will be a DAG.
Now the question just changes to finding the minimum number of chains which contain all the vertices which equals the antichain cardinality using Dilworth's theorem as it is a partially ordered set.
Minimum chains can be found using bipartite matching (How: It is minimum path cover) but now I am unable to find the antichain elements themselves?
To compute the antichain you can:
Compute the maximum bipartite matching (e.g. with a maximum flow algorithm) on a new bipartite graph D which has an edge from LHS a to RHS b if and only if a divides b.
Use the matching to compute a minimal vertex cover (e.g. with the algorithm described in the proof of Konig's theorem
The antichain is given by all vertices not in the vertex cover
There cannot be an edge between two such elements as otherwise we would have discovered an edge that is not covered by a vertex cover resulting in a contradiction.
The algorithm to find the min vertex cover is (from the link above):
Let S0 consist of all vertices unmatched by M.
For integer j ≥ 0, let S(2j+1) be the set of all vertices v such that v is adjacent via some edge in E \ M to a vertex in S(2j) and v has not been included in any
previously-defined set Sk, where k < 2j+1. If there is no such vertex,
but there remain vertices not included in any previously-defined set
Sk, arbitrarily choose one of these and let S(2j+1) consist of that
single vertex.
For integer j ≥ 1, let S(2j) be the set of all vertices u
such that u is adjacent via some edge in M to a vertex in S(2j−1). Note
that for each v in S(2j−1) there is a vertex u to which it is matched
since otherwise v would have been in S0. Therefore M sets up a
one-to-one correspondence between the vertices of S(2j−1) and the
vertices of S(2j).
The union of the odd indexed subsets is the vertex cover.

Complete graph with only two possible costs. What's the shortest path's cost from 0 to N - 1

You are given a complete undirected graph with N vertices. All but K edges have a cost of A. Those K edges have a cost of B and you know them (as a list of pairs). What's the minimum cost from node 0 to node N - 1.
2 <= N <= 500k
0 <= K <= 500k
1 <= A, B <= 500k
The problem is, obviously, when those K edges cost more than the other ones and node 0 and node N - 1 are connected by a K-edge.
Dijkstra doesn't work. I've even tried something very similar with a BFS.
Step1: Let G(0) be the set of "good" adjacent nodes with node 0.
Step2: For each node in G(0):
compute G(node)
if G(node) contains N - 1
return step
else
add node to some queue
repeat step2 and increment step
The problem is that this uses up a lot of time due to the fact that for every node you have to make a loop from 0 to N - 1 in order to find the "good" adjacent nodes.
Does anyone have any better ideas? Thank you.
Edit: Here is a link from the ACM contest: http://acm.ro/prob/probleme/B.pdf
This is laborous case work:
A < B and 0 and N-1 are joined by A -> trivial.
B < A and 0 and N-1 are joined by B -> trivial.
B < A and 0 and N-1 are joined by A ->
Do BFS on graph with only K edges.
A < B and 0 and N-1 are joined by B ->
You can check in O(N) time is there is a path with length 2*A (try every vertex in middle).
To check other path lengths following algorithm should do the trick:
Let X(d) be set of nodes reachable by using d shorter edges from 0. You can find X(d) using following algorithm: Take each vertex v with unknown distance and iterativelly check edges between v and vertices from X(d-1). If you found short edge, then v is in X(d) otherwise you stepped on long edge. Since there are at most K long edges you can step on them at most K times. So you should find distance of each vertex in at most O(N + K) time.
I propose a solution to a somewhat more general problem where you might have more than two types of edges and the edge weights are not bounded. For your scenario the idea is probably a bit overkill, but the implementation is quite simple, so it might be a good way to go about the problem.
You can use a segment tree to make Dijkstra more efficient. You will need the operations
set upper bound in a range as in, given U, L, R; for all x[i] with L <= i <= R, set x[i] = min(x[i], u)
find a global minimum
The upper bounds can be pushed down the tree lazily, so both can be implemented in O(log n)
When relaxing outgoing edges, look for the edges with cost B, sort them and update the ranges in between all at once.
The runtime should be O(n log n + m log m) if you sort all the edges upfront (by outgoing vertex).
EDIT: Got accepted with this approach. The good thing about it is that it avoids any kind of special casing. It's still ~80 lines of code.
In the case when A < B, I would go with kind of a BFS, where you would check where you can't reach instead of where you can. Here's the pseudocode:
G(k) is the set of nodes reachable by k cheap edges and no less. We start with G(0) = {v0}
while G(k) isn't empty and G(k) doesn't contain vN-1 and k*A < B
A = array[N] of zeroes
for every node n in G(k)
for every expensive edge (n,m)
A[m]++
# now we have that A[m] == |G(k)| iff m can't be reached by a cheap edge from any of G(k)
set G(k+1) to {m; A[m] < |G(k)|} except {n; n is in G(0),...G(k)}
k++
This way you avoid iterating through the (many) cheap edges and only iterate through the relatively few expensive edges.
As you have correctly noted, the problem comes when A > B and edge from 0 to n-1 has a cost of A.
In this case you can simply delete all edges in the graph that have a cost of A. This is because an optimal route shall only have edges with cost B.
Then you can perform a simple BFS since the costs of all edges are the same. It will give you optimal performance as pointed out by this link: Finding shortest path for equal weighted graph
Moreover, you can stop your BFS when the total cost exceeds A.

Time complexity of Prim's MST Algorithm

Can someone explain to me why is Prim's Algorithm using adjacent matrix result in a time complexity of O(V2)?
(Sorry in advance for the sloppy looking ASCII math, I don't think we can use LaTEX to typeset answers)
The traditional way to implement Prim's algorithm with O(V^2) complexity is to have an array in addition to the adjacency matrix, lets call it distance which has the minimum distance of that vertex to the node.
This way, we only ever check distance to find the next target, and since we do this V times and there are V members of distance, our complexity is O(V^2).
This on it's own wouldn't be enough as the original values in distance would quickly become out of date. To update this array, all we do is at the end of each step, iterate through our adjacency matrix and update the distance appropriately. This doesn't affect our time complexity since it merely means that each step takes O(V+V) = O(2V) = O(V). Therefore our algorithm is O(V^2).
Without using distance we have to iterate through all E edges every single time, which at worst contains V^2 edges, meaning our time complexity would be O(V^3).
Proof:
To prove that without the distance array it is impossible to compute the MST in O(V^2) time, consider that then on each iteration with a tree of size n, there are V-n vertices to potentially be added.
To calculate which one to choose we must check each of these to find their minimum distance from the tree and then compare that to each other and find the minimum there.
In the worst case scenario, each of the nodes contains a connection to each node in the tree, resulting in n * (V-n) edges and a complexity of O(n(V-n)).
Since our total would be the sum of each of these steps as n goes from 1 to V, our final time complexity is:
(sum O(n(V-n)) as n = 1 to V) = O(1/6(V-1) V (V+1)) = O(V^3)
QED
Note: This answer just borrows jozefg's answer and tries to explain it more fully since I had to think a bit before I understood it.
Background
An Adjacency Matrix representation of a graph constructs a V x V matrix (where V is the number of vertices). The value of cell (a, b) is the weight of the edge linking vertices a and b, or zero if there is no edge.
Adjacency Matrix
A B C D E
--------------
A 0 1 0 3 2
B 1 0 0 0 2
C 0 0 0 4 3
D 3 0 4 0 1
E 2 2 3 1 0
Prim's Algorithm is an algorithm that takes a graph and a starting node, and finds a minimum spanning tree on the graph - that is, it finds a subset of the edges so that the result is a tree that contains all the nodes and the combined edge weights are minimized. It may be summarized as follows:
Place the starting node in the tree.
Repeat until all nodes are in the tree:
Find all edges that join nodes in the tree to nodes not in the tree.
Of those edges, choose one with the minimum weight.
Add that edge and the connected node to the tree.
Analysis
We can now start to analyse the algorithm like so:
At every iteration of the loop, we add one node to the tree. Since there are V nodes, it follows that there are O(V) iterations of this loop.
Within each iteration of the loop, we need to find and test edges in the tree. If there are E edges, the naive searching implementation uses O(E) to find the edge with minimum weight.
So in combination, we should expect the complexity to be O(VE), which may be O(V^3) in the worst case.
However, jozefg gave a good answer to show how to achieve O(V^2) complexity.
Distance to Tree
| A B C D E
|----------------
Iteration 0 | 0 1* # 3 2
1 | 0 0 # 3 2*
2 | 0 0 4 1* 0
3 | 0 0 3* 0 0
4 | 0 0 0 0 0
NB. # = infinity (not connected to tree)
* = minimum weight edge in this iteration
Here the distance vector represents the smallest weighted edge joining each node to the tree, and is used as follows:
Initialize with the edge weights to the starting node A with complexity O(V).
To find the next node to add, simply find the minimum element of distance (and remove it from the list). This is O(V).
After adding a new node, there are O(V) new edges connecting the tree to the remaining nodes; for each of these determine if the new edge has less weight than the existing distance. If so, update the distance vector. Again, O(V).
Using these three steps reduces the searching time from O(E) to O(V), and adds an extra O(V) step to update the distance vector at each iteration. Since each iteration is now O(V), the overall complexity is O(V^2).
First of all, it's obviously at least O(V^2), because that is how big the adjacency matrix is.
Looking at http://en.wikipedia.org/wiki/Prim%27s_algorithm, you need to execute the step "Repeat until Vnew = V" V times.
Inside that step, you need to work out the shortest link between any vertex in V and any vertex outside V. Maintain an array of size V, holding for each vertex either infinity (if the vertex is in V) or the length of the shortest link between any vertex in V and that vertex, and its length (so in the beginning this just comes from the length of links between the starting vertex and every other vertex). To find the next vertex to add to V, just search this array, at cost V. Once you have a new vertex, look at all the links from that vertex to every other vertex and see if any of them give shorter links from V to that vertex. If they do, update the array. This also cost V.
So you have V steps (V vertexes to add) each taking cost V, which gives you O(V^2)

Floyd's Algorithm Explanation

Concerning
floyds(int a[][100],int n).
What does 'a' and represent and what does each of the two dimensions of a represent?
What does 'n' represent?
I have a list of locations, with a list of connections between those locations and have computed the distance between those connections that are connect to each other. Now I need to find shortest path between any given two locations (floyd's) - but need to understand how to apply floyds(int a[][100],int n) to my locations array, city dictionaries, and connection arrays.
FYI - Using objective C - iOS.
n is the number of nodes in the graph.
a is an distance matrix of the graph. a[i][j] is the cost (or distance) of the edge from node i to node j.
(Also read the definition of adjacency matrix if you need more help with the concept.)
/* Assume a function edgeCost(i,j) which returns the cost of the edge from i to j
2 (infinity if there is none).
3 Also assume that n is the number of vertices and edgeCost(i,i) = 0
4 */
5
6 int path[][];
7 /* A 2-dimensional matrix. At each step in the algorithm, path[i][j] is the shortest path
8 from i to j using intermediate vertices (1..k−1). Each path[i][j] is initialized to
9 edgeCost(i,j).
10 */
12 procedure FloydWarshall ()
13 for k := 1 to n
14 for i := 1 to n
15 for j := 1 to n
16 path[i][j] = min ( path[i][j], path[i][k]+path[k][j] );
http://en.wikipedia.org/wiki/Floyd-Warshall
wiki is very good~~~
floyd-warshall(W) // W is the adjacent matrix representation of graph..
n=W.rows;
for k=1 to n
for i=1 to n
for j=1 to n
w[i][j]=min(W[i][j],W[i][k]+W[k][j]);
return W;
It's a dp-algorithm.At the k-th iteration here W[i][j] is the shortest path between i and j and the vertices of the shortest path(excluding i and j) are from the set {1,2,3...,k-1,k}.In min(W[i][j],W[i][k]+W[k][j]), W[i][j] is the computed shortest path between i and j at k-1-th iteration and here since the intermediate vertices are from the set {1,2...k-1},so this path does not include vertex k. In W[i][k]+W[k][j],we include vertex k in the path.whichever between the two is minimum is the shortest path at k-th iteration.
Basically we check that whether we should include vertex k in the path or not.

Resources