Floyd–Warshall algorithm - algorithm

Is there a simple explanation for why this snippet finds the shortest distance between two vertices
for (k = 0; k < n; ++k)
for (i = 0; i < n; ++i)
for (j = 0; j < n; ++j)
if (d[i][k] + d[k][j] < d[i][j])
d[i][j] = d[i][k] + d[k][j]
and this doesn't
for (i = 0; i < n; ++i)
for (j = 0; j < n; ++j)
for (k = 0; k < n; ++k)
if (d[i][k] + d[k][j] < d[i][j])
d[i][j] = d[i][k] + d[k][j]
( for k is the innermost one in the second snippet)

Because the idea is to try to make paths better by trying to go through node k at each step in order to improve every i - j path.
The notations do not matter, you can use i, j, k as the loop variables instead of k, i, j if you want, but you must keep the logic above in mind. In that case, you will want to try to improve the j - k paths by going through i at each step:
for i = 0, n
for j = 0, n
for k = 0, n
if d[j, i] + d[i, k] < d[j, k]
d[j, k] = d[j, i] + d[i, k]
You cannot just reorder the for loops without also changing the condition because you get a different algorithm that way - who knows what it does.

In
for (k = 0; k < n; ++k)
for (i = 0; i < n; ++i)
for (j = 0; j < n; ++j)
if (d[i][k] + d[k][j] < d[i][j])
d[i][j] = d[i][k] + d[k][j]
The outermost loop k is referring to vertices that may be on the path between Vi and Vj. So when k=1, for example, you are considering all paths between vertices Vi and Vj that include vertex V1 as in
Vi .... V1 .... Vj
More importantly, from among those paths you are choosing the best with the relaxation
if (d[i][k] + d[k][j] < d[i][j])
d[i][j] = d[i][k] + d[k][j]
Again, each iteration is focussed on two vertices Vi and Vj and in chooses the best path between them.
In your other instance, the one that fails, you are not choosing the best among paths between two fixed vertices Vi and Vj, instead you are relaxing all over the place, never waiting long enough to find out which path between two set vertices is the best.
On Geekviewpoint, a site which I rely on a lot, they distinctively use x and v as vertices and t for the outermost loop, which makes it easy to remember that t is temporary and so not one of the endpoints. (I wish they had actually explained it, since it's not obvious to everyone.)
//dynamically find the shortest distance between each pair.
for (int t = 0; t < n; t++) {
for (int v = 0; v < n; v++) {
for (int u = 0; u < n; u++) {
if (dist[v][u] > (long) dist[v][t] + dist[t][u]) {
dist[v][u] = dist[v][t] + dist[t][u];
pred[v][u] = pred[t][u];
}
}
}
}

I found a counterexample for the second flawed algorithm.
When i=0, j=1 it will try to find an intermediary, but there isn't any.
Then when an intermediary would be available for i=0, j=1 it is no longer checked again.

Basically when you have K value in loop k that means You are about to add another edge and all possible way to go from (i->j) is updated using edges(1->K-1).
Then you insert another edge K and you again check if there is any way to go from (i->j) in cheaper way using this edge . so you write d[i][j]=min(d[i][j],d[i][k]+d[k][j]).
So if you want to write
for(int i=0;i<n;i++)
for(int j=0;j<n;j++)
for(int k=0;k<n;k++)
Your update should be d[j][k] = min(d[j][k],d[j][i]+d[i][k])

Related

summary of the algorithm of K sum

It is the well-konw Twelvefold way:
https://en.wikipedia.org/wiki/Twelvefold_way
Where we want to find the number of solutions for following equation:
X1 + X2 + ... + XK = target
from the given array:
vector<int> vec(N);
We can assume vec[i] > 0. There are 3 cases, for example
vec = {1,2,3}, target = 5, K = 3.
Xi can be duplicate and solution can be duplicate.
6 solutions are {1,2,2}, {2,1,2}, {2,2,1}, {1,1,3}, {1,3,1}, {3,1,1}
Xi can be duplicate and solution cannot be duplicate.
2 solutions are {1,2,2}, {1,1,3}
Xi cannot be duplicate and solution cannot be duplicate.
0 solution.
The ides must be using dynamic programming:
dp[i][k], the number of solution of target = i, K = k.
And the iteration relation is :
if(i > num[n-1]) dp[i][k] += dp[i-num[n-1]][k-1];
For three cases, they depend on the runing order of i,n,k. I know the result when there is no restriction of K (sum of any number of variables):
case 1:
int KSum(vector<int>& vec, int target) {
vector<int> dp(target + 1);
dp[0] = 1;
for (int i = 1; i <= target; ++i)
for (int n = 0; n < vec.size(); n++)
if (i >= vec[n]) dp[i] += dp[i - vec[n]];
return dp.back();
}
case 2:
for (int n = 0; n < vec.size(); n++)
for (int i = 1; i <= target; ++i)
case 3:
for (int n = 0; n < vec.size(); n++)
for (int i = target; i >= 1; --i)
When there is additional variable k, do we just simply add the for loop
for(int k = 1; k <= K; k++)
at the outermost layer?
EDIT:
I tried case 1,just add for loop of K most inside:
int KSum(vector<int> vec, int target, int K) {
vector<vector<int>> dp(K+1,vector<int>(target + 1,0));
dp[0][0] = 1;
for (int n = 0; n < vec.size(); n++)
for (int i = 1; i <= target; ++i)
for (int k = 1; k <= K; k++)
{
if (i >= vec[n]) dp[k][i] += dp[k - 1][i - vec[n]];
}
return dp[K][target];
}
Is it true for case 2 and case 3?
In your solution without variable K dp[i] represents how many solutions are there to achieve sum i.
Including the variable K means that we added another dimension to our subproblem. This dimension doesn't necessarily have to be on a specific axis. Your dp array could look like dp[i][k] or dp[k][i].
dp[i][k] means how many solutions to accumulate sum i using k numbers (duplicate or unique)
dp[k][i] means using k numbers how many solutions to accumulate sum i
Both are the same things. Meaning that you can add the loop outside or inside.

Invert Arnold's Cat map - negative array indexes

I'm trying to implement Arnold's Cat map for N*N images using the following formula
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
desMatrix[(i + j) % N][(i + 2 * j) % N] = srcMatrix[i][j];
}
}
To invert the process I do:
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
srcMatrix[(j-i) % N][(2*i-j) % N] = destMatrix[i][j];
}
}
Is the implementation correct?
It seems to me that for certain values of j and i I might get negative indexes from (j-i) and (2*i-j); how should I handle those cases, since matrix indexes are only positive?
In general, when a modulo (%) operation needs to work on negative indexes, you can simply add the modulo argument as many times as it's needed. Since
x % N == ( x + a*N ) % N
for all natural a's, and in this case you have i and j constrained in [0, N), then you can write (N + i - j) and ensure that even if i is 0 and j is N-1 (or even N for that matter), the result will always be non-negative. By the same token, (2*N + i - 2*j) or equivalently (i + 2*(N-j)) is always non-negative.
In this case, though, this is not necessary. To invert your map, you would repeat the forward step reversing the assignments. Since the matrix has unary determinant and is area-preserving, you're assured that you'll get all your points eventually (i.e. covering M(i+1) will yield a covering of M(i)).
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
newMatrix[i][j] = desMatrix[(i + j) % N][(i + 2 * j) % N];
}
}
At this point newMatrix and srcMatrix ought to be identical.
(Actually, you're already running the reverse transformation as your forward one. The one I set up to reverse yours is the one commonly used form for the forward transformation).

Algorithmic complexity of o(n)

I recently started playing with algorithms from this princeton course and I observed the following pattern
O(N)
double max = a[0];
for (int i = 1; i < N; i++)
if (a[i] > max) max = a[i];
O(N^2)
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0)
cnt++;
O(N^3)
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
if (a[i] + a[j] + a[k] == 0)
cnt++;
The common pattern here is that as the nesting in the loop grows the exponent also increases.
Is it safe to assume that if I have 20-for loops my complexity would be 0(N^20)?
PS: Note that 20 is just a random number I picked, and yes if you nest 20 for loops in your code there is clearly something wrong with you.
It depends on what the loops do. For example, if I change the end of the 2nd loop to just do 3 iterations like this:
for (int i = 0; i < N; i++)
for (int j = i; j < i+3; j++)
if (a[i] + a[j] == 0)
cnt++;
we get back to O(N)
The key is whether the number of iterations in the loop is related to N and increases linearly as N does.
Here is another example where the 2nd loop goes to N ^ 2:
for (int i = 0; i < N; i++)
for (int j = i; j < N*N; j++)
if (a[i] + a[j] == 0)
cnt++;
This would be o(N^3)
Yes, if the length of the loop is proportional to N and the loops are nested within each other like you described.
In your specific pattern, yes. But it is not safe to assume that in general. You need to check whether the number of iterations in each loop is O(n) regardless of the state of all the enclosing loops. Only after you have verified that this is the case can you conclude that the complexity is O(nloop-nesting-level).
Yes. Even though you decrease the interval of iteration, Big-o notation works with N increasing towards infinity and as all your loops' lengths grow proportional to N, it is true that such an algorithm would have time complexity O(N^20)
I strongly recommend that you understand why a doubly nested loop with each loop running from 0 to N is O(N^2).Use summations to evaluate the number of steps involved in the for loops, and then dropping constants and lower order terms, you will get the Big-Oh of that algorithm.

Big -Oh Computation (refresher help)

Okay, so I have a midterm later today and one of the items I am reviewing is big-O. Now, I did the homework way back in the day and got 100%....but I can't find it now and I am unsure of what I am doing. Sooo could someone give me an explanation as to what I am doing wrong...and if I am doing it right...well maybe you know why I am doubting myself?
Thanks!
Also, I remember before with my homework I was using summations, and I would work from the inside out. And when I finished each summation I used some "forumla" to calculate the highest n, and then keep that value and move on to the next summation, and so on and so forth until the summations were all completed.
Problem 1.
sum = 0;
for (i = 0; i < n; i++)
sum++;
So, since I forgot the whole summation aspect of this, my gut instinct tell me this is O(N), because the maximum runtimes is N times...since it is just one for loop.
Problem 2.
sum = 0;
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
sum++;
For this one, I "think" it is O(N^2) for the highest run time, since both loops are dependent on n, and it could maximize at N * N per if loop.
Problem 3.
sum = 0;
for (i = 0; i < n; i++)
for (j = 0; j < n * n; j++)
sum++;
This is where I get stuck...I feel like I actually need to use the summation layout along with the formula for adding them up. The inner most loop can maximize at n*n, so n^2. On top of which, it can maximize at N again for the outermost loop...so I would guess 0(N^3).
Problem 4.
sum = 0;
for (i = 0; i < n; i++)
for (j = 0; j < i; j++)
sum++;
Again, I am more lost on this one. The inner loop can maximize i times...which is dependent on i however, which is dependent on N....So...I see three maximized variables, and I am literally unsure of how to compare them to find a maximized runtime. (I really need to remember that summation setup and formula).
Same goes for the next problems, no clue where to start, and I'd rather not try to because I don't want to get the wrong thinking in my head. I am positive once I see the formula again it will instantly click, because I got it before...I just lost it somehow.
Any help appreciated!
Problem 5:
sum = 0;
for (i = 0; i < n; i++)
for (j = 0; j < i * i; j++)
for (k = 0; k < j; k++)
sum++;
Problem 6:
sum = 0;
for (i = 1; i < n; i++)
for (j = 1; j < i * i; j++)
if (j % i == 0)
for (k = 0; k < j; k++)
sum++;
for problems 4 to 6, I would assume i j and k are all integers, unlike n which is a variable. how I would approach the problems would be:
e.g. problem 4
inner loop - iterations from 0 to (i-1), which gives us i number of iterations.
outer loop - n summations
combined - O(i * n) = O(n) since i is an integer.

Are these 2 knapsack algorithms the same? (Do they always output the same thing)

In my code, assuming C is the capacity, N is the amount of items, w[j] is the weight of item j, and v[j] is the value of item j, does it do the same thing as the 0-1 knapsack algorithm? I've been trying my code on some data sets, and it seems to be the case. The reason I'm wondering this is because the 0-1 knapsack algorithm we've been taught is 2-dimensional, whereas this is 1-dimensional:
for (int j = 0; j < N; j++) {
if (C-w[j] < 0) continue;
for (int i = C-w[j]; i >= 0; --i) { //loop backwards to prevent double counting
dp[i + w[j]] = max(dp[i + w[j]], dp[i] + v[j]); //looping fwd is for the unbounded problem
}
}
printf( "max value without double counting (loop backwards) %d\n", dp[C]);
Here is my implementation of the 0-1 knapsack algorithm: (with the same variables)
for (int i = 0; i < N; i++) {
for (int j = 0; j <= C; j++) {
if (j - w[i] < 0) dp2[i][j] = i==0?0:dp2[i-1][j];
else dp2[i][j] = max(i==0?0:dp2[i-1][j], dp2[i-1][j-w[i]] + v[i]);
}
}
printf("0-1 knapsack: %d\n", dp2[N-1][C]);
Yes, your algorithm gets you the same result. This enhancement to the classic 0-1 Knapsack is reasonably popular: Wikipedia explains it as follows:
Additionally, if we use only a 1-dimensional array m[w] to store the current optimal values and pass over this array i + 1 times, rewriting from m[W] to m[1] every time, we get the same result for only O(W) space.
Note that they specifically mention your backward loop.

Resources