Implement Dijkstra's Algorithm with a d-ary heap - complexity-theory

As you can see in this link: http://en.wikipedia.org/wiki/D-ary_heap#Applications
It says in Wikipedia that the optimal choice of d is d=m/n (it leads to a total time complexity of O(m logm/n n) )
It seems to me that this guess was pulled out of thin air. Is there a simple way of proving (or even explaining) that this is indeed the optimal d ?
Thanks in advance

From the explanation itself you can deduct that you have n delete min operations each requiring O(d log(n)/log(d)) and m decrease priority operations of O(log(n)/log(d)). The combined work is then (m*log(n)+n*d*log(n))/log(d).
If you fill in the suggested d value, the global behavior is as stated O(m*log(n)/log(d)). If you take any other d, one of the terms is bigger then the average, resulting in an increased complexity.

Related

How is pre-computation handled by complexity notation?

Suppose I have an algorithm that runs in O(n) for every input of size n, but only after a pre-computation step of O(n^2) for that given size n. Is the algorithm considered O(n) still, with O(n^2) amortized? Or does big O only consider one "run" of the algorithm at size n, and so the pre-computation step is included in the notation, making the true notation O(n+n^2) or O(n^2)?
It's not uncommon to see this accounted for by explicitly separating out the costs into two different pieces. For example, in the range minimum query problem, it's common to see people talk about things like an ⟨O(n2), O(1)⟩-time solution to the problem, where the O(n2) denotes the precomputation cost and the O(1) denotes the lookup cost. You also see this with string algorithms sometimes: a suffix tree provides an O(m)-preprocessing-time, O(n+z)-query-time solution to string searching, while Aho-Corasick string matching offers an O(n)-preprocessing-time, O(m+z)-query-time solution.
The reason for doing so is that the tradeoffs involved here really depend on the use case. It lets you quantitatively measure how many queries you're going to have to make before the preprocessing time starts to be worth it.
People usually care about the total time to get things done when they are talking about complexity etc.
Thus, if getting to the result R requires you to perform steps A and B, then complexity(R) = complexity(A) + complexity(B). This works out to be O(n^2) in your particular example.
You have already noted that for O analysis, the fastest growing term dominates the overall complexity (or in other words, in a pipeline, the slowest module defines the throughput).
However, complexity analysis of A and B will be typically performed in isolation if they are disjoint.
In summary, it's the amount of time taken to get the results that counts, but you can (and usually do) reason about the individual steps independent of one another.
There are cases when you cannot only specify the slowest part of the pipeline. A simple example is BFS, with the complexity O(V + E). Since E = O(V^2), it may be tempting to write the complexity of BFS as O(E) (since E > V). However, that would be incorrect, since there can be a graph with no edges! In those cases, you will still need to iterate over all the vertices.
The point of O(...) notation is not to measure how fast the algorithm is working, because in many specific cases O(n) can be significantly slower than, say O(n^3). (Imagine the algorithm which runs in 10^100 n steps vs. the one which runs in n^3 / 2 steps.) If I tell you that my algorithm runs in O(n^2) time, it tells you nothing about how long it will take for n = 1000.
The point of O(...) is to specify how the algorithm behaves when the input size grows. If I tell you that my algorithm runs in O(n^2) time, and it takes 1 second to run for n = 500, then you'll expect rather 4 seconds to for n = 1000, not 1.5 and not 40.
So, to answer your question -- no, the algorithm will not be O(n), it will be O(n^2), because if I double the input size the time will be multiplied by 4, not by 2.

Big Oh - O(n) vs O(n^2)

I've recently finished two tests for a data a structures class and I've got a question related to O(n) vs O(n^2) wrong twice. I was wondering if I could get help understanding the problem. The problem is:
Suppose that Algorithm A has runtime O(n^2) and Algorithm B has runtime O(n). What can we say about the runtime of these two algorithms when n=17?
a) We cannot say anything about the specific runtimes when n=17
b) Algorithm A will run much FASTER than Algorithm B
c) Algorithm A will run much SLOWER than Algorithm B
For both tests I answered C based on: https://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions. I knew B made no sense based on the link provided. Now I am starting to think that its A. I'm guessing its A because n is small. If that is the cases I am wondering when is n sufficiently larger enough that C would true.
There are actually two issues here.
The first is the one you mentioned. Orders of growth are asymptotic. They just say that there exists some n0 for which, for any n > n0, the function is bounded in some way. They say nothing about specific values of n, only "large enough" ones.
The second problem (which you did not mention), is that O is just an upper bound (as opposed to Θ), and so even for large enough n you can't compare the two. So if A = √n and B = n, then obviously B grows faster than A. However, A and B still fit the question, as √ n = O(n2) and n = O(n).
The answer is A.
Big Oh order of a function f(x) is g(x) if f(x)<=K*g(x) forall x>some real number
Big Oh of 3*n+2 and n is O(n) since 4*n is greater than both functions for all x>2 . since both the Big oh notation of the functions are same we cannot say that they run in the same time for some value.For example at n=0 the value of first function is 2 and the second one is 0
So we cannot exactly relate the running times of two functions for some value.
The answer is a): You can't really say anything for any specific number just given the big O notation.
Counter-example for c: B has a runtime of 1000*n (= O(n)), A has a runtime of n^2.
When doing algorithm analysis, specifically Big Oh, you should really only think about input sizes tending towards infinity. With such a small size (tens vs. thousands vs. millions), there is not a significant difference between the two. However, in general O(n) should run faster than O(n^2), even if it the difference is less than few milliseconds. I suspect the key word in that question is much.
My answer is based on my experience in competitive programming, which require a basic understanding of the O or called Big O.
When you talk about which one is faster and which one is slower, of course, basic calculation is done that. O(n) is faster than O(n^2), big oh is used based on worst case scenario.
Now when exactly that happen? Well, in competitive programming, we used 10^8 thumb rule. It's mean if an algorithm complexity is O(n) and then there is around n = 10^8 with time limit around 1 second, the algorithm can solve the problem.
But what if the algorithm complexity is O(n^2)? No, then, it will need around (10^8)^2 which is more than 1 second. (1-second computer can process around 10^8 operation).
So, for 1 second time, the max bound for O(n^2) is around 10^4 meanwhile for O(n) can do up to 10^8. This is where we can clearly see the different between the two complexity in 1 second time pass on a computer.

TSP: Worst case ratio grows

As part of my high school thesis I am describing the heuristics for the Traveling Salesman Problem.
I was reading this sort of case study (Page 8) but I cannot unserstand what these sentences mean:
The running time for NN as described is Θ(N^2
). [...]
In particular, we are guaranteed that NN (I)/OPT (I) ≤ ( 0. 5 ) ( log_{2} N + 1 ).
That part is very clear to me. But then:
No substantially
better guarantee is possible, however, as there are instances for which the ratio
grows as Θ( logN).
What's the meaning of there are instances for which?
The same thing happens with the greedy algorithm:
...but the worst
examples known for Greedy only make the ratio grow as ( logN)/( 3 log log N).
So what's the meaning of those statements? Does it have to do with non-Euclidean instances (I wouldn't think so because you just have to read through the column of a distance matrix in order to solve it)? Or just instances with e.g. multiple nodes at the same distance from the starting node that requires the algorithm to split the solution tree or something similar?
EDIT:
Thanks to #templatetypedef (your answer will still be accepted as correct), it all makes sense. However I would like to ask if someone knows any example (or even just a link) of these specific graphs (I doesn't matter of which algorithm). I don't think that it is too much off-topic, it would rather add something useful to the topic.
Take a look at these two statements side-by-side:
In particular, we are guaranteed that NN (I)/OPT (I) ≤ ( 0. 5 ) ( log_{2} N + 1 ).
No substantially better guarantee is possible, however, as there are instances for which the ratio grows as Θ( logN).
This first statement says that algorithm NN, in the worst case, produces an answer that's (roughly) within 1/2 lg N of the true answer (to see this, just multiply both sides by OPT(I)). That's great news! The natural follow-up question, then, is whether the actual bound is even tighter than that. For example, that result doesn't rule out the possibility that we might also have NN(I) / OPT(I) ≤ log log N or that NN(I) / OPT(I) ≤ 2. Those are much tighter bounds.
That's where the second statement comes in. This statement says that there are known instances of TSP (that is, specific graphs with weights on them) where the ratio NN(I) / OPT(I) is Θ(log n) (that is, the ratio is proportional to the log of the number of nodes in the graph). Because we already know about inputs like this, there's no possible way that NN(I) / OPT(I) could be bounded from above by something like log log n or 2, since those bounds are too tight.
However, that second statement in isolation is not very useful. We know that there are inputs that would cause the algorithm to produce something that's off by a log factor, but might it still be possible that there are inputs that cause it to be off by a lot more; say, by an exponential factor? With the first statement, we know that this can't happen, since we know we never get more than a log factor off.
Think of these statements in two steps: the first statement gives an upper bound on how bad the approximation can be - it's never worse than a 1/2 lg N + 1 factor of optimal. In the second statement gives a lower bound on how bad the approximation can be - there are specific cases where the algorithm can't do any better than a Θ(log N) approximation of the optimal answer.
(Note that the Θ(log n) here has nothing to do with the runtime - it's just a way of saying "something logarithmic.")
Going forward, keep an eye out for both upper bounds and lower bounds. The two collectively tell you a lot more than any one individually does.
Best of luck with the thesis!

What is meant when it's said that the union-find data structure will be "linear in the real world?"

I am undertaking the algorithms course on Coursera, there is a section where the author mentions the following
the running time of weighted quick union with path compression is
going be linear in the real world and actually could be improved to
even a more interesting function called the Ackermann function, which
is even more slowly growing than lg. And another point about this
is it seems that this is so close to being linear that is time proportional to N instead of time proportional to N times the
slowly growing function in N. Is there a simple algorithm that is
linear? And people, looked for a long time for that, and actually it
works out to be the case that we can prove that there is no such
algorithm. (emphasis added)
(You can find the entire transcript here)
In all other sources including Wikipedia "linear" is used when time increases proportionally with the input size, and in weighted quick-union with path compression this is certainly not the case.
What exactly is meant by "linear in the real world" here?
The runtime of m operations on a union-find data structure with path compression and union-by-rank is O(mα(m)), where α(m) is the inverse Ackermann function. This function is so slowly-growing that you cannot express an input to it for which the output is 6 in scientific notation. In other words, for any possible value of m that fits into the universe (or even that has size around 2num atoms in the universe), we have that α(m) ≤ 5. Therefore, for any "reasonable" input the cost of m operations will be O(m · 6) = O(m), which is linear.
Of course, the runtime isn't linear because α(m) does indeed grow, just very, very slowly. However, it's usually fine to approximate the runtime as O(m) because there's no possible way you'd ever notice the runtime of the function deviating from a simple linear function.
Hope this helps!
Here are some chunks from the transcript:
And what was proved
by Hopcroft Ulman and Tarjan was that if
you have N objects, any sequence of M
union and find operations will touch the
array at most a c (N + M lg star N) times.
And now, lg N is kind of a funny function....
And another point
about this is it< /i> seems that this is
so close to being linear that is t ime
proportional to N instead of time
proportional to N times the slowly growing
function in N.
(end quote)
You are pointing out that the cost of an individual operation grows very slowly with the number of objects, but they are looking at how the total cost of a number of operations grows with the number of objects involved so N times a per-operation cost that grows only very slowly with N is still just over linear in N.

Which complexity is better?

Assume that a graph has N nodes and M edges, and the total number of iterations is k.
(k is a constant integer, larger than 1, independent of N and M)
Let D=M/N be the average degree of the graph.
I have two graph-based iterative search algorithms.
The first algorithm has the complexity of O(D^{2k}) time.
The second algorithm has the complexity of O(k*D*N) time.
Based on their Big O time complexity, which one is better?
Some told me that the first one is better because the number of nodes N in a graph is usually much larger than D in real world.
Others said that the second one is better because k is exponentially increased for the first one, but is linearly increased for the second one.
Summary
Neither of your two O's dominate the other, so the right approach is to chose the algorithm based on the inputs
O Domination
The first it better when D<1 (sparse graphs) and similar.
The second is better when D is relatively large
Algorithm Selection
The important parameter is not just the O but the actual constant in front of it.
E.g., an O(n) algorithm which is actually 100000*n is worse than O(n^2) which is just n^2 when n<100000.
So, given the graph and the desired iteration count k, you need to estimate the expected performance of each algorithm and chose the best one.
Big-O notation describes how a function grows, when its arguments grow. So if you want to estimate growth of algorithm time consumption, you should estimate first how D and N will grow. That requires some additional information from your domain.
If we assume that N is going to grow anyway. For D you have several choices:
D remains constant - the first algorithm is definitely better
D grows proportionally to N - the second algorithm is better
More generally: if D grows faster than N^(1/(2k-1)), you should select the first algorithm, otherwise - the second one.
For every fixed D, D^(2k) is a constant, so the first algorithm will beat the second if M is large enough. However, what is large enough depends on D. If D isn't constant or limited, the two complexities cannot be compared.
In practice, you would implement both algorithms, find a good approximation for their actual speed, and depending on your values pick the one that will be faster.

Resources