I came across this problem while studying which asks to consider a data structure where a sequence of n operations are performed. If the kth operation has a cost of k if it is a perfect square and a cost of 1 otherwise, what is total cost of the operations and what is the amortized cost of each operation.
I am having a bit of difficulty coming up with a summation formula that provides the definition of a perfect square where I can see what the sum yields. Any thoughts/advice?
The sum of i^2 from 1 to n can be calculated as n(n+1)(2n+1)/6. I found it in a math book, can't find a simple formula online. But check out http://mathworld.wolfram.com/Sum.html, formula (6).
To calculate this sum, let n be the square root of k, rounded down. The formula is proportional to n^3 which is sqrt(k)^3 = k^(3/2). This gives an amortized time of O(k^(3/2)).
Related
Say I need to calculate the time complexity of a function
16+26+36+...+n6. I am pretty sure this would be O(n7), but I only figure that because I know that Σi from i=0 to n is in O(n2). I cannot find a simple closed-version formula for a summation of ik. Can anyone provide more detail on how to actually calculate the time complexity?
Thanks!
An easy proof that it's Θ(n⁷) is to observe that:
1⁶+2⁶+3⁶+...+n⁶ <= n⁶+n⁶+...n⁶ = n⁷
(replacing all numbers with n makes the sum larger).
and
1⁶+2⁶+3⁶+...+n⁶ >= (n/2+1)⁶+...+n⁶ >= (n/2)⁶+(n/2)⁶+...+(n/2)⁶ = n⁷/2⁷
(in the first step, we discard the terms less or equal than n/2, and in the second step we replace all numbers with n/2. Both steps reduce the sum). (Note: I've assumed n is even, but you can extend to odd n with a bit of minor fiddling around).
Thus 1⁶+2⁶+3⁶+...+n⁶ is bounded above and below by a constant factor of n⁷ and so by definition is Θ(n⁷).
As David Eisenstat suggests in the comments, another proof is to consider the (continuous) graphs y=x⁶ and y=(x+1)⁶ from 0 to n. The area under these curves bound the sum below and above, and are readily calculated via integrals: the first is n⁷/7 and the second is ((n+1)⁷-1)/7. This shows that the sum is n⁷/7 + o(n⁷)
I am now stuck two days at this exercise from my professor:
"Consider the ordered binary tree over a set S ⊆ ℕ, built by repeated insertion, where the elements of S are inserted by a permutation picked uniformly at random. Prove that the height of the tree is O(logn) with high propability."
My work so far has been to study about the propabilistic analysis of random algorithms. For example, CLSR book has a chapter "12.4 Randomly built binary search trees" where its proven that the expected height of a binary tree built by repeated insertion over a random permutation is O(logn). Many other books prove this bound. But this is not what we are looking for. We want to prove a way stronger bound; that the height is O(logn) with high propability. I've studied the classic paper "A Note on the Height of Binary Search Trees, Luc Devroye, 1986" where he proves that the height is ~ 4.31107... logn , with high probability. But the analysis is way out of my league. I couldn't understand the logic of key points in the paper.
Every book and article i've seen uses the citation of Devroye's paper, and says "it can also be proven that with high probability the height is O(logn)".
How should I proceed further?
Thanks in advance.
I will outline my best idea based on well-known probability results. You will need to add details and make it rigorous.
First let's consider the process of descending down pivots to a random node in a binary tree. Suppose that your random node is known to be somewhere between i and i+m-1. At the next step which adds to the length of the path, you will pick a number j in that range. With probability (j-1)/m our random node is now in a range of length j. With probability 1/m it was j. And with probability (m-j-1)/m it was above j and is now in a range of length (m-j-i). Within those ranges, the unknown node is evenly distributed.
The obvious continuous approximation is to go from discrete to continuous. We pick a random number x from 0 to m to be the next pivot. With probability of x/m we are in a range of size x. With probability of (m-x)/m we are in a range of size m-x. We have therefore shrunk our factor by a random number X that is x/m or (m-x)/m. The distribution of X is known. And the sequence of samples we take from the continuous approximation is independent.
One more note. log(X) has both an expected value E and a variance V that can be calculated. Since X is always between 0 and 1, its expected value is negative.
Now pick ε with 0 < ε. The outline of the proof is as follows.
Show that at each step, the expected error from the discrete to the continuous approximation increases by at most O(1).
Show that the probability that the sum of (ε - 1/E) log(n) samples of log(X) fails to be below -log(n) is O(1/n).
Show that the probability that a random node is at depth (2ε + 1/E) log(n) or less is O(1/n).
Show that the probability of a random permutation has ANY node at depth (3ε + 1/E) log(n) is O(1/log(n)) or more.
Let's go.
1. Show that at each step, the error from the discrete to the continuous approximation increases by at most O(1).
The two roundoffs in each step are at most 1. Any errors carried over from the previous step will shrink by a random factorThe previous errors from the approximation shrink, but not increase, in the next step. And the two roundoffs are both at most 1. So the error increases by at most 2.
2. Show that the probability that the sum of (ε - 1/E) log(n) samples of log(X) fails to be below -log(n) is O(1/n).
The expected value of summing log(X) for (ε - 1/E) log(n) times is (ε E - 1) log(n). Since E is negative, this is below our target. Now we can use the Bernstein Inequalities we can put a bound on the probability of being that far from the mean. This will turn out to be proportional to an exponential in the number of variables. Since we have O(log(n)), this will be proportional to 1/n.
3. Show that the probability that a random node is at depth (2ε + 1/E) log(n) or less is O(1/n).
With probability 1 - O(1/n), in (ε + 1/E) log(n) steps the continuous approximation has converged to within 1. There were O(log(n)) steps, and therefore the error between continuous and discrete is at most O(log(n)). (Look back to step 1 to see that.) So we just have to show that the odds of failing to go from O(log(n)) possibilities to 1 in ε log(n) steps is at most O(1/n).
This would be implied if we could show that for any given constant a, the odds of failing to go from a k to 1 possibilities in at most k steps is a negative exponential in k. (k here is ε log(n).)
For that, let's record a 1 every time we cut the space in half with the next element, and a 0 otherwise. The number of such sequences is 2^k. But for any given a, if k is large enough, then you can't cut the space in half more than k/4 times without reducing the search space to 1. note that each time with odds at least 1/2 you cut the search space by at least 1/2. In k steps there are 2^k sequences of 1 or 0 for whether you cut the search space in half. A little playing around with the binomial formula and Stirling's approximation will get you an upper limit on the likelihood of failing to halve enough times of the form O( k (3/4)^k ). Which is sufficient for our purposes.
4. Show that the probability of a random permutation has ANY node at depth at least (3ε + 1/E) is O(1/log(n)).
The proportion of random nodes in random binary trees that are depth at least (2ε + 1/E) log(n) is at most < p/n for some p.
Any random tree with any node at depth (3ε + 1/E) log(n) has n nodes and ε log(n) nodes of depth at least (2ε + 1/E) log(n). If the odds of having a node at depth (3ε + 1/E) log(n) exceeds p / (ε log(n)), then we have too many nodes of depth (2ε + 1/E) log(n) just in those graphs. And therefore by the pigeon hole principle, we have our upper bound on the likelihood.
I have to find out the time complexity for binary search which calculates the dividing point as mid = high - 2 (instead of mid = (low + high)/2)
so as to know how much slower or faster the modified algorithm would be
The worst-case scenario is that the searched item is the very first one. In this case, since you always subtract 2 from n, you will have roughly n/2 steps, which is a linear complexity. The best case is that the searched item is exactly at n-2, which will take a constant complexity. The average complexity, assuming that n -> infinity will be linear as well.
Hint: You can derive the answer based on the recurrence formula for binary search.
We have T(n) = T(floor(n/2)) + O(1)
Since we divide in two equal halfs, we have floor(n/2). You should rewrite the given formula to describe the modified version. Furthermore, you should use Akra-Bazzi method to solve the recursion formula for the modified version since you are dividing in two unbalanced halfs.
I feel stupid for asking this question, but...
For the "closest pair of points" problem (see this if unfamiliar with it), why is the worst-case running time of the brute-force algorithm O(n^2)?
If say n = 4, then there would only be 12 possible pair of points to compare in the search space, if we also consider comparing two points from either direction. If we don't compare two points twice, then it's going to be 6.
O(n^2) doesn't add up to me.
The actual number of comparisons is:
, or .
But, in big-O notation, you are only concerned about the dominant term. At very large values of , the term becomes less important, as does the coefficient on the term. So, we just say it's .
Big-O notation isn't meant to give you the exact formula for the time taken or number of steps. It only gives you the order of the complexity/time so you can get a sense of how it scales for large inputs.
Applying brute force, we are forced to check all the possible pairs.Assuming N points,for each point there are N-1 other points for which we need to calculate the distance. So total possible distances calculated = N points * N-1 other points. But in process we double counted distances. Distance between A to B remains whether A to B or B to A is calculated. Hence N*(N-1)/2. Hence O(N^2).
In big-O notation, you can factor out multiplied constants, so:
O(k*(n^2)) = O(n^2)
The idea is that the constant (1/2 in the OP example, since distance comparison is reflective) doesn't really tell us anything new about the complexity. It still gets bigger with the square of the input.
In the brute-force version of the algorithm you compare all possible pairs of points. For each of n points you have (n - 1) other points to compare and if we take every pair once we end up with (n * (n - 1)) / 2 comparisons. The pessimistic complexity of O(n^2) means that the number of operations is bound by k * n^2 for some constant k. Big O notation can't tell you the exact number of operations but a function to which it is proportional when the size of data (n) increases.
I'm reading this paper Product quantization for nearest neighbor search.
On the last row of table II page 5 it gives
the complexity
given in this table for searching the k smallest elements
is the average complexity for n >> k and when the
elements are arbitrarily ordered
which is n+klogkloglogn.
I guess we can use linear selection algorithm to get unsorted k nearest neighbors with O(n), and sort the k nearest neighbors with O(klogk), so can we have O(n+klogk) in total. But where does the loglogn term come from?
The paper gives a reference to the TAOCP book for this, but I don't have the book at hand, could anyone explain it for me?
First, Table II reports the complexity of each of the step, therefore you have to add all the terms to measure the complexity of ADC.
In the last line of the table, it is a single complexity both for SDC and ADC, which is:
n + k log k log log n
The term corresponds to the average algorithmic cost of the selection algorithm that we employ to find the k smallest values in a set of n variables, that we have copy/pasted from Donald Knuth book [25].
I don't have the book in hands to I can not check, but it sounds right.
From the authors