Binary Counter Amortized Analysis - algorithm

I guess you already know that if all the entries in the Array starts at 0 and at each step we increment the counter by 1 (by flipping 0's and 1's) then the amortized cost for k increments is O(k).
But, what happens if the Array starts with n ? I though that maybe the complexity for k increments is now O(log(n) + k), because of the fact that in the beginning the maximum number of 1's is log(n).
Any suggestions ?
Thanks in advance

You are right. There is more than one way to prove this, one of them is with a potential function. This link (and many others) explain the potential method. However, textbooks usually require that the initial value of the potential function is 0. Let's generalise for the case that it is not.
For the binary counter, the potential function of the counter is the number of bits set to 1. When you increment, you spend k+1 time to flip k 1's to 0 and one 0 to 1. The potential decreases by k-1. So the amortised time of this increment = ActualTime+(PotentialAfter-PotentialBefore) = k+1-(k-1) = 2 (constant).
Now look at the section "Relation between amortized and actual time" in the wikipedia link.
TotalAmortizedTime = TotalActualTime + SumOfChangesToPotential
Since the SumOfChangesToPotential is telescoping, it is equal to FinalPotential-InitialPotential. So:
TotalAmortizedTime = TotalActualTime + FinalPotential-InitialPotential
Which gives:
TotalActualTime = TotalAmortizedTime - FinalPotential + InitialPotential <= TotalAmortizedTime + InitialPotential
So, as you say, the total time for a sequence of k increments starting with n is O(log n + k).

Related

Finding the theoretical bound of local spikes in an array

You are given an arrayA[1..n], which consists of randomly permuted distinct integers.
An element of this array,A[i], is said to be a local spike, if it is larger than all of its preceding elements (in other words, for all j < i,A[i]> A[j]).
Show that the expected number of local spikes in A is O(logn).
If anybody can give me pointers to this question, it would be much appreciated!
It is similar to the reasoning about the quicksort time complexity.
So even though it is more about statistics, it can serve as a nice example of reasoning about algorithm complexity. Maybe it would be more suited to the CS stackexchange than statistics? That being said let's dive into the rabbit hole.
First, since all the numbers are distinct, we can ommit the part about array of random integers and simply take the integers 1, 2, ..., N without a loss of generality.
Now we can change the way of looking at the problem. Instead of having the array we can say that we are choosing a random number from the range 1..N without repetition.
Another observation is, that by choosing a number X, regardless of it being a local spike or not, we are disqualifying all the numbers that are lower from ever being a local spike.
Since we are now choosing the numbers, we can thus discard all Y, where Y < X from the candidate pool. This can be done since regardless of the position for a number lower than the spike, nothing will change for the subsequent spikes. Spike always has to be bigger than the maximum of the previous elements.
So the question becomes how many times can we repeat this procedure of:
Select a number from the pool of candidates as a new spike
Discard all the lower numbers
Before we discard whole candidate pool(starting with the full 1..N range). Not surprisingly, this is almost the same as the expected depth of the quicksort's recursion which is log(n).
A quick explanation if you don't want to check the wiki: Most of the time, we will discard ~half of the candidates. Sometimes less, sometimes more, however in the long run, the half is rather good estimate. More in depth explanation can be found here.
An elegant way to determine the solution to this problem is the following:
Define binary random variables X1, X2, ..., Xn by
Xi = 1 if A[i] is a local spike
Xi = 0 if A[i] is not a local spike
We see that the total number of local spikes is always the sum of the Xi. And we know that
E[X1 + X2 + ... + Xi] = E[X1] + E[X2] + ... + E[Xn]
By the linearity of expectation. So we must now turn out attention to deducing E[Xi] for each i.
Now E[Xi] = P(A[i] is a spike). What is the probability that A[i] > A[j] for all j < i?
This is just the probability that the maximum element of A[1], A[2], ..., A[i] is A[i]. But this maximum element could be located anywhere from A[1] to A[i] with equal probability. So the probability is 1/i that the maximum element is A[i].
So E[Xi] = 1/i. Then we see that
E[total number of spikes] = E[X1] + E[X2] + ... + E[Xn] = 1/1 + 1/2 + ... + 1/n
This is the nth harmonic number, Hn. And it is well known that Hn ~ ln(n). This is because ln(n) <= Hn <= ln(n) + 1 for all n (easy proof involving Riemann sums, but requires a smidge of calculus). This shows that there are O(log n) spikes, on average.

What is the time complexity of this BFS algorithm?

I looked at LeetCode question 270. Perfext Squares:
Given an integer n, return the least number of perfect square numbers that sum to n.
A perfect square is an integer that is the square of an integer; in other words, it is the product of some integer with itself. For example, 1, 4, 9, and 16 are perfect squares while 3 and 11 are not.>
Example 1:
Input: n = 12
Output: 3
Explanation: 12 = 4 + 4 + 4.
I solved it using the following algorithm:
def numSquares(n):
squares = [i**2 for i in range(1, int(n**0.5)+1)]
step = 1
queue = {n}
while queue:
tempQueue = set()
for node in queue:
for square in squares:
if node-square == 0:
return step
if node < square:
break
tempQueue.add(node-square)
queue = tempQueue
step += 1
It basically tries to go from goal number to 0 by subtracting each possible number, which are : [1 , 4, 9, .. sqrt(n)] and then does the same work for each of the numbers obtained.
Question
What is the time complexity of this algorithm? The branching in every level is sqrt(n) times, but some branches are destined to end early... which makes me wonder how to derive the time complexity.
If you think about what you're doing, you can imagine that you're doing a breadth-first search over a graph with n + 1 nodes (all the natural numbers between 0 and n, inclusive) and some number of edges m, which we'll determine later on. Your graph is essentially represented as an adjacency list, since at each point you iterate over all the outgoing edges (squares less than or equal to your number) and stop as soon as you consider a square that's too large. As a result, the runtime will be O(n + m), and all we have to do now is work out what m is.
(There's another cost here in computing all the square roots up to and including n, but that takes time O(n1/2), which is dominated by the O(n) term.)
If you think about it, the number of outgoing edges from each number k will be given by the number of perfect squares less than or equal to k. That value is equal to ⌊√k⌋ (check this for a few examples - it works!). This means that the total number of edges is upper-bounded by
√0 + √1 + √2 + ... + √n
We can show that this sum is Θ(n3/2). First, we'll upper-bound this sum at O(n3/2), which we can do by noting that
√0 + √1 + √2 + ... + √n
≤ √n + √n + √ n + ... + √n (n+1) times
= (n + 1)√n
= O(n3/2).
To lower-bound this at Ω(n3/2), notice that
√0 + √1 + √2 + ... + √ n
≥ √(n/2) + √(n/2 + 1) + ... + √(n) (drop the first half of the terms)
≥ √(n/2) + √(n/2) + ... + √(n/2)
= (n / 2)√(n / 2)
= Ω(n3/2).
So overall, the number of edges is Θ(n3/2), so using a regular analysis of breadth-first search we can see that the runtime will be O(n3/2).
This bound is likely not tight, because this assumes that you visit every single node and every single edge, which isn't going to happen. However, I'm not sure how to tighten things much beyond this.
As a note - this would be a great place to use A* search instead of breadth-first search, since you can fairly easily come up with heuristics to underestimate the remaining total distance (say, take the number and divide it by the largest perfect square less than it). That would cause the search to focus on extremely promising paths that jump rapidly toward 0 before less-good paths, like, say, always taking steps of size one.
Hope this helps!
Some observations:
The number of squares up to n is √n (floored to the nearest integer)
After the first iteration of the while loop, tempQueue will have √n entries
tempQueue can never have more than n entries, since all these values are positive, less than n and unique.
Every natural number can be written as the sum of four integer squares. So that means your BFS algorithm's while loop will iterate at the most 4 times. If the return statement did not get executed during any of the first 3 iterations, it is guaranteed it will in the 4th.
Every statement (except for the initialisation of squares) runs in constant time, even the call to .add().
The initialisation of squares has a list comprehension loop that has √n iterations, and range runs in constant time, so that initialisation has a time complexity of O(√n).
Now we can set a ceiling to the number of times the if node-square == 0 statement is executed (or any other statement in the innermost loop's body):
1⋅√n + √n⋅√n + n⋅√n + n⋅√n
Each of the 4 terms corresponds to an iteration of the while loop. The left factor of each product corresponds to the maximum size of queue in that particular iteration, and the factor at the right corresponds to the size of squares (always the same). This simplifies to:
√n + n + 2n3⁄2
In terms of time complexity this is:
O(n3⁄2)
This is the worst case time complexity. When the while loop only has to iterate twice, it is O(n), and when only once (when n is a square), it is O(√n).

Average case of binary counter

INCREMENT(A)
i = 0
while i< A.length and A[i] ==1
A[i]=0
i=i+1
if i< A.length
A[i]=1
I am now studying amortized analysis by myself and I am thinking of the differences between average case analysis and amortized analysis that I know that the amortized cost of the binary counter operation, INCREMENT(Array), is O(1) but what if I want to analyze the average case of the INCREMENT? I am thinking of assuming the average amount of bits that we need to flip is n/2 where n is the total amount of bits, but I saw the answer in Average Case Time Complexity Analysis of Binary Counter Increment, which does not make much sense to me. Can anyone please explain? This will be helpful because I really what to know the answer:D
I assume that "average" means that we pick a random array of 0 and 1 of length n with equal probability to choose each possible option. It's equivalent to setting each of n elements of the array to 0 with probability 1/2 and to 1 with the same probability.
What is the probability that the body of the while loop will be executed at least once? It is 1/2 (it is executed if and only if the first element of the array is 1). What is the probability that the body of the loop is executed at least twice? It is the probability that the first two element are equal to 1, which is equal to 1/2 * 1/2 = 1/4 (as the probabilities that the first and the second elements are equal to one are independent). We can show by induction that the probability that the body of the while loop is executed at least i times (1 <= i <= n) is (1/2)^n.
It means that it will do one iteration with probability 1/2, one more iteration with probability 1/4, one more iteration with probability 1/8 and so on. Thus, the expected value of the number of iterations is sum for 1 <= i <= n (1/2)^i, which is bounded above by the sum of infinite series 1/2+1/4+1/8+..., which is equal to 1 (it's clearly a constant). All other operations except for the while loop are executed constant number of times regardless of the input. Thus, the total time complexity is constant on average.

Analysis of the complexity of incrementing binary counter with a given initial content

The key is that the binary counter has some content at the beginning. Is it still amortized constant time complexity? How to prove it?
Let's say that we have 11010 binary counter and we increment it so it's now 11011 and so on.
What is the amortised cost of single increment operation?
The amortised cost of each operation is O(1).
Let n be the number of bits in the counter.
In all increment operations, you need to change the LSb
In half of the operations, you need to change the 2nd LSb
In 1/4 of the operations, you need to change the 3rd LSb
...
In 1/(n/2) of the operations, you need to change the (n-1)th LSb (2nd MSb)
In 1/n of the operations, you need to change n'th LSb (MSb).
This gives you average performance of:
1 + 1/2 + 1/4 + ... + 1/n <=(*) 2
To formally prove it, use induction, in the number of bits modified.
(*) is from Sum of geometric series, with a=1, r=1/2, when summing from 1 to infinity we get SUM = 1/(1-r) = 1* 1/(1/2) = 2. Since we only reduced from this number we actually got that the sum is strictly smaller than 2.

Selection i'th smallest number algorithm

I'm reading Introduction to Algorithms book, second edition, the chapter about Medians and Order statistics. And I have a few questions about randomized and non-randomized selection algorithms.
The problem:
Given an unordered array of integers, find i'th smallest element in the array
a. The Randomized_Select algorithm is simple. But I cannot understand the math that explains it's work time. Is it possible to explain that without doing deep math, in more intuitive way? As for me, I'd think that it should work for O(nlog n), and in worst case it should be O(n^2), just like quick sort. In avg randomizedPartition returns near middle of the array, and array is divided into two each call, and the next recursion call process only half of the array. The RandomizedPartition costs (p-r+1)<=n, so we have O(n*log n). In the worst case it would choose every time the max element in the array, and divide the array into two parts - (n-1) and (0) each step. That's O(n^2)
The next one (Select algorithm) is more incomprehensible then previous:
b. What it's difference comparing to previous. Is it faster in avg?
c. The algorithm consists of five steps. In first one we divide the array into n/5 parts each one with 5 elements (beside the last one). Then each part is sorted using insertion sort, and we select 3rd element (median) of each. Because we have sorted these elements, we can be sure that previous two <= this pivot element, and the last two are >= then it. Then we need to select avg element among medians. In the book stated that we recursively call Select algorithm for these medians. How we can do that? In select algorithm we are using insertion sort, and if we are swapping two medians, we need to swap all four (or even more if it is more deeper step) elements that are "children" for each median. Or do we create new array that contain only previously selected medians, and are searching medians among them? If yes, how can we fill them in original array, as we changed their order previously.
The other steps are pretty simple and look like in the randomized_partition algorithm.
The randomized select run in O(n). look at this analysis.
Algorithm :
Randomly choose an element
split the set in "lower than" set L and "bigger than" set B
if the size of "lower than" is j-1 we found it
if the size is bigger, then Lookup in L
or lookup in B
The total cost is the sum of :
The cost of splitting the array of size n
The cost of lookup in L or the cost of looking up in B
Edited: I Tried to restructure my post
You can notice that :
We always go next in the set with greater amount of elements
The amount of elements in this set is n - rank(xj)
1 <= rank(xi) <= n So 1 <= n - rank(xj) <= n
The randomness of the element xj directly affect the randomness of the number of element which
are greater xj(and which are smaller than xj)
if xj is the element chosen , then you know that the cost is O(n) + cost(n - rank(xj)). Let's call rank(xj) = rj.
To give a good estimate we need to take the expected value of the total cost, which is
T(n) = E(cost) = sum {each possible xj}p(xj)(O(n) + T(n - rank(xj)))
xj is random. After this it is pure math.
We obtain :
T(n) = 1/n *( O(n) + sum {all possible values of rj when we continue}(O(n) + T(n - rj))) )
T(n) = 1/n *( O(n) + sum {1 < rj < n, rj != i}(O(n) + T(n - rj))) )
Here you can change variable, vj = n - rj
T(n) = 1/n *( O(n) + sum { 0 <= vj <= n - 1, vj!= n-i}(O(n) + T(vj) ))
We put O(n) outside the sum , gain a factor
T(n) = 1/n *( O(n) + O(n^2) + sum {1 <= vj <= n -1, vj!= n-i}( T(vj) ))
We put O(n) and O(n^2) outside, loose a factor
T(n) = O(1) + O(n) + 1/n *( sum { 0 <= vj <= n -1, vj!= n-i} T(vj) )
Check the link on how this is computed.
For the non-randomized version :
You say yourself:
In avg randomizedPartition returns near middle of the array.
That is exactly why the randomized algorithm works and that is exactly what it is used to construct the deterministic algorithm. Ideally you want to pick the pivot deterministically such that it produces a good split, but the best value for a good split is already the solution! So at each step they want a value which is good enough, "at least 3/10 of the array below the pivot and at least 3/10 of the array above". To achieve this they split the original array in 5 at each step, and again it is a mathematical choice.
I once created an explanation for this (with diagram) on the Wikipedia page for it... http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm

Resources