INCREMENT(A)
i = 0
while i< A.length and A[i] ==1
A[i]=0
i=i+1
if i< A.length
A[i]=1
I am now studying amortized analysis by myself and I am thinking of the differences between average case analysis and amortized analysis that I know that the amortized cost of the binary counter operation, INCREMENT(Array), is O(1) but what if I want to analyze the average case of the INCREMENT? I am thinking of assuming the average amount of bits that we need to flip is n/2 where n is the total amount of bits, but I saw the answer in Average Case Time Complexity Analysis of Binary Counter Increment, which does not make much sense to me. Can anyone please explain? This will be helpful because I really what to know the answer:D
I assume that "average" means that we pick a random array of 0 and 1 of length n with equal probability to choose each possible option. It's equivalent to setting each of n elements of the array to 0 with probability 1/2 and to 1 with the same probability.
What is the probability that the body of the while loop will be executed at least once? It is 1/2 (it is executed if and only if the first element of the array is 1). What is the probability that the body of the loop is executed at least twice? It is the probability that the first two element are equal to 1, which is equal to 1/2 * 1/2 = 1/4 (as the probabilities that the first and the second elements are equal to one are independent). We can show by induction that the probability that the body of the while loop is executed at least i times (1 <= i <= n) is (1/2)^n.
It means that it will do one iteration with probability 1/2, one more iteration with probability 1/4, one more iteration with probability 1/8 and so on. Thus, the expected value of the number of iterations is sum for 1 <= i <= n (1/2)^i, which is bounded above by the sum of infinite series 1/2+1/4+1/8+..., which is equal to 1 (it's clearly a constant). All other operations except for the while loop are executed constant number of times regardless of the input. Thus, the total time complexity is constant on average.
Related
I have an integer array of length N containing the values 0, 1, 2, .... (N-1), representing a permutation of integer indexes.
What's the most efficient way to determine if the permutation has odd or even parity, given I have parallel compute of O(N) as well?
For example, you can sum N numbers in log(N) with parallel computation. I expect to find the parity of permutations in log(N) as well, but cannot seem to find an algorithm. I also do not know how this "complexity order with parallel computation" is called.
The number in each array slot is the proper slot for that item. Think of it as a direct link from the "from" slot to the "to" slot. An array like this is very easy to sort in O(N) time with a single CPU just by following the links, so it would be a shame to have to use a generic sorting algorithm to solve this problem. Thankfully...
You can do this easily in O(log N) time with Ω(N) CPUs.
Let A be your array. Since each array slot has a single link out (the number in that slot) and a single link in (that slot's number is in some slot), the links break down into some number of cycles.
The parity of the permutation is the oddness of N-m, where N is the length of the array and m is the number of cycles, so we can get your answer by counting the cycles.
First, make an array S of length N, and set S[i] = i.
Then:
Repeat ceil(log_2(N)) times:
foreach i in [0,N), in parallel:
if S[i] < S[A[i]] then:
S[A[i]] = S[i]
A[i] = A[A[i]]
When this is finished, every S[i] will contain the smallest index in the cycle containing i. The first pass of the inner loop propagates the smallest S[i] to the next slot in the cycle by following the link in A[i]. Then each link is made twice as long, so the next pass will propagate it to 2 new slots, etc. It takes at most ceil(log_2(N)) passes to propagate the smallest S[i] around the cycle.
Let's call the smallest slot in each cycle the cycle's "leader". The number of leaders is the number of cycles. We can find the leaders just like this:
foreach i in [0,N), in parallel:
if (S[i] == i) then:
S[i] = 1 //leader
else
S[i] = 0 //not leader
Finally, we can just add up the elements of S to get the number of cycles in the permutation, from which we can easily calculate its parity.
You didn't specify a machine model, so I'll assume that we're working with an EREW PRAM. The complexity measure you care about is called "span", the number of rounds the computation takes. There is also "work" (number of operations, summed over all processors) and "cost" (span times number of processors).
From the point of view of theory, the obvious answer is to modify an O(log n)-depth sorting network (AKS or Goodrich's Zigzag Sort) to count swaps, then return (number of swaps) mod 2. The code is very complex, and the constant factors are quite large.
A more practical algorithm is to use Batcher's bitonic sorting network instead, which raises the span to O(log2 n) but has reasonable constant factors (such that people actually use it in practice to sort on GPUs).
I can't think of a practical deterministic algorithm with span O(log n), but here's a randomized algorithm with span O(log n) with high probability. Assume n processors and let the (modifiable) input be Perm. Let Coin be an array of n Booleans.
In each of O(log n) passes, the processors do the following in parallel, where i ∈ {0…n-1} identifies the processor, and swaps ← 0 initially. Lower case variables denote processor-local variables.
Coin[i] ← true with probability 1/2, false with probability 1/2
(barrier synchronization required in asynchronous models)
if Coin[i]
j ← Perm[i]
if not Coin[j]
Perm[i] ← Perm[j]
Perm[j] ← j
swaps ← swaps + 1
end if
end if
(barrier synchronization required in asynchronous models)
Afterwards, we sum up the local values of swaps and mod by 2.
Each pass reduces the number of i such that Perm[i] ≠ i by 1/4 of the current total in expectation. Thanks to the linearity of expectation, the expected total is at most n(3/4)r, so after r = 2 log4/3 n = O(log n) passes, the expected total is at most 1/n, which in turn bounds the probability that the algorithm has not converged to the identity permutation as required. On failure, we can just switch to the O(n)-span serial algorithm without blowing up the expected span, or just try again.
So the scenario goes, the increment algorithm, for n bit binary string A[0....n-1], where A[0] is the least significant bit and A[n-1] is the most significant bit, is:
Increment(A,n):
i←0
while i<k and A[i]=1 do
A[i] ← 0
i ← i+1
if i < k then
A[i] ← 1
But the cost to flip a bit at index k is 2^k
I'm lost on trying to prove that the amortized cost of this modified binary increment algorithm is O(logn). No matter how I try to approach, it still seems like the amortized cost would be big O(1), though with a bigger constant.
aggregated analysis of the increment function.
if I follow up on this breakdown and multiply by 2^i inside the sigma, since for the cost of flipping the ith bit is 2^i, I get nk for n increments. Which still gives me amortized cost of O(1)
I am not sure what I'm doing wrong here. Intuitively it makes sense for it to be still O(1) since the high cost higher bits just cancels out the low probability of it being flipped.
If we increment the counter from 0 up to 2^m, how many times does each bit flip?
Bit 0 flips 2m times. Bit 1 flips 2m-1 times. But 2 flips 2m-2 times, etc...
If we calculate the total costs:
Bit 0 costs 1 * 2m. Bit 1 costs 2*2m = 2m. Bit 2 costs 4*2m-2 = 2m, etc...
Every bit that changes has the same total cost, and there are m+1 bits that change, so the total cost (m+1)2m
If number of increments n = 2m then the amortized cost per increment is
(m+1)2m/n
= ((log2n)+1)*n/n
= 1+log2n
The key is that the binary counter has some content at the beginning. Is it still amortized constant time complexity? How to prove it?
Let's say that we have 11010 binary counter and we increment it so it's now 11011 and so on.
What is the amortised cost of single increment operation?
The amortised cost of each operation is O(1).
Let n be the number of bits in the counter.
In all increment operations, you need to change the LSb
In half of the operations, you need to change the 2nd LSb
In 1/4 of the operations, you need to change the 3rd LSb
...
In 1/(n/2) of the operations, you need to change the (n-1)th LSb (2nd MSb)
In 1/n of the operations, you need to change n'th LSb (MSb).
This gives you average performance of:
1 + 1/2 + 1/4 + ... + 1/n <=(*) 2
To formally prove it, use induction, in the number of bits modified.
(*) is from Sum of geometric series, with a=1, r=1/2, when summing from 1 to infinity we get SUM = 1/(1-r) = 1* 1/(1/2) = 2. Since we only reduced from this number we actually got that the sum is strictly smaller than 2.
I guess you already know that if all the entries in the Array starts at 0 and at each step we increment the counter by 1 (by flipping 0's and 1's) then the amortized cost for k increments is O(k).
But, what happens if the Array starts with n ? I though that maybe the complexity for k increments is now O(log(n) + k), because of the fact that in the beginning the maximum number of 1's is log(n).
Any suggestions ?
Thanks in advance
You are right. There is more than one way to prove this, one of them is with a potential function. This link (and many others) explain the potential method. However, textbooks usually require that the initial value of the potential function is 0. Let's generalise for the case that it is not.
For the binary counter, the potential function of the counter is the number of bits set to 1. When you increment, you spend k+1 time to flip k 1's to 0 and one 0 to 1. The potential decreases by k-1. So the amortised time of this increment = ActualTime+(PotentialAfter-PotentialBefore) = k+1-(k-1) = 2 (constant).
Now look at the section "Relation between amortized and actual time" in the wikipedia link.
TotalAmortizedTime = TotalActualTime + SumOfChangesToPotential
Since the SumOfChangesToPotential is telescoping, it is equal to FinalPotential-InitialPotential. So:
TotalAmortizedTime = TotalActualTime + FinalPotential-InitialPotential
Which gives:
TotalActualTime = TotalAmortizedTime - FinalPotential + InitialPotential <= TotalAmortizedTime + InitialPotential
So, as you say, the total time for a sequence of k increments starting with n is O(log n + k).
I'm reading Introduction to Algorithms book, second edition, the chapter about Medians and Order statistics. And I have a few questions about randomized and non-randomized selection algorithms.
The problem:
Given an unordered array of integers, find i'th smallest element in the array
a. The Randomized_Select algorithm is simple. But I cannot understand the math that explains it's work time. Is it possible to explain that without doing deep math, in more intuitive way? As for me, I'd think that it should work for O(nlog n), and in worst case it should be O(n^2), just like quick sort. In avg randomizedPartition returns near middle of the array, and array is divided into two each call, and the next recursion call process only half of the array. The RandomizedPartition costs (p-r+1)<=n, so we have O(n*log n). In the worst case it would choose every time the max element in the array, and divide the array into two parts - (n-1) and (0) each step. That's O(n^2)
The next one (Select algorithm) is more incomprehensible then previous:
b. What it's difference comparing to previous. Is it faster in avg?
c. The algorithm consists of five steps. In first one we divide the array into n/5 parts each one with 5 elements (beside the last one). Then each part is sorted using insertion sort, and we select 3rd element (median) of each. Because we have sorted these elements, we can be sure that previous two <= this pivot element, and the last two are >= then it. Then we need to select avg element among medians. In the book stated that we recursively call Select algorithm for these medians. How we can do that? In select algorithm we are using insertion sort, and if we are swapping two medians, we need to swap all four (or even more if it is more deeper step) elements that are "children" for each median. Or do we create new array that contain only previously selected medians, and are searching medians among them? If yes, how can we fill them in original array, as we changed their order previously.
The other steps are pretty simple and look like in the randomized_partition algorithm.
The randomized select run in O(n). look at this analysis.
Algorithm :
Randomly choose an element
split the set in "lower than" set L and "bigger than" set B
if the size of "lower than" is j-1 we found it
if the size is bigger, then Lookup in L
or lookup in B
The total cost is the sum of :
The cost of splitting the array of size n
The cost of lookup in L or the cost of looking up in B
Edited: I Tried to restructure my post
You can notice that :
We always go next in the set with greater amount of elements
The amount of elements in this set is n - rank(xj)
1 <= rank(xi) <= n So 1 <= n - rank(xj) <= n
The randomness of the element xj directly affect the randomness of the number of element which
are greater xj(and which are smaller than xj)
if xj is the element chosen , then you know that the cost is O(n) + cost(n - rank(xj)). Let's call rank(xj) = rj.
To give a good estimate we need to take the expected value of the total cost, which is
T(n) = E(cost) = sum {each possible xj}p(xj)(O(n) + T(n - rank(xj)))
xj is random. After this it is pure math.
We obtain :
T(n) = 1/n *( O(n) + sum {all possible values of rj when we continue}(O(n) + T(n - rj))) )
T(n) = 1/n *( O(n) + sum {1 < rj < n, rj != i}(O(n) + T(n - rj))) )
Here you can change variable, vj = n - rj
T(n) = 1/n *( O(n) + sum { 0 <= vj <= n - 1, vj!= n-i}(O(n) + T(vj) ))
We put O(n) outside the sum , gain a factor
T(n) = 1/n *( O(n) + O(n^2) + sum {1 <= vj <= n -1, vj!= n-i}( T(vj) ))
We put O(n) and O(n^2) outside, loose a factor
T(n) = O(1) + O(n) + 1/n *( sum { 0 <= vj <= n -1, vj!= n-i} T(vj) )
Check the link on how this is computed.
For the non-randomized version :
You say yourself:
In avg randomizedPartition returns near middle of the array.
That is exactly why the randomized algorithm works and that is exactly what it is used to construct the deterministic algorithm. Ideally you want to pick the pivot deterministically such that it produces a good split, but the best value for a good split is already the solution! So at each step they want a value which is good enough, "at least 3/10 of the array below the pivot and at least 3/10 of the array above". To achieve this they split the original array in 5 at each step, and again it is a mathematical choice.
I once created an explanation for this (with diagram) on the Wikipedia page for it... http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm