I was studying for my final when I ran into this problem.
For 1a, I think its O(1) for amortized complexity, because it does x mod N which is sparse enough and linear probing incase it fails
However I'm not sure how to state or prove that exactly.
For 1b, it would hash into the same place, so it would linearly probe more each time it inserts, but I'm not sure how to derive a runtime from that either.
1a, there will be no collision at all except the last time (N will collide with every value,i.e N will first collide with 0, then you increase the value by one, it will collide with 1, so on and so forth), the total cost would be 1+1+...+1+n = (n-1 times)+n=2n-1, the amortized cost will be (2n-1)/n, it is O(1) with big-O notation.
1b, there will be (i-1) collisions for the i-th insert,plus the insert operation, the cost for the i-th operation would be i. So the total cost will be 1+2+...+n-2+n-1+n=(n+1)*n/2, you have inserted n time, the amortized cost will be (n+1)/2.
[edited, my original analysis was for open hashing not open addressing] For 1a) h(x) = x mod N, n < N, so the hash values will be 0, 1, ..., n - 2, 0. All insertions will be collision-free, apart from the last one. The last insertion will use a linear probe. First probe goes to bucket 0, but it is taken and the key is different. The next probe is at slot 1, with same result, until it reaches the first empty bucket at (n - 1). Hence you need (n - 1) extra operations for total of (2n - 1). The amortized cost is (2n - 1)/n per insertion.
For 1b) the hash table degenerates into linked list. Insertion is linear in the size, there are n insertions, hence (n + 1) * n / 2 operations total. That is (n + 1)/2 per insertion.
Related
I have an integer array of length N containing the values 0, 1, 2, .... (N-1), representing a permutation of integer indexes.
What's the most efficient way to determine if the permutation has odd or even parity, given I have parallel compute of O(N) as well?
For example, you can sum N numbers in log(N) with parallel computation. I expect to find the parity of permutations in log(N) as well, but cannot seem to find an algorithm. I also do not know how this "complexity order with parallel computation" is called.
The number in each array slot is the proper slot for that item. Think of it as a direct link from the "from" slot to the "to" slot. An array like this is very easy to sort in O(N) time with a single CPU just by following the links, so it would be a shame to have to use a generic sorting algorithm to solve this problem. Thankfully...
You can do this easily in O(log N) time with Ω(N) CPUs.
Let A be your array. Since each array slot has a single link out (the number in that slot) and a single link in (that slot's number is in some slot), the links break down into some number of cycles.
The parity of the permutation is the oddness of N-m, where N is the length of the array and m is the number of cycles, so we can get your answer by counting the cycles.
First, make an array S of length N, and set S[i] = i.
Then:
Repeat ceil(log_2(N)) times:
foreach i in [0,N), in parallel:
if S[i] < S[A[i]] then:
S[A[i]] = S[i]
A[i] = A[A[i]]
When this is finished, every S[i] will contain the smallest index in the cycle containing i. The first pass of the inner loop propagates the smallest S[i] to the next slot in the cycle by following the link in A[i]. Then each link is made twice as long, so the next pass will propagate it to 2 new slots, etc. It takes at most ceil(log_2(N)) passes to propagate the smallest S[i] around the cycle.
Let's call the smallest slot in each cycle the cycle's "leader". The number of leaders is the number of cycles. We can find the leaders just like this:
foreach i in [0,N), in parallel:
if (S[i] == i) then:
S[i] = 1 //leader
else
S[i] = 0 //not leader
Finally, we can just add up the elements of S to get the number of cycles in the permutation, from which we can easily calculate its parity.
You didn't specify a machine model, so I'll assume that we're working with an EREW PRAM. The complexity measure you care about is called "span", the number of rounds the computation takes. There is also "work" (number of operations, summed over all processors) and "cost" (span times number of processors).
From the point of view of theory, the obvious answer is to modify an O(log n)-depth sorting network (AKS or Goodrich's Zigzag Sort) to count swaps, then return (number of swaps) mod 2. The code is very complex, and the constant factors are quite large.
A more practical algorithm is to use Batcher's bitonic sorting network instead, which raises the span to O(log2 n) but has reasonable constant factors (such that people actually use it in practice to sort on GPUs).
I can't think of a practical deterministic algorithm with span O(log n), but here's a randomized algorithm with span O(log n) with high probability. Assume n processors and let the (modifiable) input be Perm. Let Coin be an array of n Booleans.
In each of O(log n) passes, the processors do the following in parallel, where i ∈ {0…n-1} identifies the processor, and swaps ← 0 initially. Lower case variables denote processor-local variables.
Coin[i] ← true with probability 1/2, false with probability 1/2
(barrier synchronization required in asynchronous models)
if Coin[i]
j ← Perm[i]
if not Coin[j]
Perm[i] ← Perm[j]
Perm[j] ← j
swaps ← swaps + 1
end if
end if
(barrier synchronization required in asynchronous models)
Afterwards, we sum up the local values of swaps and mod by 2.
Each pass reduces the number of i such that Perm[i] ≠ i by 1/4 of the current total in expectation. Thanks to the linearity of expectation, the expected total is at most n(3/4)r, so after r = 2 log4/3 n = O(log n) passes, the expected total is at most 1/n, which in turn bounds the probability that the algorithm has not converged to the identity permutation as required. On failure, we can just switch to the O(n)-span serial algorithm without blowing up the expected span, or just try again.
According to Wikipedia, partition-based selection algorithms such as quickselect have runtime of O(n), but I am not convinced by it. Can anyone explain why it is O(n)?
In the normal quick-sort, the runtime is O(n log n). Every time we partition the branch into two branches (greater than the pivot and lesser than the pivot), we need to continue the process in both branches, whereas quickselect only needs to process one branch. I totally understand these points.
However, if you think in the Binary Search algorithm, after we chose the middle element, we are also searching only one side of the branch. So does that make the algorithm O(1)? No, of course, the Binary Search Algorithm is still O(log N) instead of O(1). This is also the same thing as the search element in a Binary Search Tree. We only search for one side, but we still consider O(log n) instead of O(1).
Can someone explain why in quickselect, if we continue the search in one side of pivot, it is considered O(1) instead of O(log n)? I consider the algorithm to be O(n log n), O(N) for the partitioning, and O(log n) for the number of times to continue finding.
There are several different selection algorithms, from the much simpler quickselect (expected O(n), worst-case O(n2)) to the more complex median-of-medians algorithm (Θ(n)). Both of these algorithms work by using a quicksort partitioning step (time O(n)) to rearrange the elements and position one element into its proper position. If that element is at the index in question, we're done and can just return that element. Otherwise, we determine which side to recurse on and recurse there.
Let's now make a very strong assumption - suppose that we're using quickselect (pick the pivot randomly) and on each iteration we manage to guess the exact middle of the array. In that case, our algorithm will work like this: we do a partition step, throw away half of the array, then recursively process one half of the array. This means that on each recursive call we end up doing work proportional to the length of the array at that level, but that length keeps decreasing by a factor of two on each iteration. If we work out the math (ignoring constant factors, etc.) we end up getting the following time:
Work at the first level: n
Work after one recursive call: n / 2
Work after two recursive calls: n / 4
Work after three recursive calls: n / 8
...
This means that the total work done is given by
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...)
Notice that this last term is n times the sum of 1, 1/2, 1/4, 1/8, etc. If you work out this infinite sum, despite the fact that there are infinitely many terms, the total sum is exactly 2. This means that the total work is
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...) = 2n
This may seem weird, but the idea is that if we do linear work on each level but keep cutting the array in half, we end up doing only roughly 2n work.
An important detail here is that there are indeed O(log n) different iterations here, but not all of them are doing an equal amount of work. Indeed, each iteration does half as much work as the previous iteration. If we ignore the fact that the work is decreasing, you can conclude that the work is O(n log n), which is correct but not a tight bound. This more precise analysis, which uses the fact that the work done keeps decreasing on each iteration, gives the O(n) runtime.
Of course, this is a very optimistic assumption - we almost never get a 50/50 split! - but using a more powerful version of this analysis, you can say that if you can guarantee any constant factor split, the total work done is only some constant multiple of n. If we pick a totally random element on each iteration (as we do in quickselect), then on expectation we only need to pick two elements before we end up picking some pivot element in the middle 50% of the array, which means that, on expectation, only two rounds of picking a pivot are required before we end up picking something that gives a 25/75 split. This is where the expected runtime of O(n) for quickselect comes from.
A formal analysis of the median-of-medians algorithm is much harder because the recurrence is difficult and not easy to analyze. Intuitively, the algorithm works by doing a small amount of work to guarantee a good pivot is chosen. However, because there are two different recursive calls made, an analysis like the above won't work correctly. You can either use an advanced result called the Akra-Bazzi theorem, or use the formal definition of big-O to explicitly prove that the runtime is O(n). For a more detailed analysis, check out "Introduction to Algorithms, Third Edition" by Cormen, Leisserson, Rivest, and Stein.
Let me try to explain the difference between selection & binary search.
Binary search algorithm in each step does O(1) operations. Totally there are log(N) steps and this makes it O(log(N))
Selection algorithm in each step performs O(n) operations. But this 'n' keeps on reducing by half each time. There are totally log(N) steps.
This makes it N + N/2 + N/4 + ... + 1 (log(N) times) = 2N = O(N)
For binary search it is 1 + 1 + ... (log(N) times) = O(logN)
In Quicksort, the recursion tree is lg(N) levels deep and each of these levels requires O(N) amount of work. So the total running time is O(NlgN).
In Quickselect, the recurision tree is lg(N) levels deep and each level requires only half the work of the level above it. This produces the following:
N * (1/1 + 1/2 + 1/4 + 1/8 + ...)
or
N * Summation(1/i^2)
1 < i <= lgN
The important thing to note here is that i goes from 1 to lgN, but not from 1 to N and also not from 1 to infinity.
The summation evaluates to 2. Hence Quickselect = O(2N).
Quicksort does not have a big-O of nlogn - it's worst case runtime is n^2.
I assume you're asking about Hoare's Selection Algorithm (or quickselect) not the naive selection algorithm that is O(kn). Like quicksort, quickselect has a worst case runtime of O(n^2) (if bad pivots are chosen), not O(n). It can run in expectation time n because it's only sorting one side, as you point out.
Because for selection, you're not sorting, necessarily. You can simply count how many items there are which have any given value. So an O(n) median can be performed by counting how many times each value comes up, and picking the value that has 50% of items above and below it. It's 1 pass through the array, simply incrementing a counter for each element in the array, so it's O(n).
For example, if you have an array "a" of 8 bit numbers, you can do the following:
int histogram [ 256 ];
for (i = 0; i < 256; i++)
{
histogram [ i ] = 0;
}
for (i = 0; i < numItems; i++)
{
histogram [ a [ i ] ]++;
}
i = 0;
sum = 0;
while (sum < (numItems / 2))
{
sum += histogram [ i ];
i++;
}
At the end, the variable "i" will contain the 8-bit value of the median. It was about 1.5 passes through the array "a". Once through the entire array to count the values, and half through it again to get the final value.
It's all in the title. Suppose $X$ is an array of n floats. The empirical CDF is the function (of t):
Fn(t) = (1/n) sum{1{Xi <= t} : i=1,...,n}
This has to be computed for t_1<t_2,...,t_m (e.g. for m different, sorted, values of t). My question is what is the numerical complexity of computing this? I think O(nlog(n))+O(mlog(n)) [sort the array then perform m binary search, one for each value of t]
but I may be naive. Can anyone confirm?
Edit:
Sorry for the mess. While writing the question, I realized that I was imposing some constraints that are not in the original problem. I respond to Yves's question below.
The Xi are not sorted.
The t_j are sorted and equi-spaced.
m is smaller than n, but not by orders of magnitudes: typically m~n/4.
The given expression, a sum of N 0/1 terms, is clearly O(N).
UPDATE:
If the Xi are presorted, the function is trivially CDFi = CDF(Xi) = i/N, and the computation is in a way O(0)!
If the Xi are unsorted, you'll need to sort first in O(N.Log(N)), unless the range of the variable allows a faster sorting such as Counting sort.
If you only need to evaluate for a small number of Xis, let K, then you can consider using the naïve summation, as K.N can beat N.Log(N).
UPDATE: (second change by the OP)
Else, sort the Xi if necessary and sort the tj if necessary. Then a single linear pass will suffice. Total complexity will be one of:
O(n.Log(n) + m.Log(m))
O(n.Log(n) + m)
O(n + m.Log(m))
O(n + m).
If m < Log(n) and the Xi are unsorted, use the naïve formula. Complexity O(m.n).
Possibly there could be better options when m>n.
UPDATE: final specs: Xi unsorted, Tj sorted, m < n.
The solution I would choose is as follows:
1) Sort the Xi.
2) "Merge" the sorted Xi and Tj. This means, progress simultaneously in the X and T lists, keeping two running indexes; make sure to always increment the index that causes the shortest move; use CDF(Tj)=i/n. This is a linear process. (Very close to a merge in mergesort.)
Global complexity is O(n.Log(n)), the merging term O(n) being absorbed in the former.
UPDATE: uniform sampling.
When the Tj values are equi-spaced, let Tj = T0 + D.j, you can use an histogram approach.
Allocate an array of m+1 counters, initially 0. For every Xi, compute a bin index as Floor((Xi - T0) / D). Clamp negative values to 0 and values larger than m to m. Increment that bin. In the end, every bin will tell you how many X values are in range [Tj, Tj+1[.
Compute the prefix sum of the counters. They will now tell you how many X values are smaller than Xj+1, and CDF(j)=Counter[j]/n.
[Caution, this is an unchecked sketch, can be wrong in details.]
Total computation will take n bin incrementations followed by a prefix sum on m elements, i.e. O(n) operations.
# Input data
X= [0.125, 6, 3.25, 9, 1.4375, 6, 3.125, 7]
n= len(X)
# Sampling points (1 to 6)
T0= 1
DT= 1
m= 6
# Initialize the counters: O(m)
C= [0] * m
# Accumulate the histogram: O(n)
for x in X:
i= max(0, int((x - T0) / DT))
if i < m:
C[i]+= 1
# Compute the prefix sum: O(m)
S= 0
for i in range(m - 1):
C[i + 1]+= C[i]
# Reduce: O(m)
for i in range(m):
C[i]/= float(n)
# Display
print "T=", C
T= [0.25, 0.25, 0.5, 0.5, 0.5, 0.75]
A CDF Fn(t) is always a non-decreasing function in [0..1]. Therefore I assume your notation is saying to count the number of elements Xi <= t and return that count divided by n.
Thus if t is very large, you have n/n = 1. For very small, it's 0/n = 0 as we'd expect.
This is a poor definition of an empiracle CDF. See for example see Law, Averill M., Simulation & Modeling, 4th ed., p 301 for some more advanced ideas.
The simplest efficient way to compute your function (given that m, the number of Fn(t) values you need, is unknown) is first to sort the inputs Xi. This requires O(n log n) time, but needs to be done only once no matter how many t values you're processing.
Let's call the sorted values Yi. To find the count of Yi values <= t is the same as finding i such that Yi <= t < Yi+i. This can be done by binary search in O(log n) time for a given value of t. Divide by n and you have the Fn(t) value required. Of course you can repeat this m times to get the job done in O(m log n) time.
However you say your special case is m presorted values of t_j. You can find all the i values with a single pass over the Yi and simultaneously over the t_j, in the fashion of the merge operation in mergesort. With this you find all the answers in O(m + n) time.
Putting this together with the sorting cost, you have O(m + n + n log n) = O(m + n log n).
Note this is always faster than using the binary search lookup m times, O(n log n + m log n) = O((m + n) log n).
The only case you'd want to skip the presorting is when m < O(log n). This is because with no presorting, processing all the t_j needs O(mn) time - you must touch all n elements to count the number <= t_j. Consequently, if m < O(log n), then skipping the presort leads to less than O(n log n), i.e. asymptotically faster than the presort method.
I'm reading Introduction to Algorithms book, second edition, the chapter about Medians and Order statistics. And I have a few questions about randomized and non-randomized selection algorithms.
The problem:
Given an unordered array of integers, find i'th smallest element in the array
a. The Randomized_Select algorithm is simple. But I cannot understand the math that explains it's work time. Is it possible to explain that without doing deep math, in more intuitive way? As for me, I'd think that it should work for O(nlog n), and in worst case it should be O(n^2), just like quick sort. In avg randomizedPartition returns near middle of the array, and array is divided into two each call, and the next recursion call process only half of the array. The RandomizedPartition costs (p-r+1)<=n, so we have O(n*log n). In the worst case it would choose every time the max element in the array, and divide the array into two parts - (n-1) and (0) each step. That's O(n^2)
The next one (Select algorithm) is more incomprehensible then previous:
b. What it's difference comparing to previous. Is it faster in avg?
c. The algorithm consists of five steps. In first one we divide the array into n/5 parts each one with 5 elements (beside the last one). Then each part is sorted using insertion sort, and we select 3rd element (median) of each. Because we have sorted these elements, we can be sure that previous two <= this pivot element, and the last two are >= then it. Then we need to select avg element among medians. In the book stated that we recursively call Select algorithm for these medians. How we can do that? In select algorithm we are using insertion sort, and if we are swapping two medians, we need to swap all four (or even more if it is more deeper step) elements that are "children" for each median. Or do we create new array that contain only previously selected medians, and are searching medians among them? If yes, how can we fill them in original array, as we changed their order previously.
The other steps are pretty simple and look like in the randomized_partition algorithm.
The randomized select run in O(n). look at this analysis.
Algorithm :
Randomly choose an element
split the set in "lower than" set L and "bigger than" set B
if the size of "lower than" is j-1 we found it
if the size is bigger, then Lookup in L
or lookup in B
The total cost is the sum of :
The cost of splitting the array of size n
The cost of lookup in L or the cost of looking up in B
Edited: I Tried to restructure my post
You can notice that :
We always go next in the set with greater amount of elements
The amount of elements in this set is n - rank(xj)
1 <= rank(xi) <= n So 1 <= n - rank(xj) <= n
The randomness of the element xj directly affect the randomness of the number of element which
are greater xj(and which are smaller than xj)
if xj is the element chosen , then you know that the cost is O(n) + cost(n - rank(xj)). Let's call rank(xj) = rj.
To give a good estimate we need to take the expected value of the total cost, which is
T(n) = E(cost) = sum {each possible xj}p(xj)(O(n) + T(n - rank(xj)))
xj is random. After this it is pure math.
We obtain :
T(n) = 1/n *( O(n) + sum {all possible values of rj when we continue}(O(n) + T(n - rj))) )
T(n) = 1/n *( O(n) + sum {1 < rj < n, rj != i}(O(n) + T(n - rj))) )
Here you can change variable, vj = n - rj
T(n) = 1/n *( O(n) + sum { 0 <= vj <= n - 1, vj!= n-i}(O(n) + T(vj) ))
We put O(n) outside the sum , gain a factor
T(n) = 1/n *( O(n) + O(n^2) + sum {1 <= vj <= n -1, vj!= n-i}( T(vj) ))
We put O(n) and O(n^2) outside, loose a factor
T(n) = O(1) + O(n) + 1/n *( sum { 0 <= vj <= n -1, vj!= n-i} T(vj) )
Check the link on how this is computed.
For the non-randomized version :
You say yourself:
In avg randomizedPartition returns near middle of the array.
That is exactly why the randomized algorithm works and that is exactly what it is used to construct the deterministic algorithm. Ideally you want to pick the pivot deterministically such that it produces a good split, but the best value for a good split is already the solution! So at each step they want a value which is good enough, "at least 3/10 of the array below the pivot and at least 3/10 of the array above". To achieve this they split the original array in 5 at each step, and again it is a mathematical choice.
I once created an explanation for this (with diagram) on the Wikipedia page for it... http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm
According to Wikipedia, partition-based selection algorithms such as quickselect have runtime of O(n), but I am not convinced by it. Can anyone explain why it is O(n)?
In the normal quick-sort, the runtime is O(n log n). Every time we partition the branch into two branches (greater than the pivot and lesser than the pivot), we need to continue the process in both branches, whereas quickselect only needs to process one branch. I totally understand these points.
However, if you think in the Binary Search algorithm, after we chose the middle element, we are also searching only one side of the branch. So does that make the algorithm O(1)? No, of course, the Binary Search Algorithm is still O(log N) instead of O(1). This is also the same thing as the search element in a Binary Search Tree. We only search for one side, but we still consider O(log n) instead of O(1).
Can someone explain why in quickselect, if we continue the search in one side of pivot, it is considered O(1) instead of O(log n)? I consider the algorithm to be O(n log n), O(N) for the partitioning, and O(log n) for the number of times to continue finding.
There are several different selection algorithms, from the much simpler quickselect (expected O(n), worst-case O(n2)) to the more complex median-of-medians algorithm (Θ(n)). Both of these algorithms work by using a quicksort partitioning step (time O(n)) to rearrange the elements and position one element into its proper position. If that element is at the index in question, we're done and can just return that element. Otherwise, we determine which side to recurse on and recurse there.
Let's now make a very strong assumption - suppose that we're using quickselect (pick the pivot randomly) and on each iteration we manage to guess the exact middle of the array. In that case, our algorithm will work like this: we do a partition step, throw away half of the array, then recursively process one half of the array. This means that on each recursive call we end up doing work proportional to the length of the array at that level, but that length keeps decreasing by a factor of two on each iteration. If we work out the math (ignoring constant factors, etc.) we end up getting the following time:
Work at the first level: n
Work after one recursive call: n / 2
Work after two recursive calls: n / 4
Work after three recursive calls: n / 8
...
This means that the total work done is given by
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...)
Notice that this last term is n times the sum of 1, 1/2, 1/4, 1/8, etc. If you work out this infinite sum, despite the fact that there are infinitely many terms, the total sum is exactly 2. This means that the total work is
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...) = 2n
This may seem weird, but the idea is that if we do linear work on each level but keep cutting the array in half, we end up doing only roughly 2n work.
An important detail here is that there are indeed O(log n) different iterations here, but not all of them are doing an equal amount of work. Indeed, each iteration does half as much work as the previous iteration. If we ignore the fact that the work is decreasing, you can conclude that the work is O(n log n), which is correct but not a tight bound. This more precise analysis, which uses the fact that the work done keeps decreasing on each iteration, gives the O(n) runtime.
Of course, this is a very optimistic assumption - we almost never get a 50/50 split! - but using a more powerful version of this analysis, you can say that if you can guarantee any constant factor split, the total work done is only some constant multiple of n. If we pick a totally random element on each iteration (as we do in quickselect), then on expectation we only need to pick two elements before we end up picking some pivot element in the middle 50% of the array, which means that, on expectation, only two rounds of picking a pivot are required before we end up picking something that gives a 25/75 split. This is where the expected runtime of O(n) for quickselect comes from.
A formal analysis of the median-of-medians algorithm is much harder because the recurrence is difficult and not easy to analyze. Intuitively, the algorithm works by doing a small amount of work to guarantee a good pivot is chosen. However, because there are two different recursive calls made, an analysis like the above won't work correctly. You can either use an advanced result called the Akra-Bazzi theorem, or use the formal definition of big-O to explicitly prove that the runtime is O(n). For a more detailed analysis, check out "Introduction to Algorithms, Third Edition" by Cormen, Leisserson, Rivest, and Stein.
Let me try to explain the difference between selection & binary search.
Binary search algorithm in each step does O(1) operations. Totally there are log(N) steps and this makes it O(log(N))
Selection algorithm in each step performs O(n) operations. But this 'n' keeps on reducing by half each time. There are totally log(N) steps.
This makes it N + N/2 + N/4 + ... + 1 (log(N) times) = 2N = O(N)
For binary search it is 1 + 1 + ... (log(N) times) = O(logN)
In Quicksort, the recursion tree is lg(N) levels deep and each of these levels requires O(N) amount of work. So the total running time is O(NlgN).
In Quickselect, the recurision tree is lg(N) levels deep and each level requires only half the work of the level above it. This produces the following:
N * (1/1 + 1/2 + 1/4 + 1/8 + ...)
or
N * Summation(1/i^2)
1 < i <= lgN
The important thing to note here is that i goes from 1 to lgN, but not from 1 to N and also not from 1 to infinity.
The summation evaluates to 2. Hence Quickselect = O(2N).
Quicksort does not have a big-O of nlogn - it's worst case runtime is n^2.
I assume you're asking about Hoare's Selection Algorithm (or quickselect) not the naive selection algorithm that is O(kn). Like quicksort, quickselect has a worst case runtime of O(n^2) (if bad pivots are chosen), not O(n). It can run in expectation time n because it's only sorting one side, as you point out.
Because for selection, you're not sorting, necessarily. You can simply count how many items there are which have any given value. So an O(n) median can be performed by counting how many times each value comes up, and picking the value that has 50% of items above and below it. It's 1 pass through the array, simply incrementing a counter for each element in the array, so it's O(n).
For example, if you have an array "a" of 8 bit numbers, you can do the following:
int histogram [ 256 ];
for (i = 0; i < 256; i++)
{
histogram [ i ] = 0;
}
for (i = 0; i < numItems; i++)
{
histogram [ a [ i ] ]++;
}
i = 0;
sum = 0;
while (sum < (numItems / 2))
{
sum += histogram [ i ];
i++;
}
At the end, the variable "i" will contain the 8-bit value of the median. It was about 1.5 passes through the array "a". Once through the entire array to count the values, and half through it again to get the final value.