What does kth largest/smallest element mean? - algorithm

I'm currently studying selection algorithms, namely, median of medians.
I came across two following sentences:
In computer science, a selection algorithm is an algorithm for finding
the kth smallest number in a list or array;
In computer science, the median of medians is an approximate (median)
selection algorithm, frequently used to supply a good pivot for an
exact selection algorithm, mainly the quickselect, that selects the
kth largest element of an initially unsorted array.
What does kth smallest/largest element mean?
To make question a bit more concrete, consider following (unsorted) array:
[19, 1, 7, 20, 8, 10, 19, 24, 23, 6]
For example, what is 5th smallest element? And what is 5th largest element?

If you sort the array from smallest to largest, the kth smallest element is the kth element in the sorted array. The kth largest element is the kth from the end in the sorted array. Let's examine your example array in Python:
In [2]: sorted([19, 1, 7, 20, 8, 10, 19, 24, 23, 6])
Out[2]: [1, 6, 7, 8, 10, 19, 19, 20, 23, 24]
The smallest element is 1, second smallest is 6, and so on. So the kth smallest is the kth element from the left. Similarly, 24 is the largest, 23 the second largest, and so on, so the kth largest element is the kth element from the right. So if k = 5:
In [3]: sorted([19, 1, 7, 20, 8, 10, 19, 24, 23, 6])[4] # index 4 is 5th from the start
Out[3]: 10
In [4]: sorted([19, 1, 7, 20, 8, 10, 19, 24, 23, 6])[-5] # index -5 is 5th from the end
Out[4]: 19
Note that you don't have to sort the array in order to get the kth smallest/largest value. Sorting is just an easy way to see which value corresponds to k.

Related

Can QuickSelect find smallest element in an Array with duplicate values?

Does the QuickSelect algorithm work with duplicate values?
If I haven an Array
int[] array = {9, 8, 7, 6, 6, 6, 5, 0, 1, 2, 3, 4, 5, 5, 7, 200};
Will it be able to get the kth smallest element even though there are duplicates?
Yes, it works. By the end of every iteration you have all elements less than current pivot stored to the left of the pivot.
Let's consider case when all elements are the same. In this case every iteration ends up putting pivot element to the left of the array. And the next iteration will continue with one element shorter array. So we need k iterations to find k-th smallest element.

kth largest element in range interval

Given a list of overlapping intervals of integers. I need to find the kth largest element.
Example:
List { (3,4), (2,8), (4,8), (1,3), (7,9) }
This interval represents numbers as
[3, 4], [2, 3, 4, 5, 6, 7, 8], [4, 5, 6, 7, 8], [1, 2, 3], and [7, 8, 9].
If we merge and sort it in decreasing order, we get
9, 8, 8, 8, 7, 7, 7, 6, 6, 5, 5, 4, 4, 4, 3, 3, 3, 2, 2, 1
Now the 4th largest number in the list is 8.
Can anyone please explain an efficient (we don't have to generate the list) algorithm to find the kth element given only a list of internals ?
Find out the largest number. You go through intervals and examine ends of intervals. In your case it is 9. Set k = 1, and L = 9.
Perhaps there are other 9s. Mark (7,9) interval as visited and check if any other intervals contains 9 a >= 9 && b <= '. In your case there is only one 9.
Decrement current largest number (L -= L) and clear history of visited intervals. And repeat checking intervals.
Every time you meet your current largest number within an interval you should increment k and mark the interval as visited. As soon as it becomes equal to kth the current greatest number L is your answer.

Multi-way KK differencing algorithm vs. Greedy algorithm?

It's proven that, the Karmarkar-Karp's differencing algorithm always performs better than greedy for 2-way partitioning problems, i.e. partitioning set of n integers to 2 subsets with equal sums. Can this be extended to k-way partitioning as well? If not, is there any example where greedy performs better than KK in k-way partitioning?
KK's superiority cannot be generalized for the k-way partitioning. In fact, it's easier to give a counter-example where the Greedy algorithm is performing better.
Let the performance measure be the maximum subset sum of the final partition.
Now, take this set of integers:
S = [10 7 5 5 6 4 10 11 12 9 10 4 3 4 5] and k=4 (partitioning into 4 equal subsets)
Fast forward, KK algorithm gives the result of [28, 26, 26, 26] whereas the greedy gives the final partition of [27, 27, 27, 24]. Since 28 > 27, greedy performed better for this example.
There is an issue with KK Algorithm solution provided.
Sum(S) = 105
Sum([28, 26, 26, 26]) = 106
Sum([27, 27, 27, 24]) = 105
Greedy algorithm gives a result of
{{12, 6, 5, 4}{11, 7, 5, 4}{10, 10, 4, 3}{10, 9, 5}}
[27, 27, 27, 24]
KK algorithm gives a result of
{{5, 12, 6, 4}{5, 10, 7, 4}{5, 11, 10}{4, 3, 10, 9}}
[27, 26, 26, 26]
Since the highest sums are equal (27=27) and KK's lowest sum is greater than the Greedy Algorithm's (26>24), KK algorithm performs better. There are circumstances where Greedy Algorithm can still perform better than KK, but this example isn't one of them.

Need to understand answer of algorithm

I am trying to solve above Longest Monotonically Increasing Subsequence problem using javascript. In order to do that, I need to know about Longest Monotonically Subsequence. Current I am following wikipedia article. The thing I am not understanding this example is that the longest increasing subsequence is given as 0, 2, 6, 9, 13, 15 from 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15, … list. The question is Why the answer does not have 3 in between 2 and 6, and 8 between 6 and 9 etc? How does that answer come from that list?
Ist of all , consider the name "Longest Monotonically Increasing Subsequence" . So , from the given array you need to figure out the largest sequence where the numbers should be appeared in a strictly increasing fashion. There can be many sequence, where the sub array can be strictly increasing but you need to find the largest sub-Array.
So. lets debug this array. a[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15}
In here the some monotonously increasing sub-arrays are :
{0,8,12,14,15} Length = 5
{0,4,12,14,15} Length = 5
{0,1,9,13,15} Length = 5 and so on.
But if you calculate like this , you can find the largest sub-array will be :
{0, 2, 6, 9, 13, 15} , Length = 6, so this is the answer.
Every single little time you pick any number , the next number should be large than the previous one and must be present in the array. say {0, 2, 6, 9, 13, 15} this list, when you pick 9 , then the next number should be larger than 9. the immediate sequence shows 13>9, so you can pick 13. You can also pick 11. But that will create another branch of sub-array. Like :
{0, 2, 6, 9, 11, 15} which is another solution.
Hope this explanation will help you to understand the LIS (Longest Increasing Subsequence).Thanks.
First of all, the title of your question says: Longest increasing CONTIGUOUS subsequence which is a slight variation of the original problem of LIS in which the result need not have contiguous values from original array as pointed out in above examples. Follow this link for a decent explanation on LIS algorithm which has O(n^2) solution and it can be optimized to have a O(nlogn) solution:
http://www.algorithmist.com/index.php/Longest_Increasing_Subsequence
for the contiguous variant of LIS, here is a decent solution:
http://worldofbrock.blogspot.com/2009/10/how-to-find-longest-continuous.html

Compare all elements inside a 2D array with each other

I have a perfectly square 64x64 2D array of integers that will never have a value greater than 64. I was wondering if there is a really fast way to compare all of the elements with each other and display the ones that are the same, in a unique way.
At the current moment I have this
2D int array named array
loop from i = 0 to 64
loop from j = 0 to 64
loop from k = (j+1) to 64
loop from z = 0 to 64
if(array[i][j] == array[k][z])
print "element [i][j] is same as [k][z]
As you see having 4 nested loops is quite a stupid thing that I would like not to use. Language does not matter at all whatsoever, I am just simply curious to see what kind of cool solutions it is possible to use. Since value inside any integer will not be greater than 64, I guess you can only use 6 bits and transform array into something fancier. And that therefore would require less memory and would allow for some really fancy bitwise operations. Alas I am not quite knowledgeable enough to think in that format, and therefore would like to see what you guys can come up with.
Thanks to anyone in advance for a really unique solution.
There's no need to sort the array via an O(m log m) algorithm; you can use an O(m) bucket sort. (Letting m = n*n = 64*64).
An easy O(m) method using lists is to set up an array H of n+1 integers, initialized to -1; also allocate an array L of m integers each, to use as list elements. For the i'th array element, with value A[i], set k=A[i] and L[i]=H[k] and H[k]=i. When that's done, each H[k] is the head of a list of entries with equal values in them. For 2D arrays, treat array element A[i,j] as A[i+n*(j-1)].
Here's a python example using python lists, with n=7 for ease of viewing results:
import random
n = 7
m = n*n
a=[random.randint(1,n) for i in range(m)]
h=[[] for i in range(n+1)]
for i in range(m):
k = a[i]
h[k].append(i)
for i in range(1,n+1):
print 'With value %2d: %s' %(i, h[i])
Its output looks like:
With value 1: [1, 19, 24, 28, 44, 45]
With value 2: [3, 6, 8, 16, 27, 29, 30, 34, 42]
With value 3: [12, 17, 21, 23, 32, 41, 47]
With value 4: [9, 15, 36]
With value 5: [0, 4, 7, 10, 14, 18, 26, 33, 38]
With value 6: [5, 11, 20, 22, 35, 37, 39, 43, 46, 48]
With value 7: [2, 13, 25, 31, 40]
class temp {
int i, j;
int value;
}
then fill your array in class temp array[64][64], then sort it by value (you can do this in Java by implementing a comparable interface). Then the equal element should be after each other and you can extract i,j for each other.
This solution would be optimal, categorizing as a quadratic approach for big-O notation.
Use quicksort on the array, then iterate through the array, storing a temporary value of the "cursor" (current value you're looking at), and determine if the temporary value is the same as the next cursor.
array[64][64];
quicksort(array);
temp = array[0][0];
for x in array[] {
for y in array[][] {
if(temp == array[x][y]) {
print "duplicate found at x,y";
}
temp = array[x][y];
}
}

Resources