Mauritus national flag problem - algorithm

I've made a solution for the Dutch national flag problem already.
But this time, I want to try something more difficult: the Mauritus national flag problem - 4 colours, instead of 3. Any suggestions for an effective algorithm?
Basically, The Mauritius National Flag Problem focuses on how you would be able to sort the given list of pairs based on the order of colors in the Mauritius National Flag (Red, Blue, Yellow, Green). And the numbers must be sorted in ascending order too.
Scheme Programming Sample Input:
( (R . 3) (G . 6) (Y . 1) (B . 2) (Y . 7) (G . 3) (R . 1) (B . 8) )
Output:
( (R . 1) (R . 3) (B . 2) (B . 8) (Y . 1) (Y . 7) (G . 3) (G . 6) )

Here is what I came up with. Instead of colors, I am using numbers.
// l - index at which 0 should be inserted.
// m1 - index at which 1 should be inserted.
// m2 - index at which 2 should be inserted.
// h - index at which 3 should be inserted.
l=m1=m2=0;
h=arr.length-1
while(m2 <= h) {
if (arr[m2] == 0) {
swap(arr, m2, l);
l++;
// m1 should be incremented if it is less than l as 1 can come after all
// 0's
//only.
if (m1 < l) {
m1++;
}
// Now why not always incrementing m2 as we used to do in 3 flag partition
// while comparing with 0? Let's take an example here. Suppose arr[l]=1
// and arr[m2]=0. So we swap arr[l] with arr[m2] with and increment l.
// Now arr[m2] is equal to 1. But if arr[m1] is equal to 2 then we should
// swap arr[m1] with arr[m2]. That's why arr[m2] needs to be processed
// again for the sake of arr[m1]. In any case, it should not be less than
// l, so incrmenting.
if(m2<l) {
m2++;
}
}
// From here it is exactly same as 3 flag.
else if(arr[m2]==1) {
swap(arr, m1, m2)
m1++;
m2++;
}
else if(arr[m2] ==2){
m2++;
}
else {
swap(arr, m2, h);
h--;
}
}
}
Similarly we can write for five flags.
l=m1=m2=m3=0;
h= arr.length-1;
while(m3 <= h) {
if (arr[m3] == 0) {
swap(arr, m3, l);
l++;
if (m1 < l) {
m1++;
}
if(m2<l) {
m2++;
}
if(m3<l) {
m3++;
}
}
else if(arr[m3]==1) {
swap(arr, m1, m3);
m1++;
if(m2<m1) {
m2++;
}
if(m3<m1) {
m3++;
}
}
else if(arr[m3] ==2){
swap(arr,m2,m3);
m2++;
m3++;
}
else if(arr[m3]==3) {
m3++;
}
else {
swap(arr, m3, h);
h--;
}
}

This is just like the Dutch national flag problem, but we have four colors. Essentially the same strategy applies. Assume we have (where ^ represents the point being scanned).
RRRRBBB???????????YYYYGGGG
^
and we scan a
red, then we swap the first blue with the current node
BLUE we do nothing
yellow we swap with the last ?
Green we swap the last yellow with the last ? Then the current node with the swapped ?
So we need to keep track or one more pointer than usual.
We need to keep track of the first blue, the first ?, the last ?, the last Y
In general, the same strategy works for any number of colors, but an increasing numbers of swaps are needed.

Basically, maintain the following :
a[0-p] => '0'
a[p-q] => '1'
a[q-r] => '2'
a[r-s] => traversing!
a[s-length] => '3'
Code:
int p=-1,q=-1,r=0,s=a.length-1;
while(r<=s){
if(a[r]==0){
exchange(a,p+1,r);
p++;r++;
if(q!=-1)
q++;
} else if (a[r]==1){
if(q==-1)
q=p;
exchange(a,q+1,r);
q++;r++;
} else if(a[r]==2) {
r++;
} else {
exchange(a,r,s);
s--;
}
}

I do have a similar kind of code but insted of
function sort(a:string[]){
let low = 0;
let mid1 = 0;
let mid2 = a.length-1;
let high = a.length-1;
while(mid1 <= mid2){
switch(a[mid1]){
case '0':
[a[mid1],a[low]] = [a[low],a[mid1]];
mid1++;
low++;
break;
case '1':mid1++;break;
case '2':
case '3':[a[mid1],a[mid2]] = [a[mid2],a[mid1]];
mid2--;
break;
}
}
//sort 2 and 3
while(mid1 <= high){
switch(a[mid1]){
case '2': mid1++; break;
case '3': [a[mid1],a[high]] = [a[high],a[mid1]]
high--;
break;
}
}
}

function sort3(a:string[]):void{
let low = 0;
let mid1 = 0;
let mid2 = 0;
let high = a.length - 1;
while(mid2<=high){
switch(a[mid2]){
case '0': [a[mid2],a[low]] = [a[low],a[mid2]];
low++;
if(mid1<low)
mid1++;
if(mid2<mid1)
mid2++;
break;
case '1': [a[mid2],a[mid1]] = [a[mid1],a[mid2]];
mid1++;
mid2++;
break;
case '2':mid2++
break;
case '3':[a[mid2],a[high]] = [a[high],a[mid2]];
high--;
}
}
}

let a:string[] = ['1','2','1','0','2','4','3','0','1','3'];
function sort3(a:string[]):void{
let low = 0;
let mid1 = 0;
let mid2 = 0;
let mid3 = 0;
let high = a.length - 1;
while(mid3<=high){
switch(a[mid3]){
case '0': [a[mid3],a[low]] = [a[low],a[mid3]];
low++;
if(mid1 < low)
mid1++;
if(mid2 < mid1)
mid2++;
if(mid3 < mid2)
mid3++;
break;
case '1': [a[mid3],a[mid1]] = [a[mid1],a[mid3]];
mid1++;
if(mid2 < mid1)
mid2++;
if(mid3 < mid2)
mid3++
break;
case '2': [a[mid2],a[mid3]] = [a[mid3],a[mid2]];
mid2++;
mid3++;
break;
case '3':
mid3++;break;
case '4': [a[mid3],a[high]] = [a[high],a[mid3]];
high--;
}
}
}

Related

Print entire path of Maximum cost in a maze from first cell to last cell when moving right and bottom is allowed

I need a help in a enhancement to very popular dynamic programming question. Min/Max cost path
Question : There is a 2D matrix which has values (0,1,-1).
0 -> no cherry. can go here
1 -> cherry present. can go here
-1 -> thorn present. can't go here
we need to print maximum cherrys collected and entire path in which we can collect maximum cherrys.
input :
{{0, 1, -1}, {1, 0, -1},{1,1,1}};
output :
4
(0,0) -> (1,0) -> (2,0) -> (2,1) -> (2,2)
I can write the code to print the maximum cherrys collected but not able to get the logic to how to store the entire path. Since we decide which cell to be choosen while backtracking, it appears little tough. didnt find any web help in this regard. I'm stuck, Any help would be appreciated.
public int cherryPickup(int[][] grid) {
if (grid.length == 0) {
return -1;
}
int[][] dp = new int[grid.length][grid[0].length];
setDp(dp);
int forwardMax = getForwardMax(grid, dp, 0, 0);
return forwardMax;
}
private void setDp(int[][] dp) {
for (int i = 0; i < dp.length; i++) {
for (int j = 0; j < dp[0].length; j++) {
dp[i][j] = -1;
}
}
}
private int getForwardMax(int[][] grid, int[][] dp, int i, int j) {
if(dp[i][j] != -1) {
return dp[i][j];
}
if (grid[i][j] == -1) {
dp[i][j] = 0;
return dp[i][j];
}
if (i == grid.length - 1 && j == grid[0].length - 1) {
dp[i][j] = grid[i][j];
return dp[i][j];
}
if (i == grid.length - 1) {
dp[i][j] = grid[i][j] + getForwardMax(grid, dp, i, j + 1);
return dp[i][j];
}
if (j == grid[0].length - 1) {
dp[i][j] = grid[i][j] + getForwardMax(grid, dp, i + 1, j);
return dp[i][j];
}
dp[i][j] = grid[i][j] + Math.max(getForwardMax(grid, dp, i + 1, j), getForwardMax(grid, dp, i, j + 1));
return dp[i][j];
}
As per suggestion in the comment for having the path[][] and storing the index which is maximum.
Below code stores (1,1) also 1, which is incorrect.
private int getForwardMax(int[][] grid, int[][] dp, int i, int j, int[][] path) {
if(dp[i][j] != -1) {
return dp[i][j];
}
if (grid[i][j] == -1) {
dp[i][j] = 0;
return dp[i][j];
}
if (i == grid.length - 1 && j == grid[0].length - 1) {
dp[i][j] = grid[i][j];
return dp[i][j];
}
if (i == grid.length - 1) {
dp[i][j] = grid[i][j] + getForwardMax(grid, dp, i, j + 1, path);
path[i][j] =1;
return dp[i][j];
}
if (j == grid[0].length - 1) {
dp[i][j] = grid[i][j] + getForwardMax(grid, dp, i + 1, j, path);
path[i][j] =1;
return dp[i][j];
}
int left = getForwardMax(grid, dp, i + 1, j, path);
int right = getForwardMax(grid, dp, i, j + 1, path);
int max = Math.max(left, right);
if(max == left) {
path[i+1][j] = 1;
} else {
path[i][j+1] = 1;
}
dp[i][j] = grid[i][j] + max;
return dp[i][j];
}
Well if you write your dynamic programming top down (as you did), restoring actual answer is actually very easy.
So you have a function getForwardMax which for given cell, return maximum amount we can collect moving right or down
You also know starting position, so all you need to do is build the answer step by step:
Let's say you're in some cell (r,c)
if there is only one possible move (you're at border) just do it
otherwise we can either move to (r+1,c) or (r,c+1)
we also know how much we will earn by moving to those cells and completing our path to the goal from getForwardMax function
So we just pick move that gives better result
Ok, bottom up DP is a correct solution as yours. I just realized you won't need separate path[][] to store the path and iterate over them.
You can use a simple while loop and choose the best among the 2 options of right and down.
If both happen to have same values, you need not worry as one grid could have multiple correct solutions. So, choosing either one in case of clash will still give you a correct solution.
We start from (0,0).
If value contained in dp[x][y+1] cell at the right + current grid[x][y] gives us the value same as dp[x][y], we move right, else we move down.
Snippet:
int x = 0,y = 0;
while(x != rows-1 || y != cols-1){
System.out.println("( " + x + " , " + y + " )");
if(x+1 < rows && grid[x][y] + dp[x+1][y] == dp[x][y]){
x++;
}else if(y + 1 < cols && grid[x][y] + dp[x][y+1] == dp[x][y]){
y++;
}
}
System.out.println("( " + x + " , " + y + " )");
Full Code: https://ideone.com/lRZ6E5

Binary search of a Matrix

Write an efficient algorithm that searches for a value in an m x n
matrix.
This matrix has the following properties:
-Integers in each row are sorted from left to right. -The first integer
of each row is greater than or equal to the last integer of the
previous row. Example:
Consider the following matrix:
[
[1, 3, 5, 7],
[10, 11, 16, 20], [23, 30, 34, 50] ] Given
target = 3, return 1 ( 1 corresponds to true )
Return 0 / 1 ( 0 if the element is not present, 1 if the element is
present ) for this problem
My solution works on NetBeans but fails on the website. Any help will be appreciated.
https://www.interviewbit.com/problems/matrix-search/
public class Solution {
public int searchMatrix(ArrayList<ArrayList<Integer>> a, int b) {
int r = a.size();
int c = a.get(0).size();
int start = 0;
int end = r - 1;
// default value is last row for edge case
int biRow = r -1; // row to search column
//binary search 1st value of rows
while (start <= end) {
int mid = (start + end) / 2;
if (b == a.get(mid).get(0)) {
return 1;
}
if (a.get(mid).get(0) < b && b < a.get(end).get(0)) {
if (mid + 1 >= end) {
biRow = mid;
break;
}
} if (b < a.get(mid).get(0)) {
end = mid - 1;
} else {
start = mid + 1;
}
}
//binary search column of biRow
start = 0;
end = c-1;
while (start <= end) {
int mid = (start + end) / 2;
if (b == a.get(biRow).get(mid)) {
return 1;
}
if (b < a.get(biRow).get(mid)) {
end = mid - 1;
} else {
start = mid + 1;
}
}
return 0;
}
}
Okay, the first thing you MUST NOT do is that, you cannot physically concat the matrix into a 1D vector, as this is O(N*M) which is linear and not what we want.
// Easy but TLE code
int Solution::searchMatrix(vector<vector<int> > &A, int B) {
vector<int> v;
for(auto a : A) v.insert(v.end(), a.begin(), a.end());
return binary_search(v.begin(), v.end(), B);
}
So the point is, you have to do binary search directly on the matrix, and that is pretty much the same (except you have to write binary search your own now).
As you did not really access all of the elements, this is O(lg (N*M))
// Less Easy but AC code
int Solution::searchMatrix(vector<vector<int> > &A, int B) {
int m = A.size(), n = A[0].size(), lo = 0, hi = m*n-1, mi, row, col;
while(lo <= hi){
mi = lo + ((hi-lo) >> 1);
row = mi / n;
col = mi % n;
if(A[row][col] == B) return 1;
else if(A[row][col] > B) hi = mi - 1;
else lo = mi + 1;
}
return 0;
}
I think the shared program seems to have a logical error.
When updating the end value in the first while loop, if the end value is equal to start, biRow can not be updated.
It worked well when I updated the code like below.
public class Solution {
public int searchMatrix(ArrayList<ArrayList<Integer>> a, int b) {
int r = a.size();
int c = a.get(0).size();
int start = 0;
int end = r - 1;
// default value is last row for edge case
int biRow = r -1; // row to search column
//binary search 1st value of rows
int mid = 0;
while (start <= end) {
mid = (start + end) / 2;
if ( b >= a.get(mid).get(0) && b <= a.get(mid).get(c-1)) {
break;
}
if (b < a.get(mid).get(0)) {
end = mid-1;
} else {
start = mid+1;
}
}
biRow = mid;
//binary search column of biRow
start = 0;
end = c-1;
while (start <= end) {
mid = (start + end) / 2;
if (b == a.get(biRow).get(mid)) {
return 1;
}
if (b < a.get(biRow).get(mid)) {
end = mid - 1;
} else {
start = mid + 1;
}
}
return 0;
}
}
There is logical error in your row search loop. I made a correction and I also added the boundary conditions.Time complexity of this algorithm is O(logN).
public class Solution {
public int searchMatrix(ArrayList<ArrayList<Integer>> a, int b) {
int r = a.size();
int c = a.get(0).size();
// return 0 if b is less than 1st element or greater than last element
if (b < a.get(0).get(0) || b > a.get(r - 1).get(c - 1))
return 0;
int start = 0;
int end = r - 1;
// default value is last row for edge case
int biRow = r - 1; // row to search column
// binary search 1st value of rows
while (start <= end) {
int mid = (start + end) / 2;
if (b == a.get(mid).get(0)) {
return 1;
}
if (b >= a.get(mid).get(0) && b <= a.get(mid).get(c - 1)) {
{
biRow = mid;
break;
}
}
if (b < a.get(mid).get(0)) {
end = mid - 1;
} else {
start = mid + 1;
}
}
// binary search column of biRow
start = 0;
end = c - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (b == a.get(biRow).get(mid)) {
return 1;
}
if (b < a.get(biRow).get(mid)) {
end = mid - 1;
} else {
start = mid + 1;
}
}
return 0;
}
}
Since the rows and columns are sorted, a binary search will be proper as you said. This is a binary search(on a matrix) implementation in Ruby
def binary_search_on_matrix(matrix,target)
row_size = matrix.size
column_size = matrix[0].size
left_index = 0
right_index = (row_size * column_size) - 1
while (left_index <= right_index)
mid_point = left_index + ((right_index - left_index) / 2)
row = mid_point / column_size
col = mid_point % column_size
value = matrix[row][col]
if (value == target)
return true
elsif (value > target)
right_index = mid_point - 1
else
left_index = mid_point + 1
end
end
return false
end
You can first convert your 2D array into 1D array and perform binary search operation. You can refer the code given below:
void search(int a[][10],int search,int m,int n)
{
int arr[100],i=0,j=0,k=-1;
for(i=0;i<m;i++)
for(j=0;j<n;j++)
arr[++k] = a[i][j];
int first = 0 , last = k-1 , middle = (first+last)/2;
while (first <= last)
{
if(arr[middle] < search)
{
first = middle + 1;
}
else if(arr[middle] == search)
{
printf("\n Element found at position:( %d , %d")",(middle/n)+1,(middle%n)+1);
printf(" \n Row : %d",(middle/n)+1);
printf("\n column : %d",(middle%n)+1);
break;
}
else
{
last = middle - 1;
}
middle = (first + last)/2;
}
if(first > last)
{
printf("\n Element not found! ");
}
}
This function prints the row and column of element to be searched if it exists.You can modify this code, if You want the function to return a value depending on the search operation.

For all labeled regions in a matrix find their vertices for drawing

I am looking for an algorithm that would get me data necessary for drawing labeled regions of a matrix in a 3d application.
The input looks like this:
For each region I need to find vertices of its outer boundary in CCW order.
I already can find the vertices of all horizontal or vertical edges by looking at the neighbours, but my implementation finds vertices from left to right, from top to bottom and not in the CCW order. Here is my code.
for (int i = 1; i < columns-1; i++)
for (int j = 1; j < rows - 1; j++) {
if (grid[i][j] > 0) { // not background
if ((grid[i + 1][j] != id) && (grid[i][j - 1] != id)) {
getCellTopLeftCoord(i, j, &x, &y);
polyPath[id]->Add(gcnew mPoint(x + width, y));
}
if ((grid[i - 1][j] != id) && (grid[i][j - 1] != id)) {
getCellTopLeftCoord(i, j, &x, &y);
polyPath[id]->Add(gcnew mPoint(x, y));
}
... // etc..
here are the boundaries I am interested in:
The following procedure should work if there aren't any unconnected surfaces with repeated labels:
Traverse the matrix from top to bottom and from left to right. If you encounter a non-null cell with a label that you haven't treated yet, create the path for that label.
The point you have found is guaranteed to be a northeast corner. Put that point into your path.
Now create a list of directions and start by going south. Because you are walking along the border anticlockwise, you should always have an occupied cell to the left and an unoccupied cell to the right. (Occupied here refers to a cell with the desired label.)
When you try to find the next direction, continue in the last direction and check the cells to your right and left. if both are unoccupied, turn left. If at least the right one is occupied, turn right. Otherwise, continues straight on.
When you change direction, append the current point to your path.
Update the coordinates acording to the current direction. Repeat until you reach your original coordinates.
This method will not give you the diagonal lines around the area labelled 4 in your sketch; it will follow the axis-aligned jagged outline.
Here's an example implementation in Javascript. The cell data is contained in the two-dimensional array m. cell looks up a cell, but accounts for out-of bounds look-ups. path creates the path for a single label. paths creates a list of paths; it calls path:
function cell(x, y) {
if (y < 0) return 0;
if (y >= m.length) return 0;
if (x < 0) return 0;
if (x >= m[y].length) return 0;
return m[y][x];
}
function path(x, y, c) {
var x0 = x;
var y0 = y;
var res = [{x: x, y: y}];
var dir = "s";
var l, r;
y++;
while (x != x0 || y != y0) {
var old = dir;
switch (dir) {
case "n": l = (cell(x - 1, y - 1) == c) ? 1 : 0;
r = (cell(x, y - 1) == c) ? 2 : 0;
dir = ["w", "n", "e", "e"][l + r];
break;
case "e": l = (cell(x, y - 1) == c) ? 1 : 0;
r = (cell(x, y) == c) ? 2 : 0;
dir = ["n", "e", "s", "s"][l + r];
break;
case "s": l = (cell(x, y) == c) ? 1 : 0;
r = (cell(x - 1, y) == c) ? 2 : 0;
dir = ["e", "s", "w", "w"][l + r];
break;
case "w": l = (cell(x - 1, y) == c) ? 1 : 0;
r = (cell(x - 1, y - 1) == c) ? 2 : 0;
dir = ["s", "w", "n", "n"][l + r];
break;
}
if (dir != old) res.push({x: x, y: y});
switch (dir) {
case "n": y--; break;
case "e": x++; break;
case "s": y++; break;
case "w": x--; break;
}
}
return res;
}
function paths() {
var res = {};
for (var y = 0; y < m.length; y++) {
for (var x = 0; x < m[y].length; x++) {
var c = m[y][x];
if (c && !(c in res)) {
res[c] = path(x, y, c);
}
}
}
return res;
}

Find order statistic in union of 2 sorted lists on logarithmic time [duplicate]

This is a homework question, binary search has already been introduced:
Given two arrays, respectively N and M elements in ascending order, not necessarily unique:
What is a time efficient algorithm to find the kth smallest element in the union of both arrays?
They say it takes O(logN + logM) where N and M are the arrays lengths.
Let's name the arrays a and b. Obviously we can ignore all a[i] and b[i] where i > k.
First let's compare a[k/2] and b[k/2]. Let b[k/2] > a[k/2]. Therefore we can discard also all b[i], where i > k/2.
Now we have all a[i], where i < k and all b[i], where i < k/2 to find the answer.
What is the next step?
I hope I am not answering your homework, as it has been over a year since this question was asked. Here is a tail recursive solution that will take log(len(a)+len(b)) time.
Assumption: The inputs are correct, i.e., k is in the range [0, len(a)+len(b)].
Base cases:
If length of one of the arrays is 0, the answer is kth element of the second array.
Reduction steps:
If mid index of a + mid index of b is less than k:
If mid element of a is greater than mid element of b, we can ignore the first half of b, adjust k.
Otherwise, ignore the first half of a, adjust k.
If k is less than sum of mid indices of a and b:
If mid element of a is greater than mid element of b, we can safely ignore second half of a.
Otherwise, we can ignore second half of b.
Code:
def kthlargest(arr1, arr2, k):
if len(arr1) == 0:
return arr2[k]
elif len(arr2) == 0:
return arr1[k]
mida1 = len(arr1) // 2 # integer division
mida2 = len(arr2) // 2
if mida1 + mida2 < k:
if arr1[mida1] > arr2[mida2]:
return kthlargest(arr1, arr2[mida2+1:], k - mida2 - 1)
else:
return kthlargest(arr1[mida1+1:], arr2, k - mida1 - 1)
else:
if arr1[mida1] > arr2[mida2]:
return kthlargest(arr1[:mida1], arr2, k)
else:
return kthlargest(arr1, arr2[:mida2], k)
Please note that my solution is creating new copies of smaller arrays in every call, this can be easily eliminated by only passing start and end indices on the original arrays.
You've got it, just keep going! And be careful with the indexes...
To simplify a bit I'll assume that N and M are > k, so the complexity here is O(log k), which is O(log N + log M).
Pseudo-code:
i = k/2
j = k - i
step = k/4
while step > 0
if a[i-1] > b[j-1]
i -= step
j += step
else
i += step
j -= step
step /= 2
if a[i-1] > b[j-1]
return a[i-1]
else
return b[j-1]
For the demonstration you can use the loop invariant i + j = k, but I won't do all your homework :)
Many people answered this "kth smallest element from two sorted array" question, but usually with only general ideas, not a clear working code or boundary conditions analysis.
Here I'd like to elaborate it carefully with the way I went though to help some novices to understand, with my correct working Java code. A1 and A2 are two sorted ascending arrays, with size1 and size2 as length respectively. We need to find the k-th smallest element from the union of those two arrays. Here we reasonably assume that (k > 0 && k <= size1 + size2), which implies that A1 and A2 can't be both empty.
First, let's approach this question with a slow O(k) algorithm. The method is to compare the first element of both array, A1[0] and A2[0]. Take the smaller one, say A1[0] away into our pocket. Then compare A1[1] with A2[0], and so on. Repeat this action until our pocket reached k elements. Very important: In the first step, we can only commit to A1[0] in our pocket. We can NOT include or exclude A2[0]!!!
The following O(k) code gives you one element before the correct answer. Here I use it to show my idea, and analysis boundary condition. I have correct code after this one:
private E kthSmallestSlowWithFault(int k) {
int size1 = A1.length, size2 = A2.length;
int index1 = 0, index2 = 0;
// base case, k == 1
if (k == 1) {
if (size1 == 0) {
return A2[index2];
} else if (size2 == 0) {
return A1[index1];
} else if (A1[index1].compareTo(A2[index2]) < 0) {
return A1[index1];
} else {
return A2[index2];
}
}
/* in the next loop, we always assume there is one next element to compare with, so we can
* commit to the smaller one. What if the last element is the kth one?
*/
if (k == size1 + size2) {
if (size1 == 0) {
return A2[size2 - 1];
} else if (size2 == 0) {
return A1[size1 - 1];
} else if (A1[size1 - 1].compareTo(A2[size2 - 1]) < 0) {
return A1[size1 - 1];
} else {
return A2[size2 - 1];
}
}
/*
* only when k > 1, below loop will execute. In each loop, we commit to one element, till we
* reach (index1 + index2 == k - 1) case. But the answer is not correct, always one element
* ahead, because we didn't merge base case function into this loop yet.
*/
int lastElementFromArray = 0;
while (index1 + index2 < k - 1) {
if (A1[index1].compareTo(A2[index2]) < 0) {
index1++;
lastElementFromArray = 1;
// commit to one element from array A1, but that element is at (index1 - 1)!!!
} else {
index2++;
lastElementFromArray = 2;
}
}
if (lastElementFromArray == 1) {
return A1[index1 - 1];
} else {
return A2[index2 - 1];
}
}
The most powerful idea is that in each loop, we always use the base case approach. After committed to the current smallest element, we get one step closer to the target: the k-th smallest element. Never jump into the middle and make yourself confused and lost!
By observing the above code base case k == 1, k == size1+size2, and combine with that A1 and A2 can't both be empty. We can turn the logic into below more concise style.
Here is a slow but correct working code:
private E kthSmallestSlow(int k) {
// System.out.println("this is an O(k) speed algorithm, very concise");
int size1 = A1.length, size2 = A2.length;
int index1 = 0, index2 = 0;
while (index1 + index2 < k - 1) {
if (size1 > index1 && (size2 <= index2 || A1[index1].compareTo(A2[index2]) < 0)) {
index1++; // here we commit to original index1 element, not the increment one!!!
} else {
index2++;
}
}
// below is the (index1 + index2 == k - 1) base case
// also eliminate the risk of referring to an element outside of index boundary
if (size1 > index1 && (size2 <= index2 || A1[index1].compareTo(A2[index2]) < 0)) {
return A1[index1];
} else {
return A2[index2];
}
}
Now we can try a faster algorithm runs at O(log k). Similarly, compare A1[k/2] with A2[k/2]; if A1[k/2] is smaller, then all the elements from A1[0] to A1[k/2] should be in our pocket. The idea is to not just commit to one element in each loop; the first step contains k/2 elements. Again, we can NOT include or exclude A2[0] to A2[k/2] anyway. So in the first step, we can't go more than k/2 elements. For the second step, we can't go more than k/4 elements...
After each step, we get much closer to k-th element. At the same time each step get smaller and smaller, until we reach (step == 1), which is (k-1 == index1+index2). Then we can refer to the simple and powerful base case again.
Here is the working correct code:
private E kthSmallestFast(int k) {
// System.out.println("this is an O(log k) speed algorithm with meaningful variables name");
int size1 = A1.length, size2 = A2.length;
int index1 = 0, index2 = 0, step = 0;
while (index1 + index2 < k - 1) {
step = (k - index1 - index2) / 2;
int step1 = index1 + step;
int step2 = index2 + step;
if (size1 > step1 - 1
&& (size2 <= step2 - 1 || A1[step1 - 1].compareTo(A2[step2 - 1]) < 0)) {
index1 = step1; // commit to element at index = step1 - 1
} else {
index2 = step2;
}
}
// the base case of (index1 + index2 == k - 1)
if (size1 > index1 && (size2 <= index2 || A1[index1].compareTo(A2[index2]) < 0)) {
return A1[index1];
} else {
return A2[index2];
}
}
Some people may worry what if (index1+index2) jump over k-1? Could we miss the base case (k-1 == index1+index2)? That's impossible. You can add up 0.5+0.25+0.125..., and you will never go beyond 1.
Of course, it is very easy to turn the above code into recursive algorithm:
private E kthSmallestFastRecur(int k, int index1, int index2, int size1, int size2) {
// System.out.println("this is an O(log k) speed algorithm with meaningful variables name");
// the base case of (index1 + index2 == k - 1)
if (index1 + index2 == k - 1) {
if (size1 > index1 && (size2 <= index2 || A1[index1].compareTo(A2[index2]) < 0)) {
return A1[index1];
} else {
return A2[index2];
}
}
int step = (k - index1 - index2) / 2;
int step1 = index1 + step;
int step2 = index2 + step;
if (size1 > step1 - 1 && (size2 <= step2 - 1 || A1[step1 - 1].compareTo(A2[step2 - 1]) < 0)) {
index1 = step1;
} else {
index2 = step2;
}
return kthSmallestFastRecur(k, index1, index2, size1, size2);
}
Hope the above analysis and Java code could help you to understand. But never copy my code as your homework! Cheers ;)
Here's a C++ iterative version of #lambdapilgrim's solution (see the explanation of the algorithm there):
#include <cassert>
#include <iterator>
template<class RandomAccessIterator, class Compare>
typename std::iterator_traits<RandomAccessIterator>::value_type
nsmallest_iter(RandomAccessIterator firsta, RandomAccessIterator lasta,
RandomAccessIterator firstb, RandomAccessIterator lastb,
size_t n,
Compare less) {
assert(issorted(firsta, lasta, less) && issorted(firstb, lastb, less));
for ( ; ; ) {
assert(n < static_cast<size_t>((lasta - firsta) + (lastb - firstb)));
if (firsta == lasta) return *(firstb + n);
if (firstb == lastb) return *(firsta + n);
size_t mida = (lasta - firsta) / 2;
size_t midb = (lastb - firstb) / 2;
if ((mida + midb) < n) {
if (less(*(firstb + midb), *(firsta + mida))) {
firstb += (midb + 1);
n -= (midb + 1);
}
else {
firsta += (mida + 1);
n -= (mida + 1);
}
}
else {
if (less(*(firstb + midb), *(firsta + mida)))
lasta = (firsta + mida);
else
lastb = (firstb + midb);
}
}
}
It works for all 0 <= n < (size(a) + size(b)) indexes and has O(log(size(a)) + log(size(b))) complexity.
Example
#include <functional> // greater<>
#include <iostream>
#define SIZE(a) (sizeof(a) / sizeof(*a))
int main() {
int a[] = {5,4,3};
int b[] = {2,1,0};
int k = 1; // find minimum value, the 1st smallest value in a,b
int i = k - 1; // convert to zero-based indexing
int v = nsmallest_iter(a, a + SIZE(a), b, b + SIZE(b),
SIZE(a)+SIZE(b)-1-i, std::greater<int>());
std::cout << v << std::endl; // -> 0
return v;
}
My attempt for first k numbers, kth number in 2 sorted arrays, and in n sorted arrays:
// require() is recognizable by node.js but not by browser;
// for running/debugging in browser, put utils.js and this file in <script> elements,
if (typeof require === "function") require("./utils.js");
// Find K largest numbers in two sorted arrays.
function k_largest(a, b, c, k) {
var sa = a.length;
var sb = b.length;
if (sa + sb < k) return -1;
var i = 0;
var j = sa - 1;
var m = sb - 1;
while (i < k && j >= 0 && m >= 0) {
if (a[j] > b[m]) {
c[i] = a[j];
i++;
j--;
} else {
c[i] = b[m];
i++;
m--;
}
}
debug.log(2, "i: "+ i + ", j: " + j + ", m: " + m);
if (i === k) {
return 0;
} else if (j < 0) {
while (i < k) {
c[i++] = b[m--];
}
} else {
while (i < k) c[i++] = a[j--];
}
return 0;
}
// find k-th largest or smallest number in 2 sorted arrays.
function kth(a, b, kd, dir){
sa = a.length; sb = b.length;
if (kd<1 || sa+sb < kd){
throw "Mission Impossible! I quit!";
}
var k;
//finding the kd_th largest == finding the smallest k_th;
if (dir === 1){ k = kd;
} else if (dir === -1){ k = sa + sb - kd + 1;}
else throw "Direction has to be 1 (smallest) or -1 (largest).";
return find_kth(a, b, k, sa-1, 0, sb-1, 0);
}
// find k-th smallest number in 2 sorted arrays;
function find_kth(c, d, k, cmax, cmin, dmax, dmin){
sc = cmax-cmin+1; sd = dmax-dmin+1; k0 = k; cmin0 = cmin; dmin0 = dmin;
debug.log(2, "=k: " + k +", sc: " + sc + ", cmax: " + cmax +", cmin: " + cmin + ", sd: " + sd +", dmax: " + dmax + ", dmin: " + dmin);
c_comp = k0-sc;
if (c_comp <= 0){
cmax = cmin0 + k0-1;
} else {
dmin = dmin0 + c_comp-1;
k -= c_comp-1;
}
d_comp = k0-sd;
if (d_comp <= 0){
dmax = dmin0 + k0-1;
} else {
cmin = cmin0 + d_comp-1;
k -= d_comp-1;
}
sc = cmax-cmin+1; sd = dmax-dmin+1;
debug.log(2, "#k: " + k +", sc: " + sc + ", cmax: " + cmax +", cmin: " + cmin + ", sd: " + sd +", dmax: " + dmax + ", dmin: " + dmin + ", c_comp: " + c_comp + ", d_comp: " + d_comp);
if (k===1) return (c[cmin]<d[dmin] ? c[cmin] : d[dmin]);
if (k === sc+sd) return (c[cmax]>d[dmax] ? c[cmax] : d[dmax]);
m = Math.floor((cmax+cmin)/2);
n = Math.floor((dmax+dmin)/2);
debug.log(2, "m: " + m + ", n: "+n+", c[m]: "+c[m]+", d[n]: "+d[n]);
if (c[m]<d[n]){
if (m === cmax){ // only 1 element in c;
return d[dmin+k-1];
}
k_next = k-(m-cmin+1);
return find_kth(c, d, k_next, cmax, m+1, dmax, dmin);
} else {
if (n === dmax){
return c[cmin+k-1];
}
k_next = k-(n-dmin+1);
return find_kth(c, d, k_next, cmax, cmin, dmax, n+1);
}
}
function traverse_at(a, ae, h, l, k, at, worker, wp){
var n = ae ? ae.length : 0;
var get_node;
switch (at){
case "k": get_node = function(idx){
var node = {};
var pos = l[idx] + Math.floor(k/n) - 1;
if (pos<l[idx]){ node.pos = l[idx]; }
else if (pos > h[idx]){ node.pos = h[idx];}
else{ node.pos = pos; }
node.idx = idx;
node.val = a[idx][node.pos];
debug.log(6, "pos: "+pos+"\nnode =");
debug.log(6, node);
return node;
};
break;
case "l": get_node = function(idx){
debug.log(6, "a["+idx+"][l["+idx+"]]: "+a[idx][l[idx]]);
return a[idx][l[idx]];
};
break;
case "h": get_node = function(idx){
debug.log(6, "a["+idx+"][h["+idx+"]]: "+a[idx][h[idx]]);
return a[idx][h[idx]];
};
break;
case "s": get_node = function(idx){
debug.log(6, "h["+idx+"]-l["+idx+"]+1: "+(h[idx] - l[idx] + 1));
return h[idx] - l[idx] + 1;
};
break;
default: get_node = function(){
debug.log(1, "!!! Exception: get_node() returns null.");
return null;
};
break;
}
worker.init();
debug.log(6, "--* traverse_at() *--");
var i;
if (!wp){
for (i=0; i<n; i++){
worker.work(get_node(ae[i]));
}
} else {
for (i=0; i<n; i++){
worker.work(get_node(ae[i]), wp);
}
}
return worker.getResult();
}
sumKeeper = function(){
var res = 0;
return {
init : function(){ res = 0;},
getResult: function(){
debug.log(5, "## sumKeeper.getResult: returning: "+res);
return res;
},
work : function(node){ if (node!==null) res += node;}
};
}();
maxPicker = function(){
var res = null;
return {
init : function(){ res = null;},
getResult: function(){
debug.log(5, "## maxPicker.getResult: returning: "+res);
return res;
},
work : function(node){
if (res === null){ res = node;}
else if (node!==null && node > res){ res = node;}
}
};
}();
minPicker = function(){
var res = null;
return {
init : function(){ res = null;},
getResult: function(){
debug.log(5, "## minPicker.getResult: returning: ");
debug.log(5, res);
return res;
},
work : function(node){
if (res === null && node !== null){ res = node;}
else if (node!==null &&
node.val !==undefined &&
node.val < res.val){ res = node; }
else if (node!==null && node < res){ res = node;}
}
};
}();
// find k-th smallest number in n sorted arrays;
// need to consider the case where some of the subarrays are taken out of the selection;
function kth_n(a, ae, k, h, l){
var n = ae.length;
debug.log(2, "------** kth_n() **-------");
debug.log(2, "n: " +n+", k: " + k);
debug.log(2, "ae: ["+ae+"], len: "+ae.length);
debug.log(2, "h: [" + h + "]");
debug.log(2, "l: [" + l + "]");
for (var i=0; i<n; i++){
if (h[ae[i]]-l[ae[i]]+1>k) h[ae[i]]=l[ae[i]]+k-1;
}
debug.log(3, "--after reduction --");
debug.log(3, "h: [" + h + "]");
debug.log(3, "l: [" + l + "]");
if (n === 1)
return a[ae[0]][k-1];
if (k === 1)
return traverse_at(a, ae, h, l, k, "l", minPicker);
if (k === traverse_at(a, ae, h, l, k, "s", sumKeeper))
return traverse_at(a, ae, h, l, k, "h", maxPicker);
var kn = traverse_at(a, ae, h, l, k, "k", minPicker);
debug.log(3, "kn: ");
debug.log(3, kn);
var idx = kn.idx;
debug.log(3, "last: k: "+k+", l["+kn.idx+"]: "+l[idx]);
k -= kn.pos - l[idx] + 1;
l[idx] = kn.pos + 1;
debug.log(3, "next: "+"k: "+k+", l["+kn.idx+"]: "+l[idx]);
if (h[idx]<l[idx]){ // all elements in a[idx] selected;
//remove a[idx] from the arrays.
debug.log(4, "All elements selected in a["+idx+"].");
debug.log(5, "last ae: ["+ae+"]");
ae.splice(ae.indexOf(idx), 1);
h[idx] = l[idx] = "_"; // For display purpose only.
debug.log(5, "next ae: ["+ae+"]");
}
return kth_n(a, ae, k, h, l);
}
function find_kth_in_arrays(a, k){
if (!a || a.length<1 || k<1) throw "Mission Impossible!";
var ae=[], h=[], l=[], n=0, s, ts=0;
for (var i=0; i<a.length; i++){
s = a[i] && a[i].length;
if (s>0){
ae.push(i); h.push(s-1); l.push(0);
ts+=s;
}
}
if (k>ts) throw "Too few elements to choose from!";
return kth_n(a, ae, k, h, l);
}
/////////////////////////////////////////////////////
// tests
// To show everything: use 6.
debug.setLevel(1);
var a = [2, 3, 5, 7, 89, 223, 225, 667];
var b = [323, 555, 655, 673];
//var b = [99];
var c = [];
debug.log(1, "a = (len: " + a.length + ")");
debug.log(1, a);
debug.log(1, "b = (len: " + b.length + ")");
debug.log(1, b);
for (var k=1; k<a.length+b.length+1; k++){
debug.log(1, "================== k: " + k + "=====================");
if (k_largest(a, b, c, k) === 0 ){
debug.log(1, "c = (len: "+c.length+")");
debug.log(1, c);
}
try{
result = kth(a, b, k, -1);
debug.log(1, "===== The " + k + "-th largest number: " + result);
} catch (e) {
debug.log(0, "Error message from kth(): " + e);
}
debug.log("==================================================");
}
debug.log(1, "################# Now for the n sorted arrays ######################");
debug.log(1, "####################################################################");
x = [[1, 3, 5, 7, 9],
[-2, 4, 6, 8, 10, 12],
[8, 20, 33, 212, 310, 311, 623],
[8],
[0, 100, 700],
[300],
[],
null];
debug.log(1, "x = (len: "+x.length+")");
debug.log(1, x);
for (var i=0, num=0; i<x.length; i++){
if (x[i]!== null) num += x[i].length;
}
debug.log(1, "totoal number of elements: "+num);
// to test k in specific ranges:
var start = 0, end = 25;
for (k=start; k<end; k++){
debug.log(1, "=========================== k: " + k + "===========================");
try{
result = find_kth_in_arrays(x, k);
debug.log(1, "====== The " + k + "-th smallest number: " + result);
} catch (e) {
debug.log(1, "Error message from find_kth_in_arrays: " + e);
}
debug.log(1, "=================================================================");
}
debug.log(1, "x = (len: "+x.length+")");
debug.log(1, x);
debug.log(1, "totoal number of elements: "+num);
The complete code with debug utils can be found at: https://github.com/brainclone/teasers/tree/master/kth
Most of the answers I found here focus on both arrays. while it's good but it's harder to implement as there are a lot of edge cases that we need to take care of. Besides that most of the implementations are recursive which adds the space complexity of recursion stack. So instead of focusing on both arrays I decided to just focus on the smaller array and do the binary search on just the smaller array and adjust the pointer for the second array based on the value of the pointer in the first Array. By the following implementation, we have the complexity of O(log(min(n,m)) with O(1) space complexity.
public static int kth_two_sorted(int []a, int b[],int k){
if(a.length > b.length){
return kth_two_sorted(b,a,k);
}
if(a.length + a.length < k){
throw new RuntimeException("wrong argument");
}
int low = 0;
int high = k;
if(a.length <= k){
high = a.length-1;
}
while(low <= high){
int sizeA = low+(high - low)/2;
int sizeB = k - sizeA;
boolean shrinkLeft = false;
boolean extendRight = false;
if(sizeA != 0){
if(sizeB !=b.length){
if(a[sizeA-1] > b[sizeB]){
shrinkLeft = true;
high = sizeA-1;
}
}
}
if(sizeA!=a.length){
if(sizeB!=0){
if(a[sizeA] < b[sizeB-1]){
extendRight = true;
low = sizeA;
}
}
}
if(!shrinkLeft && !extendRight){
return Math.max(a[sizeA-1],b[sizeB-1]) ;
}
}
throw new IllegalArgumentException("we can't be here");
}
We have a range of [low, high] for array a and we narrow this range as we go further through the algorithm. sizeA shows how many of items from k items are from array a and it derives from the value of low and high. sizeB is the same definition except we calculate the value such a way that sizeA+sizeB=k. The based on the values on those two borders with conclude that we have to extend to the right side in array a or shrink to the left side. if we stuck in the same position it means that we found the solution and we will return the max of values in the position of sizeA-1 from a and sizeB-1 from b.
Here's my code based on Jules Olleon's solution:
int getNth(vector<int>& v1, vector<int>& v2, int n)
{
int step = n / 4;
int i1 = n / 2;
int i2 = n - i1;
while(!(v2[i2] >= v1[i1 - 1] && v1[i1] > v2[i2 - 1]))
{
if (v1[i1 - 1] >= v2[i2 - 1])
{
i1 -= step;
i2 += step;
}
else
{
i1 += step;
i2 -= step;
}
step /= 2;
if (!step) step = 1;
}
if (v1[i1 - 1] >= v2[i2 - 1])
return v1[i1 - 1];
else
return v2[i2 - 1];
}
int main()
{
int a1[] = {1,2,3,4,5,6,7,8,9};
int a2[] = {4,6,8,10,12};
//int a1[] = {1,2,3,4,5,6,7,8,9};
//int a2[] = {4,6,8,10,12};
//int a1[] = {1,7,9,10,30};
//int a2[] = {3,5,8,11};
vector<int> v1(a1, a1+9);
vector<int> v2(a2, a2+5);
cout << getNth(v1, v2, 5);
return 0;
}
Here is my implementation in C, you can refer to #Jules Olléon 's explains for the algorithm: the idea behind the algorithm is that we maintain i + j = k, and find such i and j so that a[i-1] < b[j-1] < a[i] (or the other way round). Now since there are i elements in 'a' smaller than b[j-1], and j-1 elements in 'b' smaller than b[j-1], b[j-1] is the i + j-1 + 1 = kth smallest element. To find such i,j the algorithm does a dichotomic search on the arrays.
int find_k(int A[], int m, int B[], int n, int k) {
if (m <= 0 )return B[k-1];
else if (n <= 0) return A[k-1];
int i = ( m/double (m + n)) * (k-1);
if (i < m-1 && i<k-1) ++i;
int j = k - 1 - i;
int Ai_1 = (i > 0) ? A[i-1] : INT_MIN, Ai = (i<m)?A[i]:INT_MAX;
int Bj_1 = (j > 0) ? B[j-1] : INT_MIN, Bj = (j<n)?B[j]:INT_MAX;
if (Ai >= Bj_1 && Ai <= Bj) {
return Ai;
} else if (Bj >= Ai_1 && Bj <= Ai) {
return Bj;
}
if (Ai < Bj_1) { // the answer can't be within A[0,...,i]
return find_k(A+i+1, m-i-1, B, n, j);
} else { // the answer can't be within A[0,...,i]
return find_k(A, m, B+j+1, n-j-1, i);
}
}
Here's my solution. The C++ code prints the kth smallest value as well as the number of iterations to get the kth smallest value using a loop, which in my opinion is in the order of log(k). The code however requires k to be smaller than the length of the first array which is a limitation.
#include <iostream>
#include <vector>
#include<math.h>
using namespace std;
template<typename comparable>
comparable kthSmallest(vector<comparable> & a, vector<comparable> & b, int k){
int idx1; // Index in the first array a
int idx2; // Index in the second array b
comparable maxVal, minValPlus;
float iter = k;
int numIterations = 0;
if(k > a.size()){ // Checks if k is larger than the size of first array
cout << " k is larger than the first array" << endl;
return -1;
}
else{ // If all conditions are satisfied, initialize the indexes
idx1 = k - 1;
idx2 = -1;
}
for ( ; ; ){
numIterations ++;
if(idx2 == -1 || b[idx2] <= a[idx1] ){
maxVal = a[idx1];
minValPlus = b[idx2 + 1];
idx1 = idx1 - ceil(iter/2); // Binary search
idx2 = k - idx1 - 2; // Ensures sum of indices = k - 2
}
else{
maxVal = b[idx2];
minValPlus = a[idx1 + 1];
idx2 = idx2 - ceil(iter/2); // Binary search
idx1 = k - idx2 - 2; // Ensures sum of indices = k - 2
}
if(minValPlus >= maxVal){ // Check if kth smallest value has been found
cout << "The number of iterations to find the " << k << "(th) smallest value is " << numIterations << endl;
return maxVal;
}
else
iter/=2; // Reduce search space of binary search
}
}
int main(){
//Test Cases
vector<int> a = {2, 4, 9, 15, 22, 34, 45, 55, 62, 67, 78, 85};
vector<int> b = {1, 3, 6, 8, 11, 13, 15, 20, 56, 67, 89};
// Input k < a.size()
int kthSmallestVal;
for (int k = 1; k <= a.size() ; k++){
kthSmallestVal = kthSmallest<int>( a ,b ,k );
cout << k <<" (th) smallest Value is " << kthSmallestVal << endl << endl << endl;
}
}
Basically, via this approach you can discard k/2 elements at each step.
The K will recursively change from k => k/2 => k/4 => ... till it reaches 1.
So, Time Complexity is O(logk)
At k=1 , we get the lowest of the two arrays.
The following code is in JAVA. Please note that the we are subtracting 1 (-1) in the code from the indices because Java array's index starts from 0 and not 1, eg. k=3 is represented by the element in 2nd index of an array.
private int kthElement(int[] arr1, int[] arr2, int k) {
if (k < 1 || k > (arr1.length + arr2.length))
return -1;
return helper(arr1, 0, arr1.length - 1, arr2, 0, arr2.length - 1, k);
}
private int helper(int[] arr1, int low1, int high1, int[] arr2, int low2, int high2, int k) {
if (low1 > high1) {
return arr2[low2 + k - 1];
} else if (low2 > high2) {
return arr1[low1 + k - 1];
}
if (k == 1) {
return Math.min(arr1[low1], arr2[low2]);
}
int i = Math.min(low1 + k / 2, high1 + 1);
int j = Math.min(low2 + k / 2, high2 + 1);
if (arr1[i - 1] > arr2[j - 1]) {
return helper(arr1, low1, high1, arr2, j, high2, k - (j - low2));
} else {
return helper(arr1, i, high1, arr2, low2, high2, k - (i - low1));
}
}
The first pseudo code provided above, does not work for many values. For example,
here are two arrays.
int[] a = { 1, 5, 6, 8, 9, 11, 15, 17, 19 };
int[] b = { 4, 7, 8, 13, 15, 18, 20, 24, 26 };
It did not work for k=3 and k=9 in it. I have another solution. It is given below.
private static void traverse(int pt, int len) {
int temp = 0;
if (len == 1) {
int val = 0;
while (k - (pt + 1) - 1 > -1 && M[pt] < N[k - (pt + 1) - 1]) {
if (val == 0)
val = M[pt] < N[k - (pt + 1) - 1] ? N[k - (pt + 1) - 1]
: M[pt];
else {
int t = M[pt] < N[k - (pt + 1) - 1] ? N[k - (pt + 1) - 1]
: M[pt];
val = val < t ? val : t;
}
++pt;
}
if (val == 0)
val = M[pt] < N[k - (pt + 1) - 1] ? N[k - (pt + 1) - 1] : M[pt];
System.out.println(val);
return;
}
temp = len / 2;
if (M[pt + temp - 1] < N[k - (pt + temp) - 1]) {
traverse(pt + temp, temp);
} else {
traverse(pt, temp);
}
}
But... it is also not working for k=5. There is this even/odd catch of k which is not letting it to be simple.
public class KthSmallestInSortedArray {
public static void main(String[] args) {
int a1[] = {2, 3, 10, 11, 43, 56},
a2[] = {120, 13, 14, 24, 34, 36},
k = 4;
System.out.println(findKthElement(a1, a2, k));
}
private static int findKthElement(int a1[], int a2[], int k) {
/** Checking k must less than sum of length of both array **/
if (a1.length + a2.length < k) {
throw new IllegalArgumentException();
}
/** K must be greater than zero **/
if (k <= 0) {
throw new IllegalArgumentException();
}
/**
* Finding begin, l and end such that
* begin <= l < end
* a1[0].....a1[l-1] and
* a2[0]....a2[k-l-1] are the smallest k numbers
*/
int begin = Math.max(0, k - a2.length);
int end = Math.min(a1.length, k);
while (begin < end) {
int l = begin + (end - begin) / 2;
/** Can we include a1[l] in the k smallest numbers */
if ((l < a1.length) &&
(k - l > 0) &&
(a1[l] < a2[k - l - 1])) {
begin = l + 1;
} else if ((l > 0) &&
(k - l < a2.length) &&
(a1[l - 1] > a2[k - 1])) {
/**
* This is the case where we can discard
* a[l-1] from the set of k smallest numbers
*/
end = l;
} else {
/**
* We found our answer since both inequalities were
* false
*/
begin = l;
break;
}
}
if (begin == 0) {
return a2[k - 1];
} else if (begin == k) {
return a1[k - 1];
} else {
return Math.max(a1[begin - 1], a2[k - begin - 1]);
}
}
}
Here is mine solution in java . Will try to further optimize it
public class FindKLargestTwoSortedArray {
public static void main(String[] args) {
int[] arr1 = { 10, 20, 40, 80 };
int[] arr2 = { 15, 35, 50, 75 };
FindKLargestTwoSortedArray(arr1, 0, arr1.length - 1, arr2, 0,
arr2.length - 1, 6);
}
public static void FindKLargestTwoSortedArray(int[] arr1, int start1,
int end1, int[] arr2, int start2, int end2, int k) {
if ((start1 <= end1 && start1 >= 0 && end1 < arr1.length)
&& (start2 <= end2 && start2 >= 0 && end2 < arr2.length)) {
int midIndex1 = (start1 + (k - 1) / 2);
midIndex1 = midIndex1 >= arr1.length ? arr1.length - 1 : midIndex1;
int midIndex2 = (start2 + (k - 1) / 2);
midIndex2 = midIndex2 >= arr2.length ? arr2.length - 1 : midIndex2;
if (arr1[midIndex1] == arr2[midIndex2]) {
System.out.println("element is " + arr1[midIndex1]);
} else if (arr1[midIndex1] < arr2[midIndex2]) {
if (k == 1) {
System.out.println("element is " + arr1[midIndex1]);
return;
} else if (k == 2) {
System.out.println("element is " + arr2[midIndex2]);
return;
}else if (midIndex1 == arr1.length-1 || midIndex2 == arr2.length-1 ) {
if(k==(arr1.length+arr2.length)){
System.out.println("element is " + arr2[midIndex2]);
return;
}else if(k==(arr1.length+arr2.length)-1){
System.out.println("element is " + arr1[midIndex1]);
return;
}
}
int remainingElementToSearch = k - (midIndex1-start1);
FindKLargestTwoSortedArray(
arr1,
midIndex1,
(midIndex1 + remainingElementToSearch) >= arr1.length ? arr1.length-1
: (midIndex1 + remainingElementToSearch), arr2,
start2, midIndex2, remainingElementToSearch);
} else if (arr1[midIndex1] > arr2[midIndex2]) {
FindKLargestTwoSortedArray(arr2, start2, end2, arr1, start1,
end1, k);
}
} else {
return;
}
}
}
This is inspired from Algo at wonderful youtube video
Link to code complexity (log(n)+log(m))
Link to Code (log(n)*log(m))
Implementation of (log(n)+log(m)) solution
I would like to add my explanation to the problem.
This is a classic problem where we have to use the fact that the two arrays are sorted .
we have been given two sorted arrays arr1 of size sz1 and arr2 of size sz2
a)Lets suppose if
Checking If k is valid
k is > (sz1+sz2)
then we cannot find kth smallest element in union of both sorted arrays ryt So return Invalid data.
b)Now if above condition holds false and we have valid and feasible value of k,
Managing Edge Cases
We will append both the arrays by -infinity values at front and +infinity values at end to cover the edge cases of k = 1,2 and k = (sz1+sz2-1),(sz1+sz2)etc.
Now both the arrays have size (sz1+2) and (sz2+2) respectively
Main Algorithm
Now,we will do binary search on arr1 .We will do binary search on arr1 looking for an index i , startIndex <= i <= endIndex
such that if we find corresponding index j in arr2 using constraint {(i+j) = k},then if
if (arr2[j-1] < arr1[i] < arr2[j]),then arr1[i] is the kth smallest (Case 1)
else if (arr1[i-1] < arr2[j] < arr1[i]) ,then arr2[i] is the kth smallest (Case 2)
else signifies either arr1[i] < arr2[j-1] < arr2[j] (Case3)
or arr2[j-1] < arr2[j] < arr1[i] (Case4)
Since we know that the kth smallest element has (k-1) elements smaller than it in union of both the arrays ryt? So,
In Case1, what we did , we ensured that there are a total of (k-1) smaller elements to arr1[i] because elements smaller than arr1[i] in arr1 array are i-1 in number than we know (arr2[j-1] < arr1[i] < arr2[j]) and number of elements smaller than arr1[i] in arr2 is j-1 because j is found using (i-1)+(j-1) = (k-1) So kth smallest element will be arr1[i]
But answer may not always come from the first array ie arr1 so we checked for case2 which also satisfies similarly like case 1 because (i-1)+(j-1) = (k-1) . Now if we have (arr1[i-1] < arr2[j] < arr1[i]) we have a total of k-1 elements smaller than arr2[j] in union of both the arrays so its the kth smallest element.
In case3 , to form it to any of case 1 or case 2, we need to increment i and j will be found according using constraint {(i+j) = k} ie in binary search move to right part ie make startIndex = middleIndex
In case4, to form it to any of case 1 or case 2, we need to decrement i and j will be found according using constraint {(i+j) = k} ie in binary search move to left part ie make endIndex = middleIndex.
Now how to decide startIndex and endIndex at beginning of binary search over arr1
with startindex = 1 and endIndex = ??.We need to decide.
If k > sz1,endIndex = (sz1+1) , else endIndex = k;
Because if k is greater than the size of the first array we may have to do binary search over the entire array arr1 else we only need to take first k elements of it because sz1-k elements can never contribute in calculating kth smallest.
CODE Shown Below
// Complexity O(log(n)+log(m))
#include<bits/stdc++.h>
using namespace std;
#define f(i,x,y) for(int i = (x);i < (y);++i)
#define F(i,x,y) for(int i = (x);i > (y);--i)
int max(int a,int b){return (a > b?a:b);}
int min(int a,int b){return (a < b?a:b);}
int mod(int a){return (a > 0?a:((-1)*(a)));}
#define INF 1000000
int func(int *arr1,int *arr2,int sz1,int sz2,int k)
{
if((k <= (sz1+sz2))&&(k > 0))
{
int s = 1,e,i,j;
if(k > sz1)e = sz1+1;
else e = k;
while((e-s)>1)
{
i = (e+s)/2;
j = ((k-1)-(i-1));
j++;
if(j > (sz2+1)){s = i;}
else if((arr1[i] >= arr2[j-1])&&(arr1[i] <= arr2[j]))return arr1[i];
else if((arr2[j] >= arr1[i-1])&&(arr2[j] <= arr1[i]))return arr2[j];
else if(arr1[i] < arr2[j-1]){s = i;}
else if(arr1[i] > arr2[j]){e = i;}
else {;}
}
i = e,j = ((k-1)-(i-1));j++;
if((arr1[i] >= arr2[j-1])&&(arr1[i] <= arr2[j]))return arr1[i];
else if((arr2[j] >= arr1[i-1])&&(arr2[j] <= arr1[i]))return arr2[j];
else
{
i = s,j = ((k-1)-(i-1));j++;
if((arr1[i] >= arr2[j-1])&&(arr1[i] <= arr2[j]))return arr1[i];
else return arr2[j];
}
}
else
{
cout << "Data Invalid" << endl;
return -INF;
}
}
int main()
{
int n,m,k;
cin >> n >> m >> k;
int arr1[n+2];
int arr2[m+2];
f(i,1,n+1)
cin >> arr1[i];
f(i,1,m+1)
cin >> arr2[i];
arr1[0] = -INF;
arr2[0] = -INF;
arr1[n+1] = +INF;
arr2[m+1] = +INF;
int val = func(arr1,arr2,n,m,k);
if(val != -INF)cout << val << endl;
return 0;
}
For Solution of complexity (log(n)*log(m))
Just i missed using advantage of the fact that for each i the j can be found using constraint {(i-1)+(j-1)=(k-1)} So for each i i was further applying binary search on second array to find j such that arr2[j] <= arr1[i].So this solution can be optimized further
#include <bits/stdc++.h>
using namespace std;
int findKthElement(int a[],int start1,int end1,int b[],int start2,int end2,int k){
if(start1 >= end1)return b[start2+k-1];
if(start2 >= end2)return a[start1+k-1];
if(k==1)return min(a[start1],b[start2]);
int aMax = INT_MAX;
int bMax = INT_MAX;
if(start1+k/2-1 < end1) aMax = a[start1 + k/2 - 1];
if(start2+k/2-1 < end2) bMax = b[start2 + k/2 - 1];
if(aMax > bMax){
return findKthElement(a,start1,end1,b,start2+k/2,end2,k-k/2);
}
else{
return findKthElement(a,start1 + k/2,end1,b,start2,end2,k-k/2);
}
}
int main(void){
int t;
scanf("%d",&t);
while(t--){
int n,m,k;
cout<<"Enter the size of 1st Array"<<endl;
cin>>n;
int arr[n];
cout<<"Enter the Element of 1st Array"<<endl;
for(int i = 0;i<n;i++){
cin>>arr[i];
}
cout<<"Enter the size of 2nd Array"<<endl;
cin>>m;
int arr1[m];
cout<<"Enter the Element of 2nd Array"<<endl;
for(int i = 0;i<m;i++){
cin>>arr1[i];
}
cout<<"Enter The Value of K";
cin>>k;
sort(arr,arr+n);
sort(arr1,arr1+m);
cout<<findKthElement(arr,0,n,arr1,0,m,k)<<endl;
}
return 0;
}
Time Complexcity is O(log(min(n,m)))
Below C# code to Find the k-th Smallest Element in the Union of Two Sorted Arrays. Time Complexity : O(logk)
public static int findKthSmallestElement1(int[] A, int startA, int endA, int[] B, int startB, int endB, int k)
{
int n = endA - startA;
int m = endB - startB;
if (n <= 0)
return B[startB + k - 1];
if (m <= 0)
return A[startA + k - 1];
if (k == 1)
return A[startA] < B[startB] ? A[startA] : B[startB];
int midA = (startA + endA) / 2;
int midB = (startB + endB) / 2;
if (A[midA] <= B[midB])
{
if (n / 2 + m / 2 + 1 >= k)
return findKthSmallestElement1(A, startA, endA, B, startB, midB, k);
else
return findKthSmallestElement1(A, midA + 1, endA, B, startB, endB, k - n / 2 - 1);
}
else
{
if (n / 2 + m / 2 + 1 >= k)
return findKthSmallestElement1(A, startA, midA, B, startB, endB, k);
else
return findKthSmallestElement1(A, startA, endA, B, midB + 1, endB, k - m / 2 - 1);
}
}
Check this code.
import math
def findkthsmallest():
A=[1,5,10,22,30,35,75,125,150,175,200]
B=[15,16,20,22,25,30,100,155,160,170]
lM=0
lN=0
hM=len(A)-1
hN=len(B)-1
k=17
while True:
if k==1:
return min(A[lM],B[lN])
cM=hM-lM+1
cN=hN-lN+1
tmp = cM/float(cM+cN)
iM=int(math.ceil(tmp*k))
iN=k-iM
iM=lM+iM-1
iN=lN+iN-1
if A[iM] >= B[iN]:
if iN == hN or A[iM] < B[iN+1]:
return A[iM]
else:
k = k - (iN-lN+1)
lN=iN+1
hM=iM-1
if B[iN] >= A[iM]:
if iM == hM or B[iN] < A[iM+1]:
return B[iN]
else:
k = k - (iM-lM+1)
lM=iM+1
hN=iN-1
if hM < lM:
return B[lN+k-1]
if hN < lN:
return A[lM+k-1]
if __name__ == '__main__':
print findkthsmallest();

Algorithms: Interesting diffing algorithm

This came up in a real-world situation, and I thought I would share it, as it could lead to some interesting solutions. Essentially, the algorithm needs to diff two lists, but let me give you a more rigorous definition of the problem.
Mathematical Formulation
Suppose you have two lists, L and R each of which contain elements from some underlying alphabet S. Moreover, these lists have the property that the common elements that they have appear in order: that is to say, if L[i] = R[i*] and L[j] = R[j*], and i < j then i* < j*. The lists need not have any common elements at all, and one or both may be empty. [Clarification: You may assume no repetitions of elements.]
The problem is to produce a sort of "diff" of the lists, which may be viewed as new list of ordered pairs (x,y) where x is from L and y is from R, with the following properties:
If x appears in both lists, then (x,x) appears in the result.
If x appears in L, but not in R, then (x,NULL) appears in the result.
If y appears in R, but not in L, then (NULL,y) appears in the result.
and finally
The result list has "the same" ordering as each of the input lists: it shares, roughly speaking, the same ordering property as above with each of the lists individually (see example).
Examples
L = (d)
R = (a,b,c)
Result = ((NULL,d), (a,NULL), (b,NULL), (c,NULL))
L = (a,b,c,d,e)
R = (b,q,c,d,g,e)
Result = ((a,NULL), (b,b), (NULL,q), (c,c), (d,d), (NULL,g), (e,e))
Does anyone have any good algorithms to solve this? What is the complexity?
There is a way to do this in O(n), if you're willing to make a copy of one of the lists in a different data structure. This is a classic time/space tradeoff.
Create a hash map of the list R, with the key being the element and the value being the original index into the array; in C++, you could use unordered_map from tr1 or boost.
Keep an index to the unprocessed portion of list R, initialized to the first element.
For each element in list L, check the hash map for a match in list R. If you do not find one, output (L value, NULL). If there is a match, get the corresponding index from the hash map. For each unprocessed element in list R up to the matching index, output (NULL, R value). For the match, output (value, value).
When you have reached the end of list L, go through the remaining elements of list R and output (NULL, R value).
Edit: Here is the solution in Python. To those who say this solution depends on the existence of a good hashing function - of course it does. The original poster may add additional constraints to the question if this is a problem, but I will take an optimistic stance until then.
def FindMatches(listL, listR):
result=[]
lookupR={}
for i in range(0, len(listR)):
lookupR[listR[i]] = i
unprocessedR = 0
for left in listL:
if left in lookupR:
for right in listR[unprocessedR:lookupR[left]]:
result.append((None,right))
result.append((left,left))
unprocessedR = lookupR[left] + 1
else:
result.append((left,None))
for right in listR[unprocessedR:]:
result.append((None,right))
return result
>>> FindMatches(('d'),('a','b','c'))
[('d', None), (None, 'a'), (None, 'b'), (None, 'c')]
>>> FindMatches(('a','b','c','d','e'),('b','q','c','d','g','e'))
[('a', None), ('b', 'b'), (None, 'q'), ('c', 'c'), ('d', 'd'), (None, 'g'), ('e','e')]
The worst case, as defined and using only equality, must be O(n*m). Consider the following two lists:
A[] = {a,b,c,d,e,f,g}
B[] = {h,i,j,k,l,m,n}
Assume there exists exactly one match between those two "ordered" lists. It will take O(n*m) comparisons since there does not exist a comparison which removes the need for other comparisons later.
So, any algorithm you come up with is going to be O(n*m), or worse.
Diffing ordered lists can be done in linear time by traversing both lists and matching as you go. I will try to post some psuedo Java code in an update.
Since we don't know the ordering algorithm and can't determine any ordering based on less than or greater than operators, we must consider the lists unordered. Also, given how the results are to be formatted you are faced with scanning both lists (at least until you find a match and then you can bookmark and start from there again). It will still be O(n^2) performance, or yes more specifically O(nm).
This is exactly like sequence alignment, you can use the Needleman-Wunsch algorithm to solve it. The link includes the code in Python. Just make sure you set the scoring so that a mismatch is negative and a match is positive and an alignment with a blank is 0 when maximizing. The algorithm runs in O(n * m) time and space, but the space complexity of this can be improved.
Scoring Function
int score(char x, char y){
if ((x == ' ') || (y == ' ')){
return 0;
}
else if (x != y){
return -1;
}
else if (x == y){
return 1;
}
else{
puts("Error!");
exit(2);
}
}
Code
#include <stdio.h>
#include <stdbool.h>
int max(int a, int b, int c){
bool ab, ac, bc;
ab = (a > b);
ac = (a > c);
bc = (b > c);
if (ab && ac){
return a;
}
if (!ab && bc){
return b;
}
if (!ac && !bc){
return c;
}
}
int score(char x, char y){
if ((x == ' ') || (y == ' ')){
return 0;
}
else if (x != y){
return -1;
}
else if (x == y){
return 1;
}
else{
puts("Error!");
exit(2);
}
}
void print_table(int **table, char str1[], char str2[]){
unsigned int i, j, len1, len2;
len1 = strlen(str1) + 1;
len2 = strlen(str2) + 1;
for (j = 0; j < len2; j++){
if (j != 0){
printf("%3c", str2[j - 1]);
}
else{
printf("%3c%3c", ' ', ' ');
}
}
putchar('\n');
for (i = 0; i < len1; i++){
if (i != 0){
printf("%3c", str1[i - 1]);
}
else{
printf("%3c", ' ');
}
for (j = 0; j < len2; j++){
printf("%3d", table[i][j]);
}
putchar('\n');
}
}
int **optimal_global_alignment_table(char str1[], char str2[]){
unsigned int len1, len2, i, j;
int **table;
len1 = strlen(str1) + 1;
len2 = strlen(str2) + 1;
table = malloc(sizeof(int*) * len1);
for (i = 0; i < len1; i++){
table[i] = calloc(len2, sizeof(int));
}
for (i = 0; i < len1; i++){
table[i][0] += i * score(str1[i], ' ');
}
for (j = 0; j < len1; j++){
table[0][j] += j * score(str1[j], ' ');
}
for (i = 1; i < len1; i++){
for (j = 1; j < len2; j++){
table[i][j] = max(
table[i - 1][j - 1] + score(str1[i - 1], str2[j - 1]),
table[i - 1][j] + score(str1[i - 1], ' '),
table[i][j - 1] + score(' ', str2[j - 1])
);
}
}
return table;
}
void prefix_char(char ch, char str[]){
int i;
for (i = strlen(str); i >= 0; i--){
str[i+1] = str[i];
}
str[0] = ch;
}
void optimal_global_alignment(int **table, char str1[], char str2[]){
unsigned int i, j;
char *align1, *align2;
i = strlen(str1);
j = strlen(str2);
align1 = malloc(sizeof(char) * (i * j));
align2 = malloc(sizeof(char) * (i * j));
align1[0] = align2[0] = '\0';
while((i > 0) && (j > 0)){
if (table[i][j] == (table[i - 1][j - 1] + score(str1[i - 1], str2[j - 1]))){
prefix_char(str1[i - 1], align1);
prefix_char(str2[j - 1], align2);
i--;
j--;
}
else if (table[i][j] == (table[i - 1][j] + score(str1[i-1], ' '))){
prefix_char(str1[i - 1], align1);
prefix_char('_', align2);
i--;
}
else if (table[i][j] == (table[i][j - 1] + score(' ', str2[j - 1]))){
prefix_char('_', align1);
prefix_char(str2[j - 1], align2);
j--;
}
}
while (i > 0){
prefix_char(str1[i - 1], align1);
prefix_char('_', align2);
i--;
}
while(j > 0){
prefix_char('_', align1);
prefix_char(str2[j - 1], align2);
j--;
}
puts(align1);
puts(align2);
}
int main(int argc, char * argv[]){
int **table;
if (argc == 3){
table = optimal_global_alignment_table(argv[1], argv[2]);
print_table(table, argv[1], argv[2]);
optimal_global_alignment(table, argv[1], argv[2]);
}
else{
puts("Reqires to string arguments!");
}
return 0;
}
Sample IO
$ cc dynamic_programming.c && ./a.out aab bba
__aab
bb_a_
$ cc dynamic_programming.c && ./a.out d abc
___d
abc_
$ cc dynamic_programming.c && ./a.out abcde bqcdge
ab_cd_e
_bqcdge
No real tangible answer, only vague intuition. Because you don't know the ordering algorithm, only that the data is ordered in each list, it sounds vaguely like the algorithms used to "diff" files (e.g. in Beyond Compare) and match sequences of lines together. Or also vaguely similar to regexp algorithms.
There can also be multiple solutions. (never mind, not if there are not repeated elements that are strictly ordered. I was thinking too much along the lines of file comparisons)
This is a pretty simple problem since you already have an ordered list.
//this is very rough pseudocode
stack aList;
stack bList;
List resultList;
char aVal;
char bVal;
while(aList.Count > 0 || bList.Count > 0)
{
aVal = aList.Peek; //grab the top item in A
bVal = bList.Peek; //grab the top item in B
if(aVal < bVal || bVal == null)
{
resultList.Add(new Tuple(aList.Pop(), null)));
}
if(bVal < aVal || aVal == null)
{
resultList.Add(new Tuple(null, bList.Pop()));
}
else //equal
{
resultList.Add(new Tuple(aList.Pop(), bList.Pop()));
}
}
Note... this code WILL NOT compile. It is just meant as a guide.
EDIT Based on the OP comments
If the ordering algorithm is not exposed, then the lists must be considered unordered.
If the lists are unordered, then the algorithm has a time complexity of O(n^2), specifically O(nm) where n and m are the number of items in each list.
EDIT
Algorithm to solve this
L(a,b,c,d,e)
R(b,q,c,d,g,e)
//pseudo code... will not compile
//Note, this modifies aList and bList, so make copies.
List aList;
List bList;
List resultList;
var aVal;
var bVal;
while(aList.Count > 0)
{
aVal = aList.Pop();
for(int bIndex = 0; bIndex < bList.Count; bIndex++)
{
bVal = bList.Peek();
if(aVal.RelevantlyEquivalentTo(bVal)
{
//The bList items that come BEFORE the match, are definetly not in aList
for(int tempIndex = 0; tempIndex < bIndex; tempIndex++)
{
resultList.Add(new Tuple(null, bList.Pop()));
}
//This 'popped' item is the same as bVal right now
resultList.Add(new Tuple(aVal, bList.Pop()));
//Set aVal to null so it doesn't get added to resultList again
aVal = null;
//Break because it's guaranteed not to be in the rest of the list
break;
}
}
//No Matches
if(aVal != null)
{
resultList.Add(new Tuple(aVal, null));
}
}
//aList is now empty, and all the items left in bList need to be added to result set
while(bList.Count > 0)
{
resultList.Add(new Tuple(null, bList.Pop()));
}
The result set will be
L(a,b,c,d,e)
R(b,q,c,d,g,e)
Result ((a,null),(b,b),(null,q),(c,c),(d,d),(null,g),(e,e))
I don't think you have enough information. All you've asserted is that elements that match match in the same order, but finding the first matching pair is an O(nm) operation unless you have some other ordering that you can determine.
SELECT distinct l.element, r.element
FROM LeftList l
OUTER JOIN RightList r
ON l.element = r.element
ORDER BY l.id, r.id
Assumes the ID of each element is its ordering. And of course, that your lists are contained in a Relational Database :)

Resources