Related
I was recently given this interview question and I'm curious what a good solution to it would be.
Say I'm given a 2d array where all the
numbers in the array are in increasing
order from left to right and top to
bottom.
What is the best way to search and
determine if a target number is in the
array?
Now, my first inclination is to utilize a binary search since my data is sorted. I can determine if a number is in a single row in O(log N) time. However, it is the 2 directions that throw me off.
Another solution I thought may work is to start somewhere in the middle. If the middle value is less than my target, then I can be sure it is in the left square portion of the matrix from the middle. I then move diagonally and check again, reducing the size of the square that the target could potentially be in until I have honed in on the target number.
Does anyone have any good ideas on solving this problem?
Example array:
Sorted left to right, top to bottom.
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Here's a simple approach:
Start at the bottom-left corner.
If the target is less than that value, it must be above us, so move up one.
Otherwise we know that the target can't be in that column, so move right one.
Goto 2.
For an NxM array, this runs in O(N+M). I think it would be difficult to do better. :)
Edit: Lots of good discussion. I was talking about the general case above; clearly, if N or M are small, you could use a binary search approach to do this in something approaching logarithmic time.
Here are some details, for those who are curious:
History
This simple algorithm is called a Saddleback Search. It's been around for a while, and it is optimal when N == M. Some references:
David Gries, The Science of Programming. Springer-Verlag, 1989.
Edsgar Dijkstra, The Saddleback Search. Note EWD-934, 1985.
However, when N < M, intuition suggests that binary search should be able to do better than O(N+M): For example, when N == 1, a pure binary search will run in logarithmic rather than linear time.
Worst-case bound
Richard Bird examined this intuition that binary search could improve the Saddleback algorithm in a 2006 paper:
Richard S. Bird, Improving Saddleback Search: A Lesson in Algorithm Design, in Mathematics of Program Construction, pp. 82--89, volume 4014, 2006.
Using a rather unusual conversational technique, Bird shows us that for N <= M, this problem has a lower bound of Ω(N * log(M/N)). This bound make sense, as it gives us linear performance when N == M and logarithmic performance when N == 1.
Algorithms for rectangular arrays
One approach that uses a row-by-row binary search looks like this:
Start with a rectangular array where N < M. Let's say N is rows and M is columns.
Do a binary search on the middle row for value. If we find it, we're done.
Otherwise we've found an adjacent pair of numbers s and g, where s < value < g.
The rectangle of numbers above and to the left of s is less than value, so we can eliminate it.
The rectangle below and to the right of g is greater than value, so we can eliminate it.
Go to step (2) for each of the two remaining rectangles.
In terms of worst-case complexity, this algorithm does log(M) work to eliminate half the possible solutions, and then recursively calls itself twice on two smaller problems. We do have to repeat a smaller version of that log(M) work for every row, but if the number of rows is small compared to the number of columns, then being able to eliminate all of those columns in logarithmic time starts to become worthwhile.
This gives the algorithm a complexity of T(N,M) = log(M) + 2 * T(M/2, N/2), which Bird shows to be O(N * log(M/N)).
Another approach posted by Craig Gidney describes an algorithm similar the approach above: it examines a row at a time using a step size of M/N. His analysis shows that this results in O(N * log(M/N)) performance as well.
Performance Comparison
Big-O analysis is all well and good, but how well do these approaches work in practice? The chart below examines four algorithms for increasingly "square" arrays:
(The "naive" algorithm simply searches every element of the array. The "recursive" algorithm is described above. The "hybrid" algorithm is an implementation of Gidney's algorithm. For each array size, performance was measured by timing each algorithm over fixed set of 1,000,000 randomly-generated arrays.)
Some notable points:
As expected, the "binary search" algorithms offer the best performance on rectangular arrays and the Saddleback algorithm works the best on square arrays.
The Saddleback algorithm performs worse than the "naive" algorithm for 1-d arrays, presumably because it does multiple comparisons on each item.
The performance hit that the "binary search" algorithms take on square arrays is presumably due to the overhead of running repeated binary searches.
Summary
Clever use of binary search can provide O(N * log(M/N) performance for both rectangular and square arrays. The O(N + M) "saddleback" algorithm is much simpler, but suffers from performance degradation as arrays become increasingly rectangular.
This problem takes Θ(b lg(t)) time, where b = min(w,h) and t=b/max(w,h). I discuss the solution in this blog post.
Lower bound
An adversary can force an algorithm to make Ω(b lg(t)) queries, by restricting itself to the main diagonal:
Legend: white cells are smaller items, gray cells are larger items, yellow cells are smaller-or-equal items and orange cells are larger-or-equal items. The adversary forces the solution to be whichever yellow or orange cell the algorithm queries last.
Notice that there are b independent sorted lists of size t, requiring Ω(b lg(t)) queries to completely eliminate.
Algorithm
(Assume without loss of generality that w >= h)
Compare the target item against the cell t to the left of the top right corner of the valid area
If the cell's item matches, return the current position.
If the cell's item is less than the target item, eliminate the remaining t cells in the row with a binary search. If a matching item is found while doing this, return with its position.
Otherwise the cell's item is more than the target item, eliminating t short columns.
If there's no valid area left, return failure
Goto step 2
Finding an item:
Determining an item doesn't exist:
Legend: white cells are smaller items, gray cells are larger items, and the green cell is an equal item.
Analysis
There are b*t short columns to eliminate. There are b long rows to eliminate. Eliminating a long row costs O(lg(t)) time. Eliminating t short columns costs O(1) time.
In the worst case we'll have to eliminate every column and every row, taking time O(lg(t)*b + b*t*1/t) = O(b lg(t)).
Note that I'm assuming lg clamps to a result above 1 (i.e. lg(x) = log_2(max(2,x))). That's why when w=h, meaning t=1, we get the expected bound of O(b lg(1)) = O(b) = O(w+h).
Code
public static Tuple<int, int> TryFindItemInSortedMatrix<T>(this IReadOnlyList<IReadOnlyList<T>> grid, T item, IComparer<T> comparer = null) {
if (grid == null) throw new ArgumentNullException("grid");
comparer = comparer ?? Comparer<T>.Default;
// check size
var width = grid.Count;
if (width == 0) return null;
var height = grid[0].Count;
if (height < width) {
var result = grid.LazyTranspose().TryFindItemInSortedMatrix(item, comparer);
if (result == null) return null;
return Tuple.Create(result.Item2, result.Item1);
}
// search
var minCol = 0;
var maxRow = height - 1;
var t = height / width;
while (minCol < width && maxRow >= 0) {
// query the item in the minimum column, t above the maximum row
var luckyRow = Math.Max(maxRow - t, 0);
var cmpItemVsLucky = comparer.Compare(item, grid[minCol][luckyRow]);
if (cmpItemVsLucky == 0) return Tuple.Create(minCol, luckyRow);
// did we eliminate t rows from the bottom?
if (cmpItemVsLucky < 0) {
maxRow = luckyRow - 1;
continue;
}
// we eliminated most of the current minimum column
// spend lg(t) time eliminating rest of column
var minRowInCol = luckyRow + 1;
var maxRowInCol = maxRow;
while (minRowInCol <= maxRowInCol) {
var mid = minRowInCol + (maxRowInCol - minRowInCol + 1) / 2;
var cmpItemVsMid = comparer.Compare(item, grid[minCol][mid]);
if (cmpItemVsMid == 0) return Tuple.Create(minCol, mid);
if (cmpItemVsMid > 0) {
minRowInCol = mid + 1;
} else {
maxRowInCol = mid - 1;
maxRow = mid - 1;
}
}
minCol += 1;
}
return null;
}
I would use the divide-and-conquer strategy for this problem, similar to what you suggested, but the details are a bit different.
This will be a recursive search on subranges of the matrix.
At each step, pick an element in the middle of the range. If the value found is what you are seeking, then you're done.
Otherwise, if the value found is less than the value that you are seeking, then you know that it is not in the quadrant above and to the left of your current position. So recursively search the two subranges: everything (exclusively) below the current position, and everything (exclusively) to the right that is at or above the current position.
Otherwise, (the value found is greater than the value that you are seeking) you know that it is not in the quadrant below and to the right of your current position. So recursively search the two subranges: everything (exclusively) to the left of the current position, and everything (exclusively) above the current position that is on the current column or a column to the right.
And ba-da-bing, you found it.
Note that each recursive call only deals with the current subrange only, not (for example) ALL rows above the current position. Just those in the current subrange.
Here's some pseudocode for you:
bool numberSearch(int[][] arr, int value, int minX, int maxX, int minY, int maxY)
if (minX == maxX and minY == maxY and arr[minX,minY] != value)
return false
if (arr[minX,minY] > value) return false; // Early exits if the value can't be in
if (arr[maxX,maxY] < value) return false; // this subrange at all.
int nextX = (minX + maxX) / 2
int nextY = (minY + maxY) / 2
if (arr[nextX,nextY] == value)
{
print nextX,nextY
return true
}
else if (arr[nextX,nextY] < value)
{
if (numberSearch(arr, value, minX, maxX, nextY + 1, maxY))
return true
return numberSearch(arr, value, nextX + 1, maxX, minY, nextY)
}
else
{
if (numberSearch(arr, value, minX, nextX - 1, minY, maxY))
return true
reutrn numberSearch(arr, value, nextX, maxX, minY, nextY)
}
The two main answers give so far seem to be the arguably O(log N) "ZigZag method" and the O(N+M) Binary Search method. I thought I'd do some testing comparing the two methods with some various setups. Here are the details:
The array is N x N square in every test, with N varying from 125 to 8000 (the largest my JVM heap could handle). For each array size, I picked a random place in the array to put a single 2. I then put a 3 everywhere possible (to the right and below of the 2) and then filled the rest of the array with 1. Some of the earlier commenters seemed to think this type of setup would yield worst case run time for both algorithms. For each array size, I picked 100 different random locations for the 2 (search target) and ran the test. I recorded avg run time and worst case run time for each algorithm. Because it was happening too fast to get good ms readings in Java, and because I don't trust Java's nanoTime(), I repeated each test 1000 times just to add a uniform bias factor to all the times. Here are the results:
ZigZag beat binary in every test for both avg and worst case times, however, they are all within an order of magnitude of each other more or less.
Here is the Java code:
public class SearchSortedArray2D {
static boolean findZigZag(int[][] a, int t) {
int i = 0;
int j = a.length - 1;
while (i <= a.length - 1 && j >= 0) {
if (a[i][j] == t) return true;
else if (a[i][j] < t) i++;
else j--;
}
return false;
}
static boolean findBinarySearch(int[][] a, int t) {
return findBinarySearch(a, t, 0, 0, a.length - 1, a.length - 1);
}
static boolean findBinarySearch(int[][] a, int t,
int r1, int c1, int r2, int c2) {
if (r1 > r2 || c1 > c2) return false;
if (r1 == r2 && c1 == c2 && a[r1][c1] != t) return false;
if (a[r1][c1] > t) return false;
if (a[r2][c2] < t) return false;
int rm = (r1 + r2) / 2;
int cm = (c1 + c2) / 2;
if (a[rm][cm] == t) return true;
else if (a[rm][cm] > t) {
boolean b1 = findBinarySearch(a, t, r1, c1, r2, cm - 1);
boolean b2 = findBinarySearch(a, t, r1, cm, rm - 1, c2);
return (b1 || b2);
} else {
boolean b1 = findBinarySearch(a, t, r1, cm + 1, rm, c2);
boolean b2 = findBinarySearch(a, t, rm + 1, c1, r2, c2);
return (b1 || b2);
}
}
static void randomizeArray(int[][] a, int N) {
int ri = (int) (Math.random() * N);
int rj = (int) (Math.random() * N);
a[ri][rj] = 2;
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (i == ri && j == rj) continue;
else if (i > ri || j > rj) a[i][j] = 3;
else a[i][j] = 1;
}
}
}
public static void main(String[] args) {
int N = 8000;
int[][] a = new int[N][N];
int randoms = 100;
int repeats = 1000;
long start, end, duration;
long zigMin = Integer.MAX_VALUE, zigMax = Integer.MIN_VALUE;
long binMin = Integer.MAX_VALUE, binMax = Integer.MIN_VALUE;
long zigSum = 0, zigAvg;
long binSum = 0, binAvg;
for (int k = 0; k < randoms; k++) {
randomizeArray(a, N);
start = System.currentTimeMillis();
for (int i = 0; i < repeats; i++) findZigZag(a, 2);
end = System.currentTimeMillis();
duration = end - start;
zigSum += duration;
zigMin = Math.min(zigMin, duration);
zigMax = Math.max(zigMax, duration);
start = System.currentTimeMillis();
for (int i = 0; i < repeats; i++) findBinarySearch(a, 2);
end = System.currentTimeMillis();
duration = end - start;
binSum += duration;
binMin = Math.min(binMin, duration);
binMax = Math.max(binMax, duration);
}
zigAvg = zigSum / randoms;
binAvg = binSum / randoms;
System.out.println(findZigZag(a, 2) ?
"Found via zigzag method. " : "ERROR. ");
//System.out.println("min search time: " + zigMin + "ms");
System.out.println("max search time: " + zigMax + "ms");
System.out.println("avg search time: " + zigAvg + "ms");
System.out.println();
System.out.println(findBinarySearch(a, 2) ?
"Found via binary search method. " : "ERROR. ");
//System.out.println("min search time: " + binMin + "ms");
System.out.println("max search time: " + binMax + "ms");
System.out.println("avg search time: " + binAvg + "ms");
}
}
This is a short proof of the lower bound on the problem.
You cannot do it better than linear time (in terms of array dimensions, not the number of elements). In the array below, each of the elements marked as * can be either 5 or 6 (independently of other ones). So if your target value is 6 (or 5) the algorithm needs to examine all of them.
1 2 3 4 *
2 3 4 * 7
3 4 * 7 8
4 * 7 8 9
* 7 8 9 10
Of course this expands to bigger arrays as well. This means that this answer is optimal.
Update: As pointed out by Jeffrey L Whitledge, it is only optimal as the asymptotic lower bound on running time vs input data size (treated as a single variable). Running time treated as two-variable function on both array dimensions can be improved.
I think Here is the answer and it works for any kind of sorted matrix
bool findNum(int arr[][ARR_MAX],int xmin, int xmax, int ymin,int ymax,int key)
{
if (xmin > xmax || ymin > ymax || xmax < xmin || ymax < ymin) return false;
if ((xmin == xmax) && (ymin == ymax) && (arr[xmin][ymin] != key)) return false;
if (arr[xmin][ymin] > key || arr[xmax][ymax] < key) return false;
if (arr[xmin][ymin] == key || arr[xmax][ymax] == key) return true;
int xnew = (xmin + xmax)/2;
int ynew = (ymin + ymax)/2;
if (arr[xnew][ynew] == key) return true;
if (arr[xnew][ynew] < key)
{
if (findNum(arr,xnew+1,xmax,ymin,ymax,key))
return true;
return (findNum(arr,xmin,xmax,ynew+1,ymax,key));
} else {
if (findNum(arr,xmin,xnew-1,ymin,ymax,key))
return true;
return (findNum(arr,xmin,xmax,ymin,ynew-1,key));
}
}
Interesting question. Consider this idea - create one boundary where all the numbers are greater than your target and another where all the numbers are less than your target. If anything is left in between the two, that's your target.
If I'm looking for 3 in your example, I read across the first row until I hit 4, then look for the smallest adjacent number (including diagonals) greater than 3:
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Now I do the same for those numbers less than 3:
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Now I ask, is anything inside the two boundaries? If yes, it must be 3. If no, then there is no 3. Sort of indirect since I don't actually find the number, I just deduce that it must be there. This has the added bonus of counting ALL the 3's.
I tried this on some examples and it seems to work OK.
Binary search through the diagonal of the array is the best option.
We can find out whether the element is less than or equal to the elements in the diagonal.
I've been asking this question in interviews for the better part of a decade and I think there's only been one person who has been able to come up with an optimal algorithm.
My solution has always been:
Binary search the middle diagonal, which is the diagonal running down and right, containing the item at (rows.count/2, columns.count/2).
If the target number is found, return true.
Otherwise, two numbers (u and v) will have been found such that u is smaller than the target, v is larger than the target, and v is one right and one down from u.
Recursively search the sub-matrix to the right of u and top of v and the one to the bottom of u and left of v.
I believe this is a strict improvement over the algorithm given by Nate here, since searching the diagonal often allows a reduction of over half the search space (if the matrix is close to square), whereas searching a row or column always results in an elimination of exactly half.
Here's the code in (probably not terribly Swifty) Swift:
import Cocoa
class Solution {
func searchMatrix(_ matrix: [[Int]], _ target: Int) -> Bool {
if (matrix.isEmpty || matrix[0].isEmpty) {
return false
}
return _searchMatrix(matrix, 0..<matrix.count, 0..<matrix[0].count, target)
}
func _searchMatrix(_ matrix: [[Int]], _ rows: Range<Int>, _ columns: Range<Int>, _ target: Int) -> Bool {
if (rows.count == 0 || columns.count == 0) {
return false
}
if (rows.count == 1) {
return _binarySearch(matrix, rows.lowerBound, columns, target, true)
}
if (columns.count == 1) {
return _binarySearch(matrix, columns.lowerBound, rows, target, false)
}
var lowerInflection = (-1, -1)
var upperInflection = (Int.max, Int.max)
var currentRows = rows
var currentColumns = columns
while (currentRows.count > 0 && currentColumns.count > 0 && upperInflection.0 > lowerInflection.0+1) {
let rowMidpoint = (currentRows.upperBound + currentRows.lowerBound) / 2
let columnMidpoint = (currentColumns.upperBound + currentColumns.lowerBound) / 2
let value = matrix[rowMidpoint][columnMidpoint]
if (value == target) {
return true
}
if (value > target) {
upperInflection = (rowMidpoint, columnMidpoint)
currentRows = currentRows.lowerBound..<rowMidpoint
currentColumns = currentColumns.lowerBound..<columnMidpoint
} else {
lowerInflection = (rowMidpoint, columnMidpoint)
currentRows = rowMidpoint+1..<currentRows.upperBound
currentColumns = columnMidpoint+1..<currentColumns.upperBound
}
}
if (lowerInflection.0 == -1) {
lowerInflection = (upperInflection.0-1, upperInflection.1-1)
} else if (upperInflection.0 == Int.max) {
upperInflection = (lowerInflection.0+1, lowerInflection.1+1)
}
return _searchMatrix(matrix, rows.lowerBound..<lowerInflection.0+1, upperInflection.1..<columns.upperBound, target) || _searchMatrix(matrix, upperInflection.0..<rows.upperBound, columns.lowerBound..<lowerInflection.1+1, target)
}
func _binarySearch(_ matrix: [[Int]], _ rowOrColumn: Int, _ range: Range<Int>, _ target: Int, _ searchRow : Bool) -> Bool {
if (range.isEmpty) {
return false
}
let midpoint = (range.upperBound + range.lowerBound) / 2
let value = (searchRow ? matrix[rowOrColumn][midpoint] : matrix[midpoint][rowOrColumn])
if (value == target) {
return true
}
if (value > target) {
return _binarySearch(matrix, rowOrColumn, range.lowerBound..<midpoint, target, searchRow)
} else {
return _binarySearch(matrix, rowOrColumn, midpoint+1..<range.upperBound, target, searchRow)
}
}
}
A. Do a binary search on those lines where the target number might be on.
B. Make it a graph : Look for the number by taking always the smallest unvisited neighbour node and backtracking when a too big number is found
Binary search would be the best approach, imo. Starting at 1/2 x, 1/2 y will cut it in half. IE a 5x5 square would be something like x == 2 / y == 3 . I rounded one value down and one value up to better zone in on the direction of the targeted value.
For clarity the next iteration would give you something like x == 1 / y == 2 OR x == 3 / y == 5
Well, to begin with, let us assume we are using a square.
1 2 3
2 3 4
3 4 5
1. Searching a square
I would use a binary search on the diagonal. The goal is the locate the smaller number that is not strictly lower than the target number.
Say I am looking for 4 for example, then I would end up locating 5 at (2,2).
Then, I am assured that if 4 is in the table, it is at a position either (x,2) or (2,x) with x in [0,2]. Well, that's just 2 binary searches.
The complexity is not daunting: O(log(N)) (3 binary searches on ranges of length N)
2. Searching a rectangle, naive approach
Of course, it gets a bit more complicated when N and M differ (with a rectangle), consider this degenerate case:
1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
And let's say I am looking for 9... The diagonal approach is still good, but the definition of diagonal changes. Here my diagonal is [1, (5 or 6), 17]. Let's say I picked up [1,5,17], then I know that if 9 is in the table it is either in the subpart:
5 6 7 8
6 7 8 9
10 11 12 13 14 15 16
This gives us 2 rectangles:
5 6 7 8 10 11 12 13 14 15 16
6 7 8 9
So we can recurse! probably beginning by the one with less elements (though in this case it kills us).
I should point that if one of the dimensions is less than 3, we cannot apply the diagonal methods and must use a binary search. Here it would mean:
Apply binary search on 10 11 12 13 14 15 16, not found
Apply binary search on 5 6 7 8, not found
Apply binary search on 6 7 8 9, not found
It's tricky because to get good performance you might want to differentiate between several cases, depending on the general shape....
3. Searching a rectangle, brutal approach
It would be much easier if we dealt with a square... so let's just square things up.
1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
17 . . . . . . 17
. .
. .
. .
17 . . . . . . 17
We now have a square.
Of course, we will probably NOT actually create those rows, we could simply emulate them.
def get(x,y):
if x < N and y < M: return table[x][y]
else: return table[N-1][M-1] # the max
so it behaves like a square without occupying more memory (at the cost of speed, probably, depending on cache... oh well :p)
EDIT:
I misunderstood the question. As the comments point out this only works in the more restricted case.
In a language like C that stores data in row-major order, simply treat it as a 1D array of size n * m and use a binary search.
I have a recursive Divide & Conquer Solution.
Basic Idea for one step is: We know that the Left-Upper(LU) is smallest and the right-bottom(RB) is the largest no., so the given No(N) must: N>=LU and N<=RB
IF N==LU and N==RB::::Element Found and Abort returning the position/Index
If N>=LU and N<=RB = FALSE, No is not there and abort.
If N>=LU and N<=RB = TRUE, Divide the 2D array in 4 equal parts of 2D array each in logical manner..
And then apply the same algo step to all four sub-array.
My Algo is Correct I have implemented on my friends PC.
Complexity: each 4 comparisons can b used to deduce the total no of elements to one-fourth at its worst case..
So My complexity comes to be 1 + 4 x lg(n) + 4
But really expected this to be working on O(n)
I think something is wrong somewhere in my calculation of Complexity, please correct if so..
The optimal solution is to start at the top-left corner, that has minimal value. Move diagonally downwards to the right until you hit an element whose value >= value of the given element. If the element's value is equal to that of the given element, return found as true.
Otherwise, from here we can proceed in two ways.
Strategy 1:
Move up in the column and search for the given element until we reach the end. If found, return found as true
Move left in the row and search for the given element until we reach the end. If found, return found as true
return found as false
Strategy 2:
Let i denote the row index and j denote the column index of the diagonal element we have stopped at. (Here, we have i = j, BTW). Let k = 1.
Repeat the below steps until i-k >= 0
Search if a[i-k][j] is equal to the given element. if yes, return found as true.
Search if a[i][j-k] is equal to the given element. if yes, return found as true.
Increment k
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
public boolean searchSortedMatrix(int arr[][] , int key , int minX , int maxX , int minY , int maxY){
// base case for recursion
if(minX > maxX || minY > maxY)
return false ;
// early fails
// array not properly intialized
if(arr==null || arr.length==0)
return false ;
// arr[0][0]> key return false
if(arr[minX][minY]>key)
return false ;
// arr[maxX][maxY]<key return false
if(arr[maxX][maxY]<key)
return false ;
//int temp1 = minX ;
//int temp2 = minY ;
int midX = (minX+maxX)/2 ;
//if(temp1==midX){midX+=1 ;}
int midY = (minY+maxY)/2 ;
//if(temp2==midY){midY+=1 ;}
// arr[midX][midY] = key ? then value found
if(arr[midX][midY] == key)
return true ;
// alas ! i have to keep looking
// arr[midX][midY] < key ? search right quad and bottom matrix ;
if(arr[midX][midY] < key){
if( searchSortedMatrix(arr ,key , minX,maxX , midY+1 , maxY))
return true ;
// search bottom half of matrix
if( searchSortedMatrix(arr ,key , midX+1,maxX , minY , maxY))
return true ;
}
// arr[midX][midY] > key ? search left quad matrix ;
else {
return(searchSortedMatrix(arr , key , minX,midX-1,minY,midY-1));
}
return false ;
}
I suggest, store all characters in a 2D list. then find index of required element if it exists in list.
If not present print appropriate message else print row and column as:
row = (index/total_columns) and column = (index%total_columns -1)
This will incur only the binary search time in a list.
Please suggest any corrections. :)
If O(M log(N)) solution is ok for an MxN array -
template <size_t n>
struct MN * get(int a[][n], int k, int M, int N){
struct MN *result = new MN;
result->m = -1;
result->n = -1;
/* Do a binary search on each row since rows (and columns too) are sorted. */
for(int i = 0; i < M; i++){
int lo = 0; int hi = N - 1;
while(lo <= hi){
int mid = lo + (hi-lo)/2;
if(k < a[i][mid]) hi = mid - 1;
else if (k > a[i][mid]) lo = mid + 1;
else{
result->m = i;
result->n = mid;
return result;
}
}
}
return result;
}
Working C++ demo.
Please do let me know if this wouldn't work or if there is a bug it it.
class Solution {
public boolean searchMatrix(int[][] matrix, int target) {
if(matrix == null)
return false;
int i=0;
int j=0;
int m = matrix.length;
int n = matrix[0].length;
boolean found = false;
while(i<m && !found){
while(j<n && !found){
if(matrix[i][j] == target)
found = true;
if(matrix[i][j] < target)
j++;
else
break;
}
i++;
j=0;
}
return found;
}}
129 / 129 test cases passed.
Status: Accepted
Runtime: 39 ms
Memory Usage: 55 MB
Given a square matrix as follows:
[ a b c ]
[ d e f ]
[ i j k ]
We know that a < c, d < f, i < k. What we don't know is whether d < c or d > c, etc. We have guarantees only in 1-dimension.
Looking at the end elements (c,f,k), we can do a sort of filter: is N < c ? search() : next(). Thus, we have n iterations over the rows, with each row taking either O( log( n ) ) for binary search or O( 1 ) if filtered out.
Let me given an EXAMPLE where N = j,
1) Check row 1. j < c? (no, go next)
2) Check row 2. j < f? (yes, bin search gets nothing)
3) Check row 3. j < k? (yes, bin search finds it)
Try again with N = q,
1) Check row 1. q < c? (no, go next)
2) Check row 2. q < f? (no, go next)
3) Check row 3. q < k? (no, go next)
There is probably a better solution out there but this is easy to explain.. :)
As this is an interview question, it would seem to lead towards a discussion of Parallel programming and Map-reduce algorithms.
See http://code.google.com/intl/de/edu/parallel/mapreduce-tutorial.html
I recently came across this question and thought I could share it here, since I wasn't able to get it.
We are given a 5*5 grid numbered from 1-25, and a set of 5 pairs of points,that are start and end points of a path on the grid.
Now we need to find 5 corresponding paths for the 5 pairs of points, such that no two paths should overlap. Also note that only vertical and horizontal moves are allowed. Also the combined 5 path should cover the entire grid.
For example we are given the pair of points as:
P={1,22},{4,17},{5,18},{9,13},{20,23}
Then the corresponding paths will be
1-6-11-16-21-22
4-3-2-7-12-17
5-10-15-14-19-18
9-8-13
20-25-24-23
What I have thought of so far:
Maybe i can compute all paths from source to destination for all pairs of points and then check if there's no common point in the paths. However this seems to be of higher time complexity.
Can anyone propose a better algorithm? I would be glad if one could explain through a pseudo code.Thanks
This problem is essentially the Hamiltonian path/cycle problem problem (since you can connect the end of one path to the start of another, and consider all the five paths as a part of one big cycle). There are no known efficient algorithms for this, as the problem is NP-complete, so you do essentially need to try all possible paths with backtracking (there are fancier algorithms, but they're not much faster).
Your title asks for an approximation algorithm, but this is not an optimization problem - it's not the case that some solutions are better than others; all correct solutions are equally good, and if it isn't correct, then it's completely wrong - so there is no possibility for approximation.
Edit: The below is a solution to the original problem posted by the OP, which did not include the "all cells must be covered" constraint. I'm leaving it up for those that might face the original problem.
This can be solved with a maximum flow algorithm, such as Edmonds-Karp.
The trick is to model the grid as a graph where there are two nodes per grid cell; one "outgoing" node and one "incoming" node. For each adjacent pair of cells, there are edges from the "outgoing" node in either cell to the "incoming" node in the other cell. Within each cell, there is also an edge from the "incoming" to the "outgoing" node. Each edge has the capacity 1. Create one global source node that has an edge to all the start nodes, and one global sink node to which all end nodes have an edge.
Then, run the flow algorithm; the resulting flow shows the non-intersecting paths.
This works because all flow coming in to a cell must pass through the "internal" edge from the "incoming" to the "ougoing" node, and as such, the flow through each cell is limited to 1 - therefore, no paths will intersect. Also, Edmonds-Karp (and all Floyd-Warshall based flow algorithms) will produce integer flows as long as all capacities are integers.
Here's a program written in Python that walks all potential paths. It uses recursion and backtracking to find the paths, and it marks a grid to see which locations are already being used.
One key optimization is that it marks the start and end points on the grid (10 of the 25 points).
Another optimization is that it generates all moves from each point before starting the "walk" across the grid. For example, from point 1 the moves are to points 2 & 6; from point 7, the moves are to points 2, 6, 8 & 12.
points = [(1,22), (4,17), (5,18), (9,13), (20,23)]
paths = []
# find all moves from each position 0-25
moves = [None] # set position 0 with None
for i in range(1,26):
m = []
if i % 5 != 0: # move right
m.append(i+1)
if i % 5 != 1: # move left
m.append(i-1)
if i > 5: # move up
m.append(i-5)
if i < 21: # move down
m.append(i+5)
moves.append(m)
# Recursive function to walk path 'p' from 'start' to 'end'
def walk(p, start, end):
for m in moves[start]: # try all moves from this point
paths[p].append(m) # keep track of our path
if m == end: # reached the end point for this path?
if p+1 == len(points): # no more paths?
if None not in grid[1:]: # full coverage?
print
for i,path in enumerate(paths):
print "%d." % (i+1), '-'.join(map(str, path))
else:
_start, _end = points[p+1] # now try to walk the next path
walk(p+1, _start, _end)
elif grid[m] is None: # can we walk onto the next grid spot?
grid[m] = p # mark this spot as taken
walk(p, m, end)
grid[m] = None # unmark this spot
paths[p].pop() # backtrack on this path
grid = [None for i in range(26)] # initialize the grid as empty points
for p in range(len(points)):
start, end = points[p]
paths.append([start]) # initialize path with its starting point
grid[start] = grid[end] = p # optimization: pre-set the known points
start, end = points[0]
walk(0, start, end)
Well, I started out thinking about a brute force algorithm, and I left that below, but it turns out it's actually simpler to search for all answers rather than generate all configurations and test for valid answers. Here's the search code, which ended up looking much like #Brent Washburne's. It runs in 53 milliseconds on my laptop.
import java.util.Arrays;
class Puzzle {
final int path[][];
final int grid[] = new int[25];
Puzzle(int[][] path) {
// Make the path endpoints 0-based for Java arrays.
this.path = Arrays.asList(path).stream().map(pair -> {
return new int[] { pair[0] - 1, pair[1] - 1 };
}).toArray(int[][]::new);
}
void print() {
System.out.println();
for (int i = 0; i < grid.length; i += 5)
System.out.println(
Arrays.toString(Arrays.copyOfRange(grid, i, i + 5)));
}
void findPaths(int ip, int i) {
if (grid[i] != -1) return; // backtrack
grid[i] = ip; // mark visited
if(i == path[ip][1]) // path complete
if (ip < path.length - 1) findPaths(ip + 1, path[ip + 1][0]); // find next path
else print(); // solution complete
else { // continue with current path
if (i < 20) findPaths(ip, i + 5);
if (i > 4) findPaths(ip, i - 5);
if (i % 5 < 4) findPaths(ip, i + 1);
if (i % 5 > 0) findPaths(ip, i - 1);
}
grid[i] = -1; // unmark
}
void solve() {
Arrays.fill(grid, -1);
findPaths(0, path[0][0]);
}
public static void main(String[] args) {
new Puzzle(new int[][]{{1, 22}, {4, 17}, {5, 18}, {9, 13}, {20, 23}}).solve();
}
}
Old, bad answer
This problem is doable by brute force if you think about it "backward:" assign all the grid squares to paths and test to see if the assignment is valid. There are 25 grid squares and you need to construct 5 paths, each with 2 endpoints. So you know the paths these 10 points lie on. All that's left is to label the remaining 15 squares with the paths they lie on. There are 5 possibilities for each, so 5^15 in all. That's about 30 billion. All that's left is to build an efficient checker that says whether a given assignment is a set of 5 valid paths. This is simple to do by linear time search. The code below finds your solution in about 2 minutes and takes a bit under 11 minutes to test exhaustively on my MacBook:
import java.util.Arrays;
public class Hacking {
static class Puzzle {
final int path[][];
final int grid[] = new int[25];
Puzzle(int[][] path) { this.path = path; }
void print() {
System.out.println();
for (int i = 0; i < grid.length; i += 5)
System.out.println(
Arrays.toString(Arrays.copyOfRange(grid, i, i + 5)));
}
boolean trace(int p, int i, int goal) {
if (grid[i] != p) return false;
grid[i] = -1; // mark visited
boolean rtn =
i == goal ? !Arrays.asList(grid).contains(p) : nsew(p, i, goal);
grid[i] = p; // unmark
return rtn;
}
boolean nsew(int p, int i, int goal) {
if (i < 20 && trace(p, i + 5, goal)) return true;
if (i > 4 && trace(p, i - 5, goal)) return true;
if (i % 5 < 4 && trace(p, i + 1, goal)) return true;
if (i % 5 > 0 && trace(p, i - 1, goal)) return true;
return false;
}
void test() {
for (int ip = 0; ip < path.length; ip++)
if (!trace(ip, path[ip][0] - 1, path[ip][1] - 1)) return;
print();
}
void enumerate(int i) {
if (i == grid.length) test();
else if (grid[i] != -1) enumerate(i + 1); // already known
else {
for (int ip = 0; ip < 5; ip++) {
grid[i] = ip;
enumerate(i + 1);
}
grid[i] = -1;
}
}
void solve() {
Arrays.fill(grid, -1);
for (int ip = 0; ip < path.length; ip++)
grid[path[ip][0] - 1] = grid[path[ip][1] - 1] = ip;
enumerate(0);
}
}
public static void main(String[] args) {
new Puzzle(new int[][]{{1, 22}, {4, 17}, {5, 18}, {9, 13}, {20, 23}}).solve();
}
}
The starting array:
[ 0, -1, -1, 1, 2]
[-1, -1, -1, 3, -1]
[-1, -1, 3, -1, -1]
[-1, 1, 2, -1, 4]
[-1, 0, 4, -1, -1]
The result:
[ 0, 1, 1, 1, 2]
[ 0, 1, 3, 3, 2]
[ 0, 1, 3, 2, 2]
[ 0, 1, 2, 2, 4]
[ 0, 0, 4, 4, 4]
This is question asked in one of the interview. Please suggest some view.
Given an array containing all positive integers. You have to arrange elements in such a way that odd elements are at odd position and even elements are at even positions.
PS. No extra space. O(N) solution
Iterate over the even positions until you find an odd number. Iterate over the odd positions until you find and even number (using a different index). Swap the two numbers, and repeat.
Are you allowed to double the size of the array? Otherwise, the question doesn't make sense. Why?!? assume you are given an array full of odd numbers, can you think of any solution then? No, there is not.
So, I assume that you are allowed to double the size of the array. Then for any i, put the i-element ( a(i) ) into the location 2*i or 2*i +1 depending on whether a(i) is even or odd resp.
Two two new Arrays OddArray and EvenArray of same size as that of given array. Traverse through the given array and keep sending all the odd to OddArray and keep at odd positions and even number to EvenArray keeping numbers at even positions.
The efficiency will be O(n) and extra memory will be 2n where n is the size of original array.
list1 = [5, 7, 6, 8, 10, 3, 4, 9, 2, 1, 12]
odd_list = []
even_list = []
for i in range(len(list1)):
if((list1[i] % 2) == 0):
even_list.append(list1[i])
else:
odd_list.append(list1[i])
print(list1)
j = 0
k = 0
for i in range(0, len(list1)):
if((i % 2 == 0) and (j < len(odd_list))):
list1[i] = odd_list[j]
j += 1
elif(k < len(even_list)):
list1[i] = even_list[k]
k += 1
print(list1)
//Putting even number on even position and odd number on odd position
package com.learnJava;
public class ArrangeArray {
private int [] array={2,5,7,8,1,6,9};
private int len=array.length;
public static void main(String [] args)
{
ArrangeArray a=new ArrangeArray();
a.print();
a.arrange();
a.print();
}
public void print()
{
for(int i=0;i<array.length;i++)
{
System.out.print(array[i] + " ");
}
System.out.println();
}
public void arrange()
{
int oddinx=1;
int evenidx=0;
while(true)
{
while(evenidx<len && array[evenidx]%2==0)
{
evenidx+=2;
}
while(oddinx<len && array[oddinx]%2==1)
{
oddinx+=2;
}
if (evenidx < len && oddinx < len)
swap (evenidx, oddinx);
else
break;
}
}
public void swap(int a,int b)
{
int tmp=array[b];
array[b]=array[a];
array[a]=tmp;
}
}
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question 12 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
You are given as input an unsorted array of n distinct numbers, where n is a power of 2. Give an algorithm that identifies the second-largest number in the array, and that uses at most n+log₂(n)−2 comparisons.
Start with comparing elements of the n element array in odd and even positions and determining largest element of each pair. This step requires n/2 comparisons. Now you've got only n/2 elements. Continue pairwise comparisons to get n/4, n/8, ... elements. Stop when the largest element is found. This step requires a total of n/2 + n/4 + n/8 + ... + 1 = n-1 comparisons.
During previous step, the largest element was immediately compared with log₂(n) other elements. You can determine the largest of these elements in log₂(n)-1 comparisons. That would be the second-largest number in the array.
Example: array of 8 numbers [10,9,5,4,11,100,120,110].
Comparisons on level 1: [10,9] ->10 [5,4]-> 5, [11,100]->100 , [120,110]-->120.
Comparisons on level 2: [10,5] ->10 [100,120]->120.
Comparisons on level 3: [10,120]->120.
Maximum is 120. It was immediately compared with: 10 (on level 3), 100 (on level 2), 110 (on level 1).
Step 2 should find the maximum of 10, 100, and 110. Which is 110. That's the second largest element.
sly s's answer is derived from this paper, but he didn't explain the algorithm, which means someone stumbling across this question has to read the whole paper, and his code isn't very sleek as well. I'll give the crux of the algorithm from the aforementioned paper, complete with complexity analysis, and also provide a Scala implementation, just because that's the language I chose while working on these problems.
Basically, we do two passes:
Find the max, and keep track of which elements the max was compared to.
Find the max among the elements the max was compared to; the result is the second largest element.
In the picture above, 12 is the largest number in the array, and was compared to 3, 1, 11, and 10 in the first pass. In the second pass, we find the largest among {3, 1, 11, 10}, which is 11, which is the second largest number in the original array.
Time Complexity:
All elements must be looked at, therefore, n - 1 comparisons for pass 1.
Since we divide the problem into two halves each time, there are at most log₂n recursive calls, for each of which, the comparisons sequence grows by at most one; the size of the comparisons sequence is thus at most log₂n, therefore, log₂n - 1 comparisons for pass 2.
Total number of comparisons <= (n - 1) + (log₂n - 1) = n + log₂n - 2
def second_largest(nums: Sequence[int]) -> int:
def _max(lo: int, hi: int, seq: Sequence[int]) -> Tuple[int, MutableSequence[int]]:
if lo >= hi:
return seq[lo], []
mid = lo + (hi - lo) // 2
x, a = _max(lo, mid, seq)
y, b = _max(mid + 1, hi, seq)
if x > y:
a.append(y)
return x, a
b.append(x)
return y, b
comparisons = _max(0, len(nums) - 1, nums)[1]
return _max(0, len(comparisons) - 1, comparisons)[0]
The first run for the given example is as follows:
lo=0, hi=1, mid=0, x=10, a=[], y=4, b=[]
lo=0, hi=2, mid=1, x=10, a=[4], y=5, b=[]
lo=3, hi=4, mid=3, x=8, a=[], y=7, b=[]
lo=3, hi=5, mid=4, x=8, a=[7], y=2, b=[]
lo=0, hi=5, mid=2, x=10, a=[4, 5], y=8, b=[7, 2]
lo=6, hi=7, mid=6, x=12, a=[], y=3, b=[]
lo=6, hi=8, mid=7, x=12, a=[3], y=1, b=[]
lo=9, hi=10, mid=9, x=6, a=[], y=9, b=[]
lo=9, hi=11, mid=10, x=9, a=[6], y=11, b=[]
lo=6, hi=11, mid=8, x=12, a=[3, 1], y=11, b=[9]
lo=0, hi=11, mid=5, x=10, a=[4, 5, 8], y=12, b=[3, 1, 11]
Things to note:
There are exactly n - 1=11 comparisons for n=12.
From the last line, y=12 wins over x=10, and the next pass starts with the sequence [3, 1, 11, 10], which has log₂(12)=3.58 ~ 4 elements, and will require 3 comparisons to find the maximum.
I have implemented this algorithm in Java answered by #Evgeny Kluev. The total comparisons are n+log2(n)−2. There is also a good reference:
Alexander Dekhtyar: CSC 349: Design and Analyis of Algorithms. This is similar to the top voted algorithm.
public class op1 {
private static int findSecondRecursive(int n, int[] A){
int[] firstCompared = findMaxTournament(0, n-1, A); //n-1 comparisons;
int[] secondCompared = findMaxTournament(2, firstCompared[0]-1, firstCompared); //log2(n)-1 comparisons.
//Total comparisons: n+log2(n)-2;
return secondCompared[1];
}
private static int[] findMaxTournament(int low, int high, int[] A){
if(low == high){
int[] compared = new int[2];
compared[0] = 2;
compared[1] = A[low];
return compared;
}
int[] compared1 = findMaxTournament(low, (low+high)/2, A);
int[] compared2 = findMaxTournament((low+high)/2+1, high, A);
if(compared1[1] > compared2[1]){
int k = compared1[0] + 1;
int[] newcompared1 = new int[k];
System.arraycopy(compared1, 0, newcompared1, 0, compared1[0]);
newcompared1[0] = k;
newcompared1[k-1] = compared2[1];
return newcompared1;
}
int k = compared2[0] + 1;
int[] newcompared2 = new int[k];
System.arraycopy(compared2, 0, newcompared2, 0, compared2[0]);
newcompared2[0] = k;
newcompared2[k-1] = compared1[1];
return newcompared2;
}
private static void printarray(int[] a){
for(int i:a){
System.out.print(i + " ");
}
System.out.println();
}
public static void main(String[] args) {
//Demo.
System.out.println("Origial array: ");
int[] A = {10,4,5,8,7,2,12,3,1,6,9,11};
printarray(A);
int secondMax = findSecondRecursive(A.length,A);
Arrays.sort(A);
System.out.println("Sorted array(for check use): ");
printarray(A);
System.out.println("Second largest number in A: " + secondMax);
}
}
the problem is:
let's say, in comparison level 1, the algorithm need to be remember all the array element because largest is not yet known, then, second, finally, third. by keep tracking these element via assignment will invoke additional value assignment and later when the largest is known, you need also consider the tracking back. As the result, it will not be significantly faster than simple 2N-2 Comparison algorithm. Moreover, because the code is more complicated, you need also think about potential debugging time.
eg: in PHP, RUNNING time for comparison vs value assignment roughly is :Comparison: (11-19) to value assignment: 16.
I shall give some examples for better understanding. :
example 1 :
>12 56 98 12 76 34 97 23
>>(12 56) (98 12) (76 34) (97 23)
>>> 56 98 76 97
>>>> (56 98) (76 97)
>>>>> 98 97
>>>>>> 98
The largest element is 98
Now compare with lost ones of the largest element 98. 97 will be the second largest.
nlogn implementation
public class Test {
public static void main(String...args){
int arr[] = new int[]{1,2,2,3,3,4,9,5, 100 , 101, 1, 2, 1000, 102, 2,2,2};
System.out.println(getMax(arr, 0, 16));
}
public static Holder getMax(int[] arr, int start, int end){
if (start == end)
return new Holder(arr[start], Integer.MIN_VALUE);
else {
int mid = ( start + end ) / 2;
Holder l = getMax(arr, start, mid);
Holder r = getMax(arr, mid + 1, end);
if (l.compareTo(r) > 0 )
return new Holder(l.high(), r.high() > l.low() ? r.high() : l.low());
else
return new Holder(r.high(), l.high() > r.low() ? l.high(): r.low());
}
}
static class Holder implements Comparable<Holder> {
private int low, high;
public Holder(int r, int l){low = l; high = r;}
public String toString(){
return String.format("Max: %d, SecMax: %d", high, low);
}
public int compareTo(Holder data){
if (high == data.high)
return 0;
if (high > data.high)
return 1;
else
return -1;
}
public int high(){
return high;
}
public int low(){
return low;
}
}
}
Why not to use this hashing algorithm for given array[n]? It runs c*n, where c is constant time for check and hash. And it does n comparisons.
int first = 0;
int second = 0;
for(int i = 0; i < n; i++) {
if(array[i] > first) {
second = first;
first = array[i];
}
}
Or am I just do not understand the question...
In Python2.7: The following code works at O(nlog log n) for the extra sort. Any optimizations?
def secondLargest(testList):
secondList = []
# Iterate through the list
while(len(testList) > 1):
left = testList[0::2]
right = testList[1::2]
if (len(testList) % 2 == 1):
right.append(0)
myzip = zip(left,right)
mymax = [ max(list(val)) for val in myzip ]
myzip.sort()
secondMax = [x for x in myzip[-1] if x != max(mymax)][0]
if (secondMax != 0 ):
secondList.append(secondMax)
testList = mymax
return max(secondList)
public static int FindSecondLargest(int[] input)
{
Dictionary<int, List<int>> dictWinnerLoser = new Dictionary<int, List<int>>();//Keeps track of loosers with winners
List<int> lstWinners = null;
List<int> lstLoosers = null;
int winner = 0;
int looser = 0;
while (input.Count() > 1)//Runs till we get max in the array
{
lstWinners = new List<int>();//Keeps track of winners of each run, as we have to run with winners of each run till we get one winner
for (int i = 0; i < input.Count() - 1; i += 2)
{
if (input[i] > input[i + 1])
{
winner = input[i];
looser = input[i + 1];
}
else
{
winner = input[i + 1];
looser = input[i];
}
lstWinners.Add(winner);
if (!dictWinnerLoser.ContainsKey(winner))
{
lstLoosers = new List<int>();
lstLoosers.Add(looser);
dictWinnerLoser.Add(winner, lstLoosers);
}
else
{
lstLoosers = dictWinnerLoser[winner];
lstLoosers.Add(looser);
dictWinnerLoser[winner] = lstLoosers;
}
}
input = lstWinners.ToArray();//run the loop again with winners
}
List<int> loosersOfWinner = dictWinnerLoser[input[0]];//Gives all the elemetns who lost to max element of array, input array now has only one element which is actually the max of the array
winner = 0;
for (int i = 0; i < loosersOfWinner.Count(); i++)//Now max in the lossers of winner will give second largest
{
if (winner < loosersOfWinner[i])
{
winner = loosersOfWinner[i];
}
}
return winner;
}
I was recently given this interview question and I'm curious what a good solution to it would be.
Say I'm given a 2d array where all the
numbers in the array are in increasing
order from left to right and top to
bottom.
What is the best way to search and
determine if a target number is in the
array?
Now, my first inclination is to utilize a binary search since my data is sorted. I can determine if a number is in a single row in O(log N) time. However, it is the 2 directions that throw me off.
Another solution I thought may work is to start somewhere in the middle. If the middle value is less than my target, then I can be sure it is in the left square portion of the matrix from the middle. I then move diagonally and check again, reducing the size of the square that the target could potentially be in until I have honed in on the target number.
Does anyone have any good ideas on solving this problem?
Example array:
Sorted left to right, top to bottom.
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Here's a simple approach:
Start at the bottom-left corner.
If the target is less than that value, it must be above us, so move up one.
Otherwise we know that the target can't be in that column, so move right one.
Goto 2.
For an NxM array, this runs in O(N+M). I think it would be difficult to do better. :)
Edit: Lots of good discussion. I was talking about the general case above; clearly, if N or M are small, you could use a binary search approach to do this in something approaching logarithmic time.
Here are some details, for those who are curious:
History
This simple algorithm is called a Saddleback Search. It's been around for a while, and it is optimal when N == M. Some references:
David Gries, The Science of Programming. Springer-Verlag, 1989.
Edsgar Dijkstra, The Saddleback Search. Note EWD-934, 1985.
However, when N < M, intuition suggests that binary search should be able to do better than O(N+M): For example, when N == 1, a pure binary search will run in logarithmic rather than linear time.
Worst-case bound
Richard Bird examined this intuition that binary search could improve the Saddleback algorithm in a 2006 paper:
Richard S. Bird, Improving Saddleback Search: A Lesson in Algorithm Design, in Mathematics of Program Construction, pp. 82--89, volume 4014, 2006.
Using a rather unusual conversational technique, Bird shows us that for N <= M, this problem has a lower bound of Ω(N * log(M/N)). This bound make sense, as it gives us linear performance when N == M and logarithmic performance when N == 1.
Algorithms for rectangular arrays
One approach that uses a row-by-row binary search looks like this:
Start with a rectangular array where N < M. Let's say N is rows and M is columns.
Do a binary search on the middle row for value. If we find it, we're done.
Otherwise we've found an adjacent pair of numbers s and g, where s < value < g.
The rectangle of numbers above and to the left of s is less than value, so we can eliminate it.
The rectangle below and to the right of g is greater than value, so we can eliminate it.
Go to step (2) for each of the two remaining rectangles.
In terms of worst-case complexity, this algorithm does log(M) work to eliminate half the possible solutions, and then recursively calls itself twice on two smaller problems. We do have to repeat a smaller version of that log(M) work for every row, but if the number of rows is small compared to the number of columns, then being able to eliminate all of those columns in logarithmic time starts to become worthwhile.
This gives the algorithm a complexity of T(N,M) = log(M) + 2 * T(M/2, N/2), which Bird shows to be O(N * log(M/N)).
Another approach posted by Craig Gidney describes an algorithm similar the approach above: it examines a row at a time using a step size of M/N. His analysis shows that this results in O(N * log(M/N)) performance as well.
Performance Comparison
Big-O analysis is all well and good, but how well do these approaches work in practice? The chart below examines four algorithms for increasingly "square" arrays:
(The "naive" algorithm simply searches every element of the array. The "recursive" algorithm is described above. The "hybrid" algorithm is an implementation of Gidney's algorithm. For each array size, performance was measured by timing each algorithm over fixed set of 1,000,000 randomly-generated arrays.)
Some notable points:
As expected, the "binary search" algorithms offer the best performance on rectangular arrays and the Saddleback algorithm works the best on square arrays.
The Saddleback algorithm performs worse than the "naive" algorithm for 1-d arrays, presumably because it does multiple comparisons on each item.
The performance hit that the "binary search" algorithms take on square arrays is presumably due to the overhead of running repeated binary searches.
Summary
Clever use of binary search can provide O(N * log(M/N) performance for both rectangular and square arrays. The O(N + M) "saddleback" algorithm is much simpler, but suffers from performance degradation as arrays become increasingly rectangular.
This problem takes Θ(b lg(t)) time, where b = min(w,h) and t=b/max(w,h). I discuss the solution in this blog post.
Lower bound
An adversary can force an algorithm to make Ω(b lg(t)) queries, by restricting itself to the main diagonal:
Legend: white cells are smaller items, gray cells are larger items, yellow cells are smaller-or-equal items and orange cells are larger-or-equal items. The adversary forces the solution to be whichever yellow or orange cell the algorithm queries last.
Notice that there are b independent sorted lists of size t, requiring Ω(b lg(t)) queries to completely eliminate.
Algorithm
(Assume without loss of generality that w >= h)
Compare the target item against the cell t to the left of the top right corner of the valid area
If the cell's item matches, return the current position.
If the cell's item is less than the target item, eliminate the remaining t cells in the row with a binary search. If a matching item is found while doing this, return with its position.
Otherwise the cell's item is more than the target item, eliminating t short columns.
If there's no valid area left, return failure
Goto step 2
Finding an item:
Determining an item doesn't exist:
Legend: white cells are smaller items, gray cells are larger items, and the green cell is an equal item.
Analysis
There are b*t short columns to eliminate. There are b long rows to eliminate. Eliminating a long row costs O(lg(t)) time. Eliminating t short columns costs O(1) time.
In the worst case we'll have to eliminate every column and every row, taking time O(lg(t)*b + b*t*1/t) = O(b lg(t)).
Note that I'm assuming lg clamps to a result above 1 (i.e. lg(x) = log_2(max(2,x))). That's why when w=h, meaning t=1, we get the expected bound of O(b lg(1)) = O(b) = O(w+h).
Code
public static Tuple<int, int> TryFindItemInSortedMatrix<T>(this IReadOnlyList<IReadOnlyList<T>> grid, T item, IComparer<T> comparer = null) {
if (grid == null) throw new ArgumentNullException("grid");
comparer = comparer ?? Comparer<T>.Default;
// check size
var width = grid.Count;
if (width == 0) return null;
var height = grid[0].Count;
if (height < width) {
var result = grid.LazyTranspose().TryFindItemInSortedMatrix(item, comparer);
if (result == null) return null;
return Tuple.Create(result.Item2, result.Item1);
}
// search
var minCol = 0;
var maxRow = height - 1;
var t = height / width;
while (minCol < width && maxRow >= 0) {
// query the item in the minimum column, t above the maximum row
var luckyRow = Math.Max(maxRow - t, 0);
var cmpItemVsLucky = comparer.Compare(item, grid[minCol][luckyRow]);
if (cmpItemVsLucky == 0) return Tuple.Create(minCol, luckyRow);
// did we eliminate t rows from the bottom?
if (cmpItemVsLucky < 0) {
maxRow = luckyRow - 1;
continue;
}
// we eliminated most of the current minimum column
// spend lg(t) time eliminating rest of column
var minRowInCol = luckyRow + 1;
var maxRowInCol = maxRow;
while (minRowInCol <= maxRowInCol) {
var mid = minRowInCol + (maxRowInCol - minRowInCol + 1) / 2;
var cmpItemVsMid = comparer.Compare(item, grid[minCol][mid]);
if (cmpItemVsMid == 0) return Tuple.Create(minCol, mid);
if (cmpItemVsMid > 0) {
minRowInCol = mid + 1;
} else {
maxRowInCol = mid - 1;
maxRow = mid - 1;
}
}
minCol += 1;
}
return null;
}
I would use the divide-and-conquer strategy for this problem, similar to what you suggested, but the details are a bit different.
This will be a recursive search on subranges of the matrix.
At each step, pick an element in the middle of the range. If the value found is what you are seeking, then you're done.
Otherwise, if the value found is less than the value that you are seeking, then you know that it is not in the quadrant above and to the left of your current position. So recursively search the two subranges: everything (exclusively) below the current position, and everything (exclusively) to the right that is at or above the current position.
Otherwise, (the value found is greater than the value that you are seeking) you know that it is not in the quadrant below and to the right of your current position. So recursively search the two subranges: everything (exclusively) to the left of the current position, and everything (exclusively) above the current position that is on the current column or a column to the right.
And ba-da-bing, you found it.
Note that each recursive call only deals with the current subrange only, not (for example) ALL rows above the current position. Just those in the current subrange.
Here's some pseudocode for you:
bool numberSearch(int[][] arr, int value, int minX, int maxX, int minY, int maxY)
if (minX == maxX and minY == maxY and arr[minX,minY] != value)
return false
if (arr[minX,minY] > value) return false; // Early exits if the value can't be in
if (arr[maxX,maxY] < value) return false; // this subrange at all.
int nextX = (minX + maxX) / 2
int nextY = (minY + maxY) / 2
if (arr[nextX,nextY] == value)
{
print nextX,nextY
return true
}
else if (arr[nextX,nextY] < value)
{
if (numberSearch(arr, value, minX, maxX, nextY + 1, maxY))
return true
return numberSearch(arr, value, nextX + 1, maxX, minY, nextY)
}
else
{
if (numberSearch(arr, value, minX, nextX - 1, minY, maxY))
return true
reutrn numberSearch(arr, value, nextX, maxX, minY, nextY)
}
The two main answers give so far seem to be the arguably O(log N) "ZigZag method" and the O(N+M) Binary Search method. I thought I'd do some testing comparing the two methods with some various setups. Here are the details:
The array is N x N square in every test, with N varying from 125 to 8000 (the largest my JVM heap could handle). For each array size, I picked a random place in the array to put a single 2. I then put a 3 everywhere possible (to the right and below of the 2) and then filled the rest of the array with 1. Some of the earlier commenters seemed to think this type of setup would yield worst case run time for both algorithms. For each array size, I picked 100 different random locations for the 2 (search target) and ran the test. I recorded avg run time and worst case run time for each algorithm. Because it was happening too fast to get good ms readings in Java, and because I don't trust Java's nanoTime(), I repeated each test 1000 times just to add a uniform bias factor to all the times. Here are the results:
ZigZag beat binary in every test for both avg and worst case times, however, they are all within an order of magnitude of each other more or less.
Here is the Java code:
public class SearchSortedArray2D {
static boolean findZigZag(int[][] a, int t) {
int i = 0;
int j = a.length - 1;
while (i <= a.length - 1 && j >= 0) {
if (a[i][j] == t) return true;
else if (a[i][j] < t) i++;
else j--;
}
return false;
}
static boolean findBinarySearch(int[][] a, int t) {
return findBinarySearch(a, t, 0, 0, a.length - 1, a.length - 1);
}
static boolean findBinarySearch(int[][] a, int t,
int r1, int c1, int r2, int c2) {
if (r1 > r2 || c1 > c2) return false;
if (r1 == r2 && c1 == c2 && a[r1][c1] != t) return false;
if (a[r1][c1] > t) return false;
if (a[r2][c2] < t) return false;
int rm = (r1 + r2) / 2;
int cm = (c1 + c2) / 2;
if (a[rm][cm] == t) return true;
else if (a[rm][cm] > t) {
boolean b1 = findBinarySearch(a, t, r1, c1, r2, cm - 1);
boolean b2 = findBinarySearch(a, t, r1, cm, rm - 1, c2);
return (b1 || b2);
} else {
boolean b1 = findBinarySearch(a, t, r1, cm + 1, rm, c2);
boolean b2 = findBinarySearch(a, t, rm + 1, c1, r2, c2);
return (b1 || b2);
}
}
static void randomizeArray(int[][] a, int N) {
int ri = (int) (Math.random() * N);
int rj = (int) (Math.random() * N);
a[ri][rj] = 2;
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (i == ri && j == rj) continue;
else if (i > ri || j > rj) a[i][j] = 3;
else a[i][j] = 1;
}
}
}
public static void main(String[] args) {
int N = 8000;
int[][] a = new int[N][N];
int randoms = 100;
int repeats = 1000;
long start, end, duration;
long zigMin = Integer.MAX_VALUE, zigMax = Integer.MIN_VALUE;
long binMin = Integer.MAX_VALUE, binMax = Integer.MIN_VALUE;
long zigSum = 0, zigAvg;
long binSum = 0, binAvg;
for (int k = 0; k < randoms; k++) {
randomizeArray(a, N);
start = System.currentTimeMillis();
for (int i = 0; i < repeats; i++) findZigZag(a, 2);
end = System.currentTimeMillis();
duration = end - start;
zigSum += duration;
zigMin = Math.min(zigMin, duration);
zigMax = Math.max(zigMax, duration);
start = System.currentTimeMillis();
for (int i = 0; i < repeats; i++) findBinarySearch(a, 2);
end = System.currentTimeMillis();
duration = end - start;
binSum += duration;
binMin = Math.min(binMin, duration);
binMax = Math.max(binMax, duration);
}
zigAvg = zigSum / randoms;
binAvg = binSum / randoms;
System.out.println(findZigZag(a, 2) ?
"Found via zigzag method. " : "ERROR. ");
//System.out.println("min search time: " + zigMin + "ms");
System.out.println("max search time: " + zigMax + "ms");
System.out.println("avg search time: " + zigAvg + "ms");
System.out.println();
System.out.println(findBinarySearch(a, 2) ?
"Found via binary search method. " : "ERROR. ");
//System.out.println("min search time: " + binMin + "ms");
System.out.println("max search time: " + binMax + "ms");
System.out.println("avg search time: " + binAvg + "ms");
}
}
This is a short proof of the lower bound on the problem.
You cannot do it better than linear time (in terms of array dimensions, not the number of elements). In the array below, each of the elements marked as * can be either 5 or 6 (independently of other ones). So if your target value is 6 (or 5) the algorithm needs to examine all of them.
1 2 3 4 *
2 3 4 * 7
3 4 * 7 8
4 * 7 8 9
* 7 8 9 10
Of course this expands to bigger arrays as well. This means that this answer is optimal.
Update: As pointed out by Jeffrey L Whitledge, it is only optimal as the asymptotic lower bound on running time vs input data size (treated as a single variable). Running time treated as two-variable function on both array dimensions can be improved.
I think Here is the answer and it works for any kind of sorted matrix
bool findNum(int arr[][ARR_MAX],int xmin, int xmax, int ymin,int ymax,int key)
{
if (xmin > xmax || ymin > ymax || xmax < xmin || ymax < ymin) return false;
if ((xmin == xmax) && (ymin == ymax) && (arr[xmin][ymin] != key)) return false;
if (arr[xmin][ymin] > key || arr[xmax][ymax] < key) return false;
if (arr[xmin][ymin] == key || arr[xmax][ymax] == key) return true;
int xnew = (xmin + xmax)/2;
int ynew = (ymin + ymax)/2;
if (arr[xnew][ynew] == key) return true;
if (arr[xnew][ynew] < key)
{
if (findNum(arr,xnew+1,xmax,ymin,ymax,key))
return true;
return (findNum(arr,xmin,xmax,ynew+1,ymax,key));
} else {
if (findNum(arr,xmin,xnew-1,ymin,ymax,key))
return true;
return (findNum(arr,xmin,xmax,ymin,ynew-1,key));
}
}
Interesting question. Consider this idea - create one boundary where all the numbers are greater than your target and another where all the numbers are less than your target. If anything is left in between the two, that's your target.
If I'm looking for 3 in your example, I read across the first row until I hit 4, then look for the smallest adjacent number (including diagonals) greater than 3:
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Now I do the same for those numbers less than 3:
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Now I ask, is anything inside the two boundaries? If yes, it must be 3. If no, then there is no 3. Sort of indirect since I don't actually find the number, I just deduce that it must be there. This has the added bonus of counting ALL the 3's.
I tried this on some examples and it seems to work OK.
Binary search through the diagonal of the array is the best option.
We can find out whether the element is less than or equal to the elements in the diagonal.
I've been asking this question in interviews for the better part of a decade and I think there's only been one person who has been able to come up with an optimal algorithm.
My solution has always been:
Binary search the middle diagonal, which is the diagonal running down and right, containing the item at (rows.count/2, columns.count/2).
If the target number is found, return true.
Otherwise, two numbers (u and v) will have been found such that u is smaller than the target, v is larger than the target, and v is one right and one down from u.
Recursively search the sub-matrix to the right of u and top of v and the one to the bottom of u and left of v.
I believe this is a strict improvement over the algorithm given by Nate here, since searching the diagonal often allows a reduction of over half the search space (if the matrix is close to square), whereas searching a row or column always results in an elimination of exactly half.
Here's the code in (probably not terribly Swifty) Swift:
import Cocoa
class Solution {
func searchMatrix(_ matrix: [[Int]], _ target: Int) -> Bool {
if (matrix.isEmpty || matrix[0].isEmpty) {
return false
}
return _searchMatrix(matrix, 0..<matrix.count, 0..<matrix[0].count, target)
}
func _searchMatrix(_ matrix: [[Int]], _ rows: Range<Int>, _ columns: Range<Int>, _ target: Int) -> Bool {
if (rows.count == 0 || columns.count == 0) {
return false
}
if (rows.count == 1) {
return _binarySearch(matrix, rows.lowerBound, columns, target, true)
}
if (columns.count == 1) {
return _binarySearch(matrix, columns.lowerBound, rows, target, false)
}
var lowerInflection = (-1, -1)
var upperInflection = (Int.max, Int.max)
var currentRows = rows
var currentColumns = columns
while (currentRows.count > 0 && currentColumns.count > 0 && upperInflection.0 > lowerInflection.0+1) {
let rowMidpoint = (currentRows.upperBound + currentRows.lowerBound) / 2
let columnMidpoint = (currentColumns.upperBound + currentColumns.lowerBound) / 2
let value = matrix[rowMidpoint][columnMidpoint]
if (value == target) {
return true
}
if (value > target) {
upperInflection = (rowMidpoint, columnMidpoint)
currentRows = currentRows.lowerBound..<rowMidpoint
currentColumns = currentColumns.lowerBound..<columnMidpoint
} else {
lowerInflection = (rowMidpoint, columnMidpoint)
currentRows = rowMidpoint+1..<currentRows.upperBound
currentColumns = columnMidpoint+1..<currentColumns.upperBound
}
}
if (lowerInflection.0 == -1) {
lowerInflection = (upperInflection.0-1, upperInflection.1-1)
} else if (upperInflection.0 == Int.max) {
upperInflection = (lowerInflection.0+1, lowerInflection.1+1)
}
return _searchMatrix(matrix, rows.lowerBound..<lowerInflection.0+1, upperInflection.1..<columns.upperBound, target) || _searchMatrix(matrix, upperInflection.0..<rows.upperBound, columns.lowerBound..<lowerInflection.1+1, target)
}
func _binarySearch(_ matrix: [[Int]], _ rowOrColumn: Int, _ range: Range<Int>, _ target: Int, _ searchRow : Bool) -> Bool {
if (range.isEmpty) {
return false
}
let midpoint = (range.upperBound + range.lowerBound) / 2
let value = (searchRow ? matrix[rowOrColumn][midpoint] : matrix[midpoint][rowOrColumn])
if (value == target) {
return true
}
if (value > target) {
return _binarySearch(matrix, rowOrColumn, range.lowerBound..<midpoint, target, searchRow)
} else {
return _binarySearch(matrix, rowOrColumn, midpoint+1..<range.upperBound, target, searchRow)
}
}
}
A. Do a binary search on those lines where the target number might be on.
B. Make it a graph : Look for the number by taking always the smallest unvisited neighbour node and backtracking when a too big number is found
Binary search would be the best approach, imo. Starting at 1/2 x, 1/2 y will cut it in half. IE a 5x5 square would be something like x == 2 / y == 3 . I rounded one value down and one value up to better zone in on the direction of the targeted value.
For clarity the next iteration would give you something like x == 1 / y == 2 OR x == 3 / y == 5
Well, to begin with, let us assume we are using a square.
1 2 3
2 3 4
3 4 5
1. Searching a square
I would use a binary search on the diagonal. The goal is the locate the smaller number that is not strictly lower than the target number.
Say I am looking for 4 for example, then I would end up locating 5 at (2,2).
Then, I am assured that if 4 is in the table, it is at a position either (x,2) or (2,x) with x in [0,2]. Well, that's just 2 binary searches.
The complexity is not daunting: O(log(N)) (3 binary searches on ranges of length N)
2. Searching a rectangle, naive approach
Of course, it gets a bit more complicated when N and M differ (with a rectangle), consider this degenerate case:
1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
And let's say I am looking for 9... The diagonal approach is still good, but the definition of diagonal changes. Here my diagonal is [1, (5 or 6), 17]. Let's say I picked up [1,5,17], then I know that if 9 is in the table it is either in the subpart:
5 6 7 8
6 7 8 9
10 11 12 13 14 15 16
This gives us 2 rectangles:
5 6 7 8 10 11 12 13 14 15 16
6 7 8 9
So we can recurse! probably beginning by the one with less elements (though in this case it kills us).
I should point that if one of the dimensions is less than 3, we cannot apply the diagonal methods and must use a binary search. Here it would mean:
Apply binary search on 10 11 12 13 14 15 16, not found
Apply binary search on 5 6 7 8, not found
Apply binary search on 6 7 8 9, not found
It's tricky because to get good performance you might want to differentiate between several cases, depending on the general shape....
3. Searching a rectangle, brutal approach
It would be much easier if we dealt with a square... so let's just square things up.
1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
17 . . . . . . 17
. .
. .
. .
17 . . . . . . 17
We now have a square.
Of course, we will probably NOT actually create those rows, we could simply emulate them.
def get(x,y):
if x < N and y < M: return table[x][y]
else: return table[N-1][M-1] # the max
so it behaves like a square without occupying more memory (at the cost of speed, probably, depending on cache... oh well :p)
EDIT:
I misunderstood the question. As the comments point out this only works in the more restricted case.
In a language like C that stores data in row-major order, simply treat it as a 1D array of size n * m and use a binary search.
I have a recursive Divide & Conquer Solution.
Basic Idea for one step is: We know that the Left-Upper(LU) is smallest and the right-bottom(RB) is the largest no., so the given No(N) must: N>=LU and N<=RB
IF N==LU and N==RB::::Element Found and Abort returning the position/Index
If N>=LU and N<=RB = FALSE, No is not there and abort.
If N>=LU and N<=RB = TRUE, Divide the 2D array in 4 equal parts of 2D array each in logical manner..
And then apply the same algo step to all four sub-array.
My Algo is Correct I have implemented on my friends PC.
Complexity: each 4 comparisons can b used to deduce the total no of elements to one-fourth at its worst case..
So My complexity comes to be 1 + 4 x lg(n) + 4
But really expected this to be working on O(n)
I think something is wrong somewhere in my calculation of Complexity, please correct if so..
The optimal solution is to start at the top-left corner, that has minimal value. Move diagonally downwards to the right until you hit an element whose value >= value of the given element. If the element's value is equal to that of the given element, return found as true.
Otherwise, from here we can proceed in two ways.
Strategy 1:
Move up in the column and search for the given element until we reach the end. If found, return found as true
Move left in the row and search for the given element until we reach the end. If found, return found as true
return found as false
Strategy 2:
Let i denote the row index and j denote the column index of the diagonal element we have stopped at. (Here, we have i = j, BTW). Let k = 1.
Repeat the below steps until i-k >= 0
Search if a[i-k][j] is equal to the given element. if yes, return found as true.
Search if a[i][j-k] is equal to the given element. if yes, return found as true.
Increment k
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
public boolean searchSortedMatrix(int arr[][] , int key , int minX , int maxX , int minY , int maxY){
// base case for recursion
if(minX > maxX || minY > maxY)
return false ;
// early fails
// array not properly intialized
if(arr==null || arr.length==0)
return false ;
// arr[0][0]> key return false
if(arr[minX][minY]>key)
return false ;
// arr[maxX][maxY]<key return false
if(arr[maxX][maxY]<key)
return false ;
//int temp1 = minX ;
//int temp2 = minY ;
int midX = (minX+maxX)/2 ;
//if(temp1==midX){midX+=1 ;}
int midY = (minY+maxY)/2 ;
//if(temp2==midY){midY+=1 ;}
// arr[midX][midY] = key ? then value found
if(arr[midX][midY] == key)
return true ;
// alas ! i have to keep looking
// arr[midX][midY] < key ? search right quad and bottom matrix ;
if(arr[midX][midY] < key){
if( searchSortedMatrix(arr ,key , minX,maxX , midY+1 , maxY))
return true ;
// search bottom half of matrix
if( searchSortedMatrix(arr ,key , midX+1,maxX , minY , maxY))
return true ;
}
// arr[midX][midY] > key ? search left quad matrix ;
else {
return(searchSortedMatrix(arr , key , minX,midX-1,minY,midY-1));
}
return false ;
}
I suggest, store all characters in a 2D list. then find index of required element if it exists in list.
If not present print appropriate message else print row and column as:
row = (index/total_columns) and column = (index%total_columns -1)
This will incur only the binary search time in a list.
Please suggest any corrections. :)
If O(M log(N)) solution is ok for an MxN array -
template <size_t n>
struct MN * get(int a[][n], int k, int M, int N){
struct MN *result = new MN;
result->m = -1;
result->n = -1;
/* Do a binary search on each row since rows (and columns too) are sorted. */
for(int i = 0; i < M; i++){
int lo = 0; int hi = N - 1;
while(lo <= hi){
int mid = lo + (hi-lo)/2;
if(k < a[i][mid]) hi = mid - 1;
else if (k > a[i][mid]) lo = mid + 1;
else{
result->m = i;
result->n = mid;
return result;
}
}
}
return result;
}
Working C++ demo.
Please do let me know if this wouldn't work or if there is a bug it it.
class Solution {
public boolean searchMatrix(int[][] matrix, int target) {
if(matrix == null)
return false;
int i=0;
int j=0;
int m = matrix.length;
int n = matrix[0].length;
boolean found = false;
while(i<m && !found){
while(j<n && !found){
if(matrix[i][j] == target)
found = true;
if(matrix[i][j] < target)
j++;
else
break;
}
i++;
j=0;
}
return found;
}}
129 / 129 test cases passed.
Status: Accepted
Runtime: 39 ms
Memory Usage: 55 MB
Given a square matrix as follows:
[ a b c ]
[ d e f ]
[ i j k ]
We know that a < c, d < f, i < k. What we don't know is whether d < c or d > c, etc. We have guarantees only in 1-dimension.
Looking at the end elements (c,f,k), we can do a sort of filter: is N < c ? search() : next(). Thus, we have n iterations over the rows, with each row taking either O( log( n ) ) for binary search or O( 1 ) if filtered out.
Let me given an EXAMPLE where N = j,
1) Check row 1. j < c? (no, go next)
2) Check row 2. j < f? (yes, bin search gets nothing)
3) Check row 3. j < k? (yes, bin search finds it)
Try again with N = q,
1) Check row 1. q < c? (no, go next)
2) Check row 2. q < f? (no, go next)
3) Check row 3. q < k? (no, go next)
There is probably a better solution out there but this is easy to explain.. :)
As this is an interview question, it would seem to lead towards a discussion of Parallel programming and Map-reduce algorithms.
See http://code.google.com/intl/de/edu/parallel/mapreduce-tutorial.html