Related
I was recently given this interview question and I'm curious what a good solution to it would be.
Say I'm given a 2d array where all the
numbers in the array are in increasing
order from left to right and top to
bottom.
What is the best way to search and
determine if a target number is in the
array?
Now, my first inclination is to utilize a binary search since my data is sorted. I can determine if a number is in a single row in O(log N) time. However, it is the 2 directions that throw me off.
Another solution I thought may work is to start somewhere in the middle. If the middle value is less than my target, then I can be sure it is in the left square portion of the matrix from the middle. I then move diagonally and check again, reducing the size of the square that the target could potentially be in until I have honed in on the target number.
Does anyone have any good ideas on solving this problem?
Example array:
Sorted left to right, top to bottom.
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Here's a simple approach:
Start at the bottom-left corner.
If the target is less than that value, it must be above us, so move up one.
Otherwise we know that the target can't be in that column, so move right one.
Goto 2.
For an NxM array, this runs in O(N+M). I think it would be difficult to do better. :)
Edit: Lots of good discussion. I was talking about the general case above; clearly, if N or M are small, you could use a binary search approach to do this in something approaching logarithmic time.
Here are some details, for those who are curious:
History
This simple algorithm is called a Saddleback Search. It's been around for a while, and it is optimal when N == M. Some references:
David Gries, The Science of Programming. Springer-Verlag, 1989.
Edsgar Dijkstra, The Saddleback Search. Note EWD-934, 1985.
However, when N < M, intuition suggests that binary search should be able to do better than O(N+M): For example, when N == 1, a pure binary search will run in logarithmic rather than linear time.
Worst-case bound
Richard Bird examined this intuition that binary search could improve the Saddleback algorithm in a 2006 paper:
Richard S. Bird, Improving Saddleback Search: A Lesson in Algorithm Design, in Mathematics of Program Construction, pp. 82--89, volume 4014, 2006.
Using a rather unusual conversational technique, Bird shows us that for N <= M, this problem has a lower bound of Ω(N * log(M/N)). This bound make sense, as it gives us linear performance when N == M and logarithmic performance when N == 1.
Algorithms for rectangular arrays
One approach that uses a row-by-row binary search looks like this:
Start with a rectangular array where N < M. Let's say N is rows and M is columns.
Do a binary search on the middle row for value. If we find it, we're done.
Otherwise we've found an adjacent pair of numbers s and g, where s < value < g.
The rectangle of numbers above and to the left of s is less than value, so we can eliminate it.
The rectangle below and to the right of g is greater than value, so we can eliminate it.
Go to step (2) for each of the two remaining rectangles.
In terms of worst-case complexity, this algorithm does log(M) work to eliminate half the possible solutions, and then recursively calls itself twice on two smaller problems. We do have to repeat a smaller version of that log(M) work for every row, but if the number of rows is small compared to the number of columns, then being able to eliminate all of those columns in logarithmic time starts to become worthwhile.
This gives the algorithm a complexity of T(N,M) = log(M) + 2 * T(M/2, N/2), which Bird shows to be O(N * log(M/N)).
Another approach posted by Craig Gidney describes an algorithm similar the approach above: it examines a row at a time using a step size of M/N. His analysis shows that this results in O(N * log(M/N)) performance as well.
Performance Comparison
Big-O analysis is all well and good, but how well do these approaches work in practice? The chart below examines four algorithms for increasingly "square" arrays:
(The "naive" algorithm simply searches every element of the array. The "recursive" algorithm is described above. The "hybrid" algorithm is an implementation of Gidney's algorithm. For each array size, performance was measured by timing each algorithm over fixed set of 1,000,000 randomly-generated arrays.)
Some notable points:
As expected, the "binary search" algorithms offer the best performance on rectangular arrays and the Saddleback algorithm works the best on square arrays.
The Saddleback algorithm performs worse than the "naive" algorithm for 1-d arrays, presumably because it does multiple comparisons on each item.
The performance hit that the "binary search" algorithms take on square arrays is presumably due to the overhead of running repeated binary searches.
Summary
Clever use of binary search can provide O(N * log(M/N) performance for both rectangular and square arrays. The O(N + M) "saddleback" algorithm is much simpler, but suffers from performance degradation as arrays become increasingly rectangular.
This problem takes Θ(b lg(t)) time, where b = min(w,h) and t=b/max(w,h). I discuss the solution in this blog post.
Lower bound
An adversary can force an algorithm to make Ω(b lg(t)) queries, by restricting itself to the main diagonal:
Legend: white cells are smaller items, gray cells are larger items, yellow cells are smaller-or-equal items and orange cells are larger-or-equal items. The adversary forces the solution to be whichever yellow or orange cell the algorithm queries last.
Notice that there are b independent sorted lists of size t, requiring Ω(b lg(t)) queries to completely eliminate.
Algorithm
(Assume without loss of generality that w >= h)
Compare the target item against the cell t to the left of the top right corner of the valid area
If the cell's item matches, return the current position.
If the cell's item is less than the target item, eliminate the remaining t cells in the row with a binary search. If a matching item is found while doing this, return with its position.
Otherwise the cell's item is more than the target item, eliminating t short columns.
If there's no valid area left, return failure
Goto step 2
Finding an item:
Determining an item doesn't exist:
Legend: white cells are smaller items, gray cells are larger items, and the green cell is an equal item.
Analysis
There are b*t short columns to eliminate. There are b long rows to eliminate. Eliminating a long row costs O(lg(t)) time. Eliminating t short columns costs O(1) time.
In the worst case we'll have to eliminate every column and every row, taking time O(lg(t)*b + b*t*1/t) = O(b lg(t)).
Note that I'm assuming lg clamps to a result above 1 (i.e. lg(x) = log_2(max(2,x))). That's why when w=h, meaning t=1, we get the expected bound of O(b lg(1)) = O(b) = O(w+h).
Code
public static Tuple<int, int> TryFindItemInSortedMatrix<T>(this IReadOnlyList<IReadOnlyList<T>> grid, T item, IComparer<T> comparer = null) {
if (grid == null) throw new ArgumentNullException("grid");
comparer = comparer ?? Comparer<T>.Default;
// check size
var width = grid.Count;
if (width == 0) return null;
var height = grid[0].Count;
if (height < width) {
var result = grid.LazyTranspose().TryFindItemInSortedMatrix(item, comparer);
if (result == null) return null;
return Tuple.Create(result.Item2, result.Item1);
}
// search
var minCol = 0;
var maxRow = height - 1;
var t = height / width;
while (minCol < width && maxRow >= 0) {
// query the item in the minimum column, t above the maximum row
var luckyRow = Math.Max(maxRow - t, 0);
var cmpItemVsLucky = comparer.Compare(item, grid[minCol][luckyRow]);
if (cmpItemVsLucky == 0) return Tuple.Create(minCol, luckyRow);
// did we eliminate t rows from the bottom?
if (cmpItemVsLucky < 0) {
maxRow = luckyRow - 1;
continue;
}
// we eliminated most of the current minimum column
// spend lg(t) time eliminating rest of column
var minRowInCol = luckyRow + 1;
var maxRowInCol = maxRow;
while (minRowInCol <= maxRowInCol) {
var mid = minRowInCol + (maxRowInCol - minRowInCol + 1) / 2;
var cmpItemVsMid = comparer.Compare(item, grid[minCol][mid]);
if (cmpItemVsMid == 0) return Tuple.Create(minCol, mid);
if (cmpItemVsMid > 0) {
minRowInCol = mid + 1;
} else {
maxRowInCol = mid - 1;
maxRow = mid - 1;
}
}
minCol += 1;
}
return null;
}
I would use the divide-and-conquer strategy for this problem, similar to what you suggested, but the details are a bit different.
This will be a recursive search on subranges of the matrix.
At each step, pick an element in the middle of the range. If the value found is what you are seeking, then you're done.
Otherwise, if the value found is less than the value that you are seeking, then you know that it is not in the quadrant above and to the left of your current position. So recursively search the two subranges: everything (exclusively) below the current position, and everything (exclusively) to the right that is at or above the current position.
Otherwise, (the value found is greater than the value that you are seeking) you know that it is not in the quadrant below and to the right of your current position. So recursively search the two subranges: everything (exclusively) to the left of the current position, and everything (exclusively) above the current position that is on the current column or a column to the right.
And ba-da-bing, you found it.
Note that each recursive call only deals with the current subrange only, not (for example) ALL rows above the current position. Just those in the current subrange.
Here's some pseudocode for you:
bool numberSearch(int[][] arr, int value, int minX, int maxX, int minY, int maxY)
if (minX == maxX and minY == maxY and arr[minX,minY] != value)
return false
if (arr[minX,minY] > value) return false; // Early exits if the value can't be in
if (arr[maxX,maxY] < value) return false; // this subrange at all.
int nextX = (minX + maxX) / 2
int nextY = (minY + maxY) / 2
if (arr[nextX,nextY] == value)
{
print nextX,nextY
return true
}
else if (arr[nextX,nextY] < value)
{
if (numberSearch(arr, value, minX, maxX, nextY + 1, maxY))
return true
return numberSearch(arr, value, nextX + 1, maxX, minY, nextY)
}
else
{
if (numberSearch(arr, value, minX, nextX - 1, minY, maxY))
return true
reutrn numberSearch(arr, value, nextX, maxX, minY, nextY)
}
The two main answers give so far seem to be the arguably O(log N) "ZigZag method" and the O(N+M) Binary Search method. I thought I'd do some testing comparing the two methods with some various setups. Here are the details:
The array is N x N square in every test, with N varying from 125 to 8000 (the largest my JVM heap could handle). For each array size, I picked a random place in the array to put a single 2. I then put a 3 everywhere possible (to the right and below of the 2) and then filled the rest of the array with 1. Some of the earlier commenters seemed to think this type of setup would yield worst case run time for both algorithms. For each array size, I picked 100 different random locations for the 2 (search target) and ran the test. I recorded avg run time and worst case run time for each algorithm. Because it was happening too fast to get good ms readings in Java, and because I don't trust Java's nanoTime(), I repeated each test 1000 times just to add a uniform bias factor to all the times. Here are the results:
ZigZag beat binary in every test for both avg and worst case times, however, they are all within an order of magnitude of each other more or less.
Here is the Java code:
public class SearchSortedArray2D {
static boolean findZigZag(int[][] a, int t) {
int i = 0;
int j = a.length - 1;
while (i <= a.length - 1 && j >= 0) {
if (a[i][j] == t) return true;
else if (a[i][j] < t) i++;
else j--;
}
return false;
}
static boolean findBinarySearch(int[][] a, int t) {
return findBinarySearch(a, t, 0, 0, a.length - 1, a.length - 1);
}
static boolean findBinarySearch(int[][] a, int t,
int r1, int c1, int r2, int c2) {
if (r1 > r2 || c1 > c2) return false;
if (r1 == r2 && c1 == c2 && a[r1][c1] != t) return false;
if (a[r1][c1] > t) return false;
if (a[r2][c2] < t) return false;
int rm = (r1 + r2) / 2;
int cm = (c1 + c2) / 2;
if (a[rm][cm] == t) return true;
else if (a[rm][cm] > t) {
boolean b1 = findBinarySearch(a, t, r1, c1, r2, cm - 1);
boolean b2 = findBinarySearch(a, t, r1, cm, rm - 1, c2);
return (b1 || b2);
} else {
boolean b1 = findBinarySearch(a, t, r1, cm + 1, rm, c2);
boolean b2 = findBinarySearch(a, t, rm + 1, c1, r2, c2);
return (b1 || b2);
}
}
static void randomizeArray(int[][] a, int N) {
int ri = (int) (Math.random() * N);
int rj = (int) (Math.random() * N);
a[ri][rj] = 2;
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (i == ri && j == rj) continue;
else if (i > ri || j > rj) a[i][j] = 3;
else a[i][j] = 1;
}
}
}
public static void main(String[] args) {
int N = 8000;
int[][] a = new int[N][N];
int randoms = 100;
int repeats = 1000;
long start, end, duration;
long zigMin = Integer.MAX_VALUE, zigMax = Integer.MIN_VALUE;
long binMin = Integer.MAX_VALUE, binMax = Integer.MIN_VALUE;
long zigSum = 0, zigAvg;
long binSum = 0, binAvg;
for (int k = 0; k < randoms; k++) {
randomizeArray(a, N);
start = System.currentTimeMillis();
for (int i = 0; i < repeats; i++) findZigZag(a, 2);
end = System.currentTimeMillis();
duration = end - start;
zigSum += duration;
zigMin = Math.min(zigMin, duration);
zigMax = Math.max(zigMax, duration);
start = System.currentTimeMillis();
for (int i = 0; i < repeats; i++) findBinarySearch(a, 2);
end = System.currentTimeMillis();
duration = end - start;
binSum += duration;
binMin = Math.min(binMin, duration);
binMax = Math.max(binMax, duration);
}
zigAvg = zigSum / randoms;
binAvg = binSum / randoms;
System.out.println(findZigZag(a, 2) ?
"Found via zigzag method. " : "ERROR. ");
//System.out.println("min search time: " + zigMin + "ms");
System.out.println("max search time: " + zigMax + "ms");
System.out.println("avg search time: " + zigAvg + "ms");
System.out.println();
System.out.println(findBinarySearch(a, 2) ?
"Found via binary search method. " : "ERROR. ");
//System.out.println("min search time: " + binMin + "ms");
System.out.println("max search time: " + binMax + "ms");
System.out.println("avg search time: " + binAvg + "ms");
}
}
This is a short proof of the lower bound on the problem.
You cannot do it better than linear time (in terms of array dimensions, not the number of elements). In the array below, each of the elements marked as * can be either 5 or 6 (independently of other ones). So if your target value is 6 (or 5) the algorithm needs to examine all of them.
1 2 3 4 *
2 3 4 * 7
3 4 * 7 8
4 * 7 8 9
* 7 8 9 10
Of course this expands to bigger arrays as well. This means that this answer is optimal.
Update: As pointed out by Jeffrey L Whitledge, it is only optimal as the asymptotic lower bound on running time vs input data size (treated as a single variable). Running time treated as two-variable function on both array dimensions can be improved.
I think Here is the answer and it works for any kind of sorted matrix
bool findNum(int arr[][ARR_MAX],int xmin, int xmax, int ymin,int ymax,int key)
{
if (xmin > xmax || ymin > ymax || xmax < xmin || ymax < ymin) return false;
if ((xmin == xmax) && (ymin == ymax) && (arr[xmin][ymin] != key)) return false;
if (arr[xmin][ymin] > key || arr[xmax][ymax] < key) return false;
if (arr[xmin][ymin] == key || arr[xmax][ymax] == key) return true;
int xnew = (xmin + xmax)/2;
int ynew = (ymin + ymax)/2;
if (arr[xnew][ynew] == key) return true;
if (arr[xnew][ynew] < key)
{
if (findNum(arr,xnew+1,xmax,ymin,ymax,key))
return true;
return (findNum(arr,xmin,xmax,ynew+1,ymax,key));
} else {
if (findNum(arr,xmin,xnew-1,ymin,ymax,key))
return true;
return (findNum(arr,xmin,xmax,ymin,ynew-1,key));
}
}
Interesting question. Consider this idea - create one boundary where all the numbers are greater than your target and another where all the numbers are less than your target. If anything is left in between the two, that's your target.
If I'm looking for 3 in your example, I read across the first row until I hit 4, then look for the smallest adjacent number (including diagonals) greater than 3:
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Now I do the same for those numbers less than 3:
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
Now I ask, is anything inside the two boundaries? If yes, it must be 3. If no, then there is no 3. Sort of indirect since I don't actually find the number, I just deduce that it must be there. This has the added bonus of counting ALL the 3's.
I tried this on some examples and it seems to work OK.
Binary search through the diagonal of the array is the best option.
We can find out whether the element is less than or equal to the elements in the diagonal.
I've been asking this question in interviews for the better part of a decade and I think there's only been one person who has been able to come up with an optimal algorithm.
My solution has always been:
Binary search the middle diagonal, which is the diagonal running down and right, containing the item at (rows.count/2, columns.count/2).
If the target number is found, return true.
Otherwise, two numbers (u and v) will have been found such that u is smaller than the target, v is larger than the target, and v is one right and one down from u.
Recursively search the sub-matrix to the right of u and top of v and the one to the bottom of u and left of v.
I believe this is a strict improvement over the algorithm given by Nate here, since searching the diagonal often allows a reduction of over half the search space (if the matrix is close to square), whereas searching a row or column always results in an elimination of exactly half.
Here's the code in (probably not terribly Swifty) Swift:
import Cocoa
class Solution {
func searchMatrix(_ matrix: [[Int]], _ target: Int) -> Bool {
if (matrix.isEmpty || matrix[0].isEmpty) {
return false
}
return _searchMatrix(matrix, 0..<matrix.count, 0..<matrix[0].count, target)
}
func _searchMatrix(_ matrix: [[Int]], _ rows: Range<Int>, _ columns: Range<Int>, _ target: Int) -> Bool {
if (rows.count == 0 || columns.count == 0) {
return false
}
if (rows.count == 1) {
return _binarySearch(matrix, rows.lowerBound, columns, target, true)
}
if (columns.count == 1) {
return _binarySearch(matrix, columns.lowerBound, rows, target, false)
}
var lowerInflection = (-1, -1)
var upperInflection = (Int.max, Int.max)
var currentRows = rows
var currentColumns = columns
while (currentRows.count > 0 && currentColumns.count > 0 && upperInflection.0 > lowerInflection.0+1) {
let rowMidpoint = (currentRows.upperBound + currentRows.lowerBound) / 2
let columnMidpoint = (currentColumns.upperBound + currentColumns.lowerBound) / 2
let value = matrix[rowMidpoint][columnMidpoint]
if (value == target) {
return true
}
if (value > target) {
upperInflection = (rowMidpoint, columnMidpoint)
currentRows = currentRows.lowerBound..<rowMidpoint
currentColumns = currentColumns.lowerBound..<columnMidpoint
} else {
lowerInflection = (rowMidpoint, columnMidpoint)
currentRows = rowMidpoint+1..<currentRows.upperBound
currentColumns = columnMidpoint+1..<currentColumns.upperBound
}
}
if (lowerInflection.0 == -1) {
lowerInflection = (upperInflection.0-1, upperInflection.1-1)
} else if (upperInflection.0 == Int.max) {
upperInflection = (lowerInflection.0+1, lowerInflection.1+1)
}
return _searchMatrix(matrix, rows.lowerBound..<lowerInflection.0+1, upperInflection.1..<columns.upperBound, target) || _searchMatrix(matrix, upperInflection.0..<rows.upperBound, columns.lowerBound..<lowerInflection.1+1, target)
}
func _binarySearch(_ matrix: [[Int]], _ rowOrColumn: Int, _ range: Range<Int>, _ target: Int, _ searchRow : Bool) -> Bool {
if (range.isEmpty) {
return false
}
let midpoint = (range.upperBound + range.lowerBound) / 2
let value = (searchRow ? matrix[rowOrColumn][midpoint] : matrix[midpoint][rowOrColumn])
if (value == target) {
return true
}
if (value > target) {
return _binarySearch(matrix, rowOrColumn, range.lowerBound..<midpoint, target, searchRow)
} else {
return _binarySearch(matrix, rowOrColumn, midpoint+1..<range.upperBound, target, searchRow)
}
}
}
A. Do a binary search on those lines where the target number might be on.
B. Make it a graph : Look for the number by taking always the smallest unvisited neighbour node and backtracking when a too big number is found
Binary search would be the best approach, imo. Starting at 1/2 x, 1/2 y will cut it in half. IE a 5x5 square would be something like x == 2 / y == 3 . I rounded one value down and one value up to better zone in on the direction of the targeted value.
For clarity the next iteration would give you something like x == 1 / y == 2 OR x == 3 / y == 5
Well, to begin with, let us assume we are using a square.
1 2 3
2 3 4
3 4 5
1. Searching a square
I would use a binary search on the diagonal. The goal is the locate the smaller number that is not strictly lower than the target number.
Say I am looking for 4 for example, then I would end up locating 5 at (2,2).
Then, I am assured that if 4 is in the table, it is at a position either (x,2) or (2,x) with x in [0,2]. Well, that's just 2 binary searches.
The complexity is not daunting: O(log(N)) (3 binary searches on ranges of length N)
2. Searching a rectangle, naive approach
Of course, it gets a bit more complicated when N and M differ (with a rectangle), consider this degenerate case:
1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
And let's say I am looking for 9... The diagonal approach is still good, but the definition of diagonal changes. Here my diagonal is [1, (5 or 6), 17]. Let's say I picked up [1,5,17], then I know that if 9 is in the table it is either in the subpart:
5 6 7 8
6 7 8 9
10 11 12 13 14 15 16
This gives us 2 rectangles:
5 6 7 8 10 11 12 13 14 15 16
6 7 8 9
So we can recurse! probably beginning by the one with less elements (though in this case it kills us).
I should point that if one of the dimensions is less than 3, we cannot apply the diagonal methods and must use a binary search. Here it would mean:
Apply binary search on 10 11 12 13 14 15 16, not found
Apply binary search on 5 6 7 8, not found
Apply binary search on 6 7 8 9, not found
It's tricky because to get good performance you might want to differentiate between several cases, depending on the general shape....
3. Searching a rectangle, brutal approach
It would be much easier if we dealt with a square... so let's just square things up.
1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
17 . . . . . . 17
. .
. .
. .
17 . . . . . . 17
We now have a square.
Of course, we will probably NOT actually create those rows, we could simply emulate them.
def get(x,y):
if x < N and y < M: return table[x][y]
else: return table[N-1][M-1] # the max
so it behaves like a square without occupying more memory (at the cost of speed, probably, depending on cache... oh well :p)
EDIT:
I misunderstood the question. As the comments point out this only works in the more restricted case.
In a language like C that stores data in row-major order, simply treat it as a 1D array of size n * m and use a binary search.
I have a recursive Divide & Conquer Solution.
Basic Idea for one step is: We know that the Left-Upper(LU) is smallest and the right-bottom(RB) is the largest no., so the given No(N) must: N>=LU and N<=RB
IF N==LU and N==RB::::Element Found and Abort returning the position/Index
If N>=LU and N<=RB = FALSE, No is not there and abort.
If N>=LU and N<=RB = TRUE, Divide the 2D array in 4 equal parts of 2D array each in logical manner..
And then apply the same algo step to all four sub-array.
My Algo is Correct I have implemented on my friends PC.
Complexity: each 4 comparisons can b used to deduce the total no of elements to one-fourth at its worst case..
So My complexity comes to be 1 + 4 x lg(n) + 4
But really expected this to be working on O(n)
I think something is wrong somewhere in my calculation of Complexity, please correct if so..
The optimal solution is to start at the top-left corner, that has minimal value. Move diagonally downwards to the right until you hit an element whose value >= value of the given element. If the element's value is equal to that of the given element, return found as true.
Otherwise, from here we can proceed in two ways.
Strategy 1:
Move up in the column and search for the given element until we reach the end. If found, return found as true
Move left in the row and search for the given element until we reach the end. If found, return found as true
return found as false
Strategy 2:
Let i denote the row index and j denote the column index of the diagonal element we have stopped at. (Here, we have i = j, BTW). Let k = 1.
Repeat the below steps until i-k >= 0
Search if a[i-k][j] is equal to the given element. if yes, return found as true.
Search if a[i][j-k] is equal to the given element. if yes, return found as true.
Increment k
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11
public boolean searchSortedMatrix(int arr[][] , int key , int minX , int maxX , int minY , int maxY){
// base case for recursion
if(minX > maxX || minY > maxY)
return false ;
// early fails
// array not properly intialized
if(arr==null || arr.length==0)
return false ;
// arr[0][0]> key return false
if(arr[minX][minY]>key)
return false ;
// arr[maxX][maxY]<key return false
if(arr[maxX][maxY]<key)
return false ;
//int temp1 = minX ;
//int temp2 = minY ;
int midX = (minX+maxX)/2 ;
//if(temp1==midX){midX+=1 ;}
int midY = (minY+maxY)/2 ;
//if(temp2==midY){midY+=1 ;}
// arr[midX][midY] = key ? then value found
if(arr[midX][midY] == key)
return true ;
// alas ! i have to keep looking
// arr[midX][midY] < key ? search right quad and bottom matrix ;
if(arr[midX][midY] < key){
if( searchSortedMatrix(arr ,key , minX,maxX , midY+1 , maxY))
return true ;
// search bottom half of matrix
if( searchSortedMatrix(arr ,key , midX+1,maxX , minY , maxY))
return true ;
}
// arr[midX][midY] > key ? search left quad matrix ;
else {
return(searchSortedMatrix(arr , key , minX,midX-1,minY,midY-1));
}
return false ;
}
I suggest, store all characters in a 2D list. then find index of required element if it exists in list.
If not present print appropriate message else print row and column as:
row = (index/total_columns) and column = (index%total_columns -1)
This will incur only the binary search time in a list.
Please suggest any corrections. :)
If O(M log(N)) solution is ok for an MxN array -
template <size_t n>
struct MN * get(int a[][n], int k, int M, int N){
struct MN *result = new MN;
result->m = -1;
result->n = -1;
/* Do a binary search on each row since rows (and columns too) are sorted. */
for(int i = 0; i < M; i++){
int lo = 0; int hi = N - 1;
while(lo <= hi){
int mid = lo + (hi-lo)/2;
if(k < a[i][mid]) hi = mid - 1;
else if (k > a[i][mid]) lo = mid + 1;
else{
result->m = i;
result->n = mid;
return result;
}
}
}
return result;
}
Working C++ demo.
Please do let me know if this wouldn't work or if there is a bug it it.
class Solution {
public boolean searchMatrix(int[][] matrix, int target) {
if(matrix == null)
return false;
int i=0;
int j=0;
int m = matrix.length;
int n = matrix[0].length;
boolean found = false;
while(i<m && !found){
while(j<n && !found){
if(matrix[i][j] == target)
found = true;
if(matrix[i][j] < target)
j++;
else
break;
}
i++;
j=0;
}
return found;
}}
129 / 129 test cases passed.
Status: Accepted
Runtime: 39 ms
Memory Usage: 55 MB
Given a square matrix as follows:
[ a b c ]
[ d e f ]
[ i j k ]
We know that a < c, d < f, i < k. What we don't know is whether d < c or d > c, etc. We have guarantees only in 1-dimension.
Looking at the end elements (c,f,k), we can do a sort of filter: is N < c ? search() : next(). Thus, we have n iterations over the rows, with each row taking either O( log( n ) ) for binary search or O( 1 ) if filtered out.
Let me given an EXAMPLE where N = j,
1) Check row 1. j < c? (no, go next)
2) Check row 2. j < f? (yes, bin search gets nothing)
3) Check row 3. j < k? (yes, bin search finds it)
Try again with N = q,
1) Check row 1. q < c? (no, go next)
2) Check row 2. q < f? (no, go next)
3) Check row 3. q < k? (no, go next)
There is probably a better solution out there but this is easy to explain.. :)
As this is an interview question, it would seem to lead towards a discussion of Parallel programming and Map-reduce algorithms.
See http://code.google.com/intl/de/edu/parallel/mapreduce-tutorial.html
I need code for the ranking selection method on a genetic algorithm.
I have create roulette and tournament selections method but now I need ranking and I am stuck.
My roulette code is here (I am using atom struct for genetic atoms) :
const int roulette (const atom *f)
{
int i;
double sum, sumrnd;
sum = 0;
for (i = 0; i < N; i++)
sum += f[i].fitness + OFFSET;
sumrnd = rnd () * sum;
sum = 0;
for (i = 0; i < N; i++) {
sum += f[i].fitness + OFFSET;
if (sum > sumrnd)
break;
}
return i;
}
Where atom :
typedef struct atom
{
int geno[VARS];
double pheno[VARS];
double fitness;
} atom;
Rank selection is easy to implement when you already know on roulette wheel selection. Instead of using the fitness as probability for getting selected you use the rank. So for a population of N solutions the best solution gets rank N, the second best rank N-1, etc. The worst individual has rank 1. Now use the roulette wheel and start selecting.
The probability for the best individual to be selected is N/( (N * (N+1))/2 ) or roughly 2 / N, for the worst individual it is 2 / (N*(N+1)) or roughly 2 / N^2.
This is called linear rank selection, because the ranks form a linear progression. You can also think of ranks forming a geometric progression, such as e.g 1 / 2^n where n is ranging from 1 for the best individual to N for the worst. This of course gives much higher probability to the best individual.
You can look at the implementation of some selection methods in HeuristicLab.
My code of Rank Selection in MatLab:
NewFitness=sort(Fitness);
NewPop=round(rand(PopLength,IndLength));
for i=1:PopLength
for j=1:PopLength
if(NewFitness(i)==Fitness(j))
NewPop(i,1:IndLength)=CurrentPop(j,1:IndLength);
break;
end
end
end
CurrentPop=NewPop;
ProbSelection=zeros(PopLength,1);
CumProb=zeros(PopLength,1);
for i=1:PopLength
ProbSelection(i)=i/PopLength;
if i==1
CumProb(i)=ProbSelection(i);
else
CumProb(i)=CumProb(i-1)+ProbSelection(i);
end
end
SelectInd=rand(PopLength,1);
for i=1:PopLength
flag=0;
for j=1:PopLength
if(CumProb(j)<SelectInd(i) && CumProb(j+1)>=SelectInd(i))
SelectedPop(i,1:IndLength)=CurrentPop(j+1,1:IndLength);
flag=1;
break;
end
end
if(flag==0)
SelectedPop(i,1:IndLength)=CurrentPop(1,1:IndLength);
end
end
I've made a template genetic-algorithm class in C++.
My library of genetic algorithm is separated from GeneticAlgorithm and GAPopulation. Those are all template classes so that you can see its origin code in API Documents.
Here are source codes and API documents.
http://samchon.github.io/framework/api/cpp/d5/d28/classsamchon_1_1library_1_1GeneticAlgorithm.html
http://samchon.github.io/framework/api/cpp/d8/dcd/classsamchon_1_1library_1_1GAPopulation.html
How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20
This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}
This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).
For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers
So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))
I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}
If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap
As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)
I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}
The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}
Is there any nice algorithm to find the nearest prime number to a given real number? I only need to search within the first 100 primes or so.
At present, I've a bunch of prime numbers stored in an array and I'm checking the difference one number at a time (O(n)?).
Rather than a sorted list of primes, given the relatively small range targetted, have an array indexed by all the odd numbers in the range (you know there are no even primes except the special case of 2) and containing the closest prime. Finding the solution becomes O(1) time-wise.
I think the 100th prime is circa 541. an array of 270 [small] ints is all that is needed.
This approach is particularly valid, given the relative high density of primes (in particular relative to odd numbers), in the range below 1,000. (As this affects the size of a binary tree)
If you only need to search in the first 100 primes or so, just create a sorted table of those primes, and do a binary search. This will either get you to one prime number, or a spot between two, and you check which of those is closer.
Edit: Given the distribution of primes in that range, you could probably speed things up (a tiny bit) by using an interpolation search -- instead of always starting at the middle of the table, use linear interpolation to guess at a more accurate starting point. The 100th prime number should be somewhere around 250 or so (at a guess -- I haven't checked), so if (for example) you wanted the one closest to 50, you'd start about 1/5th of the way into the array instead of halfway. You can pretty much treat the primes as starting at 1, so just divide the number you want by the largest in your range to get a guess at the starting point.
Answers so far are rather complicated, given the task in hand. The first hundred primes are all less then 600. I would create an array of size 600 and place in each the value of the nearest prime to that number. Then, given a number to test, I would round it both up and down using the floor and ceil functions to get one or two candidate answers. A simple comparison with the distances to these numbers will give you a very fast answer.
The simplest approach would be to store the primes in a sorted list and modify your algorithm to do a binary search.
The standard binary search algorithm would return null for a miss, but it should be straight-forward to modify it for your purposes.
The fastest algorithm? Create a lookup table with p[100]=541 elements and return the result for floor(x), with special logic for x on [2,3]. That would be O(1).
You should sort your number in array then you can use binary search. This algorithm is O(log n) performance in worst case.
public static boolean p(int n){
for(int i=3;i*i<=n;i+=2) {
if(n%i==0)
return false;
}
return n%2==0? false: true; }
public static void main(String args[]){
String n="0";
int x = Integer.parseInt(n);
int z=x;
int a=0;
int i=1;
while(!p(x)){
a = i*(int)Math.pow(-1, i);
i++;
x+=a;
}
System.out.println( (int) Math.abs(x-z));}
this is for n>=2.
In python:
>>> def nearest_prime(n):
incr = -1
multiplier = -1
count = 1
while True:
if prime(n):
return n
else:
n = n + incr
multiplier = multiplier * -1
count = count + 1
incr = multiplier * count
>>> nearest_prime(3)
3
>>> nearest_prime(4)
3
>>> nearest_prime(5)
5
>>> nearest_prime(6)
5
>>> nearest_prime(7)
7
>>> nearest_prime(8)
7
>>> nearest_prime(9)
7
>>> nearest_prime(10)
11
<?php
$N1Diff = null;
$N2Diff = null;
$n1 = null;
$n2 = null;
$number = 16;
function isPrime($x) {
for ($i = 2; $i < $x; $i++) {
if ($x % $i == 0) {
return false;
}
}
return true;
}
for ($j = $number; ; $j--) {
if( isPrime($j) ){
$N1Diff = abs($number - $j);
$n1 = $j;
break;
}
}
for ($j = $number; ; $j++) {
if( isPrime($j) ){
$N2Diff = abs($number - $j);
$n2 = $j;
break;
}
}
if($N1Diff < $N2Diff) {
echo $n1;
} else if ($N1Diff2 < $N1Diff ){
echo $n2;
}
If you want to write an algorithm, A Wikipedia search for prime number led me to another article on the Sieve of Eratosthenes. The algorithm looks a bit simple and I'm thinking a recursive function would suit it well imo. (I could be wrong about that.)
If the array solution isn't a valid solution for you (it is the best one for your scenario), you can try the code below. After the "2 or 3" case, it will check every odd number away from the starting value until it finds a prime.
static int NearestPrime(double original)
{
int above = (int)Math.Ceiling(original);
int below = (int)Math.Floor(original);
if (above <= 2)
{
return 2;
}
if (below == 2)
{
return (original - 2 < 0.5) ? 2 : 3;
}
if (below % 2 == 0) below -= 1;
if (above % 2 == 0) above += 1;
double diffBelow = double.MaxValue, diffAbove = double.MaxValue;
for (; ; above += 2, below -= 2)
{
if (IsPrime(below))
{
diffBelow = original - below;
}
if (IsPrime(above))
{
diffAbove = above - original;
}
if (diffAbove != double.MaxValue || diffBelow != double.MaxValue)
{
break;
}
}
//edit to your liking for midpoint cases (4.0, 6.0, 9.0, etc)
return (int) (diffAbove < diffBelow ? above : below);
}
static bool IsPrime(int p) //intentionally incomplete due to checks in NearestPrime
{
for (int i = 3; i < Math.Sqrt(p); i += 2)
{
if (p % i == 0)
return false;
}
return true;
}
Lookup table whit size of 100 bytes; (unsigned chars)
Round real number and use lookup table.
Maybe we can find the left and right nearest prime numbers, and then compare to get the nearest one. (I've assumed that the next prime number shows up within next 10 occurrences)
def leftnearestprimeno(n):
n1 = n-1
while(n1 >= 0):
if isprime(n1):
return n1
else:
n1 -= 1
return -1
def rightnearestprimeno(n):
n1 = n+1
while(n1 < (n+10)):
if isprime(n1):
return n1
else:
n1 += 1
return -1
n = int(input())
a = leftnearestprimeno(n)
b = rightnearestprimeno(n)
if (n - a) < (b - n):
print("nearest: ", a)
elif (n - a) > (b - n):
print("nearest: ", b)
else:
print("nearest: ", a) #in case the difference is equal, choose min
#value
Simplest answer-
Every prime number can be represented in the form (6*x-1 and 6*X +1) (except 2 and 3).
let number is N.divide it with 6.
t=N/6;
now
a=(t-1)*6
b=(t+1)*6
and check which one is closer to N.
This question already has answers here:
Roulette Selection in Genetic Algorithms
(14 answers)
Closed 7 years ago.
Can anyone provide some pseudo code for a roulette selection function? How would I implement this: I don't really understand how to read this math notation.I want General algorithm to this.
The other answers seem to be assuming that you are trying to implement a roulette game. I think that you are asking about roulette wheel selection in evolutionary algorithms.
Here is some Java code that implements roulette wheel selection.
Assume you have 10 items to choose from and you choose by generating a random number between 0 and 1. You divide the range 0 to 1 up into ten non-overlapping segments, each proportional to the fitness of one of the ten items. For example, this might look like this:
0 - 0.3 is item 1
0.3 - 0.4 is item 2
0.4 - 0.5 is item 3
0.5 - 0.57 is item 4
0.57 - 0.63 is item 5
0.63 - 0.68 is item 6
0.68 - 0.8 is item 7
0.8 - 0.85 is item 8
0.85 - 0.98 is item 9
0.98 - 1 is item 10
This is your roulette wheel. Your random number between 0 and 1 is your spin. If the random number is 0.46, then the chosen item is item 3. If it's 0.92, then it's item 9.
Here is a bit of python code:
def roulette_select(population, fitnesses, num):
""" Roulette selection, implemented according to:
<http://stackoverflow.com/questions/177271/roulette
-selection-in-genetic-algorithms/177278#177278>
"""
total_fitness = float(sum(fitnesses))
rel_fitness = [f/total_fitness for f in fitnesses]
# Generate probability intervals for each individual
probs = [sum(rel_fitness[:i+1]) for i in range(len(rel_fitness))]
# Draw new population
new_population = []
for n in xrange(num):
r = rand()
for (i, individual) in enumerate(population):
if r <= probs[i]:
new_population.append(individual)
break
return new_population
First, generate an array of the percentages you assigned, let's say p[1..n]
and assume the total is the sum of all the percentages.
Then get a random number between 1 to total, let's say r
Now, the algorithm in lua:
local c = 0
for i = 1,n do
c = c + p[i]
if r <= c then
return i
end
end
There are 2 steps to this: First create an array with all the values on the wheel. This can be a 2 dimensional array with colour as well as number, or you can choose to add 100 to red numbers.
Then simply generate a random number between 0 or 1 (depending on whether your language starts numbering array indexes from 0 or 1) and the last element in your array.
Most languages have built-in random number functions. In VB and VBScript the function is RND(). In Javascript it is Math.random()
Fetch the value from that position in the array and you have your random roulette number.
Final note: don't forget to seed your random number generator or you will get the same sequence of draws every time you run the program.
Here is a really quick way to do it using stream selection in Java. It selects the indices of an array using the values as weights. No cumulative weights needed due to the mathematical properties.
static int selectRandomWeighted(double[] wts, Random rnd) {
int selected = 0;
double total = wts[0];
for( int i = 1; i < wts.length; i++ ) {
total += wts[i];
if( rnd.nextDouble() <= (wts[i] / total)) selected = i;
}
return selected;
}
This could be further improved using Kahan summation or reading through the doubles as an iterable if the array was too big to initialize at once.
I wanted the same and so created this self-contained Roulette class. You give it a series of weights (in the form of a double array), and it will simply return an index from that array according to a weighted random pick.
I created a class because you can get a big speed up by only doing the cumulative additions once via the constructor. It's C# code, but enjoy the C like speed and simplicity!
class Roulette
{
double[] c;
double total;
Random random;
public Roulette(double[] n) {
random = new Random();
total = 0;
c = new double[n.Length+1];
c[0] = 0;
// Create cumulative values for later:
for (int i = 0; i < n.Length; i++) {
c[i+1] = c[i] + n[i];
total += n[i];
}
}
public int spin() {
double r = random.NextDouble() * total; // Create a random number between 0 and 1 and times by the total we calculated earlier.
//int j; for (j = 0; j < c.Length; j++) if (c[j] > r) break; return j-1; // Don't use this - it's slower than the binary search below.
//// Binary search for efficiency. Objective is to find index of the number just above r:
int a = 0;
int b = c.Length - 1;
while (b - a > 1) {
int mid = (a + b) / 2;
if (c[mid] > r) b = mid;
else a = mid;
}
return a;
}
}
The initial weights are up to you. Maybe it could be the fitness of each member, or a value inversely proportional to the member's position in the "top 50". E.g.: 1st place = 1.0 weighting, 2nd place = 0.5, 3rd place = 0.333, 4th place = 0.25 weighting etc. etc.
Well, for an American Roulette wheel, you're going to need to generate a random integer between 1 and 38. There are 36 numbers, a 0, and a 00.
One of the big things to consider, though, is that in American roulette, their are many different bets that can be made. A single bet can cover 1, 2, 3, 4, 5, 6, two different 12s, or 18. You may wish to create a list of lists where each number has additional flages to simplify that, or do it all in the programming.
If I were implementing it in Python, I would just create a Tuple of 0, 00, and 1 through 36 and use random.choice() for each spin.
This assumes some class "Classifier" which just has a String condition, String message, and double strength. Just follow the logic.
-- Paul
public static List<Classifier> rouletteSelection(int classifiers) {
List<Classifier> classifierList = new LinkedList<Classifier>();
double strengthSum = 0.0;
double probabilitySum = 0.0;
// add up the strengths of the map
Set<String> keySet = ClassifierMap.CLASSIFIER_MAP.keySet();
for (String key : keySet) {
/* used for debug to make sure wheel is working.
if (strengthSum == 0.0) {
ClassifierMap.CLASSIFIER_MAP.get(key).setStrength(8000.0);
}
*/
Classifier classifier = ClassifierMap.CLASSIFIER_MAP.get(key);
double strength = classifier.getStrength();
strengthSum = strengthSum + strength;
}
System.out.println("strengthSum: " + strengthSum);
// compute the total probability. this will be 1.00 or close to it.
for (String key : keySet) {
Classifier classifier = ClassifierMap.CLASSIFIER_MAP.get(key);
double probability = (classifier.getStrength() / strengthSum);
probabilitySum = probabilitySum + probability;
}
System.out.println("probabilitySum: " + probabilitySum);
while (classifierList.size() < classifiers) {
boolean winnerFound = false;
double rouletteRandom = random.nextDouble();
double rouletteSum = 0.0;
for (String key : keySet) {
Classifier classifier = ClassifierMap.CLASSIFIER_MAP.get(key);
double probability = (classifier.getStrength() / strengthSum);
rouletteSum = rouletteSum + probability;
if (rouletteSum > rouletteRandom && (winnerFound == false)) {
System.out.println("Winner found: " + probability);
classifierList.add(classifier);
winnerFound = true;
}
}
}
return classifierList;
}
You can use a data structure like this:
Map<A, B> roulette_wheel_schema = new LinkedHashMap<A, B>()
where A is an integer that represents a pocket of the roulette wheel, and B is an index that identifies a chromosome in the population. The number of pockets is proportional to the fitness proportionate of each chromosome:
number of pockets = (fitness proportionate) · (scale factor)
Then we generate a random between 0 and the size of the selection schema and with this random number we get the index of the chromosome from the roulette.
We calculate the relative error between the fitness proportionate of each chromosome and the probability of being selected by the selection scheme.
The method getRouletteWheel returns the selection scheme based on previous data structure.
private Map<Integer, Integer> getRouletteWheel(
ArrayList<Chromosome_fitnessProportionate> chromosomes,
int precision) {
/*
* The number of pockets on the wheel
*
* number of pockets in roulette_wheel_schema = probability ·
* (10^precision)
*/
Map<Integer, Integer> roulette_wheel_schema = new LinkedHashMap<Integer, Integer>();
double fitness_proportionate = 0.0D;
double pockets = 0.0D;
int key_counter = -1;
double scale_factor = Math
.pow(new Double(10.0D), new Double(precision));
for (int index_cromosome = 0; index_cromosome < chromosomes.size(); index_cromosome++){
Chromosome_fitnessProportionate chromosome = chromosomes
.get(index_cromosome);
fitness_proportionate = chromosome.getFitness_proportionate();
fitness_proportionate *= scale_factor;
pockets = Math.rint(fitness_proportionate);
System.out.println("... " + index_cromosome + " : " + pockets);
for (int j = 0; j < pockets; j++) {
roulette_wheel_schema.put(Integer.valueOf(++key_counter),
Integer.valueOf(index_cromosome));
}
}
return roulette_wheel_schema;
}
I have worked out a Java code similar to that of Dan Dyer (referenced earlier). My roulette-wheel, however, selects a single element based on a probability vector (input) and returns the index of the selected element.
Having said that, the following code is more appropriate if the selection size is unitary and if you do not assume how the probabilities are calculated and zero probability value is allowed. The code is self-contained and includes a test with 20 wheel spins (to run).
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import java.util.logging.Level;
import java.util.logging.Logger;
/**
* Roulette-wheel Test version.
* Features a probability vector input with possibly null probability values.
* Appropriate for adaptive operator selection such as Probability Matching
* or Adaptive Pursuit, (Dynamic) Multi-armed Bandit.
* #version October 2015.
* #author Hakim Mitiche
*/
public class RouletteWheel {
/**
* Selects an element probabilistically.
* #param wheelProbabilities elements probability vector.
* #param rng random generator object
* #return selected element index
* #throws java.lang.Exception
*/
public int select(List<Double> wheelProbabilities, Random rng)
throws Exception{
double[] cumulativeProba = new double[wheelProbabilities.size()];
cumulativeProba[0] = wheelProbabilities.get(0);
for (int i = 1; i < wheelProbabilities.size(); i++)
{
double proba = wheelProbabilities.get(i);
cumulativeProba[i] = cumulativeProba[i - 1] + proba;
}
int last = wheelProbabilities.size()-1;
if (cumulativeProba[last] != 1.0)
{
throw new Exception("The probabilities does not sum up to one ("
+ "sum="+cumulativeProba[last]);
}
double r = rng.nextDouble();
int selected = Arrays.binarySearch(cumulativeProba, r);
if (selected < 0)
{
/* Convert negative insertion point to array index.
to find the correct cumulative proba range index.
*/
selected = Math.abs(selected + 1);
}
/* skip indexes of elements with Zero probability,
go backward to matching index*/
int i = selected;
while (wheelProbabilities.get(i) == 0.0){
System.out.print(i+" selected, correction");
i--;
if (i<0) i=last;
}
selected = i;
return selected;
}
public static void main(String[] args){
RouletteWheel rw = new RouletteWheel();
int rept = 20;
List<Double> P = new ArrayList<>(4);
P.add(0.2);
P.add(0.1);
P.add(0.6);
P.add(0.1);
Random rng = new Random();
for (int i = 0 ; i < rept; i++){
try {
int s = rw.select(P, rng);
System.out.println("Element selected "+s+ ", P(s)="+P.get(s));
} catch (Exception ex) {
Logger.getLogger(RouletteWheel.class.getName()).log(Level.SEVERE, null, ex);
}
}
P.clear();
P.add(0.2);
P.add(0.0);
P.add(0.5);
P.add(0.0);
P.add(0.1);
P.add(0.2);
//rng = new Random();
for (int i = 0 ; i < rept; i++){
try {
int s = rw.select(P, rng);
System.out.println("Element selected "+s+ ", P(s)="+P.get(s));
} catch (Exception ex) {
Logger.getLogger(RouletteWheel.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
/**
* {#inheritDoc}
* #return
*/
#Override
public String toString()
{
return "Roulette Wheel Selection";
}
}
Below an execution sample for a proba vector P=[0.2,0.1,0.6,0.1],
WheelElements = [0,1,2,3]:
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 1, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 0, P(s)=0.2
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
The code also tests a roulette wheel with zero probability.
I am afraid that anybody using the in built random number generator in all programming languages must be aware that the number generated is not 100% random.So should be used with caution.
Random Number Generator pseudo code
add one to a sequential counter
get the current value of the sequential counter
add the counter value by the computer tick count or some other small interval timer value
optionally add addition numbers, like a number from an external piece of hardware like a plasma generator or some other type of somewhat random phenomena
divide the result by a very big prime number
359334085968622831041960188598043661065388726959079837 for example
get some digits from the far right of the decimal point of the result
use these digits as a random number
Use the random number digits to create random numbers between 1 and 38 (or 37 European) for roulette.