Inserting keys into a hash table of length 9 - data-structures

So, I'm working on a problem that requires me to insert keys in order in a hash table. I stopped inserting after the 20 since there is not more room. I provide the following picture to help with context. I created the hash table, found the number of collisions and the load factor. Collisions are resolved by open addressing. Sorry this isn't a questions, I just need someone to look over it and tell me if its all correct.

There are a number of errors and misunderstandings in your question.
You state that you 'stopped inserting after 20' but you show 15 keys.
There are 9 buckets in your hash table but then you state that the load factor is 1. Load factor is the number of keys (15 or 20) divided by the number of buckets (9) so it is not 1.
In a hash function h(k,i) k is the key and i is the number of buckets. In your case i is 9 and so the function (k mod 9 + 5i) mod 9 really makes no sense.
All hash functions should end with mod i.
There are not 15 collisions in the keys you provided. A collision only occurs when there's a previous value in the table.
This is all explained in the wikipedia article on hashtables.
With the clarifications in the comments below this answer in mind, I used the following code to verify your conclusions:
public class Hashing {
private static final int SIZE = 9;
private final int[] keys = new int[SIZE];
private int collisions = 0;
public void add(int key) {
int attempt = 0;
while (keys[hash(key, attempt)] > 0)
attempt++;
collisions += attempt;
keys[hash(key, attempt)] = key;
}
private int hash(int key, int attempt) {
return (key % SIZE + 5 * attempt) % SIZE;
}
public static void main(String[] args) {
Hashing table = new Hashing();
Stream.of(28, 5, 15, 19, 10, 17, 33, 12, 20).forEach(table::add);
System.out.println("Table " + Arrays.toString(table.keys));
System.out.println("Collisions " + table.collisions);
}
}
And received the following output:
Table [20, 28, 19, 33, 12, 5, 15, 10, 17]
Collisions 15

Related

Am I following a good approach?

Please find the link to the discussed problem.
Sorting | Amazon Interview Question
I followed this following approach by taking random arrays. Would want your suggestions on what other approaches can be followed to solve the problem:-
public class ThreeMachineInsertionSorting {
public static void main(String[] args) {
int[] ar1 = { 10, 9, 8, 7, 6, 5, 4, 3, 2, 1,-10};
int[] ar2 = { 20, 19, 18, 17, 16, 15, 14, 23, 12, 11, 10 };
int[] ar3 = { 20, 19, 21, 122, 10, 9, 11, 12, 4, 13, 18, 17 };
performInsertionSort(ar1, ar2, ar3);
}
private static void performInsertionSort(int[] ar1, int[] ar2, int[] ar3) {
int[][] arrayOfArrays = { ar1, ar2, ar3 };
int [] unSortedArray= mergeArrays(ar1,ar2,ar3,arrayOfArrays);
int i,j,key;
for(i=1;i<unSortedArray.length;i++){
key= unSortedArray[i];
j=i-1;
while(j>=0 && key<unSortedArray[j]){
unSortedArray[j+1] = unSortedArray[j];
j--;
}
unSortedArray[j+1] = key;
}
System.out.println("Size of the unSorted array is :=" + unSortedArray.length);
shareLoad(unSortedArray, arrayOfArrays);
}
private static void shareLoad(int[] sortedArray, int[][] arrayOfArrays) {
int loadFactor = sortedArray.length/3;
int index=0;
while(index<sortedArray.length){
for(int [] ar : arrayOfArrays){
for(int i=0; i<ar.length;i++){
if(i<=loadFactor){
ar[i] = sortedArray[index];
index++;
}
}
}
}
for(int [] ar : arrayOfArrays){
System.out.println("******************************************array properties**************************************");
System.out.println("Size of array::" + ar.length);
for(int i=0; i<ar.length;i++){
System.out.print(ar[i]+" ");
}
System.out.println("\n******************************************************************");
}
}
private static int[] mergeArrays(int[] ar1, int[] ar2, int[] ar3,int[][] arrayOfArrays ) {
int[] colaboratedArray = new int[ar1.length + ar2.length + ar3.length];
System.out.println("Length of multi-dimensional array :-"
+ arrayOfArrays.length);
int i = 0;
while (i < colaboratedArray.length) {
for (int[] ar : arrayOfArrays) {
for (int j = 0; j < ar.length; j++) {
colaboratedArray[i++] = ar[j];
}
}
}
return colaboratedArray;
}
}
Here is a bit of a naive solution.
Sort M1, M2 and M3 using in-place sorting algorithm.
Combine M1 and M3 by finding out the point where all elements from M1 can be swapped out for smaller elements in M3.
To find this point take 1/9 of the numbers in M3 and move to the buffer in M1. Notice that 1/9 of the numbers is exactly the 10% capacity available. We start with the smallest bucket in M3 and compare it to the largest bucket in M1. Just by comparing the largest value from the M3 bucket and the smallest value from the M1 bucket we can determine if we can swap them out completely or not. If we can then move the bucket from M1 to M3 and bring in new bucket from M3. Do this until we have M1 and M3 partitioned.
Now we must sort M1 and M3 again. Now we must take the smallest elements from M2 and move them to M1 until we have M1 and M2 partitioned. After that is done we can sort M1 and M2. Notice that M1 is now finished.
Finish by partitioning M2 and M3 and then sort both and the problem is solved.
Notice that this solution requires us to sort all machines three times, and each data element is in worst case moved 3 times between machines.
This solution can be improved if we can find the median of two unsorted sets, then we can partition the sets according to this median and then transferring elements, and only sort in the end once we know that the final numbers are on each machine.
It can be further improved if we can find the Kth largest element of 3 unsorted sets efficiently without transferring data.

Computing the floor log of a binary number

If there is a number in binary, in a n bit system, then the floor log of the number is defined as the index of the MSB of the number. Now, if I have a number in binary, By scanning all bits one by one, I can determine the index of the MSB, but it will take me order n time. Is there some faster way I can do it?
Using c# as an example, for a byte, you can pre-compute a table and then just do a lookup
internal static readonly byte[] msbPos256 = new byte[256];
static ByteExtensions() {
msbPos256[0] = 8; // special value for when there are no set bits
msbPos256[1] = 0;
for (int i = 2; i < 256; i++) msbPos256[i] = (byte)(1 + msbPos256[i / 2]);
}
/// <summary>
/// Returns the integer logarithm base 2 (Floor(Log2(number))) of the specified number.
/// </summary>
/// <remarks>Example: Log2(10) returns 3.</remarks>
/// <param name="number">The number whose base 2 log is desired.</param>
/// <returns>The base 2 log of the number greater than 0, or 0 when the number
/// equals 0.</returns>
public static byte Log2(this byte value) {
return msbPos256[value | 1];
}
for an unsigned 32 bit int, the following will work
private static byte[] DeBruijnLSBsSet = new byte[] {
0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};
public static uint Log2(this uint value) {
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
return DeBruijnLSBsSet[unchecked((value | value >> 16) * 0x07c4acddu) >> 27];
}
This website is the go-to place for bit twiddling tricks
http://graphics.stanford.edu/~seander/bithacks.html
It has these, and a number of other techniques for achieving what you are asking for in your question.
There are a number of general tricks that utilize small lookup tables, as #hatchet says.
There is a notable alternative, however. If you want the fastest implementation and are using a low-level language, then this instruction is also built into almost all ISAs and has support from almost all compilers. See https://en.wikipedia.org/wiki/Find_first_set and use compiler intrinsics or inline assembly as appropriate.

Printing numbers of the form 2^i * 5^j in increasing order

How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20
This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}
This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).
For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers
So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))
I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}
If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap
As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)
I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}
The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}

How do you generate a random number in between 1 and 7 by using a function which generates 1 to 5 [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Expand a random range from 1-5 to 1-7
I understood the solution using rejection sampling i.e
public static int rand7() {
while (true) {
int num = 5 * (rand5() - 1) + (rand5() - 1);
if (num < 21) return (num % 7 + 1);
}
}
but I am thinking of another solution, i.e rand5() is called 7 times and result is divided by 5, but I am not sure whether this is correct. Please let me know if is or isn't.
public static int rand7() {
int num = rand5()+rand5()+rand5()+rand5()+rand5()+rand5()+rand5();
return num/5;
}
EDIT: It looks like the probability of generating 1 is (1/5)^7 but to generate 2 it is 7*(1/5)^7. It is uneven so it is not going to work.
It will not be a uniform distribution (looks normal). And as Paul says, the proof follows from the Central Limit Theorem.
No, it's not correct (at least not if the requirement is a uniform distribution, which is provided through rejection sampling).
If you had added two more + rand5() terms, you would compute an approximation of the average a rand5 function. As it stands now, you're basically compute an approximation of 7/5 × the average of the rand5 function. (Which should be about 4.2.)
If you wanted a number between 1 and say 25, you could do
int[][] lut = { { 1, 2, 3, 4, 5},
{ 6, 7, 8, 9, 10},
...,
{ 21, 22, 23, 24, 25 } }
return lut[rand5()][rand()]
This can not be done for 7 though, since 5 and 7 are co-prime. Rejection sampling is the best way to solve this.
It is not possible to create an uniform rand7() function on base of uniform rand5() function. However it is possible to be as close as needed.
For example, if you are able to call rand5() 10 times in rand7(), you can get quite well quasi-uniform rand7(). If you are able to call it 100, then rand7() can be almost perfect, but never exactly uniform. This can be mathematically proven.
A simple solution is to use rand5() on the bits of an octet, by assigning 0 to derived values 1 or 2, generating again on a 3, or assigning 1 for values 4 or 5. If the final result is zero, then redo. Here's some code:
public static int rand7() {
int returnValue = 0;
while (returnValue == 0) {
for (int i = 1; i <= 3; i++) {
returnValue = (returnValue << 1) + rand5_output_2();
}
}
return returnValue;
}
private static int rand5_output_2() {
while (true) {
int flip = rand5();
if (flip < 3) {
return 0;
}
else if (flip > 3) {
return 1;
}
}
}
While that other solution will generate a random number in the range 1..7, it will do so with a non-uniform probability distribution, i.e. the number 3 will be a lot more likely than the number 1.
In contrast, the rejection sampling approach will return all numbers in the range 1..7 with equal probability.

How to search for closest value in a lookup table?

I have a simple one dimmensional array of integer values that represent a physical set of part values I have to work with. I then calculate and ideal value mathematically.
How could I write an efficient search algorithm that will find the smallest abosulte difference from my ideal value in the array?
The array is predetermined and constant, so it can be sorted however I need.
Example
Lookup array:
100, 152, 256, 282, 300
Searching for an ideal value of 125 would find 100 in the array, whereas 127 would find 152.
The actual lookup array will be about 250 items long and never change.
Once array is sorted, use binary search
This is very similar to a binary search except if it does not find the exact key, it would return a key would be very close to the provided key.
Logic is to search till exact key is found or till there exactly one key left between high key and the low while performing binary search.
Consider an array n[] = {1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
if you search for the key: 2, then using below algorithm
Step 1: high=10, low=0, med=5
Step 2: high=5, low=0, med=2
Step 3: high=2, low=0, med=1 In this step the exact key is found. So it returns 1.
if you search for the key:3 (which is not present in the array), then using below algorithm
Step 1: high=10, low=0, med=5
Step 2: high=5, low=0, med=2
Step 3: high=2, low=0, med=1
Step 4: high=1, low=0, At this step high=low+1 i.e. no more element to search. So it returns med=1.
Hope this helps...
public static <T> int binarySearch(List<T> list, T key, Comparator<T> compare) {
int low, high, med, c;
T temp;
high = list.size();
low = 0;
med = (high + low) / 2;
while (high != low+1) {
temp = list.get(med);
c = compare.compare(temp, key);
if (c == 0) {
return med;
} else if (c < 0){
low = med;
}else{
high = med;
}
med = (high + low) / 2;
}
return med;
}
/** ------------------------ Example -------------------- **/
public static void main(String[] args) {
List<Integer> nos = new ArrayList<Integer>();
nos.addAll(Arrays.asList(new Integer[]{1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20}));
search(nos, 2); // Output Search:2 Key:1 Value:2
search(nos, 3); // Output Search:3 Key:1 Value:2
search(nos, 10); // Output Search:10 Key:5 Value:10
search(nos, 11); // Output Search:11 Key:5 Value:10
}
public static void search(List<Integer> nos, int search){
int key = binarySearch(nos, search, new IntComparator());
System.out.println("Search:"+search+"\tKey:"+key+"\tValue:"+nos.get(key));
}
public static class IntComparator implements Comparator<Integer>{
#Override
public int compare(Integer o1, Integer o2) {
return o1.compareTo(o2);
}
}
The binary search algorithm from Wikipedia is as below:
int binary_search(int A[], int key, int imin, int imax)
{
// continue searching while [imin,imax] is not empty
while (imax >= imin)
{
// calculate the midpoint for roughly equal partition
int imid = midpoint(imin, imax);
if(A[imid] == key)
// key found at index imid
return imid;
// determine which subarray to search
else if (A[imid] < key)
// change min index to search upper subarray
imin = imid + 1;
else
// change max index to search lower subarray
imax = imid - 1;
}
// key was not found
return KEY_NOT_FOUND;
}
The end condition in case a key is not found is that imax < imin.
In fact, this condition can locate the nearest match. The nearest match will lie between imax and imin (taking into account either might be outside the array bounds). Note again that imax < imin in the end case. Some solutions use abs to find the difference, but we know that A[imax] < key < A[imin] so:
if imax <= 0 return 0
if imin >= A.count - 1 return A.count - 1
if (key - A[imax]) < (A[imin] - key) return imax
return imin
Python, brute force on unsorted list (cause it's fun writing Python) O(n):
table = (100, 152, 256, 282, 300)
value = 125
lookup_dict = dict([(abs(value-x),x) for x in table])
closest_val = ldict[min(ldict.keys())]
And a proper implementation that uses binary search to find the value O(log_n):
import bisect
'''Returns the closest entry in the sorted list 'sorted' to 'value'
'''
def find_closest(sorted, value):
if (value <= sorted[0]):
return sorted[0]
if (value >= sorted[-1]):
return sorted[-1]
insertpos = bisect.bisect(sorted, value)
if (abs(sorted[insertpos-1] - value) <= abs(sorted[insertpos] - value)):
return sorted[insertpos-1]
else:
return sorted[insertpos]
Java has a Arrays.binarySearch function.
Given an array of [10, 20, 30] you would get these results:
Search for
Result
10
0
20
1
30
2
7
-1
9
-1
11
-2
19
-2
21
-3
29
-3
43
-4
Sample code:
import java.util.Arrays;
public class Solution {
public static void main(String[] args) {
int[] array = new int[]{10, 20, 30};
int[] keys = new int[]{10, 20, 30, 7, 9, 11, 19, 21, 29, 43};
for (int key: keys) {
System.out.println(Arrays.binarySearch(array, key));
}
}
}
Sample output:
1
2
-1
-1
-2
-2
-3
-3
-4
Basically the negative numbers provide 2 crucial information. Negative denotes that the exact match was not found but we can get a "close enough" match. The negative value indicates where the match is, -2 means: array[0] < key < array[1] and -3 means array[1] < key < array[2].
-1 means it is smaller than the minimum value in the array.
Example based on sample data on the initial question:
public class Solution {
public static void main(String[] args) {
int[] array = new int[]{100, 152, 256, 282, 300};
int[] keys = new int[]{125, 127, 282, 4, 900, 126};
for (int key : keys) {
int index = Arrays.binarySearch(array, key);
if (index >= 0) {
System.out.println("Found " + key);
} else {
if (index == -1) {
//smaller than smallest in the array
System.out.println("Closest to " + key + " is " + array[0]);
} else if (-index > array.length) {
//larger than the largest in the array
System.out.println("Closest to " + key + " is " + array[array.length - 1]);
} else {
//in between
int before = array[0 - index - 2];
int after = array[0 - index - 1];
if (key - before < after - key) {
System.out.println("Closest to " + key + " is " + before);
} else if (key - before > after - key) {
System.out.println("Closest to " + key + " is " + after);
} else {
System.out.println("Same distance from " + key + " to " + before + " and " + after);
}
}
}
}
}
}
And the output:
Closest to 125 is 100
Closest to 127 is 152
Found 282
Closest to 4 is 100
Closest to 900 is 300
Same distance from 126 to 100 and 152
Just going through the array and computing abs(reference-array_value[i]) would take O(N).
carry the index with the smallest difference.

Resources