Programming Pearls: Column 9.3 Binary Search - range initialization - algorithm

In Section 9.3 Job Bentley presents a modified binary search..
a brief snip of the typical implementation and the better approach shown in 9.3
if (arr[mid] < key) low = mid+1
else if (arr[mid] > key) high = mid-1
else return mid;
modified/efficient comparison with a different invariant..
if (arr[mid] < key) low = m;
else high = m;
And outside the loop there is a check if the key at the index 'high'. In the modified binary search the left index 'low' starts at -1 (instead of 0) and 'high' index starts at n (instead of n-1).. and the loop runs
while (low + 1 != high)
This modified search seems to work even if I set low = 0 and high = n-1.
But I would rather not second guess Job Bentley in his code. So why is he setting low to -1 and high to n ? Is there any corner case where only this will work ?

If you have an array that is empty (n == 0), then a check of while(low + 1 != high) will only correctly terminate if low begins at -1 and high at 0.
while((-1 + 1) != 0) //true
If low began at 0 instead, or high began at -1 (or both), then the loop will clearly perform at least one check:
while((0 + 1) != 0) // false
while((-1 + 1) != -1) // false
while((0 + 1) != -1) // false
That one check on an empty array will likely access an out of bounds index, which invokes undefined behavior.

Related

How to generate number between a low and a high number using just a bit?

I was asked this question in a interview, so I don't want the solution, just the guidance regarding how to approach it.
You have been given two numbers low and high. And a random generator which generates 0 and 1. I have to generate a number between low and high using that function.
I can get difference between the two numbers and somehow try to generate a number using bit manipulation. But I am not able to figure out how to do that?
You can do:
range = high - low
find n such that 2^n-1 < range <= 2^n
run the random generator n times to generate an int thanks to its binary representation. Something like 010011010 (= 154 in decimal)
add the obtained number to low to get your final number!
Here's a basic bit-by-bit comparison algorithm that gives a random number between low and high, using a random-bit function:
Decrease high by 1 and increase low by 1 (in case the random bits introduced later all equal those in high or low).
Create booleans high_dec and low_inc to store whether at least one 1 in high has been changed into 0, and at least one 0 in low has been changed into 1, and set both of them to false (these will help avoid the result going out of range).
Compare high and low bit-by-bit from MSB to LSB with these cases:
If you find high:1 and low:1 then store a 1 if low_inc=false or store a random bit otherwise (and update high_dec as necessary).
If you find high:1 and low:0 then store a random bit (and update high_dec or low_inc as necessary).
If you find high:0 and low:1 then store a 0 if high_dec=false or store a 1 if low_inc=false or store a random bit otherwise.
If you find high:0 and low:0 then store a 0 if high_dec=false or store a random bit otherwise (and update low_inc as necessary).
Note that the distribution of the random numbers is only uniform if the lowest possible result is a power of 2, and the range is a power of 2. In all cases the whole range is used, but there may be an emphasis on values near the beginning or end of the range.
function between(a, b) {
var lo = (a + 1).toString(2).split(''), // conversion to bit array because
hi = (b - 1).toString(2).split(''), // there is no bit manipulation in JS
lc = false, // low changed
hc = false, // high changed
result = [];
while (lo.length < hi.length) lo.unshift(0); // add leading zeros to low
for (var i = 0; i < hi.length; i++) { // iterate over bits, msb to lsb
var bit = Math.round(Math.random()); // random bit generator
if (hi[i] == 1) {
if (lo[i] == 1) { // case hi:1 lo:1
if (lc == false) bit = 1
else if (bit == 0) hc = true;
} else { // case hi:1 lo:0
if (bit == 0) hc = true
else lc = true;
}
} else {
if (lo[i] == 1) { // case hi:0 lo:1
if (hc == false) bit = 0
else if (lc == false) bit = 1;
} else { // case hi:0 lo:0
if (hc == false) bit = 0
else if (bit == 1) lc = true;
}
}
result.push(bit);
}
return parseInt(result.join(''), 2); // convert bit array to integer
}
document.write(between(999999, 1000100) + "<BR>");

Change the inequality in binary search code

shouldn't be
if(a[mid] < t)return BS(mid+1,high);
else return BS(low,mid);
the same as
if(a[mid] > t)return BS(low,mid-1);
else return BS(mid,high);
But the second one doesn't work, why?
Edit: I mean by doesn't work, that the code doesn't reach the base case.
In calculating mid as (low+high)/2 it uses integer division.
In Bref. By example
Let low = 3 , high = 4 , a[3] >= t
so by calling BS(low,high)
mid = (3+4)/2 = 3 #Integer_division
Since a[mid] >=t So return BS(mid,high) which is equivalent to BS(low,high) #infinite_loop
The solution use the integer division in your side So the code should be like
if(a[mid] >= t)return BS(low,mid);
else return BS(mid+1,high);
Think this will solve your issue.

Why is my Scala tail-recursion faster than the while loop?

Here are two solutions to exercise 4.9 in Cay Horstmann's Scala for the Impatient: "Write a function lteqgt(values: Array[Int], v: Int) that returns a triple containing the counts of values less than v, equal to v, and greater than v." One uses tail recursion, the other uses a while loop. I thought that both would compile to similar bytecode but the while loop is slower than the tail recursion by a factor of almost 2. This suggests to me that my while method is badly written.
import scala.annotation.tailrec
import scala.util.Random
object PerformanceTest {
def main(args: Array[String]): Unit = {
val bigArray:Array[Int] = fillArray(new Array[Int](100000000))
println(time(lteqgt(bigArray, 25)))
println(time(lteqgt2(bigArray, 25)))
}
def time[T](block : => T):T = {
val start = System.nanoTime : Double
val result = block
val end = System.nanoTime : Double
println("Time = " + (end - start) / 1000000.0 + " millis")
result
}
#tailrec def fillArray(a:Array[Int], pos:Int=0):Array[Int] = {
if (pos == a.length)
a
else {
a(pos) = Random.nextInt(50)
fillArray(a, pos+1)
}
}
#tailrec def lteqgt(values: Array[Int], v:Int, lt:Int=0, eq:Int=0, gt:Int=0, pos:Int=0):(Int, Int, Int) = {
if (pos == values.length)
(lt, eq, gt)
else
lteqgt(values, v, lt + (if (values(pos) < v) 1 else 0), eq + (if (values(pos) == v) 1 else 0), gt + (if (values(pos) > v) 1 else 0), pos+1)
}
def lteqgt2(values:Array[Int], v:Int):(Int, Int, Int) = {
var lt = 0
var eq = 0
var gt = 0
var pos = 0
val limit = values.length
while (pos < limit) {
if (values(pos) > v)
gt += 1
else if (values(pos) < v)
lt += 1
else
eq += 1
pos += 1
}
(lt, eq, gt)
}
}
Adjust the size of bigArray according to your heap size. Here is some sample output:
Time = 245.110899 millis
(50004367,2003090,47992543)
Time = 465.836894 millis
(50004367,2003090,47992543)
Why is the while method so much slower than the tailrec? Naively the tailrec version looks to be at a slight disadvantage, as it must always perform 3 "if" checks for every iteration, whereas the while version will often only perform 1 or 2 tests due to the else construct. (NB reversing the order I perform the two methods does not affect the outcome).
Test results (after reducing array size to 20000000)
Under Java 1.6.22 I get 151 and 122 ms for tail-recursion and while-loop respectively.
Under Java 1.7.0 I get 55 and 101 ms
So under Java 6 your while-loop is actually faster; both have improved in performance under Java 7, but the tail-recursive version has overtaken the loop.
Explanation
The performance difference is due to the fact that in your loop, you conditionally add 1 to the totals, while for recursion you always add either 1 or 0. So they are not equivalent. The equivalent while-loop to your recursive method is:
def lteqgt2(values:Array[Int], v:Int):(Int, Int, Int) = {
var lt = 0
var eq = 0
var gt = 0
var pos = 0
val limit = values.length
while (pos < limit) {
gt += (if (values(pos) > v) 1 else 0)
lt += (if (values(pos) < v) 1 else 0)
eq += (if (values(pos) == v) 1 else 0)
pos += 1
}
(lt, eq, gt)
}
and this gives exactly the same execution time as the recursive method (regardless of Java version).
Discussion
I'm not an expert on why the Java 7 VM (HotSpot) can optimize this better than your first version, but I'd guess it's because it's taking the same path through the code each time (rather than branching along the if / else if paths), so the bytecode can be inlined more efficiently.
But remember that this is not the case in Java 6. Why one while-loop outperforms the other is a question of JVM internals. Happily for the Scala programmer, the version produced from idiomatic tail-recursion is the faster one in the latest version of the JVM.
The difference could also be occurring at the processor level. See this question, which explains how code slows down if it contains unpredictable branching.
The two constructs are not identical. In particular, in the first case you don't need any jumps (on x86, you can use cmp and setle and add, instead of having to use cmp and jb and (if you don't jump) add. Not jumping is faster than jumping on pretty much every modern architecture.
So, if you have code that looks like
if (a < b) x += 1
where you may add or you may jump instead, vs.
x += (a < b)
(which only makes sense in C/C++ where 1 = true and 0 = false), the latter tends to be faster as it can be turned into more compact assembly code. In Scala/Java, you can't do this, but you can do
x += if (a < b) 1 else 0
which a smart JVM should recognize is the same as x += (a < b), which has a jump-free machine code translation, which is usually faster than jumping. An even smarter JVM would recognize that
if (a < b) x += 1
is the same yet again (because adding zero doesn't do anything).
C/C++ compilers routinely perform optimizations like this. Being unable to apply any of these optimizations was not a mark in the JIT compiler's favor; apparently it can as of 1.7, but only partially (i.e. it doesn't recognize that adding zero is the same as a conditional adding one, but it does at least convert x += if (a<b) 1 else 0 into fast machine code).
Now, none of this has anything to do with tail recursion or while loops per se. With tail recursion it's more natural to write the if (a < b) 1 else 0 form, but you can do either; and with while loops you can also do either. It just so happened that you picked one form for tail recursion and the other for the while loop, making it look like recursion vs. looping was the change instead of the two different ways to do the conditionals.

Scala PriorityQueue on Array[Int] performance issue with complex comparison function (caching is needed)

The problem involves the Scala PriorityQueue[Array[Int]] performance on large data set. The following operations are needed: enqueue, dequeue, and filter. Currently, my implementation is as follows:
For every element of type Array[Int], there is a complex evaluation function: (I'm not sure how to write it in a more efficient way, because it excludes the position 0)
def eval_fun(a : Array[Int]) =
if(a.size < 2) 3
else {
var ret = 0
var i = 1
while(i < a.size) {
if((a(i) & 0x3) == 1) ret += 1
else if((a(i) & 0x3) == 3) ret += 3
i += 1
}
ret / a.size
}
The ordering with a comparison function is based on the evaluation function: (Reversed, descendent order)
val arr_ord = new Ordering[Array[Int]] {
def compare(a : Array[Int], b : Array[Int]) = eval_fun(b) compare eval_fun(a) }
The PriorityQueue is defined as:
val pq: scala.collection.mutable.PriorityQueue[Array[Int]] = PriorityQueue()
Question:
Is there a more elegant and efficient way to write such a evaluation function? I'm thinking of using fold, but fold cannot exclude the position 0.
Is there a data structure to generate a priorityqueue with unique elements? Applying filter operation after each enqueue operation is not efficient.
Is there a cache method to reduce the evaluation computation? Since when adding a new element to the queue, every element may need to be evaluated by eval_fun again, which is not necessary if evaluated value of every element can be cached. Also, I should mention that two distinct element may have the same evaluated value.
Is there a more efficient data structure without using generic type? Because if the size of elements reaches 10,000 and the size of size reaches 1,000, the performance is terribly slow.
Thanks you.
(1) If you want maximum performance here, I would stick to the while loop, even if it is not terribly elegant. Otherwise, if you use a view on Array, you can easily drop the first element before going into the fold:
a.view.drop(1).foldLeft(0)( (sum, a) => sum + ((a & 0x03) match {
case 0x01 => 1
case 0x03 => 3
case _ => 0
})) / a.size
(2) You can maintain two structures, the priority queue, and a set. Both combined give you a sorted-set... So you could use collection.immutable.SortedSet, but there is no mutable variant in the standard library. Do want equality based on the priority function, or the actual array contents? Because in the latter case, you won't get around comparing arrays element by element for each insertion, undoing the effect of caching the priority function value.
(3) Just put the calculated priority along with the array in the queue. I.e.
implicit val ord = Ordering.by[(Int, Array[Int]), Int](_._1)
val pq = new collection.mutable.PriorityQueue[(Int, Array[Int])]
pq += eval_fun(a) -> a
Well, you can use a tail recursive loop (generally these are more "idiomatic":
def eval(a: Array[Int]): Int =
if (a.size < 2) 3
else {
#annotation.tailrec
def loop(ret: Int = 0, i: Int = 1): Int =
if (i >= a.size) ret / a.size
else {
val mod3 = (a(i) & 0x3)
if (mod3 == 1) loop(ret + 1, i + 1)
else if (mod3 == 3) loop(ret + 3, i + 1)
else loop(ret, i + 1)
}
loop()
}
Then you can use that to initialise a cached priority value:
case class PriorityArray(a: Array[Int]) {
lazy val priority = if (a.size < 2) 3 else {
#annotation.tailrec
def loop(ret: Int = 0, i: Int = 1): Int =
if (i >= a.size) ret / a.size
else {
val mod3 = (a(i) & 0x3)
if (mod3 == 2) loop(ret, i + 1)
else loop(ret + mod3, i + 1)
}
loop()
}
}
You may note also that I removed a redundant & op and have only the single conditional (for when it equals 2, rather than two checks for 1 && 3) – these should have some minimal effect.
There is not a huge difference from 0__'s proposal that just came though.
My answers:
If evaluation is critical, keep it as it is. You might get better performance with recursion (not sure why, but it happens), but you'll certainly get worse performance with pretty much any other approach.
No, there isn't, but you can come pretty close to it just modifying the dequeue operation:
def distinctDequeue[T](q: PriorityQueue[T]): T = {
val result = q.dequeue
while (q.head == result) q.dequeue
result
}
Otherwise, you'd have to keep a second data structure just to keep track of whether an element has been added or not. Either way, that equals sign is pretty heavy, but I have a suggestion to make it faster in the next item.
Note, however, that this requires that ties on the the cost function get solved in some other way.
Like 0__ suggested, put the cost on the priority queue. But you can also keep a cache on the function if that would be helpful. I'd try something like this:
val evalMap = scala.collection.mutable.HashMapWrappedArray[Int], Int
def eval_fun(a : Array[Int]) =
if(a.size < 2) 3
else evalMap.getOrElseUpdate(a, {
var ret = 0
var i = 1
while(i < a.size) {
if((a(i) & 0x3) == 1) ret += 1
else if((a(i) & 0x3) == 3) ret += 3
i += 1
}
ret / a.size
})
import scala.math.Ordering.Implicits._
val pq = new collection.mutable.PriorityQueue[(Int, WrappedArray[Int])]
pq += eval_fun(a) -> (a : WrappedArray[Int])
Note that I did not create a special Ordering -- I'm using the standard Ordering so that the WrappedArray will break the ties. There's little cost to wrap the Array, and you get it back with .array, but, on the other hand, you'll get the following:
Ties will be broken by comparing the array themselves. If there aren't many ties in the cost, this should be good enough. If there are, add something else to the tuple to help break ties without comparing the arrays.
That means all equal elements will be kept together, which will enable you to dequeue all of them at the same time, giving the impression of having kept only one.
And that equals will actually work, because WrappedArray compare like Scala sequences do.
I don't understand what you mean by that fourth point.

Determining valid adjacent cells of a square stored as an array

I have an array (of 9 elements, say) which I must treat as a (3 by 3) square.
For the sake of simplifying the question, this is a one-based array (ie, indexing starts at 1 instead of 0).
My goal is to determine valid adjacent squares relative to a starting point.
In other words, how it's stored in memory: 1 2 3 4 5 6 7 8 9
How I'm treating it:
7 8 9
4 5 6
1 2 3
I already know how to move up and down and test for going out of bounds (1 >= current_index <= 9)
edit: I know the above test is overly general but it's simple and works.
//row_size = 3, row_step is -1, 0 or 1 depending on if we're going left,
//staying put or going right respectively.
current_index += (row_size * row_step);
How do I test for an out of bounds condition when going left or right? Conceptually I know it involves determining if 3 (for example) is on the same row as 4 (or if 10 is even within the same square as 9, as an alternate example, given that multiple squares are in the same array back to back), but I can't figure out how to determine that. I imagine there's a modulo in there somewhere, but where?
Thanks very much,
Geoff
Addendum:
Here's the resulting code, altered for use with a zero-based array (I cleaned up the offset code present in the project) which walks adjacent squares.
bool IsSameSquare(int index0, int index1, int square_size) {
//Assert for square_size != 0 here
return (!((index0 < 0) || (index1 < 0))
&& ((index0 < square_size) && (index1 < square_size)))
&& (index0 / square_size == index1 / square_size);
}
bool IsSameRow(int index0, int index1, int row_size) {
//Assert for row_size != 0 here
return IsSameSquare(index0, index1, row_size * row_size)
&& (index0 / row_size == index1 / row_size);
}
bool IsSameColumn(int index0, int index1, int row_size) {
//Assert for row_size != 0 here
return IsSameSquare(index0, index1, row_size * row_size)
&& (index0 % row_size == index1 % row_size);
}
//for all possible adjacent positions
for (int row_step = -1; row_step < 2; ++row_step) {
//move up, down or stay put.
int row_adjusted_position = original_position + (row_size * row_step);
if (!IsSameSquare(original_position, row_adjusted_position, square_size)) {
continue;
}
for (int column_step = -1; column_step < 2; ++column_step) {
if ((row_step == 0) & (column_step == 0)) { continue; }
//hold on to the position that has had its' row position adjusted.
int new_position = row_adjusted_position;
if (column_step != 0) {
//move left or right
int column_adjusted_position = new_position + column_step;
//if we've gone out of bounds again for the column.
if (IsSameRow(column_adjusted_position, new_position, row_size)) {
new_position = column_adjusted_position;
} else {
continue;
}
} //if (column_step != 0)
//if we get here we know it's safe, do something with new_position
//...
} //for each column_step
} //for each row_step
This is easier if you used 0-based indexing. These rules work if you subtract 1 from all your indexes:
Two indexes are in the same square if (a/9) == (b/9) and a >= 0 and b >= 0.
Two indexes are in the same row if they are in the same square and (a/3) == (b/3).
Two indexes are in the same column if they are in the same square and (a%3) == (b%3).
There are several way to do this, I'm choosing a weird one just for fun. Use modulus.
Ase your rows are size 3 just use modulus of 3 and two simple rules.
If currPos mod 3 = 0 and (currPos+move) mod 3 = 1 then invalid
If currPos mod 3 = 1 and (currPos+move) mod 3 = 0 then invalid
this check for you jumping two a new row, you could also do one rule like this
if (currPos mod 3)-((currPos+move) mod 3)> 1 then invalid
Cheers
You should be using a multidimensional array for this.
If your array class doesn't support multidimensional stuff, you should write up a quick wrapper that does.

Resources