Python Quicksort Debugging - algorithm

I have implemented this quicksort but I seem to have bug that I can not fix, would someone mind taking a quick look at it?
The output for the example I give is close to the answer but some indices are misplaced.
def partition(array, pivot, start, end):
# move pivot to the end
temp = array[pivot]
array[pivot] = array[end]
array[end] = temp
i = start
j = end - 1
while(i < j):
# check from left for element bigger than pivot
while(i < j and array[end] > array[i]):
i = i + 1
# check from right for element smaller than pivot
while(i < j and array[end] < array[j]):
j = j - 1
# if we find a pair of misplaced elements swap them
if(i < j):
temp = array[i]
array[i] = array[j]
array[j] = temp
# move pivot element to its position
temp = array[i]
array[i] = array[end]
array[end] = temp
# return pivot position
return i
def quicksort_helper(array, start, end):
if(start < end):
pivot = (start + end) / 2
r = partition(array, pivot, start, end)
quicksort_helper(array, start, r - 1)
quicksort_helper(array, r + 1, end)
def quicksort(array):
quicksort_helper(array, 0, len(array) - 1)
array = [6, 0, 5, 1, 3, 4, -1, 10, 2, 7, 8, 9]
quicksort(array)
print array
I have a feeling the answer will be obvious but I can not find it.
Desired output:
[-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Actual output:
[-1, 0, 2, 3, 1, 4, 5, 6, 7, 8, 9, 10]

The critical repair is in the inner while loops, where you march i and j toward each other. If all you're worried about is swapping the correct non-pivot elements, the logic you posted is fine. However, that first loop needs to be
while(i <= j and array[end] > array[i]):
i = i + 1
to ensure that i has the correct value for swapping the pivot element into the middle. Otherwise, you can swap it one element to the left of its proper position, which is why your sort fails.
You can also use Python's multiple assignment for a cleaner swap:
while(i < j):
# check from left for element bigger than pivot
while(i <= j and array[end] > array[i]):
i = i + 1
# check from right for element smaller than pivot
while(i < j and array[end] < array[j]):
j = j - 1
# if we find a pair of misplaced elements swap them
if(i < j):
array[i], array[j] = array[j], array[i]

Related

Decompose string to form a valid expression

I am given a string S (of integers) and a number N. I want to insert arbitrary number of '+' in S so that the sum becomes equal to N.
Ex:<br>
S = 15112 and N = 28<br>
Ans is : 15+11+2<br>
S = 120012 and N = 33<br>
Ans is : 1+20+012<br>
S = 123 and N = 123<br>
Ans is : 123
given : |S| <= 120 and N <= 10^6
It is guarenteed that S and N are given such that it is always possible to form valid expression. Is there any algorithm which can solve this? I tried to think on it but couldn't come up with solution.
There may be more efficient ways to do this, but since you have nothing so far…
You can simply find all combinations of a boolean array that indicates whether a plus should exist between the numbers or not.
For example: with an input of 112134, 1 + 12 + 13 + 4 can be represented with the boolean array [true, false, true, false, true] indicating that there is a plus after the 1st, 3rd, and 5th numbers. The problem then reduces to finding which combinations add to your number. There are lot of ways to find combinations. Recursive backtracking is a classic.
In javascript/node this might look like this:
function splitOnIndexes(arr, a) {
// split the array into numbers based on the booleans
let current = "" + arr[0]
let output = []
for (let i = 0; i < a.length; i++) {
if (!a[i]) {
current += arr[i + 1]
} else {
output.push(current)
current = "" + arr[i + 1]
}
}
output.push(current)
return output
}
function findSum(input, total) {
function backtrack(n, k = 0, a = []) {
const sum = (arr) => arr.reduce((a, c) => a + parseInt(c), 0)
if (k === n) {
let ans = splitOnIndexes(input, a)
if (sum(ans) === total) {
console.log(ans.join(' + '))
}
} else {
k = k + 1
let c = [true, false]
for (let i = 0; i < 2; i++) {
a[k - 1] = c[i]
backtrack(n, k, a)
}
}
}
backtrack(input.length - 1)
}
findSum('15112', 28)
findSum('120012', 33)
findSum('123', 123)
As you can see, more than one answer is possible. Your first example is solved with both 15+1+12 and 15+11+2. If you only need one, you can of course stop early.
The idea is to use dynamic programming, you only care about sums between 0 and 10^6 and only have 120 possible indexes. if dp[i][j] = x, it means that from index x of the string, we went to index i (so we added a + before i) and we got a sum of j. This leads to a O(|S| * N) solution:
#include <iostream>
#include <string>
#include <vector>
using namespace std;
string s;
long n;
long dp[123][1000001];
void solve (int index, long sum) {//index = what index of s still remains to scan. sum = the sum we have accumulated till now
if (sum >= n or index >= s.length()) return;
if (dp[index][sum] != -1) return;
if (index == n and sum == n) return;
long num = 0;
for (int i = 0; i < 7 && index + i < s.length(); i++) { //N has 6 digits at most
num = stoi(s.substr(index, i + 1));
solve(index + i + 1, sum + num);
if (sum + num <= n) {
dp[index + i + 1][sum + num] = index;
}
}
}
int main () {
cin >> s;
cin >> n;
for (int i = 0; i < 121; i++) {
for (int j = 0; j < 1000001; j++) {
dp[i][j] = -1;
}
}
solve(0, 0);
int sum = n;
int idx = s.length();
vector<string> nums;
//reconstruct solution
while (idx != 0) {
nums.push_back(s.substr(dp[idx][sum], idx - dp[idx][sum]));
idx = dp[idx][sum];
sum -= stoi(nums[nums.size() - 1]);
}
for (int i = nums.size() -1; i >= 0; i--) {
cout << nums[i];
if (i != 0) cout << "+";
}
}
This is a Ruby version with step by step explanation of the algorithm, so you can easily code in C++ (or I'll try later).
# Let's consider that we extracted the values from text, so we already have the string of int and the result as integer:
string_of_int = "15112"
result = 28
# The basic idea is to find a map (array) that tells how to group digits, for example
sum_map = [2, 1, 2]
# This means that string_of_int is mapped into the following numbers
# 15, 1, 12
# then sum the numbers, in this case 15+1+12 = 28
# For finding a the solution we need to map
# all the possible combinations of addition given the n digits of the string_of_int then check if the sum is equal to the result
# We call k the number of digits of string_of_int
# in ruby we can build an array called sum_maps
# containing all the possible permutations like this:
k = string_of_int.length # => 5
sum_maps = []
k.times do |length|
(1..k).to_a.repeated_permutation(length).each {|e| sum_maps << e if e.inject(:+) == k}
end
sum_maps
# => [[1, 5], [2, 4], [3, 3], [4, 2], [5, 1], [1, 1, 4], [1, 2, 3], [1, 3, 2], [1, 4, 1], [2, 1, 3], [2, 2, 2], [2, 3, 1], [3, 1, 2], [3, 2, 1], [4, 1, 1]]
# Now must check which of of the sum_map is giving us the required result.
#
# First, to keep the code short and DRY,
# better to define a couple of useful methods for the String class to use then:
class String
def group_digits_by(sum_map)
string_of_int_splitted = self.split("")
grouped_digits = []
sum_map.each { |n| grouped_digits << string_of_int_splitted.shift(n).join.to_i}
grouped_digits.reject { |element| element == 0 }
end
def sum_grouped_of_digits_by(sum_map)
group_digits_by(sum_map).inject(:+)
end
end
# So we can call the methods directly on the string
# for example, in ruby:
string_of_int.group_digits_by sum_map #=> [15, 1, 12]
string_of_int.sum_grouped_of_digits_by sum_map #=> 28
# Now that we have this metods, we just iterate through the sum_maps array
# and apply it for printing out the sm_map if the sum of grouped digits is equal to the result
# coded in ruby it is:
combinations = []
sum_maps.each { |sum_map| combinations << string_of_int.group_digits_by(sum_map) if string_of_int.sum_grouped_of_digits_by(sum_map) == result }
p combinations.uniq
# => [[15, 1, 12], [15, 11, 2]]
In short, written as a Ruby module it becomes:
module GuessAddition
class ::String
def group_digits_by(sum_map)
string_of_int_splitted = self.split("")
grouped_digits = []
sum_map.each { |n| grouped_digits << string_of_int_splitted.shift(n).join.to_i}
grouped_digits.reject { |element| element == 0 }
end
def sum_grouped_of_digits_by(sum_map)
group_digits_by(sum_map).inject(:+)
end
end
def self.guess_this(string_of_int, result)
k = string_of_int.length
sum_maps = []
k.times { |length| (1..k).to_a.repeated_permutation(length).each {|e| sum_maps << e if e.inject(:+) == k} }
combinations = []
sum_maps.each { |sum_map| combinations << string_of_int.group_digits_by(sum_map) if string_of_int.sum_grouped_of_digits_by(sum_map) == result }
combinations.uniq
end
end
p GuessAddition::guess_this("15112", 28) # => [[15, 1, 12], [15, 11, 2]]

What's wrong with my selection sort algorithm?

The answer might be obvious to the trained eye, but I've been hitting the books for a few hours now, my eyes are straining, and I can't seem to see the bug.
Below are two implementations of selection sort I wrote, and neither is sorting the input correctly. You can play with this code on an online interpreter.
def selection_sort_enum(array)
n = array.length - 1
0.upto(n - 1) do |i|
smallest = i
(i + 1).upto(n) do |j|
smallest = j if array[j] < array[i]
end
array[i], array[smallest] = array[smallest], array[i] if i != smallest
end
end
def selection_sort_loop(array)
n = array.length - 1
i = 0
while i <= n - 1
smallest = i
j = i + 1
while j <= n
smallest = j if array[j] < array[i]
j += 1
end
array[i], array[smallest] = array[smallest], array[i] if i != smallest
i += 1
end
end
Here's the test of the first implementation, selection_sort_enum:
puts "Using enum:"
a1 = [*1..10].shuffle
puts "Before sort: #{a1.inspect}"
selection_sort_enum(a1)
puts "After sort: #{a1.inspect}"
Here's the test of the second implementation, selection_sort_loop:
puts "Using while:"
a2 = [*1..10].shuffle
puts "Before sort: #{a2.inspect}"
selection_sort_enum(a2)
puts "After sort: #{a2.inspect}"
Here's the output of the first implementation, selection_sort_enum:
Using enum:
Before sort: [7, 5, 2, 10, 6, 1, 3, 4, 8, 9]
After sort: [4, 3, 1, 9, 5, 2, 6, 7, 8, 10]
Here's the output of the second implementation, selection_sort_loop:
Using while:
Before sort: [1, 10, 5, 3, 7, 4, 8, 9, 6, 2]
After sort: [1, 2, 4, 3, 6, 5, 7, 8, 9, 10]
In both the code snippets you are comparing with index i instead of index smallest.
This should work :
def selection_sort_enum(array)
n = array.length - 1
0.upto(n - 1) do |i|
smallest = i
(i + 1).upto(n) do |j|
smallest = j if array[j] < array[smallest]
end
array[i], array[smallest] = array[smallest], array[i] if i != smallest
end
end
def selection_sort_loop(array)
n = array.length - 1
i = 0
while i <= n - 1
smallest = i
j = i + 1
while j <= n
smallest = j if array[j] < array[smallest]
j += 1
end
array[i], array[smallest] = array[smallest], array[i] if i != smallest
i += 1
end
end
Output :
Using enum:
Before sort: [5, 6, 7, 9, 2, 4, 8, 1, 10, 3]
After sort: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Using while:
Before sort: [6, 5, 9, 2, 1, 3, 10, 4, 7, 8]
After sort: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Link to solution : http://ideone.com/pKLriY
def selection_sort_enum(array)
n = array.length - 1
0.upto(n) do |i| # n instead of (n - 1)
smallest_index = i
(i + 1).upto(n) do |j|
smallest_index = j if array[j] < array[i]
end
puts "#{array}", smallest_index
array[i], array[smallest_index] = array[smallest_index], array[i] if i != smallest_index
end
end
You might be interested in this:
def selection_sort_enum(array)
n = array.length - 1
0.upto(n - 1) do |i|
smallest = i
(i + 1).upto(n) do |j|
smallest = j if array[j] < array[i]
end
array[i], array[smallest] = array[smallest], array[i] if i != smallest
end
array # <-- added to return the modified array
end
def selection_sort_loop(array)
n = array.length - 1
i = 0
while i <= n - 1
smallest = i
j = i + 1
while j <= n
smallest = j if array[j] < array[i]
j += 1
end
array[i], array[smallest] = array[smallest], array[i] if i != smallest
i += 1
end
array # <-- added to return the modified array
end
require 'fruity'
ARY = (1 .. 100).to_a.shuffle
compare do
_enum { selection_sort_enum(ARY.dup) }
_loop { selection_sort_loop(ARY.dup) }
end
Which results in:
# >> Running each test once. Test will take about 1 second.
# >> _enum is faster than _loop by 3x ± 1.0

Generating Ascending Sequence 2^p*3^q

I was interested in implementing a specific Shellsort method I read about that had the same time complexity as a bitonic sort. However, it requires the gap sequence to be the sequence of numbers [1, N-1] that satisfy the expression 2^p*3^q for any integers p and q. In layman's terms, all the numbers in that range that are only divisible by 2 and 3 an integer amount of times. Is there a relatively efficient method for generating this sequence?
Numbers of that form are called 3-smooth. Dijkstra studied the closely related problem of generating 5-smooth or regular numbers, proposing an algorithm that generates the sequence S of 5-smooth numbers by starting S with 1 and then doing a sorted merge of the sequences 2S, 3S, and 5S. Here's a rendering of this idea in Python for 3-smooth numbers, as an infinite generator.
def threesmooth():
S = [1]
i2 = 0 # current index in 2S
i3 = 0 # current index in 3S
while True:
yield S[-1]
n2 = 2 * S[i2]
n3 = 3 * S[i3]
S.append(min(n2, n3))
i2 += n2 <= n3
i3 += n2 >= n3
Simplest I can think of is to run a nested loop over p and q and then sort the result. In Python:
N=100
products_of_powers_of_2and3 = []
power_of_2 = 1
while power_of_2 < N:
product_of_powers_of_2and3 = power_of_2
while product_of_powers_of_2and3 < N:
products_of_powers_of_2and3.append(product_of_powers_of_2and3)
product_of_powers_of_2and3 *= 3
power_of_2 *= 2
products_of_powers_of_2and3.sort()
print products_of_powers_of_2and3
result
[1, 2, 3, 4, 6, 8, 9, 12, 16, 18, 24, 27, 32, 36, 48, 54, 64, 72, 81, 96]
(before sorting the products_of_powers_of_2and3 is
[1, 3, 9, 27, 81, 2, 6, 18, 54, 4, 12, 36, 8, 24, 72, 16, 48, 32, 96, 64]
)
Given the size of products_of_powers_of_2and3 is of the order of log2N*log3N the list doesn't grow very fast and sorting it doesn't seem particularly inefficient. E.g. even for N = 1 million, the list is very short, 142 items, so you don't need to worry.
You can do it very easy in JavaScript
arr = [];
n = 20;
function generateSeries() {
for (let i = 0; i < n; i++) {
for (let j = 0; j < n; j++) {
arr.push(Math.pow(2, i) * Math.pow(3, j))
}
}
sort();
}
function sort() {
arr.sort((a, b) => {
if (a < b) {return -1;}
if (a > b) {return 1;}
return 0;
});
}
function solution(N) {
arr = [];
if (N >= 0 && N <= 200 ) {
generateSeries();
console.log("arr >>>>>", arr);
console.log("result >>>>>", arr[N]);
return arr[N];
}
}
N = 200
res =[]
a,b = 2,3
for i in range(N):
for j in range(N):
temp1=a**i
temp2=b**j
temp=temp1*temp2
if temp<=200:
res.append(temp)
res = sorted(res)
print(res)

Generating integer partition by its number

I'm trying to generate decent partition of given integer number N numbered K in lexicographical order, e.g. for N = 5, K = 3 we got:
5 = 1 + 1 + 1 + 1 + 1
5 = 1 + 1 + 1 + 2
5 = 1 + 1 + 3
5 = 1 + 2 + 2
5 = 1 + 4
5 = 2 + 3
5 = 5
And the third one is 1 + 1 + 3.
How can I generate this without generating every partition(in C language, but most of all I need algorithm)?
Going to find maximal number in partition(assuming we can find number of partitions d[i][j], where i is number and j is maximal integer in its partition), then decrease the original number and number we are looking for. So yes, I'm trying to use dynamic programming. Still working on code.
This doesn't work at all:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
FILE *F1, *F2;
main()
{
long long i, j, p, n, k, l, m[102][102];
short c[102];
F1 = fopen("num2part.in", "r");
F2 = fopen ("num2part.out", "w");
n = 0;
fscanf (F1, "%lld %lld", &n, &k);
p = 0;
m[0][0] = 1;
for ( i = 0; i <= n; i++)
{
for (j = 1; j <= i; j++)
{
m[i][j] = m[i - j][j] + m[i][j - 1];
}
for (j = i + 1; j <= n; j++)
{
m[i][j] = m[i][i];
}
}
l = n;
p = n;
j = n;
while (k > 0)
{
while ( k < m[l][j])
{
if (j == 0)
{
while (l > 0)
{
c[p] = 1;
p--;
l--;
}
break;
}
j--;
}
k -=m[l][j];
c[p] = j + 1;
p--;
l -= c[p + 1];
}
//printing answer here, answer is contained in array from c[p] to c[n]
}
Here is some example Python code that generates the partitions:
cache = {}
def p3(n,val=1):
"""Returns number of ascending partitions of n if all values are >= val"""
if n==0:
return 1 # No choice in partitioning
key = n,val
if key in cache:
return cache[key]
# Choose next value x
r = sum(p3(n-x,x) for x in xrange(val,n+1))
cache[key]=r
return r
def ascending_partition(n,k):
"""Generate the k lexicographically ordered partition of n into integer parts"""
P = []
val = 1 # All values must be greater than this
while n:
# Choose the next number
for x in xrange(val,n+1):
count = p3(n-x,x)
if k >= count:
# Keep trying to find the correct digit
k -= count
elif count: # Check that there are some valid positions with this digit
# This must be the correct digit for this location
P.append(x)
n -= x
val = x
break
return P
n=5
for k in range(p3(n)):
print k,ascending_partition(n,k)
It prints:
0 [1, 1, 1, 1, 1]
1 [1, 1, 1, 2]
2 [1, 1, 3]
3 [1, 2, 2]
4 [1, 4]
5 [2, 3]
6 [5]
This can be used to generate an arbitrary partition without generating all the intermediate ones. For example, there are 9253082936723602 partitions of 300.
print ascending_partition(300,10**15)
prints
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 5, 7, 8, 8, 11, 12, 13, 14, 14, 17, 17, 48, 52]
def _yieldParts(num,lt):
''' It generate a comination set'''
if not num:
yield ()
for i in range(min(num,lt),0,-1):
for parts in _yieldParts(num-i,i):
yield (i,)+parts
def patition(number,kSum,maxIntInTupple):
''' It generates a comination set with sum of kSum is equal to number
maxIntInTupple is for maximum integer can be in tupple'''
for p in _yieldParts(number,maxIntInTupple):
if(len(p) <=kSum):
if(len(p)<kSum):
while len(p) < kSum:
p+=(0,)
print p
patition(40,8,40)
Output:
-------
(40,0,0,0,0,0,0,0)
(39,1,0,0,0,0,0,0)
.
.
.
.

How to find the insertion point in an array using binary search?

The basic idea of binary search in an array is simple, but it might return an "approximate" index if the search fails to find the exact item. (we might sometimes get back an index for which the value is larger or smaller than the searched value).
For looking for the exact insertion point, it seems that after we got the approximate location, we might need to "scan" to left or right for the exact insertion location, so that, say, in Ruby, we can do arr.insert(exact_index, value)
I have the following solution, but the handling for the part when begin_index >= end_index is a bit messy. I wonder if a more elegant solution can be used?
(this solution doesn't care to scan for multiple matches if an exact match is found, so the index returned for an exact match may point to any index that correspond to the value... but I think if they are all integers, we can always search for a - 1 after we know an exact match is found, to find the left boundary, or search for a + 1 for the right boundary.)
My solution:
DEBUGGING = true
def binary_search_helper(arr, a, begin_index, end_index)
middle_index = (begin_index + end_index) / 2
puts "a = #{a}, arr[middle_index] = #{arr[middle_index]}, " +
"begin_index = #{begin_index}, end_index = #{end_index}, " +
"middle_index = #{middle_index}" if DEBUGGING
if arr[middle_index] == a
return middle_index
elsif begin_index >= end_index
index = [begin_index, end_index].min
return index if a < arr[index] && index >= 0 #careful because -1 means end of array
index = [begin_index, end_index].max
return index if a < arr[index] && index >= 0
return index + 1
elsif a > arr[middle_index]
return binary_search_helper(arr, a, middle_index + 1, end_index)
else
return binary_search_helper(arr, a, begin_index, middle_index - 1)
end
end
# for [1,3,5,7,9], searching for 6 will return index for 7 for insertion
# if exact match is found, then return that index
def binary_search(arr, a)
puts "\nSearching for #{a} in #{arr}" if DEBUGGING
return 0 if arr.empty?
result = binary_search_helper(arr, a, 0, arr.length - 1)
puts "the result is #{result}, the index for value #{arr[result].inspect}" if DEBUGGING
return result
end
arr = [1,3,5,7,9]
b = 6
arr.insert(binary_search(arr, b), b)
p arr
arr = [1,3,5,7,9,11]
b = 6
arr.insert(binary_search(arr, b), b)
p arr
arr = [1,3,5,7,9]
b = 60
arr.insert(binary_search(arr, b), b)
p arr
arr = [1,3,5,7,9,11]
b = 60
arr.insert(binary_search(arr, b), b)
p arr
arr = [1,3,5,7,9]
b = -60
arr.insert(binary_search(arr, b), b)
p arr
arr = [1,3,5,7,9,11]
b = -60
arr.insert(binary_search(arr, b), b)
p arr
arr = [1]
b = -60
arr.insert(binary_search(arr, b), b)
p arr
arr = [1]
b = 60
arr.insert(binary_search(arr, b), b)
p arr
arr = []
b = 60
arr.insert(binary_search(arr, b), b)
p arr
and result:
Searching for 6 in [1, 3, 5, 7, 9]
a = 6, arr[middle_index] = 5, begin_index = 0, end_index = 4, middle_index = 2
a = 6, arr[middle_index] = 7, begin_index = 3, end_index = 4, middle_index = 3
a = 6, arr[middle_index] = 5, begin_index = 3, end_index = 2, middle_index = 2
the result is 3, the index for value 7
[1, 3, 5, 6, 7, 9]
Searching for 6 in [1, 3, 5, 7, 9, 11]
a = 6, arr[middle_index] = 5, begin_index = 0, end_index = 5, middle_index = 2
a = 6, arr[middle_index] = 9, begin_index = 3, end_index = 5, middle_index = 4
a = 6, arr[middle_index] = 7, begin_index = 3, end_index = 3, middle_index = 3
the result is 3, the index for value 7
[1, 3, 5, 6, 7, 9, 11]
Searching for 60 in [1, 3, 5, 7, 9]
a = 60, arr[middle_index] = 5, begin_index = 0, end_index = 4, middle_index = 2
a = 60, arr[middle_index] = 7, begin_index = 3, end_index = 4, middle_index = 3
a = 60, arr[middle_index] = 9, begin_index = 4, end_index = 4, middle_index = 4
the result is 5, the index for value nil
[1, 3, 5, 7, 9, 60]
Searching for 60 in [1, 3, 5, 7, 9, 11]
a = 60, arr[middle_index] = 5, begin_index = 0, end_index = 5, middle_index = 2
a = 60, arr[middle_index] = 9, begin_index = 3, end_index = 5, middle_index = 4
a = 60, arr[middle_index] = 11, begin_index = 5, end_index = 5, middle_index = 5
the result is 6, the index for value nil
[1, 3, 5, 7, 9, 11, 60]
Searching for -60 in [1, 3, 5, 7, 9]
a = -60, arr[middle_index] = 5, begin_index = 0, end_index = 4, middle_index = 2
a = -60, arr[middle_index] = 1, begin_index = 0, end_index = 1, middle_index = 0
a = -60, arr[middle_index] = 9, begin_index = 0, end_index = -1, middle_index = -1
the result is 0, the index for value 1
[-60, 1, 3, 5, 7, 9]
Searching for -60 in [1, 3, 5, 7, 9, 11]
a = -60, arr[middle_index] = 5, begin_index = 0, end_index = 5, middle_index = 2
a = -60, arr[middle_index] = 1, begin_index = 0, end_index = 1, middle_index = 0
a = -60, arr[middle_index] = 11, begin_index = 0, end_index = -1, middle_index = -1
the result is 0, the index for value 1
[-60, 1, 3, 5, 7, 9, 11]
Searching for -60 in [1]
a = -60, arr[middle_index] = 1, begin_index = 0, end_index = 0, middle_index = 0
the result is 0, the index for value 1
[-60, 1]
Searching for 60 in [1]
a = 60, arr[middle_index] = 1, begin_index = 0, end_index = 0, middle_index = 0
the result is 1, the index for value nil
[1, 60]
Searching for 60 in []
[60]
This is the code from Java's java.util.Arrays.binarySearch as included in Oracles Java:
/**
* Searches the specified array of ints for the specified value using the
* binary search algorithm. The array must be sorted (as
* by the {#link #sort(int[])} method) prior to making this call. If it
* is not sorted, the results are undefined. If the array contains
* multiple elements with the specified value, there is no guarantee which
* one will be found.
*
* #param a the array to be searched
* #param key the value to be searched for
* #return index of the search key, if it is contained in the array;
* otherwise, <tt>(-(<i>insertion point</i>) - 1)</tt>. The
* <i>insertion point</i> is defined as the point at which the
* key would be inserted into the array: the index of the first
* element greater than the key, or <tt>a.length</tt> if all
* elements in the array are less than the specified key. Note
* that this guarantees that the return value will be >= 0 if
* and only if the key is found.
*/
public static int binarySearch(int[] a, int key) {
return binarySearch0(a, 0, a.length, key);
}
// Like public version, but without range checks.
private static int binarySearch0(int[] a, int fromIndex, int toIndex,
int key) {
int low = fromIndex;
int high = toIndex - 1;
while (low <= high) {
int mid = (low + high) >>> 1;
int midVal = a[mid];
if (midVal < key)
low = mid + 1;
else if (midVal > key)
high = mid - 1;
else
return mid; // key found
}
return -(low + 1); // key not found.
}
The algorithm has proven to be appropriate and I like the fact, that you instantly know from the result whether it is an exact match or a hint on the insertion point.
This is how I would translate this into ruby:
# Inserts the specified value into the specified array using the binary
# search algorithm. The array must be sorted prior to making this call.
# If it is not sorted, the results are undefined. If the array contains
# multiple elements with the specified value, there is no guarantee
# which one will be found.
#
# #param [Array] array the ordered array into which value should be inserted
# #param [Object] value the value to insert
# #param [Fixnum|Bignum] from_index ordered sub-array starts at
# #param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# #return [Array] the resulting array
def self.insert(array, value, from_index=0, to_index=array.length)
array.insert insertion_point(array, value, from_index, to_index), value
end
# Searches the specified array for an insertion point ot the specified value
# using the binary search algorithm. The array must be sorted prior to making
# this call. If it is not sorted, the results are undefined. If the array
# contains multiple elements with the specified value, there is no guarantee
# which one will be found.
#
# #param [Array] array the ordered array into which value should be inserted
# #param [Object] value the value to insert
# #param [Fixnum|Bignum] from_index ordered sub-array starts at
# #param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# #return [Fixnum|Bignum] the position where value should be inserted
def self.insertion_point(array, value, from_index=0, to_index=array.length)
raise(ArgumentError, 'Invalid Range') if from_index < 0 || from_index > array.length || from_index > to_index || to_index > array.length
binary_search = _binary_search(array, value, from_index, to_index)
if binary_search < 0
-(binary_search + 1)
else
binary_search
end
end
# Searches the specified array for the specified value using the binary
# search algorithm. The array must be sorted prior to making this call.
# If it is not sorted, the results are undefined. If the array contains
# multiple elements with the specified value, there is no guarantee which
# one will be found.
#
# #param [Array] array the ordered array in which the value should be searched
# #param [Object] value the value to search for
# #param [Fixnum|Bignum] from_index ordered sub-array starts at
# #param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# #return [Fixnum|Bignum] if > 0 position of value, otherwise -(insertion_point + 1)
def self.binary_search(array, value, from_index=0, to_index=array.length)
raise(ArgumentError, 'Invalid Range') if from_index < 0 || from_index > array.length || from_index > to_index || to_index > array.length
_binary_search(array, value, from_index, to_index)
end
private
# Like binary_search, but without range checks.
#
# #param [Array] array the ordered array in which the value should be searched
# #param [Object] value the value to search for
# #param [Fixnum|Bignum] from_index ordered sub-array starts at
# #param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# #return [Fixnum|Bignum] if > 0 position of value, otherwise -(insertion_point + 1)
def self._binary_search(array, value, from_index, to_index)
low = from_index
high = to_index - 1
while low <= high do
mid = (low + high) / 2
mid_val = array[mid]
if mid_val < value
low = mid + 1
elsif mid_val > value
high = mid - 1
else
return mid # value found
end
end
-(low + 1) # value not found.
end
Code returns the same values as OP provided for his test data.
Update 2020
Actually, the insertion problem of binary search has been well researched. There is a left insertion point and right insertion point. Code can be found on Wikipedia and Rosetta Code. For example, to find the left insertion point, the code is:
BinarySearch_Left(A[0..N-1], value) {
low = 0
high = N - 1
while (low <= high) {
// invariants: value > A[i] for all i < low
value <= A[i] for all i > high
mid = (low + high) / 2
if (A[mid] >= value)
high = mid - 1
else
low = mid + 1
}
return low
}
One note is about the overflow bug, so mid really should be found as low + floor((high - low) / 2).
Earlier answer:
Actually, instead of checking for begin_index >= end_index, it can be better handled using begin_index > end_index, and the solution is much cleaner:
def binary_search_helper(arr, a, begin_index, end_index)
if begin_index > end_index
return begin_index
else
middle_index = (begin_index + end_index) / 2
if arr[middle_index] == a
return middle_index
elsif a > arr[middle_index]
return binary_search_helper(arr, a, middle_index + 1, end_index)
else
return binary_search_helper(arr, a, begin_index, middle_index - 1)
end
end
end
# for [1,3,5,7,9], searching for 6 will return index for 7 for insertion
# if exact match is found, then return that index
def binary_search(arr, a)
return binary_search_helper(arr, a, 0, arr.length - 1)
end
And using iteration instead of recursion may be faster and have less worry for stack overflow.
sample for left insertion:
def binary_search(arr, target):
left = 0
right = len(arr) - 1
while left <= right:
mid = (left + right) >> 1
if arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return left
sample for right insertion:
def binary_search(arr, target):
left = 0
right = len(arr) - 1
while left <= right:
mid = (left + right) >> 1
if arr[mid] <= target:
left = mid + 1
else:
right = mid - 1
return left # add - 1 for right most index of target
sample for is present:
def binary_search(arr, target):
left = 0
right = len(arr) - 1
while left <= right:
mid = (left + right) >> 1
if n < target:
left = mid + 1
elif n > target:
right = mid - 1
else:
return True # or return mid for index
return False # or return -1 for not found
sample test case:
arr = [1, 2, 3, 4, 5, 5, 5, 5, 5, 5, 5, 6, 7, 8, 9, 10]
result = binary_search(arr, 5)

Resources