Binary search on unsorted arrays? - algorithm

I came across this document Binary Search Revisited where the authors have proved/explained that binary search can be used for unsorted arrays (lists) as well. I haven't grokked much of the document on a first reading.
Have any of you already explored this ?

I've just read the paper. To me the author uses the term binary search to address the Bisection method used to find the zeros of a continuous function.
The examples in the paper are clearly inspired to problems like find the zero into an
interval (with translation on the y axe) or find the max/min of a function in tabular data.
The arrays the essay consider are not random filled ones, you will find a rule to construct them (it is the rule tied to the function used to dump them)
Said that it is a good chance of tinkering about different algorithms belonging to a common family in order to find similarity and differences. A good chance to expand your experiences.
Definitely not a new concept or an undervalued one.

Lookin for 3 into that unsorted list with binary or bisection :
L = 1 5 2 9 38 11 3
1-Take mid point in the whole list L : 9
3 < 9 so delete the right part of the list (38 11 3)
here you can already understand you will never find 3
2-Take mid point in the remaining list 1 5 2 : 5
3 > 5 so delete the right part of the list (5 2)
remains 1
Result : 3 unfound
Two remarks:
1-the binary or bisection algorithm consider right and left as an indication of the order
So i have rudely applied the usual algo considering right is high and left is low
If you consieder the opposit, ie right is low and left is high, then, trying to find 3 in this slighty similar list will lead to " 3 unfound"
L' = L = 1 5 2 9 3 38 11
3 < 9 / take right part : 3 38 11
mid point 38
3 < 38 take right part : 11
3 unfound
2- if you accept to re apply systematicly the algorithm on the dropped part of the list than it leads to searching the element in a list of n elements Complexity will be O(n) exactly the same as running all the list from beg to end to search your value.
The time of search could be slightly shorter.
Why ? let's consider you look one by one from beg. to end for the value 100000 in a sorted list. You will find it at the end of your list ! :-)
If now this list is unorderd and your value 100000 is for example exactly at the mid point ... bingo !!

Binary Search can be implemented on a rotated unsorted array/list.

Related

Select optimal pairings of elements

I have following problem e.g:
Given a bucket with symbols
1 1 2 3 3 4
And book of recipes to create pairs e.g:
12 13 24
Select from bucket optimal pairing, leaving as little as possible symbols in the bucket. So using the examplary values above the optimal pairing would be:
13 13 24 Which would use all the symbols given.
Naive picking from the bucket could result in something like:
12 13 Leaving the 3 and 4 unmatched. 3 and 4 cannot be matched because the book does not contain a recipe for that particular connection
Notes:
Real problem consits on average of: 500 elements in bucket in about 30 kind of symbols.
We've tried to implement the solution using the bruteforce algorithm, however I am afraid that even our grandchildren will not live long enough to see the result :).
There is no limit to the size of recipe book, it could even have every possible in the bucket. Pair made of the same element twice is not allowed.
The answer is not required to empty the bucket completely. Its just about getting the most pairs out of the bucket. Its okay to leave some in the bucket. It would be best to look for the optimal solution, however close approximation is also good enough.
I will appreciate an answer that proposes/gives hint to an algorithm to solve the problem.
Examples:
Bucket:
1 1 2 2 2 2 3 3 3 4 5 6 7 8 8
Recipe book:
12 34 15 68
Optimal result (one of possible):
{1 2} {1 2} {3 4} {6 8}
Leftover:
2 2 3 3 5 7 8
This problem is essentially the maximum matching problem with the small twist that you're allowed to have duplicate objects. Here's one way to solve this problem assuming you have a solver for maximum matching:
Create a node for each number in the input list.
For each recipe, for each pair of numbers matching that recipe, add an edge between the nodes for those numbers.
Run a maximum matching algorithm and return the pairs reported that way.
There are a good number of off-the-shelf maximum matching algorithms you can use, and if you need to code one up yourself, consider Edmonds' Blossom Algorithm, which is reasonably efficient and less tricky to code up than other approaches.
First generate all possibles pairs of symbols and store them with the indices of each symbol , so if you have n symbols , then n*(n+1)/2 pairs are going to be generated (max case n=500 then 125250 pairs are going to be generated ).
Ex : bucket with symbols 1 1 3
Then pairs are going to be generated are (11,1,2)(13,1,3)(13,2,3).
General format ( a[i]a[j], i, j ).
Now lets loop over generated pairs and delete pairs that doesn't exist in the book of recipes, so now we have at most 30 pairs .
Next lets build a graph such that the nodes are our generated pairs, and each 2 nodes are connected if the indices of the 2 pairs are different (using 2 nested loops over our pairs ) .
Finally we can perform BFS or DFS and find the longest graph between all generated graphs , which has the answer to our problem.
If you want c++/Java implementation ,please don't hesitate to ask.

Can we use binary search with an unsorted array? [duplicate]

This question already has answers here:
Time complexity of binary search for an unsorted array
(3 answers)
Closed 6 years ago.
I have an array that looks like
2 6 8 5 34 1 12
Can I use a binary search on some subarray?
You can use binary search on only one kind of "unsorted" array - the rotated array.
It can be done in O(log n) time like a typical binary search, but uses an adjusted divide and conquer approach. You can find a discussion about it here.
You can't. "Binary search" checks if the value is in left or right side, comparing when it's lesser or bigger than the central item.
Array:
2 6 8 5 34 1 12
Suppose that you want to find '1', so in first iteration the method will compare '1' with the currently central element (in this case '5'). The method would say: "As 1 is lesser than 5, I'm going to search in left side", but if we do this, we will never find it, because at left side there aren't one's.
But if we have:
1 2 5 6 8 12 34
Note that 1 is lesser than 6 (the central element), so in the next step, the algorithm will continue searching in left side and that's OK. So, to use "Binary Search" the elements MUST BE sorted.
No, binary search needs a sorted array. You might have other properties of an array that enables you to make a search that is more efficient than a mere iteration but the very nature of binary search necessitates sorted data.
If you know that part of your array is sorted you might extract that, of course, execute a binary search there and exclude this from the otherwise linear search.

How to display all ways to give change

As far as I know, counting every way to give change to a set sum and a starting till configuration is a classic Dynamic Programming problem.
I was wondering if there was a way to also display (or store) the actual change structures that could possibly amount to the given sum while preserving the DP complexity.
I have never saw this issue being discussed and I would like some pointers or a brief explanation of how this can be done or why this cannot be done.
DP for change problem has time complexity O(Sum * ValuesCount) and storage complexity O(Sum).
You can prepare extra data for this problem in the same time as DP for change, but you need more storage O(O(Sum*ValuesCount), and a lot of time for output of all variants O(ChangeWaysCount).
To prepare data for way recovery, make the second array B of arrays (or lists). When you incrementing count array A element from some previous element, add used value to corresponding element of B. At the end, unwind all the ways from the last element.
Example: values 1,2,3, sum 4
index 0 1 2 3 4
A 0 1 2 3 4
B - 1 1 2 1 2 3 1 2 3
We start unwinding from B[4] elements:
1-1-1-1 (B[4]-B[3]-B[2]-B[1])
2-1-1 (B[4]-B[2]-B[1])
2-2 (B[4]-B[2])
3-1 (B[4]-B[1])
Note that I have used only ways with non-increasing values to avoid permutation variants (i.e. 1-3 and 3-1)

Find the smallest set group to cover all combinatory possibilities

I'm making some exercises on combinatorics algorithm and trying to figure out how to solve the question below:
Given a group of 25 bits, set (choose) 15 (non-permutable and order NON matters):
n!/(k!(n-k)!) = 3.268.760
Now for every of these possibilities construct a matrix where I cross every unique 25bit member against all other 25bit member where
in the relation in between it there must be at least 11 common setted bits (only ones, not zeroes).
Let me try to illustrate representing it as binary data, so the first member would be:
0000000000111111111111111 (10 zeros and 15 ones) or (15 bits set on 25 bits)
0000000001011111111111111 second member
0000000001101111111111111 third member
0000000001110111111111111 and so on....
...
1111111111111110000000000 up to here. The 3.268.760 member.
Now crossing these values over a matrix for the 1 x 1 I must have 15 bits common. Since the result is >= 11 it is a "useful" result.
For the 1 x 2 we have 14 bits common so also a valid result.
Doing that for all members, finally, crossing 1 x 3.268.760 should result in 5 bits common so since it's < 11 its not "useful".
What I need is to find out (by math or algorithm) wich is the minimum number of members needed to cover all possibilities having 11 bits common.
In other words a group of N members that if tested against all others may have at least 11 bits common over the whole 3.268.760 x 3.268.760 universe.
Using a brute force algorithm I found out that with 81 25bit member is possible achive this. But i'm guessing that this number should be smaller (something near 12).
I was trying to use a brute force algorithm to make all possible variations of 12 members over the 3.268.760 but the number of possibilities
it's so huge that it would take more than a hundred years to compute (3,156x10e69 combinations).
I've googled about combinatorics but there are so many fields that i don't know in wich these problem should fit.
So any directions on wich field of combinatorics, or any algorithm for these issue is greatly appreciate.
PS: Just for reference. The "likeness" of two members is calculated using:
(Not(a xor b)) and a
After that there's a small recursive loop to count the bits given the number of common bits.
EDIT: As promissed (#btilly)on the comment below here's the 'fractal' image of the relations or link to image
The color scale ranges from red (15bits match) to green (11bits match) to black for values smaller than 10bits.
This image is just sample of the 4096 first groups.
tl;dr: you want to solve dominating set on a large, extremely symmetric graph. btilly is right that you should not expect an exact answer. If this were my problem, I would try local search starting with the greedy solution. Pick one set and try to get rid of it by changing the others. This requires data structures to keep track of which sets are covered exactly once.
EDIT: Okay, here's a better idea for a lower bound. For every k from 1 to the value of the optimal solution, there's a lower bound of [25 choose 15] * k / [maximum joint coverage of k sets]. Your bound of 12 (actually 10 by my reckoning, since you forgot some neighbors) corresponds to k = 1. Proof sketch: fix an arbitrary solution with m sets and consider the most coverage that can be obtained by k of the m. Build a fractional solution where all symmetries of the chosen k are averaged together and scaled so that each element is covered once. The cost of this solution is [25 choose 15] * k / [maximum joint coverage of those k sets], which is at least as large as the lower bound we're shooting for. It's still at least as small, however, as the original m-set solution, as the marginal returns of each set are decreasing.
Computing maximum coverage is in general hard, but there's a factor (e/(e-1))-approximation (≈ 1.58) algorithm: greedy, which it sounds as though you could implement quickly (note: you need to choose the set that covers the most uncovered other sets each time). By multiplying the greedy solution by e/(e-1), we obtain an upper bound on the maximum coverage of k elements, which suffices to power the lower bound described in the previous paragraph.
Warning: if this upper bound is larger than [25 choose 15], then k is too large!
This type of problem is extremely hard, you should not expect to be able to find the exact answer.
A greedy solution should produce a "fairly good" answer. But..how to be greedy?
The idea is to always choose the next element to be the one that is going to match as many possibilities as you can that are currently unmatched. Unfortunately with over 3 million possible members, that you have to try match against millions of unmatched members (note, your best next guess might already match another member in your candidate set..), even choosing that next element is probably not feasible.
So we'll have to be greedy about choosing the next element. We will choose each bit to maximize the sum of the probabilities of eventually matching all of the currently unmatched elements.
For that we will need a 2-dimensional lookup table P such that P(n, m) is the probability that two random members will turn out to have at least 11 bits in common, if m of the first n bits that are 1 in the first member are also 1 in the second. This table of 225 probabilities should be precomputed.
This table can easily be computed using the following rules:
P(15, m) is 0 if m < 11, 1 otherwise.
For n < 15:
P(n, m) = P(n+1, m+1) * (15-m) / (25-n) + P(n+1, m) * (10-n+m) / (25-n)
Now let's start with a few members that are "very far" from each other. My suggestion would be:
First 15 bits 1, rest 0.
First 10 bits 0, rest 1.
First 8 bits 1, last 7 1, rest 0.
Bits 1-4, 9-12, 16-23 are 1, rest 0.
Now starting with your universe of (25 choose 15) members, eliminate all of those that match one of the elements in your initial collection.
Next we go into the heart of the algorithm.
While there are unmatched members:
Find the bit that appears in the most unmatched members (break ties randomly)
Make that the first set bit of our candidate member for the group.
While the candidate member has less than 15 set bits:
Let p_best = 0, bit_best = 0;
For each unset bit:
Let p = 0
For each unmatched member:
p += P(n, m) where m = number of bits in common between
candidate member+this bit and the unmatched member
and n = bits in candidate member + 1
If p_best < p:
p_best = p
bit_best = this unset bit
Set bit_best as the next bit in our candidate member.
Add the candidate member to our collection
Remove all unmatched members that match this from unmatched members
The list of candidate members is our answer
I have not written code, I therefore have no idea how good an answer this algorithm will produce. But assuming that it does no better than your current, for 77 candidate members (we cheated and started with 4) you have to make 271 passes through your unmatched candidates (25 to find the first bit, 24 to find the second, etc down to 11 to find the 15th, and one more to remove the matched members). That's 20867 passes. If you have an average of 1 million unmatched members, that's on the order of a 20 billion operations.
This won't be quick. But it should be computationally feasible.

Algorithm: efficient way to search an integer in a two dimensional integer array? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Given a 2d array sorted in increasing order from left to right and top to bottom, what is the best way to search for a target number?
Search a sorted 2D matrix
A time efficient program to find an element in a two dimensional matrix, the rows and columns of which are increasing monotonically. (Rows and columns are increasing from top to bottom and from left to right).
I can only think of binary search, if the 2D array was sorted.
I posed this problem as homework last semester, and two students, which I had considered to be average, surprised me by coming up with a very elegant, straightforward, and (probably) optimal algorithm:
Find(k, tab, x, y)
let m = tab[x][y]
if k = m then return "Found"
else if k > m then
return Find(k, tab, x, y + 1)
else
return Find(k, tab, x - 1, y)
This algorithm eliminates either one line or one column at every call (note that it is tail recursive, and could be transformed into a loop, thereby avoiding the recursive calls). Thus, if your matrix is n*m, the algorithm performs in O(n+m). This solution is better than the dichotomic search spin off (which the solution I expected when handing out this problem).
EDIT : I fixed a typo (k became x in the recursive calls) and also, as Chris pointed out, this should initially be called with the "upper right" corner, that is Find(k, tab, n, 1), where n is the number of lines.
Since the the rows and columns are increasing monotonically, you can do a neat little search like this:
Start at the bottom left. If the element you are looking for is greater than the element at that location, go right. If it is less go up. Repeat until you find the element or you hit an edge. Example (in hex to make formatting easier):
1 2 5 6 7
3 4 6 7 8
5 7 8 9 A
7 A C D E
Let's search for 8. Start at position (0, 3): 7. 8 > 7 so we go right. We are now at (1, 3): A. 8 < A so we go up. At (1, 2): 7, 8 > 7 so we go right. (2, 2): 8 -> 8 == 8 so we are done.
You'll notice, however, that this has only found one of the elements whose value is 8.
Edit, in case it wasn't obvious this runs in O(n + m) average and worst case time.
Assuming I read right you are saying that the bottom of row n is always less than the top of row n+1. If that is the case then I'd say the simplest way is to search the first row using a binary search for either the number or the next smallest number. Then you will have identified the column it is in. Then do a binary search of that column until you find it.
Start at (0,0)
while the value is too low, continue to the right (0,1), then (0,2) etc.
when reaching a value too high, go down one and left one (1,1)
Repeating those steps should bring you to the target.

Resources