Giving a set of tuples (value,cost),Is there an algorithm to find the combination of tuple that have the least cost for storing given number - algorithm

I have a set of (value,cost) tuples which is (2000000,200) , (500000,75) , (100000,20)
Suppose X is any positive number.
Is there an algorithm to find the combination of tuple that have the least cost for the sum of value that can store X.
The sum of tuple values can be equal or greater than the given X
ex.
giving x = 800000 the answer should be (500000,75) , (100000,20) , (100000,20) , (100000,20)
giving x = 900000 the answer should be (500000,75) , (500000,75)
giving x = 1500000 the answer should be (2000000,200)
I can hardcode this but the set and the tuple are subject to change so if this can be substitute with well-known algorithm it would be great.

This can be solved with dinamic programming, as you have no limit on number of tuples and can afford higher sums that provided number.
First, you can optimize tuples. If one big tuple can be replaced by number of smaller ones with equal or lower cost and equal or higher value, you can remove bigger tuple at all.
Also, it's fruitful for future use to order tuples in optimized set by value/cost in descending order. Tuple is better if value/cost is bigger.
Time complexity O(N*T), where N is number divided by common factor (F) of optimized tuple values, and T is number of tuples in optimized tuple set.
Memory complexity O(N).
Set up array a of size N that will contain:
in a[i].cost best cost for solution for i*F, 0 for special case "no solution yet"
in a[i].tuple the tuple that led to best solution
Recursion scheme:
function gets n as a single parameter - it's provided number/F for start, leftover of needed value/F sums for recusion calls
if array a for n is filled, return a[n].cost
otherwise set current_cost to MAXINT
for each tuple from best to worst try to add it to solution:
if value/F >= n, we've got some solution, compare tuple cost to current_cost and if it's better, update a[n].cost and a[n].tuple
if value/F < n, call recursively for n-value/F and compare cost with current solution, update current solution and a[n].cost, a[n].tuple if needed
after all, return a[n].cost or throw exception is no solution exists
Tuple list can be retrieved from a but traverse through .tuple on each step.
It's possible to reduce overall array size down to max(tuple.value/F), but you'll have to save more or less complete solution instead of one best .tuple for each element, and you'll have to make "sliding window" carefully.
It's possible to turn recursion into cycle from 0 to n, as with many other dynamic programming algorithms.

Related

k-size possible number combinations ordered by each sum

Given a set of n numbers; What is the code that generate all possible k-size subsets in descending order (decreasing each sum of values)?
Example:
Set={9,8,6,2,1} => n=5 and k=3. So the output is:
[9,8,6]
[9,8,2]
[9,8,1]
[9,6,2]
[9,6,1]
[8,6,2]
[8,6,1]
[9,2,1]
[8,2,1]
[6,2,1]
It is preferred the most efficient algorithm, but the algorithm with NP-Complete complexity (n choose k permutations) is the answer yet.
One-by-one generation in the Matlab Code is preferred for implementation. Or a solution that the maximum size of the ordered list in it can be determined (by this, for greater n and k, one may use an approximation and return specific size of this list without computing all possibilities).
Note: 1)Please give attention to the position of [9,2,1] in this ordered list. So index ordering is not the correct answer.
2)This may be a type of Lexicographical order.
Thanks to Divakar, Yvon, and Luis, one of the possible answers to this question:
There are sorted set combinations in the SSC, so
combs = nchoosek(Set,k);
[~,ind] = sort(sum(combs,2),'descend');
SSC = combs(ind,:);
if you want the index of each number array in the Set (has unique numbers), with num_arr index in SSC use this code
for i=1:k
Index(i)=find(SSC(num_arr,j)==Set(1,:));
end
this code returns [1,3,5] for [9,6,1] in Index.
for greater n
In this case, the computation is very time-consuming or even is impractical. An approximation may solves this issue, for such situations, you can find the first arbitrary answer by modifying the nchoosek.m in the Matlab.

Algorithm to generate k element subsets in order of their sum

If I have an unsorted large set of n integers (say 2^20 of them) and would like to generate subsets with k elements each (where k is small, say 5) in increasing order of their sums, what is the most efficient way to do so?
Why I need to generate these subsets in this fashion is that I would like to find the k-element subset with the smallest sum satisfying a certain condition, and I thus would apply the condition on each of the k-element subsets generated.
Also, what would be the complexity of the algorithm?
There is a similar question here: Algorithm to get every possible subset of a list, in order of their product, without building and sorting the entire list (i.e Generators) about generating subsets in order of their product, but it wouldn't fit my needs due to the extremely large size of the set n
I intend to implement the algorithm in Mathematica, but could do it in C++ or Python too.
If your desired property of the small subsets (call it P) is fairly common, a probabilistic approach may work well:
Sort the n integers (for millions of integers i.e. 10s to 100s of MB of ram, this should not be a problem), and sum the k-1 smallest. Call this total offset.
Generate a random k-subset (say, by sampling k random numbers, mod n) and check it for P-ness.
On a match, note the sum-total of the subset. Subtract offset from this to find an upper bound on the largest element of any k-subset of equivalent sum-total.
Restrict your set of n integers to those less than or equal to this bound.
Repeat (goto 2) until no matches are found within some fixed number of iterations.
Note the initial sort is O(n log n). The binary search implicit in step 4 is O(log n).
Obviously, if P is so rare that random pot-shots are unlikely to get a match, this does you no good.
Even if only 1 in 1000 of the k-sized sets meets your condition, That's still far too many combinations to test. I believe runtime scales with nCk (n choose k), where n is the size of your unsorted list. The answer by Andrew Mao has a link to this value. 10^28/1000 is still 10^25. Even at 1000 tests per second, that's still 10^22 seconds. =10^14 years.
If you are allowed to, I think you need to eliminate duplicate numbers from your large set. Each duplicate you remove will drastically reduce the number of evaluations you need to perform. Sort the list, then kill the dupes.
Also, are you looking for the single best answer here? Who will verify the answer, and how long would that take? I suggest implementing a Genetic Algorithm and running a bunch of instances overnight (for as long as you have the time). This will yield a very good answer, in much less time than the duration of the universe.
Do you mean 20 integers, or 2^20? If it's really 2^20, then you may need to go through a significant amount of (2^20 choose 5) subsets before you find one that satisfies your condition. On a modern 100k MIPS CPU, assuming just 1 instruction can compute a set and evaluate that condition, going through that entire set would still take 3 quadrillion years. So if you even need to go through a fraction of that, it's not going to finish in your lifetime.
Even if the number of integers is smaller, this seems to be a rather brute force way to solve this problem. I conjecture that you may be able to express your condition as a constraint in a mixed integer program, in which case solving the following could be a much faster way to obtain the solution than brute force enumeration. Assuming your integers are w_i, i from 1 to N:
min sum(i) w_i*x_i
x_i binary
sum over x_i = k
subject to (some constraints on w_i*x_i)
If it turns out that the linear programming relaxation of your MIP is tight, then you would be in luck and have a very efficient way to solve the problem, even for 2^20 integers (Example: max-flow/min-cut problem.) Also, you can use the approach of column generation to find a solution since you may have a very large number of values that cannot be solved for at the same time.
If you post a bit more about the constraint you are interested in, I or someone else may be able to propose a more concrete solution for you that doesn't involve brute force enumeration.
Here's an approximate way to do what you're saying.
First, sort the list. Then, consider some length-5 index vector v, corresponding to the positions in the sorted list, where the maximum index is some number m, and some other index vector v', with some max index m' > m. The smallest sum for all such vectors v' is always greater than the smallest sum for all vectors v.
So, here's how you can loop through the elements with approximately increasing sum:
sort arr
for i = 1 to N
for v = 5-element subsets of (1, ..., i)
set = arr{v}
if condition(set) is satisfied
break_loop = true
compute sum(set), keep set if it is the best so far
break if break_loop
Basically, this means that you no longer need to check for 5-element combinations of (1, ..., n+1) if you find a satisfying assignment in (1, ..., n), since any satisfying assignment with max index n+1 will have a greater sum, and you can stop after that set. However, there is no easy way to loop through the 5-combinations of (1, ..., n) while guaranteeing that the sum is always increasing, but at least you can stop checking after you find a satisfying set at some n.
This looks to be a perfect candidate for map-reduce (http://en.wikipedia.org/wiki/MapReduce). If you know of any way of partitioning them smartly so that passing candidates are equally present in each node then you can probably get a great throughput.
Complete sort may not really be needed as the map stage can take care of it. Each node can then verify the condition against the k-tuples and output results into a file that can be aggregated / reduced later.
If you know of the probability of occurrence and don't need all of the results try looking at probabilistic algorithms to converge to an answer.

finding max value on each subset

(I'm banging my head here. Let X={x1,x2,...,xn} is an integer set. Let A1,A2,...Am be the m subsets of X. For any i and j, Ai and Aj are not necessarily disjoint. Now the goal is to find the maximal value on each Ai (i=1,...,m) efficiently, with the number of operations as fewer as possible.
For example, given X={2,4,6,3,1}, and its subsets A1={2,3,1}, A2={2,6,3,1}, A3={4,2,3,1}. We need to find Max{A1}, Max{A2}, Max{A3}, respectively.
The brute-force way for finding Max{A1}, Max{A2}, Max{A3} is to scan all the elements in each Ai, and (m*d) operations are required, with m the number of subsets of X, and d the average length of the subsets {Ai} of X.
Now, I have some observations:
(1) For any set Y⊆X, max{Y}≤max{X},
For instance, since Max{X}=6 and 6 is in A2, then Max{A2}=6 can be found directly.
(2) For any two sets A and B, if A∩B is non-empty, Max{A} and Max{B} can be identified as follows:
First, we find the common parts between A and B, deonted as c=max{A∩B}.
Then, we find Max{A}=Max{Max{A-(A∩B)}, c} and Max{B}=Max{Max{B-(A∩B)}, c}.
I am not sure whether there are some other interesting obervations for find these max values.
Any ideas are warmly welcome!
My question is what if for the general case when X={x1,x2,...,xn} and there are m subsets of X, denoted as A1,A2,...Am, is there some more efficient techniques to find such max values Max{Ai} (i=1,...,m) ?
Your help will be highly appreciated!
There is no method asymptotically better than brute force, assuming a typical representation of the given sets. Simply scanning through the sets to find the largest member of each requires linear time and linear time is optimal since every member of the set must be read in order to determine the maximum value.
Now if the input representation is not simply a listing of the elements in each set, than other bounds and algorithms may apply. For example, if we know the input sets are sorted and the length of the set is given as part of the input, we can obviously find the maximum elements in time linear only on the number of subsets but not on their length.
If your sets are implemented in a hash (or, more generally, if you can otherwise check for the presence of a value in the set in O(1) time) you can improve on a brute-force approach.
Instead of iterating through the elements of the subset and maintaining the maximum, iterate over the elements of the parent set in descending order, checking for the presence of those elements in the subset. The first found element is necessarily the subset's maximum. Technically, this still takes O(n) time (n = subset carnality) in the general case, but will generally carry a great performance benefit in practice. (If you have any data regarding the number and size of the subsets, and they favor this approach, you can improve on O(n) in the average case.)
This approach requires sorting of the parent set's elements (n log n), however, so it may only be worthwhile if the number of subsets is much greater than the carnality of the parent set.

Finding the hundred largest numbers in a file of a billion

I went to an interview today and was asked this question:
Suppose you have one billion integers which are unsorted in a disk file. How would you determine the largest hundred numbers?
I'm not even sure where I would start on this question. What is the most efficient process to follow to give the correct result? Do I need to go through the disk file a hundred times grabbing the highest number not yet in my list, or is there a better way?
Obviously the interviewers want you to point out two key facts:
You cannot read the whole list of integers into memory, since it is too large. So you will have to read it one by one.
You need an efficient data structure to hold the 100 largest elements. This data structure must support the following operations:
Get-Size: Get the number of values in the container.
Find-Min: Get the smallest value.
Delete-Min: Remove the smallest value to replace it with a new, larger value.
Insert: Insert another element into the container.
By evaluating the requirements for the data structure, a computer science professor would expect you to recommend using a Heap (Min-Heap), since it is designed to support exactly the operations we need here.
For example, for Fibonacci heaps, the operations Get-Size, Find-Min and Insert all are O(1) and Delete-Min is O(log n) (with n <= 100 in this case).
In practice, you could use a priority queue from your favorite language's standard library (e.g. priority_queue from#include <queue> in C++) which is usually implemented using a heap.
Here's my initial algorithm:
create array of size 100 [0..99].
read first 100 numbers and put into array.
sort array in ascending order.
while more numbers in file:
get next number N.
if N > array[0]:
if N > array[99]:
shift array[1..99] to array[0..98].
set array[99] to N.
else
find, using binary search, first index i where N <= array[i].
shift array[1..i-1] to array[0..i-2].
set array[i-1] to N.
endif
endif
endwhile
This has the (very slight) advantage is that there's no O(n^2) shuffling for the first 100 elements, just an O(n log n) sort and that you very quickly identify and throw away those that are too small. It also uses a binary search (7 comparisons max) to find the correct insertion point rather than 50 (on average) for a simplistic linear search (not that I'm suggesting anyone else proffered such a solution, just that it may impress the interviewer).
You may even get bonus points for suggesting the use of optimised shift operations like memcpy in C provided you can be sure the overlap isn't a problem.
One other possibility you may want to consider is to maintain three lists (of up to 100 integers each):
read first hundred numbers into array 1 and sort them descending.
while more numbers:
read up to next hundred numbers into array 2 and sort them descending.
merge-sort lists 1 and 2 into list 3 (only first (largest) 100 numbers).
if more numbers:
read up to next hundred numbers into array 2 and sort them descending.
merge-sort lists 3 and 2 into list 1 (only first (largest) 100 numbers).
else
copy list 3 to list 1.
endif
endwhile
I'm not sure, but that may end up being more efficient than the continual shuffling.
The merge-sort is a simple selection along the lines of (for merge-sorting lists 1 and 2 into 3):
list3.clear()
while list3.size() < 100:
while list1.peek() >= list2.peek():
list3.add(list1.pop())
endwhile
while list2.peek() >= list1.peek():
list3.add(list2.pop())
endwhile
endwhile
Simply put, pulling the top 100 values out of the combined list by virtue of the fact they're already sorted in descending order. I haven't checked in detail whether that would be more efficient, I'm just offering it as a possibility.
I suspect the interviewers would be impressed with the potential for "out of the box" thinking and the fact that you'd stated that it should be evaluated for performance.
As with most interviews, technical skill is one of the the things they're looking at.
Create an array of 100 numbers all being -2^31.
Check if the the first number you read from disk is greater than the first in the list. If it is copy the array down 1 index and update it to the new number. If not check the next in the 100 and so on.
When you've finished reading all 1 billion digits you should have the highest 100 in the array.
Job done.
I'd traverse the list in order. As I go, I add elements to a set (or multiset depending on duplicates). When the set reached 100, I'd only insert if the value was greater than the min in the set (O(log m)). Then delete the min.
Calling the number of values in the list n and the number of values to find m:
this is O(n * log m)
Speed of the processing algorithm is absolutely irrelevant (unless it's completely dumb).
The bottleneck here is I/O (it's specified that they are on disk). So make sure that you work with large buffers.
Keep a fixed array of 100 integers. Initialise them to a Int.MinValue. When you are reading, from 1 billion integers, compare them with the numbers in the first cell of the array (index 0). If larger, then move up to next. Again if larger, then move up until you hit the end or a smaller value. Then store the value in the index and shift all values in the previous cells one cell down... do this and you will find 100 max integers.
I believe the quickest way to do this is by using a very large bit map to record which numbers are present. In order to represent a 32 bit integer this would need to be 2^32 / 8 bytes which is about == 536MB. Scan through the integers simply setting the corresponding bit in the bit map. Then look for the highest 100 entries.
NOTE: This finds the highest 100 numbers not the highest 100 instances of a number if you see the difference.
This kind of approach is discussed in the very good book Programming Pearls which your interviewer may have read!
You are going to have to check every number, there is no way around that.
Just as a slight improvement on solutions offered,
Given a list of 100 numbers:
9595
8505
...
234
1
You would check to see if the new found value is > min value of our array, if it is, insert it. However doing a search from bottom to top can be quite expensive, and you may consider taking a divide and conquer approach, by for example evaluating the 50th item in the array and doing a comparison, then you know if the value needs to be inserted in the first 50 items, or the bottom 50. You can repeat this process for a much faster search as we have eliminated 50% of our search space.
Also consider the data type of the integers. If they are 32 bit integers and you are on a 64 bit system, you may be able to do some clever memory handling and bitwise operations to deal with two numbers on disk at once if they are continual in memory.
I think someone should have mentioned a priority queue by now. You just need to keep the current top 100 numbers, know what the lowest is and be able to replace that with a higher number. That's what a priority queue does for you - some implementations may sort the list, but it's not required.
Assuming that 1 bill + 100ion numbers fit into memory
the best sorting algorithm is heap sort. form a heap and get the first 100 numbers. complexity o(nlogn + 100(for fetching first 100 numbers))
improving the solution
divide the implementaion to two heap(so that insertion are less complex) and while fetching the first 100 elements do imperial merge algorithm.
Here's some python code which implements the algorithm suggested by ferdinand beyer above. essentially it's a heap, the only difference is that deletion has been merged with insertion operation
import random
import math
class myds:
""" implement a heap to find k greatest numbers out of all that are provided"""
k = 0
getnext = None
heap = []
def __init__(self, k, getnext ):
""" k is the number of integers to return, getnext is a function that is called to get the next number, it returns a string to signal end of stream """
assert k>0
self.k = k
self.getnext = getnext
def housekeeping_bubbleup(self, index):
if index == 0:
return()
parent_index = int(math.floor((index-1)/2))
if self.heap[parent_index] > self.heap[index]:
self.heap[index], self.heap[parent_index] = self.heap[parent_index], self.heap[index]
self.housekeeping_bubbleup(parent_index)
return()
def insertonly_level2(self, n):
self.heap.append(n)
#pdb.set_trace()
self.housekeeping_bubbleup(len(self.heap)-1)
def insertonly_level1(self, n):
""" runs first k times only, can be as slow as i want """
if len(self.heap) == 0:
self.heap.append(n)
return()
elif n > self.heap[0]:
self.insertonly_level2(n)
else:
return()
def housekeeping_bubbledown(self, index, length):
child_index_l = 2*index+1
child_index_r = 2*index+2
child_index = None
if child_index_l >= length and child_index_r >= length: # No child
return()
elif child_index_r >= length: #only left child
if self.heap[child_index_l] < self.heap[index]: # If the child is smaller
child_index = child_index_l
else:
return()
else: #both child
if self.heap[ child_index_r] < self.heap[ child_index_l]:
child_index = child_index_r
else:
child_index = child_index_l
self.heap[index], self.heap[ child_index] = self.heap[child_index], self.heap[index]
self.housekeeping_bubbledown(child_index, length)
return()
def insertdelete_level1(self, n):
self.heap[0] = n
self.housekeeping_bubbledown(0, len(self.heap))
return()
def insert_to_myds(self, n ):
if len(self.heap) < self.k:
self.insertonly_level1(n)
elif n > self.heap[0]:
#pdb.set_trace()
self.insertdelete_level1(n)
else:
return()
def run(self ):
for n in self.getnext:
self.insert_to_myds(n)
print(self.heap)
# import pdb; pdb.set_trace()
return(self.heap)
def createinput(n):
input_arr = range(n)
random.shuffle(input_arr)
f = file('input', 'w')
for value in input_arr:
f.write(str(value))
f.write('\n')
input_arr = []
with open('input') as f:
input_arr = [int(x) for x in f]
myds_object = myds(4, iter(input_arr))
output = myds_object.run()
print output
If you find 100th order statistic using quick sort, it will work in average O(billion). But I doubt that with such numbers and due to random access needed for this approach it will be faster, than O(billion log(100)).
Here is another solution (about an eon later, I have no shame sorry!) based on the second one provided by #paxdiablo. The basic idea is that you should read another k numbers only if they're greater than the minimum you already have and that sorting is not really necessary:
// your variables
n = 100
k = a number > n and << 1 billion
create array1[n], array2[k]
read first n numbers into array2
find minimum and maximum of array2
while more numbers:
if number > maximum:
store in array1
if array1 is full: // I don't need contents of array2 anymore
array2 = array1
array1 = []
else if number > minimum:
store in array2
if array2 is full:
x = n - array1.count()
find the x largest numbers of array2 and discard the rest
find minimum and maximum of array2
else:
discard the number
endwhile
// Finally
x = n - array1.count()
find the x largest numbers of array2 and discard the rest
return merge array1 and array2
The critical step is the function for finding the largest x numbers in array2. But you can use the fact, that you know the minimum and maximum to speed up the function for finding the largest x numbers in array2.
Actually, there are lots of possible optimisations since you don't really need to sort it, you just need the x largest numbers.
Furthermore, if k is big enough and you have enough memory, you could even turn it into a recursive algorithm for finding the n largest numbers.
Finally, if the numbers are already sorted (in any order), the algorithm is O(n).
Obviously, this is just theoretically because in practice you would use standard sorting algorithms and the bottleneck would probably be the IO.
There are lots of clever approaches (like the priority queue solutions), but one of the simplest things you can do can also be fast and efficient.
If you want the top k of n, consider:
allocate an array of k ints
while more input
perform insertion sort of next value into the array
This may sound absurdly simplistic. You might expect this to be O(n^2), but it's actually only O(k*n), and if k is much smaller than n (as is postulated in the problem statement), it approaches O(n).
You might argue that the constant factor is too high because doing an average of k/2 comparisons and moves per input is a lot. But most values will be trivially rejected on the first comparison against the kth largest value seen so far. If you have a billion inputs, only a small fraction are likely to be larger than the 100th so far.
(You could construe a worst-case input where each value is larger than its predecessor, thus requiring k comparisons and moves for every input. But that is essentially a sorted input, and the problem statement said the input is unsorted.)
Even the binary-search improvement (to find the insertion point) only cuts the comparisons to ceil(log_2(k)), and unless you special case an extra comparison against the kth-so-far, you're much less likely to get the trivial rejection of the vast majority of inputs. And it does nothing to reduce the number of moves you need. Given caching schemes and branch prediction, doing 7 non-consecutive comparisons and then 50 consecutive moves doesn't seem likely to be significantly faster than doing 50 consecutive comparisons and moves. It's why many system sorts abandon Quicksort in favor of insertion sort for small sizes.
Also consider that this requires almost no extra memory and that the algorithm is extremely cache friendly (which may or may not be true for a heap or priority queue), and it's trivial to write without errors.
The process of reading the file is probably the major bottleneck, so the real performance gains are likely to be by doing a simple solution for the selection, you can focus your efforts on finding a good buffering strategy for minimizing the i/o.
If k can be arbitrarily large, approaching n, then it makes sense to consider a priority queue or other, smarter, data structure. Another option would be to split the input into multiple chunks, sort each of them in parallel, and then merge.

Generate all subset sums within a range faster than O((k+N) * 2^(N/2))?

Is there a way to generate all of the subset sums s1, s2, ..., sk that fall in a range [A,B] faster than O((k+N)*2N/2), where k is the number of sums there are in [A,B]? Note that k is only known after we have enumerated all subset sums within [A,B].
I'm currently using a modified Horowitz-Sahni algorithm. For example, I first call it to for the smallest sum greater than or equal to A, giving me s1. Then I call it again for the next smallest sum greater than s1, giving me s2. Repeat this until we find a sum sk+1 greater than B. There is a lot of computation repeated between each iteration, even without rebuilding the initial two 2N/2 lists, so is there a way to do better?
In my problem, N is about 15, and the magnitude of the numbers is on the order of millions, so I haven't considered the dynamic programming route.
Check the subset sum on Wikipedia. As far as I know, it's the fastest known algorithm, which operates in O(2^(N/2)) time.
Edit:
If you're looking for multiple possible sums, instead of just 0, you can save the end arrays and just iterate through them again (which is roughly an O(2^(n/2) operation) and save re-computing them. The value of all the possible subsets is doesn't change with the target.
Edit again:
I'm not wholly sure what you want. Are we running K searches for one independent value each, or looking for any subset that has a value in a specific range that is K wide? Or are you trying to approximate the second by using the first?
Edit in response:
Yes, you do get a lot of duplicate work even without rebuilding the list. But if you don't rebuild the list, that's not O(k * N * 2^(N/2)). Building the list is O(N * 2^(N/2)).
If you know A and B right now, you could begin iteration, and then simply not stop when you find the right answer (the bottom bound), but keep going until it goes out of range. That should be roughly the same as solving subset sum for just one solution, involving only +k more ops, and when you're done, you can ditch the list.
More edit:
You have a range of sums, from A to B. First, you solve subset sum problem for A. Then, you just keep iterating and storing the results, until you find the solution for B, at which point you stop. Now you have every sum between A and B in a single run, and it will only cost you one subset sum problem solve plus K operations for K values in the range A to B, which is linear and nice and fast.
s = *i + *j; if s > B then ++i; else if s < A then ++j; else { print s; ... what_goes_here? ... }
No, no, no. I get the source of your confusion now (I misread something), but it's still not as complex as what you had originally. If you want to find ALL combinations within the range, instead of one, you will just have to iterate over all combinations of both lists, which isn't too bad.
Excuse my use of auto. C++0x compiler.
std::vector<int> sums;
std::vector<int> firstlist;
std::vector<int> secondlist;
// Fill in first/secondlist.
std::sort(firstlist.begin(), firstlist.end());
std::sort(secondlist.begin(), secondlist.end());
auto firstit = firstlist.begin();
auto secondit = secondlist.begin();
// Since we want all in a range, rather than just the first, we need to check all combinations. Horowitz/Sahni is only designed to find one.
for(; firstit != firstlist.end(); firstit++) {
for(; secondit = secondlist.end(); secondit++) {
int sum = *firstit + *secondit;
if (sum > A && sum < B)
sums.push_back(sum);
}
}
It's still not great. But it could be optimized if you know in advance that N is very large, for example, mapping or hashmapping sums to iterators, so that any given firstit can find any suitable partners in secondit, reducing the running time.
It is possible to do this in O(N*2^(N/2)), using ideas similar to Horowitz Sahni, but we try and do some optimizations to reduce the constants in the BigOh.
We do the following
Step 1: Split into sets of N/2, and generate all possible 2^(N/2) sets for each split. Call them S1 and S2. This we can do in O(2^(N/2)) (note: the N factor is missing here, due to an optimization we can do).
Step 2: Next sort the larger of S1 and S2 (say S1) in O(N*2^(N/2)) time (we optimize here by not sorting both).
Step 3: Find Subset sums in range [A,B] in S1 using binary search (as it is sorted).
Step 4: Next, for each sum in S2, find using binary search the sets in S1 whose union with this gives sum in range [A,B]. This is O(N*2^(N/2)). At the same time, find if that corresponding set in S2 is in the range [A,B]. The optimization here is to combine loops. Note: This gives you a representation of the sets (in terms of two indexes in S2), not the sets themselves. If you want all the sets, this becomes O(K + N*2^(N/2)), where K is the number of sets.
Further optimizations might be possible, for instance when sum from S2, is negative, we don't consider sums < A etc.
Since Steps 2,3,4 should be pretty clear, I will elaborate further on how to get Step 1 done in O(2^(N/2)) time.
For this, we use the concept of Gray Codes. Gray codes are a sequence of binary bit patterns in which each pattern differs from the previous pattern in exactly one bit.
Example: 00 -> 01 -> 11 -> 10 is a gray code with 2 bits.
There are gray codes which go through all possible N/2 bit numbers and these can be generated iteratively (see the wiki page I linked to), in O(1) time for each step (total O(2^(N/2)) steps), given the previous bit pattern, i.e. given current bit pattern, we can generate the next bit pattern in O(1) time.
This enables us to form all the subset sums, by using the previous sum and changing that by just adding or subtracting one number (corresponding to the differing bit position) to get the next sum.
If you modify the Horowitz-Sahni algorithm in the right way, then it's hardly slower than original Horowitz-Sahni. Recall that Horowitz-Sahni works two lists of subset sums: Sums of subsets in the left half of the original list, and sums of subsets in the right half. Call these two lists of sums L and R. To obtain subsets that sum to some fixed value A, you can sort R, and then look up a number in R that matches each number in L using a binary search. However, the algorithm is asymmetric only to save a constant factor in space and time. It's a good idea for this problem to sort both L and R.
In my code below I also reverse L. Then you can keep two pointers into R, updated for each entry in L: A pointer to the last entry in R that's too low, and a pointer to the first entry in R that's too high. When you advance to the next entry in L, each pointer might either move forward or stay put, but they won't have to move backwards. Thus, the second stage of the Horowitz-Sahni algorithm only takes linear time in the data generated in the first stage, plus linear time in the length of the output. Up to a constant factor, you can't do better than that (once you have committed to this meet-in-the-middle algorithm).
Here is a Python code with example input:
# Input
terms = [29371, 108810, 124019, 267363, 298330, 368607,
438140, 453243, 515250, 575143, 695146, 840979, 868052, 999760]
(A,B) = (500000,600000)
# Subset iterator stolen from Sage
def subsets(X):
yield []; pairs = []
for x in X:
pairs.append((2**len(pairs),x))
for w in xrange(2**(len(pairs)-1), 2**(len(pairs))):
yield [x for m, x in pairs if m & w]
# Modified Horowitz-Sahni with toolow and toohigh indices
L = sorted([(sum(S),S) for S in subsets(terms[:len(terms)/2])])
R = sorted([(sum(S),S) for S in subsets(terms[len(terms)/2:])])
(toolow,toohigh) = (-1,0)
for (Lsum,S) in reversed(L):
while R[toolow+1][0] < A-Lsum and toolow < len(R)-1: toolow += 1
while R[toohigh][0] <= B-Lsum and toohigh < len(R): toohigh += 1
for n in xrange(toolow+1,toohigh):
print '+'.join(map(str,S+R[n][1])),'=',sum(S+R[n][1])
"Moron" (I think he should change his user name) raises the reasonable issue of optimizing the algorithm a little further by skipping one of the sorts. Actually, because each list L and R is a list of sizes of subsets, you can do a combined generate and sort of each one in linear time! (That is, linear in the lengths of the lists.) L is the union of two lists of sums, those that include the first term, term[0], and those that don't. So actually you should just make one of these halves in sorted form, add a constant, and then do a merge of the two sorted lists. If you apply this idea recursively, you save a logarithmic factor in the time to make a sorted L, i.e., a factor of N in the original variable of the problem. This gives a good reason to sort both lists as you generate them. If you only sort one list, you have some binary searches that could reintroduce that factor of N; at best you have to optimize them somehow.
At first glance, a factor of O(N) could still be there for a different reason: If you want not just the subset sum, but the subset that makes the sum, then it looks like O(N) time and space to store each subset in L and in R. However, there is a data-sharing trick that also gets rid of that factor of O(N). The first step of the trick is to store each subset of the left or right half as a linked list of bits (1 if a term is included, 0 if it is not included). Then, when the list L is doubled in size as in the previous paragraph, the two linked lists for a subset and its partner can be shared, except at the head:
0
|
v
1 -> 1 -> 0 -> ...
Actually, this linked list trick is an artifact of the cost model and never truly helpful. Because, in order to have pointers in a RAM architecture with O(1) cost, you have to define data words with O(log(memory)) bits. But if you have data words of this size, you might as well store each word as a single bit vector rather than with this pointer structure. I.e., if you need less than a gigaword of memory, then you can store each subset in a 32-bit word. If you need more than a gigaword, then you have a 64-bit architecture or an emulation of it (or maybe 48 bits), and you can still store each subset in one word. If you patch the RAM cost model to take account of word size, then this factor of N was never really there anyway.
So, interestingly, the time complexity for the original Horowitz-Sahni algorithm isn't O(N*2^(N/2)), it's O(2^(N/2)). Likewise the time complexity for this problem is O(K+2^(N/2)), where K is the length of the output.

Resources