Is it possible to find the list of attributes which would yield to the greatest sum without brute forcing? - algorithm

I have about 2M records stored in a table.
Each record has a number and about 5K boolean attributes.
So the table looks something like this.
3, T, F, T, F, T, T, ...
29, F, F, T, F, T, T, ...
...
-87, T, F, T, F, T, T, ...
98, F, F, T, F, F, T, ...
And I defined SUM(A, B) as the sum of the numbers where Ath and Bth attributes are true.
For example, from the sample data above: SUM(1, 3) = 3 + ... + (-87) because the 1st and the 3rd attributes are T for 3 and -87
3, (T), F, (T), F, T, T, ...
29, (F), F, (T), F, T, T, ...
...
-87, (T), F, (T), F, T, T, ...
98, (F), F, (T), F, F, T, ...
And SUM() can take any number of parameters: SUM(1) and SUM(5, 7, ..., 3455) are all possible.
Are there some smart algorithms for finding a list of attributes L where SUM(L) would yields to the maximum result?
Obviously, brute forcing is not feasible for this large data set.
It would be awesome if there is a way to find not only the maximum but top N lists.
EDIT
It seems like it is not possible to find THE answer without brute forcing. If I changed the question to find a "good estimation", would there be a good way to do it?
Or, what if I said the cardinality of L is fixed to something like 10, would there be a way to calculate the L?
I would be happy with any.

Unfortunately, this problem is NP-complete. Your options are limited to finding a good but non-maximal solution with an approximation algorithm, or using branch-and-bound and hoping that you don't hit exponential runtime.
Proof of NP-completeness
To prove that your problem is NP-complete, we reduce the set cover problem to your problem. Suppose we have a set U of N elements, and a set S of M subsets of U, where the union of all sets in S is U. The set cover problem asks for the smallest subset T of S such that every element of U is contained in an element of T. If we had a polynomial-time algorithm to solve your problem, we could solve the set cover problem as follows:
First, construct a table with M+N rows and M attributes. The first N rows are "element" rows, each corresponding to an element of U. These have value "negative enough"; -M-1 should be enough. For element row i, the jth attribute is true if the corresponding element is not in the jth set in S.
The last M rows are "set" rows, each corresponding to a set in S. These have value 1. For set row N+i, the ith attribute is false and all others are true.
The values of the element rows are small enough that any choice of attributes that excludes all element rows beats any choice of attributes that includes any element row. Since the union of all sets in S is U, picking all attributes excludes all element rows, so the best choice of attributes is the one that includes the most set rows without including any element rows. By the construction of the table, a choice of attributes will exclude all element rows if the union of the corresponding sets is U, and if it does, its score will be better the fewer attributes it includes. Thus, the best choice of attributes corresponds directly to a minimum cover of S.
If we had a good algorithm to pick a choice of attributes that produces the maximal sum, we could apply it to this table to generate the minimum cover of an arbitrary S. Thus, your problem is as hard as the NP-complete set cover problem, and you should not waste your time trying to come up with an efficient algorithm to generate the perfect choice of attributes.

You could try a genetic algorithm approach, starting out with a certain (large) number of random attribute combinations, letting the worst x% die and mutating a certain percentage of the remaining population by adding/removing attributes.
There is no guarantee that you will find the optimal answer, but a good chance to find a good one within reasonable time.

No polynomial algorithms to solve this problem come to my mind. I can only suggest you a greedy heuristic:
For each attribute, compute its expected_score, i.e. the addend it would bring to your SUM, if selected alone. In your example, the score of 1 is 3 - 87 = -84.
Sort the attributes by expected_score in non-increasing order.
By following that order, greedily add to L the attributes. Call actual_score the score that the attribute a will actually bring to your sum (it can be better or worse than expected_score, depending on the attributes you already have in L). If actual_score(a) is not strictly positive, discard a.
This will not give you the optimal L, but I think a "fairly good" one.

Note: see below why this approach will not give the best results.
My first approach would be to start off with the special case L={} (which should give the sum of all integers) and add that to a list of solutions. From there add possible attributes as restrictions. In the first iteration, try each attribute in turn and remember those that gave a better result. After that iteration, put the remembered ones into a list of solutions.
In the second iteration, try to add another attribute to each of the remembered ones. Remember all those that improved the result. Remove duplicates from the remembered attribute combinations and add these to the list of solutions. Note that {m,n} is the same as {n,m}, so skip redundant combinations in order not to blow up your sets.
Repeat the second iterations until there are no more possible attributes that could be added to improve the final sum. If you then order the list of solutions by their sum, you get the requested solution.
Note that there are ~20G ways to select three attributes out of 5k, so you can't build a data structure containing those but you must absolutely generate them on demand. Still, the sheer amount can produce lots of temporary solutions, so you have to store those efficiently and perhaps even on disk. You can exploit the fact that you only need the previous iteration's solutions for the next iterations, not the ones before.
Another restriction here is that you can end up with less than N best solutions, because all those below L={} are not considered. In that case, I would accept all possible solutions until you have N solutions, and only once you have the N solutions discard those that don't give an improvement over the worst one.
Python code:
solutions = [{}]
remembered = [{}]
while remembered:
tmp = remembered
remembered = []
for s in remembered:
for xs in extensions(s):
if score(xs) > score(s)
remembered.append(xs)
solutions.extend(remembered)
Why this doesn't work:
Consider a temporary solution consisting of the three records
-2, T, F
-2, F, T
+3, F, F
The overall sum of these is -1. When I now select the first attribute, I discard the second and third record, giving a sum of -2. When selecting the second attribute, I discard the first and third, giving the same sum of -2. When selecting both the first and second attribute, I discard all three records, giving a sum of zero, which is an improvement.

Related

Given values v1, v2, ..., which get updated one by one, maintain the maximum over subsets (v_i1, v_i2, ...)

To set some notation, we have an array of size N consisting of non-negative floats V = [v1, v2, ..., vN], as well as M subsets S1, S2, ..., SM of {1, 2, ..., N} (the subsets will overlap). We are interested in the quantities w_j = max(v_i for i in Sj). The problem is to devise a data structure which can maintain w_j as efficiently as possible, while the values in the array V get updated one by one. We should assume that M >> N.
One idea is to construct the "inverse" of the subsets S, namely subsets T1, T2, ..., TN of {1, 2, ..., M} such that i in Sj if and only if j in Ti. Then, if vi is updated, scan every j in Ti and calculate w_j from scratch. This takes O(TN) time, where T is the maximum size of any Ti subset.
I believe I see a way to maintain these in O(T log N) time, but the algorithm involves a rather convoluted structure of copies of binary search trees and lookup tables. Is there a simpler data structure to use or a simple known solution to this problem? does this problem have a name?
As well, since we have M >> N, it would be ideal to reduce the complexity from O(M), but is this even possible?
Edit: The goal is to construct some data structure which allows efficiently maintaining the maximums when the V array is updated. You cannot construct this data structure in less than O(M) time, but it may be possible to update it in less then that whenever a single entry of the V array changes.
According to my comment, We have M sets that maybe have overlap. On the other hand, each set contains at least one number. So we need to read at least one time M sets with size at least 1. as a result our lower bound for this problem is Ω(M).

Select the most elements that do not overlap so that the sum of their size is maximized

I'm trying to find an algorithm to the following problem.
Say I have a number of objects A, B, C,...
I have a list of valid combinations of these objects. Each combination is of length 2 or 4.
For eg. AF, CE, CEGH, ADFG,... and so on.
For combinations of two objects, eg. AF, the length of the combination is 2. For combination of four objects, eg CEGH, the length of the combination.
I can only pick non-overlapping combinations, i.e. I cannot pick AF and ADFG because both require objects 'A' and 'F'. I can pick combinations AF and CEGH because they do not require common objects.
If my solution consists of only the two combinations AF and CEGH, then my objective is the sum of the length of the combinations, which is 2 + 4 = 6.
Given a list of objects and their valid combinations, how do I pick the most valid combinations that don't overlap with each other so that I maximize the sum of the lengths of the combinations? I do not want to formulate it as an IP as I am working with a problem instance with 180 objects and 10 million valid combinations and solving an IP using CPLEX is prohibitively slow. Looking for some other elegant way to solve it. Can I perhaps convert this to a network? And solve it using a max-flow algorithm? Or a Dynamic program? Stuck as to how to go about solving this problem.
My first attempt at showing this problem to be NP-hard was wrong, as it did not take into account the fact that only combinations of size 2 or 4 were allowed. However, using Jim D.'s suggestion to reduce from 3-dimensional matching (3DM), we can show that the problem is nevertheless NP-hard.
I'll show that the natural decision problem form of your problem ("Given a set O of objects, and a set C of combinations of either 2 or 4 objects from O, and an integer m, does there exist a subset D of C such that all sets in D are pairwise disjoint, and the union of all sets in D has size at least m?") is NP-hard. Clearly the optimisation problem (i.e., your original problem, where we seek an actual subset of combinations that maximises m above) is at least as hard as this problem. (To see that the optimisation problem is not "much" harder than the decision problem, notice that you could first find the maximum m value for which a solution exists using a binary search on m in which you solve a decision problem at each step, and then, once this maximal m value has been found, solving a series of decision problems in which each combination in turn is removed: if the solution after removing some particular combination is still "YES", then it may also be left out of all future problem instances, while if the solution becomes "NO", then it is necessary to keep this combination in the solution.)
Given an instance (X, Y, Z, T, k) of 3DM, where X, Y and Z are sets that are pairwise disjoint from each other, T is a subset of X*Y*Z (i.e., a set of ordered triples with first, second and third components from X, Y and Z, respectively) and k is an integer, our task is to determine whether there is any subset U of T such that |U| >= k and all triples in U are pairwise disjoint (i.e., to answer the question, "Are there at least k non-overlapping triples in T?"). To turn any such instance of 3DM into an instance of your problem, all we need to do is create a fresh 4-combination from each triple in T, by adding a distinct dummy value to each. The set of objects in the constructed instance of your problem will consist of the union of X, Y, Z, and the |T| dummy values we created. Finally, set m to k.
Suppose that the answer to the original 3DM instance is "YES", i.e., there are at least k non-overlapping triples in T. Then each of the k triples in such a solution corresponds to a 4-combination in the input C to your problem, and no two of these 4-combinations overlap, since by construction, their 4th elements are all distinct, and by assumption of the. Thus there are at least m = k non-overlapping 4-combinations in the instance of your problem, so the solution for that problem must also be "YES".
In the other direction, suppose that the solution to the constructed instance of your problem is "YES", i.e., there are at least m non-overlapping 4-combinations in C. We can simply take the first 3 elements of each of the 4-combinations (throwing away the fourth) to produce a set of k = m non-overlapping triples in T, so the answer to the original 3DM instance must also be "YES".
We have shown that a YES-answer to one problem implies a YES-answer to the other, thus a NO-answer to one problem implies a NO-answer to the other. Thus the problems are equivalent. The instance of your problem can clearly be constructed in polynomial time and space. It follows that your problem is NP-hard.
You can reduce this problem to the maximum weighted clique problem, which is, unfortunately, NP-hard.
Build a graph such that every combination is a vertex with weight equal to the length of the combination, and connect vertices if the corresponding combinations do not share any object (i.e. if you can pick both them at the same time). Then, a solution is valid if and only if it is a clique in that graph.
A simple search on google brings up a lot of approximation algorithms for this problem, such as this one.

Finding number of possible sequences in an array, with additional conditions

There is a sequence {a1, a2, a3, a4, ..... aN}. A run is the maximal strictly increasing or strictly decreasing continuous part of the sequence. Eg. If we have a sequence {1,2,3,4,7,6,5,2,3,4,1,2} We have 5 possible runs {1,2,3,4,7}, {7,6,5,2}, {2,3,4}, {4,1} and {1,2}.
Given four numbers N, M, K, L. Count the number of possible sequences of N numbers that has exactly M runs, each of the number in the sequence is less than or equal to K and difference between the adjacent numbers is less than equal to L
The question was asked during an interview.
I could only think of a brute force solution. What is an efficient solution for this problem?
Use dynamic programming. For each number in the substring maintain separate count of maximal increasing and maximally decreasing subsequences. When you incrementally add a new number to the end you can use these counts to update the counts for the new number. Complexity: O(n^2)
This can be rephrased as a recurrence problem. Look at your problem as finding #(N, M) (assume K and L are fixed, they are used in the recurrence conditions, so propagate accordingly). Now start with the more restricted count functions A(N, M; a) and D(N, M, a), where A counts those sets with last run ascending, D counts those with last run descending, and a is the value of the last element in the set.
Express #(N, M) in terms of A(N, M; a) and D(N, M; a) (it's the sum over all allowable a). You might note that there are relations between the two (like the reflection A(N, M; a) = D(N, M; K-a)) but that won't matter much for the calculation except to speed table filling.
Now A(N, M; a) can be expressed in terms of A(N-1, M; w), A(N-1, M-1; x), D(N-1, M; y) and D(N-1, M-1; z). The idea is that if you start with a set of size N-1 and know the direction of the last run and the value of the last element, you know whether adding element a will add to an existing run or add a run. So you can count the number of possible ways to get what you want from the possibilities of the previous case.
I'll let you write this recursion down. Note that this is where you account for L (only add up those that obey the L distance restriction) and K (look for end cases).
Terminate the recursion using the fact that A(1, 1; a) = 1, A(1, x>1; a) = 0 (and similarly for D).
Now, since this is a multiple recursion, be sure your implementation stores results in a table and begins by trying lookup (commonly called dynamic programming).
I suppose you mean by 'brute force solution' what I might mean by 'straightforward solution involving nested-loops over N,M,K,L' ? Sometimes the straightforward solution is good enough. One of the times when the straightforward solution is good enough is when you don't have a better solution. Another of the times is when the numbers are not very large.
With that off my chest I would write the loops in the reverse direction, or something like that. I mean:
Create 2 auxiliary data structures, one to contain the indices of the numbers <=K, one for the indices of the numbers whose difference with their neighbours is <=L.
Run through the list of numbers and populate the foregoing auxiliary data structures.
Find the intersection of the values in those 2 data structures; these will be the indices of interesting places to start searching for runs.
Look in each of the interesting places.
Until someone demonstrates otherwise this is the most efficient solution.

How to find the subset with the greatest number of items in common?

Let's say I have a number of 'known' sets:
1 {a, b, c, d, e}
2 {b, c, d, e}
3 {a, c, d}
4 {c, d}
I'd like a function which takes a set as an input, (for example {a, c, d, e}) and finds the set that has the highest number of elements, and no more other items in common. In other words, the subset with the greatest cardinality. The answer doesn't have to be a proper subset. The answer in this case would be {a, c, d}.
EDIT: the above example was wrong, now fixed.
I'm trying to find the absolute most efficient way of doing this.
(In the below, I am assuming that the cost of comparing two sets is O(1) for the sake of simplicity. That operation is outside my control so there's no point thinking about it. In truth it would be a function of the cardinality of the two sets being compared.)
Candiate 1:
Generate all subsets of the input, then iterate over the known sets and return the largest one that is a subset. The downside to this is that the complexity will be something like O(n! × m), where n is the cardinality of the input set and m is the number of 'known' subsets.
Candidate 1a (thanks #bratbrat):
Iterate over all 'known' sets and calculate the cardinatlity of the intersection, and take the one with the highest value. This would be O(n) where n is the number of subsets.
Candidate 2:
Create an inverse table and calculate the euclidean distance between the input and the known sets. This could be quite quick. I'm not clear how I could limit this to include only subsets without a subsequent O(n) filter.
Candidate 3:
Iterate over all known sets and compare against the input. The complexity would be O(n) where n is the number of known sets.
I have at my disposal the set functions built into Python and Redis.
None of these seems particularly great. Ideas? The number of sets may get large (around 100,000 at a guess).
There's no possible way to do this in less than O(n) time... just reading the input is O(n).
A couple ideas:
Sort the sets by size (biggest first), and search for the first set which is a subset of the input set. Once you find one, you don't have to examine the rest.
If the number of possible items which could be in the sets is limited, you could represent them by bit-vectors. Then you could calculate a lookup table to tell you whether a given set is a subset of the input set. (Walk down the bits for each input set under consideration, word by word, indexing each word into the appropriate table. If you find an entry telling you that it's not a subset, again, you can move on directly to the next input set.) Whether this would actually buy you performance, depends on the implementation language. I imagine it would be most effective in a language with primitive integral types, like C or Java.
Take the union of the known sets. This becomes a dictionary of known elements.
Sort the known elements by their value (they're integers, right). This defines a given integer's position in a bit string.
Use the above to define bit strings for each of the known sets. This is a one time operation - the results should be stored to avoid recomputation.
For an input set, run it through the same transform to obtain its bit string.
To get the largest subset, run through the list of known bit strings, taking the intersection (logical and) with the input bit string. Count the '1' elements. Remember the largest one.
http://packages.python.org/bitstring
As mentioned in the comments, this can be paralleled up by subdividing the known sets and giving each thread its own subset to work on. Each thread serves up its best match and then the parent thread picks the best from the threads.
How many searches are you making? In case you are searching multiple input sets you should be able to pre-process all the known sets (perhaps as a tree structure) and your search time for each query would be in the order of your query set size.
Eg: Create a Trie structure with all the known sets. Make sure to sort each set before inserting them. For the query, follow the links that are in the set.

Generate all subset sums within a range faster than O((k+N) * 2^(N/2))?

Is there a way to generate all of the subset sums s1, s2, ..., sk that fall in a range [A,B] faster than O((k+N)*2N/2), where k is the number of sums there are in [A,B]? Note that k is only known after we have enumerated all subset sums within [A,B].
I'm currently using a modified Horowitz-Sahni algorithm. For example, I first call it to for the smallest sum greater than or equal to A, giving me s1. Then I call it again for the next smallest sum greater than s1, giving me s2. Repeat this until we find a sum sk+1 greater than B. There is a lot of computation repeated between each iteration, even without rebuilding the initial two 2N/2 lists, so is there a way to do better?
In my problem, N is about 15, and the magnitude of the numbers is on the order of millions, so I haven't considered the dynamic programming route.
Check the subset sum on Wikipedia. As far as I know, it's the fastest known algorithm, which operates in O(2^(N/2)) time.
Edit:
If you're looking for multiple possible sums, instead of just 0, you can save the end arrays and just iterate through them again (which is roughly an O(2^(n/2) operation) and save re-computing them. The value of all the possible subsets is doesn't change with the target.
Edit again:
I'm not wholly sure what you want. Are we running K searches for one independent value each, or looking for any subset that has a value in a specific range that is K wide? Or are you trying to approximate the second by using the first?
Edit in response:
Yes, you do get a lot of duplicate work even without rebuilding the list. But if you don't rebuild the list, that's not O(k * N * 2^(N/2)). Building the list is O(N * 2^(N/2)).
If you know A and B right now, you could begin iteration, and then simply not stop when you find the right answer (the bottom bound), but keep going until it goes out of range. That should be roughly the same as solving subset sum for just one solution, involving only +k more ops, and when you're done, you can ditch the list.
More edit:
You have a range of sums, from A to B. First, you solve subset sum problem for A. Then, you just keep iterating and storing the results, until you find the solution for B, at which point you stop. Now you have every sum between A and B in a single run, and it will only cost you one subset sum problem solve plus K operations for K values in the range A to B, which is linear and nice and fast.
s = *i + *j; if s > B then ++i; else if s < A then ++j; else { print s; ... what_goes_here? ... }
No, no, no. I get the source of your confusion now (I misread something), but it's still not as complex as what you had originally. If you want to find ALL combinations within the range, instead of one, you will just have to iterate over all combinations of both lists, which isn't too bad.
Excuse my use of auto. C++0x compiler.
std::vector<int> sums;
std::vector<int> firstlist;
std::vector<int> secondlist;
// Fill in first/secondlist.
std::sort(firstlist.begin(), firstlist.end());
std::sort(secondlist.begin(), secondlist.end());
auto firstit = firstlist.begin();
auto secondit = secondlist.begin();
// Since we want all in a range, rather than just the first, we need to check all combinations. Horowitz/Sahni is only designed to find one.
for(; firstit != firstlist.end(); firstit++) {
for(; secondit = secondlist.end(); secondit++) {
int sum = *firstit + *secondit;
if (sum > A && sum < B)
sums.push_back(sum);
}
}
It's still not great. But it could be optimized if you know in advance that N is very large, for example, mapping or hashmapping sums to iterators, so that any given firstit can find any suitable partners in secondit, reducing the running time.
It is possible to do this in O(N*2^(N/2)), using ideas similar to Horowitz Sahni, but we try and do some optimizations to reduce the constants in the BigOh.
We do the following
Step 1: Split into sets of N/2, and generate all possible 2^(N/2) sets for each split. Call them S1 and S2. This we can do in O(2^(N/2)) (note: the N factor is missing here, due to an optimization we can do).
Step 2: Next sort the larger of S1 and S2 (say S1) in O(N*2^(N/2)) time (we optimize here by not sorting both).
Step 3: Find Subset sums in range [A,B] in S1 using binary search (as it is sorted).
Step 4: Next, for each sum in S2, find using binary search the sets in S1 whose union with this gives sum in range [A,B]. This is O(N*2^(N/2)). At the same time, find if that corresponding set in S2 is in the range [A,B]. The optimization here is to combine loops. Note: This gives you a representation of the sets (in terms of two indexes in S2), not the sets themselves. If you want all the sets, this becomes O(K + N*2^(N/2)), where K is the number of sets.
Further optimizations might be possible, for instance when sum from S2, is negative, we don't consider sums < A etc.
Since Steps 2,3,4 should be pretty clear, I will elaborate further on how to get Step 1 done in O(2^(N/2)) time.
For this, we use the concept of Gray Codes. Gray codes are a sequence of binary bit patterns in which each pattern differs from the previous pattern in exactly one bit.
Example: 00 -> 01 -> 11 -> 10 is a gray code with 2 bits.
There are gray codes which go through all possible N/2 bit numbers and these can be generated iteratively (see the wiki page I linked to), in O(1) time for each step (total O(2^(N/2)) steps), given the previous bit pattern, i.e. given current bit pattern, we can generate the next bit pattern in O(1) time.
This enables us to form all the subset sums, by using the previous sum and changing that by just adding or subtracting one number (corresponding to the differing bit position) to get the next sum.
If you modify the Horowitz-Sahni algorithm in the right way, then it's hardly slower than original Horowitz-Sahni. Recall that Horowitz-Sahni works two lists of subset sums: Sums of subsets in the left half of the original list, and sums of subsets in the right half. Call these two lists of sums L and R. To obtain subsets that sum to some fixed value A, you can sort R, and then look up a number in R that matches each number in L using a binary search. However, the algorithm is asymmetric only to save a constant factor in space and time. It's a good idea for this problem to sort both L and R.
In my code below I also reverse L. Then you can keep two pointers into R, updated for each entry in L: A pointer to the last entry in R that's too low, and a pointer to the first entry in R that's too high. When you advance to the next entry in L, each pointer might either move forward or stay put, but they won't have to move backwards. Thus, the second stage of the Horowitz-Sahni algorithm only takes linear time in the data generated in the first stage, plus linear time in the length of the output. Up to a constant factor, you can't do better than that (once you have committed to this meet-in-the-middle algorithm).
Here is a Python code with example input:
# Input
terms = [29371, 108810, 124019, 267363, 298330, 368607,
438140, 453243, 515250, 575143, 695146, 840979, 868052, 999760]
(A,B) = (500000,600000)
# Subset iterator stolen from Sage
def subsets(X):
yield []; pairs = []
for x in X:
pairs.append((2**len(pairs),x))
for w in xrange(2**(len(pairs)-1), 2**(len(pairs))):
yield [x for m, x in pairs if m & w]
# Modified Horowitz-Sahni with toolow and toohigh indices
L = sorted([(sum(S),S) for S in subsets(terms[:len(terms)/2])])
R = sorted([(sum(S),S) for S in subsets(terms[len(terms)/2:])])
(toolow,toohigh) = (-1,0)
for (Lsum,S) in reversed(L):
while R[toolow+1][0] < A-Lsum and toolow < len(R)-1: toolow += 1
while R[toohigh][0] <= B-Lsum and toohigh < len(R): toohigh += 1
for n in xrange(toolow+1,toohigh):
print '+'.join(map(str,S+R[n][1])),'=',sum(S+R[n][1])
"Moron" (I think he should change his user name) raises the reasonable issue of optimizing the algorithm a little further by skipping one of the sorts. Actually, because each list L and R is a list of sizes of subsets, you can do a combined generate and sort of each one in linear time! (That is, linear in the lengths of the lists.) L is the union of two lists of sums, those that include the first term, term[0], and those that don't. So actually you should just make one of these halves in sorted form, add a constant, and then do a merge of the two sorted lists. If you apply this idea recursively, you save a logarithmic factor in the time to make a sorted L, i.e., a factor of N in the original variable of the problem. This gives a good reason to sort both lists as you generate them. If you only sort one list, you have some binary searches that could reintroduce that factor of N; at best you have to optimize them somehow.
At first glance, a factor of O(N) could still be there for a different reason: If you want not just the subset sum, but the subset that makes the sum, then it looks like O(N) time and space to store each subset in L and in R. However, there is a data-sharing trick that also gets rid of that factor of O(N). The first step of the trick is to store each subset of the left or right half as a linked list of bits (1 if a term is included, 0 if it is not included). Then, when the list L is doubled in size as in the previous paragraph, the two linked lists for a subset and its partner can be shared, except at the head:
0
|
v
1 -> 1 -> 0 -> ...
Actually, this linked list trick is an artifact of the cost model and never truly helpful. Because, in order to have pointers in a RAM architecture with O(1) cost, you have to define data words with O(log(memory)) bits. But if you have data words of this size, you might as well store each word as a single bit vector rather than with this pointer structure. I.e., if you need less than a gigaword of memory, then you can store each subset in a 32-bit word. If you need more than a gigaword, then you have a 64-bit architecture or an emulation of it (or maybe 48 bits), and you can still store each subset in one word. If you patch the RAM cost model to take account of word size, then this factor of N was never really there anyway.
So, interestingly, the time complexity for the original Horowitz-Sahni algorithm isn't O(N*2^(N/2)), it's O(2^(N/2)). Likewise the time complexity for this problem is O(K+2^(N/2)), where K is the length of the output.

Resources