algorithm to find products of a set of primes, in order, greater than x

algorithm to find products of a set of primes, in order, greater than x - algorithm

Consider the finite set {2,3,5,...,n}. I am interested in primes but the question could apply to any set of numbers. I want to find all possible products of these numbers in ascending order, and in particular greater than or equal to some number x. Does anyone know a nice algorithm for this?
EDIT to clarify:
Each factor in the input set may be used any number of times. If the input were {2,3,5,7} the output would be {2,3,4,5,6,7,8,9,10,12,14,15,16,18,...}. The algorithm can stop as soon as it produces a result greater than or equal to some number x.

A Haskell code, as seen in this answer,
hamm :: [Integer] -> [Integer]
hamm [] = []
hamm (p:ps) = xs -- e.g. hamm [2,3,5]
where xs = merge (hamm ps) -- H({p} ∪ ps) = S,
(p : map (p*) xs) -- S ⊇ {p} ∪ H(ps) ∪ { p*x | x ∊ S }
merge a#(x:xs) b#(y:ys) | x < y = x : merge xs b
| otherwise = y : merge a ys
merge [] b = b
merge a [] = a
merge here doesn't try to eliminate multiples, because there won't be any -- but only in case you're using only the primes in the input:
~> take 20 $ hamm [2,3,5,7]
[2,3,4,5,6,7,8,9,10,12,14,15,16,18,20,21,24,25,27,28]
If not, you need to use union instead,
union a#(x:xs) b#(y:ys) | x < y = x : union xs b
| x > y = y : union a ys
| otherwise = x : union xs ys
union [] b = b
union a [] = a
Starting from (above) a given value efficiently might be an interesting challenge. A directly slice-generating code at the bottom of this answer could be taken as a starting point.
In general it is easy to skip along the ordered sequence until a value is passed over. In Haskell, it is done with a built-in dropWhile (< n),
~> take 10 $ dropWhile (< 100) $ hamm [2,3,5,7]
[100,105,108,112,120,125,126,128,135,140]

(Edit: made it produce all products in ascending order; let users filter them as they wish. This is a generalised Hamming numbers problem)
genHamming :: Integral a => [a] -> [a]
genHamming zs = hmng where
hmng = 1 : foldr (||) [] [map (z*) hmng | z <- zs]
[] || ys = ys
xs || [] = xs
(x:xs) || (y:ys) | x==y = x : (xs || ys)
| x<y = x : (xs || (y:ys))
| y<x = y : (ys || (x:xs))
Example usage
Prelude Hamming> take 10 $ dropWhile (< 1000) $ genHamming [2,3,5]
[1000,1024,1080,1125,1152,1200,1215,1250,1280,1296]
Prelude Hamming>

You probably also want to include 2^0 * 3^0 * 5^0 * 7^0 = 1 in your output.
The way to do this is with a priority queue. If k is in the sequence, so are 2k, 3k, 5k and 7k. Start your output with 1, then add 2, 3, 5, and 7 to the priority queue. Pop 2 from the top of the queue and add 2*2=4, 2*3=6, 2*5=10 and 2*7=14 to the queue; the queue at that point will contain 3, 4, 5, 6, 7, 10 and 14. Pop 3 from the top of the queue and add 3*2=6, 3*3=9, 3*5=15 and 3*7=21 to the queue. And so on.
You will discover that many elements are duplicated; for instance, we added 6 to the priority queue twice in the example above. You can either add duplicates, and check each time you pop the queue if the element is the same as the prior member of the sequence, or you can keep a separate list of items in the queue and refrain from adding duplicates in the first place.
I discuss a priority queue that contains only distinct elements at my blog.

Every integer greater than 1 is the product of a 'set of primes' because it is the product of its prime factors. It might be easier to start at your desired minimum number and strike out all numbers that have a prime factor not in your initial set. Continue the process until your result set is large enough. In effect you are doing a modified Sieve of Eratosthenes, removing all multiples of the primes not in your initial set.

Because our application is written in python I came up with the following implementation which I wanted to share:
def powers(x):
y = x
while True:
yield y
y *= x
def products(factors):
y0 = factors[0]
if len(factors) == 1:
yield from powers(y0)
else:
yield y0
g1 = products(factors)
y1 = y0 * next(g1)
g2 = products(factors[1:])
y2 = next(g2)
while True:
if y1 < y2:
yield y1
y1 = y0 * next(g1)
else:
yield y2
y2 = next(g2)
if __name__ == "__main__":
import itertools
for n in itertools.islice(products([2, 3, 5, 7]), 10**6):
print(n)
No doubt the use of recursion with generators could be improved upon, but in practice the performance is good enough for our application. Beyond that, I'm still interested in how to start at a given minimum value efficiently, as mentioned in Will Ness' answer. Thanks to all those who contributed.

There are two algorithms that come to mind for this. First, you could calculate all possible products between the numbers, and then sorts these. Although this seems to be the naive approach, you could speed this up by 'remembering' the last product, dividing out one number, and multiplying a different number in its place. This would greatly reduce the number of operations necessary, if with the correct ordering of your permutations (look into Gray Codes) you could minimize the total multiplications.
On the other hand, what you could do is compute the sets of all the original numbers, the set of the products of pairs (of two) original numbers, the set of products of 3... so on. Then you can sort each individual set (which shouldn't be hard to ensure they are nearly sorted anyways), and the mergesort the sets together into one sorted list of products. This would take more operations, but would result in a nearly sorted list at the end, which may take less time to construct overall.
The other algorithm would be to take the product of all the primes of interest, and call this P. Construct another list of all the original primes squared. Now, loop over all numbers up to P, and test if those are divisible by any of the values in the primes-squared array. If they are, toss them. If not, then add them to the output array. You can optimize this by only testing divisibility up to sqrt(i), where i is the iteration in your for loop. This is still likely slower than the above method though.

As every prime factor is allowed to appear many times, the sequence is infinite. So we can't generate all products and then sort them. We have to generate the sequence iteratively.
If a is a member of the sequence, then {2*a, 3*a, 5*a ... n*a} will also be members of the sequence, coming later.
So, the algorithm that comes to my mind is to have a (sorted, duplicate-free) buffer of the next sequence members. We extract and present the smallest value and insert all of its multiples into the buffer.
As it's difficult to predict the buffer content for your starting number x, this algorithm should start from the beginning and ignore the results until they reach the x threshold.

Related

Function including random number that can be inverted without the random number

Given a number x and a random number n, I am looking for two functions F and G so that:
y = F(x, n) where y is different for different values of n
x = G(y)
all numbers are (large, e.g. 256 bit) integers
For instance given a list of numbers k1, k2, k3, f4 generated by applying multiple times F, it is possible to calculate k3 from k4 but not k4 from k3 (the random number prevents the inversion).
The problem is obvious if we allow to use n (or derived) in G (it is basically an asymmetric encryption) but this is not the target.
Any idea?
Update
I found a function that works with infinite precision F = x * pow(coprime(x), n)
x = 29
p = 5
n = 20
def f(x,n):
return x * pow(p,n)
f(x,n) => 2765655517578125
and G becomes
def g(y):
x = y
while x % p == 0:
x = x/p
return x
g(y) = 29
Unfortunately this fails with overflow as soon as numbers become big (limited precision)
Second update: the problem has no solution
In fact let's start from a situation where the problem has a solution, which is when the domain of G and F is R.
In that case choosing a random output from any function F' that has multiple output will work.
For instance if then F(x, n) = acos(x) + 2nπ, where n random is Integer
then G(y) = cos(y). From y is always possible to go back to x, but not the opposite without knowing n.
A similar example can be built with operation with module, which will work with Integer domains without the need of real numbers.
Anyway this will fail when the domain is the same finite set (like on physical memory) for F and G. It can be proved by contradiction.
Let's assume that for finite domains D1=D2 of size N, a function F:D1->D2 exists that produces M outputs where M > 1.
Assuming that the function produces at least one output for each x in D1,
1 either D2 > D1
2 or outputs from F are the same for different values of x (some overlapping must exists)
Now 1 is against the requirement that D1=D2, while 2 is against the requirement that G(y) has a single output value
If we relax 1 and we allow D2 > D1, then we can solve the problem. This can be done by adding n (or a derivation of it) like suggested in some comments. For my specific scenario probably it makes more sense to use a EC public/private key but that is another story.
Many Thanks

Based on your requirements, the following should work. If there is some other requirement that I did not understand from your question, please clarify, because this seems to suffice based on your definition. In that case, I will change or delete this answer.
f(x, n) = x | n;
g(y | n) = y;
where | means concatenation of bits. We can assign a fixed (maximum) number of bits for n and pad with zeros.

there can be no solution for this problem because:
for a constant x1 and variable r you would have an output set with all Integers in it.
for a constant x2 and variable r again you would have an output set with all Integers in it.
so at best you can have a function g which would take a number from the output set of function f and return all possible answers which are infinite.
this is similar to writing a reverse hashing function; which defies logic.

Conditional sampling of binary vectors (?)

I'm trying to find a name for my problem, so I don't have to re-invent wheel when coding an algorithm which solves it...
I have say 2,000 binary (row) vectors and I need to pick 500 from them. In the picked sample I do column sums and I want my sample to be as close as possible to a pre-defined distribution of the column sums. I'll be working with 20 to 60 columns.
A tiny example:
Out of the vectors:
110
010
011
110
100
I need to pick 2 to get column sums 2, 1, 0. The solution (exact in this case) would be
110
100
My ideas so far
one could maybe call this a binary multidimensional knapsack, but I did not find any algos for that
Linear Programming could help, but I'd need some step by step explanation as I got no experience with it
as exact solution is not always feasible, something like simulated annealing brute force could work well
a hacky way using constraint solvers comes to mind - first set the constraints tight and gradually loosen them until some solution is found - given that CSP should be much faster than ILP...?

My concrete, practical (if the approximation guarantee works out for you) suggestion would be to apply the maximum entropy method (in Chapter 7 of Boyd and Vandenberghe's book Convex Optimization; you can probably find several implementations with your favorite search engine) to find the maximum entropy probability distribution on row indexes such that (1) no row index is more likely than 1/500 (2) the expected value of the row vector chosen is 1/500th of the predefined distribution. Given this distribution, choose each row independently with probability 500 times its distribution likelihood, which will give you 500 rows on average. If you need exactly 500, repeat until you get exactly 500 (shouldn't take too many tries due to concentration bounds).

Firstly I will make some assumptions regarding this problem:
Regardless whether the column sum of the selected solution is over or under the target, it weighs the same.
The sum of the first, second, and third column are equally weighted in the solution (i.e. If there's a solution whereas the first column sum is off by 1, and another where the third column sum is off by 1, the solution are equally good).
The closest problem I can think of this problem is the Subset sum problem, which itself can be thought of a special case of Knapsack problem.
However both of these problem are NP-Complete. This means there are no polynomial time algorithm that can solve them, even though it is easy to verify the solution.
If I were you the two most arguably efficient solution of this problem are linear programming and machine learning.
Depending on how many columns you are optimising in this problem, with linear programming you can control how much finely tuned you want the solution, in exchange of time. You should read up on this, because this is fairly simple and efficient.
With Machine learning, you need a lot of data sets (the set of vectors and the set of solutions). You don't even need to specify what you want, a lot of machine learning algorithms can generally deduce what you want them to optimise based on your data set.
Both solution has pros and cons, you should decide which one to use yourself based on the circumstances and problem set.

This definitely can be modeled as (integer!) linear program (many problems can). Once you have it, you can use a program such as lpsolve to solve it.
We model vector i is selected as x_i which can be 0 or 1.
Then for each column c, we have a constraint:
sum of all (x_i * value of i in column c) = target for column c
Taking your example, in lp_solve this could look like:
min: ;
+x1 +x4 +x5 >= 2;
+x1 +x4 +x5 <= 2;
+x1 +x2 +x3 +x4 <= 1;
+x1 +x2 +x3 +x4 >= 1;
+x3 <= 0;
+x3 >= 0;
bin x1, x2, x3, x4, x5;

If you are fine with a heuristic based search approach, here is one.
Go over the list and find the minimum squared sum of the digit wise difference between each bit string and the goal. For example, if we are looking for 2, 1, 0, and we are scoring 0, 1, 0, we would do it in the following way:
Take the digit wise difference:
2, 0, 1
Square the digit wise difference:
4, 0, 1
Sum:
5
As a side note, squaring the difference when scoring is a common method when doing heuristic search. In your case, it makes sense because bit strings that have a 1 in as the first digit are a lot more interesting to us. In your case this simple algorithm would pick first 110, then 100, which would is the best solution.
In any case, there are some optimizations that could be made to this, I will post them here if this kind of approach is what you are looking for, but this is the core of the algorithm.

You have a given target binary vector. You want to select M vectors out of N that have the closest sum to the target. Let's say you use the eucilidean distance to measure if a selection is better than another.
If you want an exact sum, have a look at the k-sum problem which is a generalization of the 3SUM problem. The problem is harder than the subset sum problem, because you want an exact number of elements to add to a target value. There is a solution in O(N^(M/2)). lg N), but that means more than 2000^250 * 7.6 > 10^826 operations in your case (in the favorable case where vectors operations have a cost of 1).
First conclusion: do not try to get an exact result unless your vectors have some characteristics that may reduce the complexity.
Here's a hill climbing approach:
sort the vectors by number of 1's: 111... first, 000... last;
use the polynomial time approximate algorithm for the subset sum;
you have an approximate solution with K elements. Because of the order of elements (the big ones come first), K should be a little as possible:
if K >= M, you take the M first vectors of the solution and that's probably near the best you can do.
if K < M, you can remove the first vector and try to replace it with 2 or more vectors from the rest of the N vectors, using the same technique, until you have M vectors. To sumarize: split the big vectors into smaller ones until you reach the correct number of vectors.
Here's a proof of concept with numbers, in Python:
import random
def distance(x, y):
return abs(x-y)
def show(ls):
if len(ls) < 10:
return str(ls)
else:
return ", ".join(map(str, ls[:5]+("...",)+ls[-5:]))
def find(is_xs, target):
# see https://en.wikipedia.org/wiki/Subset_sum_problem#Pseudo-polynomial_time_dynamic_programming_solution
S = [(0, ())] # we store indices along with values to get the path
for i, x in is_xs:
T = [(x + t, js + (i,)) for t, js in S]
U = sorted(S + T)
y, ks = U[0]
S = [(y, ks)]
for z, ls in U:
if z == target: # use the euclidean distance here if you want an approximation
return ls
if z != y and z < target:
y, ks = z, ls
S.append((z, ls))
ls = S[-1][1] # take the closest element to target
return ls
N = 2000
M = 500
target = 1000
xs = [random.randint(0, 10) for _ in range(N)]
print ("Take {} numbers out of {} to make a sum of {}", M, xs, target)
xs = sorted(xs, reverse = True)
is_xs = list(enumerate(xs))
print ("Sorted numbers: {}".format(show(tuple(is_xs))))
ls = find(is_xs, target)
print("FIRST TRY: {} elements ({}) -> {}".format(len(ls), show(ls), sum(x for i, x in is_xs if i in ls)))
splits = 0
while len(ls) < M:
first_x = xs[ls[0]]
js_ys = [(i, x) for i, x in is_xs if i not in ls and x != first_x]
replace = find(js_ys, first_x)
splits += 1
if len(replace) < 2 or len(replace) + len(ls) - 1 > M or sum(xs[i] for i in replace) != first_x:
print("Give up: can't replace {}.\nAdd the lowest elements.")
ls += tuple([i for i, x in is_xs if i not in ls][len(ls)-M:])
break
print ("Replace {} (={}) by {} (={})".format(ls[:1], first_x, replace, sum(xs[i] for i in replace)))
ls = tuple(sorted(ls[1:] + replace)) # use a heap?
print("{} elements ({}) -> {}".format(len(ls), show(ls), sum(x for i, x in is_xs if i in ls)))
print("AFTER {} splits, {} -> {}".format(splits, ls, sum(x for i, x in is_xs if i in ls)))
The result is obviously not guaranteed to be optimal.
Remarks:
Complexity: find has a polynomial time complexity (see the Wikipedia page) and is called at most M^2 times, hence the complexity remains polynomial. In practice, the process is reasonably fast (split calls have a small target).
Vectors: to ensure that you reach the target with the minimum of elements, you can improve the order of element. Your target is (t_1, ..., t_c): if you sort the t_js from max to min, you get the more importants columns first. You can sort the vectors: by number of 1s and then by the presence of a 1 in the most important columns. E.g. target = 4 8 6 => 1 1 1 > 0 1 1 > 1 1 0 > 1 0 1 > 0 1 0 > 0 0 1 > 1 0 0 > 0 0 0.
find (Vectors) if the current sum exceed the target in all the columns, then you're not connecting to the target (any vector you add to the current sum will bring you farther from the target): don't add the sum to S (z >= target case for numbers).

I propose a simple ad hoc algorithm, which, broadly speaking, is a kind of gradient descent algorithm. It seems to work relatively well for input vectors which have a distribution of 1s “similar” to the target sum vector, and probably also for all “nice” input vectors, as defined in a comment of yours. The solution is not exact, but the approximation seems good.
The distance between the sum vector of the output vectors and the target vector is taken to be Euclidean. To minimize it means minimizing the sum of the square differences off sum vector and target vector (the square root is not needed because it is monotonic). The algorithm does not guarantee to yield the sample that minimizes the distance from the target, but anyway makes a serious attempt at doing so, by always moving in some locally optimal direction.
The algorithm can be split into 3 parts.
First of all the first M candidate output vectors out of the N input vectors (e.g., N=2000, M=500) are put in a list, and the remaining vectors are put in another.
Then "approximately optimal" swaps between vectors in the two lists are done, until either the distance would not decrease any more, or a predefined maximum number of iterations is reached. An approximately optimal swap is one where removing the first vector from the list of output vectors causes a maximal decrease or minimal increase of the distance, and then, after the removal of the first vector, adding the second vector to the same list causes a maximal decrease of the distance. The whole swap is avoided if the net result is not a decrease of the distance.
Then, as a last phase, "optimal" swaps are done, again stopping on no decrease in distance or maximum number of iterations reached. Optimal swaps cause a maximal decrease of the distance, without requiring the removal of the first vector to be optimal in itself. To find an optimal swap all vector pairs have to be checked. This phase is much more expensive, being O(M(N-M)), while the previous "approximate" phase is O(M+(N-M))=O(N). Luckily, when entering this phase, most of the work has already been done by the previous phase.
from typing import List, Tuple
def get_sample(vects: List[Tuple[int]], target: Tuple[int], n_out: int,
max_approx_swaps: int = None, max_optimal_swaps: int = None,
verbose: bool = False) -> List[Tuple[int]]:
"""
Get a sample of the input vectors having a sum close to the target vector.
Closeness is measured in Euclidean metrics. The output is not guaranteed to be
optimal (minimum square distance from target), but a serious attempt is made.
The max_* parameters can be used to avoid too long execution times,
tune them to your needs by setting verbose to True, or leave them None (∞).
:param vects: the list of vectors (tuples) with the same number of "columns"
:param target: the target vector, with the same number of "columns"
:param n_out: the requested sample size
:param max_approx_swaps: the max number of approximately optimal vector swaps,
None means unlimited (default: None)
:param max_optimal_swaps: the max number of optimal vector swaps,
None means unlimited (default: None)
:param verbose: print some info if True (default: False)
:return: the sample of n_out vectors having a sum close to the target vector
"""
def square_distance(v1, v2):
return sum((e1 - e2) ** 2 for e1, e2 in zip(v1, v2))
n_vec = len(vects)
assert n_vec > 0
assert n_out > 0
n_rem = n_vec - n_out
assert n_rem > 0
output = vects[:n_out]
remain = vects[n_out:]
n_col = len(vects[0])
assert n_col == len(target) > 0
sumvect = (0,) * n_col
for outvect in output:
sumvect = tuple(map(int.__add__, sumvect, outvect))
sqdist = square_distance(sumvect, target)
if verbose:
print(f"sqdist = {sqdist:4} after"
f" picking the first {n_out} vectors out of {n_vec}")
if max_approx_swaps is None:
max_approx_swaps = sqdist
n_approx_swaps = 0
while sqdist and n_approx_swaps < max_approx_swaps:
# find the best vect to subtract (the square distance MAY increase)
sqdist_0 = None
index_0 = None
sumvect_0 = None
for index in range(n_out):
tmp_sumvect = tuple(map(int.__sub__, sumvect, output[index]))
tmp_sqdist = square_distance(tmp_sumvect, target)
if sqdist_0 is None or sqdist_0 > tmp_sqdist:
sqdist_0 = tmp_sqdist
index_0 = index
sumvect_0 = tmp_sumvect
# find the best vect to add,
# but only if there is a net decrease of the square distance
sqdist_1 = sqdist
index_1 = None
sumvect_1 = None
for index in range(n_rem):
tmp_sumvect = tuple(map(int.__add__, sumvect_0, remain[index]))
tmp_sqdist = square_distance(tmp_sumvect, target)
if sqdist_1 > tmp_sqdist:
sqdist_1 = tmp_sqdist
index_1 = index
sumvect_1 = tmp_sumvect
if sumvect_1:
tmp = output[index_0]
output[index_0] = remain[index_1]
remain[index_1] = tmp
sqdist = sqdist_1
sumvect = sumvect_1
n_approx_swaps += 1
else:
break
if verbose:
print(f"sqdist = {sqdist:4} after {n_approx_swaps}"
f" approximately optimal swap{'s'[n_approx_swaps == 1:]}")
diffvect = tuple(map(int.__sub__, sumvect, target))
if max_optimal_swaps is None:
max_optimal_swaps = sqdist
n_optimal_swaps = 0
while sqdist and n_optimal_swaps < max_optimal_swaps:
# find the best pair to swap,
# but only if the square distance decreases
best_sqdist = sqdist
best_diffvect = diffvect
best_pair = None
for i0 in range(M):
tmp_diffvect = tuple(map(int.__sub__, diffvect, output[i0]))
for i1 in range(n_rem):
new_diffvect = tuple(map(int.__add__, tmp_diffvect, remain[i1]))
new_sqdist = sum(d * d for d in new_diffvect)
if best_sqdist > new_sqdist:
best_sqdist = new_sqdist
best_diffvect = new_diffvect
best_pair = (i0, i1)
if best_pair:
tmp = output[best_pair[0]]
output[best_pair[0]] = remain[best_pair[1]]
remain[best_pair[1]] = tmp
sqdist = best_sqdist
diffvect = best_diffvect
n_optimal_swaps += 1
else:
break
if verbose:
print(f"sqdist = {sqdist:4} after {n_optimal_swaps}"
f" optimal swap{'s'[n_optimal_swaps == 1:]}")
return output
from random import randrange
C = 30 # number of columns
N = 2000 # total number of vectors
M = 500 # number of output vectors
F = 0.9 # fill factor of the target sum vector
T = int(M * F) # maximum value + 1 that can be appear in the target sum vector
A = 10000 # maximum number of approximately optimal swaps, may be None (∞)
B = 10 # maximum number of optimal swaps, may be None (unlimited)
target = tuple(randrange(T) for _ in range(C))
vects = [tuple(int(randrange(M) < t) for t in target) for _ in range(N)]
sample = get_sample(vects, target, M, A, B, True)
Typical output:
sqdist = 2639 after picking the first 500 vectors out of 2000
sqdist = 9 after 27 approximately optimal swaps
sqdist = 1 after 4 optimal swaps
P.S.: As it stands, this algorithm is not limited to binary input vectors, integer vectors would work too. Intuitively I suspect that the quality of the optimization could suffer, though. I suspect that this algorithm is more appropriate for binary vectors.
P.P.S.: Execution times with your kind of data are probably acceptable with standard CPython, but get better (like a couple of seconds, almost a factor of 10) with PyPy. To handle bigger sets of data, the algorithm would have to be translated to C or some other language, which should not be difficult at all.

Fast solution to Subset sum

Consider this way of solving the Subset sum problem:
def subset_summing_to_zero (activities):
subsets = {0: []}
for (activity, cost) in activities.iteritems():
old_subsets = subsets
subsets = {}
for (prev_sum, subset) in old_subsets.iteritems():
subsets[prev_sum] = subset
new_sum = prev_sum + cost
new_subset = subset + [activity]
if 0 == new_sum:
new_subset.sort()
return new_subset
else:
subsets[new_sum] = new_subset
return []
I have it from here:
http://news.ycombinator.com/item?id=2267392
There is also a comment which says that it is possible to make it "more efficient".
How?
Also, are there any other ways to solve the problem which are at least as fast as the one above?
Edit
I'm interested in any kind of idea which would lead to speed-up. I found:
https://en.wikipedia.org/wiki/Subset_sum_problem#cite_note-Pisinger09-2
which mentions a linear time algorithm. But I don't have the paper, perhaps you, dear people, know how it works? An implementation perhaps? Completely different approach perhaps?
Edit 2
There is now a follow-up:
Fast solution to Subset sum algorithm by Pisinger

I respect the alacrity with which you're trying to solve this problem! Unfortunately, you're trying to solve a problem that's NP-complete, meaning that any further improvement that breaks the polynomial time barrier will prove that P = NP.
The implementation you pulled from Hacker News appears to be consistent with the pseudo-polytime dynamic programming solution, where any additional improvements must, by definition, progress the state of current research into this problem and all of its algorithmic isoforms. In other words: while a constant speedup is possible, you're very unlikely to see an algorithmic improvement to this solution to the problem in the context of this thread.
However, you can use an approximate algorithm if you require a polytime solution with a tolerable degree of error. In pseudocode blatantly stolen from Wikipedia, this would be:
initialize a list S to contain one element 0.
for each i from 1 to N do
let T be a list consisting of xi + y, for all y in S
let U be the union of T and S
sort U
make S empty
let y be the smallest element of U
add y to S
for each element z of U in increasing order do
//trim the list by eliminating numbers close to one another
//and throw out elements greater than s
if y + cs/N < z ≤ s, set y = z and add z to S
if S contains a number between (1 − c)s and s, output yes, otherwise no
Python implementation, preserving the original terms as closely as possible:
from bisect import bisect
def ssum(X,c,s):
""" Simple impl. of the polytime approximate subset sum algorithm
Returns True if the subset exists within our given error; False otherwise
"""
S = [0]
N = len(X)
for xi in X:
T = [xi + y for y in S]
U = set().union(T,S)
U = sorted(U) # Coercion to list
S = []
y = U[0]
S.append(y)
for z in U:
if y + (c*s)/N < z and z <= s:
y = z
S.append(z)
if not c: # For zero error, check equivalence
return S[bisect(S,s)-1] == s
return bisect(S,(1-c)*s) != bisect(S,s)
... where X is your bag of terms, c is your precision (between 0 and 1), and s is the target sum.
For more details, see the Wikipedia article.
(Additional reference, further reading on CSTheory.SE)

While my previous answer describes the polytime approximate algorithm to this problem, a request was specifically made for an implementation of Pisinger's polytime dynamic programming solution when all xi in x are positive:
from bisect import bisect
def balsub(X,c):
""" Simple impl. of Pisinger's generalization of KP for subset sum problems
satisfying xi >= 0, for all xi in X. Returns the state array "st", which may
be used to determine if an optimal solution exists to this subproblem of SSP.
"""
if not X:
return False
X = sorted(X)
n = len(X)
b = bisect(X,c)
r = X[-1]
w_sum = sum(X[:b])
stm1 = {}
st = {}
for u in range(c-r+1,c+1):
stm1[u] = 0
for u in range(c+1,c+r+1):
stm1[u] = 1
stm1[w_sum] = b
for t in range(b,n+1):
for u in range(c-r+1,c+r+1):
st[u] = stm1[u]
for u in range(c-r+1,c+1):
u_tick = u + X[t-1]
st[u_tick] = max(st[u_tick],stm1[u])
for u in reversed(range(c+1,c+X[t-1]+1)):
for j in reversed(range(stm1[u],st[u])):
u_tick = u - X[j-1]
st[u_tick] = max(st[u_tick],j)
return st
Wow, that was headache-inducing. This needs proofreading, because, while it implements balsub, I can't define the right comparator to determine if the optimal solution to this subproblem of SSP exists.

I don't know much python, but there is an approach called meet in the middle.
Pseudocode:
Divide activities into two subarrays, A1 and A2
for both A1 and A2, calculate subsets hashes, H1 and H2, the way You do it in Your question.
for each (cost, a1) in H1
if(H2.contains(-cost))
return a1 + H2[-cost];
This will allow You to double the number of elements of activities You can handle in reasonable time.

I apologize for "discussing" the problem, but a "Subset Sum" problem where the x values are bounded is not the NP version of the problem. Dynamic programing solutions are known for bounded x value problems. That is done by representing the x values as the sum of unit lengths. The Dynamic programming solutions have a number of fundamental iterations that is linear with that total length of the x's. However, the Subset Sum is in NP when the precision of the numbers equals N. That is, the number or base 2 place values needed to state the x's is = N. For N = 40, the x's have to be in the billions. In the NP problem the unit length of the x's increases exponentially with N.That is why the dynamic programming solutions are not a polynomial time solution to the NP Subset Sum problem. That being the case, there are still practical instances of the Subset Sum problem where the x's are bounded and the dynamic programming solution is valid.

Here are three ways to make the code more efficient:
The code stores a list of activities for each partial sum. It is more efficient in terms of both memory and time to just store the most recent activity needed to make the sum, and work out the rest by backtracking once a solution is found.
For each activity the dictionary is repopulated with the old contents (subsets[prev_sum] = subset). It is faster to simply grow a single dictionary
Splitting the values in two and applying a meet in the middle approach.
Applying the first two optimisations results in the following code which is more than 5 times faster:
def subset_summing_to_zero2 (activities):
subsets = {0:-1}
for (activity, cost) in activities.iteritems():
for prev_sum in subsets.keys():
new_sum = prev_sum + cost
if 0 == new_sum:
new_subset = [activity]
while prev_sum:
activity = subsets[prev_sum]
new_subset.append(activity)
prev_sum -= activities[activity]
return sorted(new_subset)
if new_sum in subsets: continue
subsets[new_sum] = activity
return []
Also applying the third optimisation results in something like:
def subset_summing_to_zero3 (activities):
A=activities.items()
mid=len(A)//2
def make_subsets(A):
subsets = {0:-1}
for (activity, cost) in A:
for prev_sum in subsets.keys():
new_sum = prev_sum + cost
if new_sum and new_sum in subsets: continue
subsets[new_sum] = activity
return subsets
subsets = make_subsets(A[:mid])
subsets2 = make_subsets(A[mid:])
def follow_trail(new_subset,subsets,s):
while s:
activity = subsets[s]
new_subset.append(activity)
s -= activities[activity]
new_subset=[]
for s in subsets:
if -s in subsets2:
follow_trail(new_subset,subsets,s)
follow_trail(new_subset,subsets2,-s)
if len(new_subset):
break
return sorted(new_subset)
Define bound to be the largest absolute value of the elements.
The algorithmic benefit of the meet in the middle approach depends a lot on bound.
For a low bound (e.g. bound=1000 and n=300) the meet in the middle only gets a factor of about 2 improvement other the first improved method. This is because the dictionary called subsets is densely populated.
However, for a high bound (e.g. bound=100,000 and n=30) the meet in the middle takes 0.03 seconds compared to 2.5 seconds for the first improved method (and 18 seconds for the original code)
For high bounds, the meet in the middle will take about the square root of the number of operations of the normal method.
It may seem surprising that meet in the middle is only twice as fast for low bounds. The reason is that the number of operations in each iteration depends on the number of keys in the dictionary. After adding k activities we might expect there to be 2**k keys, but if bound is small then many of these keys will collide so we will only have O(bound.k) keys instead.

Thought I'd share my Scala solution for the discussed pseudo-polytime algorithm described in wikipedia. It's a slightly modified version: it figures out how many unique subsets there are. This is very much related to a HackerRank problem described at https://www.hackerrank.com/challenges/functional-programming-the-sums-of-powers. Coding style might not be excellent, I'm still learning Scala :) Maybe this is still helpful for someone.
object Solution extends App {
var input = "1000\n2"
System.setIn(new ByteArrayInputStream(input.getBytes()))
println(calculateNumberOfWays(readInt, readInt))
def calculateNumberOfWays(X: Int, N: Int) = {
val maxValue = Math.pow(X, 1.0/N).toInt
val listOfValues = (1 until maxValue + 1).toList
val listOfPowers = listOfValues.map(value => Math.pow(value, N).toInt)
val lists = (0 until maxValue).toList.foldLeft(List(List(0)): List[List[Int]]) ((newList, i) =>
newList :+ (newList.last union (newList.last.map(y => y + listOfPowers.apply(i)).filter(z => z <= X)))
)
lists.last.count(_ == X)
}
}

Determine whether a symbol is part of the ith combination nCr

UPDATE:
Combinatorics and unranking was eventually what I needed.
The links below helped alot:
http://msdn.microsoft.com/en-us/library/aa289166(v=vs.71).aspx
http://www.codeproject.com/Articles/21335/Combinations-in-C-Part-2
The Problem
Given a list of N symbols say {0,1,2,3,4...}
And NCr combinations of these
eg. NC3 will generate:
0 1 2
0 1 3
0 1 4
...
...
1 2 3
1 2 4
etc...
For the ith combination (i = [1 .. NCr]) I want to determine Whether a symbol (s) is part of it.
Func(N, r, i, s) = True/False or 0/1
eg. Continuing from above
The 1st combination contains 0 1 2 but not 3
F(N,3,1,"0") = TRUE
F(N,3,1,"1") = TRUE
F(N,3,1,"2") = TRUE
F(N,3,1,"3") = FALSE
Current approaches and tibits that might help or be related.
Relation to matrices
For r = 2 eg. 4C2 the combinations are the upper (or lower) half of a 2D matrix
1,2 1,3 1,4
----2,3 2,4
--------3,4
For r = 3 its the corner of a 3D matrix or cube
for r = 4 Its the "corner" of a 4D matrix and so on.
Another relation
Ideally the solution would be of a form something like the answer to this:
Calculate Combination based on position
The nth combination in the list of combinations of length r (with repitition allowed), the ith symbol can be calculated
Using integer division and remainder:
n/r^i % r = (0 for 0th symbol, 1 for 1st symbol....etc)
eg for the 6th comb of 3 symbols the 0th 1st and 2nd symbols are:
i = 0 => 6 / 3^0 % 3 = 0
i = 1 => 6 / 3^1 % 3 = 2
i = 2 => 6 / 3^2 % 3 = 0
The 6th comb would then be 0 2 0
I need something similar but with repition not allowed.
Thank you for following this question this far :]
Kevin.

I believe your problem is that of unranking combinations or subsets.
I will give you an implementation in Mathematica, from the package Combinatorica, but the Google link above is probably a better place to start, unless you are familiar with the semantics.
UnrankKSubset::usage = "UnrankKSubset[m, k, l] gives the mth k-subset of set l, listed in lexicographic order."
UnrankKSubset[m_Integer, 1, s_List] := {s[[m + 1]]}
UnrankKSubset[0, k_Integer, s_List] := Take[s, k]
UnrankKSubset[m_Integer, k_Integer, s_List] :=
Block[{i = 1, n = Length[s], x1, u, $RecursionLimit = Infinity},
u = Binomial[n, k];
While[Binomial[i, k] < u - m, i++];
x1 = n - (i - 1);
Prepend[UnrankKSubset[m - u + Binomial[i, k], k-1, Drop[s, x1]], s[[x1]]]
]
Usage is like:
UnrankKSubset[5, 3, {0, 1, 2, 3, 4}]
{0, 3, 4}
Yielding the 6th (indexing from 0) length-3 combination of set {0, 1, 2, 3, 4}.

There's a very efficient algorithm for this problem, which is also contained in the recently published:Knuth, The Art of Computer Programming, Volume 4A (section 7.2.1.3).
Since you don't care about the order in which the combinations are generated, let's use the lexicographic order of the combinations where each combination is listed in descending order. Thus for r=3, the first 11 combinations of 3 symbols would be: 210, 310, 320, 321, 410, 420, 421, 430, 431, 432, 510. The advantage of this ordering is that the enumeration is independent of n; indeed it is an enumeration over all combinations of 3 symbols from {0, 1, 2, …}.
There is a standard method to directly generate the ith combination given i, so to test whether a symbol s is part of the ith combination, you can simply generate it and check.
Method
How many combinations of r symbols start with a particular symbol s? Well, the remaining r-1 positions must come from the s symbols 0, 1, 2, …, s-1, so it's (s choose r-1), where (s choose r-1) or C(s,r-1) is the binomial coefficient denoting the number of ways of choosing r-1 objects from s objects. As this is true for all s, the first symbol of the ith combination is the smallest s such that
&Sum;k=0s(k choose r-1) ≥ i.
Once you know the first symbol, the problem reduces to finding the (i - &Sum;k=0s-1(k choose r-1))-th combination of r-1 symbols, where we've subtracted those combinations that start with a symbol less than s.
Code
Python code (you can write C(n,r) more efficiently, but this is fast enough for us):
#!/usr/bin/env python
tC = {}
def C(n,r):
if tC.has_key((n,r)): return tC[(n,r)]
if r>n-r: r=n-r
if r<0: return 0
if r==0: return 1
tC[(n,r)] = C(n-1,r) + C(n-1,r-1)
return tC[(n,r)]
def combination(r, k):
'''Finds the kth combination of r letters.'''
if r==0: return []
sum = 0
s = 0
while True:
if sum + C(s,r-1) < k:
sum += C(s,r-1)
s += 1
else:
return [s] + combination(r-1, k-sum)
def Func(N, r, i, s): return s in combination(r, i)
for i in range(1, 20): print combination(3, i)
print combination(500, 10000000000000000000000000000000000000000000000000000000000000000)
Note how fast this is: it finds the 10000000000000000000000000000000000000000000000000000000000000000th combination of 500 letters (it starts with 542) in less than 0.5 seconds.

I have written a class to handle common functions for working with the binomial coefficient, which is the type of problem that your problem falls under. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
This class can easily be applied to your problem. If you have the rank (or index) to the binomial coefficient table, then simply call the class method that returns the K-indexes in an array. Then, loop through that returned array to see if any of the K-index values match the value you have. Pretty straight forward...

Efficiently get sorted sums of a sorted list

You have an ascending list of numbers, what is the most efficient algorithm you can think of to get the ascending list of sums of every two numbers in that list. Duplicates in the resulting list are irrelevant, you can remove them or avoid them if you like.
To be clear, I'm interested in the algorithm. Feel free to post code in any language and paradigm that you like.

Edit as of 2018: You should probably stop reading this. (But I can't delete it as it is accepted.)
If you write out the sums like this:
1 4 5 6 8 9
---------------
2 5 6 7 9 10
8 9 10 12 13
10 11 13 14
12 14 15
16 17
18
You'll notice that since M[i,j] <= M[i,j+1] and M[i,j] <= M[i+1,j], then you only need to examine the top left "corners" and choose the lowest one.
e.g.
only 1 top left corner, pick 2
only 1, pick 5
6 or 8, pick 6
7 or 8, pick 7
9 or 8, pick 8
9 or 9, pick both :)
10 or 10 or 10, pick all
12 or 11, pick 11
12 or 12, pick both
13 or 13, pick both
14 or 14, pick both
15 or 16, pick 15
only 1, pick 16
only 1, pick 17
only 1, pick 18
Of course, when you have lots of top left corners then this solution devolves.
I'm pretty sure this problem is Ω(n²), because you have to calculate the sums for each M[i,j] -- unless someone has a better algorithm for the summation :)

Rather than coding this out, I figure I'll pseudo-code it in steps and explain my logic, so that better programmers can poke holes in my logic if necessary.
On the first step we start out with a list of numbers length n. For each number we need to create a list of length n-1 becuase we aren't adding a number to itself. By the end we have a list of about n sorted lists that was generated in O(n^2) time.
step 1 (startinglist)
for each number num1 in startinglist
for each number num2 in startinglist
add num1 plus num2 into templist
add templist to sumlist
return sumlist
In step 2 because the lists were sorted by design (add a number to each element in a sorted list and the list will still be sorted) we can simply do a mergesort by merging each list together rather than mergesorting the whole lot. In the end this should take O(n^2) time.
step 2 (sumlist)
create an empty list mergedlist
for each list templist in sumlist
set mergelist equal to: merge(mergedlist,templist)
return mergedlist
The merge method would be then the normal merge step with a check to make sure that there are no duplicate sums. I won't write this out because anyone can look up mergesort.
So there's my solution. The entire algorithm is O(n^2) time. Feel free to point out any mistakes or improvements.

You can do this in two lines in python with
allSums = set(a+b for a in X for b in X)
allSums = sorted(allSums)
The cost of this is n^2 (maybe an extra log factor for the set?) for the iteration and s * log(s) for the sorting where s is the size of the set.
The size of the set could be as big as n*(n-1)/2 for example if X = [1,2,4,...,2^n]. So if you want to generate this list it will take at least n^2/2 in the worst case since this is the size of the output.
However if you want to select the first k elements of the result you can do this in O(kn) using a selection algorithm for sorted X+Y matrices by Frederickson and Johnson (see here for gory details). Although this can probably be modified to generate them online by reusing computation and get an efficient generator for this set.
#deuseldorf, Peter
There is some confusion about (n!) I seriously doubt deuseldorf meant "n factorial" but simply "n, (very excited)!"

The best I could come up with is to produce a matrix of sums of each pair, and then merge the rows together, a-la merge sort. I feel like I'm missing some simple insight that will reveal a much more efficient solution.
My algorithm, in Haskell:
matrixOfSums list = [[a+b | b <- list, b >= a] | a <- list]
sortedSums = foldl merge [] matrixOfSums
--A normal merge, save that we remove duplicates
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys) = case compare x y of
LT -> x:(merge xs (y:ys))
EQ -> x:(merge xs (dropWhile (==x) ys))
GT -> y:(merge (x:xs) ys)
I found a minor improvement, one that's more amenable to lazy stream-based coding. Instead of merging the columns pair-wise, merge all of them at once. The advantage being that you start getting elements of the list immediately.
-- wide-merge does a standard merge (ala merge-sort) across an arbitrary number of lists
-- wideNubMerge does this while eliminating duplicates
wideNubMerge :: Ord a => [[a]] -> [a]
wideNubMerge ls = wideNubMerge1 $ filter (/= []) ls
wideNubMerge1 [] = []
wideNubMerge1 ls = mini:(wideNubMerge rest)
where mini = minimum $ map head ls
rest = map (dropWhile (== mini)) ls
betterSortedSums = wideNubMerge matrixOfSums
However, if you know you're going to use all of the sums, and there's no advantage to getting some of them earlier, go with 'foldl merge []', as it's faster.

In SQL:
create table numbers(n int not null)
insert into numbers(n) values(1),(1), (2), (2), (3), (4)
select distinct num1.n+num2.n sum2n
from numbers num1
inner join numbers num2
on num1.n<>num2.n
order by sum2n
C# LINQ:
List<int> num = new List<int>{ 1, 1, 2, 2, 3, 4};
var uNum = num.Distinct().ToList();
var sums=(from num1 in uNum
from num2 in uNum
where num1!=num2
select num1+num2).Distinct();
foreach (var s in sums)
{
Console.WriteLine(s);
}

No matter what you do, without additional constraints on the input values, you cannot do better than O(n^2), simply because you have to iterate through all pairs of numbers. The iteration will dominate sorting (which you can do in O(n log n) or faster).

This question has been wracking my brain for about a day now. Awesome.
Anyways, you can't get away from the n^2 nature of it easily, but you can do slightly better with the merge since you can bound the range to insert each element in.
If you look at all the lists you generate, they have the following form:
(a[i], a[j]) | j>=i
If you flip it 90 degrees, you get:
(a[i], a[j]) | i<=j
Now, the merge process should be taking two lists i and i+1 (which correspond to lists where the first member is always a[i] and a[i+1]), you can bound the range to insert element (a[i + 1], a[j]) into list i by the location of (a[i], a[j]) and the location of (a[i + 1], a[j + 1]).
This means that you should merge in reverse in terms of j. I don't know (yet) if you can leverage this across j as well, but it seems possible.

If you are looking for a truly language agnostic solution then you will be sorely disappointed in my opinion because you'll be stuck with a for loop and some conditionals. However if you opened it up to functional languages or functional language features (I'm looking at you LINQ) then my colleagues here can fill this page with elegant examples in Ruby, Lisp, Erlang, and others.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio