I am trying to find the best (realistic) algorithm for solving a cryptography challenge, in which:
the given cipher text C is made of about 6000 characters taken in the set S={A,B,C,...,Y,a,b,c,...y}. So |S| = 50.
the encryption scheme does not allow to have two identical adjacent characters in C
25 letters in S are called Nulls, and are unknown
these Nulls must be removed from C to obtain the actual cipher text C' which can then be attacked.
the list of Nulls in C is named N and |N| is close to |C|/2 = 3000
so: |N| + |C'| = |C|
My aim is to identify the 25 Nulls, satisfying these two conditions:
there may not be two identical adjacent characters in C'
there may not be two identical adjacent Nulls in N
Obviously by brute force there are 50!/(25! 25!) = 126410606437752 combinations of 25 Nulls in S, so this is not a realistic approach.
I have tried to recursively explore the tree of sets of Nulls and 'cut branches' as much and as soon as possible.
For example, when adding a letter of S to the subset of Nulls, if the sequence "x n1n2 x" appears in C where x is not yet a Null and n1n2 are Nulls, then x should be a Null too.
However this is not enough for a run-time lower than a few centuries...
Can you think of a more clever algorithm for identifying these 25 Nulls ?
Note: there might be more than one set of Nulls satisfying the two conditions
lets try something like this:
Create a list of sets - each set contains one char from S. the set is the null chars.
while you have more then two sets:
for each set
search the cipher text for X[<set-chars>]+X
if found, union the set with the set X in it.
if no sets where united, start recursing with two sets united.
You can speed up things if you keep a different cipher text for each set, removing from it the chars in the set. if you do so, the search is easier - you are searching for XX, witch is constant length. every time you union two sets you need to remove all the chars in the sets from the cipher text.
The time this well take depends on the string C you are given.
An explanation about the sets - each set is an option for C' or N. If you find that A and X are in the same group, then {A, X} is either a subset of N or of C'. If later you will find the same about Y and B, then {Y, B} is a subset. Later, finding a substring YAXAXY means that Y is in the same group as A and X, and so will B, because it's with Y. At the end you will end with two groups - one for C' and one for N, witch you can't distinguish between.
elyashiv's method is the good one.
It is very fast.
I have produced the two sets C' and N, which are equivalent.
The sub-sets of S, S1 and S2 which produce C' and N are adequately such that S = S1 U S2.
Thank you.
Related
Suppose I have a list of N strings, known at compile-time.
I want to generate (at compile-time) a function that will map each string to a distinct integer between 1 and N inclusive. The function should take very little time or space to execute.
For example, suppose my strings are:
{"apple", "orange", "banana"}
Such a function may return:
f("apple") -> 2
f("orange") -> 1
f("banana") -> 3
What's a strategy to generate this function?
I was thinking to analyze the strings at compile time and look for a couple of constants I could mod or add by or something?
The compile-time generation time/space can be quite expensive (but obviously not ridiculously so).
Say you have m distinct strings, and let ai, j be the jth character of the ith string. In the following, I'll assume that they all have the same length. This can be easily translated into any reasonable programming language by treating ai, j as the null character if j ≥ |ai|.
The idea I suggest is composed of two parts:
Find (at most) m - 1 positions differentiating the strings, and store these positions.
Create a perfect hash function by considering the strings as length-m vectors, and storing the parameters of the perfect hash function.
Obviously, in general, the hash function must check at least m - 1 positions. It's easy to see this by induction. For 2 strings, at least 1 character must be checked. Assume it's true for i strings: i - 1 positions must be checked. Create a new set of strings by appending 0 to the end of each of the i strings, and add a new string that is identical to one of the strings, except it has a 1 at the end.
Conversely, it's obvious that it's possible to find at most m - 1 positions sufficient for differentiating the strings (for some sets the number of course might be lower, as low as log to the base of the alphabet size of m). Again, it's easy to see so by induction. Two distinct strings must differ at some position. Placing the strings in a matrix with m rows, there must be some column where not all characters are the same. Partitioning the matrix into two or more parts, and applying the argument recursively to each part with more than 2 rows, shows this.
Say the m - 1 positions are p1, ..., pm - 1. In the following, recall the meaning above for ai, pj for pj ≥ |ai|: it is the null character.
let us define h(ai) = ∑j = 1m - 1[qj ai, pj % n], for random qj and some n. Then h is known to be a universal hash function: the probability of pair-collision P(x ≠ y ∧ h(x) = h(y)) ≤ 1/n.
Given a universal hash function, there are known constructions for creating a perfect hash function from it. Perhaps the simplest is creating a vector of size m2 and successively trying the above h with n = m2 with randomized coefficients, until there are no collisions. The number of attempts needed until this is achieved, is expected 2 and the probability that more attempts are needed, decreases exponentially.
It is simple. Make a dictionary and assign 1 to the first word, 2 to the second, ... No need to make things complicated, just number your words.
To make the lookup effective, use trie or binary search or whatever tool your language provides.
the logic behind this was (n-2)3^(n-3) has lots of repetitons like (abc)***(abc) when abc is at start and at end and the strings repated total to 3^4 . similarly as abc moves ahead and number of sets of (abc) increase
You can use dynamic programming to compute the number of forbidden strings.
The algorithms follow from the observation below:
"Legal string of size n is the legal string of size n - 1 extended with one letter, so that the last three letters of the resulting string are not all distinct."
So if we had all the legal strings of size n-1 we could try extending them to obtain the legal strings of size n.
To check whether the extended string is legal we just need to know the last two letters of the previous string (of size n-1).
In the algorithm we will compute two arrays, where
different[i] # number of legal strings of length i in which last two letters are different
same[i] # number of legal strings of length i in which last two letters are the same
It can be easily proved that:
different[i+1] = different[i] + 2*same[i]
same[i+1] = different[i] + same[i]
It is the consequence of the following facts:
Any 'same' string of size i+1 can be obtained either from 'same' string of size i (think BB -> BBB) or from 'different' string (think AB -> ABB) and these are the only options.
Any 'different' string of size i+1 can be obtained either from 'different' string of size i (think AB-> ABA ) or from the 'same' string in two ways (AA -> AAB or AA -> AAC)
Having observed all this it is easy to write an algorithm that computes the result in O(n) time.
I suggest you use recursion, and look at two numbers:
F(n), the number of legal strings of length n whose last two symbols are the same.
G(n), the number of legal strings of length n whose last two symbols are different.
Is that enough to go on?
get the ASCII values of the last three letters and add the square values of these letters. If it gives a certain result, then it is forbidden. For A, B and C, it would be fine.
To do this:
1) find out how to get characters from your string.
2) find out how to get ASCII value of a character.
3) Multiply these ASCII values with themselves.
4) Do that for the three letters each time and add their values.
I have a symmetric matrix like shown in the image attached below.
I've made up the notation A.B which represents the value at grid point (A, B). Furthermore, writing A.B.C gives me the minimum grid point value like so: MIN((A,B), (A,C), (B,C)).
As another example A.B.D gives me MIN((A,B), (A,D), (B,D)).
My goal is to find the minimum values for ALL combinations of letters (not repeating) for one row at a time e.g for this example I need to find min values with respect to row A which are given by the calculations:
A.B = 6
A.C = 8
A.D = 4
A.B.C = MIN(6,8,6) = 6
A.B.D = MIN(6, 4, 4) = 4
A.C.D = MIN(8, 4, 2) = 2
A.B.C.D = MIN(6, 8, 4, 6, 4, 2) = 2
I realize that certain calculations can be reused which becomes increasingly important as the matrix size increases, but the problem is finding the most efficient way to implement this reuse.
Can point me in the right direction to finding an efficient algorithm/data structure I can use for this problem?
You'll want to think about the lattice of subsets of the letters, ordered by inclusion. Essentially, you have a value f(S) given for every subset S of size 2 (that is, every off-diagonal element of the matrix - the diagonal elements don't seem to occur in your problem), and the problem is to find, for each subset T of size greater than two, the minimum f(S) over all S of size 2 contained in T. (And then you're interested only in sets T that contain a certain element "A" - but we'll disregard that for the moment.)
First of all, note that if you have n letters, that this amounts to asking Omega(2^n) questions, roughly one for each subset. (Excluding the zero- and one-element subsets and those that don't include "A" saves you n + 1 sets and a factor of two, respectively, which is allowed for big Omega.) So if you want to store all these answers for even moderately large n, you'll need a lot of memory. If n is large in your applications, it might be best to store some collection of pre-computed data and do some computation whenever you need a particular data point; I haven't thought about what would work best, but for example computing data only for a binary tree contained in the lattice would not necessarily help you anything beyond precomputing nothing at all.
With these things out of the way, let's assume you actually want all the answers computed and stored in memory. You'll want to compute these "layer by layer", that is, starting with the three-element subsets (since the two-element subsets are already given by your matrix), then four-element, then five-element, etc. This way, for a given subset S, when we're computing f(S) we will already have computed all f(T) for T strictly contained in S. There are several ways that you can make use of this, but I think the easiest might be to use two such subset S: let t1 and t2 be two different elements of T that you may select however you like; let S be the subset of T that you get when you remove t1 and t2. Write S1 for S plus t1 and write S2 for S plus t2. Now every pair of letters contained in T is either fully contained in S1, or it is fully contained in S2, or it is {t1, t2}. Look up f(S1) and f(S2) in your previously computed values, then look up f({t1, t2}) directly in the matrix, and store f(T) = the minimum of these 3 numbers.
If you never select "A" for t1 or t2, then indeed you can compute everything you're interested in while not computing f for any sets T that don't contain "A". (This is possible because the steps outlined above are only interesting whenever T contains at least three elements.) Good! This leaves just one question - how to store the computed values f(T). What I would do is use a 2^(n-1)-sized array; represent each subset-of-your-alphabet-that-includes-"A" by the (n-1) bit number where the ith bit is 1 whenever the (i+1)th letter is in that set (so 0010110, which has bits 2, 4, and 5 set, represents the subset {"A", "C", "D", "F"} out of the alphabet "A" .. "H" - note I'm counting bits starting at 0 from the right, and letters starting at "A" = 0). This way, you can actually iterate through the sets in numerical order and don't need to think about how to iterate through all k-element subsets of an n-element set. (You do need to include a special case for when the set under consideration has 0 or 1 element, in which case you'll want to do nothing, or 2 elements, in which case you just copy the value from the matrix.)
Well, it looks simple to me, but perhaps I misunderstand the problem. I would do it like this:
let P be a pattern string in your notation X1.X2. ... .Xn, where Xi is a column in your matrix
first compute the array CS = [ (X1, X2), (X1, X3), ... (X1, Xn) ], which contains all combinations of X1 with every other element in the pattern; CS has n-1 elements, and you can easily build it in O(n)
now you must compute min (CS), i.e. finding the minimum value of the matrix elements corresponding to the combinations in CS; again you can easily find the minimum value in O(n)
done.
Note: since your matrix is symmetric, given P you just need to compute CS by combining the first element of P with all other elements: (X1, Xi) is equal to (Xi, X1)
If your matrix is very large, and you want to do some optimization, you may consider prefixes of P: let me explain with an example
when you have solved the problem for P = X1.X2.X3, store the result in an associative map, where X1.X2.X3 is the key
later on, when you solve a problem P' = X1.X2.X3.X7.X9.X10.X11 you search for the longest prefix of P' in your map: you can do this by starting with P' and removing one component (Xi) at a time from the end until you find a match in your map or you end up with an empty string
if you find a prefix of P' in you map then you already know the solution for that problem, so you just have to find the solution for the problem resulting from combining the first element of the prefix with the suffix, and then compare the two results: in our example the prefix is X1.X2.X3, and so you just have to solve the problem for
X1.X7.X9.X10.X11, and then compare the two values and choose the min (don't forget to update your map with the new pattern P')
if you don't find any prefix, then you must solve the entire problem for P' (and again don't forget to update the map with the result, so that you can reuse it in the future)
This technique is essentially a form of memoization.
I apologize for not have the math background to put this question in a more formal way.
I'm looking to create a string of 796 letters (or integers) with certain properties.
Basically, the string is a variation on a De Bruijn sequence B(12,4), except order and repetition within each n-length subsequence are disregarded.
i.e. ABBB BABA BBBA are each equivalent to {AB}.
In other words, the main property of the string involves looking at consecutive groups of 4 letters within the larger string
(i.e. the 1st through 4th letters, the 2nd through 5th letters, the 3rd through 6th letters, etc)
And then producing the set of letters that comprise each group (repetitions and order disregarded)
For example, in the string of 9 letters:
A B B A C E B C D
the first 4-letter groups is: ABBA, which is comprised of the set {AB}
the second group is: BBAC, which is comprised of the set {ABC}
the third group is: BACE, which is comprised of the set {ABCE}
etc.
The goal is for every combination of 1-4 letters from a set of N letters to be represented by the 1-4-letter resultant sets of the 4-element groups once and only once in the original string.
For example, if there is a set of 5 letters {A, B, C, D, E} being used
Then the possible 1-4 letter combinations are:
A, B, C, D, E,
AB, AC, AD, AE, BC, BD, BE, CD, CE, DE,
ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE, CDE,
ABCD, ABCE, ABDE, ACDE, BCDE
Here is a working example that uses a set of 5 letters {A, B, C, D, E}.
D D D D E C B B B B A E C C C C D A E E E E B D A A A A C B D D B
The 1st through 4th elements form the set: D
The 2nd through 5th elements form the set: DE
The 3rd through 6th elements form the set: CDE
The 4th through 7th elements form the set: BCDE
The 5th through 8th elements form the set: BCE
The 6th through 9th elements form the set: BC
The 7th through 10th elements form the set: B
etc.
* I am hoping to find a working example of a string that uses 12 different letters (a total of 793 4-letter groups within a 796-letter string) starting (and if possible ending) with 4 of the same letter. *
Here is a working solution for 7 letters:
AAAABCDBEAAACDECFAAADBFBACEAGAADEFBAGACDFBGCCCCDGEAFAGCBEEECGFFBFEGGGGFDEEEEFCBBBBGDCFFFFDAGBEGDDDDBE
Beware that in order to attempt exhaustive search (answer in VB is trying a naive version of that) you'll first have to solve the problem of generating all possible expansions while maintaining lexicographical order. Just ABC, expands to all perms of AABC, plus all perms of ABBC, plus all perms of ABCC which is 3*4! instead of just AABC. If you just concatenate AABC and AABD it would cover just 4 out of 4! perms of AABC and even that by accident. Just this expansion will bring you exponential complexity - end of game. Plus you'll need to maintain association between all explansions and the set (the set becomes a label).
Your best bet is to use one of known efficient De Bruijn constuctors and try to see if you can put your set-equivalence in there. Check out
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.674&rep=rep1&type=pdf
and
http://www.dim.uchile.cl/~emoreno/publicaciones/FINALES/copyrighted/IPL05-De_Bruijn_sequences_and_De_Bruijn_graphs_for_a_general_language.pdf
for a start.
If you know graphs, another viable option is to start with De Bruijn graph and formulate your set-equivalence as a graph rewriting. 2nd paper does De Bruijn graph partitioning.
BTW, try VB answer just for A,B,AB (at least expansion is small) - it will make AABBAB and construct ABBA or ABBAB (or throw in a decent language) both of which are wrong. You can even prove that it will always miss with 1st lexical expansions (that's what AAB, AAAB etc. are) just by examining first 2 passes (it will always miss 2nd A for NxA because (N-1)xA+B is in the string (1st expansion of {AB}).
Oh and if we could establish how many of each letters an optimal soluton should have (don't look at B(5,2) it's too easy and regular :-) a random serch would be feasible - you generate candidates with provable traits (like AAAA, BBBB ... are present and not touching and is has n1 A-s, n2 B-s ...) and random arrangement and then test whether they are solutions (checking is much faster than exhaustive search in this case).
Cool problem. Just a draft/psuedo algo:
dim STR-A as string = getall(ABCDEFGHIJKL)
//custom function to generate concat list of all 793 4-char combos.
//should be listed side-by-side to form 3172 character-long string.
//different ordering may ultimately produce different results.
//brute-forcing all orders of combos is too much work (793! is a big #).
//need to determine how to find optimal ordering, for this particular
//approach below.
dim STR-B as string = "" // to hold the string you're searching for
dim STR-C as string = "" // to hold the sub-string you are searching in
dim STR-A-NEW as string = "" //variable to hold your new string
dim MATCH as boolean = false //variable to hold matching status
while len(STR-A) > 0
//check each character in STR-A, which will be shorted by 1 char on each
//pass.
MATCH = false
STR-B = left(STR-A, 4)
STR-B = reduce(STR-B)
//reduce(str) is a custom re-usable function to sort & remove duplicates
for i as integer = 1 to len((STR-A) - 1)
STR-C = substr(STR-A, i, 4)
//gives you the 4-character sequence beginning at position i
STR-C = reduce(STR-C)
IF STR-B = STR-C Then
MATCH = true
exit for
//as long as there is even one match, you can throw-away the first
//letter
END IF
i = i+1
next
IF match = false then
//if you didn't find a match, then the first letter should be saved
STR-A-NEW += LEFT(STR-B, 1)
END IF
MATCH = false //re-init MATCH
STR-A = RIGHT(STR-A, LEN(STR-A) - 1) //re-init STR_A
wend
Anyway -- there could be problems at this, and you'd need to write another function to parse your result string (STR-A-NEW) to prove that it's a viable answer...
I've been thinking about this one and I'm sketching out a solution.
Let's call a string of four symbols a word and we'll write S(w) to denote the set of symbols in word w.
Each word abcd has "follow-on" words bcde where a,...,e are all symbols.
Let succ(w) be the set of follow-on words v for w such that S(w) != S(v). succ(w) is the set of successor words that can follow on from the first symbol in w if w is in a solution.
For each non-empty set of symbols s of cardinality at most four, let words(s) be the set of words w such that S(w) = s. Any solution must contain exactly one word in words(s) for each such set s.
Now we can do a reasonable search. The basic idea is this: say we are exploring a search path ending with word w. The follow-on word must be a non-excluded word in succ(w). A word v is excluded if the search path contains some word w such that v in words(S(w)).
You can be slightly more cunning: if we track the possible "predecessor" words to a set s (i.e., words w with a successor v such that v in words(s)) and reach a point where every predecessor of s is excluded, then we know we have reached a dead end, since we'll never be able to obtain s from any extension of the current search path.
Code to follow after the weekend, with a bit of luck...
Here is my proposal. I'll admit upfront this is a performance and memory hog.
This may be overkill, but have a class We'll call it UniqueCombination This will contain a unique 1-4 char reduced combination of the input set (i.e. A,AB,ABC,...) This will also contain a list of possible combination (AB {AABB,ABAB,BBAA,...}) this will need a method that determines if any possible combination overlaps any possible combination of another UniqueCombination by three characters. Also need a override that takes a string as well.
Then we start with the string "AAAA" then we find all of the UniqueCombinations that overlap this string. Then we find how many uniqueCombinations those possible matches overlap with. (we could be smart at this point an store this number.) Then we pick the one with the least number of overlaps greater than 0. Use up the ones with the least possible matches first.
Then we find a specific combination for the chosen UniqueCombination and add it to the final string. Remove this UniqueCombination from the list, then as we find overlaps for current string. rinse and repeat. (we could be smart and on subsequent runs while searching for overlaps we could remove any of the unreduced combination that are contained in the final string.)
Well that's my plan I will work on the code this weekend. Granted this does not guarantee that the final 4 characters will be 4 of the same letter (it might actually be trying to avoid that but I will look into that as well.)
If there is a non-exponential solution at all it may need to be formulated in terms of a recursive "growth" from a problem with a smaller size i.e to contruct B(N,k) from B(N-1,k-1) or from B(N-1,k) or from B(N,k-1).
Systematic construction for B(5,2) - one step at the time :-) It's bound to get more complex latter [card stands for cardinality, {AB} has card=2, I'll also call them 2-s, 3-s etc.] Note, 2-s and 3-s will be k-1 and k latter (I hope).
Initial. Start with k-1 result and inject symbols for singletons
(unique expansion empty intersection):
ABCDE -> AABBCCDDEE
mark used card=2 sets: AB,BC,CD,DE
Rewriting. Form card=3 sets to inject symbols into marked card=2.
1st feasible lexicographic expansion fires (may have to backtrack for k>2)
it's OK to use already marked 2-s since they'll all get replaced
but may have to do a verification pass for higher k
AB->ACB, BC->BCD, CD->CED, DE->DAE ==> AACBBDCCEDDAEEB
mark/verify used 2s
normally keep marking/unmarking during the construction but also keep keep old
mark list
marking/unmarking can get expensive if there's backtracking in #3
Unused: AB, BE
For higher k may need several recursive rewriting passes
possibly partitioning new sets into classes
Finalize: unused 2-s should overlap around the edge (that's why it's cyclic)
ABE - B can go to the begining or and: AACBBDCCEDDAEEB
Note: a step from B(N-1,k) to B(N,k) may need injection of pseudo-signletons, like doubling or trippling A
B(5,2) -> B(5,3) - B(5,4)
Initial. same: - ABCDE -> AAACBBBDCCCEDDDAEEEB
no use of marking 3-sets since they are all going to be chenged
Rewriting.
choose systematic insertion positions
AAA_CBBB_DCCC_EDDD_AEEE_B
mark all 2-s released by this: AC,AD,BD,BE,CE
use marked 2-s to decide inserted symbols - totice total regularity:
AxCB D -> ADCB
BxDC E -> BEDC
CxED A -> CAED
DxAE B => DBAE
ExBA C -> ECBA
Verify that 3-s are all used (marked inserted symbols just for fun)
AAA[D]CBBB[E]DCCC[A]EDDD[B]AEEE[C]B
Note: Systematic choice if insertion point deterministically dictated insertions (only AD can fit 1st, AC would create duplicate 2-set (AAC, ACC))
Note: It's not going to be so nice for B(6,2) and B(6,3) since number of 2-s will exceede 2x the no of 1-s. This is important since 2-s sit naturally on the sides of 1-s like CBBBE and the issue is how to place them when you run out of 1-s.
B(5,3) is so symetrical that just repeating #1 produces B(5.4):
AAAADCBBBBEDCCCCAEDDDDBAEEEECB
If I have a set of values (which I'll call x), and a number of subsets of x:
What is the best way to work out all possible combinations of subsets whose union is equal to x, but none of whom intersect with each other.
An example might be:
if x is the set of the numbers 1 to 100, and I have four subsets:
a = 0-49
b = 50-100
c = 50-75
d = 76-100
then the possible combinations would be:
a + b
a + c + d
What you describe is called the Exact cover problem. The general solution is Knuth's Algorithm X, with the Dancing Links algorithm being a concrete implementation.
Given a well-order on the elements of x (make one up if necessary, this is always possible for finite or countable sets):
Let "sets chosen so far" be empty. Consider the smallest element of x. Find all sets which contain x and which do not intersect with any of the sets chosen so far. For each such set in turn recurse, adding the chosen set to "sets chosen so far", and looking at the smallest element of x not in any chosen set. If you reach a point where there is no element of x left, then you've found a solution. If you reach a point where there is no unchosen set containing the element you're looking for, and which does not intersect with any of the sets that you already have selected, then you've failed to find a solution, so backtrack.
This uses stack proportional to the number of non-intersecting subsets, so watch out for that. It also uses a lot of time - you can be far more efficient if, as in your example, the subsets are all contiguous ranges.
here's a bad way (recursive, does a lot of redundant work). But at least its actual code and is probably halfway to the "efficient" solution.
def unique_sets(sets, target):
if not sets and not target:
yield []
for i, s in enumerate(sets):
intersect = s.intersection(target) and not s.difference(target)
sets_without_s = sets[:i] + sets[i+1:]
if intersect:
for us in unique_sets(sets_without_s, target.difference(s)):
yield us + [s]
else:
for us in unique_sets(sets_without_s, target):
yield us
class named_set(set):
def __init__(self, items, name):
set.__init__(self, items)
self.name = name
def __repr__(self):
return self.name
a = named_set(range(0, 50), name='a')
b = named_set(range(50, 100), name='b')
c = named_set(range(50, 75), name='c')
d = named_set(range(75, 100), name='d')
for s in unique_sets([a,b,c,d], set(range(0, 100))):
print s
A way (may not be the best way) is:
Create a set of all the pairs of subsets which overlap.
For every combination of the original subsets, say "false" if the combination contains one or more of the pairs listed in Step 1, else say "true" if the union of the subsets equals x (e.g. if the total number of elements in the subsets is x)
The actual algorithm seems largely dependent on the choice of subsets, product operation, and equate operation. For addition (+), it seems like you could find a summation to suit your needs (the sum of 1 to 100 is similar to your a + b example). If you can do this, your algorithm is obviously O(1).
If you have a tougher product or equate operator (let's say taking a product of two terms means summing the strings and finding the SHA-1 hash), you may be stuck doing nested loops, which would be O(n^x) where x is the number of terms/variables.
Depending on the subsets you have to work with, it might be advantageous to use a more naive algorithm. One where you don't have to compare the entire subset, but only upper and lower bounds.
If you are talking random subsets, not necesserily a range, then Nick Johnson's suggestion will probably be the best choice.