hash for particular array - algorithm

I have a very particular problem that I want to solve efficiently.
A geometry is defined by V volumes, numbered from 0 to V-1.
Each volume is bounded by different surfaces, numbered from 0 to N-1).
Volume | Surfaces
--------------------
Geometry A (V=2, N=7): 0 | [0 3 5 6 2]
1 | [5 4 2 1]
2 | [4 0 1 3 6]
Note that a surface will only appear once in a volume.
Also, a surface is at most in 2 volumes of a geometry.
Here is the problem:
I have two different descriptions of the same underlying geometry and I want to find which volume in Geometry A correspond to which volume in Geometry B. In other words, I have the same N surfaces, but the V volumes are defined differently.
Here is a Geometry B that could correspond to Geometry A above:
Volume | Surfaces
--------------------
Geometry B (V=2, N=7): 0 | [1 5 4 2]
1 | [3 6 5 0 2]
2 | [0 1 3 6 4]
Given Geometry A and B, I want to be able to bind each volume of Geometry A to its corresponding volume in Geometry B, the most efficiently as possible.
A 0 1 2
B 1 0 2
Draft of solution:
Sort each array of surfaces in ascending or descending order, than sort each volume following the lexicographic order of their surfaces. The problem is easily and robustly solved this way.
Better solution:
Compute a quick, unique hash for each array, than sort volumes following this hash. The hash should not depend on the order of surfaces in the array.
Why do I think a hash can be a good solution ?
Take hash(Volume) = min([Surfaces])
This hash already has at most 1 collision, because a surface can only appear in 2 volumes !
Now, if I take hash(Volume) = min([Surfaces]) + max([Sufaces])*N, I still have at most 1 collision, but the probability becomes very small when there is a lot of volumes and surfaces.

As mentioned, your solution is a good approximation for what you want. However, if you seek a perfect hash function, you can use the following method:
suppose p_i is the i-th prime number such that p_0 = 2, p_1 = 3, p_2 = 5, p_3 = 7, p_4 = 11, p_5 = 13, p_6 = 17, p_7 = 19 .... We can define a hash function on x_0, x_1, ..., x_k from an array such that h(x_0, ..., x_k) = p_{x_0} p_{x_1} ... p_{x_k}. Also, for the repeated numbers, we can apply the number of repetition as a power of the p_{x_i}. It means, for example, if x_i is repeated 3 times, the power of p_{x_i} in h would be p_{x_i}^3. if number of repetition of x_i is a_i we will have h(x_0, ..., x_k) = p_{x_0}^{a_0} p_{x_1}^{a_1} ... p_{x_k}^{a_k}.
Hence, for geometry A we have:
Volume | Surfaces | Hash
----------------------------------
geometry A 0 | [0, 3, 5, 6, 2] | 2 * 7 * 13 * 17 * 5 = 15470
1 | [5, 4, 2, 1] | 13 * 11 * 5 * 3 = 2145
2 | [4, 0, 1, 3, 6] | 11 * 2 * 3 * 7 * 17 = 7854
And the similar way for geometry B. As this function returns a unique value for each array (without concern with the order) you can arrange the surfaces using the correspondence hash value. If the value of N is not big, you can use the precomputed list of prime values.

I found a pretty good hash function, that should almost never have collisions:
V: [S_0 S_1 S_2 S_3...S_N-1]
u64 hash(V) = 0;
for i in {0..N-1} :
hash(V) = hash(V) ^ (1<<(S_i & 63))
end
This gives a unique 64 bit number, and all numbers are possible (unlike Omg's solution, where most numbers are impossible to get given that there is no repetition in the list of surface)
In the extreme case where there is a collision (which I will see after sorting), I will compare the arrays lexicographically in a stupid manner.

Related

Convert the permutation sequence A to B by selecting a set in A then reversing that set and inserting that set at the beginning of A

Given the sequence A and B consisting of N numbers that are permutations of 1,2,3,...,N. At each step, you choose a set S in sequence A in order from left to right (the numbers selected will be removed from A), then reverse S and add all elements in S to the beginning of the sequence A. Find a way to transform A into B in log2(n) steps.
Input: N <= 10^4 (number of elements of sequence A, B) and 2 permutations sequence A, B.
Output: K (Number of steps to convert A to B). The next K lines are the set of numbers S selected at each step.
Example:
Input:
5 // N
5 4 3 2 1 // A sequence
2 5 1 3 4 // B sequence
Output:
2
4 3 1
5 2
Step 0: S = {}, A = {5, 4, 3, 2, 1}
Step 1: S = {4, 3, 1}, A = {5, 2}. Then reverse S => S = {1, 3, 4}. Insert S to beginning of A => A = {1, 3, 4, 5, 2}
Step 2: S = {5, 2}, A = {1, 3, 4}. Then reverse S => S = {2, 5}. Insert S to beginning of A => A = {2, 5, 1, 3, 4}
My solution is to use backtracking to consider all possible choices of S in log2(n) steps. However, N is too large so is there a better approach? Thank you.
For each operation of combined selecting/removing/prepending, you're effectively sorting the elements relative to a "pivot", and preserving order. With this in mind, you can repeatedly "sort" the items in backwards order (by that I mean, you sort on the most significant bit last), to achieve a true sort.
For an explicit example, lets take an example sequence 7 3 1 8. Rewrite the terms with their respective positions in the final sorted list (which would be 1 3 7 8), to get 2 1 0 3.
7 -> 2 // 7 is at index 2 in the sorted array
3 -> 1 // 3 is at index 0 in the sorted array
1 -> 0 // so on
8 -> 3
This new array is equivalent to the original- we are just using indices to refer to the values indirectly (if you squint hard enough, we're kinda rewriting the unsorted list as pointers to the sorted list, rather than values).
Now, lets write these new values in binary:
2 10
1 01
0 00
3 11
If we were to sort this list, we'd first sort by the MSB (most significant bit) and then tiebreak only where necessary on the subsequent bit(s) until we're at the LSB (least significant bit). Equivalently, we can sort by the LSB first, and then sort all values on the next most significant bit, and continuing in this fashion until we're at the MSB. This will work, and correctly sort the list, as long as the sort is stable, that is- it doesn't change the order of elements that are considered equal.
Let's work this out by example: if we sorted these by the LSB, we'd get
2 10
0 00
1 01
3 11
-and then following that up with a sort on the MSB (but no tie-breaking logic this time), we'd get:
0 00
1 01
2 10
3 11
-which is the correct, sorted result.
Remember the "pivot" sorting note at the beginning? This is where we use that insight. We're going to take this transformed list 2 1 0 3, and sort it bit by bit, from the LSB to the MSB, with no tie-breaking. And to do so, we're going to pivot on the criteria <= 0.
This is effectively what we just did in our last example, so in the name of space I won't write it out again, but have a look again at what we did in each step. We took the elements with the bits we were checking that were equal to 0, and moved them to the beginning. First, we moved 2 (10) and 0 (00) to the beginning, and then the next iteration we moved 0 (00) and 1 (01) to the beginning. This is exactly what operation your challenge permits you to do.
Additionally, because our numbers are reduced to their indices, the max value is len(array)-1, and the number of bits is log2() of that, so overall we'll only need to do log2(n) steps, just as your problem statement asks.
Now, what does this look like in actual code?
from itertools import product
from math import log2, ceil
nums = [5, 9, 1, 3, 2, 7]
size = ceil(log2(len(nums)-1))
bit_table = list(product([0, 1], repeat=size))
idx_table = {x: i for i, x in enumerate(sorted(nums))}
for bit_idx in range(size)[::-1]:
subset_vals = [x for x in nums if bit_table[idx_table[x]][bit_idx] == 0]
nums.sort(key=lambda x: bit_table[idx_table[x]][bit_idx])
print(" ".join(map(str, subset_vals)))
You can of course use bitwise operators to accomplish the bit magic ((thing << bit_idx) & 1) if you want, and you could del slices of the list + prepend instead of .sort()ing, this is just a proof-of-concept to show that it actually works. The actual output being:
1 3 7
1 7 9 2
1 2 3 5

Neighbors in the matrix - algorithm

I have a problem with coming up with an algorithm for the "graph" :(
Maybe one of you would be so kind and direct me somehow <3
The task is as follows:
We have a board of at least 3x3 (it doesn't have to be a square, it can be 4x5 for example). The user specifies a sequence of moves (as in Android lock pattern). The task is to check how many points he has given are adjacent to each other horizontally or vertically.
Here is an example:
Matrix:
1 2 3 4
5 6 7 8
9 10 11 12
The user entered the code: 10,6,7,3
The algorithm should return the number 3 because:
10 is a neighbor of 6
6 is a neighbor of 7
7 is a neighbor of 3
Eventually return 3
Second example:
Matrix:
1 2 3
4 5 6
7 8 9
The user entered the code: 7,8,6,3
The algorithm should return 2 because:
7 is a neighbor of 8
8 is not a neighbor of 6
6 is a neighbor of 3
Eventually return 2
Ofc number of operations equal length of array - 1
Sorry for "ile" and "tutaj", i'm polish
If all the codes are unique, use them as keys to a dictionary (with (row/col) pairs as values). Loop thru the 2nd item in user input to the end, check if math.Abs(cur.row-prev.row)+math.Abs(cur.col-prev.col)==1. This is not space efficient but deal with user input in linear complexity.
The idea is you have 4 conditions, one for each direction. Given any matrix of the shape n,m which is made of a sequence of integers AND given any element:
The element left or right will always be + or - 1 to the given element.
The element up or down will always be + or - m to the given element.
So, if abs(x-y) is 1 or m, then x and y are neighbors.
I demonstrate this in python.
def get_neighbors(seq,matrix):
#Conditions
check = lambda x,y,m: np.abs(x-y)==1 or np.abs(x-y)==m
#Pairs of sequences appended with m
params = zip(seq, seq[1:], [matrix.shape[1]]*(len(seq)-1))
neighbours = [check(*i) for i in params]
count = sum(neighbours)
return neighbours, count
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
print('Matrix:')
print(matrix)
print('')
print('Sequence:', seq)
print('')
print('Count of neighbors:',count)
Matrix:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Sequence: [10, 6, 7, 3]
Count of neighbors: 3
Another example -
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
Matrix:
[[1 2 3]
[4 5 6]
[7 8 9]]
Sequence: [7, 8, 6, 3]
Count of neighbors: 2
So your input is the width of a table, the height of a table, and a list of numbers.
W = 4, H = 3, list = [10,6,7,3]
There are two steps:
Convert the list of numbers into a list of row/column coordinates (1 to [1,1], 5 to [2,1], 12 to [3,4]).
In the new list of coordinates, find consequent pairs, which have one coordinate identical, and the other one has a difference of 1.
Both steps are quite simple ("for" loops). Do you have problems with 1 or 2?

Practical algorithms for permuting external memory

On a spinning disk, I have N records that I want to permute. In RAM, I have an array of N indices that contain the desired permutation. I also have enough RAM to hold n records at a time. What algorithm can I use to execute the permutation on disk as quickly as possible, taking into account the fact that sequential disk access is a lot faster?
I have plenty of excess disk to use for intermediate files, if desired.
This is a known problem. Find the cycles in your permutation order. For instance, given five records to permute [1, 0, 3, 4, 2], you have cycles (0, 1) and (2, 3, 4). You do this by picking an unused starting position; follow the index pointers until you return to your starting point. The sequence of pointers describes a cycle.
You then permute the records with an internal temporary variable, one record long.
temp = disk[0]
disk[0] = disk[1]
disk[1] = temp
temp = disk[2]
disk[2] = disk[3]
disk[3] = disk[4]
disk[4] = temp
Note that you can also perform the permutation as you traverse the pointers. You will also need some method to recall which positions have already been permuted, such as clearing the permutation index (set it to -1).
Can you see how to generalize that?
This is an problem with interval coordination. I'll simplify the notation slightly by changing the memory available to M records -- having upper- and lower-case N is a little confusing.
First, we re-cast the permutations as a series of intervals, the rotational span during which a record needs to reside in RAM. If a record needs to be written to a lower-numbered position, we increase the endpoint by the list size, to indicate the wraparound -- have to wait for the next disk rotation. For instance, using my earlier example, we expand the list:
[1, 0, 3, 4, 2]
0 -> 1
1 -> 0+5
2 -> 3
3 -> 4
4 -> 2+5
Now, we apply standard greedy scheduling resolution. First, sort by endpoint:
[0, 1]
[2, 3]
[3, 4]
[1, 5]
[4, 7]
Now, apply the algorithm for M-1 "lanes"; the extra one is needed for swap space. We fill each lane, appending the interval with the earliest endpoint, whose start-point doesn't overlap:
[0, 1] [2, 3] [3, 4] [4, 7]
[1, 5]
We can do this in a total of 7 "ticks" if M >= 3. If M=2, we defer the second lane by 2 rotations to [11, 15].
Sneftal's nice example gives us more troubles, with deeper overlap:
[0, 4]
[1, 5]
[2, 6]
[3, 7]
[4, 0+8]
[5, 1+8]
[6, 2+8]
[7, 3+8]
This requires 4 "lanes" if available, deferring lanes as needed if M < 5.
The pathological case is where every record in the permutation needs to be copied back one position, such as [3, 0, 1, 2], with M=2.
[0, 3]
[1, 4]
[2, 5]
[3, 6]
In this case, we walk through the deferral cycle multiple times. At the end of every rotation, we have to defer all remaining intervals by one rotation, resulting in
[0, 3] [3, 6] [2+4, 5+4] [1+4+4, 4+4+4]
Does that get you moving, or do you need more detail?
I have an idea, which might need further improvement. But here it goes:
suppose the hdd has the following structure:
5 4 1 2 3
And we want to write out this permutation:
2 3 5 1 4
Since hdd is a circular buffer, and assuming it can only rotate in one direction, we can write the above permutation using shifts as such:
5 >> 2
4 >> 3
1 >> 1
2 >> 2
3 >> 2
So let's put that in an array, and since we know it is a circular array, lets put its mirrors side by side:
| 2 3 1 2 2 | 2 3 1 2 2| 2 3 1 2 2 | 2 3 1 2 2 |... Inf
Since we want to favor sequential reads, (or writes) we can put a cost function to the above series. Let the cost function be linear, i. e:
0 1 2 3 4 5 6 7 8 9 10 ... Inf
Now, let us add the cost function to the above series, but how to select the starting point?
The idea is to select the starting point such that you get the maximum congruent monotonically increasing sequence.
For example, if you select the 0 point to be on "3", you'll get
(1) | - 3 2 4 5 | 6 8 7 9 10 | ...
If you select the 0 point to be on "2", the one just right of "1", you'll get:
(2) | - - - 2 3 | 4 6 5 7 8 | ...
Since we are trying to favor consecutive reads, lets define our read-write function to work as such:
f():
At any currently pointed hdd location, function will read the currently pointed hdd file, into available RAM. (namely, total space - 1, because we want to save 1 for swap)
If no available space is left on RAM for read, the function will assert and program will halt.
At any current hdd location, if ram holds the value that we want to be written in that hdd location, function reads the current file into swap space, writes the wanted value from the ram to hdd, and destroys the value in ram.
If a value is placed into hdd, function will check if the sequence is completed. If it is, program will return with success.
Now, we should note that if the following holds:
shift amount <= n - 1 (n : available memory we can hold)
We can traverse the hard disk in once pass using the above function. For example:
current: 4 5 6 7 0 1 2 3
we want: 0 1 2 3 4 5 6 7
n : 5
We can start anywhere we want, say from the initial "4". We read 4 items sequentially, (n has 4 items now) and we start placing from 0 1 2 3, (we can because n = 5 total, and 4 is used. 1 is used for swap). So the total operations is 4 consecutive reads, and then r-w operations for 8 times.
Using that analogy, it becomes clear that if we subtract "n-1" from equations (1) and (2), the positions which have value "<= 0" will be a better suit for initial position because the ones higher than zero will definitely require another pass.
So we select eq. (2) and subtract, for let's say "n = 3", we subtract 2 from eq. (2):
(2) | - - - 0 1 | 2 4 3 5 6 | ...
Now it is clear that, using f(), and starting from 0, assuming n = 3, we will have a starting operation as such: r, r, r-w, r-w, ...
So, how do we do the rest and find minimum cost? We will place an array with initial minimum cost, just below equation (2). The positions in that array will signify where we want f() to be executed.
| - - - 0 1 | 2 4 3 5 6 | ...
| - - - 1 1 | 1 1 1 1 1 | ...
The second array, the ones with 1's and 0's tell the program where to execute f(). Note that, if we assumed those locations wrong, f() will assert.
Before we start actually placing files into hdd, we of course want to see if the f() positions are correct. We check if there are assertions, we we will try to minimize cost whilst removing all assertions. So, e.g:
(1) 1111000000000000001111
(2) 1111111000000000000000
(1) obviously has higher cost that (2). So the question simplifies on finding the 1-0 array.
Some ideas on finding the best array:
Simplest solution is to write out all 1's and turn assertions into 0's. (essentially it's a skip). This method is guaranteed to work.
Brute force: write an array of as shown in (2) and start shifting 1's to right, in such an order that tries out every permutation available:
1111111100000000
1111111010000000
1111110110000000
...
Full random approach: Plug in mt1997 and start permuting. Whenever you see a sharp drop in cost, stop executing and implement hdd copy-paste. You won't find the global minimum, but you'll get a nice trade-off.
Genetic algorithms: For permutations where "shift count is much lower than n - 1", the methodology provided in this answer should (?) provide a global minimum and smooth gradients. This allows one to use genetic algorithms without relying on mutations too much.
One advantage I find in this approach is that, since OP mentioned that this is a real life problem, the method provides an easy(ier?) way to change cost functions. It is easier to detect the effect of say, having lots of contigous small files to be copied vs. having a single huge file. Or perhaps rrwwrrww is better than rrrrwwww?
Does any of this even make sense? We will have to try out ...

Perimeter of union of N rectangles

I want to know the efficient way to solve this problem:
Given N rectangles that given a top-left and bottom-right corner, please find the perimeter of union of N rectangles.
I only have O(N^2) algorithm and it's too slow, so please find more efficient algorithm.
You can assume that coordinate value is positive integer and less than 100000.
EDIT:
For example, in this case, the perimeter is 30.
An O(n^2) algorithm:
for x=0 to maxx
for i=0 to N
if lowx[i] = x
for j=lowy[i] to highy[i]
d[j]++
if d[j] = 1 then ret++
if highy[i] = x
for j=lowy[i] to highy[i]
d[j]--
if d[j] = 0 then ret++
for y=0 to maxy
if d[y] = 0 && d[y + 1] >= 1 then ret++
if d[y] >= 1 && d[y + 1] = 0 then ret++
The final ret is the answer.
There's an O(n log n)-time sweepline algorithm. Apply the following steps to compute the vertical perimeter of the shape. Transpose the input and apply them again to compute the horizontal perimeter.
For each rectangle, prepare a start event keyed by the left x-coordinate whose value is the y-interval, and a stop event keyed by the right x-coordinate whose value is the y-interval. Sort these events by x-coordinate and process them in order. At all times, we maintain a data structure capable of reporting the number of points at which the boundary intersects the sweepline. On the 2n - 1 intervals between event points, we add this number times the width of the interval to the perimeter.
The data structure we need supports the following operations in time O(log n).
insert(ymin, ymax) -- inserts the interval [ymin, ymax] into the data structure
delete(ymin, ymax) -- deletes the interval [ymin, ymax] from the data structure
perimeter() -- returns the perimeter of the 1D union of the contained intervals
Since the input coordinates are bounded integers, one possible implementation is via a segment tree. (There's an extension to real inputs that involves sorting the y-coordinates of the input and remapping them to small integers.) Each segment has some associated data
struct {
int covers_segment;
bool covers_lower;
int interior_perimeter;
bool covers_upper;
};
whose scope is the union of segments descended from it that are present in the input intervals. (Note that a very long segment has no influence on the leafmost levels of the tree.)
The meaning of covers_segment is that it's the number of intervals that have this segment in their decomposition. The meaning of covers_lower is that it's true if one of the segments descended from this one with the same lower endpoint belongs to the decomposition of some interval. The meaning of interior_perimeter is the 1D perimeter of segments in scope (as described above). The meaning of covers_upper is akin to covers_lower, with the upper endpoint.
Here's an example.
0 1 2 3 4 5 6 7 8 9
[---A---]
[---B---] [-D-]
[-C-]
Intervals are A ([0, 4]) and B ([2, 4], [4, 6]) and C [3, 4] [4, 5] and D [7, 8] [8, 9].
c_s c_l i_p c_u
[0, 1] 0 F 0 F
[0, 2] 0 F 0 F
[1, 2] 0 F 0 F
[0, 4] 1 T 0 T
[2, 3] 0 F 0 F
[2, 4] 1 T 1 T
[3, 4] 1 T 0 T
[0, 8] 0 T 2 F
[4, 5] 1 T 0 T
[4, 6] 1 T 1 T
[5, 6] 0 F 0 F
[4, 8] 0 T 2 F
[6, 7] 0 F 0 F
[6, 8] 0 F 1 F
[7, 8] 1 T 0 T
[0, 9] 0 T 2 T
[8, 9] 1 T 0 T
To insert (delete) an interval, insert (delete) its constituent segments by incrementing (decrementing) covers_segment. Then, for all ancestors of the affected segments, recalculate the other fields as follows.
if s.covers_segment == 0:
s.covers_lower = s.lower_child.covers_lower
s.interior_perimeter =
s.lower_child.interior_perimeter +
(1 if s.lower_child.covers_upper != s.upper_child.covers_lower else 0) +
s.upper_child.interior_perimeter
s.covers_upper = s.upper_child.covers_upper
else:
s.covers_lower = true
s.interior_perimeter = 0
s.covers_upper = true
To implement perimeter, return
(1 if root.covers_lower else 0) +
root.interior_perimeter +
(1 if root.covers_upper else 0)
where root is the root of the segment tree.
This might help in some cases of your problem:
Consider that this,
_______
| |_
| |
| _|
|___ |
| |
|___|
has the same perimeter as this:
_________
| |
| |
| |
| |
| |
|_________|
On the one hand, the classic solition for this problem would be a sweep-line-based "boolean merge" algorithm, which in its original form builds the union of these rectangles, i.e. builds the polygonal boundary of the result. The algorithm can easily be modified to calculate the perimeter of the resultant boundary without physically building it.
On the other hand, sweep-line-based "boolean merge" can do this for arbitrary polygonal input. Given that in your case the input is much more restricted (and simplified) - just a bunch of isothetic rectangles - it is quite possible that a more lightweight and clever solution exists.
Note, BTW, that union of such rectangles might actually be a multi-connected polygon, i.e. an area with holes in it.

Finding all possible combinations of row in a matrix where sum of columns represents a specific row vector

I need to find out all possible combinations of row in a matrix where sum of columns represents a specific row matrix.
Example:
Consider the following matrix
| 0 0 2 |
| 1 1 0 |
| 0 1 2 |
| 1 1 2 |
| 0 1 0 |
| 2 1 2 |
I need to get the following row matrix from where sum of columns:
| 2 2 2 |
The possible combination were:
1.
| 1 1 0 |
| 1 1 2 |
2.
| 0 1 0 |
| 2 1 2 |
What is the best way to find out that.
ALGORITHM
One option is to turn this into the subset sum problem by choosing a base b and treating each row as a number in base b.
For example, with a base of 10 your initial problem turns into:
Consider the list of numbers
002
110
012
112
010
212
Find all subsets that sum to 222
This problem is well known and is solvable via dynamic programming (see the wikipedia page).
If all your entries are nonnegative, then you can use David Psinger's linear time algorithm which has complexity O(nC) where C is the target number and n is the length of your list.
CHOICE OF BASE
The complexity of the algorithm is determined by the choice of the base b.
For the algorithm to be correct you need to choose the base larger than the sum of all the digits in each column. (This is needed to avoid solving the problem due to an overflow from one digit into the next.)
However, note that if you choose a smaller base you will still get all the correct solutions, plus some incorrect solutions. It may be worth considering using a smaller base (which will make the subset sum algorithm work much faster), followed by a postprocessing stage that checks all the solutions found and discards any incorrect ones.
Too small a base will produce an exponential number of incorrect solutions to discard, so the best size of base will depend on the details of your problem.
EXAMPLE CODE
Python code to implement this algorithm.
from collections import defaultdict
A=[ [0, 0, 2],
[1, 1, 0],
[0, 1, 2],
[1, 1, 2],
[0, 1, 0],
[2, 1, 2] ]
target = [2,2,2]
b=10
def convert2num(a):
t=0
for d in a:
t+=b*t+d
return t
B = [convert2num(a) for a in A]
M=defaultdict(list)
for v,a in zip(B,A):
M[v].append(a) # Store a reverse index to allow us to look up rows
# First build the DP array
# Map from number to set of previous numbers
DP = defaultdict(set)
DP[0] = set()
for v in B:
for old_value in DP.keys():
new_value = old_value+v
if new_value<=target:
DP[new_value].add(v)
# Then search for solutions
def go(goal,sol):
if goal==0:
# Double check
assert map(sum,zip(*sol[:]))==target
print sol
return
for v in DP[goal]:
for a in M[v]:
sol.append(a)
go(goal-v,sol)
sol.pop()
go(convert2num(target),[])
This code assumes that b has been chosen large enough to avoid overflow.

Resources