Create Ancestor Matrix from given Binary Tree - algorithm

The question is, given a Ancestor Matrix, as a bitmap of 1s and 0s, to construct the corresponding Binary Tree. Can anyone give me an idea on how to do it? I found a solution at Stackoverflow, but the line a[root->data][temp[i]]=1 seems wrong, there is no binding that the nodes will contain data 1 to n. It may contain, say 2000, in which case, there will be no a[2000][some_column], since there are only 7 nodes, hence 7 rows and columns in the matrix.

Two ways:
Normalize your node values such that they are all from 1 to n. If you have nodes 1, 2, 5000 for example, make them 1, 2, 3. You can do this by sorting or hashing your labels and keeping something like normalized[i] = normalized value of node i. normalized can be a map / hash table if you have very large labels or even text labels.
You might be able to use a sparse matrix for this, implementable with a hash table or a set: keep a hash table of hash tables. H[x] stores another hash table that stores your y values. So if in a naive matrix solution you had a[2000][5000] = 1, you would use H.get(2000) => returns a hash table H' of values stored on the 2000th row => H'.get(5000) => returns the value you want.


perfect hash function for random integer

Here's the problem:
X is a positive integer (include 0) set which has n different elements I know in advance. All of them is less equal than m. And I want to have an occ-free hash function as simple as possible to map them to 0-n-1.
For example:
X = [31,223,121,100,123,71], so n = 6, m = 223.
I want to find a hash function to map them to [0, 1, 2, 3, 4, 5].
If mapping to 0-n-1 is too difficult, then how to mapping X to a small range is also a problem.
Finding such a function is not too difficult, but to be simple and easy to be generated is hard.
It's better to preserve the order of the X.
Any clues?
My favorite perfect hash is pretty easy.
The hash function you generate has the form:
hash = table1[h1(key)%N] + table2[h2(key)%N]
h1 and h2 are randomly generated hash functions. In your case, you can generate random constants and then have h1(key)=key*C1/m and h2(key)=key*C2/m or something similarly simple
To generated the perfect hash:
Generate random constants C1 and C2
Imagine the bipartite graph, with table1 slots and table2 slots as vertices and an edge for each key between table1[h1(key)%N] and table2[h2(key)%N]. Run a DFS to see if the graph is acyclic. If not, go back to step 1.
Now that you have an acyclic graph, start at any key/edge in each connected component, and set its slots in table1 and table2 however you like to give it whatever hash you like.
Traverse the tree starting at the vertices adjacent to the edge you just set. For every edge you traverse, one of its slots will already be set. Set the other one to make the hash value come out however you like.
That's it. All of steps (2), (3) and (4) can be combined into a single DFS traversal pretty easily.
The complete description and analysis is in this paper.

Sum of Function defined on Subsets

I want to know if their are any fast approaches to solve the following problem. I have a list of codes somewhere in the thousands (A0, A1, A2, ...). There is a positive value attached to about a million distinct combinations (A0-A1, A2-A10, A1-A2-A10, ...). Let the values be denoted f(A0-A1). Note that not all the combinations have the value attached.
For each listed combination, I want to calculate the sum of values of the values attached to each set that contains the given combination. For instance, for A2-A10,
g(A2-A10) = f(A2-A10) + f(A1-A2-A10) + ...
I would like to do this with minimal time complexity. A simpler related problem is to find all combinations where g(C) is greater than a threshold value.
Key the existing combinations with a bit map, where bit n denotes whether An is in that particular coding. Store the values keyed by the bit map for each in your favorite hash-map structure. Thus, f(A0, A1, A10, A12) would be combo_val[11000000001010000...]
To sum all of the desired combinations, build a bit map of your root. For instance, with the combination above, we'd have root = 1100000000101000 (cutting off at 16 total elements for the sake of illustration.
Now simply loop through the keys of the hashmap, using root as a mask. Sum the desired values:
total = 0
for key in combo_val.keys()
if root && key == root
total += combo_val[key]
Does that get you moving?
I thought waaay too long before coming up with the following approach.
Index the million combinations. So you know which you want. In your example:
0: A0-A1
1: A2-A10
2: A1-A2-A10
For each code, create an ordered list of combinations that contain that code. Call that code_combs. In your example:
A0: [0]
A1: [0, 2]
A2: [1, 2]
A10: [1, 2]
Now we have a combination of codes, like A2-A10. We create two arrays, one of codes, the other of indices. Set indices at 0. So:
codes = ['A2', 'A10']
indices = [0, 0]
And now do the following:
while not done:
let max_comb = max(code_combs[codes[i]][indices[i]] over i in range(len(codes))
Advance each index until we are at the max_comb or greater
(if we reach the end of any list, we are done)
If all are at the same max_comb, we add its value.
Advance all indexes by 1.
(if we reach the end of any list, we are done)
Basically this is a k-way intersection of ordered lists. Now here is the trick. If we advance naively, this will be slightly faster because we only have to look at combinations that contain a code. However we can use a clever advance strategy like this:
Advance by 1, 2, 4, 8, etc until we reach or pass the point we want.
Do a binary search between the last two values until we find the point we want
(Be warned, implementing binary search is not always so easy to get right.)
And now we are crossing fingers. But if any one of our codes has few combinations that it is in, and there aren't too many codes in our combination, we can compute our intersection quite quickly.

Matrix reordering to block diagonal form

Give a sparse matrix, how to reorder the rows and columns such that it is in block diagonal like form via row and column permutation?
Row and column permutation are not necessarily coupled like reverse Cuthill-McKee ordering: In short, you can independently perform any row or column permutation.
The overall goal is to cluster all the non zero elements towards diagonal line.
Here is one approach.
First make a graph whose vertices are rows and columns. Every non-zero value is a edge between that row and that column.
You can then use a standard graph theory algorithm to detect the connected components of this graph. The single element ones represent all zero rows and columns. Number the others. Those components may have unequal numbers of rows and columns. You can distribute some zero rows and columns to them to make them square.
Your square components will be your blocks, and from the numbering of those components you know what order to put them in. Now just reorder rows and columns to achieve this structure and, voila! (The remaining zero rows/columns will result in a bunch of 0 blocks at the bottom right of the diagonal.)
Just an idea, but if you make a new matrix Ab from the original block-matrix A that contains the block-sparsity structure of A. E.g.:
A = [B 0 0; 0 0 C; 0 D 0]; % with matrices 0 (zero elements), B,C and D
Ab = [1 0 0; 0 0 2; 0 3 0]; % with identifiers 1, 2 and 3 (1-->B, 2-->C, 3-->D)
Then Ab is a simple sparse matrix (size 3x3 in the example). You can then use the reverse Cuthill-McKee ordering to get the permutations you want, and apply these permutations to Ab.
p = symrcm(Ab);
Abperm = Ab(p,p);
Then use the identifiers to create the ordered block matrix Aperm from Abperm and you'll have the desired result, I believe.
You'll need to be clever in assigning the identifiers to the individual blocks and so on, but this should be possible.

Finding the best pair of elements that don't exceed a certain weight?

I have a collection of objects, each of which has a weight and a value. I want to pick the pair of objects with the highest total value subject to the restriction that their combined weight does not exceed some threshold. Additionally, I am given two arrays, one containing the objects sorted by weight and one containing the objects sorted by value.
I know how to do it in O(n2) but how can I do it in O(n)?
This is a combinatorial optimization problem, and the fact the values are sorted means you can easily try a branch and bound approach.
I think that I have a solution that works in O(n log n) time and O(n) extra space. This isn't quite the O(n) solution you wanted, but it's still better than the naive quadratic solution.
The intuition behind the algorithm is that we want to be able to efficiently determine, for any amount of weight, the maximum value we can get with a single item that uses at most that much weight. If we can do this, we have a simple algorithm for solving the problem: iterate across the array of elements sorted by value. For each element, see how much additional value we could get by pairing a single element with it (using the values we precomputed), then find which of these pairs is maximum. If we can do the preprocessing in O(n log n) time and can answer each of the above queries in O(log n) time, then the total time for the second step will be O(n log n) and we have our answer.
An important observation we need to do the preprocessing step is as follows. Our goal is to build up a structure that can answer the question "which element with weight less than x has maximum value?" Let's think about how we might do this by adding one element at a time. If we have an element (value, weight) and the structure is empty, then we want to say that the maximum value we can get using weight at most "weight" is "value". This means that everything in the range [0, max_weight - weight) should be set to value. Otherwise, suppose that the structure isn't empty when we try adding in (value, weight). In that case, we want to say that any portion of the range [0, weight) whose value is less than value should be replaced by value.
The problem here is that when we do these insertions, there might be, on iteration k, O(k) different subranges that need to be updated, leading to an O(n2) algorithm. However, we can use a very clever trick to avoid this. Suppose that we insert all of the elements into this data structure in descending order of value. In that case, when we add in (value, weight), because we add the elements in descending order of value, each existing value in the data structure must be higher than our value. This means that if the range [0, weight) intersects any range at all, those ranges will automatically be higher than value and so we don't need to update them. If we combine this with the fact that each range we add always spans from zero to some value, the only portion of the new range that could ever be added to the data structure is the range [weight, x), where x is the highest weight stored in the data structure so far.
To summarize, assuming that we visit the (value, weight) pairs in descending order of value, we can update our data structure as follows:
If the structure is empty, record that the range [0, value) has value "value."
Otherwise, if the highest weight recorded in the structure is greater than weight, skip this element.
Otherwise, if the highest weight recorded so far is x, record that the range [weight, x) has value "value."
Notice that this means that we are always splitting ranges at the front of the list of ranges we have encountered so far. Because of this, we can think about storing the list of ranges as a simple array, where each array element tracks the upper endpoint of some range and the value assigned to that range. For example, we might track the ranges [0, 3), [3, 9), and [9, 12) as the array
3, 9, 12
If we then needed to split the range [0, 3) into [0, 1) and [1, 3), we could do so by prepending 1 to he list:
1, 3, 9, 12
If we represent this array in reverse (actually storing the ranges from high to low instead of low to high), this step of creating the array runs in O(n) time because at each point we just do O(1) work to decide whether or not to add another element onto the end of the array.
Once we have the ranges stored like this, to determine which of the ranges a particular weight falls into, we can just use a binary search to find the largest element smaller than that weight. For example, to look up 6 in the above array we'd do a binary search to find 3.
Finally, once we have this data structure built up, we can just look at each of the objects one at a time. For each element, we see how much weight is left, use a binary search in the other structure to see what element it should be paired with to maximize the total value, and then find the maximum attainable value.
Let's trace through an example. Given maximum allowable weight 10 and the objects
Weight | Value
2 | 3
6 | 5
4 | 7
7 | 8
Let's see what the algorithm does. First, we need to build up our auxiliary structure for the ranges. We look at the objects in descending order of value, starting with the object of weight 7 and value 8. This means that if we ever have at least seven units of weight left, we can get 8 value. Our array now looks like this:
Weight: 7
Value: 8
Next, we look at the object of weight 4 and value 7. This means that with four or more units of weight left, we can get value 7:
Weight: 7 4
Value: 8 7
Repeating this for the next item (weight six, value five) does not change the array, since if the object has weight six, if we ever had six or more units of free space left, we would never choose this; we'd always take the seven-value item of weight four. We can tell this since there is already an object in the table whose range includes remaining weight four.
Finally, we look at the last item (value 3, weight 2). This means that if we ever have weight two or more free, we could get 3 units of value. The final array now looks like this:
Weight: 7 4 2
Value: 8 7 3
Finally, we just look at the objects in any order to see what the best option is. When looking at the object of weight 2 and value 3, since the maximum allowed weight is 10, we need tom see how much value we can get with at most 10 - 2 = 8 weight. A binary search over the array tells us that this value is 8, so one option would give us 11 weight. If we look at the object of weight 6 and value 5, a binary search tells us that with five remaining weight the best we can do would be to get 7 units of value, for a total of 12 value. Repeating this on the next two entries doesn't turn up anything new, so the optimum value found has value 12, which is indeed the correct answer.
Hope this helps!
Here is an O(n) time, O(1) space solution.
Let's call an object x better than an object y if and only if (x is no heavier than y) and (x is no less valuable) and (x is lighter or more valuable). Call an object x first-choice if no object is better than x. There exists an optimal solution consisting either of two first-choice objects, or a first-choice object x and an object y such that only x is better than y.
The main tool is to be able to iterate the first-choice objects from lightest to heaviest (= least valuable to most valuable) and from most valuable to least valuable (= heaviest to lightest). The iterator state is an index into the objects by weight (resp. value) and a max value (resp. min weight) so far.
Each of the following steps is O(n).
During a scan, whenever we encounter an object that is not first-choice, we know an object that's better than it. Scan once and consider these pairs of objects.
For each first-choice object from lightest to heaviest, determine the heaviest first-choice object that it can be paired with, and consider the pair. (All lighter objects are less valuable.) Since the latter object becomes lighter over time, each iteration of the loop is amortized O(1). (See also searching in a matrix whose rows and columns are sorted.)
Code for the unbelievers. Not heavily tested.
from collections import namedtuple
from operator import attrgetter
Item = namedtuple('Item', ('weight', 'value'))
sentinel = Item(float('inf'), float('-inf'))
def firstchoicefrombyweight(byweight):
bestsofar = sentinel
for x in byweight:
if x.value > bestsofar.value:
bestsofar = x
yield (x, bestsofar)
def firstchoicefrombyvalue(byvalue):
bestsofar = sentinel
for x in byvalue:
if x.weight < bestsofar.weight:
bestsofar = x
yield x
def optimize(items, maxweight):
byweight = sorted(items, key=attrgetter('weight'))
byvalue = sorted(items, key=attrgetter('value'), reverse=True)
maxvalue = float('-inf')
i = firstchoicefrombyvalue(byvalue)
y =
for x, z in firstchoicefrombyweight(byweight):
if z is not x and x.weight + z.weight <= maxweight:
maxvalue = max(maxvalue, x.value + z.value)
while x.weight + y.weight > maxweight:
y =
if y is x:
maxvalue = max(maxvalue, x.value + y.value)
except StopIteration:
return maxvalue
items = [Item(1, 1), Item(2, 2), Item(3, 5), Item(3, 7), Item(5, 8)]
for maxweight in xrange(3, 10):
print maxweight, optimize(items, maxweight)
This is similar to Knapsack problem. I will use naming from it (num - weight, val - value).
The essential part:
Start with a = 0 and b = n-1. Assuming 0 is the index of heaviest object and n-1 is the index of lightest object.
Increase a til objects a and b satisfy the limit.
Compare current solution with best solution.
Decrease b by one.
Go to 2.
It's the knapsack problem, except there is a limit of 2 items. You basically need to decide how much space you want for the first object and how much for the other. There is n significant ways to split available space, so the complexity is O(n). Picking the most valuable objects to fit in those spaces can be done without additional cost.

Algorithm/Data Structure for finding combinations of minimum values easily

I have a symmetric matrix like shown in the image attached below.
I've made up the notation A.B which represents the value at grid point (A, B). Furthermore, writing A.B.C gives me the minimum grid point value like so: MIN((A,B), (A,C), (B,C)).
As another example A.B.D gives me MIN((A,B), (A,D), (B,D)).
My goal is to find the minimum values for ALL combinations of letters (not repeating) for one row at a time e.g for this example I need to find min values with respect to row A which are given by the calculations:
A.B = 6
A.C = 8
A.D = 4
A.B.C = MIN(6,8,6) = 6
A.B.D = MIN(6, 4, 4) = 4
A.C.D = MIN(8, 4, 2) = 2
A.B.C.D = MIN(6, 8, 4, 6, 4, 2) = 2
I realize that certain calculations can be reused which becomes increasingly important as the matrix size increases, but the problem is finding the most efficient way to implement this reuse.
Can point me in the right direction to finding an efficient algorithm/data structure I can use for this problem?
You'll want to think about the lattice of subsets of the letters, ordered by inclusion. Essentially, you have a value f(S) given for every subset S of size 2 (that is, every off-diagonal element of the matrix - the diagonal elements don't seem to occur in your problem), and the problem is to find, for each subset T of size greater than two, the minimum f(S) over all S of size 2 contained in T. (And then you're interested only in sets T that contain a certain element "A" - but we'll disregard that for the moment.)
First of all, note that if you have n letters, that this amounts to asking Omega(2^n) questions, roughly one for each subset. (Excluding the zero- and one-element subsets and those that don't include "A" saves you n + 1 sets and a factor of two, respectively, which is allowed for big Omega.) So if you want to store all these answers for even moderately large n, you'll need a lot of memory. If n is large in your applications, it might be best to store some collection of pre-computed data and do some computation whenever you need a particular data point; I haven't thought about what would work best, but for example computing data only for a binary tree contained in the lattice would not necessarily help you anything beyond precomputing nothing at all.
With these things out of the way, let's assume you actually want all the answers computed and stored in memory. You'll want to compute these "layer by layer", that is, starting with the three-element subsets (since the two-element subsets are already given by your matrix), then four-element, then five-element, etc. This way, for a given subset S, when we're computing f(S) we will already have computed all f(T) for T strictly contained in S. There are several ways that you can make use of this, but I think the easiest might be to use two such subset S: let t1 and t2 be two different elements of T that you may select however you like; let S be the subset of T that you get when you remove t1 and t2. Write S1 for S plus t1 and write S2 for S plus t2. Now every pair of letters contained in T is either fully contained in S1, or it is fully contained in S2, or it is {t1, t2}. Look up f(S1) and f(S2) in your previously computed values, then look up f({t1, t2}) directly in the matrix, and store f(T) = the minimum of these 3 numbers.
If you never select "A" for t1 or t2, then indeed you can compute everything you're interested in while not computing f for any sets T that don't contain "A". (This is possible because the steps outlined above are only interesting whenever T contains at least three elements.) Good! This leaves just one question - how to store the computed values f(T). What I would do is use a 2^(n-1)-sized array; represent each subset-of-your-alphabet-that-includes-"A" by the (n-1) bit number where the ith bit is 1 whenever the (i+1)th letter is in that set (so 0010110, which has bits 2, 4, and 5 set, represents the subset {"A", "C", "D", "F"} out of the alphabet "A" .. "H" - note I'm counting bits starting at 0 from the right, and letters starting at "A" = 0). This way, you can actually iterate through the sets in numerical order and don't need to think about how to iterate through all k-element subsets of an n-element set. (You do need to include a special case for when the set under consideration has 0 or 1 element, in which case you'll want to do nothing, or 2 elements, in which case you just copy the value from the matrix.)
Well, it looks simple to me, but perhaps I misunderstand the problem. I would do it like this:
let P be a pattern string in your notation X1.X2. ... .Xn, where Xi is a column in your matrix
first compute the array CS = [ (X1, X2), (X1, X3), ... (X1, Xn) ], which contains all combinations of X1 with every other element in the pattern; CS has n-1 elements, and you can easily build it in O(n)
now you must compute min (CS), i.e. finding the minimum value of the matrix elements corresponding to the combinations in CS; again you can easily find the minimum value in O(n)
Note: since your matrix is symmetric, given P you just need to compute CS by combining the first element of P with all other elements: (X1, Xi) is equal to (Xi, X1)
If your matrix is very large, and you want to do some optimization, you may consider prefixes of P: let me explain with an example
when you have solved the problem for P = X1.X2.X3, store the result in an associative map, where X1.X2.X3 is the key
later on, when you solve a problem P' = X1.X2.X3.X7.X9.X10.X11 you search for the longest prefix of P' in your map: you can do this by starting with P' and removing one component (Xi) at a time from the end until you find a match in your map or you end up with an empty string
if you find a prefix of P' in you map then you already know the solution for that problem, so you just have to find the solution for the problem resulting from combining the first element of the prefix with the suffix, and then compare the two results: in our example the prefix is X1.X2.X3, and so you just have to solve the problem for
X1.X7.X9.X10.X11, and then compare the two values and choose the min (don't forget to update your map with the new pattern P')
if you don't find any prefix, then you must solve the entire problem for P' (and again don't forget to update the map with the result, so that you can reuse it in the future)
This technique is essentially a form of memoization.
