Related
I'm working on a problem that requires an array (dA[j], j=-N..N) to be calculated from the values of another array (A[i], i=-N..N) based on a conservation of momentum rule (x+y=z+j). This means that for a given index j for all the valid combinations of (x,y,z) I calculate A[x]A[y]A[z]. dA[j] is equal to the sum of these values.
I'm currently precomputing the valid indices for each dA[j] by looping x=-N...+N,y=-N...+N and calculating z=x+y-j and storing the indices if abs(z) <= N.
Is there a more efficient method of computing this?
The reason I ask is that in future I'd like to also be able to efficiently find for each dA[j] all the terms that have a specific A[i]. Essentially to be able to compute the Jacobian of dA[j] with respect to dA[i].
Update
For the sake of completeness I figured out a way of doing this without any if statements: if you parametrize the equation x+y=z+j given that j is a constant you get the equation for a plane. The constraint that x,y,z need to be integers between -N..N create boundaries on this plane. The points that define this boundary are functions of N and j. So all you have to do is loop over your parametrized variables (s,t) within these boundaries and you'll generate all the valid points by using the vectors defined by the plane (s*u + t*v + j*[0,0,1]).
For example, if you choose u=[1,0,-1] and v=[0,1,1] all the valid solutions for every value of j are bounded by a 6 sided polygon with points (-N,-N),(-N,-j),(j,N),(N,N),(N,-j), and (j,-N).
So for each j, you go through all (2N)^2 combinations to find the correct x's and y's such that x+y= z+j; the running time of your application (per j) is O(N^2). I don't think your current idea is bad (and after playing with some pseudocode for this, I couldn't improve it significantly). I would like to note that once you've picked a j and a z, there is at most 2N choices for x's and y's. So overall, the best algorithm would still complete in O(N^2).
But consider the following improvement by a factor of 2 (for the overall program, not per j): if z+j= x+y, then (-z)+(-j)= (-x)+(-y) also.
You have a set of n objects for which integer positions are given. A group of objects is a set of objects at the same position (not necessarily all the objects at that position: there might be multiple groups at a single position). The objects can be moved to the left or right, and the goal is to move these objects so as to form k groups, and to do so with the minimum distance moved.
For example:
With initial positions at [4,4,7], and k = 3: the minimum cost is 0.
[4,4,7] and k = 2: minimum cost is 0
[1,2,5,7] and k = 2: minimum cost is 1 + 2 = 3
I've been trying to use a greedy approach (by calculating which move would be shortest) but that wouldn't work because every move involves two elements which could be moved either way. I haven't been able to formulate a dynamic programming approach as yet but I'm working on it.
This problem is a one-dimensional instance of the k-medians problem, which can be stated as follows. Given a set of points x_1...x_n, partition these points into k sets S_1...S_k and choose k locations y_1...y_k in a way that minimizes the sum over all x_i of |x_i - y_f(i)|, where y_f(i) is the location corresponding of the set to which x_i is assigned.
Due to the fact that the median is the population minimizer for absolute distance (i.e. L_1 norm), it follows that each location y_j will be the median of the elements x in the corresponding set S_j (hence the name k-medians). Since you are looking at integer values, there is the technicality that if S_j contains an even number of elements, the median might not be an integer, but in such cases choosing either the next integer above or below the median will give the same sum of absolute distances.
The standard heuristic for solving k-medians (and the related and more common k-means problem) is iterative, but this is not guaranteed to produce an optimal or even good solution. Solving the k-medians problem for general metric spaces is NP-hard, and finding efficient approximations for k-medians is an open research problem. Googling "k-medians approximation", for example, will lead to a bunch of papers giving approximation schemes.
http://www.cis.upenn.edu/~sudipto/mypapers/kmedian_jcss.pdf
http://graphics.stanford.edu/courses/cs468-06-winter/Papers/arr-clustering.pdf
In one dimension things become easier, and you can use a dynamic programming approach. A DP solution to the related one-dimensional k-means problem is described in this paper, and the source code in R is available here. See the paper for details, but the idea is essentially the same as what #SajalJain proposed, and can easily be adapted to solve the k-medians problem rather than k-means. For j<=k and m<=n let D(j,m) denote the cost of an optimal j-medians solution to x_1...x_m, where the x_i are assumed to be in sorted order. We have the recurrence
D(j,m) = min (D(j-1,q) + Cost(x_{q+1},...,x_m)
where q ranges from j-1 to m-1 and Cost is equal to the sum of absolute distances from the median. With a naive O(n) implementation of Cost, this would yield an O(n^3k) DP solution to the whole problem. However, this can be improved to O(n^2k) due to the fact that the Cost can be updated in constant time rather than computed from scratch every time, using the fact that, for a sorted sequence:
Cost(x_1,...,x_h) = Cost(x_2,...,x_h) + median(x_1...x_h)-x_1 if h is odd
Cost(x_1,...,x_h) = Cost(x_2,...,x_h) + median(x_2...x_h)-x_1 if h is even
See the writeup for more details. Except for the fact that the update of the Cost function is different, the implementation will be the same for k-medians as for k-means.
http://journal.r-project.org/archive/2011-2/RJournal_2011-2_Wang+Song.pdf
as I understand, the problems is:
we have n points on a line.
we want to place k position on the line. I call them destinations.
move each of n points to one of the k destinations so the sum of distances is minimum. I call this sum, total cost.
destinations can overlap.
An obvious fact is that for each point we should look for the nearest destinations on the left and the nearest destinations on the right and choose the nearest.
Another important fact is all destinations should be on the points. because we can move them on the line to right or to left to reach a point without increasing total distance.
By these facts consider following DP solution:
DP[i][j] means the minimum total cost needed for the first i point, when we can use only j destinations, and have to put a destination on the i-th point.
to calculate DP[i][j] fix the destination before the i-th point (we have i choice), and for each choice (for example k-th point) calculate the distance needed for points between the i-th point and the new point added (k-th point). add this with DP[k][j - 1] and find the minimum for all k.
the calculation of initial states (e.g. j = 1) and final answer is left as an exercise!
Task 0 - sort the position of the objects in non-decreasing order
Let us define 'center' as the position of the object where it is shifted to.
Now we have two observations;
For N positions the 'center' would be the position which is nearest to the mean of these N positions. Example, let 1,3,6,10 be the positions. Then mean = 5. Nearest position is 6. Hence the center for these elements is 6. This gives us the position with minimum cost of moving when all elements need to be grouped into 1 group.
Let N positions be grouped into K groups "optimally". When N+1 th object is added, then it will disturb only the K th group, i.e, first K-1 groups will remain unchanged.
From these observations, we build a dynamic programming approach.
Let Cost[i][k] and Center[i][k] be two 2D arrays.
Cost[i][k] = minimum cost when first 'i' objects are partitioned into 'k' groups
Center[i][k] stores the center of the 'i-th' object when Cost[i][k] is computed.
Let {L} be the elements from i-L,i-L+1,..i-1 which have the same center.
(Center[i-L][k] = Center[i-L+1][k] = ... = Center[i-1][k]) These are the only objects that need to be considered in the computation for i-th element (from observation 2)
Now
Cost[i][k] will be
min(Cost[i-1][k-1] , Cost[i-L-1][k-1] + computecost(i-L, i-L+1, ... ,i))
Update Center[i-L ... i][k]
computecost() can be found trivially by finding the center (from observation 1)
Time Complexity:
Sorting O(NlogN)
Total Cost Computation Matrix = Total elements * Computecost = O(NK * N)
Total = O(NlogN + N*NK) = O(N*NK)
Let's look at k=1.
For k=1 and n odd, all points should move to the center point. For k=1 and n even, all points should move to either of the center points or any spot between them. By 'center' I mean in terms of number of points to either side, i.e. the median.
You can see this because if you select a target spot, x, with more points to its right than it's left, then a new target 1 to the right of x would result in a cost reduction (unless there is exactly one more point to the right than the left and the target spot is a point, in which case n is even and the target is on/between the two center points).
If your points are already sorted, this is an O(1) operation. If not, I believe it's O(n) (via an order statistic algorithm).
Once you've found the spot that all points are moving to, it's O(n) to find the cost.
Thus regardless of whether the points are sorted or not, this is O(n).
I want to find an algorithm to find the pair of bitstrings in an array that have the largest number of common set bits (among all pairs in the array). I know it is possible to do this by comparing all pairs of bitstrings in the array, but this is O(n2). Is there a more efficient algorithm? Ideally, I would like the algorithm to work incrementally by processing one incoming bitstring in each iteration.
For example, suppose we have this array of bitstrings (of length 8):
B1:01010001
B2:01101010
B3:01101010
B4:11001010
B5:00110001
The best pair here is B2 and B3, which have four common set bits.
I found a paper that appears to describe such an algorithm (S. Taylor & T. Drummond (2011); "Binary Histogrammed Intensity Patches for Efficient and Robust Matching"; Int. J. Comput. Vis. 94:241–265), but I don't understand this description from page 252:
This can be incrementally updated in each iteration as the only [bitstring] overlaps that need recomputing are those for the new parent feature and any other [bitstrings] in the root whose “most overlapping feature” was one of the two selected for combination. This avoids the need for the O(N2) overlap comparison in every iteration and allows a forest for a typically-sized database of 700 features to be built in under a second.
As far as I can tell, Taylor & Drummond (2011) do not purport to give an O(n) algorithm for finding the pair of bitstrings in an array with the largest number of common set bits. They sketch an argument that a record of the best such pairs can be updated in O(n) after a new bitstring has been added to the array (and two old bitstrings removed).
Certainly the explanation of the algorithm on page 252 is not very clear, and I think their sketch argument that the record can be updated in O(n) is incomplete at best, so I can see why you are confused.
Anyway, here's my best attempt to explain Algorithm 1 from the paper.
Algorithm
The algorithm takes an array of bitstrings and constructs a lookup tree. A lookup tree is a binary forest (set of binary trees) whose leaves are the original bitstrings from the array, whose internal nodes are new bitstrings, and where if node A is a parent of node B, then A & B = A (that is, all the set bits in A are also set in B).
For example, if the input is this array of bitstrings:
then the output is the lookup tree:
The algorithm as described in the paper proceeds as follows:
Let R be the initial set of bitstrings (the root set).
For each bitstring f1 in R that has no partner in R, find and record its partner (the bitstring f2 in R − {f1} which has the largest number of set bits in common with f1) and record the number of bits they have in common.
If there is no pair of bitstrings in R with any common set bits, stop.
Let f1 and f2 be the pair of bitstrings in R with the largest number of common set bits.
Let p = f1 & f2 be the parent of f1 and f2.
Remove f1 and f2 from R; add p to R.
Go to step 2.
Analysis
Suppose that the array contains n bitstrings of fixed length. Then the algorithm as described is O(n3) because step 2 is O(n2), and there are O(n) iterations, because at each iteration we remove two bitstrings from R and add one.
The paper contains an argument that step 2 is Ω(n2) only on the first time around the loop, and on other iterations it is O(n) because we only have to find the partner of p "and any other bitstrings in R whose partner was one of the two selected for combination." However, this argument is not convincing to me: it is not clear that there are only O(1) other such bitstrings. (Maybe there's a better argument?)
We could bring the algorithm down to O(n2) by storing the number of common set bits between every pair of bitstrings. This requires O(n2) extra space.
Reference
S. Taylor & T. Drummond (2011). "Binary Histogrammed Intensity Patches for Efficient and Robust Matching". Int. J. Comput. Vis. 94:241–265.
Well for each bit position you could maintain two sets, those with that position on and those with it off. The sets could be placed in two binary trees for example.
Then you just perform set unions, first with all eight bits, than every combination of 7 and so on, until you find union with two elements.
The complexity here grows exponentially in the bit size, but if it is small and fixed this isn't a problem.
Another way to do it might be to look at the n k-bit strings as n points in a kD space, and your task is to find the two points closest together. There are a number of geometric algorithms to do this.
For a part of a divide and conquer algorithm, I have the following question where the data structure is not fixed, so set is not to be taken literally:
Given a set X sorted wrt. some ordering of elements and subsets A and B together consisting of all elements in X, can sorted versions A' and B' of A and B be constructed in time linear in the number of elements in X ?
At the moment I am doing a standard sort at each recursive step giving the recursion
T(n) = 2*T(n/2) + O(n*log n)
for the complexity rather than
T(n) = 2*T(n/2) + O(n)
like in the procedural version, where one can utilize a structure with constant-time lookup on A and B to form A' and B' in linear time.
The added log n factor carries over to the overall complexity, giving O(n* (log n)^2) instead of O(n* log n).
EDIT:
Perhaps I am understanding the term lookup incorrectly. The creation of A' and B' in linear time is easy to do if membership of A and B can be checked in constant time.
I didn't succeed in my attempt at making things clearer by abstracting
away the specifics, so here is the actual problem:
I am implementing the algorithm for the closest pair problem. Given a
finite collection P of points in the plane it finds a pair of points
in P with the minimal distance. It works roughly as follows:
If P
has at least 4 points, form Px and
Py, the points in P sorted by x- and y-coordinate. By
splitting Px form L and R, the left- and right-most
halves of points. Recursively compute the closest pair distance in L and
R, let d be the minimum of the two. Now the minimum distance in P is
either d or the distance from a point in L to a point in R. If the
minimal distance is between points from separate halves, it will appear
between a pair of points lying in the strip of width 2*d centered around
the line x = x0, where x0 is the x-coordinate of
a right-most point in L. It turns out that to find a potential minimal distance pair in
the strip, it is enough to compute for every point in the the strip its
distance to the seven following points if the strip points are in a
collection sorted by y-coordinate.
It is in the steps with forming the sorted collections to pass into the recursion and sorting the strip points by y-coordinate where I don't see how to, in
Haskell, utilize having sorted P at the beginning of the recursion.
The following function may interest you:
partition :: (a -> Bool) -> [a] -> ([a], [a])
partition f xs = (filter f xs, filter (not . f) xs)
If you can compute set-membership in constant time, that is, there is a predicate of type a -> Bool that runs in constant time, then partition will run in time linear in the length of its input list. Furthermore, partition is stable, so that if its input list is sorted, then so are both output lists.
I would also like to point out that the above definition is meant to be give the semantics of partition only; the real implementation in GHC only walks its input list once, even if the entire output is forced.
Of course, the real crux of the question is providing a constant-time predicate. The way you phrased the question leaves sets A and B quite unstructured -- you demand that we can handle any particular partitioning. In that case, I don't know of any particularly Haskell-y way of doing constant-time lookup in arbitrary sets. However, often these problems are a bit more structured: often, rather than set-membership, you are actually interested in whether some easily-computable property holds or not. In this case, the above is just what the doctor ordered.
I know very very little about Haskell but here's a shot anyway.
Given that (A+B) == X can;t you just iterate through X (in the sorted order) and add each element to A' or B' if it exists in A or B? Give linear time lookup of element x in the Sets A and B that would be linear.
Is there a way to generate all of the subset sums s1, s2, ..., sk that fall in a range [A,B] faster than O((k+N)*2N/2), where k is the number of sums there are in [A,B]? Note that k is only known after we have enumerated all subset sums within [A,B].
I'm currently using a modified Horowitz-Sahni algorithm. For example, I first call it to for the smallest sum greater than or equal to A, giving me s1. Then I call it again for the next smallest sum greater than s1, giving me s2. Repeat this until we find a sum sk+1 greater than B. There is a lot of computation repeated between each iteration, even without rebuilding the initial two 2N/2 lists, so is there a way to do better?
In my problem, N is about 15, and the magnitude of the numbers is on the order of millions, so I haven't considered the dynamic programming route.
Check the subset sum on Wikipedia. As far as I know, it's the fastest known algorithm, which operates in O(2^(N/2)) time.
Edit:
If you're looking for multiple possible sums, instead of just 0, you can save the end arrays and just iterate through them again (which is roughly an O(2^(n/2) operation) and save re-computing them. The value of all the possible subsets is doesn't change with the target.
Edit again:
I'm not wholly sure what you want. Are we running K searches for one independent value each, or looking for any subset that has a value in a specific range that is K wide? Or are you trying to approximate the second by using the first?
Edit in response:
Yes, you do get a lot of duplicate work even without rebuilding the list. But if you don't rebuild the list, that's not O(k * N * 2^(N/2)). Building the list is O(N * 2^(N/2)).
If you know A and B right now, you could begin iteration, and then simply not stop when you find the right answer (the bottom bound), but keep going until it goes out of range. That should be roughly the same as solving subset sum for just one solution, involving only +k more ops, and when you're done, you can ditch the list.
More edit:
You have a range of sums, from A to B. First, you solve subset sum problem for A. Then, you just keep iterating and storing the results, until you find the solution for B, at which point you stop. Now you have every sum between A and B in a single run, and it will only cost you one subset sum problem solve plus K operations for K values in the range A to B, which is linear and nice and fast.
s = *i + *j; if s > B then ++i; else if s < A then ++j; else { print s; ... what_goes_here? ... }
No, no, no. I get the source of your confusion now (I misread something), but it's still not as complex as what you had originally. If you want to find ALL combinations within the range, instead of one, you will just have to iterate over all combinations of both lists, which isn't too bad.
Excuse my use of auto. C++0x compiler.
std::vector<int> sums;
std::vector<int> firstlist;
std::vector<int> secondlist;
// Fill in first/secondlist.
std::sort(firstlist.begin(), firstlist.end());
std::sort(secondlist.begin(), secondlist.end());
auto firstit = firstlist.begin();
auto secondit = secondlist.begin();
// Since we want all in a range, rather than just the first, we need to check all combinations. Horowitz/Sahni is only designed to find one.
for(; firstit != firstlist.end(); firstit++) {
for(; secondit = secondlist.end(); secondit++) {
int sum = *firstit + *secondit;
if (sum > A && sum < B)
sums.push_back(sum);
}
}
It's still not great. But it could be optimized if you know in advance that N is very large, for example, mapping or hashmapping sums to iterators, so that any given firstit can find any suitable partners in secondit, reducing the running time.
It is possible to do this in O(N*2^(N/2)), using ideas similar to Horowitz Sahni, but we try and do some optimizations to reduce the constants in the BigOh.
We do the following
Step 1: Split into sets of N/2, and generate all possible 2^(N/2) sets for each split. Call them S1 and S2. This we can do in O(2^(N/2)) (note: the N factor is missing here, due to an optimization we can do).
Step 2: Next sort the larger of S1 and S2 (say S1) in O(N*2^(N/2)) time (we optimize here by not sorting both).
Step 3: Find Subset sums in range [A,B] in S1 using binary search (as it is sorted).
Step 4: Next, for each sum in S2, find using binary search the sets in S1 whose union with this gives sum in range [A,B]. This is O(N*2^(N/2)). At the same time, find if that corresponding set in S2 is in the range [A,B]. The optimization here is to combine loops. Note: This gives you a representation of the sets (in terms of two indexes in S2), not the sets themselves. If you want all the sets, this becomes O(K + N*2^(N/2)), where K is the number of sets.
Further optimizations might be possible, for instance when sum from S2, is negative, we don't consider sums < A etc.
Since Steps 2,3,4 should be pretty clear, I will elaborate further on how to get Step 1 done in O(2^(N/2)) time.
For this, we use the concept of Gray Codes. Gray codes are a sequence of binary bit patterns in which each pattern differs from the previous pattern in exactly one bit.
Example: 00 -> 01 -> 11 -> 10 is a gray code with 2 bits.
There are gray codes which go through all possible N/2 bit numbers and these can be generated iteratively (see the wiki page I linked to), in O(1) time for each step (total O(2^(N/2)) steps), given the previous bit pattern, i.e. given current bit pattern, we can generate the next bit pattern in O(1) time.
This enables us to form all the subset sums, by using the previous sum and changing that by just adding or subtracting one number (corresponding to the differing bit position) to get the next sum.
If you modify the Horowitz-Sahni algorithm in the right way, then it's hardly slower than original Horowitz-Sahni. Recall that Horowitz-Sahni works two lists of subset sums: Sums of subsets in the left half of the original list, and sums of subsets in the right half. Call these two lists of sums L and R. To obtain subsets that sum to some fixed value A, you can sort R, and then look up a number in R that matches each number in L using a binary search. However, the algorithm is asymmetric only to save a constant factor in space and time. It's a good idea for this problem to sort both L and R.
In my code below I also reverse L. Then you can keep two pointers into R, updated for each entry in L: A pointer to the last entry in R that's too low, and a pointer to the first entry in R that's too high. When you advance to the next entry in L, each pointer might either move forward or stay put, but they won't have to move backwards. Thus, the second stage of the Horowitz-Sahni algorithm only takes linear time in the data generated in the first stage, plus linear time in the length of the output. Up to a constant factor, you can't do better than that (once you have committed to this meet-in-the-middle algorithm).
Here is a Python code with example input:
# Input
terms = [29371, 108810, 124019, 267363, 298330, 368607,
438140, 453243, 515250, 575143, 695146, 840979, 868052, 999760]
(A,B) = (500000,600000)
# Subset iterator stolen from Sage
def subsets(X):
yield []; pairs = []
for x in X:
pairs.append((2**len(pairs),x))
for w in xrange(2**(len(pairs)-1), 2**(len(pairs))):
yield [x for m, x in pairs if m & w]
# Modified Horowitz-Sahni with toolow and toohigh indices
L = sorted([(sum(S),S) for S in subsets(terms[:len(terms)/2])])
R = sorted([(sum(S),S) for S in subsets(terms[len(terms)/2:])])
(toolow,toohigh) = (-1,0)
for (Lsum,S) in reversed(L):
while R[toolow+1][0] < A-Lsum and toolow < len(R)-1: toolow += 1
while R[toohigh][0] <= B-Lsum and toohigh < len(R): toohigh += 1
for n in xrange(toolow+1,toohigh):
print '+'.join(map(str,S+R[n][1])),'=',sum(S+R[n][1])
"Moron" (I think he should change his user name) raises the reasonable issue of optimizing the algorithm a little further by skipping one of the sorts. Actually, because each list L and R is a list of sizes of subsets, you can do a combined generate and sort of each one in linear time! (That is, linear in the lengths of the lists.) L is the union of two lists of sums, those that include the first term, term[0], and those that don't. So actually you should just make one of these halves in sorted form, add a constant, and then do a merge of the two sorted lists. If you apply this idea recursively, you save a logarithmic factor in the time to make a sorted L, i.e., a factor of N in the original variable of the problem. This gives a good reason to sort both lists as you generate them. If you only sort one list, you have some binary searches that could reintroduce that factor of N; at best you have to optimize them somehow.
At first glance, a factor of O(N) could still be there for a different reason: If you want not just the subset sum, but the subset that makes the sum, then it looks like O(N) time and space to store each subset in L and in R. However, there is a data-sharing trick that also gets rid of that factor of O(N). The first step of the trick is to store each subset of the left or right half as a linked list of bits (1 if a term is included, 0 if it is not included). Then, when the list L is doubled in size as in the previous paragraph, the two linked lists for a subset and its partner can be shared, except at the head:
0
|
v
1 -> 1 -> 0 -> ...
Actually, this linked list trick is an artifact of the cost model and never truly helpful. Because, in order to have pointers in a RAM architecture with O(1) cost, you have to define data words with O(log(memory)) bits. But if you have data words of this size, you might as well store each word as a single bit vector rather than with this pointer structure. I.e., if you need less than a gigaword of memory, then you can store each subset in a 32-bit word. If you need more than a gigaword, then you have a 64-bit architecture or an emulation of it (or maybe 48 bits), and you can still store each subset in one word. If you patch the RAM cost model to take account of word size, then this factor of N was never really there anyway.
So, interestingly, the time complexity for the original Horowitz-Sahni algorithm isn't O(N*2^(N/2)), it's O(2^(N/2)). Likewise the time complexity for this problem is O(K+2^(N/2)), where K is the length of the output.