i need to write an article about Guy Blellochs parallel prefix sum. In his paper Prefix Sums and Their Applications, he presents examples but I am not sure about the requirements of this algorithm to work. I actually have 2 Questions:
What if the Array over which the prefix sum is being calculated is NOT a power of 2?
and
What if there are not enough Processors to always sum 2 adjacent elements?
I hope someone can help me with this :D
Related
Given an algorithm of 2-swap permutation enumeration like Steinhaus–Johnson–Trotter algorithm (but not necessarily of adjacent items), I would like to find a way to do the following:
[the basics] A function that, from the starting vector [1,2,3..N] efficiently go along all permutations (iteratively and/or recursively) swapping 2 elements from the previous one.
A function that, given an index of some permutation [1,N!], can easily calculate it (I mean, find it without needing to go along the preceding ones) and then keep going from there.
The opposite, find the index of a given permutation compared to an also given starting one.
In other words, the functions to slice a list of 2-swap permutations into arbitrary sized independent blocks.
Pseudocode and/or C-like code are very welcome.
Links to articles/books too.
Ref.: http://rosettacode.org/wiki/Permutations_by_swapping
I've posted Java code here for the Johnson-Trotter algorithm (with Even's speedup) implemented iteratively rather than recursively.
I think this may help you with your questions 1, and possibly also 2 and 3.
I have N binary sequences of length L, where N and L maybe very large, and those sequences maybe very sparse, say have much more 0s then 1s.
I want to select M sequences from them, namely b_1, b_2, b_3..., such that
b_1 | b_2 | b_3 ... | b_M = 1111...11 (L 1s)
Is there an algorithm to achieve it?
My idea is:
STEP1: for position from 1 to L, count the total number of sequences which has 1 at that position. Name it 'owning number'
STEP2: consider the position having minimum owning number, and choose the sequence having the maximum number of 1s from the owning sequence of that position.
STEP3: ignore the chosen sequence, update owning number and go back to STEP2.
I believe that my method cannot generate the best answer.
Does anyone has a better idea?
This is the well known set cover problem. It is NP-hard — in fact, its decision version is one of the canonical NP-complete problems and was among the 21 problems included in Karp's 1972 paper — and so no efficient algorithm is known for solving it.
The algorithm you describe in your question is known as the "greedy algorithm" and (unless your problem has some special features that you are not telling us) it's essentially the best known approach. It finds a collection of sets that is no more than O(log |N|) times the size of the smallest such collection.
Sounds like a typical backtrack task.
Yes, your algoryth sounds reasonable if you want to have a good answer quickly. If you want to have the combination of the least possible samples you can't do better than try all combinations.
Depending on the exact structure of the problem, there is an other technique that often works well (and actually gives an optimal result):
Let x[j] be a boolean variable representing the choice whether to include the j'th binary sequence in the result. A zero-suppressed binary decision diagram can now represent (maybe succinctly - depending on the characteristics of the problem) the family of sets such that the OR of the binary sequences corresponding to a variable x[j] included in the set is all ones. Finding the smallest such set (thus minimizing the number of sequences included) is relatively easy if the ZDD was succinct. Details can be found in The Art of Computer Programming chapter 7.1.4 (volume 4A).
It's also easy to adapt to an exact cover, by taking the family of sets such that there is exactly one 1 for every position.
I have this problem in my textbook:
Given a group of n items, each with a distinct value V(i), what is the best way to divide the items into 3 groups so the group with the highest value is minimIzed? Give the value of this largest group.
I know how to do the 2 pile variant of this problem: it just requires running the knapsack algorithm backwards on the problem. However, I am pretty puzzled as how to solve this problem. Could anyone give me any pointers?
Answer: Pretty much the same thing as the 0-1 knapsack, although 2D
Tough homework problem. This is essentially the optimization version of the 3-partition problem.
http://en.wikipedia.org/wiki/3-partition_problem
It is closely related to bin packing, partition, and subset-sum (and, as you noted, knapsack). However, it happens to be strongly NP-Complete, which makes it a harder than its cousins. Anyway, I suggest you start by looking at dynamic programming solutions to the related problems (I'd start with partition, but find a non-wikipedia explanation of the DP solution).
Update: I apologize. I have mislead you. The 3-partition problem splits the input into sets of 3, not 3 sets. The rest of what I said still applies, but with the renewed hope that your variant isn't strongly np-complete.
Let f[i][j][k] denotes whether it is possible to have value j in the first set and value k in the second set, with the first i items.
So we have f[i][j][k] = f[i-1][j-v[i]][k] or f[i-1][j][k-v[i]].
and initially we have f[0][0][0] = True.
for every f[i][j][k] = True, update your answer depends on how you defines fairly.
I don't know about "The Best" mathematically speaking, but one obvious approach would be to build a population of groups initially with one item in each group. Then, for as long as you have more groups than the desired number of final groups, extract the two groups with the lowest values and combine them into a new group that you add back into the collection. This is similar to how Huffman compression trees are built.
Example:
1 3 7 9 10
becomes
4(1+3) 7 9 10
becomes
9 10 11(1+3+7)
Suppose there are n numbers let says we have the following 4 numbers 15,20,10,25
There are two container A and B and my job is to distribute numbers to them so that the sum of the number in each container have the least difference.
In the above example, A should have 15+20 and B should have 10+ 25. So difference = 0.
I think of a method. It seems to work but I don't know why.
Sort the number list in descending order first. In each round, take the maximum number out
and put to the container have less sum.
Btw, is it can be solved by DP?
THX
In fact, your method doesn't always work. Think about that 2,4,4,5,5.The result by your method will be (5,4,2)(5,4), while the best answer is (5,5)(4,4,2).
Yes, it can be solved by Dynamical Programming.Here are some useful link:
Tutorial and Code: http://www.cs.cornell.edu/~wdtseng/icpc/notes/dp3.pdf
A practice: http://people.csail.mit.edu/bdean/6.046/dp/ (then click Balanced Partition)
What's more, please note that if the scale of problem is damn large (like you have 5 million numbers etc.), you won't want to use DP which needs a too huge matrix. If this is the case, you want to use a kind of Monte Carlo Algorithm:
divide n numbers into two groups randomly (or use your method at this step if you like);
choose one number from each group, if (to swap these two number decrease the difference of sum) swap them;
repeat step 2 until "no swap occurred for a long time".
You don't want to expect this method could always work out with the best answer, but it is the only way I know to solve this problem at very large scale within reasonable time and memory.
Suppose I have a series of index numbers that consists of a check digit. If I have a fair enough sample (Say 250 sample index numbers), do I have a way to extract the algorithm that has been used to generate the check digit?
I think there should be a programmatic approach atleast to find a set of possible algorithms.
UPDATE: The length of a index number is 8 Digits including the check digit.
No, not in the general case, since the number of possible algorithms is far more than what you may think. A sample space of 250 may not be enough to do proper numerical analysis.
For an extreme example, let's say your samples are all 15 digits long. You would not be able to reliably detect the algorithm if it changed the behaviour for those greater than 15 characters.
If you wanted to be sure, you should reverse engineer the code that checks the numbers for validity (if available).
If you know that the algorithm is drawn from a smaller subset than "every possible algorithm", then it might be possible. But algorithms may be only half the story - there's also the case where multipliers, exponentiation and wrap-around points change even using the same algorithm.
paxdiablo is correct, and you can't guess the algorithm without making any other assumption (or just having the whole sample space - then you can define the algorithm by a look up table).
However, if the check digit is calculated using some linear formula dependent on the "data digits" (which is a very common case, as you can see in the wikipedia article), given enough samples you can use Euler elimination.