Combining permutation groups - algorithm

I am developing a probability analysis program for a board game. As part of an algorithm* I need to calculate the possible permutations of partitions of a number (plus some padding), such that all partition components cannot occupy any position that is lower than the total length of the permutation, in digits, minus the value of the component.
(It is extremely unlikely, however, that the number that will be partitioned will ever be higher than 8, and the length of the permutations will never be higher than 7.)
For instance, say I have the partition of 4, "211", and I want to find the permutations when there is a padding of 2, i.e. length of 5:
0 1 2 3 4 (array indexes)
5 4 3 2 1 (maximum value of partition component that can be allocated to each index)
2 1 1 _ _ (the partition plus 2 empty indexes)
This is represented as an array like so {2,1,1,0,0}
There are 6 permutations when 2 is in the 0 index (4! / 2! 2!), and there are 4 indexes that 2 can occupy (2 cannot be placed into the last index) so overall there are 24 permutations for this case (a 1 can occupy any index).
The output for input "21100":
21100, 21010, 21001, 20110, 20101, 20011
02110, 02101, 02011, 12100, 12010, 12001
00211, 10210, 11200, 10201, 01210, 01201
10021, 01021, 00121, 11020, 10120
01120
Note that this is simply the set of all permutations of "21100" minus those where 2 is in the 4th index. This is a relatively simple case.
The problem can be described as combining n different permutation groups, as the above case can be expressed as the combining of the permutations of x=1 n=4 and those of x=2 n=5, where x is the value count and n is the "space" count.
My difficulty is formulating a method that can obtain all possibilities computationally, and any advice would be greatly appreciated. -Please excuse any muddling of terminology in my question.
*The algorithm answers the following question:
There is a set of n units that are attacked k times. Each
attack has p chance to miss and q (1 - p) chance to damage
a random unit from the set. A unit that is damaged for a second time is destroyed
and is removed from the set.
What is the probability of there being
x undamaged units, y damaged units and z destroyed units after the attacks?
If anyone knows a more direct approach to this problem, please let me know.

I am going to use the algorithm to generate all permutations of the multiset as given in the answer to this question
How to generate all the permutations of a multiset?
and then filter to fit my criteria.

Related

Split array into four boxes such that sum of XOR's of the boxes is maximum

Given an array of integers which are needed to be split into four
boxes such that sum of XOR's of the boxes is maximum.
I/P -- [1,2,1,2,1,2]
O/P -- 9
Explanation: Box1--[1,2]
Box2--[1,2]
Box3--[1,2]
Box4--[]
I've tried using recursion but failed for larger test cases as the
Time Complexity is exponential. I'm expecting a solution using dynamic
programming.
def max_Xor(b1,b2,b3,b4,A,index,size):
if index == size:
return b1+b2+b3+b4
m=max(max_Xor(b1^A[index],b2,b3,b4,A,index+1,size),
max_Xor(b1,b2^A[index],b3,b4,A,index+1,size),
max_Xor(b1,b2,b3^A[index],b4,A,index+1,size),
max_Xor(b1,b2,b3,b4^A[index],A,index+1,size))
return m
def main():
print(max_Xor(0,0,0,0,A,0,len(A)))
Thanks in Advance!!
There are several things to speed up your algorithm:
Build in some start-up logic: it doesn't make sense to put anything into box 3 until boxes 1 & 2 are differentiated. In fact, you should generally have an order of precedence to keep you from repeating configurations in a different order.
Memoize your logic; this avoids repeating computations.
For large cases, take advantage of what value algebra exists.
This last item may turn out to be the biggest saving. For instance, if your longest numbers include several 5-bit and 4-bit numbers, it makes no sense to consider shorter numbers until you've placed those decently in the boxes, gaining maximum advantage for the leading bits. With only four boxes, you cannot have a num from 3-bit numbers that dominates a single misplaced 5-bit number.
Your goal is to place an odd number of 5-bit numbers into 3 or all 4 boxes; against this, check only whether this "pessimizes" bit 4 of the remaining numbers. For instance, given six 5-digit numbers (range 16-31) and a handful of small ones (0-7), your first consideration is to handle only combinations that partition the 5-digit numbers by (3, 1, 1, 1), as this leaves that valuable 5-bit turned on in each set.
With a more even mixture of values in your input, you'll also need to consider how to distribute the 4-bits for a similar "keep it odd" heuristic. Note that, as you work from largest to smallest, you need worry only about keeping it odd, and watching the following bit.
These techniques should let you prune your recursion enough to finish in time.
We can use Dynamic programming here to break the problem into smaller sets then store their result in a table. Then use already stored result to calculate answer for bigger set.
For example:
Input -- [1,2,1,2,1,2]
We need to divide the array consecutively into 4 boxed such that sum of XOR of all boxes is maximised.
Lets take your test case, break the problem into smaller sets and start solving for smaller set.
box = 1, num = [1,2,1,2,1,2]
ans = 1 3 2 0 1 3
Since we only have one box so all numbers will go into this box. We will store this answer into a table. Lets call the matrix as DP.
DP[1] = [1 3 2 0 1 3]
DP[i][j] stores answer for distributing 0-j numbers to i boxes.
now lets take the case where we have two boxes and we will take numbers one by one.
num = [1] since we only have one number it will go into the first box.
DP[1][0] = 1
Lets add another number.
num = [1 2]
now there can be two ways to put this new number into the box.
case 1: 2 will go to the First box. Since we already have answer
for both numbers in one box. we will just use that.
answer = DP[0][1] + 0 (Second box is empty)
case 2: 2 will go to second box.
answer = DP[0][0] + 2 (only 2 is present in the second box)
Maximum of the two cases will be stored in DP[1][1].
DP[1][1] = max(3+0, 1+2) = 3.
Now for num = [1 2 1].
Again for new number we have three cases.
box1 = [1 2 1], box2 = [], DP[0][2] + 0
box1 = [1 2], box2 = [1], DP[0][1] + 1
box1 = [1 ], box2 = [2 1], DP[0][0] + 2^1
Maximum of these three will be answer for DP[1][2].
Similarly we can find answer of num = [1 2 1 2 1 2] box = 4
1 3 2 0 1 3
1 3 4 6 5 3
1 3 4 6 7 9
1 3 4 6 7 9
Also note that a xor b xor a = b. you can use this property to get xor of a segment of an array in constant time as suggested in comments.
This way you can break the problem in smaller subset and use smaller set answer to compute for the bigger ones. Hope this helps. After understanding the concept you can go ahead and implement it with better time than exponential.
I would go bit by bit from the highest bit to the lowest bit. For every bit, try all combinations that distribute the still unused numbers that have that bit set so that an odd number of them is in each box, nothing else matters. Pick the best path overall. One issue that complicates this greedy method is that two boxes with a lower bit set can equal one box with the next higher bit set.
Alternatively, memoize the boxes state in your recursion as an ordered tuple.

Replace two elements with their absolute difference and generate the minimum possible element in array

I have an array of size n and I can apply any number of operations(zero included) on it. In an operation, I can take any two elements and replace them with the absolute difference of the two elements. We have to find the minimum possible element that can be generated using the operation. (n<1000)
Here's an example of how operation works. Let the array be [1,3,4]. Applying operation on 1,3 gives [2,4] as the new array.
Ex: 2 6 11 3 => ans = 0
This is because 11-6 = 5 and 5-3 = 2 and 2-2 = 0
Ex: 20 6 4 => ans = 2
Ex: 2 6 10 14 => ans = 0
Ex: 2 6 10 => ans = 2
Can anyone tell me how can I approach this problem?
Edit:
We can use recursion to generate all possible cases and pick the minimum element from them. This would have complexity of O(n^2 !).
Another approach I tried is Sorting the array and then making a recursion call where the either starting from 0 or 1, I apply the operations on all consecutive elements. This will continue till their is only one element left in the array and we can return the minimum at any point in the recursion. This will have a complexity of O(n^2) but doesn't necessarily give the right answer.
Ex: 2 6 10 15 => (4 5) & (2 4 15) => (1) & (2 15) & (2 11) => (13) & (9). The minimum of this will be 1 which is the answer.
When you choose two elements for the operation, you subtract the smaller one from the bigger one. So if you choose 1 and 7, the result is 7 - 1 = 6.
Now having 2 6 and 8 you can do:
8 - 2 -> 6 and then 6 - 6 = 0
You may also write it like this: 8 - 2 - 6 = 0
Let"s consider different operation: you can take two elements and replace them by their sum or their difference.
Even though you can obtain completely different values using the new operation, the absolute value of the element closest to 0 will be exactly the same as using the old one.
First, let's try to solve this problem using the new operations, then we'll make sure that the answer is indeed the same as using the old ones.
What you are trying to do is to choose two nonintersecting subsets of initial array, then from sum of all the elements from the first set subtract sum of all the elements from the second one. You want to find two such subsets that the result is closest possible to 0. That is an NP problem and one can efficiently solve it using pseudopolynomial algorithm similar to the knapsack problem in O(n * sum of all elements)
Each element of initial array can either belong to the positive set (set which sum you subtract from), negative set (set which sum you subtract) or none of them. In different words: each element you can either add to the result, subtract from the result or leave untouched. Let's say we already calculated all obtainable values using elements from the first one to the i-th one. Now we consider i+1-th element. We can take any of the obtainable values and increase it or decrease it by the value of i+1-th element. After doing that with all the elements we get all possible values obtainable from that array. Then we choose one which is closest to 0.
Now the harder part, why is it always a correct answer?
Let's consider positive and negative sets from which we obtain minimal result. We want to achieve it using initial operations. Let's say that there are more elements in the negative set than in the positive set (otherwise swap them).
What if we have only one element in the positive set and only one element in the negative set? Then absolute value of their difference is equal to the value obtained by using our operation on it.
What if we have one element in the positive set and two in the negative one?
1) One of the negative elements is smaller than the positive element - then we just take them and use the operation on them. The result of it is a new element in the positive set. Then we have the previous case.
2) Both negative elements are smaller than the positive one. Then if we remove bigger element from the negative set we get the result closer to 0, so this case is impossible to happen.
Let's say we have n elements in the positive set and m elements in the negative set (n <= m) and we are able to obtain the absolute value of difference of their sums (let's call it x) by using some operations. Now let's add an element to the negative set. If the difference before adding new element was negative, decreasing it by any other number makes it smaller, that is farther from 0, so it is impossible. So the difference must have been positive. Then we can use our operation on x and the new element to get the result.
Now second case: let's say we have n elements in the positive set and m elements in the negative set (n < m) and we are able to obtain the absolute value of difference of their sums (again let's call it x) by using some operations. Now we add new element to the positive set. Similarly, the difference must have been negative, so x is in the negative set. Then we obtain the result by doing the operation on x and the new element.
Using induction we can prove that the answer is always correct.

Maximum Value taken by thief

Consider we have a sacks of gold and thief wants to get the maximum gold. Thief can take the gold to get maximum by,
1) Taking the Gold from contiguous sacks.
2) Thief should take the same amount of gold from all sacks.
N Sacks 1 <= N <= 1000
M quantity of Gold 0 <= M <= 100
Sample Input1:
3 0 5 4 4 4
Output:
16
Explanation:
4 is the minimum amount he can take from the sacks 3 to 6 to get the maximum value of 16.
Sample Input2:
2 4 3 2 1
Output:
8
Explanation:
2 is the minimum amount he can take from the sacks 1 to 4 to get the maximum value of 8.
I approached the problem using subtracting the values from array and taking the transition point from negative to positive, but this doesn't solves the problem.
EDIT: code provided by OP to find the index:
int temp[6];
for(i=1;i<6;i++){
for(j=i-1; j>=0;j--) {
temp[j] = a[j] - a[i];
}
}
for(i=0;i<6;i++){
if(temp[i]>=0) {
index =i;
break;
}
}
The best amount of gold (TBAG) taken from every sack is equal to weight of some sack. Let's put indexes of candidates in a stack in order.
When we meet heavier weight (than stack contains), it definitely continues "good sequence", so we just add its index to the stack.
When we meet lighter weight (than stack top), it breaks some "good sequences" and we can remove heavier candidates from the stack - they will not have chance to be TBAG later. Remove stack top until lighter weight is met, calculate potentially stolen sum during this process.
Note that stack always contains indexes of strictly increasing sequence of weights, so we don't need to consider items before index at the stack top (intermediate AG) in calculation of stolen sum (they will be considered later with another AG value).
for idx in Range(Sacks):
while (not Stack.Empty) and (Sacks[Stack.Peek] >= Sacks[idx]): //smaller sack is met
AG = Sacks[Stack.Pop]
if Stack.Empty then
firstidx = 0
else
firstidx = Stack.Peek + 1
//range_length * smallest_weight_in_range
BestSUM = MaxValue(BestSUM, AG * (idx - firstidx))
Stack.Push(idx)
now check the rest:
repeat while loop without >= condition
Every item is pushed and popped once, so linear time and space complexity.
P.S. I feel that I've ever seen this problem in another formulation...
I see two differents approaches for the moment :
Naive approach: For each pair of indices (i,j) in the array, compute the minimum value m(i,j) of the array in the interval (i,j) and then compute score(i,j) = |j-i+1|*m(i,j). Take then the maximum score over all the pairs (i,j).
-> Complexity of O(n^3).
Less naive approach:
Compute the set of values of the array
For each value, compute the maximum score it can get. For that, you just have to iterate once over all the values of the array. For example, when your sample input is [3 0 5 4 4 4] and the current value you are looking is 3, then it will give you a score of 12. (You'll first find a value of 3 thanks to the first index, and then a score of 12 due to indices from 2 to 5).
Take the maximum over all values found at step 2.
-> Complexity is here O(n*m), since you have to do at most m times the step 2, and the step 2 can be done in O(n).
Maybe there is a better complexity, but I don't have a clue yet.

Linear Hashing calculation?

I am currently studying for my exams and have came up against this question:
(5d) Suppose we are using linear hashing, and start with an empty table with 2 buckets (M = 2), split = 0 and a load factor of 0.9. Explain the steps we go through when the following hashes are added (in order):
5,7,12,11,9
The answer provided for this is:
*— —5— (0,1)
* — —5,7 —
split —*—5,7— — (0,1,2)
—12*—5,7— — —
split —12—5—*—7— (0,1,2,3)
split =M, M = 2*M, split = 0
*—12—5— —7—
*—12—5— —7,11—
split —*—5— —7,11—12— (0,1,2,3,4)
—*—5,9— —7,11—12—
split — —9*— —7,11—12—5— (0,1,2,3,4,5)
This answer doesn't make any sense to me and the lecturer did not go through this.
How do I tackle this question?
I edited your question because the answer looks like a list of descriptions of the hash table state as each operation is performed. Did your professor cover linear hashing at all? The Wikipedia description mention a load factor precisely, but it's in the original LH paper by Witold Litwin. it's integral to when a controlled split occurs. I also found these descriptions:
Let l denote the Linear Hashing scheme’s load factor, i.e., l = S/b where S is the total number of records and b is the number of buckets used.
Linear Hashing by Zhang, et al (PDF)
The linear hashing algorithm performs splits in a deterministic order, rather than splitting at a bucket that overflowed. The splits are performed in linear order (bucket 0 first, then bucket 1, then 2, ...), and a split is performed when any bucket overflows. If the bucket that overflows is not the bucket that is split (which is the common case), overflow techniques such as chaining are used, but the common case is that few overflow buckets are needed.
snip
Instead of splitting on every collision, you can do a split when the "load" (which is bytes stored / (num buckets * bucket size), i.e. utilization of the data structure) crosses some watermark. This is called controlled splitting; the previously described is called uncontrolled splitting.
Linear Hashing: A new Tool for File and Table Addressing Witold Litwin, Summary by: Steve Gribble and Armando Fox, Online Berkley.edu retrieved June 16
So basically, a load factor is a means of predictably controlling when a split will occur. One implementation of linear hashing appears to be called 'uncontrolled split' which adds a new bucket and performs a split whenever a collision occurs. Using a load factor of 0.9 only has a split occur when 90% of the tables buckets are full - or rather, would be full based on the prediction that the buckets are uniformly assigned to.
Based on this and the Wikipedia article I just read, the setup is this:
Table is initially empty with two buckets (N = 2) - - (numbered 0 and 1)
N for number of buckets makes so much more sense to me than M, so I'm using that in my answer.
Apparently N is never changed even as new buckets are added to the table.
Our growth factor (L for bucket level) is 0. It is incremented every time every bucket in the table has been split once, which coincides with when our table has doubled in size.
Step pointer S (also called a split pointer) points to 0th bucket. It indicates which bucket will have a split applied to it next.
This follows the wikipedia article description I linked to above. Now we need to cover the hash and bucket assignment.
A decent hash function for integers you expect to have a normal distribution is to just use the integer itself. So for an input integer I, our hash H(I) is just I. I think this follows the answer key, which is good because the question is unanswerable without knowing H.
To determine which bucket an integer I is added to, one of two function values will be used, depending on whether or not the assignment points to before or after S.
First, calculate H(I) mod (N x 2L), which is really just I mod (N x 2L). I'm going to call this B(I) below for brevity (also for bucket). Call this the assignment address A.
If A is greater than or equal to S, we assign input I to address A and move on.
If A (B(I)) is less than S, we actually use a different hash function, I'll call B'(I), which is calculated as I mod (N x 2L + 1), giving us an actual assignment address of A'.
I think the reasoning for this is to keep the assignment to buckets more even as buckets are split along the way, but I don't have the mathematical proof of its importance.
I think the * in the answer's notation above denotes the location of the split pointer S. In my notation for the rest of the question below:
Let - denote an empty bucket, i denote a bucket with the Integer i in it, and i,j denote a bucket with both i and j in it.
So the first step of your answer key "— —5— (0,1)" is saying bucket 0 is empty and bucket 1 has 5 in it. I would rewrite this as - 5 for clarity.
I'm thinking your answer breakdown reads like this:
Add 5 to the table.
The linear hashing algorithm puts it into the second bucket (index 1) because:
B(5) = 5 mod (2 x 20) = 5 mod (2 x 1) = 5 mod 2 = 1
1 is greater than S, which is still 0, so we use 1 as the address.
Table now has - 5 (0th bucket empty, 1st bucket with 5 in it.
N, L, and S are unchanged
Add 7 to the table.
B(7) = 7 mod 2 = 1, so 7 is added to the same bucket as 5. S still hasn't changed, so again 1 is used as the address.
Table now has - 5,7
A split occurs! Not because a bucket has overflowed, but because the load factor has been exceeded. 2 items added, 2 total buckets, 2/2 = 1.0 > 0.9 = do a split.
First a new bucket is added at the end of the table.
S is incremented to 1. N is not incremented. L is unchanged
The split is done on a bucket. A split means all the items in the bucket get their assignment recalculated based on the new hash table size. However, one key to linear hashing is that the actual buckets are split in order, so the 0th bucket is split even though the 1st bucket is the one thats full.
Post split, the table is now - 5,7 -, with buckets 0 and 2 empty, and 1 still with 5 and 7 in it.
Add 12 to the table.
B(12) = 12 mod (2 x 20) = 12 mod 2 = 0
S is 1 and B(12) is 0, so we calculate B'(12) instead for our address.
Coincidentally, this is 12 mod (2 x 20 + 1) = 12 mod 4, which is still 0, so 12 is added to the 0th bucket.
Table now has 12 5,7 -, only the 3rd, new bucket is empty.
A split occurs again, because 3/3 = 1.0 > 0.9. This split promises to be more interesting than the last!
A new bucket is added to the end of the table, giving us 12 5,7 - -
S = 1, so the bucket with 5,7 is split. That means new buckets are picked for 5 and 7.
Increment S to 2. This is done after the split target bucket is picked, but before the new buckets are assigned. This ensures the new table is more evenly distributed (again, my supposition, don't have proof).
5 mod 2 = 1, 1 < S, calculate 5 mod 2 x 21 = 5 mod 4 = 1. 5 is re-assigned to its same bucket.
7 mod 2 = 1, 1 < S, calculate 7 mod 2 x 21 = 7 mod 4 = 3. 7 is re-assigned to 3.
Table now has 12 5 - 7
S = 2, N still equals 2, and L still = 0. S has now reached N x 2L = 2 x 20 = 2, so S is reset to 0 and L is incremented to 1.
Add 11 to the table.
B(11) = 11 mod (2 x 21) = 11 mod 4 = 3. 11 is assigned to the 3rd bucket.
Table now has 12 5 - 7,11, 4 items and 4 buckets, so a split occurs again.
S is 0 again, so the 0th bucket with 12 is reassigned after a new bucket is added. S is incremented to 1 before choosing a new bucket for 12.
B(12) = 12 mod (2 x 21) = 12 mod 4 = 0. 0 < 1, so recalculate
B'(12) = 12 mod (2 x 21+1) = 12 mod 8 = 4. 12 is assigned to the 4th bucket.
Table now contains - 5 - 7,11 12
Add 9 to the table.
I'll leave the steps to the last one for you. There are a few nuances to the LH algorithm that I'm not quite grasping. I might ask additional questions about them. But hopefully that's enough for you to get going on. In the future, I would recommend asking the course instructor directly.

Finding two consecutive 1's in a bitstring in less then n time?

I am trying to figure out a way to see if a bitstring has 2 consecutive ones in the bitstring size n in less then n time.
For example, lets say we had a bitstring size 5 (index 0-4). If index 1 and 3 were both 0's, I could return false. But if they were both ones then I may have to do 5 peeks to find my answer.
The bitstring doesn't have to be length 5. For simplicity's sake, lets say it can be between 3 and 8.
Simplest solution might be to bitwise AND the original string with a version of itself which has been shifted left or right by 1 bit. If the resulting bit string in non-zero then you have at least one 11 in there:
test = (src & (src << 1));

Resources