Create 2 pillars of equal height from an array of bricks - algorithm

Problem Statement:
There are N bricks (a1, a2, ...., aN). Each brick has length L1, L2, ...., LN). Make 2 highest parallel pillars (same length pillars) using the bricks provided.
Constraints:
There are N bricks. 5<=N<=50
Length of each brick. 1<=L<=1000
Sum of the bricks lengths <= 1000
Length of the bricks is not given in size order. There may be multiple bricks which may have the same length. Not all bricks have to be used to create the pillars.
Example:
1st Example-
N = 5
2, 3, 4, 1, 6
Possible Sets:
(2, 6) and (3, 4, 1)
Answer: 8
My Approach:
Finding the maximum possible length of the 2 parallel pillars ie. floor(N/2). Then, using DP to find all the sum lengths that are possible using all the bricks. Starting with the highest possible sum possible <= floor(N/2), I take a single subset of elements that forms the sum. Then, again repeating the DP approach to find if the same sum can be formed using the remaining elements. If it can be formed, then the output is the highest possible sum, else using the 1st DP, check the next highest possible sum that can be formed and then, again iterate the whole process.
The problem with the above approach is that it checks for only one subset of elements to form the required sum. All possible subsets that can form the required sum should be checked and then for each of these subsets, using the remaining elements it should be checked if the same required sum can be formed. The trouble is the implementation of this in my current approach.
2nd Example-
N = 6
3, 2, 6, 4, 7, 1
Possible Sets:
(3, 2, 6) and (7, 4)
Answer: 11
The problem in my code might come in this case depending on the order in which elements (brick lengths) are given as input. It might be possible that the first set is formed using the elements (3, 7, 1) = 11 but the second set (2, 6, 4) cannot form sum = 11. Hence, my code starts to find the next possible maximum sum .ie. 10 which is wrong.
Can someone suggest better approaches or possible improvements in my current approach.

I think you can solve this with dynamic programming where for each pair (x, y) you work out whether it is possible to build pillars of length x and y using different bricks from the bricks considered so far.
Consider each brick in turn. At the start only (0, 0) is possible. When you see a brick of length L then for every possible pillar (x, y) there are three descendants - (x, y) (ignore the brick), (x + L, y) (use the brick on the first pillar), and (x, y + L) (use the brick on the second pillar).
So after you have considered all the bricks you will have a long list of possible pairs and you simply choose the pair which has two pillars of the same length and as long as possible. This will be practical as long as there are never too many different pairs (you can remove any duplicates from the list).
Assuming that the brick lengths are integers and the maximum pillar length is 1000 there are only 1001 * 1001 possible pairs, so this should be practical, and in fact it is probably easiest if you store pairs by having an array of size [1001, 1001] and setting entries [x, y] to 1 if pair (x, y) is possible and 0 otherwise.
For the first few steps of the example the reachable states are
(0,0) considering nothing
(0,3) (3,0) (0,0) considering 3
(0,5) (2,3) (0,3) (5,0) (3,0) (0,2) (2,0) (0,0) considering 3 and 2
The number of reachable states grows very fast at first but since we are only considering values from 0..1000 and we only care about whether an array is reachable or not we can maintain them using a boolean array of dimension 1001x1001

Recently I tried to solve this problem while preparing for Samsung Competency Test. I build the solution and this might help you guys for your practice. Thanks to https://stackoverflow.com/users/240457/mcdowella that I am able to solve this using his strategy.
public class Pillars {
public static int h=0,mh=0;
public static void main(String[] args) {
java.util.Scanner scan = new java.util.Scanner (System.in);
int n = scan.nextInt();
int a[]=new int[n];
for(int i=0;i<n;i++)
{
a[i]=scan.nextInt();
mh+=a[i];
}
mh/=2;// the height of the two pillars is the total, then single pillar can't be more than half of total
maxHeight(0,0,a,0);
// if no pillars can be built using the data, this print statement is executed
System.out.println("Maximum Height Formed with the given Data "+h);
}
public static void maxHeight(int x,int y,int a[],int i)
{
if(x==y && x!=0 && x>h)// whether the heights formed are equal or not
{
h=x;
if(h==mh) // if equal then print and exit the program
{
System.out.println("Maximum Height Formed with the given Data "+h);
System.exit(0);
}
}
if(i<a.length )
{
maxHeight(x+a[i],y,a,i+1);
maxHeight(x,y+a[i],a,i+1);
}
}
}

Well, this question can be solved simply using recursion. But it will not be efficient for large values for n. Here is my code
#include <iostream>
using namespace std;
void solve(int a[], int vis[], int p1, int p2, int n, int &ans){
if(p1 == p2){
if(p1 > ans){
ans = p1;
}
}
for(int i=0 ; i<n ; ++i){
if(vis[i] == 0){
vis[i] = 1;
solve(a, vis, p1 + a[i], p2, n, ans);
solve(a, vis, p1, p2 + a[i], n, ans);
vis[i] = 0;
}
}
}
int main(){
int n;
cin>>n;
int a[n];
for(int i=0 ; i<n ; ++i){
cin>>a[i];
}
int vis[n] = {0};
int ans = -1;
solve(a,vis,0,0,n,ans);
cout<<ans;
return 0;
}

Related

Need help understanding the solution for the Jewelry Topcoder solution

I am fairly new to dynamic programming and don't yet understand most of the types of problems it can solve. Hence I am facing problems in understaing the solution of Jewelry topcoder problem.
Can someone at least give me some hints as to what the code is doing ?
Most importantly is this problem a variant of the subset-sum problem ? Because that's what I am studying to make sense of this problem.
What are these two functions actually counting ? Why are we using actually two DP tables ?
void cnk() {
nk[0][0]=1;
FOR(k,1,MAXN) {
nk[0][k]=0;
}
FOR(n,1,MAXN) {
nk[n][0]=1;
FOR(k,1,MAXN)
nk[n][k] = nk[n-1][k-1]+nk[n-1][k];
}
}
void calc(LL T[MAXN+1][MAX+1]) {
T[0][0] = 1;
FOR(x,1,MAX) T[0][x]=0;
FOR(ile,1,n) {
int a = v[ile-1];
FOR(x,0,MAX) {
T[ile][x] = T[ile-1][x];
if(x>=a) T[ile][x] +=T[ile-1][x-a];
}
}
}
How is the original solution constructed by using the following logic ?
FOR(u,1,c) {
int uu = u * v[done];
FOR(x,uu,MAX)
res += B[done][x-uu] * F[n-done-u][x] * nk[c][u];
}
done=p;
}
Any help would be greatly appreciated.
Let's consider the following task first:
"Given a vector V of N positive integers less than K, find the number of subsets whose sum equals S".
This can be solved in polynomial time with dynamic programming using some extra-memory.
The dynamic programming approach goes like this:
instead of solving the problem for N and S, we will solve all the problems of the following form:
"Find the number of ways to write sum s (with s ≤ S) using only the first n ≤ N of the numbers".
This is a common characteristic of the dynamic programming solutions: instead of only solving the original problem, you solve an entire family of related problems. The key idea is that solutions for more difficult problem settings (i.e. higher n and s) can efficiently be built up from the solutions of the easier settings.
Solving the problem for n = 0 is trivial (sum s = 0 can be expressed in one way -- using the empty set, while all other sums can't be expressed in any ways).
Now consider that we have solved the problem for all values up to a certain n and that we have these solutions in a matrix A (i.e. A[n][s] is the number of ways to write sum s using the first n elements).
Then, we can find the solutions for n+1, using the following formula:
A[n+1][s] = A[n][s - V[n+1]] + A[n][s].
Indeed, when we write the sum s using the first n+1 numbers we can either include or not V[n+1] (the n+1th term).
This is what the calc function computes. (the cnk function uses Pascal's rule to compute binomial coefficients)
Note: in general, if in the end we are only interested in answering the initial problem (i.e. for N and S), then the array A can be uni-dimensional (with length S) -- this is because whenever trying to construct solutions for n + 1 we only need the solutions for n, and not for smaller values).
This problem (the one initially stated in this answer) is indeed related to the subset sum problem (finding a subset of elements with sum zero).
A similar type of dynamic programming approach can be applied if we have a reasonable limit on the absolute values of the integers used (we need to allocate an auxiliary array to represent all possible reachable sums).
In the zero-sum problem we are not actually interested in the count, thus the A array can be an array of booleans (indicating whether a sum is reachable or not).
In addition, another auxiliary array, B can be used to allow reconstructing the solution if one exists.
The recurrence would now look like this:
if (!A[s] && A[s - V[n+1]]) {
A[s] = true;
// the index of the last value used to reach sum _s_,
// allows going backwards to reproduce the entire solution
B[s] = n + 1;
}
Note: the actual implementation requires some additional care for handling the negative sums, which can not directly represent indices in the array (the indices can be shifted by taking into account the minimum reachable sum, or, if working in C/C++, a trick like the one described in this answer can be applied: https://stackoverflow.com/a/3473686/6184684).
I'll detail how the above ideas apply in the TopCoder problem and its solution linked in the question.
The B and F matrices.
First, note the meaning of the B and F matrices in the solution:
B[i][s] represents the number of ways to reach sum s using only the smallest i items
F[i][s] represents the number of ways to reach sum s using only the largest i items
Indeed, both matrices are computed using the calc function, after sorting the array of jewelry values in ascending order (for B) and descending order (for F).
Solution for the case with no duplicates.
Consider first the case with no duplicate jewelry values, using this example: [5, 6, 7, 11, 15].
For the reminder of the answer I will assume that the array was sorted in ascending order (thus "first i items" will refer to the smallest i ones).
Each item given to Bob has value less (or equal) to each item given to Frank, thus in every good solution there will be a separation point such that Bob receives only items before that separation point, and Frank receives only items after that point.
To count all solutions we would need to sum over all possible separation points.
When, for example, the separation point is between the 3rd and 4th item, Bob would pick items only from the [5, 6, 7] sub-array (smallest 3 items), and Frank would pick items from the remaining [11, 12] sub-array (largest 2 items). In this case there is a single sum (s = 11) that can be obtained by both of them. Each time a sum can be obtained by both, we need to multiply the number of ways that each of them can reach the respective sum (e.g. if Bob could reach a sum s in 4 ways and Frank could reach the same sum s in 5 ways, then we could get 20 = 4 * 5 valid solutions with that sum, because each combination is a valid solution).
Thus we would get the following code by considering all separation points and all possible sums:
res = 0;
for (int i = 0; i < n; i++) {
for (int s = 0; s <= maxS; s++) {
res += B[i][s] * F[n-i][s]
}
}
However, there is a subtle issue here. This would often count the same combination multiple times (for various separation points). In the example provided above, the same solution with sum 11 would be counted both for the separation [5, 6] - [7, 11, 15], as well as for the separation [5, 6, 7] - [11, 15].
To alleviate this problem we can partition the solutions by "the largest value of an item picked by Bob" (or, equivalently, by always forcing Bob to include in his selection the largest valued item from the first sub-array under the current separation).
In order to count the number of ways to reach sum s when Bob's largest valued item is the ith one (sorted in ascending order), we can use B[i][s - v[i]]. This holds because using the v[i] valued item implies requiring the sum s - v[i] to be expressed using subsets from the first i items (indices 0, 1, ... i - 1).
This would be implemented as follows:
res = 0;
for (int i = 0; i < n; i++) {
for (int s = v[i]; s <= maxS; s++) {
res += B[i][s - v[i]] * F[n - 1 - i][s];
}
}
This is getting closer to the solution on TopCoder (in that solution, done corresponds to the i above, and uu = v[i]).
Extension for the case when duplicates are allowed.
When duplicate values can appear in the array, it's no longer easy to directly count the number of solutions when Bob's most valuable item is v[i]. We need to also consider the number of such items picked by Bob.
If there are c items that have the same value as v[i], i.e. v[i] = v[i+1] = ... v[i + c - 1], and Bob picks u such items, then the number of ways for him to reach a certain sum s is equal to:
comb(c, u) * B[i][s - u * v[i]] (1)
Indeed, this holds because the u items can be picked from the total of c which have the same value in comb(c, u) ways. For each such choice of the u items, the remaining sum is s - u * v[i], and this should be expressed using a subset from the first i items (indices 0, 1, ... i - 1), thus it can be done in B[i][s - u * v[i]] ways.
For Frank, if Bob used u of the v[i] items, the number of ways to express sum s will be equal to:
F[n - i - u][s] (2)
Indeed, since Bob uses the smallest i + u values, Frank can use any of the largest n - i - u values to reach the sum s.
By combining relations (1) and (2) from above, we obtain that the number of solutions where both Frank and Bob have sum s, when Bob's most valued item is v[i] and he picks u such items is equal to:
comb(c, u) * B[i][s - u * v[i]] * F[n - i - u][s].
This is precisely what the given solution implements.
Indeed, the variable done corresponds to variable i above, variable x corresponds to sums s, the index p is used to determine the c items with same value as v[done], and the loop over u is used in order to consider all possible numbers of such items picked by Bob.
Here's some Java code for this that references the original solution. It also incorporates qwertyman's fantastic explanations (to the extent feasible). I've added some of my comments along the way.
import java.util.*;
public class Jewelry {
int MAX_SUM=30005;
int MAX_N=30;
long[][] C;
// Generate all possible sums
// ret[i][sum] = number of ways to compute sum using the first i numbers from val[]
public long[][] genDP(int[] val) {
int i, sum, n=val.length;
long[][] ret = new long[MAX_N+1][MAX_SUM];
ret[0][0] = 1;
for(i=0; i+1<=n; i++) {
for(sum=0; sum<MAX_SUM; sum++) {
// Carry over the sum from i to i+1 for each sum
// Problem definition allows excluding numbers from calculating sums
// So we are essentially excluding the last number for this calculation
ret[i+1][sum] = ret[i][sum];
// DP: (Number of ways to generate sum using i+1 numbers =
// Number of ways to generate sum-val[i] using i numbers)
if(sum>=val[i])
ret[i+1][sum] += ret[i][sum-val[i]];
}
}
return ret;
}
// C(n, r) - all possible combinations of choosing r numbers from n numbers
// Leverage Pascal's polynomial co-efficients for an n-degree polynomial
// Leverage Dynamic Programming to build this upfront
public void nCr() {
C = new long[MAX_N+1][MAX_N+1];
int n, r;
C[0][0] = 1;
for(n=1; n<=MAX_N; n++) {
C[n][0] = 1;
for(r=1; r<=MAX_N; r++)
C[n][r] = C[n-1][r-1] + C[n-1][r];
}
}
/*
General Concept:
- Sort array
- Incrementally divide array into two partitions
+ Accomplished by using two different arrays - L for left, R for right
- Take all possible sums on the left side and match with all possible sums
on the right side (multiply these numbers to get totals for each sum)
- Adjust for common sums so as to not overcount
- Adjust for duplicate numbers
*/
public long howMany(int[] values) {
int i, j, sum, n=values.length;
// Pre-compute C(n,r) and store in C[][]
nCr();
/*
Incrementally split the array and calculate sums on either side
For eg. if val={2, 3, 4, 5, 9}, we would partition this as
{2 | 3, 4, 5, 9} then {2, 3 | 4, 5, 9}, etc.
First, sort it ascendingly and generate its sum matrix L
Then, sort it descendingly, and generate another sum matrix R
In later calculations, manipulate indexes to simulate the partitions
So at any point L[i] would correspond to R[n-i-1]. eg. L[1] = R[5-1-1]=R[3]
*/
// Sort ascendingly
Arrays.sort(values);
// Generate all sums for the "Left" partition using the sorted array
long[][] L = genDP(values);
// Sort descendingly by reversing the existing array.
// Java 8 doesn't support Arrays.sort for primitive int types
// Use Comparator or sort manually. This uses the manual sort.
for(i=0; i<n/2; i++) {
int tmp = values[i];
values[i] = values[n-i-1];
values[n-i-1] = tmp;
}
// Generate all sums for the "Right" partition using the re-sorted array
long[][] R = genDP(values);
// Re-sort in ascending order as we will be using values[] as reference later
Arrays.sort(values);
long tot = 0;
for(i=0; i<n; i++) {
int dup=0;
// How many duplicates of values[i] do we have?
for(j=0; j<n; j++)
if(values[j] == values[i])
dup++;
/*
Calculate total by iterating through each sum and multiplying counts on
both partitions for that sum
However, there may be count of sums that get duplicated
For instance, if val={2, 3, 4, 5, 9}, you'd get:
{2, 3 | 4, 5, 9} and {2, 3, 4 | 5, 9} (on two different iterations)
In this case, the subset {2, 3 | 5} is counted twice
To account for this, exclude the current largest number, val[i], from L's
sum and exclude it from R's i index
There is another issue of duplicate numbers
Eg. If values={2, 3, 3, 3, 4}, how do you know which 3 went to L?
To solve this, group the same numbers
Applying to {2, 3, 3, 3, 4} :
- Exclude 3, 6 (3+3) and 9 (3+3+3) from L's sum calculation
- Exclude 1, 2 and 3 from R's index count
We're essentially saying that we will exclude the sum contribution of these
elements to L and ignore their count contribution to R
*/
for(j=1; j<=dup; j++) {
int dup_sum = j*values[i];
for(sum=dup_sum; sum<MAX_SUM; sum++) {
// (ways to pick j numbers from dup) * (ways to get sum-dup_sum from i numbers) * (ways to get sum from n-i-j numbers)
if(n-i-j>=0)
tot += C[dup][j] * L[i][sum-dup_sum] * R[n-i-j][sum];
}
}
// Skip past the duplicates of values[i] that we've now accounted for
i += dup-1;
}
return tot;
}
}

Finding number of contiguous subarrays not including certain pairs

How many contiguous subarrays of an array exist such that they do not contain certain pairs of positions of the array? For eg. if array ={11,22,33,45} and if we do not want to include say position number (1,3) and (2,4) then number of contiguous subarrays that exist are 7 which are as follows {{11},{22},{33},{44},{11,22},{22,33},{33,45}}.
My attempt at solving the problem:
If two pairs exist such that {...,a1, ...,a2 ,..,b1, ..,b2} where n is the number of elemnts and (...) indicate that there are elements in between these positions.
1st case :We cannot include {a1,b1} and {a2,b2} then we have to just count the number of possible combinations which is n*(n+1)/2 - no. of possible combinations including {a2,b1} which covers all the possible
case 2. If we cannot include pairs {a1,b2} and {a2,b1} then we just subtract number of possibilties containing {a2,b1} which covers all the possible cases.
3rd case: if we can't include pairs {a1,a2} and {b1,b2} then we have to individually subtract possible subarrays including these positions
The problem I'm facing is I am not able to derive a formula and extend these cases to more than 2 pairs to count the number of possible solutions even after formulating the cases. So, I need help regarding that.
Source: This is an interview question asked to my friend which he could not answer.
Okay, so let's look again at the array {11,22,33,44}.
Let's say you're trying to find all subarrays starting with 11 (index=1). Now you only need to look at restrictions (a,b) where index<=a and b is minimal, because if you have e.g. (2,3) this automatically implies (1,4) or (2,4).
To find the minimal b, use the following code (I used C#):
public static int findRestrictingIndex(int currentIndex, List<Tuple<int, int>> restrictions)
{
int result = int.MaxValue;
foreach (var pair in restrictions)
{
if (currentIndex <= pair.Item1)
{
if (pair.Item2 < result)
result = pair.Item2;
}
}
return result;
}
If you run this code, you'll receive 3 (from (1,3)) as a result. The number of arrays starting with 11 is then 3-1=2, because you can add numbers to 11 up to excluding index 3.
Do this for every index and you'll be done:
int arrayLength = 4;
//excluded positions, zero-based
List<Tuple<int, int>> restrictions = new List<Tuple<int, int>>();
restrictions.Add(new Tuple<int, int>(0, 2));
restrictions.Add(new Tuple<int, int>(1, 3));
restrictions.Add(new Tuple<int, int>(4, 4)); //dummy element when there is no restriction
int numOfSubarrays = 0;
for (int currentIndex = 0; currentIndex < arrayLength; currentIndex++)
{
numOfSubarrays += findRestrictingIndex(currentIndex, restrictions) - currentIndex;
}
Console.WriteLine(numOfSubarrays);

Converting this recursive solution to DP

Given a stack of integers, players take turns at removing either 1, 2, or 3 numbers from the top of the stack. Assuming that the opponent plays optimally and you select first, I came up with the following recursion:
int score(int n) {
if (n <= 0) return 0;
if (n <= 3) {
return sum(v[0..n-1]);
}
// maximize over picking 1, 2, or 3 + value after opponent picks optimally
return max(v[n-1] + min(score(n-2), score(n-3), score(n-4)),
v[n-1] + v[n-2] + min(score(n-3), score(n-4), score(n-5)),
v[n-1] + v[n-2] + v[n-3] + min(score(n-4), score(n-5), score(n-6)));
}
Basically, at each level comparing the outcomes of selecting 1, 2, or 3 and then your opponent selecting either 1, 2, or 3.
I was wondering how I could convert this to a DP solution as it is clearly exponential. I was struggling with the fact that there seem to be 3 dimensions to it: num of your pick, num of opponent's pick, and sub problem size, i.e., it seems the best solution for table[p][o][n] would need to be maintained, where p is the number of values you choose, o is the number your opponent chooses and n is the size of the sub problem.
Do I actually need the 3 dimensions? I have seen this similar problem: http://www.geeksforgeeks.org/dynamic-programming-set-31-optimal-strategy-for-a-game/ , but couldn't seem to adapt it.
Here is way the problem can be converted into DP :-
score[i] = maximum{ sum[i] - score[i+1] , sum[i] - score[i+2] , sum[i] - score[i+3] }
Here score[i] means max score generated from game [i to n] where v[i] is top of stack. sum[i] is sum of all elements on the stack from i onwards. sum[i] can be evaluated using a separate DP in O(N). The above DP can be solved using table in O(N)
Edit :-
Following is a DP solution in JAVA :-
public class game {
static boolean play_game(int[] stack) {
if(stack.length<=3)
return true;
int[] score = new int[stack.length];
int n = stack.length;
score[n-1] = stack[n-1];
score[n-2] = score[n-1]+stack[n-2];
score[n-3] = score[n-2]+stack[n-3];
int sum = score[n-3];
for(int i=n-4;i>=0;i--) {
sum = stack[i]+sum;
int min = Math.min(Math.min(score[i+1],score[i+2]),score[i+3]);
score[i] = sum-min;
}
if(sum-score[0]<score[0])
return true;
return false;
}
public static void main(String args[]) {
int[] stack = {12,1,7,99,3};
System.out.printf("I win => "+play_game(stack));
}
EDIT:-
For getting a DP solution you need to visualize a problems solution in terms of the smaller instances of itself. For example in this case as both players are playing optimally , after the choice made by first one ,the second player also obtains an optimal score for remaining stack which the subproblem of the first one. The only problem here is that how represent it in a recurrence . To solve DP you must first define a recurrence relation in terms of subproblem which precedes the current problem in any way of computation. Now we know that whatever second player wins , first player loses so effectively first player gains total sum - score of second player. As second player as well plays optimally we can express the solution in terms of recursion.

number to unique permutation mapping of a sequence containing duplicates

I am looking for an algorithm that can map a number to a unique permutation of a sequence. I have found out about Lehmer codes and the factorial number system thanks to a similar question, Fast permutation -> number -> permutation mapping algorithms, but that question doesn't deal with the case where there are duplicate elements in the sequence.
For example, take the sequence 'AAABBC'. There are 6! = 720 ways that could be arranged, but I believe there are only 6! / (3! * 2! * 1!) = 60 unique permutation of this sequence. How can I map a number to a permutation in these cases?
Edit: changed the term 'set' to 'sequence'.
From Permutation to Number:
Let K be the number of character classes (example: AAABBC has three character classes)
Let N[K] be the number of elements in each character class. (example: for AAABBC, we have N[K]=[3,2,1], and let N= sum(N[K])
Every legal permutation of the sequence then uniquely corresponds to a path in an incomplete K-way tree.
The unique number of the permutation then corresponds to the index of the tree-node in a post-order traversal of the K-ary tree terminal nodes.
Luckily, we don't actually have to perform the tree traversal -- we just need to know how many terminal nodes in the tree are lexicographically less than our node. This is very easy to compute, as at any node in the tree, the number terminal nodes below the current node is equal to the number of permutations using the unused elements in the sequence, which has a closed form solution that is a simple multiplication of factorials.
So given our 6 original letters, and the first element of our permutation is a 'B', we determine that there will be 5!/3!1!1! = 20 elements that started with 'A', so our permutation number has to be greater than 20. Had our first letter been a 'C', we could have calculated it as 5!/2!2!1! (not A) + 5!/3!1!1! (not B) = 30+ 20, or alternatively as
60 (total) - 5!/3!2!0! (C) = 50
Using this, we can take a permutation (e.g. 'BAABCA') and perform the following computations:
Permuation #= (5!/2!2!1!) ('B') + 0('A') + 0('A')+ 3!/1!1!1! ('B') + 2!/1!
= 30 + 3 +2 = 35
Checking that this works: CBBAAA corresponds to
(5!/2!2!1! (not A) + 5!/3!1!1! (not B)) 'C'+ 4!/2!2!0! (not A) 'B' + 3!/2!1!0! (not A) 'B' = (30 + 20) +6 + 3 = 59
Likewise, AAABBC =
0 ('A') + 0 'A' + '0' A' + 0 'B' + 0 'B' + 0 'C = 0
Sample implementation:
import math
import copy
from operator import mul
def computePermutationNumber(inPerm, inCharClasses):
permutation=copy.copy(inPerm)
charClasses=copy.copy(inCharClasses)
n=len(permutation)
permNumber=0
for i,x in enumerate(permutation):
for j in xrange(x):
if( charClasses[j]>0):
charClasses[j]-=1
permNumber+=multiFactorial(n-i-1, charClasses)
charClasses[j]+=1
if charClasses[x]>0:
charClasses[x]-=1
return permNumber
def multiFactorial(n, charClasses):
val= math.factorial(n)/ reduce(mul, (map(lambda x: math.factorial(x), charClasses)))
return val
From Number to Permutation:
This process can be done in reverse, though I'm not sure how efficiently:
Given a permutation number, and the alphabet that it was generated from, recursively subtract the largest number of nodes less than or equal to the remaining permutation number.
E.g. Given a permutation number of 59, we first can subtract 30 + 20 = 50 ('C') leaving 9. Then we can subtract 'B' (6) and a second 'B'(3), re-generating our original permutation.
Here is an algorithm in Java that enumerates the possible sequences by mapping an integer to the sequence.
public class Main {
private int[] counts = { 3, 2, 1 }; // 3 Symbols A, 2 Symbols B, 1 Symbol C
private int n = sum(counts);
public static void main(String[] args) {
new Main().enumerate();
}
private void enumerate() {
int s = size(counts);
for (int i = 0; i < s; ++i) {
String p = perm(i);
System.out.printf("%4d -> %s\n", i, p);
}
}
// calculates the total number of symbols still to be placed
private int sum(int[] counts) {
int n = 0;
for (int i = 0; i < counts.length; i++) {
n += counts[i];
}
return n;
}
// calculates the number of different sequences with the symbol configuration in counts
private int size(int[] counts) {
int res = 1;
int num = 0;
for (int pos = 0; pos < counts.length; pos++) {
for (int den = 1; den <= counts[pos]; den++) {
res *= ++num;
res /= den;
}
}
return res;
}
// maps the sequence number to a sequence
private String perm(int num) {
int[] counts = this.counts.clone();
StringBuilder sb = new StringBuilder(n);
for (int i = 0; i < n; ++i) {
int p = 0;
for (;;) {
while (counts[p] == 0) {
p++;
}
counts[p]--;
int c = size(counts);
if (c > num) {
sb.append((char) ('A' + p));
break;
}
counts[p]++;
num -= c;
p++;
}
}
return sb.toString();
}
}
The mapping used by the algorithm is as follows. I use the example given in the question (3 x A, 2 x B, 1 x C) to illustrate it.
There are 60 (=6!/3!/2!/1!) possible sequences in total, 30 (=5!/2!/2!/1!) of them have an A at the first place, 20 (=5!/3!/1!/1!) have a B at the first place, and 10 (=5!/3!/2!/0!) have a C at the first place.
The numbers 0..29 are mapped to all sequences starting with an A, 30..49 are mapped to the sequences starting with B, and 50..59 are mapped to the sequences starting with C.
The same process is repeated for the next place in the sequence, for example if we take the sequences starting with B we have now to map numbers 0 (=30-30) .. 19 (=49-30) to the sequences with configuration (3 x A, 1 x B, 1 x C)
A very simple algorithm to mapping a number for a permutation consists of n digits is
number<-digit[0]*10^(n-1)+digit[1]*10^(n-2)+...+digit[n]*10^0
You can find plenty of resources for algorithms to generate permutations. I guess you want to use this algorithm in bioinformatics. For example you can use itertools.permutations from Python.
Assuming the resulting number fits inside a word (e.g. 32 or 64 bit integer) relatively easily, then much of the linked article still applies. Encoding and decoding from a variable base remains the same. What changes is how the base varies.
If you're creating a permutation of a sequence, you pick an item out of your bucket of symbols (from the original sequence) and put it at the start. Then you pick out another item from your bucket of symbols and put it on the end of that. You'll keep picking and placing symbols at the end until you've run out of symbols in your bucket.
What's significant is which item you picked out of the bucket of the remaining symbols each time. The number of remaining symbols is something you don't have to record because you can compute that as you build the permutation -- that's a result of your choices, not the choices themselves.
The strategy here is to record what you chose, and then present an array of what's left to be chosen. Then choose, record which index you chose (packing it via the variable base method), and repeat until there's nothing left to choose. (Just as above when you were building a permuted sequence.)
In the case of duplicate symbols it doesn't matter which one you picked, so you can treat them as the same symbol. The difference is that when you pick a symbol which still has a duplicate left, you didn't reduce the number of symbols in the bucket to pick from next time.
Let's adopt a notation that makes this clear:
Instead of listing duplicate symbols left in our bucket to choose from like c a b c a a we'll list them along with how many are still in the bucket: c-2 a-3 b-1.
Note that if you pick c from the list, the bucket has c-1 a-3 b-1 left in it. That means next time we pick something, we have three choices.
But on the other hand, if I picked b from the list, the bucket has c-2 a-3 left in it. That means next time we pick something, we only have two choices.
When reconstructing the permuted sequence we just maintain the bucket the same way as when we were computing the permutation number.
The implementation details aren't trivial, but they're straightforward with standard algorithms. The only thing that might heckle you is what to do when a symbol in your bucket is no longer available.
Suppose your bucket was represented by a list of pairs (like above): c-1 a-3 b-1 and you choose c. Your resulting bucket is c-0 a-3 b-1. But c-0 is no longer a choice, so your list should only have two entries, not three. You could move the entire list down by 1 resulting in a-3 b-1, but if your list is long this is expensive. A fast an easy solution: move the last element of the bucket into the removed location and decrease your bucket size: c0 a-3 b-1 becomes b-1 a-3 <empty> or just b-1 a-3.
Note that we can do the above because it doesn't matter what order the symbols in the bucket are listed in, as long as it's the same way when we encode or decode the number.
As I was unsure of the code in gbronner's answer (or of my understanding), I recoded it in R as follows
ritpermz=function(n, parclass){
return(factorial(n) / prod(factorial(parclass)))}
rankum <- function(confg, parclass){
n=length(confg)
permdex=1
for (i in 1:(n-1)){
x=confg[i]
if (x > 1){
for (j in 1:(x-1)){
if(parclass[j] > 0){
parclass[j]=parclass[j]-1
permdex=permdex + ritpermz(n-i, parclass)
parclass[j]=parclass[j]+1}}}
parclass[x]=parclass[x]-1
}#}
return(permdex)
}
which does produce a ranking with the right range of integers

Calculating number of moves from top left corner to bottom right with move in any direction

I have a problem asked to me in an interview, this is a similar problem I found so I thought of asking here. The problem is
There is a robot situated at (1,1) in a N X N grid, the robot can move in any direction left, right ,up and down. Also I have been given an integer k, which denotes the maximum steps in the path. I had to calculate the number of possible ways to move from (1,1) to (N,N) in k or less steps.
I know how to solve simplified version of this problem, the one with moves possible in only right and down direction. That can be solved with Dynamic Programming. I tried applying the same technique here but I don't think it could be solved using 2-dimensional matrix, I tried a similar approach counting possible number of ways from left or up or right and summing up in down direction, but the problem is I don't know number of ways from down direction which should also be added. So I go in a loop. I was able to solve this problem using recursion, I could recurse on (N,N,k) call for up, left and k-1, sum them up but I think this is also not correct, and if it could be correct it has exponential complexity. I found problems similar to this so I wanted to know what would be a perfect approach for solving these types of problems.
Suppose you have an NxN matrix, where each cell gives you the number of ways to move from (1,1) to (i,j) in exactly k steps (some entries will be zero). You can now create an NxN matrix, where each cell gives you the number of ways to move from (1,1) to (i,j) in exactly k+1 steps - start off with the all-zero matrix, and then add in cell (i,j) of the previous matrix to cells (i+1, j), (i, j+1),... and so on.
The (N,N) entry in each of the k matrices gives you the number of ways to move from (1,1) to (i,j) in exactly k steps - all you have to do now is add them all together.
Here is an example for the 2x2 case, where steps outside the
matrix are not allowed, and (1,1) is at the top left.
In 0 steps, you can only get to the (1,1) cell:
1 0
0 0
There is one path to 1,1. From here you can go down or right,
so there are two different paths of length 1:
0 1
1 0
From the top right path you can go left or down, and from the
bottom left you can go right or up, so both cells have paths
that can be extended in two ways, and end up in the same two
cells. We add two copies of the following, one from each non-zero
cell
1 0
0 1
giving us these totals for paths of length two:
2 0
0 2
There are two choices from each of the non-empty cells again
so we have much the same as before for paths of length three.
0 4
4 0
Two features of this are easy checks:
1) For each length of path, only two cells are non-zero,
corresponding to the length of the path being odd or even.
2) The number of paths at each stage is a power of two, because
each path corresponds to a choice at each step as to whether to
go horizontally or vertically. (This only holds for this simple
2x2 case).
Update: This algorithm is incorrect. See the comments and mcdowella's answer. However, the corrected algorithm does not make a difference to the time complexity.
It can be done in O(k * N^2) time, at least. Pseudocode:
# grid[i,j] contains the number of ways we can get to i,j in at most n steps,
# where n is initially 0
grid := N by N array of 0s
grid[1,1] := 1
for n from 1 to k:
old := grid
for each cell i,j in grid:
# cells outside the grid considered 0 here
grid[i,j] := old[i,j] + old[i-1,j] + old[i+1,j] + old[i,j-1] + old[i,j+1]
return grid[N,N]
There might be an O(log k * (N*log N)^2) solution which is way more complex. Each iteration through the outer for loop is nothing but a convolution with a fixed kernel. So we can convolve the kernel with itself to get bigger kernels that fuse multiple iterations into one, and use FFT to compute the convolution.
Basically uniquepaths( row, column ) = 0 if row > N || column > N
1 if row ==N && column == N
uniquepaths(row+1, column) + uniquePaths(row, column+1)
i.e, the solution have optimal substructure and overlapped subproblems. So, it can be solved using Dynamic Programming. Below is memorization (lazy/on demand) version of it (related which basically returns paths as well: Algorithm for finding all paths in a NxN grid) (you may refer to my blog for more details: http://codingworkout.blogspot.com/2014/08/robot-in-grid-unique-paths.html)
private int GetUniquePaths_DP_Memoization_Lazy(int?[][] DP_Memoization_Lazy_Cache, int row,
int column)
{
int N = DP_Memoization_Lazy_Cache.Length - 1;
if (row > N)
{
return 0;
}
if (column > N)
{
return 0;
}
if(DP_Memoization_Lazy_Cache[row][column] != null)
{
return DP_Memoization_Lazy_Cache[row][column].Value;
}
if((row == N) && (column == N))
{
DP_Memoization_Lazy_Cache[N][N] = 1;
return 1;
}
int pathsWhenMovedDown = this.GetUniquePaths_DP_Memoization_Lazy(DP_Memoization_Lazy_Cache,
row + 1, column);
int pathsWhenMovedRight = this.GetUniquePaths_DP_Memoization_Lazy(DP_Memoization_Lazy_Cache,
row, column + 1);
DP_Memoization_Lazy_Cache[row][column] = pathsWhenMovedDown + pathsWhenMovedRight;
return DP_Memoization_Lazy_Cache[row][column].Value;
}
where the caller is
int GetUniquePaths_DP_Memoization_Lazy(int N)
{
int?[][] DP_Memoization_Lazy_Cache = new int?[N + 1][];
for(int i =0;i<=N;i++)
{
DP_Memoization_Lazy_Cache[i] = new int?[N + 1];
for(int j=0;j<=N;j++)
{
DP_Memoization_Lazy_Cache[i][j] = null;
}
}
this.GetUniquePaths_DP_Memoization_Lazy(DP_Memoization_Lazy_Cache, row: 1, column: 1);
return DP_Memoization_Lazy_Cache[1][1].Value;
}
Unit Tests
[TestCategory(Constants.DynamicProgramming)]
public void RobotInGridTests()
{
int p = this.GetNumberOfUniquePaths(3);
Assert.AreEqual(p, 6);
int p1 = this.GetUniquePaths_DP_Memoization_Lazy(3);
Assert.AreEqual(p, p1);
var p2 = this.GetUniquePaths(3);
Assert.AreEqual(p1, p2.Length);
foreach (var path in p2)
{
Debug.WriteLine("===================================================================");
foreach (Tuple<int, int> t in path)
{
Debug.Write(string.Format("({0}, {1}), ", t.Item1, t.Item2));
}
}
p = this.GetNumberOfUniquePaths(4);
Assert.AreEqual(p, 20);
p1 = this.GetUniquePaths_DP_Memoization_Lazy(4);
Assert.AreEqual(p, p1);
p2 = this.GetUniquePaths(4);
Assert.AreEqual(p1, p2.Length);
foreach (var path in p2)
{
Debug.WriteLine("===================================================================");
foreach (Tuple<int, int> t in path)
{
Debug.Write(string.Format("({0}, {1}), ", t.Item1, t.Item2));
}
}
}
There will be infinite no of ways. This is because you can form an infinite loop of positions and thus infinite possibilities. For ex:- You can move from (0,0) to (0,1) then to (1,1), then (1,0) and back again to (0,0). This forms a loop of positions and thus anyone can go round and round these types of loops and have infinite possibilities.

Resources