How to model partition problem as Dynamic Programming problem? - c++11

I am unable to understand how this partition problem can be thought of as a dynamic programming problem.
I have the following doubts:
1) It is not an optimization problem (or I am unable to see) then why are we applying DP approach to it?
2) DP problems satisfy 2 properties:
Overlapping Subproblems
Optimal Substructure
But I am unable to see the problem satisfying the above properties.
Partition problem is to determine whether a given set can be partitioned into two subsets such that the sum of elements in both subsets is same.
arr[] = {1, 5, 11, 5}
Output: true
The array can be partitioned as {1, 5, 5} and {11}
arr[] = {1, 5, 3}
Output: false
The array cannot be partitioned into equal sum sets.

The problem is NP-Complete but for smaller constraints it is solvable using dynamic programming.
The recurrence relation will be following:
f(index,sum) = f(index,sum + arr[index]) or f(index+1,sum - arr[index])
and for the base case:
if(index >= arraySize) {
if ( sum == 0 )
return true;
else
return false;
}
The Time complexity and the memory complexity of this function will O( arraySize * maximumSum). So if arraySize * maximumSum is small enough the problem is solvable using dynamic programming.

Related

Dynamic programming function in O(nk) time

Given two integer arrays A of size n and B of size k, and knowing that all items
in the array B are unique, I want to find an algorithm that finds indices j' < j'', such
that all elements of B belong to A[j' : j''] and value |j''-j'|is minimized or
returns zero if there are no such indices at all. I also note that A can contain duplicates.
To provide more clarity, we can consider array A = {1, 2, 9, 6, 7, 8, 1, 0, 0, 6} and B {1, 8, 6}, then you can see that B ⊆ A[1 : 6] and B ⊆ A[4 : 7], but at the same time 7−4 < 6−1,
thus algorithm should output j'= 4 and j''= 7.
I want to find an algorithm that runs in O(nk) time.
My work so far is that I was thinking for each j'∈ [n], I can compute the minimum j'' ≥ j' so that B ⊆ A[j', j'']. If I assume B = {b1, ..., bk}, let Next[j'][i] denote the smallest index t ≥ j' so that at = b_i, i.e., the index of next element after a_j' (included) which equals bi.
In particular if such t doesn’t exist, simply let Next[j'][i] = ∞. If I am able to show that the minimum j'' is the following
j'' = max i∈[k] of Next[j'][i],
then I think I will be able to design a dynamic programming algorithm to compute Next in O(nk) time. Any help on this dynamic programming problem would be much appreciated!
Just run a sliding window that maintains the invariant of including all elements of B. That's O(n) with a hashmap.

How more effectively find the minimal composition from n sets that satisfies the given condition?

We have N sets of triples,like
1. { (4; 0,1), (5 ; 0.3), (7; 0,6) }
2. { (7; 0.2), (8 ; 0.4), (1 ; 0.4) }
...
N. { (6; 0.3), (1; 0.2), (9 ; 0.5) }
and need to choose only one pair from each triple, so that the sum of the first members in pair will be minimal, but also we have a condition that sum of the second members in pair must be not less than a given P number.
We can solve this by sorting all possible pair combinations with the sum of their first members (3 ^ N combinations), and in that sorted list choose the first one which also satisfies the second condition.
Could you please help to suggest a better, non trivial solution for this problem?
If there are no constraints on the values inside your triplets, then we are facing a pretty general version of integer programming problem, more specifically a 0-1 linear programming problem, as it can be represented as a system of equations with every coefficient being 0 or 1. You can find the possible approaches on the wiki page, but there is no fast-and-easy solution for this problem in general.
Alternatively, if the second numbers of each pair (the ones that need to sum up to >= P) are from a small enough range, we could view this as Dynamic Programming problem similar to a Knapsack problem. "Small enough" there is a bit hard to define because the original data has non-integer numbers. If they were integers, then the algorithmic complexity of solution I will describe is O(P * N). For non-integer numbers, they need to be first converted to integers by multiplying them all, as well as P, by a large enough number. In your example, the precision of each number is 1 digit after zero, so multiplying by 10 is enough. Hence, the actual complexity is O(M * P * N), where M is the factor everything was multiplied by to achieve integer numbers.
After this, we are essentially solving a modified Knapsack problem: instead of constraining the weight from above, we are constraining it from below, and on each step we are choosing a pair from a triplet, as opposed to deciding whether to put an item into the knapsack or not.
Let's define a function minimum_sum[i][s] which at values i, s represents the minimum possible sum (of first numbers in each pair we took) we can achieve if the sum of the second numbers in pairs taken so far is equal to s and we already considered the first i triplets. One exception to this definition is that minimum_sum[i][P] has the minimum for all sums exceeding P as well. If we can compute all values of this function, then minimum_sum[N][P] is the answer. The function values can be computed with something like this:
minimum_sum[0][0]=0, all other values are set to infinity
for i=0..N-1:
for s=0..P:
for j=0..2:
minimum_sum[i+1][min(P, s+B[i][j])] = min(minimum_sum[i+1][min(P, s+B[i][j])], minimum_sum[i][s] + A[i][j]
A[i][j] here denote the first number in i-th triplet's j-th pair, and B[i][j] denote the second number of the same triplet.
This solution is viable if N is large, but P is small and precision on Bs isn't too high. For instance, if N=50, there is little hope to compute 3^N possibilities, but with M*P=1000000 this approach would work extremely fast.
Python implementation of the idea above:
def compute(A, B, P):
n = len(A)
# note that I use 1,000,000 as “infinity” here, which might need to be increased depending on input data
best = [[1000000 for i in range(P + 1)] for j in range(n + 1)]
best[0][0] = 0
for i in range(n):
for s in range(P+1):
for j in range(3):
best[i+1][min(P, s+B[i][j])] = min(best[i+1][min(P, s+B[i][j])], best[i][s]+A[i][j])
return best[n][P]
Testing:
A=[[4, 5, 7], [7, 8, 1], [6, 1, 9]]
# second numbers in each pair after scaling them up to be integers
B=[[1, 3, 6], [2, 4, 4], [3, 2, 5]]
In [7]: compute(A, B, 0)
Out[7]: 6
In [14]: compute(A, B, 7)
Out[14]: 6
In [15]: compute(A, B, 8)
Out[15]: 7
In [20]: compute(A, B, 13)
Out[20]: 14

Longest subarray whose elements form a continuous sequence

Given an unsorted array of positive integers, find the length of the longest subarray whose elements when sorted are continuous. Can you think of an O(n) solution?
Example:
{10, 5, 3, 1, 4, 2, 8, 7}, answer is 5.
{4, 5, 1, 5, 7, 6, 8, 4, 1}, answer is 5.
For the first example, the subarray {5, 3, 1, 4, 2} when sorted can form a continuous sequence 1,2,3,4,5, which are the longest.
For the second example, the subarray {5, 7, 6, 8, 4} is the result subarray.
I can think of a method which for each subarray, check if (maximum - minimum + 1) equals the length of that subarray, if true, then it is a continuous subarray. Take the longest of all. But it is O(n^2) and can not deal with duplicates.
Can someone gives a better method?
Algorithm to solve original problem in O(n) without duplicates. Maybe, it helps someone to develop O(n) solution that deals with duplicates.
Input: [a1, a2, a3, ...]
Map original array as pair where 1st element is a value, and 2nd is index of array.
Array: [[a1, i1], [a2, i2], [a3, i3], ...]
Sort this array of pairs with some O(n) algorithm (e.g Counting Sort) for integer sorting by value.
We get some another array:
Array: [[a3, i3], [a2, i2], [a1, i1], ...]
where a3, a2, a1, ... are in sorted order.
Run loop through sorted array of pairs
In linear time we can detect consecutive groups of numbers a3, a2, a1. Consecutive group definition is next value = prev value + 1.
During that scan keep current group size (n), minimum value of index (min), and current sum of indices (actualSum).
On each step inside consecutive group we can estimate sum of indices, because they create arithmetic progression with first element min, step 1, and size of group seen so far n.
This sum estimate can be done in O(1) time using formula for arithmetic progression:
estimate sum = (a1 + an) * n / 2;
estimate sum = (min + min + (n - 1)) * n / 2;
estimate sum = min * n + n * (n - 1) / 2;
If on some loop step inside consecutive group estimate sum equals to actual sum, then seen so far consecutive group satisfy the conditions. Save n as current maximum result, or choose maximum between current maximum and n.
If on value elements we stop seeing consecutive group, then reset all values and do the same.
Code example: https://gist.github.com/mishadoff/5371821
See the array S in it's mathematical set definition :
S = Uj=0k (Ij)
Where the Ij are disjoint integer segments. You can design a specific interval tree (based on a Red-Black tree or a self-balancing tree that you like :) ) to store the array in this mathematical definitions. The node and tree structures should look like these :
struct node {
int d, u;
int count;
struct node *n_left, *n_right;
}
Here, d is the lesser bound of the integer segment and u, the upper bound. count is added to take care of possible duplicates in the array : when trying to insert an already existing element in the tree, instead of doing nothing, we will increment the count value of the node in which it is found.
struct root {
struct node *root;
}
The tree will only store disjoint nodes, thus, the insertion is a bit more complex than a classical Red-Black tree insertion. When inserting intervals, you must scans for potential overflows with already existing intervals. In your case, since you will only insert singletons this should not add too much overhead.
Given three nodes P, L and R, L being the left child of P and R the right child of P. Then, you must enforce L.u < P.d and P.u < R.d (and for each node, d <= u, of course).
When inserting an integer segment [x,y], you must find "overlapping" segments, that is to say, intervals [u,d] that satisfies one of the following inequalities :
y >= d - 1
OR
x <= u + 1
If the inserted interval is a singleton x, then you can only find up to 2 overlapping interval nodes N1 and N2 such that N1.d == x + 1 and N2.u == x - 1. Then you have to merge the two intervals and update count, which leaves you with N3 such that N3.d = N2.d, N3.u = N1.u and N3.count = N1.count + N2.count + 1. Since the delta between N1.d and N2.u is the minimal delta for two segments to be disjoint, then you must have one of the following :
N1 is the right child of N2
N2 is the left child of N1
So the insertion will still be in O(log(n)) in the worst case.
From here, I can't figure out how to handle the order in the initial sequence but here is a result that might be interesting : if the input array defines a perfect integer segment, then the tree only has one node.
UPD2: The following solution is for a problem when it is not required that subarray is contiguous. I misunderstood the problem statement. Not deleting this, as somebody may have an idea based on mine that will work for the actual problem.
Here's what I've come up with:
Create an instance of a dictionary (which is implemented as hash table, giving O(1) in normal situations). Keys are integers, values are hash sets of integers (also O(1)) – var D = new Dictionary<int, HashSet<int>>.
Iterate through the array A and for each integer n with index i do:
Check whether keys n-1 and n+1 are contained in D.
if neither key exists, do D.Add(n, new HashSet<int>)
if only one of the keys exists, e.g. n-1, do D.Add(n, D[n-1])
if both keys exist, do D[n-1].UnionWith(D[n+1]); D[n+1] = D[n] = D[n-1];
D[n].Add(n)
Now go through each key in D and find the hash set with the greatest length (finding length is O(1)). The greatest length will be the answer.
To my understanding, the worst case complexity will be O(n*log(n)), only because of the UnionWith operation. I don't know how to calculate the average complexity, but it should be close to O(n). Please correct me if I am wrong.
UPD: To speak code, here's a test implementation in C# that gives the correct result in both of the OP's examples:
var A = new int[] {4, 5, 1, 5, 7, 6, 8, 4, 1};
var D = new Dictionary<int, HashSet<int>>();
foreach(int n in A)
{
if(D.ContainsKey(n-1) && D.ContainsKey(n+1))
{
D[n-1].UnionWith(D[n+1]);
D[n+1] = D[n] = D[n-1];
}
else if(D.ContainsKey(n-1))
{
D[n] = D[n-1];
}
else if(D.ContainsKey(n+1))
{
D[n] = D[n+1];
}
else if(!D.ContainsKey(n))
{
D.Add(n, new HashSet<int>());
}
D[n].Add(n);
}
int result = int.MinValue;
foreach(HashSet<int> H in D.Values)
{
if(H.Count > result)
{
result = H.Count;
}
}
Console.WriteLine(result);
This will require two passes over the data. First create a hash map, mapping ints to bools. I updated my algorithm to not use map, from the STL, which I'm positive uses sorting internally. This algorithm uses hashing, and can be easily updated for any maximum or minimum combination, even potentially all possible values an integer can obtain.
#include <iostream>
using namespace std;
const int MINIMUM = 0;
const int MAXIMUM = 100;
const unsigned int ARRAY_SIZE = MAXIMUM - MINIMUM;
int main() {
bool* hashOfIntegers = new bool[ARRAY_SIZE];
//const int someArrayOfIntegers[] = {10, 9, 8, 6, 5, 3, 1, 4, 2, 8, 7};
//const int someArrayOfIntegers[] = {10, 6, 5, 3, 1, 4, 2, 8, 7};
const int someArrayOfIntegers[] = {-2, -3, 8, 6, 12, 14, 4, 0, 16, 18, 20};
const int SIZE_OF_ARRAY = 11;
//Initialize hashOfIntegers values to false, probably unnecessary but good practice.
for(unsigned int i = 0; i < ARRAY_SIZE; i++) {
hashOfIntegers[i] = false;
}
//Chage appropriate values to true.
for(int i = 0; i < SIZE_OF_ARRAY; i++) {
//We subtract the MINIMUM value to normalize the MINIMUM value to a zero index for negative numbers.
hashOfIntegers[someArrayOfIntegers[i] - MINIMUM] = true;
}
int sequence = 0;
int maxSequence = 0;
//Find the maximum sequence in the values
for(unsigned int i = 0; i < ARRAY_SIZE; i++) {
if(hashOfIntegers[i]) sequence++;
else sequence = 0;
if(sequence > maxSequence) maxSequence = sequence;
}
cout << "MAX SEQUENCE: " << maxSequence << endl;
return 0;
}
The basic idea is to use the hash map as a bucket sort, so that you only have to do two passes over the data. This algorithm is O(2n), which in turn is O(n)
Don't get your hopes up, this is only a partial answer.
I'm quite confident that the problem is not solvable in O(n). Unfortunately, I can't prove it.
If there is a way to solve it in less than O(n^2), I'd suspect that the solution is based on the following strategy:
Decide in O(n) (or maybe O(n log n)) whether there exists a continuous subarray as you describe it with at least i elements. Lets call this predicate E(i).
Use bisection to find the maximum i for which E(i) holds.
The total running time of this algorithm would then be O(n log n) (or O(n log^2 n)).
This is the only way I could come up with to reduce the problem to another problem that at least has the potential of being simpler than the original formulation. However, I couldn't find a way to compute E(i) in less than O(n^2), so I may be completely off...
here's another way to think of your problem: suppose you have an array composed only of 1s and 0s, you want to find the longest consecutive run of 1s. this can be done in linear time by run-length encoding the 1s (ignore the 0's). in order to transform your original problem into this new run length encoding problem, you compute a new array b[i] = (a[i] < a[i+1]). this doesn't have to be done explicitly, you can just do it implicitly to achieve an algorithm with constant memory requirement and linear complexity.
Here are 3 acceptable solutions:
The first is O(nlog(n)) in time and O(n) space, the second is O(n) in time and O(n) in space, and the third is O(n) in time and O(1) in space.
build a binary search tree then traverse it in order.
keep 2 pointers one for the start of max subset and one for the end.
keep the max_size value while iterating the tree.
it is a O(n*log(n)) time and space complexity.
you can always sort numbers set using counting sort in a linear time
and run through the array, which means O(n) time and space
complexity.
Assuming there isn't overflow or a big integer data type. Assuming the array is a mathematical set (no duplicate values). You can do it in O(1) of memory:
calculate the sum of the array and the product of the array
figure out what numbers you have in it assuming you have the min and max of the original set. Totally it is O(n) time complexity.

Knapsack variation in Dynamic Programming

I'm trying to solve this exercise: we are given n items, where each has a given nonnegative weigth w1,w2,...,wn and value v1,v2,...,vn, and a knapsack with max weigth capacity W. I have to find a subset S of maximum value, subject to two restricions: 1) the total weight of the set should not exceed W; 2) I can't take objects with consecutive index.
For example, with n = 10, possible solutions are {1, 4, 6, 9}, {2, 4, 10} o {1, 10}.
How can I build a correct recurrence?
Recall that the knapsack recursive formula used for the DP solution is:
D(i,w) = max { D(i-1,w) , D(i-1,w-weight[i]) + value[i] }
In your modified problem, if you chose to take i - you cannot take i-1, resulting in the modification of:
D(i,w) = max { D(i-1,w) , D(i-2,w-weight[i]) + value[i] }
^
note here
i-2 instead of i-1
Similar to classic knapsack, it is also an exhaustive search - and thus provides optimal solution for the same reasons.
The idea is given that you have decided to chose i - you cannot chose i-1, so find the optimal solution that uses at most the item i-2. (no change from the original if you decided to exclude i)

Coin Change (Dynamic Programming)

We usually the following recurrence relation for the coin change problem:
(P is the total money for which we need change and d_i is the coin available)
But can't we make it like this:
(V is the given sorted set of coins available, i and j are its subscripts with Vj being the highest value coin given)
C[p,Vi,j] = C[p,Vi,j-1] if Vj > p
= C[p-Vj,Vi,j] + 1 if Vj <=p
Is there anything wrong with what I wrote? Though the solution is not dynamic but isn't it more efficient?
Consider P = 6, V = {4, 3, 1}. You would pick 4, 1, 1 instead of 3, 3, so 3 coins instead of the optimal 2.
What you've written is similar to the greedy algorithm that works only under certain conditions. (See - How to tell if greedy algorithm suffices for finding minimum coin change?).
Also, in your version you aren't actually using Vi within the recurrence, so it's just a waste of memory

Resources