I am wondering can powerset problem be transformed and reduced to knapsack problem? It seems to me that they are identical that for example the changes making problem which we can think of it as powerset that every recursive stage I launch 2 recursive calls (one takes the ith element, and the other one bypass it). I can also solve it with dynamic programming just like knapsack problem, so this makes me wondering if all the powerset problem can be transformed to knapsack problem. Is that correct ?
The following are the code fragment of the coin changes making one with O(2N) time complexity and one with dynamic programming O(N2) runtime.
// O(2^N) time complexity
void bruteforce(int[] coins, int i, int N, String expr)
{
if (i == coins.length) {
if (N == 0)
count++;
return;
}
if (N >= coins[i])
recursion(coins, i, N - coins[i], expr + " " + coins[i]);
recursion(coins, i + 1, N, expr);
}
// O(N^2) time complexity
int dynamicProgramming(int[] coins, int N)
{
int [] dp = new int[N + 1];
dp[0] = 1;
for(int i = 0; i < coins.length; i++)
for(int j = coins[i]; j <= N; j++)
dp[j] += dp[j - coins[i]];
return dp[N];
}
Finding powerset (generating all subsets of a set) can't be done in a way that has a complexity better than O(2n) because there are 2n subsets and merely printing them will take exponential time.
Problems like subset sum, knapsack or coin change are related to powerset because you implicity have to generate all subsets but there is a big difference between them and powerset. In these problems you are only counting some subsets and you aren't required to explicity generate those subsets. For example if the problem asks you to find all the ways to change X dollars to some coins then you can't solve this in linear time because you have to generate all the desired subsets and there could be 2n of them.
Related
Code below computes if s1 can be reduced to t, if so how many ways.
Let's say length of s1 is n and length of t is m. Worst case runtime of code below is O(n^m) without memoization. Say we can memoize sub-problems of s1, that substring recur. Runtime is O(m*n). Since we need to recur m times for each n. Is this reasoning correct?
static int distinctSeq(String s1, String t) {
if (s1.length() == t.length()) {
if (s1.equals(t))
return 1;
else
return 0;
}
int count = 0;
for (int i = 0; i < s1.length(); i++) {
String ss = s1.substring(0, i)+ s1.substring(i + 1);
count += distinctSeqRec(ss, t);
}
return count;
}
As #meowgoesthedog already mentioned your initial solution has a time complexity of O(n!/m!):
If you are starting with s1 of length n, and n > m then you can go into n different states by excluding one symbol from the original string.
You will continue doing it until your string has the length of m. The number of ways to come to the length of m from n using the given algorithm is n*(n-1)*(n-2)*...*(m+1), which is effectively n!/m!
For each string of length m formed by excluding symbols from initial string of length n you will have to compare string derived from s1 and t, which will require m operations (length of the strings), so the complexity from the previous step should be multiplied by m, but considering that you have factorial in the big-O, another *m won't change the asymptotic complexity.
Now about the memoization. If you add memoization, then algorithm would transition only to states that weren't already visited, which means that the task is to count the number of unique substrings of s1. For simplicity we will consider that all symbols of s1 are different. The number of states with length x is the number of ways of removing different n-x symbols from s1 disregarding the order. Which is actually a binomial coefficient - C(n,x) = n!/((n-x)! * x!).
The algorithm will transition through all lengths between n and m, so overall time complexity would be Sum(k=m...n, C(n,k)) = n!/((n-1)!*1!) + n!/((n-2)!*2! + ... + n!/((n-k)!*k!). Considering that we are counting asymptotic complexity we are interested in the largest member of that sum, which is the one with k as closest as possible to n/2. If m is lesser than n/2, then C(n, n/2) is present in the sum, otherwise C(n,m) is the largest element in it, so the complexity of the algorithm with memoization is O(max(C(n,n/2), C(n,m))).
Problem: Find best way to cut a rod of length n.
Each cut is integer length.
Assume that each length i rod has a price p(i).
Given: rod of length n, and a list of prices p, which provided the price of each possible integer lenght between 0 and n.
Find best set of cuts to get maximum price.
Can use any number of cuts, from 0 to n−1.
There is no cost for a cut.
Following I present a naive algorithm for this problem.
CUT-ROD(p,n)
if(n == 0)
return 0
q = -infinity
for i = 1 to n
q = max(q, p[i]+CUT-ROD(p,n-1))
return q
How can I prove that this algorithm is exponential? Step-by-step.
I can see that it is exponential. However, I'm not able to proove it.
Let's translate the code to C++ for clarity:
int prices[n];
int cut-rod(int n) {
if(n == 0) {
return 0;
}
q = -1;
res = cut-rod(n-1);
for(int i = 0; i < n; i++) {
q = max(q, prices[i] + res);
}
return q;
}
Note: We are caching the result of cut-rod(n-1) to avoid unnecessarily increasing the complexity of the algorithm. Here, we can see that cut-rod(n) calls cut-rod(n-1), which calls cut-rod(n-2) and so on until cut-rod(0). For cut-rod(n), we see that the function iterates over the array n times. Therefore the time complexity of the algorithm is equal to O(n + (n-1) + (n-2) + (n-3)...1) = O(n(n+1)/2) which is approximately equal to O((n^2)/2).
EDIT:
If we are using the exact same algorithm as the one in the question, its time complexity is O(n!) since cut-rod(n) calls cut-rod(n-1) n times. cut-rod(n-1) calls cut-rod(n-2) n-1 times and so on. Therefore the time complexity is equal to O(n*(n-1)*(n-2)...1) = O(n!).
I am unsure if this counts as a step-by-step solution but it can be shown easily by induction/substitution. Just assume T(i)=2^i for all i<n then we show that it holds for n:
This is a practice question for the understanding of Divide and conquer algorithms.
You are given an array of N sorted integers. All the elements are distinct except one
element is repeated twice. Design an O (log N) algorithm to find that element.
I get that array needs to be divided and see if an equal counterpart is found in the next index, some variant of binary search, I believe. But I can't find any solution or guidance regarding that.
You can not do it in O(log n) time because at any step even if u divide the array in 2 parts, u can not decide which part to consider for further processing and which should be left.
On the other hand if the consecutive numbers are all present in the array then by looking at the index and the value in the index we can decide if the duplicate number is in left side or right side of the array.
D&C should look something like this
int Twice (int a[],int i, int j) {
if (i >= j)
return -1;
int k = (i+j)/2;
if (a[k] == a[k+1])
return k;
if (a[k] == a[k-1])
return k-1;
int m = Twice(a,i,k-1);
int n = Twice(a,k+1,j);
return m != -1 ? m : n;
}
int Twice (int a[], int n) {
return Twice(a,0,n);
}
But it has complexity O(n). As it is said above, it is not possible to find O(lg n) algorithm for this problem.
Given:
array of integers
value K,M
Question:
Find the maximum sum which we can obtain from all K element subsets of given array such that sum is less than value M?
is there a non dynamic programming solution available to this problem?
or if it is only dp[i][j][k] can only solve this type of problem!
can you please explain the algorithm.
Many people have commented correctly that the answer below from years ago, which uses dynamic programming, incorrectly encodes solutions allowing an element of the array to appear in a "subset" multiple times. Luckily there is still hope for a DP based approach.
Let dp[i][j][k] = true if there exists a size k subset of the first i elements of the input array summing up to j
Our base case is dp[0][0][0] = true
Now, either the size k subset of the first i elements uses a[i + 1], or it does not, giving the recurrence
dp[i + 1][j][k] = dp[i][j - a[i + 1]][k - 1] OR dp[i][j][k]
Put everything together:
given A[1...N]
initialize dp[0...N][0...M][0...K] to false
dp[0][0][0] = true
for i = 0 to N - 1:
for j = 0 to M:
for k = 0 to K:
if dp[i][j][k]:
dp[i + 1][j][k] = true
if j >= A[i] and k >= 1 and dp[i][j - A[i + 1]][k - 1]:
dp[i + 1][j][k] = true
max_sum = 0
for j = 0 to M:
if dp[N][j][K]:
max_sum = j
return max_sum
giving O(NMK) time and space complexity.
Stepping back, we've made one assumption here implicitly which is that A[1...i] are all non-negative. With negative numbers, initializing the second dimension 0...M is not correct. Consider a size K subset made up of a size K - 1 subset with sum exceeding M and one other sufficiently negative element of A[] such that overall sum no longer exceeds M. Similarly, our size K - 1 subset could sum to some extremely negative number and then with a sufficiently positive element of A[] sum to M. In order for our algorithm to still work in both cases we would need to increase the second dimension from M to the difference between the sum of all positive elements in A[] and the sum of all negative elements (the sum of the absolute values of all elements in A[]).
As for whether a non dynamic programming solution exists, certainly there is the naive exponential time brute force solution and variations that optimize the constant factor in the exponent.
Beyond that? Well your problem is closely related to subset sum and the literature for the big name NP complete problems is rather extensive. And as a general principle algorithms can come in all shapes and sizes -- it's not impossible for me to imagine doing say, randomization, approximation, (just choose the error parameter to be sufficiently small!) plain old reductions to other NP complete problems (convert your problem into a giant boolean circuit and run a SAT solver). Yes these are different algorithms. Are they faster than a dynamic programming solution? Some of them, probably. Are they as simple to understand or implement, without say training beyond standard introduction to algorithms material? Probably not.
This is a variant of the Knapsack or subset-problem, where in terms of time (at the cost of exponential growing space requirements as the input size grows), dynamic programming is the most efficient method that CORRECTLY solves this problem. See Is this variant of the subset sum problem easier to solve? for a similar question to yours.
However, since your problem is not exactly the same, I'll provide an explanation anyways. Let dp[i][j] = true, if there is a subset of length i that sums to j and false if there isn't. The idea is that dp[][] will encode the sums of all possible subsets for every possible length. We can then simply find the largest j <= M such that dp[K][j] is true. Our base case dp[0][0] = true because we can always make a subset that sums to 0 by picking one of size 0.
The recurrence is also fairly straightforward. Suppose we've calculated the values of dp[][] using the first n values of the array. To find all possible subsets of the first n+1 values of the array, we can simply take the n+1_th value and add it to all the subsets we've seen before. More concretely, we have the following code:
initialize dp[0..K][0..M] to false
dp[0][0] = true
for i = 0 to N:
for s = 0 to K - 1:
for j = M to 0:
if dp[s][j] && A[i] + j < M:
dp[s + 1][j + A[i]] = true
for j = M to 0:
if dp[K][j]:
print j
break
We're looking for a subset of K elements for which the sum of the elements is a maximum, but less than M.
We can place bounds [X, Y] on the largest element in the subset as follows.
First we sort the (N) integers, values[0] ... values[N-1], with the element values[0] is the smallest.
The lower bound X is the largest integer for which
values[X] + values[X-1] + .... + values[X-(K-1)] < M.
(If X is N-1, then we've found the answer.)
The upper bound Y is the largest integer less than N for which
values[0] + values[1] + ... + values[K-2] + values[Y] < M.
With this observation, we can now bound the second-highest term for each value of the highest term Z, where
X <= Z <= Y.
We can use exactly the same method, since the form of the problem is exactly the same. The reduced problem is finding a subset of K-1 elements, taken from values[0] ... values[Z-1], for which the sum of the elements is a maximum, but less than M - values[Z].
Once we've bound that value in the same way, we can put bounds on the third-largest value for each pair of the two highest values. And so on.
This gives us a tree structure to search, hopefully with much fewer combinations to search than N choose K.
Felix is correct that this is a special case of the knapsack problem. His dynamic programming algorithm takes O(K*M) size and O(K*K*M) amount of time. I believe his use of the variable N really should be K.
There are two books devoted to the knapsack problem. The latest one, by Kellerer, Pferschy and Pisinger [2004, Springer-Verlag, ISBN 3-540-40286-1] gives an improved dynamic programming algorithm on their page 76, Figure 4.2 that takes O(K+M) space and O(KM) time, which is huge reduction compared to the dynamic programming algorithm given by Felix. Note that there is a typo on the book's last line of the algorithm where it should be c-bar := c-bar - w_(r(c-bar)).
My C# implementation is below. I cannot say that I have extensively tested it, and I welcome feedback on this. I used BitArray to implement the concept of the sets given in the algorithm in the book. In my code, c is the capacity (which in the original post was called M), and I used w instead of A as the array that holds the weights.
An example of its use is:
int[] optimal_indexes_for_ssp = new SubsetSumProblem(12, new List<int> { 1, 3, 5, 6 }).SolveSubsetSumProblem();
where the array optimal_indexes_for_ssp contains [0,2,3] corresponding to the elements 1, 5, 6.
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
public class SubsetSumProblem
{
private int[] w;
private int c;
public SubsetSumProblem(int c, IEnumerable<int> w)
{
if (c < 0) throw new ArgumentOutOfRangeException("Capacity for subset sum problem must be at least 0, but input was: " + c.ToString());
int n = w.Count();
this.w = new int[n];
this.c = c;
IEnumerator<int> pwi = w.GetEnumerator();
pwi.MoveNext();
for (int i = 0; i < n; i++, pwi.MoveNext())
this.w[i] = pwi.Current;
}
public int[] SolveSubsetSumProblem()
{
int n = w.Length;
int[] r = new int[c+1];
BitArray R = new BitArray(c+1);
R[0] = true;
BitArray Rp = new BitArray(c+1);
for (int d =0; d<=c ; d++) r[d] = 0;
for (int j = 0; j < n; j++)
{
Rp.SetAll(false);
for (int k = 0; k <= c; k++)
if (R[k] && k + w[j] <= c) Rp[k + w[j]] = true;
for (int k = w[j]; k <= c; k++) // since Rp[k]=false for k<w[j]
if (Rp[k])
{
if (!R[k]) r[k] = j;
R[k] = true;
}
}
int capacity_used= 0;
for(int d=c; d>=0; d--)
if (R[d])
{
capacity_used = d;
break;
}
List<int> result = new List<int>();
while (capacity_used > 0)
{
result.Add(r[capacity_used]);
capacity_used -= w[r[capacity_used]];
} ;
if (capacity_used < 0) throw new Exception("Subset sum program has an internal logic error");
return result.ToArray();
}
}
This was an interview question that I was asked to solve: Given an unsorted array, find out 2 numbers and their sum in the array. (That is, find three numbers in the array such that one is the sum of the other two.) Please note, I have seen question about the finding 2 numbers when the sum (int k) is given. However, this question expect you to find out the numbers and the sum in the array. Can it be solved in O(n), O(log n) or O(nlogn)
There is a standard solution of going through each integer and then doing a binary search on it. Is there a better solution?
public static void findNumsAndSum(int[] l) {
// sort the array
if (l == null || l.length < 2) {
return;
}
BinarySearch bs = new BinarySearch();
for (int i = 0; i < l.length; i++) {
for (int j = 1; j < l.length; j++) {
int sum = l[i] + l[j];
if (l[l.length - 1] < sum) {
continue;
}
if (bs.binarySearch(l, sum, j + 1, l.length)) {
System.out.println("Found the sum: " + l[i] + "+" + l[j]
+ "=" + sum);
}
}
}
}
This is very similar to the standard problem 3SUM, which many of the related questions along the right are about.
Your solution is O(n^2 lg n); there are O(n^2) algorithms based on sorting the array, which work with slight modification for this variant. The best known lower bound is O(n lg n) (because you can use it to perform a comparison sort, if you're clever about it). If you can find a subquadratic algorithm or a tighter lower bound, you'll get some publications out of it. :)
Note that if you're willing to bound the integers to fall in the range [-u, u], there's a solution for the a + b + c = 0 problem in time O(n + u lg u) using the Fast Fourier Transform. It's not immediately obvious to me how to adjust it to the a + b = c problem, though.
You can solve it in O(nlog(n)) as follows:
Sort your array in O(nlog(n)) ascendingly. You need 2 indices pointing to the left/right end of your array. Lets's call them i and j, i being the left one and j the right one.
Now calculate the sum of array[i] + array[j].
If this sum is greater than k, reduce j by one.
If this sum is smaller than k. increase i by one.
Repeat until the sum equals k.
So with this algorithm you can find the solution in O(nlog(n)) and it is pretty simple to implement
Sorry. It seems that I didn't read your post carefully enough ;)