Variation on 0/1 Knapsack Algorithm - algorithm

I'm very new to programming and have been asked to solve a program for work. Right now we are dealing with a typical 0/1 Knapsack problem, in which the benefit/value is maximized given mass and volume constraints.
My task is to basically reverse this and minimize either the volume or mass given a value constraint. In other words, I want my benefit score to be greater than or equal to a set value and then see how small I can get the knapsack given that threshold value.
I have tried researching this problem elsewhere and am sure that it probably has a formal name, however I am unable to find it. If anyone has any information I would greatly appreciate it. I am at a bit of a loss of how to go about solving this type of algorithm as you cannot use the same recursion formulas.

Let's call the weight of item i w(i), and its value v(i). Order the items arbitrarily, and define f(i, j) to be the minimum possible capacity of a knapsack that holds a subset of the first i items totalling at least a value of j.
To calculate f(i, j), we can either include the ith item or not in the knapsack, so
f(i>0, j>0) = min(g(i, j), h(i, j)) # Can include or exclude ith item; pick the best
f(_, 0) = 0 # Don't need any capacity to reach value of 0
f(i<=0, j>0) = infinity # Can't get a positive value with <= 0 items
g(i, j) = f(i-1, j) # Capacity needed if we exclude ith item
h(i, j) = f(i-1, max(0, j-v(i))) + w(i) # Capacity needed if we include ith item
In the last line, max(0, j-v(i)) just makes sure that the second argument in the recursive call to f() does not go negative in the case where v(i) > j.
Memoising this gives a pseudopolynomial O(nc)-time, O(nc)-space algorithm, where n is the number of items and c is the value threshold. You can save space (and possibly time, although not in the asymptotic sense) by calculating it in bottom-up fashion -- this would bring the space complexity down to O(c), since while calculating f(i, ...) you only ever need access to f(i-1, ...), so you only need to keep the previous and current "rows" of the DP matrix.

If I understand your question correctly, the problem you wish to solve is on the form:
let mass_i be the mass of item i, let vol_i the volume, and let val_i be its value.
Let x_i be a binary variable, where x_i is one if and only if the item is in the knapsack.
minimize (mass_1 * x_1 + ... + mass_n * x_n) //The case where you are minimizing mass
s.t. mass_1 * x_1 + ... + mass_n * x_n >= MinMass
vol_1 * x_1 + ... + vol_n * x_n >= MinVolume
val_1 * x_1 + ... + val_n * x_n >= MinValue
x_i in {0,1} for all i
A trick you can use to to make it more "knapsacky" is to substitute x_i with 1-y_i, where y_i is 1 one, if and only if item i is not in the knapsack. Then you get an equivalent problem on the form:
let mass_i be the mass of item i, let vol_i the volume, and let val_i be its value.
Let y_i be a binary variable, where y_i is one if and only if the item is NOT in the knapsack.
maximize mass_1 * y_1 + ... + mass_n * y_n) //The case where you are minimizing mass
s.t. mass_1 * y_1 - ... + mass_n * y_n <= mass_1 + ... + mass_n - MinMass
vol_1 * y_1 - ... + vol_n * y_n <= vol_1 + ... + vol_n - MinVolume
val_1 * y_1 - ... + val_n * y_n <= val_1 + ... + val_n - MinValue
y_i in {0,1} for all i
which is a knapsack problem with 3 constraints. The solution y can easily be transformed into an equivalent solution for your original problem by setting x_i = 1 - y_i.

Related

Find a permutation that minimizes the sum

I have an array of elements [(A1, B1), ..., (An, Bn)] (all are positive floats and Bi <= 1) and I need to find such permutation which mimimizes the sum A1 + B1 * A2 + B1 * B2 * A3 + ... + B1 * ... B(n-1) * An.
Definitely I can just try all of them and select the one which gives the smallest sum (this will give correct result in O(n!)).
I tried to change the sum to A1 + B1 * (A2 + B2 * (A3 + B3 * (... + B(n-1) * An)) and tried to use a greedy algorithm which grabs the biggest Ai element on each of the steps (this does not yield a correct result).
Now when I look at the latest equation, it looks to me that here I see optimal substructure A(n - 1) + B(n - 1) * An and therefore I have to use dynamic programming, but I can not figure out correct direction. Any thoughts?
I think this can be solved in O(N log(N)).
Any permutation can be obtained by swapping pairs of adjacent elements; this is why bubble sort works, for example. So let's take a look at the effect of swapping entries (A[i], B[i]) and (A[i+1], B[i+1]). We want to find out in which cases it's a good idea to make this swap. This has effect only on the ith and i+1th terms, all others stay the same. Also, both before and after the swap, both terms have a factor B[1]*B[2]*...*B[i-1], which we can call C for now. C is a positive number.
Before the swap, the two terms we're dealing with are C*A[i] + C*B[i]*A[i+1], and afterwards they are C*A[i+1] + C*B[i+1]*A[i]. This is an improvement if the difference between the two is positive:
C*(A[i] + B[i]*A[i+1] - A[i+1] - B[i+1]*A[i]) > 0
Since C is positive, we can ignore that factor and look just at the As and Bs. We get
A[i] - B[i+1]*A[i] > A[i+1] - B[i]*A[i+1]
or equivalently
(1 - B[i+1])*A[i] > (1 - B[i])*A[i+1]
Both of these expressions are nonnegative; if one of B[i] or B[i+1] is one, then the term containing 'one minus that variable' is zero (so we should swap if B[i] is one but not if B[i+1] is one); if both variables are one, then both terms are zero. Let's assume for now that neither is equal to one; then we can rewrite further to obtain
A[i]/(1 - B[i]) > A[i+1]/(1 - B[i+1])
So we should compute this expression D[i] := A[i]/(1 - B[i]) for both terms and swap them if the left one is greater than the right one. We can extend this to the case where one or both Bs are one by defining D[i] to be infinitely big in that case.
OK, let's recap - what have we found? If there is a pair i, i+1 where D[i] > D[i+1], we should swap those two entries. That means that the only case where we cannot improve the result by swapping, is when we have reordered the pairs so that the D[i] values are in increasing order -- that is, all the cases with B[i] = 1 come last (recall that that corresponds to D[i] being infinitely large) and otherwise in increasing order of D[i] value. We can achieve that by sorting with respect to the D[i] value. A quick examination of our steps above shows that the order of pairs with equal D[i] value does not impact the final value.
Computing all D[i] values can be done in a single, linear-time pass. Sorting can be done with an O(N log(N)) algorithm (we needed the swapping-of-neighbouring-elements stuff only as an argument/proof to show that this is the optimal solution, not as part of the implementation).

How can I find a faster algorithm for this special case of Longest Common Sub-sequence (LCS)?

I know the LCS problem need time ~ O(mn) where m and n are length of two sequence X and Y respectively. But my problem is a little bit easier so I expect a faster algorithm than ~O(mn).
Here is my problem:
Input:
a positive integer Q, two sequence X=x1,x2,x3.....xn and Y=y1,y2,y3...yn, both of length n.
Output:
True, if the length of the LCS of X and Y is at least n - Q;
False, otherwise.
The well-known algorithm costs O(n^2) here, but actually we can do better than that. Because whenever we eliminate as many as Q elements in either sequence without finding a common element, the result returns False. Someone said there should be an algorithm as good as O(Q*n), but I cannot figure out.
UPDATE:
Already found an answer!
I was told I can just calculate the diagonal block of the table c[i,j], because if |i-j|>Q, means there are already more than Q unmatched elements in both sequences. So we only need to calculate the c[i,j] when |i-j|<=Q.
Here is one possible way to do it:
1. Let's assume that f(prefix_len, deleted_cnt) is the leftmost position in Y such that prefix_len elements of X were already processed and exactly deleted_cnt of them were deleted. Obviously, there are only O(N * Q) states because deleted_cnt cannot exceed Q.
2. The base case is f(0, 0) = 0(nothing was processed, thus nothing was deleted).
3. Transitions:
a) Remove the current element: f(i + 1, j + 1) = min(f(i + 1, j + 1), f(i, j)).
b) Match the current element with the leftmost possible element from Y that is equal to it and located after f(i, j)(let's assume that it has index pos): f(i + 1, j) = min(f(i + 1, j), pos).
4. So the only question remaining is how to get the leftmost matching element located to the right from a given position. Let's precompute the following pairs: (position in Y, element of X) -> the leftmost occurrence of the element of Y equal to this element of X to the right from this position in Y and put them into a hash table. It looks like O(n^2). But is not. For a fixed position in Y, we never need to go further to the right from it than by Q + 1 positions. Why? If we go further, we skip more than Q elements! So we can use this fact to examine only O(N * Q) pairs and get desired time complexity. When we have this hash table, finding pos during the step 3 is just one hash table lookup. Here is a pseudo code for this step:
map = EmptyHashMap()
for i = 0 ... n - 1:
for j = i + 1 ... min(n - 1, i + q + 1)
map[(i, Y[j])] = min(map[(i, Y[j])], j)
Unfortunately, this solution uses hash tables so it has O(N * Q) time complexity on average, not in the worst case, but it should be feasible.
You can also say cost of the process to make the string equal must not be greater than Q.if it greater than Q than answer must be false.(EDIT DISTANCE PROBLEM)
Suppose of the of string x is m, and the size of string y is n, then we create a two dimensional array d[0..m][0..n], where d[i][j] denotes the edit distance between the i-length prefix of x and j-length prefix of y.
The computation of array d is done using dynamic programming, which uses the following recurrence:
d[i][0] = i , for i <= m
d[0][j] = j , for j <= n
d[i][j] = d[i - 1][j - 1], if s[i] == w[j],
d[i][j] = min(d[i - 1][j] + 1, d[i][j - 1] + 1, d[i - 1][j - 1] + 1), otherwise.
answer of LCS if m>n, m-dp[m][m-n]

Formulating a bilinear optimization program as an integer linear program

In my work, I came across the following problem: Given a similarity matrix D, where $d_{i,j} \in \Re$ represents the similarity between objects $i$ and $j$, I would like to select $k$ objects, for $k \in {1, \dots, n}$, in such a way that minimizes the similarity between the selected objects. My first attempt to formally formulate this problem was using the following integer program:
$\minimize$ $d_{1,2}X_1X_2 + d_{1,3}X_1X_3 + \dots + d_{1,n}X_1X_n + d_{2,1}X_2X_1 + \dots + d_{n,n-1}X_nX_{n-1} $
such that $X_1 + X_2 + \dots + X_n = k$ and $X_y \in {0,1}$, for $y=1,\dots,n$
In the above program, $X_y$ indicates whether or not object $y$ was selected. Clearly, the above program is not linear. I tried to make the objective function linear by using variables $X_{1,2} $, which indicates whether or not both objects $X_1$ and $X_2$ were selected. However, I am struggling to formulate the constraint that exactly $k$ objects must be chosen, i.e., the previous constraint $X_1 + X_2 + \dots + X_n = k$.
Since I am not an expert in mathematical programming, I wonder if you could help me with this.
Thank you in advance!
All the best,
Arthur
You were on the right path, just missing one thing:
Let x_i be 1 if object i is chosen and 0 otherwise.
Let y_ij be 1 if objects i & j are both chosen and 0 otherwise
The IP goes as follows:
maximize
sum d_ij y_ij
s.t.
sum x_i = k
x_i + x_j - 1 <= y_ij for all i<j
x & y binary variables
The strange looking linking constraint says that y_ij = 1 iff x_i + x_j =2
Only define one y variable for each pair!
Hope this helps

SPOJ : Weighted Sum

You are given N integers, A[1] to A[N]. You have to assign weights to these integers such that their weighted sum is maximized. The weights should satisfy the following conditions :
Each weight should be an positive integer.
W[1] = 1
W[i] should be in the range [2, W[i-1] + 1] for i > 1
Weighted sum is defined as S = A[1] * W[1] + A[2] * W[2] + ... + A[N] * W[N]
eg :
n=4 , array[]={ 1 2 3 -4 } , answer = 6 when we assign { 1 2 3 2 } respective weights .
So, as far as my understanding and research , no Greed solution is possible for it . I worked out many testcases on pen n paper , but couldn't get a greedy strategy .
Any ideas/hints/approaches people .
Let dp[i][j] equal the maximum weighted sum we can make from A[1..i] by assigning weight j to A[i]. Clearly dp[i][j] = j*A[i] + max(dp[i - 1][(j - 1)..N]). There are O(N^2) states and our recurrence takes O(N) for each state so the overall time complexity will be O(N^3). To reduce it to O(N^2) we can notice that there is significant overlap in our recurrence.
If dp[i][j] = j * A[i] + max(dp[i - 1][(j - 1)..N]), then
dp[i][j - 1] = (j - 1)*A[i] + max(dp[i - 1][(j - 2)..N]) = (j - 1)*A[i] + max(dp[i - 1][j - 2], dp[i - 1][(j - 1)..N]) = (j - 1)*A[i] + max(dp[i - 1][j - 2], dp[i][j] - j*A[i])
Which means the recurrence takes only O(1) to compute, giving you O(N^2) time overall.
Fairly standard dynamic-programming methods can be used to solve this problem in O(N³) time. Let V(k,u) denote the best value that can be gotten using elements k...N when Wₖ₋₁ has the value u. Observe that V(k,u) is the maximum value of g·Aₖ+V(k-1,g) as g ranges from 2 to u+1, and that V(N,u) is (u+1)·AN if AN is positive, else 2·AN.
Note that u is at most k in any V(k,u) calculation, so there are N*(N-1)/2 possible values of (k,u), so the method as outlined uses O(N²) space and O(N³) time.
Here's a little insight that might enable you or someone else to come up with a really fast solution. Note that for an optimal solution, you can safely assume that at each step either you increase the weight by +1 from the previous weight, or you decrease the weight all the way down to the minimum of 2. To see this, suppose you have an optimal solution that violates the property. Then you have some weight > 2 at some position i-1 and the next weight is also > 2 at position i but not an increase. Now consider the maximal length weakly increasing sub-sequence of weights in the optimal solution starting at position i (weakly increasing means that at each step in the sub-sequence, the weight does not decrease). By assumption, the optimal solution with this sub-sequence is no worse than the same solution except with the sub-sequence having 1 subtracted from all its weights. But this means that increasing all the weights in the sub-sequence by 1 will also not make the optimal solution any worse. Thus for an optimal solution, at each step you can safely assume that either you increase the weight by 1 or you set the weight to the minimum of 2.

Print a polynomial using minimum number of calls

I keep getting these hard interview questions. This one really baffles me.
You're given a function poly that takes and returns an int. It's actually a polynomial with nonnegative integer coefficients, but you don't know what the coefficients are.
You have to write a function that determines the coefficients using as few calls to poly as possible.
My idea is to use recursion knowing that I can get the last coefficient by poly(0). So I want to replace poly with (poly - poly(0))/x, but I don't know how to do this in code, since I can only call poly. ANyone have an idea how to do this?
Here's a neat trick.
int N = poly(1)
Now we know that every coefficient in the polynomial is at most N.
int B = poly(N+1)
Now expand B in base N+1 and you have the coefficients.
Attempted explanation: Algebraically, the polynomial is
poly = p_0 + p_1 * x + p_2 * x^2 + ... + p_k * x^k
If you have a number b and expand it in base n, then you get
b = b_0 + b_1 * n + b_2 * n^2 + ...
where each b_i is uniquely determined and b_i < n.

Resources