Can not understand knapsack solutions - algorithm

In wikipedia the algorithm for Knapsack is as follows:
for i from 1 to n do
for j from 0 to W do
if j >= w[i] then
T[i, j] := max(T[i-1, j], T[i-1, j-w[i]] + v[i]) [18]
else
T[i, j] := T[i-1, j]
end if
end for
end for
And it is the same structures on all examples I found online.
What I can not understand is how does this code take into account the fact that perhaps the max value comes from a smaller knapsack? E.g. if the knapsack capacity is 8 then perhaps max value comes from capacity 7 (8 - 1).
I could not find anywhere logic to consider that perhaps the max value comes from a smaller knapsack. Is this wrong idea?

The Dynamic Programming solution of knapsack is basically recursive:
T(i,j) = max{ T(i-1,j) , T(i-1,j-w[i]) + v[i] }
// ^ ^
// ignore the element add the element, your value is increase
// by v[i] and the additional weight you can
// carry is decreased by w[i]
(The else condition is redundant in the recursive form if you set T(i,j) = -infinity for each j < 0).
The idea is exhaustive search, you start from one element and you have two possibilities: add it, or don't.
You check both options, and chose the best of those.
Since it is done recursively - you effectively checking ALL possibilities to assign the elements to the knapsack.
Note that the solution in wikipedia is basically a bottom-up solution for the same recursive formula

As I see, you have misunderstood the concept of knapsack. which I will describe here in details till we reach the code part.
First, there are two versions of the problem:
0-1 knapsack problem: here, the Items are indivisible, you either take an item or not. and can be solved with dynamic programming. //and this one is the one yo are facing problems with
Fractional knapsack problem: don't care about this one now.
For the first problem you can understand it as the following:
Given a knapsack with maximum capacity W, and a set S consisting of n items
Each item i has some weight wi and benefit value bi (all wi and W are integer values).
SO, How to pack the knapsack to achieve maximum total value of packed
items?
and in mathematical mouth:
and to solve this problem using Dynamic Programming We set up a table V[0..k, 0..W] with one row for each available item, and one column for each weight from 0 to W.
We need to carefully identify the sub-problems,
The sub-problem then will be to compute V[k,w], i.e., to find an optimal solution for
Sk= {items labeled 1, 2, .. k} in a knapsack of size w (maximum value achievable given capacity w and items 1,…, k)
So, we found this formula to solve our problem:
This algorithm only finds the max possible value that can be carried in the knapsack
i.e., the value in V[n,W]
To know the items that make this maximum value, this will be another topic.
I really hope that this answer will help you. I have an pp presentation that walks with you to fill the table and to show you the algorithm step by step. But I don't know how can I upload it to stackoverflow. let me know if any help needed.

Related

Variant of Knapsack

I'm working on a program to solve a variant of the 0/1 Knapsack problem.
The original problem is described here: https://en.wikipedia.org/wiki/Knapsack_problem.
In case the link goes missing in the future, I will give you a summary of the 0/1 Knapsack problem (if you are familiar with it, jump this paragraph):
Let's say we have n items, each with weight wi and value vi. We want to put items in a bag, that supports a maximum weight W, so that the total value inside the bag is the maximum possible without overweighting the bag. Items cannot have multiple instances (i.e., we only have one of each). The objective of the problem is to maximize SUM(vi.xi) so that SUM(wi.xi) <= W and xi = 0, 1 (xi represents the state of an item being or not in the bag).
For my case, there are small differences in both conditions and objective:
The weight of all items is 1, wi = 1, i = 1...n
I always want to put exactly half the items in the bag. So, the maximum weight capacity of the bag is half (rounded up) of the number of items.W = ceil[n/2] or W = floor[(n+1)/2].
Also, the weight inside the bag must be equal to its maximum capacity SUM(wi.xi) = W
Finally, instead of maximizing the value of the items inside the bag, the objective is that the value of the items inside is as close as possible to the value of the items outside. Hence, my objective is to minimize |SUM(vi.-xi) - SUM[vi(1-xi)]|, which simplifies into something like minimize |SUM[vi(2xi - 1)]|.
Now, there is a pseudo-code for the original 0/1 Knapsack problem in the Wikipedia page above (you can find it on the bottom of this text), but I am having trouble adapting it to my scenario. Can someone help? (I am not asking for code, just for an idea, so language is irrelevant)
Thanks!
Wikipedia's pseudo-code for 0/1 Knapsack problem:
Assume w1, w2, ..., wn, W are strictly positive integers. Define
m[i,w] to be the maximum value that can be attained with weight less
than or equal to w using items up to i (first i items).
We can define m[i,w] recursively as follows:
m[0, w]=0
m[i, w] = m[i-1, w] if wi > w (the new item is more than the current weight limit)
m[i, w]= max(m[i-1, w], m[i-1, w-wi] + vi) if wi <= w.
The solution can then be found by calculating m[n,W].
// Input:
// Values (stored in array v)
// Weights (stored in array w)
// Number of distinct items (n)
// Knapsack capacity (W)
for j from 0 to W do:
m[0, j] := 0
for i from 1 to n do:
for j from 0 to W do:
if w[i-1] <= j then:
m[i, j] := max(m[i-1, j], m[i-1, j-w[i-1]] + v[i-1])
else:
m[i, j] := m[i-1, j]
Thanks to #harold, it seems like this problem is not a Knapsack problem, but a Partition problem. Part of the pseudo-code I was seeking is in the corresponding Wikipedia page: https://en.wikipedia.org/wiki/Partition_problem
EDIT: well, actually, Partition problem algorithms tell you whether a Set of items can be partitioned in 2 sets of equal value or not. Suppose it can't, you have approximation algorithms, which say whether you can have the set partiotioned in 2 sets with the difference their values being lower than d.
BUT, they don't tell you the resulting sub-sets, and that's what I was seeking.
I ended up finding a question here asking for that (here: Balanced partition), with a code example which I have tested and works fine.

Partitioning a list of integers to minimize difference of their sums

Given a list of integers l, how can I partition it into 2 lists a and b such that d(a,b) = abs(sum(a) - sum(b)) is minimum. I know the problem is NP-complete, so I am looking for a pseudo-polynomial time algorithm i.e. O(c*n) where c = sum(l map abs). I looked at Wikipedia but the algorithm there is to partition it into exact halves which is a special case of what I am looking for...
EDIT:
To clarify, I am looking for the exact partitions a and b and not just the resulting minimum difference d(a, b)
To generalize, what is a pseudo-polynomial time algorithm to partition a list of n numbers into k groups g1, g2 ...gk such that (max(S) - min(S)).abs is as small as possible where S = [sum(g1), sum(g2), ... sum(gk)]
A naive, trivial and still pseudo-polynomial solution would be to use the existing solution to subset-sum, and repeat for sum(array)/2to 0 (and return the first one found).
Complexity of this solution will be O(W^2*n) where W is the sum of the array.
pseudo code:
for cand from sum(array)/2 to 0 descending:
subset <- subsetSumSolver(array,cand)
if subset != null:
return subset
The above will return the maximal subset that is lower/equals sum(array)/2, and the other part is the complement for the returned subset.
However, the dynamic programming for subset-sum should be enough.
Recall that the formula is:
f(0,i) = true
f(x,0) = false | x != 0
f(x,i) = f(x-arr[i],i-1) OR f(x,i-1)
When building the matrix, the above actually creates you each row with value lower than the initial x, if you input sum(array)/2 - it's basically all values.
After you generate the DP matrix, just find the maximal value of x such that f(x,n)=true, and this is the best partition you can get.
Complexity in this case is O(Wn)
You can phrase this as a 0/1 integer linear programming optimization problem. Let wi be the ith number, and let xi be a 0/1 variable which indicates whether wi is in the first set or not. Then you want to minimize sum(xi wi) - sum((1 - xi) wi) subject to
sum(xi wi) >= sum((1 - xi) wi)
and also subject to all xi being 0 or 1. There has been a lot of research into optimizing 0/1 linear programming solvers. For large total sum W this may be an improvement over the O(W n) pseudo-polynomial time algorithm presented because the W factor is scary.
My first thought is to:
Sort list of integers
Create two empty lists A and B
While iterating from biggest integer to smallest integer...add next integer to the list with the smallest current sum.
This is, of course, not guaranteed to give you the best result but you can bound the result it will give you by the size of the biggest integer in your list

Computing Combinations

I am facing difficulty in coming up with a solution for the problem given below:
We are given n boxes each having a weight ( it means each ball in box B_i have weight C_i),
Each box contain some balls specifically
{b1,b2,b3...,b_n} (b_i is the count of balls in Box B_i).
we have to choose m balls out of it such that sum of the weights of m chosen balls be less than a given number T.
How many ways to do it?
First, let's have a look on a similar problem:
The similar problem is: you are looking to maximize the sum (such that it is still smaller then T), you are facing a variation of subset-sum problem, which is NP-Hard. The variation with a constant number of items is discussed in this thread: Sum-subset with a fixed subset size.
An alternative way to look at the problem is with a 2-dimensional knapsack problem, where weight = cost, and an extra dimension for number of elements. This concept is discussed in this thread: What's the fastest way to solve knapsack prob with two properties
Now, look at your problem: Finding the number of possible ways to achieve a sum which is smaller/equal T is still NP-Hard.
Assume you had a polynomial algorithm to do it, let it be A.
Running A(T) and A(T-1) will give you two numbers, if A(T) > A(T-1), the answer to the subset sum problem would have been true - otherwise it is false, so given a polynomial solution to this problem, we could prove P=NP.
You can solve it by using dynamic programming techniques.
Let f[i][j][k] denote the number of ways to choose j balls from B_1 to B_i with sum of weights to be exactly k. The answer you want to get is f[n][m][T].
Initially, let f[i][j][k] = 1 for all i,j,k
for i = 1 to n
for j = 0 to m
for k = 0 to T
for x = 0 to min(b_i,j) # choose x balls from B_i
y = x * C_i
if y <= k
f[i][j][k] = f[i][j][k] * f[i-1][j-x][k-y] * Comb(b_i,x)
Comb(n,k) is the number of ways to choose k elements from n elements.
The time complexity is O(n m T b) where b is the maximum number of balls in a box.
Note that, because of the T in the big-O notation, theoretically it is NP-hard. However, in practice, when T is relatively small, this algorithm is still feasible.

how to solve this (selecting intervals)

I've given some intervals I = {I(1), I(2), ..., I(m)} for I(i) = [a_i, b_i] (1<=a_i<=b_i<=n). You may suppose that intervals cover each other(sorry i'm poor in english), so there's no intervals such as {[1,5], [3,6]}, {[2,5], [5,7]}. And {[1,1], [2,2], ..., [n,n]} must be included in I.
Let's suppose C(i) = b_i - a_i + 1.
I want to find {I(c_1), I(c_2), ..., I(c_k)} that are non overlapped by each other, and C(c_1) + C(c_2) + ... + C(c_k) = T. (1 <= T <= n).
I could find O(n*T) DP solution using Subset Sum problem, and I think it's NP, but I'm not sure. Can I optimize more than O(n*T)?
The problem is reduceable from the Subset Sum problem (Given a set of numbers and a target number, find out if there is a subset that sums to this target) with a simple reduction:
Given an instance of subset-sum: S={c_1,c_2,..,c_n},T - create an instance of this problem by creating n non overlapping intervals, interval i, with c_i points (easy to do by ascending order). The same T remains.
Now, the answer to the subset-sum problem is true if and only if there is a subset of intervals that sums to T. It is basically the same problem, since all intervals do not overlap each other by definition of the problem.
From this we can conclude - your problem is NP-Hard.
Moreover, if we could solve the problem better then O(T*n), we could use the same approach to solve the subset sum problem better then O(T*n)1,2.
However, AFAIK, best pseudo polynomial solution to subset sum is O(T*n), so if you have such solution - stick with it.
(1) Converting the problem is O(n)
(2) This claim is true for this specific reduction alone, and NOT for the general case of polynomial reductions.

Which algorithm can be used to solve this variation of the partition-prob?

This is the problem:
You have two arrays A and B, of equal length. You have to partition them into two groups P and Q such that:
(1) Their difference is minimized.
(2) If A[i] goes into one of P or Q, B[i] should go into another.
Here is a link to the actual problem: http://opc.iarcs.org.in/index.php/problems/EQGIFTS
This is my logic (to solve the actual problem):
if the input is : a b c d e f g, a list of values and the index of
a,b,c,d,e,f is 0,1,2,3,4,5 respectively
if t is a index of a,b,c,d,e,f,g the program checks for t and i such
that: the value at [t] > value at [t-i] , beginning with t = 5, and i
= 1, and increasing the value of i by 1 and decreasing the value of t
by 1.
as soon as it finds a match, it swaps the values of both the indices
and sorts the values beginning from [t-1].
the resulting list of values is the output.
I don't know what is wrong with this algorithm, but it produces a wrong answer for all the test cases.
I know it can be solved using dynamic programming, and that it is a variation of the partition problem. But i don't know how to change the partition algorithm to solve this problem.
Reduce the problem to partition problem:
Create a third array D[i] = B[i] - A[i] for each i.
Now the problem is a classic partition problem on the array D, and you can use its DP solution to have a pseudo-polynomial time solution.
Correctness Proof:
If there is a solution on D (sum(D_1) = sum(D_2)) - then there are i_1,...,i_k chosen to D_1 and j_1,...,j_m chosen to D_2 (and each index is in i's or j's), such that:
sum(D[i's]) = sum(D[j's])
From the construction, it means:
sum(B[i]-A[i]) = sum(B[j]-A[j]) (for each relevant i's,j's)
and thus:
sum(B[i's]) - sum(A[i's]) = sum (B[j's]) - sum(A[j's])
From this:
sum(B[i's]) + sum(A[j's]) = sum(B[j's]) + sum(A[i's])
which exactly what we wanted, since each "index" is assigned to both parts, one part gets a B and the other gets A.
The other direction is similar.
QED
Complexity of the problem:
The problem is still NP-Hard with the simple reduction:
Given an instance of Partition Problem (S=[a_1,a_2,...,a_n]), create the instance of this problem:
A=S, B=[0,...,0]
It is easy to see that the same solution that gives optimal solution to this problem will be the needed partition to the original partition problem, and thus the problem is NP-Hard.

Resources