Solving fractional knapsack problem with dynamic programming - algorithm

A few days ago, I was reading about greedy algorithms and dynamic programming for the fractional knapsack problem, and I saw that this problem can be solved optimally with the greedy method. Can anyone give an example or a solution to solve this problem with the dynamic programming method?
P.S: I know that the greedy method is the best way to solve this question, but I want to know how dynamic programming works for this issue.

Yes, you can solve the problem with dynamic programming.
Let f(i, j) denote the maximum total value that can be obtained using the first i elements using a knapsack whose capacity is j.
If you are familiar with the 0-1 knapsack problem, then you may remember that we had the exact same function. However, the recurrence for the 0-1 knapsack problem was f(i, j) = max{f(i - 1, j), V[i] + f(i - 1, j - W[i])} (the first argument considers the case in which we don't take the item at index i, and the second argument considers the case in which we do take the item at index i).
In the fractional knapsack problem, we are allowed to take fractional amounts of some item. Thus, our recurrence would look something like, f(i, j) = max{f(i - 1, j), delta * V[i] f(i - 1, j - delta * W[i]) over all possible values of delta, where delta represents the amount of the item that we are taking.
Now if you increment delta in sufficiently small increments, you should get the correct answer.

Related

Ordered Knapsack Problem Correctness/Proof

A thief is given the choice of n objects to steal, but only has one knapsack with a capacity of
taking M weight. Each object i has weight w_i, and profit p_i. Suppose he also knows the following:
the order of these items when sorted by increasing weight is the same as their order when sorted
by decreasing value. Give a greedy algorithm to find an optimal solution to this variant of the
knapsack problem. Prove the correctness and running time.
So the greedy algorithm I came up with was to sort the items based off of increasing weight which is also decreasing value. This means that the price per weight is in decreasing order. So the thief can take the highest valued item until the weight >= M. The running time would be O(n log n) since sorting takes O(n log n) and iterating through the list takes O(n). The part I am stuck on is the proof for correctness. Here is my proof so far:
Suppose there is an instance such that the solution stated above (referred to as GA) is not optimal. Let the optimal solution be referred to as OS, and the items taken by OS be sorted in increasing value. Since OS is more optimal than GA, then the profit earned from GA is less than or equal to the profit earned from OS. Since GA takes the item with the highest profit/weight ratio, then the first element, i, must be greater than or equal to the first element of OS. Because OS is more optimal, then there must exist a i that is greater than or equal to an item j in the set of GA. But because GA and OS are done on the same set, and GA is always taking the item with the highest profit/weight, there cannot be a i in OS that is greater than a j in GA.
Can anyone help with the proof? Thanks
Your approach to the solution is valid and the reasoning on the running time is correct. In the sequel, suppose that the input is "de-trivialized" in the sense that every occurring obejct actually fits into the knapsack and that it is impossible to select the entire input.
The sorting of the items that is generated by the sorting is both
decreasing in value
increasing weight
which makes it a special case of the general knapsack problem. The argumentation for the proof of correctnes is as follows. Let i' denote the breaking index which is the index of the first item in the sorted sequence which is rejected by the greedy algorithm. For clarity, call the corresponding object the breaking object. Note that
w_j > w_i' for each j > i'
holds, which means that that the greedy algorithm also rejects every object succeeding the breaking object (as it does not fit into the knapsack, just like the breaking object).
In total, the greedy algorithm selects a prefix of the sorted sequence; we aim at showing that any optimal solution (which we consider fixed in the sequel) is the same prefix.
Note that the optimal solution, as it it optimal, does not leave space for an additional object.
Aiming at a contradiction, let k be the minimal index which occurs in the greedy solution but not in the optimal solution. As it is impossible to select object k additionally into the optimal solution, there must (via minimality of k) be some item in the optimal solution with an index
k' > k
which permits an exchange of items in the optimal solution. As
w_k < w_k' and p_k > p_k'
hold, object k' can be replaced by object k in the optimal solution, which yields a solution with profit larger than the one of the optimal solution, which is a contradiction to its optimality.
Hence, there is no item in the greedy solution which is missing in the optimal solution, which means that the greedy solution is a subset of the optimal solution. On the other hand, the greedy solution is maximal with respect to inclusion, which means that the optimal solution cannot contain an item which is missing in the greedy solution.
Note that the greedy algorithm als is useful for the general knapsack problem; taking the better one of the greedy solution and an item with maximum profit yields an approximation algorithm with ratio 2.

Solving knapsack with fractional knapsack approach

There are two well-known knapsack problems:
1) Given n items, each has its weight and cost. We need to select items, that will fit in our knapsack and have maximal cost in sum. It can be easily solved using dynamic programming.
2) Fractional knapsack: same as the first, but we can take not the whole item only, but its part. This problem can be easily solved with greedy algorithm.
Imagine we are using greedy algorithm from second problem to solve the first one. How can I prove, that the solution we will get is no more than two times worse, than the optimal one?
As far I can see, the greedy solution can be as much as inefficient as you want.
Imagine that you have a knapsack with capacity 1 and two (n = 2) items:
item weight cost density
------------------------
A ε ε 1 <- greedy choice
B 1 1-ε 1-ε <- optimal choice
so the greedy algorithm takes A with ε cost when the optimal solution is
to take B with 1-ε cost. The chosen (greedy) solution is
(1-ε)/ε = 1/ε - 1
times inefficient than optimal one. Make ε as little as you want (say, ε = 1e-100) and have a very inefficient greedy solution.
Edit: in case of integer values only, just scale the sample above: you have a knapsack with capacity X and two (n = 2) items
item weight cost density
------------------------
A 1 1 1 <- greedy choice
B X X-1 1-1/X <- optimal choice
in this case the greedy solution is
(X - 1) / 1 = X - 1
times inefficient than optimal one. Finally, make X to be large enough (say, X = 1e100)

Is this Dynamic Programming Solution to Text Justification just Brute Force?

I'm having trouble understanding the dynamic programming solution to the text justification problem as specified in the MIT open courseware lecture here. Some notes from that lecture are here, and page 3 of the notes is what I am referring to.
I thought that Dynamic Programming meant you memoize some of the computations so that you don't need to recompute, thus saving you time, but in the algorithm given in the lecture, I don't see any use of memoization, just a whole bunch of deep recursive calls, i.e. the main function is this:
DP[i] = min(badness (i, j) + DP[j] for j in range (i + 1, n + 1))
DP[n] = 0
where badness is a function that determines the the amount of unused space after subtracting the length of the words from the line length. To me it looks like this algorithm calculates all possible "badness" calculations and chooses the smallest one, which seems like brute force to me. Where is the advantage Dynamic Programming usually gives us by memoizing past calculations so we don't have to recompute?
If you memoize the results, you don't have to compute each DP[i] several times.
That is, DP[0] "calls" DP[2] for example, but so does DP[1]. In the second time DP[2] is called, it won't be necessary to compute it again, you can just return the memoized value.
This also makes it easy to verify a polynomial upper bound for this algorithm. Since each DP[i] will perform O(n) operations, and there are n of them, the overall algorithm is O(n^2), assuming, of course, that badness(i, j) is O(1).

Finding maximum subsequence below or equal to a certain value

I'm learning dynamic programming and I've been having a great deal of trouble understanding more complex problems. When given a problem, I've been taught to find a recursive algorithm, memoize the recursive algorithm and then create an iterative, bottom-up version. At almost every step I have an issue. In terms of the recursive algorithm, I write different different ways to do recursive algorithms, but only one is often optimal for use in dynamic programming and I can't distinguish what aspects of a recursive algorithm make memoization easier. In terms of memoization, I don't understand which values to use for indices. For conversion to a bottom-up version, I can't figure out which order to fill the array/double array.
This is what I understand:
- it should be possible to split the main problem to subproblems
In terms of the problem mentioned, I've come up with a recursive algorithm that has these important lines of code:
int optionOne = values[i] + find(values, i+1, limit - values[i]);
int optionTwo = find(values, i+1, limit);
If I'm unclear or this is not the correct qa site, let me know.
Edit:
Example: Given array x: [4,5,6,9,11] and max value m: 20
Maximum subsequence in x under or equal to m would be [4,5,11] as 4+5+11 = 20
I think this problem is NP-hard, meaning that unless P = NP there isn't a polynomial-time algorithm for solving the problem.
There's a simple reduction from the subset-sum problem to this problem. In subset-sum, you're given a set of n numbers and a target number k and want to determine whether there's a subset of those numbers that adds up to exactly k. You can solve subset-sum with a solver for your problem as follows: create an array of the numbers in the set and find the largest subsequence whose sum is less than or equal to k. If that adds up to exactly k, the set has a subset that adds up to k. Otherwise, it does not.
This reduction takes polynomial time, so because subset-sum is NP-hard, your problem is NP-hard as well. Therefore, I doubt there's a polynomial-time algorithm.
That said - there is a pseudopolynomial-time algorithm for subset-sum, which is described on Wikipedia. This algorithm uses DP in two variables and isn't strictly polynomial time, but it will probably work in your case.
Hope this helps!

Dynamic programming aspect in Kadane's algorithm

Initialize:
max_so_far = 0
max_ending_here = 0
Loop for each element of the array
(a) max_ending_here = max_ending_here + a[i]
(b) if(max_ending_here < 0)
max_ending_here = 0
(c) if(max_so_far < max_ending_here)
max_so_far = max_ending_here
return max_so_far
Can anyone help me in understanding the optimal substructure and overlapping problem(bread and butter of DP) i the above algo?
According to this definition of overlapping subproblems, the recursive formulation of Kadane's algorithm (f[i] = max(f[i - 1] + a[i], a[i])) does not exhibit this property. Each subproblem would only be computed once in a naive recursive implementation.
It does however exhibit optimal substructure according to its definition here: we use the solution to smaller subproblems in order to find the solution to our given problem (f[i] uses f[i - 1]).
Consider the dynamic programming definition here:
In mathematics, computer science, and economics, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems1 and optimal substructure (described below). When applicable, the method takes far less time than naive methods that don't take advantage of the subproblem overlap (like depth-first search).
The idea behind dynamic programming is quite simple. In general, to solve a given problem, we need to solve different parts of the problem (subproblems), then combine the solutions of the subproblems to reach an overall solution. Often when using a more naive method, many of the subproblems are generated and solved many times. The dynamic programming approach seeks to solve each subproblem only once, thus reducing the number of computations
This leaves room for interpretation as to whether or not Kadane's algorithm can be considered a DP algorithm: it does solve the problem by breaking it down into easier subproblems, but its core recursive approach does not generate overlapping subproblems, which is what DP is meant to handle efficiently - so this would put it outside DP's specialty.
On the other hand, you could say that it is not necessary for the basic recursive approach to lead to overlapping subproblems, but this would make any recursive algorithm a DP algorithm, which would give DP a much too broad scope in my opinion. I am not aware of anything in the literature that definitely settles this however, so I wouldn't mark down a student or disconsider a book or article either way they labeled it.
So I would say that it is not a DP algorithm, just a greedy and / or recursive one, depending on the implementation. I would label it as greedy from an algorithmic point of view for the reasons listed above, but objectively I would consider other interpretations just as valid.
Note that I derived my explanation from this answer. It demonstrates how Kadane’s algorithm can be seen as a DP algorithm which has overlapping subproblems.
Identifying subproblems and recurrence relations
Imagine we have an array a from which we want to get the maximum subarray. To determine the max subarray that ends at index i the following recursive relation holds:
max_subarray_to(i) = max(max_subarray_to(i - 1) + a[i], a[i])
In order to get the maximum subarray of a we need to compute max_subarray_to() for each index i in a and then take the max() from it:
max_subarray = max( for i=1 to n max_subarray_to(i) )
Example
Now, let's assume we have an array [10, -12, 11, 9] from which we want to get the maximum subarray. This would be the work required running Kadane's algorithm:
result = max(max_subarray_to(0), max_subarray_to(1), max_subarray_to(2), max_subarray_to(3))
max_subarray_to(0) = 10 # base case
max_subarray_to(1) = max(max_subarray_to(0) + (-12), -12)
max_subarray_to(2) = max(max_subarray_to(1) + 11, 11)
max_subarray_to(3) = max(max_subarray_to(2) + 9, 49)
As you can see, max_subarray_to() is evaluated twice for each i apart from the last index 3, thus showing that Kadane's algorithm does have overlapping subproblems.
Kadane's algorithm is usually implemented using a bottom up DP approach to take advantage of the overlapping subproblems and to only compute each subproblem once, hence turning it to O(n).

Resources