A thief is given the choice of n objects to steal, but only has one knapsack with a capacity of
taking M weight. Each object i has weight w_i, and profit p_i. Suppose he also knows the following:
the order of these items when sorted by increasing weight is the same as their order when sorted
by decreasing value. Give a greedy algorithm to find an optimal solution to this variant of the
knapsack problem. Prove the correctness and running time.
So the greedy algorithm I came up with was to sort the items based off of increasing weight which is also decreasing value. This means that the price per weight is in decreasing order. So the thief can take the highest valued item until the weight >= M. The running time would be O(n log n) since sorting takes O(n log n) and iterating through the list takes O(n). The part I am stuck on is the proof for correctness. Here is my proof so far:
Suppose there is an instance such that the solution stated above (referred to as GA) is not optimal. Let the optimal solution be referred to as OS, and the items taken by OS be sorted in increasing value. Since OS is more optimal than GA, then the profit earned from GA is less than or equal to the profit earned from OS. Since GA takes the item with the highest profit/weight ratio, then the first element, i, must be greater than or equal to the first element of OS. Because OS is more optimal, then there must exist a i that is greater than or equal to an item j in the set of GA. But because GA and OS are done on the same set, and GA is always taking the item with the highest profit/weight, there cannot be a i in OS that is greater than a j in GA.
Can anyone help with the proof? Thanks
Your approach to the solution is valid and the reasoning on the running time is correct. In the sequel, suppose that the input is "de-trivialized" in the sense that every occurring obejct actually fits into the knapsack and that it is impossible to select the entire input.
The sorting of the items that is generated by the sorting is both
decreasing in value
increasing weight
which makes it a special case of the general knapsack problem. The argumentation for the proof of correctnes is as follows. Let i' denote the breaking index which is the index of the first item in the sorted sequence which is rejected by the greedy algorithm. For clarity, call the corresponding object the breaking object. Note that
w_j > w_i' for each j > i'
holds, which means that that the greedy algorithm also rejects every object succeeding the breaking object (as it does not fit into the knapsack, just like the breaking object).
In total, the greedy algorithm selects a prefix of the sorted sequence; we aim at showing that any optimal solution (which we consider fixed in the sequel) is the same prefix.
Note that the optimal solution, as it it optimal, does not leave space for an additional object.
Aiming at a contradiction, let k be the minimal index which occurs in the greedy solution but not in the optimal solution. As it is impossible to select object k additionally into the optimal solution, there must (via minimality of k) be some item in the optimal solution with an index
k' > k
which permits an exchange of items in the optimal solution. As
w_k < w_k' and p_k > p_k'
hold, object k' can be replaced by object k in the optimal solution, which yields a solution with profit larger than the one of the optimal solution, which is a contradiction to its optimality.
Hence, there is no item in the greedy solution which is missing in the optimal solution, which means that the greedy solution is a subset of the optimal solution. On the other hand, the greedy solution is maximal with respect to inclusion, which means that the optimal solution cannot contain an item which is missing in the greedy solution.
Note that the greedy algorithm als is useful for the general knapsack problem; taking the better one of the greedy solution and an item with maximum profit yields an approximation algorithm with ratio 2.
Related
1)Suppose we have a common 0-1 knapsack problem. Given a set of n items numbered from 1 up to n, each with a weight w_i and a value v_i, along with a maximum weight capacity W. Here we need to select some of the objects, so that maximize sum of v_i, such that sum of w_i of selected objects will not exceed the given W number.
maximize∑(v_i*x_i), such that ∑(w_i*x_i)≤ W
2)Now suppose we have just the same problem, but we need to select objects so that sum of their values will be minimal, and sum of their weights can't be less than the given number.
minimize∑(v_i*x_i), such that ∑(w_i*x_i)≥ W.
Knowing that first problem is NP complete, how can I prove that second one has the same complexity, in other words is NP complete too?
Knowing that first problem is NP complete, how can I prove that second one has the same complexity, in other words is NP complete too?
If you want to prove that problem B is NP-complete, you have to prove that there exists a polynomial time reduction from A to B, where A is known to be a NP-complete problem.
A polynomial-time reduction from a problem A to a problem B is an algorithm that solves problem A using a polynomial number of calls to a subroutine for problem B, and polynomial time outside of those subroutine calls.(source).
So in your case, your can easily make a polynomial time reduction from the knapsack problem to the inversed knapsack problem.
These two problems are equivalent (finding an optimal solution to one immediately solves the other).
Let S be the set of objects, M be the sum of the weights of the objects of S, and W the capacity of the knapsack.
Then, we have:
(i) finding a subset of objects such that the sum of their weight does not exceed W and the sum of their value is maximal
is equivalent to
(ii) finding a subset of objects such that the sum of their weight is of at least M-W and the sum of their value is minimal.
That is because if S' is an optimal solution of (i), then S\S' is an optimal solution of (ii) (and vice-versa).
This is a polynomial time reduction (O(1) calls to the subroutine, polynomial number of operations), so the inversed knapsack is indeed NP-complete.
The key idea seems to be to exchange value and weight and use binary search on the second problem to construct the reduction.
Given an instance I of the first formulation with values v_i and weights w_i, construct an instance of the second problem by exchanging profits and weights. The sum of all weights (which now are the profits) is bounded by
n * w_max
where w_max is the maximum weight. This number itself is exponential in the encoding length of the input; however, we can use binary search to determine the maximum attainable profit such that the initial capacity W is not exceeded. This can be done in
log( n * w_max )
iterations, a number which is polynomially bounded in the encoding size of the input, using the same number of calls to an algorithm for the second problem. The described algorithm is the polynomial rediction from the first problem to the second problem.
The inverse knapsack is one of my favorites. While I have never explicitly proven that it is NP complete, I do know how to reformulate the question to the knapsack problem itself, which should do the trick:
Rather than adding objects to an empty bag, consider the problem of choosing objects to be removed from a bag that is full. Then, since the number of weights can't be less than a given number, we must remove only up to total (weight-minimum weight) in terms of objects.
Since the price is to be minimized, then the price of objects to be removed must be maximized.
We are left with the initial knapsack problem, where we have to select a group of items (to be removed) such that we maximize their price, and their total weight does not exceed weight-minimum weight. (At the end, we take the items we did NOT remove as the solution)
We have reformed the problem to be exactly the original knapsack problem, therefore it must be NP-complete as well.
The beauty of this method is that I personally have no idea what NP complete it; I just proved that inverse knapsack and knapsack are perfectly equivalent.
Given a universe of elements U={e_1....e_n}, I have a collection of subsets of these elements C={s_1...s_m}. Now given a positive integer k, I want to find a solution of k elements which cover a maximal number of subsets.
A concrete example: I have a collection of songs. each song is composed of notes. if i only know how to play k distinct notes - which k notes would allow me to play the maximal number of songs, and what is this maximal number?
How is this problem called?
Brute force approach:
First find all distinct permutations of size k from n.
Then for every permutation find ,number of subsets it cover.
And remember, if you are taking one element that cover subset 's_1' for example then you have to take all elements from that subset, else greedy approach will cover only some part of subset not whole.
And then pick that permutation which gives maximum answer.
But brute force approach only works when k is less than 10.
As the order goes exponentially and there is no better solution than this, thus this question goes to np_hard. It can be shown that that your problem reduces to vertex cover problem.
Consider subsets as trees and elements as nodes.
Now your problem is to select k elements such that it covers maximum number of trees fully.
I want to know that is there exist an algorithm to calculate "all possible combinations" of a sorted list (float and duplicates allowed) with a target sum and if there isn't any combination equal to the target sum, the algorithm return "all possible combinations" of nearest sum (lower bound) to the target sum in polynomial or pseudo-polynomial time.
I checked Balsub Algorithm "Linear Time Algorithms for Knapsack Problems with Bounded Weights" and also "A Faster Pseudopolynomial Time Algorithm for Subset Sum" with polynomial time but I'm not sure whether these problems are same regarding time-complexity.
This is an example:
Sorted List: {1.5, 2.25, 3.75, 3.81}
Target = 3.79
Results: {1.5, 2.25}, {3.75} = 3.75
Thanks
Not that I know of.
The idea of the usual pseudopolynomial solution for subset sum with small integers is that while there are a very large number of combinations, there are relatively few sums to consider. So I can store, by subset sum, a list of which last index and value we arrived at that sum at. I can then find my target answer, and walk the data structure backwards to create a list of the intermediate subset sums and index+values which were on the way to the final target answer. That gives us a data structure that represents a finite state machine to produce all possible answers. We can walk it forward with dynamic programming to either produce one answer, or a count of answers, or recursively enumerate it to give all answers. (Knowing that all answers is usually a very long list.)
The problem with floating point is that now there are a very large number of subsets AND a very large number of intermediate sums. That trick doesn't work. You can round numbers off into buckets, and produce approximate answers that are close to your target. But they will be approximate, and the right answer remains a needle in a haystack.
Sorry.
I'm learning dynamic programming and I've been having a great deal of trouble understanding more complex problems. When given a problem, I've been taught to find a recursive algorithm, memoize the recursive algorithm and then create an iterative, bottom-up version. At almost every step I have an issue. In terms of the recursive algorithm, I write different different ways to do recursive algorithms, but only one is often optimal for use in dynamic programming and I can't distinguish what aspects of a recursive algorithm make memoization easier. In terms of memoization, I don't understand which values to use for indices. For conversion to a bottom-up version, I can't figure out which order to fill the array/double array.
This is what I understand:
- it should be possible to split the main problem to subproblems
In terms of the problem mentioned, I've come up with a recursive algorithm that has these important lines of code:
int optionOne = values[i] + find(values, i+1, limit - values[i]);
int optionTwo = find(values, i+1, limit);
If I'm unclear or this is not the correct qa site, let me know.
Edit:
Example: Given array x: [4,5,6,9,11] and max value m: 20
Maximum subsequence in x under or equal to m would be [4,5,11] as 4+5+11 = 20
I think this problem is NP-hard, meaning that unless P = NP there isn't a polynomial-time algorithm for solving the problem.
There's a simple reduction from the subset-sum problem to this problem. In subset-sum, you're given a set of n numbers and a target number k and want to determine whether there's a subset of those numbers that adds up to exactly k. You can solve subset-sum with a solver for your problem as follows: create an array of the numbers in the set and find the largest subsequence whose sum is less than or equal to k. If that adds up to exactly k, the set has a subset that adds up to k. Otherwise, it does not.
This reduction takes polynomial time, so because subset-sum is NP-hard, your problem is NP-hard as well. Therefore, I doubt there's a polynomial-time algorithm.
That said - there is a pseudopolynomial-time algorithm for subset-sum, which is described on Wikipedia. This algorithm uses DP in two variables and isn't strictly polynomial time, but it will probably work in your case.
Hope this helps!
Suppose I have a finite set of numeric values of size n.
Question: Is there an efficient algorithm for enumerating the k-combinations of that set so that combination I precedes combination J iff the sum of the elements in I is less than or equal to the sum of the elements in J?
Clearly it's possible to simply enumerate the combinations and sort them according to their sums. If the set is large, however, brute enumeration of all combinations, let alone sorting, will be infeasible. If I'm only interested in obtaining the first m << choose(n,k) combinations ranked by sum, is it possible to obtain them before the heat death of the universe?
There is no polynomial algorithm for enumerating the set this way (unless P=NP).
If there was such an algorithm (let it be A), then we could solve the subset sum problem polynomially:
run A
Do a binary search to find the subset that sums closest to the desired number.
Note that step 1 runs polynomially (assumption) and step 2 runs in O(log(2^n)) = O(n).
Conclusion: Since the Subset Sum problem is NP-Complete, solving this problem efficiently will prove P=NP - thus there is no known polynomial solution to the problem.
Edit: Even though the problem is NP-Hard, getting the "smallest" m subsets can be done on O(n+2^m) by selecting the smallest m elements, generating all the subsets from these m elements - and choosing the minimal m of those. So for fairly small values of m - it might be feasible to calculate it.