Counter examples for 0-1 knapsack problem with two knapsacks - algorithm

I came across the following question in this course:
Consider a variation of the Knapsack problem where we have two
knapsacks, with integer capacities π‘Š1 and π‘Š2. As usual, we are given
𝑛 items with positive values and positive integer weights. We want to
pick subsets 𝑆1,𝑆2 with maximum total value such that the total weights of 𝑆1 and 𝑆1 are at most π‘Š1 and π‘Š2, respectively. Assume that every item fits in either knapsack. Consider the following two algorithmic approaches.
(1) Use the algorithm from lecture to pick a max-value feasible solution 𝑆1 for the first knapsack, and then run it again on the remaining items to pick a max-value feasible solution 𝑆2 for the second knapsack.
(2) Use the algorithm from lecture to pick a max-value feasible solution for a knapsack with capacity π‘Š1+π‘Š2, and then split the chosen items into
two sets 𝑆1+𝑆2 that have size at most π‘Š1 and π‘Š2, respectively.
Which of the following statements is true?
Algorithm (1) is guaranteed to produce an optimal feasible solution to the original problem provided π‘Š1=π‘Š2.
Algorithm (1) is guaranteed to
produce an optimal feasible solution to the original problem but
algorithm (2) is not.
Algorithm (2) is guaranteed to produce an
optimal feasible solution to the original problem but algorithm (1) is
not.
Neither algorithm is guaranteed to produce an optimal feasible
solution to the original problem.
The "algorithm from lecture" is on YouTube. https://www.youtube.com/watch?v=KX_6OF8X6HQ, which is 0-1 knapsack problem for one bag.
The correct answer to this question is option 4. This, this and this post present solutions to the problem. However, I'm having a hard time finding counterexamples showing that options 1 through 3 are incorrect. Can you cite any?
Edit:
The accepted answer doesn't provide a counterexample for option 1; see 2 knapsacks with same capacity - Why can't we just find the max-value twice for that.

(Weight; Value): (3;10), (3;10), (4;2)
capacities 7, 3
The first method chooses 3+3 into the first sack, remaining items does not fit into the second one
(Weight; Value): (4;10), (4;10), (4;10), (2:1)
capacities 6, 6
The second method chooses (4+4+4) but this set cannot fit into two sacks without loss, while (4+2) and (4) is better

Related

Suggestions for fragment proposal algorithm

I'm currently trying to solve the following problem, but am unsure which algorithm I should be using. Its in the area of mass identification.
I have a series of "weights", *w_i*, which can sum up to a total weight. The as-measured total weight has an error associated with it, so is thus inexact.
I need to find, given the total weight T, the closest k possible combinations of weights that can sum up to the total, where k is an input from the user. Each weight can be used multiple times.
Now, this sounds suspiciously like the bounded-integer multiple knapsack problem, however
it is possible to go over the weight, and
I also want all of the ranked solutions in terms of error
I can probably solve it using multiple sweeps of the knapsack problem, from weight-error->weight+error, by stepping in small enough increments, however it is possible if the increment is too large to miss certain weight combinations that could be used.
The number of weights is usually small (4 ->10 weights) and the ratio of the total weight to the mean weight is usually around 2 or 3
Does anyone know the names of an algorithm that might be suitable here?
Your problem effectively resembles the knapsack problem which is a NP-complete problem.
For really limited number of weights, you could run over every combinations with repetition followed by a sorting which gives you a quite high number of manipulations; at best: (n + k - 1)! / ((n - 1)! Β· k!) for the combination and nΒ·log(n) for the sorting part.
Solving this kind of problem in a reasonable amount of time is best done by evolutionary algorithms nowadays.
If you take the following example from deap, an evolutionary algorithm framework in Python:
ga_knapsack.py, you realise that by modifying lines 58-59 that automatically discards an overweight solution for something smoother (a linear relation, for instance), it will give you solutions close to the optimal one in a shorter time than brute force. Solutions are already sorted for you at the end, as you requested.
As a first attempt I'd go for constraint programming (but then I almost always do, so take the suggestion with a pinch of salt):
Given W=w_1, ..., w_i for weights and E=e_1,.., e_i for the error (you can also make it asymmetric), and T.
Find all sets S (if the weights are unique, or a list) st sum w_1+e_1,..., w_k+e_k (where w_1, .., w_k \elem and e_1, ..., e_k \elem E) \approx T within some delta which you derive from k. Or just set it to some reasonably large value and decrease it as you are solving the constraints.
I just realise that you also want to parametrise the expression w_n op e_m over op \elem +, - (any combination of weights and error terms) and off the top of my head I don't know which constraint solver would allow you to do that. In any case, you can always fall back to prolog. It may not fly, especially if you have a lot of weights, but it will give you solutions quickly.

What is an algorithm to split a group of items into 3 separate groups fairly?

I have this problem in my textbook:
Given a group of n items, each with a distinct value V(i), what is the best way to divide the items into 3 groups so the group with the highest value is minimIzed? Give the value of this largest group.
I know how to do the 2 pile variant of this problem: it just requires running the knapsack algorithm backwards on the problem. However, I am pretty puzzled as how to solve this problem. Could anyone give me any pointers?
Answer: Pretty much the same thing as the 0-1 knapsack, although 2D
Tough homework problem. This is essentially the optimization version of the 3-partition problem.
http://en.wikipedia.org/wiki/3-partition_problem
It is closely related to bin packing, partition, and subset-sum (and, as you noted, knapsack). However, it happens to be strongly NP-Complete, which makes it a harder than its cousins. Anyway, I suggest you start by looking at dynamic programming solutions to the related problems (I'd start with partition, but find a non-wikipedia explanation of the DP solution).
Update: I apologize. I have mislead you. The 3-partition problem splits the input into sets of 3, not 3 sets. The rest of what I said still applies, but with the renewed hope that your variant isn't strongly np-complete.
Let f[i][j][k] denotes whether it is possible to have value j in the first set and value k in the second set, with the first i items.
So we have f[i][j][k] = f[i-1][j-v[i]][k] or f[i-1][j][k-v[i]].
and initially we have f[0][0][0] = True.
for every f[i][j][k] = True, update your answer depends on how you defines fairly.
I don't know about "The Best" mathematically speaking, but one obvious approach would be to build a population of groups initially with one item in each group. Then, for as long as you have more groups than the desired number of final groups, extract the two groups with the lowest values and combine them into a new group that you add back into the collection. This is similar to how Huffman compression trees are built.
Example:
1 3 7 9 10
becomes
4(1+3) 7 9 10
becomes
9 10 11(1+3+7)

backtracking algorithm for set cover

Can someone provide me with a backtracking algorithm to solve the "set cover" problem to find the minimum number of sets that cover all the elements in the universe?
The greedy approach almost always selects more sets than the optimal number of sets.
This paper uses Linear Programming Relaxation to solve covering problems.
Basically, the LP relaxation yields good bounds, and can be used to identify solutions that are optimum in many cases. Incidentally, when I last looked at open source LP solvers (~2003) I wasn't impressed (some gave incorrect results), but there seem to be some decent open source LP solvers now.
Your problem needs a little more clarification - it seems that you are given a family of subsets $$S_1,\ldots,S_n$$ of a set A, such that the union of the subsets equals A, and you want a minimum number of subsets whose union is still A.
The basic approach is branch and bound with some heuristics. E.g., if a particular element of A is in only one subset $$S_i$$, then you must select $$S_i$$. Similarly, if $$S_k$$ is a subset of $$S_j$$, then there's no reason to consider $$S_k$$; if element $$a_i$$ is in every subset that $$a_j$$ is in, then you can not bother considering $$a_i$$.
For branch and bound you need good bounding heuristics. Lower bounds can come from independent sets (if there are k elements $$i_1,\ldots,i_L$$ in A such that each if $$i_p$$ is contained in $$A_p$$ and $$i_q$$ is contained in $$A_q$$ then $$A_p$$ and $$A_q$$ are disjoint). Better lower bounds come from the LP relaxation described above.
The Espresso logic minimization system from Berkeley has a very high quality set covering engine.

Sort array in ascending order while minimizing "cost"

I'm taking comp 2210 (Data Structures) next semester and I've been doing the homework for the summer semester that is posted online. Until now, I've had no problems doing the assignments. Take a look at assignment 4 below, and see if you can give me a hint as to how to approach it. Please don't provide a complete algorithm, just an approach. Thanks!
A β€œcosted sort” is an algorithm in which a sequence of values must be arranged in ascending order. The sort is
carried out by interchanging the position of two values one at a time until the sequence is in the proper order. Each
interchange incurs a cost, which is calculated as the sum of the two values involved in the interchange. The total
cost of the sort is the sum of the cost of the interchanges.
For example, suppose the starting
sequence were {3, 2, 1}. One possible
series of interchanges is
Interchange 1: {3, 1, 2} interchange cost = 0
Interchange 2: {1, 3, 2} interchange cost = 4
Interchange 3: {1, 2, 3} interchange cost = 5,
given a total cost of 9
You are to write a program that determines the minimal cost to arrange a specific sequence of numbers.
Edit: The professor does not allow brute forcing.
If you want to surprise your professor, you could use Simulated Annealing. Then again, if you manage that, you can probably skip a few courses :). Note that this algorithm will only give an approximate answer.
Otherwise: try a Backtracking algorithm, or Branch and Bound. These will both find the optimal answer.
What do you mean "brute forcing?" Do you mean "try all possible combinations and select the cheapest?" Just checking.
I think "branch and bound" is what you're looking for - check any source on algorithms. It is "like" brute force, except as you try a sequence of moves, as soon as that sequence of moves is less optimal than any other sequence of moves tried so far, you can abandon the sequence that got you to that point - the cost. This is one flavor of the "backtracking" mentioned above.
My preferred language for doing this would be Prolog but I'm weird.
Simulated Annealing is a PROBABLISTIC algorithm - if the solution space has local minima, then you may be trapped in one and get what you think is the right answer but isn't. There are ways around that and the literature all about that can be found but I don't agree that it's the tool you want.
You could try the related genetic algorithms too if that's the way you want to go.
Have you learned trees? You could create a tree with all possible changes leading to the desired result. The trick, of course, is to avoid creating the whole tree -- particularly when a part of it is obviously not the best solution, right?
I think the appropriate approach is to think hard about what defining properties a minimal "cost" sort has. Then figure out the cost by simulating this ideal sort. The key element here is you don't have to implement a general minimal cost sorting algorithm.
For example, let's say the defining property of a minimal cost sort is that every exchange puts at least one of the exchanged element in it's sorted position (I don't know if this is true). Every exchange based sort would love to be able to have this property, but it's not easy(possible?) in the general case. However You can easily create a program that takes an unsorted array, takes the sorted version (which itself can be generated by an unoptimal algorithm), and then using this information decides the minimum cost to achieve the sorted array from the unsorted array.
Description
I think the cheapest way to do this is to swap the cheapest misplaced item with the item that belongs in its spot. I believe this reduces cost by moving the most expensive things just once. If there are n-elements that are out of place, then there will be at most n-1 swaps to put them in place, at a cost of n-1 * cost of least item + cost of all other out of place.
If the globally cheapest element is not misplaced and the spread between this cheapest one and the cheapest misplaced one is great enough, it can be cheaper to swap the cheapest one in its correct place with the cheapest misplaced one. The cost then is n-1 * cheapest + cost of all out of place + cost of cheapest out of place.
Example
For [4,1,2,3], this algorithm exchanges (1,2) to produce:
[4,2,1,3]
and then swaps (3,1) to produce:
[4,2,3,1]
and then swaps (4,1) to produce:
[1,2,3,4]
Notice that each misplaced item in [2,3,4] is moved only once and is swapped with the lowest cost item.
Code
Ooops: "Please don't provide a complete algorithm, just an approach." Removed my code.
In an effort to only get you going on it this may not make complete sense.
Determine all possible moves, and the cost for each move and store those somehow, perform the least expensive move, then determine the moves that can be performed from this variation, storing those with the rest of your stored moves, perform the least expensive etc until the array is sorted.
I love solving things like this.
This problem is also known as the Silly Sort problem in some ACM contests. Take a look at this solution using Divide & Conquer.
Try different sorting algorithms on the same input data and print the minimum.

What's the most insidious way to pose this problem?

My best shot so far:
A delivery vehicle needs to make a series of deliveries (d1,d2,...dn), and can do so in any order--in other words, all the possible permutations of the set D = {d1,d2,...dn} are valid solutions--but the particular solution needs to be determined before it leaves the base station at one end of the route (imagine that the packages need to be loaded in the vehicle LIFO, for example).
Further, the cost of the various permutations is not the same. It can be computed as the sum of the squares of distance traveled between di -1 and di, where d0 is taken to be the base station, with the caveat that any segment that involves a change of direction costs 3 times as much (imagine this is going on on a railroad or a pneumatic tube, and backing up disrupts other traffic).
Given the set of deliveries D represented as their distance from the base station (so abs(di-dj) is the distance between two deliveries) and an iterator permutations(D) which will produce each permutation in succession, find a permutation which has a cost less than or equal to that of any other permutation.
Now, a direct implementation from this description might lead to code like this:
function Cost(D) ...
function Best_order(D)
for D1 in permutations(D)
Found = true
for D2 in permutations(D)
Found = false if cost(D2) > cost(D1)
return D1 if Found
Which is O(n*n!^2), e.g. pretty awful--especially compared to the O(n log(n)) someone with insight would find, by simply sorting D.
My question: can you come up with a plausible problem description which would naturally lead the unwary into a worse (or differently awful) implementation of a sorting algorithm?
I assume you're using this question for an interview to see if the applicant can notice a simple solution in a seemingly complex question.
[This assumption is incorrect -- MarkusQ]
You give too much information.
The key to solving this is realizing that the points are in one dimension and that a sort is all that is required. To make this question more difficult hide this fact as much as possible.
The biggest clue is the distance formula. It introduces a penalty for changing directions. The first thing an that comes to my mind is minimizing this penalty. To remove the penalty I have to order them in a certain direction, this ordering is the natural sort order.
I would remove the penalty for changing directions, it's too much of a give away.
Another major clue is the input values to the algorithm: a list of integers. Give them a list of permutations, or even all permutations. That sets them up to thinking that a O(n!) algorithm might actually be expected.
I would phrase it as:
Given a list of all possible
permutations of n delivery locations,
where each permutation of deliveries
(d1, d2, ...,
dn) has a cost defined by:
Return permutation P such that the
cost of P is less than or equal to any
other permutation.
All that really needs to be done is read in the first permutation and sort it.
If they construct a single loop to compare the costs ask them what the big-o runtime of their algorithm is where n is the number of delivery locations (Another trap).
This isn't a direct answer, but I think more clarification is needed.
Is di allowed to be negative? If so, sorting alone is not enough, as far as I can see.
For example:
d0 = 0
deliveries = (-1,1,1,2)
It seems the optimal path in this case would be 1 > 2 > 1 > -1.
Edit: This might not actually be the optimal path, but it illustrates the point.
YOu could rephrase it, having first found the optimal solution, as
"Give me a proof that the following convination is the most optimal for the following set of rules, where optimal means the smallest number results from the sum of all stage costs, taking into account that all stages (A..Z) need to be present once and once only.
Convination:
A->C->D->Y->P->...->N
Stage costs:
A->B = 5,
B->A = 3,
A->C = 2,
C->A = 4,
...
...
...
Y->Z = 7,
Z->Y = 24."
That ought to keep someone busy for a while.
This reminds me of the Knapsack problem, more than the Traveling Salesman. But the Knapsack is also an NP-Hard problem, so you might be able to fool people to think up an over complex solution using dynamic programming if they correlate your problem with the Knapsack. Where the basic problem is:
can a value of at least V be achieved
without exceeding the weight W?
Now the problem is a fairly good solution can be found when V is unique, your distances, as such:
The knapsack problem with each type of
item j having a distinct value per
unit of weight (vj = pj/wj) is
considered one of the easiest
NP-complete problems. Indeed empirical
complexity is of the order of O((log
n)2) and very large problems can be
solved very quickly, e.g. in 2003 the
average time required to solve
instances with n = 10,000 was below 14
milliseconds using commodity personal
computers1.
So you might want to state that several stops/packages might share the same vj, inviting people to think about the really hard solution to:
However in the
degenerate case of multiple items
sharing the same value vj it becomes
much more difficult with the extreme
case where vj = constant being the
subset sum problem with a complexity
of O(2N/2N).
So if you replace the weight per value to distance per value, and state that several distances might actually share the same values, degenerate, some folk might fall in this trap.
Isn't this just the (NP-Hard) Travelling Salesman Problem? It doesn't seem likely that you're going to make it much harder.
Maybe phrasing the problem so that the actual algorithm is unclear - e.g. by describing the paths as single-rail railway lines so the person would have to infer from domain knowledge that backtracking is more costly.
What about describing the question in such a way that someone is tempted to do recursive comparisions - e.g. "can you speed up the algorithm by using the optimum max subset of your best (so far) results"?
BTW, what's the purpose of this - it sounds like the intent is to torture interviewees.
You need to be clearer on whether the delivery truck has to return to base (making it a round trip), or not. If the truck does return, then a simple sort does not produce the shortest route, because the square of the return from the furthest point to base costs so much. Missing some hops on the way 'out' and using them on the way back turns out to be cheaper.
If you trick someone into a bad answer (for example, by not giving them all the information) then is it their foolishness or your deception that has caused it?
How great is the wisdom of the wise, if they heed not their ego's lies?

Resources