greedy algorithms - algorithm

greedy algorithms - algorithm

I am new to algorithms and am currently studying using you-tube video tutorials/lectures and a book, I firstly watch the video and then read the book and finally try a question from the book to make sure I have learned the topic correctly. I am currently up to greedy algorithms and it is very confusing.
Inside the book there are various problems but I am having trouble understanding and answering a particular one.
Firstly it gives the problem which is (I've just copied the text).
there is a set of n objects of sizes {x1; x2;..... xn} and a bin with
capacity B. All these are positive integers. Try to find a subset of these objects
so that their total size is smaller than or equal to B, but as close to B as possible.
All objects are 1-dimensional. For example, if the objects have sizes 4, 7, 10, 12, 15, and
B = 20, then we should choose 4 and 15 with total size 19 (or equivalently, 7 and 12).
For each of the following greedy algorithms, show that they are not optimal by creating
a counter-example. try to make your examples as bad as you can, where "badness"
is measured by the ratio between the optimal and greedy solutions. Thus if the best
solution has value 10 and the greedy solution has value 5, then the ratio is 2.
how do I do this for the following?
1) Always choose the object with the largest size so that the total size of this and all
other objects already chosen does not exceed B. Repeat this for the remaining objects.

Assume the following instance of the problem:
You have a box of size 2n, one element of size n+1 and the rest are of size n.
It is easy to see that the optimal is 2 elements of size n, while the greedy will get you one element of size n+1.
Since it is true for each n, it actually gives you a desired ratio of at least using this greedy approach 2.

This sounds similar to the 0-1 Knapsack problem where each item has a different size but the same value, which means any one item doesn't have any preference to being placed into the bin other than its size. In your code, you need to examine each item and calculate the maximum total size that results whether or not putting it into the bin without exceeding the capacity of the bin.

Related

Why dynamic programming for 0/1 Knapsack?

I looked at many resources and also this question, but am still confused why we need Dynamic Programming to solve 0/1 knapsack?
The question is: I have N items, each item with value Vi, and each item has weight Wi. We have a bag of total weight W. How to select items to get best total of values over limitation of weight.
I am confused with the dynamic programming approach: why not just calculate the fraction of (value / weight) for each item and select the item with best fraction which has less weight than remaining weight in bag?

For your fraction-based approach you can easily find a counterexample.
Consider
W=[3, 3, 5]
V=[4, 4, 7]
Wmax=6
Your approach gives optimal value Vopt=7 (we're taking the last item since 7/5 > 4/3), but taking the first two items gives us Vopt=8.

As other answers pointed out, there are edge cases with your approach.
To explain the recursive solution a bit better, and perhaps to understand it better I suggest you approach it with this reasoning:
For each "subsack"
If we have no fitting element there is no best element
If we only have one fitting element, the best choice is that element
If we have more than one fitting element, we take each element and calculate the best fit for its "subsack". The best choice is the highest valued element/subsack combination.
This algorithm works because it spans all the possible combinations of fitting elements and finds the one with the highest value.
A direct solution, instead, is not possible as the problem is NP-hard.

Just look at this counterexample:
Weight 7, W/V pairs (3/10),(4/12),(5/21)

Greedy algorithm fails when there is unit ratio case. for example consider the following example:
n= 1 2, P= 4 18, W= 2 18, P/W= 2 1
Knapsack capacity=18
According to greedy algorithm it will consider the first item since it's P/W ratio is greater and hence the total profit will be 4 (since it cannot insert the second item after first as the capacity reduces to 16 after inserting the first item).
But the actual answer is 18.
Hence there are multiple corner cases where greedy fails to give optimal solution, that's why we use Dynamic programming in 0/1 knapsack problem.

a dynamic program about subarray sum

I've see a problem about dynamic program
like this:
let's say there is a array like this: [600, 500, 300, 220, 210]
I want to find a sub array whose sum is the most closest to 1000 and bigger than it.(>=1000).
how can I write the code? I already understand the 01 backpack problem but still cannot make out this problem

A few things:
First, I think you are referring to "dynamic programming", not "a dynamic program"; read up here if you want to know the difference: https://en.wikipedia.org/wiki/Dynamic_programming
Second, I think you mean "closest to 1000 but NOT bigger than it (< 1000)", since that is the general constraint. If you were allowed to go over 1000, then the problem doesn't make sense because there is no constraint.
Like the backpack problem, this is going to be a non-polynomial (NP) time problem (a problem where the time required to compute increases faster than polynomial growth - usually exponential or faster), where you would normally have to check every possible combination of numbers, which can take a long time for seemingly small set sizes.
I believe that the correct answer from the 5 you provided is 500+220+210, which sums to 930, the largest that you can make without going over 1000.
The basic idea of dynamic programming is to break the problem into smaller similar problems that are more easily computable; for example, if you had a million numbers and wanted to find the subset that is closest to 100000 but not over, you might divide the million into 100,000 subsets of 10 elements, and find the closest to a smaller number of each of those subsets, then use the resulting set of 100,000 sums to repeat with 10,000 sets, etc, until you reduce it to a close-but-not-perfect solution.
In any non-polynomial-time problem, dynamic programming can only be used in building a close approximation, since the solution isn't guaranteed to be optimal.

You can use transaction optimizer from the EmerCoin wallet.
It exacly does, what you're looking for.

An approach to solve this problem can be done in two steps:
define a function which takes a subarray and gives you an evaluation or a score of this subarray so that you can actually compare subarrays and take the best. A function could be simply
if(sum(subarray) < 1000) return INFINITY
else return sum(subarray) - 1000
note that you can also use dynamic programming to compute the sum of subarrays
Assuming that the length of your goal array is N, you will need to solve the problems of size 1 to N. If the array's length is 1 then obviously there is one possibility and it's the best. If size > 1 then we take the solution of the problem with length size - 1 and we compare it with every subarray containing the last element of the array and we take the best subarray as the solution of the problem with length size.
I hope my explanation makes sense

A new Bin-packing?

I'm looking in to a kind-of bin-packing problem, but not quite the same.
The problem asks to put n items into minimum number of bins without total weight exceeding capacity of bins. (classical definition)
The difference is:
Each item has a weight and bound, and the capacity of the bin is dynamically determined by the minimum bound of items in that bin.
E.g.,
I have four items A[11,12], B[1,10], C[3,4], D[20,22] ([weight,bound]).
Now, if I put item A into a bin, call it b1, then the capacity of b1 become 12. Now I try to put item B into b1, but failed because the total weight is 11+1 =12, and the capacity of b1 become 10, which is smaller than total weight. So, B is put into bin b2, whose capacity become 10. Now, put item C into b2, because the total weight is 1+3 =4, and the capacity of b2 become 4.
I don't know whether this question has been solved in some areas with some name. Or it is a variant of bin-packing that has been discussed somewhere.
I don't know whether this is the right place to post the question, any helps are appreciated!

Usually with algorithm design for NP-hard problems, it's necessary to reuse techniques rather than whole algorithms. Here, the algorithms for standard bin packing that use branch-and-bound with column generation carry over well.
The idea is that we formulate an enormous set cover instance where the sets are the sets of items that fit into a single bin. Integer programming is a good technique for normal set cover, but there are so many sets that we need to do something else, i.e., column generation. There is a one-to-one correspondence between sets and columns, so we rip out the part of the linear programming solver that uses brute force to find a good column to enter and replace it with a solver for what turns out to be the knapsack analog of this problem.
This modified knapsack problem is, given items with weights, profits, and bounds, find the most profitable set of items whose total weight is less than the minimum bound. The dynamic program for solving knapsack with small integer weights happily transfers over with no loss of efficiency. Just sort the items by descending bounds; then, when forming sets involving the most recent item, the weight limit is just that item's bound.

The following is based on Anony-mouse's answer. I am not an algorithm expert, so consider the following as "just my two cents", for what they are worth.
I think Anony-mouse is correct in starting with the smallest items (by bound). This is because a bin tends to get smaller in capacity the more items you add to it; a bin's maximum capacity is determined with the first item placed in it, it can never get larger after that point.
So instead of starting with a large bin and have its capacity slowly reduced, and having to worry about taking out too-large items that previously fit, let's jut try to keep bins' capacities as constant as possible. If we can keep the bins' capacities stable, we can use "standard" algorithms that know nothing about "bound".
So I'd suggest this:
Group all items by bound.
This will allow you to use a standard bin packing algorithm per group because if all items have the same bound (i.e. bound is constant), it can essentially be disregarded. All that the bound means now is that you know the resulting bins' capacity in advance.
Start with the group with the smallest bound and perform a standard bin packing for its items.
This will result in 1 or more bins that have a capacity equal to the bound of all items in them.
Proceed with the item group having the next-larger bound. See if there are any items that could still be put in an already existing bin (i.e. a bin produced by the previous steps).
Note that bound can again be ignored; since all pre-existing bins already have a smaller capacity than these additional items' bound, the bins' capacity cannot be affected; only weight is relevant, so you can use "standard" algorithms.
I suspect this step is an instance of the (multiple) knapsack problem, so look towards knapsack algorithms to determine how to distribute these items over and into the pre-existing, partially filled bins.
It's possible that the item group from the previous group has only been partially processed, there might be items left. These will go into one or more new bins: Basically, repeat step 3.
Repeat the above steps (from 3 onwards) until no more items are left.

It can still be written as an ILP instance, like so:
Make a binary variable x_{i,j} signifying whether item j goes into bin i, helper variables y_i that signify whether bin i is used, helper variables c_i that determine the capacity of bin i, and there are constants s_j (size of item j) b_j (bound of item j) and M (a large enough constant), now
minimize sum[j] y_j
subject to:
1: for all j:
(sum[i] x_{i,j}) = 1
2: for all i,j:
y_i ≥ x_{i,j}
3: for all i:
(sum[j] s_j * x_{i,j}) ≤ c_i
4: for all i,j:
c_i ≤ b_j + (M - M * x_{i,j})
5: x_{i,j} ϵ {0,1}
6: y_i ϵ {0,1}
The constraints mean
any item is in exactly one bin
if an item is in a bin, then that bin is used
the items in a bin do not exceed the capacity of that bin
the capacity of a bin is no more than the lowest bound of the items that are in it (the thing with the big M prevents items that are not in the bin from changing the capacity, provided you choose M no less than the highest bound)
and 6., variables are binary.
But the integrality gap can be atrocious.

First of all i might be totally wrong and there might exist an algorithm that is even better than mine.
Bin packing is NP-hard and is efficiently solved using classic algorithms like First Fit etc.There are some improvements to this too.Korf's algorithm
I aim to reduce this to normal bin packing by sorting the items by thier bound.The steps are
Sort items by bound :Sorting items by bound will help us in arranging the bins as limiting condition is minimum of bound.
Insert smallest item(by bound) into a bin
Check whether the next item(sorted by bound) can coexist in this bin.If it can then keep the item in the bin too.If not then try putting it in another bin or create another bin for it.
Repeat the procedure till all elements are arranged. The procedure is repeated in ascending order of bounds.
I think this pretty much solves the problem.Please inform me if it doesn't.I am trying to implement the same.And if there are any suggestions or improvements inform me that too. :) Thank you

Algorithm to optimally group list of values

I have several numbers. I need to group them in several groups, so that sums of all numbers in one group are between predefined min and max. The point is to left as few numbers ungrouped as possible.
Input:
min, max: range for sum of numbers
N1, N2, N3 ... Ni: numbers to group
Output:
[N1,N3,N5],[Ni,Nj,Nk,Nm...]...: groups where sum of numbers is between min and max
Na,Nb,Nc...: numbers, left ingrouped.

This problem could be viewed as bin packing into bins of size max, with a funny objective: minimize the number of items not packed into bins holding at least min. One idea from the bin-packing literature is that the "small" items (in this case, items that are small relative to max - min) are easy to pack but are accountable for most of the combinatorial explosion of possibilities. Thus some approximation algorithms for bin packing do something clever for big items and then fill in with the small. Another way to reduce the number of possibilities is to round the numbers to belong to a smaller set. It's somewhat obvious how to do that for bin packing (round up), but it's not clear what to do for this problem.
Okay, I'll give an example of how these ideas could be instantiated. Suppose that max = 1 and min = 1/2. Let's try to find a solution that's competitive with the optimum for when max = 2 and min = 1/2. (That may sound terrible, but this sort of approximation guarantee where OPT is held to higher standards is sometimes used in the literature.)
First round every item's size up to a power of 2. Very large items, of size 4 or greater, can't be packed. Large items, of size 2 or 1 or 1/2, are given their own bins. Small items, of size 1/4 or less, are dealt with as follows. Whenever two items of size 1/4 or less have the same size, combine them into one super-item. Pack all of the new items of size 1/2 into their own bins. The remainder has total size less than 1/2. If there is space in another bin, put them there. Otherwise, give them their own bin.
The quality of the resulting solution for max = 2 is at least as good as the quality of OPT for max = 1. Take the optimal solution for max = 1 and round the item sizes. The set of bad bins remains the same, because no item is smaller, and each bin stores less than 2 because each item is less than twice as large as it used to be. Now it suffices to show that the packing algorithm I gave for powers of 2 is optimal. I'll leave that as an exercise.
I don't expect this instantly to generalize into a full algorithm. I have to get back to work, but the approach I would take would be to force OPT to deal with max = 1 while ALG gets to use max = 1 + epsilon, substitute powers of (1 + epsilon) for powers of two in the rounding step, and then figure out how to pack the small items, probably using a dynamic program since greed likely won't work.

If you're not worried about efficiency, simply generate each possible grouping and choose the one that is correct and optimal in the sense you describe. Clearly, this works for any finite list of numbers (and is, by definition, optimal).
If efficiency is desired, the problem seems to become somewhat more difficult. :D I'll keep thinking.
EDIT: Come to think of it, this problem seems at least as hard as "subset sum" and, as such, I don't think there is a solution significantly better than the one I give (i.e., no known polynomial-time algorithm can solve it, if it is NP-Hard.

A packing algorithm ... kind of

Given an array of items, each of which has a value and cost, what's the best algorithm determine the items required to reach a minimum value at the minimum cost? eg:
Item: Value -> Cost
-------------------
A 20 -> 11
B 7 -> 5
C 1 -> 2
MinValue = 30
naive solution: A + B + C + C + C. Value: 30, Cost 22
best option: A + B + B. Value: 34, Cost 21
Note that the overall value:cost ratio at the end is irrelevant (A + A would give you the best value for money, but A + B + B is a cheaper option which hits the minimum value).

This is the knapsack problem. (That is, the decision version of this problem is the same as the decision version of the knapsack problem, although the optimization version of the knapsack problem is usually stated differently.) It is NP-hard (which means no algorithm is known that is polynomial in the "size" -- number of bits -- in the input). But if your numbers are small (the largest "value" in the input, say; the costs don't matter), then there is a simple dynamic programming solution.
Let best[v] be the minimum cost to get a value of (exactly) v. Then you can calculate the values best[] for all v, by (initializing all best[v] to infinity and):
best[0] = 0
best[v] = min_(items i){cost[i] + best[v-value[i]]}
Then look at best[v] for values upto the minimum you want plus the largest value; the smallest of those will give you the cost.
If you want the actual items (and not just the minimum cost), you can either maintain some extra data, or just look through the array of best[]s and infer from it.

This problem is known as integer linear programming. It's NP-hard.
However, for small problems like your example, it's trivial to make a quick few lines of code to simply brute force all the low combinations of purchase choices.
NP-harddoesn't mean impossible or even expensive, it means your problem becomes rapidly slower to solve with larger scale problems. In your case with just three items, you can solve this in mere microseconds.
For the exact question of what's the best algorithm in general.. there are entire textbooks on it. A good start is good old Wikipedia.

Edit This answer is redacted on account of being factually incorrect. Following the advice in this will only cause you harm.
This is not actually the knapsack problem, because it assumes that you cannot pack more items than there is space for in some container. In you case you want to find the cheapest combination that will fill up the space, allowing for the fact that overflow may occur.
My solution, which I don't know is the optimal but it should be pretty close, would be to compute for each item the cost benefit ratio, find the item with the highest cost benefit and fill the structure with this item until there isn't space for one more item. Then I would test to see if there was a combination with any of the other available items that could fill the available slot for less that the cost of one of the cheapest items and then if such a solution exist, use that combination otherwise use one more of the cheapest items.
Amenddum This may actually also be NP-complete, but I am not sure yet. Anyway for all practical purposes this variation should be much faster than the naive solution.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio