Find the max weight subset which to carry - algorithm

A person has items with below weights.
[10, 10, 12, 15, 16, 20, 45, 65, 120, 140, 150, 178, 198, 200, 210, 233, 298 , 306, 307, 310 , 375, 400, 420 , 411 , 501, 550, 662, 690 ,720, 731, 780, 790]
And the maximum weight he can carry home is 3 kg (3000 grams). He wants to cary as much as maximum.
Note i tried with backtracking algorithm but it's giving me subsets which are equal to the sum which i am looking but in a case where i can't find equal match sum then it's failed. I want to find the subset which is close to the sum.

This is the subset sum problem that is solveable in Dynamic Programming - which is basically an efficient implementation of your backtracking one, by following the following recursion formulas:
D(0,n) = True
D(x,0) = False | x > 0
D(x,n) = D(x-arr[i], n-1) OR D(x,n-1)
^ ^
item i is in subset item i is not in subset
By using bottom-up Dynamic Programming (creating a matrix and filling it from lower to higher) or top-down Dynamic Programming (memorizing every result and checking if it's already calculated before recursing), this is solveable in O(n*W), where n is the number of elements and W is the size of the subset (3000 in your case).
If you run bottom-up DP, the largest value of x such that D(x,n) = True, is the maximum weight one can carry. For finding the actual items, one should follow the table back, examine which items was added at each decision point, and yield the items that were added. Returning the actual set is explained with more details in the thread: How to find which elements are in the bag, using Knapsack Algorithm [and not only the bag's value]? (This thread is dealing with knapsack problem, which is a variant of your problem, with weight=cost for each item)

Using Backtracking, we can frame the solution like this,
We will try to return the maximum weight of the subset which is nearest but also lower to the given weight using this Pseudo Code:
func(weight_till_now,curr_pos)
if(weight_till_now > max_possible) return 0
if(curr_pos >= N) return 0
// Taking this weight in current subset
weight = max(weight_till_now,func(weight_till_now+wt[curr_pos],curr_pos+1))
// Not taking in current subset
weight = max(weight_till_now,func(weight_till_now,curr_pos+1))
return weight
Calling this function with initial parameters as 0,0 will give you the answer as this will make each and every subset and also try to get the maximum weight of all the possible subset weight and if it becomes greater than maximum possible weight then this will return 0.

Related

Algorithm using a combination of numbers to achieve a target or exceed it in the most efficient way

I am looking into a problem given a list and a target. I can use any number in the list multiple times to achieve the target or slightly exceed it.
It needs to be the most efficient combo. The ones I have been finding try to hit the target and if they can't then we return nothing.
For example, if I have a target of 242 and a list of 40,and 100, 240.
the most efficient would be to use 40 four times and 100 once. That gives us 260.
I tried going down the approach of using remainders. I would start with the largest number, see what remains
Just going down the algo first (not the most efficient way)
242 % 240 --> Quotient: 1, Remainder: 2--> So Use 240 + 240 = 480.
242 % 100 --> Quotient: 2, Remainder: 42 --> Use 100, 100, 100 = 300 --> Better
242 % 40 --> Quotient: 6, Remainder: 2 --> Use 6*40 + 40 = 280 --> Even better.
Try to use a combo
242 % 240 --> Remainder is 2. Try using the next smallest size. 240 + 100 --> 340. Bad
242 % 100 --> Remainder is 42. Try using the next smallest size. 40 + 40. 100 + 100 + 40 + 40. 280. Better.
Last case doesn't matter.
None of these work. I need to determine that 100 + 40 + 40 +40 + 40 = 260. This would be the best.
Do I need to go through every combination of potential values? Any direction would be helpful.
Here is a solution using A* search. It is guaranteed to find the path to the smallest amount over, using the least coins, in polynomial time. If a greedy solution works, it will get there very quickly. If it has to backtrack, it will backtrack as little as it needs to.
Note the k hack is to break all comparisons from heapq.heappush. In particular we never would want to wind up comparing down to the potential trailing None at the end (which would be a problem).
import heapq
from collections import namedtuple
def min_change (target, denominations):
denominations = list(reversed(sorted(denominations)))
k = 0
CoinChain = namedtuple('CoinChain', ['over', 'est', 'k', 'coins', 'value', 'i', 'prev'])
queue = [CoinChain(0, target/denominations[0], k, 0, 0, 0, None)]
found = {}
while True: # will break out when we have our answer.
chain = heapq.heappop(queue)
if target <= chain.value:
# Found it!
answer = []
while chain.prev is not None:
answer.append(denominations[chain.i])
chain = chain.prev
return list(reversed(answer))
elif chain.value in found and found[chain.value] <= chain.i:
continue # We can't be better than the solution that was here.
else:
found[chain.value] = chain.i # Mark that we've been here.
i = chain.i
while i < len(denominations):
k = k+1
heapq.heappush(
queue,
CoinChain(
max(chain.value + denominations[i] - target, 0),
chain.coins + 1,
k,
chain.coins + 1,
chain.value + denominations[i],
i,
chain
)
)
i += 1
print(min_change(242, [40, 100, 240]))
This is actually kind of knapsack problem or "change-making" alghoritm: https://en.wikipedia.org/wiki/Change-making_problem
The change-making problem addresses the question of finding the minimum number of coins (of certain denominations) that add up to a given amount of money. It is a special case of the integer knapsack problem, and has applications wider than just currency.
Easy way to implement it is backtracking with these rules:
Add highest possible value to your solution
Sum up the values of your solution
If treshold was exceeded, replace the last highest value with lower one. If there is already lowest possible value, remove the value and lower the next value
Example:
current: [] = 0, adding 240
current: [240] = 240, adding 240
current: [240, 240] = 480, replacing with lower
current: [240, 100] = 340, replacing with lower
current: [240, 40] = 280, replacing with lower not possible. Remove and lower next value
current: [100] = 100. Adding highest
current: [100, 240] = 340. Replacing with lower
current: [100, 100] = 200. Adding highest
....
Eventually you get to [100,40,40,40,40] = 260
Just read about that there can be amount that cannot be achieved. I suppose those are the rules:
If the value can be achieved with coins, the correct solution is the exact value with lowest possible number of coins.
If value cannot be achieved, then the best solution is the one that exceeds it, but has lowest possible difference (+if there are more solutions with this same value, the one with lowest number of coins wins)
Then you just use what I wrote, but you will also remember solutions that exceeded it. If you find solution that exceeded, you will persist it. If you find solution with better results (less exceeding or same value, but less coins), you replace it as your "best solution".
You have to go through all the possibilities (basically to the state when this alghoritm deletes all the values and cannot do anything anymore) to find the optimal solution.
You have to remember the solution that is the best so far and then return it at the end.

Find the numbers in a collection that add up to the number closest to n which is less than n

Basically I'm given a vector of numbers and a target number where the goal is to find the numbers in the array that add up to the number closest to n, that is less than n and I should be able to use the same number in the vector multiple times.
For example:
{4, 3}, target: 10
In this case it should return a vector containing 3, 3, 4 in any order because their sum is 10
{60, 30}, target: 135
In this case it should return a vector containing either 60, 60 or 30, 30, 30, 30 because the sum of these is the closest sum to the target possible while it is still less than the target.
How could I make this algorithm in modern c++?
I have tried modifying https://stackoverflow.com/a/15496549/13879838 solution to what I want but I got stuck at the point that it uses a recursion and I can't find a way to check whether a solution is better than the previous one found by the algorithm.
If all numbers in vec are guaranteed to be positive this can easily be solved using recursion.
solve(vec, target, currentSum, solution)
if currentSum > target
return
if currentSum == target
print(solution)
return
for value in vec
solve(vec, target, currentSum + value, solution + value)
You start with
solve({4, 3}, 10, 0, {})
solution + value appends and currentSum + value adds.
The next step is to find an iterative approach and to add memoization

Optimal solution to balancing out a set of numbers

So I'm writing for a side project and trying to optimise:
Given a set of n numbers (e.g. [4, 10, 15, 25, 3]), we want to make each number be roughly the same within a given tolerance (i.e. if we wanted exact, then it should be 11.4 in the above example).
We can add/remove from one and add to another. For example, we can -5 from [3] and +5 to [1] which would give us [9, 10, 10, 25, 3].
The constraint that I have is that we want the minimal number of "transfers" between each number (e.g. if we do -3.6 from [3], then it counts as one "transfer").
Not fussed about performance (most I can see it going to is a set of 50 numbers max) but really want to keep the transfers to a minimum.
We can assume the tolerance is +/- 1 to start but can dynamically change.
The goal of the algorithm is to make sure that each of the numbers in the list are roughly the same within a given tolerance. Thus, if the tolerance is zero, all the numbers must be equal to the average of all the values in the list (which will remain constant throughout the algorithm). Taking in account the tolerance, all numbers in the list must belong to the inclusive interval [average - 0.5*TOLERANCE, average + 0.5*TOLERANCE].
The main iteration of the algorithm involves retrieving the maximum and minimum values and "transferring" just enough from the maximum to the minimum so that the value furthest from the average (this can be either the minimum or the maximum) falls in the required interval. This process iterates till the maximum and minimum values are not more than TOLERANCE units away from each other.
Pseudocode for the algorithm will look as follows:
target = average of the values in the list
while dist(max, min) > TOLERANCE
x = maximum of dist(max, target) and dist(min, target)
transfer (x - 0.5*TOLERANCE) units from maximum into minimum
dist(a, b) can be defined simply as abs(a - b)
This algorithm runs in about O(n^2) time on average, requiring a bit more than n iterations, where n is the number of values.
This algorithm requires less than half the number of iterations the naive sub-optimal approach of averaging out only the minimum and maximum values in each iteration takes.
in the code, getMinMax function is simple enough, returns the min / max values, indexes and the distance (absolute value of the subtraction)
// the principle of the balance is to even the most different numbers in the set (min and max)
const balance = (threshold, arr) => {
const toBalance = Object.assign([], arr);
let mm = getMinMax(toBalance);
while (mm.distance > threshold){
toBalance[mm.maxIdx] -= mm.distance / 2;
toBalance[mm.minIdx] += mm.distance / 2;
mm = getMinMax(toBalance);
}
return toBalance;
}
To test it
const numbers = [4, 10, 15, 25, 3];
const threshold = 0;
const output = balance(threshold, numbers);
console.log(output);
// prints an array with four numbers of 11.4 (with some precision error)

Algorithm for scaling one list of ranges to another

I have a constant base list, like this:
[50, 100, 150, 200, 500, 1000]
The list defines ranges: 0 to 50, 50 to 100, and do on until 1000 to infinity.
I want to write a function for transforming any list of numbers into a list compatible with the above. By "compatible" I mean it has only numbers from that list in it, but the numbers are as close to the original value as possible. So for an example input of [111, 255, 950], I would get [100, 200, 1000]. So far I have a naive code that works like this:
for each i in input
{
calculate proximity to each number in base list
get the closest number
remove that number from the base list
return the number
}
This works fine for most scenarios, but breaks down when the input scale goes way out of hand. When I have an input like [1000, 2000, 3000], the first number gets the last number from the base list, then 2000 and 3000 get respectively 500 and 200 (since 1000 and then 500 are already taken). This results in a backwards list [1000, 500, 200].
How would I guard against that?
Approach 1
This can be solved in O(n^3) time by using the Hungarian algorithm where n is max(len(list),len(input)).
First set up a matrix that gives the cost of assigning each input to each number in the list.
matrix[i,j] = abs(input[i]-list[j])
Then use the Hungarian algorithm to find the minimum cost matching of inputs to numbers in the list.
If you have more numbers in the list than inputs, then add some extra dummy inputs which have zero cost of matching with any number in the list.
Approach 2
If the first approach is too slow, then you could use dynamic programming to compute the best fit.
The idea is to compute a function A(a,b) which gives the best match of the first a inputs to the first b numbers in your list.
A(a,b) = min( A(a-1,b-1)+matrix[a,b], A(a,b-1) )
This should give an O(n^2) solution but will require a bit more effort in order to read back the solution.

algorithm to find number of integers with given digits within a given range

If I am given the full set of digits in the form of a list list and I want to know how many (valid) integers they can form within a given range [A, B], what algorithm can I use to do it efficiently?
For example, given a list of digits (containing duplicates and zeros) list={5, 3, 3, 2, 0, 0}, I want to know how many integers can be formed in the range [A, B]=[20, 400] inclusive. For example, in this case, 20, 23, 25, 30, 32, 33, 35, 50, 52, 53, 200, 203, 205, 230, 233, 235, 250, 253, 300, 302, 303, 305, 320, 323, 325, 330, 332, 335, 350, 352, 353 are all valid.
Step 1: Find the number of digits your answers are likely to fall in. In your
example it is 2 or 3.
Step 2: For a given number size (number of digits)
Step 2a: Pick the possibilities for the first (most significant digit).
Find the min and max number starting with that digit (ascend or descending
order of rest of the digits). If both of them fall into the range:
step 2ai: Count the number of digits starting with that first digit and
update that count
Step 2b: Else if both max and min are out of range, ignore.
Step 2c: Otherwise, add each possible digit as second most significant digit
and repeat the same step
Solving by example of your case:
For number size of 2 i.e. __:
0_ : Ignore since it starts with 0
2_ : Minimum=20, Max=25. Both are in range. So update count by 3 (second digit might be 0,3,5)
3_ : Minimum=30, Max=35. Both are in range. So update count by 4 (second digit might be 0,2,3,5)
5_ : Minimum=50, Max=53. Both are in range. So update count by 3 (second digit might be 0,2,3)
For size 3:
0__ : Ignore since it starts with 0
2__ : Minimum=200, max=253. Both are in range. Find the number of ways you can choose 2 numbers from a set of {0,0,3,3,5}, and update the count.
3__ : Minimum=300, max=353. Both are in range. Find the number of ways you can choose 2 numbers from a set of {0,0,2,3,5}, and update the count.
5__ : Minimum=500, max=532. Both are out of range. Ignore.
A more interesting case is when max limit is 522 (instead of 400):
5__ : Minimum=500, max=532. Max out of range.
50_: Minimum=500, Max=503. Both in range. Add number of ways you can choose one digit from {0,2,3,5}
52_: Minimum=520, Max=523. Max out of range.
520: In range. Add 1 to count.
522: In range. Add 1 to count.
523: Out of range. Ignore.
53_: Minimum=530, Max=532. Both are out of range. Ignore.
def countComb(currentVal, digSize, maxVal, minVal, remSet):
minPosVal, maxPosVal = calculateMinMax( currentVal, digSize, remSet)
if maxVal>= minPosVal >= minVal and maxVal>= maxPosVal >= minVal
return numberPermutations(remSet,digSize, currentVal)
elif minPosVal< minVal and maxPosVal < minVal or minPosVal> maxVal and maxPosVal > maxVal:
return 0
else:
count=0
for k in unique(remSet):
tmpRemSet = [i for i in remSet]
tmpRemSet.remove(k)
count+= countComb(currentVal+k, digSize, maxVal, minVal, tmpRemSet)
return count
In your case: countComb('',2,400,20,['0','0','2','3','3','5']) +
countComb('',3,400,20,['0','0','2','3','3','5']) will give the answer.
def calculateMinMax( currentVal, digSize, remSet):
numRemain = digSize - len(currentVal)
minPosVal = int( sorted(remSet)[:numRemain] )
maxPosVal = int( sorted(remSet,reverse=True)[:numRemain] )
return minPosVal,maxPosVal
numberPermutations(remSet,digSize, currentVal): Basically number of ways
you can choose (digSize-len(currentVal)) values from remSet. See permutations
with repeats.
If the range is small but the list is big, the easy solution is just loop over the range and check if every number can be generated from the list. The checking can be made fast by using a hash table or an array with a count for how many times each number in the list can still be used.
For a list of n digits, z of which are zero, a lower bound l, and an upper bound u...
Step 1: The Easy Stuff
Consider a situation in which you have a 2-digit lower bound and a 4-digit upper bound. While it might be tricky to determine how many 2- and 4-digit numbers are within the bounds, we at least know that all 3-digit numbers are. And if the bounds were a 2-digit number and a 5-digit number, you know that all 3- and 4-digit numbers are fair game.
So let's generalize this to to a lower bound with a digits and an upper bound with b digits. For every k between a and b (not including a and b, themselves), all k-digit numbers are within the range.
How many such numbers are there? Consider how you'd pick them: the first digit must be one of the n numbers which is non-zero (so one of (n - z) numbers), and the rest are picked from the yet-unpicked list, i.e. (n-1) choices for the second digit, (n-2) for the third, etc. So this is looking like a factorial, but with a weird first term. How many numbers of the n are picked? Why, k of them, which means we have to divide by (n - k)! to ensure we only pick k digits in total. So the equation for each k looks something like: (n - z)(n - 1)!/(n - k)! Plug in every k in the range (a, b), and you have the number of (a+1)- to (b-1)-digit numbers possible, all of which must be valid.
Step 2: The Edge Cases
Things are a little bit trickier when you consider a- and b-digit numbers. I don't think you can avoid starting a depth-first search through all possible combinations of digits, but you can at least abort on an entire branch if it exceeds the boundary.
For example, if your list contained { 7, 5, 2, 3, 0 } and you had an upper bound of 520, your search might go something like the following:
Pick the 7: does 7 work in the hundreds place? No, because 700 > 520;
abort this branch entirely (i.e. don't consider 752, 753, 750, 725, etc.)
Pick the 5: does 5 work in the hundreds place? Yes, because 500 <= 520.
Pick the 7: does 7 work in the tens place? No, because 570 > 520.
Abort this branch (i.e. don't consider 573, 570, etc.)
Pick the 2: does 2 work in the tens place? Yes, because 520 <= 520.
Pick the 7: does 7 work in the ones place? No, because 527 > 520.
Pick the 3: does 3 work in the ones place? No, because 523 > 520.
Pick the 0: does 0 work in the ones place? Yes, because 520 <= 520.
Oh hey, we found a number. Make sure to count it.
Pick the 3: does 3 work in the tens place? No; abort this branch.
Pick the 0: does 0 work in the tens place? Yes.
...and so on.
...and then you'd do the same for the lower bound, but flipping the comparators. It's not nearly as efficient as the k-digit combinations in the (a, b) interval (i.e. O(1)), but at least you can avoid a good deal by pruning branches that must be impossible early on. In any case, this strategy ensures you only have to actually enumerate the two edge cases that are the boundaries, regardless of how wide your (a, b) interval is (or if you have 0 as your lower bound, only one edge case).
EDIT:
Something I forgot to mention (sorry, I typed all of the above on the bus home):
When doing the depth-first search, you actually only have to recurse when your first number equals the first number of the bound. That is, if your bound is 520 and you've just picked 3 as your first number, you can just add (n-1)!/(n-3)! immediately and skip the entire branch, because all 3-digit numbers beginning with 300 are certainly all below 500.

Resources