Algorithm: Fill different baskets - algorithm

Let's assume I have 3 different baskets with a fixed capacity
And n-products which provide different value for each basket -- you can only pick whole products
Each product should be limited to a max amount (i.e. you can maximal pick product A 5 times)
Every product adds at least 0 or more value to all baskets and come in all kinds of variations
Now I want a list with all possible combinations of products fitting in the baskets ordered by accuracy (like basket 1 is 5% more full would be 5% less accurate)
Edit: Example
Basket A capacity 100
Basket B capacity 80
Basket C capacity 30
fake products
Product 1 (A: 5, B: 10, C: 1)
Product 2 (A: 20 B: 0, C: 0)
There might be hundreds more products
Best fit with max 5 each would be
5 times Product 1
4 times Product 2
Result
A: 105
B: 50
C: 5
Accuracy: (qty_used / max_qty) * 100 = (160 / 210) * 100 = 76.190%
Next would be another combination with less accuracy
Any pointing in the right direction is highly appreciated Thanks
Edit:
instead of above method, accuracy should be as error and the list should be in ascending order of error.
Error(Basket x) = (|max_qty(x) - qty_used(x)| / max_qty(x)) * 100
and the overall error should be the weighted average of the errors of all baskets.
Total Error = [Σ (Error(x) * max_qty(x))] / [Σ (max_qty(x))]

Related

Algorithm to distribute a jackpot between winners

I'm building a betting pool system and I have to split the jackpot between all participants given the number of hits (accurate predictions of a certain sport game) they achieved, where more hits means a bigger prize.
For example, if we want to distribute a 1000 coins jackpot for this betting pool, we could use this distribution:
Is there any algorithm to calculate the prize given to each winner given this conditions?
Without knowing how you want to split the prize, one option is to calculate the total number of hits by all users, and divide the jackpot by that number to find the prize awarded to each hit.
You can then just go through and give each user a prize that is this number multiplied by the number of hits.
You can simply define how big the share for which number of hits is
Hits, winWeight, numberOfWinners
5, 24, n(5)
4, 12, n(4)
3, 4, n(3)
2, 2, n(2)
1, 1, n(1)
than you multiply these values with number of winners and get:
total=24*n(5)+12*n(4)+4*n(3)+2*n(2)+1*n(1)
Now you calculate how many coins:
jackpot/total * 24 = pricePerWinner for 5 hits
jackpot/total * 12 = pricePerWinner for 4 hits
jackpot/total * 4 = pricePerWinner for 3 hits
jackpot/total * 2 = pricePerWinner for 2 hits
jackpot/total * 1 = pricePerWinner for 1 hit
Calculating the amount of total hits.
5*6 = 30
4*40 = 160
3*80 = 240
2*20 = 40
1*15 = 15
0*2 = 0
If you add them all up together it would total up to
30+160+240+40+15+0=485
Since there are 1000 coins for the jackpot.
1000/485 ~= 2
This means that for each hit, it would grant 2 coins.
Eg. 5 hits would mean 10 coins per winner

How to define a algorithm that gives a ranking number for at dentist?

I have some problems with defining a algorithm that will calculate a ranking number for a dentist.
Assume, we have three different dentists:
dentist number 1: Got 125 patients and out of the 125 patients the
dentist have booked a time with 75 of them. 60% of them got a time.
dentist number 2: Got 5 patients and out of the 5 patients the
dentist have booked a time with 4 of them. 80% of them got a time.
dentist number 3: Got 25 patients and out of the 14 patients the
dentist have booked a time with 14 of them. 56% got a time.
If we use the formula:
patients booked time with / totalpatients * 100
it will not be the right way to calculate the ranking, as we will get an output of the higher percentage is, the better the dentist is, but it's wrong. By doing it in that way, the dentists would have a ranking:
dentist number 2 would have a ranking of 1. (80% got a time).
dentist number 1 would have a ranking of 2 (60% got a time).
dentist number 3 would have a ranking of 3. (56% got a time).
But, it should be in this way:
dentist number 1 = ranking 1
dentist number 2 = ranking 2
dentist number 3 = ranking 3
I don't know to make a algorithm that also takes the amount of patients as a factor to the ranking-calculation.
It is quite arbitrary how you define what makes a better dentist in terms of number of patients and the percentage of those that have an appointment with them.
Let's call the number of patients P, the number of those that have an appointment A, and the function determining how "good" a dentist is f. So f would be a function of P and A: f(P, A).
One component of f could indeed be what you already calculated: A/P.
Another component would have to be P, but I would think that the effect on f(P, A) of increasing P with 1 would be much higher for a low P, than for a high P, so this component should not be a linear function. It would also be practical if this component would have a value between 0 and 1, just like the other component.
Taking all this together, I suggest this definition of f, which will give a number between 0 and 1:
f(P,A) = 1/3 * P/(10 + P) + 2/3 * A/P
For the different dentists, this results in:
1: 1/3 * 125/135 + 2/3 * 75/125 = 0.7086419753...
2: 1/3 * 5/15 + 2/3 * 4/5 = 0.6444444444...
3: 1/3 * 25/35 + 2/3 * 14/25 = 0.6114285714...
You could play a bit with the constant factors in the formula, like increasing the term 10. Or you could change the factors 1/3 and 2/3 making sure that their sum is 1.
This is just one way to do it. There are an infinity of other ways...

Summarize different category rankings

I determine the rankings of i.e. 1000 participants in multiple categories.
The results are something like that:
Participant/Category/Place (lower is better):
A|1|1.
A|2|1.
A|3|1.
A|4|7.
B|1|2.
B|2|2.
B|3|2.
B|4|4.
[...]
Now I want to summarize the rankings. The standard method would be to sum up all places and divide it by the number of categories:
Participant A: (1+1+1+7) / 4 = 2,5
Participant B: (2+2+2+4) / 4 = 2,5
But I want to prefer participant A, because he's won 3 of 4 categories.
I could define fixed points for all places, i.e:
Place|Points
1|1000
2|500
3|250
4|125
5|62.5
6|31.25
7|15.625
[...]
Participant A: 1000+1000+1000+15.625 = 3015.625
Participant B: 500+500+500+125 = 1625
The problem is now, that I want to give every place some points, so it's still possible to sort low places. And when I continue to divide the available points by 2, the maximum number of decimal places are insufficient (Available points /2^Number of places).
What can I do?
How about using harmonic mean?
4 / (1/1 + 1/1 + 1/1 + 1/7) = 1.272727
4 / (1/2 + 1/2 + 1/2 + 1/4) = 2.285714

Find the optimum number of non uniform bins

R - Problem: to find the optimum number of non-uniform bins to show a range of data points.
I have a bunch of data points (let us assume different prices of different mobiles). I need to categorize these mobile phones into some categories (based on the price). The bin size (in this example refers to the price range) need not be uniform (there might be lots of mobiles in the low price category and few in the long tail category).
Is there any efficient algorithm to find the optimum number of bins required and the number of data points (in this case mobile phones) which shall go into each category.
This is not a standard formula, but wanted to post as it seem to work well with data set i tested.
Find the average price of all the mobiles.
Ex: 5 mobiles with prices 10, 20, 40, 80, 200
Avg is 350/5 = 70
Subtract minimum price from average price: 70 - 10 = 60 -> name it N1
Subtract avg price from Max price: 200 - 70 = 130 -> name it N2
Find the ratio N2/N1 : 130/60: Roughly 2
This indicates that it is better to have 2 bins at the lower price range for every 1 bin at higher range.
So, for example take 2 bins below 70. Range 0 - 35(2 mobiles), 36 - 70(1 mobile)
1 bin above 70: Range 71 - 200(2 mobiles)
As you can see, number of bins and bin sizes are reasonably optimal.

How to optimize Cartesian product

Is there a better way to compute Cartesian product. Since Cartesian product is a special case that differs on each case. I think, I need to explain what I need to achieve and why I end up doing Cartesian product. Please help me if Cartesian product is the only solution for my problem. If so, how to improve the performance?
Background:
We are trying to help customers to buy products cheaper.
Let say customer ordered 5 products (prod1, prod2, prod3, prod4, prod5).
Each ordered product has been offered by different vendors.
Representation Format 1:
Vendor 1 - offers prod1, prod2, prod4
vendor 2 - offers prod1, prod5
vendor 3 - offers prod1, prod2, prod5
vendor 4 - offers prod1
vendor 5 - offers prod2
vendor 6 - offers prod3, prod4
In other words
Representation Format 2:
Prod 1 - offered by vendor1, vendor2, vendor3, vendor4
Prod 2 - offered by vendor5, vendor3, vendor1
prod 3 - offered by vendor6
prod 4 - offered by vendor1, vendor6
prod 5 - offered by vendor3, vendor2
Now to choose the best vendor based on the price. We can sort the products by price and take the first one.
In that case we choose
prod 1 from vendor 1
prod 2 from vendor 5
prod 3 from vendor 6
prod 4 from vendor 1
prod 5 from vendor 3
Complexity:
Since we chose 4 unique vendors, we need to pay 4 shipping prices.
Also each vendor has a minimum purchase order. If we don't meet it, then we end up paying that charge as well.
In order to choose the best combination of products, we have to do Cartesian product of offered products to compute the total price.
total price computation algorithm:
foreach unique vendor
if (sum (product price offered by specific vendor * quantity) < minimum purchase order limit specified by specific vendor)
totalprice += sum (product price * quantity) + minimum purchase charge + shipping price
else
totalprice += sum (product price * quantity) + shipping price
end foreach
In our case
{vendor1, vendor2, vendor3, vendor4}
{vendor1, vendor3, vendor5}
{vendor6}
{vendor1, vendor6}
{vendor2, vendor3}
4 * 3 * 1 * 2 * 2 = 48 combination needs to be computed to find the best combination.
{vendor1,vendor1, vendor6, vendor1, vendor2} = totalprice1,
{vendor1, vendor3, vendor6, vendor1, vendor2} = totalprice2,
*
{vendor4, vendor5, vendor6, vendor6, vendor3} = totalprice48
Now sort the computed total price to find the best combination.
Actual problem:
If the customer orders more than 15 products, and assume, each product has been offered by 8 unique vendors, then we end up computing 8^15=35184372088832 combinations, which takes more than couple of hours. If the customer orders more than 20 products then it takes more than couple of days.
Is there a solution to approach this problem in a different angle?
Your problem can get even more complex. A simple example:
Product 1 2 3
Vendor 1 10 20 40
Vendor 2 20 10 40
--------------------------
needed cnt 100 100 25
You need 100 El. of P1, 100 of P2, and 25 of P3.
P1 can be purchased for 1000 at V1, P2 for 1000 at V2, and P3 for 1000 at V1 or V3.
Now shipping would be free, if you purchase for 1500, but cost you 200 at each vendor else.
So if you order everything at V1, you would pay 4000:
1000+2000+1000+0 (shipping) = or for the same sum
2000+1000+1000+0 at V2, or splitted
1000+0+0+200 = 1200 at V1 plus
0+1000+1000+0 = 2000 at V2,
which sums up to 3200 and could be found by your method.
But you could split the purchase of product 3 this way:
1000+0+500+0 = 1500 at V1 plus
0+1000+500+0 = 1500 at V2
which only sums up to 3000 and would not be found by your method.
Afaik, there is established research in such topics, and the keywords are matrices and system of equations.
You can describe your problem as
f(c11, p11) + f(c22, p12) + f(c13, p13) = c1 => dc1
f(c21, p21) + f(c22, p22) + f(c23, p23) = c2 => dc2
...
f(c31, p31) + f(c32, p32) + f(c13, p33) = c3 => dc3
where cij is the count of product j at vendor i and pij is the price of product j at vendor i, but f(c11,p11) is not just count*price, but a function of count and price, since there might be a quantity discount. The right side is the purchase total for vendor i.
This is without purchase discount, which has to be modeled on top. If the discount on shipping is only depending on the total costs, it can be modeled just from ci => dci.
You would try to minimize sum (dc1+dc2+...+dcm).

Resources