Best algorithm for netting orders - algorithm

I am building a marketplace, and I want to build a matching mechanism for market participants orders.
For instance I receive these orders:
A buys 50
B buys 100
C sells 50
D sells 20
that can be represented as a List<Orders>, where Order is a class with Participant,BuySell, and Amount
I want to create a Match function, that outputs 2 things:
A set of unmatched orders (List<Order>)
A set of matched orders (List<MatchedOrder> where MatchOrder has Buyer,Seller,Amount
The constrain is to minimize the number of orders (unmatched and matched), while leaving no possible match undone (ie, in the end there can only be either buy or sell orders that are unmatched)
So in the example above the result would be:
A buys 50 from C
B buys 20 from D
B buys 80 (outstanding)
This seems like a fairly complex algorithm to write but very common in practice. Any pointers for where to look at?

You can model this as a flow problem in a bipartite graph. Every selling node will be on the left, and every buying node will be on the right. Like this:
Then you must find the maximum amount of flow you can pass from source to sink.
You can use any maximum flow algorithms you want, e.g. Ford Fulkerson. To minimize the number of orders, you can use a Maximum Flow/Min Cost algorithm. There are a number of techniques to do that, including applying Cycle Canceling after finding a normal MaxFlow solution.
After running the algorithm, you'll probably have a residual network like the following:

Create a WithRemainingQuantity structure with 2 members: a pointeur o to an order and an integer to store the unmatched quantity
Consider 2 List<WithRemainingQuantity> , 1 for buys Bq, 1 for sells Sq, both sorted by descending quantities of the contained order.
the algo match the head of each queue until one of them is empty
Algo (mix of meta and c++) :
struct WithRemainingQuantity
{
Order * o;
int remainingQty; // initialised with o->getQty
}
struct MatchedOrder
{
Order * orderBuy;
Order * orderSell;
int matchedQty=0;
}
List<WithRemainingQuantity> Bq;
List<WithRemainingQuantity> Sq;
/*
populate Bq and Sq and sort by quantities descending,
this is what guarantees the minimum of matched.
*/
List<MatchedOrder> l;
while( ! Bq.empty && !Sq.empty)
{
int matchedQty = std::min(Bq.front().remainingQty, Sq.front().remainingQty)
l.push_back( MatchedOrder(orderBuy=Bq.front(), sellOrder=Sq.front(), qtyMatched=matchedQty) )
Bq.remainingQty -= matchedQty
Sq.remainingQty -= matchedQty
if(Bq.remainingQty==0)
Bq.pop_front()
if(Sq.remainingQty==0)
Sq.pop_front()
}
The unmatched orders are the remaining orders in Bq or Sq (one of them if fatally empty, according to the while clause).

Related

Algorithm to find the optimal items to buy to reach certain criteria

We have n item types that we can buy (we have unlimited stock of each):
{ p:100.0f, a:10.0f, b:20.0f }
{ p:77.0f, a:20.0f, b:10.0f }
{ p:55.0f, a:0.0f, b:12.0f }
let a and b be some random properties of the item (ie. quality and performance, this is irrelevant to the problem). We then have two values:
a: 12.0f
b: 4.0f
These two values signify the properties of our items that we are looking for, these numbers have to precisely match - we need to find the best combination of items to buy so that we have reached our targets, at the lowest p. Note that individual items can be used in fractional amounts (0.5 of a certain item has 0.5 of it's p, a, and b values)
Task: Minimize p while matching the total of a and b with the required a and b, find the best configuration and print it (including amount of each item we need).
Note that not all item types have to be used
I've tried solving this as a knapsack problem, but I was unable to get it working.

Algorithm to calculate sum of points for groups with varying member count [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Let's start with an example. In Harry Potter, Hogwarts has 4 houses with students sorted into each house. The same happens on my website and I don't know how many users are in each house. It could be 20 in one house 50 in another and 100 in the third and fourth.
Now, each student can earn points on the website and at the end of the year, the house with the most points will win.
But it's not fair to "only" do a sum of the points, as the house with a 100 students will have a much higher chance to win, as they have more users to earn points. So I need to come up with an algorithm which is fair.
You can see an example here: https://worldofpotter.dk/points
What I do now is to sum all the points for a house, and then divide it by the number of users who have earned more than 10 points. This is still not fair, though.
Any ideas on how to make this calculation more fair?
Things we need to take into account:
* The percent of users earning points in each house
* Few users earning LOTS of points
* Many users earning FEW points (It's not bad earning few points. It still counts towards the total points of the house)
Link to MySQL dump(with users, houses and points): https://worldofpotter.dk/wop_points_example.sql
Link to CSV of points only: https://worldofpotter.dk/points.csv
I'd use something like Discounted Cumulative Gain which is used for measuring the effectiveness of search engines.
The concept is as it follows:
FUNCTION evalHouseScore (0_INDEXED_SORTED_ARRAY scores):
score = 0;
FOR (int i = 0; i < scores.length; i++):
score += scores[i]/log2(i);
END_FOR
RETURN score;
END_FUNCTION;
This must be somehow modified as this way of measuring focuses on the first result. As this is subjective you should decide on your the way you would modify it. Below I'll post the code which some constants which you should try with different values:
FUNCTION evalHouseScore (0_INDEXED_SORTED_ARRAY scores):
score = 0;
FOR (int i = 0; i < scores.length; i++):
score += scores[i]/log2(i+K);
END_FOR
RETURN L*score;
END_FUNCTION
Consider changing the logarithm.
Tests:
int[] g = new int[] {758,294,266,166,157,132,129,116,111,88,83,74,62,60,60,52,43,40,28,26,25,24,18,18,17,15,15,15,14,14,12,10,9,5,5,4,4,4,4,3,3,3,2,1,1,1,1,1};
int[] s = new int[] {612,324,301,273,201,182,176,139,130,121,119,114,113,113,106,86,77,76,65,62,60,58,57,54,54,42,42,40,36,35,34,29,28,23,22,19,17,16,14,14,13,11,11,9,9,8,8,7,7,7,6,4,4,3,3,3,3,2,2,2,2,2,2,2,1,1,1};
int[] h = new int[] {813,676,430,382,360,323,265,235,192,170,107,103,80,70,60,57,43,41,21,17,15,15,12,10,9,9,9,8,8,6,6,6,4,4,4,3,2,2,2,1,1,1};
int[] r = new int[] {1398,1009,443,339,242,215,210,205,177,168,164,144,144,92,85,82,71,61,58,47,44,33,21,19,18,17,12,11,11,9,8,7,7,6,5,4,3,3,3,3,2,2,2,1,1,1,1};
The output is for different offsets:
1182
1543
1847
2286
904
1231
1421
1735
813
1120
1272
1557
It sounds like some sort of constraint between the houses may need to be introduced. I might suggest finding the person that earned the most points out of all the houses and using it as the denominator when rolling up the scores. This will guarantee the max value of a user's contribution is 1, then all the scores for a house can be summed and then divided by the number of users to normalize the house's score. That should give you a reasonable comparison. It does introduce issues with low numbers of users in a house that are high achievers in which you may want to consider lower limits to the number of house members. Another technique may be to introduce handicap scores for users to balance the scales. The algorithm will most likely flex over time based on the data you receive. To keep it fair it will take some responsive action after the initial iteration. Players can come up with some creative ways to make scoring systems work for them. Here is some pseudo-code in PHP that you may use:
<?php
$mostPointsEarned; // Find the user that earned the most points
$houseScores = [];
foreach ($houses as $house) {
$numberOfUsers = 0;
$normalizedScores = [];
foreach ($house->getUsers() as $user) {
$normalizedScores[] = $user->getPoints() / $mostPointsEarned;
$numberOfUsers++;
}
$houseScores[] = array_sum($normalizedScores) / $numberOfUsers;
}
var_dump($houseScores);
You haven't given any examples on what should be preferred state, and what are situations against which you want to be immune. (3,2,1,1 compared to 5,2 etc.)
It's also a pity you haven't provided us the dataset in some nice way to play.
scala> val input = Map( // as seen on 2016-09-09 14:10 UTC on https://worldofpotter.dk/points
'G' -> Seq(758,294,266,166,157,132,129,116,111,88,83,74,62,60,60,52,43,40,28,26,25,24,18,18,17,15,15,15,14,14,12,10,9,5,5,4,4,4,4,3,3,3,2,1,1,1,1,1),
'S' -> Seq(612,324,301,273,201,182,176,139,130,121,119,114,113,113,106,86,77,76,65,62,60,58,57,54,54,42,42,40,36,35,34,29,28,23,22,19,17,16,14,14,13,11,11,9,9,8,8,7,7,7,6,4,4,3,3,3,3,2,2,2,2,2,2,2,1,1,1),
'H' -> Seq(813,676,430,382,360,323,265,235,192,170,107,103,80,70,60,57,43,41,21,17,15,15,12,10,9,9,9,8,8,6,6,6,4,4,4,3,2,2,2,1,1,1),
'R' -> Seq(1398,1009,443,339,242,215,210,205,177,168,164,144,144,92,85,82,71,61,58,47,44,33,21,19,18,17,12,11,11,9,8,7,7,6,5,4,3,3,3,3,2,2,2,1,1,1,1)
) // and the results on the website were: 1. R 1951, 2. H 1859, 3. S 990, 4. G 954
Here is what I thought of:
def singleValuedScore(individualScores: Seq[Int]) = individualScores
.sortBy(-_) // sort from most to least
.zipWithIndex // add indices e.g. (best, 0), (2nd best, 1), ...
.map { case (score, index) => score * (1 + index) } // here is the 'logic'
.max
input.mapValues(singleValuedScore)
res: scala.collection.immutable.Map[Char,Int] =
Map(G -> 1044,
S -> 1590,
H -> 1968,
R -> 2018)
The overall positions would be:
Ravenclaw with 2018 aggregated points
Hufflepuff with 1968
Slytherin with 1590
Gryffindor with 1044
Which corresponds to the ordering on that web: 1. R 1951, 2. H 1859, 3. S 990, 4. G 954.
The algorithms output is maximal product of score of user and rank of the user within a house.
This measure is not affected by "long-tail" of users having low score compared to the active ones.
There are no hand-set cutoffs or thresholds.
You could experiment with the rank attribution (score * index or score * Math.sqrt(index) or score / Math.log(index + 1) ...)
I take it that the fair measure is the number of points divided by the number of house members. Since you have the number of points, the exercise boils down to estimate the number of members.
We are in short supply of data here as the only hint we have on member counts is the answers on the website. This makes us vulnerable to manipulation, members can trick us into underestimating their numbers. If the suggested estimation method to "count respondents with points >10" would be known, houses would only encourage the best to do the test to hide members from our count. This is a real problem and the only thing I will do about it is to present a "manipulation indicator".
How could we then estimate member counts? Since we do not know anything other than test results, we have to infer the propensity to do the test from the actual results. And we have little other to assume than that we would have a symmetric result distribution (of the logarithm of the points) if all members tested. Now let's say the strong would-be respondents are more likely to actually test than weak would-be respondents. Then we could measure the extra dropout ratio for the weak by comparing the numbers of respondents in corresponding weak and strong test-point quantiles.
To be specific, of the 205 answers, there are 27 in the worst half of the overall weakest quartile, while 32 in the strongest half of the best quartile. So an extra 5 respondents of the very weakest have dropped out from an assumed all-testing symmetric population, and to adjust for this, we are going to estimate member count from this quantile by multiplying the number of responses in it by 32/27=about 1.2. Similarly, we have 29/26 for the next less-extreme half quartiles and 41/50 for the two mid quartiles.
So we would estimate members by simply counting the number of respondents but multiplying the number of respondents in the weak quartiles mentioned above by 1.2, 1.1 and 0.8 respectively. If however any result distribution within a house would be conspicuously skewed, which is not the case now, we would have to suspect manipulation and re-design our member count.
For the sample at hand however, these adjustments to member counts are minor, and yields the same house ranks as from just counting the respondents without adjustments.
I got myself to amuse me a little bit with your question and some python programming with some random generated data. As some people mentioned in the comments you need to define what is fairness. If as you said you don't know the number of people in each of the houses, you can use the number of participations of each house, thus you motivate participation (it can be unfair depending on the number of people of each house, but as you said you don't have this data on the first place).
The important part of the code is the following.
import numpy as np
from numpy.random import randint # import random int
# initialize random seed
np.random.seed(4)
houses = ["Gryffindor","Slytherin", "Hufflepuff", "Ravenclaw"]
houses_points = []
# generate random data for each house
for _ in houses:
# houses_points.append(randint(0, 100, randint(60,100)))
houses_points.append(randint(0, 50, randint(2,10)))
# count participation
houses_participations = []
houses_total_points = []
for house_id in xrange(len(houses)):
houses_total_points.append(np.sum(houses_points[house_id]))
houses_participations.append(len(houses_points[house_id]))
# sum the total number of participations
total_participations = np.sum(houses_participations)
# proposed model with weighted total participation points
houses_partic_points = []
for house_id in xrange(len(houses)):
tmp = houses_total_points[house_id]*houses_participations[house_id]/total_participations
houses_partic_points.append(tmp)
The results of this method are the following:
House Points per Participant
Gryffindor: [46 5 1 40]
Slytherin: [ 8 9 39 45 30 40 36 44 38]
Hufflepuff: [42 3 0 21 21 9 38 38]
Ravenclaw: [ 2 46]
House Number of Participations per House
Gryffindor: 4
Slytherin: 9
Hufflepuff: 8
Ravenclaw: 2
House Total Points
Gryffindor: 92
Slytherin: 289
Hufflepuff: 172
Ravenclaw: 48
House Points weighted by a participation factor
Gryffindor: 16
Slytherin: 113
Hufflepuff: 59
Ravenclaw: 4
You'll find the complete file with printing results here (https://gist.github.com/silgon/5be78b1ea0b55a20d90d9ec3e7c515e5).
You should enter some more rules to define the fairness.
Idea 1
You could set up the rule that anyone has to earn at least 10 points to enter the competition.
Then you can calculate the average points for each house.
Positive: Everyone needs to show some motivation.
Idea 2
Another approach would be to set the rule that from each house only the 10 best students will count for the competition.
Positive: Easy rule to calculate the points.
Negative: Students might become uninterested if they see they can't reach the top 10 places of their house.
From my point of view, your problem is diveded in a few points:
The best thing to do would be to re - assignate the player in the different Houses so that each House has the same number of players. (as explain by #navid-vafaei)
If you don't want to do that because you believe that it may affect your game popularity with player whom are in House that they don't want because you can change the choice of the Sorting Hat at least in the movie or books.
In that case, you can sum the point of the student's house and divide by the number of students. You may just remove the number of student with a very low score. You may remove as well the student with a very low activity because students whom skip school might be fired.
The most important part for me n your algorithm is weather or not you give points for all valuables things:
In the Harry Potter's story, the students earn point on the differents subjects they chose at school and get point according to their score.
At the end of the year, there is a special award event. At that moment, the Director gave points for valuable things which cannot be evaluated in the subject at school suche as the qualites (bravery for example).

How to find the minimum value of M?

I'm trying to solve this problem:
You have N relatives. You will talk to ith relative for exactly Ti
minutes. Each minute costs you 1 dollar . After the conversation,
they will add a recharge of Xi dollars in your mobile. Initially, you
have M dollars balance in your mobile phone.
Find the minimum value of M, that you must have initially, in your
phone, so that you don't run out of balance during any of the call
(encounter negative balance).
Note : You can call relatives in any order. Each relative will be
called exactly once.
Input:
N
T1 X1
T2 X2
2
1 1
2 1
Output:
2
This looks easy to me at first but I'm not able to find the exact solution.
My Initial thoughts:
We have no problem where Xi > Ti as it will not reduce our initial
balance. We need to take care of situation where where we will run
into loss i.e Ti > Xi.
But I am unable to make expression which will result in minimum
initial value.
Need guidance in approaching this problem to find optimal solution.
UPDATE:-
Binary Search approach seems to lead to wrong result (as proved by the
test case provided in the comment below by user greybeard.
So, this is another approach.We maintain the difference between call cost
and recharge amount.
Then we maintain two arrays/vectors.
If our recharge amount is strictly greater than cost of call, we put
the call in the first array ,else we put it in the second array.
Then we can sort the first array according to the cost and the second array
according to the recharge amount. We then update the diff by adding the
least amount of recharge from the call where our cost is greater than recharge
Then we can iterate through our first array and update our max
requirement,requirement for each call and current balance.Finally, our answer
will be the maximum between max requirement and the diff we have maintained.
Example :-
N = 2
T1 = 1 R1 = 1
T2 = 2 R2 = 1
Our first array contains nothing as all the calls have cost greater than
or equal to recharge amount. So, we place both calls in our second array
The diff gets updated to 2 before we sort the array. Then, we add the min
recharge we can get from the calls to our diff(i.e 1).Now, the diff stands
at 3.Then as our first array contains no elements, our answer is equal to
the diff i.e 3.
Time Complexity :- O(nlogn)
Working Example:-
#include<bits/stdc++.h>
using namespace std;
#define MAXN 100007
int n,diff;
vector<pair<int,int> > v1,v2;
int main(){
diff = 0;
cin>>n;
for(int i=0;i<n;i++){
int cost,recharge;
cin>>cost>>recharge;
if(recharge > cost){
v1.push_back(make_pair(cost,recharge));
}else{
v2.push_back(make_pair(recharge,cost));
}
diff += (cost-recharge);
}
sort(v1.begin(), v1.end());
sort(v2.begin(), v2.end());
if(v2.size() > 0)diff += v2[0].first;
int max_req = diff, req = 0,cur = 0;
for(int i=0; i<v1.size(); i++){
req = v1[i].first - cur;
max_req = max(max_req, req);
cur += v1[i].second-v1[i].first;
}
cout<<max(max_req,diff)<<endl;
return 0;
}
(This is a wiki post: you are invited to edit, and don't need much reputation to do so without involving a moderator.)
Working efficiently means accomplishing the task at hand, with no undue effort. Aspects here:
the OP asks for guidance in approaching this problem to find optimal solution - not for a solution (as this entirely similar, older question does).
the problem statement asks for the minimum value of M - not an optimal order of calls or how to find that.
To find the minimum balance initially required, categorise the relatives/(T, X)-pairs/calls (the order might have a meaning, if not for the problem as stated)
T < X Leaves X-T more for calls to follow. Do in order of increasing cost.
Start assuming an initial balance of 1. For each call, if you can afford it, subtract its cost, add its refund and be done accounting for it. If you can't afford it (yet), put it on hold/the back burner/in a priority queue. At the end of "rewarding calls", remove each head of the queue in turn, accounting for necassary increases in intitial balance.
This part ends with a highest balance, yet.
T = X No influence on any other call. Just do at top balance, in any order.
The top balance required for the whole sequence can't be lower than the cost of any single call, including these.
T > X Leaves T-X less for subsequent calls. Do in order of decreasing refund.
(This may, as any call, go to a balance of zero before refund.
As order of calls does not change the total cost, the ones requiring the least initial balance will be those yielding the lowest final one. For the intermediate balance required by this category, don't forget that least refund.)
Combine the requirements from all categories.
Remember the request for guidance.

Find the best way to buy p Product from limit x Vendors

I have to buy 100 Products ( or p Products) from 20 Vendors ( or v Vendors). Each Vendors have all of these Products, but they sell different Price.
I want to find the best price to get 100 Products. Asume that there is no Shipping Cost.
There are v^p ways. And I will get only one way that have best Price.
The problem seem to be easy if there is no requirement: LIMIT number of Vendors to x in the Orders because of Time Delivery ( or Some reasons).
So, the problem is: Find the best way to buy p Product from limit x Vendors ( There are v Vendors , x<=v).
I can generate all Combination of Vendors( There are C(v,x) combinations) and compare the Total Price. But There are so many combinations . (if there are 20 Vendors, there are around 185k combinations).
I stuck at this idea.
Someone has same problem , pls help me. Thank you very much.
This problem is equivalent to the non-metric k-center problem (cities = products, warehouses = vendors), which is NP-hard.
I would try mixed integer programming. Here's one formulation.
minimize c(i, j) y(i, j) # cost of all of the orders
subject to
for all i: sum over j of y(i, j) = 1 # buy each product once
for all i, j: y(i, j) <= z(j) # buy products only from chosen vendors
sum over j of z(j) <= x # choose at most x vendors
for all i, j: 0 <= y(i, j) <= 1
for all j: z(j) in {0, 1}
The interpretation of the variables is that i is a product, j is a vendor, c(i, j) is the cost of product i from vendor j, y(i, j) is 1 if we buy product i from vendor j and 0 otherwise, z(j) is 1 is we buy from vendor j at all and 0 otherwise.
There are many free mixed integer program solvers available.
Not Correct as shown by #Per the structure lacks optimal substructure
My assumptions are as follows, from the master table you need to create a sub list which has only "x" vendor columns, and "Best Price" is the "Sum" of all the prices.
Use a dynamic programming approach
What you do is define two functions, Picking (i,k) and NotPicking(i,k).
What it means is getting the best with ability to pick vendors from 1,.. i with maximum of k vendors.
Picking (1,_) = Sum(All prices)
NotPicking (1,_) = INF
Picking (_,0) = INF
NotPicking (_,0) = INF
Picking (i,k) = Min (Picking(i-1,k-1) + NotPicking(i-1,k-1)) - D (The difference you get because of having this vendor)
NotPicking (i,k) = Min (Picking(i-1,k) + NotPicking(i-1,k))
You just solve it for a i from 1 to V and k from 1 to X
You calculate the difference by maintaining for each picking the whole product list, and calculating the difference.
How about using a Greedy Approach. Since you have a limitation on the vendors ( you need to use at least x of the total v vendors). That means you need to choose at least 1 product from each vendor of the x ... And here's an example solution:
For each vendor in v, sort the products by price, then you will have "v" sets of sorted prices. Now you can pick the min of these sets and sort again, producing a new set of "v" products, containing only the cheapest ones.
Now, if p <= v, then pick the first p items and you are done, otherwise pick all v items and repeat the same logic until you reach p.
I haven't worked this out and verified, but I guess it might work. Try this:
Add two more columns called "Highest Price" and "Lowest Price" to the table and generate data for it: they should hold the highest and lowest price for each product amongst all vendors.
Also add another column, called "Range" which should hold the (highest price - lowest price).
Now do this 100 (p) times:
Pick the row with highest range. Buy the product with least price on
that row. Once bought, mark that cell as 'bought' (maybe set null).
Recalculate lowest price, range for that row (ignoring cells marked as 'bought').
EDIT: Hungarian algorithm is not the answer to your question unless you did not wanted to put a limit on vendors.
The algorithm you are looking for is Hungarian Algorithm.
There are many available implementations of it on the web.

Algorithm to share/settle expenses among a group

I am looking forward for an algorithm for the below problem.
Problem: There will be a set of people who owe each other some money or none. Now, I need an algorithm (the best and neat) to settle expense among this group.
Person AmtSpent
------ ---------
A 400
B 1000
C 100
Total 1500
Now, expense per person is 1500/3 = 500. Meaning B to give A 100. B to give C 400. I know, I can start with the least spent amount and work forward.
Can some one point me the best one if you have.
To sum up,
Find the total expense, and expense per head.
Find the amount each owe or outstanding (-ve denote outstanding).
Start with the least +ve amount. Allocate it to the -ve amount.
Keep repeating step 3, until you run out of -ve amount.
s. Move to next bigger +ve number. Keep repeating 3 & 4 until there are +ve numbers.
Or is there any better way to do?
The best way to get back to zero state (minimum number of transactions) was covered in this question here.
I have created an Android app which solves this problem. You can input expenses during the trip, it even recommends you "who should pay next". At the end it calculates "who should send how much to whom". My algorithm calculates minimum required number of transactions and you can setup "transaction tolerance" which can reduce transactions even further (you don't care about $1 transactions) Try it out, it's called Settle Up:
https://market.android.com/details?id=cz.destil.settleup
Description of my algorithm:
I have basic algorithm which solves the problem with n-1 transactions, but it's not optimal. It works like this: From payments, I compute balance for each member. Balance is what he paid minus what he should pay. I sort members according to balance increasingly. Then I always take the poorest and richest and transaction is made. At least one of them ends up with zero balance and is excluded from further calculations. With this, number of transactions cannot be worse than n-1. It also minimizes amount of money in transactions. But it's not optimal, because it doesn't detect subgroups which can settle up internally.
Finding subgroups which can settle up internally is hard. I solve it by generating all combinations of members and checking if sum of balances in subgroup equals zero. I start with 2-pairs, then 3-pairs ... (n-1)pairs. Implementations of combination generators are available. When I find a subgroup, I calculate transactions in the subgroup using basic algorithm described above. For every found subgroup, one transaction is spared.
The solution is optimal, but complexity increases to O(n!). This looks terrible but the trick is there will be just small number of members in reality. I have tested it on Nexus One (1 Ghz procesor) and the results are: until 10 members: <100 ms, 15 members: 1 s, 18 members: 8 s, 20 members: 55 s. So until 18 members the execution time is fine. Workaround for >15 members can be to use just the basic algorithm (it's fast and correct, but not optimal).
Source code:
Source code is available inside a report about algorithm written in Czech. Source code is at the end and it's in English:
https://web.archive.org/web/20190214205754/http://www.settleup.info/files/master-thesis-david-vavra.pdf
You have described it already. Sum all the expenses (1500 in your case), divide by number of people sharing the expense (500). For each individual, deduct the contributions that person made from the individual share (for person A, deduct 400 from 500). The result is the net that person "owes" to the central pool. If the number is negative for any person, the central pool "owes" the person.
Because you have already described the solution, I don't know what you are asking.
Maybe you are trying to resolve the problem without the central pool, the "bank"?
I also don't know what you mean by "start with the least spent amount and work forward."
Javascript solution to the accepted algorithm:
const payments = {
John: 400,
Jane: 1000,
Bob: 100,
Dave: 900,
};
function splitPayments(payments) {
const people = Object.keys(payments);
const valuesPaid = Object.values(payments);
const sum = valuesPaid.reduce((acc, curr) => curr + acc);
const mean = sum / people.length;
const sortedPeople = people.sort((personA, personB) => payments[personA] - payments[personB]);
const sortedValuesPaid = sortedPeople.map((person) => payments[person] - mean);
let i = 0;
let j = sortedPeople.length - 1;
let debt;
while (i < j) {
debt = Math.min(-(sortedValuesPaid[i]), sortedValuesPaid[j]);
sortedValuesPaid[i] += debt;
sortedValuesPaid[j] -= debt;
console.log(`${sortedPeople[i]} owes ${sortedPeople[j]} $${debt}`);
if (sortedValuesPaid[i] === 0) {
i++;
}
if (sortedValuesPaid[j] === 0) {
j--;
}
}
}
splitPayments(payments);
/*
C owes B $400
C owes D $100
A owes D $200
*/
I have recently written a blog post describing an approach to solve the settlement of expenses between members of a group where potentially everybody owes everybody else, such that the number of payments needed to settle the debts is the least possible. It uses a linear programming formulation. I also show an example using a tiny R package that implements the solution.
I had to do this after a trip with my friends, here's a python3 version:
import numpy as np
import pandas as pd
# setup inputs
people = ["Athos", "Porthos", "Aramis"] # friends names
totals = [300, 150, 90] # total spent per friend
# compute matrix
total_spent = np.array(totals).reshape(-1,1)
share = total_spent / len(totals)
mat = (share.T - share).clip(min=0)
# create a readable dataframe
column_labels = [f"to_{person}" for person in people]
index_labels = [f"{person}_owes" for person in people]
df = pd.DataFrame(data=mat, columns=column_labels, index=index_labels)
df.round(2)
Returns this dataframe:
to_Athos
to_Porthos
to_Aramis
Athos_owes
0
0
0
Porthos_owes
50
0
0
Aramis_owes
70
20
0
Read it like: "Porthos owes $50 to Athos" ....
This isn't the optimized version, this is the simple version, but it's simple code and may work in many situations.
I'd like to make a suggestion to change the core parameters, from a UX-standpoint if you don't mind terribly.
Whether its services or products being expensed amongst a group, sometimes these things can be shared. For example, an appetizer, or private/semi-private sessions at a conference.
For things like an appetizer party tray, it's sort of implied that everyone has access but not necessarily that everyone had it. To charge each person to split the expense when say, only 30% of the people partook can cause contention when it comes to splitting the bill. Other groups of people might not care at all. So from an algorithm standpoint, you need to first decide which of these three choices will be used, probably per-expense:
Universally split
Split by those who partook, evenly
Split by proportion per-partaker
I personally prefer the second one in-general because it has the utility to handle whole-expense-ownership for expenses only used by one person, some of the people, and the whole group too. It also remedies the ethical question of proportional differences with a blanket generalization of, if you partook, you're paying an even split regardless of how much you actually personally had. As a social element, I would consider someone who had a "small sample" of something just to try it and then decided not to have anymore as a justification to remove that person from the people splitting the expense.
So, small-sampling != partaking ;)
Then you take each expense and iterate through the group of who partook in what, and atomically handle each of those items, and at the end provide a total per-person.
So in the end, you take your list of expenses and iterate through them with each person. At the end of the individual expense check, you take the people who partook and apply an even split of that expense to each person, and update each person's current split of the bill.
Pardon the pseudo-code:
list_of_expenses[] = getExpenseList()
list_of_agents_to_charge[] = getParticipantList()
for each expense in list_of_expenses
list_of_partakers[] = getPartakerList(expense)
for each partaker in list_of_partakers
addChargeToAgent(expense.price / list_of_partakers.size, list_of_agents_to_charge[partaker])
Then just iterate through your list_of_agents_to_charge[] and report each total to each agent.
You can add support for a tip by simply treating the tip like an additional expense to your list of expenses.
Straightforward, as you do in your text:
Returns expenses to be payed by everybody in the original array.
Negativ values: this person gets some back
Just hand whatever you owe to the next in line and then drop out. If you get some, just wait for the second round. When done, reverse the whole thing. After these two round everybody has payed the same amount.
procedure SettleDepth(Expenses: array of double);
var
i: Integer;
s: double;
begin
//Sum all amounts and divide by number of people
// O(n)
s := 0.0;
for i := Low(Expenses) to High(Expenses) do
s := s + Expenses[i];
s := s / (High(Expenses) - Low(Expenses));
// Inplace Change to owed amount
// and hand on what you owe
// drop out if your even
for i := High(Expenses) downto Low(Expenses)+1 do begin
Expenses[i] := s - Expenses[i];
if (Expenses[i] > 0) then begin
Expenses[i-1] := Expenses[i-1] + Expenses[i];
Expenses.Delete(i);
end else if (Expenses[i] = 0) then begin
Expenses.Delete(i);
end;
end;
Expenses[Low(Expenses)] := s - Expenses[Low(Expenses)];
if (Expenses[Low(Expenses)] = 0) then begin
Expenses.Delete(Low(Expenses));
end;
// hand on what you owe
for i := Low(Expenses) to High(Expenses)-1 do begin
if (Expenses[i] > 0) then begin
Expenses[i+1] := Expenses[i+1] + Expenses[i];
end;
end;
end;
The idea (similar to what is asked but with a twist/using a bit of ledger concept) is to use Pool Account where for each bill, members either pay to the Pool or get from the Pool.
e.g.
in below attached image, the Costco expenses are paid by Mr P and needs $93.76 from Pool and other members pay $46.88 to the pool.
There are obviously better ways to do it. But that would require running a NP time complexity algorithm which could really show down your application. Anyways, this is how I implemented the solution in java for my android application using Priority Queues:
class calculateTransactions {
public static void calculateBalances(debtors,creditors) {
// add members who are owed money to debtors priority queue
// add members who owe money to others to creditors priority queue
}
public static void calculateTransactions() {
results.clear(); // remove previously calculated transactions before calculating again
PriorityQueue<Balance> debtors = new PriorityQueue<>(members.size(),new BalanceComparator()); // debtors are members of the group who are owed money, balance comparator defined for max priority queue
PriorityQueue<Balance> creditors = new PriorityQueue<>(members.size(),new BalanceComparator()); // creditors are members who have to pay money to the group
calculateBalances(debtors,creditors);
/*Algorithm: Pick the largest element from debtors and the largest from creditors. Ex: If debtors = {4,3} and creditors={2,7}, pick 4 as the largest debtor and 7 as the largest creditor.
* Now, do a transaction between them. The debtor with a balance of 4 receives $4 from the creditor with a balance of 7 and hence, the debtor is eliminated from further
* transactions. Repeat the same thing until and unless there are no creditors and debtors.
*
* The priority queues help us find the largest creditor and debtor in constant time. However, adding/removing a member takes O(log n) time to perform it.
* Optimisation: This algorithm produces correct results but the no of transactions is not minimum. To minimize it, we could use the subset sum algorithm which is a NP problem.
* The use of a NP solution could really slow down the app! */
while(!creditors.isEmpty() && !debtors.isEmpty()) {
Balance rich = creditors.peek(); // get the largest creditor
Balance poor = debtors.peek(); // get the largest debtor
if(rich == null || poor == null) {
return;
}
String richName = rich.name;
BigDecimal richBalance = rich.balance;
creditors.remove(rich); // remove the creditor from the queue
String poorName = poor.name;
BigDecimal poorBalance = poor.balance;
debtors.remove(poor); // remove the debtor from the queue
BigDecimal min = richBalance.min(poorBalance);
// calculate the amount to be send from creditor to debtor
richBalance = richBalance.subtract(min);
poorBalance = poorBalance.subtract(min);
HashMap<String,Object> values = new HashMap<>(); // record the transaction details in a HashMap
values.put("sender",richName);
values.put("recipient",poorName);
values.put("amount",currency.charAt(5) + min.toString());
results.add(values);
// Consider a member as settled if he has an outstanding balance between 0.00 and 0.49 else add him to the queue again
int compare = 1;
if(poorBalance.compareTo(new BigDecimal("0.49")) == compare) {
// if the debtor is not yet settled(has a balance between 0.49 and inf) add him to the priority queue again so that he is available for further transactions to settle up his debts
debtors.add(new Balance(poorBalance,poorName));
}
if(richBalance.compareTo(new BigDecimal("0.49")) == compare) {
// if the creditor is not yet settled(has a balance between 0.49 and inf) add him to the priority queue again so that he is available for further transactions
creditors.add(new Balance(richBalance,richName));
}
}
}
}
I've created a React App that implements Bin-packing approach to split trip expenses among friends with least number of transactions.
Check out the TypeScript file SplitPaymentCalculator.ts implementing the same.
You can find the working app's link on the homepage of the repo.

Resources