Algorithm for multicriterial arrangement - algorithm

Let me describe problem in a form of a small fiction story.
The story
In a Brave New World new cities are built in a couple of days and only need to be populated. Moreover, there's no more long boring hiring process, no interviews and subjective decisions - every person passes several tests and their results are used to find best employees.
When new city is built, number of companies place their offices there and ask Super Mind to find best employees for them given a way to calculate person's score for their particular company. People on their side ask Super Mind to find work for them. They give him list of companies where they would like to work together with corresponding priorities. Super Mind is very humanistic, so its task is to find such arrangement that people get to the best companies they want, even if some companies will left without employees at all.
Formal definition
Now let me define the task more formally.
E - number of employees seeking for a job.
C - number of companies.
S(e,c) - score of employee e for company c.
Pr(e,c) - priority of company c in a personal "wishlist" of employee e.
P(c) - # of positions available in company c.
Task: obtain list of (e, c) tuples given following conditions:
employees with higher S(e,c) should always go first (e.g. if there's only one position left in company c and there are 2 candidates for it, it should be guaranteed that employee with higher score gets to this position).
employees should get to the company with highest priority available for them.
My algorithm
The only algorithm I can think of that guarantees all conditions is as follows. First I create list of all possible applications from employees to companies (A(e,c,s,p)), where s is a score of employee e for company c and p is company priority for this employee. Then I sort all applications by total score and run next recursive procedure:
def arrange(As, Ps, not_approved, approved):
# As - list of applications left
# Ps - map of type (company -> # of positions left)
# not_approved - set of not approved applications
# approved - set of approved applications (hold intermediate result)
if (empty(As))
return approved
a = head(As)
As_rest = tail(As)
if (cant_be_hired(a)) # if no places left in company from this application
return arrange(As_rest, Ps, not_approved + a, approved)
else if (highest_priority(a)) # if this application has highest of left priorities
return arrange(As_rest, Ps(c) - 1, not_approved, approved + a)
else
# if application can be accepted, but it has higher priorities left,
# check what will happen if we do not accept this application
check_result = arrange(As_left, Ps, not_approved + a, approved)
if (employee_is_hired_for_better_job(a, check_result))
# if employee can be hired to a job with higher priority,
# just return check_result - it is already an answer
return check_result
else
# otherwise accept this application and proceed for rest of them
return arrange(As_rest, Ps, not_approved, approved + a)
But, of course, this algorithm has very large computational complexity. Dynamic programming with caching check results helps a bit, but this is still too slow.
I was thinking of some kind of conditional optimization algorithm that always converges, however I'm not so closely familiar with this field to find appropriate one.
So, is there better algorithm?

Related

Fulfilling maximum customer orders

There is an inventory of products like eg. A- 10Units, B- 15units, C- 20Units and so on. We have some customer orders of some products like customer1{A- 10Units, B- 15Units}, customer2{A- 5Units, B- 10Units}, customer3{A- 5Units, B- 5Units}. The task is fulfill maximum customer orders with the limited inventory we have. The result in this case should be filling customer2 and customer3 orders instead of just customer1.[The background for this problem is a realtime online retail scenario, where we have millions of customers and millions of products and we are trying to fulfill the orders as efficiently as possible]
How do I solve this?Is there an algorithm for this kind of problem, something like optimisation?
Edit: The requirement here is fixed. The only aim here is maximizing the number of fulfilled orders regardless of value. But we have millions of users and millions of products.
This problem includes as a special case a knapsack problem. To see why consider only one product A: the storage amount of the product is your bag capacity, the order quantities are the weights and each rock value is 1. Your problem is to maximize the total value you can fit in the bag.
Don't expect an exact solution for your problem in polynomial time...
An approach I'd go for is a random search: make a list of the orders and compute a solution (i.e. complete orders in sequence, skipping the orders you cannot fulfill). Then change the solution by applying a permutation on the orders and see if it's better.
Keep going with search until time runs out or you're happy with the solution.
It can be solved by DP.
Firstly sort all your orders with respect to A in increasing order.
Use this DP :
DP[n][m][o] = DP[n-a][m-b][o-c] + 1 where n-a>=0 and m-b >=0 o-c>=0
DP[0][0][0] = 1;
Do bottom up computation :
Set DP[i][j][k] = 0 , for all i =0 to Amax; j= 0 to Bmax; k = 0 to Cmax
For Each n : 0 to Amax
For Each m : 0 to Bmax
For Each o : 0 to Cmax
if(n>=a && m>=b && o>= c)
DP[n][m][o] = DP[n-a][m-b][o-c] + 1;
You will then have to find the max value of DP[i][j][k] for all values of i,j,k possible. This is your answer. - O(n^3)
Reams have been written about order fulfillment and yet no one has come up with a standard answer. The reason being that companies have different approaches and different requirements.
There are so many variables that a one size solution that fits all is not possible.
You would have to sit down and ask hundreds of questions before you could even start to come up with an approach tailored to your customers needs.
Indeed those needs might also vary, based on the time of year, the day of the week, what promotions are currently being run, whether customers are ranked, numbers of picking and packing staff/machinery currently employed, nature, size, weight of products, where products are in the warehouse, whether certain products are in fast/automated picking lines, standard picking faces or in bulk. The list can appear endless.
Then consider whether all orders are to be filled or are you allowed to partially fill an order and back-order out of stock products.
Does the entire order have to fit in a single box or are multiple box orders permitted.
Are you dealing with multiple warehouses and if so can partial orders be sent from each or do they have to be transferred for consolidation.
Should precedence be given to local or overseas orders.
The amount of information that you need at your finger tips before you can even start to plan a methodology to fit your customers specific requirements can be enormous and sadly, you are not going to get a definitive answer. It does not exist.
Whilst I realise that this is not a) an answer or b) necessarily a welcome post, the hard truth is that you will require your customer to provide you with immense detail as to what it is that they wish to achieve, how and when.
You job, initially, is the play devils advocate, in attempting to nail them down.
P.S. Welcome to S.O.

Sorting a group into sub groups based on two characteristics

I think I have a moderately useful solution but really curious if anyone has a better way. I'm also unsure how to properly define the type of problem I have, which means I haven't done much Googling.
The problem:
I have an unknown number of players who are selecting roles for a game. Valid roles are 1-5. The players MUST select a primary role, but they also have an option for a secondary role. The options for their secondary role are "none", one of 1-5 (different from their primary), or "any". I must then put players into teams of 5, each team requires one of each number. The goal is to maximize the number of correctly formed teams.
My current solution:
Divide players into 3 groups: A) People who selected "none" as their secondary, B) People who selected a number as their secondary, C) people who selected "any" s their secondary
Fill up teams with players from group A, placing them as their primary role.
Count the number of open positions remaining in the existing, non-full teams
Count the number roles represented by group B (both primary and secondary)
Based on the count of open positions, start adding players from group B - the order for filling positions is biggest deficit to smallest deficit, with a preference to picking players who have as their other role a number with the biggest size (by representation) in group B
Fill in the left overs with group C
If worse came to worse I could always just throw people into groups regardless of their preferred role, but ideally I would create correct groups.
Do primary and secondary roles have the same bearing on team balance? If so, you could look at the problem from the point of the teams and the status of each team's current roles/needs.
For example, the first person on team A may be both 1 and 3, so team A has a 1 and a 3. The next person on team A should not be a 1 or a 3 in either the primary or secondary role. Using the remaining team needs (roles 2,4,5) you can iterate through the possible members until either 1) the team is filled/balanced or 2) there are no remaining candidates.
Is there more to the problem? I get the feeling I'm not getting the whole issue.

Database design for bus reservation

I'm developing a reservation module for buses and I have trouble designing the right database structure for it.
Let's take following case:
Buses go from A to D with stopovers at B and C. A Passenger can reserve ticket for any route, ie. from A to B, C to D, A to D, etc.
So each route can have many "subroutes", and bigger contain smaller ones.
I want to design a table structure for routes and stops in a way that would help easily search for free seats. So if someone reserves seat from A to B, then seats from B to C or D would be still be available.
All ideas would be appreciated.
I'd probably go with a "brute force" structure similar to this basic idea:
(There are many more fields that should exist in the real model. This is only a simplified version containing the bare essentials necessary to establish relationships between tables.)
The ticket "covers" stops through TICKET_STOP table, For example, if a ticket covers 3 stops, then TICKET_STOP will contain 3 rows related to that ticket. If there are 2 other stops not covered by that ticket, then there will be no related rows there, but there is nothing preventing a different ticket from covering these stops.
Liberal usage or natural keys / identifying relationships ensures two tickets cannot cover the same seat/stop combination. Look at how LINE.LINE_ID "migrates" alongside both edges of the diamond-shaped dependency, only to be merged at its bottom, in the TICKET_STOP table.
This model, by itself, won't protect you from anomalies such as a single ticket "skipping" some stops - you'll have to enforce some rules through the application logic. But, it should allow for a fairly simple and fast determination of which seats are free for which parts of the trip, something like this:
SELECT *
FROM
STOP CROSS JOIN SEAT
WHERE
STOP.LINE_ID = :line_id
AND SEAT.BUS_NO = :bus_no
AND NOT EXIST (
SELECT *
FROM TICKET_STOP
WHERE
TICKET_STOP.LINE_ID = :line_id
AND TICKET_STOP.BUS_ID = :bus_no
AND TICKET_STOP.TRIP_NO = :trip_no
AND TICKET_STOP.SEAT_NO = SEAT.SEAT_NO
AND TICKET_STOP.STOP_NO = STOP.STOP_NO
)
(Replace the parameter prefix : with what is appropriate for your DBMS.)
This query essentially generates all combinations of stops and seats for given line and bus, then discards those that are already "covered" by some ticket on the given trip. Those combinations that remain "uncovered" are free for that trip.
You can easily add: STOP.STOP_NO IN ( ... ) or SEAT.SEAT_NO IN ( ... ) to the WHERE clause to restrict the search on specific stops or seats.
From the perspective of bus company:
Usually one route is considered as series of sections, like A to B, B to C, C to D, etc. The fill is calculated on each of those sections separately. So if the bus leaves from A full, and people leave at C, then user can buy ticket at C.
We calculate it this way, that each route has ID, and each section belongs to this route ID. Then if user buys ticket for more than one section, then each section is marked. Then for the next passenger system checks if all sections along the way are available.

Have/Want List Matching Algorithm

Have/Want List Matching Algorithm
I am implementing an item trading system on a high-traffic site. I have a large number of users that each maintain a HAVE list and a WANT list for a number of specific items. I am looking for an algorithm that will allow me to efficiently suggest trading partners based on your HAVEs and WANTs matched with theirs. Ideally I want to find partners with the highest mutual trading potential (i.e. I have a ton of things you want, you have a ton of things I want). I don't need to find the global highest-potential pair (which sounds hard), just find the highest-potential pairs for a given user (or even just some high-potential pairs, not the global max).
Example:
User 1 HAS A,C WANTS B,D
User 2 HAS D WANTS A
User 3 HAS A,B,D WANTS C
User 1 goes to the site and clicks a button that says
"Find Trading Partners" and the top-ranked result is
User 3, followed by User 2.
An additional source of complexity is that the items have different values, and I want to match on the highest valued trade possible, rather than on the most number of matches between two traders. So in the example above, if all items are worth 1, but A and D are both worth 10, User 1 now gets matched with User 2 above User 3.
A naive way to do this would to compute the max trade value between the user looking for partners vs. all other users in the database. I'm thinking with some lookup tables on the right things I might be able to do better. I've tried googling around, since this seems like a classical problem, but I don't know the name for it.
Can anyone recommend a good approach to solving this problem? I've seen sites like the Magic Online Trading League that seem to solve it in realtime.
You could do this in O(n*k^2) (n is the number of people, k is the average number of items they have/want) by keeping hash tables (or, in a database, indexes) of all the people who have and want given items, then giving scores for all the people who have items the current user wants, and want items the current user has. Display the top 10 or 20 scores.
[Edit] Example of how this would be implemented in SQL:
-- Get score for #userid wants
SELECT UserHas.UserID, SUM(Items.Weight) AS Score
FROM UserWants
INNER JOIN UserHas ON UserWants.ItemID = UserHas.ItemID
INNER JOIN Items ON Items.ItemID = UserWants.ItemID
WHERE UserWants.UserID = #userid
GROUP BY UserWants.UserID, UserHas.UserID
This gives you a list of other users and their score, based on what items they have that the current user wants. Do the same for items the current user has the others want, then combine them somehow (add the scores or whatever you want) and grab the top 10.
This problem looks pretty similar to stable roomamates problem. I don't see any thing wrong with the SQL implementation that got highest votes but as some else suggested this is like a dating/match making problem similar to the lines of stable marriage problem but here all the participants are in one pool.
The second wikipedia entry also has a link to a practical solution in javascript which could be useful
You could maintain a per-item list (as a complement to per-user list). Item search is then spot on. Now you can allow your self brute force search for most valuable pair by checking most valuable items first. If you want more complex (arguably faster) search you could introduce set of items that often come together as meta-items, and look for them first.
Okay, what about this:
There are basically giant "Pools"
Each "pool" contains "sections." Each "Pool" is dedicated to people who own a specific item. Each section is for people who own that item, and want another.
What I mean:
Pool A (For those requesting A)
--Section B (For those requesting A that have B)
--Section C (For those requesting A that have C, even if they also have B)
Pool B
--Section A
--Section B
Pool C
--Section A
--Section C
Each section is filled with people.
"Deals" would consist of one "Requested" item, and a "Pack," you're willing to give any or all of the items up to get the item you requested.
Every "Deal" is calculated per-pool.... if you want a given item, you go to the pools of the items you'd be willing to give, and it find the Section which belongs to the item you are requesting.
Likewise, your deal is placed in the pools. So you can immediately find all of the applicable people, because you know EXACTLY which pools, and EXACTLY which sections to search in, no sorting necessary once they've entered the system.
And, then, age would have priority, older deals would be picked, rather than new ones.
Let's assume you can hash your items, or at least sort them. Assume your goal is to find the best result for a given user, on request, as in your original example. (Optimizing trading partners to maximize overall trade value is a different question.)
This would be fast. O(log n) for each insertion operation. Worst case O(n) for suggesting trading partners, but you bound this by processing time.
You're already maintaining a list of items per user.
Give each user a score equal to the sum of the values of the items they have.
Maintain a list of user-HAVES and user-WANTS per item (#Dialecticus), sorted by user score. (You can sort on demand, or keep the lists sorted dynamically every time a user changes their HAVE list.)
When a user user1 requests suggested trade partners
Iterate over their items item in order by value.
Iterate over the user-HAVES user2 for each item, in order by user score.
Compute trade value for user1 trades-with user2.
Remember best trade so far.
Keep hash of users processed so far to avoid recomputing value for a user multiple times.
Terminate when you run out of processing time (your real-time guarantee).
Sorting by item value and user score is the approximation that makes this fast. I'm not sure how sub-optimal it would be, though. There are certainly easy examples where this would fail to find the best trade if you don't run it to completion. In practice, it seems like it might be good enough. In the limit, you can make it optimal by letting it run until it exhausts the lists in step 4.1 and 4.2. There's extra memory cost associated with the inverted lists, but you didn't say you were memory constrained. And generally, if you want speed, it's not uncommon to trade-off space to get it.
I mark item by letter and user by number.
m - number of items in all have/want lists (have or want, not have and want)
x - number of users.
For each user you have list of his wants and haves. Left line is want list, right is have list (both will be sorted so we can use binary search).
1 - ABBCDE FFFGH
2 - CFGGH BE
3 - AEEGH BBDF
For each pair of users you generate two values and store them somewhere, you'd only generate it once and than actualize. Sorting first table and generating second, is O(m*x*log(m/x)) + O(log(m)) and will require O(x^2) extra memory. These values are: how many would first user get and how many another (if you want you can modify these values by multiplying them by value of particular item).
1-2 : 1 - 3 (user 1 gets 1) - (user 2 gets 3)
1-3 : 3 - 2
2-3 : 1 - 1
You also compute and store best trader for each user. After you've generated this helpful data you can quickly query.
Adding/Removing item - O(m*log(m/x)) (You loop through user's have/want list and do binary search on have/want list of every other user and actualize data)
Finding best connection - O(1) or O(x) (Depends on whether result stored in cache is correct or needs to be updated. You loop through user's pairs and do whatever you want with data to return to user the best connection)
By m/x I estimate number of items in single user's want/have list.
In this algorithm I'm assuming that all data isn't stored in Database (I don't know if binary search is possible with Databases) and that inserting/removing item into list is O(1).
PS. Sorry for bad english and I hope I've computed it all correctly and that it is working because I also need it.
Of course you could always seperate the system into three categories; "Wants," "Haves," and "Open Offers." So lets say User1 has Item A, User2 has Item B & C and is trading those for item A, but User1 still wants Item D, and User2 wants Item E. So User1 (assuming he's the trade "owner") puts a request, or want for Item D and Item E, thus the offer stands, and goes on the "Open Offers" list. If it isn't accepted or edited within two or so days, it's automatically cancelled. So User3 is looking for Item F and Item G, and searches on the "Have list" for Items F & G, which are split between User1 & User2. He realizes that User1 and User2's open offer includes requests for Items D & E, which he has. So he chooses to "join" the operation, and it's accepted on their terms, trading and swaping they items among them.
Lets say User1 now wants Item H. He simply searches on the "Have" list for the item, and among the results, he finds that User4 will trade Item H for Item I, which User1 happens to have. They trade, all is well.
Just make it BC only. That solves all problems.

Fair matchmaking for online games [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
Most online games arbitrarily form teams. Often times its up to the user, and they'll choose a fast server with a free slot. This behavior produces unfair teams and people rage quit. By tracking a player's statics (or any statics that can be gathered) how can you choose teams that are as fair as possible?
One of the more well-known systems now is Microsoft's TrueSkill algorithm.
People have also attempted to adapt the Elo system for team matchmaking, though it's more designed for 1-v-1 pairings.
After my previous answer, I realized that if you wanted to get really fancy you could use a really simple but powerful idea: Markov Chains.
The intuitive idea behind using a Markov Chain goes something like this:
Create a graph G=(V,E)
Let each vertex in V represent an entity
Let each edge in E represent a transitioning probability between entities. This means that the sum of the out degrees of each vertex must be 1.
At the start (time t=0) assign each entity a unit value of 1
At each time step, transition form entity i, j by the transition probability defined in 3.
Let t->infinity then the value of each entity at t=infinity is the equilibrium (that is the chance of a transition into an entity is the same as the total chance of a transition out of an entity.)
This idea has for example been used successfully to implement Google's page rank algorithm. To describe how you can use it consider the following:
V = players E = probability of transitioning form player to player based on relative win/loss ratios
Each player is a vertex.
An edge from player A to B (B is not equal to A) has probability X/N where N is the total number of games played by A and X is the total games lost to B. Add an edge from A to A with probability M/N where M is the total number of games won by A.
Assign a skill level of 1 to each player at the start.
Use the Power Method to find the dominant eigenvector of the link matrix constructed from the probabilities defined in 3.
The dominant eigenvector is the amount of skill each player has at t=infinity, that is
the amount of skill each player has once the markov chain has come to equilibrium. This is a very robust measure of each players skill using the topology of the win/loss space.
Some caveats: there are several problems when applying this directly, the biggest problem will be seperated webs (that is your markov chain will not be irreducible and so the power method will not be guaranteed to converge.) Lucky for you, google has dealt with all these problems and more when implementing their page rank algorithm and all that remains for you is to look up how they circumvent these problems if you are so inclined.
One way would be to simply create a list of players looking for matches at any given time, sorted by player rank. Once you've reached enough people to start a new match (or perhaps, two less than the required), group them as such:
Remove best and worst player and put them on team 1
Remove now-best and now-worst player (really second-best and second worst) and put them on team 2
If there are only two players left, place each one on different teams, depending on who has the lowest combined score. Otherwise, repeat:
Remove now-best and now-worst and put them on team 1
Remove now-best and now-worst and put them on team 2
etc. etc. etc. until your teams are filled.
If you decided to start a new match with less than the required, then here it is time to let the players wait for new people to join. As soon as a new person joins, you're going to want to put them on the open team with the least combined score.
Alternatively, if you wanted to avoid games that combined good and bad players on the same team, you could split up everyone into tiers, (groups based on their ranking) and only match people within the same tier. This would require a new open/sorted list for each extra tier.
Example
Game is 4v4
A - 1000 pnts
B - 800 pnts
C - 600 pnts
D - 400 pnts
E - 200 pnts
F - 100 pnts
As soon as you get these six, group them into teams as such:
Team 1: A, F, D (combined score 1500)
Team 2: B, E, C (combined score 1600)
Now, we wait for two more players to join.
First, player E comes along with 500 pnts. He goes to Team 1, because they have a lower combined score.
Then, player F comes with 800 pnts. He goes to Team 2, because are the only open team left.
Total teams:
Team 1: A, F, D, E (combined score 2000)
Team 2: B, E, C, F (combined score 2400)
Note that the teams were actually pretty fair until the last two came in. To be honest, the best way would be to only create the match when you have enough players to start it. But then the wait times might be too long for the player.
Adjust with how much you need before forming the match. Lower = less wait time, more possibly unfair. Higher = more wait time, less possibly unfair.
If you have a pre-game screen, lower would also offer more time for people to chat and talk with their to-be teammates while waiting.
It is difficult to estimate the skill of any one player by a single metric and such a method is prone to abuse. However, if you only care about implementing something simple that will work well try the following:
keep track of wins and losses
use the percentage of wins vs losses as the statistic to match players ( in some sense of the word match, i.e. group players with similar percentages)
This has the obvious downfall of the case where a player may have a win-loss ratio of 5-0 and another of 50-20, the first has an infinite percentage while the other has a more reasonable percentage. It makes sense for the matching system to acknowledge this and be far more confident that the latter player has actual more skill because of the consistency required; however, pitting the two players against each other would probably be a good thing because the 5-0 player is probably trying to work the system by playing versus weaker players so pitting him against a consistently good player would do everyone good.
Note, I speak from experience from playing only strategy games such as Warcraft 3 where this is the typical match making behaviour. It seems to me like the percentage of wins over losses is a great metric by which to match players.
Match based on multiple attributes. I've implemented a simple matchmaking system using AWS Cloudsearch (based on Apache Solr). For example matching based on the a combination of following fields is possible
{
"fields": {
"elo_rating": 3121.44,
"points": 404,
"randomizer": 35,
"last_login": "2014-10-09T22:57:57Z",
"weapons": [
"CANNON",
"GUN"
]
}
It is now possible to run queries inclusive of multiple fields like the following.
(and (or weapons:'GUN' weapons:'CANNON' weapons:'DRONE')(and last_login:['2013-05-25T00:00:00Z','2014-10-25T00:00:00Z'])(and points:[100, 200])(and elo_rating:[1000, 2000]))}

Resources