Safari lodge vehicle allocations - resource constraint / allocation algorithm? - algorithm

I work in a safari lodge where we need to continuously allocate guests to vehicles for the duration of their stay, with the following contraints:
no more than 6 guests allocated to a vehicle at any time
guests must be allocated to the same vehicle for the duration of their stay
where a guest (or group of guests) pays for a private vehicle, no other guests may be allocated to that vehicle
private vehicle bookings impact the remaining number of vehicles available for allocation
guests travelling together must be allocated to the same vehicle(s)
the maximum number of available vehicles is fixed (=5)
the maximum number of potential guests to be allocated is fixed (=24)
at times, certain vehicles may be unavailable for a period
allocations are updated daily for latest + last minute bookings and must take into account the existing allocation configuration for the current day when updating forward into the future (i.e. tomorrow onwards)
the goal is to minimize the number of vehicles in use at any point, without having to break up groups or move guests from one vehicle to another during their stay.
I've come across a range of algorithmic approaches, from resource scheduling to greedy algorithms - but realistically I'm not technical enough to evaluate what I need and how it needs to hang together. I have some basic coding experience in Javascript and VBA (shudder), but I'm not especially mathematical. Viewing this as an interesting learning exercise that could make life a lot easier if I can find a way to automate much of the allocation logic. I can't quite formulate a 'picture' of the problem/approach in my mind - whether it's a large matrix or some kind of tree?
Ideally, I'd be able to generate an algorithmic solution that would give me 90% of what I need, which I could then tweak manually (and minimally) for a final solution, taking into account the particular idiosyncrasies of certain guests or groups. This final solution would then need to be the input (starting point) to the next allocation update when the most recent list of (updated) bookings is received.
It's very possible that this problem is too complex for my technical abilities - but I won't know until I ask! Many thanks.

Related

Parallel Solving with PDPTW in OptaPlanner

I'm trying to increase OptaPlanner performance using parallel methods, but I'm not sure of the best strategy.
I have PDPTW:
vehicle routing
time-windowed (1 hr windows)
pickup and delivery
When a new customer wants to add a delivery, I'm trying to figure out a fast way (less than a second) to show them what time slots are available in a day (8am, 9am, 10am, etc). Each time slot has different score outcomes. Some are very efficient and some aren't bookable depending on the time/situation with increased drive times.
For performance, I don't want to try each of the hour times in sequence as it's too slow.
How can I try the customer's delivery across all the time slots in parallel? It would make sense to run the solver first before adding the customer's potential delivery window and then share that solved original state with all the different added delivery's time slots being solved independently.
Is there an intuitive way to do this? Eg:
Reuse some of the original solving computation (the state before adding the new delivery). Maybe this can even be cached ahead of time?
Perhaps run all the time slot solving instances on separate servers (or at least multiple threads).
What is the recommended setup for something like this? It would be great to return an HTTP response within a second. This is for roughly 100-200 deliveries and 10-20 trucks.
Thanks!
A) If you optimize the assignment of 1 customer to 1 index in 1 of the vehicles, while pinning all other already assigned customers, then you forgoing all optimization benefits. It's not NP-hard.
You can still use OptaPlanner <constructionHeuristic/> for this (<localSearch/> won't improve the score), with or without moveThreadCount to spread it across cores, even though the main benefit will just the the incremental score calculation, not the AI algoritms.
B) Optimize assignment of all customers to an index of a vehicle. The real business benefits - like 25% less driving time - come when adding a new customer allows moving existing customer assignments too. The problem is that those existing customers already received a time window they blocked out in their agenda. But that doesn't need to be a problem if those time windows are wide enough: those are just hard constraints. Wider time windows = more driving time optimization opportunities (= more $$$, less CO² emissions).
What about the response within one minute?
At that point, you don't need to publish (= share info with the customer) which vehicle will come at which time in which order. You only need to publish whether or not you accept the time window. There's two ways to accomplish this:
C) Decision table based (a relaxation): no more than 5 customers per vehicle per day.
Pitfall: if it gets 5 customers in the 5 corners of the country/state, then it might still be infeasible. Factor in the average eucledean distance between any 2 customer location pairs to influence the decision.
D) By running optaplanner until termination feasible=true, starting from a warm start of the previous schedule. If no such feasible solution is found within 1000ms, reject the time window proposal.
Pitfall with D): if 2 requests come in at the same time, and you run them in parallel, so neither takes into account the other one, they could be feasible individually but infeasible together.

Can I use a spreadsheet to calculate profit/loss and current position for trading of shares?

I have data about some trades of a share (currently only 20 transactions as a learning subset, with the full set covering around 50,000 transactions for different shares).
I'm trying to work out how to determine the profit for each trade, as well as the current position. Not all the held shares are sold in 1 trade, so there is sometimes a carry over (whole and fractional values). Sometimes several buys and sells occur, rather than buy/sell/buy/sell.
I'm NOT the one trading! (Just to make that clear).
I suppose, what I'm trying to get to is "what am I actually asking about?".
I suspect there is a perfectly standardised way of doing this, but I don't know what that is.

Optimal shift scheduling algorithm

I have been trying for some time solve a scheduling problem for a pool that I used to work at. This problem is as follows...
There are X many lifeguards that work at the pool, and each has a specific number of hours they would like to work. We hope to keep the average number of hours away from each lifeguards desired number of hours as low as possible, and as fair as possible for all. Each lifeguard is also a college student, and thus will have a different schedule of availability.
Each week the pool's schedule of events is different than the last, thus a new schedule must be created each week.
Within each day there will be so many lifeguards required for certain time intervals (ex: 3 guards from 8am-10am, 4 guards from 10am-3pm, and 2 guards from 3pm-10pm). This is where the hard part comes in. There are no clearly defined shifts (slots) to place each of the lifeguards into (because of the fact that creating a schedule may not be possible provided the availability of the lifeguards plus the changing weekly pool schedule of events).
Therefore a schedule must be created from a blank slate provided only with...
The Lifeguards and their information (# of desired hours, availability)
The pool's schedule of events, plus number of guards required to be on duty at any moment
The problem can now be clearly defined as "Create a possible schedule that covers the required number of guards at all times each day of the week AND be as fair as possible to all lifeguards in scheduling."
Creating a possible schedule that covers the required number of guards at all times each day of the week is the part of the problem that is a necessity and must be completely solved. The second half about being as fair as possible to all lifeguards significantly complicates the problem leading me to believe I will need an approximation approach, since the possible number of way to divide up a work day could be ridiculous, but sometimes may be necessary as the only possible schedule may be ridiculous for fairness.
Edit: One of the most commonly suggested algorithms I find is the "Hospitals/Residents problem", however I don't believe this will be applicable since there are no clearly defined slots to place workers.
One way to solve this is with constraint programming - the Wikipedia article provides links for several constraint programming languages and libraries. Here is a paper describing how to use constraint programming to solve scheduling problems.
Another option is to use a greedy algorithm to find a (possibly invalid) solution, and to then use local search to make the invalid solution valid, or else to improve the sub-optimal greedy solution. For example, start by assigning each lifeguard their preferred hours, which will result in too many guards being scheduled for some slots and will also result in some guards being assigned a ridiculous number of hours; then use local search to un-assign the guards with the most hours from the slots that have too many guards assigned.
You need to turn your fairness criterion into an objective function. Then you can pick from any number of workplace scheduling tools.For instance, you describe wanting to minimize the average difference between desired and assigned hours. However, I'd suggest that you consider minimizing the maximum difference. This seems fairer (to me) and it will generally result in a different schedule.
The problem, however, is a bit more complex. For instance, if one guard is always getting shorted while the others all get their desired hours, that's unfair as well. So you might want to introduce variables into your fairness model that represent the cumulative discrepancy for each guard from previous weeks. Also, a one-hour discrepancy for a guard who wants to work four hours a week may be more unfair than for a guard who wants to work twenty. To handle things like that, you might want to weight the discrepancies.
You might have to introduce constraints, such as that no guard is assigned more than a certain number of hours, or that every guard has a certain amount of time between shifts, or that the number of slots assigned to any one guard in a week should not exceed some threshold. Many scheduling tools have capabilities to handle these kinds of constraints, but you have to add them to the model.

How to find Best Price for a Deck of Collectible Cards?

Or The Traveling Salesman plays Magic!
I think this is a rather interesting algorithmic challenge. Curious if anyone has any good suggestions for solving it, or if it is already solvable in a known way.
TCGPlayer.com sells collectible cards for a variety of games, including Magic the Gathering. Instead of just selling cards from their inventory they are actually a re-seller from multiple vendors (50+). Each vendor has a different inventory of cards and a different price per card. Each vendor also charges a flat rate for shipping (usually). Given all of that, how would one find the best price for a deck of cards (say 40 - 100 cards)?
Just finding the best price for each card doesn't work because if you order 10 cards from 10 different vendors then you pay shipping 10 times, but if you order all 10 from one vendor you only pay shipping once.
The other night I wrote a simple HTML Scraper (using HTML Agility Pack) that grabs all the different prices for each card, and then finds all the vendors that carry all the cards in the deck, totals the price of the cards from each vendor and sorts by price. That was really easy. The total prices ended up being near the total median price for all the cards.
I did notice that some of the individual cards ended up being much higher than the median price. That raises the question of splitting an order over multiple vendors, but only if enough savings could be made by splitting the order up to cover the additional shipping (each added vendor adds another shipping charge).
Logically it seems that the best price will probably only involve a few different vendors, but if the cards are expensive enough (and some are) then in theory ordering each card from a different vendor could still result in enough savings to justify all the extra shipping.
If you were going to tackle this how would you do it? Pure brute force figuring every possible combination of card / vendor combinations? A process that is more likely to be done in my lifetime would seem to involve a methodical series of estimates over a fixed number of iterations. I have a couple ideas, but am curious what others might suggest.
I am looking more for the algorithm than actual code. I am currently using .NET though, if that makes any difference.
I would just be greedy.
Assume that you are going to eat the shipping cost and buy from all vendors. Work out the absolute lowest price you get. Then for each vendor work out how much being able to buy some cards from them versus someone else saves you. Order the vendors by shipping - incremental savings.
Starting with the vendors who provide the least value, axe that vendor, redistribute their cards to the other vendors, and recalculate incremental savings. Wash, rinse, and repeat until your most marginal vendor is saving you money.
This should find a good solution but is not guaranteed to find the best solution. Finding the absolute best solution, though, seems likely to be NP-hard.
This is isomorphic to the uncapacitated facility location problem.
card in the deck : client
vendor : possible facility location
vendor shipping rate : cost of opening a facility at a location
cost of a card with a particular vendor : "distance" from a client to a facility
Facility location is a well-studied problem in the combinatorial optimization literature.
Interesting question! :)
So if we have n cards and m vendors, the brute force approach might have to check up to n^m combinations, right (a bit less since not each vendor has each card, but I guess that doesn't really matter in the grand scheme of things ;).
Let's for a second assume each vendor has each card and then see later-on how things change if they don't.
find the cheapest one-vendor solution.
order the cards by price, find the most expensive card that's cheaper at another vendor.
for all cards from vendor 1, move them to vendor 2 if they're cheaper there.
if having added vendor 2 doesn't make the order cheaper, undo and terminate, otherwise repeat from step 2
So if one vendor doesn't have all cards, you have to start with a multi-vendor situation. For each vendor, you might start by buying all cards that exist there, then apply the algorithm to the remaining cards.
Obviously, you may not be able to exploit all subtleties in the pricing with this method. But if we assume that a large portion of the price differences is made up by individual high-price cards, I think you can find a reasonable solution with this way.
Ok after writing all this I realized, the n^m assumption is actually wrong.
Once you have chosen a set of vendors to buy from, you can simply choose the cheapest vendor for each card. This is a great advantage because the individual choices of where to buy each card don't interfere with each other.
What does this mean for our problem? From the first look of it, it means that the selection of dealers is the problem (in terms of computational complexity), not the individual allocation of your buying choices. So instead of n^m, you got 2^m possible configurations in the worst case. So what we need is a heuristic for choosing vendors rather than choosing individual cards. Which might make the heuristic from above actually even more justifiable.
I myself have pondered this. Consider the following:
If it takes you a week to figure out,
code, and debug and algorithm that
only provides a 1% discount, would you
do it?
The answer is probably "No" (unless you're spending your entire life savings on cards, in which case you may be crazy). =)... or Amazon.com
Consequently, there is already an easy approximating algorithm:
Wait until you're buying lots of cards (reduce the shipping overhead).
Buy the cards from 3 vendors:
- the two with the cheapest-but-most-diverse inventories
- a third which isn't really cheap but definitely has every card you'd want.
Optimize accordingly (for each card, buy from the cheaper one).
Also consider local vendors you could just walk to, pre-constructed decks, and trading.
Based on firsthand and second experience, I can say you will find that you can get the median price with perhaps a few dollars more shipping you could otherwise, while still getting around median on each. You will find that you may have to pay a tiny bit more for understocked cards, but this will be few and far between, and the shipping savings will make up for it.
I recall the old programming adage: "Never optimize, until it's absolutely necessary; chances are you won't need to, or would have optimized the wrong thing." (e.g. your time is a resource too, and also has monetary value)
edit: Given that, this is an amazingly cool problem and one should solve it if one has time.
my algorithm goes like this
for each card calculate the average price available i.e sum of the price available from each vendor divide by the no of vendors.
now for that card select a vendor that offers less than or equal to average price.
now for each card we will have the list of vendors. now go for the intersection this way we will end up with series of vendor providing the maximxum no of cards at average or below average price.
i'm still thinking over the next steps but im putting the rough idea over here
now we are left with cards which are providing us single card. for such cards we will look into the price list of alredy short listed vendors with max no of cards and if the price diff is less than the shipping cost the we add the card to that vendors list.
i know this will require a huge optimization. but this what i have roghly figured out hope this helps
How about this:
Calculate the average price per ordered card across all vendors.
For each vendor that has at least one of the cards, calculate the total savings for all cards in the order as the difference between each card's price at that vendor and the average price.
Start with the vendor with the highest total savings and select all of those cards from that vendor.
Continue to select vendors with the next highest total savings until you have all of the cards in the order selected. Skip vendors that don't have cards that you still need.
From the selected list of vendors, redistribute the card purchases to the vendors with the best price for that card.
From the remaining list of vendors, and if the list is small enough, you could then brute force any vendors with a low card count to see if you could move the cards to other vendors to eliminate the shipping cost.
I actually wrote this exact thing last year. The first thing I do after loading all the prices is I weed out my card pool:
Each vendor can have multiple
versions of each card, as there are
reprints. Find the cheapest one.
Eliminate any card where the card value is greater than the cheapest card+shipping combo. That is, if I can buy the card cheaper as a one-off to a vendor than I can by adding it to an existing order from your store, I will buy it from the other vendor.
Eliminate any vendor whose offering I can buy cheaper (for every card) from another vendor. Basically, if another vendor out-prices you on every card, and on the total + shipping, then you are gone.
Unfortunately, this still leaves a huge pool.
Then I do some sorting and some brute-force-depth-first summing and some pruning and eventually end up with a result.
Anyway, I tuned it up to the point that I can do 70 cards and, within a minute, get within 5% of the optimal goal. And in an hour, less than 2%. And then, a couple of days later, the actual, final result.
I am going to read more about facility planning. Thanks for that tip!
What about using genetic algorithm? I think I'll try that one myself. You might manipulate the pool by adding both a chromosome with lowest prices, and another with lowest shipping costs.
BTW, did you finally implement any of the solutions presented here? which one? why?
Cheers!

How to manage transactions, debt, interest and penalty?

I am making a BI system for a bank-like institution. This system should manage credit contracts, invoices, payments, penalties and interest.
Now, I need to make a method that builds an invoice. I have to calculate how much the customer has to pay right now. He has a debt, which he has to pay for. He also has to pay for the interest. If he was ever late with due payment, penalties are applied for each day he's late.
I thought there were 2 ways of doing this:
By having only 1 original state - the contract's original state. And each time to compute the monthly payment which the customer has to make, consider the actual, made payments.
By constantly making intermediary states, going from the last intermediary state, and considering only the events that took place between the time of these 2 intermediary states. This means having a job that performs periodically (daily, monthly), that takes the last saved state, apply the changes (due payments, actual payments, changes in global constans like the penalty rate which is controlled by the Central Bank), and save the resulting state.
The benefits of the first variant:
Always actual. If changes were made with a date from the past (a guy came with a paid invoice 5 days after he made the payment to the bank), they will be correctly reflected in the results.
The flaws of the first variant:
Takes long to compute
Documents printed with the current results may differ if the correct data changes due to operations entered with a back date.
The benefits of the second variant:
Works fast, and aggregated data is always available for search and reports.
Simpler to compute
The flaws of the second variant:
Vulnerable to failed jobs.
Errors in the past propagate until the end, to the final results.
An intermediary result cannot be changed if new data from past transactions arrives (it can, but it's hard, and with many implications, so I'd rather mark it as Tabu)
Jobs cannot be performed successfully and without problems if an unfinished transaction exists (an issued invoice that wasn't yet paid)
Is there any other way? Can I combine the benefits from these two? Which one is used in other similar systems you've encountered? Please share any experience.
Problems of this nature are always more complicated than they first appear. This
is a consequence of what I like to call the Rumsfeldian problem of the unknown unknown.
Basically, whatever you do now, be prepared to make adjustments for arbitrary future rules.
This is a tough proposition. some future possibilities that may have a significant impact on
your calculation model are back dated payments, adjustments and charges.
Forgiven interest periods may also become an issue (particularly if back dated). Requirements
to provide various point-in-time (PIT) calculations based on either what was "known" at
that PIT (past view of the past) or taking into account transactions occurring after the reference PIT that
were back dated to a PIT before the reference (current view of the past). Calculations of this nature can be
a real pain in the head.
My advice would be to calculate from "scratch" (ie. first variant). Implement optimizations (eg. second variant) only
when necessary to meet performance constraints. Doing calculations from the beginning is a compute intensive
model but is generally more flexible with respect to accommodating unexpected left turns.
If performance is a problem but the frequency of complicating factors (eg. back dated transactions)
is relatively low you could explore a hybrid model employing the best of both variants. Here you store the
current state and calculate forward
using only those transactions that posted since the last stored state to create a new current state. If you hit a
"complication" re-do the entire account from the
beginning to reestablish the current state.
Being able to accommodate the unexpected without triggering a re-write is probably more important in the long run
than shaving calculation time right now. Do not place restrictions on your computation model until you have to. Saving
current state often brings with it a number of built in assumptions and restrictions that reduce wiggle room for
accommodating future requirements.

Resources