Task assigning algorithm - algorithm

I'm trying to figure out what the most efficient way of assigning tasks to people. Here is what I'm struggling with:
you have X number of people that are all eligible to do work
each person can do X number of tasks at once
you have X number of waiting tasks
each task takes a variable length of time
The goal of the challenge is to evenly distribute the tasks among the people as best as possible. Once a person finishes one of the given tasks, one of the 'queued' tasks will be fed to them. Here's an example scenario.
There's 500 tasks in the queue with 50 people available to 'take' them. Each person can take 2 tasks at once. Once a person finishes a given task they'll be fed another. The tasks that have been waiting the longest get the highest priority.
One way to possibly do it would be to have each of the 50 people that have the capacity to take a task be assigned one based on their task last given time. For example:
task 1 -> person 1
task 2 -> person 2
task 3 -> person 3
...
task 4 -> person 1
task 5 -> person 2
task 6 -> person 3
Based on the task last assigned to X person, the person with the oldest task last assigned and that's available to take on another task would get it fed to them. I'm unsure if this is the right solution for even task distribution, would love to hear suggestions! Is there a name for this type of algorithm?
Another method could possibly be to assign tasks based on the person currently serving the lowest number of tasks. Although if multiple people are tied to for the lowest number of tasks, the task is assigned to the person who's been available (last task assigned) for the longest period of time.

Please consider looking at this at a higher level.
The proposals so far have been greedy. They build one schedule and hope for the best.
The first thing you'll need to decide is whether that's what you want. Greedy assignment will produce spectacularly bad answers for some inputs, though if the inputs are "reasonable," and all you want is a reasonable answer, it may be fine.
On the other hand, finding the optimal assignment of tasks is NP hard. You'll need time exponential in the input size to be sure you have the best possible answer.
There are two intermediate approaches.
Randomized task scheduling algorithms. This is a huge topic. This paper is still a decent starting place, though it's now very out-of-date. Richard Karp is amazing. The nice thing about randomized algorithms is that they can provide very useful optimality guarantees.
Heuristic search. Define a single numeric metric of goodness of a schedule. Start with a reasonable one (greedily determined or random). Put that on the search queue sorted by metric v, pull the best metric off the queue, find all its "children", i.e. schedules that haven't been considered before resulting from all possible simple changes, add these to the queue, and repeat. Stop when you can't wait any longer. The current best is your answer. You can also structure this as a genetic algorithm, which is just a specialized heuristic search.

Keep 2 queues. one for tasks another for free people waiting for task. If there is a task to be taken first person from queue will take and go. You will not think about time for tasks and people in this solution as it is justice way of distribution. You might think about priority queues if you need some kind of prioritization in the future with little changes for both queues.

As zsiar, but use two priority queues. The top task in the top priority queue gets assigned to the top workers, assuming he is capable. If he's not, the task can't be done and must wait.
Workers in the worker priority queue get ordered by capacity or time idle or whatever seems fair. In fact it's not a real priority queue as when a worker finishes a task, we take him out and put him back in the queue, in a higher position.
(If workers can do two tasks at once they are probably computers rather than people, so time idle isn't a useful metric. Only human workers care if they are kept busy whilst other doss about).

Related

Need an advice about algorithm to solve quite specific Job Shop Scheduling Problem

Job Shop Scheduling Problem (JSSP): I have jobs that consist of tasks and I have machines that can perform these tasks.
I should be able to add new jobs dynamically. E.g. I have a schedule for the first 5 jobs, and when the 6th arrive - I need to be able to fit it into the schedule in the best way. It is possible to adjust existing schedule within the given flexibility constrains.
Look at the picture below.
Jobs have tasks, each task is the same type of action. Think about painting of some objects with paint spray. All the machines are the same (paint sprays), and all of the tasks are the same.
Constraint 1. Jobs have a preferred deadline for completion, but the deadline is flexible to some extent.
Edit after #tucuxi answer: Flexible deadline mean that the time of completion can be extended by some delta if necessary.
Constraint 2. Between the jobs there is resting phase. Think about drying the paint. Resting phase has minimal required duration. Resting phase can be longer or shorter if necessary.
Edit after #tucuxi answer: So there is planned time of rest Tp which is desired, but flexible value that can be increased or decreased if this allows for better scheduling. And there is minimal time of rest Tm. So Tp-Tadjustmenet>=Tm.
The machine is occupied by the job from the start to the completion.
Here goes parts that make this problem very distinct from what I have read about.
Jobs arrive in batches of several jobs. For example a batch can contain 10 jobs of the type Job_1 and 5 of Job_2. Different batches can contain different types of jobs. All the jobs from the batches should be finished as close to each other as possible. Not necessary at the same time, but we need to minimize the delay between the completion of first and last jobs from the batch.
Constraint 3. Machines are grouped. In each group only M machines can work simultaneously. Think about paint sprays that are connected to the common pressurizer that has limited performance.
The goal.
Having given description of the problem, it should be possible to solve JSSP. It should be also possible to add new jobs to the existing schedule.
Edit after #tucuxi answer: This is not a task that should be solved immediately: it is not a time-critical system. But it shouldn't be too long to irritate a human who put new tasks into the algorithm.
Question
What kind of many JSSP algorithms can help me solve this? I can implement an algorithm by myself, if there is one. The closest I found is This - Resource Constrained Project Scheduling Problem. But I was not able to comprehend how can I glue it to the JSSP solving algorithm.
Edit after #tucuxianswer: No, I haven't tried it yet.
Is there any libraries that can be used to solve this problem? Python or C# are the preferred languages, but in the end it doesn't really matter.
I appreciate any help: keyword to search for, link, reference to a book, reference to a library.
Thank you.
I doubt that there is a pre-made algorithm that solves your exact problem.
If I had to solve it, I would first:
compile datasets of inputs that I can feed into candidate solvers.
think of a metric to rank outputs, so that I can compare the candidates to see which is better.
A baseline solver could be a brute-force search: test and rate all possible job schedulings for small sample problems. This is of course infeasible for large inputs, but for small inputs it allows you to compare the outputs of more efficient solvers to a known-best answer.
Your link is to localsolver.com, which appears to provide a library for specifying problem constraints to then solve them. It is not freely available, requiring a license to use; but it would seem that your problem can be readily modeled in it. Have you tried to do so? They appear to support both C++ and Python. Other free options exist, including optaplanner (2.8k stars in github) or python-constraint (I have not looked into other languages).
Note that a good metric is crucial to choosing a good algorithm: unless you have a clear cost function to minimize, choosing "a good algorithm" is impossible. In your description of the problem, I see several places where cost is unclear (marked in italics):
job deadlines are flexible
minimal required rest times... which may be shortened
jobs from a batch should be finished as close together as possible
(not from specification): how long can you wait for an optimal vs a less-optimal-but-faster solution?

Parallel Solving with PDPTW in OptaPlanner

I'm trying to increase OptaPlanner performance using parallel methods, but I'm not sure of the best strategy.
I have PDPTW:
vehicle routing
time-windowed (1 hr windows)
pickup and delivery
When a new customer wants to add a delivery, I'm trying to figure out a fast way (less than a second) to show them what time slots are available in a day (8am, 9am, 10am, etc). Each time slot has different score outcomes. Some are very efficient and some aren't bookable depending on the time/situation with increased drive times.
For performance, I don't want to try each of the hour times in sequence as it's too slow.
How can I try the customer's delivery across all the time slots in parallel? It would make sense to run the solver first before adding the customer's potential delivery window and then share that solved original state with all the different added delivery's time slots being solved independently.
Is there an intuitive way to do this? Eg:
Reuse some of the original solving computation (the state before adding the new delivery). Maybe this can even be cached ahead of time?
Perhaps run all the time slot solving instances on separate servers (or at least multiple threads).
What is the recommended setup for something like this? It would be great to return an HTTP response within a second. This is for roughly 100-200 deliveries and 10-20 trucks.
Thanks!
A) If you optimize the assignment of 1 customer to 1 index in 1 of the vehicles, while pinning all other already assigned customers, then you forgoing all optimization benefits. It's not NP-hard.
You can still use OptaPlanner <constructionHeuristic/> for this (<localSearch/> won't improve the score), with or without moveThreadCount to spread it across cores, even though the main benefit will just the the incremental score calculation, not the AI algoritms.
B) Optimize assignment of all customers to an index of a vehicle. The real business benefits - like 25% less driving time - come when adding a new customer allows moving existing customer assignments too. The problem is that those existing customers already received a time window they blocked out in their agenda. But that doesn't need to be a problem if those time windows are wide enough: those are just hard constraints. Wider time windows = more driving time optimization opportunities (= more $$$, less CO² emissions).
What about the response within one minute?
At that point, you don't need to publish (= share info with the customer) which vehicle will come at which time in which order. You only need to publish whether or not you accept the time window. There's two ways to accomplish this:
C) Decision table based (a relaxation): no more than 5 customers per vehicle per day.
Pitfall: if it gets 5 customers in the 5 corners of the country/state, then it might still be infeasible. Factor in the average eucledean distance between any 2 customer location pairs to influence the decision.
D) By running optaplanner until termination feasible=true, starting from a warm start of the previous schedule. If no such feasible solution is found within 1000ms, reject the time window proposal.
Pitfall with D): if 2 requests come in at the same time, and you run them in parallel, so neither takes into account the other one, they could be feasible individually but infeasible together.

Assigning jobs to workers

There are N plumbers, and M jobs for them to do, where M > N in general.
If N > M then it's time for layoffs! :)
Properties of a job:
Each job should be performed in a certain time window which can vary per-job.
Location of each job varies per-job.
Some jobs require special skill. Skills needed to complete the job can vary per-job
Some jobs have higher priority than others. The "reward" for some jobs is higher than others.
Properties of a plumber:
Plumbers have to drive from one job to the next which takes time. Say it's known what the travel time from each job to every other job site is.
Some plumbers have skills that others don't have.
The task is to find the optimal assignment of jobs to plumbers, so that the reward is maximized.
It's possible that not all jobs can be completed. For example, with one plumber and two jobs, it's possible that if they are doing job A, they can't do job B because there's not enough time to get from A to B once they are done with A and B is supposed to begin. In that case, optimal is to have the plumber do the job with the biggest reward and we are done.
I am thinking of a greedy algorithm that works like this:
sort jobs by reward
while true:
for each job:
find plumbers that could potentially handle this job
make a note of the association, used in next loop
if each plumber is associated with a different job, break
for each job that can be handled by a plumber:
assign job to a plumber:
if more than one plumber can handle this job, break tie somehow:
for instance if plumber A can do jobs X,Y but
plumber B can only do X, then give X to B.
else just pick a plumber to take it
remove assigned job from further consideration
if no jobs got assigned:
break out of "while true" loop
My question: is there a better way? Seems like an NP-hard problem but I have no proof of that. :)
I guess it's similar to the Assignment Problem.
Seems it's a bit different though because of the space/time wrinkle: plumber could do either A or B, but not both because of the distance between them (can't get to B in time after finishing A). And jobs must be completed in certain time windows.
Also a plumber might not be able to take both jobs if they are too close in time (even if they are close in space). For example if B must be started before time_A_finished + time_to_travel_A_to_B, then B can't be done after A.
Thanks for any ideas! Any pointers on good stuff to read in this area is also appreciated.
Even routing just one plumber between jobs is as hard as the NP-hard traveling salesman problem.
I can suggest two general approaches for improving on your greedy algorithm. The first is local search. After obtaining a greedy solution, see if there are any small improvements to be made by assigning/reassigning/un-assigning a few jobs. Repeat until there are no obvious improvements or CPU time runs out.
Another approach is linear programming with column generation. This is more powerful but a lot more involved. The idea is to set up a master program where we try to capture as much reward as possible by choosing to use or not use every feasible plumber schedule, subject to the packing constraints of only doing a job once and not using more plumber skills than are available. At each stage of solving the master program, the dual values corresponding to jobs and plumbers reflect the opportunity cost of doing a particular job/using a particular plumber. The subproblem is figuring how out to route a plumber so as to capture more (adjusted) reward than the plumber "costs". This subproblem is NP-hard (per the note above), but it may be amenable itself to dynamic programming or further linear programming techniques depending on how many jobs there are. You'll quickly bump into the outer limits of academic operations research following this path.

What data structure and algorithms to use to optimize concurrent jobs?

I have a series of file-watchers that trigger jobs. The file-watchers look, every fixed interval of time, in their list and, if they find a file, they trigger a job. If not, they wait, coming back after that mentioned interval.
Some jobs are dependent on others, so running them in a proper order and with proper parallelism would be a good optimization. But I do not want to think about this myself.
What data structure and algorithms should I use to ask a computer to tell me what job to assign to what file-watcher (and in what order to put them)?
As input, I have the dependencies between the jobs, the arrival time of files for each job and a number of watchers. (For starter, I will pretend each jobs takes same amount of time). How do I spread the jobs between the watchers, to avoid unnecessary waiting gaps and to obtain faster run time?
(I am looking forward tackling this optimization in an algorithmic way, but would like to start with some expert advice)
EDIT : so far I understood the fact the I need a DAG (Directed acyclic graph) to represent the dependencies and that I need to play with Topological sorting in order to optimize. But this responds with a one execution line, one thread. What if I have more, say 7?

How to optimize assignment of tasks to agents with these constraints?

I have an assignment problem as a part of my Master's Thesis, and I am looking for general direction in solving the same.
So, there is a list of agents, and a list of tasks, with number of tasks being greater than the number of agents.
The agents submit a prioritized ordered list of tasks they can/want to do. The length of the list is fixed to a number much smaller than the total number of tasks.
Every agent must be assigned a task. A task once assigned cannot be assigned to another agent.
The objective is to find an assignment such that the average priority/preference of the assigned tasks is the lowest. Additionally, if it is complete solution i.e. every agent is assigned a task, it is even better.
I have looked at the generalized assignment problems, and the Hungarian algorithm, but these do not cater to the specific situation where there is a cost to a task and also the possibility of the agent being unable to do some of the tasks.
Please help. Thank you.
If you want a general approach, you can model the problem as Mixed Integer Program, introducing binary variables for the assignment of tasks to agents and putting the priority costs and (very high) non-assignment costs into the objective function. Mixed Integer Programs can be solved using a variety of solver programs, including CPLEX or Gurobi which are free for academic purposes.

Resources