Scheduling - Assigning jobs to the most efficient worker - algorithm

This was asked by a friend of mine. I had no previous context, so I want to know what type of algorithm this problem belongs to. Any hint or suggestions will do.
Suppose we have a group of N workers working at a car assembly line. Each worker can do 3 types of work, and their skills rated from 1 to 10. For example, Worker1's "paint surface" efficiency is rated 8, but "assemble engine" efficiency is only rated 5.
The manager has a list of M jobs defined by start time, duration, job type, and importance, rated from 0 to 1. Each worker can only work on 1 job at a time, and 1 job can be worked by 1 worker. How can the managers assign the jobs properly to get maximum output?
The maximum output for a job = worker skill rating * job importance * duration.
For example, we have workers {w1, w2}
w1: paint_skill = 9, engine_skill = 8
w2: paint_skill = 10,engine_skill = 5
We have jobs {j1, j2}
j1: paint job, start_time = 0, duration = 10, importance = 0.5
j2: engine job, start_time = 3, duration = 10, importance = 0.9
We should assign w1 to j2, and w2 to j1. output = 8 * 10 * 0.5 + 10 * 10 * 0.9 = 40 + 90 = 130
A greedy solution that matches the next available worker with the next job is clearly sub-optimal, as in the example we could have matched w1 to j1, which is not optimal.
A exhaustive brute-force solution would guarantee the best output, but will use exponentially more time to compute with large job lists.
How can this problem be approached?

Related

Matching problem with multiples assigment

Introduction
I have a bipartite graph with workers(W) and Tasks(T).
Te goal is assign all task to the workers to minimize the maximum time spend. IE finish the last tasks as soon as possible.
Question
What modification to the Hungarian algorithm have to be done to accomplish this task.
If Hungarian algorithm is not useful what could be a good mathematical approach?
Mathematically i don't know how to work with multiple task assignments for workers.
I will implement it in python once i understand the math theory.
Problem
Conditions:
A task can only be assigned to one worker
There isn't any restriction in the amount of task
All task must be assigned
A worker could have multiple task assigned
There isn't any restriction in the amount of workers
A worker could have no assignation.
Workers are not free to start working at the same time
Example
If i have 7 task T={T₁, T₂, T₃, T₄, T₅, T₆, T₇} and 3 workers W={W₁, W₂, W₃}, workers will be free to start working in F={4, 7, 8} (where Fᵢ is the time Wᵢ needs to be free to start working) and the cost matrix is:
A matching example could be (not necessary correct in this case, is just an example):
W₁ = {T₁, T₂, T₃}
W₂ = {T₄, T₅}
W₃ = {T₆, T₇}
in this case the time expend for each worker is:
Time(W₁) = 4+5+4+3 = 16
Time(W₂) = 7+4+9 = 20
Time(W₃) = 8+1+7 = 16
Explained as:
For W₁, we have to wait for:
4 till he is free
after that he will finish T₁ in 5
T₂ in 4
T₃ in 3
giving a total time of 16.
For W₂, we have to wait for:
7 till he is free
After that he will finish T₄ in 4
T₅ in 9
Giving a total time of 20.
For W₃, we have to wait for:
8 till he is free
after that he will finish T₆ in 1
T₇ in 7
Giving a total time of 16.
Goal
Minimize the maximum total time. Not the sum of totals.
If Total times {9, 6, 6} (sum 21) is a solution then {9, 9, 9} (sum 27) is a solution too and {10, 1, 1} (sum 12) is not because in the first and second case the last task is finished at time 9 and in the third case in time 10.

Maximizing profit in doing the jobs

We are given two arrays M (money) and E (experience) of integers each of size 50 at most. After Bob does the job i, two things happen:
(Let TE be Bob's total experience initialized by 0)
Bob's experience (i.e. TE) is incremented by E[i]
Then, he will receive money equal to TE*M[i]
What is the maximum profit Bob can make if he does the jobs in the best possible order?
For any i we know:
1 <= E[i] <= 10^5
1 <= M[i] <= 10
Example:
M[] = { 20, 30, 100 }
E[] = { 1, 1, 6 }
Answer: 880 = job 3-1-2 = 6*100 + 7*20 + 8*30 = 980
I think the problem can be solved by Greedy Algorithm (which is a special case of DP) as described follow:
Sort the job by ratio Exp/Money in descending order
If tie, then sort the job by Money in ascending order
Then the sorted job sequence is the order of the job which yields the optimal solution.
My reasoning is as follows: The ratio Exp/Money can be interpreted as How much Exp can you buy with 1 money, so it is always better if we choose the job with higher ratio first, as this increase the experience for later jobs.
In the tie case, choose the job with smaller money reward, as this makes the job with higher money reward can be multiplied by a larger experience factor later on.
For example:
E = {2,1,6,1}
M = {40,20,100,10}
Sorted job = { job3, job4, job2, job1}
= 6*100 + 7*10 + 8*20 + 10*40 = 1230

Greedy algorithm: highest value first vs earliest deadline first

Assume we have a set of n jobs to execute, each of which takes unit time. At any time we can serve exactly one job. Job i, 1<=i<=n earns us a profit if and only if it is executed no later than its deadline.
We can a set of jobs feasible if there exists at least one sequence that allows each job in the set to be performed no later than their deadline. "Earliest deadline first" is feasible.
Show that the greedy algorithm is optimal: Add in every step the job with the highest value of profit among those not yet considered, provided that the chosen set of jobs remains feasible.
MUST DO THIS FIRST: show first that is always possible to re-schedule two feasible sequences (one computed by Greedy) in a way that every job common to both sequences is scheduled at the same time. This new sequence might contain gaps.
UPDATE
I created an example that seems to disprove the algorithm:
Assume 4 jobs:
Job A has profit 1, time duration 2, deadline before day 3;
Job B has profit 4, time duration 1, deadline before day 4;
Job C has profit 3, time duration 1, deadline before day 3;
Job D has profit 2, time duration 1, deadline before day 2.
If we use greedy algorithm with the highest profit first, then we only get job B & C. However, if we do deadline first, then we can get all jobs and the order is CDB
Not sure if I am approaching this question in the right way, since I created an example to disprove what the question wants
This problem looks like Job Shop Scheduling, which is NP-complete (which means there's no optimal greedy algorithm - despite that experts are trying to find one since the 70's). Here's a video on a more advanced form of that use case that is being solved with a Greedy algorithm followed by Local Search.
If we presume your use case can indeed be relaxed to Job Shop Scheduling, than there are many optimization algorithms that can help, such as Metaheuristics (including Local Search such as Tabu Search and Simulated Annealing), (M)IP, Dynamic Programming, Constraint Programming, ... The reason there are so many choices, is because none are perfect. I prefer Metaheuristics, as they out-scale the others in all the research challenges I've seen.
In fact, neither "earliest deadline first", "highest profit first" nor "highest profit/duration first" are correct algorithm...
Assume 2 jobs:
Job A has profit 1, time duration 1, deadline before day 1;
Job B has profit 2, time duration 2, deadline before day 2;
Then "earliest deadline first" fails to get correct answer. Correct answer is B.
Assume another 5 jobs:
Job A has profit 2, time duration 3, deadline before day 3;
Job B has profit 1, time duration 1, deadline before day 1;
Job C has profit 1, time duration 1, deadline before day 2;
Job D has profit 1, time duration 1, deadline before day 3;
Job E has profit 1, time duration 1, deadline before day 4;
Then "highest profit first" fails to get correct answer. Correct answer is BCDE.
Assume another 4 jobs:
Job A has profit 6, time duration 4, deadline before day 6;
Job B has profit 4, time duration 3, deadline before day 6;
Job C has profit 4, time duration 3, deadline before day 6;
Job D has profit 0.0001, time duration 2, deadline before day 6;
Then "highest profit/duration first" fails to get correct answer. Correct answer is BC (Thanks for #dognose's counter-example, see comment).
One correct algorithm is Dynamic Programming:
First order by deadline ascending. dp[i][j] = k means within the first i jobs and within j time units we can get k highest profit. Then initially dp[0][0] = 0.
Jobs info are stored in 3 arrays: profit are stored in profit[i], 1<=i<=n, time duration are stored in time[i], 1<=i<=n, deadline are stored in deadline[i], 1<=i<=n.
// sort by deadline in ascending order
...
// initially 2 dimension dp array are all -1, -1 means this condition unreachable
...
dp[0][0] = 0;
int maxDeadline = max(deadline); // max value of deadline
for(int i=0;i<n;i++) {
for(int j=0;j<=maxDeadline;j++) {
// if do task i+1 satisfy deadline condition, try to update condition for "within the first i+1 jobs, cost j+time[i+1] time units, what's the highest total profit will be"
if(dp[i][j] != -1 && j + time[i+1] <= deadline[i+1]) {
dp[i+1][j+time[i+1]] = max(dp[i+1][j+time[i+1]], dp[i][j] + profit[i+1]);
}
}
}
// the max total profit can get is max value of 2 dimension dp array
The time/space complexity (which is n*m, n is job count, m is maximum deadline) of DP algorithm is highly dependent on how many jobs and the maximum deadline. If n and/or m is rather large, it maybe difficult to get answer, while for common use, it will work well.
The problem is called Job sequencing with deadlines, and can be solved by two algorithms based on greedy strategy:
Sort input jobs decreasing on profit. For every job put it in list of jobs of solution sorted increasingly on deadline. If after including a job some jobs in solution has index grater than deadline, do not include this job.
Sort input jobs decreasing on profit. For every job put it in the list of job of solution on the last possible index. If there is no free index less or equal to the job deadline, do not include the job.
public class JOB {
public static void main(String[] args) {
char name[]={'1','2','3','4'};
int dl[] = {1,1,4,1};
int profit[] ={40,30,20,10};
char cap[] = new char[2];
for (int i =0;i<2 ;i++)
{
cap[i]='\0';
}
int j;
int i =0;
j = dl[i]-1;
while (i<4)
{
if(j<0) {
i++;
if(i<4)
j = dl[i]-1;
}
else if(j<2 && cap[j]=='\0')
{
cap[j] = name[i];
i++;
if(i<4)
j = dl[i]-1;
}
else
j=j-1;
}
for (int i1 =0 ; i1< 2 ; i1++)
System.out.println(cap[i1]);
}
}

subset sum in pairs of two

This is an interview question:
4 men- each can cross a bridge in 1,3, 7, and 10 min. Only 2 people
can walk the bridge at a time. How many minutes would they take
to cross the bridge?
I can manually think of a solution as: 10 and 7 got together, as soon as 7 reaches the destination, '3' hops in and 10 and 3 complete together. Now 1 goes by itself, and total time taken is 11. So, {10, 7} followed by {10, 3} followed by {1}.
I am unable to think about how I can implement this into a general algorithm into a code. Can someone help me identify how I can go about converting this idea into some real code?
The problem you describe is not subset sum.
Yet you can:
order the array a by descending time
int time1 = 0; // total time taken by the first lane
int time2 = 0; // total time taken by the second lane
for i : 0..n
if(time1 < time2) // add the time to the most "available" lane
time1 += a[i];
else
time2 += a[i];
endif
endfor
return max(time1, time2);
This is not a subset sum problem but a job shop scheduling problem. See Wikipedia entry on Job shop scheduling. You have four "jobs", taking 1, 3, 7 and 10 minutes respectively, and two "lanes" for conducting them, that is, the capacity 2 of the bridge. Calculating exact solution in general for job shop scheduling is hard.

How to find maximum task running at x time?

Problem description is follows:
There are n events for particular day d having start time and duration. Example:
e1 10:15:06 11ms (ms = milli seconds)
e2 10:16:07 12ms
......
I need to find out the time x and n. Where x is the time when maximum events were getting executed.
Solution I am thinking is:
Scanning all ms in day d. But that request total 86400000*n calculation. Example
Check at 00::00::00::001 How many events are running
Check at 00::00::00::002 How many events are running
Take max of Range(00::00::00::01,00::00::00::00)
Second solution I am thinking is:
For eventi in all events
Set running_event=1
eventj in all events Where eventj!=eventi
if eventj.start_time in Range (eventi.start_time,eventi.execution_time)
running_event++
And then take max of running_event
Is there any better solution for this?
This can be solved in O(n log n) time:
Make an array of all events. This array is already partially sorted: O(n)
Sort the array: O(n log n); your library should be able to make use of the partial sortedness (timSort does that very well); look into distribution-based sorting algorithms for better expected running time.
Sort event boundaries ascending w.r.t. the boundary time
Sort event ends before sort starts if touching intervals are considered non-overlapping
(Sort event ends after sort starts if touching intervals are considered overlapping)
Initialise running = 0, running_best = 0, best_at = 0
For each event boundary:
If it's a start of an event, increment running
If running > running_best, set best_at = current event time
If it's an end of an event, decrement running
output best_at
You could reduce the number of points you check by checking only ends of all intervals, for each interval (task) I that lasts from t1 to t2, you only need to check how many tasks are running at t1 and at t2 (assuming the tasks runs from t1 to t2 inclusive, if it is exclusive, check for t1-EPSILON, t1+EPSILON, t2-EPSILON, T2+EPSILON.
It is easy to see (convince yourself why) that you cannot get anything better that these cases do not cover.
Example:
tasks run in `[0.5,1.5],[0,1.2],[1,3]`
candidates: 0,0.5,1,1.2,1.5,3
0 -> 1 tasks
0.5 -> 2 tasks
1 -> 3 tasks
1.2 -> 3 tasks (assuming inclusive, end of interval)
1.5 -> 2 tasks (assuming inclusive, end of interval)
3 -> 1 task (assuming inclusive, end of interval)

Resources