Job scheduling algorithm with deadline and execution time - algorithm

Given an array of jobs where every job has a deadline(d_i > 0) and associated execution time (e_i > 0), i.e.
we have been given an array of (d_i, e_i) , can we find an arrangement of jobs such that all of them can be scheduled. There may be more than possible answer, any one will suffice.
e.g. {(3,1),(3,2),(7,3)} {J1,J2,J3}
Answer could be one of them {J1,J2,J3} or {J2,J1,J3}
We can solve this problem using backtracking but running time will be very high. Can we solve this problem using greedy or any other approach? Please provide it's correctness .
Edit:
At most one job can be run at a time .

Hint: After you have scheduled k initial jobs successfully, it is possible to find a satisfying full schedule only if there is a next job whose execution time added to the current time after the k previous jobs is less than or equal to the deadline time for the next job. Can you see why always choosing the next job with the earliest deadline at each step of choosing a job will determine whether there is or is not a solution, and if there is, will give a precise solution? Let me know if you'd like more details about how to prove that, but hopefully you can see it on your own now that I've pointed out what the correct greedy solution is.
UPDATE: Further hint: Assume that you have a satisfying assignment where two consecutive jobs are out of order according to their deadlines (this just means the overall ordering of jobs is out of order somehow according to deadlines). Then it is possible to finish both of these jobs before the earlier deadline of the two deadlines. Thus it is also possible to finish both jobs before the later deadline, by swapping the jobs, which will still be a satisfying assignment because by assumption you will now finish the earlier deadline job before the previous time you finished it, by assumption, and the later deadline of the two is later than the earlier deadline, and previously it was still possible to find a satisfying assignment.
Thus, if a satisfying assignment exists, then there is another one that exists where jobs are ordered according to their deadlines. I.e., the greedy strategy will always find a satisfying assignment if one exists -- otherwise, there is no solution.

A O(nlogn) Greedy approach based on heap data structure
the input is array of job
struct Job
{
char id;
int deadLine;
int profit;
}
Algorithm pseudo code:
1.Sort the input jobArray in non-decresing order of deadLine.
2.create a maxHeap (will consists of job).Basis of Comparison is profit
3.let n=length of jobArray
initialize time=jobArray[n-1].deadLine
index=n-1
4.while index>=0 && jobArray[index].deadLine >= time
4a) insert(maxHeap,jobArray[index])
4b) index=index-1
5. j=removeMax(maxHeap)
print j.id
time=time-1
6.if time > 0
goto step 4.
else
return ;
This will print jobs in the reverse order .
It can be modified to print in right order;

Related

Resource with schedules allocation problem

I have a similar to this task but with some differencies in bold. That is:
I have a set of J jobs that need to be completed. Jobs are organized into set of directed graphs. Each job can have one or more preceding(parent) jobs that have to be completed before it starts
All jobs take different times to complete, but the time is known
I have a set of R human resources. Important!!! Each resource has a schedule when this resource is available
Some jobs can be preempted once started. E.g. if a resource is not available at some specific hours then it can process the job after
A job may has a predefined amount of resource which can be changed. If algorithm can't find desired decision (all jobs couldn't be finished at some known date) then we can increase/decrease resources amount. The duration of a job changes proportionally with amount. E.g. J1 require 1 human resource initially and estimated duration 1 hour. If we set 2 human resource than duration decreases to 0.5 hour
Resources have parameters that match tasks. E.g. If J1 has constraints {A,B} then we can assign resource R1 with permissions {A,B} which satisfied these constraints. If there is no appropriate resources then we can assign any of available.
My Goal: Assign as much appropriate resources as possible so that the job would finish at the time.
Is there a way to solve this problem via simplex method? How equatinons would look like. What will be a complexity? Could you please help me to build a math model or share some links. Any help would be appreciate. I can't get close to resource schedules without brute permutation.
I've tried to implement something similar to greedy algorithm:
Firstly I sheduled tasks from job start date without resources calendars considering only relations between tasks (I calculated order for each task which is in chain. Where order 0 - the earliest predecessor)
E.g. In chain J1 -> J2 -> J3 J3 must be done after finishing J1 and J2. J1.Order = 0, J2.Order = 1, J3.Order = 3.
Then I sorted tasks by order and operations count in chain. First goes the task with higher count of children. Then I sorted by count of required resources. And finally by initial duration of a task
For each order I take a task and assign resources with satisfied skills and can do the task then earlier then better considering their calendars and buisiness into another job
This algorithm doesn't take into account predefined job finish date. I'm worried that it's far away from optimal solution and another approach is needed.
The greedy algorithm with a proof of correctness would also make the deal for me.

Critical path with given start and end times instead of cost

I'm having trouble computing the critical path of a network of activities. The data I have to work with is a little different than what I have seen in simple examples on the web. That is I have the start and end times of each activities, from whom I deduce the length. I have been using that algorithm to compute earliest and latest start and end times for each activity :
"To find the earliest start times, start at activities with no predecessors, and say they start at time zero. Then repeatedly find an activity whose predecessors' start times have all been filled, and set the start time to the maximum predecessor finish time.
To find the latest start times, run the preceding algorithm backwards. Start at activities with no successors. Set their finish time to the maximum finish time from the previous phase. Repeatedly find a predecessor whose successors have all been evaluated. Set its finish time to the earliest successor's start time.
Now it is trivial to evaluate slack = latest start – earliest start. Some chain of events will have slack time equal to zero; this is the critical path."
source : https://stackoverflow.com/questions/6368404/find-the-critical-path-and-slack-time
My code is identifying sometimes correctly the critical activities which compose the critical path, but due to the data I have, it sometimes fail. I have found when it does happen : it's when the given times for an activity (from whom cost is deduced) does not respect the early and latest times computed. Right now I only take into account the cost of each activity, but obviously it is not enough because in case like the one in picture below, critical path computed is not accurate :
one fail case for algorithm above http://imageshack.us/a/img688/2420/casemp.png
Obviously activity B is critical (if its end time is shifted, the end of project is also shifted) but the algorithm compute a slack of 1...
I don't know how to change the algorithm to make it work for case above.
I found some easy way to identify the critical activities from the data I have. For each activity, I simulate one second delay (add one second to the end time) and propagate that delay to all successors and test if it affects the time of last activity. If so, that task is critical.
This way works great, I have now a list of all the critical activities, but for some case it can take several seconds (23 seconds for 450 activities with a lot of dependances !). So still trying to find a better way.

A Greedy algorithm for k-limited resources

I am studying greedy algorithms and I am wondering the solution for a different case.
For interval selection problem we want to pick the maximum number of activities that do not clash with each other, so selecting the job with the earliest finishing time works.
Another example; we have n jobs given and we want to buy as smallest number of resources as possible. Here, we can sort all the jobs from left to right, and when we encounter a new startpoint, we increment a counter and when we encounter an endpoint, we decrement the counter. So the largest value we get from this counter will be number of resources we need to buy.
But for example, what if we have n tasks but k resources? What if we cannot afford more then k resource? How should be a greedy solution to remove as few tasks as possible to satisfy this?
Also if there is a specific name for the last problem I wrote, I would be happy to hear that.
This looks like a general case of the version where we have only one resource.
Intuitively, it makes sense to still sort the jobs by end time and take them one by one in that order. Now, instead of the ending time of the last job, we keep track of the ending times of the last k jobs accepted into our resources. For each job, we check if the current jobs starting time is greater that the last job in any one of our resources. If no such resource is found, we skip that job and move ahead. If one resource is found, we assign that job to that resource and update ending time. If there are more than one resource able to take on that job, it makes sense to assign it to the resource with the latest end time.
I don't really have a proof of this greedy strategy, so it may well be wrong. But I cannot think of a case where changing the choice might enable us to fit more jobs.

load balancing algorithms - special example

Let´s pretend i have two buildings where i can build different units in.
A building can only build one unit at the same time but has a fifo-queue of max 5 units, which will be built in sequence.
Every unit has a build-time.
I need to know, what´s the fastest solution to get my units as fast as possible, considering the units already in the build-queues of my buildings.
"Famous" algorithms like RoundRobin doesn´t work here, i think.
Are there any algorithms, which can solve this problem?
This reminds me a bit of starcraft :D
I would just add an integer to the building queue which represents the time it is busy.
Of course you have to update this variable once per timeunit. (Timeunits are "s" here, for seconds)
So let's say we have a building and we are submitting 3 units, each take 5s to complete. Which will sum up to 15s total. We are in time = 0.
Then we have another building where we are submitting 2 units that need 6 timeunits to complete each.
So we can have a table like this:
Time 0
Building 1, 3 units, 15s to complete.
Building 2, 2 units, 12s to complete.
Time 1
Building 1, 3 units, 14s to complete.
Building 2, 2 units, 12s to complete.
And we want to add another unit that takes 2s, we can simply loop through the selected buildings and pick the one with the lowest time to complete.
In this case this would be building 2. This would lead to Time2...
Time 2
Building 1, 3 units, 13s to complete
Building 2, 3 units, 11s+2s=13s to complete
...
Time 5
Building 1, 2 units, 10s to complete (5s are over, the first unit pops out)
Building 2, 3 units, 10s to complete
And so on.
Of course you have to take care of the upper boundaries in your production facilities. Like if a building has 5 elements, don't assign something and pick the next building that has the lowest time to complete.
I don't know if you can implement this easily with your engine, or if it even support some kind of timeunits.
This will just result in updating all production facilities once per timeunit, O(n) where n is the number of buildings that can produce something. If you are submitting a unit this will take O(1) assuming that you keep the selected buildings in a sorted order, lowest first - so just a first element lookup. In this case you have to resort the list after manipulating the units like cancelling or adding.
Otherwise amit's answer seem to be possible, too.
This is NPC problem (proof at the end of the answer) so your best hope to find ideal solution is trying all possibilities (this will be 2^n possibilities, where n is the number of tasks).
possible heuristic was suggested in comment (and improved in comments by AShelly): sort the tasks from biggest to smallest, and put them in one queue, every task can now take element from the queue when done.
this is of course not always optimal, but I think will get good results for most cases.
proof that the problem is NPC:
let S={u|u is a unit need to be produced}. (S is the set containing all 'tasks')
claim: if there is a possible prefect split (both queues finish at the same time) it is optimal. let this time be HalfTime
this is true because if there was different optimal, at least one of the queues had to finish at t>HalfTime, and thus it is not optimal.
proof:
assume we had an algorithm A to produce the best solution at polynomial time, then we could solve the partition problem at polynomial time by the following algorithm:
1. run A on input
2. if the 2 queues finish exactly at HalfTIme - return True.
3. else: return False
this solution solves the partition problem because of the claim: if the partition exist, it will be returned by A, since it is optimal. all steps 1,2,3 run at polynomial time (1 for the assumption, 2 and 3 are trivial). so the algorithm we suggested solves partition problem at polynomial time. thus, our problem is NPC
Q.E.D.
Here's a simple scheme:
Let U be the list of units you want to build, and F be the set of factories that can build them. For each factory, track total time-til-complete; i.e. How long until the queue is completely empty.
Sort U by decreasing time-to-build. Maintain sort order when inserting new items
At the start, or at the end of any time tick after a factory completes a unit runs out of work:
Make a ready list of all the factories with space in the queue
Sort the ready list by increasing time-til-complete
Get the factory that will be done soonest
take the first item from U, add it to thact factory
Repeat until U is empty or all queues are full.
Googling "minimum makespan" may give you some leads into other solutions. This CMU lecture has a nice overview.
It turns out that if you know the set of work ahead of time, this problem is exactly Multiprocessor_scheduling, which is NP-Complete. Apparently the algorithm I suggested is called "Longest Processing Time", and it will always give a result no longer than 4/3 of the optimal time.
If you don't know the jobs ahead of time, it is a case of online Job-Shop Scheduling
The paper "The Power of Reordering for Online Minimum Makespan Scheduling" says
for many problems, including minimum
makespan scheduling, it is reasonable
to not only provide a lookahead to a
certain number of future jobs, but
additionally to allow the algorithm to
choose one of these jobs for
processing next and, therefore, to
reorder the input sequence.
Because you have a FIFO on each of your factories, you essentially do have the ability to buffer the incoming jobs, because you can hold them until a factory is completely idle, instead of trying to keeping all the FIFOs full at all times.
If I understand the paper correctly, the upshot of the scheme is to
Keep a fixed size buffer of incoming
jobs. In general, the bigger the
buffer, the closer to ideal
scheduling you get.
Assign a weight w to each factory according to
a given formula, which depends on
buffer size. In the case where
buffer size = number factories +1, use weights of (2/3,1/3) for 2 factories; (5/11,4/11,2/11) for 3.
Once the buffer is full, whenever a new job arrives, you remove the job with the least time to build and assign it to a factory with a time-to-complete < w*T where T is total time-to-complete of all factories.
If there are no more incoming jobs, schedule the remainder of jobs in U using the first algorithm I gave.
The main problem in applying this to your situation is that you don't know when (if ever) that there will be no more incoming jobs. But perhaps just replacing that condition with "if any factory is completely idle", and then restarting will give decent results.

Algorithm for the allocation of work with dynamic programming

The problem is this:
Need to perform n jobs, each characterized by a gain {v1, v2,. . . , vn}, a time required for its implementation {t1, t2,. . . , tn} and a deadline for its implementation {d1, d2,. . . , dn} with d1<=d2<=.....<=d3. Knowing that the gain occurs only if the work is done by that time and that you have a single machine. Must describe an algorithm that computes the maximum gain that is possible to obtain.
I had thought of a recurrence equation with two parameters, one indicating the i-th job and the other shows the moment in which we are implementing : OPT(i,d) , If d+t_i <= d then adds the gain t_i. (then a variant of multiway choice ..that is min for 1<=i<=n).
My main problem is: how can I find jobs that previously were carried out? I use a data structure of support?
As you would have written the equation of recurrence?
thanks you!!!!
My main problem is: how can I find jobs that previously were carried out? I use a data structure of support?
The trick is, you don't need to know what jobs are completed already. Because you can execute them in the order of increasing deadline.
Let's say, some optimal solution (yielding maximum profit) requirers you to complete job A (deadline 10) and then job B (deadline 3). But in this case you can safely swap A and B. They both will still be completed in time and new arrangement will yield the same total profit.
End of proof.
As you would have written the equation of recurrence?
You already have general idea, but you don't need a loop (min for 1<=i<=n).
max_profit(current_job, start_time)
// skip this job
result1 = max_profit(current_job + 1, start_time)
// start doing this job now
finish_time = start_time + T[current_job]
if finish_time <= D[current_job]
// only if we can finish it before deadline
result2 = max_profit(current_job + 1, finish_time) + V[current_job];
end
return max(result1, result2);
end
Converting it to DP should be trivial.
If you don't want O(n*max_deadline) complexity (e.g., when d and t values are big), you can resort to recursion with memoization and store results in a hash-table instead of two-dimensional array.
edit
If all jobs must be performed, but not all will be paid for, the problem stays the same. Just push jobs you don't have time for (jobs you can't finish before deadline) to the end. That's all.
First of all I would pick the items with the biggest yield. Meaning the jobs that have the
biggest rate in value/time that can match their deadline (if now+t1 exceeds d1 then it is bogus). Afterwards I check the time between now+job_time and each deadline obtaining a "chace to finish" of each job. The jobs that will come first will be the jobs with biggest yield and lowest chance to finish. The idea is to squeeze the most valuable jobs.
CASES:
If a job with a yield of 5 needs 10 seconds to finish and it's deadline comes in 600 seconds and a job with the same yield needs 20 seconds to finish and it's deadline comes in 22 seconds, then I run the second one.
If a job with a yield of 10 needs 10 seconds to finish and it's deadline comes in 100 seconds while another job has a yield of 5 needs 10 seconds to finish and it's deadline comes in 100 seconds,I'll run the first one.
If their yield is identical and they take same time to finish while their deadline comes in 100 seconds,respectively 101 seconds, I'll run the first one as it wins more time.
.. so on and so forth..
Recursion in this case refers only to reordering the jobs by these parameters by "Yield" and "Chance to finish".
Remember to always increase "now" (+job_time)after inserting a job in the order.
Hope it answers.
I read the upper comments and understood that you are not looking for efficiency you are looking for completion, so that takes the yield out of the way and leaves us with just ordering by deadline. It's the classic problem done by
Divide et Impera Quicksort
http://en.wikipedia.org/wiki/Quicksort

Resources