Understanding Machine Scheduling - data-structures

I'm currently learning about priority queues and heaps in my Data Structures class and all that stuff and in the class power points there is a little section that introduces machine scheduling, and I'm having difficulty understanding what is going on.
It begins by giving an example:
m identical machines
n jobs/tasks to be performed
assign jobs to machines so that the time at which the last job completes is minimum. -->The wording of this last part sort of throws me of...what exactly does the italicized portion mean? Can somebody word it differently?
Continuing with the example it says:
3 machines and 7 jobs
job times are [6, 2, 3, 5, 10, 7, 14]
possible schedule, followed by this picture:
(Example schedule is constructed by scheduling the jobs in the order they appear in the given job list (left to right); each job is scheduled on the machine on which it will complete earliest.
)
Finish time = 21
Objective: find schedules with minimum finish time
And I don't really understand what is going on. I don't understand what is being accomplished, or how they came up with that little picture with the jobs and the different times...Can somebody help me out?

"The time at which the last job completes is minimum" = "the time at which the all jobs are finished", if that helps.
In your example, that happens at time = 21. Clearly there's no jobs still running after that time, and all jobs have been scheduled (i.e. you can't schedule no jobs and say the minimum time is time = 0).
To explain the example:
The given jobs are the duration of the jobs. The job with duration 6 is scheduled first - since scheduling it on machines A, B or C will all end up with it finishing at time 6, which one doesn't really matter, so we just schedule it on machine A. Then the job with duration 2 is scheduled. Similarly it can go on B or C (if it were to go on A, it would finish at time 8, so that's not in line with our algorithm), and we schedule it on B. Then the job with duration 3 is scheduled. The respective end times for machines A, B and C would be 9, 5 and 3, so we schedule it on machine C. And so on.
Although the given algorithm is not the best we can do (but perhaps there is something enforcing the order, although that won't make too much sense). One better assignment:
14 16
A | 14 |2|
10 16
B | 10 | 6 |
7 10 15
C | 7 | 3| 5 |
Here all jobs are finished at time = 16.
I've listed the actual job chosen for each slot in the slot itself to hopefully explain it better to possibly clear up any remaining confusion (for example, on machine A, you can see that the jobs with duration 14 and 2 were scheduled, ending at time 16).
I'm sure the given algorithm was just an introduction to the problem and you'll get to always producing the best result soon.
What's being accomplished with trying to get all jobs to finish as soon as possible: think of a computer with multiple cores for example. There are many reasons you'd want tasks to finish as soon as possible. Perhaps you're playing a game and you have a bunch of tasks that work out what's happening (maybe there's a task assigned to each unit / a few units to determine what it does). You can only display after all tasks is finished, so if you don't try to finish as soon as possible, you'll unnecessarily make the game slow.

Related

A job allocating proble

I got n people, need to do N tasks, each task takes (x man-day), tasks have dependence, such as task A must start after task B finished. How should I arrangement it? Thank you so much!
I set up 4 rules for this question:
one Task could set most num of people to do it (such as 5)
there are logical dependence between tasks
one Task will cost determinated man-days
everyday concurrent working people can not be above current people num(n)
the rules are above, but i don't know how to calculate a minimum total time. Please tell me a method to solve it or some inspiration, thank you so much!
Make topological sort of jobs.
Now you have sequence of events (time; job start/job end)
On start of the job assign a man, if there are free workers, otherwise wait.
On job end free a worker, assign him to the waiting job if exists.
This problem is well-known to be NP-complete. Even the simple version with no dependencies, and only one worker per job, is NP-complete.
So you likely need to accept a reasonable heuristic. Here is one.
With a breadth-first search starting from the jobs which nothing else depends on, figure out for each job the minimum wall-clock time (max out workers on the job, and everything that depends on it, even if that is not actually possible) from starting that job to finishing.
And now you start at the top. Always assign workers to jobs by the following rules:
Longest wallclock time to finish first.
Break ties in favor of longest job.
Break ties in favor of job with fewest max workers.
In other words you're prioritizing putting people to work on the critical path and any slow jobs.

Preemptive SSTF algorithm

What happens in preemptive SSTF algorithm if the arriving process has the same burst time (shortest) as the currently running process at that instance? Will running process continue to run or the processor will switch to the arriving process?
Example: At time instance 4, P1 has the remaining time of 6 ms and a new process p2 arrives with a burst of 6 ms, will P1 continue to run or process will switch to P2?
That is entirely system dependent. It may break the tie using the smallest arrival time first or it may be simply the priority of the jobs. In general it is the priority which is determined by number of factors. That saves you from stucking a process in same state for long. These are the common way using which the problem is resolved.
So long story short it depends on implementation.

Task Scheduling Optimization with dependency and worker constraint

We are confronted with a Task scheduling Problem
Specs
We have N workers available, and a list of tasks to do.
Each task-->Ti needs Di (i.e. worker*days) to finish (Demand), and can only hold no more than Ci workers to work on it simultaneously (Capacity).
And some tasks can only start after other task(s) are done (Dependency).
The target is to achieve total minimal duration by allocating workers to those sequences.
Example
Number of workers: 10
Taks List: [A, B, C]
Demand: [100 50 10] - unit: workerday (Task A need 100 workerday to finish, B needs 50 workerday, and C need 10 workerday)
Capacity: [10 10 2] - unit: worker (Task A can only 10 workers to work on it at the same time, B can only hold 10, and C can only hold 2)
Dependency: {A: null, B: null, C: B} - A and B can start at any time, C can only start after B is done
Possible approaches to the example problem:
First assign B 10 workers, and it will take 50/10 = 5 days to finish. Then at day 5, we assign 2 workers to C, and 8 workers to A, it will take max(10/2 = 5, 100/8 = 12.5) = 12.5 days to finish. Then the total duration is 5 + 12.5 = 17.5 days.
First assign A 10 workers, and it takes 100/10 = 10 days to finish. Then at day 10, we assign 10 workers to B, which takes 50/10 = 5 days to finish. Then at day 15, we assign 2 workers to C, which takes 10/2 = 5 days to finish. The total duration is 10+5+5 = 20 days.
So the first practice is better, since 17.5 < 20.
But there are still many more possible allocation practices to the example problem, and we are not even sure about what is the best practice to get the minimal total duration for it.
What we want is an algorithm:
Input:
Nworker, Demand, Capacity, Dependency
output: worker allocation practice with the minimal total duration.
Possible Allocation Strategies we've considered when allocating for the tasks without dependency:
First finish the tasks dependent by others as soon as possible (say, finish B as soon as possible in the example)
Allocate workers to tasks with maximam demand (say, first allocate all workers to A in the example)
But none of the two proves to be the optimal strategy.
Any idea or suggestion would be appreciated. Thanks !
This sounds like Job Shop Scheduling with dependencies, which is NP-complete (or NP-hard). So scaling out and delivering an optimal solution in reasonable time is probably impossible.
I've got good results on similar cases (Task assigning and Dependend Job Scheduling) by doing first a Construction Heuristic (pretty much one of those 2 allocation strategies you got there) and then doing a Local Search (usually Late Acceptance or Tabu Search) to get to near optimal results.

Optimum algorithm

Two workers has several tasks.Assume the tasks has duration of 14, 7, 2, 4. The next task goes to first worker that is free. The two workers have to finish several tasks in one day. The same task takes the same time on the two workers. Our goal is to finish the tasks as soon as possible.
Two questions:
1.show that the algorithm always completes the task prior to time 2*T,T is the optimum completion time.
2.express the optimum scheduling with a resursion(multi-dimentonal)
Not HW PRoblem
Please give me some suggestions
What is a multi-dimentional recursion?
Since you ask for suggestions...
Try drawing the problem out. Have a timeline for worker #1 and worker #2 and specify what tasks they are working on for what stretches of time. Once you understand why this algorithm completes in less than 2*T time, you then can start figuring out how to formally prove it.

Average waiting time in Round Robin scheduling

Waiting time is defined as how long each process has to wait before it gets it's time slice.
In scheduling algorithms such as Shorted Job First and First Come First Serve, we can find that waiting time easily when we just queue up the jobs and see how long each one had to wait before it got serviced.
When it comes to Round Robin or any other preemptive algorithms, we find that long running jobs spend a little time in CPU, when they are preempted and then wait for sometime for it's turn to execute and at some point in it's turn, it executes till completion. I wanted to findout the best way to understand 'waiting time' of the jobs in such a scheduling algorithm.
I found a formula which gives waiting time as:
Waiting Time = (Final Start Time - Previous Time in CPU - Arrival Time)
But I fail to understand the reasoning for this formula. For e.g. Consider a job A which has a burst time of 30 units and round-robin happens at every 5 units. There are two more jobs B(10) and C(15).
The order in which these will be serviced would be:
0 A 5 B 10 C 15 A 20 B 25 C 30 A 35 C 40 A 45 A 50 A 55
Waiting time for A = 40 - 5 - 0
I choose 40 because, after 40 A never waits. It just gets its time slices and goes on and on.
Choose 5 because A spent in process previouly between 30 and 35.
0 is the start time.
Well, I have a doubt in this formula as why was 15 A 20 is not accounted for?
Intuitively, I unable to get how this is getting us the waiting time for A, when we are just accounting for the penultimate execution only and then subtracting the arrival time.
According to me, the waiting time for A should be:
Final Start time - (sum of all times it spend in the processing).
If this formula is wrong, why is it?
Please help clarify my understanding of this concept.
You've misunderstood what the formula means by "previous time in CPU". This actually means the same thing as what you call "sum of all times it spend in the processing". (I guess "previous time in CPU" is supposed to be short for "total time previously spent running on the CPU", where "previously" means "before the final start".)
You still need to subtract the arrival time because the process obviously wasn't waiting before it arrived. (Just in case this is unclear: The "arrival time" is the time when the job was submitted to the scheduler.) In your example, the arrival time for all processes is 0, so this doesn't make a difference there, but in the general case, the arrival time needs to be taken into account.
Edit: If you look at the example on the webpage you linked to, process P1 takes two time slices of four time units each before its final start, and its "previous time in CPU" is calculated as 8, consistent with the interpretation above.
Last waiting
value-(time quantum×(n-1))
Here n denotes the no of times a process arrives in the gantt chart.

Resources