algorithm to find optimized scheduling - algorithm

I have 7 tasks t1-t7, each task has an execution time associated (t1 takes 1 time unit,t2 takes 5 etc).
You don't have to pay for communication cost between t1 and t2 if you do them in the same processor. If you do t1 on p1 and t2 on p2, you need 5 time units to transfer data from p1 to p2. (5 is the weight of edge t1-t2)
As illustrated in the right image, the communication c(1,2) and c(3,4) can be done in the same time, they both end at the end of 5 time unit.
The makespan is the time taken to finish all 7 tasks. Given that there are as many processors as I want to, which algorithm can I use to find the scheduling result with minimum makespan (or at least near the minimum one)?
Please note that the results in the right image may not be optimized.

Related

Max Tasks that can be completed in given time

I recently came across a this question in a forum:
You are given a straight line starting at 0 to 10^9. You start at zero and there are n tasks you can perform. i th task is located at point i in the line and requires 't' time to be performed. To perform the task you need to reach the point i and spend 't' time at that location.
example: (5,8) lies at 5 so travel distance is 5 and work effort is 8.
Total effort is calculated as travel distance + time required to complete the work.
It takes one sec to travel one unit of path.
Now we are given total T seconds and we need to complete as many tasks as possible and reach back to starting position
Find the max number of tasks that you can finish in time T.
example :
3 16 - 3 tasks and 16 units of total time
2 8 - task 1 at position 2 in line and takes 8 sec to complete
4 5 - task 2 at position 4 in line and takes 5 sec to complete
5 1 - task 3 at position 5 in line and takes 1 sec to complete
​​​​​​​
Output : 2
Explanation :
If we take task 1 at location 2 which requires 8 sec then getting to location 2 takes 2s and completing the task takes 8s leaving us with only 6s which is not enough for completing other task
On the other hand skipping the fist task leaves us enough time to complete the other two tasks.
Going to location and coming back costs 2x5 =10s and performing task at location 4 and 5 cost us 5+1 = 6s. Total time spent will be 10s+6s=16s.
I am new to graphs and DP so I was not sure which approach to use Hamiltonian cycle, Knapsack or Longest Path.
Can someone please help me with the most efficient approach to solve this.
Let's iterate from the first task to the last, according to distance. As we go, it's clear that after subtracting 2 * distance(i) + effort(i) for considering the current task as our last, the most tasks we can achieve can be found by greedily accumulating as many earlier tasks as possible into the remaining time, ordering them by increasing effort.
Therefore, an efficient solution could insert the seen element into a data-structure ordered by effort, dynamically updating the best solution so far. (I originally thought of using a treap and binary search but j_random_hacker suggested a much simpler way in the comments below this answer.)
Suggestion:
For each task n create a graph like this
Join up these graphs for all the tasks.
Run a travelling salesman algorithm to find the minimum time to do all the tasks ( = visit all the nodes in combined graph )
Remove tasks in an orderly sequence. This will give you a collection of results for different numbers of tasks performed. Choose the one that does the most number of tasks that still remains under the time limit.
Since you are maximizing the number of tasks performed, start by removing the longest tasks so that you will be left with lots of short tasks.

Dynamic Programming - Break Scheduling Problem With Decreasing Work Capacity

Assume you are given t1, t2, t3, ..., tn amount of tasks to finish every day. And once you start working, you can only finish c1, c2, c3, ..., cn tasks until spending 1 day resting. You can spend multiple days resting too. But you can only do the tasks which are given you that day. For example;
T[] = {10, 1, 4, 8} given tasks;
C[] = {8, 4, 2, 1} is the capacity of doing tasks for each day.
For this example, optimal solution is giving a break on the 3rd day. That way you can complete 17 tasks in 4 days:
1st day 8 (maximum 10 tasks, but c1=8)
2nd day 1 (maximum 1 task, c2=4)
3rd day 0 (rest to reset to c1)
4th day 8 (maximum 8 tasks, c1=8)
Any other schedule would result with fewer tasks getting done.
I'm trying to find the recurrence relation for this dynamic programming problem. Can anyone help me? I find this question but mine is different because of the decreasing work capacity and there are different number of jobs each day. Reference
If I got you right you have an amount of tasks to do, t(i) for every day i. Also you have some kind of a given internal restriction sequence c(j) for a current treak day j where j can be reseted to 0 if no task was done that day. Goal is to maximizie the solved tasks.
Naive approach is to store for each day i a list for every state j how many tasks were done. Fill the data for the first day. Then for every following day fill the values for the "no break" case - which is
value(i-1,j-1)+min(t(i),c(j)). Choose the maximum from the previous day to fill the "break" entry. Repeat until last day. Choose the highest value and trace back the path.
Example for above
Memory consumtption is pretty easy: O(number of days * number of states).
If you are only interested in the value and not the schedule the memory consumption would be the O(number of states).
Time consumption is a bit more complex, so lets write some pseudo code:
For each day i
For each possible state j
add and write values
choose maximum from previous day for break state
choose maximum
For each day
trace back path
The choose maximum-function would have a complexity of O(number of states).
This pseudo code results in time consumption O(number of days * number of states) as well.

Greedy Algorithm: Assigning jobs to minimize cost

What is the best approach to take if I want to find the minimum total cost if I want to assign n jobs to a person in a sequence which have cost assigned to them? For eg. I have 2 jobs which have costs 4 and 5 respectively. Both jobs take 6 and 10 minutes respectively. So the finish time of the second job will be finish time of first job + time taken by this job. So the total cost will be finish time of each job multiplied by its cost.
If you have to assign n jobs to 1 person (or 1 machine) in scheduling literature terminology, you are looking to minimize weighted flow time. The problem is polynomially solvable.
The shortest weighted processing time sequence is optimal.
Sort and reindex jobs such that p_1/w_1 <= p_2/w_2 <= ... <= p_n/w_n,
where, p_i is the processing time of the ith job and w_i is its weight or cost.
Then, assign job 1 first, followed by 2 and so on until n.
If you look at what happens if you swap two adjacent values you will end up comparing terms like (A+c)m + (A+c+d)l and (A+d)l + (A+c+d)m, where A is the time consumed by earlier jobs, c and d are times, and l and m are costs. With some algebra and rearrangement you can see that the first version is smaller if c/m < d/l. So you could work out for each job the time taken by that job divided by its cost, and do first the jobs with smallest time per unit cost. - check: if you have a job that takes 10 years and has a cost of 1 cent, you want to do that last so that 10 year wait doesn't get multiplied by any other costs.

Trying to gain intuition for work scheduling greedy algorithm

I have the following scenario: (since I don't know of a way to show LaTeX, here's a screenshot)
I'm having some trouble conceptualizing what's going on here. If I were to program this, I would probably attempt to structure this as some kind of heap where each node represents a worker, from earliest-to-latest, then run Prim's/Kruskal's algorithm on it. I don't know if I'm on the right track with that idea, but I need to flesh out my understanding of this problem so I can do the following:
Describe in detail the greedy choice
Show that if there's an optimal solution for which the greedy choice was not made, then an exchange can be made to conform with the greedy choice
Know how to implement a greedy algorithm solution, and its running time
So where should I be going with this idea?
This problem is very similar in nature to "Roster Scheduling problems." Think of the committee as say a set of 'supervisors' and you want to have a supervisor present, whenever a worker is present. In this case, the supervisor comes from the same set as the workers.
Here are some modeling ideas, and an Integer Programming formulation.
Time Slicing Idea
This sounds like a bad idea initially, but works really well in practice. We are going to create a lot of "time instants" T i from the start time of the first shift, to the end time of the very last shift. It sometimes helps to think of
T1, T2, T3....TN as being time instants (say) five minutes apart. For every Ti at least one worker is working on a shift. Therefore, that time instant has be be covered (Coverage means there has to be at least one member of the committee also working at time Ti.)
We really need to only worry about 2n Time instants: The start and finish times of each of the n workers.
Coverage Property Requirement
For every time instant Ti, we want a worker from the Committee present.
Let w1, w2...wn be the workers, sorted by their start times s_i. (Worker w1 starts the earliest shift, and worker wn starts the very last shift.)
Introduce a new Indicator variable (boolean):
Y_i = 1 if worker i is part of the committeee
Y_i = 0 otherwise.
Visualization
Now think of a 0-1 matrix, where the rows are the SORTED workers, and the columns are the time instants...
Construct a Time-Worker Matrix (0/1)
t1 t2 t3 t4 t5 t6 ... tN
-------------------------------------------
w1 1 1
w2 1 1
w3 1 1 1
w4 1 1 1
...
...
wn 1 1 1 1
-------------------------------------------
Total 2 4 3 ... ... 1 2 4 5
So the problem is to make sure that for each column, at least 1 worker is Selected to be part of the committee. The Total shows the number of candidates for the committee at each Time instant.
An Integer Programming based formulation
Objective: Minimize Sum(Y_i)
Subject to:
Y1 + Y2 >= 1 # coverage for time t1
Y1 + Y2 + Y3 >= 1 # coverage for time t2
...
More generally, the constraints are:
# Set Covering constraint for time T_i
Sum over all worker i's that are working at time t_i (Y_i) >= 1
Y_i Binary for all i's
Preprocessing
This Integer program, if attempted without preprocessing can be very difficult, and end up choking the solvers. But in practice there are quite a number of preprocessing ideas that can help immensely.
Make any forced assignments. (If ever there is a time instant with only one
worker working, that worker has to be in the committee ∈ C)
Separate into nice subproblems. Look at the time-worker Matrix. If there are nice 'rectangles' in it that can be cut out without
impacting any other time instant, then that is a wholly separate
sub-problem to solve. Makes the solver go much, much faster.
Identical shifts - If lots of workers have the exact same start and end times, then you can simply choose ANY one of them (say, the
lexicographically first worker, WLOG) and remove all the other workers from
consideration. (Makes a ton of difference in real life situations.)
Dominating shifts: If one worker starts before and stays later than any other worker, the 'dominating' worker can stay, all the
'dominated' workers can be removed from consideration for C.
All the identical rows (and columns) in the time-worker Matrix can be fused. You need to only keep one of them. (De-duping)
You could throw this into an IP solver (CPLEX, Excel, lp_solve etc.) and you will get a solution, if the problem size is not an issue.
Hope some of these ideas help.

Priority based preemptive Shortest Job First. How to determine what process comes first

I have a question for Priority based preemptive Shortest Job First algorithm. If two processes have the same priority, who is the one to go first. The one that was put in first or the one with smaller burst time? Same goes with burst time if I have 2 processes with same burst time do I sort by priority? And what happens if 2 processes have same burst time and priority?
For example what would a Gantt chart based on this table look like?
Arrival Time Burst Time Priority
p0 0 8 2
p1 4 15 5
p2 7 9 3
p3 13 5 1
p4 9 13 4
p5 0 6 1
As the name implies, you first pick the set of highest priority jobs.
Then, from that set you select the shortest job. In this case I presume 'burst time' represents the expected execution time (or time to yield).
Therefore assuming that your lower priority numbers represent 'higher' priority jobs, p3 and p5 are the two highest priority jobs.
At that point, what matters is the expected job size (burst time) at which point you select the one with the shortest burst time. In this case it would be p3.

Resources