Profit dependent on the previous job time - Job Scheduling problem - algorithm

There are n jobs that need to be processed on a single machine. Job j requires tj time units to execute and has a profit value of pj. All the jobs are to schedule in time W = summation of tjtime units.
Scheduling job j to start at time sj earns a profit (W - sj)*pj.
I have already tried a greedy approach for pj and sj individually as well pj*tj but have been able to come up with a counterexample. I think it can be solved by a greedy algorithm using pj/tj in decreasing order but not able to prove it. I am just looking for some hints on how to prove it formally.

An approach I have seen before is to consider swapping two adjacent jobs in a proposed schedule. Suppose we have 1,2, where other stuff will take time K and then we hit a deadline. This is better left unswapped if
p1(K + t2) + p2K > p2(K + t1) + p1K
which simplifies to
p1t2 > p2t1
which simplifies to
p1 / t1 > p2 / t2
So if we sort in the way you guessed no swap of adjacent jobs will increase profits, but if there a schedule which does not follow this rule you can improve it by swapping adjacent jobs. So I think your guess is correct.

Related

Proving the greedy solution to the weighted task scheduling problem

I am attempting to prove the following algorithm is fully correct (partial correctness + termination), but I can only seem to prove for arbitrary example inputs (not general ones).
Here is my pseudo-code:
IN :Listofjobs J, maxindex n
1:S ← an array indexed 0 to n, with null at each index
2:Sort J in non-increasing order of profits
3:for i from 0 to n
4:Find the largest t such that S[t] = null and t ≤ J[i].deadline (if one exists) if an index t was found
5: S[t] ← J[i]
OUT: S maximizes the profit of scheduled jobs that can be done in n 1 unit of time blocks
So for example, I created a table with jobs and their associated attributes (deadlines and profit):
jOB J1 J2 J3 J4 J5
Deadline 2 1 3 2 1
profit 62 100 20 40 20
From this example input, we'd be able to do J2,J1,J3 for a total profit of 182.
Can someone help me form a more generic way of showing my pseudo-code algorithm is fully correct?
Adding constraints helps you here.
In step 2 order J first by non-increasing profits, and then by non-increasing deadline.
Of all optimal solutions, at least one must be lexicographically last by the ordering of profits of jobs over time. (If multiple jobs have the same profit and deadlines, there may be multiple optimal solutions that are lexicographically last.)
Prove that any solution that does not have a job of the same profit at the same time as the first job that you place is either not optimal, or is not lexicographically last among optimal solutions.
Prove by induction that the solution that you found is identical in placement of sizes of jobs as any solution that is optimal and lexicographically last among optimal solutions.
Please note that this is an outline only. There are some subtle tricks needed at each step of this proof that I'm deliberately leaving as an exercise.

Assignment problem with multiple persons needed on each job

First, sorry if my English isn't so good, I'm not a native English speaker.
I'm facing an assignment problem.
I have a list of jobs, with a certain number of persons needed for each job. Each person will let me know on how many jobs they want to be and their preferences.
I tried to do that with the Hungarian Algorithm but it seems I'm unable to get it done. With a large number of jobs and spots, some persons got multiple time the same job, which isn't ok.
I think it's all due to the fact I considered each spot as an individual job and I listed each person as many times as they need to be placed.
Do you know a better algorithm or a way to do it?
(It's not a coding problem, I'm doing it in Octave/Matlab for now, but I think I'll switch to Python.)
Thanks for your help.
In addition to Henrik's suggestion to use Linear Programming, your specific problem can also be solved using Minimum cost maximum flow.
You make a bipartite graph between people and jobs (like in the Hungarian algorithm), where the cost on the middle edges are the preference scores, and the capacity is 1.
The capacity of the edges from the jobs to the sink is the number of people you need for that job.
Assignment problems can be solved with linear programming:
Let xij = 1 if person i is assigned to job j and 0 otherwise. Let aij be the rank for person i of job j : aij = 1 for the job he wants most, aij = 2 for the next and so on. If he only wants k jobs you put aij to a very high number for all jobs beyond those k.
If you need at least bj workers on job j you have the constraint
x1j + ... + xmj >= bj (j = 1,...,n)
You also have the constraints xij >= 0 and xij <= 1 .
The linear function to minimize is
sum( aij xij ) over all i,j

Greedy Algorithm: Assigning jobs to minimize cost

What is the best approach to take if I want to find the minimum total cost if I want to assign n jobs to a person in a sequence which have cost assigned to them? For eg. I have 2 jobs which have costs 4 and 5 respectively. Both jobs take 6 and 10 minutes respectively. So the finish time of the second job will be finish time of first job + time taken by this job. So the total cost will be finish time of each job multiplied by its cost.
If you have to assign n jobs to 1 person (or 1 machine) in scheduling literature terminology, you are looking to minimize weighted flow time. The problem is polynomially solvable.
The shortest weighted processing time sequence is optimal.
Sort and reindex jobs such that p_1/w_1 <= p_2/w_2 <= ... <= p_n/w_n,
where, p_i is the processing time of the ith job and w_i is its weight or cost.
Then, assign job 1 first, followed by 2 and so on until n.
If you look at what happens if you swap two adjacent values you will end up comparing terms like (A+c)m + (A+c+d)l and (A+d)l + (A+c+d)m, where A is the time consumed by earlier jobs, c and d are times, and l and m are costs. With some algebra and rearrangement you can see that the first version is smaller if c/m < d/l. So you could work out for each job the time taken by that job divided by its cost, and do first the jobs with smallest time per unit cost. - check: if you have a job that takes 10 years and has a cost of 1 cent, you want to do that last so that 10 year wait doesn't get multiplied by any other costs.

Dynamic programming algorithm for unweighted interval scheduling?

I was wondering if someone could please help me reason about a DP algorithm for unweighted interval scheduling.
I'm given 2 arrays [t1,...,tn] and [d1,...,dn] where ti is the start time of job i and di is the duration of job i. Also the jobs are sorted by start time, so t1 <= t2 <= ... <= tn. I need to maximize the number of jobs that can be executed without any overlaps. I'm trying to come up with a DP algorithm and runtime for this problem. Any help would be much appreciated!
Thank you!
I am sorry I don't have any more time now to spend on this problem. Here is an idea, I think it lends itself nicely to Dynamic Programming. [Actually I think it is DP, but almost two decades have passed since I last studied such things...]
Suppose T = {t1, t2, ..., tn} is partitioned as follows:
T = {t1, t2, ..., tn} = {t1, t2, ..., tk} U {tk+1, tk+2, ..., tn}
= T1(k) U T2(k)
Let T2'(k) be the subset of T2(k) not containing the jobs overlapping T1(k).
Let opt(X) be the optimal value for a subset X of T. Then
opt(T) = min( opt( T1(k) ) + opt( T2'(k) )
where the minimum is taken along any possible k in {1, 2, ..., n}
Of course you need to compute opt() recursively, and take into account overlaps.
Hope this helps!
It's easiest for me to think about if I suppose that you work out what the end time would be for each job and sort the jobs into order of increasing end time, although you can probably achieve the same thing using start times working in the opposite direction.
Consider each job in order of increasing end time. For each job work out the maximum number of jobs you can handle up to and including that job if you decide to work on that job. To work this out, look at the answers you have already computed that cover times up to the start time of that job and find the one that covers the maximum number of jobs. The best you can do while handling the job you are considering is one plus that maximum number.
When you have considered all the jobs, the maximum number you can cover is the maximum number you have computed when considering any job. You can work out which jobs to do by storing the previous job that you identified when working out the maximum score possible for a particular job, and then tracing these pointers back from the job with the maximum score.
With N jobs to consider you look back at at most N previously computed answers when working out the best possible score for each job so I think this is O(N^2)

Trying to gain intuition for work scheduling greedy algorithm

I have the following scenario: (since I don't know of a way to show LaTeX, here's a screenshot)
I'm having some trouble conceptualizing what's going on here. If I were to program this, I would probably attempt to structure this as some kind of heap where each node represents a worker, from earliest-to-latest, then run Prim's/Kruskal's algorithm on it. I don't know if I'm on the right track with that idea, but I need to flesh out my understanding of this problem so I can do the following:
Describe in detail the greedy choice
Show that if there's an optimal solution for which the greedy choice was not made, then an exchange can be made to conform with the greedy choice
Know how to implement a greedy algorithm solution, and its running time
So where should I be going with this idea?
This problem is very similar in nature to "Roster Scheduling problems." Think of the committee as say a set of 'supervisors' and you want to have a supervisor present, whenever a worker is present. In this case, the supervisor comes from the same set as the workers.
Here are some modeling ideas, and an Integer Programming formulation.
Time Slicing Idea
This sounds like a bad idea initially, but works really well in practice. We are going to create a lot of "time instants" T i from the start time of the first shift, to the end time of the very last shift. It sometimes helps to think of
T1, T2, T3....TN as being time instants (say) five minutes apart. For every Ti at least one worker is working on a shift. Therefore, that time instant has be be covered (Coverage means there has to be at least one member of the committee also working at time Ti.)
We really need to only worry about 2n Time instants: The start and finish times of each of the n workers.
Coverage Property Requirement
For every time instant Ti, we want a worker from the Committee present.
Let w1, w2...wn be the workers, sorted by their start times s_i. (Worker w1 starts the earliest shift, and worker wn starts the very last shift.)
Introduce a new Indicator variable (boolean):
Y_i = 1 if worker i is part of the committeee
Y_i = 0 otherwise.
Visualization
Now think of a 0-1 matrix, where the rows are the SORTED workers, and the columns are the time instants...
Construct a Time-Worker Matrix (0/1)
t1 t2 t3 t4 t5 t6 ... tN
-------------------------------------------
w1 1 1
w2 1 1
w3 1 1 1
w4 1 1 1
...
...
wn 1 1 1 1
-------------------------------------------
Total 2 4 3 ... ... 1 2 4 5
So the problem is to make sure that for each column, at least 1 worker is Selected to be part of the committee. The Total shows the number of candidates for the committee at each Time instant.
An Integer Programming based formulation
Objective: Minimize Sum(Y_i)
Subject to:
Y1 + Y2 >= 1 # coverage for time t1
Y1 + Y2 + Y3 >= 1 # coverage for time t2
...
More generally, the constraints are:
# Set Covering constraint for time T_i
Sum over all worker i's that are working at time t_i (Y_i) >= 1
Y_i Binary for all i's
Preprocessing
This Integer program, if attempted without preprocessing can be very difficult, and end up choking the solvers. But in practice there are quite a number of preprocessing ideas that can help immensely.
Make any forced assignments. (If ever there is a time instant with only one
worker working, that worker has to be in the committee ∈ C)
Separate into nice subproblems. Look at the time-worker Matrix. If there are nice 'rectangles' in it that can be cut out without
impacting any other time instant, then that is a wholly separate
sub-problem to solve. Makes the solver go much, much faster.
Identical shifts - If lots of workers have the exact same start and end times, then you can simply choose ANY one of them (say, the
lexicographically first worker, WLOG) and remove all the other workers from
consideration. (Makes a ton of difference in real life situations.)
Dominating shifts: If one worker starts before and stays later than any other worker, the 'dominating' worker can stay, all the
'dominated' workers can be removed from consideration for C.
All the identical rows (and columns) in the time-worker Matrix can be fused. You need to only keep one of them. (De-duping)
You could throw this into an IP solver (CPLEX, Excel, lp_solve etc.) and you will get a solution, if the problem size is not an issue.
Hope some of these ideas help.

Resources