I recently came across a this question in a forum:
You are given a straight line starting at 0 to 10^9. You start at zero and there are n tasks you can perform. i th task is located at point i in the line and requires 't' time to be performed. To perform the task you need to reach the point i and spend 't' time at that location.
example: (5,8) lies at 5 so travel distance is 5 and work effort is 8.
Total effort is calculated as travel distance + time required to complete the work.
It takes one sec to travel one unit of path.
Now we are given total T seconds and we need to complete as many tasks as possible and reach back to starting position
Find the max number of tasks that you can finish in time T.
example :
3 16 - 3 tasks and 16 units of total time
2 8 - task 1 at position 2 in line and takes 8 sec to complete
4 5 - task 2 at position 4 in line and takes 5 sec to complete
5 1 - task 3 at position 5 in line and takes 1 sec to complete
Output : 2
Explanation :
If we take task 1 at location 2 which requires 8 sec then getting to location 2 takes 2s and completing the task takes 8s leaving us with only 6s which is not enough for completing other task
On the other hand skipping the fist task leaves us enough time to complete the other two tasks.
Going to location and coming back costs 2x5 =10s and performing task at location 4 and 5 cost us 5+1 = 6s. Total time spent will be 10s+6s=16s.
I am new to graphs and DP so I was not sure which approach to use Hamiltonian cycle, Knapsack or Longest Path.
Can someone please help me with the most efficient approach to solve this.
Let's iterate from the first task to the last, according to distance. As we go, it's clear that after subtracting 2 * distance(i) + effort(i) for considering the current task as our last, the most tasks we can achieve can be found by greedily accumulating as many earlier tasks as possible into the remaining time, ordering them by increasing effort.
Therefore, an efficient solution could insert the seen element into a data-structure ordered by effort, dynamically updating the best solution so far. (I originally thought of using a treap and binary search but j_random_hacker suggested a much simpler way in the comments below this answer.)
Suggestion:
For each task n create a graph like this
Join up these graphs for all the tasks.
Run a travelling salesman algorithm to find the minimum time to do all the tasks ( = visit all the nodes in combined graph )
Remove tasks in an orderly sequence. This will give you a collection of results for different numbers of tasks performed. Choose the one that does the most number of tasks that still remains under the time limit.
Since you are maximizing the number of tasks performed, start by removing the longest tasks so that you will be left with lots of short tasks.
Related
I have N tasks where the i'th task takes A[i] time to process. Every task is independent of each other and can be scheduled anytime on any of the P processors. A task can be run on only 1 processor, and a processor can process any number of tasks. Each agent/processor can only work on one task at a time, and once begun, must continue to work on it until the task is complete
I want to minimize the amount of time it takes to complete all the task
I am implementing this using a min-heap, i.e.
Sort the task in descending order
Create a min-heap of size P initialized to 0
For each task i, pull the min from heap, add the task time A[i] to it and add it back to the heap
The time to complete all the task is maximum value in the heap. This has been working so far and I want to verify its correctness
Do you think this breaks for any inputs?
I believe I am doing something like Greedy Number Partitioning
This is a polynomial time algorithm for a problem that includes NP complete problems as special cases (for example with P=2, you have a subset sum problem). Therefore you should expect it to not always work.
The simplest case I could find where your algorithm breaks is if the weights are 1, 1, 5, 5 and P=2. Your algorithm should combine things like this:
1 1 5 5
1,1 5 5
1,1,5 5
and will take 7. The better solution you don't find is:
1,5 1,5
which will complete in 6.
I have 6 processes P1, P2, P3, P4, P5, and P6. I also have their start times and duration given in the problem.
process# start duration
1 1 1
2 3 1
3 0 6
4 5 2
5 5 4
6 8 1
Now I have to find out the maximum of number of completely non-overlapping processes. Two processes are completely non-overlapping if one does not overlap the other at any point in time.
So I made a Gantt chart and it is easy to see that the answer is 4.
P1, P2, P4 and P6 are completely non-overlapping.
Now I have to write a program to compute the same. On a Gantt chart I can easily 'see' the solution.
In the algorithm for my program, I don't know how to minimise the time complexity: currently I'm thinking about taking each process and comparing its start and end times with other processes, but that roughly makes it O(n^2).
If I scale up the processes from 6 to say 1000, O(n^2) will take a huge time.
Is there any standard way of doing such problems - I mean such problems that are easy to visualise - like Gantt charts? Otherwise how do I make this algorithm better, any suggestions?
There are different paths that you could take to find a solution, here are some in no particular order.
Is that already a solution on the net?
Most likely, important point, Gantt chars are essentially intervals.
Could it be a graph problem?
Consider that each interval is a node, imaging a start node at 0 (zero) and connect all nodes to all nodes starting later than its end. Use Dijkstra or A* like to find a solution.
Could it be a dynamic programming problem?
Are there subproblems, yes, add or don't add an interval, repeat.
Do I know a data structure that is used in this kind of problems?
Yes, Augmented Interval Tree can it be used for this problem, maybe.
What is the best approach to take if I want to find the minimum total cost if I want to assign n jobs to a person in a sequence which have cost assigned to them? For eg. I have 2 jobs which have costs 4 and 5 respectively. Both jobs take 6 and 10 minutes respectively. So the finish time of the second job will be finish time of first job + time taken by this job. So the total cost will be finish time of each job multiplied by its cost.
If you have to assign n jobs to 1 person (or 1 machine) in scheduling literature terminology, you are looking to minimize weighted flow time. The problem is polynomially solvable.
The shortest weighted processing time sequence is optimal.
Sort and reindex jobs such that p_1/w_1 <= p_2/w_2 <= ... <= p_n/w_n,
where, p_i is the processing time of the ith job and w_i is its weight or cost.
Then, assign job 1 first, followed by 2 and so on until n.
If you look at what happens if you swap two adjacent values you will end up comparing terms like (A+c)m + (A+c+d)l and (A+d)l + (A+c+d)m, where A is the time consumed by earlier jobs, c and d are times, and l and m are costs. With some algebra and rearrangement you can see that the first version is smaller if c/m < d/l. So you could work out for each job the time taken by that job divided by its cost, and do first the jobs with smallest time per unit cost. - check: if you have a job that takes 10 years and has a cost of 1 cent, you want to do that last so that 10 year wait doesn't get multiplied by any other costs.
I have been working on this question and can't seem to find the right answer. Can someone please help me with this?
We are given N jobs [1,..,N]. We'll get a salary S(i) >= 0 for getting a job i done, and a deduction D(i) >= 0 that adds up for each day passing.
We'll need T(i) days to complete job i. Suppose the job i is done on day d, we'll get S(i) - d.D(i) in reward. The reward can be negative if d is too big.
We can switch jobs in the process and work on jobs in any order, meaning if we start job 1 that takes 5 days on day 1, we don't have to spend 5 consecutive days working on job 1.
How can we decide the best schedule of the jobs, so that we can complete all the jobs and get maximum salary?
I think shapiro is right. You need to determine an appropriate weighted cost formula for each task. It has to take into account the days remaining, the per day deduction, and maybe total deduction.
Once you have the weighted cost you can sort the task list by the weighted cost and perform one day of work on the first task in the list (should be the one that will cost the most if not completed). Then recalculate the weighted cost for all the tasks now that a day has passed, sort the list, and repeat until all tasks are complete.
Generally when you are optimizing schedules in the real world this is the approach. Figure out which task should be worked on first, do some work on it, then recalculate to see if you should switch tasks or keep working on the current one.
Following the above discussion:
For each job i, calculate the one day delay cost as X(i) = D(i) / T(i) and order the jobs by it. Maybe even just order by D(i) since when you choose one job you are not choosing the others - so it makes sense to choose the one with the most expensive deduction. Perform the jobs by this order to minimize the deduction fees.
Again, this is assuming that S(i) is a fixed reward for each job, independent on the exact day it is finished by, and that all jobs need to be performed.
First forget about S(i). You are doing all the jobs you get all the rewards anyway.
Second there's no point to interrupt a task and switch to another.Let's say you have jobs A and B. The deduction you get for the one that finishes last is the same (it's going to take T(A) + T(B) to finish it regardless of how you schedule). The deduction for the other job can only increase if you switch because it's going to take longer to finish it. So you're best if you drop the switch.
Now the problem is to order the tasks so that you get minimum amount of penalty. I'm not sure what's next.
You can pick the first job to minimize T(x) * sum(d) (since you commit to dong job x everything will incur T(x) days delay).
Or you can pick the last job since you know you're going to pay sum(T) * d(x) (you know when it's going to finish).
One says order by T(x) the other says order by d(x) and they are both wrong.
Likely the solution is some dynamic programming in this space, but it escapes me at the moment.
Here is the problem. Suppose we have N workers and N jobs. We want to assign each job exactly one worker. For each worker i, he could do some jobs on some cost. Our goal is to minimize the total cost on the condition that any single cost should be less than some value.
For example, 10 workers and 10 jobs. Worker 1 can do job 1 with $0.8, job 2 with $2.3, job 3 with $15.8, jobs 4 to 8 with $100, job 9 with $3.2, job 10 with $15.3.
Worker 2 can do job 1 with $3.5, job 2 with $2.3, job 3 with $4.6, job 4 with $17, etc.
Our goal is to find a matching or we can call it an assignment such that the total cost is minimized but any single cost of the corresponding pair/matching between work i and job i is less than a value like $50.
I would very much like to solve it in MATLAB if possible.
This is a slight variation of the Assignment Problem. To handle your additional constraint that no single job cost should be more than some value, just change all entries in the matrix of costs that are greater than this threshold to a huge value (bigger than the sum of all other entries will suffice), and solve as usual, using e.g. the Hungarian Algorithm.