Algorithm for the allocation of work with dynamic programming - algorithm

The problem is this:
Need to perform n jobs, each characterized by a gain {v1, v2,. . . , vn}, a time required for its implementation {t1, t2,. . . , tn} and a deadline for its implementation {d1, d2,. . . , dn} with d1<=d2<=.....<=d3. Knowing that the gain occurs only if the work is done by that time and that you have a single machine. Must describe an algorithm that computes the maximum gain that is possible to obtain.
I had thought of a recurrence equation with two parameters, one indicating the i-th job and the other shows the moment in which we are implementing : OPT(i,d) , If d+t_i <= d then adds the gain t_i. (then a variant of multiway choice ..that is min for 1<=i<=n).
My main problem is: how can I find jobs that previously were carried out? I use a data structure of support?
As you would have written the equation of recurrence?
thanks you!!!!

My main problem is: how can I find jobs that previously were carried out? I use a data structure of support?
The trick is, you don't need to know what jobs are completed already. Because you can execute them in the order of increasing deadline.
Let's say, some optimal solution (yielding maximum profit) requirers you to complete job A (deadline 10) and then job B (deadline 3). But in this case you can safely swap A and B. They both will still be completed in time and new arrangement will yield the same total profit.
End of proof.
As you would have written the equation of recurrence?
You already have general idea, but you don't need a loop (min for 1<=i<=n).
max_profit(current_job, start_time)
// skip this job
result1 = max_profit(current_job + 1, start_time)
// start doing this job now
finish_time = start_time + T[current_job]
if finish_time <= D[current_job]
// only if we can finish it before deadline
result2 = max_profit(current_job + 1, finish_time) + V[current_job];
end
return max(result1, result2);
end
Converting it to DP should be trivial.
If you don't want O(n*max_deadline) complexity (e.g., when d and t values are big), you can resort to recursion with memoization and store results in a hash-table instead of two-dimensional array.
edit
If all jobs must be performed, but not all will be paid for, the problem stays the same. Just push jobs you don't have time for (jobs you can't finish before deadline) to the end. That's all.

First of all I would pick the items with the biggest yield. Meaning the jobs that have the
biggest rate in value/time that can match their deadline (if now+t1 exceeds d1 then it is bogus). Afterwards I check the time between now+job_time and each deadline obtaining a "chace to finish" of each job. The jobs that will come first will be the jobs with biggest yield and lowest chance to finish. The idea is to squeeze the most valuable jobs.
CASES:
If a job with a yield of 5 needs 10 seconds to finish and it's deadline comes in 600 seconds and a job with the same yield needs 20 seconds to finish and it's deadline comes in 22 seconds, then I run the second one.
If a job with a yield of 10 needs 10 seconds to finish and it's deadline comes in 100 seconds while another job has a yield of 5 needs 10 seconds to finish and it's deadline comes in 100 seconds,I'll run the first one.
If their yield is identical and they take same time to finish while their deadline comes in 100 seconds,respectively 101 seconds, I'll run the first one as it wins more time.
.. so on and so forth..
Recursion in this case refers only to reordering the jobs by these parameters by "Yield" and "Chance to finish".
Remember to always increase "now" (+job_time)after inserting a job in the order.
Hope it answers.

I read the upper comments and understood that you are not looking for efficiency you are looking for completion, so that takes the yield out of the way and leaves us with just ordering by deadline. It's the classic problem done by
Divide et Impera Quicksort
http://en.wikipedia.org/wiki/Quicksort

Related

Getting the maximum data an algorithm can process in a certain time span

Well meanwhile it's the second time I got an exercise where I have to determine (in this case it's about sorting algorithms) how many numbers I can sort with a certain algorithm (on my own computer) so that the algorithm would run exactly one minute.
This is a practical exercise, means I must generate enough numbers so it would run that long. Now I ask myself, since I haven't had this problem in all ten years of programming: How can I possibly do this? My first attempt was a bit brute-forcy which resulted in an instant StackOverflow.
I could make an array (or multiple) and fill them up with random numbers, but to determine how many would end up in one minute runtime would be a terrible long task since you would always need to wait.
What can I do to efficiently find out about this? Measuring the difference between let's say 10 and 20 numbers and calculate how much it would take to fill a minute? Sounds easy, but algorithms (especially sorting algorithms) are rarely linear.
You know time complexity for each algorithm in question. For example, bubble sort takes O(n*n) time. Make relatively small sample run - D=1000 records, measure the time it takes (T milliseconds). For example, it takes 15 seconds = 15000 milliseconds.
Now with more or less accuracy you can expect that D*2 records will be processed 4 times slower. And vice versa - you need about D* sqrt(60000/T) records to process them in 1 minute. For example, you need D* sqrt(60000/15000)=D* sqrt(4)=D*2=2000 records.
This method is not accurate enough to get exact number, and in most cases exact number of records is not set, it fluctuates from run to run. Also for many algorithms time it takes depends on values in your record set. For example, worst case for quicksort is O(nn), while normal case is O(nlog(n))
You could use something like this:
long startTime = System.getCurrentTimeMillis();
int times = 0;
boolean done = false;
while(!done){
//run algorithm
times++;
if(System.getCurrentTimeMillis()-startTime >= 60000)
done = true;
}
Or if you don't want to wait that long you can can replace the 60000 by 1000 and then multiply the times by 60, it won't be very accurate though.
It would be time consuming to generate a new number every time, so you can use an array that you populate beforehand and then access with the times variable, or you can always use the same variable, which you know would be most time consuming to process so that you get the minimum amount of times that it would run in a minute.

Job scheduling algorithm with deadline and execution time

Given an array of jobs where every job has a deadline(d_i > 0) and associated execution time (e_i > 0), i.e.
we have been given an array of (d_i, e_i) , can we find an arrangement of jobs such that all of them can be scheduled. There may be more than possible answer, any one will suffice.
e.g. {(3,1),(3,2),(7,3)} {J1,J2,J3}
Answer could be one of them {J1,J2,J3} or {J2,J1,J3}
We can solve this problem using backtracking but running time will be very high. Can we solve this problem using greedy or any other approach? Please provide it's correctness .
Edit:
At most one job can be run at a time .
Hint: After you have scheduled k initial jobs successfully, it is possible to find a satisfying full schedule only if there is a next job whose execution time added to the current time after the k previous jobs is less than or equal to the deadline time for the next job. Can you see why always choosing the next job with the earliest deadline at each step of choosing a job will determine whether there is or is not a solution, and if there is, will give a precise solution? Let me know if you'd like more details about how to prove that, but hopefully you can see it on your own now that I've pointed out what the correct greedy solution is.
UPDATE: Further hint: Assume that you have a satisfying assignment where two consecutive jobs are out of order according to their deadlines (this just means the overall ordering of jobs is out of order somehow according to deadlines). Then it is possible to finish both of these jobs before the earlier deadline of the two deadlines. Thus it is also possible to finish both jobs before the later deadline, by swapping the jobs, which will still be a satisfying assignment because by assumption you will now finish the earlier deadline job before the previous time you finished it, by assumption, and the later deadline of the two is later than the earlier deadline, and previously it was still possible to find a satisfying assignment.
Thus, if a satisfying assignment exists, then there is another one that exists where jobs are ordered according to their deadlines. I.e., the greedy strategy will always find a satisfying assignment if one exists -- otherwise, there is no solution.
A O(nlogn) Greedy approach based on heap data structure
the input is array of job
struct Job
{
char id;
int deadLine;
int profit;
}
Algorithm pseudo code:
1.Sort the input jobArray in non-decresing order of deadLine.
2.create a maxHeap (will consists of job).Basis of Comparison is profit
3.let n=length of jobArray
initialize time=jobArray[n-1].deadLine
index=n-1
4.while index>=0 && jobArray[index].deadLine >= time
4a) insert(maxHeap,jobArray[index])
4b) index=index-1
5. j=removeMax(maxHeap)
print j.id
time=time-1
6.if time > 0
goto step 4.
else
return ;
This will print jobs in the reverse order .
It can be modified to print in right order;

Algorithm to find middle of largest free time slot in period?

Say I want to schedule a collection of events in the period 00:00–00:59. I schedule them on full minutes (00:01, never 00:01:30).
I want to space them out as far apart as possible within that period, but I don't know in advance how many events I will have total within that hour. I may schedule one event today, then two more tomorrow.
I have the obvious algorithm in my head, and I can think of brute-force ways to implement it, but I'm sure someone knows a nicer way. I'd prefer Ruby or something I can translate to Ruby, but I'll take what I can get.
So the algorithm I can think of in my head:
Event 1 just ends up at 00:00.
Event 2 ends up at 00:30 because that time is the furthest from existing events.
Event 3 could end up at either 00:15 or 00:45. So perhaps I just pick the first one, 00:15.
Event 4 then ends up in 00:45.
Event 5 ends up somewhere around 00:08 (rounded up from 00:07:30).
And so on.
So we could look at each pair of taken minutes (say, 00:00–00:15, 00:15–00:30, 00:30–00:00), pick the largest range (00:30–00:00), divide it by two and round.
But I'm sure it can be done much nicer. Do share!
You can use bit reversing to schedule your events. Just take the binary representation of your event's sequential number, reverse its bits, then scale the result to given range (0..59 minutes).
An alternative is to generate the bit-reversed words in order (0000,1000,0100,1100,...).
This allows to distribute up to 32 events easily. If more events are needed, after scaling the result you should check if the resulting minute is already occupied, and if so, generate and scale next word.
Here is the example in Ruby:
class Scheduler
def initialize
#word = 0
end
def next_slot
bit = 32
while (((#word ^= bit) & bit) == 0) do
bit >>= 1;
end
end
def schedule
(#word * 60) / 64
end
end
scheduler = Scheduler.new
20.times do
p scheduler.schedule
scheduler.next_slot
end
Method of generating bit-reversed words in order is borrowed from "Matters Computational
", chapter 1.14.3.
Update:
Due to scaling from 0..63 to 0..59 this algorithm tends to make smallest slots just after 0, 15, 30, and 45. The problem is: it always starts filling intervals from these (smallest) slots, while it is more natural to start filling from largest slots. Algorithm is not perfect because of this. Additional problem is the need to check for "already occupied minute".
Fortunately, a small fix removes all these problems. Just change
while (((#word ^= bit) & bit) == 0) do
to
while (((#word ^= bit) & bit) != 0) do
and initialize #word with 63 (or keep initializing it with 0, but do one iteration to get the first event). This fix decrements the reversed word from 63 to zero, it always distributes events to largest possible slots, and allows no "conflicting" events for the first 60 iteration.
Other algorithm
The previous approach is simple, but it only guarantees that (at any moment) the largest empty slots are no more than twice as large as the smallest slots. Since you want to space events as far apart as possible, algorithm, based on Fibonacci numbers or on Golden ratio, may be preferred:
Place initial interval (0..59) to the priority queue (max-heap, priority = interval size).
To schedule an event, pop the priority queue, split the resulting interval in golden proportion (1.618), use split point as the time for this event, and put two resulting intervals back to the priority queue.
This guarantees that the largest empty slots are no more than (approximately) 1.618 times as large as the smallest slots. For smaller slots approximation worsens and sizes are related as 2:1.
If it is not convenient to keep the priority queue between schedule changes, you can prepare an array of 60 possible events in advance, and extract next value from this array every time you need a new event.
Since you can have only 60 events at maximum to schedule, then I suppose using static table is worth a shot (compared to thinking algorithm and testing it). I mean for you it is quite trivial task to layout events within time. But it is not so easy to tell computer how to do it nice way.
So, what I propose is to define table with static values of time at which to put next event. It could be something like:
00:00, 01:00, 00:30, 00:15, 00:45...
Since you can't reschedule events and you don't know in advance how many events will arrive, I suspect your own proposal (with Roman's note of using 01:00) is the best.
However, if you have any sort of estimation on how many events will arrive at maximum, you can probably optimize it. For example, suppose you are estimating at most 7 events, you can prepare slots of 60 / (n - 1) = 10 minutes and schedule the events like this:
00:00
01:00
00:30
00:10
00:40
00:20
00:50 // 10 minutes apart
Note that the last few events might not arrive and so 00:50 has a low probability to be used.
which would be fairer then the non-estimation based algorithm, especially in the worst-case scenario were all slots are used:
00:00
01:00
00:30
00:15
00:45
00:07
00:37 // Only 7 minutes apart
I wrote a Ruby implementation of my solution. It has the edge case that any events beyond 60 will all stack up at minute 0, because every free space of time is now the same size, and it prefers the first one.
I didn't specify how to handle events beyond 60, and I don't really care, but I suppose randomization or round-robin could solve that edge case if you do care.
each_cons(2) gets bigrams; the rest is probably straightforward:
class Scheduler
def initialize
#scheduled_minutes = []
end
def next_slot
if #scheduled_minutes.empty?
slot = 0
else
circle = #scheduled_minutes + [#scheduled_minutes.first + 60]
slot = 0
largest_known_distance = 0
circle.each_cons(2) do |(from, unto)|
distance = (from - unto).abs
if distance > largest_known_distance
largest_known_distance = distance
slot = (from + distance/2) % 60
end
end
end
#scheduled_minutes << slot
#scheduled_minutes.sort!
slot
end
def schedule
#scheduled_minutes
end
end
scheduler = Scheduler.new
20.times do
scheduler.next_slot
p scheduler.schedule
end

load balancing algorithms - special example

Let´s pretend i have two buildings where i can build different units in.
A building can only build one unit at the same time but has a fifo-queue of max 5 units, which will be built in sequence.
Every unit has a build-time.
I need to know, what´s the fastest solution to get my units as fast as possible, considering the units already in the build-queues of my buildings.
"Famous" algorithms like RoundRobin doesn´t work here, i think.
Are there any algorithms, which can solve this problem?
This reminds me a bit of starcraft :D
I would just add an integer to the building queue which represents the time it is busy.
Of course you have to update this variable once per timeunit. (Timeunits are "s" here, for seconds)
So let's say we have a building and we are submitting 3 units, each take 5s to complete. Which will sum up to 15s total. We are in time = 0.
Then we have another building where we are submitting 2 units that need 6 timeunits to complete each.
So we can have a table like this:
Time 0
Building 1, 3 units, 15s to complete.
Building 2, 2 units, 12s to complete.
Time 1
Building 1, 3 units, 14s to complete.
Building 2, 2 units, 12s to complete.
And we want to add another unit that takes 2s, we can simply loop through the selected buildings and pick the one with the lowest time to complete.
In this case this would be building 2. This would lead to Time2...
Time 2
Building 1, 3 units, 13s to complete
Building 2, 3 units, 11s+2s=13s to complete
...
Time 5
Building 1, 2 units, 10s to complete (5s are over, the first unit pops out)
Building 2, 3 units, 10s to complete
And so on.
Of course you have to take care of the upper boundaries in your production facilities. Like if a building has 5 elements, don't assign something and pick the next building that has the lowest time to complete.
I don't know if you can implement this easily with your engine, or if it even support some kind of timeunits.
This will just result in updating all production facilities once per timeunit, O(n) where n is the number of buildings that can produce something. If you are submitting a unit this will take O(1) assuming that you keep the selected buildings in a sorted order, lowest first - so just a first element lookup. In this case you have to resort the list after manipulating the units like cancelling or adding.
Otherwise amit's answer seem to be possible, too.
This is NPC problem (proof at the end of the answer) so your best hope to find ideal solution is trying all possibilities (this will be 2^n possibilities, where n is the number of tasks).
possible heuristic was suggested in comment (and improved in comments by AShelly): sort the tasks from biggest to smallest, and put them in one queue, every task can now take element from the queue when done.
this is of course not always optimal, but I think will get good results for most cases.
proof that the problem is NPC:
let S={u|u is a unit need to be produced}. (S is the set containing all 'tasks')
claim: if there is a possible prefect split (both queues finish at the same time) it is optimal. let this time be HalfTime
this is true because if there was different optimal, at least one of the queues had to finish at t>HalfTime, and thus it is not optimal.
proof:
assume we had an algorithm A to produce the best solution at polynomial time, then we could solve the partition problem at polynomial time by the following algorithm:
1. run A on input
2. if the 2 queues finish exactly at HalfTIme - return True.
3. else: return False
this solution solves the partition problem because of the claim: if the partition exist, it will be returned by A, since it is optimal. all steps 1,2,3 run at polynomial time (1 for the assumption, 2 and 3 are trivial). so the algorithm we suggested solves partition problem at polynomial time. thus, our problem is NPC
Q.E.D.
Here's a simple scheme:
Let U be the list of units you want to build, and F be the set of factories that can build them. For each factory, track total time-til-complete; i.e. How long until the queue is completely empty.
Sort U by decreasing time-to-build. Maintain sort order when inserting new items
At the start, or at the end of any time tick after a factory completes a unit runs out of work:
Make a ready list of all the factories with space in the queue
Sort the ready list by increasing time-til-complete
Get the factory that will be done soonest
take the first item from U, add it to thact factory
Repeat until U is empty or all queues are full.
Googling "minimum makespan" may give you some leads into other solutions. This CMU lecture has a nice overview.
It turns out that if you know the set of work ahead of time, this problem is exactly Multiprocessor_scheduling, which is NP-Complete. Apparently the algorithm I suggested is called "Longest Processing Time", and it will always give a result no longer than 4/3 of the optimal time.
If you don't know the jobs ahead of time, it is a case of online Job-Shop Scheduling
The paper "The Power of Reordering for Online Minimum Makespan Scheduling" says
for many problems, including minimum
makespan scheduling, it is reasonable
to not only provide a lookahead to a
certain number of future jobs, but
additionally to allow the algorithm to
choose one of these jobs for
processing next and, therefore, to
reorder the input sequence.
Because you have a FIFO on each of your factories, you essentially do have the ability to buffer the incoming jobs, because you can hold them until a factory is completely idle, instead of trying to keeping all the FIFOs full at all times.
If I understand the paper correctly, the upshot of the scheme is to
Keep a fixed size buffer of incoming
jobs. In general, the bigger the
buffer, the closer to ideal
scheduling you get.
Assign a weight w to each factory according to
a given formula, which depends on
buffer size. In the case where
buffer size = number factories +1, use weights of (2/3,1/3) for 2 factories; (5/11,4/11,2/11) for 3.
Once the buffer is full, whenever a new job arrives, you remove the job with the least time to build and assign it to a factory with a time-to-complete < w*T where T is total time-to-complete of all factories.
If there are no more incoming jobs, schedule the remainder of jobs in U using the first algorithm I gave.
The main problem in applying this to your situation is that you don't know when (if ever) that there will be no more incoming jobs. But perhaps just replacing that condition with "if any factory is completely idle", and then restarting will give decent results.

Algorithm to process jobs with same priority

I am solving exercise problems from a book called Algorithms by Papadimitrou and Vazirani.
The following is the question:
A server has n customers waiting to be served. The service time required by each customer is known in advance: it is ti minutes for customer i. So if, for example, the customers are served in order of increasing i, then the ith customer has to wait for Sum(j = 1 to n) tj minutes.
We wish to minimize the total waiting time. Give an efficient algorithm for the same.
My Attempt:
I thought of a couple of approaches but couldnt decide which is best or any other approach that beats mine.
Approach 1:
Serve them in Round Robin fashion with time slice as 5. However, when i need to be more careful when deciding the time slice. It shouldnt be too high or low. So, i thought of selecting the time slice as the average of serving times.
Approach 2:
Assume jobs are sorted according to the time they take and are stored in an array A[1...n]
First serve A[1] then A[n] then A[2] then A[n-1] and so on.
I cant really decide which will be a more optimal solution for this problem. Am i missing something.
Thanks,
Chander
You can solve this problem by adding the sorting part and improvising on your Round robin approach,
First sort the customers based on service time
Now instead of just giving each customer a time slice t in round robin manner, you can also check if the customer has less than t/2 remaining time, if so complete his task
So
for each customer in sorted list from first
server customer for time t
if his remaining time is < t/2 then complete his service now
else move to next customer
Let me assume the "total waiting time" is the sum of the time each customer waits before the server finish serving him/her, and assume the customers are served in order of increasing i, so customer C1 waits t1 minutes, customer C2 waits t1+t2 minutes, and customer C3 waits t1+t2+t3 minutes, and ... customer Cn waits t1+t2+...+t{n-1}+tn minutes.
or:
C1 waits: t1
C2 waits: t1+t2
C3 waits: t1+t2+t3
...
Cn waits: t1+t2+t3+...tn
The total waiting time adds up to n*t1+(n-1)*t2+...1*tn
Again, this is based on the assumption that the customers are served in order of increasing i.
Now, which customer do you want to server first?

Resources