Lamport timestamps with variable message travel time - algorithm

I think I'm misunderstanding something with Lamport timestamps. It appears that it expects that messages take the same time to travel between distributed endpoints.
Lets say that process p1 sends messages m1 and m2 sequentially to process p2. Following the pseudocode in the Algorithm section of the article, we have:
# we start at 0
time(p1) = 0
# we send m1
time(p1) = time(p1) + 1 = 1
send(m1, 1)
# we send m2
time(p1) = time(p1) + 1 = 2
send(m2, 2)
If m1 reaches p2 before m2 everything is fine. But if m2 comes first, we get:
# we start at 0
time(p2) = 0
# we receive m2 first
time(p2) = max(2, time(p2)) + 1 = max(2, 0) + 1 = 3
# we receive m1 second
time(p2) = max(1, time(p2)) + 1 = max(1, 3) + 1 = 4
So in p2's local time (time(p2)) m2 has time of 3, and m1 has time of 4. That is the opposite of the order in which the messages were originally sent.
Am I missing something fundamental or do Lamport timestamps require consistent travel times to work?

The time of a message is the time contained in the message, not the time of the process receiving the message.
The timer is logically shared between the two processes, so it needs to count events (sends) by both processes, which is why the receiver adds one to the time when it receives a mesage.
The algorithm attempts to maintain the two processes' timers in sync, even if messages are lost or delayed, which is why the receiver takes the maximum of its view of the time and the sender's view of the time found in the message. If these are not the same, some message has been lost or delayed. That will result in the two processes having different views of the time, but the clocks will be resynchronised when the next message is sent and received (in either direction).
If the two processes simultaneously send a message, the two messages will contain the same time, which is why it is not a total order. But the clocks will still eventually be resynchronised.

Related

Fischer's Mutual Exclusion Algorithm

Two processes are trying to enter their critical sections, executing the same program:
while true do begin
// Noncritical section.
L: if not(id=0) then goto L;
id := i;
pause(delay);
if not(id=i) then goto L;
// Inside critical section.
id := 0;
end
The constant i identifies the process (i.e., has the value 1 or 2), and id is a global variable, with value 0 initially. The statement pause(delay) delays the execution for delay time units. It is assumed that the assignment id := i; takes at most t time units.
It has been proved that for delay > t the algorithm is correct.
I have two questions:
1) Suppose both processes A and B pass the control at label L. Suppose that at this point A get always chosen by the scheduler until it enters in its critical section. Suppose that, while A is in it critical section, the scheduler dispatches process B; since it has already passed the control at label L it also can enter in its critical section. Where am i wrong?
2) Why if delay == t the algorithm isn't correct?
Suppose that process A and B reach the label L at times t_A and t_B respectively (t_A < t_B) but the difference between these times is smaller-equal than t (worst-case assignment time). If it was larger than t, process B would stop at label L and wait until id=0.
As a result, process B will still see id=0 and assign its ID as well. But process A is not aware about this assignment yet. The only way for process A to get informed about this assignment is to wait for some time and re-check the value of id.
This waiting time must be larger that t. Why?
Let's consider two edge cases here
-case 1: t_A = t_B, in other words, process A and B reached label L at the same time. They both see id=0 and hence assign their IDs to it.
Let's assume that process A's assignment finishes in almost 0 time and process B's assignment finishes in worst-case time t. This means that process A has to delay more than t time, in order to see the process B's update to variable id. If delay is smaller-equal than t, the update will not be visible and they both will enter critical section. This is actually already sufficient for claiming that delay has to be larger than t.
-case 2: t_B = t_A + t, in other words, process A reaches label L, assigns its ID in worst-case time t, then after t time, process B reaches label L, checks id=0 (because assignment of process A has not finished yet) and assigns its ID in worst-case time t. Again here, if process A's delay will be smaller-equal than t, it will not see the update of process B.

Minimum time to shift given weight boxes from one position to other when capacity of machines are given

Given n boxes of different weights and m machines of different weight carrying capacity. Find the minimum time required to move all boxes.
Machines Capacities : C[0] , C[1] , C[2],........C[m-1].
Box Weights : W[0] , W[1] , W[2] .... W[n].
Each machine takes 1 minute to carry one time.
What can be the optimal approach recursive approach will be to try assigning current box to given machine and not assign and recur for rest of thee boxes.
Note: A single machine can carry boxes multiple times , Each round trip takes exactly 1 unit time.
Sort the machines in descending order of weight carrying capacities.
Sort the boxes in increasing order of weights.
Now, add boxes one by one to each machine until the weight carrying capacity of the machine exceeds.
When it exceeds then move to the next machine.
Pseudocode:
W[] //sorted in increasing order
C[] //sorted in decreasing order
i = 0 //pointer for box
j = 0 //pointer for machine
curr_weight = 0
time_taken = 0
while i<n:
curr_weight = curr_weight + W[i]
if curr_weight > C[j]:
curr_weight = 0
j = j + 1
time_taken = time_taken + 1
else
i = i + 1
end while
print time_taken + 1
Check for boundary cases. For example if j exceeds m-1
Edit: In case the same machine can carry multiple times
Maintain a sorted STACK for machines [sorted in descending order]. As soon as a machine gets full and leaves for transportation, pop the machine from STACK and enqueue it in a QUEUE. As soon as a machine is ready to carry again(after it has returned from the transportation job), dequeue it from the QUEUE and push it back on the STACK.
Assumption: time to move from source to destination for a machine is 1 minute. time to move back from destination to source for a machine is 1 minute.

Scheduling - Assigning jobs to the most efficient worker

This was asked by a friend of mine. I had no previous context, so I want to know what type of algorithm this problem belongs to. Any hint or suggestions will do.
Suppose we have a group of N workers working at a car assembly line. Each worker can do 3 types of work, and their skills rated from 1 to 10. For example, Worker1's "paint surface" efficiency is rated 8, but "assemble engine" efficiency is only rated 5.
The manager has a list of M jobs defined by start time, duration, job type, and importance, rated from 0 to 1. Each worker can only work on 1 job at a time, and 1 job can be worked by 1 worker. How can the managers assign the jobs properly to get maximum output?
The maximum output for a job = worker skill rating * job importance * duration.
For example, we have workers {w1, w2}
w1: paint_skill = 9, engine_skill = 8
w2: paint_skill = 10,engine_skill = 5
We have jobs {j1, j2}
j1: paint job, start_time = 0, duration = 10, importance = 0.5
j2: engine job, start_time = 3, duration = 10, importance = 0.9
We should assign w1 to j2, and w2 to j1. output = 8 * 10 * 0.5 + 10 * 10 * 0.9 = 40 + 90 = 130
A greedy solution that matches the next available worker with the next job is clearly sub-optimal, as in the example we could have matched w1 to j1, which is not optimal.
A exhaustive brute-force solution would guarantee the best output, but will use exponentially more time to compute with large job lists.
How can this problem be approached?

Feedback and HRRN Scheduling Algorithms?

These are examples from William Stallings Operating Systems Internal and Principles Design (7th ed). Below are the process arrival times and the service times:
HRRN:
I understand A and B but then according to what C is chosen before others and then why D is in the end I don't understand...
Feedback with q = 2
I read on a source that is a priority version of Round Robin and on our script it says another version of short response next algorithm with q. I mixed everything on this one and cant really find a correct logic. Most interesting why is there a block greater than 2? Final block of B.
I would be glad if you can explain the answers.
In the HRRN question, the process B executes from 4-7 ms. Since process C arrived at 4ms, it has to wait 3ms. Similarly process D, arrived at 6ms and it has to wait for 1ms.
According HRRN, ratio for C = 1 + 3/4 = 1.75
ratio for D = 1 + 1/5 = 1.2 , therefore process C executes from 7-11ms.
Now, D has to wait for 4ms more till C completes. Similarly E waits for 3ms.
Ratio for D = 1 + (4+1)/5 = 2
Ratio for E = 1 + 3/2 = 2.5
Therefore E executes next and D gets executed finally. Hope this clarifies. I have no idea about problem 2.

How to find maximum task running at x time?

Problem description is follows:
There are n events for particular day d having start time and duration. Example:
e1 10:15:06 11ms (ms = milli seconds)
e2 10:16:07 12ms
......
I need to find out the time x and n. Where x is the time when maximum events were getting executed.
Solution I am thinking is:
Scanning all ms in day d. But that request total 86400000*n calculation. Example
Check at 00::00::00::001 How many events are running
Check at 00::00::00::002 How many events are running
Take max of Range(00::00::00::01,00::00::00::00)
Second solution I am thinking is:
For eventi in all events
Set running_event=1
eventj in all events Where eventj!=eventi
if eventj.start_time in Range (eventi.start_time,eventi.execution_time)
running_event++
And then take max of running_event
Is there any better solution for this?
This can be solved in O(n log n) time:
Make an array of all events. This array is already partially sorted: O(n)
Sort the array: O(n log n); your library should be able to make use of the partial sortedness (timSort does that very well); look into distribution-based sorting algorithms for better expected running time.
Sort event boundaries ascending w.r.t. the boundary time
Sort event ends before sort starts if touching intervals are considered non-overlapping
(Sort event ends after sort starts if touching intervals are considered overlapping)
Initialise running = 0, running_best = 0, best_at = 0
For each event boundary:
If it's a start of an event, increment running
If running > running_best, set best_at = current event time
If it's an end of an event, decrement running
output best_at
You could reduce the number of points you check by checking only ends of all intervals, for each interval (task) I that lasts from t1 to t2, you only need to check how many tasks are running at t1 and at t2 (assuming the tasks runs from t1 to t2 inclusive, if it is exclusive, check for t1-EPSILON, t1+EPSILON, t2-EPSILON, T2+EPSILON.
It is easy to see (convince yourself why) that you cannot get anything better that these cases do not cover.
Example:
tasks run in `[0.5,1.5],[0,1.2],[1,3]`
candidates: 0,0.5,1,1.2,1.5,3
0 -> 1 tasks
0.5 -> 2 tasks
1 -> 3 tasks
1.2 -> 3 tasks (assuming inclusive, end of interval)
1.5 -> 2 tasks (assuming inclusive, end of interval)
3 -> 1 task (assuming inclusive, end of interval)

Resources