Understanding omission failure in distributed systems

Understanding omission failure in distributed systems - client-server

The following text says this which I'm not able to quite agree :
client C sends a request R to server S. The time taken by a communication link to transport R over the link is D. P is the maximum time needed by S to recieve , process and reply to R. If omission failure is assumed ; then if no reply to R is received within 2(D+P) , then C will never recieve a reply to R .
Why is the time here 2(D+P). As I understand shouldn't it be 2D+P ?

Omission failures either due to process crash either due to communication link failures are detected via timeouts.
But in an asynchronous system a timeout is an indication only that a process is not responding. The other process may have crashed or just has slowed down due to heavy processing load.
So usually as a timeout we pick a maximum period. So this could be the 2(D+P) and not your strict 2D+P. The maximum period allows to account for either the network became congested and therefore slower on the response return/ the slower processing in the process and the time to process the response R that the client takes to read the message from the incoming buffer and do the processing need to pass it up to the application level.
So I can not tell you exactly what accounts as what in that formula from your book, I can tell you though that on timeouts we are not as strict as you would expect.

Related

Question about implementing Raft's Client interaction

I'm actually learning MIT6.824,
https://www.youtube.com/channel/UC_7WrbZTCODu1o_kfUMq88g,
and try to implement its lab,
there's a paragraph in raft's paper describing client semantics:
Our goal for Raft is to implement linearizable seman- tics (each operation appears to execute instantaneously, exactly once, at some point between its invocation and its response). However, as described so far Raft can exe- cute a command multiple times: for example, if the leader crashes after committing the log entry but before respond- ing to the client, the client will retry the command with a new leader, causing it to be executed a second time. The solution is for clients to assign unique serial numbers to every command. Then, the state machine tracks the latest serial number processed for each client, along with the as- sociated response. If it receives a command whose serial number has already been executed, it responds immedi- ately without re-executing the
request.
Now I have passed MIT lab 3A, but I have responses map[string]string in kvserver,
which is a map from client's request id to response, but the problem is then the
map will keep increasing if client's keep sending request, Which is problemic in real project. How does Raft handle this in real project? Also, the MIT lab 3 says one client
will execute one command at a time, so probably I can optimize by deleting client's last request's response. But how does Raft handle this in real project where client's behavior is more free?

Cancel last sent message ZeroMQ (python) (dealer/router and pushh/pull)

How would one cancel the last sent message ?
I have this set up
The idea is that the client can ask for different types of large data.
The server reads the request from the client and answers an acknowledgement.
Once its data is ready, it pushes it through the other socket.
This enables queueing task on the server side when multiple clients are connected.
However, if the client decides that it does not need the data anymore, it can send a cancel message to the server.
I'm using asyncio.Queue for queueing messages, so I can easily empty the queue, however, I don't know how to drop a message that is in the push/pull pipe to free up the channel?
The kill switch example (Figure 19 - Parallel Pipeline with Kill Signaling) in https://zguide.zeromq.org/docs/chapter2/ is used to end the process. I just want to cancel it.
My idea was to close the socket on the server side and reopen it, but even with linger set to 0, the messages are not dropped.
EDIT: The messages are indeed dropped, but I feel the solution is wrong.

It doesn't really make any sense for ZeroMQ itself to have such a feature.
Suppose that it did have a cancel message feature. For it to operate as expected, you would be critically dependent on the speed of the network. You might develop on a slow network and their you have the time available to decide to cancel, submit the request and for that to take effect before anything has moved anywhere. But on a fast network you won't.
ZeroMQ is a bit like the post office. Once you have posted a letter, they are going to deliver it.
Other issues for a library developer would include how messages are identified, who can cancel a message, etc? It would get very complex for the library to do it and cater for all possible use cases, so it's not unreasonable that they've left such things as an exercise for the application developers.
Chop the Responses Up
You could divide the responses up into smaller messages, send them at some likely rate (proportionate to the network throughput) and check to see if a cancellation has been received before sending each chunk.
It's a bit fiddly, you'd need to know what kind of rate to send the smaller messages so that you don't starve the network, but don't over do it either.
Or, Convert to CSP
The problem lies in ZeroMQ implementing Actor Model, where the transport buffers messages. What you need is Communicating Sequential Processes, which does not buffer messages. You can implement this quite easily on top of ZeroMQ, basically all you need to do is have a two way message exchange going on basically like:
Peer1->Peer2: I'd like to send you a message
time passes
Peer2->Peer1: Okay send a message
Peer1->Peer2: Here is the message
time passes
Peer2->Peer1: I have received the message
end
And in doing this the peers would block, ie peer 1 does nothing else until it gets peer 2's final response.
This feels clunky, but it's what you have to do to reign in an Actor Model system and control where your messages are at any point in time. It's slower because there's more too-ing and fro-ing going on between the peers (in systems like Transputers, this was all done down at the electronic level, so it wasn't an encumberance on software).
The blocking can be a blessing, if throughput matters. Basically, if you find the sender is being blocked too much, that just means you haven't got enough receivers for the tasks they're performing. Actor Model can deceive, because buffering in the network / actor model implementation can temporarily soak up an excess of messages, adding a bit of latency that goes unnoticed.
Anyway, this way you can have a mechanism whereby the flow of messages is fully managed within the application, and not within the ZeroMQ library. If a client does send a "cancel my last request" message (using the above mechanism to send it), that either arrives before the reponse has started to be sent, or after the response has already been delivered to the client (using the mechanism above to send it). There is no intermediate state where a response is already on the way, but out of control of the applications.
CSP is a mode that I'd dearly like ZeroMQ to implement natively. It nearly does, in that you can control the socket high water marks. Unfortunately, a high water mark of 0 means "inifinite", not zero.
CSP itself is a 1970s idea, that saw some popularity and indeed silicon in the 1980s, early 1990s (Inmos, Transputers, Occam, etc) but has recently made something of a comeback in languages like Rust, Go, Erlang. There's even a MS-supplied library for .NET that does it too (not that they call it CSP).
The really big benefit of CSP is that it is algebraically analysable - a design can be analysed and proven to be free of deadlock, without having to do any testing. However, with Actor model systems you cannot do that, and testing will not confirm a lack of problems either. Complex, circular message flows in Actor model can easily lead to deadlock, but that might not occur until the network between computers becomes just a tiny bit busier. Deadlock can happen in CSP too, but it's basically guaranteed to happen every time, if the system has accidentally been architected to deadlock. This shows up in testing quite readily (so at least you know early on!).
As I alluded to early, CSP also doesn't deceive you into thinking there is enough compute resources in a system. If a sender has a strict schedule to keep, and the recipient(s) aren't keeping up, the sender ends up being blocked trying to send instead of waiting for fresh input. It's easy to detect that the real time requirement has not been met. Whereas with Actor model, the send launches messages off into some buffer, and so long as the receiver(s) on average keeps up, all appears to be OK. However, you have no visibility of whether messages are building up inside the (in this case) ZeroMQ's own buffers, so there is little notice of a trending problem in the overall system.

Are SNMP request sequential - are there chances they it can arrive in multiples

I am writing an SNMP agent and plan to write agent to process SNMP request one by one. Means that as when a request arrives at port 161 - will not accept any further request until response / timeout completes.
I am no sure of many SNMP clients - but is it that the SNMP request are sync and sequential - is there any way that they can come in bulk at a single time?

I think SNMP queries can easily come in bursts due to multiple independent managers polling your agent and/or a single anxious manager retrying the same command if your agent is not quick enough to respond.
When it comes to writing SNMP agents, the other consideration would be to estimate the maximum possible time for the agent to gather required data to respond. I believe it should not be the OID-average, but the OID-maximum. In other words, should your agent serve 100 OIDs, out of which querying one "slow" OID would lead to the entire (synchronous) agent to block and stop serving others - this situation might undermine the credibility of your agent on the network...
On top of that, if you happen to hit the same slow OID multiple time in a row (e.g. manager retries), the delay might be accumulating, effectively blocking out other queries.
To summarize: I think high-performance SNMP agent should have the following traits:
Support massively concurrent SNMP commands processing
Have non-blocking data source access for gathering managed objects data
Have some form of caching or rate limiting to protect computationally expensive data sources from cocky SNMP managers
On the other hand, if your SNMP agent is serving a small piece of static data on a low-power hardware and you do not expect too many managers ever talking to you, perhaps you could get away with a simplistic synchronous SNMP agent...
BTW, BSD sockets interface would hold a queue of unprocessed UDP packets so your agent would have a chance to catch up.

The premise of your question is flawed, as there is no concept of "coming in bulk at a single time" — no matter in which order the UDP datagrams making up an SNMP packet are received, and no matter how long a duration lies between the receipt of each packet by your network interface, your operating system will present the SNMP packets to you in receipt order, in sequence. You have one listen port, and one read buffer. So this synchronicity is already how network data processing works and you shouldn't worry about it.
I would say though, that if you are waiting for some resource to become available while processing an SNMP request (as suggested by your use of the word "timeout"), you probably ought to get on and start processing your other pending SNMP requests in the meantime, or you risk your whole stack grinding to a halt. It's not fair to make a manager wait some unknown duration for a response to request B just because some other manager made a request A that is experiencing a delay in being serviced. That being said, you probably do want some upper limit on how many requests can be serviced at any one time, to prevent potential DDoSsing — choosing this value can only be done by you, with your knowledge of the use case and the ecosystem.

Get requests are one OID per request, GetBulk request can ask for several OIDs in one request. Also SNMP client can use async mode sending multiple requests with minimal intervals and waiting for replies.
Packets can also arrive out-or-order due to network delays and equal-cost routes. Your can experiment sending requests with snmpget, snmpgetbulk, snmpbulkwalk and use tcpdump to see what is on the wire.
So, in general, your agent has to be ready to accept bursts of requests.
For simplicity, if request rate is low and your agent can reply fast enough, you can use one-by-one processing. Some of requests can fail in this case, but clients can retry request and finally get reply from agent.

ZeroMQ: How to set the HWM to become really a "zero" ( not the API def'd 0 == INFINITY )?

I have a question about the high water mark (HWM) for a ZeroMQ PUB/SUB connection. Effectively I want to set the HWM value to zero. i.e.: if a message can't be delivered, just drop it.
Unfortunately it seems that the HWM value can only be set as low as 1.
"0" means infinity according to the API docs and my testing.
In my opinion, using "0" to mean "infinite" in the API was a mistake
:/ It's probably unlikely to change.Is there a workaround that
doesn't require recompilation of ZeroMQ?
The problem I'm encountering is that with a non-zero HWM, when a connection fails, at least one message sits in the queue and is sent when the connection is re-established. By that time the message is no longer valid and shouldn't be trusted.
I've thought about discarding messages on the receiving side by including a time stamp generated at the sending side. Unfortunately the systems clocks are not connected to the internet and drift significantly. Syncing the clocks with an additional REQ/REP socket will introduce other complicated startup states to the sender and seems like an unnecessary workaround.

Yes:
. . . may try to hunt the same deer from the opposite forest
Based on the need,
declared in the initial problem motivation,
the solution may be derived from setting not the ZMQ_???HWM,
but another parameter:
ZMQ_CONFLATE: Keep only last message
Default value 0 ( false )
Applicable socket types ZMQ_PULL, ZMQ_PUSH, ZMQ_SUB, ZMQ_PUB, ZMQ_DEALERIf set, a socket shall keep only one message in its inbound / outbound queue, this message being the last message received/the last message to be sent. Ignores ZMQ_RCVHWM and ZMQ_SNDHWM options.Does not support multi-part messages, in particular, only one part of it is kept in the socket internal queue.
and some help may come from a "preventive-measure" hidden in:
ZMQ_IMMEDIATE: Queue messages only to completed connections
Applicable socket types all, only for connection-oriented transports.
By default queues will fill on outgoing connections even if the connection has not completed. This can lead to "lost" messages on sockets with round-robin routing (REQ, PUSH, DEALER).
If this option is set to 1, messages shall be queued only to completed connections. This will cause the socket to block if there are no other connections, but will prevent queues from filling on pipes awaiting connection.

How does a non-forking web server work?

Non-forking (aka single-threaded or select()-based) webservers like lighttpd or nginx are
gaining in popularity more and more.
While there is a multitude of documents explaining forking servers (at
various levels of detail), documentation for non-forking servers is sparse.
I am looking for a bird eyes view of how a non-forking web server works.
(Pseudo-)code or a state machine diagram, stripped down to the bare
minimum, would be great.
I am aware of the following resources and found them helpful.
The
World of SELECT()
thttpd
source code
Lighttpd
internal states
However, I am interested in the principles, not implementation details.
Specifically:
Why is this type of server sometimes called non-blocking, when select() essentially blocks?
Processing of a request can take some time. What happens with new requests during this time when there is no specific listener thread or process? Is the request processing somehow interrupted or time sliced?
Edit:
As I understand it, while a request is processed (e.g file read or CGI script run) the
server cannot accept new connections. Wouldn't this mean that such a server could miss a lot
of new connections if a CGI script runs for, let's say, 2 seconds or so?

Basic pseudocode:
setup
while true
select/poll/kqueue
with fd needing action do
read/write fd
if fd was read and well formed request in buffer
service request
other stuff
Though select() & friends block, socket I/O is not blocking. You're only blocked until you have something fun to do.
Processing individual requests normally involved reading a file descriptor from a file (static resource) or process (dynamic resource) and then writing to the socket. This can be done handily without keeping much state.
So service request above typically means opening a file, adding it to the list for select, and noting that stuff read from there goes out to a certain socket. Substitute FastCGI for file when appropriate.
EDIT:
Not sure about the others, but nginx has 2 processes: a master and a worker. The master does the listening and then feeds the accepted connection to the worker for processing.

select() PLUS nonblocking I/O essentially allows you to manage/respond to multiple connections as they come in a single thread (multiplexing), versus having multiple threads/processes handle one socket each. The goal is to minimize the ratio of server footprint to number of connections.
It is efficient because this single thread takes advantage of the high level of active socket connections required to reach saturation (since we can do nonblocking I/O to multiple file descriptors).
The rationale is that it takes very little time to acknowledge bytes are available, interpret them, then decide on the appropriate bytes to put on the output stream. The actual I/O work is handled without blocking this server thread.
This type of server is always waiting for a connection, by blocking on select(). Once it gets one, it handles the connection, then revisits the select() in an infinite loop. In the simplest case, this server thread does NOT block any other time besides when it is setting up the I/O.
If there is a second connection that comes in, it will be handled the next time the server gets to select(). At this point, the first connection could still be receiving, and we can start sending to the second connection, from the very same server thread. This is the goal.
Search for "multiplexing network sockets" for additional resources.
Or try Unix Network Programming by Stevens, Fenner, Rudoff

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio