Are SNMP request sequential - are there chances they it can arrive in multiples - snmp

I am writing an SNMP agent and plan to write agent to process SNMP request one by one. Means that as when a request arrives at port 161 - will not accept any further request until response / timeout completes.
I am no sure of many SNMP clients - but is it that the SNMP request are sync and sequential - is there any way that they can come in bulk at a single time?

I think SNMP queries can easily come in bursts due to multiple independent managers polling your agent and/or a single anxious manager retrying the same command if your agent is not quick enough to respond.
When it comes to writing SNMP agents, the other consideration would be to estimate the maximum possible time for the agent to gather required data to respond. I believe it should not be the OID-average, but the OID-maximum. In other words, should your agent serve 100 OIDs, out of which querying one "slow" OID would lead to the entire (synchronous) agent to block and stop serving others - this situation might undermine the credibility of your agent on the network...
On top of that, if you happen to hit the same slow OID multiple time in a row (e.g. manager retries), the delay might be accumulating, effectively blocking out other queries.
To summarize: I think high-performance SNMP agent should have the following traits:
Support massively concurrent SNMP commands processing
Have non-blocking data source access for gathering managed objects data
Have some form of caching or rate limiting to protect computationally expensive data sources from cocky SNMP managers
On the other hand, if your SNMP agent is serving a small piece of static data on a low-power hardware and you do not expect too many managers ever talking to you, perhaps you could get away with a simplistic synchronous SNMP agent...
BTW, BSD sockets interface would hold a queue of unprocessed UDP packets so your agent would have a chance to catch up.

The premise of your question is flawed, as there is no concept of "coming in bulk at a single time" — no matter in which order the UDP datagrams making up an SNMP packet are received, and no matter how long a duration lies between the receipt of each packet by your network interface, your operating system will present the SNMP packets to you in receipt order, in sequence. You have one listen port, and one read buffer. So this synchronicity is already how network data processing works and you shouldn't worry about it.
I would say though, that if you are waiting for some resource to become available while processing an SNMP request (as suggested by your use of the word "timeout"), you probably ought to get on and start processing your other pending SNMP requests in the meantime, or you risk your whole stack grinding to a halt. It's not fair to make a manager wait some unknown duration for a response to request B just because some other manager made a request A that is experiencing a delay in being serviced. That being said, you probably do want some upper limit on how many requests can be serviced at any one time, to prevent potential DDoSsing — choosing this value can only be done by you, with your knowledge of the use case and the ecosystem.

Get requests are one OID per request, GetBulk request can ask for several OIDs in one request. Also SNMP client can use async mode sending multiple requests with minimal intervals and waiting for replies.
Packets can also arrive out-or-order due to network delays and equal-cost routes. Your can experiment sending requests with snmpget, snmpgetbulk, snmpbulkwalk and use tcpdump to see what is on the wire.
So, in general, your agent has to be ready to accept bursts of requests.
For simplicity, if request rate is low and your agent can reply fast enough, you can use one-by-one processing. Some of requests can fail in this case, but clients can retry request and finally get reply from agent.

Related

Cancel last sent message ZeroMQ (python) (dealer/router and pushh/pull)

How would one cancel the last sent message ?
I have this set up
The idea is that the client can ask for different types of large data.
The server reads the request from the client and answers an acknowledgement.
Once its data is ready, it pushes it through the other socket.
This enables queueing task on the server side when multiple clients are connected.
However, if the client decides that it does not need the data anymore, it can send a cancel message to the server.
I'm using asyncio.Queue for queueing messages, so I can easily empty the queue, however, I don't know how to drop a message that is in the push/pull pipe to free up the channel?
The kill switch example (Figure 19 - Parallel Pipeline with Kill Signaling) in https://zguide.zeromq.org/docs/chapter2/ is used to end the process. I just want to cancel it.
My idea was to close the socket on the server side and reopen it, but even with linger set to 0, the messages are not dropped.
EDIT: The messages are indeed dropped, but I feel the solution is wrong.
It doesn't really make any sense for ZeroMQ itself to have such a feature.
Suppose that it did have a cancel message feature. For it to operate as expected, you would be critically dependent on the speed of the network. You might develop on a slow network and their you have the time available to decide to cancel, submit the request and for that to take effect before anything has moved anywhere. But on a fast network you won't.
ZeroMQ is a bit like the post office. Once you have posted a letter, they are going to deliver it.
Other issues for a library developer would include how messages are identified, who can cancel a message, etc? It would get very complex for the library to do it and cater for all possible use cases, so it's not unreasonable that they've left such things as an exercise for the application developers.
Chop the Responses Up
You could divide the responses up into smaller messages, send them at some likely rate (proportionate to the network throughput) and check to see if a cancellation has been received before sending each chunk.
It's a bit fiddly, you'd need to know what kind of rate to send the smaller messages so that you don't starve the network, but don't over do it either.
Or, Convert to CSP
The problem lies in ZeroMQ implementing Actor Model, where the transport buffers messages. What you need is Communicating Sequential Processes, which does not buffer messages. You can implement this quite easily on top of ZeroMQ, basically all you need to do is have a two way message exchange going on basically like:
Peer1->Peer2: I'd like to send you a message
time passes
Peer2->Peer1: Okay send a message
Peer1->Peer2: Here is the message
time passes
Peer2->Peer1: I have received the message
end
And in doing this the peers would block, ie peer 1 does nothing else until it gets peer 2's final response.
This feels clunky, but it's what you have to do to reign in an Actor Model system and control where your messages are at any point in time. It's slower because there's more too-ing and fro-ing going on between the peers (in systems like Transputers, this was all done down at the electronic level, so it wasn't an encumberance on software).
The blocking can be a blessing, if throughput matters. Basically, if you find the sender is being blocked too much, that just means you haven't got enough receivers for the tasks they're performing. Actor Model can deceive, because buffering in the network / actor model implementation can temporarily soak up an excess of messages, adding a bit of latency that goes unnoticed.
Anyway, this way you can have a mechanism whereby the flow of messages is fully managed within the application, and not within the ZeroMQ library. If a client does send a "cancel my last request" message (using the above mechanism to send it), that either arrives before the reponse has started to be sent, or after the response has already been delivered to the client (using the mechanism above to send it). There is no intermediate state where a response is already on the way, but out of control of the applications.
CSP is a mode that I'd dearly like ZeroMQ to implement natively. It nearly does, in that you can control the socket high water marks. Unfortunately, a high water mark of 0 means "inifinite", not zero.
CSP itself is a 1970s idea, that saw some popularity and indeed silicon in the 1980s, early 1990s (Inmos, Transputers, Occam, etc) but has recently made something of a comeback in languages like Rust, Go, Erlang. There's even a MS-supplied library for .NET that does it too (not that they call it CSP).
The really big benefit of CSP is that it is algebraically analysable - a design can be analysed and proven to be free of deadlock, without having to do any testing. However, with Actor model systems you cannot do that, and testing will not confirm a lack of problems either. Complex, circular message flows in Actor model can easily lead to deadlock, but that might not occur until the network between computers becomes just a tiny bit busier. Deadlock can happen in CSP too, but it's basically guaranteed to happen every time, if the system has accidentally been architected to deadlock. This shows up in testing quite readily (so at least you know early on!).
As I alluded to early, CSP also doesn't deceive you into thinking there is enough compute resources in a system. If a sender has a strict schedule to keep, and the recipient(s) aren't keeping up, the sender ends up being blocked trying to send instead of waiting for fresh input. It's easy to detect that the real time requirement has not been met. Whereas with Actor model, the send launches messages off into some buffer, and so long as the receiver(s) on average keeps up, all appears to be OK. However, you have no visibility of whether messages are building up inside the (in this case) ZeroMQ's own buffers, so there is little notice of a trending problem in the overall system.

How to create two udp sockets where one is sending requests and another one receiving the answers?

I'm looking for a proper way to have one goroutine sending out request packets to specific servers while a second goroutine receiving the responses and handling them, maybe even create a new goroutine for each response to handle.
The architecture of the game is that there are multiple masterservers, which can be asked for ip lists of registered servers.
After getting the ips and ports from the masterservers, each of the ips gets a request for its data, like server name, map, players, etc.
Also, are there better ways to handle this?
Currently I am creating a goroutine per request that also waits for a response afterwards.
The waiting for a response timeouts after 35ms and continues to send 1.2 times the previous amount of request packets to have a small burst of requests. Also the timeout is doubled on every retry.
I'd like to know if there are better strategies that have proven to be more robust and have a lower latency, that are not too complex.
Edit:
I only create the client side sockets, but would have, if there is no better approach, a client that sends UDP request packets that contain a different socket's address as sender value in order to receive the answers on a different socket that acts kind of like a server, where all the response packets are collected. In order to separate the sending socket from the receiving socket.
This question is tagged as client-server as one of the sockets is supposed to act like a server, even tho all it does is receive expected answers in response to request packets sent by the client socket.

Data structure used for a buffer

I attended a developer interview recently and I was asked the following question:
I have a server that can handle 20 requests. Which data structure is used to model this? What will happen if thee are more than 20 requests? i.e., What will you do in case of buffer overflow?
I am not from CS background. I am transitioning from a different field. I am self taught in programming and DSA. So I would like to know the answers for these questions. Thanking in advance!
Regarding a server that can handle 20 simultaneous requests:
Your question indicates that you are not yet thinking about this is in a reasonable way and are probably quite far from understanding how it works. No problem -- it just means that maybe you have more to learn than you expect.
To help you along, I will write you the correct answer, full of terms you can google for:
When a client attempts to connect to your server, the kernel puts his request in to a 'listen queue' attached to your server's listening 'socket'.
When your server is ready to service a request, it 'accepts' a connection from the listening socket, which creates a new socket for the communication between the client and server, and the server then processes the request.
If your server can handle 20 simultaneous requests, it typically means that it can have up to 20 threads processing connections at the same time. That is usually accomplished by using a 'thread pool' of limited size. When a thread in the pool is available, it gets a new connection from the listening socket (might have to wait for one), and processes it, and it is only the fact that there are at most 20 of these threads that limits the number of request you will handle simultaneously. (nothing to do with a buffer of any kind, really)
If the server is already processing 20 simultaneous requests when a new one comes in, then the client's request will wait in the socket listen queue until the server eventually picks it up, or it will timeout and fail if it has been waiting too long.
There is also a limit (the TCP backlog) on the number of connection requests that can be waiting in the listen queue. If a connection request comes in when the listen queue is full, it is immediately rejected. If you want your server to handle 20 simultaneous requests, then the listen queue should have length at least 20 in case 20 requests arrive at the same time -- they will all get queued until your server picks them up.

How can I limit total concurrent subscriber connections to a ZeroMQ publisher endpoint?

When building a pub-sub service using ZeroMQ on a Linux system, is there any way to enforce concurrent subscriber limits?
For example, I might want to create a ZeroMQ publisher service on a resource-limited system, and want to prevent overloading the system by setting a limit of, say, 100 concurrent connections to the tcp publisher endpoint. After that limit is reached, all subsequent connection attempts from ZeroMQ subscribers would fail.
I understand ZeroMQ doesn't provide notifications about connect/disconnect, but I've been looking for socket options that might allow such limits -- so far, no luck.
Or is this something that should be handled at some other level, perhaps within the protocol?
Yes, ZeroMQ is a Can-Do messaging framework:
Besides the trivial Formal Communication Pattern Framework elements ( the library primitives ), the strongest powers behind the ZeroMQ is the ability to develop one's own messaging system(s).
In your case, it is enough to enrich the scene with a few additional things ... a SUB-process -> PUB-process message-flow-channel, so as to allow PUB-side process to count a number of SUB-process instances concurrently connected and to allow for a disconnect ( a step delegated rather "back" to a SUB-process side suicside move, as the classical PUB-process, intentionally, has no instrumentation to manage subscriptions ) once a limit is dynamically achieved.
Plus add some dynamics for the inter-node signalling to start re-counting and/or to equip the SUB-process side(s) with a self-advertising mechanism to push-keepAliveSIG-s to the PUB-side and expect this signalling to be a weak and informative-only indication as there are many real-world collisions, where decentralised node simply fail to deliver a "guaranteed-delivery" message(s) and a well designed, distributed, low-latency, high-performance system has to cope well with this reality and have the self-healing state-recovery policies designed and in-built into own behaviour.
( Fig. courtesy imatix/ZeroMQ )
The ZeroMQ library can be thought of as a very powerful LEGO-tool-box for designing cool distributed systems, than a ready-made / batteries-included, stiff, quasi-solution-for-just-a-few-academic-cases ( well, it might be considered such, but just for some no-brainer's life, while our lives are much more colourful & teasing, aren't they ? )
So, "How to?"
Worth, definitely worth a few days to read the both of Pieter Hintjens' books & a few weeks for shifting one's mind to start designing with the ZeroMQ full-powers on one's side.
With just a few Python add-on habits ( a zmq.Context() early-setup, and not forgetting a finally: aContext.term() )
There's no way that I'm aware of to configure ZMQ to limit connections automatically... however, you have other options to accomplish what you're looking for. Perhaps the "traditional" way to accomplish this is with a second set of "network communication" sockets... perhaps REQ/REP from subscriber to publisher, asking for permission to connect.
You also have the option, depending on your version of ZMQ (and I've never used it and I can't find it in 5 minutes of searching, so I don't know how recent your version must be) to use XPUB/XSUB sockets, which can accomplish bi-directional communication. You can connect with XSUB, send a subscribe request, then receive a positive or negative response (you might have to play with your subscriber topics to communicate directly with just the single subscriber, I'm not sure), and react accordingly.
Either way, you'll be allowing a connection of some sort between the two systems and then either allowing it or terminating it depending on the situation. This could be less than completely ideal since you'll have to carve out a little overhead to handle connections that you'll be refusing... let's say you're saturated at 100 clients and all of a sudden get 100 new subscribe requests... you may or may not be able to cope with that sort of burst traffic.
You can test out the overhead in alternative communication mediums... like you could publish a webservice that indicates subscriber status that a client could check first, but that may not be any better to have clients connecting that way.
If you're absolutely at the limit of your resources, you'll have to set up a second server to handle subscriber status:
Server 1 is your publisher. You could set it up with a PUB socket and a REP socket.
Server 2 is your status server. It has a REQ socket. Have it subscribe to something like "system-status" or some such thing as that. It will also have your mechanism for communicating with new subscribers, be that a ZMQ socket or a web service or whatever else.
A client will request status from your status server. The status server will send a request to your publisher, which will increment it's subscriber count and reply with success, or keep its subscriber count and reply with failure. This success or failure will be communicated back to the subscriber, which will use that information to connect or not.
Disconnections will have to be communicated in a similar way... and you'll have to use some sort of heartbeating round-robin to confirm clients weren't a victim of catastrophic failure.
This will allow your publisher to make intelligent choices about whether it has resources or not. If you just want to set a static number, you don't even need the connection between the status server and the publisher, you can just keep count on the status server... but just to ensure the overall health of the network then it's probably best not to go that simplistic route.
Anyway, those are just some ideas to accomplish what you're looking for. ZMQ gives you options with which to craft your solutions moreso than actual solutions.

How does a non-forking web server work?

Non-forking (aka single-threaded or select()-based) webservers like lighttpd or nginx are
gaining in popularity more and more.
While there is a multitude of documents explaining forking servers (at
various levels of detail), documentation for non-forking servers is sparse.
I am looking for a bird eyes view of how a non-forking web server works.
(Pseudo-)code or a state machine diagram, stripped down to the bare
minimum, would be great.
I am aware of the following resources and found them helpful.
The
World of SELECT()
thttpd
source code
Lighttpd
internal states
However, I am interested in the principles, not implementation details.
Specifically:
Why is this type of server sometimes called non-blocking, when select() essentially blocks?
Processing of a request can take some time. What happens with new requests during this time when there is no specific listener thread or process? Is the request processing somehow interrupted or time sliced?
Edit:
As I understand it, while a request is processed (e.g file read or CGI script run) the
server cannot accept new connections. Wouldn't this mean that such a server could miss a lot
of new connections if a CGI script runs for, let's say, 2 seconds or so?
Basic pseudocode:
setup
while true
select/poll/kqueue
with fd needing action do
read/write fd
if fd was read and well formed request in buffer
service request
other stuff
Though select() & friends block, socket I/O is not blocking. You're only blocked until you have something fun to do.
Processing individual requests normally involved reading a file descriptor from a file (static resource) or process (dynamic resource) and then writing to the socket. This can be done handily without keeping much state.
So service request above typically means opening a file, adding it to the list for select, and noting that stuff read from there goes out to a certain socket. Substitute FastCGI for file when appropriate.
EDIT:
Not sure about the others, but nginx has 2 processes: a master and a worker. The master does the listening and then feeds the accepted connection to the worker for processing.
select() PLUS nonblocking I/O essentially allows you to manage/respond to multiple connections as they come in a single thread (multiplexing), versus having multiple threads/processes handle one socket each. The goal is to minimize the ratio of server footprint to number of connections.
It is efficient because this single thread takes advantage of the high level of active socket connections required to reach saturation (since we can do nonblocking I/O to multiple file descriptors).
The rationale is that it takes very little time to acknowledge bytes are available, interpret them, then decide on the appropriate bytes to put on the output stream. The actual I/O work is handled without blocking this server thread.
This type of server is always waiting for a connection, by blocking on select(). Once it gets one, it handles the connection, then revisits the select() in an infinite loop. In the simplest case, this server thread does NOT block any other time besides when it is setting up the I/O.
If there is a second connection that comes in, it will be handled the next time the server gets to select(). At this point, the first connection could still be receiving, and we can start sending to the second connection, from the very same server thread. This is the goal.
Search for "multiplexing network sockets" for additional resources.
Or try Unix Network Programming by Stevens, Fenner, Rudoff

Resources