How does a non-forking web server work? - algorithm

Non-forking (aka single-threaded or select()-based) webservers like lighttpd or nginx are
gaining in popularity more and more.
While there is a multitude of documents explaining forking servers (at
various levels of detail), documentation for non-forking servers is sparse.
I am looking for a bird eyes view of how a non-forking web server works.
(Pseudo-)code or a state machine diagram, stripped down to the bare
minimum, would be great.
I am aware of the following resources and found them helpful.
The
World of SELECT()
thttpd
source code
Lighttpd
internal states
However, I am interested in the principles, not implementation details.
Specifically:
Why is this type of server sometimes called non-blocking, when select() essentially blocks?
Processing of a request can take some time. What happens with new requests during this time when there is no specific listener thread or process? Is the request processing somehow interrupted or time sliced?
Edit:
As I understand it, while a request is processed (e.g file read or CGI script run) the
server cannot accept new connections. Wouldn't this mean that such a server could miss a lot
of new connections if a CGI script runs for, let's say, 2 seconds or so?

Basic pseudocode:
setup
while true
select/poll/kqueue
with fd needing action do
read/write fd
if fd was read and well formed request in buffer
service request
other stuff
Though select() & friends block, socket I/O is not blocking. You're only blocked until you have something fun to do.
Processing individual requests normally involved reading a file descriptor from a file (static resource) or process (dynamic resource) and then writing to the socket. This can be done handily without keeping much state.
So service request above typically means opening a file, adding it to the list for select, and noting that stuff read from there goes out to a certain socket. Substitute FastCGI for file when appropriate.
EDIT:
Not sure about the others, but nginx has 2 processes: a master and a worker. The master does the listening and then feeds the accepted connection to the worker for processing.

select() PLUS nonblocking I/O essentially allows you to manage/respond to multiple connections as they come in a single thread (multiplexing), versus having multiple threads/processes handle one socket each. The goal is to minimize the ratio of server footprint to number of connections.
It is efficient because this single thread takes advantage of the high level of active socket connections required to reach saturation (since we can do nonblocking I/O to multiple file descriptors).
The rationale is that it takes very little time to acknowledge bytes are available, interpret them, then decide on the appropriate bytes to put on the output stream. The actual I/O work is handled without blocking this server thread.
This type of server is always waiting for a connection, by blocking on select(). Once it gets one, it handles the connection, then revisits the select() in an infinite loop. In the simplest case, this server thread does NOT block any other time besides when it is setting up the I/O.
If there is a second connection that comes in, it will be handled the next time the server gets to select(). At this point, the first connection could still be receiving, and we can start sending to the second connection, from the very same server thread. This is the goal.
Search for "multiplexing network sockets" for additional resources.
Or try Unix Network Programming by Stevens, Fenner, Rudoff

Related

Are SNMP request sequential - are there chances they it can arrive in multiples

I am writing an SNMP agent and plan to write agent to process SNMP request one by one. Means that as when a request arrives at port 161 - will not accept any further request until response / timeout completes.
I am no sure of many SNMP clients - but is it that the SNMP request are sync and sequential - is there any way that they can come in bulk at a single time?
I think SNMP queries can easily come in bursts due to multiple independent managers polling your agent and/or a single anxious manager retrying the same command if your agent is not quick enough to respond.
When it comes to writing SNMP agents, the other consideration would be to estimate the maximum possible time for the agent to gather required data to respond. I believe it should not be the OID-average, but the OID-maximum. In other words, should your agent serve 100 OIDs, out of which querying one "slow" OID would lead to the entire (synchronous) agent to block and stop serving others - this situation might undermine the credibility of your agent on the network...
On top of that, if you happen to hit the same slow OID multiple time in a row (e.g. manager retries), the delay might be accumulating, effectively blocking out other queries.
To summarize: I think high-performance SNMP agent should have the following traits:
Support massively concurrent SNMP commands processing
Have non-blocking data source access for gathering managed objects data
Have some form of caching or rate limiting to protect computationally expensive data sources from cocky SNMP managers
On the other hand, if your SNMP agent is serving a small piece of static data on a low-power hardware and you do not expect too many managers ever talking to you, perhaps you could get away with a simplistic synchronous SNMP agent...
BTW, BSD sockets interface would hold a queue of unprocessed UDP packets so your agent would have a chance to catch up.
The premise of your question is flawed, as there is no concept of "coming in bulk at a single time" — no matter in which order the UDP datagrams making up an SNMP packet are received, and no matter how long a duration lies between the receipt of each packet by your network interface, your operating system will present the SNMP packets to you in receipt order, in sequence. You have one listen port, and one read buffer. So this synchronicity is already how network data processing works and you shouldn't worry about it.
I would say though, that if you are waiting for some resource to become available while processing an SNMP request (as suggested by your use of the word "timeout"), you probably ought to get on and start processing your other pending SNMP requests in the meantime, or you risk your whole stack grinding to a halt. It's not fair to make a manager wait some unknown duration for a response to request B just because some other manager made a request A that is experiencing a delay in being serviced. That being said, you probably do want some upper limit on how many requests can be serviced at any one time, to prevent potential DDoSsing — choosing this value can only be done by you, with your knowledge of the use case and the ecosystem.
Get requests are one OID per request, GetBulk request can ask for several OIDs in one request. Also SNMP client can use async mode sending multiple requests with minimal intervals and waiting for replies.
Packets can also arrive out-or-order due to network delays and equal-cost routes. Your can experiment sending requests with snmpget, snmpgetbulk, snmpbulkwalk and use tcpdump to see what is on the wire.
So, in general, your agent has to be ready to accept bursts of requests.
For simplicity, if request rate is low and your agent can reply fast enough, you can use one-by-one processing. Some of requests can fail in this case, but clients can retry request and finally get reply from agent.

Broadcast Server

I am writing a TCP Server that accepts connections from multiple clients, this server gathers data from the system that it's running on and transmits it to every connected client.
What design patterns would be best for this situation?
Example
Put all connections in an array, then loop through the array and send the data to each client one by one. Advantage: very easy to implement. Disadvantage: not very efficient when handling large amounts of data.
An easier way is to use some existing software to do this ... For example use https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=d5bedadd-e46f-4c97-af89-22d65ffee070 .
In case you want to write on your own you will need a list(linked list) to manage the connections.
Here is an example of a server http://examples.oreilly.com/jenut/Server.java
If you want to handle large amounts of data, one of the techniques is to have a queue associated with each of the subscribers at the server end. A multi-threaded program can send the data to the clients from those queues.
A number of patterns have been developed for distributed processing and servers, for instance in the ACE project: http://www.cs.wustl.edu/~schmidt/patterns-ace.html. The design might be focused around the events which announce either that data has been received and may be read, or that buffers have been emptied and more data may now be written. At least in the days when a 32-bit address space was the rule, you could have many more open connections than you had threads, so you would typically have a small number of threads waiting for events which announced that they could safely read or write without stalling until the other side co-operated. This may come from events, or from calls such as select() or poll() terminating. Patterns are also associated with http://zguide.zeromq.org/page:all#-MQ-in-a-Hundred-Words.

Client to server connection only sending not receiving

This is my case, I have a server listening for connections, and a client that I'm programming now. The client has nothing to receive from the server, yet it has to be sending status updates every 3 minutes.
I have the following at the moment:
WSAStartup(0x101,&ws);
sock = socket(AF_INET,SOCK_STREAM,0);
sa.sin_family = AF_INET;
sa.sin_port = htons(PORT_NET);
sa.sin_addr.s_addr = inet_addr("127.0.0.1");
connect(sock,(SOCKADDR*)&sa,sizeof(sa));
send(sock,(const char*)buffer,128,NULL);
How should my approach be? Can I avoid looping recv?
That's rather dependant on what behaviour you want and your program structure.
By default a socket will block on any read or write operations, which means that if your try and have your server's main thread poll the connection, you're going to end up with it 'freezing' for 3 minutes or until the client closes the connection.
The absolute simplest functional solution (no multithreadding) is to set the socket to non-blocking, and poll in in the main thread. It sounds like you want to avoid doing that though.
The most obvious way around that is to make a dedicated thread for every connection, plus the main listener socket. Your server listens for incoming connections and spawns a thread for each stream socket it creates. Then each connection thread blocks on it's socket until it receives data, and either handles it itself or shunts it onto a shared queue.
That's a bulky and complex solution - multiple threads which need opening and closing, shared resources which need protecting.
Another option is to set the socket to non-blocking (Under win32 use setsockopt so set a timeout, under *nix pass it the O_NONBLOCK flag). That way it will return control if there's no data available to read. However that means you need to poll the socket at reasonable intervals ("reasonable" being entirely up to you, and how quickly you need the server to act on new data.)
Personally, for the lightweight use you're describing I'd use a combination of the above: A single dedicated thread which polls a socket (or an array of nonblocking sockets) every few seconds, sleeping in between, and simply pushed the data onto a queue for the main thread to act upon during it's main loop.
There are a lot of ways to get into a mess with asynchronous programs, so it's probably best to keep it simple and get it working, until you're comfortable with the control flow.

Listening to multiple sockets: select vs. multi-threading

A server needs to listen to incoming data from several sockets (10-20). After some initializations, those sockets are created and do not change (i.e. no new sockets accepted, and none of them is expected to close during the lifetime of the server).
One option is to select() on all sockets, then deal with incoming data per socket (i.e. route to proper handling function).
Another option is to open one thread per socket and let each thread recv() and handle the input.
(The first option has the benefit of setting a timeout, but this is not an issue in this case,
since all the sockets are quite active).
Assuming the following: Windows server, has enough memory such that 20MB (for the 20 threads) is a non-issue, is any of those options expected to be faster then the other?
There's not much in it in you app. Typically, using a thread-per-socket is easier than asynchronous approaches because it's a simpler overall structure and it's easier to maintain state.

Is this a scaleable named pipe server implementation?

Looking at this example of named pipes using Overlapped I/O from MSDN, I am wondering how scaleable it is? If you spin up a dozen clients and have them hit the server 10x/sec, with each batch of 10 being immediately after the other, before sleeping for a full second, it seems that eventually some of the instances of the client are starved.
Server implementation:
http://msdn.microsoft.com/en-us/library/aa365603%28VS.85%29.aspx
Client implementation (assuming call is made 10x/sec, and there are a dozen instances).
http://msdn.microsoft.com/en-us/library/aa365603%28VS.85%29.aspx
The fact that the web page points out that:
The pipe server creates a fixed number of pipe instances.
and
Although the example shows simultaneous operations on different pipe instances, it avoids simultaneous operations on a single pipe instance by using the event object in the OVERLAPPED structure. Because the same event object is used for read, write, and connect operations for each instance, there is no way to know which operation's completion caused the event to be set to the signaled state for simultaneous operations using the same pipe instance
you can probably safely assume that it's not as scalable as it could be; it's an API usage example after all; demonstration of functionality is usually the most important design constraint for such code.
If you need 12 clients making 10 connections per second then I'd personally have the server able to handle MORE than just 12 clients to allow for the period when the server is preparing for a new client to connect... Personally I'd switch to using sockets but that's just me (and I'm skewed that way because I've done lots of high performance socket's work and so have all the code)...

Resources