How to reduce long WebSocket IO pauses? - go

I have a tool called Tendermint, which is written in Golang. It processes transactions and creates blocks (details are intentionally omitted). Transactions can be submitted through the WebSocket server. Blocks are configured to be created ~ every second.
Now, when I open two or more WS connections and submit more transactions than the application can handle, periodically, Tendermint gets stuck.
During this time, it does not create any blocks, but instead spends the significant portion of its time handling WebSocket IO.
I still don't understand the exact nature of these pauses. Maybe someone here knows or can ask the right questions? Also, I'm wondering what the ways to limit the IO are? Throttle each connection?
NOTE: I'm using https://github.com/gorilla/websocket for WebSockets. Our WS server can be found here.
Thank you for your time!
UPD 1: I've managed to flatten the pauses by batching responses in our WS server (see https://github.com/tendermint/tendermint/issues/3905#issuecomment-684860429)

Related

Front-facing REST API with an internal message queue?

I have created a REST API - in a few words, my client hits a particular URL and she gets back a JSON response.
Internally, quite a complicated process starts when the URL is hit, and there are various services involved as a microservice architecture is being used.
I was observing some performance bottlenecks and decided to switch to a message queue system. The idea is that now, once the user hits the URL, a request is published on internal message queue waiting for it to be consumed. This consumer will process and publish back on a queue and this will happen quite a few times until finally, the same node servicing the user will receive back the processed response to be delivered to the user.
An asynchronous "fire-and-forget" pattern is now being used. But my question is, how can the node servicing a particular person remember who it was servicing once the processed result arrives back and without blocking (i.e. it can handle several requests until the response is received)? If it makes any difference, my stack looks a little like this: TomCat, Spring, Kubernetes and RabbitMQ.
In summary, how can the request node (whose job is to push items on the queue) maintain an open connection with the client who requested a JSON response (i.e. client is waiting for JSON response) and receive back the data of the correct client?
You have few different scenarios according to how much control you have on the client.
If the client behaviour cannot be changed, you will have to keep the session open until the request has not been fully processed. This can be achieved employing a pool of workers (futures/coroutines, threads or processes) where each worker keeps the session open for a given request.
This method has few drawbacks and I would keep it as last resort. Firstly, you will only be able to serve a limited amount of concurrent requests proportional to your pool size. Lastly as your processing is behind a queue, your front-end won't be able to estimate how long it will take for a task to complete. This means you will have to deal with long lasting sessions which are prone to fail (what if the user gives up?).
If the client behaviour can be changed, the most common approach is to use a fully asynchronous flow. When the client initiates a request, it is placed within the queue and a Task Identifier is returned. The client can use the given TaskId to poll for status updates. Each time the client requests updates about a task you simply check if it was completed and you respond accordingly. A common pattern when a task is still in progress is to let the front-end return to the client the estimated amount of time before trying again. This allows your server to control how frequently clients are polling. If your architecture supports it, you can go the extra mile and provide information about the progress as well.
Example response when task is in progress:
{"status": "in_progress",
"retry_after_seconds": 30,
"progress": "30%"}
A more complex yet elegant solution would consist in using HTTP callbacks. In short, when the client makes a request for a new task it provides a tuple (URL, Method) the server can use to signal the processing is done. It then waits for the server to send the signal to the given URL. You can see a better explanation here. In most of the cases this solution is overkill. Yet I think it's worth to mention it.
One option would be to use DeferredResult provided by spring but that means you need to maintain some pool of threads in request serving node and max no. of active threads will decide the throughput of your system. For more details on how to implement DeferredResult refer this link https://www.baeldung.com/spring-deferred-result

At what point are WebSockets less efficient than Polling?

While I understand that the answer to the above question is somewhat determined by your application's architecture, I'm interested mostly in very simple scenarios.
Essentially, if my app is pinging every 5 seconds for changes, or every minute, around when will the data being sent to maintain the open Web Sockets connection end up being more than the amount you would waste by simple polling?
Basically, I'm interested in if there's a way of quantifying how much inefficiency you incur by using frameworks like Meteor if an application doesn't necessarily need real-time updates, but only periodic checks.
Note that my focus here is on bandwidth utilization, not necessarily database access times, since frameworks like Meteor have highly optimized methods of requesting only updates to the database.
The whole point of a websocket connection is that you don't ever have to ping the app for changes. Instead, the client just connects once and then the server can just directly send the client changes whenever they are available. The client never has to ask. The server just sends data when it's available.
For any type of server-initiated-data, this is way more efficient with bandwidth than http polling. Besides giving you much more timely results (the results are delivered immediately rather than discovered by the client only on the next polling interval).
For pure bandwidth usage, the details would depend upon the exact circumstances. An http polling request has to set up a TCP connection and confirm that connection (even more data if its an SSL connection), then it has to send the http request, including any relevant cookies that belong to that host and including relevant headers and the GET URL. Then, the server has to send a response. And, most of the time all of this overhead of polling will be completely wasted bandwidth because there's nothing new to report.
A webSocket starts with a simple http request, then upgrades the protocol to the webSocket protocol. The webSocket connection itself need not send any data at all until the server has something to send to the client in which case the server just sends the packet. Sending the data itself has far less overhead too. There are no cookies, no headers, etc... just the data. Even if you use some keep-alives on the webSocket, that amount of data is incredibly tiny compared to the overhead of an HTTP request.
So, how exactly much you would save in bandwidth depends upon the details of the circumstances. If it takes 50 polling requests before it finds any useful data, then every one of those http requests is entirely wasted compared to the webSocket scenario. The difference in bandwidth could be enormous.
You asked about an application that only needs periodic checks. As soon as you have a periodic check that results in no data being retrieved, that's wasted bandwidth. That's the whole idea of a webSocket. You consume no bandwidth (or close to no bandwidth) when there's no data to send.
I believe #jfriend00 answered the question very clearly. However, I do want to add a thought.
By throwing in a worst case (and improbable) scenario for Websockets vs. HTTP, you would clearly see that a Websocket connection will always have an advantage in regards to bandwidth (and probably all-round performance).
This is the worst case scenario for Websockets v/s HTTP:
your code uses Websocket connections the exact same way it uses HTTP requests, for polling.
(which isn't something you would do, I know, but it is a worst case scenario).
Every polling event is answered positively - meaning that no HTTP requests were performed in vain.
This is the worst possible situation for Websockets, which are designed for pushing data rather than polling... even in this situation Websockets will save you both bandwidth and CPU cycles.
Seriously, even ignoring the DNS query (performed by the client, so you might not care about it) and the TCP/IP handshake (which is expensive for both the client and the server), a Websocket connection is still more performant and cost-effective.
I'll explain:
Each HTTP request includes a lot of data, such as cookies and other headers. In many cases, each HTTP request is also subject to client authentication... rarely is data given away to anybody.
This means that HTTP connections pass all this data (and possibly perform client authentication) once per request.[Stateless]
However, Websocket connections are stateful. The data is sent only once (instead of each time a request is made). Client authentication occurs only during the Websocket connection negotiation.
This means that Websocket connections pass the same data (and possibly perform client authentication) once per connection (once for all polls).
So even in this worst case scenario, where polling is always positive and Websockets are used for polling instead of pushing data, Websockets will still save your server both bandwidth and other resources (i.e. CPU time).
I think the answer to your question, simply put, is "never". Websockets are never less efficient than polling.

Cassandra throttling workload

I've been recently attempting to send a workload of read operations to a 2-node Cassandra cluster (version 2.0.9, with rf=2). My intention was to send a number of reads at a rate that is higher than the capacity of my backend servers, thereby overwhelming them and resulting in server-side queuing. To do so, I'm using the datastax java driver (cql version 2) to run my operations asynchronously (in other words, the calling thread doesn't block waiting for a response).
The problem is that I'm unable to reach a high-enough sending-rate to overload my backend servers. The # of requests that I'm sending is being somehow throttled by Cassandra. To confirm this, I've ran clients from two different machines simultaneously, and the total number of requests sent per unit time is still peaking at the same value. I'm wondering if there's a mechanism that is employed by Cassandra to throttle the amount of requests that are being received? Otherwise, what else might be causing this behavior?
Each request received by Cassandra will be handled by multiple thread pools implementing a staged event-driven architecture, where requests will be queued for each stage. You can use nodetool tpstats to inspect the current status of each queue. Once too many requests are about to overwhelm the server, Cassandra will shed load by dropping requests once queues are about to reach their capacity. You'll notice this by numbers shown in the dropped section of tpstats. In case no requests are dropped, all of them will eventually complete, but you may see higher latencies using nodetool cfhistograms or WriteTimeoutExceptions on the client.
The network bandwidth from Cassandra side is throttling the amount of requests that are being received.
As far as I know their is no other mechanism employed by Cassandra to prevent itself from receiving too much requests. Timeout Exception is the main mechanism that Cassandra use to avoid crashing when it is overloaded.
Yes, Cassandra has multiple ways to throttle incoming requests. The first action on your part would be to find out which mechanism is the culprit. Then you can tune this mechanism to fit your needs.
The first step to find out where the block occurs, would be to connect to JMX with jconsole or similar and look at the queues and block values.
If I would hazard a guess, check MessagingService for timeouts and dropped messages between nodes. Then check the native transport requests for blocked tasks before the request even get to the stages.

Web server and ZeroMQ patterns

I am running an Apache server that receives HTTP requests and connects to a daemon script over ZeroMQ. The script implements the Multithreaded Server pattern (http://zguide.zeromq.org/page:all#header-73), it successfully receives the request and dispatches it to one of its worker threads, performs the action, responds back to the server, and the server responds back to the client. Everything is done synchronously as the client needs to receive a success or failure response to its request.
As the number of users is growing into a few thousands, I am looking into potentially improving this. The first thing I looked at is the different patterns of ZeroMQ, and whether what I am using is optimal for my scenario. I've read the guide but I find it challenging understanding all the details and differences across patterns. I was looking for example at the Load Balancing Message Broker pattern (http://zguide.zeromq.org/page:all#header-73). It seems quite a bit more complicated to implement than what I am currently using, and if I understand things correctly, its advantages are:
Actual load balancing vs the round-robin task distribution that I currently have
Asynchronous requests/replies
Is that everything? Am I missing something? Given the description of my problem, and the synchronous requirement of it, what would you say is the best pattern to use? Lastly, how would the answer change, if I want to make my setup distributed (i.e. having the Apache server load balance the requests across different machines). I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Some thoughts about the subject...
Keep it simple
I would try to keep things simple and "plain" ZeroMQ as long as possible. To increase performance, I would simply to change your backend script to send request out from dealer socket and move the request handling code to own program. Then you could just run multiple worker servers in different machines to get more requests handled.
I assume this was the approach you took:
I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Only problem here is that there is no request retry in the backend. If worker fails to handle given task it is forever lost. However one could write worker servers so that they handle all the request they got before shutting down. With this kind of setup it is possible to update backend workers without clients to notice any shortages. This will not save requests that get lost if the server crashes.
I have the feeling that in common scenarios this kind of approach would be more than enough.
Mongrel2
Mongrel2 seems to handle quite many things you have already implemented. It might be worth while to check it out. It probably does not completely solve your problems, but it provides tested infrastructure to distribute the workload. This could be used to deliver the request to be handled to multithreaded servers running on different machines.
Broker
One solution to increase the robustness of the setup is a broker. In this scenario brokers main role would be to provide robustness by implementing queue for the requests. I understood that all the requests the worker handle are basically the same type. If requests would have different types then broker could also do lookups to find correct server for the requests.
Using the queue provides a way to ensure that every request is being handled by some broker even if worker servers crashed. This does not come without price. The broker is by itself a single point of failure. If it crashes or is restarted all messages could be lost.
These problems can be avoided, but it requires quite much work: the requests could be persisted to the disk, servers could be clustered. Need has to be weighted against the payoffs. Does one want to use time to write a message broker or the actual system?
If message broker seems a good idea the time which is required to implement one can be reduced by using already implemented product (like RabbitMQ). Negative side effect is that there could be a lot of unwanted features and adding new things is not so straight forward as to self made broker.
Writing own broker could covert toward inventing the wheel again. Many brokers provide similar things: security, logging, management interface and so on. It seems likely that these are eventually needed in home made solution also. But if not then single home made broker which does single thing and does it well can be good choice.
Even if broker product is chosen I think it is a good idea to hide the broker behind ZeroMQ proxy, a dedicated code that sends/receives messages from the broker. Then no other part of the system has to know anything about the broker and it can be easily replaced.
Using broker is somewhat developer time heavy. You either need time to implement the broker or time to get use to some product. I would avoid this route until it is clearly needed.
Some links
Comparison between broker and brokerless
RabbitMQ
Mongrel2

Web sockets make ajax/CORS obsolete?

Will web sockets when used in all web browsers make ajax obsolete?
Cause if I could use web sockets to fetch data and update data in realtime, why would I need ajax? Even if I use ajax to just fetch data once when the application started I still might want to see if this data has changed after a while.
And will web sockets be possible in cross-domains or only to the same origin?
WebSockets will not make AJAX entirely obsolete and WebSockets can do cross-domain.
AJAX
AJAX mechanisms can be used with plain web servers. At its most basic level, AJAX is just a way for a web page to make an HTTP request. WebSockets is a much lower level protocol and requires a WebSockets server (either built into the webserver, standalone, or proxied from the webserver to a standalone server).
With WebSockets, the framing and payload is determined by the application. You could send HTML/XML/JSON back and forth between client and server, but you aren't forced to. AJAX is HTTP. WebSockets has a HTTP friendly handshake, but WebSockets is not HTTP. WebSockets is a bi-directional protocol that is closer to raw sockets (intentionally so) than it is to HTTP. The WebSockets payload data is UTF-8 encoded in the current version of the standard but this is likely to be changed/extended in future versions.
So there will probably always be a place for AJAX type requests even in a world where all clients support WebSockets natively. WebSockets is trying to solve situations where AJAX is not capable or marginally capable (because WebSockets its bi-directional and much lower overhead). But WebSockets does not replace everything AJAX is used for.
Cross-Domain
Yes, WebSockets supports cross-domain. The initial handshake to setup the connection communicates origin policy information. The wikipedia page shows an example of a typical handshake: http://en.wikipedia.org/wiki/WebSockets
I'll try to break this down into questions:
Will web sockets when used in all web browsers make ajax obsolete?
Absolutely not. WebSockets are raw socket connections to the server. This comes with it's own security concerns. AJAX calls are simply async. HTTP requests that can follow the same validation procedures as the rest of the pages.
Cause if I could use web sockets to fetch data and update data in realtime, why would I need ajax?
You would use AJAX for simpler more manageable tasks. Not everyone wants to have the overhead of securing a socket connection to simply allow async requests. That can be handled simply enough.
Even if I use ajax to just fetch data once when the application started I still might want to see if this data has changed after a while.
Sure, if that data is changing. You may not have the data changing or constantly refreshing. Again, this is code overhead that you have to account for.
And will web sockets be possible in cross-domains or only to the same origin?
You can have cross domain WebSockets but you have to code your WS server to accept them. You have access to the domain (host) header which you can then use to accept / deny requests. This can, however, be spoofed by something as simple as nc. In order to truly secure the connection you will need to authenticate the connection by other means.
Websockets have a couple of big downsides in terms of scalability that ajax avoids. Since ajax sends a request/response and closes the connection (..or shortly after) if someone stays on the web page it doesn't use server resources when idling. Websockets are meant to stream data back to the browser, and they tie up server resources to do so. Servers have a limit in how many simultaneous connections they can keep open at one time. Not to mention depending on your server side technology, they may tie up a thread to handle the socket. So websockets have more resource intensive requirements for both sides per connection. You could easily exhaust all of your threads servicing clients and then no new clients could come in if lots of users are just sitting on the page. This is where nodejs, vertx, netty can really help out, but even those have upper limits as well.
Also there is the issue of state of the underlying socket, and writing the code on both sides that carry on the stateful conversation which isn't something you have to do with ajax style because it's stateless. Websockets require you create a low level protocol which is solved for you with ajax. Things like heart beating, closing idle connections, reconnection on errors, etc are vitally important now. These are things you didn't have to solve when using AJAX because it was stateless. State is very important to the stability of your app and more importantly the health of your server. It's not trivial. Pre-HTTP we built a lot of stateful TCP protocols (FTP, telnet, SSH), and then HTTP happened. And no one did that stuff much anymore because even with its limitations HTTP was surprisingly easier and more robust. Websockets bring back the good and the bad of stateful protocols. You'll learn soon enough if you didn't get a dose of that last go around.
If you need streaming of realtime data this extra overhead is warranted because polling the server to get streamed data is worse, but if all you are doing is user interaction->request->response->update UI, then ajax is easier and will use less resources because once the response is sent the conversation is over and no additional server resources are used. So I think it's a tradeoff and the architect has to decide which tool fits their problem. AJAX has its place, and websockets have their place.
Update
So the architecture of your server is what matters when we are talking about threads. If you are using a traditionally multi-threaded server (or processes) where a each socket connection gets its own thread to respond to requests then websockets matter a lot to you. So for each connection we have a socket, and eventually the OS will fall over if you have too many of these, and the same goes for threads (more so for processes). Threads are heavier than sockets (in terms of resources) so we try and conserve how many threads we have running simultaneously. That means creating a thread pool which is just a fixed number of threads that is shared among all sockets. But once a socket is opened the thread is used for the entire conversation. The length of those conversations govern how quickly you can repurpose those threads for new sockets coming in. The length of your conversation governs how much you can scale. However if you are streaming this model doesn't work well for scaling. You have to break the thread/socket design.
HTTP's request/response model makes it very efficient in turning over threads for new sockets. If you are just going to use request/response use HTTP its already built and much easier than reimplementing something like that in websockets.
Since websockets don't have to be request/response as HTTP and can stream data if your server has a fixed number of threads in its thread pool and you have the same number of websockets tying up all of your threads with active conversations, you can't service new clients coming in! You've reached your maximum capacity. That's where protocol design is important too with websockets and threads. Your protocol might allow you to loosen the thread per socket per conversation model that way people just sitting there don't use a thread on your server.
That's where asynchronous single thread servers come in. In Java we often call this NIO for non-blocking IO. That means it's a different API for sockets where sending and receiving data doesn't block the thread performing the call.
So traditional in blocking sockets when you call socket.read() or socket.write() they wait until the data is received or sent before returning control to your program. That means your program is stuck waiting for the socket data to come in or go out until you can do anything else. That's why we have threads so we can do work concurrently (at the same time). Send this data to client X while I wait on data from client Y. Concurrencies is the name of the game when we talk about servers.
In a NIO server we use a single thread to handle all clients and register callbacks to be notified when data arrives. For example
socket.read( function( data ) {
// data is here! Now you can process it very quickly without waiting!
});
The socket.read() call will return immediately without reading any data, but our function we provided will be called when it comes in. This design radically changes how you build and architect your code because if you get hung up waiting on something you can't receive any new clients. You have a single thread you can't really do two things at once! You have to keep that one thread moving.
NIO, Asynchronous IO, Event based program as this is all known as, is a much more complicated system design, and I wouldn't suggest you try and write this if you are starting out. Even very Senior programmers find it very hard to build a robust systems. Since you are asynchronous you can't call APIs that block. Like reading data from the DB or sending messages to other servers have to be performed asynchronously. Even reading/writing from the file system can slow your single thread down lowering your scalability. Once you go asynchronous it's all asynchronous all the time if you want to keep the single thread moving. That's where it gets challenging because eventually you'll run into an API, like DBs, that is not asynchronous and you have to adopt more threads at some level. So a hybrid approaches are common even in the asynchronous world.
The good news is there are other solutions that use this lower level API already built that you can use. NodeJS, Vertx, Netty, Apache Mina, Play Framework, Twisted Python, Stackless Python, etc. There might be some obscure library for C++, but honestly I wouldn't bother. Server technology doesn't require the very fastest languages because it's IO bound more than CPU bound. If you are a die hard performance nut use Java. It has a huge community of code to pull from and it's speed is very close (and sometimes better) than C++. If you just hate it go with Node or Python.
Yes, yes it does. :D
The earlier answers lack imagination. I see no more reason to use AJAX if websockets are available to you.

Resources