High Performance Options for Remote services access - performance

I have a service, foo, running on machine A. I need to access that service from machine B. One way is to launch a web server on A and do it via HTTP; code running under web server on A accesses foo and returns the results. Another is to write socket server on A; socket server access service foo and returns the result.
HTTP connection initiation and handshake is expensive; sockets can be written, but I want to avoid that. What other options are available for high performance remote calls?

HTTP is just the protocol over the socket. If you are using TCP/IP networks, you are going to be using a socket. The HTTP connection initiation and handshake are not the expensive bits, it's TCP initiation that's really the expensive bit.
If you use HTTP 1.1, you can use persistent connections (Keep-Alive), which drastically reduces this cost, closer to that of keeping a persistent socket open.
It all depends on whether you want/need the higher-level protocol. Using HTTP means you will be able to do things like consume this service from a lot more clients, while writing much less documentation (if you write your own protocol, you will have to document it). HTTP servers also supports things like authentication, cookies, logging, out of the box. If you don't need these sorts of capabilities, then HTTP might be a waste. But I've seen few projects that don't need at least some of these things.

Adding to the answer of #Rob, as the question is not precisely pointing to an application or performance boundaries, it would be good to look into the options available in a broader context, which is Inter process communication.
The wikipedia page cleanly lists down the options available and would be a good place to start with.

What technology are you going to use? Let me answer for Java world.
If your request rate is below 100/sec, you should not care about optimizations and use most versatile solution - HTTP.
Well-written asynchronous server like Netty-HTTP can easily handle 1000 requests per second on medium-level hardware.
If you need more or have constrained resources, you can go to binary format. Most popular one out there is Google Protobuf(multilanguage) + Netty (Java).
What you should know about HTTP performance:
Http can use Keep-Alive which removes reconnection cost for every request
Http adds traffic overhead for every request and response - around 50-100 bytes.
Http client and server consumes additional CPU for parsing HTTP headers - that is noticeable after abovementioned 100 req/sec
Be careful when selecting technology. Even in 21 century it is hard to find well-written HTTP server and client.

Related

Do we still need a connection pool for microservices talking HTTP2?

As HTTP2 supports multiplexing, do we need still a pool of connections for microservice communication?
If yes, what are the benefits of having such a pool?
Example:
Service A => Service B
Both the above services have only one instance available.
Multiple connections may help overcome OS buffer size limitation for each Connection(Socket)? What else?
Yes, you still need connection pool in a client contacting a microservice.
First, in general it's the server that controls the amount of multiplexing. A particular microservice server may decide that it cannot allow beyond a very small multiplexing.
If a client wants to use that microservice with a higher load, it needs to be prepared to open multiple connections and this is where the connection pool comes handy.
This is also useful to handle load spikes.
Second, HTTP/2 has flow control and that may severely limit the data throughput on a single connection. If the flow control window are small (the default defined by the HTTP/2 specification is 65535 bytes, which is typically very small for microservices) then client and server will spend a considerable amount of time exchanging WINDOW_UPDATE frames to enlarge the flow control windows, and this is detrimental to throughput.
To overcome this, you either need more connections (and again a client should be prepared for that), or you need larger flow control windows.
Third, in case of large HTTP/2 flow control windows, you may hit TCP congestion (and this is different from socket buffer size) because the consumer is slower than the producer. It may be a slow server for a client upload (REST request with a large payload), or a slow client for a server download (REST response with a large payload).
Again to overcome TCP congestion the solution is to open multiple connections.
Comparing HTTP/1.1 with HTTP/2 for the microservice use case, it's typical that the HTTP/1.1 connection pools are way larger (e.g. 10x-50x) than HTTP/2 connection pools, but you still want connection pools in HTTP/2 for the reasons above.
[Disclaimer I'm the HTTP/2 implementer in Jetty].
We had an initial implementation where the Jetty HttpClient was using the HTTP/2 transport with an hardcoded single connection per domain because that's what HTTP/2 preached for browsers.
When exposed to real world use cases - especially microservices - we quickly realized how bad of an idea that was, and switched back to use connection pooling for HTTP/2 (like HttpClient always did for HTTP/1.1).

Read_reply in tarantool-c is too slow

I am setting up a c server and used tarantool as databased using tarantool-c. However, everytime I setup read_reply() the request per second tanks so much its like using mysql. How to fix it?
We had the discussion with James and he shares the code. The code implements the http server and this is how it processes a request:
Accept a new incoming http request.
Send a request to tarantool (using binary protocol).
Wait for a reply from tarantool (synchronously, unable to handle other incoming http requests).
Answer to the http request.
The root of the problem here is that we unable to utilize full network bandwith between the http server and tarantool. Such server should use select() / poll() / epoll() (on Linux) / kqueue (on FreeBSD) or a library like libev to determine whether it is able to write to a socket, read from it or accept a request.
Let me describe in brief how it should operate to at least utilize a network as much as possible (when doing it from one thread) in the set of rules of when-X-then-Y kind:
When a new http request arrive it should register the need to send a request to tarantool (let me name it pending request).
When a socket to tarantool is ready for write and there is at least one pending request the server should form a request to tarantool, save its ID (see tnt_stream.reqid) and write the request to the socket.
When the socket to tarantool is ready to read the server should read a reply from tarantool, match its ID (see tnt_reply.sync) against saved one and write a response to an http client, then close the socket to the client.
If http keepalive / pipelining need to be supported, the server needs to check a socket to an http client for readiness to perform read() / write(). Also it need to register pending http responses rather then write it just when they appears.
Aside of that HTTP itself is not easy to implement in a proper way: it always give you surprises if you don't control implementations of both client and server.
So I will describe alternatives to implementing its own http server, that are compatible with tarantool and able to operate with it in asynchronous way to achieve good performance. They are:
Using http.server tarantool module that allows to process http requests right in tarantool w/o external service and w/o using a connector to tarantool.
Using nginx_upstream_tarantool nginx module that allows to perform a request to tarantool from nginx using the binary protocol.
There are cons and pros for both of these ways. However http.server disadvantages can be overcomed with using nginx as frontend to proxying requests to tarantool(s) saving http.server advantages.
http.server cons:
No https support.
Single CPU bound / single tarantool instance bound.
Possibly worse performance then nginx (but not much).
Possibly worse support of HTTP caveats (but I don't know one).
http.server pros:
Simpler to start developing.
Simpler in configuration / deployment: parts of an application configuration do not spread across configs for tarantool and nginx.
nginx_upstream_tarantool cons and pros are reverse of http.server ones. Also I would mention specifically that nginx allows you to balance load across several tarantool instances, which may form a replication group or may be sharding frontends. This ability can be used to scale a service in the sense of desired performance as with proxying to http.server as well as with nginx_upstream_tarantool.
I guess you also interested in benchmarking results for http.server and nginx_upstream_tarantool. Look at this measurement. Note however that it is quite synthetic: it performs small requests and answer with small responses. Real RPS numbers may be different depending of size of requests and responses, hardware, whether https is needed, etc.

Does HTTP/2 make websockets obsolete?

I'm learning about HTTP/2 protocol. It's a binary protocol with small message frames. It allows stream multiplexing over single TCP connection. Conceptually it seems very similar to WebSockets.
Are there plans to obsolete websockets and replace them with some kind of headerless HTTP/2 requests and server-initiated push messages? Or will WebSockets complement HTTP/2?
After just getting finished reading RFC 7540, HTTP/2 does obsolete websockets for all use cases except for pushing binary data from the server to a JS webclient. HTTP/2 fully supports binary bidi streaming (read on), but browser JS doesn't have an API for consuming binary data frames and AFAIK such an API is not planned.
For every other application of bidi streaming, HTTP/2 is as good or better than websockets, because (1) the spec does more work for you, and (2) in many cases it allows fewer TCP connections to be opened to an origin.
PUSH_PROMISE (colloquially known as server push) is not the issue here. That's just a performance optimization.
The main use case for Websockets in a browser is to enable bidirectional streaming of data. So, I think the OP's question becomes whether HTTP/2 does a better job of enabling bidirectional streaming in the browser, and I think that yes, it does.
First of all, it is bi-di. Just read the introduction to the streams section:
A "stream" is an independent, bidirectional sequence of frames
exchanged between the client and server within an HTTP/2 connection.
Streams have several important characteristics:
A single HTTP/2 connection can contain multiple concurrently open
streams, with either endpoint interleaving frames from multiple
streams.
Streams can be established and used unilaterally or shared by
either the client or server.
Streams can be closed by either endpoint.
Articles like this (linked in another answer) are wrong about this aspect of HTTP/2. They say it's not bidi. Look, there is one thing that can't happen with HTTP/2: After the connection is opened, the server can't initiate a regular stream, only a push stream. But once the client opens a stream by sending a request, both sides can send DATA frames across a persistent socket at any time - full bidi.
That's not much different from websockets: the client has to initiate a websocket upgrade request before the server can send data across, too.
The biggest difference is that, unlike websockets, HTTP/2 defines its own multiplexing semantics: how streams get identifiers and how frames carry the id of the stream they're on. HTTP/2 also defines flow control semantics for prioritizing streams. This is important in most real-world applications of bidi.
(That wrong article also says that the Websocket standard has multiplexing. No, it doesn't. It's not really hard to find that out, just open the Websocket RFC 6455 and press ⌘-F, and type "multiplex". After you read
The protocol is intended to be extensible; future versions will likely introduce additional concepts such as multiplexing.
You will find that there is 2013 draft extension for Websocket multiplexing. But I don't know which browsers, if any, support that. I wouldn't try to build my SPA webapp on the back of that extension, especially with HTTP/2 coming, the support may never arrive).
Multiplexing is exactly the kind of thing that you normally have to do yourself whenever you open up a websocket for bidi, say, to power a reactively updating single page app. I'm glad it's in the HTTP/2 spec, taken care of once and for all.
If you want to know what HTTP/2 can do, just look at gRPC. gRPC is implemented across HTTP/2. Look specifically at the half and full duplex streaming options that gRPC offers. (Note that gRPC doesn't currently work in browsers, but that is actually because browsers (1) don't expose the HTTP/2 frame to the client javascript, and (2) don't generally support Trailers, which are used in the gRPC spec.)
Where might websockets still have a place? The big one is server->browser pushed binary data. HTTP/2 does allow server->browser pushed binary data, but it isn't exposed in browser JS. For applications like pushing audio and video frames, this is a reason to use websockets.
Edit: Jan 17 2020
Over time this answer has gradually risen up to the top (which is good, because this answer is more-or-less correct). However there are still occasional comments saying that it is not correct for various reasons, usually related to some confusion about PUSH_PROMISE or how to actually consume message-oriented server -> client push in a single page app.
If you need to build a real-time chat app, let's say, where you need to broadcast new chat messages to all the clients in the chat room that have open connections, you can (and probably should) do this without websockets.
You would use Server-Sent Events to push messages down and the Fetch api to send requests up. Server-Sent Events (SSE) is a little-known but well supported API that exposes a message-oriented server-to-client stream. Although it doesn't look like it to the client JavaScript, under the hood your browser (if it supports HTTP/2) will reuse a single TCP connection to multiplex all of those messages. There is no efficiency loss and in fact it's a gain over websockets because all the other requests on your page are also sharing that same TCP connection. Need multiple streams? Open multiple EventSources! They'll be automatically multiplexed for you.
Besides being more resource efficient and having less initial latency than a websocket handshake, Server-Sent Events have the nice property that they automatically fall back and work over HTTP/1.1. But when you have an HTTP/2 connection they work incredibly well.
Here's a good article with a real-world example of accomplishing the reactively-updating SPA.
From what I understood HTTP/2 is not a replacement for websocket but aims to standardize SPDY protocol.
In HTTP/2, server-push is used behind the scene to improve resource loading by the client from the browser. As a developer, you don't really care about it during your development. However, with Websocket, the developer is allowed to use API which is able to consume and push message with an unique full-duplex connection.
These are not the same things, and they should complement each other.
I say Nay (Websockets aren't obsolete).
The first and most often ignored issue is that HTTP/2 push isn't enforceable and might be ignored by proxies, routers, other intermediaries or even the browser.
i.e. (from the HTTP2 draft):
An intermediary can receive pushes from the server and choose not to forward them on to the client. In other words, how to make use of the pushed information is up to that intermediary. Equally, the intermediary might choose to make additional pushes to the client, without any action taken by the server.
Hence, HTTP/2 Push can't replace WebSockets.
Also, HTTP/2 connections do close after a while.
It's true that the standard states that:
HTTP/2 connections are persistent. For best performance, it is expected that clients will not close connections until it is determined that no further communication with a server is necessary (for example, when a user navigates away from a particular web page) or until the server closes the connection.
But...
Servers are encouraged to maintain open connections for as long as possible but are permitted to terminate idle connections if necessary. When either endpoint chooses to close the transport-layer TCP connection, the terminating endpoint SHOULD first send a GOAWAY (Section 6.8) frame so that both endpoints can reliably determine whether previously sent frames have been processed and gracefully complete or terminate any necessary remaining tasks.
Even if the same connection allows for pushing content while it is open and even if HTTP/2 resolves some of the performance issues introduced by HTTP/1.1's 'keep-alive'... HTTP/2 connections aren't kept open indefinitely.
Nor can a webpage re-initiate an HTTP/2 connection once closed (unless we're back to long-pulling, that is).
EDIT (2017, two years later)
Implementations of HTTP/2 show that multiple browser tabs/windows share a single HTTP/2 connection, meaning that push will never know which tab / window it belongs to, eliminating the use of push as a replacement for Websockets.
EDIT (2020)
I'm not sure why people started downvoting the answer. If anything, the years since the answer was initially posted proved that HTTP/2 can't replace WebSockets and wasn't designed to do so.
Granted, HTTP/2 might be used to tunnel WebSocket connections, but these tunneled connections will still require the WebSocket protocol and they will effect the way the HTTP/2 container behaves.
The answer is no. The goal between the two are very different. There is even an RFC for WebSocket over HTTP/2 which allows you to make multiple WebSocket connections over a single HTTP/2 TCP pipe.
WS over HTTP/2 will be a resource conservation play by decreasing the time to open new connections and allowing for more communication channels without the added expense of more sockets, soft IRQs, and buffers.
https://datatracker.ietf.org/doc/html/draft-hirano-httpbis-websocket-over-http2-01
Well, to quote from this InfoQ article:
Well, the answer is clearly no, for a simple reason: As we have seen above, HTTP/2 introduces Server Push which enables the server to proactively send resources to the client cache. It does not, however, allow for pushing data down to the client application itself. Server pushes are only processed by the browser and do not pop up to the application code, meaning there is no API for the application to get notifications for those events.
And so HTTP2 push is really something between your browser and server, while Websockets really expose the APIs that can be used by both client (javascript, if its running on browser) and application code (running on server) for transferring real time data.
As of today, no.
HTTP/2, compared to HTTP, allows you to maintain a connection with a server. From there, you can have multiple streams of data at the same time. The intent is that you can push multiple things at the same time even without the client requesting it. For example, when a browser asks for a index.html, the server might want to also push index.css and index.js. The browser didn't ask for it, but the server might provide it without being asked because it can assume you're going to want in a few seconds.
This is faster than the HTTP/1 alternative of getting index.html, parsing it, discovering it needs index.js and index.css and then building 2 other requests for those files. HTTP/2 lets the server push data the client hasn't even asked for.
In that context, it's similar to WebSocket, but not really by design. WebSocket is supposed to allow a bi-directional communication similar to a TCP connection, or a serial connection. It's a socket where both communicate with each other. Also, the major difference is that you can send any arbitrary data packets in raw bytes, not encapsulated in HTTP protocol. The concepts of headers, paths, query strings only happen during the handshake, but WebSocket opens up a data stream.
The other difference is you get a lot more fine-tuned access to WebSocket in Javascript, whereas with HTTP, it's handled by the browser. All you get with HTTP is whatever you can fit in XHR/fetch(). That also means the browser will get to intercept and modify HTTP headers without you being able to control it (eg: Origin, Cookies, etc). Also, what HTTP/2 is able to push is sent to the browser. That means JS doesn't always (if ever) know things are being pushed. Again, it makes sense for index.css and index.js because the browser will cache it, but not so much for data packets.
It's really all in the name. HTTP stands for HyperText Transfer Protocol. We're geared around the concept of transferring assets. WebSocket is about building a socket connection where binary data gets passed around bidirectionally.
The one we're not really discussing is SSE (Server-Sent Events). Pushing data to the application (JS) isn't HTTP/2's intent, but it is for SSE. SSE gets really strengthened with HTTP/2. But it's a not a real replacement for WebSockets when what's important is the data itself, not the variable endpoints being reached. For each endpoint in with WebSocket a new data stream is created, but with SSE it's shared between the already existing HTTP/2 session.
Summarized here are the objectives for each:
HTTP - Respond to a request with one asset
HTTP/2 - Respond to a request with multiple assets
SSE - Respond with a unidirectional text (UTF-8) event stream
WebSocket - Create a bidirectional binary data stream
Message exchange and simple streaming(not audio, video streaming) can be done via both Http/2 multiplexing and WebSockets. So there is some overlap, but WebSockets have well established protocol, a lot of frameworks/APIs and less headers overhead.
Here is nice article about the topic.
No, WebSockets are not obsolete. However, HTTP/2 breaks websockets as defined for HTTP/1.1 (mostly by forbidding protocol updates using the Upgrade header). Which is why this rfc:
https://datatracker.ietf.org/doc/html/rfc8441
defines a websocket bootstrapping procedure for HTTP/2.
For the time being April 2020, HTTP/2 is not making WebSockets obsolete. The greatest advantage of WebSockets over HTTP2 is that
HTTP/2 works only on Browser Level not Application Level
Means that HTTP/2 does not offer any JS API like WebSockets to allow communication and transfer some kind of JSON or other data to server directly from Application (e.g. Website). So, as far as I believe, HTTP/2 will only make WebSockets obsolete if it starts offering API like WebSockets to talk to server. Till that it is just updated and faster version of HTTP 1.1.
No HTTP/2 does not make websockets obsolete, but SSE over HTTP/2 offers a viable alternative. The minor caveat is that SSE does not support unsolicited events from server to client (and neither does HTTP/2): i.e. the client has to explicitly subscribe by creating an EventSource instance specifying the event source endpoint. So you may have to slightly reorganise how the client arranges for events to be delivered - I can't think of a scenario where this is actually a technical barrier.
SSE works with HTTP/1.1. But HTTP/2 makes using SSE generally viable and competitive with websockets in terms of efficiency, instead of practically unusable in the case of HTTP/1.1. Firstly, HTTP/2 multiplexes many event source connections (or rather "streams" in HTTP/2 terms) onto a single TCP connection where as in HTTP/1.1 you'd need one connection for each. According to the HTTP/2 spec, millions of streams can be created per connection by default with the recommended (configurable) minimum being 100, where as browsers maybe severly limited in the number of TCP connections they can make. Second reason is efficiency: many streams in HTTP/2 is requires much less overhead than the many connections required in HTTP/1.1.
One final thing is, if you want to replace websockets with SSE your forgoing some of the tools / middlewares built on top of websockets. In particular I'm thinking of socket.io (which is how a lot of people actually use websockets), but I'm sure there is a ton more.

Slow HTTP vs Web Sockets - Resource utilization

If a bunch of "Slow HTTP" connection to a server can consume so much resources so as to cause a denial of service, why wouldn't a bunch of web sockets to a server cause the same problem?
The accepted answer to a different SO question says that it is almost free to maintain a idle connection.
If it costs nothing to maintain an open TCP connection, why does a "Slow HTTP" cause denial of service?
A WebSocket and a "slow" HTTP connection both use an open connection. The difference is in expectations of the server design.
Typical HTTP servers do not need to handle a large number of open connections and are designed around the assumption that the number of open connections is small. If the server does not protect against slow clients, then an attacker can force a server designed around this assumption to hit a resource limit.
Here are a couple of examples showing how the different expectations can impact the design:
If you only have a few HTTP requests in flight at a time, then it's OK to use a thread per connection. This is not a good design for a WebSocket server.
The default file descriptor limits are often adequate for typical HTTP scenarios, but not for a large numbers of connections.
It is possible to design an HTTP server to handle a large number of open connections and several servers do so out of the box.

Any HTTP proxies with explicit, configurable support for request/response buffering and delayed connections?

When dealing with mobile clients it is very common to have multisecond delays during the transmission of HTTP requests. If you are serving pages or services out of a prefork Apache the child processes will be tied up for seconds serving a single mobile client, even if your app server logic is done in 5ms. I am looking for a HTTP server, balancer or proxy server that supports the following:
A request arrives to the proxy. The proxy starts buffering in RAM or in disk the request, including headers and POST/PUT bodies. The proxy DOES NOT open a connection to the backend server. This is probably the most important part.
The proxy server stops buffering the request when:
A size limit has been reached (say, 4KB), or
The request has been received completely, headers and body
Only now, with (part of) the request in memory, a connection is opened to the backend and the request is relayed.
The backend sends back the response. Again the proxy server starts buffering it immediately (up to a more generous size, say 64KB.)
Since the proxy has a big enough buffer the backend response is stored completely in the proxy server in a matter of miliseconds, and the backend process/thread is free to process more requests. The backend connection is immediately closed.
The proxy sends back the response to the mobile client, as fast or as slow as it is capable of, without having a connection to the backend tying up resources.
I am fairly sure you can do 4-6 with Squid, and nginx appears to support 1-3 (and looks like fairly unique in this respect). My question is: is there any proxy server that empathizes these buffering and not-opening-connections-until-ready capabilities? Maybe there is just a bit of Apache config-fu that makes this buffering behaviour trivial? Any of them that it is not a dinosaur like Squid and that supports a lean single-process, asynchronous, event-based execution model?
(Siderant: I would be using nginx but it doesn't support chunked POST bodies, making it useless for serving stuff to mobile clients. Yes cheap 50$ handsets love chunked POSTs... sigh)
What about using both nginx and Squid (client — Squid — nginx — backend)? When returning data from a backend, Squid does convert it from C-T-E: chunked to a regular stream with Content-Length set, so maybe it can normalize POST also.
Nginx can do everything you want. The configuration parameters you are looking for are
http://wiki.codemongers.com/NginxHttpCoreModule#client_body_buffer_size
and
http://wiki.codemongers.com/NginxHttpProxyModule#proxy_buffer_size
Fiddler, a free tool from Telerik, does at least some of the things you're looking for.
Specifically, go to Rules | Custom Rules... and you can add arbitrary Javascript code at all points during the connection. You could simulate some of the things you need with sleep() calls.
I'm not sure this method gives you the fine buffering control you want, however. Still, something might be better than nothing?
Squid 2.7 can support 1-3 with a patch:
http://www.squid-cache.org/Versions/v2/HEAD/changesets/12402.patch
I've tested this and found it to work well, with the proviso that it only buffers to memory, not disk (unless it swaps, of course, and you don't want this), so you need to run it on a box that's appropriately provisioned for your workload.
Chunked POSTs are a problem for most servers and intermediaries. Are you sure you need support? Usually clients should retry the request when they get a 411.
Unfortunately, I'm not aware of a ready-made solution for this. In the worst case scenario, consider developing it yourself, say, using Java NIO -- it shouldn't take more than a week.

Resources