I'd like to set an arbitrary timer on my graphQL requests. Say, I make a request and it takes longer than 10 seconds for Apollo to send back an error.
Would I need to do this with the Apollo client and Apollo server (say additional service requests such as databases, etc.)?

There are three different places where timeouts might make sense:
1. For the connection to the server
To have a timeout for requests sent to the server, you could build a wrapper around the network interface, which would reject query promises after x seconds.
2. For the query resolution on the GraphQL server
To implement a per-query timeout on the server, you could put the query start time on the context at the start of the query, and wrap each resolve function with a function that either returns the promise from the resolver, or rejects when the timeout has elapsed.
3. For the connection between your GraphQL server and the backends
To implement timeouts for requests to the backend, you can simply make the fetch-requests to the backends time out after a certain amount of time.
PS: It's worth noting that the solutions above will cause queries or requests to time out, but they won't cancel them, which means that your server or backends will most likely continue doing work that is now wasted. However, cancelling is an entirely different problem, and it's also harder to address.


Disallow queuing of requests in gRPC microservices

We have gRPC pods running in a k8s cluster. The service mesh we use is linkerd. Our gRPC microservices are written in python (asyncio grpcs as the concurrency mechanism), with the exception of the entry-point. That microservice is written in golang (using gin framework). We have an AWS API GW that talks to an NLB in front of the golang service. The golang service communicates to the backend via nodeport services.
Requests on our gRPC Python microservices can take a while to complete. Average is 8s, up to 25s in the 99th %ile. In order to handle the load from clients, we've horizontally scaled, and spawned more pods to handle concurrent requests.
When we send multiple requests to the system, even sequentially, we sometimes notice that requests go to the same pod as an ongoing request. What can happen is that this new request ends up getting "queued" in the server-side (not fully "queued", some progress gets made when context switches happen). The issue with queueing like this is that:
The earlier requests can start getting starved, and eventually timeout (we have a hard 30s cap from API GW).
The newer requests may also not get handled on time, and as a result get starved.
The symptom we're noticing is 504s which are expected from our hard 30s cap.
What's strange is that we have other pods available, but for some reason the loadbalancer isn't routing it to those pods smartly. It's possible that linkerd's smarter load balancing doesn't work well for our high latency situation (we need to look into this further, however that will require a big overhaul to our system).
One thing I wanted to try doing is to stop this queuing up of requests. I want the service to immediately reject the request if one is already in progress, and have the client (meaning the golang service) retry. The client retry will hopefully hit a different pod (do let me know if that won’t happen). In order to do this, I set the "maximum_concurrent_rpcs" to 1 on the server-side (Python server). When i sent multiple requests in parallel to the system, I didn't see any RESOURCE_EXHAUSTED exceptions (even under the condition when there is only 1 server pod). What I do notice is that the requests are no longer happening in parallel on the server, they happen sequentially (I think that’s a step in the right direction, the first request doesn’t get starved). That being said, I’m not seeing the RESOURCE_EXHAUSTED error in golang. I do see a delay between the entry time in the golang client and the entry time in the Python service. My guess is that the queuing is now happening client-side (or potentially still server side, but it’s not visible to me)?
I then saw online that it may be possible for requests to get queued up on the client-side as a default behavior in http/2. I tried to test this out in custom Python client that mimics the golang one with:
channel = grpc.insecure_channel(
"<some address>",
options=[("grpc.max_concurrent_streams", 1)]
# create stub to server with channel…
However, I'm not seeing any change here either. (Note, this is a test dummy client - eventually i'll need to make this run in golang. Any help there would be appreciated as well).
How can I get the desired effect here? Meaning server sends resource exhausted if already handling a request, golang client retries, and it hits a different pod?
Any other advice on how to fix this issue? I'm grasping at straws here.
Thank you!

Front-facing REST API with an internal message queue?

I have created a REST API - in a few words, my client hits a particular URL and she gets back a JSON response.
Internally, quite a complicated process starts when the URL is hit, and there are various services involved as a microservice architecture is being used.
I was observing some performance bottlenecks and decided to switch to a message queue system. The idea is that now, once the user hits the URL, a request is published on internal message queue waiting for it to be consumed. This consumer will process and publish back on a queue and this will happen quite a few times until finally, the same node servicing the user will receive back the processed response to be delivered to the user.
An asynchronous "fire-and-forget" pattern is now being used. But my question is, how can the node servicing a particular person remember who it was servicing once the processed result arrives back and without blocking (i.e. it can handle several requests until the response is received)? If it makes any difference, my stack looks a little like this: TomCat, Spring, Kubernetes and RabbitMQ.
In summary, how can the request node (whose job is to push items on the queue) maintain an open connection with the client who requested a JSON response (i.e. client is waiting for JSON response) and receive back the data of the correct client?
You have few different scenarios according to how much control you have on the client.
If the client behaviour cannot be changed, you will have to keep the session open until the request has not been fully processed. This can be achieved employing a pool of workers (futures/coroutines, threads or processes) where each worker keeps the session open for a given request.
This method has few drawbacks and I would keep it as last resort. Firstly, you will only be able to serve a limited amount of concurrent requests proportional to your pool size. Lastly as your processing is behind a queue, your front-end won't be able to estimate how long it will take for a task to complete. This means you will have to deal with long lasting sessions which are prone to fail (what if the user gives up?).
If the client behaviour can be changed, the most common approach is to use a fully asynchronous flow. When the client initiates a request, it is placed within the queue and a Task Identifier is returned. The client can use the given TaskId to poll for status updates. Each time the client requests updates about a task you simply check if it was completed and you respond accordingly. A common pattern when a task is still in progress is to let the front-end return to the client the estimated amount of time before trying again. This allows your server to control how frequently clients are polling. If your architecture supports it, you can go the extra mile and provide information about the progress as well.
Example response when task is in progress:
{"status": "in_progress",
"retry_after_seconds": 30,
"progress": "30%"}
A more complex yet elegant solution would consist in using HTTP callbacks. In short, when the client makes a request for a new task it provides a tuple (URL, Method) the server can use to signal the processing is done. It then waits for the server to send the signal to the given URL. You can see a better explanation here. In most of the cases this solution is overkill. Yet I think it's worth to mention it.
One option would be to use DeferredResult provided by spring but that means you need to maintain some pool of threads in request serving node and max no. of active threads will decide the throughput of your system. For more details on how to implement DeferredResult refer this link

At what point are WebSockets less efficient than Polling?

While I understand that the answer to the above question is somewhat determined by your application's architecture, I'm interested mostly in very simple scenarios.
Essentially, if my app is pinging every 5 seconds for changes, or every minute, around when will the data being sent to maintain the open Web Sockets connection end up being more than the amount you would waste by simple polling?
Basically, I'm interested in if there's a way of quantifying how much inefficiency you incur by using frameworks like Meteor if an application doesn't necessarily need real-time updates, but only periodic checks.
Note that my focus here is on bandwidth utilization, not necessarily database access times, since frameworks like Meteor have highly optimized methods of requesting only updates to the database.
The whole point of a websocket connection is that you don't ever have to ping the app for changes. Instead, the client just connects once and then the server can just directly send the client changes whenever they are available. The client never has to ask. The server just sends data when it's available.
For any type of server-initiated-data, this is way more efficient with bandwidth than http polling. Besides giving you much more timely results (the results are delivered immediately rather than discovered by the client only on the next polling interval).
For pure bandwidth usage, the details would depend upon the exact circumstances. An http polling request has to set up a TCP connection and confirm that connection (even more data if its an SSL connection), then it has to send the http request, including any relevant cookies that belong to that host and including relevant headers and the GET URL. Then, the server has to send a response. And, most of the time all of this overhead of polling will be completely wasted bandwidth because there's nothing new to report.
A webSocket starts with a simple http request, then upgrades the protocol to the webSocket protocol. The webSocket connection itself need not send any data at all until the server has something to send to the client in which case the server just sends the packet. Sending the data itself has far less overhead too. There are no cookies, no headers, etc... just the data. Even if you use some keep-alives on the webSocket, that amount of data is incredibly tiny compared to the overhead of an HTTP request.
So, how exactly much you would save in bandwidth depends upon the details of the circumstances. If it takes 50 polling requests before it finds any useful data, then every one of those http requests is entirely wasted compared to the webSocket scenario. The difference in bandwidth could be enormous.
You asked about an application that only needs periodic checks. As soon as you have a periodic check that results in no data being retrieved, that's wasted bandwidth. That's the whole idea of a webSocket. You consume no bandwidth (or close to no bandwidth) when there's no data to send.
I believe #jfriend00 answered the question very clearly. However, I do want to add a thought.
By throwing in a worst case (and improbable) scenario for Websockets vs. HTTP, you would clearly see that a Websocket connection will always have an advantage in regards to bandwidth (and probably all-round performance).
This is the worst case scenario for Websockets v/s HTTP:
your code uses Websocket connections the exact same way it uses HTTP requests, for polling.
(which isn't something you would do, I know, but it is a worst case scenario).
Every polling event is answered positively - meaning that no HTTP requests were performed in vain.
This is the worst possible situation for Websockets, which are designed for pushing data rather than polling... even in this situation Websockets will save you both bandwidth and CPU cycles.
Seriously, even ignoring the DNS query (performed by the client, so you might not care about it) and the TCP/IP handshake (which is expensive for both the client and the server), a Websocket connection is still more performant and cost-effective.
I'll explain:
Each HTTP request includes a lot of data, such as cookies and other headers. In many cases, each HTTP request is also subject to client authentication... rarely is data given away to anybody.
This means that HTTP connections pass all this data (and possibly perform client authentication) once per request.[Stateless]
However, Websocket connections are stateful. The data is sent only once (instead of each time a request is made). Client authentication occurs only during the Websocket connection negotiation.
This means that Websocket connections pass the same data (and possibly perform client authentication) once per connection (once for all polls).
So even in this worst case scenario, where polling is always positive and Websockets are used for polling instead of pushing data, Websockets will still save your server both bandwidth and other resources (i.e. CPU time).
I think the answer to your question, simply put, is "never". Websockets are never less efficient than polling.

Jersey and AsyncResponse vs. Redirects

Currently I am using Jersey 1.0 and about to switch to 2.0. For REST requests the may last over a second or two I use the following pattern:
Client calls GET or PUT
Server returns a polling URL to the client
The client polls the URL until it gets a redirect to the completed resource
Pretty standard and straightforward. However, I noticed that Jersey 2.0 has an AsyncResponse capability. But it looks like this is done with no changes on the wire. In other words, the client still blocks for the result while the server is asynchronously processing the request.
So what good is this? Should I be using it instead of my current asynchronous approach for calls >1 second? Or is it really just to keep the connections freed on the server for calls that would be only a few hundred milliseconds?
I want my server to be as scalable as possible but the approach I use now can be tedious for the client. AsyncResponse seems super simple but I'm not sure how it would work for something like a heroku service where you want very short connection times.
AsyncResponse presumably gives you more scalability within the web app server for standard standard requests in terms of thread pooling resources, but I don't think it changes anything about the client experience which will continue to block on read on their connection. Therefore, if you already implemented a polling solution from your client side, this won't add much of any value to you imho.

What is the disadvantage of using websocket/ where ajax will do?

Similar questions have been asked before and they all reached the conclusion that AJAX will not become obsolete. But in what ways is ajax better than websockets?
With, it's easy to fall back to flash or long polling, so browser compatibility seems to be a non-issue.
Websockets are bidirectional. Where ajax would make an asynchronous request, websocket client would send a message to the server. The POST/GET parameters can be encoded in JSON.
So what is wrong with using 100% websockets? If every visitor maintains a persistent websocket connection to the server, would that be more wasteful than making a few ajax requests throughout the visit session?
I think it would be more wasteful. For every connected client you need some sort of object/function/code/whatever on the server paired up with that one client. A socket handler, or a file descriptor, or however your server is setup to handle the connections.
With AJAX you don't need a 1:1 mapping of server side resource to client. Your # of clients can scale less dependently than your server-side resources. Even node.js has its limitations to how many connections it can handle and keep open.
The other thing to consider is that certain AJAX responses can be cached too. As you scale up you can add an HTTP cache to help reduce the load from frequent AJAX requests.
Short Answer
Keeping a websocket active has a cost, for both the client and the server, whether Ajax will have a cost only once, depending on what you're doing with it.
Long Answer
Websockets are often misunderstood because of this whole "Hey, use Ajax, that will do !". No, Websockets are not a replacement for Ajax. They can potentially be applied to the same fields, but there are cases where using Websocket is absurd.
Let's take a simple example : A dynamic page which loads data after the page is loaded on the client side. It's simple, make an Ajax call. We only need one direction, from the server to the client. The client will ask for these data, the server will send them to the client, done. Why would you implement websockets for such a task ? You don't need your connection to be opened all the time, you don't need the client to constantly ask the server, you don't need the server to notify the client. The connection will stay open, it will waste resources, because to keep a connection open you need to constantly check it.
Now for a chat application things are totally different. You need your client to be notified by the server instead of forcing the client to ask the server every x seconds or milliseconds if something is new. It would make no sense.
To understand better, see that as two persons. One of the two is the server, the over is the client. Ajax is like sending a letter. The client sends a letter, the server responds with another letter. The fact is that, for a chat application the conversation would be like that :
"Hey Server, got something for me ?
- No.
- Hey Server, got something for me ?
- No.
- Hey Server, got something for me ?
- Yes, here it is."
The server can't actually send a letter to the client, if the client never asked for an answer. It's a huge waste of resources. Because for every Ajax request, even if it's cached, you need to make an operation on the server side.
Now the case I discussed earlier with the data loaded with Ajax. Imagine the client is on the phone with the server. Keeping the connection active has a cost. It costs electricity and you have to pay your operator. Now why would you need to call someone and keep him on phone for an hour, if you just want that person to tell you 3 words ? Send a goddamn letter.
In conclusion Websockets are not a total replacement for Ajax !
Sometimes you will need Ajax where Websocket usage is absurd.
Edit : The SSE case
That technology isn't used very widely but it can be useful. As its name states it, Server-Sent Events are a one-way push from the server to the client. The client doesn't request anything, the server just sends the data.
In short :
- Unidirectional from the client : Ajax
- Unidirectional from the server : SSE
- Bidirectional : Websockets
Personally, I think that websockets will be used more and more in web applications instead of AJAX. They are not well suited to web sites where caching and SEO are of greater concern, but they will do wonders for webapps.
Projects such as DNode and socketstream help to remove the complexity and enable simple RPC-style coding. This means your client code just calls a function on the server, passing whatever data to that function it wants. And the server can call a function on the client and pass it data as well. You don't need to concern yourself with the nitty gritties of TCP.
Furthermore, there is a lot of overhead with AJAX calls. For instance, a connection needs to be established and HTTP headers (cookies, etc.) are passed with every request. Websockets eliminate much of that. Some say that websockets are more wasteful, and perhaps they are right. But I'm not convinced that the difference is really that substantial.
I answered another related question in detail, including many links to related resources. You might check it out:
websocket api to replace rest api?
I think that sooner or later websocket based frameworks will start to popup not just for writing real-time chat like parts of web apps, but also as standalone web frameworks. Once permanent connection is created it can be used for receiving all kinds of stuff including UI parts of web application which are now served for example through AJAX requests. This approach may hurt SEO in some way although it can reduce amount of traffic and load generated by asynchronous requests which includes redundant HTTP headers.
However I doubt that websockets will replace or endanger AJAX because there are numerous scenarios where permanent connections are unnecessary or unwanted. For example mashup applications which are using (one time) single purpose REST based services that doesn't need to be permanently connected with clients.
There's nothing "wrong" about it.
The only difference is mostly readability. The main advantage of Ajax is that it allows you fast development because most of the functionality is written for you.
There's a great advantage in not having to re-invent the wheel every time you want to open a socket.
WS:// connections have far less overhead than "AJAX" requests.
As other people said, keeping the connection open can be overkill in some scenarios where you don't need server to client notifications, or client to server request happens with low frecuency.
But another disadvantage is that websockets is a low level protocol, not offering additional features to TCP once the initial handshake is performed. So when implementing a request-response paradigm over websockets, you will probably miss features that HTTP (a very mature and extense protocol family) offers, like caching (client and shared caches), validation (conditional requests), safety and idempotence (with implications on how the agent behaves), range requests, content types, status codes, ...
That is, you reduce message sizes at a cost.
So my choice is AJAX for request-response, websockets for server pushing and high frequency low latency messaging
If you want the connection to server open and if continuous polling to the server will be there then go for sockets else you are good to go with ajax.
Simple Analogy :
Ajax asks questions(requests) to server and server gives answers(responses) to these questions. Now if you want to ask continuous questions then ajax wont work, it has a large overhead which will require resources at both the ends.
