Ways to wait if server is not available in gRPC from client side - go

I hope who ever is reading this is doing well.
Here's a scenario that I'm wondering about: there's a global ClientConn that is being used for all grpc requests to a server. Then that server goes down. I was wondering if there's a way to wait for this server to go up with some timeout in order for the usage of grpc in this scenario to be more resilient to failures(either a transient failure or server goes down). I was thinking keep looping if the clientConn state is connecting or a transient failure and if a timeout occurs when the clientConn state was a transient failure then return an error since the server might be down.
I was wondering if this would work if there are multiple requests coming in the client side that would need this ClientConn so then multiple go routines would be running this loop. Would appreciate any other alternatives, suggestions, or advice.

When you call grpc.Dial to connect to a server and receive a grpc.ClientConn, it will automatically handle reconnections for you. When you call a method or request a stream, it will fail if it can't connect to the server or if there is an error processing the request.
You could retry a few times if the error indicates that it is due to the network. You can check the grpc status codes in here https://github.com/grpc/grpc-go/blob/master/codes/codes.go#L31 and extract them from the returned error using status.FromError: https://pkg.go.dev/google.golang.org/grpc/status#FromError
You also have the grpc.WaitForReady option (https://pkg.go.dev/google.golang.org/grpc#WaitForReady) which can be used to block the grpc call until the server is ready if it is in a transient failure. In that case, you don't need to retry, but you should probably add a timeout that cancels the context to have control over how long you stay blocked.
If you want to even avoid trying to call the server, you could use ClientConn.WaitForStateChange (which is experimental) to detect any state change and call ClientConn.GetState to determine in what state is the connection to know when it is safe to start calling the server again.

Related

Invalid session / Session is disconnected

What can be the reasons that cause a socket.io session to be crashed and server returns invalid session or session is disconnected ?
There is a specific situation that causes these problems with the session. When a client fails to send the pings at the expected interval the server declares the client gone and deletes the session. If a client that falls into this situation later tries to send a ping or another request using the now invalidated session id it will receive one of these errors.
Another possible problem with the same outcome is when the client does send the pings at the correct intervals, but the server is blocked or too busy to process these pings in time.
So to summarize, if you think your clients are well behaved, I would look at potential blocking tasks in your server.
Ok, I'll illustrate my problem in this figure project's architecture .
In fact, I have a websocket between the react app and the rasa ( tool for creating chatbots) based on flask. bot response need to access to an external API to retrieve some data. Here where things go wrong. Sometimes, these requests take too long to return a response, and that's when websocket misbehave.

What if server didn't receive fd_close

I have a high performance client server system programmed from the scratch. i am still improving my system. the server using io overlapping to handle connections. the server correctly handles disconnections and resource deallocations. at the client side i used shutdown command with sd_receive to notify the server that the client has no data to receive after final send from the client. this works well. and server detects that as a graceful disconnection. rarely i have observed when the connection is very slow the server doesn't detect this. I feel that the shutdown partial closure doesn't reach the server. how can i handle this. this is important the server shouldn't contain this kind of connections if so the server can not be stopped. and i do not want to close all such connection by force.
at the client side i used shutdown command with sd_receive to notify the server that the client has no data to receive after final send from the client.
It doesn't do that.
this works well
It doesn't work at all. The shutdown command with SD_RECEIVE that you're using is completely pointless. A close, or a shutdown with SD_SEND or SD_BOTH, sends a FIN: shutdown with SD_RECEIVE does exactly nothing on the wire, and specifically it does not 'notify the server' of anything.
I feel that the shutdown partial closure doesn't reach the server.
It never reaches the server. Your code doesn't work the way you think it does. What reaches the server is the FIN, which in turn is the result of the close, not the shutdown SD_RECEIVE.
What you need here is a read timeout at the server end. As you're using select() or whatever is delivering you the events, you will have to implement the timeout manually yourself.

Long-polling vs websocket when expecting one-time response from server-side

I have read many articles on real-time push notifications. And the resume is that websocket is generally the preferred technique as long as you are not concerned about 100% browser compatibility. And yet, one article states that
Long polling - potentially when you are exchanging single call with
server, and server is doing some work in background.
This is exactly my case. The user presses a button which initiates some complex calculations on server-side, and as soon as the answer is ready, the server sends a push-notification to the client. The question is, can we say that for the case of one-time responses, long-polling is better choice than websockets?
Or unless we are concerned about obsolete browsers support and if I am going to start the project from scratch, websockets should ALWAYS be preferred to long-polling when it comes to push-protocol ?
The question is, can we say that for the case of one-time responses,
long-polling is better choice than websockets?
Not really. Long polling is inefficient (multiple incoming requests, multiple times your server has to check on the state of the long running job), particularly if the usual time period is long enough that you're going to have to poll many times.
If a given client page is only likely to do this operation once, then you can really go either way. There are some advantages and disadvantages to each mechanism.
At a response time of 5-10 minutes you cannot assume that a single http request will stay alive that long awaiting a response, even if you make sure the server side will stay open that long. Clients or intermediate network equipment (proxies, etc...) just make not keep the initial http connection open that long. That would have been the most efficient mechanism if you could have done that. But, I don't think you can count on that for a random network configuration and client configuration that you do not control.
So, that leaves you with several options which I think you already know, but I will describe here for completeness for others.
Option 1:
Establish websocket connection to the server by which you can receive push response.
Make http request to initiate the long running operation. Return response that the operation has been successfully initiated.
Receive websocket push response some time later.
Close webSocket (assuming this page won't be doing this again).
Option 2:
Make http request to initiate the long running operation. Return response that the operation has been successfully initiated and probably some sort of taskID that can be used for future querying.
Using http "long polling" to "wait" for the answer. Since these requests will likely "time out" before the response is received, you will have to regularly long poll until the response is received.
Option 3:
Establish webSocket connection.
Send message over webSocket connection to initiate the operation.
Receive response some time later that the operation is complete.
Close webSocket connection (assuming this page won't be using it any more).
Option 4:
Same as option 3, but using socket.io instead of plain webSocket to give you heartbeat and auto-reconnect logic to make sure the webSocket connection stays alive.
If you're looking at things purely from the networking and server efficiency point of view, then options 3 or 4 are likely to be the most efficient. You only have the overhead of one TCP connection between client and server and that one connection is used for all traffic and the traffic on that one connection is pretty efficient and supports actual push so the client gets notified as soon as possible.
From an architecture point of view, I'm not a fan of option 1 because it just seems a bit convoluted when you initiate the request using one technology and then send the response via another and it requires you to create a correlation between the client that initiated an incoming http request and a connected webSocket. That can be done, but it's extra bookkeeping on the server. Option 2 is simple architecturally, but inefficient (regularly polling the server) so it's not my favorite either.
There is an alterternative that don't require polling or having an open socket connection all the time.
It's called web push.
The Push API gives web applications the ability to receive messages pushed to them from a server, whether or not the web app is in the foreground, or even currently loaded, on a user agent. This lets developers deliver asynchronous notifications and updates to users that opt in, resulting in better engagement with timely new content.
Some perks are
You need to ask for notification permission
Your site needs to have a service worker running in foreground
having a service worker also means you need to have SSL / HTTPS

Why might an EventMachine outbound data buffer stop sending and just fill up forever (while other connections can still send)

I have an EventMachine server sending TCP data down to a Mac client (via GCDAsyncSocket). It always works flawlessly for a while, but inevitably the server suddenly stops sending data on a connection-by-connection basis. The connection is still maintained, and the server still receives data from the client, but it doesn't go the other way.
When this happens, I've discovered via connection#get_outbound_data_size that the connection send buffer is filling up infinitely (via #send_data) and not being sent to the client.
Are there specific (and hopefully fixable) reasons why this might occur? The reactor keeps humming along, and other active connections to the server continue working fine (though they sometimes fall into buffer hell as well).
I see one reason at least: when the remote client no longer read data from its side of the TCP connection (with a recv() call or whatever).
Then, the scenario is: the receiving TCP buffer on the client side becomes full. And the OS can no longer accepts TCP pacquets from its peer, since it cannot store them queue them. As a consequence, the sending TCP buffer on the server side becomes full too as your application continue to send paquets on the socket! Soon your server is no longer able to write into the socket since the send() system call will :
blocks undefinitively. (waiting for buffer to empty enough for the new paquet)
ot returns with an EWOULDBLOCK error. (if you configured your socket as a non-blocking one)
I usually met that kind of use case in TEST environment when I put a breakpoint in my code on the client side.
There was a patch was applied to GCDAsyncSocket on March 23 that prevents the reads from stopping. Did this patch solve your problem?

WinSock best accept() practices

Imagine you have a server which can handle only one client at a time. The server uses WSAAsyncSelect to be notified of new connections. In this case, what is the best way of handling FD_ACCEPT messages:
A > Accept the connection attempt right away but queue the client until its turn?
B > Do not accept the next connection attempt until we are done serving the currently connected client?
What do you guys think is the most efficient?
Here I describe the cons that I'm aware for both options. Hopefully this might help you decide.
A)
Upon a new client connection, it could send tons of data making your receive buffer become full, which causes unnecessary packets to be transmitted (see this). If you don't plan to receive any data from the client, shutdown receiving on that socket, thus if the client sends any data after that, the connection is reset. Moreover, if your protocol has strict rules, disconnect the client.
If the connection stays idle for too long, the system might disconnect it. To solve this, use setsockopt to set SO_KEEPALIVE on each client socket.
B)
If you don't accept the connection after a certain period (I guess the default is 60 seconds), it will timeout. In a normal (or most common) situation this indicates the server is overloaded, thus unable to answer in time. However, if the client is also designed by you, make the socket non-blocking, try to connect, then manage the timeout as you wish.
Ask yourself: what do you want the user experience to be at the other end? Do you want them to be stuck? Do you want them to time out? Do you want them to get a polite message?

Resources