Unable to connect to Elastic Search intermittently - elasticsearch

I am trying to connect to elastic search via Jest Client.
Sometimes, the client is not able to connect to the elastic search cluster.
Stack Trace :
org.apache.http.NoHttpResponseException: search-xxx-yyy.ap-southeast-1.es.amazonaws.com:443 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
The elastic search cluster is in a public domain, so I am not understanding why the client is unable to connect.
Also, the issue happens intermittently, if I retry the request, it connects sometimes.
Any help is appreciated. Thanks

When JestClient initiates the http request, it will call read() on the socket and block. When this read returns -1, this means that the server closed the connection before or during client was waiting for the response.
Why it happens
There's two main causes for NoHttpResponseException:
. The server end of the connection was closed before the client attempts to send a request down it.
. The server end of the connection closes the connection in the middle of a request.
Stale Connection (connection closed before request)
Most often this is a stale connection. When using persistent connections, you may have a connection sit around in the connection pool not being used for a while. If it is idle for longer than the server or load balancer's HTTP keep alive timeout, then the server or load balancer will close the connection due to its idleness. The Jakarta client isn't structured to receive a notification of this happening (it doesn't use NIO), so the connection sits around in a half-closed state. The only way the client can detect this state is by reading from the socket. So when you send a request, the write is successful because the socket is only half closed (writes succeed until you close your end) but then the read indicates the socket was closed. This causes the request to fail.
Connection Closed Mid-Request
The other reason this might occur is the connection was actually closed while the service was working on it. Anything between your client and service may close the connection, including load balancers, proxies, or the HTTP endpoint fronting your service. If your activities are quite long-running or you're transferring a lot of data, the window for something to go wrong is larger and the connection is more likely to be lost in the middle of the request. An example of this happening is a Java server process exiting after an OutOfMemoryException occurs due to trying to return a large amount of data. You can verify whether this is the problem by looking at TCP dumps to see whether the connection is closed while the request is in flight. Also, failures of this type usually occur some time after sending the request, whereas stale connection failures always occur immediately when the request is made.
Diagnosing The Cause
NoHttpResponseException is usually a stale connection (according to problems I've observed and helped people with)
When the failure always occurs immediately after submitting the request, stale connection is almost certainly the problem
When failures occur some non-trivial amount of time after waiting for the response, then the connection wasn't stale when the request was made and the connection is being closed in the middle of the request
TCPDumps can be more conclusive. You can see when the connection is being closed (before or during the request).
What can be done about it
Use a better client
Nonblocking HTTP clients exist that allow the caller to know when a connection is closed without having to try to read from the connection.
Retry failed requests
If your call is safe to retry (e.g. it's idempotent), this is a good option. It also covers all sorts of transient failures besides stale connection failures. NoHttpResponseException isn't necessarily a stale connection and it's possible that the service received the request, so you should take care to retry only when safe.

Related

Ways to wait if server is not available in gRPC from client side

I hope who ever is reading this is doing well.
Here's a scenario that I'm wondering about: there's a global ClientConn that is being used for all grpc requests to a server. Then that server goes down. I was wondering if there's a way to wait for this server to go up with some timeout in order for the usage of grpc in this scenario to be more resilient to failures(either a transient failure or server goes down). I was thinking keep looping if the clientConn state is connecting or a transient failure and if a timeout occurs when the clientConn state was a transient failure then return an error since the server might be down.
I was wondering if this would work if there are multiple requests coming in the client side that would need this ClientConn so then multiple go routines would be running this loop. Would appreciate any other alternatives, suggestions, or advice.
When you call grpc.Dial to connect to a server and receive a grpc.ClientConn, it will automatically handle reconnections for you. When you call a method or request a stream, it will fail if it can't connect to the server or if there is an error processing the request.
You could retry a few times if the error indicates that it is due to the network. You can check the grpc status codes in here https://github.com/grpc/grpc-go/blob/master/codes/codes.go#L31 and extract them from the returned error using status.FromError: https://pkg.go.dev/google.golang.org/grpc/status#FromError
You also have the grpc.WaitForReady option (https://pkg.go.dev/google.golang.org/grpc#WaitForReady) which can be used to block the grpc call until the server is ready if it is in a transient failure. In that case, you don't need to retry, but you should probably add a timeout that cancels the context to have control over how long you stay blocked.
If you want to even avoid trying to call the server, you could use ClientConn.WaitForStateChange (which is experimental) to detect any state change and call ClientConn.GetState to determine in what state is the connection to know when it is safe to start calling the server again.

How to keep long connection in HTTP2?

I am reading the documentation of Alexa Voice Service capabilities and came across the part on managing HTTP2 connection. I don't really understand how this down channel works behind the scenes. Is it using server push? Well, could server push be used to keep a long connection? Or is it just using some tricks to keep the connection alive for a very long time?
As stated on the documentation, the client needs to establish a down channel stream with the server.
Based on what I read here https://www.rfc-editor.org/rfc/rfc7540, From this state diagram:
once the stream sends a HEADER frame, followed by an END STREAM flag, the state will be half-closed(local) on the point of view of the client. So, this is how half-closed state for the device happened, as stated in above image. Correct me that if I am wrong.
For managing the HTTP connection, this is what it says.
Based on my understanding: the client sets a timeout of 60minutes for the GET request. After the request is sent, the server will not send any response. Then the connection will remain open for 60minutes. But once a response is sent from the server, the connection should be closed. Isn't that supposed to happen? Or, is it because when the server sends response through the down channel stream, it did not send an END STREAM flag so the stream will not be closed?
But once a response is sent from the server, the connection should be closed.
HTTP/1.1 and HTTP/2 use persistent connections, which means that a single connection can be used not just for one request/response, but for several request/response cycles.
Only HTTP/1.0 was closing the connection after the response, and so for HTTP/2 this is not the case, the connection will remain open until either peer decides to explicitly close it.
The recommendations about the idle timeouts are exactly to prevent the client to explicitly close the connection too early when it sees no network traffic, independently from requests or responses.

How to make http2 requests with persistent connection ? (Any language)

How connect to https://api.push.apple.com using http2 with persistent connection ?
Persistent connection is to avoid rapid connection and disconnection:
APNs treats rapid connection and disconnection as a denial-of-service attack
https://developer.apple.com/library/ios/documentation/NetworkingInternet/Conceptual/RemoteNotificationsPG/Chapters/APNsProviderAPI.html
Is writing a client in c using https://nghttp2.org the only solution?
(If that question should be ask in another StackExchange website, please do tell me)
Non-persistent connections are a relic of the past. They were used in HTTP/1.0, but HTTP/1.1 already moved to a model where the connections were persistent by default, and HTTP/2 (also being multiplexed) continues on that model of connections being persistent by default.
Independently on the language you are using to develop your applications, any HTTP/2 compliant client will, by default, use persistent connections.
You only need to use the HTTP/2 client library in a way that you don't explicitly close the connection after every request you make.
Typically these libraries employ a connection pool that keeps the connections open, typically until an idle timeout fires.
When your application makes HTTP requests, the library will pick an open connection and send the request. When the response arrives the library will not close the connection but instead put it back into the pool for the next usage.
Just study how the library you want to use allows you to make multiple requests without closing the connection.
I also met this question!
If the connection be idle for a long time (about 1 hour), then function poll catches no socket status changed. It always returns 0 even as on_frame_send_callback was invoked.
Is there anyone can figure out the problem?

Under what circumstance will websocket stop reading from buffer

I have a websocket server. It accepts thousands of connections from clients. Read data from and write data to clients. It will work normally for weeks. But something wrong will happen occasionally, maybe once two weeks. In a very short time, the new clients will establish connections to server and send a protocol immediately. The server side websocket.onOpen() will be invoked, but it fails to read the protocol data from client. And later the client may close the connection. But on the server side, the connections will keep in the state of CLOSE_WAIT, but never successfully closed. Via netstat I can see that the CLOSE_WAIT connections' read buffer is not empty and keep that value(never be read). So I guess that the server's failing to read data and the close FIN package leads to the connection to keep in CLOSE_WAIT state.
So I want to know under what circumstance may the websocket fail to read data from reading buffers.

Connecting timeout for WebSockets

I can't find any information which specify the timeout for building the connection with a WebSocket server. This means, how long can the state "CONNECTING" be active before it changes to "CLOSED"? When the target host doesn't exist the state changes almost immediately, this is easy to find out. But what happens if the DNS lookup takes longer or the server is busy?
The same question arises for the "CLOSING" state when the connection goes through the closing handshake procedure. If the connection fails here, how long does it take until onClose() is called?
Are these two timeouts browser specific? If so, does anyone know any concrete numbers?

Resources