How to stop pushing the heartbeat to the eureka server by client temporarily - spring

I have a requirement to avoid incoming requests to a one of instance from the instance cluster for some time. Just think the service instance is going to execute a heavy batch process. While doing that batch process, i want to stop sending heartbeat to the eureka service temporarily. After doing that task the instance should be released as a normal active instance to be received the incoming requests.

Related

EWS - One or more subscriptions in the request reside on another Client Access server

I got this error when I'm using streaming subscription with impersonation.
After the connection opened and receive notification successfully for minutes, it just pops up a bunch of this for almost all subscriptions.
How can I avoid this error?
One or more subscriptions in the request reside on another Client Access server. GetStreamingEvents won't proxy in the event of a batch request., The Availability Web Service instance doesn't have sufficient permissions to perform the request
I need to keep the connection stable and avoid this error.
Sounds like you haven't use affinity https://learn.microsoft.com/en-us/exchange/client-developer/exchange-web-services/how-to-maintain-affinity-between-group-of-subscriptions-and-mailbox-server
Also if its a multi threaded application ExchangeService isn't thread safe and shouldn't be used across multiple threads.

Kubernetes pods graceful shutdown with TCP connections (Spring boot)

I am hosting my services on azure cloud, sometimes I get "BackendConnectionFailure" without any apparent reason, after investigation I found a correlation between this exception and autoscale (scaling down) almost at the same second in most of the cases.
According to documentation termination grace period by default is 30 seconds, which is the case. The pod will be marked terminating and the loadbalancer will not consider it anymore, so receiving no more requests. According to this if my service takes far less time than 30 seconds, I should not need prestop hook or any special implementation in my application (please correct me if I am wrong).
If the previous paragraph is correct, why does this exception occur relatively frequent? My thought is when the pod is marked terminating and the loadbalancer does not forward anymore requests to the pod while it should do.
Edit 1:
The Architecture is simply like this
Client -> Firewall(azure) -> API(azure APIM) -> Microservices(Spring boot) -> backend(third party) or azure RDB depending on the service
I think the Exception comes from APIM, I found two patterns for this exception:
Message The underlying connection was closed: The connection was closed unexpectedly.
Exception type BackendConnectionFailure
Failed method forward-request
Response time 10.0 s
Message The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.
Exception type BackendConnectionFailure
Failed method forward-request
Response time 3.6 ms
Spring Boot doesn't do graceful termination by default.
The Spring Boot app and it's application container (not linux container) are in control of what happens to existing connections during the termination grace period. The protocols being used and how a client reacts to a "close" also have a part to play.
If you get to the end of the grace period, then everything gets a hard reset.
Kubernetes
When a pod is deleted in k8s, the Pod Endpoint removal from Services is triggered at the same time as the SIGTERM signal to the container(s).
At this point the cluster nodes will be reconfigured to remove any rules directing new traffic to the Pod. Any existing TCP connections to the Pod/containers will remain in connection tracking until they are closed (by the client, server or network stack).
For HTTP Keep Alive or HTTP/2 services, the client will continue hitting the same Pod Endpoint until it is told to close the connection (or it is forcibly reset)
App
The basic rules are, on SIGTERM the application should:
Allow running transactions to complete
Do any application cleanup required
Stop accepting new connections, just in case
Close any inactive connections it can (keep alive requests, websockets)
Some circumstances you might not be able to handle (depends on the client)
A keep alive connection that doesn't complete a request in the grace period, can't get a Connection: close header. It will need a TCP level FIN close.
A slow client with a long transfer, in a one way HTTP transfer these will have to be waited for or forcibly closed.
Although keepalive clients should respect a TCP FIN close, every client reacts differently. Microsoft APIM might be sensitive and produce the error even though there was no real world impact. It's best to load test your setup while scaling to see if there is a real world impact.
For more spring boot info see:
https://github.com/spring-projects/spring-boot/issues/4657
https://github.com/corentin59/spring-boot-graceful-shutdown
https://github.com/SchweizerischeBundesbahnen/springboot-graceful-shutdown
You can use a preStop sleep if needed. While the pod is removed from the service endpoints immediately, it still takes time (10-100ms) for the endpoint update to be sent to every node and for them to update iptables.
When your applications receives a SIGTERM (from the Pod termination) it needs to first stop reporting it is ready (fail the readinessProbe) but still serve requests as they come in from clients. After a certain time (depending on your readinessProbe settings) you can shut down the application.
For Spring Boot there is a small library doing exactly that: springboot-graceful-shutdown

Websphere web plug-in to automatically propagate cluster node shutdown

Does the WebServer web server plug-in automatically propagate the new configuration due to a manual shutdown of a node in the application server cluster? I've been going through the documentation and it looks like the only way for the web server to act on this is by detecting the node state by itself.
Is there any workaround?
By default, the WAS Plug-in only detects that a JVM is down by failing to send it a request or failing to establish a new TCP connection.
If you use the "Intelligent Management for WebServers" features available in 8.5 and later, there is a control connection between the cell and the Plug-in that will proactively tell the Plugin that a server is down.
Backing up to the non-IM case, here's what happens during an unplanned shutdown of a JVM (from http://publib.boulder.ibm.com/httpserv/ihsdiag/plugin_questions.html#failover)
If an application server terminates unexpectedly, several things
unfold. This is largely WebSphere edition independent.
The application servers operating system closes all open sockets.
WebServer threads waiting for the response in the WAS Plug-in are notified of EOF or ECONNRESET.
If the error occurred on a new connection to the application server, it will be marked down in the current webserver process. This server will not be retried until a configurable interval expires (RetryInterval).
If the error occurred on a an existng connection to the application server, it will not be marked down.
Retryable requests that were in-flight are retried by the WAS Plug-in, as permitted.
If the backend servers use memory to memory session replication (ND only), the WLM component will tell the WAS Plug-in to use a specific replacement affinity server.
If the backend servers use any kind of session persistence, the failover is transparent. Session persistence is available in all websphere editions.
New requests, with or without affinity, are routed to remaining servers..
After the RetryInterval expires, the WAS plug-in will try to establish new connections to the server. If it remains down, failure will be relatively fast, and put the server back into the markd down state.

Understanding effects of Domino command to restart HTTP server

We have a Domino cluster which consists of two servers. Recently we see that one of the server has memory problems, and the HTTP service goes down after 2 hours. So we plan to implement a scheduled server task which runs the command nserver -c "restart task http" till we find the memory leak solution. The HTTP service restarts in say 15 seconds. But what would happen if a user submits data during this small period. Will the cluster manager automatically manage the user session using the other server, and hence load balance the submit task?. Not sure about this. The failover runs fine in a normal case, so when one of the server goes down the other server load balances it. But we are not sure about the behavior of "restart task http" command. Does the restart http task finish all the pending threads, or Domino cluster manager switches to other server to load balance the request?.
Thanks in advance
The server should close out all HTTP requests prior to shutting down and restarting.

Is it possible to communicate with http requests between web and worker processes on Heroku?

I'm building an HTTP -> IRC proxy, it receives messages via an HTTP request and should then connect to an IRC server and post them to a channel (chat room).
This is all fairly straightforward, the one issue I have is that a connection to an IRC server is a persistent socket that should ideally be kept open for a reasonable period of time - unlike HTTP requests where a socket is opened and closed for each request (not always true I know). The implication of this is that a message bound for the same IRC server/room must always be sent via the same process (the one that holds a connection to the IRC server).
So I basically need to receive the HTTP request on my web processes, and then have them figure out which specific worker process has an open connection to the IRC server and route the message to that process.
I would prefer to avoid the complexity of a message queue within the IRC proxy app, as we already have one sitting in front of it that sends it the HTTP requests in the first place.
With that in mind my ideal solution is to have a shared datastore between the web and worker processes, and to have the worker processes maintain a table of all the IRC servers they're connected to. When a web process receives an HTTP request it could then look up the table to figure out if there is already a worker with a connection the the required IRC server and forward the message to that, or if there is no existing connection it could effectively act as a load balancer and pick an appropriate worker to forward the message to so it can establish and hold a connection to the IRC server.
Now to do this it would require my worker processes to be able to start an HTTP server and listen for requests from the web processes. On Heroku I know only web processes are added to the public facing "routing mesh" which is fine, what I would like to know is is it possible to send HTTP requests between a web and worker process internally within Herokus network (outside of the "routing mesh").
I will use a message queue if I must be as I said I'd like to avoid it.
Thanks!

Resources