How to restart kubernetes pod when issue because of Rabbit MQ connectivity in logs - spring

I have a Spring Boot 2 standalone application( not REST service) which connect to rabbit MQ and process message. The application is deployed in kubernetes. While it work great, but when Rabbit MQ remain down for little longer and in logs I see hearbeat exception 60sec and eventually connection get drop even if the rabbit mq comes up after certain time:
Automatic retry connection to broker by spring-rabbitmq
https://www.rabbitmq.com/heartbeats.html
While I try to manage above issue by increasing number of retry :https://stackoverflow.com/questions/45385119/how-configure-timeouts-retries-or-max-attempts-in-differents-queues-with-spring
but after expiry of retry still above issue comes.
How can I reboot/delete-recreate pod if I see above issue in logs from kubernetes.

The easiest way is to use actuator, which has a /actuator/health endpoint. (Note that the recent version also add /actuator/health/liveness and /actuator/health/readiness).
You can assign the endpoint to livenessProbe property of k8s. Then it will automatically restart when it is necessary. You can parameterize, when your app is down if necessary.
See the docs:
Kubernetes liveness probe
Spring actuator health

Related

How to retry indefinitely while connecting spring boot application to consul

We use Consul by HashiCorp for configuration management in our spring boot application. Sometimes consul agent is occasionally unavailable when our app starts. It is inconvenient that the app fails if consul isn’t available for configuration.
I read about Consul Retry https://cloud.spring.io/spring-cloud-static/spring-cloud-consul/1.2.3.RELEASE/single/spring-cloud-consul.html#spring-cloud-consul-retry
Although it is a good solution for retrying, is there any solution in which the application retries indefinitely until it connects with consul?
We deploy this app on tomcat. Is there a better solution than consul-retry for reconnection? Should we try reloading app on tomcat whenever it fails ?
Can anyone tell me the best practice for reconnection in this scenario?

Spring Boot micro-service not connecting to Rabbit MQ server after server is online again

We are facing one problem, that sometimes our RabbitMQ server crashes due to some reasons. To connect RabbitMQ again with micro-services we need to restart the spring boot micro-services, Now is there a way that we can skip the restarts, so whenever RabbitMQ comes up, services' connection to RabbitMQ should automatically be created and should start working as expected.

RabbitMQ on Kubernates Unacked messages in queue

We are having issue on rabbitmq that happens when we deploy the application on production, we are not able to reproduce the issue on our development environment.
We have a microservices architecture with multiple spring boot applications deployed on kubernates with autoscaler depends on the usage and we notice that after sometimes some Unacked messages are created in queue, the number of Unacked messages will increase with the time and after sometimes rabbitmq seems to stop working.
Is there something we can check in order to identify the problem?

Quarkus Kafka: How to configure the number of retry attempts if we are not able to connect to the Kafka Broker?

I am working on a Quarkus application and intend to use Kafka to receive messages, however I want to stop the application if the application is not able to reach Kafka broker after retrying for a certain number of times. The default configuration is to try infinite number of times to reconnect. In the documentation at Smallrye Reactive Messaging Kafka, it says we can use kafka.retry-attempts or mp.messaging.incoming.[channel-name].retry-attempts to configure the number of retries. I have tried both but the application still goes on retring.
Have someone faced a similar issue or can someone help me with the resolution?

my Pivotal cloud foundry app is crashing often while doing healthcheck

I have created a spring boot integration app and deployed it to Pivotal Cloud Foundry (PCF) environment. It works for couple of days and then it starts to crash randomly afterwards. I checked the PCF logs and found this information about the crash.
OUTApp instance exited with guid 3c348d47-48c4-403f-950a-29af1efa551d
payload: {"instance"=>"e2122543-214f-4806-62c7-00e1", "index"=>2,
"reason"=>"CRASHED", "exit_description"=>"Instance became unhealthy: Failed
to make HTTP request to '/health' on port 8080: timed out after 1.00
seconds", "crash_count"=>1, "crash_timestamp"=>1511959503256098495,
"version"=>"10cea919-d490-460d-83d6-5132c96ef781"}
My CPU utilization is not much. My memory is also not leaking.
Information about the application deployed in PCF:
Spring boot integration app connects to IBM MQ queues and polls for messages and then calls couple of web services.
There is also another application Service Bus, which makes the health check call on PCF application to check if the PCF app is available or not. If Service Bus finds that PCF app is available then the requests are routed to PCF else they are processed at Service Bus end itself.
Please let me know, how to find the root cause of the CRASH and fix it.
Thanks in advance. Please let me know, if you need further details.
I have changed the health check type to port type from http in manifest.yml file.
configuration change in manifest file is as follows:
health-check-type: port
Now the app is not crashing. It is working fine. Hope this helps.

Resources