I have two docker instances that I launch with docker-compose.
One holds a Cassandra instance
One holds a Spring Boot application that tries to connect to that application.
However, the Spring Boot application will always fail, because it's trying to connect to a Cassandra instance that is not ready yet to take connections.
I have tried:
Using restart:always in Docker-compose
This still doesn't always work, because the Cassandra might be up 'enough' to no longer crash the Spring Boot application, but not up 'enough' to have successfully created the Table/Column family. On top of that, this is a very hacky solution.
Using healthcheck
It seems like healthcheck in compose doesn't have restart capabilities
Using a bash script as entrypoint
In the hope that I could use netstat,ping,... whatever to determine that readiness state of Cassandra
Right now the only thing that really works is using that same bash script and sleep the process for x seconds, then start the jar. This is even more hacky...
Does anyone have an idea on how to solve this?
Thanks!
Does the spring boot service defined in the docker-compose.yml depends_on the cassandara service? If yes then the service is started only if the cassandra service is ready.
https://docs.docker.com/compose/compose-file/#depends_on
Take a look at this github repository, to find a healthcheck for the cassandra service.
https://github.com/docker-library/healthcheck
CONCLUSION
After some discussion we found out that docker-compose seems not to provide a functionality for waiting until services are up and healthy, such as Kubernetes and Openshift provide (See comments below). They recommend to use wrapper script (docker-entrypoint.sh) which waits for the depending service to come up, which make binaries necessary, the actual service shouldn't use such as the cassandra client binary. Additionally the service depending on cassandra could never get up if cassandra doesn't, which shouldn't happen.
A main thing with microservices is that they have to be resilient for failures and are not supposed to die or not to come up if a depending service is currently not available or unexpectedly disappears. Therefore the microservice should be implemented in a way so that it retries to get connection after startup or an unexpected disappearance. Unexpected is a word actually wrongly used in this context, because you should always expect such issues in a distributed environment, and even with docker-compose you will face issues like that as discussed in this topic.
The following link points to a tutorial which helped to integrate cassandra properly into a spring boot application. It provides a way to implement the retrieval of a cassandra connection with a retry behavior, therefore the service is resilient to a non existing cassandra database and will not fail to start anymore. Hope this helps others as well.
https://dzone.com/articles/containerising-a-spring-data-cassandra-application
Related
According to the documentation:
An application is considered ready as soon as application and
command-line runners have been called, see Spring Boot application
lifecycle and related Application Events.
So it doesn't seem to be considering external systems suchs a database. Is this correct?
How could we make the ReadinessStateHealthIndicator evaluate the state of such systems so the pod is taken away from the k8s service load balancer when they are failing of are not available?
We have a situation where we have an abundance of Spring boot applications running in containers (on OpenShift) that access centralized infrastructure (external to the pod) such as databases, queues, etc.
If a piece of central infrastructure is down, the health check returns "unhealthy" (rightfully so). The problem is that the liveliness check sees this, and restarts the pod (the readiness check then sees it's down too, so won't start the app). This is fine when only a few are available, but if many (potentially hundreds) of applications are using this, it forces restarts on all of them (crash loop).
I understand that central infrastructure being down is a bad thing. It "should" never happen. But... if it does (Murphy's law), it throws containers into a frenzy. Just seems like we're either doing something wrong, or we should reconfigure something.
A couple questions:
If you are forced to use centralized infrastructure from a Spring boot app running in a container on OpenShift/Kubernetes, should all actuator checks still be enabled for backend? (bouncing the container really won't fix the backend being down anyway)
Should the /actuator/health endpoint be set for both the liveliness probe and the readiness probe?
What common settings do folk use for the readiness/liveliness probe in a spring boot app? (timeouts/interval/etc).
Using actuator checks for liveness/readiness is the de-facto way to check for healthy app in Spring Boot Pod. Your application, once up, should ideally not go down or become unhealthy if a central piece, such as DB or Queueing service goes down , ideally you should add some sort of a resiliency which will either connect to alternate DR site or wait for certain time period for central service to come back up and app to reconnect. This is more of a technical failure on backend side causing a functional failure of your Application after it was started up cleanly.
Yes , both liveness and readiness is required as they both serve different purposes. Read this
In one of my previous projects, the settings used for readiness was around 30 seconds and liveness at around 90, but to be honest this is completely dependent on your application , if your app takes 1 minute to start , that is what your readiness time should be configured at , and your liveness should factor in the same along with any time required for making failover switch of your backend services.
Hope this helps.
I understand that Spring Boot has a built-in Tomcat server (or Jetty) which facilitates rapid development. But what do you do when you need to scale out your application because traffic has increased?
As pointed out in the comments, there is no silver bullet here, it depends on your infrastructure and there are several tools out there to help you, you only need to choose what works best for you.
For load balancing you can either choose something like an Nginx or leave it to spring cloud which also has a lot of other handy features for scaling/clustering.
Scaling shouldn't be very hard because spring boot runs on it's own server.
Some tools that help with scaling/clustering:
Spring boot app:
If you are going to scale, your app has to be near-stateless (e.g: you cannot have a scheduled task or something like that because when you scale to x instances, they are executed x times).
You can use the spring cloud project for extra added features like service discovery and other goodies that make scaling easier (e.g: When you spin up a new instance, it can get the config easily from a config server, 'register' to ease the loadbalancing between services, have cluster-like behaviour, etc...).
Infrastructure and containers:
Docker is a no-brainer here to handle easy launching of your applications and their replicas, if needed. If you can go further with resources and go with Kubernetes but it all depends on the use case.
Various servers (nodes), in case one of them fails and to easily distribute loads.
Ngnix for load balancing is pretty straightforward if you already don't have something done with spring cloud.
Database:
You really do NOT want to go with MySQL here because it can not scale well as your spring apps. You can choose something like Cassandra or Redis but that would mean restructuring your data model. Maybe the least-painful transition from MySQL to something NoSQL that can scale is a MongoDB (imho: Cassandra performs better).
Logging:
This can be a nightmare but spring also has a solution for this. Check out zipkin and spring sleuth.
Also, there are a lot resources here that talk a lot about architecture in general and how it is necessary to change the mindset when trying to run distributed services.
Hope this helps.
Update 2021-02-23
Today, Kubernetes is pretty much a de-facto standard when we talk about scaling and is preferred because of the rich set of features that you will be able to leverage and focus your app purely on business domain logic and can remove things like spring cloud for service discovery. If you can use some public clouds like EKS and GKE, you are better off without having to manage the clusters by yourself.
It provides autoscaling and built-in healthchecks. Starting from Spring Boot 2.4, you have many added benefits for running Spring Boot on K8s like dedicated healthcheck endpoints for liveness and readiness probes, graceful shutdown, etc....
On the database side, aim for something that is managed and scales easily such as AWS Aurora or similar.
An important thing to mention when managing spring boot services at scale is probably configuration management. A very useful solution that you can use out of the box is Consul. This will enable you to hot reload the configuration which is important when you have 50 services that you need to restart only to change one boolean variable. Depending on how big is your application, the startup can be costly, in terms of time as well as CPU/memory resources
In a distributed setup, how do I programatically start containers?
More specifically, does there exist any API similar to deploying and undeploying streams for setting up and tearing down containers?
There is currently no way to do this via an API. Containers are only known to the cluster after they are started. Upon initialization, the container registers itself with ZooKeeper. Running a container requires XD to be installed on that host which is currently a manual process: download,unzip,configure, as is starting the container. Some automation of operations will likely be provided in a future release.
I wrote a Spring Boot webservice that uses an embedded tomcat as container.
In case the system reboots I want to backup some information to a mysql database.
In my webservice I use #Scheduled() and #PreDestroy to run the backup.
This goes well when I stop the server with ^C.
But when I kill the process with an sysV skript (/etc/init.d) and the kill command - even though the daemon has a dependency on mysql, the mysql server is shut down before the backup is finished (resulting in SQL Exceptions in my log).
The reason for this is of course, that kill only sends a signal to stop the process.
How can I (from my sysv skript) synchroneously stop the running spring boot tomcat server?
If you include spring-boot-starter-actuator then that provides a REST endpoint for management. One of the endpoints provided is /shutdown. By hitting that endpoint, you will get a controlled shutdown of all resources, which ensures that #PreDestroy will be called. As this could be dangerous to have enabled by default, to use it you will need to add the following to your application.properties file:
endpoints.shutdown.enabled=true
Of course, once you have exposed that endpoint you need to ensure that there's a teeny bit of security applied to prevent just anybody shutting down your server.
On a related note, you may find my answer to Spring Boot application as a Service useful, where I provided the code for a full init.d script which makes use of this.
As an alternative to the "/shutdown" endpoint the Actuator also has an ApplicationPidListener (not enabled by default) that you can use to create a pidfile (which is commonly used in "init.d" style scripts to kill a process when you want to stop it). The JVM should respond to a kill (sigint) and Spring will shutdown gracefully.