`ApplicationRunner` wait until other service is up - spring

We have an ApplicationRunner to create an admin user on the first start of a spring boot application.
To create the user, it must connect to an auth server (we use Keycloak). However, if the service is deployed together with the main application (via docker-compose up -d) it will take some time until the auth server is available, actually, too long. The ApplicationRunner will fail with a 502 Bad Gateway exception, because it is executed before the auth server is up and running.
How can the ApplicationRunner delay creating the admin until the auth server is up?
Ideally, the ApplicationRunner should delay everything and provide some information about the "waiting state" during startup. If after e.g. 1 min or so the auth server is not available during startup, the application run should fail.
Notes
We are using docker-compose version 3. We are looking for an application level solution, because docker-compose docs state that this should be handled on the application level.

As long as you use Docker-Compose, you can Control startup and shutdown order in Compose.
my-service:
image: my-company/my-service:1.0.0
container_name: my-service
restart: on-failure
depends_on:
my-auth-server-service:
condition: service_healthy
ports:
- 8080:8080
my-auth-server-service:
image: ...
container_name: ...
...
However, it doesn't guarantee the "readiness" of the service itself so an application-specific health or readiness check would be required (also described in the link above).
You might want define HEALTHCHECK in the Dockerfile of the authorization my-auth-server-service service (feel free to do in every service) to detect the health/readiness through a REST API call. As long as you use Spring, the actuator endpoint is suitable.
Here is a short example, however, you might want to define additional/custom logic being able to detect the complete readiness.
RUN apk --no-cache add curl
HEALTHCHECK --interval=20s --timeout=3s --retries=20 \
CMD curl --fail http://localhost:8080/actuator/health || exit 1

Conceptually, the startup procedure may just fail the complete application run, because it is an initialization script. Without success of this initialization the complete application cannot be used. The context / deployment should handle when to (re)start what.
If the initialization script fails (and consequently the complete application run), then using docker-compose's restart: unless-stopped configuration on the main application will just retry starting it. Eventually, the auth server gets up in the meantime and finally the main service is up running.
Turns out that implementing ApplicationRunner it is not as easy to stop the complete application context during startup. Instead, implementing SmartLifecycle and moving the logic into start() is a better idea. The method may just throw an exception to fail the application run on startup.

Related

OpenFaas : getting Upstream HTTP request error: Post http://127.0.0.1:8082/: EOF while deploying long running springboot service

I am trying to deploy a long running Springboot service on OpenFaas and facing Upstream HTTP request error: Post http://127.0.0.1:8082/: EOF
I have a springboot service which exposes API's.
To deploy this service on openfaas the following steps were performed.
the service depends on external dependencies which are placed in a folder in root project.
Updated build.gradle with implementation fileTree("$folderPath") to include external dependencies.
Started the Springboot Application run process in Handler class, but getting Upstream HTTP request error: Post http://127.0.0.1:8082/: EOF while we try to invoke the Handler.
Tried increasing the exec_time, but it dint help.
How do we run a long running springboot process in OpenFaas?
If the dependencies are loaded while building, how are they resolved at runtime inside docker container ?
Go http clients usually return EOF in case of request timeouts. OpenFaas watchdog, gateway, queue workers, all use GO http clients internally. Most probably your OpenFaas installation is not configured properly for long running functions.
You can double check this by making your Springboot API return immediately and see if that works.
You can refer to this to configure your OpenFaas to run long running functions. This is an excellent sample function for long running functions.
For your second question, if your faas build is succeeding (and i am assuming you are working with standard OpenFaas Java template), it packages the complete output of gradle build inside the docker container, which should carry all your dependency files as well.

ECS Health Check Issue with Spring Boot Management Port

Set up-1:(Not Working)
I have a task running in the ECS cluster. But it's going down because of a health check immediately after it started.
My service is spring boot based which has both traffic(for service calls) and management ports(for health check). I have "permitAll() permission for "*/health" path.
PFA: I configured the same by selecting the override port option in the TG health check tab as well.
Set up-2: (Working Fine)
I have the same setup in my docker-compose file and I can access health check endpoint in my local container.
This is how I defined in my compose:
service:
image: repo/a:name
container_name: container-1
ports:
- "9904:9904" # traffic port
- "8084:8084". # management Port
So, I tried configuring the management port on Task Def in the container section. I tried updated the corresponding service for this latest revision of the TD, but when I save this service, I'm getting an error. Is this the right way of handling this?
Error in ECS console:
Failed updating Service : The task definition is configured to use a dynamic host port,
but the target group with targetGroupArn arn:aws:elasticloadbalancing:us-east-2:{accountId}:targetgroup/ecs-container-tg/{someId} has a health check port specified.
Service
Two possible resolutions:
Is there a way I can specify this port mapping in the docker file?
Another way to configure the management port mappings in the container config of task definition within ECS? (Prefered)
Get rid of Spring Boot's actuator endpoint and implement our own endpoint for health? (BAD: As I need to implement lot of things to show all details which is returned by spring boot)
The task definition is configured to use a dynamic host port but target has a health check port specified.
Base on the error it seems like you have configured dynamic port mapping in Task definition, you can verify this in task definition.
understanding-dynamic-port-mapping-in-amazon-ecs
So in dynamic port, ECS schedule will assign and publish random port in the host which will be different than 8082, so change the health check setting accordingly to traffic port.
this will resolve the health issue, now come to your query
Is there a way I can specify this port mapping in the docker file?
No, port mapping happen at run time not at build time, you can specify that in task definition.
Another way to configure the management port mappings in the container config of task definition within ECS? (Prefered)
You can assign static port mapping which mean both publish port and expose will be same 8082:8082 in this health check will work by using static port mapping.
Get rid of Spring Boot's actuator endpoint and implement our own endpoint for health? (BAD: As I need to implement lot of things to show all details which is returned by spring boot)
Healthcheck is simple HTTP Get a call that ALB expecting 200 HTTP status code in response, so you can create a simple endpoint that will return 200 HTTP status code.
So, after 2 days of doing different things:
In task definition, the networking mode should be "Bridge" type
In task definition, leave the CPU and memory units empty. Providing them at the container level should be enough.

How to check if docker Cassandra instance is ready to take connections

I have two docker instances that I launch with docker-compose.
One holds a Cassandra instance
One holds a Spring Boot application that tries to connect to that application.
However, the Spring Boot application will always fail, because it's trying to connect to a Cassandra instance that is not ready yet to take connections.
I have tried:
Using restart:always in Docker-compose
This still doesn't always work, because the Cassandra might be up 'enough' to no longer crash the Spring Boot application, but not up 'enough' to have successfully created the Table/Column family. On top of that, this is a very hacky solution.
Using healthcheck
It seems like healthcheck in compose doesn't have restart capabilities
Using a bash script as entrypoint
In the hope that I could use netstat,ping,... whatever to determine that readiness state of Cassandra
Right now the only thing that really works is using that same bash script and sleep the process for x seconds, then start the jar. This is even more hacky...
Does anyone have an idea on how to solve this?
Thanks!
Does the spring boot service defined in the docker-compose.yml depends_on the cassandara service? If yes then the service is started only if the cassandra service is ready.
https://docs.docker.com/compose/compose-file/#depends_on
Take a look at this github repository, to find a healthcheck for the cassandra service.
https://github.com/docker-library/healthcheck
CONCLUSION
After some discussion we found out that docker-compose seems not to provide a functionality for waiting until services are up and healthy, such as Kubernetes and Openshift provide (See comments below). They recommend to use wrapper script (docker-entrypoint.sh) which waits for the depending service to come up, which make binaries necessary, the actual service shouldn't use such as the cassandra client binary. Additionally the service depending on cassandra could never get up if cassandra doesn't, which shouldn't happen.
A main thing with microservices is that they have to be resilient for failures and are not supposed to die or not to come up if a depending service is currently not available or unexpectedly disappears. Therefore the microservice should be implemented in a way so that it retries to get connection after startup or an unexpected disappearance. Unexpected is a word actually wrongly used in this context, because you should always expect such issues in a distributed environment, and even with docker-compose you will face issues like that as discussed in this topic.
The following link points to a tutorial which helped to integrate cassandra properly into a spring boot application. It provides a way to implement the retrieval of a cassandra connection with a retry behavior, therefore the service is resilient to a non existing cassandra database and will not fail to start anymore. Hope this helps others as well.
https://dzone.com/articles/containerising-a-spring-data-cassandra-application

Eureka First Discovery & Config Client Retry with Docker Compose

We've three Spring Boot applications:
Eureka Service
Config Server
Simple Web Service making use of Eureka and Config Server
I've set up the services so that we use a Eureka First Discovery, i.e. the simple web application finds out about the config server from the eureka service.
When started separately (either locally or by starting them as individual docker images) everything is ok, i.e. start config server after discovery service is running, and the Simple web service is started once the config server is running.
When docker-compose is used to start the services, they obviously start at the same time and essentially race to get up and running. This isn't an issue as we've added failFast: true and retry values to the simple web service and also have the docker container restarting so that the simple web service will eventually restart at a time when the discovery service and config server are both running but this doesn't feel optimal.
The unexpected behaviour we noticed was the following:
The simple web service reattempts a number of times to connect to the discovery service. This is sensible and expected
At the same time the simple web service attempts to contact the config server. Because it cannot contact the discovery service, it retries to connect to a config server on localhost, e.g. logs show retries going to http://localhost:8888. This wasn't expected.
The simple web service will eventually successfully connect to the discovery service but the logs show it stills tries to establish communication to the config server by going to http://localhost:8888. Again, this wasn't ideal.
Three questions/observations:
Is it a sensible strategy for the config client to fall back to trying localhost:8888 when it has been configured to use discovery to find the config server?
When the eureka connections is established, should the retry mechanism not now switch to trying the config server endpoint as indicated by Eureka? Essentially putting in higher/longer retry intervals and periods for the config server connection is pointless in this case as it's never going to connect to it if it's looking at localhost so we're better just failing fast.
Are there any properties that can override this behaviour?
I've created a sample github repo that demonstrates this behaviour:
https://github.com/KramKroc/eurekafirstdiscovery/tree/master

Spring Boot (Tomcat) based application as daemon - howto stop?

I wrote a Spring Boot webservice that uses an embedded tomcat as container.
In case the system reboots I want to backup some information to a mysql database.
In my webservice I use #Scheduled() and #PreDestroy to run the backup.
This goes well when I stop the server with ^C.
But when I kill the process with an sysV skript (/etc/init.d) and the kill command - even though the daemon has a dependency on mysql, the mysql server is shut down before the backup is finished (resulting in SQL Exceptions in my log).
The reason for this is of course, that kill only sends a signal to stop the process.
How can I (from my sysv skript) synchroneously stop the running spring boot tomcat server?
If you include spring-boot-starter-actuator then that provides a REST endpoint for management. One of the endpoints provided is /shutdown. By hitting that endpoint, you will get a controlled shutdown of all resources, which ensures that #PreDestroy will be called. As this could be dangerous to have enabled by default, to use it you will need to add the following to your application.properties file:
endpoints.shutdown.enabled=true
Of course, once you have exposed that endpoint you need to ensure that there's a teeny bit of security applied to prevent just anybody shutting down your server.
On a related note, you may find my answer to Spring Boot application as a Service useful, where I provided the code for a full init.d script which makes use of this.
As an alternative to the "/shutdown" endpoint the Actuator also has an ApplicationPidListener (not enabled by default) that you can use to create a pidfile (which is commonly used in "init.d" style scripts to kill a process when you want to stop it). The JVM should respond to a kill (sigint) and Spring will shutdown gracefully.

Resources