ECS Health Check Issue with Spring Boot Management Port - spring-boot

Set up-1:(Not Working)
I have a task running in the ECS cluster. But it's going down because of a health check immediately after it started.
My service is spring boot based which has both traffic(for service calls) and management ports(for health check). I have "permitAll() permission for "*/health" path.
PFA: I configured the same by selecting the override port option in the TG health check tab as well.
Set up-2: (Working Fine)
I have the same setup in my docker-compose file and I can access health check endpoint in my local container.
This is how I defined in my compose:
service:
image: repo/a:name
container_name: container-1
ports:
- "9904:9904" # traffic port
- "8084:8084". # management Port
So, I tried configuring the management port on Task Def in the container section. I tried updated the corresponding service for this latest revision of the TD, but when I save this service, I'm getting an error. Is this the right way of handling this?
Error in ECS console:
Failed updating Service : The task definition is configured to use a dynamic host port,
but the target group with targetGroupArn arn:aws:elasticloadbalancing:us-east-2:{accountId}:targetgroup/ecs-container-tg/{someId} has a health check port specified.
Service
Two possible resolutions:
Is there a way I can specify this port mapping in the docker file?
Another way to configure the management port mappings in the container config of task definition within ECS? (Prefered)
Get rid of Spring Boot's actuator endpoint and implement our own endpoint for health? (BAD: As I need to implement lot of things to show all details which is returned by spring boot)

The task definition is configured to use a dynamic host port but target has a health check port specified.
Base on the error it seems like you have configured dynamic port mapping in Task definition, you can verify this in task definition.
understanding-dynamic-port-mapping-in-amazon-ecs
So in dynamic port, ECS schedule will assign and publish random port in the host which will be different than 8082, so change the health check setting accordingly to traffic port.
this will resolve the health issue, now come to your query
Is there a way I can specify this port mapping in the docker file?
No, port mapping happen at run time not at build time, you can specify that in task definition.
Another way to configure the management port mappings in the container config of task definition within ECS? (Prefered)
You can assign static port mapping which mean both publish port and expose will be same 8082:8082 in this health check will work by using static port mapping.
Get rid of Spring Boot's actuator endpoint and implement our own endpoint for health? (BAD: As I need to implement lot of things to show all details which is returned by spring boot)
Healthcheck is simple HTTP Get a call that ALB expecting 200 HTTP status code in response, so you can create a simple endpoint that will return 200 HTTP status code.

So, after 2 days of doing different things:
In task definition, the networking mode should be "Bridge" type
In task definition, leave the CPU and memory units empty. Providing them at the container level should be enough.

Related

Spring Cloud Deployer Local is unable to spin up worker remote partitions when server.port property is set in master's application properties file

I am trying to build a batch service in an existing application that has server.port=8080 property configured in application.properties file. When I run the batch process and Spring Batch trying to bring up remote partitions(separate JVMs), spring cloud deployer local throws error saying
"\r\n\r\n***************************\r\nAPPLICATION FAILED TO START\r\n***************************\r\n\r\nDescription:\r\n\r\nThe Tomcat connector configured to listen on port 8080 failed to start. The port may already be in use or the connector may be misconfigured.\r\n\r\nAction:\r\n\r\nVerify the connector's configuration, identify and stop any process that's listening on port 8080, or configure this application to listen on another port.
Is there a way to make the framework generate random ports for worker partitions being the server.port property that is already configured in the application.properties as is?
Thanks.
A Spring Batch remote partitioning setup requires a message broker for the communication between the manager and workers, but it does not require any web capabilities. You seem to be deploying all your apps locally (manager and workers) as web applications, hence the port conflict when multiple workers are deployed.
You have at least two options:
Either set a random server port for each app (See how Spring Boot allows you to do that here)
Or, if the number of workers is fixed, set ports to distinct values statically.

How to enable spring-actuator to collect metrics from multiple applications

I have a pod which consists of multiple containers each have an application running. How do I enable actuator to fetch metrics from these applications. I couldn't find a way to do this.
There are four micro services running in the pod on different ports say 8082, 8080, 8081, 8083. But the actuator is scraping the metrics only from the micro service running on 8080(default port).
I tried adding application properties indicated in code section to all properties. but it didn't work. Here is the application.property content:
management.endpoint.metrics.enabled=true
management.endpoints.web.exposure.include=*
management.endpoint.prometheus.enabled=true
management.metrics.export.prometheus.enabled=true
management.metrics.use-global-registry=true
management.server.port=8888
Expected output: I should be able to see the metrics from each applications using /metrics endpoint.
I was able to solve this problem. Here are the steps:
Configure separate acutator ports for each service in application.properties.
Expose the configure ports in deployment yaml(in case of kubernetes).
That's it.

Linux: Hostname to Spring Cloud Web Application not recognized

I have a development environment in windows where I can access the spring build in both IP Address and Hostname(PC-Name) for my Eureka Config Server / Client. When I move it to the RedHat environment it does not recognize the URL if it is a hostname.
My main goal is to change the eureka client status page to point to the hysterix monitor for the eureka client's hysterix stream. The value of ${spring.cloud.client.hostname} resolves to the hostname. I was wondering what is the way to make it the current IP of the eureka client?
To be exact here is an example of what I want am trying to do.
eureka:
instance:
preferIPAddress: true
statusPageUrlPath: http://${spring.cloud.client.hostname}:${eureka.cloud.config.port}/hystrix/monitor?stream=http%3A%2F%2F${spring.cloud.client.hostname}%3A${server.port}%2Factuator%2Fhystrix.stream
It just so happen that the client and the server are both in the same machine so I am contented in using the client hostname for both the Eureka Config Server path and the Eureka Client hystrix stream.
Note that I already set the preferIPAddress to true but the generated hostname is still the value of "/etc/hostname". I saw some solution that explicitly specifies the IP Address in the Eureka Client Instance. But I prefer to make it dynamic so that the same code can run smoothly on either Development and Deploy environment.
What can I do so that the hostname can also be recognized the same as the ip address?
The answer has dawn upon me just now. I just changed the following to this.
${spring.cloud.client.hostname}
↓↓↓↓↓↓↓↓↓↓↓↓
${spring.cloud.client.ip-address}
I was thrown to a wrong conclusion because other sites would tell me to use this configuration ${spring.cloud.client.ipAddress} which does not work.
Probably there was a change in Finchley / Spring boot 2.0 version. If anyone can give me a link to a documentation or discussion describing the configuration change, it would be helpful.

Spring Cloud Consul's discovery client returns failing instances

I use Spring Cloud Consul to discover the services I need for my application. The discovery client returns all registered instances for the requested service although I also returns failing instances.
Consul itself marks the failing instances corretly as failed (critical).
So, why does the discovery client not remove critical/failed instances? Or is where is the documentation which describes this?
After reading the source of Spring Cloud Consul I found out that I have to set the property spring.cloud.consul.discovery.queryPassing as follows
java -Dspring.cloud.consul.discovery.queryPassing=true ...`
The relevant source files are:
ConsulDiscoveryClient.java
ConsulDiscoveryProperties.java
After setting this property the discovery client for Consul will return only those instance which pass all Consul health checks.

Eureka First Discovery & Config Client Retry with Docker Compose

We've three Spring Boot applications:
Eureka Service
Config Server
Simple Web Service making use of Eureka and Config Server
I've set up the services so that we use a Eureka First Discovery, i.e. the simple web application finds out about the config server from the eureka service.
When started separately (either locally or by starting them as individual docker images) everything is ok, i.e. start config server after discovery service is running, and the Simple web service is started once the config server is running.
When docker-compose is used to start the services, they obviously start at the same time and essentially race to get up and running. This isn't an issue as we've added failFast: true and retry values to the simple web service and also have the docker container restarting so that the simple web service will eventually restart at a time when the discovery service and config server are both running but this doesn't feel optimal.
The unexpected behaviour we noticed was the following:
The simple web service reattempts a number of times to connect to the discovery service. This is sensible and expected
At the same time the simple web service attempts to contact the config server. Because it cannot contact the discovery service, it retries to connect to a config server on localhost, e.g. logs show retries going to http://localhost:8888. This wasn't expected.
The simple web service will eventually successfully connect to the discovery service but the logs show it stills tries to establish communication to the config server by going to http://localhost:8888. Again, this wasn't ideal.
Three questions/observations:
Is it a sensible strategy for the config client to fall back to trying localhost:8888 when it has been configured to use discovery to find the config server?
When the eureka connections is established, should the retry mechanism not now switch to trying the config server endpoint as indicated by Eureka? Essentially putting in higher/longer retry intervals and periods for the config server connection is pointless in this case as it's never going to connect to it if it's looking at localhost so we're better just failing fast.
Are there any properties that can override this behaviour?
I've created a sample github repo that demonstrates this behaviour:
https://github.com/KramKroc/eurekafirstdiscovery/tree/master

Resources