Question:
Is there an option within spring or its embedded servlet container to open ports when spring is ready to handle traffic?
Situation:
In the current setup i use a spring boot application running in google cloud run.
Circumstances:
Cloud run does not support liveness/readyness probes, it considers an open port as "application ready".
Cloud run sends request to the container although spring is not ready to handle requests.
Spring start its servlet container, open its ports while still spinning up its beans.
Problem:
Traffic to an unready application will result in a lot of http 429 status codes.
This affects:
new deployments
scaling capabilities of cloud run
My desire:
Configure spring/servlet container to delay opening ports when application is actually ready
Delaying opening ports to the time the application is ready would ease much pain without interfering too much with the existing code base.
Any alternatives not causing too much pain?
Things i found and considered not viable
Using native-image is not an option as it is considered experimental and consumes more RAM at compile time than our deployment pipeline agents allow to allocate (max 8GB vs needed 13GB)
another answer i found: readiness check for google cloud run - how?
which i don't see how it could satisfy my needs, since spring-boot startup time is still slow. That's why my initial idea was to delay opening ports
I did not have time to test the following, but one thing i stumbled upon is
a blogpost about using multiple processes within a container. Though it is against the recommendation of containers principles, it seems viable for the time until cloud run supports probes of any type.
As you are well aware of the fact that “Cloud Run currently does not have a readiness/liveness check to avoid sending requests to unready applications” I would say there is not much that can be done on Cloud Run’s side except :
Try and optimise the Spring boot app as per the docs.
Make a heavier entrypoint in Cloud Run service that takes care of
more setup tasks. This stackoverflow thread mentions how “A
’heavier’ entrypoint will help post-deploy responsiveness, at the
cost of slower cold-starts” ( this is the most relevant solution
from a Cloud Run perspective and outlines the issue correctly)
Run multiple processes in a container in Cloud Run as you
mentioned.
This question seems more directed at Spring Boot specifically and I found an article with a similar requirement.
However, if you absolutely need the app ready to serve when requests come in, we have another alternative to Cloud Run, Google Kubernetes Engine (GKE) which makes use of readiness/liveness probes.
Related
Assume I have a simple Spring Boot Hello World Microservice proving a single endpoint:
GET http://hostname/hello?name=name
and it gives text response: Hello name
This Spring Boot app is deployed as fat executable jar with embedded tomcat webserv.er, so that when it is launched it is always running tomcat server waiting for GET request.
When I deploy this app in AWS Lambda, AWS will charge me only when it is processing a GET request and not charge me when it is idle i.e. not processing GET request but waiting for GET requests?
The same question when the app is deployed as Docker container in AWS ECS Fargate or AWS EKS Fargate.
What would be the approximate AWS Lambda and AWS Fargate charges per day when 1000 GET requests are made per day? Just rough approximation.
Working out the charges for a Spring Boot app running on Lambda is actually not that hard. First, you need to work out how you're going to deploy it there. There are a few ways.
There's a quick start from AWS itself. It's really simple, only a few changes to make in your app, and you can do a 1-line deployment with their SAM CLI. The problem is the cold starts. Your app is going to go cold after some time (AWS won't tell you how long that is. This article from 2017 has more info). If it needs to startup to serve every request, that could be 20 seconds or even multiple minutes added to your response time. If you're doing some kind of batch jobs maybe this is ok, but in most cases response times of 20 seconds plus is probably going to be a no-go. In your case, with 1000 requests/day, if they are evenly spaced you would be fine because your app would never go cold. If they're grouped, you might need option 2.
Kotless from JetBrains is a great option for Kotlin. Again it's only a few small changes to your app and you're deployed on Lambda. Kotless has a built-in autoWarm option, which means your app is pinged every 5 minutes to stay warm so no more cold starts (you can change that to whatever frequency you want). It adds an extra cost but that cost is seriously tiny.
Pricing
For option 1, if you take your 10,000 requests/day, that's 300,000/month. Using region US East, response time 400ms, and 1500MB (should be plenty), if you haven't used up your Lambda free tier, that will cost $0.00. Even if you have used your free tier, that's $2.99. Even with option 2, we add ~8000 requests for the auto warming, which adds only a few cents to your monthly bill.
In future Spring Native could also be a great option. As of now I've found it too immature, but that way you can forget about cold starts. So overall, Spring Boot on Lambda seems crazy considering how "heavy" it is but actually, it could be a really cheap option for you, especially if you use DynamoDB for your persistence layer and use the free tier there too.
*All prices from the official AWS cost calculator
We recently added MongoCk to our Spring 5 app (using the Spring runner), but are having some issues during our deploys. Our final step in the deploy process is a health check where the deployment server checks a health page every 5s for 5 minutes. Once it gets the correct response the deployment is considered successful and it finishes.
The issue is that MongoCk seems to only start the migration around 30s after the application context loads, resulting in the health check passing and the migration possibly failing after the service was "successfully" launched.
Using a standalone runner might solve this, but we really like the availability of other beans during the changelogs. So is there a way to enforce the changelogs to be processed as part of loading the application context? Or where is this delay coming from, and how can we reduce it?
You don't provide much information, but you are saying that Mongock starts 30 secs after the application context is loaded. That could be happening for two reasons:
The most likely possibility is that you are using runner-type ApplicationRunner(by default). This means that Spring decides when to run it after the entire context is loaded. From what you are saying runner-type InitializingBean is a better fit for you .
Please try this:
mongock:
runner-type: InitializingBean
You have multiple instances fighting for the lock. There is nothing we can do about it, this process is optimised(Although we are improving even more). However, as said, I believe the issue is related with the runner-type
I have a problem with a server that called server A:
Server A: Red Hat Enterprise Linux Server release 7.2 (Maipo)
Server B: Red Hat Enterprise Linux Server release 7.7 (Maipo)
jdk-8u231 installed on all of servers.
I have an Spring Boot application running on 2 servers.
Whenever i use Jmeter to send 100 concurrency request to application running on each servers, the application running on Server B have no problem.
But in Server A, the application will be not responding, that mean the Process (PID) still running but I can't visit actuator endpoint, cannot visit Swagger page, cannot send new request ... log file show nothing since that time.
Thread dump and heap dump have no significant difference.
Could anyone show me how to analysis that problem?
I still have no idea why the problem occur.
Well, I can only speculate here, but here some ideas that can help:
There are two possible sources of issue here Java Application and Linux (+its network policies, firewalls and so forth).
Since You don't know for sure, what happens, try working by "elimination".
Create a script that will run 100 concurrent requests. Place the script at the Server A (the problematic one) and run The script will run against "localhost" (obviously). If you see that it works, then the issue is not in Java at all. Probably some network policies or linux setup, who knows.
Place a log message in the controller of the java application and examine the log. The log should print the request number among other things, so that you'll be able to understand whether you get stuck after a well defined number of requests or its always a different number.
Check the configurations of Spring Boot application. Maybe there is a difference in the number of threads allocated to serve the request by the embedded web server that runs inside the spring boot application (assuming you're not using a reactive stack) and this number differs. In this case you won't be able to call rest endpoints, actuator, etc.
If JMX connection is available to the setup, connect via the JMX and check the MBean of Tomcat (again, assuming there is a tomcat under the hood) to check pretty much the same information as in 4.
You've mentioned thread dumps. Try to take more than one thread dump but one before you're running JMeter test, one during the running (when everything still works), one when everything is stuck.
In the thread dumps check the actual stacktraces, maybe all the threads are stuck working with Database or something and can't serve requests like I've explained in "4"
Examine GC logs, maybe GC works so hard that you can't really interact with the application.
We are trying to revamp our batch job scheduling and monitoring process over the entire enterprise. Currently all our batch jobs are scheduled using Unix crontab and are monitored using log files generated by shell scripts.
This process has lot of disadvantages and as the number of applications grow this gets really complicated.
Two copies of applications need to be deployed one to App-Server and one as standalone(since business logic is shared between both). This is complicating our build process too.
There is no easy of use web-ui for us to see the status of jobs and manually run failed jobs remotely without getting onto the unix box.
There is no fail over or load balanced batch processing.
So I was thinking of using Quartz (with our existing Spring apps) in our applications and deploy them to App-Servers and no longer rely on the unix crontab.
Is there a way I can write a centralized web application from where I can schedule and monitor jobs running on different quartz schedulers on different app servers?
P.S: I know quartzdesk.com is one solution, but I don't want to enable RMI on my JVM.
You could use SpringBoot scheduler as an Orchestrator and call REST APIs for the remote (or local, if you are small) execution. This way, as your app grows you could easily leverage a load balancer.
If you have the possibility of using cloud services (like Amazon, Azure or Google Cloud), this could be done easily using their own load balancers. They also support docker and could take care of any peaks of utilization.
I have an app-engine+spring+hibernate mobile/web backend configured with F2 instances and D2 cloud sql instance. I've also configure warm-ups by configuring Idle Instances to be minimum 1.
I have two questions:
Is it possible to configure cloud sql instances to scale up when
needed?
My app takes about 20-40 seconds to start (after removing autowiring and doing all the optimization tips described here: https://developers.google.com/appengine/articles/spring_optimization). Still I get slow latency (~20-40 seconds) for some of the requests during load testing. I believe this is because app engine starts new instances and it takes them this much time to start. After the instances are up and running, everything is working fine, until too many users connect and again the delay. Is there a way I can solve this other then configuring more minimum live instances?
For your question regarding Cloud SQL, it currently does not have auto scaling capability.
As Tony said, you can't configure Cloud SQL to automatically scale according to demand. You could, of course, configure it to serve a higher expected demand from the beginning.
On the side, I'd like to suggest different things you could do with your app servers:
Change from F2 to F4 or F4_1G (if you're using a lot of memory) to see if that reduces your startup time.
If you're not doing it yet, you could use AppStats [1] to get a better understanding of which are the bottlenecks of your app. If it's only the startup time, and (1) doesn't help, I'm sorry that configure more idle servers would be the answer you're looking for.
[1] https://developers.google.com/appengine/docs/java/tools/appstats