Sending requests to container before the app in container is ready in docker swarm - spring

I'm using docker swarm for orchestrating containers of my micro services.
For one of the microservices i have 2 replicas, so the requests are sent to one of them..
but when one of these 2 containers is stopped and then started again, after container start it needs some time for the application inside the container to start..
however as soon as the container is started the requests are sent to it, but as the app is not started yet (it needs about 5 minutes to start), i get errors with server connection..
Is there any configuration (may be some parameter in docker-compose) for swarm load balancing, not to send requests to container for some configured time after start and wait for the app ?
I tried health-check parameter in docker-compose, but it did not work

Related

TCP connection limit/timeout in virtual machine and native macOS/ARM-based Mac gRPC Go client?

I am currently working on a gRPC microservice, which is deployed to a Kubernetes cluster. As usual, I am benchmarking and load-/stress-testing my service, testing different load balancing settings, impact of SSL and so forth.
Initially, I used my Macbook and my gRPC client written in Go and executed this setup either in Docker or directly in containerd with nerdctl. The framework I use for this is called Colima and basically builds on a lean Alpine VM to provide the container engine. Herein, I ran into issues with connection timeouts and refusals once I crossed a certain number of parallel sessions, which I guess is a result from the container engine.
Therefore, I went ahead and ran my Go client natively on macOS. This setup somehow runs into the default 20s keepalive timeout for gRPC (https://grpc.github.io/grpc/cpp/md_doc_keepalive.html) the moment my parallel connections exceed the number of traffic I can work out by some margin (#1).
When I run the very same Go client on an x86 Ubuntu 22 desktop, there are no such issues whatsoever and I can start way more sessions in parallel, which are then processed accordingly without any issues with the 20s keepalive timeout.
Any ideas how that comes to being and if I could make some changes to my setup to be able to run my stress-test benchmarks from macOS?
#1: Let's say I can process and reply 1 request per second with my service. For stress testing, I now start 20 parallel sessions and would expect them to be processed sequentially.

How to stop pushing the heartbeat to the eureka server by client temporarily

I have a requirement to avoid incoming requests to a one of instance from the instance cluster for some time. Just think the service instance is going to execute a heavy batch process. While doing that batch process, i want to stop sending heartbeat to the eureka service temporarily. After doing that task the instance should be released as a normal active instance to be received the incoming requests.

Why Tomcat can stop respoinding until deplyed app is stopped?

Tomcat8 is running in docker container with single app deployed there.
App is mainly busy with processing users requests and cron jobs(usually additional work needs to be done after user request is finished).
What is the problem (by looking at the logs):
App (deployed under /mysoawesomeapp) is working as usual, processing requests and cron jobs.
There's a couple minutes gap, like the app would freeze
Docker is running health check on localhost:8080, every 30s waiting for response for 10s, then it restarts the container.
I can see shutdown request in logs, and then I can also see those health check responses with 200 status. It doesn't really matter now, since server is being shutdown.
My question is: how is it possible, that localhost:8080 request that would normally load tomcat home page can be halted until server shutdown occurs. How mysoawesomeapp can have an impact? And how can I confirm it?

On AWS Elastic Beanstalk, how can I keep instances ready but not serving until the main instance crashes?

I'm working on an app that currently cannot work correctly if it is run as multiple instances behind a load balancer that sends traffic to more than one instance. That is because the web sockets are coordinated via goroutines rather than via an external pub/sub system. Also, the app crashes sometimes, and it takes about 30s for it to come back up, so I would like to have another instance of it ready to serve when the live instance crashes.
Is there a good way to do that within Elastic Beanstalk, minimizing downtime?

How to find out the worker/socket number from within app code running in thin web server

I've recently started using a centralized log server (graylog2) and have been happily adding info to be logged.
I'm also running several Ruby web applications (Rails and Sinatra) under the thin web server, each with a number of workers (1-4). The workers listen to UNIX sockets.
I'd like to log which of the thin workers (i.e. worker #1, worker #2) is serving the request. The idea is to check that all workers get roughly the same load from the load balancer.
There seems to be no HTTP header or ENV variable set up for this in thin.
Anyone know if this information can be made available to the web app running inside the worker ?

Resources