I am running few test with Eureka and seeing the issue though I shut down the micoservices , it still shows services are up and running, ribbon got the server list and call failed with 404. I went through the eureka docs 85% rule, still this one is tricky. If I disabled the self preservation mode it works, but I don't want to do that as per recommendations in prod. so what is the best configuration to not face this issue?
The configuration options are very rich both on the client and the server side, but firstly you must bear in mind that default properties' values are supposed to work for Netflix, where are hundreds of microservices. When you have a small infrastructure, then 85% threshold is pretty strict. One way is to decrease it using eureka.server.renevalPercentThreshold property. You need to estimate the best value for your needs, depending mainly on the number of instances that register in Eureka.
When you decide to switch self preservation mode off, then you can configure eureka.server.evictionIntervalTimerInMs property, so that services will disappear from registry after time period prefered by yourself. Moreover you can configure (per each instance that registers in Eureka) eureka.instance.leaseExpirationDurationInSeconds, which is a time that Eureka server waits since it received last heartbeat from the service before it evicts it.
The following classes are very well documented, and you can figure out what is configurable and may be useful for you:
com.netflix.discovery.EurekaClientConfig.java, com.netflix.appinfo.EurekaInstanceConfig.java, com.netflix.eureka.EurekaServerConfig.java
Related
Question:
Is there an option within spring or its embedded servlet container to open ports when spring is ready to handle traffic?
Situation:
In the current setup i use a spring boot application running in google cloud run.
Circumstances:
Cloud run does not support liveness/readyness probes, it considers an open port as "application ready".
Cloud run sends request to the container although spring is not ready to handle requests.
Spring start its servlet container, open its ports while still spinning up its beans.
Problem:
Traffic to an unready application will result in a lot of http 429 status codes.
This affects:
new deployments
scaling capabilities of cloud run
My desire:
Configure spring/servlet container to delay opening ports when application is actually ready
Delaying opening ports to the time the application is ready would ease much pain without interfering too much with the existing code base.
Any alternatives not causing too much pain?
Things i found and considered not viable
Using native-image is not an option as it is considered experimental and consumes more RAM at compile time than our deployment pipeline agents allow to allocate (max 8GB vs needed 13GB)
another answer i found: readiness check for google cloud run - how?
which i don't see how it could satisfy my needs, since spring-boot startup time is still slow. That's why my initial idea was to delay opening ports
I did not have time to test the following, but one thing i stumbled upon is
a blogpost about using multiple processes within a container. Though it is against the recommendation of containers principles, it seems viable for the time until cloud run supports probes of any type.
As you are well aware of the fact that “Cloud Run currently does not have a readiness/liveness check to avoid sending requests to unready applications” I would say there is not much that can be done on Cloud Run’s side except :
Try and optimise the Spring boot app as per the docs.
Make a heavier entrypoint in Cloud Run service that takes care of
more setup tasks. This stackoverflow thread mentions how “A
’heavier’ entrypoint will help post-deploy responsiveness, at the
cost of slower cold-starts” ( this is the most relevant solution
from a Cloud Run perspective and outlines the issue correctly)
Run multiple processes in a container in Cloud Run as you
mentioned.
This question seems more directed at Spring Boot specifically and I found an article with a similar requirement.
However, if you absolutely need the app ready to serve when requests come in, we have another alternative to Cloud Run, Google Kubernetes Engine (GKE) which makes use of readiness/liveness probes.
I have an app deployed on a wildfly server on the Jelastic PaaS. This app functions normally with a few users. I'm trying to do some load tests, by using JMeter, in this case calling a REST api 300 times in 1 second.
This leads to around 60% error rate on the requests, all of them being 503 (service temporarily unavailable). I don't know what things I have to tweak in the environment to get rid of those errors. I'm pretty sure it's not my app's fault, since it is not heavy and i get the same results even trying to test the load on the Index page.
The topology of the environment is simply 1 wildfly node (with 20 cloudlets) and a Postgres database with 20 cloudlets. I had fancier topologies, but trying to narrow the problem down I cut the load balancer (NGINX) and the multiple wildfly nodes.
Requests via the shared load balancer (i.e. when your internet facing node does not have a public IP) face strict QoS limits to protect platform stability. The whole point of the shared load balancer is it's shared by many users, so you can't take 100% of its resources for yourself.
With a public IP, your traffic goes straight from the internet to your node and therefore those QoS limits are not needed or applicable.
As stated in the documentation, you need a public IP for production workloads (a load test should be considered 'production' in this context).
I don't know what things I have to tweak in the environment to get rid of those errors
we don't know either and as your question doesn't provide sufficient level of details we can come up only with generic suggestions like:
Check WildFly log for any suspicious entries. HTTP 503 is a server-side error so it should be logged along with the stacktrace which will lead you to the root cause
Check whether Wildfly instance(s) have enough headroom to operate in terms of CPU, RAM, et, it can be done using i.e. JMeter PerfMon Plugin
Check JVM and WildFly specific JMX metrics using JVisualVM or the aforementioned JMeter PerfMon Plugin
Double check Undertow subsystem configuration for any connection/request/rate limiting entries
Use a profiler tool like JProfiler or YourKit to see what are the slowest functions, largest objects, etc.
We have 10 instances of same microservice identified via eureka service discovery and calls being routed to them through gateway. We want to deploy code changes across these 10 instances but the code changes should be atomic. Meaning at no point of time, 2 instances be running different code.
The simple strategy could be to bring down 9 of the instances--> deploy changes on them --> bring them up --> bring down remaining one instance and after deployment change, bring it up again.
Is this the ideal strategy to be followed on production environment or are there specific patterns to be followed?
The answers on blogs seems to discuss the microservices pattern but none talk about the scenario when some of the instances have newer code version and others yet to be updated.
The ideal strategy is to spin up a few new instances and start balancing requests to them progressively. The load balancer can do IP address pinning so that starting at a particular point in time, an IP address only gets replies from the new instances.
In ideal production world; your atomic requirement is NOT there... Generally we deploy new code on suppose 10% on servers.. see how it is performing in terms of exceptions, latency numbers ..and if all good we keep increasing this percentage..
But I completely understand for some releases ( for example some DB changes though there is even solution for that but that is for another what if ) or for some scenarios we CANNOT have multiple code bases running. First question to be asked for any deployment is "allowed downtime".
Let us assume u need minimum downtime... then solution is that u deploy on another 10 servers; test them out .. and once all is ok , then point your ELB to new servers.. Note that there will be few minutes downtime here..as we have atomic requirement.
I read a lot of threads now, still my problem could not be solved sufficiently:
If running a tomcat webserver with a spring rest backend, there should must be a way to limit the possible requests per seconds/minute/... based on, lets say, the IP of a requestor.
My investigations led to the following possibilites so far:
Use Guava RateLimiter or https://github.com/weddini/spring-boot-throttling and check all requests in the preHandle. But since this does not take into account, which IPs at what time requested, something like a REDIS store would make more sense to check against (IP/Last access timestamp)
Put a more advanced web server in front of tomcat which offers this functionality (e.g. apache2 or nginx)
Now I don't like the first solution, since the requests already hit the application itself and the second solution builds up an additional layer, which I can't really believe is necessary for such a basic problem..
So my question is, what methods and solutions am I missing here? I read something about tomcat valve semaphore, but it seems to just limit the overall rate of requests.
Would it be most efficiently/possible to already filter with some basic functionality like iptables or fail2ban on the 8443 port and simply drop requests by the same ip in a give time frame?
Can someone please tell me the pro's and con's of mod_jk vs mod_cluster.
We are looking to do very simple load balancing.. We are going to be using sticky sessions and just need something to route new requests to a new server if one server goes down. I feel that mod_jk does this and does a good job so why do I need mod_cluster?
If your JBoss version is 5.x or above, you should use mod_cluster, it will give you a better performance and reliability than mod_jk. Here you've some reasons:
better load balacing between app servers: the load balancing logic is calculated based on information and metrics provided directly by the applications servers (bear in mind they have first hand information about its load), in contrast with mod_jk with which the logic is calculated by the proxy itself. For that, mod_cluster uses an extra connection between the servers and the proxy (a part from the data one), used to send this load information.
better integration with the lifecycle of the applications deployed in the servers: the servers keep the proxy informed about the changes of the application in each respective node (for example if you undeploy the application in one of the nodes, the node will inform the proxy (mod_cluster) immediately, avoiding this way the inconvenient 404 errors.
it doesn't require ajp: you can also use it with http or https.
better management of the servers lifecycle events: when a server shutdowns or it's restarted, it informs the proxy about its state, so that the proxy can reconfigure itself automatically.
You can use sticky sessions as well with mod cluster, though of course, if one of the nodes fails, mod cluster won't help to keep the user sessions (as it would happen as well with other balancers, unless you've the JBoss nodes in cluster). But due to the reasons given above (keeping track of the server lifecycle events, and better load balancing mainly), in case one of the servers goes down, mod cluster will manage it better and more transparently to the user (the proxy will be informed immediately, and so it will never send requests to that node, until it's informed that it's restarted).
Remember that you can use mod_cluster with JBoss AS/EAP 5.x or JBoss Web 2.1.1 or above (in the case of Tomcat I think it's version 6 or above).
To sum up, though your use case of load balancing is simple, mod_cluster offers a better performance and scalability.
You can look for more information in the JBoss site for mod_cluster, and in its documentation page.