CircuitBreaker waitDurationInOpenState does not work - spring-boot

I am using spring-cloud-starter-circuitbreaker-resilience4j and spring boot and following is configuration for the CircuitBreaker:
CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
.slidingWindowType(CircuitBreakerConfig.SlidingWindowType.TIME_BASED)
.slidingWindowSize(5)
.minimumNumberOfCalls(5)
.waitDurationInOpenState(Duration.ofSeconds(10))
.failureRateThreshold(0.7F)
.recordExceptions(ResourceAccessException.class)
.build();
When I call the endpoint 15 times, circuitBreaker goes to OPEN state after 5 call fails; When it is in OPEN state, I expect it remains in this state for 10 seconds as it is configured in above. But after few miliseconds, it goes to HALF_OPEN and tries more and fails.
Is there anything with the configuration that I am missing?
Thank you!

Related

RabbitMQ consumers are not adding up

RabbitMQ consumers are not adding up
I get the exact same issue, I've a RabittMQ listener defined with the following annotation and using Spring AMQP.
#RabbitListener(queues = "#{documentCreationQueue.name}", concurrency = "10-20")
It creates correctly the 10 consumers, but then it never increase to more that 10 even if I sent 20 requests and the 10 first one are in progress.
Did you find a solution to your problem couple of years ago ?
#GaryRussell any suggestions ?
I try to define a SimpleRabbitListenerContainerFactory with the annotation containerFactory and try multiple settings (prefectchCount, consecutiveActiveTrigger, taskExecutor with a thread pool....) without any success.

Spring Boot cannot config database connection down to zero

I am deploying my Spring Boot REST API on AWS Fargate, which connects to AWS Aurora Postgresql Serverless V1.
I have configured the Aurora to scale the ACU to 0 when idle as in the following picture, so that I am not charge too much when I don't use the API.
Initially, my Spring Boot App maintains 10 idle connections by default, so I have tried to make it zero by adding the this to application.properties
spring.datasource.minimumIdle=0
And then I see from AWS console that the database connection has been reduced. But it remains 1 connection forever.
Please help suggest if you know how to make it zero.
The Spring Boot database configuration is basically like this
#Bean
#ConfigurationProperties(prefix = "spring.datasource")
public DataSource dataSource() {
return DataSourceBuilder.create().build();
}
Edit 1
I used the suggestion in the comment to check if the connection really comes from Spring Boot.
It turns out there is no active connection but /actuator/metrics/hikaricp.connections.idle always return the value of 1
{"name":"hikaricp.connections.idle","description":"Idle connections","baseUnit":null,"measurements":[{"statistic":"VALUE","value":1.0}],"availableTags":[{"tag":"pool","values":["HikariPool-1"]}]}
And it seems does not relate to health check because I have tried running it locally and the result of /actuator/metrics/hikaricp.connections.idle remains 1.
I set logging.level.root = trace to see what is happening.
There are only 2 things keep printing in the log periodically
The Hikari connection report, showing 1 idle connection
{"level":"DEBUG","ref":"|","marker":"INTERNAL","message":"HikariPool-1 - Before cleanup stats (total=1, active=0, idle=1, waiting=0)","logger":"com.zaxxer.hikari.pool.HikariPool","timestamp":"2022-06-14 16:15:16.799","thread":"HikariPool-1 housekeeper"}
{"level":"DEBUG","ref":"|","marker":"INTERNAL","message":"HikariPool-1 - After cleanup stats (total=1, active=0, idle=1, waiting=0)","logger":"com.zaxxer.hikari.pool.HikariPool","timestamp":"2022-06-14 16:15:16.800","thread":"HikariPool-1 housekeeper"}
{"level":"DEBUG","ref":"|","marker":"INTERNAL","message":"HikariPool-1 - Fill pool skipped, pool is at sufficient level.","logger":"com.zaxxer.hikari.pool.HikariPool","timestamp":"2022-06-14 16:15:16.800","thread":"HikariPool-1 housekeeper"}
Tomcat NioEndpoint, but I think it is not relevant
{"level":"DEBUG","ref":"|","marker":"INTERNAL","message":"timeout completed: keys processed=0; now=1655198117181; nextExpiration=1655198117180; keyCount=0; hasEvents=false; eval=false","logger":"org.apache.tomcat.util.net.NioEndpoint","timestamp":"2022-06-14 16:15:17.181","thread":"http-nio-8445-Poller"}
Thanks to the suggestion in the comment, this is because of the actuator health check, which can be solved by the following settings
management.health.db.enabled=false

Wire Logger configuration behavior on Spring Reactor Netty

While trying to configure Reactor Netty wiretap feature on a WebFlux client to see the generated requests (and more detailed response), I've ended up with the following working setup:
In a service bean:
private WebClient client = WebClient.builder()
.clientConnector(new ReactorClientHttpConnector(
HttpClient.create().wiretap("webClientLogger", LogLevel.DEBUG, AdvancedByteBufFormat.TEXTUAL)
))
.baseUrl("https://...")
.defaultHeader(HttpHeaders.ACCEPT, MediaType.APPLICATION_NDJSON_VALUE)
.codecs(codecs -> codecs.defaultCodecs().jackson2JsonDecoder(new Jackson2JsonDecoder(Jackson2ObjectMapperBuilder.json().build(),
new MimeType("application", "json"), new MimeType("application", "x-ndjson"))))
.build();
and in application.properties:
logging.level.webClientLogger = DEBUG
My question is, why, given that both the name and the level of the logger seems to be already defined in the #wiretap method's signature, the above line in the application.property is still required (as commenting it out stops producing any log output).
Relevant Reactor Netty documentation seems to reference reactor.netty.http.client.HttpClient, however adding reactor.netty.http.client.HttpClient=DEBUG to application.properties without it having the above mentioned line doesn't make any difference and still doesn't produce the log output.
As a note, I'm using Lombok with its respective annotation #Slf4j to materialize the logging behavior.

How to fix "NoHttpResponseException" when running Wiremock on jenkins?

I start a wiremock server in a integration test.
The IT pass in my local BUT some case failed in jenkins server, the error is
localhost:8089 failed to respond; nested exception is org.apache.http.NoHttpResponseException: localhost:8089 failed to respond
I try to add sleep(3000) in my test, that can fix the issue, But I don’t know the root cause of the issue, so the work around not a good idea
I also try to use #AutoConfigureWireMock(port=8089) to replace WireMockServer to start wiremock server, that could fix the problem, BUT I don't know how to do some configuration to the wiremock server using the annotation #AutoConfigureWireMock(port=8089).
Here my code to start a wiremock server, any suggestion to fix "NoHttpResponseException"?
#ContextConfiguration(
initializers = ConfigFileApplicationContextInitializer.class)
#SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.DEFINED_PORT)
class BaseSpec extends Specification {
#Shared
WireMockServer wireMockServer
def setupSpec() {
wireMockServer = new WireMockServer(options().port(PORT).jettyHeaderBufferSize(12345)
.notifier(new ConsoleNotifier(new Boolean(System.getenv(“IT_WIREMOCK_LOG”) ?: ‘false’)))
.extensions(new ResponseTemplateTransformer(true)))
wireMockServer.start()
}
Apache HttpClient suffers from NoHttpResponseException from time to time. This is a very old problem.
Anyway, I guess in your case the problem might be caused by restarting the WireMock server between tests, and at the same time, Apache HttpClient pools HTTP connections and tries to reuse them between tests. If this is the case, there are two solutions:
Disable pooling HTTP connections in your tests. This makes sense because it's considered normal that the WireMock server can be restarted during tests execution. Alternatively, craft your WireMock stubs to always send "Connection": "close" among the headers. The outcome will be the same.
Switch from Apache HttpClient to Square OkHttp. OkHttp, although it pools http connections by default, is always able to gracefully recover from situations like a stale connection. Unfortunately the library from Apache is not so smart.
Coorect, as already written by G. Demecki it's not related to Wiremock.
It’s related to your application server, which is calling wiremock. Today it’s common, to reuse a connection to improve the performance in a micro service infrastructure. So connection-close-header, RequestScoped client, etc is not useful.
Check the apache http client:
httpclient-4.5.2 - PoolingHttpClientConnectionManager
documentation
The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms
Each time a wiremock-endpoint was destroyed and a new one was created for a new test class, it takes 2 seconds, until your application detects, that the previous connection is broken and a new one has to be opened.
If you don’t wait 2 seconds, such a NoHttpResponseException could be thrown, depends on the last check.
So a Thread.sleep(2000); looks ugly. But it's not so bad, as long we know why this is required.
Each time a wiremock endpoint is destroyed (because the wiremock server is restarted between tests) and a new one is created for a new test, it takes 2 seconds (as stated in documentation), until the application detects that the previous http connection is broken and a new one has to be opened.
The solution is to simply override the default keep-alive connection behaviour for every stub using .withHeader("Connection", "close"). Something like:
givenThat(get("/endpoint_path")
.withHeader("Authorization", equalTo(authHeader))
.willReturn(
ok()
.withBody(body)
.withHeader(HttpHeaders.CONNECTION, "close")
)
)
Also possible to do it globally using a transformer:
public class NoKeepAliveTransformer extends ResponseDefinitionTransformer {
#Override
public ResponseDefinition transform(Request request,
ResponseDefinition responseDefinition,
FileSource files,
Parameters parameters) {
return ResponseDefinitionBuilder
.like(responseDefinition)
.withHeader(CONNECTION, "close")
.build();
}
#Override
public String getName() {
return "keep-alive-disabler";
}
}
Then this transformer have to be registered when you create the wiremock server:
new WireMockServer(
options()
.port(port)
.extensions(NoKeepAliveTransformer.class)
)
Solution that worked for us in this situation was just adding retry to apache client:
#Configuration
public class FeignTestConfig {
#Bean
#Primary
public HttpClient testClient() {
return HttpClientBuilder.create().setRetryHandler((exception, executionCount, context) -> {
if (executionCount > 3) {
return false;
}
return exception instanceof org.apache.http.NoHttpResponseException || exception instanceof SocketException;
}).build();
}
}
Socket exception is there as well, because sometimes this exception is thrown instead of NoHttpResponse

Readiness probe during Spring context startup

We are deploying our spring boot applications in OpenShift.
Currently we are trying to run a potentially long running task (database migration) before the webcontext is fully set up.
It is especially important that the app does not accept REST requests or process messages before the migration is fully run.
See the following minimal example:
// DemoApplication.java
#SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
// MigrationConfig.java
#Configuration
#Slf4j
public class MigrationConfig {
#PostConstruct
public void run() throws InterruptedException {
log.info("Migration...");
// long running task
Thread.sleep(10000);
log.info("...Migration");
}
}
// Controller.java
#RestController
public class Controller {
#GetMapping("/test")
public String test() {
return "test";
}
}
// MessageHandler.java
#EnableBinding(Sink.class)
public class MessageHandler {
#StreamListener(Sink.INPUT)
public void handle(String message) {
System.out.println("Received: " + message);
}
}
This works fine so far: the auto configuration class is processed before the app responds to requests.
What we are worried about, however, is OpenShifts readiness probe: currently we use an actuator health endpoint to check if the application is up and running.
If the migration takes a long time, OpenShift might stop the container, potentially leaving us with inconsistent state in the database.
Does anybody have an idea how we could communicate that the application is starting, but prevent REST controller or message handlers from running?
Edit
There are multiple ways of blocking incoming REST requests, #martin-frey suggested a servletfilter.
The larger problem for us is stream listener. We use Spring Cloud Stream to listen to a RabbitMQ queue.
I added an exemplary handler in the example above.
Do you have any suggestions on how to "pause" that?
What about a servletfilter that knows about the state of the migration? That way you should be able to handle any inbound request and return a responsecode to your liking. Also there would be no need to prevent any requesthandlers until the system is fully up.
I think it can run your app pod without influence if you set up good enough initialDelaySeconds for initialization of your application.[0][1]
readinessProbe:
httpGet:
path: /_status/healthz
port: 8080
initialDelaySeconds: 10120
timeoutSeconds: 3
periodSeconds: 30
failureThreshold: 100
successThreshold: 1
Additionally, I recommend to set up the liveness probes with same condition (but more time than the readiness probes' value), then you can implement automated recovery of your pods if the application is failed until initialDelaySeconds.
[0] [ https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes ]
[1] [ https://docs.openshift.com/container-platform/latest/dev_guide/application_health.html ]
How about adding an init container which only role is the db migration stuffs without the application.
Then another container to serve the application. But be careful when deploying the application with more than 1 replica. The replicas will also execute the initcontainer at the same time if you are using Deployment.
If you need multiple replicas, you might want to consider StatefulSets instead.
Such database migrations are best handled by switching to a Recreate deployment strategy and doing the migration as a mid lifecyle hook. At that point there are no instances of your application running so it can be safely done. If you can't have downtime, then you need to have the application be able to be switched to some offline or read/only mode against a copy of your database while doing the migration.
Don't keep context busy doing a long task in PostConstruct. Instead start migration as fully asynchronous task and allow Spring to build the rest of the context meanwhile. At the end of the task just set some shared Future with success or failure. Wrap controller in a proxy (can be facilitated with AOP, for example) where every method except the health check tries to get value from the same future within a timeout. If it succeeds, migration is done, all calls are available. If not, reject the call. Your proxy would serve as a gate allowing to use only part of API that is critical to be available while migration is going on. The rest of it may simply respond with 503 indicating the service is not ready yet. Potentially those 503 responses can also be improved by measuring and averaging the time migration typically takes and returning this value with RETRY-AFTER header.
And with the MessageHandler you can do essentially same thing. You wait for result of the future in the handle method (provided message handlers are allowed to hang indefinitely). Once the result is set, it will proceed with message handling from that moment on.

Resources