How to retry a failed action? - spring

I went through Spring Statemachine documentation but did not find clear answers for some scenarios. I will greatly appreciate if some one can clarify my questions.
Scenario1: How to retry errors related to action failures? Lets say I have the following states S1, S2 and S3 and when we transition from S1 to S2 I want to perform action A2. If action A2 fails I want to retry it with some time intervals. Is that possible using Spring StateMachine?
Consider AWS state machine Step Functions for example. All work in the step functions States are done using Task. And Task can be configured for retry.
transitions
.withExternal()
.source(States.S1)
.target(States.S2)
.event(Events.E1)
.action(action());
Scenario 2: Lets say Statemachine has states S1, S2 and S3. The current state is S2. If the server goes down on startup will the Statemachine execution pick up from where it left off or we will have to do it all over again?
Scenario 3: When a Guard returns false (possibly because of error condition) and prevents a transition what happens next?

How to retry a failed action?
There are two types of actions in Spring State Machine - transition actions and state actions. In scenario 1 you're talking about transition action.
When you specify a transition action, you can also specify an error handler if the action fails. This is clearly documented in the spring state machine documentation.
.withExternal()
.source(States.S1)
.target(States.S2)
.event(Events.E1)
.action(action(), errorAction());
In your errorAction() method you can implement your logic.
Possible options are:
transition to an earlier state and go the same path
transition to a specific state (e.g. retry state) where you can have your retry logic (e.g. Task/Executor that retries the action N times, and transition to other states (e.g. action success => go normal flow; action failed after N retries => transition to a failure terminal state)
There's also the official Tasks example, that demonstrates recovery/retry logic (source code).

Related

Project reactor - react to timeout happened downstream

Project Reactor has a variety of timeout() operators.
The very basic implementation raises TimeoutException in case no item arrives within the given Duration. The exception is propagated downstream , and to upstream it sends cancel signal.
Basically my question is: is it possible to somehow react (and do something) specifically to timeout that happened downstream, not just to cancelation that sent after timeout happened?
My question is based on the requirements of my real business case and also I'm wondering if there is a straight solution.
I'll simplify my code for better understanding what I want to achieve.
Let's say I have the following reactive pipeline:
Flux.fromIterable(List.of(firstClient, secondClient))
.concatMap(Client::callApi) // making API calls sequentially
.collectList() // collecting results of API calls for further processing
.timeout(Duration.ofMillis(3000)) // the entire process should not take more than duration specified
.subscribe();
I have multiple clients for making API calls. The business requirement is to call them sequantilly, so I call them with concatMap(). Then I should collect all the results and the entire process should not take more than some Duration
The Client interface:
interface Client {
Mono<Result> callApi();
}
And the implementations:
Client firstClient = () ->
Mono.delay(Duration.ofMillis(2000L)) // simulating delay of first api call
.map(__ -> new Result())
// !!! Pseudo-operator just to demonstrate what I want to achieve
.doOnTimeoutDownstream(() ->
log.info("First API call canceled due to downstream timeout!")
);
Client secondClient = () ->
Mono.delay(Duration.ofMillis(1500L)) // simulating delay of second api call
.map(__ -> new Result())
// !!! Pseudo-operator just to demonstrate what I want to achieve
.doOnTimeoutDownstream(() ->
log.info("Second API call canceled due to downstream timeout!")
);
So, if I have not received and collected all the results during the amount of time specified, I need to know which API call was actually canceled due to downstream timeout and have some callback for this "event".
I know I could put doOnCancel() callback to every client call (instead of pseudo-operator I demonstrated) and it would work, but this callback reacts to cancelation, which may happen due to any error.
Of course, with proper exception handling (onErrorResume(), for example) it would work as I expect, however, I'm interesting if there is some straight way to somehow react specifically to timeout in this case.

Breaking a possible infinite loop in AWS step functions

I am writing a state machine with the following functionality.
start State -> Lambda1 which calls external service Describe API endpoint to get State attribute of item example "isOKay" or "isNotOkay" -> Choice state((depending on the state received) if "IsOkay" move to next state and if "isNotOkay" again call lambda1. This happens until it gets a IsOkay state. How can put a limit to this custom retry loop so that I dont get stuck if I never receive a IsOkay response.
You can use input your step in a form of counter, which incremented by lambda. Which when return in retry can be checked for a limit, if crosses one fail lambda with custom exception. Describe separate step for handling the exception.
https://docs.aws.amazon.com/step-functions/latest/dg/input-output-inputpath-params.html
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html

Spring Integration Flow : Circuit breaker for each endpoints or at flow level

I have successfully implemented some spring Integration Flow.
I am looking to have a circuit breaker either the same one for each endpoints or either at the flow level.
I have already read this documentation https://docs.spring.io/spring-integration/reference/html/handler-advice.html, but I havent find my answer.
Should I use some AOP ?
Thanks
G.
I'm not sure what you have missed in the mentioned docs, but RequestHandlerCircuitBreakerAdvice is indeed over there: https://docs.spring.io/spring-integration/reference/html/handler-advice.html#circuit-breaker-advice
The advises like this should be applied in the Java DSL with this configuration option:
.transform(..., c -> c.advice(expressionAdvice()))
Pay attention to that advice(expressionAdvice()) call. The expressionAdvice() is a bean method. So, you can do something similar for the RequestHandlerCircuitBreakerAdvice and any your endpoints in the flow which need to be guarded by the circuit.
And yes, you can use only a single bean for the RequestHandlerCircuitBreakerAdvice. It does keep a state for any endpoint it is called against:
protected Object doInvoke(ExecutionCallback callback, Object target, Message<?> message) {
AdvisedMetadata metadata = this.metadataMap.get(target);
if (metadata == null) {
this.metadataMap.putIfAbsent(target, new AdvisedMetadata());
metadata = this.metadataMap.get(target);
}
Thanks for your answer #artem-bilan.
I really appreciate that a spring integration team member anwsered to this.
After more thoughts, I have reformulated my problem.
Given an IntegrationFlow, with a specific error channel, if there are more than a given amount of errors in given span time (more than 10 errors in 10s), I want to stop polling the input channel.
So I redirect all the errors for this flow to the specific flow error channel.
An error counter is incremented, and then if the threshold is reached in the given span time, I stop the poller.
I have a second flow that monitor "stopped" pollers, and it restart them after some time.
[UPDATE]
I do have use your recommendations.
Mainly because I the framework dont solve your problem, your probably wrong.
And I was wrong.
Thanks !

Cypress async form validation - how to capture (possibly) quick state changes

I have some async form validation code that I'd like to put under test using Cypress. The code is pretty simple -
on user input, enter async validation UI state (or stay in that state if there are previous validation requests that haven't been responded to)
send a request to the server
receive a response
if there are no pending requests, leave async validation UI state
Step 1 is the part I want to test. Right now, this means checking if some element has been assigned some class -- but the state changes can happen very fast, and most of the time (not always!) Cypress times out waiting for something that has ALREADY happened (in other words, step 4 has already occurred by the time we get around to seeing if step 1 happened).
So the failing test looks like:
cy.get("#some-input").type("...");
cy.get("#some-target-element").should("have.class", "class-to-check-for");
Usually, by the time Cypress gets to the second line, step 4 has already ran and the test fails. Is there a common pattern I should know about to solve this? I would naturally prefer not to have change the code under test.
Edit 1:
I'm not certain that I've 100% solved the "race" condition here, but if I use the underlying native elements (discarding the jQuery abstraction), I haven't had a failure yet.
So, changing:
cy.get("#some-input").type("...")
to:
cy.get("#some-input").then(jQueryObj => {
let nativeElement = jQueryObj[0];
nativeElement.value = "...";
nativeElement.dispatchEvent(new Event("input")); // make sure the app knows this element changed
});
And then running Cypress' checks for what classes have / haven't been added has been effective.
You can stub the server request that happens during form validation - and slow it down, see delay parameter https://docs.cypress.io/api/commands/route.html#Use-delays-for-responses
While the request is delayed, your app's validation UI is showing, you can validate it and then once the request finishes, check if the UI goes away.

Project reactor processors v3.X

We are trying to migrate from 2.X to 3.X.
https://github.com/reactor/reactor-core/issues/375
We have used the EventBus as event manager in our application(Low latency FX system) and it works very well for us.
After the change we decided to take every module and create his own processor to handle event.
1. Does this use seems to be correct from your point of view? Because lack of document at the current stage and after reviewing everything we could we don't really know what to do here
2. We have tried to use Flux in order to perform action every X interval
For example: Market is arriving 1000 for 1 second but we want to process an update only 4 time in a second. After upgrading we are using:
Processor with buffer and sending to another method.
In this method we have Flux that get list and try to work in parallel in order to complete his task.
We had 2 major problems:
1. Sometimes we received Null event which we cannot find that our system is sending to i suppose maybe we are miss using the processor
//Definition of processor
ReplayProcessor<Event> classAEventProcessor = ReplayProcessor.create();
//Event handler subscribing
public void onMyEventX(Consumer<Event> consumer) {
Flux<Event> handler = classAEventProcessor .filter(event -> event.getType().equals(EVENT_X));
handler.subscribe(consumer);
}
in the example above the event in the handler sometimes get null.. Once he does the stream stop working until we are restating server(Because only on restart we are doing creating processor)
2.We have tried to us parallel but sometimes some of the message were disappeared so maybe we are misusing the framework
//On constructor
tickProcessor.buffer(1024, Duration.of(250, ChronoUnit.MILLIS)).subscribe(markets ->
handleMarkets(markets));
//Handler
Flux.fromIterable(getListToProcess())
.parallel()
.runOn(Schedulers.parallel())
.doOnNext(entryMap -> {
DoBlockingWork(entryMap);
})
.sequential()
.subscribe();
The intention of this is that the processor will wakeup every 250ms and invoke the handler. The handler will work work with Flux parallel in order to make better and faster processing.
*In case that DoBlockingWork takes more than 250ms i couldn't understand what will be the behavior
UPDATE:
The EventBus was wrapped by us and every event subscribed throw the wrapped event manager.
Now we have tried to create event processor for every module but it works very slow. We have used TopicProcessor with ThreadExecutor and still very slow.. EventBus did the same work in high speed
Anyone has any idea? BTW when i tried to use DirectProcessor it seems to work much better that the TopicProcessor
Reactor 3 is built around the concept that you should avoid blocking as much as you can, so in your second snippet DoBlockingWork doesn't look good.
How are the events generated? Do you maybe have an listener-based asynchronous API to get them? If so, you could try using Flux.create.
For your use case of "we have 1000 events in 1 second, but only want to process 4", I'd chain a sample operator. For instance, sample(Duration.ofMillis(250)) will divide each second into 4 windows, from which it will only emit the last element.
The reference guide is being written, as well as a page where you can find links to external articles and learning material.There's a preview of the WIP reference guide here and the learning resources page here.

Resources