Spring Sleuth - Tracing Failures - spring

In a microservice environment I see two main benefits from tracing requests through all microservice instances over an entire business process.
Finding latency gaps between or in service instances
Finding roots of failures, whether technical or regarding the business case
With Zipkin there is a tool, which addresses the first issue. But how can tracing be used to unveil failures in your microservice landscape? I definitely want to trace all error afflicted spans, but not each request, where nothing went wrong.
As mentioned here a custom Sampler could be used.
Alternatively, you may register your own Sampler bean definition and programmatically make the decision which requests should be sampled. You can make more intelligent choices about which things to trace, for example, by ignoring successful requests, perhaps checking whether some component is in an error state, or really anything else.
So I tried to implement that, but it doesn't work or I used it wrong.
So, as the blog post suggested I registered my own Sampler:
#Bean
Sampler customSampler() {
return new Sampler() {
#Override
public boolean isSampled(Span span) {
boolean isErrorSpan = false;
for(String tagKey : span.tags().keySet()){
if(tagKey.startsWith("error_")){
isErrorSpan = true;
}
}
return isErrorSpan ;
}
};
}
And in my controller I create a new Span, which is being tagged as an error if an exception raises
private final Tracer tracer;
#Autowired
public DemoController(Tracer tracer) {
this.tracer = tracer;
}
#RequestMapping(value = "/calc/{i}")
public String calc(#PathVariable String i){
Span span = null;
try {
span = this.tracer.createSpan("my_business_logic");
return "1 / " + i + " = " + new Float(1.0 / Integer.parseInt(i)).toString();
}catch(Exception ex){
log.error(ex.getMessage(), ex);
span.logEvent("ERROR: " + ex.getMessage());
this.tracer.addTag("error_" + ex.hashCode(), ex.getMessage());
throw ex;
}
finally{
this.tracer.close(span);
}
}
Now, this doesn't work. If I request /calc/a the method Sampler.isSampled(Span) is being called before the Controller method throws a NumberFormatException. This means, when isSampled() checks the Span, it has no tags yet. And the Sampler method is not being called again later in the process. Only if I open the Sampler and allow every span to be sampled, I see my tagged error-span later on in Zipkin. In this case Sampler.isSampled(Span) was called only 1 time but HttpZipkinSpanReporter.report(Span) was executed 3 times.
So what would the use case look like, to transmit only traces, which have error spans ? Is this even a correct way to tag a span with an arbitrary "error_" tag ?

The sampling decision is taken for a trace. That means that when the first request comes in and the span is created you have to take a decision. You don't have any tags / baggage at that point so you must not depend on the contents of tags to take this decision. That's a wrong approach.
You are taking a very custom approach. If you want to go that way (which is not recommended) you can create a custom implementation of a SpanReporter - https://github.com/spring-cloud/spring-cloud-sleuth/blob/master/spring-cloud-sleuth-core/src/main/java/org/springframework/cloud/sleuth/SpanReporter.java#L30 . SpanReporter is the one that is sending spans to zipkin. You can create an implementation that will wrap an existing SpanReporter implementation and will delegate the execution to it only when some values of tags match. But from my perspective it doesn't sound right.

Related

How to track java method calls and get alerts in Dynatrace?

I have a code similar to the one below. Every-time a DBLock appears, I want to get an alert in Dynatrace creating a problem so that I can see it on the dashboard and possibly get an email notification also. The DB lock would appear if the update count is greater than 1.
private int removeDBLock(DataSource dataSource) {
int updateCount = 0;
final Timestamp lastAllowedDBLockTime = new Timestamp(System.currentTimeMillis() - (5 * 60 * 1000));
final String query = format(RELEASE_DB_CHANGELOCK, lastAllowedDBLockTime.toString());
try (Statement stmt = dataSource.getConnection().createStatement()) {
updateCount = stmt.executeUpdate(query);
if(updateCount>0){
log.error("Stale DB Lock found. Locks Removed Count is {} .",updateCount);
}
} catch (SQLException e) {
log.error("Error while trying to find and remove Db Change Lock. ",e);
}
return updateCount;
}
I tried using the event API to trigger an event on my host mentioned here and was successful in raising a problem alert on my dashboard.
https://www.dynatrace.com/support/help/dynatrace-api/environment-api/events/post-event/?request-parameters%3C-%3Ejson-model=json-model
but this would mean injecting an api call in my code just for monitoring, any may lead to more external dependencies and hence more chance of failure.
I also tried creating a custom service detection by adding the class containing this method and the method itself in the custom service. But I do not know how I can link this to an alert or a event that creates a problem on the dashboard.
Are there any best practices or solutions on how I can do this in Dynatrace. Any leads would be helpful.
I would take a look at Custom Services for Java which will cause invocations of the method to be monitored in more detail.
Maybe you can extract a method which actually throws the exception and the outer method which handles it. Then it should be possible to alert on the exception.
There are also some more ways to configure the service via settings, i.e. raise an error based on a return value directly.
See also documentation:
https://www.dynatrace.com/support/help/how-to-use-dynatrace/transactions-and-services/configuration/define-custom-services/
https://www.dynatrace.com/support/help/technology-support/application-software/java/configuration-and-analysis/define-custom-java-services/

Mono returned by ServerRequest.bodyToMono() method not extracting the body if I return ServerResponse immediately

I am using web reactive in spring web flux. I have implemented a Handler function for POST request. I want the server to return immediately. So, I have implemeted the handler as below -:
public class Sample implements HandlerFunction<ServerResponse>{
public Mono<ServerResponse> handle(ServerRequest request) {
Mono bodyMono = request.bodyToMono(String.class);
bodyMono.map(str -> {
System.out.println("body got is " + str);
return str;
}).subscribe();
return ServerResponse.status(HttpStatus.CREATED).build();
}
}
But the print statement inside the map function is not getting called. It means the body is not getting extracted.
If I do not return the response immediately and use
return bodyMono.then(ServerResponse.status(HttpStatus.CREATED).build())
then the map function is getting called.
So, how can I do processing on my request body in the background?
Please help.
EDIT
I tried using flux.share() like below -:
Flux<String> bodyFlux = request.bodyToMono(String.class).flux().share();
Flux<String> processFlux = bodyFlux.map(str -> {
System.out.println("body got is");
try{
Thread.sleep(1000);
}catch (Exception ex){
}
return str;
});
processFlux.subscribeOn(Schedulers.elastic()).subscribe();
return bodyFlux.then(ServerResponse.status(HttpStatus.CREATED).build());
In the above code, sometimes the map function is getting called and sometimes not.
As you've found, you can't just arbitrarily subscribe() to the Mono returned by bodyToMono(), since in that case the body simply doesn't get passed into the Mono for processing. (You can verify this by putting a single() call in that Mono, it'll throw an exception since no element will be emitted.)
So, how can I do processing on my request body in the background?
If you really still want to just use reactor to do a long task in the background while returning immediately, you can do something like:
return request.bodyToMono(String.class).doOnNext(str -> {
Mono.just(str).publishOn(Schedulers.elastic()).subscribe(s -> {
System.out.println("proc start!");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("proc end!");
});
}).then(ServerResponse.status(HttpStatus.CREATED).build());
This approach immediately publishes the emitted element to a new Mono, set to publish on an elastic scheduler, that is then subscribed in the background. However, it's kind of ugly, and it's not really what reactor is designed to do. You may be misunderstanding the idea behind reactor / reactive programming here:
It's not written with the idea of "returning a quick result and then doing stuff in the background" - that's generally the purpose of a work queue, often implemented with something like RabbitMQ or Kafka. It's "raison d'ĂȘtre" is instead to be non-blocking, so a single thread is never idly blocked, waiting for something else to complete.
The map() method isn't designed for side effects, it's designed to transform each object into another. For side effects, you want doOnNext() instead;
Reactor uses a single thread by default, so your "additional processing" in your map() method would still block that thread.
If your application is for anything more than quick demo purposes, and/or you need to make heavy use of this pattern, then I'd seriously consider setting up a proper work queue instead.
This is not possible.
Web servers (including Reactor Netty, Tomcat, etc) clean up and recycle resources when request processing is done. This means that when your controller handler is done, the HTTP resources, the request itself, reusable buffers, etc are recycled or closed. At that point, you cannot read from the request body anymore.
In your case, you need to read and buffer the whole request body first, then return a response and kick off a task for processing that request in a separate execution.

spring rector throw Exception if NOT empty

Similar to Spring Reactor: How to throw an exception when publisher emit a value?
I have a finder method in my DAO java findSomePojo which returns result SomePojo . The finder calls amazon db apis and the javasoftware.amazon.awssdk.services.dynamodb.model.GetItemResponse has output of call.
So I am trying this hasElement() check in my service layer createSomePojo method. (Not sure if I am using it correctly- Iwas trying and debugging)
Basically :
I want to check if there is already element, it is illegal to save and I would not call DAOs save. So I need to throw exception.
Assuming that there is already a record of SomePojo in DB, I try to invoke create_SomePjo of service .But I see in logs that filter is not working and is get NPE when reactor invokes createModel_SomePojo making me believe that somehow even after check filter it throws NPE
///service SomePjoService it has create_SomePojo, find_SomePojo etc
Mono<Void> create_SomePojo(reqPojo){
// Before calling DAO 's save I call serivice find (which basically calls DAOs find (Shown befow after this methid)
Mono<Boolean> monoPresent = find_SomePojo(accountId, contentIdExtn)
.filter(i -> i.getId() != null)
.hasElement();
System.out.println("monoPresent="+monoPresent.toString());
if(monoPresent.toString().equals("MonoHasElement")){
//*************it comes here i see that***********//
System.out.println("hrereee monoPresent="+monoPresent);
// Mono<Error> monoCheck=
return monoPresent.handle((next, sink) -> sink.error(new SomeException(ITEM_ALREADY_EXISTS))).then();
} else {
return SomePojoRepo.save(reqPojo).then();
}
}
Mono<SomePojo> find_SomePojo(id){
return SomePojoRepo.find(id);
}
==============================================================
///DAO : SomePojoRepo.java : it has save,find,delete
Mono<SomePojo> find( String id) {
Mono<SomePojo> fallback = Mono.empty();
Mono<GetItemResponse> monoFilteredResponse = monoFuture
.filter(getItemResponse -> getItemResponse.item().size() > 0&& getItemResponse!=null);
Mono<SomePojo> result = monoFilteredResponse
.map(getItemResponse -> createModel_SomePojo(getItemResponse.item()));
Mono<SomePojo> deferedResult = Mono.defer(() -> result.switchIfEmpty(fallback));
return deferedResult;
}
I see there is hasElement() method on Mono . Not sure how to correctly use it.
I can achieve exception if I call DAO save in my service create_SomePojo(reqPojo) directly without doing all this findner check because primary key constraint will take care and throw excpetion and I cna rethrow and then catch in service but what If I want to check in service and throw exception with error codes . The idea is not to pass response error object to dao layer .
Try to use Hooks.onOperatorDebug() hook to get better debugging experience.
Correct way to use hasElement (assuming that find_SomePojo never returns null)
Mono<Boolean> monoPresent = find_SomePojo(accountId, contentIdExtn)
.filter(i -> i.getId() != null)
.hasElement();
return monoPresent.flatMap(isPresent -> {
if(isPresent){
Mono.error(new SomeException(ITEM_ALREADY_EXISTS)));
}else{
SomePojoRepo.save(reqPojo);
}
}).then();
Sidenote
There is a common misconception about what Mono actually is. It does not hold any data - it's just a fragment of pipeline, which transmits signals and data flowing through it. Therefore, line System.out.println("monoPresent="+monoPresent.toString()); makes no sense, because it just prints the hasElements() decorator around the existsing pipeline. Internal name of this decorator is MonoHasElement, no matter what is contained in it(true /false), MonoHasElement would be printed anyway.
Correct ways to print signal (and data transmitted along with them) are:
Mono.log(), Mono.doOnEach/next(System.out::println) or System.out.println("monoPresent="+monoPresent.block());. Beware of third one: it will block whole thread until data is emitted, so use it only if you know what you are doing.
Example with Monos printing to play with:
Mono<String> abc = Mono.just("abc").delayElement(Duration.ofSeconds(99999999));
System.out.println(abc); //this will print MonoDelayElement instantly
System.out.println(abc.block()); //this will print 'abc', if you are patient enough ;^)
abc.subscribe(System.out::println); //this will also print 'abc' after 99999999 seconds, but without blocking current thread

Service method transactionality when not using exceptions as flow control in Spring Boot

I have the following method in an #Service class which has #Transactional defined:
#Override
public Result add(#NonNull final UserSaveRequest request) {
final Result<Email> emailResult = Email.create(request.getEmail());
final Result<UserFirstName> userFirstNameResult = UserFirstName.create(request.getFirstName());
final Result<UserLastName> userLastNameResult = UserLastName.create(request.getLastName());
final Result combinedResult = Result.combine(emailResult, userFirstNameResult, userLastNameResult);
if (combinedResult.isFailure()) {
return Result.fail(combinedResult.getErrorMessage());
}
final Result<User> userResult = User.create(emailResult.getValue(), userFirstNameResult.getValue(), userLastNameResult.getValue());
if (userResult.isFailure()) {
return Result.fail(userResult.getErrorMessage());
}
this.userRepository.save(userResult.getValue());
return Result.ok();
}
Now as you can see I utilize a Result class which can contain a return value or an error message as I don't think using exceptions for flow control is very clean.
The problem I now have is; the complete method is bound in one transaction and if one database call should fail the whole transaction will be rolled back. In my model however, after the this.userRepository.save(userResult.getValue()); call, if something would happen that would force me to return a failed result, I can't undo that save(userResult.getVlaue()); call seeing as I don't use exceptions for flow control.
Is this a problem that has an elegant solution, or is this a place where I need to make a trade-off between using exceptions as flow control and having to mentally keep track of the ordering of my statements in these kind of situations?
Yes, you can trigger rollback manually. Try this:
TransactionAspectSupport.currentTransactionStatus().setRollbackOnly();
More information: https://docs.spring.io/spring/docs/5.0.7.RELEASE/spring-framework-reference/data-access.html#transaction-declarative-rolling-back

Domain Driven Design - complex validation of commands across different aggregates

I've only began with DDD and currently trying to grasp the ways to do different things with it. I'm trying to design it using asynchronous events (no event-sourcing yet) with CQRS. Currently I'm stuck with validation of commands. I've read this question: Validation in a Domain Driven Design , however, none of the answers seem to cover complex validation across different aggregate roots.
Let's say I have these aggregate roots:
Client - contains list of enabled services, each service can have a value-object list of discounts and their validity.
DiscountOrder - an order to enable more discounts on some of the services of given client, contains order items with discount configuration.
BillCycle - each period when bills are generated is described by own billcycle.
Here's the usecase:
Discount order can be submitted. Each new discount period in discount order should not overlap with any of BillCycles. No two discounts of same type can be active at the same time on one service.
Basically, using Hibernate in CRUD style, this would look something similar to (java code, but question is language-agnostic):
public class DiscountProcessor {
...
#Transactional
public void processOrder(long orderId) {
DiscOrder order = orderDao.get(orderId);
BillCycle[] cycles = billCycleDao.getAll();
for (OrderItem item : order.getItems()) {
//Validate billcycle overlapping
for (BillCycle cycle : cycles) {
if (periodsOverlap(cycle.getPeriod(), item.getPeriod())) {
throw new PeriodsOverlapWithBillCycle(...);
}
}
//Validate discount overlapping
for (Discount d : item.getForService().getDiscounts()) {
if (d.getType() == item.getType() && periodsOverlap(d.getPeriod(), item.getPeriod())) {
throw new PeriodsOverlapWithOtherItems(...);
}
}
//Maybe some other validations in future or stuff
...
}
createDiscountsForOrder(order);
}
}
Now here are my thoughts on implementation:
Basically, the order can be in three states: "DRAFT", "VALIDATED" and "INVALID". "DRAFT" state can contain any kind of invalid data, "VALIDATED" state should only contain valid data, "INVALID" should contain invalid data.
Therefore, there should be a method which tries to switch the state of the order, let's call it order.validate(...). The method will perform validations required for shift of state (DRAFT -> VALIDATED or DRAFT -> INVALID) and if successful - change the state and transmit a OrderValidated or OrderInvalidated events.
Now, what I'm struggling with, is the signature of said order.validate(...) method. To validate the order, it requires several other aggregates, namely BillCycle and Client. I can see these solutions:
Put those aggregates directly into the validate method, like
order.validateWith(client, cycles) or order.validate(new
OrderValidationData(client, cycles)). However, this seems a bit
hackish.
Extract the required information from client and cycle
into some kind of intermediate validation data object. Something like
order.validate(new OrderValidationData(client.getDiscountInfos(),
getListOfPeriods(cycles)).
Do validation in a separate service
method which can do whatever it wants with whatever aggregates it
wants (basically similar to CRUD example above). However, this seems
far from DDD, as method order.validate() will become a dummy state
setter, and calling this method will make it possible to bring an
order unintuitively into an corrupted state (status = "valid" but
contains invalid data because nobody bothered to call validation
service).
What is the proper way to do it, and could it be that my whole thought process is wrong?
Thanks in advance.
What about introducing a delegate object to manipulate Order, Client, BillCycle?
class OrderingService {
#Injected private ClientRepository clientRepository;
#Injected private BillingRepository billRepository;
Specification<Order> validSpec() {
return new ValidOrderSpec(clientRepository, billRepository);
}
}
class ValidOrderSpec implements Specification<Order> {
#Override public boolean isSatisfied(Order order) {
Client client = clientRepository.findBy(order.getClientId());
BillCycle[] billCycles = billRepository.findAll();
// validate here
}
}
class Order {
void validate(ValidOrderSpecification<Order> spec) {
if (spec.isSatisfiedBy(this) {
validated();
} else {
invalidated();
}
}
}
The pros and cons of your three solutions, from my perspective:
order.validateWith(client, cycles)
It is easy to test the validation with order.
#file: OrderUnitTest
#Test public void should_change_to_valid_when_xxxx() {
Client client = new ClientFixture()...build()
BillCycle[] cycles = new BillCycleFixture()...build()
Order order = new OrderFixture()...build();
subject.validateWith(client, cycles);
assertThat(order.getStatus(), is(VALID));
}
so far so good, but there seems to be some duplicate test code for DiscountOrderProcess.
#file: DiscountProcessor
#Test public void should_change_to_valid_when_xxxx() {
Client client = new ClientFixture()...build()
BillCycle[] cycles = new BillCycleFixture()...build()
Order order = new OrderFixture()...build()
DiscountProcessor subject = ...
given(clientRepository).findBy(client.getId()).thenReturn(client);
given(cycleRepository).findAll().thenReturn(cycles);
given(orderRepository).findBy(order.getId()).thenReturn(order);
subject.processOrder(order.getId());
assertThat(order.getStatus(), is(VALID));
}
#or in mock style
#Test public void should_change_to_valid_when_xxxx() {
Client client = mock(Client.class)
BillCycle[] cycles = array(mock(BillCycle.class))
Order order = mock(Order.class)
DiscountProcessor subject = ...
given(clientRepository).findBy(client.getId()).thenReturn(client);
given(cycleRepository).findAll().thenReturn(cycles);
given(orderRepository).findBy(order.getId()).thenReturn(order);
given(client).....
given(cycle1)....
subject.processOrder(order.getId());
verify(order).validated();
}
order.validate(new OrderValidationData(client.getDiscountInfos(),
getListOfPeriods(cycles))
Same as the above one, you still need to prepare data for both OrderUnitTest and discountOrderProcessUnitTest. But I think this one is better as order is not tightly coupled with Client and BillCycle.
order.validate()
Similar to my idea if you keep validation in the domain layer. Sometimes it is just not any entity's responsibility, consider domain service or specification object.
#file: OrderUnitTest
#Test public void should_change_to_valid_when_xxxx() {
Client client = new ClientFixture()...build()
BillCycle[] cycles = new BillCycleFixture()...build()
Order order = new OrderFixture()...build();
Specification<Order> spec = new ValidOrderSpec(clientRepository, cycleRepository);
given(clientRepository).findBy(client.getId()).thenReturn(client);
given(cycleRepository).findAll().thenReturn(cycles);
subject.validate(spec);
assertThat(order.getStatus(), is(VALID));
}
#file: DiscountProcessor
#Test public void should_change_to_valid_when_xxxx() {
Order order = new OrderFixture()...build()
Specification<Order> spec = mock(ValidOrderSpec.class);
DiscountProcessor subject = ...
given(orderingService).validSpec().thenReturn(spec);
given(spec).isSatisfiedBy(order).thenReturn(true);
given(orderRepository).findBy(order.getId()).thenReturn(order);
subject.processOrder(order.getId());
assertThat(order.getStatus(), is(VALID));
}
Do the 3 possible states reflect your domain or is that just extrapolation ? I'm asking because your sample code doesn't seem to change Order state but throw an exception when it's invalid.
If it's acceptable for the order to stay DRAFT for a short period of time after being submitted, you could have DiscountOrder emit a DiscountOrderSubmitted domain event. A handler catches the event and (delegates to a Domain service that) examines if the submit is legit or not. It would then issue a ChangeOrderState command to make the order either VALIDATED or INVALID.
You could even suppose that the change is legit by default and have processOrder() directly take it to VALIDATED, until proven otherwise by a subsequent INVALID counter-order given by the validation service.
This is not much different from your third solution or Hippoom's one though, except every step of the process is made explicit with its own domain event. I guess that with your current aggregate design you're doomed to have a third party orchestrator (as un-DDD and transaction script-esque as it may sound) that controls the process, since the DiscountOrder aggregate doesn't have native access to all information to tell if a given transformation is valid or not.

Resources