Troubles registering custom metrics to Micrometer & Spring Boot 2 - metrics

I'm having issues registering custom metrics to MeterRegistery.
What it looks like is that Spring manages register all its metrics (70 something) and then when the control comes back to me, I'm trying to register my own too:
public MetricsAwareActiveTasksRepository(ActiveTasksRepository activeTasksRepository, MeterRegistry meterRegistry) {
this.activeTasksRepository = activeTasksRepository;
this.createdTaskIdsCounter = meterRegistry.counter(CustomBusinessMetrics.CREATED_TASK_IDS_COUNTER);
this.autoStoppedTasksCounter = meterRegistry.counter(CustomBusinessMetrics.AUTO_STOPPED_TASKS_COUNTER);
}
Unfortunately at the point where my first metric is being registered, process hangs here in micrometer:
private Meter getOrCreateMeter(#Nullable DistributionStatisticConfig config,
BiFunction<Id, /*Nullable Generic*/ DistributionStatisticConfig, Meter> builder,
Id originalId, Id mappedId, Function<Meter.Id, ? extends Meter> noopBuilder) {
Meter m = meterMap.get(mappedId);
if (m == null) {
if (isClosed()) {
return noopBuilder.apply(mappedId);
}
synchronized (meterMapLock) {
m = meterMap.get(mappedId);
if (m == null) {
if (!accept(originalId)) {
//noinspection unchecked
return noopBuilder.apply(mappedId);
}
if (config != null) {
for (MeterFilter filter : filters) {
DistributionStatisticConfig filteredConfig = filter.configure(mappedId, config);
if (filteredConfig != null) {
config = filteredConfig;
}
}
}
m = builder.apply(mappedId, config);
meterMap = meterMap.plus(mappedId, m);
for (Consumer<Meter> onAdd : meterAddedListeners) {
onAdd.accept(m);
}
}
}
}
return m;
}
The last line that can be tracked from debugger is where synchronized block starts. From debugger I see that the main thread becomes ZOMBIE, and nothing more happens. Lock is not being released.
Did anyone have this kind of problems? Am I doing something wrong here?
BTW Here's config also:
#Bean
CloudWatchMeterRegistry cloudWatchMeterRegistry(CloudWatchConfigProperties config, AmazonCloudWatchAsync amazonCloudWatch) {
CloudWatchMeterRegistry meterRegistry = new CloudWatchMeterRegistry(config, Clock.SYSTEM, amazonCloudWatch);
meterRegistry.config().commonTags(commonTags());
return meterRegistry;
}
As you can see I also tried to register some "custom" metrics before Spring does and it succeeds, however it does not change the situation that later this is not possible.
Also after enabling DEBUG logger, the micrometer library says nothing, the last logs before actually hanging forever are the logs from spring bean lifecycle methods, like autowiring this particular constructor that registers new metrics.
Versions are: spring-boot: 2.0.2, micrometer 1.0.4
Thanks in advance for any ideas.

Related

Writing blocking operations in reactor tests with Spring and State Machine

I'm completely new to reactor programming and I'm really struggling with migrating old integration tests since upgrading to the latest Spring Boot / State Machine.
Most Integration tests have the same basic steps :
Call a method that returns a Mono and starts a state Machine and returns an object containing a generated unique id as well as some other infos related to the initial request.
With the returned object call a method that verifies if a value has been updated in the database (using the information of the object retried in step 1)
Poll at a fixed interval the method that checks in the database if value has changed until either the value has changed or a predefined timeout occurs.
Check another table in the database if another object has been updated
Below an example:
#Test
void testEndToEnd() {
var instance = ServiceInstance.buildDefault();
var updateRequest = UpdateRequest.build(instance);
// retrieve an update Response related to the request
// since a unique id is generated when triggering the update request
// before starting a stateMachine that goes through different steps
var updateResponse = service.updateInstance(updateRequest).block();
await().alias("Check if operation was successful")
.atMost(Duration.ofSeconds(120))
.pollInterval(Duration.ofSeconds(2))
.until(() -> expectOperationState(updateResponse, OperationState.SUCCESS))
// check if values are updated in secondary table
assertValuesInTransaction(updateResponse);
}
This was working fine before but ever since the latest update where it fails with the exception :
java.lang.IllegalStateException: block()/blockFirst()/blockLast() are blocking, which is not supported in thread parallel-6
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:83)
at reactor.core.publisher.Mono.block(Mono.java:1710)
I saw that a good practice to test reactor methods using StepVerifier but I do not see how I can reproduce the part done with Awaitability to poll to see if the value has changed in the DB since the method that checks in the DB returns a Mono and not a flux that keeps sending values.
Any idea on how to accomplish this or to make the spring stack accept blocking operations?
Thanks
My current stack :
Spring Boot 3.0.1
Spring State Machine 3.0.1
Spring 6
Junit 5.9.2
So as discussed in comments here is an example with comments. I used flatMap to subscribe to what expectOperationState returns. Also there is Mono.fromCallable used which check the value from some method and if it fails to emit anything in 3 seconds - the timeout exception is thrown. Also we could try to get rid of this boolean value from expectOperationState and refactor the code to just return Mono<Void> with completed signal but this basically shows how you can achieve what you want.
class TestStateMachine {
#Test
void testUntilSomeOperationCompletes() {
final Service service = new Service();
final UpdateRequest updateRequest = new UpdateRequest();
StepVerifier.create(service.updateInstance(updateRequest)
.flatMap(updateResponse -> expectOperationState(updateResponse, OperationState.SUCCESS))
)
.consumeNextWith(Assertions::assertTrue)
.verifyComplete();
}
private Mono<Boolean> expectOperationState(final UpdateResponse updateResponse, final OperationState success) {
return Mono.fromCallable(() -> {
while (true) {
boolean isInDb = checkValueFromDb(updateResponse);
if (isInDb) {
return true;
}
}
})
.publishOn(Schedulers.single())
//timeout if we not receive any value from callable within 3 seconds so that we do not check forever
.timeout(Duration.ofSeconds(3));
}
private boolean checkValueFromDb(final UpdateResponse updateResponse) {
return true;
}
}
class Service {
Mono<UpdateResponse> updateInstance(final UpdateRequest updateRequest) {
return Mono.just(new UpdateResponse());
}
}
Here is an example without using Mono<Boolean> :
class TestStateMachine {
#Test
void test() {
final Service service = new Service();
final UpdateRequest updateRequest = new UpdateRequest();
StepVerifier.create(service.updateInstance(updateRequest)
.flatMap(updateResponse -> expectOperationState(updateResponse, OperationState.SUCCESS).timeout(Duration.ofSeconds(3)))
)
.verifyComplete();
}
private Mono<Void> expectOperationState(final UpdateResponse updateResponse, final OperationState success) {
return Mono.fromCallable(() -> {
while (true) {
boolean isInDb = checkValueFromDb(updateResponse);
if (isInDb) {
//return completed Mono
return Mono.<Void>empty();
}
}
})
.publishOn(Schedulers.single())
//timeout if we not receive any value from callable within 3 seconds so that we do not check forever
.timeout(Duration.ofSeconds(3))
.flatMap(objectMono -> objectMono);
}
private boolean checkValueFromDb(final UpdateResponse updateResponse) {
return true;
}
}

#Version column is not working out of the box with spring data jdbc

I have my version column defined like this
#org.springframework.data.annotation.Version
protected long version;
With Spring Data JDBC it's always trying to INSERT. Updates are not happening. When I debug I see that, PersistentEntityIsNewStrategy is being used which is the default strategy. It has isNew() method to determine the state of the entity being persisted. I do see that version and id are used for this determination.
But my question is who is responsible to increment the version column after every save, so that when the second time .save() is called, the isNew() method can return false.
Should we do fire a BeforeSaveEvent and handle the incrementation of Version column? Would that be good enough to handle the OptimisticLock ?
Edit
I added an ApplicationListener to listen to BeforeSaveEvent like this.
public ApplicationListener<BeforeSaveEvent> incrementingVersion() {
return event -> {
Object entity = event.getEntity();
if (BaseDataModel.class.isAssignableFrom(entity.getClass())) {
BaseDataModel baseDataModel = (BaseDataModel) entity;
Long version = baseDataModel.getVersion();
if (version == null) {
baseDataModel.setVersion(0L);
} else {
baseDataModel.setVersion(version + 1L);
}
}
};
}
So now the version column works, but rest of Auditable fields #CreatedAt, #CreatedBy,#LastModifiedDate and #LastModifiedBy are not set!!
Edit2
Created a new ApplicationListener like below. In this case both my custom listener and Spring's RelationalAuditingListener are getting called. But still it doesn't solve the problem. Because the order of listeners[custom one followed by spring's] making the markAudited to invoke markUpdated instead of markCreated, since the version column is already incremented. I tried to make my Listener be the LOWEST_PRECEDENCE still no luck.
My custom listener here
public class CustomRelationalAuditingEventListener
implements ApplicationListener<BeforeSaveEvent>, Ordered {
#Override
public void onApplicationEvent(BeforeSaveEvent event) {
Object entity = event.getEntity();
// handler.markAudited(entity);
if (BaseDataModel.class.isAssignableFrom(entity.getClass())) {
BaseDataModel baseDataModel = (BaseDataModel) entity;
if (baseDataModel.getVersion() == null) {
baseDataModel.setVersion(0L);
} else {
baseDataModel.setVersion(baseDataModel.getVersion() + 1L);
}
}
}
#Override
public int getOrder() {
return LOWEST_PRECEDENCE;
}
}
Currently, you have to increment the version manually and there is no optimistic locking, i.e. the version is only used for checking if an entity is new.
There is an open issue for support of optimistic locking and there is even a PR open for it.
Therefore it is likely that this feature will be available with an upcoming 1.1 milestone.

Spring Web-Flux: How to return a Flux to a web client on request?

We are working with spring boot 2.0.0.BUILD_SNAPSHOT and spring boot webflux 5.0.0 and currently we cant transfer a flux to a client on request.
Currently I am creating the flux from an iterator:
public Flux<ItemIgnite> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue());
}
});
}
And on request I am simply doing:
#RequestMapping(value="/all", method=RequestMethod.GET, produces="application/json")
public Flux<ItemIgnite> getAllFlux() {
return this.provider.getAllFlux();
}
When I now locally call localhost:8080/all after 10 seconds I get a 503 status code. Also as at client when I request /all using the WebClient:
public Flux<ItemIgnite> getAllPoducts(){
WebClient webClient = WebClient.create("http://localhost:8080");
Flux<ItemIgnite> f = webClient.get().uri("/all").accept(MediaType.ALL).exchange().flatMapMany(cr -> cr.bodyToFlux(ItemIgnite.class));
f.subscribe(System.out::println);
return f;
}
Nothing happens. No data is transferred.
When I do the following instead:
public Flux<List<ItemIgnite>> getAllFluxMono() {
return Flux.just(this.getAllList());
}
and
#RequestMapping(value="/allMono", method=RequestMethod.GET, produces="application/json")
public Flux<List<ItemIgnite>> getAllFluxMono() {
return this.provider.getAllFluxMono();
}
It is working. I guess its because all data is already finished loading and just transferred to the client as it usually would transfer data without using a flux.
What do I have to change to get the flux streaming the data to the web client which requests those data?
EDIT
I have data inside an ignite cache. So my getAllIterator is loading the data from the ignite cache:
public Iterator<Cache.Entry<String, ItemIgnite>> getAllIterator() {
return this.igniteCache.iterator();
}
EDIT
adding flux.complete() like #Simon Baslé suggested:
public Flux<ItemIgnite> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue());
}
flux.complete(); // see here
});
}
Solves the 503 problem in the browser. But it does not solve the problem with the WebClient. There is still no data transferred.
EDIT 3
using publishOn with Schedulers.parallel():
public Flux<ItemIgnite> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.<ItemIgnite>create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue());
}
flux.complete();
}).publishOn(Schedulers.parallel());
}
Does not change the result.
Here I post you what the WebClient receives:
value :[Item ID: null, Product Name: null, Product Group: null]
complete
So it seems like he is getting One item (out of over 35.000) and the values are null and he is finishing after.
One thing that jumps out is that you never call flux.complete() in your create.
But there's actually a factory operator that is tailored to transform an Iterable to a Flux, so you could just do Flux.fromIterable(this)
Edit: in case your Iterator is hiding complexity like a DB request (or any blocking I/O), be advised this spells trouble: anything blocking in a reactive chain, if not isolated on a dedicated execution context using publishOn, has the potential to block not only the entire chain but other reactive processes has well (as threads can and will be used by multiple reactive processes).
Neither create nor fromIterable do anything in particular to protect from blocking sources. I think you are facing that kind of issue, judging from the hang you get with the WebClient.
The problem was my Object ItemIgnite which I transfer. The system Flux seems not to be able to handle this. Because If I change my original code to the following:
public Flux<String> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue().toString());
}
});
}
Everything is working fine. Without publishOn and without flux.complete(). Maybe someone has an idea why this is working.

How to perform new operation on #RetryOnFailure by jcabi

Iam using jcabi-aspects to retry connection to my URL http://xxxxxx:8080/hello till the connection comes back.As you know #RetryOnFailure by jcabi has two fields attempts and delay.
I want to perform the operation like attempts(12)=expiryTime(1 min=60000 millis)/delay(5 sec=5000 millis) on jcabi #RetryOnFailure.How do i do this.The code snippet is as below.
#RetryOnFailure(attempts = 12, delay = 5)
public String load(URL url) {
return url.openConnection().getContent();
}
You can combine two annotations:
#Timeable(unit = TimeUnit.MINUTE, limit = 1)
#RetryOnFailure(attempts = Integer.MAX_VALUE, delay = 5)
public String load(URL url) {
return url.openConnection().getContent();
}
#RetryOnFailure will retry forever, but #Timeable will stop it in a minute.
The library you picked (jcabi) does not have this feature. But luckily the very handy RetryPolicies from Spring-Batch have been extracted (so you can use them alone, without the batching):
Spring-Retry
One of the many classes you could use from there is TimeoutRetryPolicy:
RetryTemplate template = new RetryTemplate();
TimeoutRetryPolicy policy = new TimeoutRetryPolicy();
policy.setTimeout(30000L);
template.setRetryPolicy(policy);
Foo result = template.execute(new RetryCallback<Foo>() {
public Foo doWithRetry(RetryContext context) {
// Do stuff that might fail, e.g. webservice operation
return result;
}
});
The whole spring-retry project is very easy to use and full of features, like backOffPolicies, listeners, etc.

Non-Blocking Endpoint: Returning an operation ID to the caller - Would like to get your opinion on my implementation?

Boot Pros,
I recently started to program in spring-boot and I stumbled upon a question where I would like to get your opinion on.
What I try to achieve:
I created a Controller that exposes a GET endpoint, named nonBlockingEndpoint. This nonBlockingEndpoint executes a pretty long operation that is resource heavy and can run between 20 and 40 seconds.(in the attached code, it is mocked by a Thread.sleep())
Whenever the nonBlockingEndpoint is called, the spring application should register that call and immediatelly return an Operation ID to the caller.
The caller can then use this ID to query on another endpoint queryOpStatus the status of this operation. At the beginning it will be started, and once the controller is done serving the reuqest it will be to a code such as SERVICE_OK. The caller then knows that his request was successfully completed on the server.
The solution that I found:
I have the following controller (note that it is explicitely not tagged with #Async)
It uses an APIOperationsManager to register that a new operation was started
I use the CompletableFuture java construct to supply the long running code as a new asynch process by using CompletableFuture.supplyAsync(() -> {}
I immdiatelly return a response to the caller, telling that the operation is in progress
Once the Async Task has finished, i use cf.thenRun() to update the Operation status via the API Operations Manager
Here is the code:
#GetMapping(path="/nonBlockingEndpoint")
public #ResponseBody ResponseOperation nonBlocking() {
// Register a new operation
APIOperationsManager apiOpsManager = APIOperationsManager.getInstance();
final int operationID = apiOpsManager.registerNewOperation(Constants.OpStatus.PROCESSING);
ResponseOperation response = new ResponseOperation();
response.setMessage("Triggered non-blocking call, use the operation id to check status");
response.setOperationID(operationID);
response.setOpRes(Constants.OpStatus.PROCESSING);
CompletableFuture<Boolean> cf = CompletableFuture.supplyAsync(() -> {
try {
// Here we will
Thread.sleep(10000L);
} catch (InterruptedException e) {}
// whatever the return value was
return true;
});
cf.thenRun(() ->{
// We are done with the super long process, so update our Operations Manager
APIOperationsManager a = APIOperationsManager.getInstance();
boolean asyncSuccess = false;
try {asyncSuccess = cf.get();}
catch (Exception e) {}
if(true == asyncSuccess) {
a.updateOperationStatus(operationID, Constants.OpStatus.OK);
a.updateOperationMessage(operationID, "success: The long running process has finished and this is your result: SOME RESULT" );
}
else {
a.updateOperationStatus(operationID, Constants.OpStatus.INTERNAL_ERROR);
a.updateOperationMessage(operationID, "error: The long running process has failed.");
}
});
return response;
}
Here is also the APIOperationsManager.java for completness:
public class APIOperationsManager {
private static APIOperationsManager instance = null;
private Vector<Operation> operations;
private int currentOperationId;
private static final Logger log = LoggerFactory.getLogger(Application.class);
protected APIOperationsManager() {}
public static APIOperationsManager getInstance() {
if(instance == null) {
synchronized(APIOperationsManager.class) {
if(instance == null) {
instance = new APIOperationsManager();
instance.operations = new Vector<Operation>();
instance.currentOperationId = 1;
}
}
}
return instance;
}
public synchronized int registerNewOperation(OpStatus status) {
cleanOperationsList();
currentOperationId = currentOperationId + 1;
Operation newOperation = new Operation(currentOperationId, status);
operations.add(newOperation);
log.info("Registered new Operation to watch: " + newOperation.toString());
return newOperation.getId();
}
public synchronized Operation getOperation(int id) {
for(Iterator<Operation> iterator = operations.iterator(); iterator.hasNext();) {
Operation op = iterator.next();
if(op.getId() == id) {
return op;
}
}
Operation notFound = new Operation(-1, OpStatus.INTERNAL_ERROR);
notFound.setCrated(null);
return notFound;
}
public synchronized void updateOperationStatus (int id, OpStatus newStatus) {
iteration : for(Iterator<Operation> iterator = operations.iterator(); iterator.hasNext();) {
Operation op = iterator.next();
if(op.getId() == id) {
op.setStatus(newStatus);
log.info("Updated Operation status: " + op.toString());
break iteration;
}
}
}
public synchronized void updateOperationMessage (int id, String message) {
iteration : for(Iterator<Operation> iterator = operations.iterator(); iterator.hasNext();) {
Operation op = iterator.next();
if(op.getId() == id) {
op.setMessage(message);
log.info("Updated Operation status: " + op.toString());
break iteration;
}
}
}
private synchronized void cleanOperationsList() {
Date now = new Date();
for(Iterator<Operation> iterator = operations.iterator(); iterator.hasNext();) {
Operation op = iterator.next();
if((now.getTime() - op.getCrated().getTime()) >= Constants.MIN_HOLD_DURATION_OPERATIONS ) {
log.info("Removed operation from watchlist: " + op.toString());
iterator.remove();
}
}
}
}
The questions that I have
Is that concept a valid one that also scales? What could be improved?
Will i run into concurrency issues / race conditions?
Is there a better way to achieve the same in boot spring, but I just didn't find that yet? (maybe with the #Async directive?)
I would be very happy to get your feedback.
Thank you so much,
Peter P
It is a valid pattern to submit a long running task with one request, returning an id that allows the client to ask for the result later.
But there are some things I would suggest to reconsider :
do not use an Integer as id, as it allows an attacker to guess ids and to get the results for those ids. Instead use a random UUID.
if you need to restart your application, all ids and their results will be lost. You should persist them to a database.
Your solution will not work in a cluster with many instances of your application, as each instance would only know its 'own' ids and results. This could also be solved by persisting them to a database or Reddis store.
The way you are using CompletableFuture gives you no control over the number of threads used for the asynchronous operation. It is possible to do this with standard Java, but I would suggest to use Spring to configure the thread pool
Annotating the controller method with #Async is not an option, this does not work no way. Instead put all asynchronous operations into a simple service and annotate this with #Async. This has some advantages :
You can use this service also synchronously, which makes testing a lot easier
You can configure the thread pool with Spring
The /nonBlockingEndpoint should not return the id, but a complete link to the queryOpStatus, including id. The client than can directly use this link without any additional information.
Additionally there are some low level implementation issues which you may also want to change :
Do not use Vector, it synchronizes on every operation. Use a List instead. Iterating over a List is also much easier, you can use for-loops or streams.
If you need to lookup a value, do not iterate over a Vector or List, use a Map instead.
APIOperationsManager is a singleton. That makes no sense in a Spring application. Make it a normal PoJo and create a bean of it, get it autowired into the controller. Spring beans by default are singletons.
You should avoid to do complicated operations in a controller method. Instead move anything into a service (which may be annotated with #Async). This makes testing easier, as you can test this service without a web context
Hope this helps.
Do I need to make database access transactional ?
As long as you write/update only one row, there is no need to make this transactional as this is indeed 'atomic'.
If you write/update many rows at once you should make it transactional to guarantee, that either all rows are updated or none.
However, if two operations (may be from two clients) update the same row, always the last one will win.

Resources