Spring webflux and reading from database - spring

Spring 5 introduces the reactive programming style for rest APIs with webflux. I'm fairly new to it myself and was wondering wether wrapping synchronous calls to a database into Flux or Mono makes sense preformence-wise? If yes, is this the way to do it:
#RestController
public class HomeController {
private MeasurementRepository repository;
public HomeController(MeasurementRepository repository){
this.repository = repository;
}
#GetMapping(value = "/v1/measurements")
public Flux<Measurement> getMeasurements() {
return Flux.fromIterable(repository.findByFromDateGreaterThanEqual(new Date(1486980000L)));
}
}
Is there something like an asynchronous CrudRepository? I couldn't find it.

One option would be to use alternative SQL clients that are fully non-blocking. Some examples include:
https://github.com/mauricio/postgresql-async or https://github.com/finagle/roc. Of course, none of these drivers is officially supported by database vendors yet. Also, functionality is way much less attractive comparing to mature JDBC-based abstractions such as Hibernate or jOOQ.
The alternative idea came to me from Scala world. The idea is to dispatch blocking calls into isolated ThreadPool not to mix blocking and non-blocking calls together. This will allow us to control the overall number of threads and will let the CPU serve non-blocking tasks in the main execution context with some potential optimizations.
Assuming that we have JDBC based implementation such as Spring Data JPA which is indeed blocking, we can make it’s execution asynchronous and dispatch on the dedicated thread pool.
#RestController
public class HomeController {
private final MeasurementRepository repository;
private final Scheduler scheduler;
public HomeController(MeasurementRepository repository, #Qualifier("jdbcScheduler") Scheduler scheduler) {
this.repository = repository;
this.scheduler = scheduler;
}
#GetMapping(value = "/v1/measurements")
public Flux<Measurement> getMeasurements() {
return Mono.fromCallable(() -> repository.findByFromDateGreaterThanEqual(new Date(1486980000L))).publishOn(scheduler);
}
}
Our Scheduler for JDBC should be configured by using dedicated Thread Pool with size count equal to the number of connections.
#Configuration
public class SchedulerConfiguration {
private final Integer connectionPoolSize;
public SchedulerConfiguration(#Value("${spring.datasource.maximum-pool-size}") Integer connectionPoolSize) {
this.connectionPoolSize = connectionPoolSize;
}
#Bean
public Scheduler jdbcScheduler() {
return Schedulers.fromExecutor(Executors.newFixedThreadPool(connectionPoolSize));
}
}
However, there are difficulties with this approach. The main one is transaction management. In JDBC, transactions are possible only within a single java.sql.Connection. To make several operations in one transaction, they have to share a connection. If we want to make some calculations in between them, we have to keep the connection. This is not very effective, as we keep a limited number of connections idle while doing calculations in between.
This idea of an asynchronous JDBC wrapper is not new and is already implemented in Scala library Slick 3. Finally, non-blocking JDBC may come along on the Java roadmap. As it was announced at JavaOne in September 2016, and it is possible that we will see it in Java 10.

Based on this blog you should rewrite your snippet in following way
#GetMapping(value = "/v1/measurements")
public Flux<Measurement> getMeasurements() {
return Flux.defer(() -> Flux.fromIterable(repository.findByFromDateGreaterThanEqual(new Date(1486980000L))))
.subscribeOn(Schedulers.elastic());
}

Obtaining a Flux or a Mono doesn’t necessarily mean it will run in a dedicated Thread. Instead, most operators continue working in the Thread on which the previous operator executed. Unless specified, the topmost operator (the source) itself runs on the Thread in which the subscribe() call was made.
If you have blocking persistence APIs (JPA, JDBC) or networking APIs to use, Spring MVC is the best choice for common architectures at least. It is technically feasible with both Reactor and RxJava to perform blocking calls on a separate thread but you would not be making the most of a non-blocking web stack.
So... How do I wrap a synchronous, blocking call?
Use Callable to defer execution. And you should use Schedulers.elastic because it creates a dedicated thread to wait for the blocking resource without tying up some other resource.
Schedulers.immediate() : Current thread.
Schedulers.single() : A single, reusable thread.
Schedulers.newSingle() : A per-call dedicated thread.
Schedulers.elastic() : An elastic thread pool. It creates new worker pools as needed, and reuse idle ones. This is a good choice for I/O blocking work for instance.
Schedulers.parallel() : A fixed pool of workers that is tuned for parallel work.
example:
Mono.fromCallable(() -> blockingRepository.save())
.subscribeOn(Schedulers.elastic());

Spring data support reactive repository interface for Mongo and Cassandra.
Spring data MongoDb Reactive Interface
Spring Data MongoDB provides reactive repository support with Project Reactor and RxJava 1 reactive types. The reactive API supports reactive type conversion between reactive types.
public interface ReactivePersonRepository extends ReactiveCrudRepository<Person, String> {
Flux<Person> findByLastname(String lastname);
#Query("{ 'firstname': ?0, 'lastname': ?1}")
Mono<Person> findByFirstnameAndLastname(String firstname, String lastname);
// Accept parameter inside a reactive type for deferred execution
Flux<Person> findByLastname(Mono<String> lastname);
Mono<Person> findByFirstnameAndLastname(Mono<String> firstname, String lastname);
#InfiniteStream // Use a tailable cursor
Flux<Person> findWithTailableCursorBy();
}
public interface RxJava1PersonRepository extends RxJava1CrudRepository<Person, String> {
Observable<Person> findByLastname(String lastname);
#Query("{ 'firstname': ?0, 'lastname': ?1}")
Single<Person> findByFirstnameAndLastname(String firstname, String lastname);
// Accept parameter inside a reactive type for deferred execution
Observable<Person> findByLastname(Single<String> lastname);
Single<Person> findByFirstnameAndLastname(Single<String> firstname, String lastname);
#InfiniteStream // Use a tailable cursor
Observable<Person> findWithTailableCursorBy();
}

Related

Need to understand asynchronous usage of Spring WebClient

I have a doubt regarding the usage of webclient in cirumstances when you need to invoke another service which is slow in responding and then use its data to process something and return in the call to your own api.
e.g. my doSometing method is called by service to retrive top order from a list.My service fetches the list from another service "order-service"
#RestController
public class TestController {
#Autowired
private TestService testService;
#GetMapping("/test")
public String doSomething() {
return WebClient.create()
.get()
.uri("http://localhost:9090/order-service")
.retrieve()
.bodyToMono(List.class)
.block().get(0).toString();
//may be do more processing on the list later.
}
As you can see , it currently blocks the calling thread which beats the purpose of async.Also , in future i might need to do more processing on this List before returning.
Am i using webclient correctly(is it serving any purpose here)?

Primary/secondary datasource failover in Spring MVC

I have a java web application developed on Spring framework which uses mybatis. I see that the datasource is defined in beans.xml. Now I want to add a secondary data source too as a backup. For e.g, if the application is not able to connect to the DB and gets some error, or if the server is down, then it should be able to connect to a different datasource. Is there a configuration in Spring to do this or we will have to manually code this in the application?
I have seen primary and secondary notations in Spring boot but nothing in Spring. I could achieve these in my code where the connection is created/retrieved, by connecting to the secondary datasource if the connection to the primary datasource fails/timed out. But wanted to know if this can be achieved by making changes just in Spring configuration.
Let me clarify things one-by-one-
Spring Boot has a #Primary annotation but there is no #Secondary annotation.
The purpose of the #Primary annotation is not what you have described. Spring does not automatically switch data sources in any way. #Primary merely tells the spring which data source to use in case we don't specify one in any transaction. For more detail on this- https://www.baeldung.com/spring-data-jpa-multiple-databases
Now, how do we actually switch datasources when one goes down-
Most people don't manage this kind of High-availability in code. People usually prefer to 2 master database instances in an active-passive mode which are kept in sync. For auto-failovers, something like keepalived can be used. This is also a high subjective and contentious topic and there are a lot of things to consider here like can we afford replication lag, are there slaves running for each master(because then we have to switch slaves too as old master's slaves would now become out of sync, etc. etc.) If you have databases spread across regions, this becomes even more difficult(read awesome) and requires yet more engineering, planning, and design.
Now since, the question specifically mentions using application code for this. There is one thing you can do. I don't advice to use it in production though. EVER. You can create an ASPECTJ advice around your all primary transactional methods using your own custom annotation. Lets call this annotation #SmartTransactional for our demo.
Sample Code. Did not test it though-
#Retention(RetentionPolicy.RUNTIME)
#Target(ElementType.METHOD)
public #interface SmartTransactional {}
public class SomeServiceImpl implements SomeService {
#SmartTransactional
#Transactional("primaryTransactionManager")
public boolean someMethod(){
//call a common method here for code reusability or create an abstract class
}
}
public class SomeServiceSecondaryTransactionImpl implements SomeService {
#Transactional("secondaryTransactionManager")
public boolean usingTransactionManager2() {
//call a common method here for code reusability or create an abstract class
}
}
#Component
#Aspect
public class SmartTransactionalAspect {
#Autowired
private ApplicationContext context;
#Pointcut("#annotation(...SmartTransactional)")
public void smartTransactionalAnnotationPointcut() {
}
#Around("smartTransactionalAnnotationPointcut()")
public Object methodsAnnotatedWithSmartTransactional(final ProceedingJoinPoint joinPoint) throws Throwable {
Method method = getMethodFromTarget(joinPoint);
Object result = joinPoint.proceed();
boolean failure = Boolean.TRUE;// check if result is failure
if(failure) {
String secondaryTransactionManagebeanName = ""; // get class name from joinPoint and append 'SecondaryTransactionImpl' instead of 'Impl' in the class name
Object bean = context.getBean(secondaryTransactionManagebeanName);
result = bean.getClass().getMethod(method.getName()).invoke(bean);
}
return result;
}
}

How to propagate JTA state when using reactive-messaging?

I would like to propagate JTA state (= the transaction) between a transactional REST endpoint that emits a message to a reactive-messaging connector.
#Inject
#Channel("test")
Emitter<String> emitter;
#POST
#Transactional
public Response test() {
emitter.send("test");
}
and
#ApplicationScoped
#Connector("test")
public class TestConnector implements OutgoingConnectorFactory {
#Inject
TransactionManager tm;
#Override
public SubscriberBuilder<? extends Message<?>, Void> getSubscriberBuilder(Config config) {
return ReactiveStreams.<Message<?>>builder()
.flatMapCompletionStage(message -> {
tm.getTransaction(); // = null
return message.ack();
})
.ignore();
}
}
As I understand, context-propagation is responsible for making the transaction available (see io.smallrye.context.jta.context.propagation.JtaContextProvider#currentContext). The problem seems to be, that currentContext gets created on subscription, which happens when the injection point (Emitter<String> emitter) get its instance. Which is too early to properly capture the transaction.
What am I missing?
By the way, I am having the same problem when using #Incoming / #Outgoing instead of the emitter. I have decided to give you this example because it is easy to understand and reproduce.
At the moment, you need to pass the current Transaction in the message metadata. Thus, it will be propagated to your different downstream components (as well as the connector).
Note that, Transaction tends to be attached to the request scope, which means that in your connector, it may already be too late to use it. So, make sure your endpoint is asynchronous and only returns when the emitted message is acknowledged.
Context Propagation is not going to help in this case as the underlying streams are built at startup time (at build time in Quarkus) so, there are no capture contexts.

Spring-data-JPA - Executing complex multi join queries

I have a requirement for which I need to execute a bunch of random complex queries with multiple joins for reporting purposes. So I am planning to use entitymanager native query feature directly. I just tried and it seems to work.
#Service
public class SampleService {
#Autowired
private EntityManager entityManager;
public List<Object[]> execute(String sql){
Query query = entityManager.createNativeQuery(sql);
return query.getResultList();
}
}
This code is invoked once in every 30 seconds. Single threaded - scheduled process.
Question:
Should I be using entity manager or entity manager factory?
Should I close the connection here? or is it managed automatically?
How to reduce the DB connection pool - as it is not multi threaded app or Should I not be worried about that?
Any other suggestions!?
Should I be using entity manager or entity manager factory?
Injecting EntityManager Vs. EntityManagerFactory
EntityManager looks fine in this instance.
Should I close the connection here? or is it managed automatically?
No I dont think you need to as the manager handles this.
How to reduce the DB connection pool - as it is not multi threaded app or Should I not be worried about that?
I doubt you need concern yourself with the connection pools unless you are expecting large volumes and your application is running slowly under load. Try doing some bench marking you may have much more capacity than you need and be prematurely optimising your app.
It more likely you would you increase it number of connections rather than decrease. To increase the number of connections you do that in the application.properties (or application.yml)
Any other suggestions!?
Rather than a generic method I would consider having a separate repository class outside of the service and have that repository method do something specific. Make a method return a specific result or thing rather than pass in any sql.
As a rough outline of two seperate classes (files) something like this
#Service
public class SampleService {
#Autowired
private MyAuthorNativeRepository myAuthorNaviveRepository;
public List<Author> getAuthors(){
return myAuthorRepository.getAuthors();
}
}
#Service
public class MyAuthorNativeRepository {
#Autowired
private EntityManager entityManager;
public List<Author> getAuthors(){
Query q = entityManager.createNativeQuery("SELECT blah blah FROM Author");
List<Author> authors = new ArrayList();
for (Object[] row : q.getResultList()) {
Author author = new Author();
author.setName(row[0]);
authors.add(author);
}
return authors;
}
}

Spring boot test: Wait for microservices schedulder task

I'm trying to test a service which's trying to communicate with other one.
One of them generates auditories which are stored on memory until an scheduled task flushs them on a redis node:
#Component
public class AuditFlushTask {
private AuditService auditService;
private AuditFlushTask(AuditService auditService) {
this.auditService = auditService;
}
#Scheduled(fixedDelayString = "${fo.audit-flush-interval}")
public void flushAudits() {
this.auditService.flush();
}
}
By other hand, this service provide an endpoint stands for providing those flushed auditories:
public Collection<String> listAudits(
) {
return this.boService.listRawAudits(deadlineTimestamp);
}
The problem is I'm building an integration test in order to check if this process works right, I mean, if audits are well provided.
So, I don't know how to "wait until audits has been flushed on microservice".
Any ideas?
Don't test the framework: Spring almost certainly has tests which test fixed delays.
Instead, keep all logic within the service itself, and integration test that in isolation from the Spring #Scheduled function.

Resources