Spring Reactor and consuming websocket messages - spring

I'm creating a spring reactor application to consume messages from websockets server, transform them and later save them to redis and some sql database, saving to redis and sql database is also reactive. Also, before writing to redis and sql database, messages will be windowed (with different timespans) and aggregated.
I'm not sure if the way I've accomplished what I want to achieve is a proper reactive wise, it means, I'm not losing reactive benefits (performance).
First, let me show you what I got:
#Service
class WebSocketsConsumer {
public ConnectableFlux<String> webSocketFlux() {
return Flux.<String>create(emitter -> {
createWebSocketClient()
.execute(URI.create("wss://some-url-goes-here.com"), session -> {
WebSocketMessage initialMessage = session.textMessage("SOME_MSG_HERE");
Flux<String> flux = session.send(Mono.just(initialMessage))
.thenMany(session.receive())
.map(WebSocketMessage::getPayloadAsText)
.doOnNext(emitter::next);
Flux<String> sessionStatus = session.closeStatus()
.switchIfEmpty(Mono.just(CloseStatus.GOING_AWAY))
.map(CloseStatus::toString)
.doOnNext(emitter::next)
.flatMapMany(Flux::just);
return flux
.mergeWith(sessionStatus)
.then();
})
.subscribe(); //1: highlighted by Intellij Idea: `Calling subsribe in not blocking context`
})
.publish();
}
private ReactorNettyWebSocketClient createWebSocketClient() {
return new ReactorNettyWebSocketClient(
HttpClient.create(),
() -> WebsocketClientSpec.builder().maxFramePayloadLength(131072 * 100)
);
}
}
And
#Service
class WebSocketMessageDispatcher {
private final WebSocketsConsumer webSocketsConsumer;
private final Consumer<String> reactiveRedisConsumer;
private final Consumer<String> reactiveJdbcConsumer;
private Disposable webSocketsDisposable;
WebSocketMessageDispatcher(WebSocketsConsumer webSocketsConsumer, Consumer<String> redisConsumer, Consumer<String> dbConsumer) {
this.webSocketsConsumer = webSocketsConsumer;
this.reactiveRedisConsumer = redisConsumer;
this.reactiveJdbcConsumer = dbConsumer;
}
#EventListener(ApplicationReadyEvent.class)
public void onReady() {
ConnectableFlux<String> messages = webSocketsConsumer.webSocketFlux();
messages.subscribe(reactiveRedisConsumer);
messages.subscribe(reactiveJdbcConsumer);
webSocketsDisposable = messages.connect();
}
#PreDestroy
public void onDestroy() {
if (webSocketsDisposable != null) webSocketsDisposable.dispose();
}
}
Questions:
Is it a proper use of reactive streams? Maybe redis and database writes should be done in flatMap, however IMO they can't as I want them to happen in the background and they will also aggregate messages with different time windows. Also note comment 1 from the code above where idea lints my code, code works however I wonder what this lint may result in? Maybe I should use doOnNext not to call emitter::next but to invoke some dispatcher of messages there with some funcion like doOnNext(dispatcher::dispatchMessage) ?
I want websockets client to start immediately after application is ready and stop consuming messages when application shuts down, are #EventListener(ApplicationReadyEvent.class) and #PreDestroy annotations and code shown above a proper way to handle this scenario in reactive world?
As I said saving to redis and sql database is also reactive, i.e. those saves are also producing Mono<T> is subscribing to those Monos inside subscribe of websockets flux ok or it should be accomplished some other way (comments 2 and 3 in code above)

Related

Difference between DirectChannel and FluxMessageChannel

I was reading about Spring Integration's FluxMessageChannel here and here, but I still don't understand exactly what are the differences between using a DirectChannel and FluxMessageChannel when using Project Reactor. Since the DirectChannel is stateless and controlled by its pollers, I'd expect the FluxMessageChannel to not be needed. I'm trying to understand when exactly should I use each and why, when speaking on Reactive Streams applications that are implemented with Spring Integration.
I currently have a reactive project that uses DirectChannel, and it seems to work fine, even the documentation says:
the flow behavior is changed from an imperative push model to a reactive pull model
I'd like to understand when to use each of the channels and what is the exact difference when working with Reactive Streams.
The DirectChannel does not have any poller and its implementation is very simple: as long as a message is sent to it, the handler is called. In the same caller's thread:
public class DirectChannel extends AbstractSubscribableChannel {
private final UnicastingDispatcher dispatcher = new UnicastingDispatcher();
private volatile Integer maxSubscribers;
/**
* Create a channel with default {#link RoundRobinLoadBalancingStrategy}.
*/
public DirectChannel() {
this(new RoundRobinLoadBalancingStrategy());
}
Where that UnicastingDispatcher is:
public final boolean dispatch(final Message<?> message) {
if (this.executor != null) {
Runnable task = createMessageHandlingTask(message);
this.executor.execute(task);
return true;
}
return this.doDispatch(message);
}
(There is no executor option for the DirectChannel)
private boolean doDispatch(Message<?> message) {
if (tryOptimizedDispatch(message)) {
return true;
}
...
protected boolean tryOptimizedDispatch(Message<?> message) {
MessageHandler handler = this.theOneHandler;
if (handler != null) {
try {
handler.handleMessage(message);
return true;
}
catch (Exception e) {
throw IntegrationUtils.wrapInDeliveryExceptionIfNecessary(message,
() -> "Dispatcher failed to deliver Message", e);
}
}
return false;
}
That's why I call it " imperative push model". The caller is this case is going to wait until the handler finishes its job. And if you have a big flow, everything is going to be stopped in the sender thread until a sent message has reached the end of the flow of direct channels. In two simple words: the publisher is in charge for the whole execution and it is blocked in this case. You haven't faced any problems with your solution based on the DirectChannel just because you didn't use reactive non-blocking threads yet like Netty in WebFlux or MongoDB reactive driver.
The FluxMessageChannel was really designed for Reactive Streams purposes where the subscriber is in charge for handling a message which it pulls from the Flux on demand. This way just after sending the publisher is free to do anything else. Just because it is already a subscriber responsibility to handle the message.
I would say it is definitely OK to use DirectChannel as long as your handlers are not blocking. As long as they are blocking you should go with FluxMessageChannel. Although don't forget that there are other channel types for different tasks: https://docs.spring.io/spring-integration/docs/current/reference/html/core.html#channel-implementations

How can transactions be implemented in spring webflux without r2dbc driver

General problem description
Due to compatibility issues with the provided database I can not use the provided r2dbc driver for the database. The only possible option is using the standard jdbc driver but I have faced some issues getting transactions to work in the spring-weflux/ project reactor context.
Transactions with jdbc usually rely on the requirement of the connection being thread-local. In project reactor Flux/Mono it is not guaranteed that each flux execution is performed in the same thread. Even more i assume one of the major benefits of reactive programming is the ability to switch threads without having to worry about it. For this reason the standard spring jdbc TransactionManager can not be used and for r2dbc a ReactiveTransactionManager is implemented. As I am using jdbc in this case neither can I use the JdbcTransactionManager, nor is a ReactiveTransactionManager available.
First of all: Is there a simple solution to this Problem?
"Hacky" solution
I will now elaborate further on the steps I already took to solve this issue for me. My idea was implementing a custom ReactiveTransactionManager, which is based on the provided JdbcTransactionManager. My assumption was that it would be possible to wrap a transaction around a Mono/Flux this way. The issue is that I did not take into account the issue described above: It works currently only in a ThreadLocal context as the underlying JdbcTransactions still rely on it. Due to this the inner transactions are handled (commit,rollback) individually if the thread is changed in between.
The following class is the implementation of my custom transaction manager to be included in a reactive stream.
public class JdbcReactiveTransactionManager implements ReactiveTransactionManager {
// Jdbc or connection based transaction manager
private final DataSourceTransactionManager transactionManager;
// ReactiveTransaction delegates everything to TransactionStatus.
static class JdbcReactiveTransaction implements ReactiveTransaction {
public JdbcReactiveTransaction(TransactionStatus transactionStatus) {
this.transactionStatus = transactionStatus;
}
private final TransactionStatus transactionStatus;
public TransactionStatus getTransactionStatus() {
return transactionStatus;
}
// [...]
}
#Override
public #NonNull Mono<ReactiveTransaction> getReactiveTransaction(TransactionDefinition definition)
throws TransactionException {
return Mono.just(transactionManager.getTransaction(definition)).map(JdbcReactiveTransaction::new);
}
#Override
public #NonNull Mono<Void> commit(#NonNull ReactiveTransaction transaction) throws TransactionException {
if (transaction instanceof JdbcReactiveTransaction t) {
transactionManager.commit(t.getTransactionStatus());
return Mono.empty();
} else {
return Mono.error(new IllegalTransactionStateException("Illegal ReactiveTransaction type used"));
}
}
#Override
public #NonNull Mono<Void> rollback(#NonNull ReactiveTransaction transaction) throws TransactionException {
if (transaction instanceof JdbcReactiveTransaction t) {
transactionManager.rollback(t.getTransactionStatus());
return Mono.empty();
} else {
return Mono.error(new IllegalTransactionStateException("Illegal ReactiveTransaction type used"));
}
}
The implemented solution works in all scenarios where the tread does not change. But a fixed thread is not what one usually wants to archive using reactive approaches. Therefore the thread must be fixed using publishOn and subscribeOn. This is all very hacky and I myself consider this a good solution but I do not see a better alternative currently. As this is only required for one use case right now I can probably do but I would really like to find a better solution.
Pinning the Thread
The example below shows the situation that I need to use both: publishOn and subscribeOn to pin the thread. If I omit either on of these some statements wont be executed in the same thread. My current assumption is that Netty executes the parsing in a separate thread (or eventloop). Therefore the additional publishOn is required.
public Mono<ServerResponse> allocateFlows(ServerRequest request) {
final val single = Schedulers.newSingle("AllocationService-allocateFlows");
return request.bodyToMono(FlowsAllocation.class)
.publishOn(single) // Why do I need this although I execute subscribeOn later?
.flatMapMany(this::someProcessingLogic)
.concatMapDelayError(this::someOtherProcessingLogic)
.as(transactionalOperator::transactional)
.subscribeOn(single, false)
.then(ServerResponse.ok().build());
}

Spring Integration: Manual channel handling

What I want: Build a configurable library that
uses another library that has an internal routing and a subscribe method like: clientInstance.subscribe(endpoint, (endpoint, message) -> <handler>) , e.g. Paho MQTT library
later in my code I want to access the messages in a Flux.
My idea:
create MessageChannels like so:
integrationFlowContext
.registration(IntegrationFlows.from("message-channel:" + endpoint)).bridge().get())
.register()
forward to reactive publishers:
applicationContext.registerBean(
"publisher:" + endpoint,
Publisher.class,
() -> IntegrationFlows.from("message-channel:" + endpoint)).toReactivePublisher()
);
keep the message channels in a set or similar and implement the above handler: (endpoint, message) -> messageChannels.get(endpoint).send( <converter>(message))
later use (in a #PostConstruct method):
Flux
.from((Publihser<Message<?>>)applicationContext.getBean("publisher:" + enpoint))
.map(...)
.subscribe()
I doubt this to be the best way to do what I want. Feels like abusing spring integration. Any suggestions are welcome at this point.
In general however (at least in my tests) this seemed to be working. But when I run my application, I get errors like: "Caused by: org.springframework.messaging.core.DestinationResolutionException: no output-channel or replyChannel header available".
This is especially bad, since after this exception the publishers claim to not have a subscriber anymore. Thus, in a real application no messages are proceeded anymore.
I am not sure what this message means, but I can kind of reproduce it (but don't understand why):
#Test
public void channelTest() {
integrationFlowContext
.registration(
IntegrationFlows.from("any-channel").bridge().get()
)
.register();
registryUtil.registerBean(
"any-publisher",
Publisher.class,
() -> IntegrationFlows.from("any-channel").toReactivePublisher()
);
Flux
.from((Publisher<Message<?>>) applicationContext.getBean("any-publisher"))
.subscribe(System.out::println);
MessageChannel messageChannel = applicationContext.getBean("any-channel", MessageChannel.class);
try {
messageChannel.send(MessageBuilder.withPayload("test").build());
} catch (Throwable t) {
log.error("Error: ", t);
}
}
I of course read parts of the spring integration documentation, but don't quite get what happens behind the scenes. Thus, I feel like guessing possible error causes.
EDIT:
This, however works:
#TestConfiguration
static class Config {
GenericApplicationContext applicationContext;
Config(
GenericApplicationContext applicationContext,
IntegrationFlowContext integrationFlowContext
) {
this.applicationContext = applicationContext;
// optional here, but needed for some reason in my library,
// since I can't find the channel beans like I will do here,
// if I didn't register them like so:
//integrationFlowContext
// .registration(
// IntegrationFlows.from("any-channel").bridge().get())
// .register();
applicationContext.registerBean(
"any-publisher",
Publisher.class,
() -> IntegrationFlows.from("any-channel").toReactivePublisher()
);
}
#PostConstruct
void connect(){
Flux
.from((Publisher<Message<?>>) applicationContext.getBean("any-publisher"))
.subscribe(System.out::println);
}
}
#Autowired
ApplicationContext applicationContext;
#Autowired
IntegrationFlowContext integrationFlowContext;
#Test
#SneakyThrows
public void channel2Test() {
MessageChannel messageChannel = applicationContext.getBean("any-channel", MessageChannel.class);
try {
messageChannel.send(MessageBuilder.withPayload("test").build());
} catch (Throwable t) {
log.error("Error: ", t);
}
}
Thus apparently my issue above is realted to messages arriving "too early" .. I guess?!
No, your issue is related to round-robin dispatched on the DirectChannel for the any-channel bean name.
You define two IntegrationFlow instances starting with that channel and then you declare their own subscribers, but at runtime both of them are subscribed to the same any-channel instance. And that one comes with the round-robin balancer by default. So, one message goes to your Flux.from() subscriber, but another to that bridge() which doesn't know what to do with your message, so it tries to resolve a replyChannel header.
Therefore your solution just only with one IntegrationFlows.from("any-channel").toReactivePublisher() is correct. Although you could just do a FluxMessageChannel registration and use it from one side for regular messages sending and from other side as a reactive source for Flux.from().

Vert.x: how to process HttpRequest with a blocking operation

I've just started with Vert.x and would like to understand what is the right way of handling potentially long (blocking) operations as part of processing a REST HttpRequest. The application itself is a Spring app.
Here is a simplified REST service I have so far:
public class MainApp {
// instantiated by Spring
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
Vertx.vertx().deployVerticle(alertsRestService);
}
}
public class AlertsRestService extends AbstractVerticle {
// instantiated by Spring
private PostgresService pgService;
#Value("${rest.endpoint.port:8080}")
private int restEndpointPort;
#Override
public void start(Future<Void> futureStartResult) {
HttpServer server = vertx.createHttpServer();
Router router = Router.router(vertx);
//enable reading of the request body for all routes
router.route().handler(BodyHandler.create());
router.route(HttpMethod.GET, "/allDefinitions")
.handler(this::handleGetAllDefinitions);
server.requestHandler(router)
.listen(restEndpointPort,
result -> {
if (result.succeeded()) {
futureStartResult.complete();
} else {
futureStartResult.fail(result.cause());
}
}
);
}
private void handleGetAllDefinitions( RoutingContext routingContext) {
HttpServerResponse response = routingContext.response();
Collection<AlertDefinition> allDefinitions = null;
try {
allDefinitions = pgService.getAllDefinitions();
} catch (Exception e) {
response.setStatusCode(500).end(e.getMessage());
}
response.putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(allAlertDefinitions));
}
}
Spring config:
<bean id="alertsRestService" class="com.my.AlertsRestService"
p:pgService-ref="postgresService"
p:restEndpointPort="${rest.endpoint.port}"
/>
<bean id="mainApp" class="com.my.MainApp"
p:alertsRestService-ref="alertsRestService"
/>
Now the question is: how to properly handle the (blocking) call to my postgresService, which may take longer time if there are many items to get/return ?
After researching and looking at some examples, I see a few ways to do it, but I don't fully understand differences between them:
Option 1. convert my AlertsRestService into a Worker Verticle and use the worker thread pool:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions().setWorker(true);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
What confuses me here is this statement from the Vert.x docs: "Worker verticle instances are never executed concurrently by Vert.x by more than one thread, but can [be] executed by different threads at different times"
Does it mean that all HTTP requests to my alertsRestService are going to be, effectively, throttled to be executed sequentially, by one thread at a time? That's not what I would like: this service is purely stateless and should be able to handle concurrent requests just fine ....
So, maybe I need to look at the next option:
Option 2. convert my service to be a multi-threaded Worker Verticle, by doing something similar to the example in the docs:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions()
.setWorker(true)
.setInstances(5) // matches the worker pool size below
.setWorkerPoolName("the-specific-pool")
.setWorkerPoolSize(5);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
So, in this example - what exactly will be happening? As I understand, ".setInstances(5)" directive means that 5 instances of my 'alertsRestService' will be created. I configured this service as a Spring bean, with its dependencies wired in by the Spring framework. However, in this case, it seems to me the 5 instances are not going to be created by Spring, but rather by Vert.x - is that true? and how could I change that to use Spring instead?
Option 3. use the 'blockingHandler' for routing. The only change in the code would be in the AlertsRestService.start() method in how I define a handler for the router:
boolean ordered = false;
router.route(HttpMethod.GET, "/allDefinitions")
.blockingHandler(this::handleGetAllDefinitions, ordered);
As I understand, setting the 'ordered' parameter to TRUE means that the handler can be called concurrently. Does it mean this option is equivalent to the Option #2 with multi-threaded Worker Verticles?
What is the difference? that the async multi-threaded execution pertains to the one specific HTTP request only (the one for the /allDefinitions path) as opposed to the whole AlertsRestService Verticle?
Option 4. and the last option I found is to use the 'executeBlocking()' directive explicitly to run only the enclosed code in worker threads. I could not find many examples of how to do this with HTTP request handling, so below is my attempt - maybe incorrect. The difference here is only in the implementation of the handler method, handleGetAllAlertDefinitions() - but it is rather involved... :
private void handleGetAllAlertDefinitions(RoutingContext routingContext) {
vertx.executeBlocking(
fut -> { fut.complete( sendAsyncRequestToDB(routingContext)); },
false,
res -> { handleAsyncResponse(res, routingContext); }
);
}
public Collection<AlertDefinition> sendAsyncRequestToDB(RoutingContext routingContext) {
Collection<AlertDefinition> allAlertDefinitions = new LinkedList<>();
try {
alertDefinitionsDao.getAllAlertDefinitions();
} catch (Exception e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
return allAlertDefinitions;
}
private void handleAsyncResponse(AsyncResult<Object> asyncResult, RoutingContext routingContext){
if(asyncResult.succeeded()){
try {
routingContext.response().putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(asyncResult.result()));
} catch(EncodeException e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
} else {
routingContext.response().setStatusCode(500)
.end(asyncResult.cause());
}
}
How is this different form other options? And does Option 4 provide concurrent execution of the handler or single-threaded like in Option 1?
Finally, coming back to the original question: what is the most appropriate Option for handling longer-running operations when handling REST requests?
Sorry for such a long post.... :)
Thank you!
That's a big question, and I'm not sure I'll be able to address it fully. But let's try:
In Option #1 what it actually means is that you shouldn't use ThreadLocal in your worker verticles, if you use more than one worker of the same type. Using only one worker means that your requests will be serialised.
Option #2 is simply incorrect. You cannot use setInstances with instance of a class, only with it's name. You're correct, though, that if you choose to use name of the class, Vert.x will instantiate them.
Option #3 is less concurrent than using Workers, and shouldn't be used.
Option #4 executeBlocking is basically doing Option #3, and is also quite bad.

Run task in background using deferredResult in Spring without frozen browser as client

I have implemented a simple Rest service by which I'd like to test deferredResult from Spring. While am I getting texts in that order:
TEST
TEST 1
TEST AFTER DEFERRED RESULT
I am very interested why in a browser (client) I need to wait that 8 seconds. Isn't that deferedResult shouldn't be non-blocking and run a task in the background? If no, how to create a rest service which will be non-blocking and run tasks in the background without using Java 9 and reactive streams?
#RestController("/")
public class Controller {
#GetMapping
public DeferredResult<Person> test() {
System.out.println("TEST");
DeferredResult<Person> result = new DeferredResult<>();
CompletableFuture.supplyAsync(this::test1)
.whenCompleteAsync((res, throwable) -> {
System.out.println("TEST AFTER DEFERRED RESULT");
result.setResult(res);
});
System.out.println("TEST 1");
return result;
}
private Person test1() {
try {
Thread.sleep(8000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Person("michal", 20);
}
}
class Person implements Serializable {
private String name;
private int age;
}
DeferredResult is a holder for a WebRequest to allow the serving thread to release and serve another incoming HTTP request instead of waiting for the current one's result. After setResult or setError methods will be invoked - Spring will release that stored WebRequest and your client will receive the response.
DeferredResult holder is a Spring Framework abstraction for Non-blocking IO threading.
Deferred result abstraction has nothing with background tasks. Calling it without threading abstractions will cause the expected same thread execution. Your test1 method is running in the background because of CompletableFuture.supplyAsync method invocation that gives the execution to common pool.
The result is returned in 8 seconds because the whenCompleteAsync passed callback will be called only after test1 method will return.
You cannot receive the result immediately when your "service call logic" takes 8 seconds despite you are performing it in the background. If you want to release the HTTP request - just return an available proper object (it could contain a UUID, for example, to fetch the created person later) or nothing from the controller method. You can try to GET your created user after N seconds. There are specific HTTP response codes (202 ACCEPTED), that means the serverside is processing the request. Finally just GET your created object.
The second approach (if you should notify your clientside - but I will not recommend you to do it if this is the only reason) - you can use WebSockets to notify the clientside and message with it.

Resources