How to convert a vert.x ReactiveReadStream<Document> to ReactiveWriteStream<Buffer> - spring

I have a straightforward use case. This is to make a rest call, query mongo and then return an arbitrarily large stream of data back to the client, all with reactive streams type back pressure management.
This was quite easy to achieve using Spring WebFlux and Reactor. I am now trying to achieve the same goal using vert.x, as a comparison of ease of implementation.
Having found the vert.x mongo client to be lacking any support for managing back pressure, I am now attempting to use the WebFlux mongo client and then pump the data back through the vert.x HttpResponse, as shown in the following code:
public class MyMongoVerticle extends AbstractVerticle {
ReactiveMongoOperations operations;
public void start() throws Exception {
final Router router = Router.router(vertx);
router.route().handler(BodyHandler.create());
router.get("/myUrl").handler(ctx -> {
// WebFlux mongo operations returns a ReactiveStreams compatible entity
Flux<Document> mongoStream = operations.findAll(Document.class, "myCollection");
ReactiveReadStream rrs = ReactiveReadStream.readStream();
// rrs is ReactiveStream streams subscriber
mongoStream.subscribe(rrs);
// Pump pumps the rrs (ReactiveReadStream) to the HttpServerResponse (ReactiveWriteStream)
Pump pump = Pump.pump(rrs, ctx.response());
pump.start();
});
vertx.createHttpServer().requestHandler(router::accept).listen(8777);
}
}
The issue I have encountered is that the HttpServerResponse implements ReactiveWriteStream<Buffer> so is expecting a Buffer rather than a stream of Document's. The result is a ClassCaseException.
The question I have is how can I convert this stream of Documents into a into a ReactiveWriteStream<Buffer>? There may be another better way to do this, so I'm open to other suggestions on how to achieve this.

Pump won't work for you, as it doesn't support transformations currently. You'll have to implement pump by yourself. Luckily, this shouldn't be too hard:
Flux<Document> mongoStream = operations.findAll(Document.class, "myCollection");
ReactiveReadStream<Document> rrs = ReactiveReadStream.readStream();
mongoStream.subscribe(rrs);
HttpServerResponse outStream = ctx.response();
// Changes start here
rrs.handler(d -> {
if (outStream.writeQueueFull()) {
outStream.drainHandler((s) -> {
rrs.resume();
});
rrs.pause();
}
else {
outStream.write(d.toJson());
}
}).endHandler(h -> {
outStream.end();
});
Note that I wouldn't expect this to be more effective than "native" WebFlux implementation.
Also, JSON in this example will be mangled, as I don't wrap it in proper JSON Array

Related

How to combine sink.asFlux() with Server-Sent Events (SSE) using Spring WebFlux?

I am using Spring Boot 2.7.8 with WebFlux.
I have a sink in my class like this:
private final Sinks.Many<TaskEvent> sink = Sinks.many()
.multicast()
.onBackpressureBuffer();
This can be used to subscribe on like this:
public Flux<List<TaskEvent>> subscribeToTaskUpdates() {
return sink.asFlux()
.buffer(Duration.ofSeconds(1))
.share();
}
The #Controller uses this like this to push the updates as a Server-Sent Event (SSE) to the browser:
#GetMapping("/transferdatestatuses/updates")
public Flux<ServerSentEvent<TransferDateStatusesUpdateEvent>> subscribeToTransferDataStatusUpdates() {
return monitoringSseBroker.subscribeToTaskUpdates()
.map(taskEventList -> ServerSentEvent.<TransferDateStatusesUpdateEvent>builder()
.data(TransferDateStatusesUpdateEvent.of(taskEventList))
.build())
This works fine at first, but if I navigate away in my (Thymeleaf) web application to a page that has no connection with the SSE url and then go back, then the browser cannot connect anymore.
After some investigation, I found out that the problem is that the removal of the subscriber closes the flux and a new subscriber cannot connect anymore.
I have found 3 ways to fix it, but I don't understand the internals enough to decide which one is the best solution and if there any things I need to consider to decide what to use.
Solution 1
Disable the autoCancel on the sink by using the method overload of onBackpressureBuffer that allows to set this parameter:
private final Sinks.Many<TaskEvent> sink = Sinks.many()
.multicast()
.onBackpressureBuffer(Queues.SMALL_BUFFER_SIZE, false);
Solution 2
Use replay(0).autoConnect() instead of share():
public Flux<List<TaskEvent>> subscribeToTaskUpdates() {
return sink.asFlux()
.buffer(Duration.ofSeconds(1))
.replay(0).autoConnect();
}
Solution 3
Use publish().autoConnect() instead of share():
public Flux<List<TaskEvent>> subscribeToTaskUpdates() {
return sink.asFlux()
.buffer(Duration.ofSeconds(1))
.publish().autoConnect();
}
Which of the solutions are advisable to make sure a browser can disconnect and connect again later without problems?
I'm not quite sure if it is the root of your problem, but I didn't have that issue by using a keepAlive Flux.
val keepAlive = Flux.interval(Duration.ofSeconds(10)).map {
ServerSentEvent.builder<Image>()
.event(":keepalive")
.build()
}
return Flux.merge(
keepAlive,
imageUpdateFlux
)
Here is the whole file: Github

Can Reactive Kafka Receiver work with non-reactive Elasticsearch client?

Below is a sample code which uses reactor-kafka and reads data from a topic (with retry logic) which has records published via a non-reactive producer. Inside my doOnNext() consumer I am using non-reactive elasticsearch client which indexes the record in the index. So I have few questions that I am still unclear about :
I know that consumers and producers are independent decoupled systems, but is it recommended to have reactive producer as well whose consumers are reactive?
If I am using something that is non-reactive, in this case Elasticsearch client org.elasticsearch.client.RestClient, does the "reactiveness" of the code work? If it does or does not, how do I test it? (By "reactiveness", I mean non blocking IO part of it i.e. if I spawn three reactive-consumers and one is latent for some reason, the thread should be unblocked and used for other reactive consumer).
In general the question is, if I wrap some API with reactive clients should the API be reactive as well?
public Disposable consumeRecords() {
long maxAttempts = 3, duration = 10;
RetryBackoffSpec retrySpec = Retry.backoff(maxAttempts, Duration.ofSeconds(duration)).transientErrors(true);
Consumer<ReceiverRecord<K, V>> doOnNextConsumer = x -> {
// use non-reactive elastic search client and index record x
};
return KafkaReceiver.create(receiverOptions)
.receive()
.doOnNext(record -> {
try {
// calling the non-reactive consumer
doOnNextConsumer.accept(record);
} catch (Exception e) {
throw new ReceiverRecordException(record, e);
}
record.receiverOffset().acknowledge();
})
.doOnError(t -> log.error("Error occurred: ", t))
.retryWhen(retrySpec)
.onErrorContinue((e, record) -> {
ReceiverRecordException receiverRecordException = (ReceiverRecordException) e;
log.error("Retries exhausted for: " + receiverRecordException);
receiverRecordException.getRecord().receiverOffset().acknowledge();
})
.repeat()
.subscribe();
}
Got some understanding around it.
Reactive KafkaReceiver will internally call some API; if that API is blocking API then even if KafkaReceiver is "reactive" the non-blocking IO will not work and the receiver thread will be blocked because you are calling Blocking API / non-reactive API.
You can test this out by creating a simple server (which blocks calls for sometime / sleep) and calling that server from this receiver

Spring WebFlux + Kotlin Response Handling

I'm having some trouble wrapping my head around a supposedly simple RESTful WS response handling scenario when using Spring WebFlux in combination with Kotlin coroutines. Suppose we have a simple WS method in our REST controller that is supposed to return a possibly huge number (millions) of response "things":
#GetMapping
suspend fun findAllThings(): Flow<Thing> {
//Reactive DB query, return a flow of things
}
This works as one would expect: the result is streamed to the client as long as a streaming media type (e.g. "application/x-ndjson") is used. In more complex service calls that also accounts for the possibility of errors/warnings I would like to return a response object of the following form:
class Response<T> {
val errors: Flow<String>
val things: Flow<T>
}
The idea here being that a response either is successful (returning an empty error Flow and a Flow of things), or failed (errors contained in the corresponding Flow while the things Flow being empty). In blocking programming this is a quite common response idiom. My question now is how can I adapt this idiom to the reactive approach in Kotlin/Spring WebFlux?
I know its possible to just return the Response as described (or Mono<Response> for Java users), but this somewhat defeats the purpose of being reactive as the entire Mono has to exist in memory at serialization time. Is there any way to solve this? The only possible solution I can think of right now is a custom Spring Encoder that is smart enough to stream both errors or things (whatever is present).
How about returning Success/Error per Thing?
class Result<T> private constructor(val result: T?, val error: String?) {
constructor(data: T) : this(data, null)
constructor(error: String) : this(null, error)
val isError = error != null
}
#GetMapping
suspend fun findAllThings(): Flow<Result<Thing>> {
//Reactive DB query, return a flow of things
}

Idiomatic way of verifying a reactive request before actually persisting to the database

I have an endpoint that accepts as well as returns a reactive type. What I'm trying to achieve is to somehow verify that the complete reactive request (that is actually an array of resources) is valid before persisting the changes to the database (read Full-Update of a ressource). The question is not so much concerned with how to actually verify the request but more with how to chain the steps together using which of springs reactive handler methods (map, flatMap and the likes) in the desired order which is basically:
verify correctness of request (the Ressource is properly annotated with JSR-303 annotations)
clear the current resource in case of valid request
persist new resources in the database after clearing the database
Let's assume the following scenario:
val service : ResourceService
#PostMapping("/resource/")
fun replaceResources(#Valid #RequestBody resources:
Flux<RessourceDto>): Flux<RessourceDto> {
var deleteWrapper = Mono.fromCallable {
service.deleteAllRessources()
}
deleteWrapper = deleteWrapper.subscribeOn(Schedulers.elastic())
return deleteWrapper.thenMany<RessourceDto> {
resources
.map(mapper::map) // map to model object
.flatMap(service::createResource)
.map(mapper::map) // map to dto object
.subscribeOn(Schedulers.parallel())
}
}
//alternative try
#PostMapping("/resourceAlternative/")
override fun replaceResourcesAlternative2(#RequestBody resources:
Flux<ResourceDto>): Flux<ResourceDto> {
return service.deleteAllResources()
.thenMany<ResourceDto> {
resources
.map(mapper::map)
.flatMap(service::createResource)
.map(mapper::map)
}
}
Whats the idiomatic way of doing this in a reactive fashion?

Spring Web-Flux: How to return a Flux to a web client on request?

We are working with spring boot 2.0.0.BUILD_SNAPSHOT and spring boot webflux 5.0.0 and currently we cant transfer a flux to a client on request.
Currently I am creating the flux from an iterator:
public Flux<ItemIgnite> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue());
}
});
}
And on request I am simply doing:
#RequestMapping(value="/all", method=RequestMethod.GET, produces="application/json")
public Flux<ItemIgnite> getAllFlux() {
return this.provider.getAllFlux();
}
When I now locally call localhost:8080/all after 10 seconds I get a 503 status code. Also as at client when I request /all using the WebClient:
public Flux<ItemIgnite> getAllPoducts(){
WebClient webClient = WebClient.create("http://localhost:8080");
Flux<ItemIgnite> f = webClient.get().uri("/all").accept(MediaType.ALL).exchange().flatMapMany(cr -> cr.bodyToFlux(ItemIgnite.class));
f.subscribe(System.out::println);
return f;
}
Nothing happens. No data is transferred.
When I do the following instead:
public Flux<List<ItemIgnite>> getAllFluxMono() {
return Flux.just(this.getAllList());
}
and
#RequestMapping(value="/allMono", method=RequestMethod.GET, produces="application/json")
public Flux<List<ItemIgnite>> getAllFluxMono() {
return this.provider.getAllFluxMono();
}
It is working. I guess its because all data is already finished loading and just transferred to the client as it usually would transfer data without using a flux.
What do I have to change to get the flux streaming the data to the web client which requests those data?
EDIT
I have data inside an ignite cache. So my getAllIterator is loading the data from the ignite cache:
public Iterator<Cache.Entry<String, ItemIgnite>> getAllIterator() {
return this.igniteCache.iterator();
}
EDIT
adding flux.complete() like #Simon Baslé suggested:
public Flux<ItemIgnite> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue());
}
flux.complete(); // see here
});
}
Solves the 503 problem in the browser. But it does not solve the problem with the WebClient. There is still no data transferred.
EDIT 3
using publishOn with Schedulers.parallel():
public Flux<ItemIgnite> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.<ItemIgnite>create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue());
}
flux.complete();
}).publishOn(Schedulers.parallel());
}
Does not change the result.
Here I post you what the WebClient receives:
value :[Item ID: null, Product Name: null, Product Group: null]
complete
So it seems like he is getting One item (out of over 35.000) and the values are null and he is finishing after.
One thing that jumps out is that you never call flux.complete() in your create.
But there's actually a factory operator that is tailored to transform an Iterable to a Flux, so you could just do Flux.fromIterable(this)
Edit: in case your Iterator is hiding complexity like a DB request (or any blocking I/O), be advised this spells trouble: anything blocking in a reactive chain, if not isolated on a dedicated execution context using publishOn, has the potential to block not only the entire chain but other reactive processes has well (as threads can and will be used by multiple reactive processes).
Neither create nor fromIterable do anything in particular to protect from blocking sources. I think you are facing that kind of issue, judging from the hang you get with the WebClient.
The problem was my Object ItemIgnite which I transfer. The system Flux seems not to be able to handle this. Because If I change my original code to the following:
public Flux<String> getAllFlux() {
Iterator<Cache.Entry<String, ItemIgnite>> iterator = this.getAllIterator();
return Flux.create(flux -> {
while(iterator.hasNext()) {
flux.next(iterator.next().getValue().toString());
}
});
}
Everything is working fine. Without publishOn and without flux.complete(). Maybe someone has an idea why this is working.

Resources