Spring Boot Webflux/Netty - Detect closed connection

Spring Boot Webflux/Netty - Detect closed connection - spring

I've been working with spring-boot 2.0.0.RC1 using the webflux starter (spring-boot-starter-webflux). I created a simple controller that returns a infinite flux. I would like that the Publisher only does its work if there is a client (Subscriber). Let's say I have a controller like this one:
#RestController
public class Demo {
#GetMapping(value = "/")
public Flux<String> getEvents(){
return Flux.create((FluxSink<String> sink) -> {
while(!sink.isCancelled()){
// TODO e.g. fetch data from somewhere
sink.next("DATA");
}
sink.complete();
}).doFinally(signal -> System.out.println("END"));
}
}
Now, when I try to run that code and access the endpoint http://localhost:8080/ with Chrome, then I can see the data. However, once I close the browser the while-loop continues since no cancel event has been fired. How can I terminate/cancel the streaming as soon as I close the browser?
From this answer I quote that:
Currently with HTTP, the exact backpressure information is not
transmitted over the network, since the HTTP protocol doesn't support
this. This can change if we use a different wire protocol.
I assume that, since backpressure is not supported by the HTTP protocol, it means that no cancel request will be made either.
Investigating a little bit further, by analyzing the network traffic, showed that the browser sends a TCP FIN as soon as I close the browser. Is there a way to configure Netty (or something else) so that a half-closed connection will trigger a cancel event on the publisher, making the while-loop stop?
Or do I have to write my own adapter similar to org.springframework.http.server.reactive.ServletHttpHandlerAdapter where I implement my own Subscriber?
Thanks for any help.
EDIT:
An IOException will be raised on the attempt to write data to the socket if there is no client. As you can see in the stack trace.
But that's not good enough, since it might take a while before the next chunk of data will be ready to send and therefore it takes the same amount of time to detect the gone client. As pointed out in Brian Clozel's answer it is a known issue in Reactor Netty. I tried to use Tomcat instead by adding the dependency to the POM.xml. Like this:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-tomcat</artifactId>
</dependency>
Although it replaces Netty and uses Tomcat instead, it does not seem reactive due to the fact that the browser does not show any data. However, there is no warning/info/exception in the console. Is spring-boot-starter-webflux as of this version (2.0.0.RC1) supposed to work together with Tomcat?

Since this is a known issue (see Brian Clozel's answer), I ended up using one Flux to fetch my real data and having another one in order to implement some sort of ping/heartbeat mechanism. As a result, I merge both together with Flux.merge().
Here you can see a simplified version of my solution:
#RestController
public class Demo {
public interface Notification{}
public static class MyData implements Notification{
…
public boolean isEmpty(){…}
}
#GetMapping(value = "/", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<? extends Notification>> getNotificationStream() {
return Flux.merge(getEventMessageStream(), getHeartbeatStream());
}
private Flux<ServerSentEvent<Notification>> getHeartbeatStream() {
return Flux.interval(Duration.ofSeconds(2))
.map(i -> ServerSentEvent.<Notification>builder().event("ping").build())
.doFinally(signalType ->System.out.println("END"));
}
private Flux<ServerSentEvent<MyData>> getEventMessageStream() {
return Flux.interval(Duration.ofSeconds(30))
.map(i -> {
// TODO e.g. fetch data from somewhere,
// if there is no data return an empty object
return data;
})
.filter(data -> !data.isEmpty())
.map(data -> ServerSentEvent
.builder(data)
.event("message").build());
}
}
I wrap everything up as ServerSentEvent<? extends Notification>. Notification is just a marker interface. I use the event field from the ServerSentEvent class in order to separate between data and ping events. Since the heartbeat Flux sends events constantly and in short intervals, the time it takes to detect that the client is gone is at most the length of that interval. Remember, I need that because it might take a while before I get some real data that can be sent and, as a result, it might also take a while before it detects that the client is gone. Like this, it will detect that the client is gone as soon as it can’t sent the ping (or possibly the message event).
One last note on the marker interface, which I called Notification. This is not really necessary, but it gives some type safety. Without that, we could write Flux<ServerSentEvent<?>> instead of Flux<ServerSentEvent<? extends Notification>> as return type for the getNotificationStream() method. Or also possible, make getHeartbeatStream() return Flux<ServerSentEvent<MyData>>. However, like this it would allow that any object could be sent, which I don’t want. As a consequence, I added the interface.

I'm not sure why this behaves like this, but I suspect it is because of the choice of generation operator. I think using the following would work:
return Flux.interval(Duration.ofMillis(500))
.map(input -> {
return "DATA";
});
According to Reactor's reference documentation, you're probably hitting the key difference between generate and push (I believe a quite similar approach using generate would probably work as well).
My comment was referring to the backpressure information (how many elements a Subscriber is willing to accept), but the success/error information is communicated over the network.
Depending on your choice of web server (Reactor Netty, Tomcat, Jetty, etc), closing the client connection might result in:
a cancel signal being received on the server side (I think this is supported by Netty)
an error signal being received by the server when it's trying to write on a connection that's been closed (I believe the Servlet spec does not provide that that callback and we're missing the cancel information).
In short: you don't need to do anything special, it should be supported already, but your Flux implementation might be the actual problem here.
Update: this is a known issue in Reactor Netty

Related

coordinating multiple outgoing requests in a reactive manner

this is more of a best practice question.
in my current system (monolith), a single incoming http api request might need to gather similarly structured data from to several backend sources, aggregate it and only then return the data to the client in the reponse of the API.
in the current implementation I simply use a threadpool to send all requests to the backend sources in parallel and a countdown latch of sorts to know all requests returned.
i am trying to figure out the best practice for transforming the described above using reactice stacks like vert.x/quarkus. i want to keep the reactiveness of the service that accepts this api call, calls multiple (similar) backend source via http, aggregates the data.
I can roughly guess I can use things like rest-easy reactive for the incoming request and maybe MP HTTP client for the backend requests (not sure its its reactive) but I am not sure what can replace my thread pool to execute things in parallel and whats the best way to aggregate the data that returns.
I assume that using a http reactive client I can invoke all the backend sources in a loop and because its reactive it will 'feel' like parralel work. and maybe the returned data should be aggragated via the stream API (to join streams of data)? but TBH I am not sure.
I know its a long long question but some pointers would be great.
thanks!

You can drop the thread pool, you don't need it to invoke your backend services in parallel.
Yes, the MP RestClient is reactive. Let's say you have this service which invokes a backend to get a comic villain:
#RegisterRestClient(configKey = "villain-service")
public interface VillainService {
#GET
#Path("/")
#NonBlocking
#CircuitBreaker
Uni<Villain> getVillain();
}
And a similar one for heroes, HeroService. You can inject them in your endpoint class, retrieve a villain and a hero, and then compute the fight:
#Path("/api")
public class Api {
#RestClient
VillainService villains;
#RestClient
HeroService heroes;
#Inject
FightService fights;
#GET
public Uni<Fight> fight() {
Uni<Villain> villain = villains.getVillain();
Uni<Hero> hero = heroes.getRandomHero();
return Uni.combine().all().unis(hero, villain).asTuple()
.chain(tuple -> {
Hero h = tuple.getItem1();
Villain v = tuple.getItem2();
return fights.computeResult(h, v);
});
}
}

Start processing Flux response from server before completion: is it possible?

I have 2 Spring-Boot-Reactive apps, one server and one client; the client calls the server like so:
Flux<Thing> things = thingsApi.listThings(5);
And I want to have this as a list for later use:
// "extractContent" operation takes 1.5s per "thing"
List<String> thingsContent = things.map(ThingConverter::extractContent)
.collect(Collectors.toList())
.block()
On the server side, the endpoint definition looks like this:
#Override
public Mono<ResponseEntity<Flux<Thing>>> listThings(
#NotNull #Valid #RequestParam(value = "nbThings") Integer nbThings,
ServerWebExchange exchange
) {
// "getThings" operation takes 1.5s per "thing"
Flux<Thing> things = thingsService.getThings(nbThings);
return Mono.just(new ResponseEntity<>(things, HttpStatus.OK));
}
The signature comes from the Open-API generated code (Spring-Boot server, reactive mode).
What I observe: the client jumps to things.map immediately but only starts processing the Flux after the server has finished sending all the "things".
What I would like: the server should send the "things" as they are generated so that the client can start processing them as they arrive, effectively halving the processing time.
Is there a way to achieve this? I've found many tutorials online for the server part, but none with a java client. I've heard of server-sent events, but can my goal be achieved using a "classic" Open-API endpoint definition that returns a Flux?
The problem seemed too complex to fit a minimal viable example in the question body; full code available for reference on Github.
EDIT: redirect link to main branch after merge of the proposed solution

I've got it running by changing 2 points:
First: I've changed the content type of the response of your /things endpoint, to:
content:
text/event-stream
Don't forget to change also the default response, else the client will expect the type application/json and will wait for the whole response.
Second point: I've changed the return of ThingsService.getThings to this.getThingsFromExistingStream (the method you comment out)
I pushed my changes to a new branch fix-flux-response on your Github, so you can test them directly.

How does ktor websocket flow api works?

I'm using ktor for server side development with websockets.
Documentations shows us this example of using incoming channel:
for (frame in incoming.mapNotNull { it as? Frame.Text }) {
// some
}
But mapNotNull is marked as deprecated in favor of Flow. How should I use this API and what problems could be there? For example, the Flow is a cold stream. It means that the producer function will be called on each collect. How does it work in context of websocket. Will it be reopened on second collect call, or maybe old messages will be delivered once after the next collect? How can I collect N messages, then stop collecting, then collect again?
Thanks in advance :)

How should I use this API and what problems could be there?
What I am using and what I have seen in one of the examples somewhere in the docs is the consumeAsFlow() method called on ReceiveChannel. Here is the entire snippet:
webSocket("/websocket") { //this: DefaultWebSocketServerSession
incoming
.consumeAsFlow()
.map { receive(it) }
.collect()
}
Haven't seen major issues with this approach. One thing you should be aware of (but that goes for the non-flow approach as well) is that if you throw inside your flow, then it will break the WebSocket connection, which is usually not something you'd like to do. It might be worth considering wrapping the entire thing in a try-catch.
Will it be reopened on second collect call, or maybe old messages will be delivered once after the next collect?
You open the websocket before you even start consuming the messages from the flow. You can see that inside webSocket() {} you are in the context of DefaultWebSocketServerSession. This is your connection management. Inside your flow you are simply receiving messages one by one as they arrive (after the connection has been established). If the connection breaks, then you're out of the flow. It needs to be re-established before you can process your messages. This establishing bit is done by the Route.webSocket() method. I do recommend taking a look at its Javadoc.
If you wish to add some clean up after the connection is closed you can add a finally block like so:
webSocket("/chat") {
try {
incoming
.consumeAsFlow()
.map { receive(it, client) }
.collect()
} finally {
// cleanup
}
}
In short: collect is called once per received message. If there is no connection (or it was broken) then collect won't be called.
How can I collect N messages, then stop collecting, then collect again?
What is the use case for this? I don't think you should be doing this with any flow. You can of course take(n) items from a flow, but you won't be able to take any more from it again.

Timeout on replyChannel when wireTap is used

We are using wireTap to take timestamps at different parts of the flow. When introduced to the newest flow, it started causing a timeout in the replyChannel. From what I understand from the documentation, wireTap does intercept the message and sends it to secondary channel, while not affecting the main flow - so it looks like the perfect thing to use to take snapshots of said timestamps. Are we using wrong component for the job, or is there something wrong with the configuration? And if so, how would you recommend to register such information?
The exception:
o.s.integration.core.MessagingTemplate : Failed to receive message from channel 'org.springframework.messaging.core.GenericMessagingTemplate$TemporaryReplyChannel#21845b0d' within timeout: 1000
The code:
#Bean
public MarshallingWebServiceInboundGateway inboundGateway(Jaxb2Marshaller jaxb2Marshaller,
DefaultSoapHeaderMapper defaultSoapHeaderMapper) {
final MarshallingWebServiceInboundGateway inboundGateway =
new MarshallingWebServiceInboundGateway(jaxb2Marshaller);
inboundGateway.setRequestChannelName(INPUT_CHANNEL_NAME);
inboundGateway.setHeaderMapper(defaultSoapHeaderMapper);
return inboundGateway;
}
#Bean
public IntegrationFlow querySynchronous() {
return IntegrationFlows.from(INPUT_CHANNEL_NAME)
.enrichHeaders(...)
.wireTap(performanceTimestampRegistrator.registerTimestampFlow(SYNC_REQUEST_RECEIVED_TIMESTAMP_NAME))
.handle(outboundGateway)
.wireTap(performanceTimestampRegistrator.registerTimestampFlow(SYNC_RESPONSE_RECEIVED_TIMESTAMP_NAME))
//.transform( m -> m) // for tests - REMOVE
.get();
}
And the timestamp flow:
public IntegrationFlow registerTimestampFlow(String asyncRequestReceivedTimestampName) {
return channel -> channel.handle(
m -> MetadataStoreConfig.registerFlowTimestamp(m, metadataStore, asyncRequestReceivedTimestampName));
}
The notable thing here is that if I uncomment the no-operation transformer, everything suddenly works fine, but it doesn't sound right and I would like to avoid such workarounds.
Another thing is that the other, very similar flow works correctly, without any workarounds. Notable difference being it puts message in kafka using kafka adapter, instead of calling some web service with outbound gateway. It still generates response to handle (with generateResponseFlow()), so it should behave the same way. Here is the flow, which works fine:
#Bean
public MarshallingWebServiceInboundGateway workingInboundGateway(Jaxb2Marshaller jaxb2Marshaller,
DefaultSoapHeaderMapper defaultSoapHeaderMapper, #Qualifier("errorChannel") MessageChannel errorChannel) {
MarshallingWebServiceInboundGateway aeoNotificationInboundGateway =
new MarshallingWebServiceInboundGateway(jaxb2Marshaller);
aeoNotificationInboundGateway.setRequestChannelName(WORKING_INPUT_CHANNEL_NAME);
aeoNotificationInboundGateway.setHeaderMapper(defaultSoapHeaderMapper);
aeoNotificationInboundGateway.setErrorChannel(errorChannel);
return aeoNotificationInboundGateway;
}
#Bean
public IntegrationFlow workingEnqueue() {
return IntegrationFlows.from(WORKING_INPUT_CHANNEL_NAME)
.enrichHeaders(...)
.wireTap(performanceTimestampRegistrator
.registerTimestampFlow(ASYNC_REQUEST_RECEIVED_TIMESTAMP_NAME))
.filter(...)
.filter(...)
.publishSubscribeChannel(channel -> channel
.subscribe(sendToKafkaFlow())
.subscribe(generateResponseFlow()))
.wireTap(performanceTimestampRegistrator
.registerTimestampFlow(ASYNC_REQUEST_ENQUEUED_TIMESTAMP_NAME))
.get();
}
Then, there is no problem with wireTap being the last component and response is correctly received on replyChannel in time, without any workarounds.

The behavior is expected.
When the wireTap() (or log()) is used in the end of flow, there is no reply by default.
Since we can't assume what logic you try to include into the flow definition, therefore we do our best with the default behavior - the flow becomes a one-way, send-and-forget one: some people really asked to make it non replyable after log() ...
To make it still reply to the caller you need to add a bridge() in the end of flow.
See more in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/dsl.html#java-dsl-log
It works with your much complex scenario because one of the subscriber for your publishSubscribeChannel is that generateResponseFlow() with the reply. Honestly you need to be careful with request-reply behavior and such a publishSubscribeChannel configuration. The replyChannel can accept only one reply and if you would expect a reply from several subscribers, you would be surprised how the behavior is strange.
The wireTap in this your configuration is not a subscriber, it is an interceptor injected into that publishSubscribeChannel. So, your assumption about similarity is misleading. There is the end of the flow after that wiretap, but since one of the subscribers is replying, you get an expected behavior. Let's take a look into the publishSubscribeChannel as a parallel electrical circuit where all the connections get an electricity independently of others. And they perform they job not affecting all others. Anyway this is different story.
To conclude: to reply from the flow after wireTap(), you need to specify a bridge() and reply message will be routed properly into the replyChannel from the caller.

SignalR client misses some events at a regular interval

I have a straightforward SignalR setup: OWIN-hosted .NET server and JavaScript client (both # v2.1.1). The client uses SignalR to synchronize its copy of an ordered event stream maintained in an Rx ReplaySubject on the server. When a client connects, it provides a startAfter query parameter that is used to initialize an IObserver against the ReplaySubject, and this observer then sends each event in the observed sequence to the client. Each event has a sequence number, and the client can tell, based on the event sequence number, if any event is missing in the sequence. (Which would be a serious problem in this application.)
The problem is that the client regularly receives only portions of the event sequence. In fact, there is a regular pattern to this. For every 250 events there is a large gap. So for example, each test shows that the first gap was from somewhere between 70 and 80 to 250. Why always 250? And from there on, the "skip-to" point is always in intervals of 250; e.g., a gap from 263 to 500, then one from 511 to 750, etc.. I have to assume that this is some kind of default buffer size.
Also, the first time a client connects to the server it always receives the entire sequence just fine. It's the subsequent connections that exhibit the regular skipping problem. So it seems like it's a server-side problem, and not a client problem at all.
I then added some checks to the server to ensure that the IObserver for each client is seeing all of the events in the correct order. It is. So it seems almost certain that the problem is on the SignalR server side and has nothing to do with Rx.
And finally, I checked to see if the dropped messages were perhaps just being delivered out of order (which I could live with, although I assumed SignalR provides an ordered-delivery guarantee). They are not - the messages just disappear into a void.
If it helps, I'm currently running locally, with IIS Express on Win 8.1 x64 and testing on IE Developer Channel as well as Chrome 36. The connection is using WebSockets. I couldn't find any reference to 250 as a special quantity in either the SignalR source (client or server) or the Rx.Net source.
Any suggestions on troubleshooting? I'd love to find a stable solution before I start building a complicated workaround.
Here's the relevant server-side code:
public class AllEventsReplaySource
{
private readonly IHubConnectionContext<dynamic> clients;
private readonly ReplaySubject<dynamic> allEvents;
private AllEventsReplaySource(IHubConnectionContext<dynamic> clients)
{
this.clients = clients;
this.allEvents = new ReplaySubject<dynamic>();
// (Not shown: code that generates the input to the ReplaySubject.)
}
public void SubscribeClient(string connectionId, int startAfter)
{
this.allEvents.Skip(startAfter).Subscribe(e =>
{
// (Not shown: code that verifies no skips are occurring at this point for a client.)
clients.Client(connectionId).notifyEvent(e);
});
}
private readonly static Lazy<AllEventsReplaySource> instance =
new Lazy<AllEventsReplaySource>(() => new AllEventsReplaySource(
GlobalHost.ConnectionManager.GetHubContext<AllEventsReplayHub>().Clients));
public static AllEventsReplaySource Instance
{
get { return instance.Value; }
}
}
[HubName("allEventsReplayHub")]
public class AllEventsReplayHub : Hub
{
private readonly AllEventsReplaySource source;
public AllEventsReplayHub()
: this(AllEventsReplaySource.Instance)
{ }
public AllEventsReplayHub(AllEventsReplaySource source)
{
this.source = source;
}
public override Task OnConnected()
{
var previousSequenceNumber = Int32.Parse(Context.QueryString["startAfter"]);
var connectionId = this.Context.ConnectionId;
AllEventsReplaySource.Instance.SubscribeClient(connectionId, previousSequenceNumber);
return base.OnConnected();
}
}

The issue you are experiencing seems consistent with a message buffer overflow. When SignalR releases messages from its buffer, it does so in 250 message fragments by default.
SignalR will buffer at least the last 1000 messages sent to a given connectionId. This means that when you send the 1251st message, the first 250 get dereferenced by the buffer. This explains why when a client first connects to the server, it receives the entire sequence of messages. You have to send at least 1251 messages to a given client before the buffer will drop fragments. Again, this is all assuming default settings.
While you could increase the DefaultMessageBufferSize, that probably will not fix your root problem. It seems that you are trying to send messages faster than the server can send them to the client. If you do that continuously, you will run out of buffer space no matter the size.
It's more common to reduce the DefaultMessageBufferSize rather than increase it, since the buffers can consume a lot of memory, especially if you are sending a lot of large unique messages to many different clients.
Your best bet to avoid overrunning the buffer is to have the client send an ACK at least every 1000 messages. Given this, it might be possible to avoid sending over 1000 unACKed messages thereby avoiding this problem altogether.
By the way, you can take a look at SignalR's message buffer implementation yourself if you feel so inclined. Note that the capacity constructor argument is the DefaultMessageBufferSize.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio