Speeding up Message Hub Kafka Java Console sample - message-hub

I've been working with the Message Hub sample code found at this link: https://github.com/ibm-messaging/message-hub-samples
In particular, I've been trying to increase the throughput of the producer with the Kafka Java console example. I noticed the documentation in this snippet of code:
// Synchronously wait for a response from Message Hub / Kafka on every message produced.
// For high throughput the future should be handled asynchronously.
RecordMetadata recordMetadata = future.get(5000, TimeUnit.MILLISECONDS);
producedMessages++;
I've already turned off the thread sleep found later in the code which also helped increase the throughput, but I was hoping I could get some help on implementing the future asynchronously in this block. Thanks in advance!

you have two basic options for handling the outcome of a produce request asynchronously
1) use the overloaded send with a completion callback argument, which will be invoked asynchronously:
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback);
if using the callback you may ignore the future.
2) pass the Future to some other thread you have created, and have it inspect the future for completion, while leaving the thread that calls send free to carry on.

Related

Project reactor - react to timeout happened downstream

Project Reactor has a variety of timeout() operators.
The very basic implementation raises TimeoutException in case no item arrives within the given Duration. The exception is propagated downstream , and to upstream it sends cancel signal.
Basically my question is: is it possible to somehow react (and do something) specifically to timeout that happened downstream, not just to cancelation that sent after timeout happened?
My question is based on the requirements of my real business case and also I'm wondering if there is a straight solution.
I'll simplify my code for better understanding what I want to achieve.
Let's say I have the following reactive pipeline:
Flux.fromIterable(List.of(firstClient, secondClient))
.concatMap(Client::callApi) // making API calls sequentially
.collectList() // collecting results of API calls for further processing
.timeout(Duration.ofMillis(3000)) // the entire process should not take more than duration specified
.subscribe();
I have multiple clients for making API calls. The business requirement is to call them sequantilly, so I call them with concatMap(). Then I should collect all the results and the entire process should not take more than some Duration
The Client interface:
interface Client {
Mono<Result> callApi();
}
And the implementations:
Client firstClient = () ->
Mono.delay(Duration.ofMillis(2000L)) // simulating delay of first api call
.map(__ -> new Result())
// !!! Pseudo-operator just to demonstrate what I want to achieve
.doOnTimeoutDownstream(() ->
log.info("First API call canceled due to downstream timeout!")
);
Client secondClient = () ->
Mono.delay(Duration.ofMillis(1500L)) // simulating delay of second api call
.map(__ -> new Result())
// !!! Pseudo-operator just to demonstrate what I want to achieve
.doOnTimeoutDownstream(() ->
log.info("Second API call canceled due to downstream timeout!")
);
So, if I have not received and collected all the results during the amount of time specified, I need to know which API call was actually canceled due to downstream timeout and have some callback for this "event".
I know I could put doOnCancel() callback to every client call (instead of pseudo-operator I demonstrated) and it would work, but this callback reacts to cancelation, which may happen due to any error.
Of course, with proper exception handling (onErrorResume(), for example) it would work as I expect, however, I'm interesting if there is some straight way to somehow react specifically to timeout in this case.

FireStore save document asynchronously in Java Spring Application

Per Java Example, a document can be written as
ApiFuture<WriteResult> future = db.collection("cities").document("LA").set(docData);
// ...
// future.get() blocks on response
System.out.println("Update time : " + future.get().getUpdateTime());
but if there is a big document and I do not want to wait(block) on it to finish but want let it finish it in background, I tried using
future.get(2, TimeUnit.MILLISECONDS).getUpdateTime()
Will it guarantee that the document be written, sometime I am getting following error
java.util.concurrent.TimeoutException: Waited 2 nanoseconds for
TransformFuture#aaac[status=PENDING, info=[inputFuture=
[com.google.api.core.ApiFutureToListenableFuture#qqqqa5b9],
function=[com.google.api.core.ApiFutures$ApiFunctionToGuavaFunction#3244n86]]] at
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:508)
at com.google.common.util.concurrent.FluentFuture$TrustedFuture.get
(FluentFuture.java:93)
So does the application needs to wait for the write ( set) operation) to finish or Firestore will take care of it? App is running in google cloud Run.
Since you have an asynchronous operation, you need to ensure you only send a response to the caller once that asynchronous operation has completed.
See the documentation on avoiding background activities if CPU is allocated only during request processing, specifically the part about Spring Boot in there.
Failing to follow this guidance may result in your code getting terminated by Cloud Run before it completes, or you getting charged for code that is longer than is necessary.

Project reactor processors v3.X

We are trying to migrate from 2.X to 3.X.
https://github.com/reactor/reactor-core/issues/375
We have used the EventBus as event manager in our application(Low latency FX system) and it works very well for us.
After the change we decided to take every module and create his own processor to handle event.
1. Does this use seems to be correct from your point of view? Because lack of document at the current stage and after reviewing everything we could we don't really know what to do here
2. We have tried to use Flux in order to perform action every X interval
For example: Market is arriving 1000 for 1 second but we want to process an update only 4 time in a second. After upgrading we are using:
Processor with buffer and sending to another method.
In this method we have Flux that get list and try to work in parallel in order to complete his task.
We had 2 major problems:
1. Sometimes we received Null event which we cannot find that our system is sending to i suppose maybe we are miss using the processor
//Definition of processor
ReplayProcessor<Event> classAEventProcessor = ReplayProcessor.create();
//Event handler subscribing
public void onMyEventX(Consumer<Event> consumer) {
Flux<Event> handler = classAEventProcessor .filter(event -> event.getType().equals(EVENT_X));
handler.subscribe(consumer);
}
in the example above the event in the handler sometimes get null.. Once he does the stream stop working until we are restating server(Because only on restart we are doing creating processor)
2.We have tried to us parallel but sometimes some of the message were disappeared so maybe we are misusing the framework
//On constructor
tickProcessor.buffer(1024, Duration.of(250, ChronoUnit.MILLIS)).subscribe(markets ->
handleMarkets(markets));
//Handler
Flux.fromIterable(getListToProcess())
.parallel()
.runOn(Schedulers.parallel())
.doOnNext(entryMap -> {
DoBlockingWork(entryMap);
})
.sequential()
.subscribe();
The intention of this is that the processor will wakeup every 250ms and invoke the handler. The handler will work work with Flux parallel in order to make better and faster processing.
*In case that DoBlockingWork takes more than 250ms i couldn't understand what will be the behavior
UPDATE:
The EventBus was wrapped by us and every event subscribed throw the wrapped event manager.
Now we have tried to create event processor for every module but it works very slow. We have used TopicProcessor with ThreadExecutor and still very slow.. EventBus did the same work in high speed
Anyone has any idea? BTW when i tried to use DirectProcessor it seems to work much better that the TopicProcessor
Reactor 3 is built around the concept that you should avoid blocking as much as you can, so in your second snippet DoBlockingWork doesn't look good.
How are the events generated? Do you maybe have an listener-based asynchronous API to get them? If so, you could try using Flux.create.
For your use case of "we have 1000 events in 1 second, but only want to process 4", I'd chain a sample operator. For instance, sample(Duration.ofMillis(250)) will divide each second into 4 windows, from which it will only emit the last element.
The reference guide is being written, as well as a page where you can find links to external articles and learning material.There's a preview of the WIP reference guide here and the learning resources page here.

Not receiving Apache Camel Event Notifications under the smallest load

I have extended EventNotiferSupport, and set the isEnable() to respond True for all events. I have a notify() that logs what events I receive and the corresponding Exchange ID for the event.
I have added my ExchangeMessageNotifier with this.context.getManagementStrategy().addEventNotifier(this.exchangeMessageNotifier);
I run my program under basically no load, sending 1 message at a time 1 second delay between messages into Camel to send out. Everything works the way I expect. I receive my events everything looks good.
I decrease the delay between messages to 0 milliseconds, and I find that 1 out of approximately 20 messages I fail to receive one of the Events, (Often the Completed event).
Add a second thread sending at the same rate and I don't get any events for any messages.
What am I missing? I've done searches and I don't find anything that I need to do differently. Is there something I am missing?
I am using Apache Camel 2.16.3, and moved to 2.18.1 still see the same behavior.
Well found my own answer. Part of the fun of inheriting code without any informaiton.
In your implementation of the EventNotifierSupport you need to override the doStart() method and configure the EventNotifierSupport for what events you wish to receive.
protected void doStart() throws Exception {
// filter out unwanted events
setIgnoreCamelContextEvents(true);
setIgnoreServiceEvents(true);
setIgnoreRouteEvents(true);
setIgnoreExchangeCreatedEvent(true);
setIgnoreExchangeCompletedEvent(false);
setIgnoreExchangeFailedEvents(true);
setIgnoreExchangeRedeliveryEvents(true);
setIgnoreExchangeSentEvents(false);
}
This is in addition to doing the following:
public boolean isEnabled(EventObject event) {
return true;
}
Which enables you to determine if you want a particular event, out of the selected groups you had set in the doStart().
Once these changes were in I was receiving consistent events.

How to use multiple sessions per connection in a multi-threaded application?

Suppose I have one connection c and many session objects s1, s2 .. sn, each working in different threads t1, t2 ... tn.
c
|
-------------------------------------------------
| | | |
(t1,s1) (t2,s2) (t3,s3) ...... (tn,sn)
Now suppose one of the thread t3 wants to send a message to a particular queue q3 and then listen to the reply asynchronously. So it does the following:
1: c.stop();
2: auto producer = s3.createProducer(s3.createQueue(q3));
3: auto text = s3.createTextMessage(message);
4: auto replyQueue = s3.createTemporaryQueue();
5: text.setJMSReplyTo(replyQueue);
6: producer.send(text);
7: auto consumer = s3.createConsumer(replyQueue);
8: consumer.setMessageListener(myListener);
9: c.start();
The reason why I called c.stop() in the beginning and then c.start() in the end, because I'm not sure if any of the other threads has called start on the connection (making all the sessions asynchronous — is that right?) and as per the documentation:
"If synchronous calls, such as creation of a consumer or producer, must be made on an asynchronous session, the Connection.Stop must be called. A session can be resumed by calling the Connection.Start method to start delivery of messages."
So calling stop in the beginning of the steps and then start in the end seems reasonable and thus the code seems correct (at least to me). However, when I thought about it more, I think the code is buggy, as it doesn't make sure no other threads call start before t3 finishes all the steps.
So my questions are:
Do I need to use mutex to ensure it? Or the XMS handles it automatically (which means my reasoning is wrong)?
How to design my application so that I dont have to call stop and start everytime I want to send a messages and listen reply asynchronously?
As per the quoted text above, I cannot call createProducer() and createConsumer() if the connection is in asynchronous mode. What are other methods I cannot call? The documentation doesn't categorise the methods in this way:
Also, the documentation doesn't say clearly what makes a session asynchronous. It says this:
"A session is not made asynchronous by assigning a message listener to a consumer. A session becomes asynchronous only when the Connection.Start method is called."
I see two problems here:
Calling c.start() makes all sessions asynchronous, not just one.
If I call c.start() but doesn't assign any message listener to a consumer, are the session(s) still asynchronous?
It seems I've lots of questions, so it'd be great if anyone could provide me with links to the parts or sections of the documentation which explains XMS objects with such minute details.
This says,
"According to the specification, calling stop(), close() on a Connection, setMessageListener() on a Session etc. must wait till all message processing finishes, that is till all onMessage() calls which have already been entered exit. So if anyone attempts to do that operation inside onMessage() there will be a deadlock by design."
But I'm not sure if that information is authentic, as I didn't find this info on IBM documentation.
I prefer the KIS rule. Why don't you use 1 connection per thread? Hence, the code would not have to worry about conflicts between threads.

Resources