I am trying to utilize the MassTransit batching technique to process multiple messages to reduce the individual queries to be database (read and write).
If there is an exception while processing one/more of the messages, then the expectation is to fault only required messages and have the ability to process the rest of the messages.
This is common scenario in my use case ,what I am trying to establish here is a way to perform batch processing that caters for poisoned messages
For example, if I have a batch size of 10 messages, 10 in the queue and 1 persistently fails, I still need a means of ensuring the other 9 can be processed successfully. It is fine if all 10 need to be returned to the queue and subset re-consumed - but the poisoned message needs to be eliminated somehow. Does this requirement discount the use of batching?
I have tried below, however did solve my use case.
catching the exception and raising NotifyFaulted for that specific message.
modified sample-twitch application, to throw an exception to something like below , based on https://github.com/MassTransit/Sample-Twitch/blob/master/src/Sample.Components/BatchConsumers/RoutingSlipBatchEventConsumer.cs
file.
public Task Consume(ConsumeContext<Batch<RoutingSlipCompleted>> context)
{
if (_logger.IsEnabled(LogLevel.Information))
{
_logger.Log(LogLevel.Information, "Routing Slips Completed: {TrackingNumbers}",
string.Join(", ", context.Message.Select(x => x.Message.TrackingNumber)));
}
for (int i = 0; i < context.Message.Length; i++)
{
try
{
if (i % 2 != 0)
throw new System.Exception("business error -message failed");
}
catch (System.Exception ex)
{
context.Message[i].NotifyFaulted(TimeSpan.Zero, "batch routing silp faulted", ex);
}
}
return Task.CompletedTask;
}
I have dig into a few more threads that look similar to the issue ,for reference.
Masstransit error handling for batch consumer
If you want to use batch, and have a message in that batch that cannot be processed, you should catch the exception and do something else with the poison message. You could write it someplace else, publish some type of event, or whatever else. But MassTransit does not allow you to partially complete/fault messages of a batch.
Related
API: Exception - reactor-code
The example works as follows:
Subscribes to the next one on departure. The incoming data comes from the Rabbit and it will be processed. This can take a relatively long time and the result will send into another Rabbit queue.
Because of bulk processing, I use buffer for 10 elements. If not receive enough elements for 10, I use a timeout (on buffer) to release for processing.
Problem: If processing or rabbit publisher is slow, bufferTimeout not receive "request", when timeout run out bufferTimeout would like to emit. Then I get the following Exception: "Could not emit buffer due to lack of requests"
Since I need all the data, I exclude next method usage: onBackPressureDrop or onBackPressureLatest. Using plain onBackPressure won't be good because it is not forward the received request number. (onBackPressure use request(unbound) not request(n))
Example Kotlin code:
#Configuration
class SpringCloudStreamRabbitProcessor {
#Bean
fun rabbitFunc() = Function<Flux<Int>, Flux<Int>> {
it.bufferTimeout(10, Duration.ofMinutes(1))
.concatMap { intList ->
// process
Mono.just(intList)
}
.flatMapIterable { intList ->
intList
}
}
}
Instead of using a logger or database server I'd like to append information to one file from possibly many verticle instances.
There are versions of methods for writing asynchronously to a file.
Can I assume that vertx handles the synchronisation between the writes so that these dont interfere when using those versions of methods marked as ¨async¨ ?
There seems to be a rule that one can rely on vertx providing all isolation between concurrent processing out of the box. But is that true in case of writing file access?
Could you please include a code snippet into the answer that shows how to open and write to one file from many verticle instances with finest possible granularity, e.g. for logging requests.
I wouldn't recommend writing to a single file with many different "writers". Regarding concurrent logging I would stick to the Single Writer principle.
Create a Verticle which subscribes to the Event Bus and listens for messages to be logged. Lets call this Verticle Logger which listens to system.logger.
EventBus eb = vertx.eventBus();
eb.consumer("system.logger", message -> {
// write to file
});
Verticles which like to log something need to send a message to the Logger Verticle:
eventBus.send("system.logger", "foobar");
Appending to a existing file work something like this (didn't test):
vertx.fileSystem().open("file.log", new OpenOptions(), result -> {
if (result.succeeded()) {
Buffer buff = Buffer.buffer(message); // message from consume
AsyncFile file = result.result();
file.write(buff, buff.length() * i, ar -> {
if (ar.succeeded()) {
System.out.println("done");
} else {
System.err.println("write failed: " + ar.cause());
}
});
} else {
System.err.println("open file failed " + result.cause());
}
});
I've written a Continuous JMS Message reveiver :
Here, I'm using CLIENT_ACKNOWLEDGE because I don't want this thread to acknowledge the messages.
(...)
connection.start();
session = connection.createQueueSession(true, Session.CLIENT_ACKNOWLEDGE);
queue = session.createQueue(QueueId);
receiver = session.createReceiver(queue);
While (true) {
message = receiver.receive(1000);
if ( message != null ) {
// NB : I can only pass Strings to the other thread
sendMessageToOtherThread( message.getText() , message.getJMSMessageID() );
}
// TODO Implement criteria to exit the loop here
}
In another thread, I'll do something as follows (after successful processing) :
This is in a distinct JMS Connection executed simultaneously.
public void AcknowledgeMessage(String messageId) {
if (this.first) {
this.connection.start();
this.session = this.connection.createQueueSession( false, Session.AUTO_ACKNOWLEDGE );
this.queue = this.session.createQueue(this.QueueId);
}
QueueReceiver receiver = this.session.createReceiver(this.queue, "JMSMessageID='" + messageId + "'");
Message AckMessage = receiver.receive(2000);
receiver.close();
}
It appears that the message is not found (AckMessage is null after timeout) whereas it does exist in the Queue.
I suspect the message to be blocked by the continuous input thread.. indeed, when firing the AcknowledgeMessage() alone, it works fine.
Is there a cleaner way to retrieve 1 message ? based on its QueueId and messageId
Also, I feel like there could be a risk of memory leak in the continuous reader if it has to memorize the Messages or IDs during a long time.. justified ?
If I'm using a QueueBrowser to avoid impacting the Acknowledge Thread, it looks like I cannot have this continuous input feed.. right ?
More context : I'm using ActiveMQ and the 2 threads are 2 custom "Steps" of a Pentaho Kettle transformation.
NB : Code samples are simplified to focus on the issue.
Well, you can't read that message twice, since you have already read it in the first thread.
ActiveMQ will not delete the message as you have not acknowledge it, but it won't be visible until you drop the JMS connection (I'm not sure if there is a long timeout here as well in ActiveMQ).
So you will have to use the original message and do: message.acknowledge();.
Note, however, that sessions are not thread safe, so be careful if you do this in two different threads.
We have a situation where we set up a component to run batch jobs using spring batch remotely. We send a JMS message with the job xml path, name, parameters, etc. and we wait on the calling batch client for a response from the server.
The server reads the queue and calls the appropriate method to run the job and return the result, which our messaging framework does by:
this.jmsTemplate.send(queueName, messageCreator);
this.LOGGER.debug("Message sent to '" + queueName + "'");
try {
final Destination replyTo = messageCreator.getReplyTo();
final String correlationId = messageCreator.getMessageId();
this.LOGGER.debug("Waiting for the response '" + correlationId + "' back on '" + replyTo + "' ...");
final BytesMessage message = (BytesMessage) this.jmsTemplate.receiveSelected(replyTo, "JMSCorrelationID='"
+ correlationId + "'");
this.LOGGER.debug("Response received");
Ideally, we want to be able to call out runJobSync method twice, and have two jobs simultaneously operate. We have a unit test that does something similar, without jobs. I realize this code isn't very great, but, here it is:
final List result = Collections.synchronizedList(new ArrayList());
Thread thread1 = new Thread(new Runnable(){
#Override
public void run() {
client.pingWithDelaySync(1000);
result.add(Thread.currentThread().getName());
}
}, "thread1");
Thread thread2 = new Thread(new Runnable(){
#Override
public void run() {
client.pingWithDelaySync(500);
result.add(Thread.currentThread().getName());
}
}, "thread2");
thread1.start();
Thread.sleep(250);
thread2.start();
thread1.join();
thread2.join();
Assert.assertEquals("both thread finished", 2, result.size());
Assert.assertEquals("thread2 finished first", "thread2", result.get(0));
Assert.assertEquals("thread1 finished second", "thread1", result.get(1));
When we run that test, thread 2 completes first since it just has a 500 millisencond wait, while thread 1 does a 1 second wait:
Thread.sleep(delayInMs);
return result;
That works great.
When we run two remote jobs in the wild, one which takes about 50 seconds to complete and one which is designed to fail immediately and return, this does not happen.
Start the 50 second job, then immediately start the instant fail job. The client prints that we sent a message requesting that the job run, the server prints that it received the 50 second request, but waits until that 50 second job is completed before handling the second message at all, even though we use the ThreadPoolExecutor.
We are running transactional with Auto acknowledge.
Doing some remote debugging, the Consumer from AbstractPollingMessageListenerContainer shows no unhandled messages (so consumer.receive() obviously just returns null over and over). The webgui for the amq broker shows 2 enqueues, 1 deque, 1 dispatched, and 1 in the dispatched queue. This suggests to me that something is preventing AMQ from letting the consumer "have" the second message. (prefetch is 1000 btw)
This shows as the only consumer for the particular queue.
Myself and a few other developers have poked around for the last few days and are pretty much getting nowhere. Any suggestions on either, what we have misconfigured if this is expected behavior, or, what would be broken here.
Does the method that is being remotely called matter at all? Currently the job handler method uses an executor to run the job in a different thread and does a future.get() (the extra thread is for reasons related to logging).
Any help is greatly appreciated
not sure I follow completely, but off the top, you should try the following...
set the concurrentConsumers/maxConcurrentConsumers greater than the default (1) on the MessageListenerContainer
set the prefetch to 0 to better promote balancing messages between consumers, etc.
The run method of my worker role is:
public override void Run()
{
Message msg=null;
while (true)
{
msg = queue.GetMessage();
if(msg!=null && msg.DequeueCount==1){
//delete message
...
//execute operations
...
}
else if(msg!=null && msg.DequeueCount>1){
//delete message
...
}
else{
int randomTime = ...
Thread.Sleep(randomTime);
}
}
}
For performance tests I would that a message could be analysed only by a worker (I don't consider failure problems on workers).
But seems by my tests, that two workers can pick up the same message and read DequeueCount equals to 1 (both workers). Is it possible?
Does exist a way that allow just a worker to read a message in a "mutex" way?
How is your "getAMessage(queue)" method defined? If you do PeekMessage(), a message will be visible by all workers. If you do GetMessage(), the message will be got only by the worker which firsts get it. But for the invisibility timeout either specified or the default (30 sec.). You have to delete the message before the invisibility timeout comes.
Check out the Queue Service API for more information. I am sure that there is something wrong in your code. I use queues and they behave as by documentation in dev storage and in production storage. You may want to explicitly put higher value of the Visibility Timeout when you do GetMessage. And make sure you do not sleep longer than the visibility timeout.