Spring Integration message processing partitioned by header information

Spring Integration message processing partitioned by header information - spring

I want to be able to process messages with Spring Integration in parallel. The messages come from multiple devices and we need to process messages from the same device in sequential order but the devices can be processed in multiple threads. There can be thousands of devices so I'm trying to figure out how to assign processor based on mod of the device ID using Spring Integration's semantics as much as possible. What approach should I be looking at?

It's difficult to generalize without knowing other requirements (transaction semantics etc) but probably the simplest approach would be a router sending messages to a number of QueueChannels using some kind of hash algorithm on the device id (so all messages for a particular device go to the same channel).
Then, have a single-threaded poller pulling messages from each queue.
EDIT: (response to comment)
Again, difficult to generalize, but...
See AbstractMessageRouter.determineTargetChannels() - a router actually returns a physical channel object (actually a list, but in most cases a list of 1). So, yes, you can create the QueueChannels programmatically and have the router return the appropriate one, based on the message.
Assuming you want all the messages to then be handled by the same downstream flow, you would also need to create a <bridge/> for each queue channel to bridge it to the input channel of the next component in the flow.
create a QueueChannel
create a BridgeHandler (set the outputChannel to the input channel of the next component)
create a PollingConsumer (constructor takes the channel and handler; set trigger etc)
start() the consumer.
All of this can be done in your custom router initialization and implement determineTargetChannels() to select the queue.
Depending on the processing time for your events, I would generally recommend running the downstream flow on the poller thread rather than setting a taskExecutor to avoid issues with the next poll trying to schedule another task before this one's done. You might need to increase the default taskScheduler's pool size.

Related

Is it possible to define a single saga which will process many messages

My team is considering if we can use mass transit as a primary solution for sagas in RabbitMq (vs NServiceBus). I admit that our experience which solution like masstransit and nserviceBus are minimal and we have started to introduce messaging into our system. So I sorry if my question will be simple or even stupid.
However, when I reviewed the mass transit documentation I noticed that I am not sure if that is possible to solve one of our cases.
The case looks like:
One of our components will produce up to 100 messages which will be "sent" to queue. These messages are a result of a single operation in a system. All of the messages will have the same Correlated Id and our internal publication id (same too).
1) is it possible to define a single instance saga (by correlated id) which will wait until it receives all messages from a queue and then process them as a single batch?
2) otherwise, is there any solution to ensure all of the sent messages was processed? (Consistency batch?) I assume that correlated Id will serve as a way to found an existing saga instance (singleton). In the ideal case, I would like to complete an instance of a saga When the system will process every message which belongs to a single group (to one publication)
I look at CompositeEvent too but I do not sure if I could use it to "ensure" that every message was processed and then I would let to complete saga for specific correlated Id.
Can you explain how could it be achieved? And into what mechanism I should look at in order to correlated id a lot of messages with the same id to the single saga and then complete if all of msg will be consumed?
Thank you in advance for any response

What you describe is how correlation by id works. It is like that out of the box.
So, in short - when you configure correlation for your messages correctly, all messages with the same correlation id will be handled by the same saga instance.
Concerning the second question - unless you publish a separate event that would inform the saga about how messages it should expect, how would it know that? You can definitely schedule a long timeout, attempting and assuming that within the timeout all the messages will be received by the saga, but it's not reliable.
Composite events won't help here since they are for messages with different types to be handled as one when all of them arrive and it doesn't count for the number of messages of each type. It just waits for one message of each type.

The ability to receive a series of messages and then operate on them in a batch is a common case, so much so that there is a sample showing how to do just that:
Batch Sample
Each saga instance has a unique correlation identifier, and as long as those messages can be correlated to that single instance, MassTransit will manage the concurrency (either optimistic or pessimistic, and depending upon the saga storage engine).
I'd suggest reviewing the state machine in the sample, and seeing how that compares to your scenario.

bind destinations dynamically for producers and consumers (Spring)

I'm trying to send and receive messages to channels/topics whose destination names are in a database, so they can be added/modified/deleted at runtime, but I'm surprised I have found little on the web. I'm using Spring Cloud Streams to allow to change the underlying broker.
To send messages to dynamically bound destinations I'm going with BinderAwareChannelResolver.resolveDestination(target).send(message), but I haven't found something that works like it to receive messages.
My questions are:
1. Is there something similar?
2. how can the message be processed periodically as #StreamListener does?
3. And not as important, but can you create a subscriber automatically in case there is none?
Thanks for any help!

This is a bit out of scope of the original design of the framework. But I would further question your architecture. . . If you truly desire to subscribe to unlimited amount of destinations I wonder why? What is the underlying business requirement?
Keep in mind that even if we were to do it somehow that would require creation of a message listener container dynamically for each new destination which would raise more questions, such as, how long would such container have to live since eventually you would run out of resources.
If, however, you simply asking about possibility of mapping multiple destinations to a single channel so all messages go to the same message handler (e.g., StreamListener), then you can simply use input destination property and define multiple destination delimited by comas.

Strategy for passing same payload between messages when optional outbound gateways fail

I have a workflow whose message payload (MasterObj) is being enriched several times. During the 2nd enrichment () an UnknownHostException was thrown by an outbound gateway. My error channel on the enricher is called but the message the error-channel receives is an exception, and the failed msg in that exception is no longer my MasterObj (original payload) but it is now the object gotten from request-payload-expression on the enricher.
The enricher calls an outbound-gateway and business-wise this is optional. I just want to continue my workflow with the payload that I've been enriching. The docs say that the error-channel on the enricher can be used to provide an alternate object (to what the enricher's request-channel would return) but even when I return an object from the enricher's error-channel, it still takes me to the workflow's overall error channel.
How do I trap errors from enricher's + outbound-gateways, and continue processing my workflow with the same payload I've been working on?
Is trying to maintain a single payload object for the entire workflow the right strategy? I need to be able to access it whenever I need.
I was thinking of using a bean scoped to the session where I store the payload but that seems to defeat the purpose of SI, no?
Thanks.

Well, if you worry about your MasterObj in the error-channel flow, don't use that request-payload-expression and let the original payload go to the enricher's sub-flow.
You always can use in that flow a simple <transformer expression="">.
On the other hand, you're right: it isn't good strategy to support single object through the flow. You carry messages via channel and it isn't good to be tied on each step. The Spring Integration purpose is to be able to switch from different MessageChannel types at any time with small effort for their producers and consumers. Also you can switch to the distributed mode when consumers and producers are on different machines.
If you still need to enrich the same object several times, consider to write some custom Java code. You can use a #MessagingGateway on the matter to still have a Spring Integration gain.
And right, scope is not good for integration flow, because you can simply switch there to a different channel type and lose a ThreadLocal context.

Spring integration priority channel with round robin consumer

I am trying to implement a kind of Priority Channel with spring integration but I am blocked and didn't find a solution on the web.
I would like to read multiples channel (6) alternatively with a service activator. Each channel is for a priority level (CRITICAL, HIGHEST, HIGH, NORMAL, LOW, LOWEST). Message come from RabbitMQ and are distributed on correct channel with a Router.
The problem is that I would like to create a Service Activator who read alternatively in the channels using a round robin based on time.
For example, CRITICAL should have a 5 secondes running time, and then the service switch to HIGHEST for 3 seconds, and then to HIGH for 1 second, ...
Is it possible to do it properly with spring integration ?
Maybe I don't use the correct component to do it ?
Regards

The Priority Channel pattern works a bit different way.
It is a queue with sort support. When a new message arrives to the queue it is sorted to the proper place according to its priority. That absolutely does matter how your consumer of this channel works. The priority happens only in the channel. The consumer just polls messages from that queue like they are ordered for it: the CRITICAL, than HIGHEST, if CRITICAL aren't present and so on.
On the other hand, if you distribute messages by priority do different channels, why just don't have separate Service Activators for each of those channels? And each priority will be read by its own process.
There is no such a solution based on the "time to run". It just doesn't seem with a good fit for messaging architecture. Although you might can implement via scheduled task cancel() or Quartz to "perform task until...".
UPDATE
Regarding time control, I think you can come up with the solution which in the infinite loop really start()s different service activators and stop()s them after an appropriate scheduled time. All those service activators should listen to different queue channels.

Different message types (XMLs) on one TIBCO queue?

I am trying to implement an application(Java) which will subscribe to different message types (XMLs) from other different applications via TIBCO EMS. Each of these message types will have a specific purpose. I am of the opinion that I should have multiple queues with multiple subscribers in my application, however, the TIBCO guy is adamant that there should be only one queue where all of these messages will be published and I will have one subscriber and the subscriber then should have logic to different tasks based on the XML received.
Which approach is better? One with multiple queues and subscribers OR the one queue and one subscriber? Please let me know reasons for the choice.
Thanks!
-Naveen

In general, if the same application is reading all the messages, it is much cleaner for that application to have a single input queue instead of multiple input queues. With multiple then the application will need to have logic to know which order to process the queues and so on. With one input queue, the messaging system can deal with the order of the messages - whether FIFO or by priority etc, and the application can just read the next message and process it.

Use unique message header for each type of xml while sending the message. And use message selectors / filters while receiving the same, so that it can be routed / delegated to the respective handler based on the header value. This way, you will be able to handle different type of xml messages by single queue as well.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio