Can I convert an input message into multiple messages (involving replicating the input message and adding variable content based on a json array element in the message).
If the replicating and adding variable content goes through for some of the elements in json array, it should go to success topic. However if for any of the array element, the processing fails, the message should go to failure topic for retry.
Is this possible to achieve with Kafka streams?
Related
I'm creating a load test where I try to sent three types of JSON messages through ActiveMQ topic to server. After sending first message I get 3 responses, sending second - get 2 responses according to business logic.
One iteration sequentially:
publish message1
consume 3 responses as a result of successful processing message1
publish message2
consume 2 responses as a result of successful processing message2
etcetera
I need to start 50 parallel iterations and not to confuse messages from different iterations. How can I do it?
I tried JMS selector but this one can filter messages by the headers only. I don't have any specific headers for each responses to get.
Can I filter messages, for example, by UUID? And how it can be implemented? I tried to find needed info on Internet but without results.
Will be very thankful for advices and help with it!
Yes, messages can be filtered by either header (fixed set of JMS header names) or by property (custom key-value pair).
JMSCorrelationID may be a good bet here. You can publish all messages for a given producer (or iteration) w/ the same JMSCorrelationID and then check the consumer counts that way.
ie.. for producer1 set: JMSCorrelationID = 'producer-1'
for producer2 set: JMSCorrelationID = 'producer-2'
I'm sending messages to my message queue like this
messages.forEach(message->
sources.output().send(MessageBuilder.withPayload(message).build());
Those messages come from an external source and there could be thousands of them.
I've seen the Splitter but it requires an input channel and output channel, but my messages are going into the queue for the first time, I'm just producing messages not consuming them, and I'm not sure how Aggregator would work or if it would be too complex for such a simple scenario.
So basically I'd like to be able to send those messages in batches, rather than one by one.
How could that be accomplished?
For something simple you can collect and create a List of data (messages or just payloads) and then create a single Message with List being a payload and send it.
For more configurable approach you can also use Spring Integration Aggregator
I would like to process multiple messages at a time e.g. get 10 messages from the channel at a time and write them to a log file at once.
Given the scenario, can I write a service activator which will get messages in predefined set i.e. 5 or 10 messages and process it? If this is not possible then how to achieve this using Spring Integration.
That is exactly what you can get with the Aggregator. You can collect several messages to the group using simple expression like size() == 10. When the group is complete, the DefaultAggregatingMessageGroupProcessor emits a single message with the list of payloads of messages in the group. The result you can send to the service-activator for handling the batch at once.
UPDATE
Something like this:
.aggregate(aggregator -> aggregator
.correlationStrategy(message -> 1)
.releaseStrategy(group -> group.size() == 10)
.outputProcessor(g -> new GenericMessage<Collection<Message<?>>>(g.getMessages()))
.expireGroupsUponCompletion(true))
So, we correlate messages (group or buffer them) by the static 1 key.
The group (or buffer size is 10) and when we reach it we emit a single message which contains all the message from the group. After emitting the result we clean the store from this group to allow to form a new one for a fresh sequence of messages.
It depends on what is creating the messages in the first place; if a message-driven channel adapter, the concurrency in that adapter is the key.
For other message sources, you can use an ExecutorChannel as the input channel to the service activator, with an executor with a pool size of 10.
Depending on what is sending messages, you need to be careful about losing messages in the event of a server failure.
It's difficult to provide a general answer without more information about your application.
Collector node issue: I am currently using collector node to group messages (XML's). My requirement is to collect messages till the last message is received. (Reading from file input)
Control terminal: I'm sending a control message to stop collection and propagate to next node. But this doesn't work. As it still waits for timeout/quantity condition to be satisfied.
MY QUESTION: What condition can I use to collect messages till the last message received?
Add a separate input terminal on the Collector node that is used to complete a collection. Once you send a message to the second terminal, the collection is complete and propagated.
The Control terminal can be used to signal the Collector node when complete collections are propagated, not to determine when a collection is complete.
A collection is complete when either the set number of messages are received or the timeout is exhausted for all input terminals.
So if you don't know in advance how many messages you want to include in a collection, you have 3 options:
Set message quantity to 0 and set an appropriate timeout for input terminals.
This way the node will include all messages received within the time between the first message and the timeout value in the collection.
Set a large number as message quantity and use collection expiry
With collection expiry, incomplete collections can be propagated to the expiry terminal, but this will work essentially the same as the previous method.
Develop your own collector flow
You can develop a flow for combining messages using MQ Input, Get and Output nodes, keeping intermediate combined messages in MQ queues. Use this flow to combine your inputs and send the complete message onto the input queue of your processing flow.
My topology looks like this :
Data_Enrichment_Persistence_Topology
So basically the problem I am trying to solve here is that every time any issue comes in the Stop or Load service bolts, and a tuple fails , it replays and the spout re emits it. This makes the Cassandra bolt re process the tuple and rewrite data.
I can not make the tuples in the load and stop bolts unanchored as i need them to be replayed in case of any failure. However I only want to get the upper workflow replayed.
I am using a KafkaSpout to emit data ( it is emitting it on the " default" stream). Not sure how to duplicate the streams at the Kafka Spout's emit level.
If I can duplicate the streams the replay on any of of the two will only re emit the message on a particular stream right at the spout level leaving the other stream untouched right?
TIA!
You need to use two output streams in your Spout -- one for each downstream pass. Furthermore, you emit each tuple to both streams (using different message-id).
Thus, if one fails, you can reply this tuple to just this stream.