Claim Check Out when using multiple threads in Spring Integration - spring

I have a huge xml that come as an input payload to my Spring integration flow. So I am using claim check in transformer instead of header enricher to retain my payload. I am using an in-memory message store.
Later on in my SI flow, I have a splitter that splits the payload into multiple threads and each thread will invoke different channel based on one of the attribute payload. I am using a router for achieve this. Each flow or each thread uses a claim check out transformer to retrieve the initial payload then us it for building the required response. Each thread will produce a response and I don't have to aggregate them. So I will have multiple responses coming out from my flow which will then be dropped into a queue.
I cannot remove the message during the check out as other thread will also try to check out the same message. What is the best way to remove the message from the message store?
Sample configuration
`<int:chain input-channel="myInputChannel"
output-channel="myOutputchannel">
<int:claim-check-in />
<int:header-enricher>
<int:header name="myClaimCheckID" expression="payload"/>
</int:header-enricher>
</int:chain>`
all the other components in the flow are invoked before the splitter
<int:splitter input-channel="mySplitterChannel" output-channel="myRouterChannel" expression="mySplitExpression">
</int:splitter>
`<int:router input-channel="myRouterChannel" expression="routerExpression"
resolution-required="true">
<int:mapping value="A" channel="aChannel" />
<int:mapping value="B" channel="bChannel" />
<int:mapping value="C" channel="cChannel" />
</int:router>`
Each channel has a claim check out transformer for the initial payload. So how do I make sure the message is removed after all the threads have been processed?

When you know you are done with the message you can simply invoke the message store's remove() method. You could use a service activator with
... expression="#store.remove(headers['myClaimCheckID'])" ...
However, if you are using an in-memory message store there is really no point in using the claim check pattern.
If you simply promote the payload to a header, it will use no more memory than putting it in a store.
Even if it ends up in multiple messages on multiple threads, it makes no difference since they'll all be pointing to the same object on the heap.

Related

Spring AMQP and concurrency

I have a “listener-container” defined like this:
<listener-container concurrency="1" connection-factory="connectionFactory" prefetch="10"
message-converter="jsonMessageConverter"
error-handler="clientErrorHandler"
mismatched-queues-fatal="true"
xmlns="http://www.springframework.org/schema/rabbit">
<listener ref="clientHandler" method="handleMessage" queue-names="#{marketDataBroadcastQueue.name}" />
</listener-container>
I want to process the messages in sequential order, so I need to set concurrency to 1.
But the bean “clientHandler” has more than one “handleMessage” methods (with diferent java classes as parameters). I can see in the application logs that messages are not processed one by one. I have several messages processed in parallel. Can it be due to having multiple methods with the same name that processes those messages?
Thanks!

Using barrier to wait for integration flow to complete

I have an integration flow where some of the steps are async and some of sync. I want to use the barrier to block the Main thread until all async tasks have completed. Based on the documentation, there are two ways to use the barrier.
Send a second trigger message to the input channel of the barrier.
Invoke the trigger method manually of the barrier
In my use case a message comes in the flow and then goes through several components until it reaches the completed channel. I want the main thread to be blocked until the original messages reaches the completed channel. So it seems appropriate to use the option #2 and invoke the barrier trigger method after reaching the completed state. This doesnt seem to work. Here is a simplified version of my flow.
<int:gateway
service-interface="...BarrierGateway"
id="barrierGateway" default-request-channel="input">
</int:gateway>
<int:channel id="input">
<int:dispatcher task-executor="executor" />
</int:channel>
<int:service-activator input-channel="input" output-channel="completed">
<bean class="...BarrierSA" />
</int:service-activator>
<int:channel id="completed" />
<int:service-activator input-channel="completed"
ref="barrier1.handler" method="trigger" />
<int:barrier id="barrier1" input-channel="input" timeout="10000" />
I am sending a message to the gateway which passes it to the input channel which is using a dispatcher so a new thread is started to pass the message forward. At this point, I want to block the main thread while the Executor-1 thread goes through the flow. The rest of the flow is simple. My service-activator sleeps for 3 seconds before returning the message to simulate a delay. Once the message is received in the completed channel, the service-activator should invoke the barrier trigger method and only at this point, the main thread should be released. Instead, the main thread is released right after the dispatcher starts a new thread. I have tried specifying a constant correlation id ('abc') but that didnt help.
I see you're caught in a trap.
The <int:barrier> suspends the Thread just on the message message, but only that Thread which brings that message to him. Looking to your config it is the same input channel with Executor. The purpose of the ExecutorChannel to shift message to a different thread, but not suspend the caller's thread.
From other side you have one more mistake around that input. You declare two subscribers for him, where only one of them will be called by the round-robin balancing strategy.
To fix your task we should have one more top-level channel as <publish-subscribe-channel>. And right, already now you can have two subscribers.
One of them should be a <bridge> to your input ExecutorChannel. And another the desired <barrier>. And only now it can suspend (block in your terms) the main thread from the <gateway>.
From other side the more simpler solution would be don't use the <barrier> at all. The <gateway> has an ability to to block the caller's thread and wait for reply. Of course, that works when the gateway methods isn't void.
And one more point to your config: if you don't wait for reply in the gateway, the <barrier> will fail with the
throw new DestinationResolutionException("no output-channel or replyChannel header available");
So, consider to use something as an output-channel there a well.

Spring Integration Auditing

First want to say that Spring Integration is some great stuff. Hats off to the team for such a solid framework.
Here is my current challenge. My goal is to handle the crosscutting concern of auditing information within a message flow. For example, store the current SI Message in flight, its Message ID, all Payloads contained within the Message and context specific "attributes" that belong to the Message such as "orderId", "customerId", "partId", etc.
You can imagine a synchronous flow that may be running for 6 months where reports might need to be run to determine availability of that particular service (e.g. failed transactions versus successful, transactions with particular "attribute" values, transactions in a certain status, failover occurrences, etc).
If I have the following flow:
gateway->channel1->object-to-json-transformer->channel2->outbound-gateway
gateway has a single method which takes an argument of MyRequest and returns a type of MyResponse. When the flow starts, I can wire-tap channel1 and route all data on that channel to an audit channel, auditChannel, for example.
<int:channel id="auditChannel"/>
<int-jdbc:outbound-channel-adapter data-source="auditDataSource" channel="auditChannel"
query="insert into MESSAGE (PAYLOAD,CREATED_DATE) values (:payload, :createdDate)"
sql-parameter-source-factory="messageSpelSource"/>
<bean id="messageSpelSource"
class="org.springframework.integration.jdbc.ExpressionEvaluatingSqlParameterSourceFactory">
<property name="parameterExpressions">
<map>
<entry key="payload" value="payload.toString()"/>
<entry key="createdDate" value="new java.util.Date()"/>
</map>
</property>
</bean>
The above subflow (from channel1 through auditChannel) does not result in a Message object for the payload Map entry. Instead the type is MyRequest. This makes sense since I would not want to marshall a Message instance outbound, but it still leaves me in a dilemma of not having access to the Message envelope for auditing purposes.
If my intention is to provide a generic auditing facility that persists on demand to a common integration database schema (e.g. to a MESSAGE (message_id, correlation_id, payload, timestamp) table and MESSAGE_ATTRIBUTE (attribute_id, message_id, name, value) table), how can I ensure that I always have access to the core Message instance whenever I wire-tap a channel within the flow?
This use case is something that I have had to deal with many years ago with a custom integration framework so I know it is a valid concern.
I hope my request is not too far fetched. Perhaps there is a simple way to handle this and I am just not seeing it.
It's not entirely clear what you consider to be the problem; you can add more parameters, such as...
<entry key="timestamp" value="headers['timestamp']"/>
...what am I missing in your question?
The whole message is available using "#this".

Any concept of a global variable in streams in spring xd?

Scenario: A stream definition in spring xd has the following structure:
jms | filter | transform | hdfs
In the filter module, I fire a query to a database to verify if the current message is applicable for further processing.
When the condition is met, the message passes on to the transform module.
In the transform module, I would like to have access to the query results from the filter module.
Currently, I end up having to fire a query once more inside the transform to access the same result set.
Is there any form of a global variable that can apply during the lifetime of a message passing from source to sink across different modules? This could help reduce latency of reading from database.
If this isn't possible, what would be a recommended alternative?
You typically would use a transformer for this, or a header-enricher, to set a message header with the query result; use that header in the filter, and the header will be available for downstream modules, including your transformer.
<int:chain input-channel="input" output-channel="output">
<int:header-enricher..../>
<int:filter ... />
</int:chain>
This (passing arbitrary headers) currently only works (out of the box) with the rabbit (and local) transport, or if direct binding is enabled.
When using the redis transport, you have to configure the bus to add your header to those it passes.

Spring Integration - Concurrent Service Activators

I have a queue channel, and a service activator with a poller which reads from that queue. I'd like to have configuration to say "I want 50 threads to poll that queue, and each time you poll and get a message back, on this thread, invoke the service the service-activator points to."
The service has no #Async annotations, but is stateless and safe to run in a concurrent fashion.
Will the below do that? Are there other preferred ways of achieving this?
<int:channel id="titles">
<int:queue/>
</int:channel>
<int:service-activator output-channel="resolvedIds" ref="searchService" method="searchOnTitle" input-channel="titles">
<int:poller fixed-delay="100" time-unit="MILLISECONDS" task-executor="taskExecutor"></int:poller>
</int:service-activator>
<task:executor id="taskExecutor" pool-size="50" keep-alive="120" />
Yes I think it does what you want. Once you introduce a QueueChannel the interaction becomes async - you don't need #Async. If you don't explicitly set up a poller it will use the default poller.
What you have outlined is the best way to achieve it. You might also consider putting a limit on the queue size - so that in case there is a lag in keeping up with the producer it doesn't lead to out of memory issue. If a size is specified then the send calls on the channel will block - acting as a throttle.
The configuration you have will work as you expect. The only issue is that once you start creating executors and pollers for each end point it becomes difficult to figure out the optimal configuration for the entire application. It is ok to do this kind of optimization for a few specific steps - but not for all the endpoints (nothing in you questions suggests that you are doing it, just thought that I will raise it anyway.

Resources