Spring Integration - Message Driven Channel Adapter Configuration To Reduce Consuming Interval - spring

I have a queue to listen messages which comes from a service like this:
<bean id="inboundQueue" class="com.rabbitmq.jms.admin.RMQDestination">
<constructor-arg value="myQueue"/>
<constructor-arg value="true"/>
<constructor-arg value="false"/>
</bean>
Then, I have a message driven channel adapter like this, I can't use inbound-channel-adapter because my system has no tolerance about delay. Message forwarding and timeliness is so important.
<int-jms:message-driven-channel-adapter id="inboundAdaptor"
auto-startup="true"
connection-factory="jmsConnectionFactory"
destination="inboundQueue"
channel="requestChannel"
error-channel="errorHandlerChannel"
receive-timeout="1000" />
When I started the project and checked RabbitMQ console, I saw that Get (empty) was around 10/s. Increasing or decreasing the receive-timeout property has no affect on it.
When I stopped the project, this Get (empty) is 0. I understood that the value of Get (empty) field totally about my message-driven-channel-adapter. And I guess that that time is about consuming interval about message-driven-channel-adapter.
What can I do to reduce this time?

Related

Spring Integration AWS s3-inbound-streaming-channel-adapter stream from multiple s3 buckets

I am using XML based spring integration and use s3-inbound-streaming-channel-adapter to stream from a single s3 bucket.
We now have a requirement to stream from two s3 buckets.
So is it possible for s3-inbound-streaming-channel-adapter to stream from multiple buckets?
Or would I need to create a separate s3-inbound-streaming-channel-adapter for each s3 bucket?
This is my current set up for a single s3 bucket and it does work.
<int-aws:s3-inbound-streaming-channel-adapter
channel="s3Channel"
session-factory="s3SessionFactory"
filter="acceptOnceFilter"
remote-directory-expression="'bucket-1'">
<int:poller fixed-rate="1000"/>
</int-aws:s3-inbound-streaming-channel-adapter>
Thanks in advance.
UPDATE:
I ended up having two s3-inbound-streaming-channel-adapter as mentioned by Artem Bilan below.
However, for each inbound adapter, I had to declare instances of acceptOnceFilter and metadataStore separately.
This is because if I only had one instance of acceptOnceFilter and metadataStore and these were shared the the two inbound adapters, then some weird looping started happening.
e.g. When a file_1.csv arrived on bucket-1 and got processed and then if you put the same file_1.csv on bucket-2 then weird looping started happening. Don't know why! So I ended up creating acceptOnceFilter and metadataStore for each inbound adapter.
`
<!-- ===================================================== -->
<!-- Region 1 s3-inbound-streaming-channel-adapter setting -->
<!-- ===================================================== -->
<bean id="metadataStore" class="org.springframework.integration.metadata.SimpleMetadataStore"/>
<bean id="acceptOnceFilter"
class="org.springframework.integration.aws.support.filters.S3PersistentAcceptOnceFileListFilter">
<constructor-arg index="0" ref="metadataStore"/>
<constructor-arg index="1" value="streaming"/>
</bean>
<int-aws:s3-inbound-streaming-channel-adapter id="s3Region1"
channel="s3Channel"
session-factory="s3SessionFactory"
filter="acceptOnceFilter"
remote-directory-expression="'${s3.bucketOne.name}'">
<int:poller fixed-rate="1000"/>
</int-aws:s3-inbound-streaming-channel-adapter>
<int:channel id="s3Channel">
<int:queue capacity="50"/>
</int:channel>
<!-- ===================================================== -->
<!-- Region 2 s3-inbound-streaming-channel-adapter setting -->
<!-- ===================================================== -->
<bean id="metadataStoreRegion2" class="org.springframework.integration.metadata.SimpleMetadataStore"/>
<bean id="acceptOnceFilterRegion2"
class="org.springframework.integration.aws.support.filters.S3PersistentAcceptOnceFileListFilter">
<constructor-arg index="0" ref="metadataStoreRegion2"/>
<constructor-arg index="1" value="streaming"/>
</bean>
<int-aws:s3-inbound-streaming-channel-adapter id="s3Region2"
channel="s3ChannelRegion2"
session-factory="s3SessionFactoryRegion2"
filter="acceptOnceFilterRegion2"
remote-directory-expression="'${s3.bucketTwo.name}'">
<int:poller fixed-rate="1000"/>
</int-aws:s3-inbound-streaming-channel-adapter>
<int:channel id="s3ChannelRegion2">
<int:queue capacity="50"/>
</int:channel>
`
That's correct, the current implementation supports only a single remote directory to poll periodically. We really are working at this very moment to formalize such a solution as an out-of-the-box feature. Similar request has been reported for the (S)FTP support, especially when the target directory is not know in advance during configuration.
If that is not a big deal for your to configure several channel adapters for each for the directory, that would be great. You always can send messages from them to the same channel for processing.
Otherwise you can consider to loop the list of buckets via:
<xsd:attribute name="remote-directory-expression" type="xsd:string">
<xsd:annotation>
<xsd:documentation>
Specify a SpEL expression which will be used to evaluate the directory
path to where the files will be transferred
(e.g., "headers.['remote_dir'] + '/myTransfers'" for outbound endpoints)
There is no root object (message) for inbound endpoints
(e.g., "#someBean.fetchDirectory");
</xsd:documentation>
</xsd:annotation>
</xsd:attribute>
in some bean.

(Spring batch) Pollable channel with replies contains ChunkResponses even if JOB is succefully completed

I have following chunk writer configuration with getting the replies from spring batch remote chunking:
<bean id="chunkWriter" class="org.springframework.batch.integration.chunk.ChunkMessageChannelItemWriter" scope="step">
<property name="messagingOperations" ref="messagingGateway" />
<property name="replyChannel" ref="masterChunkReplies" />
<property name="throttleLimit" value="5" />
<property name="maxWaitTimeouts" value="30000" />
</bean>
<bean id="messagingGateway" class="org.springframework.integration.core.MessagingTemplate">
<property name="defaultChannel" ref="masterChunkRequests" />
<property name="receiveTimeout" value="2000" />
</bean>
<!-- Remote Chunking Replies From Slave -->
<jms:inbound-channel-adapter id="masterJMSReplies"
destination="remoteChunkingRepliesQueue"
connection-factory="remoteChunkingConnectionFactory"
channel="masterChunkReplies">
<int:poller fixed-delay="10" />
</jms:inbound-channel-adapter>
<int:channel id="masterChunkReplies">
<int:queue />
<int:interceptors>
<int:wire-tap channel="loggingChannel"/>
</int:interceptors>
</int:channel>
My remotely chunked step is running perfectly, all data are processed with very good performance, all steps ends in COMPLETED state. But problem is that masterChunkReplies queue channel contains ChunkResponses after end of the job. Documentation doesn't say anything about it, is that normal state?
Problem is that I can't run a new job then, because it then crashes at:
Message contained wrong job instance id ["
+ jobInstanceId + "] should have been [" + localState.getJobId() + "]."
There is a simple workaround, cleaning the masterChunkReplies queue channel at the start of the job, but I'm not sure if it is correct...
Can you please clarify this?
Gary, I found the root cause.
At slaves, if I change following chunk consumer JMS adapter:
<jms:message-driven-channel-adapter id="slaveRequests"
connection-factory="remoteChunkingConnectionFactory"
destination="remoteChunkingRequestsQueue"
channel="chunkRequests"
concurrent-consumers="10"
max-concurrent-consumers="50"
acknowledge="transacted"
receive-timeout="5000"
idle-task-execution-limit="10"
idle-consumer-limit="5"
/>
for
<jms:inbound-channel-adapter id="jmsRequests" connection-factory="remoteChunkingConnectionFactory"
destination="remoteChunkingRequestsQueue"
channel="chunkRequests"
acknowledge="transacted"
receive-timeout="5000"
>
<int:poller fixed-delay="100"/>
</jms:inbound-channel-adapter>
then it works, masterChunkReplies queue is consumed completely at the end of job. Anyway, any attempts of consuming chunkRequests at slaves in parallalel doesn't work. MasterChunkReplies queue then contains not consumed ChunkResponses. So starting new jobs ends in
Message contained wrong job instance id ["
+ jobInstanceId + "] should have been [" + localState.getJobId() + "]."
Gary, does it mean that slaves cannot consume ChunkRequests in parallel?
Gary, after few days of struggling, I made it finally to work, ...Even with parallel ChunkRequests consuming at slaves and with empty masterChunkReplies pollable channel at the end of the job...Changes:
At master, I changed the polled inbound channel adapter consuming ChunkResponses just taken from github examples, for message driven adapter with the same level of multithreading as slaves are consuming ChunkRequests. Because I had a feeling that master is consuming ChunkResponses slowely, that's why there were additional ChunkResponses at the end of the job.
Also I misconfigured remotely chunked step....My fault.
I didn't test it yet at more then one nodes, but now I think it works as it should be.
Thank you very much for help.
regards
Tomas

Prevent duplicates across restarts in spring integration

I have to poll a directory and write entries to rdbms.
I wired up a redis metadatstore for duplicates check. I see that the framework updates the redis store with entries for all files in the folder [~ 140 files], much before the rdbms entries gets written. At the time of application termination, rdbms has logged only 90 files. On application restart no more files are picked from folder.
Properties: msgs.per.poll=10, polling.interval=2000
How can I ensure entries to redis are made after writing to db, so that both are in sync and I don't miss any files.
<code>
<task:executor id="executor" pool-size="5" />
<int-file:inbound-channel-adapter channel="filesIn" directory="${input.Dir}" scanner="dirScanner" filter="compositeFileFilter" prevent-duplicates="true">
<int:poller fixed-delay="${polling.interval}" max-messages-per-poll="${msgs.per.poll}" task-executor="executor">
</int:poller>
</int-file:inbound-channel-adapter>
<int:channel id="filesIn" />
<bean id="dirScanner" class="org.springframework.integration.file.RecursiveLeafOnlyDirectoryScanner" />
<bean id="compositeFileFilter" class="org.springframework.integration.file.filters.CompositeFileListFilter">
<constructor-arg ref="persistentFilter" />
</bean>
<bean id="persistentFilter" class="org.springframework.integration.file.filters.FileSystemPersistentAcceptOnceFileListFilter">
<constructor-arg ref="metadataStore" />
</bean>
<bean name="metadataStore" class="org.springframework.integration.redis.metadata.RedisMetadataStore">
<constructor-arg name="connectionFactory" ref="redisConnectionFactory"/>
</bean>
<bean id="redisConnectionFactory" class="org.springframework.data.redis.connection.jedis.JedisConnectionFactory" p:hostName="localhost" p:port="6379" />
<int-jdbc:outbound-channel-adapter channel="filesIn" data-source="dataSource" query="insert into files values (:path,:name,:size,:crDT,:mdDT,:id)"
sql-parameter-source-factory="spelSource">
</int-jdbc:outbound-channel-adapter>
....
</code>
Artem is correct, you might as well extend the RedisMetadataStore and flush the entries that are not in your database on initialization time, this way you could use Redis and be in sync with the DB. But this kind of couples things a little.
How can I ensure entries to redis are made after writing to db
It's isn't possible, because FileSystemPersistentAcceptOnceFileListFilter works before any message sending and only once, when FileReadingMessageSource.toBeReceived is empty. Of course, it tries to refetch files on the next application restart, but it can't do that because your RedisMetadataStore already contains entries for those files.
I think we don't have in your case any choice unless use some custom JdbcFileListFilter based on your files table. Fortunately you logic ends up with file entry anyway.

Purpose of taskExecutor property in Spring's DefaultMessageListenerContainer

The Spring's DefaultMessageListenerContainer (DMLC) has concurrentConsumer and taskExecutor property. The taskExecutor bean can be given corePoolSize property. What is then the difference between specifying concurrentConsumer and corePoolSize ? When concurrentConsumer property is defined it means that Spring will create specified number of consumer/messageListeners to process the message. When does corePoolSize comes into picture ?
Code snippet
<bean id="myMessageListener"
class="org.springframework.jms.listener.DefaultMessageListenerContainer">
<property name="connectionFactory" ref="connectionFactory" />
<property name="destination" ref="myQueue" />
<property name="messageListener" ref="myListener" />
<property name="cacheLevelName" value="CACHE_CONSUMER"/>
<property name="maxConcurrentConsumers" value="10"/>
<property name="concurrentConsumers" value="3"/>
<property name="taskExecutor" ref="myTaskExecutor"/>
</bean>
<bean id="myTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor" >
<property name="corePoolSize" value="100"/>
<property name="maxPoolSize" value="100"/>
<property name="keepAliveSeconds" value="30"/>
<property name="threadNamePrefix" value="myTaskExecutor"/>
</bean>
According to 4.3.6 version, the taskExecutor contains instances of AsyncMessageListenerInvoker which responsible for message processing. corePoolSize is a number of physical threads in the defined pool, while concurrentConsumer is a number of tasks in this pool. I guess this abstraction was designed for more flexible control.
The Purpose of TaskExecutor Property
Set the Spring TaskExecutor to use for running the listener threads.
Default is a SimpleAsyncTaskExecutor, starting up a number of new threads, according to the specified number of concurrent consumers.
Specify an alternative TaskExecutor for integration with an existing thread pool.
Above is from [Spring Official Documentation][1]
When you specify the alternative task executor, then instead of using the asyncTaskExcutor the listener threads will use the defined task executor.
This can be easily illustrated when we define two jmsListeners with the same containerFactory. when you specify the concurrency, the concurrency should support the taskExecutor corePoolSize and maxPoolSize.
If you set the concurrency as 5-20 and you have two listeners then you should set the core poolSize more than 10 and the maxPoolSize more than 40. then listeners can get the threads accordingly their concurrency limit.
In this case, If you set the maxPoolsize to less than 10 then the listener containers will not be upon 10. From the spring you will get below warning as well
The number of scheduled consumers has dropped below concurrent consumers limit, probably due to tasks having been rejected. Check your thread pool configuration! Automatic recovery to be triggered by remaining consumers.
basically, the listener threads will act based on the taskExecutor property.

How is ordering preserved in ActiveMQ?

I have set up an application to listen to an ActiveMQ topic. Here's the way I have configured it:
<jms:listener-container connection-factory="jmsFactory"
container-type="default" destination-type="durableTopic" client-id="CMY-LISTENER"
acknowledge="transacted">
<jms:listener destination="CMY.UPDATES"
ref="continuingStudiesCourseUpdateListener" subscription="CMY-LISTENER" />
</jms:listener-container>
<bean id="jmsFactoryDelegate" class="org.apache.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="${jmsFactory.brokerURL}" />
<property name="redeliveryPolicy">
<bean class="org.apache.activemq.RedeliveryPolicy">
<property name="maximumRedeliveries" value="10" />
<property name="initialRedeliveryDelay" value="60000" />
<property name="redeliveryDelay" value="60000" />
<property name="useExponentialBackOff" value="true" />
<property name="backOffMultiplier" value="2" />
</bean>
</property>
</bean>
The problem I have is this:
I put 10 messages into the topic.
If the first message is read, and the application fails to process the task, it rolls back the message.
1 minute later, it retries to read the first message and process it. It fails and rolls back.
2 minutes later, it retries, and rolls back.
4 minutes later... etc
It gets stuck on the first message and the next 9 messages don't get read until the first one is dealt with.
Is this the way a topic is supposed to work? Is there a way that I can have my 9 other messages read while the first one is waiting to be re-tried?
Its working just like its supposed to, that's the nature of transactional message processing. You can't process the other messages until the first one completes or is discarded based on the rules in your given redelivery policy.
Might want to read though the JMS tutorial here:

Resources