I have a network of brokers on a Complete Graph topology with 3 nodes at different servers: A, B and C. Every broker has a producer attached and, for testing purposes, only one non-broker consumer on broker C. As I'm using the Complete Graph topology every broker also has a broker-consumer for each of the other nodes.
The problem is: A receives a few messages. I expect it to forward those messages to broker C, which has a "real" consumer attached. This is not happening, broker A stores those messages until a "real" consumer connects to it.
What's wrong with my configuration (or understanding)?
I'm using ActiveMQ 5.9.0.
Here's my activemq.xml for broker A. It's the same for B and C, only changing names:
<beans
xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">
<broker xmlns="http://activemq.apache.org/schema/core" brokerName="broker-A" dataDirectory="${activemq.data}">
<destinationPolicy>
<policyMap>
<policyEntries>
<policyEntry topic="tokio.>">
<subscriptionRecoveryPolicy>
<noSubscriptionRecoveryPolicy/>
</subscriptionRecoveryPolicy>
<pendingMessageLimitStrategy>
<constantPendingMessageLimitStrategy limit="1000"/>
</pendingMessageLimitStrategy>
</policyEntry>
</policyEntries>
</policyMap>
</destinationPolicy>
<managementContext>
<managementContext createConnector="true"/>
</managementContext>
<persistenceAdapter>
<kahaDB directory="${activemq.data}/kahadb"/>
</persistenceAdapter>
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage percentOfJvmHeap="70" />
</memoryUsage>
<storeUsage>
<storeUsage limit="40 gb"/>
</storeUsage>
<tempUsage>
<tempUsage limit="10 gb"/>
</tempUsage>
</systemUsage>
</systemUsage>
<networkConnectors>
<networkConnector name="linkTo-broker-B"
uri="static:(tcp://SRVMSG01:61616)"
duplex="true"
/>
<networkConnector name="linkTo-broker-C"
uri="static:(tcp://SRVMSG03:61616)"
duplex="true"
/>
</networkConnectors>
<transportConnectors>
<transportConnector uri="tcp://localhost:0" discoveryUri="multicast://default"/>
<transportConnector name="nio" uri="nio://0.0.0.0:61616" />
</transportConnectors>
</broker>
</beans>
By default, networkTTL is 1 (see documentation), so when a producer on B publishes a message, if it takes the path to A (which it will do 50% of the time in your configuration because you've got the broker set up to round-robin between consumers, more on that in a second), it's not allowed to be forwarded to C. You could fix the problem by increasing the value of networkTTL, but the better solution is to set decreaseNetworkConsumerPriority=true (see documentation at same link as above) to ensure that messages always go as directly as possible to the consumer to which they're destined.
Note, however, that if your consumers move around the mesh, this can strand messages both because the networkTTL value won't allow additional forwards and because messages aren't allowed to be resent to a broker through which they've already passed. You can address those by setting networkTTL to a larger value (like 20, to be completely safe) and by applying the replayWhenNoConsumers=true policy setting described in the "Stuck Messages" section of that same documentation page. Neither of those settings is strictly necessary, as long as you're sure your consumers will never move to another broker or you're OK losing a few messages when it does happen.
Related
I have a queue to listen messages which comes from a service like this:
<bean id="inboundQueue" class="com.rabbitmq.jms.admin.RMQDestination">
<constructor-arg value="myQueue"/>
<constructor-arg value="true"/>
<constructor-arg value="false"/>
</bean>
Then, I have a message driven channel adapter like this, I can't use inbound-channel-adapter because my system has no tolerance about delay. Message forwarding and timeliness is so important.
<int-jms:message-driven-channel-adapter id="inboundAdaptor"
auto-startup="true"
connection-factory="jmsConnectionFactory"
destination="inboundQueue"
channel="requestChannel"
error-channel="errorHandlerChannel"
receive-timeout="1000" />
When I started the project and checked RabbitMQ console, I saw that Get (empty) was around 10/s. Increasing or decreasing the receive-timeout property has no affect on it.
When I stopped the project, this Get (empty) is 0. I understood that the value of Get (empty) field totally about my message-driven-channel-adapter. And I guess that that time is about consuming interval about message-driven-channel-adapter.
What can I do to reduce this time?
I am using XML based spring integration and use s3-inbound-streaming-channel-adapter to stream from a single s3 bucket.
We now have a requirement to stream from two s3 buckets.
So is it possible for s3-inbound-streaming-channel-adapter to stream from multiple buckets?
Or would I need to create a separate s3-inbound-streaming-channel-adapter for each s3 bucket?
This is my current set up for a single s3 bucket and it does work.
<int-aws:s3-inbound-streaming-channel-adapter
channel="s3Channel"
session-factory="s3SessionFactory"
filter="acceptOnceFilter"
remote-directory-expression="'bucket-1'">
<int:poller fixed-rate="1000"/>
</int-aws:s3-inbound-streaming-channel-adapter>
Thanks in advance.
UPDATE:
I ended up having two s3-inbound-streaming-channel-adapter as mentioned by Artem Bilan below.
However, for each inbound adapter, I had to declare instances of acceptOnceFilter and metadataStore separately.
This is because if I only had one instance of acceptOnceFilter and metadataStore and these were shared the the two inbound adapters, then some weird looping started happening.
e.g. When a file_1.csv arrived on bucket-1 and got processed and then if you put the same file_1.csv on bucket-2 then weird looping started happening. Don't know why! So I ended up creating acceptOnceFilter and metadataStore for each inbound adapter.
`
<!-- ===================================================== -->
<!-- Region 1 s3-inbound-streaming-channel-adapter setting -->
<!-- ===================================================== -->
<bean id="metadataStore" class="org.springframework.integration.metadata.SimpleMetadataStore"/>
<bean id="acceptOnceFilter"
class="org.springframework.integration.aws.support.filters.S3PersistentAcceptOnceFileListFilter">
<constructor-arg index="0" ref="metadataStore"/>
<constructor-arg index="1" value="streaming"/>
</bean>
<int-aws:s3-inbound-streaming-channel-adapter id="s3Region1"
channel="s3Channel"
session-factory="s3SessionFactory"
filter="acceptOnceFilter"
remote-directory-expression="'${s3.bucketOne.name}'">
<int:poller fixed-rate="1000"/>
</int-aws:s3-inbound-streaming-channel-adapter>
<int:channel id="s3Channel">
<int:queue capacity="50"/>
</int:channel>
<!-- ===================================================== -->
<!-- Region 2 s3-inbound-streaming-channel-adapter setting -->
<!-- ===================================================== -->
<bean id="metadataStoreRegion2" class="org.springframework.integration.metadata.SimpleMetadataStore"/>
<bean id="acceptOnceFilterRegion2"
class="org.springframework.integration.aws.support.filters.S3PersistentAcceptOnceFileListFilter">
<constructor-arg index="0" ref="metadataStoreRegion2"/>
<constructor-arg index="1" value="streaming"/>
</bean>
<int-aws:s3-inbound-streaming-channel-adapter id="s3Region2"
channel="s3ChannelRegion2"
session-factory="s3SessionFactoryRegion2"
filter="acceptOnceFilterRegion2"
remote-directory-expression="'${s3.bucketTwo.name}'">
<int:poller fixed-rate="1000"/>
</int-aws:s3-inbound-streaming-channel-adapter>
<int:channel id="s3ChannelRegion2">
<int:queue capacity="50"/>
</int:channel>
`
That's correct, the current implementation supports only a single remote directory to poll periodically. We really are working at this very moment to formalize such a solution as an out-of-the-box feature. Similar request has been reported for the (S)FTP support, especially when the target directory is not know in advance during configuration.
If that is not a big deal for your to configure several channel adapters for each for the directory, that would be great. You always can send messages from them to the same channel for processing.
Otherwise you can consider to loop the list of buckets via:
<xsd:attribute name="remote-directory-expression" type="xsd:string">
<xsd:annotation>
<xsd:documentation>
Specify a SpEL expression which will be used to evaluate the directory
path to where the files will be transferred
(e.g., "headers.['remote_dir'] + '/myTransfers'" for outbound endpoints)
There is no root object (message) for inbound endpoints
(e.g., "#someBean.fetchDirectory");
</xsd:documentation>
</xsd:annotation>
</xsd:attribute>
in some bean.
I'm new to Spring Integration and I'm experimenting with the various components in a small project.
In the task at hand, I need to process a text file and store its contents to database. The file holds lines that can be grouped together, so it will be natural to divide each file into several independent messages.
This is the whole process (please see the config at the end):
do an initial file analysis;
done by transformers.outcomeTransf
store some data to database (i.e. file name, file date, etc.);
?
split the file contents into several distinct messages;
done by splitters.outcomeSplit
further analyze each message;
done by transformers.SingleoutcomeToMap
store single message data to database referencing data stored at step 1.
done by stored-proc-outbound-channel-adapter
The database holds just two tables:
T1 for file metadata (file name, file date, file source, ...);
T2 for file content details, rows here reference rows in T1.
I'm missing the component for step 2. As I understand it, a channel outbound adapter "swallows" the message it handles, so that no other endpoint can receive it.
I thought about a publish-subscribe channel (without a TaskExecutor) after step one, with a jdbc outbound adapter as first subscriber and the splitter from stem 3 as the second one: each subscribed handler should then receive a copy of the message but it's not clear to me if any processing in the splitter would wait the outbound adapter had finished.
Is this the right approach to the task? What if the transformer at step 4 is called asynchronously - each splitted message is self contained and that would call for concurrency.
Spring configuration:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xmlns:int-file="http://www.springframework.org/schema/integration/file"
xmlns:int-jdbc="http://www.springframework.org/schema/integration/jdbc"
xmlns:beans="http://www.springframework.org/schema/beans"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/integration
http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/file
http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
http://www.springframework.org/schema/integration/jdbc
http://www.springframework.org/schema/integration/jdbc/spring-integration-jdbc.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<!-- input-files-from-folder -->
<int-file:inbound-channel-adapter id="outcomeIn"
directory="file:/in-outcome">
<int:poller id="poller" fixed-delay="2500" />
</int-file:inbound-channel-adapter>
<int:transformer input-channel="outcomeIn" output-channel="outcomesChannel" method="transform">
<beans:bean class="transformers.outcomeTransf" />
</int:transformer>
<!-- save source to db! -->
<int:splitter input-channel="outcomesChannel" output-channel="singleoutcomeChannel" method="splitMessage">
<beans:bean class="splitters.outcomeSplit" />
</int:splitter>
<int:transformer input-channel="singleoutcomeChannel" output-channel="jdbcChannel" method="transform">
<beans:bean class="transformers.SingleoutcomeToMap" />
</int:transformer>
<int-jdbc:stored-proc-outbound-channel-adapter
data-source="dataSource" channel="jdbcChannel" stored-procedure-name="insert_outcome"
ignore-column-meta-data="true">
<int-jdbc:sql-parameter-definitions ... />
<int-jdbc:parameter ... />
</int-jdbc:stored-proc-outbound-channel-adapter>
<bean id="dataSource" class="org.apache.commons.dbcp2.BasicDataSource" destroy-method="close" >
<property name="driverClassName" value="org.postgresql.Driver"/>
<property ... />
</bean>
</beans>
You think the right way. When you have a PublishSubscribeChannel without an Executor, each next subscriber is going to wait when the previous finishes its work. Therefore your spllitter is not going to be called until everything is done on DB. More over by default, when first subscriber fails to handle a message (not DB connection ?), all others won't be called.
Another way to achieve similar behavior can be configured with the <request-handler-advice-chain> and ExpressionEvaluatingRequestHandlerAdvice: https://docs.spring.io/spring-integration/docs/5.0.4.RELEASE/reference/html/messaging-endpoints-chapter.html#expression-advice
All the splitter downstream flow concurrency and multi-threading is already not related to the DB logic. A parallelism isn't going to happen until DB performs its request properly.
I have following chunk writer configuration with getting the replies from spring batch remote chunking:
<bean id="chunkWriter" class="org.springframework.batch.integration.chunk.ChunkMessageChannelItemWriter" scope="step">
<property name="messagingOperations" ref="messagingGateway" />
<property name="replyChannel" ref="masterChunkReplies" />
<property name="throttleLimit" value="5" />
<property name="maxWaitTimeouts" value="30000" />
</bean>
<bean id="messagingGateway" class="org.springframework.integration.core.MessagingTemplate">
<property name="defaultChannel" ref="masterChunkRequests" />
<property name="receiveTimeout" value="2000" />
</bean>
<!-- Remote Chunking Replies From Slave -->
<jms:inbound-channel-adapter id="masterJMSReplies"
destination="remoteChunkingRepliesQueue"
connection-factory="remoteChunkingConnectionFactory"
channel="masterChunkReplies">
<int:poller fixed-delay="10" />
</jms:inbound-channel-adapter>
<int:channel id="masterChunkReplies">
<int:queue />
<int:interceptors>
<int:wire-tap channel="loggingChannel"/>
</int:interceptors>
</int:channel>
My remotely chunked step is running perfectly, all data are processed with very good performance, all steps ends in COMPLETED state. But problem is that masterChunkReplies queue channel contains ChunkResponses after end of the job. Documentation doesn't say anything about it, is that normal state?
Problem is that I can't run a new job then, because it then crashes at:
Message contained wrong job instance id ["
+ jobInstanceId + "] should have been [" + localState.getJobId() + "]."
There is a simple workaround, cleaning the masterChunkReplies queue channel at the start of the job, but I'm not sure if it is correct...
Can you please clarify this?
Gary, I found the root cause.
At slaves, if I change following chunk consumer JMS adapter:
<jms:message-driven-channel-adapter id="slaveRequests"
connection-factory="remoteChunkingConnectionFactory"
destination="remoteChunkingRequestsQueue"
channel="chunkRequests"
concurrent-consumers="10"
max-concurrent-consumers="50"
acknowledge="transacted"
receive-timeout="5000"
idle-task-execution-limit="10"
idle-consumer-limit="5"
/>
for
<jms:inbound-channel-adapter id="jmsRequests" connection-factory="remoteChunkingConnectionFactory"
destination="remoteChunkingRequestsQueue"
channel="chunkRequests"
acknowledge="transacted"
receive-timeout="5000"
>
<int:poller fixed-delay="100"/>
</jms:inbound-channel-adapter>
then it works, masterChunkReplies queue is consumed completely at the end of job. Anyway, any attempts of consuming chunkRequests at slaves in parallalel doesn't work. MasterChunkReplies queue then contains not consumed ChunkResponses. So starting new jobs ends in
Message contained wrong job instance id ["
+ jobInstanceId + "] should have been [" + localState.getJobId() + "]."
Gary, does it mean that slaves cannot consume ChunkRequests in parallel?
Gary, after few days of struggling, I made it finally to work, ...Even with parallel ChunkRequests consuming at slaves and with empty masterChunkReplies pollable channel at the end of the job...Changes:
At master, I changed the polled inbound channel adapter consuming ChunkResponses just taken from github examples, for message driven adapter with the same level of multithreading as slaves are consuming ChunkRequests. Because I had a feeling that master is consuming ChunkResponses slowely, that's why there were additional ChunkResponses at the end of the job.
Also I misconfigured remotely chunked step....My fault.
I didn't test it yet at more then one nodes, but now I think it works as it should be.
Thank you very much for help.
regards
Tomas
Is there a way to limit size of a queue in ActiveMQ. Like I have four queues: Q1, Q2, Q3, Q4 and I want when Q3 has 200 MB of messages it should block untill messages are not consumed, but other Q1, Q2, Q4 function normally.
You can do it, but you have to do it in steps.
There are different types of memory, like normal memory "RAM" or disk space in the persistent store. You have to configure them separately. Since when the "RAM" memory is out, the message is swapped out and fetched from store (depending a bit on the configuration).
However, you have hopefully a system wide limit, like this:
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage percentOfJvmHeap="70" />
</memoryUsage>
<storeUsage>
<storeUsage limit="100 gb"/>
</storeUsage>
<tempUsage>
<tempUsage limit="50 gb"/>
</tempUsage>
</systemUsage>
</systemUsage>
Given these entries as a starting point, you can apply per-destination-policies that limit certain queue(s). That is set as a percentage of the system memory, so you need to do some calculations.
Use storeUsageHighWaterMark and/or cursorMemoryHighWaterMark depending on the effect you want. Note that store is not used for non persistent messages.
For a basic memory limit, you can also use the memoryLimit setting on the destination policy. It's a child to the memoryUsage system property.
<policyEntry queue="ANOTHER.>" producerFlowControl="true" memoryLimit="12 kb">
not limiting the queue size to 12 kb