Webapp hangs when Active MQ broker is running - spring

I got a strange problem with my spring webapp (running on local jetty) which connects to a locally running ActiveMQ broker for JMS functionality.
As soon as I start the broker the applications becomes incredibly slow, e.g. the startup of the ApplicationContext with active broker takes forever (i.e. > 10mins, did not yet wait long enough for it to complete). If I start the broker after the webapp (i.e. after the ApplicationContext was loaded) it's running but in a very very slow way (requests which usually take <1s take >30s). All operations take longer even the ones without JMS involved. When I run the application without an activemq broker everything runs smoothly (except the JMS related stuff of course ;-) )
Here's what I tried so far:
Updated the ActiveMQ Version to 5.10.1
Used standalone ActiveMQ instead of maven-plugin
moved the broker running from a separate JVM (via active mq maven plugin, connection via JNDI lookup in jetty config) into the same JVM (started via spring config, without JNDI)
changed the active mq transport from tcp to vm
several activemq settings (alwaysSyncSend, alwaysSessionAsync, producerWindowSize)
Using CachingConnectionFactory and PooledConnectionFactory
When analyzing a thread dump (jstack) I see many activemq threads sleeping on a monitor. Which looks like this:
"ActiveMQ VMTransport: vm://localhost#0-3" daemon prio=6 tid=0x000000000b1a3000 nid=0x1840 waiting on condition [0x00000000177df000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000f786d670> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:955)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:917)
at java.lang.Thread.run(Thread.java:662)
Any help is greatly appreciated !

I found the cause of the issue and was able to fix it:
we were passing a transactionmanager to the AbstractMessageListenerContainer. While in production there is a XA-Transactionmanager in use on the local jetty environment only a JPATransactionManager is used. Apparently the JMS is waiting forever for an XA transaction to be commited, which never happens in the local environment.
By overriding the bean definition of the AbstractMessageListenerContainer for the local env without setting a transcationmanager but using sessionTransacted="true" instead everything works fine.
I got the idea that it might be related to transaction handling from enabling the ActiveMQ logging. With this I saw that something was wrong with the transaction (transactionContext.getTransactionId() returned null).

Related

Tracing memory leak in Spring Azure qPID JMS code

Im trying to trace and identify root cause for memory leak in our very small and simple Spring Boot application.
It uses following:
- Spring Boot 2.2.4
- azure-servicebus-jms-spring-boot-starter 2.2.1
- MSSQL
Function:
The app only dispatches Azure ServiceBus queue and stores data and sends data to other destination.
It is a small app so it starts easily with 64 megs of memory, despite I give it up to 256 megs via Xmx option. Important note is the queue is being dispatched using Spring default transacted mode with dedicated JmsTransactionManager who is actually inner TM of ChainedTransactionManager along with dbTM and additional outbound JMS TM. Both JMS ConnectionFactory objects are created as CachingConnectionFactory.
Behavior:
Once the app is started it seems OK. There is no traffic so I can see in the log it is opening transactions and closing when checking the queue (jms:message-driven-channel-adapter).
However after some time when there is still no traffic, no single message was consumed the memory starts climbing as monitored via JVVM.
There is an error thrown:
--2020-04-24 11:17:01.443 - WARN 39892 --- [er.container-10] o.s.j.l.DefaultMessageListenerContainer : Setup of JMS message listener invoker failed for destination 'MY QUEUE NAME HERE' - trying to recover. Cause: Heuristic completion: outcome state is rolled back; nested exception is org.springframework.transaction.TransactionSystemException: Could not commit JMS transaction; nested exception is javax.jms.IllegalStateException: The Session was closed due to an unrecoverable error.
... and after several minutes it reaches MAX of the heap and since that moment it is failing on OutOfMemory error in the thread opening JMS connections.
--2020-04-24 11:20:04.564 - WARN 39892 --- [windows.net:-1]] i.n.u.concurrent.AbstractEventExecutor : A task raised an exception. Task: org.apache.qpid.jms.provider.amqp.AmqpProvider$$Lambda$871/0x000000080199f840#1ed8f2b9
-
java.lang.OutOfMemoryError: Java heap space
at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:61)
at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:348)
at org.apache.qpid.proton.engine.impl.ByteBufferUtils.newWriteableBuffer(ByteBufferUtils.java:99)
at org.apache.qpid.proton.engine.impl.TransportOutputAdaptor.init_buffers(TransportOutputAdaptor.java:108)
at org.apache.qpid.proton.engine.impl.TransportOutputAdaptor.pending(TransportOutputAdaptor.java:56)
at org.apache.qpid.proton.engine.impl.SaslImpl$SwitchingSaslTransportWrapper.pending(SaslImpl.java:842)
at org.apache.qpid.proton.engine.impl.HandshakeSniffingTransportWrapper.pending(HandshakeSniffingTransportWrapper.java:138)
at org.apache.qpid.proton.engine.impl.TransportImpl.pending(TransportImpl.java:1577)
at org.apache.qpid.proton.engine.impl.TransportImpl.getOutputBuffer(TransportImpl.java:1526)
at org.apache.qpid.jms.provider.amqp.AmqpProvider.pumpToProtonTransport(AmqpProvider.java:994)
at org.apache.qpid.jms.provider.amqp.AmqpProvider.pumpToProtonTransport(AmqpProvider.java:985)
at org.apache.qpid.jms.provider.amqp.AmqpProvider.lambda$close$3(AmqpProvider.java:351)
at org.apache.qpid.jms.provider.amqp.AmqpProvider$$Lambda$871/0x000000080199f840.run(Unknown Source)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:835)
HeapDumps:
I took couple of heap snapshots during this whole process and looked at what gets increased.
I can see suspicious amount of ConcurrentHashMap/String/Byte[] objects.
Has anyone some clue/hint what can be wrong in this setup and libs: Spring Boot, Apache qPid used under the hood of the Azure JMS dependency etc.? Many thanks.
Update #1
I have clear evidence that the problem is either in Spring or azure service bus starter library - not automatically qPid client used. I would say the library has the bug rather than Spring, just my guess. This is how the failing setup looks like:
There are two JMS destinations and one DB, each having its transaction manager
There is ChainedTransactionManager wrapping above three TMs.
Spring integration app which connects to Azure ServiceBus queue via jms:message-driven-channel-adapter and setting the transaction manager on this component (as created in point 2)
Start the app., no traffic on the queue is needed, after 10 minutes the app will crash due to OutOfMemoryError ... within those 10 minutes I watch log on debug level and only thing which is happening is opening and closing transactions using ChainedTransactionManager ... also as written in the comments another important condition is the third JMS TransactionManager ... with 2 TMs it works and is stable, with 3 it will crash ...
Additional research and steps taken identified the most likely root cause Spring CachingConnectionFactory class. Once I removed that and used only native types the problem went away and memory consumption profile is very different and healthy.
I have to say I created CachingConnectionFactory using standard constructor and didnt further configure the behavior. However these Spring defaults clearly lead to memory leak as per my experience.
In past I had memory leak with ActiveMq which had to be resolved by using CachingConnectionFactory and now I have memory leak with Azure ServiceBus when using CachingConnectionFactory .. strange :) In both cases I see that as bugs because memory management should be correct regardless caching involved or not.
Marking this as my answer.
Tested case: The problem occurs when receiving and sending message both with its own TM and both JMS connectionFactories are type CachedConnectionFactory. At the end I tested the app. with inbound connection factory of type CachedConnectionFactory and outbound just native type ... no memory leak as well.

Spring Boot application won't quit on 'eb deploy' since I added embedded ActiveMQ

I have a Spring Boot application hosted on AWS Elastic Beanstalk. Ever since I included embedded ActiveMQ the application won't quit on redeploy - I am getting an error about the port 5000 already in use when it is trying to start the newly deployed jar.
The only workaround I found is to recreate the environment after each re-deploy, which means a long downtime.
I am suspecting a timing problem with the shutdown hook.
When I Ctrl-C the application in local, it quits after a several seconds delay, with some exceptions:
javax.jms.JMSException: peer (vm://embedded#1) stopped.
at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:54) ~[activemq-client-5.15.10.jar:5.15.10]
...
Caused by: org.apache.activemq.transport.TransportDisposedIOException: peer (vm://embedded#1) stopped.
My brokerUrl is set to vm://embedded?broker.persistent=false,useShutdownHook=false, althought jConsole shows Broker/Embedded/Attributes/Persistent is true.
Any hints?

Managing JMS Message Containers on Application Startup and Shutdown

Currently, we have four JMS listener containers that are started during the application start. They all connect through Apache ZooKeeper and are manually started. This becomes problematic when a connection to ZooKeeper cannot be established. The (Wicket) application cannot start, even though it is not necessary for the JMS listeners be active to use the application. They simply need to listen to messages in the background, save them and a cron job will process them in batches.
Goals:
Allow the application to start and not be prevented by the message containers not being able to connect.
After the application starts, start the message listeners.
If the connection to one or any of the message listeners goes down, it should attempt to automatically reconnect.
On application shutdown (such as the Tomcat being shutdown), the application should stop the message listeners and the cron job that processes the saved messages.
Make all of this testable (as in, be able to write integration tests for this setup).
Current Setup:
Spring Boot 1.5.6
Apache ZooKeeper 3.4.6
Apache ActiveMQ 5.7
Wicket 7.7.0
Work done so far:
Define a class that implements ApplicationListener<ApplicationReadyEvent>.
Setting the autoStart property of the DefaultMessageListenerContainer to false and start each container in the onApplicationEvent in a separate thread.
Questions:
Is it necessary to start each message container in its own thread? This seems to be overkill, but the way the "start" process works is that the DefaultMessageListenerContainer is built for that listener and then it is started. There is a UI component that a user can use to start/stop the message listeners if need be, and if these are started sequentially in one thread, then the latter three message containers could be null if the first one has yet to connect on startup.
How do I accomplish goals 4 and 5?
Of course, any commments on whether I am on the right track would be helpful.
If you do not start them in a custom thread then the whole application cannot be fully started. It is not just Wicket, but the Servlet container won't change the application state from STARTING to STARTED due to the blocking request to ZooKeeper.
Another option is to use a non-blocking request to ZooKeeper but this is done by the JMS client (ActiveMQ), so you need to check whether this is supported in their docs (both ActiveMQ and ZooKeeper). I haven't used those in several years, so I cannot help you more.

Database failure shuts down ActiveMQ windows service using JDBC persistence

I have an ActiveMQ broker running as a Windows service. Its using jdbcPersistenceAdapter with Oracle data source and Oracle's Universal Connection Pooling (UCP).
When the database is down (due to network problems or scheduled maintenance), the ActiveMQ windows service shuts down completely. This, of course makes the broker unavailable even after the database is restored.
I have tried connection validation in UCP, DBCP with connection validation and even MySQL data source without any success. The service shuts down within 30 seconds of database failure (I believe this is because the default cleanupInterval is 30 seconds).
Is there a way to prevent the windows service from shutting down and make it wait for database availability?
Any help is greatly appreciated.
Here is my current configuration from activemq.xml:
<persistenceAdapter>
<jdbcPersistenceAdapter dataSource="#oracle-ds"/>
</persistenceAdapter>
<bean id="oracle-ds" class="oracle.ucp.jdbc.PoolDataSourceFactory"
factory-method="getPoolDataSource" p:URL="jdbc:oracle:thin:#localhost:1521:amq"
p:connectionFactoryClassName="oracle.jdbc.pool.OracleDataSource"
p:validateConnectionOnBorrow="true" p:user="appuser" p:password="userspassword" />
generally, you should use a JDBC master/slave to support failover from another broker when the database becomes unavailable...
see http://activemq.apache.org/jdbc-master-slave.html
that said, there is a known issue with JDBC master/slave failover that was fixed in 5.6.0...
see https://issues.apache.org/jira/browse/AMQ-1958

Spring JMS: Creating multiple connection to a queue

To process a large number of messages coming to a queue i need guarantee of at least one jms connection to be there at any time. I am using spring and spring allows to have multiple sessions on a single connection only. In case one and only connection fails, application will come to standstill till spring reconnects to the JMS bridge.
So how can i create more than one connection to a queue in Spring, also how can i do connection pooling here.
The answer to this depends on whether you are using Spring inside a J2EE container(jboss etc.) or in a standalone application.
Standalone - you'll find pooling connections to be a problem. Springs SingleConnectionFactory can be setup to renew the connection on an exception garaunteeing that at some point a connection will come online and start processing the queue again, but you'll still have the problem of waiting for that single connection to renew, plus depending on what messaging implementation your dealing with and how it does load balancing you may find yourself stuck with a connection to a single node in a cluster.
If you are running in a container you can rely on the containers connection factory which will be much more robust. JBoss Messaging in the container for instance will failover seamlessly to other nodes and handles pooling under the covers, but if your working in the container its usually easier to bail on JMS template which kind of sucks and use whatever that container provides.

Resources