Infinispan embedded cache jmx statistic numberOfEntries always -1 - spring-boot

My JMX JConsole always show numberOfEntries=-1. How can I make it reflects the right number?
Detail:
I'm working with Infinispan 14.0.3.Final with following simple configuration. The JConsole statistic is showing that the statisticEnabled=Unavailable and the numberOfEntries=-1 even the hits value is increased when I put entries into the cache.
<infinispan>
<cache-container name="default" statistics="true">
<jmx enabled="true" />
<replicated-cache name="invoices" mode="SYNC"
statistics="true" statistics-available="true">
</replicated-cache>
</cache-container>
</infinispan>

Related

ActiveMQ not distributing messages between brokers

I have a network of brokers on a Complete Graph topology with 3 nodes at different servers: A, B and C. Every broker has a producer attached and, for testing purposes, only one non-broker consumer on broker C. As I'm using the Complete Graph topology every broker also has a broker-consumer for each of the other nodes.
The problem is: A receives a few messages. I expect it to forward those messages to broker C, which has a "real" consumer attached. This is not happening, broker A stores those messages until a "real" consumer connects to it.
What's wrong with my configuration (or understanding)?
I'm using ActiveMQ 5.9.0.
Here's my activemq.xml for broker A. It's the same for B and C, only changing names:
<beans
xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">
<broker xmlns="http://activemq.apache.org/schema/core" brokerName="broker-A" dataDirectory="${activemq.data}">
<destinationPolicy>
<policyMap>
<policyEntries>
<policyEntry topic="tokio.>">
<subscriptionRecoveryPolicy>
<noSubscriptionRecoveryPolicy/>
</subscriptionRecoveryPolicy>
<pendingMessageLimitStrategy>
<constantPendingMessageLimitStrategy limit="1000"/>
</pendingMessageLimitStrategy>
</policyEntry>
</policyEntries>
</policyMap>
</destinationPolicy>
<managementContext>
<managementContext createConnector="true"/>
</managementContext>
<persistenceAdapter>
<kahaDB directory="${activemq.data}/kahadb"/>
</persistenceAdapter>
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage percentOfJvmHeap="70" />
</memoryUsage>
<storeUsage>
<storeUsage limit="40 gb"/>
</storeUsage>
<tempUsage>
<tempUsage limit="10 gb"/>
</tempUsage>
</systemUsage>
</systemUsage>
<networkConnectors>
<networkConnector name="linkTo-broker-B"
uri="static:(tcp://SRVMSG01:61616)"
duplex="true"
/>
<networkConnector name="linkTo-broker-C"
uri="static:(tcp://SRVMSG03:61616)"
duplex="true"
/>
</networkConnectors>
<transportConnectors>
<transportConnector uri="tcp://localhost:0" discoveryUri="multicast://default"/>
<transportConnector name="nio" uri="nio://0.0.0.0:61616" />
</transportConnectors>
</broker>
</beans>
By default, networkTTL is 1 (see documentation), so when a producer on B publishes a message, if it takes the path to A (which it will do 50% of the time in your configuration because you've got the broker set up to round-robin between consumers, more on that in a second), it's not allowed to be forwarded to C. You could fix the problem by increasing the value of networkTTL, but the better solution is to set decreaseNetworkConsumerPriority=true (see documentation at same link as above) to ensure that messages always go as directly as possible to the consumer to which they're destined.
Note, however, that if your consumers move around the mesh, this can strand messages both because the networkTTL value won't allow additional forwards and because messages aren't allowed to be resent to a broker through which they've already passed. You can address those by setting networkTTL to a larger value (like 20, to be completely safe) and by applying the replayWhenNoConsumers=true policy setting described in the "Stuck Messages" section of that same documentation page. Neither of those settings is strictly necessary, as long as you're sure your consumers will never move to another broker or you're OK losing a few messages when it does happen.

JBoss Cache JGroups Membership GMS TCPPING

I am using JBoss Cache (tried versions 3.2.7 and 3.1.0) to have a replicated Map for caching data between Application Servers. In the past I did some checks if it works out and it did. My testenvironment always were 2 nodes in the same network segment.
Since IT deparments sometimes have a problem with UDP, we use TCP (TCPPING for discovery).
Now a customer reported problems with our nodes losing their sync, not replicating data.
They have 4 nodes in 2 subnets (2 and 2). They say when they onbly use 2 nodes in any subnet it works. When they start the third, the problems begin.
The logfile state a lot of "merge" problems indicating a partitioning problem.
So I did my own tests in my company. My setup is my Laptop under Windows and two virtual machines with Ubuntu. The virtual machines use bridged networking interfaces. DHCP is used and our IT department provides my 3 nodes with different subnet IPs. My Host Laptop is in a different net than the virtual machines are. TCP Communication between the nodes works. There should be no firewall involved.
So much to my setup.
I wrote a little Program that just initializes JBoss Cache, gets the cache(MAP) and in an intervall changes values in the map, and afterwards showing the content of the whole map. Pretty simple, 2 Classes involved.
My JBoss Cache setup is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:jboss:jbosscache-core:config:3.1">
<clustering mode="replication" clusterName="${jgroups.clustername:DEFAULT}">
<stateRetrieval timeout="20000" fetchInMemoryState="true" />
<sync replTimeout="20000" />
<jgroupsConfig>
<TCP start_port="${jgroups.tcpping.start_port:7800}" loopback="true" recv_buf_size="20000000"
send_buf_size="640000" discard_incompatible_packets="true"
max_bundle_size="64000" max_bundle_timeout="30"
use_incoming_packet_handler="true" enable_bundling="false"
use_send_queues="false" sock_conn_timeout="3000"
skip_suspected_members="true" use_concurrent_stack="true"
thread_pool.enabled="true" thread_pool.min_threads="1"
thread_pool.max_threads="25" thread_pool.keep_alive_time="5000"
thread_pool.queue_enabled="false" thread_pool.queue_max_size="100"
thread_pool.rejection_policy="run" oob_thread_pool.enabled="true"
oob_thread_pool.min_threads="1" oob_thread_pool.max_threads="8"
oob_thread_pool.keep_alive_time="5000"
oob_thread_pool.queue_enabled="false"
oob_thread_pool.queue_max_size="100"
oob_thread_pool.rejection_policy="run" />
<TCPPING timeout="3000"
initial_hosts="${jgroups.tcpping.initial_hosts:localhost[7800],localhost[7801]}"
port_range="3" num_initial_members="3" />
<MERGE2 max_interval="100000" min_interval="20000" />
<MERGE3 max_interval="100000" min_interval="20000" />
<FD_SOCK />
<FD timeout="10000" max_tries="5" shun="true" />
<VERIFY_SUSPECT timeout="1500" />
<BARRIER />
<pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800" discard_delivered_msgs="true" />
<UNICAST timeout="300,600,1200" />
<pbcast.STABLE stability_delay="1000"
desired_avg_gossip="50000" max_bytes="400000" />
<VIEW_SYNC avg_send_interval="60000" />
<pbcast.GMS print_local_addr="true" join_timeout="6000"
shun="true" view_bundling="true" />
<FC max_credits="2000000" min_threshold="0.10" />
<FRAG2 frag_size="60000" />
<pbcast.STREAMING_STATE_TRANSFER />
</jgroupsConfig>
</clustering>
</jbosscache>
starting my test nodes I provide them with the following System properties.
-Djgroups.bind_addr=NODE1
-Djgroups.tcpping.initial_hosts=NODE1[7900],NODE2[7900],NODE3[7900]
-Djgroups.tcpping.start_port=7900
From the logging messages I can see the GMS message that the NODE address is indeed as specified NODE-X[7900] for all nodes.
NODE1-3 are given as IP numbers. Those IP numbers can be reached from other nodes.
NODE1,2 are in the same subnet
NODE3 is in a different subnet
I did an incredible amount of tests, changing the config, the JBossCache version, the combinations of nodes running etc.
Sometimes it works, sometimes it doesn't.
One cause that seems to influence if the members find each other is the initial hosts info. Depending on the order of the hosts, if there are additional hosts to the 3 given, leaving out the own ip out of the list the setup works or not. It also depends on the order the nodes are started.
I am sure it has to do with the JGroups Group membership. Maybe parameters need to be added to make it more robust.
I really would appreciate some hints on what to try in order to get the nodes talking to each other reliably.
I addition to trying to figure out the problems with JBoss Cache(JGroups) I did the same test using Hazelcast (TCP) instead. It PERFECTLY works without any problem, so the basic networking should work for the nodes.
I consider to switch to Hazelcast, but this requires redeployments in several of our customers IT departments and I would want to avoid that.

Spring batch admin remote partition steps running maximum 8 threads even though concurrency is 10?

I am using spring batch remote partitioning for batch process. I am launching jobs using spring batch admin.
I have inbound gateway consumer concurrency step to 10 but maximum number of partitions running in parallel are 8.
I want to increase the consumer concurrency to 15 later on.
Below is my configuration,
<task:executor id="taskExecutor" pool-size="50" />
<rabbit:template id="computeAmqpTemplate"
connection-factory="rabbitConnectionFactory" routing-key="computeQueue"
reply-timeout="${compute.partition.timeout}">
</rabbit:template>
<int:channel id="computeOutboundChannel">
<int:dispatcher task-executor="taskExecutor" />
</int:channel>
<int:channel id="computeInboundStagingChannel" />
<amqp:outbound-gateway request-channel="computeOutboundChannel"
reply-channel="computeInboundStagingChannel" amqp-template="computeAmqpTemplate"
mapped-request-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS"
mapped-reply-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS" />
<beans:bean id="computeMessagingTemplate"
class="org.springframework.integration.core.MessagingTemplate"
p:defaultChannel-ref="computeOutboundChannel"
p:receiveTimeout="${compute.partition.timeout}" />
<beans:bean id="computePartitionHandler"
class="org.springframework.batch.integration.partition.MessageChannelPartitionHandler"
p:stepName="computeStep" p:gridSize="${compute.grid.size}"
p:messagingOperations-ref="computeMessagingTemplate" />
<int:aggregator ref="computePartitionHandler"
send-partial-result-on-expiry="true" send-timeout="${compute.step.timeout}"
input-channel="computeInboundStagingChannel" />
<amqp:inbound-gateway concurrent-consumers="${compute.consumer.concurrency}"
request-channel="computeInboundChannel"
reply-channel="computeOutboundStagingChannel" queue-names="computeQueue"
connection-factory="rabbitConnectionFactory"
mapped-request-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS"
mapped-reply-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS" />
<int:channel id="computeInboundChannel" />
<int:service-activator ref="stepExecutionRequestHandler"
input-channel="computeInboundChannel" output-channel="computeOutboundStagingChannel" />
<int:channel id="computeOutboundStagingChannel" />
<beans:bean id="computePartitioner"
class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
p:resources="file:${spring.tmp.batch.dir}/#{jobParameters[batch_id]}/shares_rics/shares_rics_*.txt"
scope="step" />
<beans:bean id="computeFileItemReader"
class="org.springframework.batch.item.file.FlatFileItemReader"
p:resource="#{stepExecutionContext[fileName]}" p:lineMapper-ref="stLineMapper"
scope="step" />
<beans:bean id="computeItemWriter"
class="com.st.batch.foundation.writers.ComputeItemWriter"
p:symfony-ref="symfonyStepScoped" p:timeout="${compute.item.timeout}"
p:batchId="#{jobParameters[batch_id]}" scope="step" />
<step id="computeStep">
<tasklet transaction-manager="transactionManager">
<chunk reader="computeFileItemReader" writer="computeItemWriter"
commit-interval="${compute.commit.interval}" />
</tasklet>
</step>
<flow id="computeFlow">
<step id="computeStep.master">
<partition partitioner="computePartitioner"
handler="computePartitionHandler" />
</step>
</flow>
<job id="computeJob" restartable="true">
<flow id="computeJob.computeFlow" parent="computeFlow" />
</job>
compute.grid.size = 112
compute.consumer.concurrency = 10
Input files are splited to 112 equal parts = compute.grid.size = total number of partitions
Number of servers = 4.
There are 2 problems,
i) Even though I have set concurrency to 10, maximum number of threads running are 8.
ii)
some are slower as other processes runs on them and some are faster so I want make sure step executions are distributed fairly i.e. if faster servers are done with their execution, other remaining executions in queue should go to them . It should not be distributed round robbin fashion.
I know in rabbitmq there is prefetch count setting and ack mode to distribute farely. For spring integration, prefetch count is 1 default and ack mode is AUTO by default. But still some servers keeps running more partitions even though other servers are done for long time. Ideally no servers should be sitting idle.
Update:
One more thing I now observed is that, for some steps which runs in parallel using split (not distributed using remote partitioning) also run max 8 in parallel. It looks something like thread pool limit issue but as you can see taskExecutor has pool-size set to 50.
Is there anything in spring-batch/spring-batch-admin which limits number of concurrently running steps ?
2nd Update:
And, if there are 8 or more threads running in parallel processing items, spring batch admin doesn't load. It just hangs. If I reduce concurrency, spring batch admin loads. I even tested it with setting concurrency 4 on one server and 8 on other server, spring batch admin doesn't load it I use URL of server where 8 threads are running but it works on the server where 4 threads are running.
Spring batch admin manager has below jobLauncher configuration,
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean>
<task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />
The pool size is 6 there, has it anything to do with above problem ?
Or is there anything in tomcat 7 which restricts number of threads running to 8 ?
Are you using a database for JobRepository?
During the execution, the batch frameworks persists step executions and number of connections to the JobRepository database can interfere in parallel step executions.
Concurrency of 8 makes me thinks you might be using BasicDataSource? If so, switch to something like DriverManagerDataSource and see.
Confused - you said "I have set the concurrency to 10" but then show compute.consumer.concurrency = 8. So it is working as configured. It is impossible to have only 8 consumer threads if the property is set to 10.
From Rabbit's perspective, all consumers are equal - if there are 10 consumers on a slow box and 10 consumers on a fast box, and you only have 10 partitions, it is possible that all 10 partitions will end up on the slow box.
RabbitMQ does not distribute work across servers, it distributes the work across consumers only.
You might get better distribution by reducing the concurrency. You should also set the concurrency lower on the slower boxes.

Spring annotation-driven ehcache note evicting using timeToLiveSeconds

Sorry in advance for the long post however I wanted to be precise in my post.
I have a spring mvc 3.1 web application and have just built ehcache into my web application which I am using to cache all my list of values (drop downs) built from the database.
Here is an example of my settings...
<!-- Cache -->
<dependency>
<groupId>net.sf.ehcache</groupId>
<artifactId>ehcache-core</artifactId>
<version>2.5.0</version>
</dependency>
...
<ehcache>
<diskStore path="java.io.tmpdir"/>
<cache
name="lovStore"
maxElementsInMemory="512"
eternal="false"
timeToIdleSeconds="60"
timeToLiveSeconds="60"
overflowToDisk="false"
memoryStoreEvictionPolicy="LRU"/>
</ehcache>
...
<cache:annotation-driven />
<bean id="cacheManager"
class="org.springframework.cache.ehcache.EhCacheCacheManager"
p:cache-manager-ref="ehcache"/>
<bean id="ehcache"
class="org.springframework.cache.ehcache.EhCacheManagerFactoryBean"
p:config-location="classpath:ehcache.xml"/>
...
#Cacheable(value = "lovStore", key = "#code")
public Map<String, String> getLov(String code) throws ReportingManagerException {
return MockLOVHelper.getInstance().getLov(code);
}
A lot of the tutorials on the net talk about evicting the cache using a #cacheEvict method however this does not suit me. I think I am better suited off using the timeToLiveSeconds option to evict the cache.
When i look in the logs the cache is definitely working however the eviction of the cache is not. I've read some other articles on the net about how timeToLiveSeconds doesn't truly evict the cache but other articles like this one http://code.google.com/p/ehcache-spring-annotations/wiki/EvictExpiredElements which say there are special settings you have to create to get the cache to evict.
Can someone please help me understand if my cache should be evicting and also how I can evict because what is mentioned in the article is not something i was able to understand how to implement.
Here are what my logs look like. But there are no signs of eviction...
2014-01-20 13:32:41,791 DEBUG [AnnotationCacheOperationSource] - Adding cacheable method 'getLov' with attribute: [CacheableOperation[public java.util.Map com.myer.reporting.dao.mock.MockLovStoreDaoImpl.getLov(java.lang.String) throws com.myer.reporting.exception.ReportingManagerException] caches=[lovStore] | condition='' | key='#code']
thanks
Ecache does not evict elements until there's need to do that. And that's reasonable since there is no need to waste CPU resources for operation(evicting elemets) that wouldn't make much difference.
If someone would try to get element from cache and the element's time to live/idle would be expired, then ehcache would evict that element and return null. In case max elements or memory for cache/pool would be reached then ehcache would evict elements that are expired, based on eviction policy.
And if I understand correctly it's not guaranteed that all expired element's would be evicted since only sample elements are selected to evict(from documentation):
#param sampledElements this should be a random subset of the population

Infinispan : Responses contains more than 1 not equal elements

I am using Infinispan as L2 cache and I have two application nodes. The L2 cache in two apps are replicated. The two apps are not identical.
One of my app fill the database using web services while other app run GUI for the database.
The both app do the extensively read and write to the database. After running the app I have seen following error. I do not know which cause this error.
I wonder why
- My cache instances are not properly replicated each change to other
L2 cache got a two reposes
L2 responses are not equal
ERROR org.infinispan.interceptors.InvocationContextInterceptor - ISPN000136: Execution error
2013-05-29 06:32:32 ERROR - Exception while processing event, reason: org.infinispan.loaders.CacheLoaderException: Responses contains more than 1 element and these elements are not equal, so can't decide which one to use:
[SuccessfulResponse{responseValue=TransientCacheValue{maxIdle=100000, lastUsed=1369809152081} TransientCacheValue {value=MarshalledValue{instance=, serialized=ByteArray{size=1911, array=0x0301fe0409000000..}, cachedHashCode=1816114786}#57991642}} ,
SuccessfulResponse{responseValue=TransientCacheValue{maxIdle=100000, lastUsed=1369809152116} TransientCacheValue {value=MarshalledValue{instance=, serialized=ByteArray{size=1911, array=0x0301fe0409000000..}, cachedHashCode=1816114786}#6cdaa731}} ]
My Infinispan configuration is
<globalJmxStatistics enabled="true" jmxDomain="org.infinispan" allowDuplicateDomains="true"/>
<transport
transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport"
clusterName="infinispan-hibernate-cluster"
distributedSyncTimeout="50000"
strictPeerToPeer="false">
<properties>
<property name="configurationFile" value="jgroups.xml"/>
</properties>
</transport>
</global>
<default>
</default>
<namedCache name="my-cache-entity">
<clustering mode="replication">
<stateRetrieval fetchInMemoryState="false" timeout="60000"/>
<sync replTimeout="20000"/>
</clustering>
<locking isolationLevel="READ_COMMITTED" concurrencyLevel="1000"
lockAcquisitionTimeout="15000" useLockStriping="false"/>
<eviction maxEntries="10000" strategy="LRU"/>
<expiration maxIdle="100000" wakeUpInterval="5000"/>
<lazyDeserialization enabled="true"/>
<!--<transaction useSynchronization="true"
transactionMode="TRANSACTIONAL" autoCommit="false"
lockingMode="OPTIMISTIC"/>-->
<loaders passivation="false" shared="false" preload="false">
<loader class="org.infinispan.loaders.cluster.ClusterCacheLoader"
fetchPersistentState="false"
ignoreModifications="false" purgeOnStartup="false">
<properties>
<property name="remoteCallTimeout" value="20000"/>
</properties>
</loader>
</loaders>
</namedCache>
Replicated entity caches should be configured with state retrieval, as already indicated in the default Infinispan configuration file and you've already done so. ClusterCacheLoader should only used in special situations (for query caching). Why not just use the default Infinsipan configuration provided? In fact, if you don't configure a config file, it'll use the default one.

Resources