Cannot acquire lock exception in lettuce - spring

We have recently moved form jedis to using lettuce in our production services. However we have hit a roadblock while creating redis distributed locks
We are using non clustered setup of aws elasticache with one master and 2 read repilcas
Configs :
Spring-boot : 2.2.5
spring-boot-starter-data-redis : 2.2.5
spring-data-redis : 2.2.5
spring-integration-redis : 5.2.4
redis : 5.0.6
#Bean
public LettuceConnectionFactory redisConnectionFactory() {
GenericObjectPoolConfig poolingConfig = new GenericObjectPoolConfig();
poolingConfig.setMaxIdle(Integer.valueOf(maxConnections));
poolingConfig.setMaxTotal(Integer.valueOf(maxIdleConnections));
poolingConfig.setMinIdle(Integer.valueOf(minIdleConnections));
poolingConfig.setMaxWaitMillis(-1);
final SocketOptions socketOptions = SocketOptions.builder().connectTimeout(Duration.ofSeconds(10)).build();
final ClientOptions clientOptions = ClientOptions.builder().socketOptions(socketOptions).build();
LettucePoolingClientConfiguration clientOption = LettucePoolingClientConfiguration.builder()
.poolConfig(poolingConfig).readFrom(ReadFrom.REPLICA_PREFERRED)
.commandTimeout(Duration.ofMillis(Long.valueOf(commandTimeout)))
.clientOptions(clientOptions).useSsl().build();
RedisStaticMasterReplicaConfiguration redisStaticMasterReplicaConfiguration = new RedisStaticMasterReplicaConfiguration(
primaryEndPoint, Integer.valueOf(port));
redisStaticMasterReplicaConfiguration.addNode(readerEndPoint, Integer.valueOf(port));
redisStaticMasterReplicaConfiguration.setPassword(password);
/*
* LettuceClientConfiguration clientConfig = LettuceClientConfiguration
* .builder() .useSsl()
*
* .readFrom(new ReadFrom() {
*
* #Override public List<RedisNodeDescription> select(Nodes nodes) {
* List<RedisNodeDescription> allNodes = nodes.getNodes(); int ind =
* Math.abs(index.incrementAndGet() % allNodes.size()); RedisNodeDescription
* selected = allNodes.get(ind);
* //logger.info("Selected random node {} with uri {}", ind, selected.getUri());
* List<RedisNodeDescription> remaining = IntStream.range(0, allNodes.size())
* .filter(i -> i != ind) .mapToObj(allNodes::get).collect(Collectors.toList());
* return Stream.concat( Stream.of(selected), remaining.stream()
* ).collect(Collectors.toList()); } }) .build();
*/
return new LettuceConnectionFactory(redisStaticMasterReplicaConfiguration, clientOption);
}
#Bean
public StringRedisTemplate stringRedisTemplate() {
return new StringRedisTemplate(redisConnectionFactory());
}
LOCKING SERVICE
#Service
public class RedisLockService {
#Autowired
RedisConnectionFactory redisConnectionFactory;
private static final Logger LOGGER = LoggerFactory.getLogger(RedisLockService.class);
public Lock obtainLock(String registryKey,String redisKey,Long lockExpiry){
try{
RedisLockRegistry registry = new RedisLockRegistry(redisConnectionFactory, registryKey, lockExpiry);
Lock lock = registry.obtain(redisKey);
if(lock.tryLock()==false)
{
LOGGER.info("Lock already made");
return null;
}
else
return lock;
}catch (Exception e) {
LOGGER.warn("Unable to acquire lock: ", e);
return null;
}
}
public void unLock(Lock lock) {
if(lock!=null)
lock.unlock();
}
}
error We are getting while trying to call obtainLock function
.RedisSystemException: Error in execution; nested exception is io.lettuce.core.RedisCommandExecutionException: ERR Error running script (call to f_8426c8df41c64d8177dce3ecbbe9146ef3759cd2): #user_script:6: #user_script: 6: -READONLY You can't write against a read only replica.
at org.springframework.integration.redis.util.RedisLockRegistry$RedisLock.rethrowAsLockException(RedisLockRegistry.java:224)
at org.springframework.integration.redis.util.RedisLockRegistry$RedisLock.tryLock(RedisLockRegistry.java:276)
at org.springframework.integration.redis.util.RedisLockRegistry$RedisLock.tryLock(Re

you need to connect to the master read/write node of Redis

Related

Caused by io.lettuce.core.rediscommandexecutionexception: moved 15596 XX.X.XXX.XX:6379 Java Spring boot

We have a spring-boot application which is deployed to lambda in AWS.
Code
public AbstractRedisClient getClient(String host, String port) {
LOG.info("redis-uri" + "redis://"+host+":"+port);
return RedisClient.create("redis://"+host+":"+port);
}
/**
* Returns the Redis connection using the Lettuce-Redis-Client
*
* #return RedisClient
*/
public RedisClient getConnection(String host, String port) {
LOG.info("redis-Host " + host);
LOG.info("redis-Port " + port);
RedisClient redisClient = (RedisClient) getClient(host, port);
redisClient.setDefaultTimeout(Duration.ofSeconds(10));
return redisClient;
}
private RedisCommands<String, String> getRedisCommands() {
StatefulRedisConnection<String, String> statefulConnection = openConnection();
if(statefulConnection != null)
return statefulConnection.sync();
else
return null;
}
public StatefulRedisConnection<String, String> openConnection() {
if(connection != null && connection.isOpen()) {
return connection;
}
String redisPort = "6379";
String redisHost = environment.getProperty("REDIS_HOST");
//String redisPort = environment.getProperty("REDIS_PORT");
LOG.info("Host: {}", redisHost);
LOG.info("Port: {}", redisPort);
UnifiedReservationRedisConfig lettuceRedisConfig = new UnifiedReservationRedisConfig();
String redisUri = "redis://"+redisHost+":"+redisPort;
redisClient = lettuceRedisConfig.getConnection(redisHost, redisPort);
ConnectionFuture<StatefulRedisConnection<String, String>> future = redisClient
.connectAsync(StringCodec.UTF8, RedisURI.create(redisUri));
try {
connection = future.get();
} catch(InterruptedException | ExecutionException exception) {
LOG.info(exception.getMessage());
closeConnectionsAsync();
connection = null;
Thread.currentThread().interrupt();
}
return connection;
}
private void closeConnectionsAsync() {
LOG.info("Close redis connection");
if(connection != null && connection.isOpen()) {
connection.closeAsync();
}
if(redisClient != null) {
redisClient.shutdownAsync();
}
}
The issue was happening all time, But frequently geting this issue like Caused by io.lettuce.core.rediscommandexecutionexception: moved 15596 XX.X.XXX.XX:6379, Any one can help to solve this issue
As per my knowledge, you are doing port-forwarding to your redis cluster/instance using port 15596 but the actual redis ports like 6379 are not accessible from your application's network.
When redis's java client get the access to redis then tries to connect to actual ports like 6379.
Try using RedisClusterClient, instead of RedisClient. The unhandled MOVED response indicates that you are trying to use the non-cluster-aware client with Redis deployed in cluster mode.

Intermittent SocketTimeoutException with elasticsearch-rest-client-7.2.0

I am using RestHighLevelClient version 7.2 to connect to the ElasticSearch cluster version 7.2. My cluster has 3 Master nodes and 2 data nodes. Data node memory config: 2 core and 8 GB. I have used to below code in my spring boot project to create RestHighLevelClient instance.
#Bean(destroyMethod = "close")
#Qualifier("readClient")
public RestHighLevelClient readClient(){
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY,
new UsernamePasswordCredentials(elasticUser, elasticPass));
RestClientBuilder builder = RestClient.builder(new HttpHost(elasticHost, elasticPort))
.setHttpClientConfigCallback(httpClientBuilder ->httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider).setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build()));
builder.setRequestConfigCallback(requestConfigBuilder -> requestConfigBuilder.setConnectTimeout(30000).setSocketTimeout(60000)
);
RestHighLevelClient restClient = new RestHighLevelClient(builder);
return restClient;
}
RestHighLevelClient is a singleton bean. Intermittently I am getting SocketTimeoutException with both GET and PUT request. The index size is around 50 MB. I have tried increasing the socket timeout value, but still, I receive the same error. Am I missing some configuration? Any help would be appreciated.
I got the issue just wanted to share so that it can help others.
I was using Load Balancer to connect to the ElasticSerach Cluster.
As you can see from my RestClientBuilder code that I was using only the loadbalancer host and port. Although I have multiple master node, still RestClient was not retrying my request in case of connection timeout.
RestClientBuilder builder = RestClient.builder(new HttpHost(elasticHost, elasticPort))
.setHttpClientConfigCallback(httpClientBuilder ->httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider).setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build()));
According to the RestClient code if we use a single host then it won't retry in case of any connection issue.
So I changed my code as below and it started working.
RestClientBuilder builder = RestClient.builder(new HttpHost(elasticHost, 9200),new HttpHost(elasticHost, 9201))).setHttpClientConfigCallback(httpClientBuilder -> httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider));
For complete RestClient code please refer https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java
Retry code block in RestClient
private Response performRequest(final NodeTuple<Iterator<Node>> nodeTuple,
final InternalRequest request,
Exception previousException) throws IOException {
RequestContext context = request.createContextForNextAttempt(nodeTuple.nodes.next(), nodeTuple.authCache);
HttpResponse httpResponse;
try {
httpResponse = client.execute(context.requestProducer, context.asyncResponseConsumer, context.context, null).get();
} catch(Exception e) {
RequestLogger.logFailedRequest(logger, request.httpRequest, context.node, e);
onFailure(context.node);
Exception cause = extractAndWrapCause(e);
addSuppressedException(previousException, cause);
if (nodeTuple.nodes.hasNext()) {
return performRequest(nodeTuple, request, cause);
}
if (cause instanceof IOException) {
throw (IOException) cause;
}
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
}
throw new IllegalStateException("unexpected exception type: must be either RuntimeException or IOException", cause);
}
ResponseOrResponseException responseOrResponseException = convertResponse(request, context.node, httpResponse);
if (responseOrResponseException.responseException == null) {
return responseOrResponseException.response;
}
addSuppressedException(previousException, responseOrResponseException.responseException);
if (nodeTuple.nodes.hasNext()) {
return performRequest(nodeTuple, request, responseOrResponseException.responseException);
}
throw responseOrResponseException.responseException;
}
I'm facing the same issue, and seeing this I realized that the retry is happening on my side too in each host (I have 3 host and the exception happens in 3 threads). I wanted to post it since you might face the same issue or someone else might come to this post because of the same SocketConnection Exception.
Searching the official docs, the HighLevelRestClient uses under the hood the RestClient, and the RestClient uses CloseableHttpAsyncClient which have a connection pool. ElasticSearch specifies that you should close the connection once that you are done, (which sounds ambiguous the definition of "done" in an application), but in general in internet I have found that you should close it when the application is closing or ending, rather than when you finished querying.
Now on the official documentation of apache they have an example to handle the connection pool, which i'm trying to follow, I'll try to replicate the scenario and will post if that fixes my issue, the code can be found here:
https://hc.apache.org/httpcomponents-asyncclient-dev/httpasyncclient/examples/org/apache/http/examples/nio/client/AsyncClientEvictExpiredConnections.java
This is what i have so far:
#Bean(name = "RestHighLevelClientWithCredentials", destroyMethod = "close")
public RestHighLevelClient elasticsearchClient(ElasticSearchClientConfiguration elasticSearchClientConfiguration,
RestClientBuilder.HttpClientConfigCallback httpClientConfigCallback) {
return new RestHighLevelClient(
RestClient
.builder(getElasticSearchHosts(elasticSearchClientConfiguration))
.setHttpClientConfigCallback(httpClientConfigCallback)
);
}
#Bean
#RefreshScope
public RestClientBuilder.HttpClientConfigCallback getHttpClientConfigCallback(
PoolingNHttpClientConnectionManager poolingNHttpClientConnectionManager,
CredentialsProvider credentialsProvider
) {
return httpAsyncClientBuilder -> {
httpAsyncClientBuilder.setSSLHostnameVerifier(NoopHostnameVerifier.INSTANCE);
httpAsyncClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
httpAsyncClientBuilder.setConnectionManager(poolingNHttpClientConnectionManager);
return httpAsyncClientBuilder;
};
}
public class ElasticSearchClientManager {
private ElasticSearchClientManager.IdleConnectionEvictor idleConnectionEvictor;
/**
* Custom client connection manager to create a connection watcher
*
* #param elasticSearchClientConfiguration elasticSearchClientConfiguration
* #return PoolingNHttpClientConnectionManager
*/
#Bean
#RefreshScope
public PoolingNHttpClientConnectionManager getPoolingNHttpClientConnectionManager(
ElasticSearchClientConfiguration elasticSearchClientConfiguration
) {
try {
SSLIOSessionStrategy sslSessionStrategy = new SSLIOSessionStrategy(getTrustAllSSLContext());
Registry<SchemeIOSessionStrategy> sessionStrategyRegistry = RegistryBuilder.<SchemeIOSessionStrategy>create()
.register("http", NoopIOSessionStrategy.INSTANCE)
.register("https", sslSessionStrategy)
.build();
ConnectingIOReactor ioReactor = new DefaultConnectingIOReactor();
PoolingNHttpClientConnectionManager poolingNHttpClientConnectionManager =
new PoolingNHttpClientConnectionManager(ioReactor, sessionStrategyRegistry);
idleConnectionEvictor = new ElasticSearchClientManager.IdleConnectionEvictor(poolingNHttpClientConnectionManager,
elasticSearchClientConfiguration);
idleConnectionEvictor.start();
return poolingNHttpClientConnectionManager;
} catch (IOReactorException e) {
throw new RuntimeException("Failed to create a watcher for the connection pool");
}
}
private SSLContext getTrustAllSSLContext() {
try {
return new SSLContextBuilder()
.loadTrustMaterial(null, (x509Certificates, string) -> true)
.build();
} catch (Exception e) {
throw new RuntimeException("Failed to create SSL Context with open certificate", e);
}
}
public IdleConnectionEvictor.State state() {
return idleConnectionEvictor.evictorState;
}
#PreDestroy
private void finishManager() {
idleConnectionEvictor.shutdown();
}
public static class IdleConnectionEvictor extends Thread {
private final NHttpClientConnectionManager nhttpClientConnectionManager;
private final ElasticSearchClientConfiguration elasticSearchClientConfiguration;
#Getter
private State evictorState;
private volatile boolean shutdown;
public IdleConnectionEvictor(NHttpClientConnectionManager nhttpClientConnectionManager,
ElasticSearchClientConfiguration elasticSearchClientConfiguration) {
super();
this.nhttpClientConnectionManager = nhttpClientConnectionManager;
this.elasticSearchClientConfiguration = elasticSearchClientConfiguration;
}
#Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(elasticSearchClientConfiguration.getExpiredConnectionsCheckTime());
// Close expired connections
nhttpClientConnectionManager.closeExpiredConnections();
// Optionally, close connections
// that have been idle longer than 5 sec
nhttpClientConnectionManager.closeIdleConnections(elasticSearchClientConfiguration.getMaxTimeIdleConnections(),
TimeUnit.SECONDS);
this.evictorState = State.RUNNING;
}
}
} catch (InterruptedException ex) {
this.evictorState = State.NOT_RUNNING;
}
}
private void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}
public enum State {
RUNNING,
NOT_RUNNING
}
}
}

Spring AMQP, CorrelationId and GZipPostProcessor: UnsupportedEncodingException

I have a Project with Spring AMQP (1.7.12.RELEASE).
If I put a value for the correlationId field (etMessageProperties (). SetCorrelationId) and I use GZipPostProcessor, the following error always occurs:
"org.springframework.amqp.AmqpUnsupportedEncodingException: java.io.UnsupportedEncodingException: gzip"
To solve it, it seems that it works using the following code:
DefaultMessagePropertiesConverter messageConverter = new DefaultMessagePropertiesConverter();
messageConverter.setCorrelationIdAsString(DefaultMessagePropertiesConverter.CorrelationIdPolicy.STRING);
template.setMessagePropertiesConverter(messageConverter);
but I do not know what implications it will have to use it in real with clients that do not use Spring AMQP (I establish this field if the message that has reached me has it).
I enclose a complete example of code:
#Configuration
public class SimpleProducerGZIP
{
static final String queueName = "spring-boot";
#Bean
public CachingConnectionFactory connectionFactory() {
com.rabbitmq.client.ConnectionFactory factory = new com.rabbitmq.client.ConnectionFactory();
factory.setHost("localhost");
factory.setAutomaticRecoveryEnabled(false);
CachingConnectionFactory connectionFactory = new CachingConnectionFactory(factory);
return connectionFactory;
}
#Bean
public AmqpAdmin amqpAdmin() {
RabbitAdmin rabbitAdmin = new RabbitAdmin(connectionFactory());
rabbitAdmin.setAutoStartup(true);
return rabbitAdmin ;
}
#Bean
Queue queue() {
Queue qr = new Queue(queueName, false);
qr.setAdminsThatShouldDeclare(amqpAdmin());
return qr;
}
#Bean
public RabbitTemplate rabbitTemplate()
{
RabbitTemplate template = new RabbitTemplate(connectionFactory());
template.setEncoding("gzip");
template.setBeforePublishPostProcessors(new GZipPostProcessor());
// TODO :
DefaultMessagePropertiesConverter messageConverter = new DefaultMessagePropertiesConverter();
messageConverter.setCorrelationIdAsString(DefaultMessagePropertiesConverter.CorrelationIdPolicy.STRING);
template.setMessagePropertiesConverter(messageConverter);
return template;
}
public static void main(String[] args)
{
#SuppressWarnings("resource")
ApplicationContext context = new AnnotationConfigApplicationContext(SimpleProducerGZIP.class);
RabbitTemplate _rabbitTemplate = context.getBean(RabbitTemplate.class);
int contador = 0;
try {
while(true)
{
contador = contador + 1;
int _nContador = contador;
System.out.println("\nInicio envio : " + _nContador);
Object _o = new String(("New Message : " + contador));
try
{
_rabbitTemplate.convertAndSend(queueName, _o,
new MessagePostProcessor() {
#SuppressWarnings("deprecation")
#Override
public Message postProcessMessage(Message msg) throws AmqpException {
if(_nContador%2 == 0) {
System.out.println("\t--- msg.getMessageProperties().setCorrelationId ");
msg.getMessageProperties().setCorrelationId("NewCorrelation".getBytes(StandardCharsets.UTF_8));
}
return msg;
}
}
);
System.out.println("\tOK");
}catch (Exception e) {
System.err.println("\t\tError en envio : " + contador + " - " + e.getMessage());
}
System.out.println("Fin envio : " + contador);
Thread.sleep(500);
}
}catch (Exception e) {
System.err.println("Exception : " + e.getMessage());
}
}
}
The question is, if I change the configuration of the rabbitTemplate so that the error does not happen, can it have implications for clients that use Spring AMQP or other alternatives?
--- EDIT (28/03/2019)
This is the complete stack trace with the code:
org.springframework.amqp.AmqpUnsupportedEncodingException: java.io.UnsupportedEncodingException: gzip
at org.springframework.amqp.rabbit.support.DefaultMessagePropertiesConverter.fromMessageProperties(DefaultMessagePropertiesConverter.java:211)
at org.springframework.amqp.rabbit.core.RabbitTemplate.doSend(RabbitTemplate.java:1531)
at org.springframework.amqp.rabbit.core.RabbitTemplate$3.doInRabbit(RabbitTemplate.java:716)
at org.springframework.amqp.rabbit.core.RabbitTemplate.doExecute(RabbitTemplate.java:1455)
at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:1411)
at org.springframework.amqp.rabbit.core.RabbitTemplate.send(RabbitTemplate.java:712)
at org.springframework.amqp.rabbit.core.RabbitTemplate.convertAndSend(RabbitTemplate.java:813)
at org.springframework.amqp.rabbit.core.RabbitTemplate.convertAndSend(RabbitTemplate.java:791)
at es.jab.example.SimpleProducerGZIP.main(SimpleProducerGZIP.java:79)
Caused by: java.io.UnsupportedEncodingException: gzip
at java.lang.StringCoding.decode(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at org.springframework.amqp.rabbit.support.DefaultMessagePropertiesConverter.fromMessageProperties(DefaultMessagePropertiesConverter.java:208)
... 8 more
I'd be interested to see the complete stack trace for more information about the problem.
This code was part of a transition from a byte[] correlation Id to a String. This was needed to avoid a byte[]/String/byte[] conversion.
When the policy is String, you should use the correlationIdString property instead of correlationId. Otherwise, the correlationId won't be mapped in outbound messages (we don't look at correlationId in that case). For inbound messages it controls which property is populated.
In 2.0 and later, correlationId is now a String instead of a byte[] so this setting is no longer needed.
EDIT
Now I see the stack trace, this...
template.setEncoding("gzip");
...is wrong.
/**
* The encoding to use when inter-converting between byte arrays and Strings in message properties.
*
* #param encoding the encoding to set
*/
public void setEncoding(String encoding) {
this.encoding = encoding;
}
There is no such Charset as gzip. This property has nothing to do with the message content, it is simply used when converting byte[] to/from String. It is UTF-8 by default.

Hibernate session closed for no apparent reason

I've been struggling with this for days and i really don't know what's happening here.
I have a few processes that are triggered by a user and happen on a seperate thread, which does some operations, and updates an entry in the DB about it's progress which the user can retrieve.
this has all been working fine until recently, when suddenly, sometimes, seemingly uncorrelated with anything, these processes fail on their first attempt to lazy load an entity. They fail with one of a few different errors, all of which eventually stem from the hibernate seesion being closed somehow:
org.hibernate.SessionException: Session is closed. The read-only/modifiable setting is only accessible when the proxy is associated with an open session.
-
org.hibernate.SessionException: Session is closed!
-
org.hibernate.LazyInitializationException: could not initialize proxy - the owning Session was closed
-
org.hibernate.exception.GenericJDBCException: Could not read entity state from ResultSet : EntityKey[com.rdthree.plenty.domain.crops.CropType#1]
-
java.lang.NullPointerException
at com.mysql.jdbc.ResultSetImpl.checkColumnBounds(ResultSetImpl.java:766)
i'm using spring #Transactional to manage my transactions
here's my config:
#Bean
public javax.sql.DataSource dataSource() {
// update TTL so that the datasource will pick up DB failover - new IP
java.security.Security.setProperty("networkaddress.cache.ttl", "30");
HikariDataSource ds = new HikariDataSource();
String jdbcUrl = "jdbc:mysql://" + plentyConfig.getHostname() + ":" + plentyConfig.getDbport() + "/"
+ plentyConfig.getDbname() + "?useSSL=false";
ds.setJdbcUrl(jdbcUrl);
ds.setUsername(plentyConfig.getUsername());
ds.setPassword(plentyConfig.getPassword());
ds.setConnectionTimeout(120000);
ds.setMaximumPoolSize(20);
return ds;
}
#Bean
public JpaTransactionManager transactionManager(EntityManagerFactory entityManagerFactory) {
JpaTransactionManager manager = new JpaTransactionManager();
manager.setEntityManagerFactory(entityManagerFactory);
return manager;
}
#Bean
#Autowired
public LocalContainerEntityManagerFactoryBean entityManagerFactory(javax.sql.DataSource dataSource) {
LocalContainerEntityManagerFactoryBean factory = new LocalContainerEntityManagerFactoryBean();
factory.setDataSource(dataSource);
factory.setPackagesToScan("com.rdthree.plenty.domain");
Properties properties = new Properties();
properties.put("org.hibernate.flushMode", "ALWAYS");
properties.put("hibernate.cache.use_second_level_cache", "true");
properties.put("hibernate.cache.use_query_cache", "true");
properties.put("hibernate.cache.region.factory_class",
"com.rdthree.plenty.config.PlentyInfinispanRegionFactory");
properties.put("hibernate.cache.infinispan.statistics", "true");
properties.put("hibernate.cache.infinispan.query", "distributed-query");
properties.put("hibernate.enable_lazy_load_no_trans", "true");
if (plentyConfig.getProfile().equals(PlentyConfig.UNIT_TEST)
|| plentyConfig.getProfile().equals(PlentyConfig.PRODUCTION_INIT)) {
properties.put("hibernate.cache.infinispan.cfg", "infinispan-local.xml");
} else {
properties.put("hibernate.cache.infinispan.cfg", "infinispan.xml");
}
factory.setJpaProperties(properties);
HibernateJpaVendorAdapter adapter = new HibernateJpaVendorAdapter();
adapter.setShowSql(false);
adapter.setDatabasePlatform("org.hibernate.dialect.MySQLDialect");
factory.setJpaVendorAdapter(adapter);
return factory;
}
the way it works is that the thread fired by the user iterates over a collection of plans and applies each of them, the process that applies the plan also updates the progress entity on the DB.
the whole thread bean is marked at transactional:
#Component
#Scope("prototype")
#Transactional(propagation = Propagation.REQUIRES_NEW, isolation = Isolation.READ_COMMITTED)
public class TemplatePlanApplicationThreadBean extends AbstractPlentyThread implements TemplatePlanApplicationThread {
...
#Override
public void run() {
startProcessing();
try {
logger.trace(
"---Starting plan manifestation for " + fieldCropReplaceDatePlantationDates.size() + " fields---");
List<Plan> plans = new ArrayList<>();
for (FieldCropReplaceDatePlantationDates obj : fieldCropReplaceDatePlantationDates) {
for (TemplatePlan templatePlan : obj.getTemplatePlans()) {
try {
plans.add(planService.findActivePlanAndManifestTemplatePlan(templatePlan, organization,
obj.getPlantationDate(), obj.getReplacementStartDate(), obj.getFieldCrop(),
autoSchedule, schedulerRequestArguments, planApplicationProgress, false));
} catch (ActivityException e) {
throw new IllegalArgumentException(e);
}
}
Plan plan = plans.get(plans.size() - 1);
plan = planService.getEntityById(plan.getId());
if (plan != null) {
planService.setUnscheduledPlanAsSelected(plan);
}
plans.clear();
}
if (planApplicationProgressService.getEntityById(planApplicationProgress.getId()) != null) {
planApplicationProgressService.deleteEntity(planApplicationProgress.getId());
}
} catch (Exception e) {
logger.error(PlentyUtils.extrapolateStackTrace(e));
connector.createIssue("RdThreeLLC", "plenty-web",
new GitHubIssueRequest("Template plan application failure",
"```\n" + PlentyUtils.extrapolateStackTrace(e) + "\n```", 1, new ArrayList<>(),
Lists.newArrayList("plentytickets")));
planApplicationProgress.setFailed(true);
planApplicationProgressService.saveEntity(planApplicationProgress);
} finally {
endProcessing();
}
}
here is the method called by the thread:
#Override
#Transactional(propagation = Propagation.REQUIRES_NEW)
public synchronized Plan findActivePlanAndManifestTemplatePlan(TemplatePlan templatePlan,
ServiceProviderOrganization organization, Date plantationDate, Date replacementDate, FieldCrop fieldCrop,
boolean autoSchedule, SchedulerRequestArguments schedulerRequestArguments, Progress planProgress,
boolean commit) throws ActivityException {
Plan oldPlan = getLatestByFieldCrop(fieldCrop);
List<Activity> activitiesToRemove = oldPlan != null
? findActivitiesToRemoveAfterDate(oldPlan, replacementDate != null ? replacementDate : new Date())
: new ArrayList<>();
List<PlanExpense> planExpensesToRemove = oldPlan != null
? findPlanExpensesToRemoveAfterDate(oldPlan, replacementDate != null ? replacementDate : new Date())
: new ArrayList<>();
Date oldPlanPlantationDate = oldPlan != null ? inferPlanDates(oldPlan).getPlantationDate() : null;
if (oldPlan != null) {
if (commit) {
oldPlan.setReplaced(true);
}
buildPlanProfitProjectionForPlanAndField(oldPlan, Sets.newHashSet(activitiesToRemove),
Sets.newHashSet(planExpensesToRemove));
}
if (commit) {
for (Activity activity : activitiesToRemove) {
activityService.deleteEntity(activity.getId());
}
for (PlanExpense planExpense : planExpensesToRemove) {
planExpenseService.deleteEntity(planExpense.getId());
}
}
oldPlan = oldPlan != null ? getEntityById(oldPlan.getId()) : null;
Plan plan = manifestTemplatePlan(templatePlan, oldPlan, organization,
plantationDate != null ? plantationDate : oldPlanPlantationDate, replacementDate, fieldCrop,
autoSchedule, schedulerRequestArguments, planProgress, commit);
if (!commit) {
setPlanAllocationsUnscheduled(plan);
}
return plan;
}
the thing that kills me is that the error happens only sometimes, so i can't really debug it and i can't correlate it with anything.
Any ideas about what would cause the session to close?
i tried turning off any other threads, so basically this is the only one that's running, didn't help
Thanks

CloseableHttpClient.execute freezes once every few weeks despite timeouts

We have a groovy singleton that uses PoolingHttpClientConnectionManager(httpclient:4.3.6) with a pool size of 200 to handle very high concurrent connections to a search service and processes the xml response.
Despite having specified timeouts, it freezes about once a month but runs perfectly fine the rest of the time.
The groovy singleton below. The method retrieveInputFromURL seems to block on client.execute(get);
#Singleton(strict=false)
class StreamManagerUtil {
// Instantiate once and cache for lifetime of Signleton class
private static PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager();
private static CloseableHttpClient client;
private static final IdleConnectionMonitorThread staleMonitor = new IdleConnectionMonitorThread(connManager);
private int warningLimit;
private int readTimeout;
private int connectionTimeout;
private int connectionFetchTimeout;
private int poolSize;
private int routeSize;
PropertyManager propertyManager = PropertyManagerFactory.getInstance().getPropertyManager("sebe.properties")
StreamManagerUtil() {
// Initialize all instance variables in singleton from properties file
readTimeout = 6
connectionTimeout = 6
connectionFetchTimeout =6
// Pooling
poolSize = 200
routeSize = 50
// Connection pool size and number of routes to cache
connManager.setMaxTotal(poolSize);
connManager.setDefaultMaxPerRoute(routeSize);
// ConnectTimeout : time to establish connection with GSA
// ConnectionRequestTimeout : time to get connection from pool
// SocketTimeout : waiting for packets form GSA
RequestConfig config = RequestConfig.custom()
.setConnectTimeout(connectionTimeout * 1000)
.setConnectionRequestTimeout(connectionFetchTimeout * 1000)
.setSocketTimeout(readTimeout * 1000).build();
// Keep alive for 5 seconds if server does not have keep alive header
ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() {
#Override
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
HeaderElementIterator it = new BasicHeaderElementIterator
(response.headerIterator(HTTP.CONN_KEEP_ALIVE));
while (it.hasNext()) {
HeaderElement he = it.nextElement();
String param = he.getName();
String value = he.getValue();
if (value != null && param.equalsIgnoreCase
("timeout")) {
return Long.parseLong(value) * 1000;
}
}
return 5 * 1000;
}
};
// Close all connection older than 5 seconds. Run as separate thread.
staleMonitor.start();
staleMonitor.join(1000);
client = HttpClients.custom().setDefaultRequestConfig(config).setKeepAliveStrategy(myStrategy).setConnectionManager(connManager).build();
}
private retrieveInputFromURL (String categoryUrl, String xForwFor, boolean isXml) throws Exception {
URL url = new URL( categoryUrl );
GPathResult searchResponse = null
InputStream inputStream = null
HttpResponse response;
HttpGet get;
try {
long startTime = System.nanoTime();
get = new HttpGet(categoryUrl);
response = client.execute(get);
int resCode = response.getStatusLine().getStatusCode();
if (xForwFor != null) {
get.setHeader("X-Forwarded-For", xForwFor)
}
if (resCode == HttpStatus.SC_OK) {
if (isXml) {
extractXmlString(response)
} else {
StringBuffer buffer = buildStringFromResponse(response)
return buffer.toString();
}
}
}
catch (Exception e)
{
throw e;
}
finally {
// Release connection back to pool
if (response != null) {
EntityUtils.consume(response.getEntity());
}
}
}
private extractXmlString(HttpResponse response) {
InputStream inputStream = response.getEntity().getContent()
XmlSlurper slurper = new XmlSlurper()
slurper.setFeature("http://xml.org/sax/features/validation", false)
slurper.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false)
slurper.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false)
slurper.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)
return slurper.parse(inputStream)
}
private StringBuffer buildStringFromResponse(HttpResponse response) {
StringBuffer buffer= new StringBuffer();
BufferedReader rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line = "";
while ((line = rd.readLine()) != null) {
buffer.append(line);
System.out.println(line);
}
return buffer
}
public class IdleConnectionMonitorThread extends Thread {
private final HttpClientConnectionManager connMgr;
private volatile boolean shutdown;
public IdleConnectionMonitorThread
(PoolingHttpClientConnectionManager connMgr) {
super();
this.connMgr = connMgr;
}
#Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(5000);
connMgr.closeExpiredConnections();
connMgr.closeIdleConnections(10, TimeUnit.SECONDS);
}
}
} catch (InterruptedException ex) {
// Ignore
}
}
public void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}
}
I also found found this in the log leading me to believe it happened on waiting for response data
java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
Findings thus far:
We are using java 1.8u25. There is an open issue on a similar scenario
https://bugs.openjdk.java.net/browse/JDK-8075484
HttpClient had a similar report https://issues.apache.org/jira/browse/HTTPCLIENT-1589 but this was fixed in
the 4.3.6 version we are using
Questions
Can this be a synchronisation issue? From my understanding even though the singleton is accessed by multiple threads, the only shared data is the cached CloseableHttpClient
Is there anything else fundamentally wrong with this code,approach that may be causing this behaviour?
I do not see anything obviously wrong with your code. I would strongly recommend setting SO_TIMEOUT parameter on the connection manager, though, to make sure it applies to all new socket at the creation time, not at the time of request execution.
I would also help to know what exactly 'freezing' means. Are worker threads getting blocked waiting to acquire connections from the pool or waiting for response data?
Please also note that worker threads can appear 'frozen' if the server keeps on sending bits of chunk coded data. As usual a wire / context log of the client session would help a lot
http://hc.apache.org/httpcomponents-client-4.3.x/logging.html

Resources