Redis java.net.SocketTimeoutException: Read timed out on setting value while High memory utilization on redis cluster - caching

I am using Jedis client for redis in my spring service.
Below is the similar code which I use for setting value into a hashedkey.
Jedis jedis = null;
try {
1-- jedis = redisJedisPool.getResource();
2-- jedis.hset(key,"data", dataValue);
for (Entry<String, String> entry : hmap.entrySet()) {
if (some condition)
3-- jedis.hset(key, entry.getKey(), entry.getValue());
}
4-- jedis.expire(key, ttl);
}catch(Exception e) {
logger.error("Error for key {}, Reason: {}", key, e.getMessage());
}
finally {
RedisConnectionManager.closeJedisResource(jedis);
}
there was heavy load on this service and thus a high memory utilization of the redis cluster was noticed.
I faced a lot of SocketTimeoutException from the above function(all exception where catched in the catch block).
Exception:
Error for key: {key}, Reason: java.net.SocketTimeoutException: Read timed out
Main Issue: After this intermittent issue I see a lot of keys with ttl as -1(infinite expiry). Almost all these keys where logged in the above catch block.
Need thoughts on this from the community on what can be the possible issue here.
Action I took for verifying: I checked the data for few of these keys(with ttl -1) and saw that the data that was to be set at line 3(inside the for loop) didn't set all the data. But I came across couple of keys which had ttl as -1, not all data was set but that key was not logged in the above catch block. And this is the only func where the setting of data in cache takes place in my service.
After this I am not able to conclude the above hypothesis.

Related

How to print details status when RestTemplate request failure?

I use resttemplate to call a link such as http://example/jsonObject,about 4000 times a minute ,In most time it is fine, but sometime resttemplate will throw 60 times above exceptions in a minute.
restTemplate error org.springframework.web.client.ResourceAccessException: I/O error on GET request for "http://example/jsonObject": Read timed out; nested exception is java.net.SocketTimeoutException: Read timed out
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:751)
I use code:
try {
CouponV2ResultVO couponV2ResultVO = restTemplate.getForObject("http://example/jsonObject", CouponV2ResultVO.class);
long expense = System.currentTimeMillis() - startMs;
log.info("takes {} {} ms", expense);
return couponV2ResultVO;
} catch (Exception e) {
long expense = System.currentTimeMillis() - startMs;
log.error("takes {} ms", expense, couponReqFullUri, ExceptionUtils.getFullStackTrace(e));
throw e;
}
I want to print a more complete detail such as tcp communication between the whole request when resttemplate failure, or any other usage information instead of juat a timeout exception.Is it possible?
You could use log.error(e) or e.printStackTrace instead of throw e, when you want to have more details about an exception.

How to check how many total redis connection , that a REDIS server can given to clients?

We are using REDIS cache , and using Spring-Redis module , we set the maxActiveConnections 10 in application configuration , but sometimes in my applications am seeing below errors
Exception occurred while querying cache : org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
is it because of in the Redis server their are no more connections to give to my applications or any other reason , can anyone please suggest on this ?
Note : their are 15 applications which are using the same Redis server to store the data , i mean 15 applications need connections from this single redis server only , for now we set 10 as maxActiveConnections for each of the 15 applications
To check how many clients are connected to redis you can use redis-cli and type this command: redis> INFO more specifically info Clients command.
192.168.8.176:8023> info Clients
# Clients
connected_clients:1
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
Form Jedis source code, it seems that the exception happened for the following reason:
Exhausted cache: // The exception was caused by an exhausted pool
or // Otherwise, the exception was caused by the implemented activateObject() or ValidateObject()
Here is the code snippet of Jedis getResource method:
public T getResource() {
try {
return internalPool.borrowObject();
} catch (NoSuchElementException nse) {
if (null == nse.getCause()) { // The exception was caused by an exhausted pool
throw new JedisExhaustedPoolException(
"Could not get a resource since the pool is exhausted", nse);
}
// Otherwise, the exception was caused by the implemented activateObject() or ValidateObject()
throw new JedisException("Could not get a resource from the pool", nse);
} catch (Exception e) {
throw new JedisConnectionException("Could not get a resource from the pool", e);
}
}

HBase - Connection Reset by peer Exception

I am trying to use HBase for building some real time API's. Hence my use case is to support ~10000 concurrent requests per second. I am trying to do some connection pooling so as to achieve multi thread access. I followed this documentation to create the connection: https://hbase.apache.org/1.1/apidocs/org/apache/hadoop/hbase/client/ConnectionFactory.html
But I keep getting this error when I make concurrent requests to my API:
WARN [http-nio-34000-exec-93-SendThread(d-3zjyk02.target.com:2181)]
19 Apr 2017 04:48:13:872 (ClientCnxn.java:1102) - Session 0x0 for
server d-3zjyk02.target.com/10.66.241.30:2181, unexpected error,
closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
Here is how I am creating the connection:
// Connection to the cluster. A single connection shared by all application threads
private Connection connection = null;
public Connection getHBaseConnection() throws Exception {
if (connection == null) {
try {
Configuration configuration = HBaseConfiguration.create();
configuration.addResource("core-site.xml");
configuration.addResource("hbase-site.xml");
configuration.addResource("hdfs-site.xml");
connection = ConnectionFactory.createConnection(configuration);
} catch (Exception ex) {
LOG.error("Exception in creating the HBase connection object: " + ex.getMessage());
throw new Exception("Exception in creating the HBase connection: " + ex.getMessage());
}
}
return connection;
}
And here is how I use the get HBase connection method to some scan operations:
try {
connection = getHBaseConnection();
afterConnectionStartTime = System.currentTimeMillis();
LOG.info("[" + (System.currentTimeMillis() - startTime) + "]ms" + " ...TIME TAKEN to get the HBase connection object");
if (connection != null) {
table = connection.getTable(TableName.valueOf(TABLE_NAME));
Scan scan = new Scan(Bytes.toBytes(rowKeyStartDate), Bytes.toBytes(rowKeyEndDate));
scan.addColumn(COLUMN_FAMILY, ITEM);
}
This code works fine for any number of sequential requests, but when I do concurrent requests, I keep getting this error.
Some of the observations from my research on this issue:
1) This error is related to zookeeper closing the socket after certain number of requests (which I assume when it exceeds the max client connections (40) mentioned in my zoo.cfg file). But what I don't understand is why the concurrent requests are going to zookeeper in the first place. The first request should open the connection object and all the subsequent requests should use that pre existing connection to directly talk to region servers.
2) I am assuming this is the right way to do the connection pooling (at least as per the official Hbase doc). If no, whats the right way to do it?
3) I don't want to increase the max client connections in the zookeeper cfg file, thought it might be a temporary hack that can do my work.
Any help / suggestions is much appreciated.
Thanks!

Elasticsearch client does not fetch result when a single client node goes down

We have a very standard elasticsearch setup with 3 master nodes, 6 data nodes and 3 client nodes. Here is our connection code for connecting to Elasticsearch clients from our Java application.
Settings settings = Settings.settingsBuilder()
.put("cluster.name", configuration.getString("clusterName"))
.put("client.transport.sniff", false)
.put("client.transport.ping_timeout", "5s")
.build();
TransportClient client = TransportClient.builder().settings(settings).build();
for (String hostname : (Collection<String>)configuration.get("hostnames")){
try {
client = client.addTransportAddresses(
new InetSocketTransportAddress(InetAddress.getByName(hostname), 9300)
);
break;
} catch (UnknownHostException e) {
e.printStackTrace();
}
}
We have currently three different host in hostnames list. But any time a single client from this list of hostname goes down this Elasticsearch transport client stops responding. I have gone through transport client documentation on Elasticsearch site and have also tried looking at their Github issues, according to that whenever a node goes down only elasticsearch should remove it from list of nodes and continue working with other nodes, but in our case things just break down. Anyone has any idea what might be the problem?
We are using elasticsearch 2.4.3 right now.
It looks like you are breaking the loop after a single node has been added. Try removing the break statement:
for (String hostname : (Collection<String>)configuration.get("hostnames")){
try {
client = client.addTransportAddresses(
new InetSocketTransportAddress(InetAddress.getByName(hostname), 9300)
);
} catch (UnknownHostException e) {
e.printStackTrace();
}
}

SFTP error : com.jcraft.jsch.JSchException: invalid server's version string

I have the below code to SFTP to a location
public static void putFile(String username, String host, String password, String remotefile, String localfile){
JSch jsch = new JSch();
Session session = null;
try {
session = jsch.getSession(username, host, 22);
session.setConfig("StrictHostKeyChecking", "no");
session.setPassword(password);
session.connect();
Channel channel = session.openChannel("sftp");
channel.connect();
ChannelSftp sftpChannel = (ChannelSftp) channel;
sftpChannel.put(localfile, remotefile);
sftpChannel.exit();
session.disconnect();
} catch (JSchException e) {
e.printStackTrace();
} catch (SftpException e) {
e.printStackTrace();
}
}
I am able to SFTP the document from my local machine using the above code. However when I am trying from a different environment to SFTP to the same location I am getting the follow error.
com.jcraft.jsch.JSchException: invalid server's version string at
com.jcraft.jsch.Session.connect(Session.java:253)
Note : I am using jsch-0.1.31.jar file.
on printing out session.getClientVersion() I am getting "SSH-2.0-JSCH-0.1.31".
I tried to upgrade the jar file to jsch-0.1.51.jar then session.getClientVersion() = "SSH-1.5-JSCH-0.1.51" and I am getting the following error
com.jcraft.jsch.JSchException: Session.connect: java.net.SocketException: Connection reset at com.jcraft.jsch.Session.connect(Session.java:558)
Please can you help me on what parameters should I be looking into and what is causing it to run from my local machine and upload to the same SFTP location and not from other environment?
As noted by #Kenster, the exception is about server's version string, not client's. The "invalid server's version string" exception is thrown by following code in Session.connect:
if(i==buf.buffer.length ||
i<7 || // SSH-1.99 or SSH-2.0
(buf.buffer[4]=='1' && buf.buffer[6]!='9') // SSH-1.5
){
throw new JSchException("invalid server's version string");
}
First, I would try to connect with some client that logs the version string and see yourself. For example with WinSCP, search its log for a pattern like:
. 2014-09-03 17:01:20.596 Server version: SSH-2.0-OpenSSH_5.3
(I'm the author of WinSCP)
Though possibly it's not about version string at all. I would rather believe the error raised by the new version, the Connection reset. The old version may fail to detect that the connection was aborted prematurely and tries to validate some random or incomplete data.
The Connection reset may indicate wide variety of different errors
Server refusing a connection from the other location
Some firewall or proxy not allowing the connection to pass through

Resources