We have a groovy singleton that uses PoolingHttpClientConnectionManager(httpclient:4.3.6) with a pool size of 200 to handle very high concurrent connections to a search service and processes the xml response.
Despite having specified timeouts, it freezes about once a month but runs perfectly fine the rest of the time.
The groovy singleton below. The method retrieveInputFromURL seems to block on client.execute(get);
#Singleton(strict=false)
class StreamManagerUtil {
// Instantiate once and cache for lifetime of Signleton class
private static PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager();
private static CloseableHttpClient client;
private static final IdleConnectionMonitorThread staleMonitor = new IdleConnectionMonitorThread(connManager);
private int warningLimit;
private int readTimeout;
private int connectionTimeout;
private int connectionFetchTimeout;
private int poolSize;
private int routeSize;
PropertyManager propertyManager = PropertyManagerFactory.getInstance().getPropertyManager("sebe.properties")
StreamManagerUtil() {
// Initialize all instance variables in singleton from properties file
readTimeout = 6
connectionTimeout = 6
connectionFetchTimeout =6
// Pooling
poolSize = 200
routeSize = 50
// Connection pool size and number of routes to cache
connManager.setMaxTotal(poolSize);
connManager.setDefaultMaxPerRoute(routeSize);
// ConnectTimeout : time to establish connection with GSA
// ConnectionRequestTimeout : time to get connection from pool
// SocketTimeout : waiting for packets form GSA
RequestConfig config = RequestConfig.custom()
.setConnectTimeout(connectionTimeout * 1000)
.setConnectionRequestTimeout(connectionFetchTimeout * 1000)
.setSocketTimeout(readTimeout * 1000).build();
// Keep alive for 5 seconds if server does not have keep alive header
ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() {
#Override
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
HeaderElementIterator it = new BasicHeaderElementIterator
(response.headerIterator(HTTP.CONN_KEEP_ALIVE));
while (it.hasNext()) {
HeaderElement he = it.nextElement();
String param = he.getName();
String value = he.getValue();
if (value != null && param.equalsIgnoreCase
("timeout")) {
return Long.parseLong(value) * 1000;
}
}
return 5 * 1000;
}
};
// Close all connection older than 5 seconds. Run as separate thread.
staleMonitor.start();
staleMonitor.join(1000);
client = HttpClients.custom().setDefaultRequestConfig(config).setKeepAliveStrategy(myStrategy).setConnectionManager(connManager).build();
}
private retrieveInputFromURL (String categoryUrl, String xForwFor, boolean isXml) throws Exception {
URL url = new URL( categoryUrl );
GPathResult searchResponse = null
InputStream inputStream = null
HttpResponse response;
HttpGet get;
try {
long startTime = System.nanoTime();
get = new HttpGet(categoryUrl);
response = client.execute(get);
int resCode = response.getStatusLine().getStatusCode();
if (xForwFor != null) {
get.setHeader("X-Forwarded-For", xForwFor)
}
if (resCode == HttpStatus.SC_OK) {
if (isXml) {
extractXmlString(response)
} else {
StringBuffer buffer = buildStringFromResponse(response)
return buffer.toString();
}
}
}
catch (Exception e)
{
throw e;
}
finally {
// Release connection back to pool
if (response != null) {
EntityUtils.consume(response.getEntity());
}
}
}
private extractXmlString(HttpResponse response) {
InputStream inputStream = response.getEntity().getContent()
XmlSlurper slurper = new XmlSlurper()
slurper.setFeature("http://xml.org/sax/features/validation", false)
slurper.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false)
slurper.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false)
slurper.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)
return slurper.parse(inputStream)
}
private StringBuffer buildStringFromResponse(HttpResponse response) {
StringBuffer buffer= new StringBuffer();
BufferedReader rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line = "";
while ((line = rd.readLine()) != null) {
buffer.append(line);
System.out.println(line);
}
return buffer
}
public class IdleConnectionMonitorThread extends Thread {
private final HttpClientConnectionManager connMgr;
private volatile boolean shutdown;
public IdleConnectionMonitorThread
(PoolingHttpClientConnectionManager connMgr) {
super();
this.connMgr = connMgr;
}
#Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(5000);
connMgr.closeExpiredConnections();
connMgr.closeIdleConnections(10, TimeUnit.SECONDS);
}
}
} catch (InterruptedException ex) {
// Ignore
}
}
public void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}
}
I also found found this in the log leading me to believe it happened on waiting for response data
java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
Findings thus far:
We are using java 1.8u25. There is an open issue on a similar scenario
https://bugs.openjdk.java.net/browse/JDK-8075484
HttpClient had a similar report https://issues.apache.org/jira/browse/HTTPCLIENT-1589 but this was fixed in
the 4.3.6 version we are using
Questions
Can this be a synchronisation issue? From my understanding even though the singleton is accessed by multiple threads, the only shared data is the cached CloseableHttpClient
Is there anything else fundamentally wrong with this code,approach that may be causing this behaviour?
I do not see anything obviously wrong with your code. I would strongly recommend setting SO_TIMEOUT parameter on the connection manager, though, to make sure it applies to all new socket at the creation time, not at the time of request execution.
I would also help to know what exactly 'freezing' means. Are worker threads getting blocked waiting to acquire connections from the pool or waiting for response data?
Please also note that worker threads can appear 'frozen' if the server keeps on sending bits of chunk coded data. As usual a wire / context log of the client session would help a lot
http://hc.apache.org/httpcomponents-client-4.3.x/logging.html
Related
I've set up a spring-reactive server and an use a while loop inside of a Scheduled job to make fire and forget requests from a httpClient (backend just returns the same simple string via tomcat) it is configured as follows:
public ConnectionProvider getConnectionProvider(){
int maxConnections = 20;
ConnectionProvider connProvider = ConnectionProvider
.builder("webclient-conn-pool")
.maxConnections(maxConnections)
.maxIdleTime(Duration.of(20, ChronoUnit.SECONDS))
.maxLifeTime(Duration.of(1000, ChronoUnit.SECONDS))
.pendingAcquireMaxCount(2000)
.pendingAcquireTimeout(Duration.ofMillis(20000))
.build();
return connProvider;
}
public HttpClient getHttpClient(){
return HttpClient
.create(getConnectionProvider());
//.secure(sslContextSpec -> sslContextSpec.sslContext(webClientSslHelper.getSslContext()))
/*.tcpConfiguration(tcpClient -> {
LoopResources loop = LoopResources.create("webclient-event-loop",
selectorThreadCount, workerThreadCount, Boolean.TRUE);
return tcpClient
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 10000)
.option(ChannelOption.TCP_NODELAY, true);*/
}
I've also created a job that uses #Autowired kafkaTemplate and makes a while loop that sends a short text string
For http / tomcat calls I am getting around 65 requests a second
For kafka I am maxing out around 50000 requests with a half second lag
application.props
spring.kafka.bootstrap-servers=PLAINTEXT://localhost:9092,PLAINTEXT://localhost:9093
host.name=localhost
Job to make calls
#Async
#Scheduled(fixedDelay = 15000)
public void scheduleTaskUsingCronExpression() {
generateCalls();
}
private void generateCalls() {
try{
int i = 0;
System.out.println("start");
long startTime = System.currentTimeMillis();
while(i <= 5000){
//Thread.sleep(5);
String message = "Test Message sadg sad-";
kafkaTemplate.send(TOPIC, message + i);
i++;
}
long endTime = System.currentTimeMillis();
System.out.println((endTime - startTime));
System.out.println("done");
}
catch(Exception e){
e.printStackTrace();
}
System.out.println("RUNNING");
}
Kakfa partition config
#Bean
public KafkaAdmin kafkaAdmin() {
//String bootstrapAddress = "localhost:29092";
String bootstrapAddress = "localhost:9092";
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
return new KafkaAdmin(configs);
}
#Bean
public NewTopic testTopic() {
return new NewTopic("test-topic", 6, (short) 1);
}
Kafka consumer consuming messge
#KafkaListener(topics = "test-topic", groupId = "one", concurrency = "6" )
public void listenGroupFoo(String message) {
if(message.indexOf("-0") != -1){
startTime = new Date().getTime();
System.out.println("Starting Message in group foo: " + message);
}
else if(message.indexOf("-100000") != -1){
endTime = new Date().getTime();
System.out.println("Received Message in group foo: " + message);
System.out.println(endTime - startTime);
}
}
For hardware I have a 10900k with 64gb ram
5ghz clock speed
970 Evo single nvme disk
10 core 20 thread
All requests are from the same local server to the same local server
I can add more local servers for testing if needed
Is there a better way to organize / optimize the code to make a massive number of requests?
Theories:
Multiple Threads?
Changing configurations of servers such as tomcat configs (receiving or sending side)?
Not use the kafkaTemplate that is autowired or creating multiple?
Modify Hardware to have multiple disks?
Improve server receiving http to allow more connections?
Not use a job to make the requests?
Anything else anyone can think of to help?
We have a spring-boot application which is deployed to lambda in AWS.
Code
public AbstractRedisClient getClient(String host, String port) {
LOG.info("redis-uri" + "redis://"+host+":"+port);
return RedisClient.create("redis://"+host+":"+port);
}
/**
* Returns the Redis connection using the Lettuce-Redis-Client
*
* #return RedisClient
*/
public RedisClient getConnection(String host, String port) {
LOG.info("redis-Host " + host);
LOG.info("redis-Port " + port);
RedisClient redisClient = (RedisClient) getClient(host, port);
redisClient.setDefaultTimeout(Duration.ofSeconds(10));
return redisClient;
}
private RedisCommands<String, String> getRedisCommands() {
StatefulRedisConnection<String, String> statefulConnection = openConnection();
if(statefulConnection != null)
return statefulConnection.sync();
else
return null;
}
public StatefulRedisConnection<String, String> openConnection() {
if(connection != null && connection.isOpen()) {
return connection;
}
String redisPort = "6379";
String redisHost = environment.getProperty("REDIS_HOST");
//String redisPort = environment.getProperty("REDIS_PORT");
LOG.info("Host: {}", redisHost);
LOG.info("Port: {}", redisPort);
UnifiedReservationRedisConfig lettuceRedisConfig = new UnifiedReservationRedisConfig();
String redisUri = "redis://"+redisHost+":"+redisPort;
redisClient = lettuceRedisConfig.getConnection(redisHost, redisPort);
ConnectionFuture<StatefulRedisConnection<String, String>> future = redisClient
.connectAsync(StringCodec.UTF8, RedisURI.create(redisUri));
try {
connection = future.get();
} catch(InterruptedException | ExecutionException exception) {
LOG.info(exception.getMessage());
closeConnectionsAsync();
connection = null;
Thread.currentThread().interrupt();
}
return connection;
}
private void closeConnectionsAsync() {
LOG.info("Close redis connection");
if(connection != null && connection.isOpen()) {
connection.closeAsync();
}
if(redisClient != null) {
redisClient.shutdownAsync();
}
}
The issue was happening all time, But frequently geting this issue like Caused by io.lettuce.core.rediscommandexecutionexception: moved 15596 XX.X.XXX.XX:6379, Any one can help to solve this issue
As per my knowledge, you are doing port-forwarding to your redis cluster/instance using port 15596 but the actual redis ports like 6379 are not accessible from your application's network.
When redis's java client get the access to redis then tries to connect to actual ports like 6379.
Try using RedisClusterClient, instead of RedisClient. The unhandled MOVED response indicates that you are trying to use the non-cluster-aware client with Redis deployed in cluster mode.
I am using RestHighLevelClient version 7.2 to connect to the ElasticSearch cluster version 7.2. My cluster has 3 Master nodes and 2 data nodes. Data node memory config: 2 core and 8 GB. I have used to below code in my spring boot project to create RestHighLevelClient instance.
#Bean(destroyMethod = "close")
#Qualifier("readClient")
public RestHighLevelClient readClient(){
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY,
new UsernamePasswordCredentials(elasticUser, elasticPass));
RestClientBuilder builder = RestClient.builder(new HttpHost(elasticHost, elasticPort))
.setHttpClientConfigCallback(httpClientBuilder ->httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider).setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build()));
builder.setRequestConfigCallback(requestConfigBuilder -> requestConfigBuilder.setConnectTimeout(30000).setSocketTimeout(60000)
);
RestHighLevelClient restClient = new RestHighLevelClient(builder);
return restClient;
}
RestHighLevelClient is a singleton bean. Intermittently I am getting SocketTimeoutException with both GET and PUT request. The index size is around 50 MB. I have tried increasing the socket timeout value, but still, I receive the same error. Am I missing some configuration? Any help would be appreciated.
I got the issue just wanted to share so that it can help others.
I was using Load Balancer to connect to the ElasticSerach Cluster.
As you can see from my RestClientBuilder code that I was using only the loadbalancer host and port. Although I have multiple master node, still RestClient was not retrying my request in case of connection timeout.
RestClientBuilder builder = RestClient.builder(new HttpHost(elasticHost, elasticPort))
.setHttpClientConfigCallback(httpClientBuilder ->httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider).setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build()));
According to the RestClient code if we use a single host then it won't retry in case of any connection issue.
So I changed my code as below and it started working.
RestClientBuilder builder = RestClient.builder(new HttpHost(elasticHost, 9200),new HttpHost(elasticHost, 9201))).setHttpClientConfigCallback(httpClientBuilder -> httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider));
For complete RestClient code please refer https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java
Retry code block in RestClient
private Response performRequest(final NodeTuple<Iterator<Node>> nodeTuple,
final InternalRequest request,
Exception previousException) throws IOException {
RequestContext context = request.createContextForNextAttempt(nodeTuple.nodes.next(), nodeTuple.authCache);
HttpResponse httpResponse;
try {
httpResponse = client.execute(context.requestProducer, context.asyncResponseConsumer, context.context, null).get();
} catch(Exception e) {
RequestLogger.logFailedRequest(logger, request.httpRequest, context.node, e);
onFailure(context.node);
Exception cause = extractAndWrapCause(e);
addSuppressedException(previousException, cause);
if (nodeTuple.nodes.hasNext()) {
return performRequest(nodeTuple, request, cause);
}
if (cause instanceof IOException) {
throw (IOException) cause;
}
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
}
throw new IllegalStateException("unexpected exception type: must be either RuntimeException or IOException", cause);
}
ResponseOrResponseException responseOrResponseException = convertResponse(request, context.node, httpResponse);
if (responseOrResponseException.responseException == null) {
return responseOrResponseException.response;
}
addSuppressedException(previousException, responseOrResponseException.responseException);
if (nodeTuple.nodes.hasNext()) {
return performRequest(nodeTuple, request, responseOrResponseException.responseException);
}
throw responseOrResponseException.responseException;
}
I'm facing the same issue, and seeing this I realized that the retry is happening on my side too in each host (I have 3 host and the exception happens in 3 threads). I wanted to post it since you might face the same issue or someone else might come to this post because of the same SocketConnection Exception.
Searching the official docs, the HighLevelRestClient uses under the hood the RestClient, and the RestClient uses CloseableHttpAsyncClient which have a connection pool. ElasticSearch specifies that you should close the connection once that you are done, (which sounds ambiguous the definition of "done" in an application), but in general in internet I have found that you should close it when the application is closing or ending, rather than when you finished querying.
Now on the official documentation of apache they have an example to handle the connection pool, which i'm trying to follow, I'll try to replicate the scenario and will post if that fixes my issue, the code can be found here:
https://hc.apache.org/httpcomponents-asyncclient-dev/httpasyncclient/examples/org/apache/http/examples/nio/client/AsyncClientEvictExpiredConnections.java
This is what i have so far:
#Bean(name = "RestHighLevelClientWithCredentials", destroyMethod = "close")
public RestHighLevelClient elasticsearchClient(ElasticSearchClientConfiguration elasticSearchClientConfiguration,
RestClientBuilder.HttpClientConfigCallback httpClientConfigCallback) {
return new RestHighLevelClient(
RestClient
.builder(getElasticSearchHosts(elasticSearchClientConfiguration))
.setHttpClientConfigCallback(httpClientConfigCallback)
);
}
#Bean
#RefreshScope
public RestClientBuilder.HttpClientConfigCallback getHttpClientConfigCallback(
PoolingNHttpClientConnectionManager poolingNHttpClientConnectionManager,
CredentialsProvider credentialsProvider
) {
return httpAsyncClientBuilder -> {
httpAsyncClientBuilder.setSSLHostnameVerifier(NoopHostnameVerifier.INSTANCE);
httpAsyncClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
httpAsyncClientBuilder.setConnectionManager(poolingNHttpClientConnectionManager);
return httpAsyncClientBuilder;
};
}
public class ElasticSearchClientManager {
private ElasticSearchClientManager.IdleConnectionEvictor idleConnectionEvictor;
/**
* Custom client connection manager to create a connection watcher
*
* #param elasticSearchClientConfiguration elasticSearchClientConfiguration
* #return PoolingNHttpClientConnectionManager
*/
#Bean
#RefreshScope
public PoolingNHttpClientConnectionManager getPoolingNHttpClientConnectionManager(
ElasticSearchClientConfiguration elasticSearchClientConfiguration
) {
try {
SSLIOSessionStrategy sslSessionStrategy = new SSLIOSessionStrategy(getTrustAllSSLContext());
Registry<SchemeIOSessionStrategy> sessionStrategyRegistry = RegistryBuilder.<SchemeIOSessionStrategy>create()
.register("http", NoopIOSessionStrategy.INSTANCE)
.register("https", sslSessionStrategy)
.build();
ConnectingIOReactor ioReactor = new DefaultConnectingIOReactor();
PoolingNHttpClientConnectionManager poolingNHttpClientConnectionManager =
new PoolingNHttpClientConnectionManager(ioReactor, sessionStrategyRegistry);
idleConnectionEvictor = new ElasticSearchClientManager.IdleConnectionEvictor(poolingNHttpClientConnectionManager,
elasticSearchClientConfiguration);
idleConnectionEvictor.start();
return poolingNHttpClientConnectionManager;
} catch (IOReactorException e) {
throw new RuntimeException("Failed to create a watcher for the connection pool");
}
}
private SSLContext getTrustAllSSLContext() {
try {
return new SSLContextBuilder()
.loadTrustMaterial(null, (x509Certificates, string) -> true)
.build();
} catch (Exception e) {
throw new RuntimeException("Failed to create SSL Context with open certificate", e);
}
}
public IdleConnectionEvictor.State state() {
return idleConnectionEvictor.evictorState;
}
#PreDestroy
private void finishManager() {
idleConnectionEvictor.shutdown();
}
public static class IdleConnectionEvictor extends Thread {
private final NHttpClientConnectionManager nhttpClientConnectionManager;
private final ElasticSearchClientConfiguration elasticSearchClientConfiguration;
#Getter
private State evictorState;
private volatile boolean shutdown;
public IdleConnectionEvictor(NHttpClientConnectionManager nhttpClientConnectionManager,
ElasticSearchClientConfiguration elasticSearchClientConfiguration) {
super();
this.nhttpClientConnectionManager = nhttpClientConnectionManager;
this.elasticSearchClientConfiguration = elasticSearchClientConfiguration;
}
#Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(elasticSearchClientConfiguration.getExpiredConnectionsCheckTime());
// Close expired connections
nhttpClientConnectionManager.closeExpiredConnections();
// Optionally, close connections
// that have been idle longer than 5 sec
nhttpClientConnectionManager.closeIdleConnections(elasticSearchClientConfiguration.getMaxTimeIdleConnections(),
TimeUnit.SECONDS);
this.evictorState = State.RUNNING;
}
}
} catch (InterruptedException ex) {
this.evictorState = State.NOT_RUNNING;
}
}
private void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}
public enum State {
RUNNING,
NOT_RUNNING
}
}
}
We have a standalone java application using third-party tool to manage connection pooling, which worked for us in v6_client + v6_server setup for a long time.
Now we are trying to migrate from v6 to v9 (yes, we are pretty late to the party.....), and found v9_client connection to v6_server connection is constantly interrupted, meaning:
Socket created by XAQueueConnectionFactory#createXAConnection() is always closed immediately, and the created XAConnection seems to be unaware of it.
Due to socket close mentioned above, XASession created from the XAConnection.createXASession() always creates a new socket and close the socket after XASession.close().
We went throught the complete list of tunables for v9_client (XAQCF
column in https://www.ibm.com/support/knowledgecenter/SSFKSJ_9.0.0/com.ibm.mq.ref.dev.doc/q111800_.html) and only spot two potential new configs we haven't used in v6_client, SHARECONVALLOWED and PROVIDERVERSION. Unfortunately neither helps us out..... Specifically:
We tried setShareConvAllowed(WMQConstants.WMQ_SHARE_CONV_ALLOWED_[YES/NO]) Considering there's no SHARECNV property in v6_server side, not a surprise.
We tried "Migration/Restricted/Normal Mode" by setProviderVersion("[6/7/8") ([7/8] throws exceptions, as expected....).
Just wondering if anybody else had similar experience and could share some insights. We tried v9_server+v9_client and haven't seen any similar problem, so that could also be our eventual solution.....
Btw, the WMQ is hosted on linux (RedHat), and we only use products of MQXAQueueConnectionFactory on client side (ibm mq class for jms).
Thanks.
Additional details/updates.
[update-1]
--[playgrond setup]
v9_client jars:
javax.jms-api-2.0.jar
com.ibm.mq.allclient(-9.0.0.[1/5]).jar
v6_client jars:
in addition to v9_client jars, introduced the following jars in eclipse classpath
com.ibm.dhbcore-1.0.jar
com.ibm.mq.pcf-6.0.3.jar
com.ibm.mqjms-6.0.2.2.jar
com.ibm.mq-6.0.2.2.jar
com.ibm.mqetclient-6.0.2.2.jar
connector.jar
jta-1.1.jar
Testing code - single thread:
import javax.jms.*;
import com.ibm.mq.jms.*;
import com.ibm.msg.client.wmq.WMQConstants;
public class MQSeries_simpleAuditQ {
private static String queueManager = "QM.RCTQ.ALL.01";
private static String host = "localhost";
private static int port = 26005;
public static void main(String[] args) throws JMSException {
MQXAQueueConnectionFactory queueFactory= new MQXAQueueConnectionFactory();
System.out.println("\n\n\n*******************\nqueueFactory implementation version: " +
queueFactory.getClass().getPackage().getImplementationVersion() + "*****************\n\n\n");
queueFactory.setHostName(host);
queueFactory.setPort(port);
queueFactory.setQueueManager(queueManager);
queueFactory.setTransportType(WMQConstants.WMQ_CM_CLIENT);
if (queueFactory.getClass().getPackage().getImplementationVersion().split("\\.")[0].equals("9")) {
queueFactory.setProviderVersion("6");
//queueFactory.setShareConvAllowed(WMQConstants.WMQ_SHARE_CONV_ALLOWED_YES);
}
XASession xaSession;
javax.jms.QueueConnection xaQueueConnection;
try {
// Obtain a QueueConnection
System.out.println("Creating Connection...");
xaQueueConnection = (QueueConnection)queueFactory.createXAConnection(" ", "");
xaQueueConnection.start();
for (int counter=0; counter<2; counter++) {
try {
xaSession = ((XAConnection)xaQueueConnection).createXASession();
xaSession.close();
} catch (Exception ex) {
System.out.println(ex);
}
}
System.out.println("Closing connection.... ");
xaQueueConnection.close();
} catch (Exception e) {
System.out.println("Error processing " + e.getMessage());
}
}
}
--[observations]
v6_client only created and close a single socket, while v9_client (both 9.0.0.[1/5]):
socket created and closed right after xaQueueConnection = (QueueConnection)queueFactory.createXAConnection(" ", "");
in the inner for loop, socket created right after xaSession = ((XAConnection)xaQueueConnection).createXASession();, and closed after xaSession.close();
Naively i was expecting socket remains open until xaQueueConnection.close().
[update-2]
Using queueFactory.setProviderVersion("9"); and queueFactory.setShareConvAllowed(WMQConstants.WMQ_SHARE_CONV_ALLOWED_YES); for v9_server+v9_client, we don't see the same constant socket close issue in v6_server+v9_client, which is a good news.
[update-3] MCAUSER on attribute for all SVRCONN channel on v6_server. Same on v9_server (which doesn't have the same socket close problem when connected with the same v9_client).
display channel (SYSTEM.ADMIN.SVRCONN)
MCAUSER(mqm)
display channel (SYSTEM.AUTO.SVRCONN)
MCAUSER( )
display channel (SYSTEM.DEF.SVRCONN)
MCAUSER( )
[update-4]
I tried setting MCAUSER() to mqm, then using both (blank) and mqm from client side, both can create connections, but still seeing the same unexpected socket close using v9_client+v6_user. After updating MCAUSER(), i always added refresh security, and restart the qmgr.
I also tried setting system variable to blank in eclipse before creating the connection using blank user, didn't help either.
[update-5]
Limiting our discussion to v9_client+v9_server. The async testing code below generates a ton of xasession create/close request, using limited number of existing connections. Using SHARECNV(1) we would also end up with unaffordable high TIME_WAIT count, but using larger than 1 SHARECNV (eg. 10) might introduce extra performance penalty......
Async testing code
import java.util.ArrayList;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.FutureTask;
import javax.jms.*;
import com.ibm.mq.jms.*;
import com.ibm.msg.client.wmq.WMQConstants;
public class MQSeries_simpleAuditQ_Async_v9 {
private static String queueManager = "QM.ALPQ.ALL.01";
private static int port = 26010;
private static String host = "localhost";
private static int connCount = 20;
private static int amp = 100;
private static ExecutorService amplifier = Executors.newFixedThreadPool(amp);
public static void main(String[] args) throws JMSException {
MQXAQueueConnectionFactory queueFactory= new MQXAQueueConnectionFactory();
System.out.println("\n\n\n*******************\nqueueFactory implementation version: " +
queueFactory.getClass().getPackage().getImplementationVersion() + "*****************\n\n\n");
queueFactory.setTransportType(WMQConstants.WMQ_CM_CLIENT);
if (queueFactory.getClass().getPackage().getImplementationVersion().split("\\.")[0].equals("9")) {
queueFactory.setProviderVersion("9");
queueFactory.setShareConvAllowed(WMQConstants.WMQ_SHARE_CONV_ALLOWED_YES);
}
queueFactory.setHostName(host);
queueFactory.setPort(port);
queueFactory.setQueueManager(queueManager);
//queueFactory.setChannel("");
ArrayList<QueueConnection> xaQueueConnections = new ArrayList<QueueConnection>();
try {
// Obtain a QueueConnection
System.out.println("Creating Connection...");
//System.setProperty("user.name", "mqm");
//System.out.println("system username: " + System.getProperty("user.name"));
for (int ct=0; ct<connCount; ct++) {
// xaQueueConnection instance of MQXAQueueConnection
QueueConnection xaQueueConnection = (QueueConnection)queueFactory.createXAConnection(" ", "");
xaQueueConnection.start();
xaQueueConnections.add(xaQueueConnection);
}
ArrayList<Double> totalElapsedTimeRecord = new ArrayList<Double>();
ArrayList<FutureTask<Double>> taskBuffer = new ArrayList<FutureTask<Double>>();
for (int loop=0; loop <= 10; loop++) {
try {
for (int i=0; i<amp; i++) {
int idx = (int)(Math.random()*((connCount)));
System.out.println("Using connection: " + idx);
FutureTask<Double> xaSessionPoker = new FutureTask<Double>(new XASessionPoker(xaQueueConnections.get(idx)));
amplifier.submit(xaSessionPoker);
taskBuffer.add(xaSessionPoker);
}
System.out.println("loop " + loop + " completed");
} catch (Exception ex) {
System.out.println(ex);
}
}
for (FutureTask<Double> xaSessionPoker : taskBuffer) {
totalElapsedTimeRecord.add(xaSessionPoker.get());
}
System.out.println("Average xaSession poking time: " + calcAverage(totalElapsedTimeRecord));
System.out.println("Closing connections.... ");
for (QueueConnection xaQueueConnection : xaQueueConnections) {
xaQueueConnection.close();
}
} catch (Exception e) {
System.out.println("Error processing " + e.getMessage());
}
amplifier.shutdown();
}
private static double calcAverage(ArrayList<Double> myArr) {
double sum = 0;
for (Double val : myArr) {
sum += val;
}
return sum/myArr.size();
}
// create and close session through QueueConnection object passed in.
private static class XASessionPoker implements Callable<Double> {
// conn instance of MQXAQueueConnection. ref. QueueProviderService
private QueueConnection conn;
XASessionPoker(QueueConnection conn) {
this.conn = conn;
}
#Override
public Double call() throws Exception {
XASession xaSession;
double elapsed = 0;
try {
final long start = System.currentTimeMillis();
// ref. DualSessionWrapper
// xaSession instance of MQXAQueueSession
xaSession = ((XAConnection) conn).createXASession();
xaSession.close();
final long end = System.currentTimeMillis();
elapsed = (end - start) / 1000.0;
} catch (Exception e) {
// TODO Auto-generated catch block
System.out.println(e);
}
return elapsed;
}
}
}
We found the root cause is a combination of no more session pooling + bitronix TM doesn't offer session pooling across TX. Specifically (in our case), bitronix manages JmsPooledConnection pooling, but everytime a (xa)session is used (under JmsPooledConnection), a new socket is created (createXASession()) and closed (xaSession.close()).
One solution, is to wrap the jms connection with (xa)session pool, similar to what's been done in https://github.com/messaginghub/pooled-jms/tree/master/pooled-jms/src/main/java/org/messaginghub/pooled/jms
http://bjansen.github.io/java/2018/03/04/high-performance-mq-jms.html also suggests Spring CachingConnectionFactory works well, which sounds like a speical case of the first solution.
I'm finding that when multiple threads request a connection each from a single, shared instance of a ComboPooledDataSource, there is the occasion of if returning a connection from the pool that's already in-use. Is there a configuration setting or other means to make sure that connections currently checked-out aren't checked-out again?
package stress;
import java.beans.PropertyVetoException;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.HashSet;
import java.util.Set;
import com.mchange.v2.c3p0.ComboPooledDataSource;
import com.mchange.v2.c3p0.DataSources;
public class StressTestDriver
{
private static final String _host = "";
private static final String _port = "3306";
private static final String _database = "";
private static final String _user = "";
private static final String _pass = "";
public static void main(String[] args)
{
new StressTestDriver();
}
StressTestDriver()
{
ComboPooledDataSource cpds = new ComboPooledDataSource();
try
{
cpds.setDriverClass( "com.mysql.jdbc.Driver" );
String connectionString = "jdbc:mysql://" + _host + ":" + _port + "/"
+ _database;
cpds.setJdbcUrl( connectionString );
cpds.setMaxPoolSize( 15 );
cpds.setMaxIdleTime( 100 );
cpds.setAcquireRetryAttempts( 1 );
cpds.setNumHelperThreads( 3 );
cpds.setUser( _user );
cpds.setPassword( _pass );
}
catch( PropertyVetoException e )
{
e.printStackTrace();
return;
}
write("BEGIN");
try
{
for(int i=0; i<100000; ++i)
doConnection( cpds );
}
catch( Exception ex )
{
ex.printStackTrace();
}
finally
{
try
{
DataSources.destroy( cpds );
}
catch( SQLException e )
{
e.printStackTrace();
}
}
write("END");
}
void doConnection( final ComboPooledDataSource cpds )
{
Thread[] threads = new Thread[ 10 ];
final Set<String> set = new HashSet<String>(threads.length);
Runnable runnable = new Runnable()
{
public void run()
{
Connection conn = null;
try
{
conn = cpds.getConnection();
synchronized(set)
{
String toString = conn.toString();
if( set.contains( toString ) )
write("In-use connection: " + toString);
else
set.add( toString );
}
conn.close();
}
catch( Exception e )
{
e.printStackTrace();
return;
}
}
};
for(int i=0; i<threads.length; ++i)
{
threads[i] = new Thread( runnable );
threads[i].start();
}
for(int i=0; i<threads.length; ++i)
{
try
{
threads[i].join();
}
catch( InterruptedException e )
{
e.printStackTrace();
}
}
}
private static void write(String msg)
{
String threadName = Thread.currentThread().getName();
System.err.println(threadName + ": " + msg);
}
}
I've run you stress test in my environment. It terminated without output.
That said, it strikes me as a bit unreliable to use toString() output as a token for identity. The hashcodes encoded in Object.toString() are not guaranteed unique, and you're generating lots. You might just be seeing collisions.
In particular, the scenario you are concerned about would represent a surprising and perplexing sort of bug. You should be seeing instances of com.mchange.v2.c3p0.impl.NewProxyConnection. Those instances are not checked in and out of the pool — they are single use, disposable proxies that wrap the underlying database Connection. They are constructed with new when you call getConnection(), remain associated with a PooledConnection object while you work, then dereferenced by the c3p0 library when you call close(), to be garbage collected after client code dereferences them. If somehow getConnection() was being called on the same underlying PooledConnection, that would be a c3p0 bug, and c3p0 would emit a warning to that effect. You are not seeing anything like...
c3p0 -- Uh oh... getConnection() was called on a PooledConnection when
it had already provided a client with a Connection that has not yet
been closed. This probably indicates a bug in the connection pool!!!
are you?
I'd verify that you are not just seeing collisions of NewProxyConnection.toString(). Use a map rather than a set, hold a reference to close()ed Connection instances as values, when you see a collision of String keys, check with == whether the new Connection is the same instance as the value in your map. I'd be astonished if you find that they are the same instance. (I wish I could reproduce the problem to verify this on my own.)
I added code to get the connection ID/##SPID for Oracle and MS SQL Server, and whenever the toString value was the same, the connection ID always differed, so c3p0 is not the culprit:
94843 - pool-1-thread-12 - : onCheckOut called with: com.microsoft.sqlserver.jdbc.SQLServerConnection#5e65d9 - H: 6186457 - P: : 57
94843 - pool-1-thread-13 - : onCheckOut called with: com.microsoft.sqlserver.jdbc.SQLServerConnection#170a25e - H: 24158814 - P: : 55
94843 - pool-1-thread-1 - : onCheckOut called with: com.microsoft.sqlserver.jdbc.SQLServerConnection#5e65d9 - H: 6186457 - P: : 54
19172 - pool-1-thread-11 - : onCheckOut called with: oracle.jdbc.driver.T4CConnection#2475ca - H: 2389450 - P: : 6564
19172 - pool-1-thread-31 - : onCheckOut called with: oracle.jdbc.driver.T4CConnection#2475ca - H: 2389450 - P: : 6448
I'll have to look into revising the existing code. Thanks for your help Steve! You've pointed me in the right direction.