Getting Rabbitmq HeartBeat Exception - amazon-ec2

I am facing one issue with rabbitmq, that it terminate the connection with below exception
{"message":"com.rabbitmq.client.AlreadyClosedException: connection is already closed due to connection error; cause: com.rabbitmq.client.MissedHeartbeatException: Heartbeat missing with heartbeat = 60 seconds"}
so I am using following configuration for it:-
Rabbitmq docker on aws ec2-instance1 with t2_small server.
Services which are facing this issue are running on aws ec2-instance2 with t2-small server.
Set following configuration for rabbitmq
factory.setAutomaticRecoveryEnabled(true);
factory.setNetworkRecoveryInterval(1000);
factory.setRequestedHeartbeat(60);
I just not able to understand what can be the reason for getting this type of error, how can we resolve this issue.
Also I found below logs from the service which was trying to connect with rabbitmq
{"log":"Caught an exception during connection recovery!\n","stream":"stderr","time":"2018-03-22T00:00:00.632851865Z"}
{"log":"java.net.NoRouteToHostException: No route to host\n","stream":"stderr","time":"2018-03-22T00:00:00.633374123Z"}
{"log":"\u0009at java.net.PlainSocketImpl.socketConnect(Native Method)\n","stream":"stderr","time":"2018-03-22T00:00:00.633666158Z"}
{"log":"\u0009at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)\n","stream":"stderr","time":"2018-03-22T00:00:00.633935828Z"}
{"log":"\u0009at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)\n","stream":"stderr","time":"2018-03-22T00:00:00.634170787Z"}
{"log":"\u0009at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)\n","stream":"stderr","time":"2018-03-22T00:00:00.63440824Z"}
{"log":"\u0009at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)\n","stream":"stderr","time":"2018-03-22T00:00:00.634625637Z"}
{"log":"\u0009at java.net.Socket.connect(Socket.java:589)\n","stream":"stderr","time":"2018-03-22T00:00:00.635038038Z"}
{"log":"\u0009at com.rabbitmq.client.impl.FrameHandlerFactory.create(FrameHandlerFactory.java:32)\n","stream":"stderr","time":"2018-03-22T00:00:00.635172903Z"}
{"log":"\u0009at com.rabbitmq.client.impl.recovery.RecoveryAwareAMQConnectionFactory.newConnection(RecoveryAwareAMQConnectionFactory.java:34)\n","stream":"stderr","time":"2018-03-22T00:00:00.635369445Z"}
{"log":"\u0009at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.recoverConnection(AutorecoveringConnection.java:435)\n","stream":"stderr","time":"2018-03-22T00:00:00.635639932Z"}
{"log":"\u0009at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.beginAutomaticRecovery(AutorecoveringConnection.java:407)\n","stream":"stderr","time":"2018-03-22T00:00:00.63584649Z"}
{"log":"\u0009at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.access$000(AutorecoveringConnection.java:53)\n","stream":"stderr","time":"2018-03-22T00:00:00.636051142Z"}
{"log":"\u0009at com.rabbitmq.client.impl.recovery.AutorecoveringConnection$1.shutdownCompleted(AutorecoveringConnection.java:352)\n","stream":"stderr","time":"2018-03-22T00:00:00.636233667Z"}
{"log":"\u0009at com.rabbitmq.client.impl.ShutdownNotifierComponent.notifyListeners(ShutdownNotifierComponent.java:75)\n","stream":"stderr","time":"2018-03-22T00:00:00.636899252Z"}
{"log":"\u0009at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:576)\n","stream":"stderr","time":"2018-03-22T00:00:00.637183801Z"}

From what I could find searching around, seems the connection could have been closed by the AWS load-balancers.
Could you try modifying your configuration as follows:
factory.setRequestedHeartbeat(30);
And see if it resolves your issue?

Related

Graylog connecting to existing elasticsearch shows 503

Graylog v4.0.7
ElasticSearch v7.7
MongoDb v4.4
We are setting up Graylog in our kubernetes cluster. Graylog can connect to MongoDB server from another cluster using LB. When we connect our Graylog to ElasticSearch which is in different cluster (using LB e.g. https://myelastic.sample.com:443), logs from the app shows 503. But when we curl some elasticsearch api it shows 200.
This only occurs in graylog pod.
Caused by: org.graylog.shaded.elasticsearch7.org.elasticsearch.client.ResponseException: method [GET], host [https://myelastic.sample.com:443], URI [/_alias/graylog_deflector?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=true], status line [HTTP/1.1 503 Service Unavailable]
upstream connect error or disconnect/reset before headers. reset reason: connection failure
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestClient.performRequest(RestClient.java:272) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestClient.performRequest(RestClient.java:246) ~[?:?]

Connection refused while connection : MQJE001: Completion Code '2', Reason '2538'

I'm trying to access a queue.
def mqProps = new Hashtable<String, Object>()
mqProps.put(MQConstants.CHANNEL_PROPERTY, 'CHANNEL')
mqProps.put(MQConstants.PORT_PROPERTY, PORT)
mqProps.put(MQConstants.HOST_NAME_PROPERTY, 'HOST')
mqProps.put(MQConstants.USER_ID_PROPERTY, 'myuser') // is it the correct property for the user?
mqProps.put(MQConstants.PASSWORD_PROPERTY, 'mypassword') // is it the correct property for the password?
def qMgr = new MQQueueManager('QM', mqProps)
However I'm facing the following error
javax.script.ScriptException: com.ibm.mq.MQException: MQJE001: Completion Code '2', Reason '2538'
...
Caused by: com.ibm.mq.MQException: MQJE001: Completion Code '2', Reason '2538'.
...
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;AMQ9204: Connection to host 'HOST(PORT)' rejected.
...
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;AMQ9204: Connection to host 'HOST/address:PORT' rejected.
...
Caused by: java.net.ConnectException: Connection timed out: connect
...
The error happened on the line:
def qMgr = new MQQueueManager('QM', mqProps)
Can you please explain me reason of this issue? Thank you a lot.
Reason Code 2538 is MQRC_HOST_NOT_AVAILABLE.
You can quickly discover this by using the mqrc command line tool that comes with IBM MQ. Type:
mqrc 2538
and you will be told:
2538 0x000009ea MQRC_HOST_NOT_AVAILABLE
Alternatively you can look it up in the IBM MQ Knowledge Center.
Reading the explanation in Knowledge Center will show you a number of common possibilities for the problem.
The listener has not been started on the remote system.
The connection name in the client channel definition is incorrect.
The network is currently unavailable.
A firewall blocking the port, or protocol-specific traffic
Perhaps the most common of errors is that the listener running at the queue manager is not using the same port number that you have put in the client application connection details.
You haven't shown us in your question any details about the listener running on the queue manager, so we will have to leave that for you to check yourself.

Connection timeout expired while connecting to impala with impala JDBC Driver

I am using impala2.12.0-cdh5.16.1 and connecting to impala with impala_jdbc_2.6.4.1005. Normally it runs very well, but when I run distcp (which cost the Cluster Network IO and HDFS IO), the java program may throw errors.
2019/02/28 12:54:26 531873 ERROR run.QihooStatusTask(run:88) - [Cloudera][ImpalaJDBCDriver](700100) Connection timeout expired. Details: java.net.ConnectException: Connection timed out.
java.sql.SQLException: [Cloudera][ImpalaJDBCDriver](700100) Connection timeout expired. Details: java.net.ConnectException: Connection timed out.
at com.cloudera.impala.hivecommon.core.HiveJDBCCommonConnection.handleException(Unknown Source)
at com.cloudera.impala.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
at com.cloudera.impala.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
at com.cloudera.impala.jdbc.common.AbstractDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:270)
The full error Message is in the picture:
I tried to add DriverManager.setLoginTimeout(120) to the program,but the error still exists.
I think it may be happen because the Cluster Network IO is very high and there may be a parameter which can add the value of timeout so the error will disappear.
So, any suggestion about this? Thx.

What will cause zookeeper Client session timed out

I deployed a long running Storm topology. After several hours running, the whole topology went down. I checked worker logs, and found these logs . As it says, zookeeper client session timed out and it caused reconnection. I suspect it was relate to my broken topology. Now I try to find out what can cause clients timeout.
2016-02-29T10:34:12.386+0800 o.a.s.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 23789ms for sessionid 0x252f862028c0083, closing socket connection and attempting reconnect
2016-02-29T10:34:12.986+0800 o.a.s.c.f.s.ConnectionStateManager [INFO] State change: SUSPENDED
2016-02-29T10:34:13.059+0800 b.s.cluster [WARN] Received event :disconnected::none: with disconnected Zookeeper.
2016-02-29T10:34:13.197+0800 o.a.s.z.ClientCnxn [INFO] Opening socket connection to server zk-3.cloud.mos/172.16.13.147:2181. Will not attempt to authenticate using SASL (unknown error)
2016-02-29T10:34:13.241+0800 o.a.s.z.ClientCnxn [WARN] Session 0x252f862028c0083 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_31]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) ~[na:1.8.0_31]
at org.apache.storm.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[storm-core-0.9.6.jar:0.9.6]
at org.apache.storm.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) ~[storm-core-0.9.6.jar:0.9.6]
Your client can no longer talk to the ZooKeeper server. The first thing that happened was there was no answer to the heartbeats within the negotiated session timeout:
2016-02-29T10:34:12.386+0800 o.a.s.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 23789ms for sessionid 0x252f862028c0083, closing socket connection and attempting reconnect
Then when it tried to reconnect, it got a connection refused:
2016-02-29T10:34:13.241+0800 o.a.s.z.ClientCnxn [WARN] Session 0x252f862028c0083 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
This means either your ZooKeeper server:
Is not reachable (network connection down)
Is dead (so nothing is listening on the socket)
Is GCing itself to death and cannot communicate (although that might have issued a connection timeout error, I'm not sure)
To tell more you will need to check the ZooKeeper server logs on your (Hadoop?) cluster.
Its worked for me by increasing the connection timeout in server.properties:
zookeeper.connection.timeout.ms=60000
One way that this can happen is if you start zookeeper, then break in the terminal, then try to start kafka.
In order to use kafka, you really should use 3 terminal windows (or 3 PuTTY sessions if you are SSHing into your instance from Windows)
First Session for Zookeeper server.
Second Session for Kafka server.
Third Session for running Kafka commands to do things like create topics.
I have started Kafka in cluster mode with 3 zookeeper server and 3 Kafka server. All zookeeper server started successfully but while starting Kafka server its get disconnected stating "fatal error during Kafka server startup. prepare to shutdown (kafka.server.kafkaserver)". while investigation, I found that Kafka server get disconnected every time after 18 seconds[which is zookeeper.connection.timeout.ms = 18000 default value] so I updated the same and issue get resolved.
always use 2181 as port number for zookeeper connection until you haven't configured your zookeeper !!!

Javascript Adapter throwing java.net.SocketException: Connection reset

I am trying to make a https call to the backend server that gives a json data , i could get the by making https calls using browser but when make the same call using the javascript adapter i getting this output
I followed this IBM Knowledge Center to add the cert to the default mobilefirst keystore. I am not sure why i am getting this error?
[ERROR ] FWLSE0099E: An error occurred while invoking procedure [project kmf]login/HttpRequestFWLSE0100E: parameters: [project kmf]
Http request failed: java.net.SocketException: Connection reset
FWLSE0101E: Caused by: [project kmf]java.net.SocketException: Connection resetjava.lang.RuntimeException: Http request failed: java.net.SocketException: Connection reset
at com.worklight.adapters.http.HTTPConnectionManager.execute(HTTPConnectionManager.java:271)
at com.worklight.adapters.http.HttpClientContext.doExecute(HttpClientContext.java:201)
at com.worklight.adapters.http.HttpClientContext.execute(HttpClientContext.java:185
From the comments, by Vivin:
Connection reset means , the backend server has reset the connection. You should consult your network team/ backend team and verify why this occuring. Firewalls / network issues / backend server connection issues are all possibilities. MobileFirst server is only reporting the issue as it found

Resources