HBase RegionServer: error telling master we are up - hadoop

I am getting the following errors in the logs of the slave RegionServer.
The problem seems to be at
regionserver.HRegionServer: reportForDuty to
master=localhost,60000,1397430611631 with port=60020
The master is set as localhost but should actually be pointing towards master.
I am unable to figure out how the slave figures out who the master even after going through the docs.
The complete log is:
2014-04-14 04:49:35,939 INFO [regionserver60020] regionserver.HRegionServer: CompactionChecker runs every 10sec
2014-04-14 04:49:35,950 INFO [regionserver60020] regionserver.HRegionServer: reportForDuty to master=localhost,60000,1397430611631 with port=60020, startcode=1397431174733
2014-04-14 04:49:36,083 WARN [regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1671)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:5402)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2013)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:846)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:866)
at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1536)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
... 5 more

check your /etc/hosts file,if there is something like
127.0.0.1 localhost yourhost
change it to
127.0.0.1 localhost
192.168.1.1 yourhost

Related

ambari on HDP cluster + ambari-metrics-collector service not start

we have some issue with ambari-metrics-collector service , ( we have HDP cluster version - 2.6.4 with 8 nodes )
ambari metrics collector service can’t start or start of few second then failed
the details about metrics collector version
rpm -qa | grep metrics
ambari-metrics-grafana-2.6.1.0-143.x86_64
ambari-metrics-monitor-2.6.1.0-143.x86_64
ambari-metrics-collector-2.5.0.3-7.x86_64
ambari-metrics-hadoop-sink-2.6.1.0-143.x86_64
all machines are rhel 7.2
we performed the following steps in order to resolve the problem
1.restart metrics-collector service
su - ams -c '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf/ stop'
su - ams -c '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf/ start'
or
ambari-metrics-collector stop
ambari-metrics-collector start
2.restart ambari-metrics-monitor on all nodes
ambari-metrics-monitor stop
ambari-metrics-monitor start
3.clean the folder /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/
mv /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/zookeeper_0 /tmp/bck/zookeeper/
Then restart metrics-collector service
4.Tuning the metrics-collector parameters according - https://docs.cloudera.com/HDPDocuments/Ambari-2.2.1.0/bk_ambari_reference_guide/content/_ams_general_guidelines.html
we update the follwing parameters in ambari
metrics_collector_heap_size=1024
hbase_regionserver_heapsize=1024
hbase_master_heapsize=512
hbase_master_xmn_size=128
status for now: - steps 1-4 doesn’t help
From the logs we can see the following:
log file - ambari-metrics-collector.log
2020-06-25 09:06:14,474 WARN org.apache.zookeeper.ClientCnxn: Session 0x172eab71f310002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
2020-06-25 09:06:14,575 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=master02.sys671.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure/meta-region-server
log file - hbase-ams-master-master02.sys671.com.log
2020-06-25 09:38:18,799 WARN [RS:0;master02:51842-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Session 0x172ead5d73a0004 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2020-06-25 09:38:20,437 INFO [main-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server master02.sys671.com/23.2.35.171:61181. Will not attempt to authenticate using SASL (unknown error)
2020-06-25 09:38:20,438 WARN [main-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Session 0x172ead5d73a0002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
we also not see that port is listening ( timeline.metrics.service.webapp.address )
netstat -tulpn | grep 6188
any advice how to continue from this point ?
we'll appreciate to get any help about this problem

Nifi - "Connection refused" in gui after installation

I'm installing a three node non-secure nifi cluster (1.5) with embedded zookeeper.
The installation has completed, the cluster startup up and election have finished, without any obvious issues.
However when I hit the gui I see the following:
javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)
In nifi-apps.log there is:
2018-03-07 14:41:37,857 WARN [Replicate Request Thread-7] o.a.n.c.c.h.r.ThreadPoolRequestReplicator Failed to replicate request GET /nifi-api/flow/current-user to test-nifi01:8080 due to javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)
2018-03-07 14:41:37,858 WARN [Replicate Request Thread-7] o.a.n.c.c.h.r.ThreadPoolRequestReplicator
javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)
at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:284)
at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:278)
at org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$0(JerseyInvocation.java:753)
....
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
....
in nifi-user.log:
2018-03-07 15:18:49,475 INFO [NiFi Web Server-21] org.apache.nifi.web.filter.RequestLogger Attempting request for (<no user found>) POST http://test-nifi01:8080/nifi-api/access/kerberos (source ip: 172.100.1.11)
2018-03-07 15:18:49,476 INFO [NiFi Web Server-21] o.a.n.w.a.c.IllegalStateExceptionMapper java.lang.IllegalStateException: Access tokens are only issued over HTTPS.. Returning Conflict response.
2018-03-07 15:18:49,572 INFO [NiFi Web Server-17] org.apache.nifi.web.filter.RequestLogger Attempting request for (<no user found>) POST http://test-nifi01:8080/nifi-api/access/oidc/exchange (source ip: 172.100.1.11)
2018-03-07 15:18:49,573 INFO [NiFi Web Server-17] o.a.n.w.a.c.IllegalStateExceptionMapper java.lang.IllegalStateException: User authentication/authorization is only supported when running over HTTPS.. Returning Conflict response.
2018-03-07 15:18:49,672 INFO [NiFi Web Server-23] org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) GET http://test-nifi01:8080/nifi-api/flow/current-user (source ip: 172.100.1.11)
It's the same results regardless of what host I connect to.
I'm not sure what I need to be checking here.
EDIT: Updated with more detail.
Node1:
https://pastebin.com/5nMzWfXw
# web properties #
nifi.web.war.directory=./lib
nifi.web.http.host=test-nifi01
nifi.web.http.port=8080
nifi.web.http.network.interface.default=eth0
nifi.web.https.host=
nifi.web.https.port=
nifi.web.https.network.interface.default=
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=test-nifi01
nifi.cluster.node.protocol.port=11003
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=
Node 2: https://pastebin.com/n5GMy1nA
# web properties #
nifi.web.war.directory=./lib
nifi.web.http.host=test-nifi02
nifi.web.http.port=8080
nifi.web.http.network.interface.default=eth0
nifi.web.https.host=
nifi.web.https.port=
nifi.web.https.network.interface.default=
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=test-nifi02
nifi.cluster.node.protocol.port=11003
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=
Node 3: https://pastebin.com/EqztLfnD
# web properties #
nifi.web.war.directory=./lib
nifi.web.http.host=test-nifi03
nifi.web.http.port=8080
nifi.web.http.network.interface.default=eth0
nifi.web.https.host=
nifi.web.https.port=
nifi.web.https.network.interface.default=
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=test-nifi03
nifi.cluster.node.protocol.port=11003
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=
Anyone have any thoughts on what I should be looking at?
EDIT 2:
It's looks like each node is refusing to connect to itself. E.g on node 2 we have
2018-03-07 20:31:08,099 WARN [Replicate Request Thread-4] o.a.n.c.c.h.r.ThreadPoolRequestReplicator Failed to replicate request GET /nifi-api/flow/current-user to test-nifi02:8080 due to javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)
and a curl also fails:
[taas#test-nifi02 logs]$ curl http://test-nifi02:8080
curl: (7) Failed connect to test-nifi02:8080; Connection refused
Curls to the other nodes are fine. The same behavior is occurring on each node, e,g node1 can't curl to node1 and node3 can't curl to node3.
Make sure you have dns enabled for network/subnet . if you dont have luxury of dns try updating /etc/hosts files with the exact ip address and hostnames
ex:
192.168.0.1 test-nifi01
192.168.0.2 test-nifi02
in all the hosts of the nifi nodes then try connecting it. and also make sure to check if you can able to connect from one node to another node by using shell via ssh
In a NiFi cluster, a request comes in to one node, and then that node replicates the request to all other nodes.
This issue shown nifi-app.log is indicating that the connection was refused between the current node test-nifi01:8080.
I assume you have hosts like test-nifi01, test-nifi02, and test-nifi03, so each of these nodes would need to be able to resolve the other hostnames. You can start by issuing a ping from one node to the other two, and do that on each node.

Storm - Supervisors launched but not connecting to Nimbus

I have a Storm cluster with 1 Nimbus, 4 Supervisors and 2 Zookeeper nodes. My Storm.yaml is as following:
storm.zookeeper.servers:
- "storage14"
- "storage15"
nimbus.seeds: ["storage01"]
#storm.local.hostname: "storage05"
supervisor.supervisors:
- "storage02"
- "storage03"
- "storage04"
- "storage05"
storm.local.dir: "/tmp/storm"
worker.childopts: "-Xmx%HEAP-MEM%m -XX:+PrintGCDetails -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=artifacts/heapdump"
This storm.yaml file is used by both Nimbus and Supervisors. When Nimbus is started I have the storm.local.hostname commented out as is shown above.
However, when starting Supervisors on respective nodes, I uncomment the storm.local.hostname and set it to the hostname of the node on which the supervisor is being launched. For instance if I was launching the supervisor on storage05, the storm.yaml file would have the following additional config param:
storm.local.hostname: "storage05"
The problem is even though Nimubs is launched successfully and I can see it on the Storm UI, some supervisors do not seem to be able to connect to Nimbus. For instance of the 4 nodes I start supervisors on, Storm UI often shows only 2 of them connected. However, if I ssh in to these nodes and run jps, I can see that the supervisor process is running on ALL of these nodes.
The Supervisors at the nodes which do end up connecting are not the same always, so it is definitely not a problem with those specific nodes.
Another thing to notice is if I try to execute a topology on whatever nodes that got connected, it does not get registered by the cluster and I can not see that topology on the UI either.
What do you think might be causing this erratic behavior?
UPDATE:
Tail end of nimbus.log has the following lines
2017-01-25 00:04:25.216 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-01-25 00:04:25.317 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server storage15/192.168.140.195:2181. Will not attempt to authenticate using SASL (unknown error)
2017-01-25 00:04:25.317 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-01-25 00:04:25.686 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server storage15/192.168.140.195:2181. Will not attempt to authenticate using SASL (unknown error)
2017-01-25 00:04:25.686 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-01-25 00:04:25.787 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server storage14/192.168.140.194:2181. Will not attempt to authenticate using SASL (unknown error)
2017-01-25 00:04:25.787 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
Your UPDATE (nimbus log) indicates that your Nimbus cannot connect Zookeeper cluster. Please check that Zookeeper cluster (storage14/storage15) is accessible from storage01 (not only node is accessible, but also do telnet to Zookeeper server via "telnet storage14 (and/or storage15) 2181").
When ZK connectivity issue is gone please try starting supervisor again.

com.datastax.driver.core.TransportException: [/xx.xxx.x.xxx:9042] Cannot connect

I have 2 node in Cassandra cluster with IP:Port aa.aaa.a.aaa:9043(node) and xx.xxx.x.xxx:9043. When i trying to connect using following config **PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions.setCoreConnectionsPerHost(HostDistance.LOCAL, 2)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 4)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 4)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 200)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 200);
cluster = Cluster.builder()
.addContactPointsWithPorts(socketAddressList)
.withPoolingOptions(poolingOptions)
.withRetryPolicy(DefaultRetryPolicy.INSTANCE)
.withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy())).build();
Session session = cluster.connect(cassandraDB);**
I am getting following exception 16/01/14 09:52:45 INFO core.NettyUtil: Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
16/01/14 09:52:46 WARN core.Cluster: ***You listed /xx.xxx.x.xxx:9043 in your contact points, but it could not be reached at startup*
16/01/14 09:52:47 INFO policies.DCAwareRoundRobinPolicy: Using data-center name 'name' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)ent.Futures$CombinedFuture setExceptionAndMaybeLog
SEVERE: input future failed.
com.datastax.driver.core.TransportException: [/xx.xxx.x.xxx:9042] Cannot connect
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:156)
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:139)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /xx.xxx.x.xxx:9042
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)
... 6 more
16/01/14 09:52:47 ERROR core.Session: Error creating pool to /xx.xxx.x.xxx:9042
com.datastax.driver.core.TransportException: [/xx.xxx.x.xxx:9042] Cannot connect
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:156)
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:139)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /xx.xxx.x.xxx:9042
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)**
My Question are:
why it is trying to connect on port 9042, while no where i am using this port in code as well as in config file ?
cassandra version: Cassandra 2.2.1
why it is trying to connect on port 9042, while no where i am using this port in code as well as in config file ?
9042 is the default port for the CQL binary protocol.
Can you show us the content of the variable socketAddressList that you passed to the cluster builder ?
Is there any reason you're using port 9043 instead of the default 9042 port ?

Mapreduce returns error when accessing to datanode on master machine

I set up a Hadoop 2.4.0 cluster with three machines. One master machine is deployed with namenode, resource manager, datanode and node manager. The other two worker machines are deployed with datanode and node manager. When I run Hive query, the work fails and the error is
2014-06-11 13:40:13,364 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From master/127.0.0.1 to
master:43607 failed on connection exception: java.net.ConnectException: Connection >refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:5>7)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImp>l.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:604)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:699)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
... 4 more
if I disable the datanode on master machine, everything works well. I'm wondering if it's allowed to deployed datanode on the master machine. Thank you for your kindly help in advance.
BTW, my /etc/hosts on the three machines are the same:
127.0.0.1 localhost
10.1.154.231 master
10.1.153.220 slave1
10.1.153.133 slave2
Please set up passwordless ssh on your master to itself.
You can achieve this by
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys2
Make sure the permissions are correct
chmod 0600 ~/.ssh/authorized_keys2
In this case you may check if the namenode is started correctly on the master by checking logs at your yourhadoopfolder/logs/hadoop-[hadoop-user]-namenode-master.log
It is often caused by the hdfs is not formatted before. Run
hadoop namenode -format
Of course you will need to put your data to the cluster again.

Resources