HBase master stops with "Connetion Refused" Error

HBase master stops with "Connetion Refused" Error - hadoop

This is happening in pseudo-distributed as well as distributed mode.
When I try to start HBase, initially all the 3 services - master, region and quorumpeer start. However within a minute, the master stops. In the logs, this is the trace -
2013-05-06 20:10:25,525 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 0 time(s).
2013-05-06 20:10:26,528 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 1 time(s).
2013-05-06 20:10:27,530 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 2 time(s).
2013-05-06 20:10:28,533 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 3 time(s).
2013-05-06 20:10:29,535 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 4 time(s).
2013-05-06 20:10:30,538 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 5 time(s).
2013-05-06 20:10:31,540 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 6 time(s).
2013-05-06 20:10:32,543 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 7 time(s).
2013-05-06 20:10:33,544 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 8 time(s).
2013-05-06 20:10:34,547 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 9 time(s).
2013-05-06 20:10:34,550 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call to <master/master_ip>:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
at org.apache.hadoop.ipc.Client.call(Client.java:1155)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at $Proxy9.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:132)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:259)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:220)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1611)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:68)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1645)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1627)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183)
at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:363)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:86)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:368)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:301)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
at org.apache.hadoop.ipc.Client.call(Client.java:1121)
... 18 more
Steps I have taken to fix this without any success
- downgraded from distributed mode to pseudo-distributed mode. Same issue.
- tried standalone mode- no luck
- used same user (hadoop) for both hadoop and hbase. Setup passwordless ssh for hadoop. - same problem.
- edited /etc/hosts file and changed localhost/servername as well as 127.0.0.1 to actual IP address referencing SO and different sources. Still same issue.
- rebooted the server
Here are the conf files.
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://<master>:9000/hbase</value>
<description>The directory shared by regionservers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value><master></value>
</property>
<property>
<name>hbase.master</name>
<value><master>:60000</value>
<description>The host and port that the HBase master runs at.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>The replication count for HLog and HFile storage. Should not be greater than HDFS datanode count.</description>
</property>
</configuration>
/etc/hosts file
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
.
What am I doing wrong here?
Hadoop Version - Hadoop 0.20.2-cdh3u5
HBase Version - Version 0.90.6-cdh3u5

By looking at you configuration file, I assume that you are using the actual hostname in your config files. Add the hostname along with the IP of the machine into the /etc/hosts file if that is the case. Also make sure it matches with the hostname in your Hadoop's core-site.xml. Proper name resolution is vital for a proper HBase functioning.
If you still face any problem please follow the steps mentioned here properly. I have tried to explain the procedure in detail and hopefully you'll be able to make it run if you follow all the steps carefully.
HTH

I believe you're trying to use pseudo-distributed mode. I was getting the same error until I fixed 3 things:
local /etc/hosts file
$ cat /etc/hosts
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost
172.20.x.x my.hostname.com
instead of pointing to hostname, point to localhost in hbase-env.sh
Correct my classpath
A. Ensure Hadoop is in classpath (via hbase-env.sh)
export JAVA_HOME=your path to java home
export HADOOP_HOME=your path to hadoop home
export HBASE_HOME=your path to hbase home
export HBASE_CLASSPATH=your path to hbase home/conf:your path to hadoop home/conf
B. When running my program, I edited the following bash script from HBase: The Definitive Guide (bin/run.sh)
$ grep -v # bin/run.sh
bin=`dirname "$0"`
bin=`cd "$bin">/dev/null; pwd`
echo "usage: $(basename $0) <example-name>"
exit 1;
fi
MVN="mvn"
if [ "$MAVEN_HOME" != "" ]; then
MVN=${MAVEN_HOME}/bin/mvn
fi
CLASSPATH="${HBASE_CONF_DIR}"
if [ -d "${bin}/../target/classes" ]; then
CLASSPATH=${CLASSPATH}:${bin}/../target/classes
fi
cpfile="${bin}/../target/cached_classpath.txt"
if [ ! -f "${cpfile}" ]; then
${MVN} -f "${bin}/../pom.xml" dependency:build-classpath -Dmdep.outputFile="${cpfile}" &> /dev/null
fi
CLASSPATH=`hbase classpath`:${CLASSPATH}:`cat "${cpfile}"`
JAVA_HOME=your path to java home
JAVA=$JAVA_HOME/bin/java
JAVA_HEAP_MAX=-Xmx512m
echo "Classpath is $CLASSPATH"
"$JAVA" $JAVA_HEAP_MAX -classpath "$CLASSPATH" "$#"
It's worth noting I am using teh Mac. I believe these instructions will work for teh Linux too.

Related

Hadoop datanode -> namenode communication issue

I have a Vagrant machine running a local Hadoop installation. Hadoop was working fine until today. Today Vagrant's insecure SSH key stopped working so I had to replace it. Now Hadoop is not working. In the logs I see:
17/09/18 09:35:41 INFO ipc.Client: Retrying connect to server: mymachine/192.168.33.10:8020. Already tried 0 time(s); maxRetries=45
17/09/18 09:36:01 INFO ipc.Client: Retrying connect to server: mymachine/192.168.33.10:8020. Already tried 1 time(s); maxRetries=45
17/09/18 09:36:21 INFO ipc.Client: Retrying connect to server: mymachine/192.168.33.10:8020. Already tried 2 time(s); maxRetries=45
17/09/18 09:36:41 INFO ipc.Client: Retrying connect to server: mymachine/192.168.33.10:8020. Already tried 3 time(s); maxRetries=45
The claim here is that it's a datanode -> namenode communication issue. core-site.xml contains:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mymachine:8020</value>
</property>
</configuration>
Which is correct. Trying getent hosts mymachine yields 192.168.33.10, which means the host is ok. I tried sudo netstat -antp | grep 8020 and got:
tcp 0 1 10.0.2.15:42002 192.168.33.10:8020 SYN_SENT 2630/java
tcp 0 1 10.0.2.15:42004 192.168.33.10:8020 SYN_SENT 2772/java
tcp 0 1 10.0.2.15:41998 192.168.33.10:8020 SYN_SENT 3312/java
So it appears that the port is also ok. However, when I do curl http://mymachine:8020 I get no reply. I checked on an identical machine and the correct reply should be It looks like you are making an HTTP request to a Hadoop IPC port. This is not the correct port for the web interface on this daemon..
Any ideas?

There are some answers in my opinion following:
1. Check if you can "ssh" to localhost without the password;
2. Check the authority when you start the hadoop;
3. It should be 127.0.0.1:8020 if running a local hadoop on your machine.Because the hadoop may run rightly while the network disconnecting...

hadoop "ipc.Client: Retrying connect to server" error

There are a lot of ideas on how to resolve this hadoop error
15/04/17 10:59:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 0 time(s).
However, I tried all and still see that error! Here are my configurations
1) core-site.xml
$ cat ../../apache/hadoop-1.0.2/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
2) mapred-site.xml
$ cat ../../apache/hadoop-1.0.2/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m</value>
</property>
</configuration>
3) iptables for the port
# cat /etc/sysconfig/iptables
*filter
:INP UT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 5901 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 2049 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 54310 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 54311 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT
# /etc/init.d/iptables restart
iptables: Flushing firewall rules: [ OK ]
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Unloading modules: [ OK ]
iptables: Applying firewall rules: [ OK ]
$ netstat -an | grep 54310
$ netstat -an | grep 54311
tcp 0 0 ::ffff:127.0.0.1:54311 :::* LISTEN
tcp 238 0 ::ffff:127.0.0.1:54311 ::ffff:127.0.0.1:44216 ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:44216 ::ffff:127.0.0.1:54311 ESTABLISHED
4) starting hadoop
$ $HADOOP_HOME/bin/start-all.sh
Warning: $HADOOP_HOME is deprecated.
starting namenode, logging to /home/mahmood/bigdatabench/apache/hadoop-1.0.2/libexec/../logs/hadoop-mahmood-namenode-tiger.out
mahmood#localhost's password:
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting datanode, logging to /home/mahmood/bigdatabench/apache/hadoop-1.0.2/libexec/../logs/hadoop-mahmood-datanode-tiger.out
mahmood#localhost's password:
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: secondarynamenode running as process 7583. Stop it first.
jobtracker running as process 7792. Stop it first.
mahmood#localhost's password:
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: tasktracker running as process 8019. Stop it first.
5) check java processes
$ jps
10292 Jps
8019 TaskTracker
7792 JobTracker
7583 SecondaryNameNode
6) Still I get that error
$ hadoop fs -ls hdfs://localhost:54310/
Warning: $HADOOP_HOME is deprecated.
15/04/17 10:59:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 0 time(s).
15/04/17 10:59:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
15/04/17 10:59:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
15/04/17 11:00:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
15/04/17 11:00:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
15/04/17 11:00:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
15/04/17 11:00:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 6 time(s).
15/04/17 11:00:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 7 time(s).
15/04/17 11:00:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 8 time(s).
15/04/17 11:00:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 9 time(s).
Bad connection to FS. command aborted. exception: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused
UPDATE:
While I thought that I had formatted the filesystem, it turns out that the hdfs format command aborted and I didn't noticed. The reason for that was I answered this question Re-format filesystem in /home/mahmood/bigdatabench/apache/hadoop-1.0.2/folders/name ? (Y or N) with y. However the correct answer is to press Y (capital letter!!).
So the correct steps are
1- stop-all.sh
2- hadoop namenode -format
3- start-all.sh
The ipc error has been gone :)
How can I fix that?

I think, you are running this command from a client machine , in your config files you need to give ip address/hostname instead of localhost to avoid this type of problems. Try to have hostnames in your config files and try once again.

This error is been handled now.
Please check below list of configurations which should be performed at master and datanodes.
Master:
1) Disable IP v6
2) Disable the firewall
3) Use Physical IP in hdfs-site.xml and mapered XML file instead of localhost.
4) In etc/hosts file, please comment all entries except master physical IP and datanode physical IP address. All IP starting from 127.*** and IP v6 entries should be commented.
Datanode:
1) In etc/hosts file, please comment all entries except master physical IP and datanode physical IP address. All IP starting from 127.*** and IP v6 entries should be commented.
2) Use Physical IP in hdfs-site.xml and mapered XML file instead of localhost.

hadoop - Connection refused on namenode

I've searched web and stackoverflow for a long time but it was not useful.
I have installed hadoop yarn 2.2.0 in 2 node cluster setup. but something goes wrong.
when I start hadoop daemons using start-dfs.sh and start-yarn.sh on master node, they successfully run in master and slave (my master's hostname is RM and my slave's hostname is slv). they can ssh each other successfully. but when I want to run a job, this error appears:
14/01/02 04:22:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/01/02 04:22:56 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/QuasiMonteCarlo_1388665371850_813553673/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
and in datanode log this log exists:
2014-01-02 04:40:31,616 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: RM/192.168.1.101:9000
2014-01-02 04:40:37,618 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 0 time(s)$
2014-01-02 04:40:38,619 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 1 time(s)$
2014-01-02 04:40:39,620 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 2 time(s)$
2014-01-02 04:40:40,621 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 3 time(s)
I checked the 9000 port on the master node and the output is this:
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 10227/java
I guess the problem is caused by the reason that in the slave node when I
telnet RM 9000
it says
Trying 192.168.1.101...
telnet: Unable to connect to remote host: Connection refused
however
telnet RM
the output is :
Trying 192.168.1.101...
Connected to RM.
Escape character is '^]'.
Ubuntu 12.04.2 LTS
RM login:
for additional information my /etc/hosts on master and slave is as below:
127.0.0.1 RM|slv localhost
192.168.1.101 RM
192.168.1.103 slv
can anybody suggest me a solution?
any help is really appreciated.
thanks

I think the problem is that your master is listening on 127.0.0.1:9000, so datanode can't connect because it is not listening at 192.168.1.101:9000 (theoretically, a good place to listen is 0.0.0.0:9000 since avoids this problems, but seems this configuration is not accepted).
Maybe will fix modifying your /etc/hosts deleting the first line, or try first just with:
127.0.0.1 localhost
192.168.1.101 RM
192.168.1.103 slv
-- edit: read comments bellow

I had the same problem, I changed
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
in core-site.xml to
<property>
<name>fs.defaultFS</name>
<value>hdfs://ip-address:8020</value>
</property>
and it worked

I encountered the same issue. After ran jps, we can see all the namenode and datanode are running up. but cannot see the active node in the web page. And I found I put 127.0.0.1 master in /etc/hosts. After removed it. the slaves can telnet master 9000.
My /etc/hosts looks like:
127.0.0.1 localhost
192.168.139.129 slave1
192.168.139.130 slave2
192.168.139.128 master

Problems in setting hadoop on mac os x 10.8

I have set something up on my mac for installing hadoop. But there is an error message like this:
13/02/18 04:05:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).
13/02/18 04:05:53 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
13/02/18 04:05:54 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s).
13/02/18 04:05:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s).
13/02/18 04:05:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s).
13/02/18 04:05:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s).
13/02/18 04:05:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s).
13/02/18 04:05:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s).
13/02/18 04:06:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s).
13/02/18 04:06:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
java.lang.RuntimeException: java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:265)
at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
at org.apache.hadoop.ipc.Client.call(Client.java:1075)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542)
... 17 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
at org.apache.hadoop.ipc.Client.call(Client.java:1050)
... 31 more
then I enter jps to check the service, the result is :
20635 Jps
20466 TaskTracker
20189 DataNode
20291 SecondaryNameNode
I don't know how to deal with this error. could someone give me an answer?
Thx a Lot!!!

Actually all processes that (hadoop should run) are not running because of misconfiguration of IP.I am not familiar with Mac OS but on linux and windows we need to put hadoop enteries for connection in hosts files(etc/hosts) and I am damn sure that it should be for Mac.Now the point is You need to put your hadoop entry in that file as a local mnachine like 127.0.0.1Actually you need to put it against actual IP of your machine For example
hadoop-machine 127.0.0.1 -->(placing loop back IP is wrong here because hadoop will try to connect with this IP).
remove this 127.0.0.1 and place the actual IP of your machine infront of this entry.You can find ip of your mac machine eaisly.here are some questions which are not directly related to hadoop but I guess they would be helpful for you.Question 1 , Question 2, Question 3

This might help a bit. But first U gotta remove the earlier installation using the command
rm -rf /usr/local/Cellar /usr/local/.git && brew cleanup
Then U may begin with a fresh installation of Hadoop on U're Mac.
Step 1: Install Homebrew
$ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
Step 2: Install Hadoop
$ brew install hadoop
Assume that brew installs Hadoop 1.2.1
Step 3: Configure Hadoop
$ cd /usr/local/Cellar/hadoop/1.2.1/libexec
Add the following line to conf/hadoop-env.sh:
export HADOOP_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="
Add the following lines to conf/core-site.xml inside the configuration tags:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
Add the following lines to conf/hdfs-site.xml inside the configuration tags:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Add the following lines to conf/mapred-site.xml inside the configuration tags:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
Go to System Preferences > Sharing.
Make sure “Remote Login” is checked.
$ ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Step 4: Enable SSH to localhost
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Step 5: Format Hadoop filesystem
$ bin/hadoop namenode -format
Step 6: Start Hadoop
$ bin/start-all.sh
Make sure that all Hadoop processes are running:
$ jps
Run a Hadoop example:
One more thing ! “brew update” will update the hadoop binaries to recent version (1.2.1 at present).

I had this same error: ConnectException: Connection refused in all the secondarynamenode log files.
But I also found this in the namenode's log file:
2015-10-25 16:35:15,720 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /private/tmp/hadoop-admin/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
I therefore did a
hadoop namenode -format
and the problem has gone away. Thus, the error message was just about the fact that the namenode had not started successfully.

unable to check nodes on hadoop [Connection refused]

If I type http://localhost:50070 or http://localhost:9000 to see the nodes,my browser shows me nothing I think it can't connect to the server.
I tested my hadoop with this command:
hadoop jar hadoop-*test*.jar TestDFSIO -write -nrFiles 10 -fileSize 1000
but too didn't work and it tries to connect to the server,this is the output:
12/06/06 17:25:24 INFO mapred.FileInputFormat: nrFiles = 10
12/06/06 17:25:24 INFO mapred.FileInputFormat: fileSize (MB) = 1000
12/06/06 17:25:24 INFO mapred.FileInputFormat: bufferSize = 1000000
12/06/06 17:25:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).
12/06/06 17:25:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
12/06/06 17:25:27 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s).
12/06/06 17:25:28 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s).
12/06/06 17:25:29 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s).
12/06/06 17:25:30 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s).
12/06/06 17:25:31 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s).
12/06/06 17:25:32 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s).
12/06/06 17:25:33 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s).
12/06/06 17:25:34 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
I changed some files like this:
in conf/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
in conf/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
</configuration>
in conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
Thanks for your attention. If I run this command:
cat /etc/hosts
I see:
127.0.0.1 localhost
127.0.1.1 ubuntu.ubuntu-domain ubuntu
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
and if i run this one:
ps axww | grep hadoop
I see this result:
2170 pts/0 S+ 0:00 grep --color=auto hadoop
but no effect. Have you any idea, how can I solve my problem?

There are few things that you need to take care of before starting hadoop services.
Check what this returns:
hostname --fqdn
In your case this should be localhost.
Also comment out IPV6 in /etc/hosts.
Did you format the namenode before starting HDFS.
hadoop namenode -format
How did you install Hadoop. Location of log files will depend on that. Usually it is in location "/var/log/hadoop/" if you have used cloudera's distribution.
If you are a complete newbie, I suggest installing Hadoop using Cloudera SCM which is quite easy. I have posted my approach in installing Hadoop with Cloudera's distribution.
Also
Make sure DFS location has a write permission. It usually sits # /usr/local/hadoop_store/hdfs
That is a common reason.

same problem i got and this solved my problem:
problem lies with the permission given to the folders
"chmod" 755 or greater for the folders
/home/username/hadoop/*

Another possibility is the namenode is not running.
You can remove the HDFS files:
rm -rf /tmp/hadoop*
Reformat the HDFS
bin/hadoop namenode -format
And restart hadoop services
bin/hadoop/start-all.sh (Hadoop 1.x)
or
sbin/hadoop/start-all.sh (Hadoop 2.x)

also edit your /etc/hosts file and change 127.0.1.1 to 127.0.0.1...proper dns resolution is very important for hadoop and a bit tricky too..also add following property in your core-site.xml file -
<property>
<name>hadoop.tmp.dir</name>
<value>/path_to_temp_directory</value>
</property>
the default location for this property is /tmp directory which get emptied after each system restart..so you loose all your info at each restart..also add these properties in your hdfs-site.xml file -
<property>
<name>dfs.name.dir</name>
<value>/path_to_name_directory</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/path_to_data_directory</value>
</property>

I am assuming that is your first installation of hadoop.
At the beginning please check if your daemons are working. To do that use (in terminal):
jps
If only jps appears that means all daemons are down. Please check the log files. Especially the namenode. Log folder is probably somewhere there /usr/lib/hadoop/logs
If you have some permission problems. Use this guide during the installation.
Good installation guide
I am shooting with this explanations but these are most common problems.

Hi Edit your core conf/core-site.xml and change localhost to 0.0.0.0. Use the conf below. That should work.
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9000</value>
</property>

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

HBase master stops with "Connetion Refused" Error - hadoop

Related

Hadoop datanode -> namenode communication issue

hadoop "ipc.Client: Retrying connect to server" error

hadoop - Connection refused on namenode

Problems in setting hadoop on mac os x 10.8

unable to check nodes on hadoop [Connection refused]

Categories

Resources