I've searched web and stackoverflow for a long time but it was not useful.
I have installed hadoop yarn 2.2.0 in 2 node cluster setup. but something goes wrong.
when I start hadoop daemons using start-dfs.sh and start-yarn.sh on master node, they successfully run in master and slave (my master's hostname is RM and my slave's hostname is slv). they can ssh each other successfully. but when I want to run a job, this error appears:
14/01/02 04:22:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/01/02 04:22:56 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/QuasiMonteCarlo_1388665371850_813553673/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
and in datanode log this log exists:
2014-01-02 04:40:31,616 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: RM/192.168.1.101:9000
2014-01-02 04:40:37,618 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 0 time(s)$
2014-01-02 04:40:38,619 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 1 time(s)$
2014-01-02 04:40:39,620 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 2 time(s)$
2014-01-02 04:40:40,621 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: RM/192.168.1.101:9000. Already tried 3 time(s)
I checked the 9000 port on the master node and the output is this:
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 10227/java
I guess the problem is caused by the reason that in the slave node when I
telnet RM 9000
it says
Trying 192.168.1.101...
telnet: Unable to connect to remote host: Connection refused
however
telnet RM
the output is :
Trying 192.168.1.101...
Connected to RM.
Escape character is '^]'.
Ubuntu 12.04.2 LTS
RM login:
for additional information my /etc/hosts on master and slave is as below:
127.0.0.1 RM|slv localhost
192.168.1.101 RM
192.168.1.103 slv
can anybody suggest me a solution?
any help is really appreciated.
thanks
I think the problem is that your master is listening on 127.0.0.1:9000, so datanode can't connect because it is not listening at 192.168.1.101:9000 (theoretically, a good place to listen is 0.0.0.0:9000 since avoids this problems, but seems this configuration is not accepted).
Maybe will fix modifying your /etc/hosts deleting the first line, or try first just with:
127.0.0.1 localhost
192.168.1.101 RM
192.168.1.103 slv
-- edit: read comments bellow
I had the same problem, I changed
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
in core-site.xml to
<property>
<name>fs.defaultFS</name>
<value>hdfs://ip-address:8020</value>
</property>
and it worked
I encountered the same issue. After ran jps, we can see all the namenode and datanode are running up. but cannot see the active node in the web page. And I found I put 127.0.0.1 master in /etc/hosts. After removed it. the slaves can telnet master 9000.
My /etc/hosts looks like:
127.0.0.1 localhost
192.168.139.129 slave1
192.168.139.130 slave2
192.168.139.128 master
Related
Set up a Hadoop cluster of 3 nodes. One of them got both NameNode and DataNode roles while other two are just DataNodes.
I started all nodes and services but in summary it shows only one of DataNodes's status is live. Status of other nodes are not even showing.
My question is what is the difference between being started and being live? And why other nodes don't have a status at all?
I guess the issue is datanodes can't talk to namenode. So as Azwaw pointed out, I checked /etc/hosts file. It was like this:
127.0.0.1 nnode.domain nnode localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.1.4.212 nnode.domain nnode
192.1.5.124 dnode02.domain dnode02
192.1.5.125 dnode01.domain dnode01
I changed first line to:
127.0.0.1 localhost.localdomain localhost localhost4 localhost4.localdomain4
Now I can establish connection with nnode.domain:50070, however errors at datanode side changed. Here log piece from datanode:
2015-05-15 10:08:21,721 ERROR datanode.DataNode (DataXceiver.java:run(253)) - dnode01.domain:50010:DataXceiver error processing unknown operation src: /127.0.0.1:49000 dst: /127.0.0.1:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:212)
at java.lang.Thread.run(Thread.java:745)
2015-05-15 10:08:23,670 INFO datanode.DataNode (BPServiceActor.java:register(782)) - Block pool BP-2116866246-127.0.0.1-1431441630609 (Datanode Uuid null) service to nnode.domain/192.1.4.212:8020 beginning handshake with NN
2015-05-15 10:08:23,674 ERROR datanode.DataNode (BPServiceActor.java:run(840)) - Initialization failed for Block pool BP-2116866246-127.0.0.1-1431441630609 (Datanode Uuid null) service to nnode.domain/192.1.4.212:8020 Datanode denied communication with namenode because hostname cannot be resolved (ip=192.1.4.1, hostname=192.1.4.1): DatanodeRegistration(0.0.0.0, datanodeUuid=7f1be518-1255-4a6a-b31c-22be5dc47673, infoPort=50075, ipcPort=8010, storageInfo=lv=-56;cid=CID-51d1dfd0-9376-44a7-b581-c14eec95fd74;nsid=450599258;c=0)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5282)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1082)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
This is odd, there is no host with 192.1.4.1 IP address. Why would datanodes try to connect 192.1.4.1?
Unresolved datanode registration: hostname cannot be resolved (ip=192.1.4.1, hostname=192.1.4.1)
"Datanodes 3/3 started" means 3 datanodes process running
Datanodes status "1 live / 0 Dead / 0 Decommissioning" means your namenode is able to communicate with one node.
It seems to be a network problem (make sure HDFS ports are open on your firewall). The alive Datanode is probably on same machine as your Namenode.
Moving NameNode to same network with DataNodes solved the problem.
DataNodes are in 192.1.5.* network.
NameNode was in 192.1.4.* network.
After moving NameNode to 192.1.5.* did the trick for my case.
I set up a Hadoop 2.4.0 cluster with three machines. One master machine is deployed with namenode, resource manager, datanode and node manager. The other two worker machines are deployed with datanode and node manager. When I run Hive query, the work fails and the error is
2014-06-11 13:40:13,364 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From master/127.0.0.1 to
master:43607 failed on connection exception: java.net.ConnectException: Connection >refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:5>7)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImp>l.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:604)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:699)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
... 4 more
if I disable the datanode on master machine, everything works well. I'm wondering if it's allowed to deployed datanode on the master machine. Thank you for your kindly help in advance.
BTW, my /etc/hosts on the three machines are the same:
127.0.0.1 localhost
10.1.154.231 master
10.1.153.220 slave1
10.1.153.133 slave2
Please set up passwordless ssh on your master to itself.
You can achieve this by
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys2
Make sure the permissions are correct
chmod 0600 ~/.ssh/authorized_keys2
In this case you may check if the namenode is started correctly on the master by checking logs at your yourhadoopfolder/logs/hadoop-[hadoop-user]-namenode-master.log
It is often caused by the hdfs is not formatted before. Run
hadoop namenode -format
Of course you will need to put your data to the cluster again.
This is happening in pseudo-distributed as well as distributed mode.
When I try to start HBase, initially all the 3 services - master, region and quorumpeer start. However within a minute, the master stops. In the logs, this is the trace -
2013-05-06 20:10:25,525 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 0 time(s).
2013-05-06 20:10:26,528 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 1 time(s).
2013-05-06 20:10:27,530 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 2 time(s).
2013-05-06 20:10:28,533 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 3 time(s).
2013-05-06 20:10:29,535 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 4 time(s).
2013-05-06 20:10:30,538 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 5 time(s).
2013-05-06 20:10:31,540 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 6 time(s).
2013-05-06 20:10:32,543 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 7 time(s).
2013-05-06 20:10:33,544 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 8 time(s).
2013-05-06 20:10:34,547 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 9 time(s).
2013-05-06 20:10:34,550 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call to <master/master_ip>:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
at org.apache.hadoop.ipc.Client.call(Client.java:1155)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at $Proxy9.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:132)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:259)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:220)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1611)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:68)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1645)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1627)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183)
at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:363)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:86)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:368)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:301)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
at org.apache.hadoop.ipc.Client.call(Client.java:1121)
... 18 more
Steps I have taken to fix this without any success
- downgraded from distributed mode to pseudo-distributed mode. Same issue.
- tried standalone mode- no luck
- used same user (hadoop) for both hadoop and hbase. Setup passwordless ssh for hadoop. - same problem.
- edited /etc/hosts file and changed localhost/servername as well as 127.0.0.1 to actual IP address referencing SO and different sources. Still same issue.
- rebooted the server
Here are the conf files.
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://<master>:9000/hbase</value>
<description>The directory shared by regionservers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value><master></value>
</property>
<property>
<name>hbase.master</name>
<value><master>:60000</value>
<description>The host and port that the HBase master runs at.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>The replication count for HLog and HFile storage. Should not be greater than HDFS datanode count.</description>
</property>
</configuration>
/etc/hosts file
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
.
What am I doing wrong here?
Hadoop Version - Hadoop 0.20.2-cdh3u5
HBase Version - Version 0.90.6-cdh3u5
By looking at you configuration file, I assume that you are using the actual hostname in your config files. Add the hostname along with the IP of the machine into the /etc/hosts file if that is the case. Also make sure it matches with the hostname in your Hadoop's core-site.xml. Proper name resolution is vital for a proper HBase functioning.
If you still face any problem please follow the steps mentioned here properly. I have tried to explain the procedure in detail and hopefully you'll be able to make it run if you follow all the steps carefully.
HTH
I believe you're trying to use pseudo-distributed mode. I was getting the same error until I fixed 3 things:
local /etc/hosts file
$ cat /etc/hosts
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost
172.20.x.x my.hostname.com
instead of pointing to hostname, point to localhost in hbase-env.sh
Correct my classpath
A. Ensure Hadoop is in classpath (via hbase-env.sh)
export JAVA_HOME=your path to java home
export HADOOP_HOME=your path to hadoop home
export HBASE_HOME=your path to hbase home
export HBASE_CLASSPATH=your path to hbase home/conf:your path to hadoop home/conf
B. When running my program, I edited the following bash script from HBase: The Definitive Guide (bin/run.sh)
$ grep -v # bin/run.sh
bin=`dirname "$0"`
bin=`cd "$bin">/dev/null; pwd`
echo "usage: $(basename $0) <example-name>"
exit 1;
fi
MVN="mvn"
if [ "$MAVEN_HOME" != "" ]; then
MVN=${MAVEN_HOME}/bin/mvn
fi
CLASSPATH="${HBASE_CONF_DIR}"
if [ -d "${bin}/../target/classes" ]; then
CLASSPATH=${CLASSPATH}:${bin}/../target/classes
fi
cpfile="${bin}/../target/cached_classpath.txt"
if [ ! -f "${cpfile}" ]; then
${MVN} -f "${bin}/../pom.xml" dependency:build-classpath -Dmdep.outputFile="${cpfile}" &> /dev/null
fi
CLASSPATH=`hbase classpath`:${CLASSPATH}:`cat "${cpfile}"`
JAVA_HOME=your path to java home
JAVA=$JAVA_HOME/bin/java
JAVA_HEAP_MAX=-Xmx512m
echo "Classpath is $CLASSPATH"
"$JAVA" $JAVA_HEAP_MAX -classpath "$CLASSPATH" "$#"
It's worth noting I am using teh Mac. I believe these instructions will work for teh Linux too.
I have set something up on my mac for installing hadoop. But there is an error message like this:
13/02/18 04:05:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).
13/02/18 04:05:53 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
13/02/18 04:05:54 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s).
13/02/18 04:05:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s).
13/02/18 04:05:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s).
13/02/18 04:05:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s).
13/02/18 04:05:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s).
13/02/18 04:05:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s).
13/02/18 04:06:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s).
13/02/18 04:06:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
java.lang.RuntimeException: java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:265)
at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
at org.apache.hadoop.ipc.Client.call(Client.java:1075)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542)
... 17 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
at org.apache.hadoop.ipc.Client.call(Client.java:1050)
... 31 more
then I enter jps to check the service, the result is :
20635 Jps
20466 TaskTracker
20189 DataNode
20291 SecondaryNameNode
I don't know how to deal with this error. could someone give me an answer?
Thx a Lot!!!
Actually all processes that (hadoop should run) are not running because of misconfiguration of IP.I am not familiar with Mac OS but on linux and windows we need to put hadoop enteries for connection in hosts files(etc/hosts) and I am damn sure that it should be for Mac.Now the point is You need to put your hadoop entry in that file as a local mnachine like 127.0.0.1Actually you need to put it against actual IP of your machine For example
hadoop-machine 127.0.0.1 -->(placing loop back IP is wrong here because hadoop will try to connect with this IP).
remove this 127.0.0.1 and place the actual IP of your machine infront of this entry.You can find ip of your mac machine eaisly.here are some questions which are not directly related to hadoop but I guess they would be helpful for you.Question 1 , Question 2, Question 3
This might help a bit. But first U gotta remove the earlier installation using the command
rm -rf /usr/local/Cellar /usr/local/.git && brew cleanup
Then U may begin with a fresh installation of Hadoop on U're Mac.
Step 1: Install Homebrew
$ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
Step 2: Install Hadoop
$ brew install hadoop
Assume that brew installs Hadoop 1.2.1
Step 3: Configure Hadoop
$ cd /usr/local/Cellar/hadoop/1.2.1/libexec
Add the following line to conf/hadoop-env.sh:
export HADOOP_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="
Add the following lines to conf/core-site.xml inside the configuration tags:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
Add the following lines to conf/hdfs-site.xml inside the configuration tags:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Add the following lines to conf/mapred-site.xml inside the configuration tags:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
Go to System Preferences > Sharing.
Make sure “Remote Login” is checked.
$ ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Step 4: Enable SSH to localhost
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Step 5: Format Hadoop filesystem
$ bin/hadoop namenode -format
Step 6: Start Hadoop
$ bin/start-all.sh
Make sure that all Hadoop processes are running:
$ jps
Run a Hadoop example:
One more thing ! “brew update” will update the hadoop binaries to recent version (1.2.1 at present).
I had this same error: ConnectException: Connection refused in all the secondarynamenode log files.
But I also found this in the namenode's log file:
2015-10-25 16:35:15,720 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /private/tmp/hadoop-admin/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
I therefore did a
hadoop namenode -format
and the problem has gone away. Thus, the error message was just about the fact that the namenode had not started successfully.
I tried installing hadoop in two nodes. Both the nodes are up and running. The namenode runs on Ubuntu 10.10 and Datanode on Fedora 13. While copying the file from local file system to hdfs I encountered the following errors.
The terminal showed:
12/04/12 02:19:15 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.OException: Bad connect ack with firstBadLink as 10.211.87.162:9200
12/04/12 02:19:15 INFO hdfs.DFSClient: Abandoning block blk_-1069539184735421145_1014
The log file in namenode showed:
2012-10-16 16:17:56,723 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.6.2.26:50010, storageID=DS-880164535-10.18.13.10-50010-1349721715148, infoPort=50075, ipcPort=50020):DataXceiver
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:282)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
at java.lang.Thread.run(Thread.java:662)
Datanodes available are indicated as 2. I've disabled the firewall and selinux.
The following changes have also been made in the hdfs-site.xml
dfs.socket.timeout -> 360000
dfs.datanode.socket.write.timeout -> 3600000
dfs.datanode.max.xcievers -> 1048576
Both the nodes run sun-java6-jdk, The datanode contains Openjdk but the path settings have been made for sun java.
Yet the same error persists.
What might be the solution.
That's because your firewall is on.
try
sudo /etc/init.d/iptables stop
If you are on Ubuntu, do
sudo ufw disable
this should solve the issue.
The exception log mentioned tha the failure reason is No route to host.
Try ping 10.6.2.26 to test your network connection.