Datanode is not showing up on hitting jps command - hadoop

I am newbie in hadoop i have setup multinode cluster but when i hit jps command on master node it shows only namenode not datanode and when i paste this url 'Master:50070' it shows no live node due to which i am unable to copy data from my local system into hdfs it throws this error
hduser#oodles-Latitude-3540:~$ hadoop fs -copyFromLocal /home/oodles/input/test /tmp
15/06/28 16:27:56 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/test._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
after starting hadoop cluster using this command start-dfs.sh my namenode started successfully but datanode did't . when i check datanode log it shows this
ToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-06-28 04:01:53,496 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Master/192.168.0.126:9000. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-06-28 04:01:54,498 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Master/192.168.0.126:9000. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-06-28 04:01:55,499 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Master/192.168.0.126:9000. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-06-28 04:01:56,500 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Master/192.168.0.126:9000. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
i googled but not found solution for this .
when i hit jps command on slave node there it is showing datanode only
and one thing more when i paste 'Master:50070' into browser and Browse file system
it shows me this error
HTTP ERROR 500
Problem accessing /nn_browsedfscontent.jsp. Reason:
Can't browse the DFS since there are no live nodes available to redirect to.
Caused by:
java.io.IOException: Can't browse the DFS since there are no live nodes available to redirect to.
at org.apache.hadoop.hdfs.server.namenode.NamenodeJspHelper.redirectToRandomDataNode(NamenodeJspHelper.java:666)
at org.apache.hadoop.hdfs.server.namenode.nn_005fbrowsedfscontent_jsp._jspService(nn_005fbrowsedfscontent_jsp.java:70)
My hadoop cluster configuration is like this
1) /etc/host file on master
2) /etc/hosts file on slave
i have edit entry in master and slave file in hadoop configuration folder i.e masters file i added Master and slaves file i added Slave1
Can anybody help me to solve these problems!
datanode logs showing in two pictures

Do you config the ssh? Try you use ssh to login the other node to check the ssh connection.

Related

Error when copying the file into HDFS

Hadoop cluster started normally and JPS shows datanodes and tasktracker running correctly.
When i copy a file into HDFS this is the error message i am getting.
hduser#nn:~$ hadoop fs -put gettysburg.txt /user/hduser/getty/gettysburg.txt
Warning: $HADOOP_HOME is deprecated.
14/08/24 21:12:50 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:51 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:52 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:53 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:54 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:55 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:56 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:57 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:58 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/08/24 21:12:59 INFO ipc.Client: Retrying connect to server: nn/10.10.1.1:54310. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
Bad connection to FS. command aborted. exception: Call to nn/10.10.1.1:54310 failed on connection exception: java.net.ConnectException: Connection refused
hduser#nn:~$
I am able to do ssh from NN to DNs and Viceverssa and between DNs.
I have changed the cd /etc/hosts in all NNs and DNs as below.
#127.0.0.1 localhost loghost localhost.project1.ch-geni-net.emulab.net
#10.10.1.1 NN-Lan NN-0 NN
#10.10.1.2 DN1-Lan DN1-0 DN1
#10.10.1.3 DN2-Lan DN2-0 DN2
#10.10.1.5 DN4-Lan DN4-0 DN4
#10.10.1.4 DN3-Lan DN3-0 DN3
10.10.1.1 nn
10.10.1.2 dn1
10.10.1.3 dn2
10.10.1.4 dn3
10.10.1.5 dn4
My mapredsite.xml looks like this.
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://nn:54310</value>
<description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHE$
</property>
</configuration>
Configured cd /usr/local/hadoop/conf/master
hduser#nn:/usr/local/hadoop/conf$ vi masters
#localhost
nn
hduser#dn1:~$ jps
9975 DataNode
10186 Jps
10070 TaskTracker
hduser#dn1:~$
hduser#nn:~$ jps
5979 JobTracker
5891 SecondaryNameNode
6159 Jps
hduser#nn:~$
What is the problem?
Check your fs.default.name property in core-site.xml file. The value should be hdfs://NN:port.
Check the following :
core-site.xml - the hdfs url mentioned - hdfs://ip:port
Format namenode
Check if safemode is on

What if the ResourceManager down?

In the newest version of Hadoop mapreduce(called 'Yarn'), JobTracker(exists in previous version) has been replaced by the ResourceManager(called 'RM') and ApplicationMaster.
In official document about Yarn architecture, there are no words say that how many RMs are there in a MapReduce cluster, and the given graph about Yarn architecture shows only 1 RM exists in a cluster.
So, what if the only RM down? If there are several RMs, how do they work together?
Hope someone can explain it to me.
Thanks.
There is 1 RessourceManager per rack but you can have several racks in your cluster.
If you try to submit a job while RessourceManager is down, Hadoop will try to connect to the RessourceManager because it needs it to execute the job.
Here is an example of the logs when the RM is down and try to submit a job :
14/06/06 09:39:54 INFO ipc.Client: Retrying connect to server: hadoop01.sii.fr/10.6.6.211:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/06/06 09:39:55 INFO ipc.Client: Retrying connect to server: hadoop01.sii.fr/10.6.6.211:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/06/06 09:39:56 INFO ipc.Client: Retrying connect to server: hadoop01.sii.fr/10.6.6.211:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
When the RM is back, the job is submitting correctly.

Hadoop-1.2.1 in Solaris 11.1 VM: Call to name-node failed on connection exception

Hi I am following this below guide in link for VirtualBox Solaris Zones Hadoop installation.
Oracle Solaris Zones Hadoop Setup
I was able to successfully follow till step 10. Once I tried to check report I am getting this error::
adoop#name-node:~$ hadoop dfsadmin -report
14/05/17 16:45:12 INFO ipc.Client: Retrying connect to server: name-node/192.168.1.1:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/05/17 16:45:13 INFO ipc.Client: Retrying connect to server: name-node/192.168.1.1:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
....
14/05/17 16:45:21 INFO ipc.Client: Retrying connect to server: name-node/192.168.1.1:8020. Already tried 9 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
report: Call to name-node/192.168.1.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
hadoop#name-node:~$
can someone kindly suggest resolution.
Also netstat shows this
name-node.8021 . 0 0 128000 0 LISTEN
*.50030 . 0 0 128000 0 LISTEN
how to configure dfsadmin to port 8021 instead?
Step by step to configure Hadoop cluster on Oracle Solaris 11.1 using zones --- http://hashprompt.blogspot.com/2014/05/multi-node-hadoop-cluster-on-oracle.html
Probably this is too old question and you might have already solved it. But just in case if anyone is wondering.
in core-site.xml make the following changes
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.1.1:8021/</value>
</property>
This will configure name node server port.

Hadoop - Pseudo-Distributed Operation

I am trying to copy a file quangle.txt from my localsystem to Hadoop using the command below:
testuser#ubuntu:~/Downloads/hadoop/bin$ ./hadoop fs -copyFromLocal Desktop/quangle.txt hdfs://localhost/testuser/quangle.txt
13/11/28 06:35:50 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:51 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:53 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:54 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/11/28 06:35:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
copyFromLocal: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
I tried to ping 127.0.0.1 and I got the response. Please advice
just add correct port to the filepath after localhost:
hdfs://localhost:9000/testuser/quangle.txt
Looks like your Name node isn't running - try running the jps cmd and see if NameNode is listed in the running services (or you might have to run ps axww | grep NameNode if the NameNode was started by/under a different user)
Does sudo netstat -atnp | grep 8020 yield any results?
If the Name Node is refusing to start then copy in your Name Node logs into to your original question (or post a new question - after searching for the error first of all to see if someone else has had this problem)
Try running jps to see the currently running Java processes.
Are all Hadoop processes running, especially the Namemode?
If yes, you should get this output (with different process ids):
10015 JobTracker
9670 TaskTracker
9485 DataNode
10380 Jps
9574 SecondaryNameNode
9843 NameNode
I think you can use hadoop fs -put ~/Desktop/quangle.txt /testuser, after copied, you can look up it via hadoop fs -ls /testuser in the /testuser directory
you create Desktop and others with the command hadoop fs -mkdir testuser and then try, it worked for me that way
Maybe there is something wrong with your setting for Pseudodistributed Mode.
It should be configured in this order:
fill up the configuration files:core-site.xml, hdfs-site.xml,
mapred-site.xml, yar-site.xml.
Configuring SSH
Formatting the
HDFS filesystem
Starting and stopping the daemons

Unable to add a datanode to Hadoop

I got all my settings right and I am able to run Hadoop ( 1.1.2 ) on a single-Node. However, after making the changes to the relevant files ( /etc/hosts, *-site.xml ), I am not able to add a Datanode to the cluster and I keep getting the following error on the Slave.
Anybody knows how to rectify this?
2013-05-13 15:36:10,135 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-05-13 15:36:11,137 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-05-13 15:36:12,140 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
Check the value of fs.default.name in your core-site.xml conf file (on each node in your cluster). This needs to be the network name of the name node and i suspect you have this as hdfs://localhost:54310).
Failing that check for any mention of localhost in your hadoop configuration files on all nodes in your cluster:
grep localhost $HADOOP_HOME/conf/*.xml
try relpacing localhost with the namenode's ip address or network name

Resources