access file in datanode from namenode hadoop - hadoop

I have configured the namenode in hadoopmaster and data node in hadoopslave.When start-dfs.sh in hadoopmaster the namenode in hadoopmaster and datanode in hadoopslave getting started.when i try to execute the command hdfs dfs -ls / in hadoopmaster i can't view the files that i had put in hadoopslave.
Note:I put a file in hadoopslave using hdfs dfs -put /sample.txt /
Correct me if I'am wrong!

Related

What is the prefered solution for corrupted namenode metadata

we have HDP cluster , version 2.6.5
cluster include management of two name-node ( one is active and the secondary is standby )
and 65 datanode machines
we have problem with the standby name-node that not started and from the namenode logs we can see the following
2021-01-01 15:19:43,269 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412.
at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:838)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:693)
from ambari we can see that standby is down
for now the active namenode is up but the standby name node is down , and the root cause for this issue is because namenode matadata is damaged/corrupted.
so we have two solution - A or B
A)
run the following recover on standby namenode
su
hadoop namenode -recover
B)
Put Active NN in safemode
su hdfs
hdfs dfsadmin -safemode enter
Do a savenamespace operation on Active NN
su hdfs
hdfs dfsadmin -saveNamespace
Leave Safemode
su hdfs
hdfs dfsadmin -safemode leave
Login to Standby NN
Run below command on Standby namenode to get latest fsimage that we saved in above steps.
su hdfs
hdfs namenode -bootstrapStandby -force
what is the preferred solution for our problem?

Connection refused even when all daemons are working on hadoop,

I am working on hadoop 2.X , on which when I run jps, it shows all the daemons running correctly.
[root#localhost Flume]# jps
3521 NodeManager
3058 DataNode
3252 SecondaryNameNode
4501 Jps
3419 ResourceManager
2957 NameNode
But when I run,
hadoop dfs -ls /
It says,
ls: Call From localhost.localdomain/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
please help me with it.
The command you gave "hadoop dfs -ls /" is not correct. Change the dfs to fs. This command can be specified in two ways:
hadoop fs -ls / /* this is the old style and is deprecated */
hdfs dfs -ls / /* this is the new style */
Note the difference in both commands. The 2nd part of 1st command should be fs and not dfs.

Hadoop Multinode cluster. Data node not working properly

I'm deploying hadoop as a multi node cluster (distributed mode). But each data node is having different different cluster id.
On slave1,
java.io.IOException: Incompatible clusterIDs in /home/pushuser1/hadoop/tmp/dfs/data: namenode clusterID = CID-c72a7d30-ec64-4e4f-9a80-e6f9b6b1d78c; datanode clusterID = CID-2ecca585-6672-476e-9931-4cfef9946c3b
On slave2,
java.io.IOException: Incompatible clusterIDs in /home/pushuser1/hadoop/tmp/dfs/data: namenode clusterID = CID-c72a7d30-ec64-4e4f-9a80-e6f9b6b1d78c; datanode clusterID = CID-e24b0548-2d8d-4aa4-9b8c-a336193c006e
I followed this link as well Datanode not starts correctly but I dont know which cluster id I should pick. If I pick any then data node starts on that machine but not on another one. And also when I format namenode using basic command (hadoop namenode - format), datanodes on each slave nodes are started but then namenode on master machine doesn't get started.
ClusterIDs of datanodes and namenodes should match, then only datanodes can effectively communicate with namenode. If you do namenode format new ClusterID will be assigned for namenodes then ClusterIDs in datanodes won't match.
You can locate a VERSION files in your /home/pushuser1/hadoop/tmp/dfs/data/current/ (datanode directory ) as well as namenode directory(/home/pushuser1/hadoop/tmp/dfs/name/current/ based on the value your specified for dfs.namenode.name.dir) that contains the ClusterID.
If you are ready for format your hdfs namenode, Stop all HDFS services, Clear out all files inside the following directories
rm -rf /home/pushuser1/hadoop/tmp/dfs/data/* (Need to execute on all data nodes)
rm -rf /home/pushuser1/hadoop/tmp/dfs/name/*
and format hdfs again (hadoop namenode -format )

Hadoop -mkdir : - Could not create the Java Virtual Machine

I have configured the Hadoop 1.0.4 and started the following without any issue:
1. $ start-dfs.sh : -Works fine
2. $ start-mapred.sh : - Work fine
3. $ jps (Output is below)
Out put:
rahul#rahul-Inspiron-N4010:/usr/local/hadoop-1.0.4/bin$ jps
6964 DataNode
7147 SecondaryNameNode
6808 NameNode
7836 Jps
7254 JobTracker
7418 TaskTracker
But facing issue: While issuing command
rahul#rahul-Inspiron-N4010:/usr/local/hadoop-1.0.4/bin$ hadoop -mkdir /user
Getting following error
Unrecognized option: -mkdir
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
I applied the patch : HDFS-1943.patch but not use full
Should be: hdfs dfs -mkdir /user
Consult documentation at http://hadoop.apache.org/docs/r1.0.4/file_system_shell.html
fs option is missing
hadoop fs -mkdir /user

hadoop file system list my own root directory

I met a very wired situation when I try to install single node hadoop yarn 2.2.0 on my mac. I follow the tutorial on this link: http://raseshmori.wordpress.com/2012/09/23/install-hadoop-2-0-1-yarn-nextgen/.
When I start the hadoop, and jps to check the status, it shows: (which means normal, I think)
5552 Jps
7162 ResourceManager
7512 Jps
7243 NodeManager
6962 DataNode
7060 SecondaryNameNode
6881 NameNode
However, after enter
hadoop fs -ls /
The files lists are the files in my own root but not the hadoop file system root. There must be some error when I set the hadoop that mix my own fs with the hdfs. Could any one give me a hint about it?
Use the following command for accessing HDFS
hadoop fs -ls hdfs://localhost:9000/
Or
Populate ${HADOOP_CONF_DIR}/core-site.xml as follows. If your doing so even without specifying hdfs:// URI you will be able to access HDFS.
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Add the following line at the starting of the file $HOME/yarn/hadoop-2.0.1-alpha/libexec/hadoop-config.sh
export HADOOP_CONF_DIR=$HOME/yarn/hadoop-2.0.1-alpha/etc/hadoop

Resources