Namenode keeps shutting down after start-dfs.sh - hadoop

Namenode for Fully Distributed Hadoop in Ubuntu mode will not stay open/ It starts and shutsdown with the error below. I tried a few things but nothing works. The namenode log is below and it automatically shuts down. Any help is appreciated.
Directory /usr/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
2019-03-25 01:34:44,354 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at
I have already tried reformatting namenode
7952 SecondaryNameNode
7714 DataNode
23346 NodeManager
10555 Jps
23167 ResourceManager

kindly recheck files vim ~/.bashrc, core-site.xml, hdfs-site.xml

Related

unable to initialized namenode ,datanode,jobtracker,tasktracker in cenos

when i give the command
for service in /etc/init.d/hadoop*
>do
>sudo $service stop
>done
its stops all the service
and when i give
for service in /etc/init.d/hadoop-hdfs-*
>do
>sudo $service stop
>done
its stops all the service
it sometimes start datanode and sometimes namenode
eg:
21270 NameNode
21422 Jps
21374 SecondaryNameNode
2624 HMaster
or
11070 DataNode
11422 Jps
11554 SecondaryNameNode
2554 HMaster
same thing happens for jobtracker and tasktracker
I tried formating the namenode but it didnt help
I also changing the path of localhost in
core-site.xml from 8020 to 50020
and also in mapred-site.xml from 8021 to 50020
this time it shows NameNode, DataNode, JobTracker,Tasktracker using jps
but when i check the browser localhost:50070 and localhost:50030
it refers to 8020 instead of 50020.
why is this happening ?
please help
Run the following script from terminal to stop the running hadoop daemons.
> $HADOOP_INSTALL/hadoop/bin/stop-all.sh
Run the following script from terminal to start the hadoop daemons.
$HADOOP_INSTALL/hadoop/bin/start-all.sh

Hadoop name node not starting

I am trying to run hadoop as a root user, i executed namenode format command hadoop namenode -format. after that I tried to open hadoop daemons , but namenode is not starting. I run the command hadoop namenode -importCheckpoint and it gives foll. error:
14/09/15 01:25:55 INFO common.Storage: Storage directory /home/umaima/cloudera_namedir is not formatted.
14/09/15 01:25:55 INFO common.Storage: Formatting ...
14/09/15 01:25:55 ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:336)
at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:531)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:375)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:372)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:335)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:467)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1330)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1339)
14/09/15 01:25:55 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at umaima-Lenovo-IdeaPad-S510p/127.0.1.1
************************************************************/
I am stuck in this. Any help is highly appreciated. Thanks in advance
Before formatting the namenode, delete the tmp folder(contains datanode and namenode) and then format the namenode.
And then start the hadoop services.

Datanode not starts correctly

I am trying to install Hadoop 2.2.0 in pseudo-distributed mode. While I am trying to start the datanode services it is showing the following error, can anyone please tell how to resolve this?
**2**014-03-11 08:48:15,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/127.0.0.1:9000 starting to offer service
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627#prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
java.io.IOException: Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
at**** org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:662)
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582)
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
You can do the following method,
copy to clipboard datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9
and run following command under HADOOP_HOME/bin directory
./hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9
then this code formatted the namenode with datanode cluster ids.
You must do as follow :
bin/stop-all.sh
rm -Rf /home/prassanna/usr/local/hadoop/yarn_data/hdfs/*
bin/hadoop namenode -format
I had the same problem until I found an answer in this web site.
Whenever you are getting below error, trying to start a DN on a slave machine:
java.io.IOException: Incompatible clusterIDs in /home/hadoop/dfs/data: namenode clusterID= ****; datanode clusterID = ****
It is because after you set up your cluster, you, for whatever reason, decided to reformat
your NN. Your DNs on slaves still bear reference to the old NN.
To resolve this simply delete and recreate data folder on that machine in local Linux FS, namely /home/hadoop/dfs/data.
Restarting that DN's daemon on that machine will recreate data/ folder's content and resolve
the problem.
Do following simple steps
Clear the data directory of hadoop
Format the namenode again
start the cluster
After this your cluster will start normally if you are not having any other configuration issue
DataNode dies because of incompatible Clusterids compared to the NameNode. To fix this problem you need to delete the directory /tmp/hadoop-[user]/hdfs/data and restart hadoop.
rm -r /tmp/hadoop-[user]/hdfs/data
I got similar issue in my pseudo distributed environment. I stopped cluster first, then I copied Cluster ID from NameNode's version file and put it in DataNode's version file, then after restarting cluster, its all fine.
my data path is here /usr/local/hadoop/hadoop_store/hdfs/datanode and /usr/local/hadoop/hadoop_store/hdfs/namenode.
FYI : version file is under /usr/local/hadoop/hadoop_store/hdfs/datanode/current/ ; likewise for NameNode.
Here, the datanode gets stopped immediately because the clusterID of datanode and namenode are different. So you have to format the clusterID of namenode with clusterID of datanode
Copy the datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9 and run following command from your home directory. You can go to your home dir by just typing cd on your terminal.
From your home dir now type the command:
hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9
Delete the namenode and datanode directories as specified in the core-site.xml.
After that create the new directories and restart the dfs and yarn.
I also had the similar issue.
I deleted namenode and datanode folders from all the nodes, and rerun:
$HADOOP_HOME/bin> hdfs namenode -format -force
$HADOOP_HOME/sbin> ./start-dfs.sh
$HADOOP_HOME/sbin> ./start-yarn.sh
To check the health report from command line (which I would recommend)
$HADOOP_HOME/bin> hdfs dfsadmin -report
and I got all the nodes working correctly.
I had same issue for hadoop 2.7.7
I removed the namenode/current & datanode/current directory on namenode and all the datanodes
Removed files at /tmp/hadoop-ubuntu/*
then format namenode & datanode
restart all the nodes.
things work fine
steps:
stop all nodes/managers then attempt below steps
rm -rf /tmp/hadoop-ubuntu/* (all nodes)
rm -r /usr/local/hadoop/data/hdfs/namenode/current (namenode: check hdfs-site.xml for path)
rm -r /usr/local/hadoop/data/hdfs/datanode/current (datanode:check hdfs-site.xml for path)
hdfs namenode -format (on namenode)
hdfs datanode -format (on namenode)
Reboot namenode & data nodes
There's been different solutions to this problem, but I tested another easy solution and it worked like a charm :
So if someone get the same error, you just need to change the clusterID in the datanodes with clusterID of the namenode in the VERSION file.
With your case, here's were you can change it on datanode side :
namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
Backup the current VERSION : cp /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION.BK
vim /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION and change
clusterID=CID-8bf63244-0510-4db6-a949-8f74b50f2be9
with
clusterID=CID-fb61aa70-4b15-470e-a1d0-12653e357a10
Restart the datanode and it should work.

Data-Node Does Not Start

I have trouble starting my Hadoop data-node. I did all the research that I could and none of the methods were helpful in solving my issue. Here's my terminal console output when I try to start it using
hadoop datanode -start
This is what happens:
root#Itanium:~/Desktop/hadoop# hadoop datanode -start
Warning: $HADOOP_HOME is deprecated.
13/09/29 22:11:42 INFO datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = Itanium/127.0.1.1
STARTUP_MSG: args = [-start]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_25
************************************************************/
Usage: java DataNode
[-rollback]
13/09/29 22:11:42 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at Itanium/127.0.1.1
************************************************************/
root#Itanium:~/Desktop/hadoop# jps
31438 SecondaryNameNode
32013 Jps
31818 TaskTracker
1146 Bootstrap
31565 JobTracker
30930 NameNode
root#Itanium:~/Desktop/hadoop#
As we can see the DataNode attempts to start but then shuts down. All the while I have been having trouble with NameNode starting up. I used to fix this by manually starting it using
start-dfs.sh
And now the problem is with DataNode. I really would appreciate all your help in resolving this issue.
And one more generic question. Why is Hadoop displaying such inconsistent behavior. I am sure I did not change any of the *-site.xml settings.
use this command hadoop datanode -rollback
I had a similar issue as well. Looking at the comment posted by Anup "seems to be an issue with namespaceIDs not matching" I was able to find a reference that showed me how to solve my issue.
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#caveats
I took a look at the logfile on the slave nodes where the DataNodes did not start. They both had the following exception :
2014-11-05 10:26:14,289 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /scratch/hdfs/data/srinivasand: namenode namespaceID = 1296690356; datanode namespaceID = 1228298945
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
Fixing this exception solved the issue.
The fix is to either
a) delete the dfs data directory. reformat using namenode -format.
b) update the VERSION file so that the two namespace IDs match.
I was able to use option b) and the datanodes started successfully after that.
The bug report that leads to this issue is recorded at : https://issues.apache.org/jira/browse/HDFS-107
I ever got the same issue, it turns out that 50010 port is occupied by other application, stop the application, restart Hadoop

hadoop namenode not getting detected

I am trying to configure a pseudo hadoop 1 node cluster on my ubuntu machine. However when i give the following command
bin/start-all.sh
it start all the daemons but when i do jps,it does not give me namnode port and when i go to the namenode logs
The following message is displayed.
2013-04-26 11:59:09,927 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: Incomplete HDFS URI, no host: hdfs://vikasXXX.XX.XX.XX:X000
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:85)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.fs.Trash.<init>(Trash.java:62)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startTrashEmptier(NameNode.java:314)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:310)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2013-04-26 11:59:09,929 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at vikas/XXX.XX.XX.XX
************************************************************/
What could be the reason?
Thanks
I assume that you have masked and shared your URL
hdfs://vikasXXX.XX.XX.XX:X000
I think it is not recognizing your machine by name. Try using localhost and check if it works.
hdfs://localhost:8020

Resources