SafeModeException : Name node is in safe mode - hadoop

I tried copying files from my local disk to hdfs . At first it gave SafeModeException. While searching for solution I read that the problem does not appear if one executes same command again. So I tried again and it didn't gave exception.
hduser#saket:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /tmp/gutenberg/ /user/hduser/gutenberg
copyFromLocal: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/hduser/gutenberg. Name node is in safe mode.
hduser#saket:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /tmp/gutenberg/ /user/hduser/gutenberg
Why is this happening?. Should I keep safemode off by using this code?
hadoop dfs -safemode leave

NameNode is in safemode until configured percent of blocks reported to be online by the data nodes. It can be configured by parameter dfs.namenode.safemode.threshold-pct in the hdfs-site.xml
For small / development clusters, where you have very few blocks - it makes sense to make this parameter lower then its default 0.9999f value. Otherwise 1 missing block can lead to system to hang in safemode.

Go to the hadoop path into bin(in my system /usr/local/hadoop/bin/),
cd /usr/local/hadoop/bin/
Check there is a file hadoop,
hadoopuser#arul-PC:/usr/local/hadoop/bin$ ls
the o/p will be,
hadoop hadoop-daemons.sh start-all.sh start-jobhistoryserver.sh stop-balancer.sh stop-mapred.sh
hadoop-config.sh rcc start-balancer.sh start-mapred.sh stop-dfs.sh task-controller
hadoop-daemon.sh slaves.sh start-dfs.sh stop-all.sh stop-jobhistoryserver.sh
Then you have to off safe mode by using command ./hadoop dfsadmin -safemode leave,
hadoopuser#arul-PC:/usr/local/hadoop/bin$ ./hadoop dfsadmin -safemode leave
you will get response as,
Safe mode is OFF
Note: I created Hadoop user with the name of hadoopuser.

Related

Cannot create directory in hdfs NameNode is in safe mode

I upgrade to the latest version of cloudera.Now I am trying to create directory in HDFS
hadoop fs -mkdir data
Am getting the following error
Cannot Create /user/cloudera/data Name Node is in SafeMode.
How can I do this?
When you start hadoop, for some time limit hadoop stays in safemode. You can either wait (you can see the time limit being decreased on Namenode web UI) until the time limit or You can turn it off with
hadoop dfsadmin -safemode leave
The above command turns off the safemode of hadoop
In addition to Ramesh Maharjan answer, By default, cloudera machine(Cloudera Quick Start#5.12) doesn't allow to SET OFF safe mode, it's required to specify the -u options as shown below:
sudo -u hdfs hdfs dfsadmin -safemode leave
For me, I was immediately using hive command to go into hive shell after starting hadoop using start-all.sh. I re-tried using hive command after waiting for 10-20 seconds.
Might need the full path to hdfs command
/usr/local/hadoop/bin/hdfs dfsadmin -safemode leave

cant get past the error: "mkdir: Cannot create directory /user/hadoop. Name node is in safe mode." [duplicate]

root# bin/hadoop fs -mkdir t
mkdir: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/root/t. Name node is in safe mode.
not able to create anything in hdfs
I did
root# bin/hadoop fs -safemode leave
But showing
safemode: Unknown command
what is the problem?
Solution: http://unmeshasreeveni.blogspot.com/2014/04/name-node-is-in-safe-mode-how-to-leave.html?m=1
In order to forcefully let the namenode leave safemode, following command should be executed:
bin/hadoop dfsadmin -safemode leave
You are getting Unknown command error for your command as -safemode isn't a sub-command for hadoop fs, but it is of hadoop dfsadmin.
Also after the above command, I would suggest you to once run hadoop fsck so that any inconsistencies crept in the hdfs might be sorted out.
Update:
Use hdfs command instead of hadoop command for newer distributions. The hadoop command is being deprecated:
hdfs dfsadmin -safemode leave
hadoop dfsadmin has been deprecated and so is hadoop fs command, all hdfs related tasks are being moved to a separate command hdfs.
try this, it will work
sudo -u hdfs hdfs dfsadmin -safemode leave
The Command did not work for me but the following did
hdfs dfsadmin -safemode leave
I used the hdfs command instead of the hadoop command.
Check out http://ask.gopivotal.com/hc/en-us/articles/200933026-HDFS-goes-into-readonly-mode-and-errors-out-with-Name-node-is-in-safe-mode- link too
safe mode on means (HDFS is in READ only mode)
safe mode off means (HDFS is in Writeable and readable mode)
In Hadoop 2.6.0, we can check the status of name node with help of the below commands:
TO CHECK THE name node status
$ hdfs dfsadmin -safemode get
TO ENTER IN SAFE MODE:
$ hdfs dfsadmin -safemode enter
TO LEAVE SAFE mode
~$ hdfs dfsadmin -safemode leave
If you use Hadoop version 2.6.1 above, while the command works, it complains that its depreciated. I actually could not use the hadoop dfsadmin -safemode leave because I was running Hadoop in a Docker container and that command magically fails when run in the container, so what I did was this. I checked doc and found dfs.safemode.threshold.pct in documentation that says
Specifies the percentage of blocks that should satisfy the minimal
replication requirement defined by dfs.replication.min. Values less
than or equal to 0 mean not to wait for any particular percentage of
blocks before exiting safemode. Values greater than 1 will make safe
mode permanent.
so I changed the hdfs-site.xml into the following (In older Hadoop versions, apparently you need to do it in hdfs-default.xml:
<configuration>
<property>
<name>dfs.safemode.threshold.pct</name>
<value>0</value>
</property>
</configuration>
Try this
sudo -u hdfs hdfs dfsadmin -safemode leave
check status of safemode
sudo -u hdfs hdfs dfsadmin -safemode get
If it is still in safemode ,then one of the reason would be not enough space in your node, you can check your node disk usage using :
df -h
if root partition is full, delete files or add space in your root partition and retry first step.
Namenode enters into safemode when there is shortage of memory. As a result the HDFS becomes readable only. That means one can not create any additional directory or file in the HDFS. To come out of the safemode, the following command is used:
hadoop dfsadmin -safemode leave
If you are using cloudera manager:
go to >>Actions>>Leave Safemode
But it doesn't always solve the problem. The complete solution lies in making some space in the memory. Use the following command to check your memory usage.
free -m
If you are using cloudera, you can also check if the HDFS is showing some signs of bad health. It probably must be showing some memory issue related to the namenode. Allot more memory by following the options available. I am not sure what commands to use for the same if you are not using cloudera manager but there must be a way. Hope it helps! :)
Run the command below using the HDFS OS user to disable safe mode:
sudo -u hdfs hadoop dfsadmin -safemode leave
use below command to turn off the safe mode
$> hdfs dfsadmin -safemode leave

NameNode Does Not Start with start-all.sh

The NameNode does not start after stop-all.sh with start-all.sh. I try hadoop namenode -format and hadoop-daemon.sh start namenode then everything ok. However my data is lost in HDFS.
I do not want data loss. This result, hadoop namenode -format command is not want my path to a solution. How can I start the NameNode with start-all.sh ?
Thanks
First of all, stop-all.sh with start-all.sh are deprecated. Use start-dfs.sh and start-yarn.sh instead of start-all.sh. Same with stop-all.sh(it already says so)
secondly, hadoop namenode -format formats your HDFS and should therefore be used only once, at the time of installation.
Hadoop by default sets the property of hadoop.tmp.dir to a directory in /tmp, where the files are deleted after every restart. Set the hadoop.tmp.dir property in $HADOOP_HOME/conf/hadoop/core-site.xml, to some place where the files are not usually deleted. Run the hadoop namenode -format (actually it is hdfs namenode -format, this one is also deprecated.) one last time and start the daemons.
PS: If you can post the log file or the terminal screenshot of the error, it will be easier to help you.
hadoop.temp.dir
temp = should be "tmp" => hadoop.tmp.dir
I missed only "e".

Datanode process not running in Hadoop

I set up and configured a multi-node Hadoop cluster using this tutorial.
When I type in the start-all.sh command, it shows all the processes initializing properly as follows:
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-namenode-jawwadtest1.out
jawwadtest1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest1.out
jawwadtest2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest2.out
jawwadtest1: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-secondarynamenode-jawwadtest1.out
starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-jobtracker-jawwadtest1.out
jawwadtest1: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest1.out
jawwadtest2: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest2.out
However, when I type the jps command, I get the following output:
31057 NameNode
4001 RunJar
6182 RunJar
31328 SecondaryNameNode
31411 JobTracker
32119 Jps
31560 TaskTracker
As you can see, there's no datanode process running. I tried configuring a single-node cluster but got the same problem. Would anyone have any idea what could be going wrong here? Are there any configuration files that are not mentioned in the tutorial or I may have looked over? I am new to Hadoop and am kinda lost and any help would be greatly appreciated.
EDIT:
hadoop-root-datanode-jawwadtest1.log:
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.3
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/$
************************************************************/
2012-08-09 23:07:30,717 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loa$
2012-08-09 23:07:30,734 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$
2012-08-09 23:07:30,735 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$
2012-08-09 23:07:30,736 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$
2012-08-09 23:07:31,018 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$
2012-08-09 23:07:31,024 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$
2012-08-09 23:07:32,366 INFO org.apache.hadoop.ipc.Client: Retrying connect to $
2012-08-09 23:07:37,949 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: $
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(Data$
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransition$
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNo$
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java$
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNod$
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode($
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataN$
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.$
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1$
2012-08-09 23:07:37,951 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: S$
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at jawwadtest1/198.101.220.90
************************************************************/
You need to do something like this:
bin/stop-all.sh (or stop-dfs.sh and stop-yarn.sh in the 2.x serie)
rm -Rf /app/tmp/hadoop-your-username/*
bin/hadoop namenode -format (or hdfs in the 2.x serie)
the solution was taken from:
http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/. Basically it consists in restarting from scratch, so make sure you won't loose data by formating the hdfs.
I ran into the same issue. I have created a hdfs folder '/home/username/hdfs' with sub-directories name, data, and tmp which were referenced in config xml files of hadoop/conf.
When I started hadoop and did jps, I couldn't find datanode so I tried to manually start datanode using bin/hadoop datanode. Then I realized from error message that it has permissions issue accessing the dfs.data.dir=/home/username/hdfs/data/ which was referenced in one of the hadoop config files. All I had to do was stop hadoop, delete the contents of /home/username/hdfs/tmp/* directory and then try this command - chmod -R 755 /home/username/hdfs/ and then start hadoop. I could find the datanode!
I faced similar issue while running the datanode. The following steps were useful.
In [hadoop_directory]/sbin directory use ./stop-all.sh to stop all the running services.
Remove the tmp dir using rm -r [hadoop_directory]/tmp (The path configured in [hadoop_directory]/etc/hadoop/core-site.xml)
sudo mkdir [hadoop_directory]/tmp (Make a new tmp directory)
Go to */hadoop_store/hdfs directory where you have created namenode and datanode as sub-directories. (The paths configured in [hadoop_directory]/etc/hadoop/hdfs-site.xml). Use
rm -r namenode
rm -r datanode
In */hadoop_store/hdfs directory use
sudo mkdir namenode
sudo mkdir datanode
In case of permission issue, use
chmod -R 755 namenode
chmod -R 755 datanode
In [hadoop_directory]/bin use
hadoop namenode -format (To format your namenode)
In [hadoop_directory]/sbin directory use ./start-all.sh or ./start-dfs.sh to start the services.
Use jps to check the services running.
Delete the datanode under your hadoop folder then rerun start-all.sh
I was having the same problem running a single-node pseudo-distributed instance. Couldn't figure out how to solve it, but a quick workaround is to manually start a DataNode with
hadoop-x.x.x/bin/hadoop datanode
Follow these steps and your datanode will start again.
Stop dfs.
Open hdfs-site.xml
Remove the data.dir and name.dir properties from hdfs-site.xml and -format namenode again.
Then remove the hadoopdata directory and add the data.dir and name.dir in hdfs-site.xml and again format namenode.
Then start dfs again.
Need to follow 3 steps.
(1) Need to go to the logs and check the most recent log (In hadoop-
2.6.0/logs/hadoop-user-datanode-ubuntu.log)
If the error is as
java.io.IOException: Incompatible clusterIDs in /home/kutty/work/hadoop2data/dfs/data: namenode clusterID = CID-c41df580-e197-4db6-a02a-a62b71463089; datanode clusterID = CID-a5f4ba24-3a56-4125-9137-fa77c5bb07b1
i.e. namenode cluster id and datanode cluster id's are not identical.
(2) Now copy the namenode clusterID which is CID-c41df580-e197-4db6-a02a-a62b71463089 in above error
(3) Replace the Datanode cluster ID with Namenode cluster ID in hadoopdata/dfs/data/current/version
clusterID=CID-c41df580-e197-4db6-a02a-a62b71463089
Restart Hadoop. Will run DataNode
Stop all the services - ./stop-all.sh
Format all the hdfs tmp directory from all the master and slave. Don't forget to format from slave.
Format the namenode.(hadoop namenode -format)
Now start the services on namenode.
./bin/start-all.sh
This made a difference for me to start the datanode service.
Stop the dfs and yarn first.
Remove the datanode and namenode directories as specified in the core-site.xml file.
Re-create the directories.
Then re-start the dfs and the yarn as follows.
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
Hope this works fine.
Delete the files under $hadoop_User/dfsdata and $hadoop_User/tmpdata
then run:
hdfs namenode -format
finally run:
start-all.sh
Then your problem gets solved.
Please control if the the tmp directory property is pointing to a valid directory in core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/data/tmp</value>
</property>
If the directory is misconfigured, the datanode process will not start properly.
Run Below Commands in Line:-
stop-all.sh (Run Stop All to Stop all the hadoop process)
rm -r /usr/local/hadoop/tmp/ (Your Hadoop tmp directory which you configured in hadoop/conf/core-site.xml)
sudo mkdir /usr/local/hadoop/tmp (Make the same directory again)
hadoop namenode -format (Format your namenode)
start-all.sh (Run Start All to start all the hadoop process)
JPS (It will show the running processes)
Step 1:- Stop-all.sh
Step 2:- got to this path
cd /usr/local/hadoop/bin
Step 3:- Run that command
hadoop datanode
Now DataNode work
Check whether the hadoop.tmp.dir property in the core-site.xml is correctly set.
If you set it, navigate to this directory, and remove or empty this directory.
If you didn't set it, you navigate to its default folder /tmp/hadoop-${user.name}, likewise remove or empty this directory.
In case of Mac os(Pseudo-distributed mode):
Open terminal
Stop dfs. 'sbin/stop-all.sh'.
cd /tmp
rm -rf hadoop*
Navigate to hadoop directory. Format the hdfs. bin/hdfs namenode -format
sbin/start-dfs.sh
Error in datanode.log file
$ more /usr/local/hadoop/logs/hadoop-hduser-datanode-ubuntu.log
Shows:
java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop_tmp/hdfs/datanode: namenode clusterID = CID-e4c3fed0-c2ce-4d8b-8bf3-c6388689eb82; datanode clusterID = CID-2fcfefc7-c931-4cda-8f89-1a67346a9b7c
Solution: Stop your cluster and issue the below command & then start your cluster again.
sudo rm -rf /usr/local/hadoop_tmp/hdfs/datanode/*
I have got details of the issue in the log file like below :
"Invalid directory in dfs.data.dir: Incorrect permission for /home/hdfs/dnman1, expected: rwxr-xr-x, while actual: rwxrwxr-x"
and from there I identified that the datanote file permission was 777 for my folder. I corrected to 755 and it started working.
Instead of deleting everything under the "hadoop tmp dir", you can set another one. For example, if your core-site.xml has this property:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/data/tmp</value>
</property>
You can change this to:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/data/tmp2</value>
</property>
and then scp core-site.xml to each node, and then "hadoop namenode -format", and then restart hadoop.
This is for newer version of Hadoop (I am running 2.4.0)
In this case stop the cluster sbin/stop-all.sh
Then go to /etc/hadoop for config files.
In the file: hdfs-site.xml
Look out for directory paths corresponding to
dfs.namenode.name.dir
dfs.namenode.data.dir
Delete both the directories recursively (rm -r).
Now format the namenode via bin/hadoop namenode -format
And finally sbin/start-all.sh
Hope this helps.
You need to check :
/app/hadoop/tmp/dfs/data/current/VERSION and /app/hadoop/tmp/dfs/name/current/VERSION ---
in those two files and that to Namespace ID of name node and datanode.
If and only if data node's NamespaceID is same as name node's NamespaceID then your datanode will run.
If those are different copy the namenode NamespaceID to your Datanode's NamespaceID using vi editor or gedit and save and re run the deamons it will work perfectly.
if formatting the tmp directory is not working then try this:
first stop all the entities like namenode, datanode etc. (you will
be having some script or command to do that)
Format tmp directory
Go to /var/cache/hadoop-hdfs/hdfs/dfs/ and delete all the contents
in the directory manually
Now format your namenode again
start all the entities then use jps command to confirm that the
datanode has been started
Now run whichever application you have
Hope this helps.
I configured hadoop.tmp.dir in conf/core-site.xml
I configured dfs.data.dir in conf/hdfs-site.xml
I configured dfs.name.dir in conf/hdfs-site.xml
Deleted everything under "/tmp/hadoop-/" directory
Changed file permissions from 777 to 755 for directory listed under dfs.data.dir
And the data node started working.
Even after removing the remaking the directories, the datanode wasn't starting.
So, I started it manually using bin/hadoop datanode
It did not reach any conclusion. I opened another terminal from the same username and did jps and it showed me the running datanode process.
It's working, but I just have to keep the unfinished terminal open by the side.
Follow these steps and your datanode will start again.
1)Stop dfs.
2)Open hdfs-site.xml
3)Remove the data.dir and name.dir properties from hdfs-site.xml and -format namenode again.
4)Then start dfs again.
Got the same error. Tried to start and stop dfs several times, cleared all directories that are mentioned in previous answers, but nothing helped.
The issue was resolved only after rebooting OS and configuring Hadoop from the scratch. (configuring Hadoop from the scratch without rebooting didn't work)
Once I was not able to find data node using jps in hadoop, then I deleted the
current folder in the hadoop installed directory (/opt/hadoop-2.7.0/hadoop_data/dfs/data) and restarted hadoop using start-all.sh and jps.
This time I could find the data node and current folder was created again.
Try this
stop-all.sh
vi hdfs-site.xml
change the value given for property dfs.data.dir
format namenode
start-all.sh
I Have applied some mixed configuration, and its worked for me.
First >>
Stop Hadoop all Services using
${HADOOP_HOME}/sbin/stop-all.sh
Second >>
Check mapred-site.xml which is located at your ${HADOOP_HOME}/etc/hadoop/mapred-site.xml and change the localhost to master.
Third >>
Remove the temporary folder created by hadoop
rm -rf //path//to//your//hadoop//temp//folder
Fourth >>
Add the recursive permission on temp.
sudo chmod -R 777 //path//to//your//hadoop//temp//folder
Fifth >>
Now Start all the services again. And First check that all service including datanode is running.
enter image description here
mv /usr/local/hadoop_store/hdfs/datanode /usr/local/hadoop_store/hdfs/datanode.backup
mkdir /usr/local/hadoop_store/hdfs/datanode
hadoop datanode OR start-all.sh
jps

Hadoop dfs -ls returns list of files in my hadoop/ dir

I've set up a sigle-node Hadoop configuration running via cygwin under Win7. After starting Hadoop bybin/start-all.sh I run bin/hadoop dfs -ls which returns me a list of files in my hadoop directory. Then I run bin/hadoop datanode -formatbin/hadoop namenode -format but -ls still returns me the contents of my hadoop directory. As far as I understand it should return nothing(empty folder). What am I doing wrong?
Did you edit the core-site.xml and mapred-site.xml under conf folder ?
It seems like your hadoop cluster is in local mode.
I know this question is quite old, but directory structure in Hadoop has changed a bit (version 2.5 )
Jeroen's current version would be.
hdfs dfs -ls hdfs://localhost:9000/users/smalldata
Also Just for information - use of start-all.sh and stop-all.sh has been deprecated, instead one should use start-dfs.sh and start-yarn.sh
I had the same problem and solved it by explicitly specifying the URL to the NameNode.
To list all directories in the root of your hdfs space do the following:
./bin/hadoop dfs -ls hdfs://<ip-of-your-server>:9000/
The documentation says something about a default hdfs point in the configuration, but I cannot find it. If someone knows what they mean please enlighten us.
This is where I got the info: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#Overview
Or you could just do:
Run stop-all.sh.
Remove dfs data and name directories
Namenode -format
Run start-all.sh

Resources