Hadoop's NameNode and DataNode Service did not run in single_mode - hadoop

I installed Hadoop 2.7.2 on Ubuntu 16.04 in single mode. But neither NameNode nor DataNode Services run after starting the Hadoop.
hduser#saber-Studio-1435:/usr/local/hadoop$ start-all.sh
This script is Deprecated.
Instead use start-dfs.sh and start-yarn.sh
16/06/20 15:34:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-saber-Studio-1435.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-saber-Studio-1435.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 7214. Stop it first.
16/06/20 15:35:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
resourcemanager running as process 7374. Stop it first.
localhost: nodemanager running as process 7502. Stop it first.
Status:
hduser#saber-Studio-1435:/usr/local/hadoop$ jps
8747 Jps
7502 NodeManager
7374 ResourceManager
7214 SecondaryNameNode

First stop the hadoop $HADOOP_HOME ./sbin/stop-all.sh
Then format the hadoop ecosytem
./bin/hadoop namenode -format
./bin/hadoop datanode -format
./bin/hdfs namenode -format
./bin/hdfs datanode -format
Then start agian using ./sbin/start-all.sh
Then try jps on cli and if still does'nt works then remove the directory created for hdfs and recreate it using mkdir -p

Related

How to install Hadoop on M1 Mac

I followed serveral tuitorial and everytime I start Hadoop will have these
feiyechen#FEIYEdeMac-mini ~ % start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as feiyechen in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
Starting datanodes
localhost: datanode is running as process 55832. Stop it first and ensure /tmp/hadoop-feiyechen-datanode.pid file is empty before retry.
Starting secondary namenodes [FEIYEdeMac-mini.local]
FEIYEdeMac-mini.local: secondarynamenode is running as process 55966. Stop it first and ensure /tmp/hadoop-feiyechen-secondarynamenode.pid file is empty before retry.
2022-01-28 20:35:24,311 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
feiyechen#FEIYEdeMac-mini ~ % jps
55832 DataNode
57838 Jps
55966 SecondaryNameNode
57247 NameNode
Tutorial said should got these after run jps
I only have 4 items: DataNode, Jps, SecondaryNameNode, NameNode. Is that mean I failed?
It means you have a running HDFS installation, but not YARN.
You should be able to run start-yarn.sh separately if you want the ResourceManger + NodeManager
Otherwise, there are log files created for both the YARN processes that would include information about why they are failing.

Spark-shell --master yarn stuck

I installed Hadoop and Spark via Homebrew
$ brew list --versions | grep spark
apache-spark 2.2.0
$ brew list --versions | grep hadoop
hadoop 2.8.1 2.8.2 hdfs
where Hadoop 2.8.2 is what I am using.
I followed this post to configure Hadoop. Also, followed this post to configure spark.yarn.archive as:
spark.yarn.archive hdfs://localhost:9000/user/panc25/spark-jars.zip
The following are my Hadoop/Spark related environment setting in my .bash_profile :
# ---------------------
# Hadoop
# ---------------------
export HADOOP_HOME=/usr/local/Cellar/hadoop/2.8.2
export YARN_CONF_DIR=$HADOOP_HOME/libexec/etc/hadoop/
alias hadoop-start="$HADOOP_HOME/sbin/start-dfs.sh;$HADOOP_HOME/sbin/start-yarn.sh"
alias hadoop-stop="$HADOOP_HOME/sbin/stop-yarn.sh;$HADOOP_HOME/sbin/stop-dfs.sh"
# ---------------------
# Apache Spark
# ---------------------
export SPARK_HOME=/usr/local/Cellar/apache-spark/2.2.0/libexec
export PATH=$SPARK_HOME/../bin:$SPARK_HOME/sbin:$PATH
I can successfully start hadoop (hdfa + yarn):
$ hadoop-start
17/11/12 17:08:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-namenode-mbp13mid2017.local.out
localhost: starting datanode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-datanode-mbp13mid2017.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-secondarynamenode-mbp13mid2017.local.out
17/11/12 17:08:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/yarn-panc25-resourcemanager-mbp13mid2017.local.out
localhost: starting nodemanager, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/yarn-panc25-nodemanager-mbp13mid2017.local.out
$ jps
92723 NameNode
93188 Jps
93051 ResourceManager
93149 NodeManager
92814 DataNode
92926 SecondaryNameNode
However, when I start spark-shell --master yarn it seems to freeze and I don't know what is going on:
What is wrong?
BTW, I could visit the SparkUI http://localhost:4040/, but all pages are blank.
I experienced a similar issue an was caused by the fact that I forgot to append /conf to HADOOP_CONF_DIR env variable (/etc/hadoop/conf).
In my case I was running spark 2.1 cloudera distribution and specified HADOOP_CONF_DIR=/etc/hadoop/conf/:/etc/hive/conf/ . Due to some reason it was getting stuck so I modified it to HADOOP_CONF_DIR=/etc/hadoop/conf/ and it worked. Still looking for the root cause !

Hadoop v2.7 pseudo distributed installation NativeCodeLoader error

I've tried to install hadoop in my system and first i was getting permissions that i was able to resolve by just a simple chmod and chown but after solving this issue now there is a new error that arises whenever I use start-dfs.sh
kishan#RoCk ~ $ start-dfs.sh
17/04/08 12:22:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
Starting namenodes on []
localhost: starting namenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-kishan-namenode-RoCk.out
localhost: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-kishan-datanode-RoCk.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-kishan-secondarynamenode-RoCk.out
17/04/08 12:22:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
kishan#RoCk ~ $ jps
10303 Jps
it's a warning but all the nodes are not running.
UPDATE:
Namenode ERROR log:
2017-04-09 21:32:40,002 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/kishan/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:327)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
2017-04-09 21:32:40,003 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/kishan/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible
Your Namenode needs a format.
hdfs namenode -format
If the Datanode data directories were already created, manually remove them before re-starting the cluster.

Hadoop 2.6.2, start-dfs.sh dont start jobtacker and tasktracker

I installed hadoop single node, and now Im starting the cluster with start-dfs.sh command.
But jobotracker and tasktracker are not appearing with jps command, so it seems that they are not starting.
Do you see why? Im installing the version 2.6.2...
After execute the command start-dfs.sh, this appears:
[hadoopadmin#hadoop ~]$ start-dfs.sh
16/03/23 12:17:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop-2.6.2/logs/hadoop-hadoopadmin-namenode-hadoop.out
localhost: starting datanode, logging to /usr/local/hadoop-2.6.2/logs/hadoop-hadoopadmin-datanode-hadoop.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.6.2/logs/hadoop-hadoopadmin-secondarynamenode-hadoop.out
16/03/23 12:17:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoopadmin#hadoop ~]$ jps
2881 DataNode
2758 NameNode
3142 Jps
3039 SecondaryNameNode
[hadoopadmin#hadoop ~]$
There is no JobTracker and TaskTracker anymore. We have NodeManager and resourceManager. Here you just started dfs services not started yarn services, to start yarn services run start-yarn.sh then only yarn related services will start.
If you want to start all services run start-all.sh (not a good practice)

Hadoop Datanode is not starting

Curently, I have installed Hadoop in my Ubuntu system. And I started it. Here are the details:
krish#krish-VirtualBox:~$ start-dfs.sh
14/10/20 13:16:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-krish-namenode-krish-VirtualBox.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-krish-datanode-krish-VirtualBox.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-krish-secondarynamenode-krish-VirtualBox.out
14/10/20 13:16:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
krish#krish-VirtualBox:~$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-krish-resourcemanager-krish-VirtualBox.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-krish-nodemanager-krish-VirtualBox.out
krish#krish-VirtualBox:~$ jps
3065 NodeManager
2800 SecondaryNameNode
2941 ResourceManager
3307 Jps
2497 NameNode
krish#krish-VirtualBox:~$
I just want to know if all things are perfect in it. I do not see Datanode in the checklist.
stop the cluster .
if you have specifically defined tmp directory location in core-site.xml then remove all files under those directory .
if you have specifically defined data node and namenode directory in hdfs-site.xml then delete all the files under those directories .
if you have not defined anything in core-site or hdfs-site then please remove all the files under /tmp/hadoop-*nameofyourhadoop user
format the namenode
it should work

Resources