Why is homebrew Hadoop 2.3 not working OSX Mavericks? - macos

I am running into the following issues after using homebrew to install hadoop. I followed the guide here:
http://glebche.appspot.com/static/hadoop-ecosystem/hadoop-hive-tutorial.html
Setting the following environment variables in bashrc:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home
export HADOOP_INSTALL=/usr/local/Cellar/hadoop/2.3.0
export HADOOP_HOME=$HADOOP_INSTALL
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
After running a hadoop namenode -format.. I attempt to run start-dfs.sh and get the following:
14/05/05 21:19:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: set hadoop variables
localhost: starting namenode, logging to /usr/local/Cellar/hadoop/2.3.0/libexec/logs/mynotebook.local.out
localhost: Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode
localhost: set hadoop variables
localhost: starting datanode, logging to /usr/local/Cellar/hadoop/2.3.0/libexec/logs/mynotebook.local.out
localhost: Error: Could not find or load main class org.apache.hadoop.hdfs.server.datanode.DataNode
Starting secondary namenodes [0.0.0.0]
0.0.0.0: set hadoop variables
0.0.0.0: secondarynamenode running as process 12747. Stop it first.
14/05/05 21:19:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
How to I get around this issue?

Based on the first line of the second message,
"14/05/05 21:19:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable"
I suppose that you're running hadoop in a 64 bit operating system. Hadoop is built from default in a 32 bit system, I had the same issue and the same message. What you have to do is re-build hadoop from the source on your system.
I suggest you to use the guide below, it's for the 2.2 version but it's ok for the 2.3 version too
http://csrdu.org/nauman/2014/01/23/geting-started-with-hadoop-2-2-0-building/
Or the official guide
http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/NativeLibraries.html#Build

Related

Spark-shell --master yarn stuck

I installed Hadoop and Spark via Homebrew
$ brew list --versions | grep spark
apache-spark 2.2.0
$ brew list --versions | grep hadoop
hadoop 2.8.1 2.8.2 hdfs
where Hadoop 2.8.2 is what I am using.
I followed this post to configure Hadoop. Also, followed this post to configure spark.yarn.archive as:
spark.yarn.archive hdfs://localhost:9000/user/panc25/spark-jars.zip
The following are my Hadoop/Spark related environment setting in my .bash_profile :
# ---------------------
# Hadoop
# ---------------------
export HADOOP_HOME=/usr/local/Cellar/hadoop/2.8.2
export YARN_CONF_DIR=$HADOOP_HOME/libexec/etc/hadoop/
alias hadoop-start="$HADOOP_HOME/sbin/start-dfs.sh;$HADOOP_HOME/sbin/start-yarn.sh"
alias hadoop-stop="$HADOOP_HOME/sbin/stop-yarn.sh;$HADOOP_HOME/sbin/stop-dfs.sh"
# ---------------------
# Apache Spark
# ---------------------
export SPARK_HOME=/usr/local/Cellar/apache-spark/2.2.0/libexec
export PATH=$SPARK_HOME/../bin:$SPARK_HOME/sbin:$PATH
I can successfully start hadoop (hdfa + yarn):
$ hadoop-start
17/11/12 17:08:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-namenode-mbp13mid2017.local.out
localhost: starting datanode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-datanode-mbp13mid2017.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-secondarynamenode-mbp13mid2017.local.out
17/11/12 17:08:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/yarn-panc25-resourcemanager-mbp13mid2017.local.out
localhost: starting nodemanager, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/yarn-panc25-nodemanager-mbp13mid2017.local.out
$ jps
92723 NameNode
93188 Jps
93051 ResourceManager
93149 NodeManager
92814 DataNode
92926 SecondaryNameNode
However, when I start spark-shell --master yarn it seems to freeze and I don't know what is going on:
What is wrong?
BTW, I could visit the SparkUI http://localhost:4040/, but all pages are blank.
I experienced a similar issue an was caused by the fact that I forgot to append /conf to HADOOP_CONF_DIR env variable (/etc/hadoop/conf).
In my case I was running spark 2.1 cloudera distribution and specified HADOOP_CONF_DIR=/etc/hadoop/conf/:/etc/hive/conf/ . Due to some reason it was getting stuck so I modified it to HADOOP_CONF_DIR=/etc/hadoop/conf/ and it worked. Still looking for the root cause !

Hadoop2.7.3: Cannot see DataNode/ResourceManager process after starting hdfs and yarn

I'm using mac and java version:
$java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
followed this link: https://dtflaneur.wordpress.com/2015/10/02/installing-hadoop-on-mac-osx-el-capitan/
I first brew install hadoop, config ssh connection and xml files as required, and
start-dfs.sh
start-yarn.sh
The screen output is like this:
$start-dfs.sh
17/05/06 09:58:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: namenode running as process 74213. Stop it first.
localhost: starting datanode, logging to /usr/local/Cellar/hadoop/2.7.3/libexec/logs/hadoop-x-datanode-xdeMacBook-Pro.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 74417. Stop it first.
17/05/06 09:58:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
$start-dfs.sh
17/05/06 09:58:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: namenode running as process 74213. Stop it first.
localhost: starting datanode, logging to /usr/local/Cellar/hadoop/2.7.3/libexec/logs/hadoop-x-datanode-xdeMacBook-Pro.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 74417. Stop it first.
17/05/06 09:58:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Then using jps I cannot see "DataNode" and "ResourceManager". I suppose DataNode is hdfs module and ResourceManager is yarn module:
$jps
74417 SecondaryNameNode
75120 Jps
74213 NameNode
74539 ResourceManager
74637 NodeManager
I can list hdfs files:
$hdfs dfs -ls /
17/05/06 09:58:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x - x supergroup 0 2017-05-05 23:50 /user
But running the pi examples throws exception:
$hadoop jar /usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar pi 2 5
Number of Maps = 2
Samples per Map = 5
17/05/06 10:19:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/06 10:19:49 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/x/QuasiMonteCarlo_1494037188550_135794067/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
I wonder if I missed any configuation, how can I make sure that they run successfully, and how to check or trouble shoot possible failure reasons?
Thanks.
I am too in learning phase yet. This error comes when there is no datanode available to read/write.
You can check Resource Manager using this URL: http://localhost:50070
Is there any datanode running or not.
For trouble shooting you can check logs generated under installation directory of hadoop . If you can share that logs i can try to help.

Unable to start namenode in Hadoop

I am installing Hadoop 2.7.3 on my Ubuntu 16.0.4 system. I am getting following error while trying to execute start-dfs.sh. I have checked all configuration files.
node#hellbot:~$ start-dfs.sh
17/01/28 20:46:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-node-namenode-hellbot.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-node-datanode-hellbot.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-node-secondarynamenode-hellbot.out
17/01/28 20:46:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Thanks in advance

I'm getting error while running start-dfs.sh

I'm getting error while running start-dfs.sh
start-dfs.sh
16/10/02 23:10:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/hadoop/logs/hadoop-root-namenode-Web.out
localhost: nice: /home/hadoop/hadoop/bin/hdfs: No such file or directory
localhost: starting datanode, logging to /opt/hadoop/logs/hadoop-root-datanode-Web.out
localhost: nice: /home/hadoop/hadoop/bin/hdfs: No such file or directory
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-root-secondarynamenode-Web.out
0.0.0.0: nice: /home/hadoop/hadoop/bin/hdfs: No such file or directory
16/10/02 23:11:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Looks like you are missing the hadoop home env var.
export HADOOP_HOME=/opt/hadoop
then try is anything working
hadoop version
Issues like yours depends on lack of env var.

Starting hadoop Daemons issues

I have installed Hadoop 2.6.0 in my ubuntu 12.04. When I start/stop the dfs-sh daemon its showing the below error. Please help me to overcome this issue
no namenode to stop
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
16/05/04 10:40:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Thanks,
It looks like namenode server was no started by looking at mentioned errors. Can you please share your cluster details ?
Meanwhile, you can check with [the link for how to setup hadoop cluster][1]
[1]: http://bigdatahandler.com/hadoop-hdfs/hadoop-multi-node-cluster-setup/ and compare with your setup

Resources