Hadoop Home Path for Datanode - hadoop

I'm configuring a 3 node hadoop cluster in EC2.
For Namenode and Jobtracker:
export HADOOP_HOME=/usr/local/hadoop # Masternode
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib"
For Datanode, I attached additional EBS storage which is mounted on /vol,
export HADOOP_HOME=/vol/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib"
When I tried to run start-dfs.sh on Namenode I got following error,
data01.hdp-dev.XYZ.com: bash: line 0: cd: /usr/local/hadoop: No such file or directory
data01.hdp-dev.XYZ.com: bash: /usr/local/hadoop/sbin/hadoop-daemon.sh: No such file or directory
Notice the Namenode trying to invoke hadoop from Datanode from wrong directory...
Any advice would help.

Related

HADOOP: Failed to run hdfs namenoe -format

When I run hdfs namenode -format, I meet an error:
hadoop/bin/hdfs: line 10: /opt/bmr/hadoop/bin/hdfs.distro: No such file or directory
And in hdfs file, I found the contest as follows.
#!/bin/bash
export HADOOP_HOME=${HADOOP_HOME:-/opt/bmr/hadoop}
export HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-/opt/bmr/hadoop}
export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-/opt/bmr/hadoop}
export HADOOP_LIBEXEC_DIR=${HADOOP_HOME}/libexec
export HADOOP_OPTS="${HADOOP_OPTS}"
export YARN_OPTS="${YARN_OPTS}"
exec /opt/bmr/hadoop/bin/hdfs.distro "$#"

unable to start namenode after formatting in centos7

I am unable to start namenode in hdp 2.3.4 centos 7 after running the format command. I am getting below error: Error: Cannot find configuration directory: start
Below is the bashrc file:
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
export JAVA_HOME=$PATH/jdk1.7.0_71
export HADOOP_INSTALL=$PATH/hadoop-2.3.4
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
Below is the command I am executing to start namenode:
/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
The error
Error: Cannot find configuration directory:
is thrown because the variable $HADOOP_CONF_DIR, used in the command, is not set in the environment and is trying to start namenode with no actual config --config $HADOOP_CONF_DIR path.
After fixing the environment variable assignments, the .bashrc should look like this (assumed the installation is thru tarballs)
export JAVA_HOME=/<absolute_path_where_jdk_is_extracted>/jdk1.7.0_71
export HADOOP_INSTALL=/<absolute_path_where_hdp_is_extracted>/hadoop-2.3.4
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin:$JAVA_HOME/bin
Update your .bashrc with below parameters
export JAVA_HOME= location of the JAVA_home (/usr/java/jdk1.x.x)
export HADOOP_HOME=location of the HADOOP_HOME (User defined)
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export PATH
Note: Hadoop installed location should be with HADOOP_HOME, it will reflext in hadoop-env.sh

Hadoop fs shell commands not working

I'm not able to run hadoop fs shell commands from the CLI but, able to browse the hdfs through web UI and also other hadoop commands are working fine (for example hadoop version). Below is the error I'm getting. Please help.
$ hadoop fs -ls /
-ls: For input string: "13.1067728"
Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [<path> ...]
Use
hdfs dfs -ls ...
Try this then f above doesnt works,
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
Add these two lines in .bashrc.
export Hadoop home according to your system in .bashrc
#Hadoop variables
export JAVA_HOME=/usr/jdk1.8.0_11
export PATH=$JAVA_HOME/bin:$PATH
export HADOOP_HOME=/home/kishore/BigData/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
and execute command on terminal
source ~/.bashrc

Can hadoop slaves exist on different install directory?

I have a 3 node hadoop cluster with one namenode and two datanodes.
The namenode resides in : /opt/hadoop/ directory and datanodes reside in /mnt/hadoop/ directory.
In .bashrc of namenode is :
export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
export HADOOP_INSTALL=/opt/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
and datanodes is :
export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
export HADOOP_INSTALL=/mnt/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
However,when I start the cluster from namenode , I get from the datanodes saying
/opt/hadoop/sbin/hadoop-daemon.sh: No such file or directory .
It seems to me that slaves are referring /opt/hadoop/ instead of /mnt/hadoop. Why is this ?
Should the slaves reside in the same path as namenodes ?
Thanks.
If you use start-dfs.sh (start-all.sh is depricated now which internally invokes start-dfs.sh and start-yarn.sh utilities for starting HDFS and YARN services respectively) utility for starting all HDFS services(namenode and datanodes), then you got to maintain the directory structure (keeping hadoop artifacts and configuration files) in all nodes.
If you are not maintaining in the same directory structure, you have to execute the following command in all slave nodes for starting datanodes.
$HADOOP_INSTALL/sbin/hadoop-daemon.sh start datanode
For starting yarn slave services you have to use the following command in all slave nodes.
$HADOOP_INSTALL/sbin/yarn-daemon.sh start nodemanager

Can not create hadoop cluster

I am following http://alanxelsys.com/hadoop-v2-single-node-installation-on-centos-6-5 to install hadoop
on my cluster
I have installed hadoop inside /usr/local/hadoop/sbin directory and when I try executing the bash script
start-all.sh; system gives below error;
start-all.sh: command not found
Know What I have tried
1. Tried setting SSH again
2. Recheck the java path
Varible i have set is
export JAVA_HOME=/usr/java/latest
export HADOOP_INSTALL=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
The start-all.sh script says that "This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh" so you should use start-dfs.sh and start-yarn.sh

Resources