running fpg algorithm of mahout on hadoop as cluster mod - hadoop

I install mahout-0.7 and hadoop-1.2.1 on linux (centos).hadoop config as multi_node.
I created a user named hadoop and install mahout and hadoop in path /home/hadoop/opt/
I set MAHOU_HOME and HADOOP_HOME and MAHOUT_LOCAL , .... in .bashrc file in the user's environment hadoop
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
# User specific aliases and functions
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71/jre
export HADOOP_HOME=/home/hadoop/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_CONF_DIR=/opt/hadoop/conf
export MAHOUT_HOME=/home/hadoop/opt/mahout
export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
export PATH=$PATH:$MAHOUT_HOME/bin
I want to run mahout on hadoop systemfile ,When I run the following command, I get an error
command:
hadoop#master mahout$ bin/mahout fpg -i /home/hadoop/output.dat -o patterns -method mapreduce -k 50 -s 2
error:
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
Error occurred during initialization of VM
Could not reserve enough space for object heap
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Please help me. I tried but could not fix the error.

It seems that there are some conflicts in your configurations and usage.
In the fist look you can make sure about these:
To make sure that you've set the Mahout path correctly use this command:
echo $MAHOUT_LOCAL
This should not return an empty string (when you run mahout locally)
Also HADOOP_CONF_DIR should be set to $HADOOP_HOME/conf
Here's a list of popular environment variables for Hadoop:
#HADOOP VARIABLES START
export JAVA_HOME=/path/to/jdk1.8.0/ #your jdk path
export HADOOP_HOME=/usr/local/hadoop #your hadoop path
export HADOOP_INSTALL=/usr/local/hadoop #your hadoop path
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export HADOOP_CLASSPATH=/home/hduser/lib/* #thir party libraries to be loaded with Hadoop
#HADOOP VARIABLES END
Also you get a heap error and you should increase your heap size so JVM will be enable to initialize
Also you may help to solve your error by adding more info about your cluster:
how many machine are you using?
what is the hardware spec of these machines?
what distribution and version of Hadoop?

Related

When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment

I am trying to run Spark using yarn and I am running into this error:
Exception in thread "main" java.lang.Exception: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
I am not sure where the "environment" is (what specific file?). I tried using:
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
in the bash_profile, but this doesn't seem to help.
While running spark using Yarn, you need to add following line in to spark-env.sh
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
Note: check $HADOOP_HOME/etc/hadoop is correct one in your environment. And spark-env.sh contains export of HADOOP_HOME as well.
For the Windows environment, open file load-spark-env.cmd in the Spark bin folder and add the following line:
set HADOOP_CONF_DIR=%HADOOP_HOME%\etc\hadoop
just an update to answer by Shubhangi,
cd $SPARK_HOME/bin
sudo nano load-spark-env.sh
add below lines , save and exit
export SPARK_LOCAL_IP="127.0.0.1"
export HADOOP_CONF_DIR="$HADOOP_HOME/etc/hadoop"
export YARN_CONF_DIR="$HADOOP_HOME/etc/hadoop"

sqoop started but command show sqoop command not found

i am learning sqoop from few days and successfully installed and configure with hadoop.
hadoop_usr#sawai-Lenovo-G580:/usr/local/sqoop/bin$ sqoop2-server start
Setting conf dir: /usr/local/sqoop/bin/conf
Sqoop home directory: /usr/local/sqoop
The Sqoop server is already started.
hadoop_usr#sawai-Lenovo-G580:/usr/local/sqoop/bin$ sqoop
sqoop: command not found
sqoop server is already running and when i try to fire sqoop command then i get error message. command not found. sqoop home is already in path
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME
export SQOOP_HOME=/usr/local/sqoop
export SQOOP_CONF_DIR=$SQOOP_HOME/conf
export SQOOP_CLASSPATH=$SQOOP_HOME/server/lib
export PATH=$PATH:$SQOOP_HOME/bin:$SQOOP_CONF:$SQOOP_CLASSPATH
$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/hadoop/sbin:/usr/local/hadoop/bin:/usr/local/sqoop/bin::/usr/local/sqoop/server/lib
Please help me to resolve this issue.
Thanks in advance.
command not found error in most of the cases happens because of path is not set for same.
Kindly set the paths of sqoop, which you already have done.
export PATH=$PATH:$SQOOP_HOME/bin:$SQOOP_CONF:$SQOOP_CLASSPATH
Compile the file where you have set $PATH or restart your terminal.
put below command in .bashrc file
export SQOOP_HOME=/home/pj/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
and restart .bashrc
source .bashrc
If still issue persist, restart terminal.

Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode Tried all solution still error persists

I am following tutorial to install hadoop. It is explained with hadoop 1.x but I am using hadoop-2.6.0
I have successfully completed all the step just before executing following cmd.
bin/hadoop namenode -format
I am getting the following error when I execute the above command.
Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode
My hadoop-env.sh file
The java implementation to use.
export JAVA_HOME="C:/Program Files/Java/jdk1.8.0_74"
# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol. Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}
export HADOOP_PREFIX="/home/582092/hadoop-2.6.0"
export HADOOP_HOME="/home/582092/hadoop-2.6.0"
export HADOOP_COMMON_HOME=$HADOOP_HOME
#export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_PREFIX/bin
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.6.0.jar
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
core-site.xml
core-site.xml Image
hdfs-site.xml
dfs.data.dir
/home/582092/hadoop-dir/datadir
dfs.name.dir
/home/582092/hadoop-dir/namedir
Kindly help me in fixing this issue.
One cause behind this problem might be a user-defined HDFS_DIR environment variable. This is picked up by scripts such as the following lines in libexec/hadoop-functions.sh:
HDFS_DIR=${HDFS_DIR:-"share/hadoop/hdfs"}
...
if [[ -z "${HADOOP_HDFS_HOME}" ]] &&
[[ -d "${HADOOP_HOME}/${HDFS_DIR}" ]]; then
export HADOOP_HDFS_HOME="${HADOOP_HOME}"
fi
The solution is to avoid defining an environment variable HDFS_DIR.
The recommendations in the comments of question are correct – use the hadoop classpath command to identify whether hadoop-hdfs-*.jar files are present in the classpath or not. They were missing in my case.

hadoop installation issue on fedora 24

I have been following this tutorial thttp://www.tecmint.com/install-configure-apache-hadoop-centos-7/ to set up hadoop on a virtual machine. However, when i try to start hadoop I am getting the follow error:
start-dfs.sh
Java HotSpot(TM) Client VM warning: You have loaded library /opt/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. 16/11/09 08:20:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [master.hadoop.lan]
my Java JDK directory is JAVA_HOME=/usr/local/jdk1.8.0_111
below is my configuration information in the .bash_profile file:
## JAVA env variables
export JAVA_HOME=/usr/local/jdk1.8.0_111
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
## HADOOP env variables
export HADOOP_HOME=/opt/hadoop
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
my java home path in the hadoop-env.sh file is :
export JAVA_HOME=/usr/local/jdk1.8.0_111
Am I missing a configuration step?
it's just a warning.
type jps and check whether namenode and datanode is running.
If you want to eliminate the warning,
replace export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
with
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export HADOOP_OPTS="$HADOOP_OPTS -XX:-PrintWarnings -Djava.net.preferIPv4Stack=true"
and execute bash

sqoop hadoop-mapreduce does not exist

I run a import command on sqoop and I face the below issue. Can someone help me with this.
Error: /usr/local/sqoop-1.4.5.bin__hadoop-2.0.4-alpha/bin/../../hadoop-mapreduce does not exist!
Please set $HADOOP_MAPRED_HOME to the root of your Hadoop MapReduce installation.
My bashrc:
export JAVA_HOME=$(/usr/libexec/java_home)
export HADOOP_HOME=/usr/local/Cellar/hadoop/2.6.0/libexec
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_PID_DIR=$HADOOP_HOME/pids
export HADOOP_LOG_DIR=$HADOOP_HOME/logs
export HADOOP_HOME_WARN_SUPPRESS=true
export HADOOP_PREFIX=$HADOOP_HOME
export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:/usr/local/sqoop/bin:$PATH
Your Sqoop installation in PATH and error does not match.
export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:/usr/local/sqoop/bin:$PATH
Error: /usr/local/sqoop-1.4.5.bin__hadoop-2.0.4-alpha/bin/../../hadoop-mapreduce does not exist! Please set $HADOOP_MAPRED_HOME to the root of your Hadoop MapReduce installation.
1. Your PATH says sqoop is located at /usr/local/sqoop but your error points sqoop location to /usr/local/sqoop-1.4.5.... Give the correct location for sqoop and hadoop home.
2. Export HADOOP_MAPRED_HOME in sqoop.sh (found in $SQOOP_HOME/bin). Now execute, sqoop import command.

Resources