How to know the default location of hadoop homepath - hadoop

I have installed cloudera hadoop 2.0.0 CDH4 while doing i am not mentioning any
home path.It is working fine now. But when i ran JPS command.It was showing jps
process only.
So i tried to start the hadoop but I am unable to find the location of hadoop,
where It is actually stored. So can any one please help how to find the default
location.
Is there any commands are there to find the exact location of hadoop in my
system?
please help me on this issue.
Thanks,
Anbu k

Yes, you could try:
find / -name hadoop

Related

Hadoop setup and configuration

C:\hadoop-2.3.0\bin>hadoop
The system cannot find the path specified.
Error: JAVA_HOME is incorrectly set.
Please update C:\hadoop-2.3.0\conf\hadoop-env.c
Usage: hadoop [--config confdir] COMMAND
Facing above error in Hadoop configuration. Can anyone please help to resolve the issue.
If this is for learning purpose to setup hadoop on windows you will find enough blog link
If your primary objective is to learn Hadoop then i will suggest you to download VMware Player and setup hadoop on ubantu or you can download CDH version from cloudera website to start your learning.

Spark installed but no command 'hdfs' or 'hadoop' found

I am a new pyspark user.
I just downloaded and installed a spark cluster ("spark-2.0.2-bin-hadoop2.7.tgz")
after installation I wanted to access the file system (upload local files to cluster). But when I tried to type hadoop or hdfs in command it will say "no command found".
Am I gonna install hadoop/HDFS (I thought it's built in the spark, I don't get)?
Thanks in advance.
You have to install hadoop first to access HDFS.
Follow this http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Choose the latest version of hadoop from the apache site.
Once you done with hadoop setup go to spark http://d3kbcqa49mib13.cloudfront.net/spark-2.0.2-bin-hadoop2.7.tgz download this, Extract files. Setup java_home and hadoop_home in spark-env.sh.
You don't have hdfs or hadoop on classpath so this is the reason why you are getting message: "no command found".
If you run \yourparh\hadoop-2.7.1\bin\hdfs dfs -ls / it should works and show root content.
But, You can add your hadoop/bin (hdfs, hadoop ...) commands to classpath with something like this:
export PATH $PATH:$HADOOP_HOME/bin
where HADOOP_HOME is your env. variable with path to hadoop installation folder (download and install is required)

Hadoop namenode format not working

I've been trying to install hadoop 2.7.0 on Ubuntu but when i enter the hadoop namenode -format command i get the following message:
Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode
I've triple checked all the configuration files but i can't seem to find where the problem is.
I followed this tutorial : http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
Can anyone please tell me why is this not working??
You have to add hadoop-hdfs-2.7.0.jar to your hadoop classpath. Just add these lines in $HADOOP_HOME/etc/hadoop/hadoop-env.sh:
export HADOOP_HOME=/path/to/hadoop
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.7.0.jar
Now, stop all hadoop processes. Try to format namenode now. Post the error if you get any.

sudo jps not locating MapReduce jobtracker

I am running CDH5 on Ubuntu. I have installed everything i need, but when i type in sudo jps, the jobtracker is not displayed. Heres my configuration on mapred-site.xml
mapred.job.tracker.http.address: localhost{50030|50020}
Can someone please explain why this is happening? How can it be fixed?
what is your hadoop version? If it is 0.20.2+ then there is no need of configuring the jobtracker as they have removed the seperate jobtracker functionality. You can find it at localhost:8088.
Remove the configuration and restart the node
If it is an older version then try to manually start it using:
$hadoop jobtracker
if it doe not start, post the error log here

HADOOP_HOME and hadoop streaming

Hi I am trying to run hadoop on a server that has hadoop installed but I have no idea the directory where hadoop resides. The server was configure by the server admin.
In order to load hadoop I use the use command from the dotkit package.
There may be several solutions but wanted to know where the hadoop package was installed, how to set up the $HADOOP_HOME variable, and how to approp run a hadoop streaming job, such as $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/mapred/contrib/streaming/hadoop-streaming.jar, aka, http://wiki.apache.org/hadoop/HadoopStreaming.
Thanks! any help would be greatly appreciated!
If you're using a cloudera distribution then it's most probably in /usr/lib/hadoop, otherwise it could be anywhere (at the discretion of your system admin).
There are some tricks you can use to try and locate it:
locate hadoop-env.sh (assuming that locate has been installed and updatedb has been run recently)
If the machine you're running this on is running a hadoop service (such as data node, job tracker, task tracker, name node), then you can perform a process list and grep for the hadoop command: ps axww | grep hadoop
Failing the above two, look for the hadoop root directory in some common locations such as: /usr/lib, /usr/local, /opt
Failing all this, and assuming your current user has the permissions: find / -name hadoop-env.sh
If you're install with rpm then it's most probably in /etc/hadoop.
Why don't you try:
echo $HADOOP_HOME
Obiviously the above env variable has to be set before you could even issue hadoop executables from anywhere on the box.

Resources