HDP: unable to start Phoenix sqlline.py - hadoop

I am using Sandbox HDP 2.2
I did a yum install phoenix (version is 4.2)
But when I run these:
./sqlline.py localhost:2181
./sqlline.py localhost
./sqlline.py sandbox.hortonworks.com:2181
./sqlline.py sandbox.hortonworks.com
I got the error:
15/07/03 08:26:31 ERROR client.ConnectionManager$HConnectionImplementation:
The node /hbase is not in ZooKeeper. It should have been written by the master.
Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch
with the one configured in the master.
I tried to run:
./sqlline.py sandbox.hortonworks.com:2181:/hbase-unsecure
But it "hangs" - after 20 minutes still no response
I have this in my /etc/hbase/conf/hbase-site.xml:
<property>
<name>hbase.zookeeper.quorum</name>
<value>sandbox.hortonworks.com</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase-unsecure</value>
</property>

You have to create links in the directory where sqlline.py lives to 2 .xml files that are provided by HBase/Hadoop.
$ pwd
/usr/hdp/2.2.8.0-3150/phoenix/bin
$ ll | grep xml
lrwxrwxrwx 1 root root 29 Dec 16 13:34 core-site.xml -> /etc/hbase/conf/core-site.xml
lrwxrwxrwx 1 root root 30 Dec 16 13:34 hbase-site.xml -> /etc/hbase/conf/hbase-site.xml
With those in place and $JAVA_HOME and java on your $PATH, you can now run sqlline.py:
$ ./sqlline.py localhost:2181/hbase-unsecure

You will need to specify root dir "/hbase-unsecure" in the connection string because by default Phoenix is trying to connect to /hbase. Try this:
./sqlline.py localhost:2181:/hbase-unsecure

Related

Hadoop Edge HDFS points to local FS

I have done my Hadoop cluster including 1 NameNode and 2 DataNodes and everything works perfectly :)
Now, I want to add a Hadoop Edge (aka Hadoop Gateway), I followed instructions here and finally, I execute :
hadoop fs -ls /
But unfortunately, I expected to see my HDFS's content but I see my local FS :
Found 22 items
-rw-r--r-- 1 root root 0 2017-03-30 16:44 /autorelabel
dr-xr-xr-x - root root 20480 2017-03-30 16:49 /bin
...
drwxr-xr-x - root root 20480 2016-07-08 17:31 /home
I think my core-site.xml is configurated as needed with specific property :
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopnodemaster1:8020/</value>
</property>
hadoopmaster1 is my namenode and is reachable ..
I don't understand why I see my Local FS and not my HDFS .. Thank you :)

I tried to start up HBase

I tried to run start-hbase.sh. but...
dream#dream-VirtualBox:/usr/local/hbase/bin$ cat ~/.bashrc | tail -n 2
export PATH=$PATH:/usr/local/hadoop/sbin/:/usr/local/hadoop/bin/:/usr/local/hbase/bin/:/usr/local/mahout/bin/
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
dream#dream-VirtualBox:/usr/local/hbase/bin$source ~/.bashrc
dream#dream-VirtualBox:/usr/local/hbase/bin$sh -x ./bin/start-hbase.sh
...(skip)...
./start-hbase.sh: 53: [: unexpected operator
+ /usr/local/hbase/bin/hbase-daemons.sh --config /usr/local/hbase/bin/../conf start zookeeper
Error: Could not find or load main class .usr.lib.jvm.java-7-oracle..bin.java
+ /usr/local/hbase/bin/hbase-daemon.sh --config /usr/local/hbase/bin/../conf start master
starting master, logging to /usr/local/hbase/bin/../logs/hbase-dream-master-dream-VirtualBox.out
Error: Could not find or load main class .usr.lib.jvm.java-7-oracle..bin.java
+ /usr/local/hbase/bin/hbase-daemons.sh --config /usr/local/hbase/bin/../conf --hosts /usr/local/hbase/bin/../conf/regionservers start regionserver
starting regionserver, logging to /usr/local/hbase/bin/../logs/hbase-dream-1-regionserver-dream-VirtualBox.out
Error: Could not find or load main class .usr.lib.jvm.java-7-oracle..bin.java
+ /usr/local/hbase/bin/hbase-daemons.sh --config /usr/local/hbase/bin/../conf --hosts /usr/local/hbase/bin/../conf/backup-masters start master-backup
I observed start-hbase.sh that it tried to run shell of /usr/local/hbase/bin/hbase org.apache.hadoop.hbase.zookeeper.ZKServerTool in fail.
I didn't sure that hbase why always throw exception.
dream#dream-VirtualBox:/usr/local/hbase$ /usr/local/hbase/bin/hbase org.apache.hadoop.hbase.zookeeper.ZKServerTool
Error: Could not find or load main class .usr.lib.jvm.java-7-oracle..bin.java
dream#dream-VirtualBox:/usr/local/hbase$ ./bin/hbase shell
Error: Could not find or load main class .usr.lib.jvm.java-7-oracle..bin.java
But... I tried to use sudo. it maybe look work
dream#dream-VirtualBox:/usr/local/hbase$ sudo ./bin/start-hbase.sh
starting master, logging to /usr/local/hbase/bin/../logs/hbase-root-master-dream-VirtualBox.out
Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
dream#dream-VirtualBox:/usr/local/hbase$ jps
2869 NameNode
3540 NodeManager
3403 ResourceManager
3237 SecondaryNameNode
3031 DataNode
5666 Jps
dream#dream-VirtualBox:/usr/local/hbase$ sudo jps
5053 HQuorumPeer
2869 NameNode
3540 NodeManager
5857 Jps
3403 ResourceManager
3237 SecondaryNameNode
3031 DataNode
dream#dream-VirtualBox:/usr/local/hbase$ sudo ./bin/hbase shell
2015-08-10 15:41:04,136 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.1, rd0a115a7267f54e01c72c603ec53e91ec418292f, Tue Jun 23 14:44:07 PDT 2015
hbase(main):001:0>
My environment
Linux dream-VirtualBox 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Java-7-oracle#1.7.0_80
HBase-1.1.1
My HBase setting
conf/hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///usr/local/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/hbase/zookeeper</value>
</property>
</configuration>
~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export PATH=$PATH:/usr/local/hadoop/sbin/:/usr/local/hadoop/bin/:/usr/local/hbase/bin/
Would you please give me any help?
Thanks.
First, I not sure that why had some funny property of exec in /bin/hbase.
/bin/hbase:
exec "$JAVA" -Dproc_$COMMAND -XX:OnOutOfMemoryError="kill -9 %p" $HEAP_SETTINGS $HBASE_OPTS $CLASS "$#"
Entity:
exec /usr/lib/jvm/java-7-oracle/bin/java -DXXXXXX /usr/lib/jvm/java-7-oracle//bin/java -Xmx1000m -DXXXXXX
I think that I needed delete /usr/lib/jvm/java-7-oracle//bin/java.
I observed line 217-229 in script of /bin/hbase.
217 #If avail, add Hadoop to the CLASSPATH and to the JAVA_LIBRARY_PATH
218 # Allow this functionality to be disabled
219 if [ "$HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP" != "true" ] ; then
220 HADOOP_IN_PATH=$(PATH="${HADOOP_HOME:-${HADOOP_PREFIX}}/bin:$PATH" which hadoop 2>/dev/null)
221 if [ -f ${HADOOP_IN_PATH} ]; then
222 HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$CLASSPATH" ${HADOOP_IN_PATH} \
223 org.apache.hadoop.hbase.util.GetJavaProperty java.library.path 2>/dev/null)
224 if [ -n "$HADOOP_JAVA_LIBRARY_PATH" ]; then
225 JAVA_LIBRARY_PATH=$(append_path "${JAVA_LIBRARY_PATH}" "$HADOOP_JAVA_LIBRARY_PATH")
226 fi
227 CLASSPATH=$(append_path "${CLASSPATH}" `${HADOOP_IN_PATH} classpath 2>/dev/null`)
228 fi
229 fi
That do something when HADOOP_PATH in PATH.
To explain why my user(dream) didn't run /bin/hbase but root was fine.
So, I had remove HADOOP_PATH in PATH. It seem work.
dream#dream-VirtualBox:/usr/local/hbase/bin$ ./start-hbase.sh
starting master, logging to /usr/local/hbase/bin/../logs/hbase-dream-master-dream-VirtualBox.out
dream#dream-VirtualBox:/usr/local/hbase/bin$ jps
22956 Jps
2869 NameNode
3540 NodeManager
3403 ResourceManager
3237 SecondaryNameNode
22722 HMaster
3031 DataNode
dream#dream-VirtualBox:/usr/local/hbase/bin$ ./hbase shell
2015-08-10 23:33:44,016 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.1, rd0a115a7267f54e01c72c603ec53e91ec418292f, Tue Jun 23 14:44:07 PDT 2015
hbase(main):001:0>
add JAVA_HOME in hbase-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
add given property in hbase-site.xml
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
#create zookeeperDir directory with permission 755
<value>/home/kishore/zookeeperDir</value>
</property>
make sure your zookeeper should be run on port 2181.
You ran sh interpreter with the command
sh -x ./bin/start-hbase.sh
Use instead
./bin/start-hbase.sh
as you did in
sudo ./bin/start-hbase.sh
This automatically selects the script interpreter that may be different as the first line of start-hbase.sh says
#!/usr/bin/env bash
The difference between these two ways is explained here: https://askubuntu.com/questions/22910/what-is-the-difference-between-and-sh-to-run-a-script
This solved the problem I had with
bin/start-hbase.sh: 51: [: unexpected operator
I am using hbase-1.1.2 so the line may have changed.
The issue is with already running ZK service.
The error message/logs in a screenshot you attached clearly mentioned the problem :
Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
I also faced the same problem, but when I stopped the ZK service everything works well. JPS started listing HMaster service.
I have used Java 8 and HBase 2.2.0

Hadoop - java.net.ConnectException: Connection refused

I want connect to hdfs (in localhost) and i have a error:
Call From despubuntu-ThinkPad-E420/127.0.1.1 to localhost:54310 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I follow all the steps in other posts, but i dont solve my problem. I use hadoop 2.7 and this is configurations:
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/despubuntu/hadoop/name/data</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
I type /usr/local/hadoop/bin/hdfs namenode -format and
/usr/local/hadoop/sbin/start-all.sh
But when i type "jps" the result is:
10650 Jps
4162 Main
5255 NailgunRunner
20831 Launcher
I need help...
Make sure that DFS which is set to port 9000 in core-site.xml is actually started. You can check with jps command. You can start it with sbin/start-dfs.sh
I guess that you didn't set up your hadoop cluster correctly please follow these steps :
Step1: begin with setting up .bashrc:
vi $HOME/.bashrc
put the following lines at the end of the file: (change the hadoop home as yours)
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
step 2 : edit hadoop-env.sh as following:
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
step 3 : Now create a directory and set the required ownerships and permissions
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hadoop /app/hadoop/tmp
# ...and if you want to tighten up security, chmod from 755 to 750...
$ sudo chmod 750 /app/hadoop/tmp
step 4 : edit core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
step 5 : edit mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
step 6 : edit hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
finally format your hdfs (You need to do this the first time you set up a Hadoop cluster)
$ /usr/local/hadoop/bin/hadoop namenode -format
hope this will help you
I got the same issue. You can see Name node, DataNode, Resource manager and Task manager daemons are running when you type. So just do start-all.sh then all daemons start running and now you can access HDFS.
First check is if java processes are working or not by typing jps command on command line. On running jps command following processes are mandatory to run-->>
DataNode
jps
NameNode
SecondaryNameNode
If following processes are not running then first start the name node by using following command-->>
start-dfs.sh
This worked out for me and removed the error you stated.
I was getting similar error. Upon checking I found that my namenode service was in stopped state.
check status of the namenode sudo status hadoop-hdfs-namenode
if its not in started/running state
start namenode service sudo start hadoop-hdfs-namenode
Do keep in mind that it takes time before name node service becomes fully functional after restart. It reads all the hdfs edits in memory. You can check progress of this in /var/log/hadoop-hdfs/ using command tail -f /var/log/hadoop-hdfs/{Latest log file}

Hadoop / Yarn (v0.23.3) Psuedo-Distributed Mode setup :: No job node

I just setup Hadoop/Yarn 2.x (specifically, v0.23.3) in Psuedo-Distributed mode.
I followed the instructions of a few blogs & websites which, more-or-less provide the
same prescription for setting it up. I also followed the 3rd-Edition of O'reilly's
Hadoop book (which ironically was the least helpful).
THE PROBLEM:
After running "start-dfs.sh" and then "start-yarn.sh", while all of the daemons
do start (as indicated by jps(1)), the Resource Manager web portal
(Here: http://localhost:8088/cluster/nodes) indicates 0 (zero) job-nodes in the
cluster. So while submitting the example/test Hadoop job indeed does get
scheduled, it pends forever because, I assume, the configuration doesn't see a
node to run it on.
Below are the steps I performed, including resultant configuration files.
Hopefully the community help me out... (And thank you in advance).
THE CONFIGURATION:
The following environment variables are set in both my and hadoop's UNIX account profiles: ~/.profile:
export HADOOP_HOME=/home/myself/APPS.d/APACHE_HADOOP.d/latest
# Note: /home/myself/APPS.d/APACHE_HADOOP.d/latest -> hadoop-0.23.3
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_INSTALL=${HADOOP_HOME}
export HADOOP_CLASSPATH=${HADOOP_HOME}/lib
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop/conf
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop/conf
export JAVA_HOME=/usr/lib/jvm/jre
hadoop$ java -version
java version "1.7.0_06-icedtea<br>
OpenJDK Runtime Environment (fedora-2.3.1.fc17.2-x86_64)<br>
OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)<br>
# Although the above shows OpenJDK, the same problem happens with Sun's JRE/JDK.
The NAMENODE & DATANODE directories, also specified in etc/hadoop/conf/hdfs-site.xml:
/home/myself/APPS.d/APACHE_HADOOP.d/latest/YARN_DATA.d/HDFS.d/DATANODE.d/
/home/myself/APPS.d/APACHE_HADOOP.d/latest/YARN_DATA.d/HDFS.d/NAMENODE.d/
Next, the various XML configuration files (again, YARN/MRv2/v0.23.3 here):
hadoop$ pwd; ls -l
/home/myself/APPS.d/APACHE_HADOOP.d/latest/etc/hadoop/conf
lrwxrwxrwx 1 hadoop hadoop 16 Sep 20 13:14 core-site.xml -> ../core-site.xml
lrwxrwxrwx 1 hadoop hadoop 16 Sep 20 13:14 hdfs-site.xml -> ../hdfs-site.xml
lrwxrwxrwx 1 hadoop hadoop 18 Sep 20 13:14 httpfs-site.xml -> ../httpfs-site.xml
lrwxrwxrwx 1 hadoop hadoop 18 Sep 20 13:14 mapred-site.xml -> ../mapred-site.xml
-rw-rw-r-- 1 hadoop hadoop 10 Sep 20 15:36 slaves
lrwxrwxrwx 1 hadoop hadoop 16 Sep 20 13:14 yarn-site.xml -> ../yarn-site.xml
core-site.xml
<?xml version="1.0"?>
<!-- core-site.xml -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<!-- mapred-site.xml -->
<configuration>
<!-- Same problem whether this (legacy) stanza is included or not. -->
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
hdfs-site.xml
<!-- hdfs-site.xml -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/myself/APPS.d/APACHE_HADOOP.d/YARN_DATA.d/HDFS.d/NAMENODE.d</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/myself/APPS.d/APACHE_HADOOP.d/YARN_DATA.d/HDFS.d/DATANODE.d</value>
</property>
</configuration>
yarn-site.xml
<?xml version="1.0"?>
<!-- yarn-site.xml -->
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/home/myself/APPS.d/APACHE_HADOOP.d/YARN_DATA.d/TEMP.d</value>
</property>
</configuration>
etc/hadoop/conf/saves
localhost
# Community/friends, is this entry correct/needed for my psuedo-dist mode?
Miscellaneous wrap-up notes:
(1) As you may have gleaned from above, all files/directories are owned
by the 'hadoop' UNIX user. There is a hadoop:hadoop, UNIX User and
Group, respectively.
(2) The following command was run after the NAMENODE & DATANODE directories
(listed above) were created (and whose paths were entered into
hdfs-site.xml):
hadoop$ hadoop namenode -format
(3) Next, I ran "start-dfs.sh", then "start-yarn.sh".
Here is jps(1) output:
hadoop#e6510$ jps
21979 DataNode
22253 ResourceManager
22384 NodeManager
22156 SecondaryNameNode
21829 NameNode
22742 Jps
Thank you!
After much toil on this problem without success (and trust me I tried it all), I instituted
hadoop using a different solution. Whereas above I downloaded a gzip/tar ball
of the hadoop distribution (again v0.23.3) from one of the download mirrors, this
time I used the Caldera CDH distribution of RPM packages, which I installed via
their YUM repos. In hopes that this will help someone, here are the detailed steps.
Step-1:
For Hadoop 0.20.x (MapReduce version 1):
# rpm -Uvh http://archive.cloudera.com/redhat/6/x86_64/cdh/cdh3-repository-1.0-1.noarch.rpm
# rpm --import http://archive.cloudera.com/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
# yum install hadoop-0.20-conf-pseudo
-or-
For Hadoop 0.23.x (MapReduce version 2):
# rpm -Uvh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.noarch.rpm
# rpm --import http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
# yum install hadoop-conf-pseudo
In both cases above, installing that "psuedo" package (which stands for "pseudo-distributed
Hadoop" mode), will alone conveniently trigger the installation of all the other necessary packages you'll need (via dependency resolution).
Step-2:
Install Sun/Oracle's Java JRE (if you haven't already done so). You can
install it via the RPM that they provide, or the gzip/tar ball portable
version. It doesn't matter which as long as you set and export the "JAVA_HOME"
environment appropriately, and ensure ${JAVA_HOME}/bin/java is in your path.
# echo $JAVA_HOME; which java
/home/myself/APPS.d/JAVA-JRE.d/jdk1.7.0_07
/home/myself/APPS.d/JAVA-JRE.d/jdk1.7.0_07/bin/java
Note: I actually create a symlink called "latest" and point/re-point it to the JAVA
version specific directory whenever I update the JAVA. I was explicit above for
the reader's understanding.
Step-3: Format hdfs as the "hdfs" Unix user (created during "yum install" above).
# sudo su hdfs -c "hadoop namenode -format"
Step-4:
Manually start the hadoop daemons.
for file in `ls /etc/init.d/hadoop*`
do
{
${file} start
}
done
Step-5:
Check to see if things are working. The following is for MapReduce v1
(It's not that much different for MapReduce v2 at this superficial level).
root# jps
23104 DataNode
23469 TaskTracker
23361 SecondaryNameNode
23187 JobTracker
23267 NameNode
24754 Jps
# Do the next commands as yourself (not as "root").
myself$ hadoop fs -mkdir /foo
myself$ hadoop fs -rmr /foo
myself$ hadoop jar /usr/lib/hadoop-0.20/hadoop-0.20.2-cdh3u5-examples.jar pi 2 100000
I hope this helped!
Noel,
I followed this other day the steps in this tutorial http://www.thecloudavenue.com/search?q=0.23 and I managed to set up a small cluster of 3 centos 6.3 machines

Adding Data Node to hadoop cluster

When I start the hadoopnode1 by using start-all.sh, it successfully start the services on master and slave (see jps command output for slave). But when I try to see the live nodes in admin screen slave node doesn't show up. Even when I run the hadoop fs -ls / command from master it runs perfectly, but from salve it shows error message
#hadoopnode2:~/hadoop-0.20.2/conf$ hadoop fs -ls /
12/05/28 01:14:20 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 0 time(s).
12/05/28 01:14:21 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 1 time(s).
12/05/28 01:14:22 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 2 time(s).
12/05/28 01:14:23 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 3 time(s).
.
.
.
12/05/28 01:14:29 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 10 time(s).
It looks like slave (hadoopnode2) is not being able to find/connect the master node(hadoopnode1)
Please point me what I am missing?
Here are the setting from Master and Slave nodes -
P.S. - Master and slave running same version of Linux and Hadoop and SSH is working perfectly,
because I can start the slave from master node
Also Same settings for core-site.xml, hdfs-site.xml and mapred-site.xml on master(hadooopnode1) and slave (hadoopnode2)
OS - Ubuntu 10
Hadoop Version -
oop#hadoopnode1:~/hadoop-0.20.2/conf$ hadoop version
Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
-- Master (hadoopnode1)
hadoop#hadoopnode1:~/hadoop-0.20.2/conf$ uname -a
Linux hadoopnode1 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux
hadoop#hadoopnode1:~/hadoop-0.20.2/conf$ jps
9923 Jps
7555 NameNode
8133 TaskTracker
7897 SecondaryNameNode
7728 DataNode
7971 JobTracker
masters -> hadoopnode1
slaves -> hadoopnode1
hadoopnode2
--Slave (hadoopnode2)
hadoop#hadoopnode2:~/hadoop-0.20.2/conf$ uname -a
Linux hadoopnode2 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux
hadoop#hadoopnode2:~/hadoop-0.20.2/conf$ jps
1959 DataNode
2631 Jps
2108 TaskTracker
masters - hadoopnode1
core-site.xml
hadoop#hadoopnode2:~/hadoop-0.20.2/conf$ cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/tmp/hadoop/hadoop-${user.name}</value>
<description>A base for other temp directories</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopnode1:8020</value>
<description>The name of the default file system</description>
</property>
</configuration>
hadoop#hadoopnode2:~/hadoop-0.20.2/conf$ cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoopnode1:8021</value>
<description>The host and port that the MapReduce job tracker runs at.If "local", then jobs are run in process as a single map</description>
</property>
</configuration>
hadoop#hadoopnode2:~/hadoop-0.20.2/conf$ cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication</description>
</property>
</configuration>
check your service by sudo jps
the master should not be displayed what you need to do
Restart Hadoop
Go to /app/hadoop/tmp/dfs/name/current
Open VERSION (i.e. by vim VERSION)
Record namespaceID
Go to /app/hadoop/tmp/dfs/data/current
Open VERSION (i.e. by vim VERSION)
Replace the namespaceID with the namespaceID you recorded in step 4.
this should work.Best of luck
At the web GUI you can see the number of nodes your cluster has. If you see less than you expected, then make sure that /etc/hosts file at master has as hosts only( for 2 node cluster).
192.168.0.1 master
192.168.0.2 slave
If you see any 127.0.1.1.... ip then comment out, because Hadoop will see them first as host( s).
Check the namenode and datanode logs. (Should be in $HADOOP_HOME/logs/). Most likely issue could be that the namenode and datanode IDs don't match. Delete the hadoop.tmp.dir from all nodes and format the namenode ($HADOOP_HOME/bin/hadoop namenode -format) again, then try again.
I think in slave 2. Slave 2 should listen to the same port 8020 instead of listening at 8021.
Add new node hostname to slaves file and start data node & task tracker on new node.
Indeed there are two errors in your case.
can't connect to hadoop master node from slave
That's network problem. Test it: curl 192.168.1.120:8020 .
Normal Response: curl: (52) Empty reply from server
In my case, I get host not found error. So just take a look at firewall settings
data node down:
That's hadoop problem. Raze2dust's method is good. Here's a another way if you see Incompatible namespaceIDs error in your log:
stop hadoop and edit the value of namespaceID in /current/VERSION to match the value of the current namenode, then start hadoop.
You can always check available datanodes using: hadoop fsck /

Resources