Hadoop master cannot start slave with different $HADOOP_HOME - hadoop

In master, the $HADOOP_HOME is /home/a/hadoop, the slave's $HADOOP_HOME is /home/b/hadoop
In master, when I try to using start-all.sh, then the master name node start successfuly, but fails to start slave's data node with following message:
b#192.068.0.2: bash: line 0: cd: /home/b/hadoop/libexec/..: No such file or directory
b#192.068.0.2: bash: /home/b/hadoop/bin/hadoop-daemon.sh: No such file or directory
any idea on how to specify the $HADOOP_HOME for slave in master configuration?

I don't know of a way to configure different home directories for the various slaves from the master, but the Hadoop FAQ says that the Hadoop framework does not require ssh and that the DataNode and TaskTracker daemons can be started manually on each node.
I would suggest writing you own scripts to start things that take into account the specific environments of your nodes. However, make sure to include all the slaves in the master's slave file. It seems that this is necessary and that the heart beats are not enough for a master to add slaves.

Related

Cannot connect slave1:8088 in hadoop 2.7.2

I am new in hadoop and I have installed hadoop 2.7.2 into two machines which are master and slave1. I have followed this tutorial. It was not mentioned in the tutorial but I have also edited JAVA_HOME and HADOOP_CONF_DIR variables in hadoop-env.sh. At the end I have two machines hadoop installed. In master NameNode, DataNode, SecondaryNameNode, ResourceManager and NodeManager are running and in slave1 DataNode and NodeManager are running.
I am able to go to master:8088 in the browser and when I go http://master:8088/cluster/nodes, there is only master node here. I am not able to go isci17:8088 and that is not a live node. Why could it be?
Port 8088 is the resource manager web ui port, so if it is running on master you probably won't have it on the slave.
You should be able to also go to the name node web ui on port 50070 on your name node as well to see status such as http://master:50070/ and the MapReduce JobHistory Server at http://hostname:19888/ for a web ui.
If you have access to a terminal session you run the following command on each server as root/sudo user to see which ports are listening on which server in a Linux terminal session;
sudo lsof -i tcp | grep -i LISTEN
You also also run hadoop cli commands that will give you info;
You can run the following to check hadoops ports in a terminal session.
hdfs portmap
Other Health checks on command line;
hdfs classpath
hdfs getconf -namenodes
hdfs dfsadmin -report -live
hdfs dfsadmin -report -dead
hdfs dfsadmin -printTopology
Depending on if the hadoop cli command works automatically you might have to find the executable to run ./hdfs. Also depending on distro/version you might have to replace the command hdfs with the command hadoop.
If you want to see your cluster configurations check your /etc/hadoop/conf folder along with /etc/hadoop/hive . You will find about 5-10 *-site.xml files. There configuration files contain your clusters configuration with the hostnames and ports.

Hadoop datanode services is not starting in the slaves in hadoop

I am trying to configure hadoop-1.0.3 multinode cluster with one master and two slave in my laptop using vmware workstation.
when I ran the start-all.sh from master all daemon process running in master node (namenode,datanode,tasktracker,jobtracker,secondarynamenode) but Datanode and tasktracker is not starting on slave node. Password less ssh is enabled and I can do ssh for both master and slave from my masternode without pwd.
Please help me resolve this.
Stop the cluster.
If you have specifically defined tmp directory location in core-site.xml, then remove all files under those directory.
If you have specifically defined data node and namenode directory in hdfs-site.xml, then delete all the files under those directories.
If you have not defined anything in core-site.xml or hdfs-site.xml, then please remove all the files under /tmp/hadoop-*nameofyourhadoopuser.
Format the namenode.
It should work!

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

Path on various slaves node

I have installed hadoop on 3 nodes, 1 master and 2 -slave nodes.
on master node and one of slave node is having same hadoop path i.e. /home/hduser/hadoop,
but in one slave node it is different, i.e. /usr/hadoop
so while running ./start-all.sh from master namenode and jobtarcker started, and datanode started on one slave that is having same hadoop path as master node,but on other slave node it is giving error like--
ngs-dell: bash: line 0: cd: /home/hduser/hadoop/libexec/..: No such file or directory
means it is searching on same path as master, but it have different path.
Please tell me how to solve this issue.
And one more doubt, is it compulsary that all hadoop node (master & slave) should have same username, in my case it is hduser. If I change on one node of hadoop cluster then it gives me error.
I think you may not change the 'hadoop.tmp.dir' settings of core-site.xml in the slave node.
you can check the answer in this post

DataNode doesn't start in one of the slaves

I am trying to configure hadoop with 5 slaves. After I run start-dfs.sh in the master there is only one slave node which doesn't run DataNode. I tried looking for some difference in the configuration files in that node but I didn't find anything.
There WAS a difference in the configuration files! In the core-site.xml the hadoop.tmp.dir variable was set to a invalid directory so it couldn't be created when the DataNode was started. Lesson learned: look in the logs (Thanks Chris)

Resources