DataNode doesn't start in one of the slaves - hadoop

I am trying to configure hadoop with 5 slaves. After I run start-dfs.sh in the master there is only one slave node which doesn't run DataNode. I tried looking for some difference in the configuration files in that node but I didn't find anything.

There WAS a difference in the configuration files! In the core-site.xml the hadoop.tmp.dir variable was set to a invalid directory so it couldn't be created when the DataNode was started. Lesson learned: look in the logs (Thanks Chris)

Related

Hadoop : swap DataNode & NameNode without losing any HDFS data

I have a cluster of 5 machines:
1 big NameNode
4 standard DataNodes
I want to change my current NameNode with a DataNode without losing the data stored in HDFS, so my cluster could become:
1 standard NameNode
3 standard DataNodes
1 big DataNode
Does someone know a simple way to do that?
Thank you very much
Decomission data node where namenode will be moved.
Stop the cluster.
Create a tar of dfs.name.dir from current namenode.
Copy all hadoop config files from current NN to target NN.
Replace the name/ip of target namenode by modifying core-site.xml.
Restore tarball of dfs.name.dir. Make sure that full path is same.
Now start the cluster by starting new namenode and one less datanode.
Verify that everything is working perfectly.
Add old namenode as datanode by configuring it as datanode.
I would suggest to uninstall and then install hadoop on both the nodes so that previous configuration does not cause any problem.

Hadoop datanode services is not starting in the slaves in hadoop

I am trying to configure hadoop-1.0.3 multinode cluster with one master and two slave in my laptop using vmware workstation.
when I ran the start-all.sh from master all daemon process running in master node (namenode,datanode,tasktracker,jobtracker,secondarynamenode) but Datanode and tasktracker is not starting on slave node. Password less ssh is enabled and I can do ssh for both master and slave from my masternode without pwd.
Please help me resolve this.
Stop the cluster.
If you have specifically defined tmp directory location in core-site.xml, then remove all files under those directory.
If you have specifically defined data node and namenode directory in hdfs-site.xml, then delete all the files under those directories.
If you have not defined anything in core-site.xml or hdfs-site.xml, then please remove all the files under /tmp/hadoop-*nameofyourhadoopuser.
Format the namenode.
It should work!

CDH4 : Add new node to existing cluster

I have successfully created hadoop cluster with CDH4 on ubuntu . I have created this with one master(master) and one slave(slave1) . Now I want to add one more cluster . For this I just cloned slave2 and updated hosts and ssh accordingly . Then I updated conf/slaves file with all datanode dns names in all nodes and restarted everything . But it's not detecting the new datanode instead it only shows the old one that is slave1 not slave2 . Can anyone please help me on this ?
I have used cdh4-repository_1.0_all.deb
#user2009755, you need to create a master and slave file only in the master. And in configuration files in $HADOOP_HOME/etc/hadoop, make necessary changes to the URI pointing to the master node.NOTE: Try to format the namenode and delete the tmp files (usually /tmp/*) but if you changed it in core-site.xml, format that directory in all nodes and start all the daemons, it worked for me.
There is so many reasons,
Have you change the dfs.replication value to 3 in conf/hdfs-site.xml??
check on master with cammands hduser#master:~$ ssh slave it should be show the slave terminal if not then execute this cammand -hduser#master:~$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser#slave
for fully understand see this link
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

Hadoop master cannot start slave with different $HADOOP_HOME

In master, the $HADOOP_HOME is /home/a/hadoop, the slave's $HADOOP_HOME is /home/b/hadoop
In master, when I try to using start-all.sh, then the master name node start successfuly, but fails to start slave's data node with following message:
b#192.068.0.2: bash: line 0: cd: /home/b/hadoop/libexec/..: No such file or directory
b#192.068.0.2: bash: /home/b/hadoop/bin/hadoop-daemon.sh: No such file or directory
any idea on how to specify the $HADOOP_HOME for slave in master configuration?
I don't know of a way to configure different home directories for the various slaves from the master, but the Hadoop FAQ says that the Hadoop framework does not require ssh and that the DataNode and TaskTracker daemons can be started manually on each node.
I would suggest writing you own scripts to start things that take into account the specific environments of your nodes. However, make sure to include all the slaves in the master's slave file. It seems that this is necessary and that the heart beats are not enough for a master to add slaves.

Resources