I have 2 hadoop pseudo(standalone) servers, which I created for testing purposes.
Now I want to club the two servers into one cluster and make this a Master-Slave configuration.
Is there anyway to achieve this?
Thanks in advance.
Step 1
Determine which pseudo you want to be master. Once decided add master machine name in file
$hadoop_home/conf/master
Step 2
If u want other pseudo as well as master machine to act as DataNodes then add these machine names in file
$hadoop_home/conf/slaves
Step 3
Make SSH password less connection between master and slave. And change
/etc/hosts
file if necessary if they are getting connected.
Step 4
Now prepare the hadoop MultiNode Cluster to start
Format namenode
$hadoop_home/bin/hadoop namenode -format
Start the cluster
$hadoop_home/bin/start-all.sh
sure you can, just follow Running Hadoop on Ubuntu Linux (Multi-Node Cluster)
Related
I am trying to set up a multi-node Hadoop cluster between 2 windows devices. I am using Hadoop 3.3.1. how can I achieve that, please
(note: I was able to successfully create in Single-Node)
of course, I did these steps :
My network will be as follows:
master-IP : 192.168.81.144
slave-IP : 192.168.81.145
add below lines in Master-mode and SlaveNode C:\Windows\System32\drivers\etc\hosts
192.168.81.144 hadoopMaster
192.168.81.145 hadoopSlave
in master, ping salve-IP worked
in slave, ping master-IP worked
add below lines in the workers file
hadoopMaster
hadoopSlave
3)format the HDFS file system
hdfs namenode -format
run below script on master node
start-dfs
but the slave node not working
Can anyone please tell how to create a hadoop 2 node cluster ?i have 2 Virual machines both have hadoop single node setup installed properly,please tell how to form 2 node cluster from that ,and mention proper network setup for that.Thanks in Advance.i am using hadoop -1.0.3 version.
Hi can you do one thing for network configuration of vms.
Make sure of the network connect of virtual machine setting of hostonly.
And add two hostname and ips in /etc/hosts file.
add two hostnames in $HADOOP_HOME/conf/slaves file.
Make sure of secondary namenode hostname in $HADOOP_HOME/conf/masters file.
Then make sure same configuration on two machine.
hadoop namenode -format
start-all.sh
Best of luck.
Is it necessary to have 2 linux machine for configuring two node hadoop cluster(ie 1 master 1 slave).Can it be done on single machine
Yes, you can run in local mode or pseudo-distributed mode, you don't necessarily need an actual cluster of machines.
See: http://hadoop.apache.org/docs/stable/single_node_setup.html
Pseudo-Distributed single node cluster implementation
I am using window 7 with CYGWIN and installed hadoop-1.0.3 successfully. I still start services job tracker,task tracker and namenode on port (localhost:50030,localhost:50060 and localhost:50070).I have completed single node implementation.
Now I want to implement Pseudo-Distributed multiple node cluster . I don't understand how to divide in master and slave system through network ips?
For your ssh problem just follow the link of single node cluster :
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
and yes, you need to specify the ip's of master and slave in conf file
for that you can refer this url :
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
I hope this helps.
Try to create number of VM you want to add in your cluster.Make sure that those VM are having the same hadoop version .
Figure out the IPs of each VM.
you will find files named master and slaves in $HADOOP_HOME/conf mention the IP of VM to conf/master file which you want to treat as master and and do the same with conf/slaves
with slave nodes IP.
Make sure these nodes are having Passwordless-ssh connection.
Format your namenode and then run start-all.sh.
Thanks,
In master, the $HADOOP_HOME is /home/a/hadoop, the slave's $HADOOP_HOME is /home/b/hadoop
In master, when I try to using start-all.sh, then the master name node start successfuly, but fails to start slave's data node with following message:
b#192.068.0.2: bash: line 0: cd: /home/b/hadoop/libexec/..: No such file or directory
b#192.068.0.2: bash: /home/b/hadoop/bin/hadoop-daemon.sh: No such file or directory
any idea on how to specify the $HADOOP_HOME for slave in master configuration?
I don't know of a way to configure different home directories for the various slaves from the master, but the Hadoop FAQ says that the Hadoop framework does not require ssh and that the DataNode and TaskTracker daemons can be started manually on each node.
I would suggest writing you own scripts to start things that take into account the specific environments of your nodes. However, make sure to include all the slaves in the master's slave file. It seems that this is necessary and that the heart beats are not enough for a master to add slaves.