Hadoop 3.2.1 Multinode Cluster Nodemanager is not running

Hadoop 3.2.1 Multinode Cluster Nodemanager is not running - hadoop

I have Hadoop 3.2.1 installed on Ubuntu 16.04lts and my cluster has 18 datanodes and 1 master.
After running:
$ start-dfs.sh
$ start-yarn.sh
$ jps
On master I get the following:
ResourceManager
NameNode
SecondaryNameNodecode
jps
And on datanodes:
DataNode
jps
All the nodes seems to be live:
NameNode Overview Web Page
But when I reach the Cluster overview, none of my datanodes seems to be active:
Cluster Overview
My configurations files:
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-3.2.1/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop-master:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-3.2.1/data/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-3.2.1/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
The namenode and datanode directories exists on every host (master and datanodes)
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services </name>
<value> mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
</configuration>
Also I have configured hadoop-env.sh for JAVA_HOME Path and all the other variables are in .bashrc file (also in every host).
I have modified the /etc/hosts file to include all the hosts with their IPs and hostnames and finally I have also modified the workers file to include all the IPs of the datanodes.
The first time I have formatted the NameNode, the directories for the hdfs-site.xml was wrong (I had the datanode dir twice), so hdfs make its own directories under /tmp/hdfs/ (if I remember correctly). But I fixed this with formating again the NameNode with the corect directories.

Related

hadoop's start-dfs not creating datanode on the slave

I am trying to set a Hadoop cluster over two nodes. start-dfs.sh on my master node is opening a window and shortly after the window closes, and when i execute start-dfs it logs namenode is correctly launched, but datanode is not and logs the following :
Problem binding to [slave-VM1:9005] java.net.BindException: Cannot assign requested address: bind; For more details see: http://wiki.apache.org/hadoop/BindException
I have set
ssh-keygen -t rsa -P ''
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
(and also set adminstrators_authorized_keys file with the right public key) (also ssh user#remotemachine is working and gives access to the slave)
Here's my full Hadoop configuration set on both master and slave machines (Windows):
hdfs-site.xml :
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/C:/Hadoop/hadoop-3.2.2/data/namenode</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>slaveVM1:50475</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/C:/Hadoop/hadoop-3.2.2/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
core-site.xml :
<configuration>
<property>
<name>dfs.datanode.http.address</name>
<value>slaveVM1:9005</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://masterVM2:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/C:/Hadoop/hadoop-3.2.2/hadoopTmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://masterVM2:8020</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>masterVM2:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>%HADOOP_HOME%/share/hadoop/mapreduce/*,%HADOOP_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_HOME%/share/hadoop/common/*,%HADOOP_HOME%/share/hadoop/common/lib/*,%HADOOP_HOME%/share/hadoop/yarn/*,%HADOOP_HOME%/share/hadoop/yarn/lib/*,%HADOOP_HOME%/share/hadoop/hdfs/*,%HADOOP_HOME%/share/hadoop/hdfs/lib/*</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
PS : i am adminstrator on both machines, and i set HADOOP_CONF_DIR C:\Hadoop\hadoop-3.2.2\etc\hadoop
I also set the slave IP in hadoop_conf_dir slaves file.
PS : if i remove the code :
<property>
<name>dfs.datanode.https.address</name>
<value>slave:50475</value>
</property>
from hdfs-site.xml
Then both datanote and namenode launch on the master node.
hosts :
*.*.*.* slaveVM1
*.*.*.* masterVM2
... are the IPs of the respective machines, all other entries are commented out

This usually happens
BindException: Cannot assign requested address: bind;
when the port in use. Meaning maybe it's the application was already started, or was started previously and didn't shut down properly or another applicaiton is using that port. Try rebooting, (as a heavy handed but reasonably effective way of clearing ports).

Hadoop: datanode not starting on slave

I have two VMs setup with Ubuntu 12.04. I am trying to setup Hadoop multinode, but after executing hadoop/sbin/start-dfs.shI see following process on my master:
20612 DataNode
20404 NameNode
20889 SecondaryNameNode
21372 Jps
However, there is nothing in the slave. Also when I do hdfs dfsadmin -report, I only see:
Live datanodes (1):
Name: 10.222.208.221:9866 (master)
Hostname: master
I checked logs, my start-dfs.sh does not even try to start datanode on my slave.
I am using following configuration:
#/etc/hosts
127.0.0.1 localhost
10.222.208.221 master
10.222.208.68 slave-1
changed hostanme in /etc/hostname in respective systems
Also, I am able to ping slave-1 from master system and vice-versa using ping.
/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
#hadoop/etc/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///hadoop/data/namenode</value>
<description>NameNode directory</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///hadoop/data/datanode</value>
<description>DataNode directory</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>
I have also added master and slave-1 in /hadoop/etc/master and /hadoop/etc/slaveson both my master and slave system.
I have also tried cleaning data/* and then hdfs namenode -format before start-dfs.sh, still the problem persists.
Also, I have Network adapter setting marked as Bridged adapter.
Any possible reason datanode not starting on slave?

Can't claim to have the answer, but I found this "start-all.sh" and "start-dfs.sh" from master node do not start the slave node services?
Changed my slaves file to workers file and everything clicked in.

It seems you are using hadoop-2.x.x or above, so, try this configuration. And by default masters file( hadoop-2.x.x/etc/hadoop/masters) won't available on hadoop-2.x.x onwards.
hadoop-2.x.x/etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
~/etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///hadoop/data/namenode</value>
<description>NameNode directory</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///hadoop/data/datanode</value>
<description>DataNode directory</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
~/etc/hadoop/mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
~/etc/hadoop/yarn-site.xml:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
~/etc/hadoop/slaves
slave-1
copy all the above configured file from master and replace it on slave on this path hadoop-2.x.x/etc/hadoop/.

nodemanager is not starting while upgrading to hadoop 2 from hadoop classic

I have one master one worker cluster. I am upgrading to YARN from Hadoop classic. resourcemanager and historyserver successfully started, but nodemanager is not starting it is giving error
java.lang.NumberFormatException: For input string: "${nodemanager.resource.memory-mb}"
I have kept same yarn-site.xml.template in both server.
I have replaced ${nodemanager.resource.memory-mb} to 8192
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>__RM_IP__</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>${nodemanager.resource.memory-mb}</value>
</property>
</configuration></br>

Name node is not displaying when I hit JPS

Contents of hdfs.site.xml :
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
jps is not starting namenode
20471 NodeManager
20515 Jps
20036 DataNode
20362 ResourceManager

Cannot create directory /home/hadoop/hadoopinfra/hdfs/namenode/current

I get the error
Cannot create directory /home/hadoop/hadoopinfra/hdfs/namenode/current
While trying to install hadoop on my local Mac.
What could be the reason for this? Just for reference, I'm putting my xml files down below:
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/namenode </value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/datanode </value>
</property>
</configuration>
core-site.xml:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
I think my problem lies in my hdfs-site.xml file, but I'm not sure how to pinpoint/change it.
I'm using this tutorial, and "hadoop" in the file path is replaced by my username.

Possible error: misconfiguration of the hdfs-site.xml file
This happened to me when I was following a setup tutorial. The contents of the hdfs-site.xml for me was
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/nameNode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/dataNode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Only then I realized that the text hadoop in the above file corresponds to the user name, where in my case, it had to replaced with hduser. When both occurrences of hadoop was replaced with hduser, the hdfs namenode -format command worked fine.

I had this problem too and it was a permission problem. I just did:
sudo chmod 777 /home/hadoop/hadoopinfra/hdfs/namenode/
and works!

In the step where you need to verify the hadoop installation, instead of 'hdfs namenode -format' use '/usr/local/hadoop/bin/hdfs namenode -format'
Found this answer from:
hadoop java.io.IOException: while running namenode -format

If you are not using any other distro than native hadoop, then add the current user to hadoop group and retry formatting the namenode.
sudo usermod -a -G hadoop <current-username>
In case of using thirdparty hadoop distros such Cloudera, Hortonworks or MapR, switch to root user and again switch to hdfs user then try formatting the namenode will succeed.
$ sudo -i
$ su - hdfs
$ hdfs namenode -format

Try the Hadoop command with sudo

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hadoop 3.2.1 Multinode Cluster Nodemanager is not running - hadoop

Related

hadoop's start-dfs not creating datanode on the slave

Hadoop: datanode not starting on slave

nodemanager is not starting while upgrading to hadoop 2 from hadoop classic

Name node is not displaying when I hit JPS

Cannot create directory /home/hadoop/hadoopinfra/hdfs/namenode/current

Categories

Resources