Hadoop: datanode not starting on slave

Hadoop: datanode not starting on slave - hadoop

I have two VMs setup with Ubuntu 12.04. I am trying to setup Hadoop multinode, but after executing hadoop/sbin/start-dfs.shI see following process on my master:
20612 DataNode
20404 NameNode
20889 SecondaryNameNode
21372 Jps
However, there is nothing in the slave. Also when I do hdfs dfsadmin -report, I only see:
Live datanodes (1):
Name: 10.222.208.221:9866 (master)
Hostname: master
I checked logs, my start-dfs.sh does not even try to start datanode on my slave.
I am using following configuration:
#/etc/hosts
127.0.0.1 localhost
10.222.208.221 master
10.222.208.68 slave-1
changed hostanme in /etc/hostname in respective systems
Also, I am able to ping slave-1 from master system and vice-versa using ping.
/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
#hadoop/etc/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///hadoop/data/namenode</value>
<description>NameNode directory</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///hadoop/data/datanode</value>
<description>DataNode directory</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>
I have also added master and slave-1 in /hadoop/etc/master and /hadoop/etc/slaveson both my master and slave system.
I have also tried cleaning data/* and then hdfs namenode -format before start-dfs.sh, still the problem persists.
Also, I have Network adapter setting marked as Bridged adapter.
Any possible reason datanode not starting on slave?

Can't claim to have the answer, but I found this "start-all.sh" and "start-dfs.sh" from master node do not start the slave node services?
Changed my slaves file to workers file and everything clicked in.

It seems you are using hadoop-2.x.x or above, so, try this configuration. And by default masters file( hadoop-2.x.x/etc/hadoop/masters) won't available on hadoop-2.x.x onwards.
hadoop-2.x.x/etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
~/etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///hadoop/data/namenode</value>
<description>NameNode directory</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///hadoop/data/datanode</value>
<description>DataNode directory</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
~/etc/hadoop/mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
~/etc/hadoop/yarn-site.xml:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
~/etc/hadoop/slaves
slave-1
copy all the above configured file from master and replace it on slave on this path hadoop-2.x.x/etc/hadoop/.

Related

No Disk Space allocated to HDFS filesystem

I am trying to set up Hadoop on my local machine. However, when I'm running wordcount based on map reduce example (I did hdfs namenode -format earlier) :
This is maybe hard to read but I end up with a "Job failed with state FAILED due to:
Application failed 2 times
due to AM Container exited with
exitCode: -1000 Failing this attempt.Diagnostics: No space available in any of the local directories."
I don't understand why I have such error. This is how my applications & attempt look like :
I followed several tutorials, ending up with these parameters :
mapred-site.xml :
configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml :
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>
%HADOOP_HOME%\etc\hadoop,
%HADOOP_HOME%\share\hadoop\common\*,
%HADOOP_HOME%\share\hadoop\common\lib\*,
%HADOOP_HOME%\share\hadoop\hdfs\*,
%HADOOP_HOME%\share\hadoop\hdfs\lib\*,
%HADOOP_HOME%\share\hadoop\mapreduce\*,
%HADOOP_HOME%\share\hadoop\mapreduce\lib\*,
%HADOOP_HOME%\share\hadoop\yarn\*,
%HADOOP_HOME%\share\hadoop\yarn\lib\*
</value>
</property>
</configuration>
core-site.xml :
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml :
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///C:/hadoop-3.3.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///C:/hadoop-3.3.0/data/datanode</value>
</property>
</configuration>
Can you help me with this please ? I already tried what is mentionned in these questions :
Hadoop errorcode -1000, No space available in any of the local directories, except for the namenode emptying cache part.
Hadoop Windows setup. Error while running WordCountJob: "No space available in any of the local directories"
what do you think ?
Thank you !

hadoop's start-dfs not creating datanode on the slave

I am trying to set a Hadoop cluster over two nodes. start-dfs.sh on my master node is opening a window and shortly after the window closes, and when i execute start-dfs it logs namenode is correctly launched, but datanode is not and logs the following :
Problem binding to [slave-VM1:9005] java.net.BindException: Cannot assign requested address: bind; For more details see: http://wiki.apache.org/hadoop/BindException
I have set
ssh-keygen -t rsa -P ''
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
(and also set adminstrators_authorized_keys file with the right public key) (also ssh user#remotemachine is working and gives access to the slave)
Here's my full Hadoop configuration set on both master and slave machines (Windows):
hdfs-site.xml :
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/C:/Hadoop/hadoop-3.2.2/data/namenode</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>slaveVM1:50475</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/C:/Hadoop/hadoop-3.2.2/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
core-site.xml :
<configuration>
<property>
<name>dfs.datanode.http.address</name>
<value>slaveVM1:9005</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://masterVM2:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/C:/Hadoop/hadoop-3.2.2/hadoopTmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://masterVM2:8020</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>masterVM2:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>%HADOOP_HOME%/share/hadoop/mapreduce/*,%HADOOP_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_HOME%/share/hadoop/common/*,%HADOOP_HOME%/share/hadoop/common/lib/*,%HADOOP_HOME%/share/hadoop/yarn/*,%HADOOP_HOME%/share/hadoop/yarn/lib/*,%HADOOP_HOME%/share/hadoop/hdfs/*,%HADOOP_HOME%/share/hadoop/hdfs/lib/*</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
PS : i am adminstrator on both machines, and i set HADOOP_CONF_DIR C:\Hadoop\hadoop-3.2.2\etc\hadoop
I also set the slave IP in hadoop_conf_dir slaves file.
PS : if i remove the code :
<property>
<name>dfs.datanode.https.address</name>
<value>slave:50475</value>
</property>
from hdfs-site.xml
Then both datanote and namenode launch on the master node.
hosts :
*.*.*.* slaveVM1
*.*.*.* masterVM2
... are the IPs of the respective machines, all other entries are commented out

This usually happens
BindException: Cannot assign requested address: bind;
when the port in use. Meaning maybe it's the application was already started, or was started previously and didn't shut down properly or another applicaiton is using that port. Try rebooting, (as a heavy handed but reasonably effective way of clearing ports).

Hadoop 3.2.1 Multinode Cluster Nodemanager is not running

I have Hadoop 3.2.1 installed on Ubuntu 16.04lts and my cluster has 18 datanodes and 1 master.
After running:
$ start-dfs.sh
$ start-yarn.sh
$ jps
On master I get the following:
ResourceManager
NameNode
SecondaryNameNodecode
jps
And on datanodes:
DataNode
jps
All the nodes seems to be live:
NameNode Overview Web Page
But when I reach the Cluster overview, none of my datanodes seems to be active:
Cluster Overview
My configurations files:
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-3.2.1/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop-master:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-3.2.1/data/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-3.2.1/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
The namenode and datanode directories exists on every host (master and datanodes)
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services </name>
<value> mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
</configuration>
Also I have configured hadoop-env.sh for JAVA_HOME Path and all the other variables are in .bashrc file (also in every host).
I have modified the /etc/hosts file to include all the hosts with their IPs and hostnames and finally I have also modified the workers file to include all the IPs of the datanodes.
The first time I have formatted the NameNode, the directories for the hdfs-site.xml was wrong (I had the datanode dir twice), so hdfs make its own directories under /tmp/hdfs/ (if I remember correctly). But I fixed this with formating again the NameNode with the corect directories.

Hadoop jobtracker's tracking url cannot access

I have configured my hadoop system in wsl and run the wordcount example. But when I want to see the history of the job, I found the tracking url cannot access.
The job is working well, the jobhistory is running as well.
The history tracking url is my wsl hostname:8088/proxy/application_1585482453915_0002/.
You can see the url above.
But I can still access to localhost:19888/jobhistory to see my jobhistory.
How is this problem occurs? Is it a problem of configuration?
My hadoop version is 2.7.1.
My core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
　　　　 <name>fs.defaultFS</name>
　　　　 <value>hdfs://localhost:9000</value>
</property>
My hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hadoop/tmp/dfs/data</value>
</property>
My mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:19888</value>
</property>
My yarn-site.xml
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
My /etc/hosts
127.0.0.1 localhost
127.0.1.1 DESKTOP-U1EOV4J.localdomain DESKTOP-U1EOV4J

The JobHistoryServer daemon is running in localhost (127.0.0.1), whereas the tracking URL is constructed with the hostname, thus redirecting to DESKTOP-U1EOV4J.localdomain (127.0.1.1).
For a Pseudo distributed cluster, it is safer to leave the host of JobHistoryServer to be 0.0.0.0.
Update the job history server properties in mapred-site.xml
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
</property>
and restart the JobHistoryServer.

Setting up Hadoop Cluster - NameNode smoke test "Unable to connect"

I am having trouble connection to my NameNode server from another server on the cluster. the namenode starts fine and i can get to the namenode dashboard browsing to http://localhost:50070, but trying to browse to http://hadoop-cluster-1:50070 or even using the IP address doesn't work. I am able to ping hadoop-cluster-1 and the IP address. I am also able to traceroute the port and host all from the server that i am getting an "Unable to connect" in Firefox. See below for values files.
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-cluster-1.com:8020</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>8192</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop-cluster-1:8020</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-cluster-1:8020</value>
<description>The port where the NameNode runs the HDFS protocol.
Combined with the NameNode's hostname to build its address.
</description>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>hadoop-cluster-1:8020</value>
<description>
RPC address that handles all clients requests. In the case of HA/Federation where multiple namenodes exist, the name ser
vice id is added to the name e.g. dfs.namenode.rpc-address.ns1 dfs.namenode.rpc-address.EXAMPLENAMESERVICE The value of this propert
y will take the form of nn-host1:rpc-port.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/nn1,file:///data/nn2</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>131072</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/data1</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hadoop: datanode not starting on slave - hadoop

Can't claim to have the answer, but I found this "start-all.sh" and "start-dfs.sh" from master node do not start the slave node services? Changed my slaves file to workers file and everything clicked in.

Related

No Disk Space allocated to HDFS filesystem

hadoop's start-dfs not creating datanode on the slave

Hadoop 3.2.1 Multinode Cluster Nodemanager is not running

Hadoop jobtracker's tracking url cannot access

Setting up Hadoop Cluster - NameNode smoke test "Unable to connect"

Categories

Resources