How to get datanode timeout? - hadoop

I have a 3 node hadoop setup, with replication factor as 2.
When one of my datanode dies, namenode waits for 10 mins before removing it from live nodes. Till then my hdfs writes fail saying bad ack from node.
Is there a way to set a smaller timeout( like 1 min) so that the node where datanode dies is discarded immediately ?

Setting up the following in your hdfs-site.xml will give you 1-minute timeout.
<property>
<name>heartbeat.recheck.interval</name>
<value>15</value>
<description>Determines datanode heartbeat interval in seconds</description>
</property>
If above doesn't work - try the following (seems to be version-dependent):
<property>
<name>dfs.heartbeat.recheck.interval</name>
<value>15</value>
<description>Determines datanode heartbeat interval in seconds.</description>
</property>
Timeout equals to 2 * heartbeat.recheck.interval + 10 * heartbeat.interval. Default for heartbeat.interval is 3 seconds.

In the version of Hadoop that we use, dfs.heartbeat.recheck.interval should be specified in milliseconds (check the code/doc of your version of Hadoop, to validate that).

I've managed to make this work. I'm using Hadoop version 0.2.2.
Here's what I added to my hdfs-site.xml:
<property>
<name>dfs.heartbeat.interval</name>
<value>2</value>
<description>Determines datanode heartbeat interval in seconds.</description>
</property>
<property>
<name>dfs.heartbeat.recheck.interval</name>
<value>1</value>
<description>Determines when machines are marked dead</description>
</property>
This parameters can differ for other versions of Hadoop. Here's how to check that you're using the right parameters: Once you set them, start your master, and check the configuration at :
http://your_master_machine:19888/conf
If you don't find "dfs.heartbeat.interval" and/or "dfs.heartbeat.recheck.interval" in there, that means you should try using their version without the "dfs." prefix:
"heartbeat.interval" and "heartbeat.recheck.interval"
Finally, to check that the dead datanode is no longer used after the desired amount of time, kill a datanode, then check repeatedly the console at:
http://your_master_machine:50070
For me, with the configuration shown here, I can see that a dead datanode is removed after about 20 seconds.

Related

HADOOP YARN - Application is added to the scheduler and is not yet activated. Skipping AM assignment as cluster resource is empty

I am evaluating YARN for a project. I am trying to get the simple distributed shell example to work. I have gotten the application to the SUBMITTED phase, but it never starts. This is the information reported from this line:
ApplicationReport report = yarnClient.getApplicationReport(appId);
Application is added to the scheduler and is not yet activated. Skipping AM assignment as cluster resource is empty. Details : AM Partition = DEFAULT_PARTITION; AM Resource Request = memory:1024, vCores:1; Queue Resource Limit for AM = memory:0, vCores:0; User AM Resource Limit of the queue = memory:0, vCores:0; Queue AM Resource Usage = memory:128, vCores:1;
The solutions for other developers seems to have to increase yarn.scheduler.capacity.maximum-am-resource-percent in the yarn-site.xml file from its default value of .1. I have tried values of .2 and .5 but it does not seem to help.
Looks like you did not configure the RAM allocated to Yarn in a proper way. This can be a pin in the ..... if you try to infer/adapt from tutorials according to your own installation. I would strongly recommend that you use tools such as this one:
wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
tar zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
rm hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
mv hdp_manual_install_rpm_helper_files-2.6.0.3.8/ hdp_conf_files
python hdp_conf_files/scripts/yarn-utils.py -c 4 -m 8 -d 1 false
-c number of cores you have for each node
-m amount of memory you have for each node (Giga)
-d number of disk you have for each node
-bool "True" if HBase is installed; "False" if not
This should give you something like:
Using cores=4 memory=8GB disks=1 hbase=True
Profile: cores=4 memory=5120MB reserved=3GB usableMem=5GB disks=1
Num Container=3
Container Ram=1536MB
Used Ram=4GB
Unused Ram=3GB
yarn.scheduler.minimum-allocation-mb=1536
yarn.scheduler.maximum-allocation-mb=4608
yarn.nodemanager.resource.memory-mb=4608
mapreduce.map.memory.mb=1536
mapreduce.map.java.opts=-Xmx1228m
mapreduce.reduce.memory.mb=3072
mapreduce.reduce.java.opts=-Xmx2457m
yarn.app.mapreduce.am.resource.mb=3072
yarn.app.mapreduce.am.command-opts=-Xmx2457m
mapreduce.task.io.sort.mb=614
Edit your yarn-site.xml and mapred-site.xml accordingly.
nano ~/hadoop/etc/hadoop/yarn-site.xml
nano ~/hadoop/etc/hadoop/mapred-site.xml
Moreover, you should have this in your yarn-site.xml
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>name_of_your_master_node</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
and this in your mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Then, upload your conf files to each node using scp (If you uploaded you ssh keys to each one)
for node in node1 node2 node3; do scp ~/hadoop/etc/hadoop/* $node:/home/hadoop/hadoop/etc/hadoop/; done
And then, restart yarn
stop-yarn.sh
start-yarn.sh
and check that you can see your nodes:
hadoop#master-node:~$ yarn node -list
18/06/01 12:51:33 INFO client.RMProxy: Connecting to ResourceManager at master-node/192.168.0.37:8032
Total Nodes:3
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
node3:34683 RUNNING node3:8042 0
node2:36467 RUNNING node2:8042 0
node1:38317 RUNNING node1:8042 0
This might fix the issue (good luck) (additional info)
Add below properties to yarn-site.xml and restart dfs and yarn
<property>
<name>yarn.scheduler.capacity.root.support.user-limit-factor</name>
<value>2</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
<value>0.0</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>100.0</value>
</property>
I got the same error and tried to solve it hard. I realized the resource manager had no resource to allocate the application master (AM) of the MapReduce application.
I navigated on browser http://localhost:8088/cluster/nodes/unhealthy and examined unhealthy nodes (in my case there was only one) -> health report. I saw the warning about that some log directories filled up. I cleaned those directories then my node became healthy and the application state switched to RUNNING from ACCEPTED. Actually, as a default, if the node disk fills up more than %90, YARN behaves like that. Someway you have to clean space and make available space lower than %90.
My exact health report was:
1/1 local-dirs usable space is below configured utilization percentage/no more usable space [ /tmp/hadoop-train/nm-local-dir : used space above threshold of 90.0% ] ;
1/1 log-dirs usable space is below configured utilization percentage/no more usable space [ /opt/manual/hadoop/logs/userlogs : used space above threshold of 90.0% ]

httpfs error Operation category READ is not supported in state standby

I am working on hadoop apache 2.7.1 and I have a cluster that consists of 3 nodes
nn1
nn2
dn1
nn1 is the dfs.default.name, so it is the master name node.
I have installed httpfs and started it of course after restarting all the services. When nn1 is active and nn2 is standby I can send this request
http://nn1:14000/webhdfs/v1/aloosh/oula.txt?op=open&user.name=root
from my browser and a dialog of open or save for this file appears, but when I kill the name node running on nn1 and start it again as normal then because of high availability nn1 becomes standby and nn2 becomes active.
So here httpfs should work, even if nn1 becomes stand by, but sending the same request now
http://nn1:14000/webhdfs/v1/aloosh/oula.txt?op=open&user.name=root
gives me the error
{"RemoteException":{"message":"Operation category READ is not supported in state standby","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}
Shouldn't httpfs overcome nn1 standby status and bring the file? Is that because of a wrong configuration, or is there any other reason?
My core-site.xml is
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
It looks like HttpFs is not High Availability aware yet. This could be due to the missing configurations required for the Clients to connect with the current Active Namenode.
Ensure the fs.defaultFS property in core-site.xml is configured with the correct nameservice ID.
If you have the below in hdfs-site.xml
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
then in core-site.xml, it should be
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
Also configure the name of the Java class which will be used by the DFS Client to determine which NameNode is the currently Active and is serving client requests.
Add this property to hdfs-site.xml
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
Restart the Namenodes and HttpFs after adding the properties in all nodes.

what would happen if nodes in hadoop change their IP address?

my hadoop clusters do not work fine because of the network conditions.What if i change the entire network,like another router,thus change the IP addresses? could the clusters still work by updating some configurations? or i must torn it down and rebuilt everything?
Thanks in advance
It works once you change the ip addresses into the configuration, why did not you use the DNS?
Ok, it was not a good answer, let me apologize and give a better answer.
If you need to change configuration on a running cluster you can decommission and commission the data nodes.
Switch off the data node is not a good idea.
Data Node Decomissioning
The fist step is tell to yarn you are going to remove some nodes, then you have to say the same to node manager.
I don't know if your system is configured for decommissioning, if it so you have the key yarn.resourcemanager.nodes.exclude-path into the yarn-site.xml and dfs.hosts.exclude into hdfs-site.xml
hdfs-site.xml
<property>
<name>dfs.hosts.exclude</name>
<value>$YOUR_PATH/dfs.exclude</value>
<final>true</final>
</property>
yarn-site.xml
<property>
<name>dfs.hosts.exclude</name>
<value>$YOUR_PATH/dfs.exclude</value>
<final>true</final>
</property>
Open the file $YOUR_PATH/dfs.exclude and add hostnames / ip addresses of node you need to stop.
execute
yarn rmadmin -refreshNodes
hdfs dfsadmin -refreshNodes
Check if the data nodes are in decommission checking the web interface.
Data Node Comissioning
Works in the same way of the Decommissioning
yarn-site.xml
<property>
<name>yarn.resourcemanager.nodes.include-path</name>
<value>$YOUR_PATH/dfs.include</value>
<final>true</final>
</property>
hdfs-site.xml
<property>
<name>dfs.hosts</name>
<value>$YOUR_PATH/dfs.include</value>
<final>true</final>
</property>
Open the file $YOUR_PATH/dfs.include and add hostnames / ip addresses of node you need to add.
yarn rmadmin -refreshNodes
hdfs dfsadmin -refreshNodes
wait some time
hdfs dfsadmin -report
Now the hosts you added are into the list.
If your configurations are missing the above keys you need to halt/restart the node manager and yarn after adding them.
Using these procedure you can halt data nodes in a safe way.

Should I have to run history server in all nodes to get job history in Hadoop Cluster WebUI

I am facing one issue in Hadoop cluster. I have a Hadoop cluster with 5 datanodes and one edge/gateway node.
My issue is that I had to start the history server in each of those nodes (1 namenode and 5 datanodes) to get any job history from hadoop webUI for any submitted job.
I have added mapreduce.jobhistory.address and mapreduce.jobhistory.webapp.address in mapred-site.xml
But it's not working properly I guess.
If I start the history server in name node or any other node only , Hadoop Cluster Web-UI is unable to show me the job history and ends up with some error.
My Mapred-site XML
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoopmaster:8021</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoopmaster:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoopmaster:19888</value>
</property>
</configuration>
For the time being as a workaround I start the history server in each node (namenode and all data node) manually. But think this is not right way.
Now I have 5 data node only so its still feasible to start history server in each and every node manually , but if case of multiple nodes(say 100/200) it will not be feasible any more to start history server in every node. There should be some standard solution for this issue...
Please help me out if anyone knows how to resolve this issue.
Thanks in advanceā€¦.
Finally I am able to solve the issue.
Actually in case of mapreduce.jobhistory.address , it will history server is running in one node only (jps).
It's working properly now...

Discrepancy in task run per map with the tasks configured in mapred-site.xml

I stopped all the agents running in my pseudo distributed mode by giving the following command.
stop-all.sh
Then I changed the configuration file of "mapred-site.xml" to 1 Map Task
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>1</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>1</value>
</property>
</configuration>
you see I have set 1 MapTask snd 1 ReduceTask to run.
Then I started back all the agents
start-all.sh
and run the map-reduce program but still I see 2 tasks instead of 1 as configured in mapred-site.xml.
The screen shot of the tasks are shown below,
Why is such discrepancy occurring, Please guide me through
thanks
Okay so this property mapred.tasktracker.map.tasks.maximum tells the maximum number of tasks (mapper tasks) which can be run by a task tracker at a time. Basically you are restricting each node running task tracker to run one mapper at a time.
If you have 10 nodes then you should be able to run 10 mappers in parallel.
However if your job requires 2 mappers (which is totally based on size of input data & block size unless you extend inputformat) and you have only one node then the map tasks would be executed sequentially on that node.
Hope this is clearer now.

Resources