How AM selects the node for each reduce task? - hadoop

I am doing two jobs of Word count example in the same cluster (I run hadoop 2.65 locally with my a multi-cluster) where my code run the two jobs one after the other.
Where both of the jobs share the same mapper, reducer and etc. but each one of them has a different Partitioner.
Why there is a different allocation of the reduce task on the nodes for the second job? I am identifying the reduce task node by the node's IP (Java getting my IP address).
I know that the keys would go to a different reduce task but I want that their destination would stay unchanged.
For example, I have five different keys and four reduce task.
The allocation for Job 1 is:
partition_1 ->NODE_1
partition_2 ->NODE_1
partition_3 ->NODE_2
partition_4 ->NODE_3
The allocation for Job 2 is:
partition_1 ->NODE_2
partition_2 ->NODE_3
partition_3 ->NODE_1
partition_4 ->NODE_3

In hadoop we haven’t locality for reducers so yarn select nodes for reducer based on the resources. There is no way to force hadoop to run each reducer on the same node in two job.

Related

How does Hadoop framework decides the node to run Map job

As per my understanding, files stored in HDFS are divided into blocks and and each block is replicated to multiple nodes, 3 by default. How does Hadoop framework choose the node to run a map job, out of all the nodes on which a particular block is replicated.
As I know, there will be same amounts of map tasks as amounts of blocks.
See manual here.
Usually, framework choose those nodes close to the input block for reducing network bandwidth for map task.
That's all I know.
In Mapreduce 1 it depends on how many map task are running in that datanode which hosts a replica, because the number of map tasks is fixed in MR1. In MR2 there are no fixed slots, so it depends on number of tasks already running in that node.

How does Hadoop decide how many nodes will perform the Map and Reduce tasks?

I'm new to hadoop and I'm trying to understand it. Im talking about hadoop 2. When I have an input file which I wanto to do a MapReduce, in the MapReduce programm I say the parameter of the Split, so it will make as many map tasks as splits,right?
The resource manager knows where the files are and will send the tasks to the nodes who have the data, but who says how many nodes will do the tasks? After the maps are donde there is the shuffle, which node will do a reduce task is decided by the partitioner who does a hash map,right? How many nodes will do reduce tasks? Will nodes who have done maps will do too reduce tasks?
Thank you.
TLDR: If I have a cluster and I run a MapReduce job, how does Hadoop decides how many nodes will do map tasks and then which nodes will do the reduce tasks?
How Many Maps?
The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files.
The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 maps for very cpu-light map tasks. Task setup takes a while, so it is best if the maps take at least a minute to execute.
If you havve 10TB of input data and a blocksize of 128MB, you’ll end up with 82,000 maps, unless Configuration.set(MRJobConfig.NUM_MAPS, int) (which only provides a hint to the framework) is used to set it even higher.
How Many Reduces?
The right number of reduces seems to be 0.95 or 1.75 multiplied by ( < no. of nodes > * < no. of maximum containers per node > ).
With 0.95 all of the reduces can launch immediately and start transferring map outputs as the maps finish. With 1.75 the faster nodes will finish their first round of reduces and launch a second wave of reduces doing a much better job of load balancing.
Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures.
Reducer NONE
It is legal to set the number of reduce-tasks to zero if no reduction is desired
Which nodes for Reduce tasks?
You can configure number of mappers and number of reducers per node as per Configuration parameters like mapreduce.tasktracker.reduce.tasks.maximum
if you set this parameter as zero, that node won't be considered for Reduce tasks. Otherwise, all nodes in the cluster are eligible for Reduce tasks.
Source : Map Reduce Tutorial from Apache.
Note: For a given Job, you can set mapreduce.job.maps & mapreduce.job.reduces. But it may not be effective. We should leave the decisions to Map Reduce Framework to decide on number of Map & Reduce tasks
EDIT:
How to decide which Reducer node?
Assume that you have equal reduce slots available on two nodes N1 and N2 and current load on N1 > N2, then , Reduce task will be assigned to N2. If both load and number of slots are same, whoever sends first heartbeat to resource manager will get the task. This is the code block for reduce assignment:http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-320/org/apache/hadoop/mapred/JobQueueTaskScheduler.java#207
how does Hadoop decides how many nodes will do map tasks
By default, the number of mappers will be same as the number of split (blocks) of the input to the mapreduce.
Now about the nodes, In the Hadoop 2, each node runs it own NodeManager (NM). The job of the NM is to manage the application container assigned to it by the Resourcemanager (RM). So basically, each of the task will be running in the individual container. To run the mapper tasks, ApplicationMaster negotiate the container from the ResourceManager. Once the containers are allocated, the NodeManager will launch the task and monitor it.
which nodes will do the reduce tasks?
Again the reduce tasks will also runs in the containers. The ApplicationMaster (per-application (job)) will negotiate the containers from the RM and launch the reducer tasks. Mostly they run on the different nodes then the Mapper nodes.
The default number of reducers for any job is 1. The number of reducers can be set in the job configuration.

Hadoop Map/Reduce Job distribution

I have 4 nodes and I am running a mapreduce sample project to see if job is being distrubuted between all 4 nodes. I ran the project mulitple times and have noticed that, the mapper task is being splitted among all 4 nodes but the reducer task is only being done by one node. Is this how it is suppose to be or is reducer task suppose to be split among all 4 nodes as well.
Thank you
Distribution of Mappers depends on which block of data the mapper will operate on. Framework by default tries to assign the task to a node which has the block of data stored. This will prevent network transfer of data.
For reducers again it depends on no. of reducers which your job requires. If your job uses only one reducer it may be assigned to any pf the nodes.
Also impacting this is speculative execution. If on then this results in multiple instances of map task/ reduce task to start on different nodes and the job tracker based on % completion decides which one will go through and other instances will be killed.
Let us say you 224 MB file. When you add that file into HDFS based on the default block size of 64 MB, the files are split into 4 blocks [blk1=64M,blk2=64M,blk3=64M,blk4=32M]. Let us assume blk1 in on node1 represented as blk1::node1, blk2::node2, blk3:node3, blk4:node4. Now when you run the MR, the Map needs to access the input file. So MR FWK creates 4 mappers and are executed on each node. Now comes the Reducer, as Venkat said it depends on no.of reducers configured for your job. The reducers can be configured using the Hadoop org.apache.hadoop.mapreduce.Job setNumReduceTasks(int tasks) API.

Can reducers and mappers be on the same data node?

I have started reading about Big Data and Hadoop, so this question may sound very stupid to you.
This is what I know.
Each mapper processes a small amount of data and produces an intermediate output.
After this, we have the step of shuffle and sort.
Now, Shuffle = Moving intermediate output over to respective Reducers each dealing with a particular key/keys.
So, can one Data Node have the Mapper and Reducer code running in them or we have different DNs for each?
Terminology: Datanodes are for HDFS (storage). Mappers and Reducers (compute) run on nodes that have the TaskTracker daemon on them.
The number of mappers and reducers per tasktracker are controlled by the configs:
mapred.tasktracker.map.tasks.maximum
and
mapred.tasktracker.reduce.tasks.maximum
Subject to other limits in other configs, theoretically, as long as the tasktracker doesn't have the maximum number of map or reduce tasks, it may get assigned more map or reduce tasks by the jobtracker. Typically the jobtracker will try to assign tasks to reduce the amount of data movement.
So, yes, you can have mappers and reducers running on the same node at the same time.
You can have both mappers and reducers running on the same node. As an example, consider a single node hadoop cluster. In a single node hadoop cluster, the entire HDFS storage(Data Nodes, Name Nodes) and both the job tracker and the task trackers everything runs on the same node.
In this case both the mappers and reducers run on the same node.

What happened if the number of Reducer is more than Datanode

If I have 3 datanodes, I set the number of Reducer Tasks to 4, what happened in this case? The fourth one will be stand by until one of the datanode finishes its reducer task? Or two of them will be running in the same datanode at the same time?
Adding to Chaos's answer, if you have set number of reduce task to a number greater than that of the slots present for reduce tasks throughout the cluster, remaining reduce task will run whenever previous reduce slots gets unoccupied.
Reduce Tasks do not depend on Datanodes, they depend on the number of slots that are assigned to a particular node. The TaskTracker is responsible for running tasks on these slots on any node in the cluster. You can have more than 1 slot per node, so you can have more than 1 Reduce tasks running per node.

Resources