number of reducers for 1 task in MapReduce - hadoop

In a typical MapReduce setup(like Hadoop), how many reducer is used for 1 task, for example, counting words? My understanding of that MapReduce from Google means only 1 reducer is involved. Is that correct?
For example, the word count will divide the input into N chunks, and N Map will be running, producing the (word,#) list. My question is, once the Map phase is done, will there be only ONE reducer instance running to compute the result? or there will be reducers running in parallel?

The simple answer is that the number of reducers does not have to be 1 and yes, reducers can run in parallel. As I mentioned above this is user defined or derived.
To keep things in context I will refer to Hadoop in this case so you have an idea of how things work. If you are using the streaming API in Hadoop (0.20.2) you will have to explicitly define how many reducers you would like to run since by default, only 1 reduce task will be launched. You do so by passing the number of reducers to the -D mapred.reduce.tasks=# of reducers argument. The Java API will try to derive the number of reducers you will need but again you can explicitly set that too. In both cases, there is a hard cap on the number of reducers you can run per node and that is set in your mapred-site.xml configuration file using mapred.tasktracker.reduce.tasks.maximum.
On a more conceptual note, you can look at this post on the hadoop wiki that talks about choosing the number of map and reduce tasks.

I case of simple wordcount example it would make sense to use only one reducer.
If you want to have as a result of computation only one number you have to use one reducer (2 or more reducers would give you 2 or more output files).
If this reducer is taking long time to complete you can think of chaining multiple reducers where reducers in next phase would sum results from previous reducers.

This depends entirely on the situation. In some cases, you don't have any reducers...everything can be done mapside. In other cases, you cannot avoid having one reducer, but generally this comes in a 2nd or 3rd map/reduce job that condenses earlier results. Generally, however, you want to have a lot of reducers or else you are losing a lot of the power of MapReduce! In word count, for example, the result of your mappers will be pairs. These pairs are then partitioned based on the word such that each reducer will receive the same words, and can give you the ultimate sum. Each reducer then outputs the result. If you wanted to, you could then shoot off another M/R job that took all of these files and concatenated them-- that job would only have one reducer.

The default value is 1.
If you are considering hive or pig,then the number of reducer depends on the query , like group by , sum .....
In case of ur mapreduce code , it can be defined by setNumReduceTasks on job/conf.
job.setNumReduceTasks(3);
Most of the time it is done when you overwrite the getPartition(), i.e. you are using a custom partitioner
class customPartitioner extends Partitioner<Text,Text>{
public int getPartition(Text key, Text value, int numReduceTasks){
if(numReduceTasks==0)
return 0;
if(some logic)
return 0;
if(some logic)
return 1;
else
return 2;
}
}
One thing you will notice that the number of reducers = the number of part file in the output.
Let me know if you have doubts.

The reducers runs in parallel . The number of reducer you have set in your job while changing config file mapred-site.xml or by setting reducer while command of running job or you can set it in the program also that number of reducer will run parallely. Its not necessary to keep it as 1. By default its value is 1.

Related

number of mapper and reducer tasks in MapReduce

If I set the number of reduce tasks as something like 100 and when I run the job, suppose the reduce task number exceeds (as per my understanding the number of reduce tasks depends on the key-value we get from the mapper.Suppose I am setting (1,abc) and (2,bcd) as key value in mapper, the number of reduce tasks will be 2) How will MapReduce handle it?.
as per my understanding the number of reduce tasks depends on the key-value we get from the mapper
Your understanding seems to be wrong. The number of reduce tasks does not depend on the key-value we get from the mapper.
In a MapReduce job the number of reducers is configurable on a per job basis and is set in the driver class.
For example if we need 2 reducers for our job then we need to set it in the driver class of our MapReduce job as below:-
job.setNumReduceTasks(2);
In the Hadoop: The Definitive Guide book, Tom White states that -
Setting reducer count is kind of art, instead of science.
So we have to decide how many reducers we need for our job. For your example if you have the intermediate Mapper input as (1,abc) and (2,bcd) and you have not set the number of reducers in the driver class then Mapreduce by default runs only 1 reducer and both of the key value pairs will be processed by a single Reducer and you will get a single output file in the specified output directory.
The default value of number of reducer on MapReduce is 1 irrespective of the number of the (key,value) pairs.
If you set the number of Reducer for a MapReduce job, then number of Reducer will not exceed than the defined value irrespective of the number of different (key,value) pairs.
Once the Mapper task are completed the output is processed by Partitioner by dividing the data into Reducers. The default partitioner for hadoop is HashPartitioner which partition the data based on hash value of keys. It has a method called getPartition. It takes key.hashCode() & Integer.MAX_VALUE and finds the modulus using the number of reduce tasks.
So the number of reducer will never exceed than what you have defined in the Driver class.

Hadoop-2.4.1 custom partitioner to balance reducers

As we know, that during the shuffle phase of hadoop, each of the reducer read data from all the mapper's output (intermedia data).
Now, we also know that by default Hash-Partitioning is used for reducers.
My question is: How do we implement an algorithm, e.g. Locality-aware?
In short, you should not do it.
First, you have no control over where the mappers and reducers are executed on the cluster, so even when the complete output of a single mapper will go to a single reducer there is a huge probability that they would be on different hosts and the data would be transferred through the network
Second, to make the reducer process the whole output of the mapper, you first have to make mapper process the right part of the information, which means that you have to preprocess data by partitioning it and then run a single mapper and a single reducer for each partition, but this preprocessing itself would take much resources so it is mostly meaningless
And finally, why do you need it? The main concept of map-reduce is manipulation with key-value pairs, and reducer in general should aggregate list of values outputted by the mappers for the same keys. Here's why hash partitioning is used: distribute N keys between K reducers. Using different type of partitioner is a really seldom case. If you need data locality you might prefer to work with MPP database rather than Hadoop, for example.
If you really need a custom partitioner, here's an example of how it can be implemented: http://hadooptutorial.wikispaces.com/Custom+partitioner. Nothing special, just return reducer number based on the key and value passed and the number of reducers. Using hash code of the host name divided (%) by the number of reducers will make the whole output of a single mapper go to a single reducer. Also you might use process PID % number of reducers. But before doing this you have to check, whether you really need this behavior or not.

Creating more partitions than reducers

When developing locally on my single machine, I believe the default number of reducers is 6. In a particular MR step, I actually divide up the data into n partitions where n can be greater than 6. From what I have observed, it looks like only 6 of those partitions actually get processed because I only see output from 6 specific partitions only. A few questions:
(a) Do I need to set the number of reducers to be greater than the number of partitions? If so, can I do this before/during/after running the Mapper?
(b) Why is it that the other partitions are not queued up? Is there a way to wait for a reducer to finish processing one partition before working on another partition such that all partitions can be processed regardless of whether the actual number of reducers is less than the number of partitions?
(a) No. You can have any number of reducers based on your needs. Partitioning just decides which set of key/value pairs will go to which reducer. It doesn't decide how many reducers will be generated. But, if there is a situation wherein you want to set the number of reducers as per your requirement, you can do that through Job :
job.setNumReduceTasks(2);
(b) This is actually what happens. Based on the availability of slots a set reducers is initiated which process all the input fed to them. If all the reducers have finished and some data is still left unprocessed a second batch of reducers will start and finish rest of the data. All of your data will eventually get processed irrespective of the number of partitions and reducers.
Please make sure your partition logic is correct.
P.S. : Why do you believe the default number of reducers is 6?
You can also ask for a number of reducers when you submit the job to hadoop.
$hadoop jar myjarfile mymainclass -Dmapreduce.job.reduces=n myinput myoutputdir
For more options and some details see:
Hadoop Number of Reducers Configuration Options Priority

how to start sort and reduce in hadoop before shuffle completes for all mappers?

I understand from When do reduce tasks start in Hadoop that the reduce task in hadoop contains three steps: shuffle, sort and reduce where the sort (and after that the reduce) can only start once all the mappers are done. Is there a way to start the sort and reduce every time a mapper finishes.
For example lets we have only one job with mappers mapperA and mapperB and 2 reducers. What i want to do is:
mapperA finishes
shuffles copies the appropriate partitions of the mapperAs output lets say to reducer 1 and 2
sort on reducer 1 and 2 starts sorting and reducing and generates some intermediate output
now mapperB finishes
shuffles copies the appropriate partitions of the mapperBs output to reducer 1 and 2
sort and reduce on reducer 1 and 2 starts again and the reducer merges the new output with the old one
Is this possible? Thanks
You can't with the current implementation. However, people have "hacked" the Hadoop code to do what you want to do.
In the MapReduce model, you need to wait for all mappers to finish, since the keys need to be grouped and sorted; plus, you may have some speculative mappers running and you do not know yet which of the duplicate mappers will finish first.
However, as the "Breaking the MapReduce Stage Barrier" paper indicates, for some applications, it may make sense not to wait for all of the output of the mappers. If you would want to implement this sort of behavior (most likely for research purposes), then you should take a look at theorg.apache.hadoop.mapred.ReduceTask.ReduceCopier class, which implements ShuffleConsumerPlugin.
EDIT: Finally, as #teo points out in this related SO question, the
ReduceCopier.fetchOutputs() method is the one that holds the reduce
task from running until all map outputs are copied (through the while
loop in line 2026 of Hadoop release 1.0.4).
You can configure this using the slowstart property, which denotes the percentage of your mappers that need to be finished before the copy to the reducers starts. It usual default is in the 0.9 - 0.95 (90-95%) mark, but you can override to be 0 if your want
`mapreduce.reduce.slowstart.completed.map`
Starting the sort process before all mappers finish is sort of a hadoop-antipattern (if I may put it that way!), in that the reducers cannot know that there is no more data to receive until all mappers finish. you, the invoker may know that, based on your definition of keys, partitioner etc., but the reducers don't.

In Hadoop, is there a way to see the key/value pairs sent to a reducer for a running task?

For one of my hadoop jobs, the amount of data fed into my reducer tasks is extremely unbalanced. For instance, if I have 10 reducer tasks, the input size to 9 of them will be in the 50KB range and the last will be close to 200GB. I suspect that my mappers are generating a large number of values for a single key but I don't know what that key is. It's a legacy job and I don't have access to the source code anymore. Is there a way to see the key/value pairs, either as output from the mapper or input to the reducer, while the job is running?
Try this adding this to your CLI job run: -D mapred.reduce.tasks=0
This should set the number of reducers to 0, which in effect will have the mappers dump output directly to HDFS. However, there may be some code that is overwriting the number of reducers regardless... so this might not work.
If this works, this will show the output of the mapper.
You can always count the total amount of values of your keys with a different simple map reduce job.

Resources