Hadoop, Launched reduce tasks = 1 - hadoop

Hadoop is running on a cluster of 8 nodes. The submitted job produces several key-value objects as mapper output with different keys (manually checked), so I except to have several launched reducers to manage the data in the nodes.
I don't know why, as the log report, the number of launched reduce tasks is always 1. Since there are tens different keys I expect to have at least as many reducers as the number of nodes, i.e. 8 (which is also the number of slaves).
This is the log when job ends
13/05/25 04:02:31 INFO mapred.JobClient: Job complete: job_201305242051_0051
13/05/25 04:02:31 INFO mapred.JobClient: Counters: 30
13/05/25 04:02:31 INFO mapred.JobClient: Job Counters
13/05/25 04:02:31 INFO mapred.JobClient: Launched reduce tasks=1
13/05/25 04:02:31 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=21415994
13/05/25 04:02:31 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/05/25 04:02:31 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/05/25 04:02:31 INFO mapred.JobClient: Rack-local map tasks=7
13/05/25 04:02:31 INFO mapred.JobClient: Launched map tasks=33
13/05/25 04:02:31 INFO mapred.JobClient: Data-local map tasks=26
13/05/25 04:02:31 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=5486645
13/05/25 04:02:31 INFO mapred.JobClient: File Output Format Counters
13/05/25 04:02:31 INFO mapred.JobClient: Bytes Written=2798
13/05/25 04:02:31 INFO mapred.JobClient: FileSystemCounters
13/05/25 04:02:31 INFO mapred.JobClient: FILE_BYTES_READ=2299685944
13/05/25 04:02:31 INFO mapred.JobClient: HDFS_BYTES_READ=2170126861
13/05/25 04:02:31 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2879025663
13/05/25 04:02:31 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=2798
13/05/25 04:02:31 INFO mapred.JobClient: File Input Format Counters
13/05/25 04:02:31 INFO mapred.JobClient: Bytes Read=2170123000
Other (useful?) information:
for each node I have 1 core assigned to the job
I manually checked that the job is effectively running on 8 nodes.
There is no parameter set by me for setting the reducers tasks fixed to one
Hadoop version: 1.1.2
So, do you have any idea of why the reducer number is 1? and not more?
Thanks

You should:
firstly checkout whether your cluster support more than 1 reducer
Specify the reduce members you want to run
checkout the supported reducer count
The most convienient way to checkout it out is using the jobtracker webUI: http://localhost:50030/machines.jsp?type=active ( you may need to remove localhost with the hostname that the jobtracker is running. It will show all the active TaskTrackers in your cluster, and how many reducers each TaskTracker could run concurrently.
Specify the reducer number
There are three ways for you:
Specify the reducer number in your code
Like zsxwing have showed out, you should specify the reducer number by calls setNumReduceTasks() method of JobConf. And give the reduce number as the parameter.
Specify the reducer number in your command line
you could also pass the reducer number in command line like the following:
bin/hadoop jar hadoop-examples-1.0.4.jar terasort -Dmapred.reduce.tasks=2 teragen teragen_out.
The above command line will start 2 reducers.
Specify the reducer number in your conf/mapred-site.xml
You can also add a new property in your mapred-site.xml like this:
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
</property>

You need to set the reducer number by yourself (default is 1) no matter how many keys outputed by mappers. You can use job.setNumReduceTasks(5) to set the reduce tasks 5.

Related

Hadoop producing no output?

I've recently started learning how to use the Hadoop system, and decided it's time to try writing some code. Before that, I wanted to try running the examples seen in the Getting Started page. However, it does not seem to produce any visible results.
I'm currently using Hadoop version 3.3.1 using a single-node setup,
and using jdk 11.0.11. I am running this on Windows 10 (due to current development requirements).
I've used the following command on cmd:
hadoop jar %hadoop_home%/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep input /output 'dfs[a-z.]+'
The output to the command:
C:\Windows\system32>hadoop jar %hadoop_home%/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep input /output 'dfs[a-z.]+'
2021-12-15 00:33:10,486 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
2021-12-15 00:33:10,800 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/E/.staging/job_1639519343908_0005
2021-12-15 00:33:11,029 INFO input.FileInputFormat: Total input files to process : 10
2021-12-15 00:33:11,108 INFO mapreduce.JobSubmitter: number of splits:10
2021-12-15 00:33:11,281 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1639519343908_0005
2021-12-15 00:33:11,281 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-12-15 00:33:11,442 INFO conf.Configuration: resource-types.xml not found
2021-12-15 00:33:11,443 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2021-12-15 00:33:11,497 INFO impl.YarnClientImpl: Submitted application application_1639519343908_0005
2021-12-15 00:33:11,527 INFO mapreduce.Job: The url to track the job: http://DESKTOP-S15C716:8088/proxy/application_1639519343908_0005/
2021-12-15 00:33:11,528 INFO mapreduce.Job: Running job: job_1639519343908_0005
2021-12-15 00:33:19,611 INFO mapreduce.Job: Job job_1639519343908_0005 running in uber mode : false
2021-12-15 00:33:19,615 INFO mapreduce.Job: map 0% reduce 0%
2021-12-15 00:33:31,178 INFO mapreduce.Job: map 50% reduce 0%
2021-12-15 00:33:32,263 INFO mapreduce.Job: map 60% reduce 0%
2021-12-15 00:33:39,624 INFO mapreduce.Job: map 90% reduce 0%
2021-12-15 00:33:40,632 INFO mapreduce.Job: map 100% reduce 0%
2021-12-15 00:33:41,636 INFO mapreduce.Job: map 100% reduce 100%
2021-12-15 00:33:41,648 INFO mapreduce.Job: Job job_1639519343908_0005 completed successfully
2021-12-15 00:33:41,760 INFO mapreduce.Job: Counters: 51
File System Counters
FILE: Number of bytes read=6
FILE: Number of bytes written=3021766
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=31877
HDFS: Number of bytes written=86
HDFS: Number of read operations=35
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Killed map tasks=1
Launched map tasks=10
Launched reduce tasks=1
Data-local map tasks=10
Total time spent by all maps in occupied slots (ms)=89653
Total time spent by all reduces in occupied slots (ms)=8222
Total time spent by all map tasks (ms)=89653
Total time spent by all reduce tasks (ms)=8222
Total vcore-milliseconds taken by all map tasks=89653
Total vcore-milliseconds taken by all reduce tasks=8222
Total megabyte-milliseconds taken by all map tasks=91804672
Total megabyte-milliseconds taken by all reduce tasks=8419328
Map-Reduce Framework
Map input records=819
Map output records=0
Map output bytes=0
Map output materialized bytes=60
Input split bytes=1139
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=60
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=90
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=2952790016
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=30738
File Output Format Counters
Bytes Written=86
2021-12-15 00:33:41,790 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
2021-12-15 00:33:41,814 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/E/.staging/job_1639519343908_0006
2021-12-15 00:33:41,855 INFO input.FileInputFormat: Total input files to process : 1
2021-12-15 00:33:41,913 INFO mapreduce.JobSubmitter: number of splits:1
2021-12-15 00:33:41,950 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1639519343908_0006
2021-12-15 00:33:41,950 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-12-15 00:33:42,179 INFO impl.YarnClientImpl: Submitted application application_1639519343908_0006
2021-12-15 00:33:42,190 INFO mapreduce.Job: The url to track the job: http://DESKTOP-S15C716:8088/proxy/application_1639519343908_0006/
2021-12-15 00:33:42,191 INFO mapreduce.Job: Running job: job_1639519343908_0006
2021-12-15 00:33:55,301 INFO mapreduce.Job: Job job_1639519343908_0006 running in uber mode : false
2021-12-15 00:33:55,302 INFO mapreduce.Job: map 0% reduce 0%
2021-12-15 00:34:00,336 INFO mapreduce.Job: map 100% reduce 0%
2021-12-15 00:34:06,366 INFO mapreduce.Job: map 100% reduce 100%
2021-12-15 00:34:07,375 INFO mapreduce.Job: Job job_1639519343908_0006 completed successfully
2021-12-15 00:34:07,404 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=6
FILE: Number of bytes written=548197
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=212
HDFS: Number of bytes written=0
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3232
Total time spent by all reduces in occupied slots (ms)=3610
Total time spent by all map tasks (ms)=3232
Total time spent by all reduce tasks (ms)=3610
Total vcore-milliseconds taken by all map tasks=3232
Total vcore-milliseconds taken by all reduce tasks=3610
Total megabyte-milliseconds taken by all map tasks=3309568
Total megabyte-milliseconds taken by all reduce tasks=3696640
Map-Reduce Framework
Map input records=0
Map output records=0
Map output bytes=0
Map output materialized bytes=6
Input split bytes=126
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=6
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=13
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=536870912
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=86
File Output Format Counters
Bytes Written=0
Yet when viewing the contents of the now-made 'output' folder,
I receive the following result:
hdfs dfs -ls /output
Found 2 items
-rw-r--r-- 1 E supergroup 0 2021-12-15 00:34 /output/_SUCCESS
-rw-r--r-- 1 E supergroup 0 2021-12-15 00:34 /output/part-r-00000
I.e. there's no data written to those files!
May anyone please assist me?
If you have no data in your HDFS input folder that matches the grep pattern 'dfs[a-z.]+', then the output will be empty
From the linked docs (which are for Unix, not Windows), make sure this command completed
bin/hdfs dfs -put %HADOOP_HOME%/etc/hadoop/*.xml input
And you can grep dfs $HADOOP_HOME/etc/hadoop/*.xml (at least on Unix) as well, locally, to verify there should be data output

Why there is no reducer when running 1TB teragen?

I am running a terasort benchmark for hadoop using the following command:
jar /Users/karan.verma/Documents/backups/h/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar teragen -Dmapreduce.job.maps=100 1t random-data
and got the following logs printed for 100 map tasks:
18/03/27 13:06:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/03/27 13:06:04 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
18/03/27 13:06:05 INFO terasort.TeraSort: Generating -727379968 using 100
18/03/27 13:06:05 INFO mapreduce.JobSubmitter: number of splits:100
18/03/27 13:06:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1522131782827_0001
18/03/27 13:06:06 INFO impl.YarnClientImpl: Submitted application application_1522131782827_0001
18/03/27 13:06:06 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1522131782827_0001/
18/03/27 13:06:06 INFO mapreduce.Job: Running job: job_1522131782827_0001
18/03/27 13:06:16 INFO mapreduce.Job: Job job_1522131782827_0001 running in uber mode : false
18/03/27 13:06:16 INFO mapreduce.Job: map 0% reduce 0%
18/03/27 13:06:29 INFO mapreduce.Job: map 2% reduce 0%
18/03/27 13:06:31 INFO mapreduce.Job: map 3% reduce 0%
18/03/27 13:06:32 INFO mapreduce.Job: map 5% reduce 0%
....
18/03/27 13:09:27 INFO mapreduce.Job: map 100% reduce 0%
and here is the final counters as printed on console:
18/03/27 13:09:29 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=10660990
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=8594
HDFS: Number of bytes written=0
HDFS: Number of read operations=400
HDFS: Number of large read operations=0
HDFS: Number of write operations=200
Job Counters
Launched map tasks=100
Other local map tasks=100
Total time spent by all maps in occupied slots (ms)=983560
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=983560
Total vcore-milliseconds taken by all map tasks=983560
Total megabyte-milliseconds taken by all map tasks=1007165440
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=8594
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=9746
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=11220811776
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
and here is the output on job schedular:
Please suggest why there is no reduce task?
Your run command says that you're running teragen and not terasort. teragen simply generates data that you can then use for terasort, and so no reducers are needed.
To run terasort over the data that you've just generated, run:
hadoop jar /Users/karan.verma/Documents/backups/h/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort random-data terasort-output
You should then see reducers.
No reduce tasks run when executing teragen. Here is the documentation:
TeraGen will run map tasks to generate the data and will not run any reduce tasks. The default number of map task is defined by the "mapreduce.job.maps=2" param. It's the only purpose here is to generate the 1TB of random data in the following format " 10 bytes key | 2 bytes break | 32 bytes acsii/hex | 4 bytes break | 48 bytes filler | 4 bytes break | \r\n".

Mapreduce api giving wrong mapper count

I am trying to get the number of mappers in a mapreduce program by using below piece of code. I get the value for mapreduce.job.maps as 2 but the program actually launches 6 mappers as there are 6 small files. Anyone getting similar issue?
Code
job.getConfiguration().get("mapreduce.job.maps")
Log:
num of mappers : 2
...
17/05/13 06:56:47 INFO input.FileInputFormat: Total input paths to process : 6
17/05/13 06:56:47 INFO mapreduce.JobSubmitter: number of splits:6
...
17/05/13 06:56:48 INFO mapreduce.Job: Running job: job_1494588725898_0047
17/05/13 06:56:59 INFO mapreduce.Job: Job job_1494588725898_0047 running in uber mode : false
17/05/13 06:56:59 INFO mapreduce.Job: map 0% reduce 0%
...
17/05/13 06:57:39 INFO mapreduce.Job: map 100% reduce 100%
17/05/13 06:57:40 INFO mapreduce.Job: Job job_1494588725898_0047 completed successfully
17/05/13 06:57:40 INFO mapreduce.Job: Counters: 49
File System Counters
...
Job Counters
Launched map tasks=6
Launched reduce tasks=2
This is not an issue, but the actual behaviour of MapReduce.
The value you get for mapreduce.job.maps property is its default value, 2. The number of mapper tasks will always be determined from the File Input Splits, which is 6 in this scenario. And to get the actual number of map tasks launched for a job, you have to wait till the job is completed.

hadoop test examples to validate the installation

I have successfully configured Hadoop 2.4 on my Ubuntu 14.04 using this tutorial.
http://dogdogfish.com/2014/04/26/installing-hadoop-2-4-on-ubuntu-14-04/
Now after completing installtion how can I perform test on it?
How and where can I get the test data or jar files?
You have some example jars in your hadoop installation directory.
Simplest thing you can do is run the teragen example(or wordcount).
It is the first step in perform terasort.
Steps:
Go to the hadoop installation directory.
Run "hadoop jar hadoop-examples-0.20.2-cdh3u0.jar" to see all the jars you can run.
Go to home/[user] directory and create a file "example.txt" with the following data
"This is a file to test Hadoop Installation example
For the sake of the experiment, consider it to be 1TB"
While you are in that directory, run "hadoop dfs -put examples.txt /" this uploads the file onto your HDFS
Run "hadoop dfs -ls /" to check it is on there
Go to your Hadoop installation directory and run "hadoop jar hadoop-examples-0.20.2-cdh3u0.jar teragen 1000 /user/teragendata" - 1000 is the size data is to be broken into and the other param is the output directory.
On successful execution, you will see something like the text at the bottom.
Now to see how your MR job was run, in your browser open JobTracker and see the completed jobs. "localhost50030/jobtracker.jsp"
cloudera#cloudera-vm:/usr/lib/hadoop$ hadoop jar hadoop-examples-0.20.2-cdh3u0.jar teragen 600 /user/teragendata
Generating 600 using 2 maps with step of 300
14/07/24 09:02:44 INFO mapred.JobClient: Running job: job_201407230030_0008
14/07/24 09:02:45 INFO mapred.JobClient: map 0% reduce 0%
14/07/24 09:02:57 INFO mapred.JobClient: map 100% reduce 0%
14/07/24 09:03:00 INFO mapred.JobClient: Job complete: job_201407230030_0008
14/07/24 09:03:00 INFO mapred.JobClient: Counters: 13
14/07/24 09:03:00 INFO mapred.JobClient: Job Counters
14/07/24 09:03:00 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=22008
14/07/24 09:03:00 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/07/24 09:03:00 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/07/24 09:03:00 INFO mapred.JobClient: Launched map tasks=2
14/07/24 09:03:00 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
14/07/24 09:03:00 INFO mapred.JobClient: FileSystemCounters
14/07/24 09:03:00 INFO mapred.JobClient: HDFS_BYTES_READ=164
14/07/24 09:03:00 INFO mapred.JobClient: FILE_BYTES_WRITTEN=105150
14/07/24 09:03:00 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=60000
14/07/24 09:03:00 INFO mapred.JobClient: Map-Reduce Framework
14/07/24 09:03:00 INFO mapred.JobClient: Map input records=600
14/07/24 09:03:00 INFO mapred.JobClient: Spilled Records=0
14/07/24 09:03:00 INFO mapred.JobClient: Map input bytes=600
14/07/24 09:03:00 INFO mapred.JobClient: Map output records=600
14/07/24 09:03:00 INFO mapred.JobClient: SPLIT_RAW_BYTES=164

Too many fetch failures: Hadoop on cluster (x2)

I have been using Hadoop for the last week or so (trying to get to grips with it), and although I have been able to set up a multinode cluster (2 machines: 1 laptop and a small desktop) and retrieve results, I always seem to encounter "Too many fetch failures" when I run a hadoop job.
An example output (on a trivial wordcount example) is:
hadoop#ap200:/usr/local/hadoop$ bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount sita sita-output3X
11/05/20 15:02:05 INFO input.FileInputFormat: Total input paths to process : 7
11/05/20 15:02:05 INFO mapred.JobClient: Running job: job_201105201500_0001
11/05/20 15:02:06 INFO mapred.JobClient: map 0% reduce 0%
11/05/20 15:02:23 INFO mapred.JobClient: map 28% reduce 0%
11/05/20 15:02:26 INFO mapred.JobClient: map 42% reduce 0%
11/05/20 15:02:29 INFO mapred.JobClient: map 57% reduce 0%
11/05/20 15:02:32 INFO mapred.JobClient: map 100% reduce 0%
11/05/20 15:02:41 INFO mapred.JobClient: map 100% reduce 9%
11/05/20 15:02:49 INFO mapred.JobClient: Task Id : attempt_201105201500_0001_m_000003_0, Status : FAILED
Too many fetch-failures
11/05/20 15:02:53 INFO mapred.JobClient: map 85% reduce 9%
11/05/20 15:02:57 INFO mapred.JobClient: map 100% reduce 9%
11/05/20 15:03:10 INFO mapred.JobClient: Task Id : attempt_201105201500_0001_m_000002_0, Status : FAILED
Too many fetch-failures
11/05/20 15:03:14 INFO mapred.JobClient: map 85% reduce 9%
11/05/20 15:03:17 INFO mapred.JobClient: map 100% reduce 9%
11/05/20 15:03:25 INFO mapred.JobClient: Task Id : attempt_201105201500_0001_m_000006_0, Status : FAILED
Too many fetch-failures
11/05/20 15:03:29 INFO mapred.JobClient: map 85% reduce 9%
11/05/20 15:03:32 INFO mapred.JobClient: map 100% reduce 9%
11/05/20 15:03:35 INFO mapred.JobClient: map 100% reduce 28%
11/05/20 15:03:41 INFO mapred.JobClient: map 100% reduce 100%
11/05/20 15:03:46 INFO mapred.JobClient: Job complete: job_201105201500_0001
11/05/20 15:03:46 INFO mapred.JobClient: Counters: 25
11/05/20 15:03:46 INFO mapred.JobClient: Job Counters
11/05/20 15:03:46 INFO mapred.JobClient: Launched reduce tasks=1
11/05/20 15:03:46 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=72909
11/05/20 15:03:46 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
11/05/20 15:03:46 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
11/05/20 15:03:46 INFO mapred.JobClient: Launched map tasks=10
11/05/20 15:03:46 INFO mapred.JobClient: Data-local map tasks=10
11/05/20 15:03:46 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=76116
11/05/20 15:03:46 INFO mapred.JobClient: File Output Format Counters
11/05/20 15:03:46 INFO mapred.JobClient: Bytes Written=1412473
11/05/20 15:03:46 INFO mapred.JobClient: FileSystemCounters
11/05/20 15:03:46 INFO mapred.JobClient: FILE_BYTES_READ=4462381
11/05/20 15:03:46 INFO mapred.JobClient: HDFS_BYTES_READ=6950740
11/05/20 15:03:46 INFO mapred.JobClient: FILE_BYTES_WRITTEN=7546513
11/05/20 15:03:46 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1412473
11/05/20 15:03:46 INFO mapred.JobClient: File Input Format Counters
11/05/20 15:03:46 INFO mapred.JobClient: Bytes Read=6949956
11/05/20 15:03:46 INFO mapred.JobClient: Map-Reduce Framework
11/05/20 15:03:46 INFO mapred.JobClient: Reduce input groups=128510
11/05/20 15:03:46 INFO mapred.JobClient: Map output materialized bytes=2914947
11/05/20 15:03:46 INFO mapred.JobClient: Combine output records=201001
11/05/20 15:03:46 INFO mapred.JobClient: Map input records=137146
11/05/20 15:03:46 INFO mapred.JobClient: Reduce shuffle bytes=2914947
11/05/20 15:03:46 INFO mapred.JobClient: Reduce output records=128510
11/05/20 15:03:46 INFO mapred.JobClient: Spilled Records=507835
11/05/20 15:03:46 INFO mapred.JobClient: Map output bytes=11435785
11/05/20 15:03:46 INFO mapred.JobClient: Combine input records=1174986
11/05/20 15:03:46 INFO mapred.JobClient: Map output records=1174986
11/05/20 15:03:46 INFO mapred.JobClient: SPLIT_RAW_BYTES=784
11/05/20 15:03:46 INFO mapred.JobClient: Reduce input records=201001
I did a google on the problem, and the people at apache seem to suggest it could be anything from a networking problem (or something to do with /etc/hosts files) or could be a corrupt disk on the slave nodes.
Just to add: I do see 2 "live nodes" on namenode Admin panel (localhost:50070/dfshealth) and under Map/reduce Admin, I see 2 nodes aswell.
Any clues as to how I can avoid these errors?
Thanks in advance.
Edit:1:
The tasktracker log is on: http://pastebin.com/XMkNBJTh
The datanode log is on: http://pastebin.com/ttjR7AYZ
Many thanks.
Modify datanode node/etc/hosts file.
Each line is divided into three parts. The first part is the network IP address, the second part is the host name or domain name, the third part is the host alias detailed steps are as follows:
First check the host name:
cat / proc / sys / kernel / hostname
You will see a HOSTNAME attribute. Change the value of the IP behind on OK and then exit.
Use the command:
hostname ***. ***. ***. ***
Asterisk is replaced by the corresponding IP.
Modify the the hosts configuration similarly, as follows:
127.0.0.1 localhost.localdomain localhost
:: 1 localhost6.localdomain6 localhost6
10.200.187.77 10.200.187.77 hadoop-datanode
If the IP address is configured and successfully modified, or show host name there is a problem, continue to modify the hosts file.
Following solution will definitely work
1.Remove or comment line with Ip 127.0.0.1 and 127.0.1.1
2.use host name not alias for referring node in host file and Master/slave file present in hadoop directory
-->in Host file 172.21.3.67 master-ubuntu
-->in master/slave file master-ubuntu
3. see for NameSpaceId of namenode = NameSpaceId of Datanode
I had the same problem: "Too many fetch failures" and very slow Hadoop performance (the simple wordcount example took more than 20 minutes to run on a 2-node cluster of powerful servers). I also got "WARN mapred.JobClient: Error reading task outputConnection refused" errors.
The problem was fixed, when I followed the instruction by Thomas Jungblut: I removed my master node from the slaves configuration file. After this, the errors disappeared and the wordcount example took only 1 minute.

Resources