cloudera hadoop mapreduce job GC overhead limit exceeded error - hadoop

I am running a canopy cluster job (using mahout) on a cloudera cdh4. the content to be clustered has about 1m records (each record is less than 1k in size). the whole hadoop environment (including all the nodes) is running in a vm with 4G memory. the installation of cdh4 is by default. I got the following exception when running the job.
It looks the job client should need a higher jvm heap size according to the exception. However, there are quite a few configuration options for jvm heap size in cloudera manager. I changed "Client Java Heap Size in Bytes" from 256MiB to 512MiB. However, it didnt improve.
Any hints/tips on setting these heap size options?
13/07/03 17:12:45 INFO input.FileInputFormat: Total input paths to process : 1
13/07/03 17:12:46 INFO mapred.JobClient: Running job: job_201307031710_0001
13/07/03 17:12:47 INFO mapred.JobClient: map 0% reduce 0%
13/07/03 17:13:06 INFO mapred.JobClient: map 1% reduce 0%
13/07/03 17:13:27 INFO mapred.JobClient: map 2% reduce 0%
13/07/03 17:14:01 INFO mapred.JobClient: map 3% reduce 0%
13/07/03 17:14:50 INFO mapred.JobClient: map 4% reduce 0%
13/07/03 17:15:50 INFO mapred.JobClient: map 5% reduce 0%
13/07/03 17:17:06 INFO mapred.JobClient: map 6% reduce 0%
13/07/03 17:18:44 INFO mapred.JobClient: map 7% reduce 0%
13/07/03 17:20:24 INFO mapred.JobClient: map 8% reduce 0%
13/07/03 17:22:20 INFO mapred.JobClient: map 9% reduce 0%
13/07/03 17:25:00 INFO mapred.JobClient: map 10% reduce 0%
13/07/03 17:28:08 INFO mapred.JobClient: map 11% reduce 0%
13/07/03 17:31:46 INFO mapred.JobClient: map 12% reduce 0%
13/07/03 17:35:57 INFO mapred.JobClient: map 13% reduce 0%
13/07/03 17:40:52 INFO mapred.JobClient: map 14% reduce 0%
13/07/03 17:46:55 INFO mapred.JobClient: map 15% reduce 0%
13/07/03 17:55:02 INFO mapred.JobClient: map 16% reduce 0%
13/07/03 18:08:42 INFO mapred.JobClient: map 17% reduce 0%
13/07/03 18:59:11 INFO mapred.JobClient: map 8% reduce 0%
13/07/03 18:59:13 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000001_0, Status : FAILED
Error: GC overhead limit exceeded
13/07/03 18:59:23 INFO mapred.JobClient: map 9% reduce 0%
13/07/03 19:00:09 INFO mapred.JobClient: map 10% reduce 0%
13/07/03 19:01:49 INFO mapred.JobClient: map 11% reduce 0%
13/07/03 19:04:25 INFO mapred.JobClient: map 12% reduce 0%
13/07/03 19:07:48 INFO mapred.JobClient: map 13% reduce 0%
13/07/03 19:12:48 INFO mapred.JobClient: map 14% reduce 0%
13/07/03 19:19:46 INFO mapred.JobClient: map 15% reduce 0%
13/07/03 19:29:05 INFO mapred.JobClient: map 16% reduce 0%
13/07/03 19:43:43 INFO mapred.JobClient: map 17% reduce 0%
13/07/03 20:49:36 INFO mapred.JobClient: map 8% reduce 0%
13/07/03 20:49:38 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000001_1, Status : FAILED
Error: GC overhead limit exceeded
13/07/03 20:49:48 INFO mapred.JobClient: map 9% reduce 0%
13/07/03 20:50:31 INFO mapred.JobClient: map 10% reduce 0%
13/07/03 20:52:08 INFO mapred.JobClient: map 11% reduce 0%
13/07/03 20:54:38 INFO mapred.JobClient: map 12% reduce 0%
13/07/03 20:58:01 INFO mapred.JobClient: map 13% reduce 0%
13/07/03 21:03:01 INFO mapred.JobClient: map 14% reduce 0%
13/07/03 21:10:10 INFO mapred.JobClient: map 15% reduce 0%
13/07/03 21:19:54 INFO mapred.JobClient: map 16% reduce 0%
13/07/03 21:31:35 INFO mapred.JobClient: map 8% reduce 0%
13/07/03 21:31:37 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000000_0, Status : FAILED
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250)
Caused by: java.io.IOException: Task process exit with nonzero status of 65.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:237)
13/07/03 21:32:09 INFO mapred.JobClient: map 9% reduce 0%
13/07/03 21:33:31 INFO mapred.JobClient: map 10% reduce 0%
13/07/03 21:35:42 INFO mapred.JobClient: map 11% reduce 0%
13/07/03 21:38:41 INFO mapred.JobClient: map 12% reduce 0%
13/07/03 21:42:27 INFO mapred.JobClient: map 13% reduce 0%
13/07/03 21:48:20 INFO mapred.JobClient: map 14% reduce 0%
13/07/03 21:56:12 INFO mapred.JobClient: map 15% reduce 0%
13/07/03 22:07:20 INFO mapred.JobClient: map 16% reduce 0%
13/07/03 22:26:36 INFO mapred.JobClient: map 17% reduce 0%
13/07/03 23:35:30 INFO mapred.JobClient: map 8% reduce 0%
13/07/03 23:35:32 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000000_1, Status : FAILED
Error: GC overhead limit exceeded
13/07/03 23:35:42 INFO mapred.JobClient: map 9% reduce 0%
13/07/03 23:36:16 INFO mapred.JobClient: map 10% reduce 0%
13/07/03 23:38:01 INFO mapred.JobClient: map 11% reduce 0%
13/07/03 23:40:47 INFO mapred.JobClient: map 12% reduce 0%
13/07/03 23:44:44 INFO mapred.JobClient: map 13% reduce 0%
13/07/03 23:50:42 INFO mapred.JobClient: map 14% reduce 0%
13/07/03 23:58:58 INFO mapred.JobClient: map 15% reduce 0%
13/07/04 00:10:22 INFO mapred.JobClient: map 16% reduce 0%
13/07/04 00:21:38 INFO mapred.JobClient: map 7% reduce 0%
13/07/04 00:21:40 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000001_2, Status : FAILED
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250)
Caused by: java.io.IOException: Task process exit with nonzero status of 65.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:237)
13/07/04 00:21:50 INFO mapred.JobClient: map 8% reduce 0%
13/07/04 00:22:27 INFO mapred.JobClient: map 9% reduce 0%
13/07/04 00:23:52 INFO mapred.JobClient: map 10% reduce 0%
13/07/04 00:26:00 INFO mapred.JobClient: map 11% reduce 0%
13/07/04 00:28:47 INFO mapred.JobClient: map 12% reduce 0%
13/07/04 00:32:17 INFO mapred.JobClient: map 13% reduce 0%
13/07/04 00:37:34 INFO mapred.JobClient: map 14% reduce 0%
13/07/04 00:44:30 INFO mapred.JobClient: map 15% reduce 0%
13/07/04 00:54:28 INFO mapred.JobClient: map 16% reduce 0%
13/07/04 01:16:30 INFO mapred.JobClient: map 17% reduce 0%
13/07/04 01:32:05 INFO mapred.JobClient: map 8% reduce 0%
13/07/04 01:32:08 INFO mapred.JobClient: Task Id : attempt_201307031710_0001_m_000000_2, Status : FAILED
Error: GC overhead limit exceeded
13/07/04 01:32:21 INFO mapred.JobClient: map 9% reduce 0%
13/07/04 01:33:26 INFO mapred.JobClient: map 10% reduce 0%
13/07/04 01:35:37 INFO mapred.JobClient: map 11% reduce 0%
13/07/04 01:38:48 INFO mapred.JobClient: map 12% reduce 0%
13/07/04 01:43:06 INFO mapred.JobClient: map 13% reduce 0%
13/07/04 01:49:58 INFO mapred.JobClient: map 14% reduce 0%
13/07/04 01:59:07 INFO mapred.JobClient: map 15% reduce 0%
13/07/04 02:12:00 INFO mapred.JobClient: map 16% reduce 0%
13/07/04 02:37:56 INFO mapred.JobClient: map 17% reduce 0%
13/07/04 03:31:55 INFO mapred.JobClient: map 8% reduce 0%
13/07/04 03:32:00 INFO mapred.JobClient: Job complete: job_201307031710_0001
13/07/04 03:32:00 INFO mapred.JobClient: Counters: 7
13/07/04 03:32:00 INFO mapred.JobClient: Job Counters
13/07/04 03:32:00 INFO mapred.JobClient: Failed map tasks=1
13/07/04 03:32:00 INFO mapred.JobClient: Launched map tasks=8
13/07/04 03:32:00 INFO mapred.JobClient: Data-local map tasks=8
13/07/04 03:32:00 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=11443502
13/07/04 03:32:00 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0
13/07/04 03:32:00 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/07/04 03:32:00 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
Exception in thread "main" java.lang.RuntimeException: java.lang.InterruptedException: Canopy Job failed processing vector

The Mahout jobs are very memory intensive. I don't know whether the mappers or reducers are the culprits, but, either way you will have to tell Hadoop to give them more RAM. "GC Overhead Limit Exceeded" is just a way of saying "out of memory" -- means the JVM gave up trying to reclaim the last 0.01% of available RAM.
How you set this is indeed a little complex, because there are several properties and they changed in Hadoop 2. CDH4 can support Hadoop 1 or 2 -- which one are you using?
If I had to guess: set mapreduce.child.java.opts to -Xmx1g. But the right answer really depends on your version and your data.

You need to change your memory settings for Hadoop, as the memory allocated for Hadoop is not enough to accommodate the job requirement you are running, Try to increase the heap memory and verify, due to over uses of memory OS might be killing the processes due to which job is failing.

Related

Map function not finish after reach 100%

I got a weird issue, when I run my map task, it still runs again after reach 100%.
Here is the console log I got:
15/07/22 00:50:12 INFO mapred.JobClient: map 0% reduce 0%
15/07/22 00:50:17 INFO mapred.LocalJobRunner:
15/07/22 00:50:18 INFO mapred.JobClient: map 3% reduce 0%
15/07/22 00:50:20 INFO mapred.LocalJobRunner:
15/07/22 00:50:21 INFO mapred.JobClient: map 7% reduce 0%
15/07/22 00:50:23 INFO mapred.LocalJobRunner:
15/07/22 00:50:26 INFO mapred.LocalJobRunner:
15/07/22 00:50:27 INFO mapred.JobClient: map 9% reduce 0%
15/07/22 00:50:29 INFO mapred.LocalJobRunner:
15/07/22 00:50:30 INFO mapred.JobClient: map 13% reduce 0%
15/07/22 00:50:32 INFO mapred.LocalJobRunner:
15/07/22 00:50:33 INFO mapred.JobClient: map 15% reduce 0%
15/07/22 00:50:35 INFO mapred.LocalJobRunner:
15/07/22 00:50:38 INFO mapred.LocalJobRunner:
15/07/22 00:50:39 INFO mapred.JobClient: map 17% reduce 0%
15/07/22 00:50:41 INFO mapred.LocalJobRunner:
15/07/22 00:50:42 INFO mapred.JobClient: map 18% reduce 0%
15/07/22 00:50:44 INFO mapred.LocalJobRunner:
15/07/22 00:50:45 INFO mapred.JobClient: map 20% reduce 0%
15/07/22 00:50:47 INFO mapred.LocalJobRunner:
15/07/22 00:50:48 INFO mapred.JobClient: map 22% reduce 0%
15/07/22 00:50:50 INFO mapred.LocalJobRunner:
15/07/22 00:50:51 INFO mapred.JobClient: map 24% reduce 0%
15/07/22 00:50:53 INFO mapred.LocalJobRunner:
15/07/22 00:50:54 INFO mapred.JobClient: map 26% reduce 0%
15/07/22 00:50:56 INFO mapred.LocalJobRunner:
15/07/22 00:50:57 INFO mapred.JobClient: map 27% reduce 0%
15/07/22 00:50:59 INFO mapred.LocalJobRunner:
15/07/22 00:51:00 INFO mapred.JobClient: map 30% reduce 0%
15/07/22 00:51:02 INFO mapred.LocalJobRunner:
15/07/22 00:51:03 INFO mapred.JobClient: map 32% reduce 0%
15/07/22 00:51:05 INFO mapred.LocalJobRunner:
15/07/22 00:51:06 INFO mapred.JobClient: map 34% reduce 0%
15/07/22 00:51:09 INFO mapred.LocalJobRunner:
15/07/22 00:51:12 INFO mapred.LocalJobRunner:
15/07/22 00:51:12 INFO mapred.JobClient: map 36% reduce 0%
15/07/22 00:51:15 INFO mapred.LocalJobRunner:
15/07/22 00:51:15 INFO mapred.JobClient: map 38% reduce 0%
15/07/22 00:51:18 INFO mapred.LocalJobRunner:
15/07/22 00:51:18 INFO mapred.JobClient: map 39% reduce 0%
15/07/22 00:51:21 INFO mapred.LocalJobRunner:
15/07/22 00:51:21 INFO mapred.JobClient: map 43% reduce 0%
15/07/22 00:51:24 INFO mapred.LocalJobRunner:
15/07/22 00:51:24 INFO mapred.JobClient: map 45% reduce 0%
15/07/22 00:51:27 INFO mapred.LocalJobRunner:
15/07/22 00:51:27 INFO mapred.JobClient: map 46% reduce 0%
15/07/22 00:51:30 INFO mapred.LocalJobRunner:
15/07/22 00:51:30 INFO mapred.JobClient: map 48% reduce 0%
15/07/22 00:51:33 INFO mapred.LocalJobRunner:
15/07/22 00:51:33 INFO mapred.JobClient: map 51% reduce 0%
15/07/22 00:51:36 INFO mapred.LocalJobRunner:
15/07/22 00:51:36 INFO mapred.JobClient: map 53% reduce 0%
15/07/22 00:51:39 INFO mapred.LocalJobRunner:
15/07/22 00:51:39 INFO mapred.JobClient: map 55% reduce 0%
15/07/22 00:51:42 INFO mapred.LocalJobRunner:
15/07/22 00:51:42 INFO mapred.JobClient: map 57% reduce 0%
15/07/22 00:51:45 INFO mapred.LocalJobRunner:
15/07/22 00:51:45 INFO mapred.JobClient: map 59% reduce 0%
15/07/22 00:51:48 INFO mapred.LocalJobRunner:
15/07/22 00:51:49 INFO mapred.JobClient: map 60% reduce 0%
15/07/22 00:51:51 INFO mapred.LocalJobRunner:
15/07/22 00:51:51 INFO mapred.JobClient: map 62% reduce 0%
15/07/22 00:51:54 INFO mapred.LocalJobRunner:
15/07/22 00:51:54 INFO mapred.JobClient: map 64% reduce 0%
15/07/22 00:51:57 INFO mapred.LocalJobRunner:
15/07/22 00:51:57 INFO mapred.JobClient: map 66% reduce 0%
15/07/22 00:52:00 INFO mapred.LocalJobRunner:
15/07/22 00:52:00 INFO mapred.JobClient: map 68% reduce 0%
15/07/22 00:52:03 INFO mapred.LocalJobRunner:
15/07/22 00:52:06 INFO mapred.LocalJobRunner:
15/07/22 00:52:06 INFO mapred.JobClient: map 69% reduce 0%
15/07/22 00:52:09 INFO mapred.LocalJobRunner:
15/07/22 00:52:12 INFO mapred.LocalJobRunner:
15/07/22 00:52:21 INFO mapred.LocalJobRunner:
15/07/22 00:52:22 INFO mapred.JobClient: map 71% reduce 0%
15/07/22 00:52:24 INFO mapred.LocalJobRunner:
15/07/22 00:52:30 INFO mapred.LocalJobRunner:
15/07/22 00:52:31 INFO mapred.JobClient: map 73% reduce 0%
15/07/22 00:52:36 INFO mapred.LocalJobRunner:
15/07/22 00:52:37 INFO mapred.JobClient: map 79% reduce 0%
15/07/22 00:52:40 INFO mapred.LocalJobRunner:
15/07/22 00:52:40 INFO mapred.JobClient: map 81% reduce 0%
15/07/22 00:52:43 INFO mapred.LocalJobRunner:
15/07/22 00:52:46 INFO mapred.LocalJobRunner:
15/07/22 00:52:46 INFO mapred.JobClient: map 82% reduce 0%
15/07/22 00:52:50 INFO mapred.LocalJobRunner:
15/07/22 00:52:51 INFO mapred.JobClient: map 84% reduce 0%
15/07/22 00:52:53 INFO mapred.LocalJobRunner:
15/07/22 00:52:59 INFO mapred.LocalJobRunner:
15/07/22 00:53:00 INFO mapred.JobClient: map 87% reduce 0%
15/07/22 00:53:03 INFO mapred.LocalJobRunner:
15/07/22 00:53:09 INFO mapred.LocalJobRunner:
15/07/22 00:53:10 INFO mapred.JobClient: map 88% reduce 0%
15/07/22 00:53:12 INFO mapred.LocalJobRunner:
15/07/22 00:53:14 INFO mapred.JobClient: map 90% reduce 0%
15/07/22 00:53:15 INFO mapred.LocalJobRunner:
15/07/22 00:53:16 INFO mapred.JobClient: map 92% reduce 0%
15/07/22 00:53:18 INFO mapred.LocalJobRunner:
15/07/22 00:53:25 INFO mapred.LocalJobRunner:
15/07/22 00:53:25 INFO mapred.JobClient: map 94% reduce 0%
15/07/22 00:53:31 INFO mapred.LocalJobRunner:
15/07/22 00:53:31 INFO mapred.JobClient: map 100% reduce 0%
15/07/22 00:53:33 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
15/07/22 00:53:34 INFO mapred.LocalJobRunner:
15/07/22 00:53:34 INFO mapred.LocalJobRunner:
15/07/22 00:53:34 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
15/07/22 00:53:35 INFO mapreduce.TableOutputFormat: Created table instance for test1
15/07/22 00:53:35 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#4af2be62
15/07/22 00:53:40 INFO mapred.LocalJobRunner:
15/07/22 00:53:40 INFO mapred.JobClient: map 51% reduce 0%
15/07/22 00:53:43 INFO mapred.LocalJobRunner:
15/07/22 00:53:43 INFO mapred.JobClient: map 52% reduce 0%
15/07/22 00:53:46 INFO mapred.LocalJobRunner:
15/07/22 00:53:46 INFO mapred.JobClient: map 53% reduce 0%
15/07/22 00:53:49 INFO mapred.LocalJobRunner:
15/07/22 00:53:49 INFO mapred.JobClient: map 55% reduce 0%
15/07/22 00:53:52 INFO mapred.LocalJobRunner:
15/07/22 00:53:52 INFO mapred.JobClient: map 56% reduce 0%
15/07/22 00:53:55 INFO mapred.LocalJobRunner:
15/07/22 00:53:55 INFO mapred.JobClient: map 58% reduce 0%
15/07/22 00:53:58 INFO mapred.LocalJobRunner:
15/07/22 00:53:58 INFO mapred.JobClient: map 59% reduce 0%
15/07/22 00:54:01 INFO mapred.LocalJobRunner:
15/07/22 00:54:01 INFO mapred.JobClient: map 60% reduce 0%
15/07/22 00:54:04 INFO mapred.LocalJobRunner:
15/07/22 00:54:10 INFO mapred.LocalJobRunner:
15/07/22 00:54:10 INFO mapred.JobClient: map 62% reduce 0%
15/07/22 00:54:13 INFO mapred.LocalJobRunner:
15/07/22 00:54:13 INFO mapred.JobClient: map 63% reduce 0%
15/07/22 00:54:16 INFO mapred.LocalJobRunner:
15/07/22 00:54:17 INFO mapred.JobClient: map 64% reduce 0%
15/07/22 00:54:19 INFO mapred.LocalJobRunner:
15/07/22 00:54:20 INFO mapred.JobClient: map 66% reduce 0%
15/07/22 00:54:22 INFO mapred.LocalJobRunner:
15/07/22 00:54:23 INFO mapred.JobClient: map 67% reduce 0%
15/07/22 00:54:25 INFO mapred.LocalJobRunner:
15/07/22 00:54:26 INFO mapred.JobClient: map 68% reduce 0%
15/07/22 00:54:28 INFO mapred.LocalJobRunner:
15/07/22 00:54:29 INFO mapred.JobClient: map 69% reduce 0%
15/07/22 00:54:31 INFO mapred.LocalJobRunner:
15/07/22 00:54:32 INFO mapred.JobClient: map 71% reduce 0%
15/07/22 00:54:34 INFO mapred.LocalJobRunner:
15/07/22 00:54:35 INFO mapred.JobClient: map 72% reduce 0%
15/07/22 00:54:37 INFO mapred.LocalJobRunner:
15/07/22 00:54:38 INFO mapred.JobClient: map 73% reduce 0%
15/07/22 00:54:40 INFO mapred.LocalJobRunner:
15/07/22 00:54:41 INFO mapred.JobClient: map 75% reduce 0%
15/07/22 00:54:43 INFO mapred.LocalJobRunner:
15/07/22 00:54:44 INFO mapred.JobClient: map 76% reduce 0%
15/07/22 00:54:46 INFO mapred.LocalJobRunner:
15/07/22 00:54:47 INFO mapred.JobClient: map 77% reduce 0%
15/07/22 00:54:49 INFO mapred.LocalJobRunner:
15/07/22 00:54:50 INFO mapred.JobClient: map 79% reduce 0%
15/07/22 00:54:52 INFO mapred.LocalJobRunner:
15/07/22 00:54:53 INFO mapred.JobClient: map 80% reduce 0%
15/07/22 00:54:55 INFO mapred.LocalJobRunner:
15/07/22 00:54:56 INFO mapred.JobClient: map 82% reduce 0%
15/07/22 00:54:58 INFO mapred.LocalJobRunner:
15/07/22 00:54:59 INFO mapred.JobClient: map 84% reduce 0%
15/07/22 00:55:01 INFO mapred.LocalJobRunner:
15/07/22 00:55:04 INFO mapred.LocalJobRunner:
15/07/22 00:55:05 INFO mapred.JobClient: map 85% reduce 0%
15/07/22 00:55:07 INFO mapred.LocalJobRunner:
15/07/22 00:55:10 INFO mapred.LocalJobRunner:
15/07/22 00:55:11 INFO mapred.JobClient: map 87% reduce 0%
15/07/22 00:55:13 INFO mapred.LocalJobRunner:
15/07/22 00:55:14 INFO mapred.JobClient: map 89% reduce 0%
15/07/22 00:55:16 INFO mapred.LocalJobRunner:
15/07/22 00:55:17 INFO mapred.JobClient: map 90% reduce 0%
15/07/22 00:55:21 INFO mapred.LocalJobRunner:
15/07/22 00:55:24 INFO mapred.LocalJobRunner:
15/07/22 00:55:25 INFO mapred.JobClient: map 91% reduce 0%
15/07/22 00:55:28 INFO mapred.LocalJobRunner:
15/07/22 00:55:29 INFO mapred.JobClient: map 92% reduce 0%
15/07/22 00:55:32 INFO mapred.LocalJobRunner:
15/07/22 00:55:32 INFO mapred.JobClient: map 93% reduce 0%
15/07/22 00:55:35 INFO mapred.LocalJobRunner:
15/07/22 00:55:35 INFO mapred.JobClient: map 94% reduce 0%
15/07/22 00:55:38 INFO mapred.LocalJobRunner:
15/07/22 00:55:39 INFO mapred.JobClient: map 95% reduce 0%
15/07/22 00:55:41 INFO mapred.LocalJobRunner:
15/07/22 00:55:42 INFO mapred.JobClient: map 97% reduce 0%
15/07/22 00:55:44 INFO mapred.LocalJobRunner:
15/07/22 00:55:45 INFO mapred.JobClient: map 98% reduce 0%
15/07/22 00:55:47 INFO mapred.LocalJobRunner:
15/07/22 00:55:48 INFO mapred.JobClient: map 99% reduce 0%
15/07/22 00:55:50 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting
15/07/22 00:55:50 INFO mapred.LocalJobRunner:
15/07/22 00:55:50 INFO mapred.LocalJobRunner:
15/07/22 00:55:50 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done.
15/07/22 00:55:51 INFO mapreduce.TableOutputFormat: Created table instance for test1
15/07/22 00:55:51 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#6491a41a
15/07/22 00:55:51 INFO mapred.JobClient: map 100% reduce 0%
15/07/22 00:55:56 INFO mapred.LocalJobRunner:
15/07/22 00:55:57 INFO mapred.JobClient: map 68% reduce 0%
15/07/22 00:55:59 INFO mapred.LocalJobRunner:
15/07/22 00:56:00 INFO mapred.JobClient: map 69% reduce 0%
15/07/22 00:56:02 INFO mapred.LocalJobRunner:
15/07/22 00:56:03 INFO mapred.JobClient: map 70% reduce 0%
15/07/22 00:56:05 INFO mapred.LocalJobRunner:
15/07/22 00:56:06 INFO mapred.JobClient: map 71% reduce 0%
15/07/22 00:56:08 INFO mapred.LocalJobRunner:
15/07/22 00:56:09 INFO mapred.JobClient: map 72% reduce 0%
15/07/22 00:56:11 INFO mapred.LocalJobRunner:
15/07/22 00:56:14 INFO mapred.LocalJobRunner:
15/07/22 00:56:15 INFO mapred.JobClient: map 73% reduce 0%
15/07/22 00:56:17 INFO mapred.LocalJobRunner:
15/07/22 00:56:18 INFO mapred.JobClient: map 74% reduce 0%
15/07/22 00:56:20 INFO mapred.LocalJobRunner:
15/07/22 00:56:21 INFO mapred.JobClient: map 75% reduce 0%
15/07/22 00:56:23 INFO mapred.LocalJobRunner:
15/07/22 00:56:26 INFO mapred.LocalJobRunner:
15/07/22 00:56:27 INFO mapred.JobClient: map 77% reduce 0%
15/07/22 00:56:30 INFO mapred.LocalJobRunner:
15/07/22 00:56:36 INFO mapred.LocalJobRunner:
15/07/22 00:56:36 WARN mapred.FileOutputCommitter: Output path is null in cleanup
15/07/22 00:56:36 INFO mapred.JobClient: Job complete: job_local_0001
15/07/22 00:56:36 INFO mapred.JobClient: Counters: 12
15/07/22 00:56:36 INFO mapred.JobClient: Map-Reduce Framework
15/07/22 00:56:36 INFO mapred.JobClient: Spilled Records=0
15/07/22 00:56:36 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
15/07/22 00:56:36 INFO mapred.JobClient: Map input records=129
15/07/22 00:56:36 INFO mapred.JobClient: SPLIT_RAW_BYTES=351
15/07/22 00:56:36 INFO mapred.JobClient: Map output records=128
15/07/22 00:56:36 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
15/07/22 00:56:36 INFO mapred.JobClient: CPU time spent (ms)=0
15/07/22 00:56:36 INFO mapred.JobClient: Total committed heap usage (bytes)=183218176
15/07/22 00:56:36 INFO mapred.JobClient: File Input Format Counters
15/07/22 00:56:36 INFO mapred.JobClient: Bytes Read=79622144
15/07/22 00:56:36 INFO mapred.JobClient: FileSystemCounters
15/07/22 00:56:36 INFO mapred.JobClient: FILE_BYTES_WRITTEN=36253707
15/07/22 00:56:36 INFO mapred.JobClient: FILE_BYTES_READ=220612775
15/07/22 00:56:36 INFO mapred.JobClient: File Output Format Counters
15/07/22 00:56:36 INFO mapred.JobClient: Bytes Written=0
And in configuration, I set number of reducer is 0.
Does anyone know what happen with my map task?
I just update my job configuration:
Here is my job configuration:
Path inputPath = new Path(inputPathName);
Job job = new Job(conf, NAME + "_" + tableName);
job.setJarByClass(Importer.class);
FileInputFormat.setInputPaths(job, inputPath);
job.setInputFormatClass(SequenceFileInputFormat.class);
// job.setInputFormatClass(TextInputFormat.class);
job.setMapperClass(Importer.class);
// No reducers. Just write straight to table. Call initTableReducerJob
// because it sets up the TableOutputFormat.
TableMapReduceUtil.initTableReducerJob(tableName, null, job);
job.setNumReduceTasks(0);

hadoop yarn single node performance tuning

I have hadoop 2.5.2 single mode installation on my Ubuntu VM, which is: 4-core, 3GHz per core; 4G memory. This VM is not for production, only for demo and learning.
Then, I wrote a vey simple map-reduce application using python, and use this application to process 49 xmls. All these xml files are small-size, hundreds of lines each. So, I expected a quick process. But, big22 surprise to me, it took more than 20 minutes to finish the job (the output of the job is correct.). Below is the output metrics :
14/12/15 19:37:55 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/12/15 19:37:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/12/15 19:38:03 INFO mapred.FileInputFormat: Total input paths to process : 49
14/12/15 19:38:06 INFO mapreduce.JobSubmitter: number of splits:49
14/12/15 19:38:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1418368500264_0005
14/12/15 19:38:10 INFO impl.YarnClientImpl: Submitted application application_1418368500264_0005
14/12/15 19:38:10 INFO mapreduce.Job: Running job: job_1418368500264_0005
14/12/15 19:38:59 INFO mapreduce.Job: Job job_1418368500264_0005 running in uber mode : false
14/12/15 19:38:59 INFO mapreduce.Job: map 0% reduce 0%
14/12/15 19:39:42 INFO mapreduce.Job: map 2% reduce 0%
14/12/15 19:40:05 INFO mapreduce.Job: map 4% reduce 0%
14/12/15 19:40:28 INFO mapreduce.Job: map 6% reduce 0%
14/12/15 19:40:49 INFO mapreduce.Job: map 8% reduce 0%
14/12/15 19:41:10 INFO mapreduce.Job: map 10% reduce 0%
14/12/15 19:41:29 INFO mapreduce.Job: map 12% reduce 0%
14/12/15 19:41:50 INFO mapreduce.Job: map 14% reduce 0%
14/12/15 19:42:08 INFO mapreduce.Job: map 16% reduce 0%
14/12/15 19:42:28 INFO mapreduce.Job: map 18% reduce 0%
14/12/15 19:42:49 INFO mapreduce.Job: map 20% reduce 0%
14/12/15 19:43:08 INFO mapreduce.Job: map 22% reduce 0%
14/12/15 19:43:28 INFO mapreduce.Job: map 24% reduce 0%
14/12/15 19:43:48 INFO mapreduce.Job: map 27% reduce 0%
14/12/15 19:44:09 INFO mapreduce.Job: map 29% reduce 0%
14/12/15 19:44:29 INFO mapreduce.Job: map 31% reduce 0%
14/12/15 19:44:49 INFO mapreduce.Job: map 33% reduce 0%
14/12/15 19:45:09 INFO mapreduce.Job: map 35% reduce 0%
14/12/15 19:45:28 INFO mapreduce.Job: map 37% reduce 0%
14/12/15 19:45:49 INFO mapreduce.Job: map 39% reduce 0%
14/12/15 19:46:09 INFO mapreduce.Job: map 41% reduce 0%
14/12/15 19:46:29 INFO mapreduce.Job: map 43% reduce 0%
14/12/15 19:46:49 INFO mapreduce.Job: map 45% reduce 0%
14/12/15 19:47:09 INFO mapreduce.Job: map 47% reduce 0%
14/12/15 19:47:29 INFO mapreduce.Job: map 49% reduce 0%
14/12/15 19:47:49 INFO mapreduce.Job: map 51% reduce 0%
14/12/15 19:48:08 INFO mapreduce.Job: map 53% reduce 0%
14/12/15 19:48:28 INFO mapreduce.Job: map 55% reduce 0%
14/12/15 19:48:48 INFO mapreduce.Job: map 57% reduce 0%
14/12/15 19:49:09 INFO mapreduce.Job: map 59% reduce 0%
14/12/15 19:49:29 INFO mapreduce.Job: map 61% reduce 0%
14/12/15 19:49:55 INFO mapreduce.Job: map 63% reduce 0%
14/12/15 19:50:23 INFO mapreduce.Job: map 65% reduce 0%
14/12/15 19:50:53 INFO mapreduce.Job: map 67% reduce 0%
14/12/15 19:51:22 INFO mapreduce.Job: map 69% reduce 0%
14/12/15 19:51:50 INFO mapreduce.Job: map 71% reduce 0%
14/12/15 19:52:18 INFO mapreduce.Job: map 73% reduce 0%
14/12/15 19:52:48 INFO mapreduce.Job: map 76% reduce 0%
14/12/15 19:53:18 INFO mapreduce.Job: map 78% reduce 0%
14/12/15 19:53:48 INFO mapreduce.Job: map 80% reduce 0%
14/12/15 19:54:18 INFO mapreduce.Job: map 82% reduce 0%
14/12/15 19:54:48 INFO mapreduce.Job: map 84% reduce 0%
14/12/15 19:55:19 INFO mapreduce.Job: map 86% reduce 0%
14/12/15 19:55:48 INFO mapreduce.Job: map 88% reduce 0%
14/12/15 19:56:16 INFO mapreduce.Job: map 90% reduce 0%
14/12/15 19:56:44 INFO mapreduce.Job: map 92% reduce 0%
14/12/15 19:57:14 INFO mapreduce.Job: map 94% reduce 0%
14/12/15 19:57:45 INFO mapreduce.Job: map 96% reduce 0%
14/12/15 19:58:15 INFO mapreduce.Job: map 98% reduce 0%
14/12/15 19:58:46 INFO mapreduce.Job: map 100% reduce 0%
14/12/15 19:59:20 INFO mapreduce.Job: map 100% reduce 100%
14/12/15 19:59:28 INFO mapreduce.Job: Job job_1418368500264_0005 completed successfully
14/12/15 19:59:30 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=17856
FILE: Number of bytes written=5086434
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=499030
HDFS: Number of bytes written=10049
HDFS: Number of read operations=150
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=49
Launched reduce tasks=1
Data-local map tasks=49
Total time spent by all maps in occupied slots (ms)=8854232
Total time spent by all reduces in occupied slots (ms)=284672
Total time spent by all map tasks (ms)=1106779
Total time spent by all reduce tasks (ms)=35584
Total vcore-seconds taken by all map tasks=1106779
Total vcore-seconds taken by all reduce tasks=35584
Total megabyte-seconds taken by all map tasks=1133341696
Total megabyte-seconds taken by all reduce tasks=36438016
Map-Reduce Framework
Map input records=9352
Map output records=296
Map output bytes=17258
Map output materialized bytes=18144
Input split bytes=6772
Combine input records=0
Combine output records=0
Reduce input groups=53
Reduce shuffle bytes=18144
Reduce input records=296
Reduce output records=52
Spilled Records=592
Shuffled Maps =49
Failed Shuffles=0
Merged Map outputs=49
GC time elapsed (ms)=33590
CPU time spent (ms)=191390
Physical memory (bytes) snapshot=13738057728
Virtual memory (bytes) snapshot=66425016320
Total committed heap usage (bytes)=10799808512
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=492258
File Output Format Counters
Bytes Written=10049
14/12/15 19:59:30 INFO streaming.StreamJob: Output directory: /data_output/sb50projs_1_output
As a newbie to hadoop, for this crazy unreasonable performance, I have several questions:
how to configure my hadoop/yarn/mapreduce to make the whole environment more convenient for trial usage?
I understand hadoop is designed for huge-data and big files. But for a trial environment, my files are small and my data is very limited, which default configuration items should I change? I have changed "dfs.blocksize" of hdfs-site.xml to a smaller value to match my small files, but seems no big enhancements. I know there are some JVM configuration items in yarn-site.xml and mapred-site.xml, but I am not sure about how to adjust them.
how to read hadoop logs
Under the logs folder, there are separate log files for nodemanager/resourcemanager/namenode/datanode. I tried to read these files to understand how the 20 minutes are spent during the process, but it's not easy for a newbie like me. So I wonder is there any tool/UI could help me to analyze the logs.
basic performance tuning tools
Actually I have googled around for this question, and I got a bunch of names like Ganglia/Nagios/Vaidya/Ambari. I want to know, which tool is best analyse the issue like , "why it took 20 minutes to do such a simple job?".
big number of hadoop processes
Even if there is no job running on my hadoop, I found around 100 hadoop processes on my VM, like below (I am using htop, and sort the result by memory). Is this normal for hadoop ? Or am I incorrect for some environment configuration?
You don't have to change anything.
The default configuration is done for small environment. You may change it if you grow the environment. Ant there is a lot of params and a lot of time for fine tuning.
But I admit your configuration is smaller than the usual ones for tests.
The log you have to read isn't the services ones but the job ones. Find them in /var/log/hadoop-yarn/containers/
If you want a better view of your MR, use the web interface on http://127.0.0.1:8088/. You will see your job's progression in real time.
IMO, Basic tuning = use hadoop web interfaces. There are plenty available natively.
I think you find your problem. This can be nomal, or not.
But quickly, YARN launch MR to use all the available memory :
Available memory is set in your yarn-site.xml : yarn.nodemanager.resource.memory-mb (default to 8 Gio).
Memory for a task is defined in mapred-site.xml or in the task itself by the property : mapreduce.map.memory.mb (default to 1536 Mio)
So :
Change the available memory for your nodemanager (to 3Gio, in order to let 1 Gio for the system)
Change the memory available for hadoop services (-Xmx in hadoop-env.sh, yarn-env.sh) (system + each hadoop services (namenode / datanode / ressourcemanager / nodemanager) < 1 Gio.
Change the memory for your map tasks (512 Mio ?). The lesser it is, more task can be executed in the same time.
Change yarn.scheduler.minimum-allocation-mb to 512 in yarn-site.xml to allow mappers with less than 1 Gio of memory.
I hope this will help you.

hadoop map task restart when reduce complete 80%

I'm running hadoop on 2.2.0 on a cluster about 50 nodes, my job is 64 map tasks and 20 reduce tasks.The map complete in about 30 minutes, then all the reduce tasks running, but I found a strange log like this:
14/10/08 21:41:04 INFO mapreduce.Job: map 37% reduce 10%
14/10/08 21:41:05 INFO mapreduce.Job: map 38% reduce 11%
14/10/08 21:41:07 INFO mapreduce.Job: map 39% reduce 12%
14/10/08 21:41:08 INFO mapreduce.Job: map 40% reduce 12%
14/10/08 21:41:09 INFO mapreduce.Job: map 41% reduce 12%
14/10/08 21:41:10 INFO mapreduce.Job: map 42% reduce 12%
14/10/08 21:41:12 INFO mapreduce.Job: map 44% reduce 12%
14/10/08 21:41:14 INFO mapreduce.Job: map 45% reduce 12%
14/10/08 21:41:16 INFO mapreduce.Job: map 46% reduce 13%
14/10/08 21:41:18 INFO mapreduce.Job: map 48% reduce 13%
14/10/08 21:41:21 INFO mapreduce.Job: map 49% reduce 13%
14/10/08 21:41:22 INFO mapreduce.Job: map 50% reduce 13%
14/10/08 21:41:25 INFO mapreduce.Job: map 51% reduce 14%
14/10/08 21:41:29 INFO mapreduce.Job: map 52% reduce 15%
14/10/08 21:41:34 INFO mapreduce.Job: map 52% reduce 16%
14/10/08 21:41:38 INFO mapreduce.Job: map 53% reduce 17%
14/10/08 21:41:53 INFO mapreduce.Job: map 53% reduce 18%
14/10/08 21:42:09 INFO mapreduce.Job: map 54% reduce 18%
14/10/08 21:42:18 INFO mapreduce.Job: map 55% reduce 18%
14/10/08 21:42:37 INFO mapreduce.Job: map 56% reduce 18%
14/10/08 21:42:49 INFO mapreduce.Job: map 56% reduce 19%
14/10/08 21:42:53 INFO mapreduce.Job: map 57% reduce 19%
14/10/08 21:42:57 INFO mapreduce.Job: map 58% reduce 19%
14/10/08 21:42:59 INFO mapreduce.Job: map 59% reduce 19%
14/10/08 21:43:00 INFO mapreduce.Job: map 60% reduce 19%
14/10/08 21:43:05 INFO mapreduce.Job: map 61% reduce 19%
14/10/08 21:43:06 INFO mapreduce.Job: map 62% reduce 19%
14/10/08 21:43:09 INFO mapreduce.Job: map 62% reduce 20%
14/10/08 21:43:13 INFO mapreduce.Job: map 63% reduce 20%
14/10/08 21:43:16 INFO mapreduce.Job: map 63% reduce 21%
14/10/08 21:43:19 INFO mapreduce.Job: map 64% reduce 21%
14/10/08 21:43:36 INFO mapreduce.Job: map 65% reduce 21%
14/10/08 21:43:42 INFO mapreduce.Job: map 66% reduce 21%
14/10/08 21:43:44 INFO mapreduce.Job: map 67% reduce 22%
14/10/08 21:44:00 INFO mapreduce.Job: map 68% reduce 22%
14/10/08 21:44:09 INFO mapreduce.Job: map 70% reduce 22%
14/10/08 21:44:12 INFO mapreduce.Job: map 70% reduce 23%
14/10/08 21:44:18 INFO mapreduce.Job: map 71% reduce 23%
14/10/08 21:44:24 INFO mapreduce.Job: map 72% reduce 23%
14/10/08 21:44:28 INFO mapreduce.Job: map 72% reduce 24%
14/10/08 21:44:35 INFO mapreduce.Job: map 73% reduce 24%
14/10/08 21:44:44 INFO mapreduce.Job: map 74% reduce 24%
14/10/08 21:44:50 INFO mapreduce.Job: map 75% reduce 24%
14/10/08 21:44:51 INFO mapreduce.Job: map 76% reduce 24%
14/10/08 21:44:54 INFO mapreduce.Job: map 77% reduce 25%
14/10/08 21:45:00 INFO mapreduce.Job: map 78% reduce 25%
14/10/08 21:45:07 INFO mapreduce.Job: map 78% reduce 26%
14/10/08 21:45:15 INFO mapreduce.Job: map 79% reduce 26%
14/10/08 21:45:17 INFO mapreduce.Job: map 80% reduce 26%
14/10/08 21:45:20 INFO mapreduce.Job: map 81% reduce 26%
14/10/08 21:45:24 INFO mapreduce.Job: map 82% reduce 26%
14/10/08 21:45:27 INFO mapreduce.Job: map 83% reduce 26%
14/10/08 21:45:29 INFO mapreduce.Job: map 83% reduce 27%
14/10/08 21:45:34 INFO mapreduce.Job: map 85% reduce 27%
14/10/08 21:45:37 INFO mapreduce.Job: map 86% reduce 27%
14/10/08 21:45:42 INFO mapreduce.Job: map 86% reduce 28%
14/10/08 21:45:43 INFO mapreduce.Job: map 88% reduce 28%
14/10/08 21:45:45 INFO mapreduce.Job: map 89% reduce 28%
14/10/08 21:45:47 INFO mapreduce.Job: map 89% reduce 29%
14/10/08 21:45:56 INFO mapreduce.Job: map 89% reduce 30%
14/10/08 21:45:58 INFO mapreduce.Job: map 90% reduce 30%
14/10/08 21:46:00 INFO mapreduce.Job: map 91% reduce 30%
14/10/08 21:46:06 INFO mapreduce.Job: map 92% reduce 30%
14/10/08 21:46:10 INFO mapreduce.Job: map 92% reduce 31%
14/10/08 21:46:12 INFO mapreduce.Job: map 93% reduce 31%
14/10/08 21:46:24 INFO mapreduce.Job: map 94% reduce 31%
14/10/08 21:48:02 INFO mapreduce.Job: map 95% reduce 31%
14/10/08 21:48:12 INFO mapreduce.Job: map 95% reduce 32%
14/10/08 21:48:13 INFO mapreduce.Job: map 96% reduce 32%
14/10/08 21:48:22 INFO mapreduce.Job: map 97% reduce 32%
14/10/08 21:49:00 INFO mapreduce.Job: map 98% reduce 32%
14/10/08 21:49:09 INFO mapreduce.Job: map 98% reduce 33%
14/10/08 21:49:19 INFO mapreduce.Job: map 99% reduce 33%
14/10/08 21:49:22 INFO mapreduce.Job: map 100% reduce 33%
14/10/08 21:49:27 INFO mapreduce.Job: map 100% reduce 34%
14/10/08 21:49:28 INFO mapreduce.Job: map 100% reduce 37%
14/10/08 21:49:29 INFO mapreduce.Job: map 100% reduce 40%
14/10/08 21:49:30 INFO mapreduce.Job: map 100% reduce 42%
14/10/08 21:49:32 INFO mapreduce.Job: map 100% reduce 44%
14/10/08 21:49:33 INFO mapreduce.Job: map 100% reduce 46%
14/10/08 21:49:34 INFO mapreduce.Job: map 100% reduce 48%
14/10/08 21:49:35 INFO mapreduce.Job: map 100% reduce 49%
14/10/08 21:49:36 INFO mapreduce.Job: map 100% reduce 54%
14/10/08 21:49:37 INFO mapreduce.Job: map 100% reduce 55%
14/10/08 21:49:39 INFO mapreduce.Job: map 100% reduce 58%
14/10/08 21:49:40 INFO mapreduce.Job: map 100% reduce 59%
14/10/08 21:49:41 INFO mapreduce.Job: map 100% reduce 60%
14/10/08 21:49:42 INFO mapreduce.Job: map 100% reduce 61%
14/10/08 21:49:44 INFO mapreduce.Job: map 100% reduce 62%
14/10/08 21:49:46 INFO mapreduce.Job: map 100% reduce 63%
14/10/08 21:49:48 INFO mapreduce.Job: map 100% reduce 64%
14/10/08 21:49:50 INFO mapreduce.Job: map 100% reduce 65%
14/10/08 21:49:56 INFO mapreduce.Job: map 100% reduce 66%
14/10/08 21:50:03 INFO mapreduce.Job: map 100% reduce 67%
14/10/08 21:52:11 INFO mapreduce.Job: map 100% reduce 68%
14/10/08 21:56:12 INFO mapreduce.Job: map 100% reduce 69%
14/10/08 22:00:35 INFO mapreduce.Job: map 100% reduce 71%
14/10/08 22:00:40 INFO mapreduce.Job: map 100% reduce 72%
14/10/08 22:00:55 INFO mapreduce.Job: map 100% reduce 74%
14/10/08 22:01:09 INFO mapreduce.Job: map 100% reduce 75%
14/10/08 22:01:14 INFO mapreduce.Job: map 100% reduce 77%
14/10/08 22:02:23 INFO mapreduce.Job: map 100% reduce 78%
14/10/08 22:02:37 INFO mapreduce.Job: map 100% reduce 80%
14/10/08 22:03:32 INFO mapreduce.Job: map 100% reduce 82%
14/10/08 22:04:14 INFO mapreduce.Job: map 100% reduce 83%
14/10/08 22:04:43 INFO mapreduce.Job: map 100% reduce 85%
14/10/08 22:08:11 INFO mapreduce.Job: map 100% reduce 86%
14/10/08 22:08:45 INFO mapreduce.Job: map 100% reduce 88%
14/10/08 22:09:53 INFO mapreduce.Job: map 100% reduce 89%
14/10/08 22:11:21 INFO mapreduce.Job: map 100% reduce 91%
14/10/08 22:11:44 INFO mapreduce.Job: map 100% reduce 92%
14/10/08 22:13:23 INFO mapreduce.Job: map 100% reduce 94%
14/10/08 22:13:25 INFO mapreduce.Job: map 100% reduce 95%
14/10/08 22:15:21 INFO mapreduce.Job: map 100% reduce 97%
14/10/08 22:18:14 INFO mapreduce.Job: map 97% reduce 98%
14/10/08 22:18:15 INFO mapreduce.Job: map 97% reduce 100%
14/10/08 22:19:14 INFO mapreduce.Job: map 97% reduce 95%
14/10/08 22:19:15 INFO mapreduce.Job: map 97% reduce 90%
14/10/08 22:19:27 INFO mapreduce.Job: map 97% reduce 91%
14/10/08 22:19:35 INFO mapreduce.Job: map 97% reduce 92%
14/10/08 22:20:36 INFO mapreduce.Job: map 97% reduce 93%
14/10/08 22:34:55 INFO mapreduce.Job: map 94% reduce 93%
14/10/08 22:35:20 INFO mapreduce.Job: map 95% reduce 93%
14/10/08 22:40:46 INFO mapreduce.Job: map 96% reduce 93%
14/10/08 22:40:55 INFO mapreduce.Job: map 97% reduce 93%
14/10/08 22:40:59 INFO mapreduce.Job: map 97% reduce 94%
14/10/08 22:41:12 INFO mapreduce.Job: map 97% reduce 95%
14/10/08 22:51:43 INFO mapreduce.Job: map 98% reduce 95%
14/10/08 22:52:52 INFO mapreduce.Job: map 99% reduce 95%
14/10/08 22:53:01 INFO mapreduce.Job: map 100% reduce 95%
14/10/08 22:53:03 INFO mapreduce.Job: map 100% reduce 96%
14/10/08 22:53:28 INFO mapreduce.Job: map 100% reduce 97%
14/10/08 22:57:19 INFO mapreduce.Job: map 100% reduce 98%
14/10/08 23:09:55 INFO mapreduce.Job: map 100% reduce 100%
14/10/08 23:10:00 INFO mapreduce.Job: Job job_1412772687841_0019 completed successfully
This can happen if one your Node dies in between the job.
For example a Task tracker is doing a task and its 50% complete. It updates the Job Tracker the same through heart beat. Then if that node dies, the progress is lost and the job will have to be restarted. This might cause the overall % of completion to reduce.

Why MapReduce progress report is NOT monotonically increasing?

I submitted a MapReduce job to Hadoop and watch the progress report on screen. The progress report should be monotonically increasing (for example 0%, 10%, 25%, 60%, 78%, 95% and 100%) for both map tasks and reduce tasks. But as a matter of fact the progress reported was not monotonically increasing:
14/01/21 11:05:37 INFO mapred.JobClient: Running job: job_201401201555_0036
14/01/21 11:05:38 INFO mapred.JobClient: map 0% reduce 0%
14/01/21 11:06:07 INFO mapred.JobClient: map 11% reduce 0%
14/01/21 11:06:10 INFO mapred.JobClient: map 0% reduce 0%
14/01/21 11:06:19 INFO mapred.JobClient: map 9% reduce 0%
14/01/21 11:06:22 INFO mapred.JobClient: map 22% reduce 0%
14/01/21 11:06:25 INFO mapred.JobClient: map 31% reduce 0%
14/01/21 11:06:28 INFO mapred.JobClient: map 39% reduce 0%
14/01/21 11:06:29 INFO mapred.JobClient: map 53% reduce 0%
14/01/21 11:06:30 INFO mapred.JobClient: map 57% reduce 0%
14/01/21 11:06:32 INFO mapred.JobClient: map 50% reduce 0%
14/01/21 11:06:33 INFO mapred.JobClient: map 55% reduce 0%
14/01/21 11:06:34 INFO mapred.JobClient: map 43% reduce 0%
14/01/21 11:06:35 INFO mapred.JobClient: map 48% reduce 0%
14/01/21 11:06:36 INFO mapred.JobClient: map 40% reduce 0%
14/01/21 11:06:38 INFO mapred.JobClient: map 30% reduce 0%
14/01/21 11:06:40 INFO mapred.JobClient: map 40% reduce 0%
14/01/21 11:06:41 INFO mapred.JobClient: map 49% reduce 0%
14/01/21 11:06:43 INFO mapred.JobClient: map 57% reduce 0%
14/01/21 11:06:44 INFO mapred.JobClient: map 70% reduce 0%
14/01/21 11:06:46 INFO mapred.JobClient: map 73% reduce 0%
14/01/21 11:06:47 INFO mapred.JobClient: map 82% reduce 0%
14/01/21 11:06:48 INFO mapred.JobClient: map 93% reduce 0%
14/01/21 11:06:50 INFO mapred.JobClient: map 94% reduce 0%
14/01/21 11:06:52 INFO mapred.JobClient: map 95% reduce 0%
14/01/21 11:06:53 INFO mapred.JobClient: map 96% reduce 0%
14/01/21 11:06:56 INFO mapred.JobClient: map 98% reduce 0%
14/01/21 11:06:59 INFO mapred.JobClient: map 99% reduce 0%
14/01/21 11:07:00 INFO mapred.JobClient: map 100% reduce 0%
14/01/21 11:07:19 INFO mapred.JobClient: map 100% reduce 4%
14/01/21 11:07:22 INFO mapred.JobClient: map 100% reduce 8%
14/01/21 11:07:25 INFO mapred.JobClient: map 100% reduce 66%
14/01/21 11:07:29 INFO mapred.JobClient: map 100% reduce 67%
14/01/21 11:07:32 INFO mapred.JobClient: map 100% reduce 68%
14/01/21 11:07:35 INFO mapred.JobClient: map 100% reduce 69%
14/01/21 11:07:41 INFO mapred.JobClient: map 100% reduce 70%
14/01/21 11:07:47 INFO mapred.JobClient: map 100% reduce 71%
14/01/21 11:07:53 INFO mapred.JobClient: map 100% reduce 72%
14/01/21 11:07:59 INFO mapred.JobClient: map 100% reduce 73%
14/01/21 11:08:02 INFO mapred.JobClient: map 100% reduce 100%
14/01/21 11:08:03 INFO mapred.JobClient: Job complete: job_201401201555_0036
The progress is indicated by the percentage splits already been processed among all the input splits. But why is the progress report not monotonically increasing?
Check the logs of the tasktrackers and the jobtracker. Are there any failures in the map phase? If a machine fails to perform a task, or the master cannot reach it anymore,then the task is performed again, starting from scratch, by another machine; so the progress is not monotonically increased.

Hadoop mapreduce running very slowly

I am using a 4datanode/1namenode hadoop cluster with version 1.1.2 installed in xenserver as vms. I had a 1GB text file and tried to do wordcount. map took 2hrs and reducer just hangs. A normal perl script finished the job in 10 minutes. Looks like something missing in my setup.
Even for small files in Kbs took little long.
hadoop#master ~]$ hadoop jar /usr/share/hadoop/hadoop-examples-1.1.2.jar wordcount huge out
13/05/29 10:45:09 INFO input.FileInputFormat: Total input paths to process : 1
13/05/29 10:45:09 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/05/29 10:45:09 WARN snappy.LoadSnappy: Snappy native library not loaded
13/05/29 10:45:11 INFO mapred.JobClient: Running job: job_201305290801_0002
13/05/29 10:45:12 INFO mapred.JobClient: map 0% reduce 0%
13/05/29 10:57:14 INFO mapred.JobClient: map 2% reduce 0%
13/05/29 10:58:01 INFO mapred.JobClient: map 3% reduce 0%
13/05/29 10:58:53 INFO mapred.JobClient: map 4% reduce 0%
13/05/29 10:58:54 INFO mapred.JobClient: map 5% reduce 0%
13/05/29 10:59:33 INFO mapred.JobClient: map 6% reduce 0%
13/05/29 11:01:52 INFO mapred.JobClient: map 7% reduce 0%
13/05/29 11:03:02 INFO mapred.JobClient: map 8% reduce 0%
13/05/29 11:03:20 INFO mapred.JobClient: Task Id : attempt_201305290801_0002_m_000002_0, Status : FAILED
Task attempt_201305290801_0002_m_000002_0 failed to report status for 604 seconds. Killing!
13/05/29 11:03:28 INFO mapred.JobClient: Task Id : attempt_201305290801_0002_m_000003_0, Status : FAILED
Task attempt_201305290801_0002_m_000003_0 failed to report status for 604 seconds. Killing!
13/05/29 11:03:29 INFO mapred.JobClient: map 9% reduce 0%
13/05/29 11:04:07 INFO mapred.JobClient: map 10% reduce 0%
13/05/29 11:05:13 INFO mapred.JobClient: map 11% reduce 0%
13/05/29 11:06:34 INFO mapred.JobClient: map 12% reduce 0%
13/05/29 11:06:59 INFO mapred.JobClient: map 13% reduce 0%
13/05/29 11:08:14 INFO mapred.JobClient: map 14% reduce 0%
13/05/29 11:08:39 INFO mapred.JobClient: map 15% reduce 0%
13/05/29 11:09:35 INFO mapred.JobClient: map 16% reduce 0%
13/05/29 11:10:03 INFO mapred.JobClient: map 17% reduce 0%
13/05/29 11:10:55 INFO mapred.JobClient: map 18% reduce 0%
13/05/29 11:11:47 INFO mapred.JobClient: map 19% reduce 0%
13/05/29 11:14:05 INFO mapred.JobClient: map 20% reduce 0%
13/05/29 11:15:22 INFO mapred.JobClient: map 21% reduce 0%
13/05/29 11:15:49 INFO mapred.JobClient: map 22% reduce 0%
13/05/29 11:17:09 INFO mapred.JobClient: map 23% reduce 0%
13/05/29 11:18:06 INFO mapred.JobClient: map 24% reduce 0%
13/05/29 11:18:29 INFO mapred.JobClient: map 25% reduce 0%
13/05/29 11:18:53 INFO mapred.JobClient: map 26% reduce 0%
13/05/29 11:20:05 INFO mapred.JobClient: map 27% reduce 0%
13/05/29 11:21:09 INFO mapred.JobClient: map 28% reduce 0%
13/05/29 11:21:45 INFO mapred.JobClient: map 29% reduce 0%
13/05/29 11:22:14 INFO mapred.JobClient: map 30% reduce 0%
13/05/29 11:22:31 INFO mapred.JobClient: map 31% reduce 0%
13/05/29 11:22:32 INFO mapred.JobClient: map 32% reduce 0%
13/05/29 11:23:01 INFO mapred.JobClient: map 33% reduce 0%
13/05/29 11:23:41 INFO mapred.JobClient: map 34% reduce 0%
13/05/29 11:24:29 INFO mapred.JobClient: map 35% reduce 0%
13/05/29 11:25:16 INFO mapred.JobClient: map 36% reduce 0%
13/05/29 11:25:58 INFO mapred.JobClient: map 37% reduce 0%
13/05/29 11:27:09 INFO mapred.JobClient: map 38% reduce 0%
13/05/29 11:27:55 INFO mapred.JobClient: map 39% reduce 0%
13/05/29 11:28:33 INFO mapred.JobClient: map 40% reduce 0%
13/05/29 11:29:50 INFO mapred.JobClient: map 41% reduce 0%
13/05/29 11:30:29 INFO mapred.JobClient: map 42% reduce 0%
13/05/29 11:31:37 INFO mapred.JobClient: map 43% reduce 0%
13/05/29 11:32:10 INFO mapred.JobClient: map 44% reduce 0%
13/05/29 11:32:34 INFO mapred.JobClient: map 45% reduce 0%
13/05/29 11:34:08 INFO mapred.JobClient: map 46% reduce 0%
13/05/29 11:36:01 INFO mapred.JobClient: map 47% reduce 0%
13/05/29 11:36:57 INFO mapred.JobClient: map 48% reduce 0%
13/05/29 11:37:53 INFO mapred.JobClient: map 49% reduce 0%
13/05/29 11:39:50 INFO mapred.JobClient: map 50% reduce 0%
13/05/29 11:42:17 INFO mapred.JobClient: map 51% reduce 0%
13/05/29 11:43:26 INFO mapred.JobClient: map 52% reduce 0%
13/05/29 11:47:55 INFO mapred.JobClient: map 53% reduce 0%
13/05/29 11:48:25 INFO mapred.JobClient: map 54% reduce 0%
13/05/29 11:49:28 INFO mapred.JobClient: map 54% reduce 2%
13/05/29 11:49:31 INFO mapred.JobClient: map 54% reduce 4%
13/05/29 11:50:03 INFO mapred.JobClient: map 55% reduce 4%
13/05/29 11:50:49 INFO mapred.JobClient: map 56% reduce 4%
13/05/29 11:50:54 INFO mapred.JobClient: map 58% reduce 4%
13/05/29 11:51:21 INFO mapred.JobClient: map 59% reduce 4%
13/05/29 11:51:46 INFO mapred.JobClient: Task Id : attempt_201305290801_0002_m_000002_1, Status : FAILED
Task attempt_201305290801_0002_m_000002_1 failed to report status for 685 seconds. Killing!
13/05/29 11:52:09 INFO mapred.JobClient: map 61% reduce 4%
13/05/29 11:52:27 INFO mapred.JobClient: map 62% reduce 4%
13/05/29 11:52:53 INFO mapred.JobClient: map 63% reduce 4%
13/05/29 11:53:36 INFO mapred.JobClient: map 64% reduce 4%
13/05/29 11:53:57 INFO mapred.JobClient: map 65% reduce 4%
13/05/29 11:54:41 INFO mapred.JobClient: map 66% reduce 4%
13/05/29 11:55:51 INFO mapred.JobClient: map 67% reduce 4%
13/05/29 11:57:00 INFO mapred.JobClient: map 68% reduce 4%
13/05/29 11:57:04 INFO mapred.JobClient: map 69% reduce 4%
13/05/29 11:57:11 INFO mapred.JobClient: map 70% reduce 4%
13/05/29 11:57:41 INFO mapred.JobClient: map 71% reduce 4%
13/05/29 11:58:13 INFO mapred.JobClient: map 72% reduce 4%
13/05/29 11:58:45 INFO mapred.JobClient: map 73% reduce 4%
13/05/29 11:59:05 INFO mapred.JobClient: map 74% reduce 4%
13/05/29 11:59:08 INFO mapred.JobClient: map 74% reduce 6%
13/05/29 11:59:42 INFO mapred.JobClient: map 75% reduce 6%
13/05/29 11:59:52 INFO mapred.JobClient: map 76% reduce 6%
13/05/29 12:00:33 INFO mapred.JobClient: map 77% reduce 6%
13/05/29 12:00:53 INFO mapred.JobClient: map 78% reduce 6%
13/05/29 12:01:06 INFO mapred.JobClient: map 79% reduce 6%
13/05/29 12:01:51 INFO mapred.JobClient: map 80% reduce 6%
13/05/29 12:02:29 INFO mapred.JobClient: map 81% reduce 6%
13/05/29 12:02:39 INFO mapred.JobClient: map 82% reduce 6%
13/05/29 12:02:56 INFO mapred.JobClient: map 83% reduce 6%
13/05/29 12:03:36 INFO mapred.JobClient: map 84% reduce 6%
13/05/29 12:04:05 INFO mapred.JobClient: map 85% reduce 6%
13/05/29 12:04:59 INFO mapred.JobClient: map 86% reduce 6%
13/05/29 12:05:47 INFO mapred.JobClient: map 87% reduce 6%
13/05/29 12:07:04 INFO mapred.JobClient: map 88% reduce 6%
13/05/29 12:08:00 INFO mapred.JobClient: map 89% reduce 6%
13/05/29 12:08:32 INFO mapred.JobClient: map 90% reduce 6%
13/05/29 12:09:41 INFO mapred.JobClient: map 91% reduce 6%
13/05/29 12:10:04 INFO mapred.JobClient: map 92% reduce 6%
13/05/29 12:10:17 INFO mapred.JobClient: map 93% reduce 6%
13/05/29 12:10:45 INFO mapred.JobClient: map 94% reduce 6%
13/05/29 12:10:49 INFO mapred.JobClient: map 95% reduce 6%
13/05/29 12:11:00 INFO mapred.JobClient: map 96% reduce 6%
13/05/29 12:11:03 INFO mapred.JobClient: map 97% reduce 6%
13/05/29 12:11:12 INFO mapred.JobClient: map 98% reduce 6%
13/05/29 12:11:17 INFO mapred.JobClient: map 99% reduce 6%
13/05/29 12:12:02 INFO mapred.JobClient: map 100% reduce 6%
^C[hadoop#master ~]$
From the limited information that you gave (console output), it looks like the cluster aint healthy.
13/05/29 11:03:20 INFO mapred.JobClient: Task Id : attempt_201305290801_0002_m_000002_0, Status : FAILED
Task attempt_201305290801_0002_m_000002_0 failed to report status for 604 seconds. Killing!
Tasks were attempted on some node which did not report back to the JobTracker in 10 mins. This caused the task to get re-scheduled again. Diving into more logs, identifying which particular node(s) fails the assigned tasks could be something that you should do.

Resources