Shuffle, merger and fetcher errors when processing large files in hadoop - hadoop

I am running a word-count like mapreduce job processing 200 files of 1Gb each. I am running the job on a hadoop cluster comprising 4 datanodes (2cpu each) with 8Gb of memory and about 200G of space. I have tried various configurations options but every time my job fails, with either InMemory Shuffle, OnDisk Shuffle, InMemory merger, OnDisk Merger, or Fetcher errors.
The size of the mapper output is comparable to the size of the input files, therefore , in order to minimise the mapper output size I am using the BZip2 compression for the mapreduce output. However even with a compressed map output I still get errors in the reducer phase. I use 4 reducers. Thus I have tried various configurations of the hadoop cluster:
The standard configuration of the cluster was:
Default virtual memory for a job's map-task 3328 Mb
Default virtual memory for a job's reduce-task 6656 Mb
Map-side sort buffer memory 205 Mb
Mapreduce Log Dir Prefix /var/log/hadoop-mapreduce
Mapreduce PID Dir Prefix /var/run/hadoop-mapreduce
yarn.app.mapreduce.am.resource.mb 6656
mapreduce.admin.map.child.java.opts -Djava.net.preferIPv4Stack=TRUE -Dhadoop.metrics.log.level=WARN
mapreduce.admin.reduce.child.java.opts -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
mapreduce.admin.user.env LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native/`$JAVA_HOME/bin/java -d32 -version &> /dev/null;if [ $? -eq 0 ]; then echo Linux-i386-32; else echo Linux-amd64-64;fi`
mapreduce.am.max-attempts 2
mapreduce.application.classpath $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
mapreduce.cluster.administrators hadoop
mapreduce.framework.name yarn
mapreduce.job.reduce.slowstart.completedmaps 0.05
mapreduce.jobhistory.address ip-XXXX.compute.internal:10020
mapreduce.jobhistory.done-dir /mr-history/done
mapreduce.jobhistory.intermediate-done-dir /mr-history/tmp
mapreduce.jobhistory.webapp.address ip-XXXX.compute.internal:19888
mapreduce.map.java.opts -Xmx2662m
mapreduce.map.log.level INFO
mapreduce.map.output.compress true
mapreduce.map.sort.spill.percent 0.7
mapreduce.map.speculative false
mapreduce.output.fileoutputformat.compress true
mapreduce.output.fileoutputformat.compress.type BLOCK
mapreduce.reduce.input.buffer.percent 0.0
mapreduce.reduce.java.opts -Xmx5325m
mapreduce.reduce.log.level INFO
mapreduce.reduce.shuffle.input.buffer.percent 0.7
mapreduce.reduce.shuffle.merge.percent 0.66
mapreduce.reduce.shuffle.parallelcopies 30
mapreduce.reduce.speculative false
mapreduce.shuffle.port 13562
mapreduce.task.io.sort.factor 100
mapreduce.task.timeout 300000
yarn.app.mapreduce.am.admin-command-opts -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
yarn.app.mapreduce.am.command-opts -Xmx5325m
yarn.app.mapreduce.am.log.level INFO
yarn.app.mapreduce.am.staging-dir /user
mapreduce.map.maxattempts 4
mapreduce.reduce.maxattempts 4
This configuration gave me the following error:
14/05/16 20:20:05 INFO mapreduce.Job: map 20% reduce 3%
14/05/16 20:27:13 INFO mapreduce.Job: map 20% reduce 0%
14/05/16 20:27:13 INFO mapreduce.Job: Task Id : attempt_1399989158376_0049_r_000000_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in InMemoryMerger - Thread to merge in-memory shuffled map-outputs
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1399989158376_0049_r_000000_0/map_2038.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:398)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at org.apache.hadoop.mapred.YarnOutputFiles.getInputFileForWrite(YarnOutputFiles.java:213)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:450)
at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94)
Then I've tried changing various options, hopping to reduce the load during the shuffle phase, however I got the same error.
mapreduce.reduce.shuffle.parallelcopies 5
mapreduce.task.io.sort.factor 10
or
mapreduce.reduce.shuffle.parallelcopies 10
mapreduce.task.io.sort.factor 20
Then I realised that the tmp files on my data node were non existing and therefore all the merging and shuffling was happening in memory. Therefore I've manually added on each datanode.
I've kept the initial configuration but increased the time delay before the reducer starts in order to limit the load on the datanode.
mapreduce.job.reduce.slowstart.completedmaps 0.7
I've also tried increasing the io.sort.mb:
mapreduce.task.io.sort.mb from 205 to 512.
However now I get the following onDisk error:
14/05/26 12:17:08 INFO mapreduce.Job: map 62% reduce 21%
14/05/26 12:20:13 INFO mapreduce.Job: Task Id : attempt_1400958508328_0021_r_000000_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in OnDiskMerger - Thread to merge on-disk map-outputs
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for hadoop/yarn/local/usercache/eoc21/appcache/application_1400958508328_0021/output/attempt_1400958508328_0021_r_000000_0/map_590.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:398)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$OnDiskMerger.merge(MergeManagerImpl.java:536)
at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94)
The reducer dropped down to 0% and when it got back to 17% I got the following error:
14/05/26 12:32:03 INFO mapreduce.Job: Task Id : attempt_1400958508328_0021_r_000000_1, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#22
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1400958508328_0021_r_000000_1/map_1015.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:398)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at org.apache.hadoop.mapred.YarnOutputFiles.getInputFileForWrite(YarnOutputFiles.java:213)
at org.apache.hadoop.mapreduce.task.reduce.OnDiskMapOutput.<init>(OnDiskMapOutput.java:61)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:257)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:411)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
I read around and it seems that "Could not find any valid local directory for output/attempt_1400958508328_0021_r_000000_1/map_1015.out" is correlated to not having enough space on the node for the spill. However I checked the data node and it seems that there is enough space:
Filesystem Size Used Avail Use% Mounted on
/dev/xvde1 40G 22G 18G 56% /
none 3.6G 0 3.6G 0% /dev/shm
/dev/xvdj 1008G 758G 199G 80% /hadoop/hdfs/data
So not sure what to try anymore. Is the cluster too small for processing such jobs? Do I require more space on the datanodes? Is there a way to find an optimum configuration for the job on hadoop? Any suggestion is highly appreciated!

It could be one of four things I know if, most likely being the point you made in your question about disk space, or a similar problem - inodes:
Files being deleted by another process (unlikely, unless you remember doing this yourself)
Disk error (unlikely)
Not enough disk space
Not enough inodes (run df -i)
Even if you run df -h and df -i before/after the job, you don't know how much is being eaten and cleaned away during the job. So while your job is running, suggest watching these numbers / log them to a file / graph them / etc. E.g.
watch "df -h && df -i"

You need to specify some temp directories to store the intermediate map and reduce output.
May be you have not specified any temp directories so it could not find any valid directory to store the intermediate data.
You can do it by editing mapred-site.xml
<property>
<name>mapred.local.dir</name>
<value>/temp1,/temp2,/temp3</value>
</property>
Comma-separated list of paths on the local filesystem where temporary MapReduce data is written. Multiple paths help spread disk i/o.
After specifying these temp directories it will store intermediate map and reduce output by choosing the temp directories in any of the below ways
random: In this case, the intermediate data for reduce tasks is stored at a data location chosen at random.
max: In this case, the intermediate data for reduce tasks is stored at a data location with the most available space.
roundrobin: In this case, the mappers and reducers pick disks through round-robin scheduling for storing intermediate data at the job level within the number of local disks. The job ID is used to create unique sub directories on the local disks to store the intermediate data for each job.
you can set this property in mapred-site.xml
example
<property>
<name>mapreduce.job.local.dir.locator</name>
<value>max</value>
</property>
By default in hadoop it is roundrobin

"mapreduce.cluster.local.dir" (Old deprecated name: mapred.local.dir) specified in the mapred-site.xml.

Related

How to tune the Hadoop MapReduce parameters on Amazon EMR?

My MR job ended at map 100% reduce 35% with lots of error messages similar to running beyond physical memory limits. Current usage: 3.0 GB of 3 GB physical memory used; 3.7 GB of 15 GB virtual memory used. Killing container.
My input *.bz2 file is about 4GB, if I uncompress it, the size of it will be about 38GB, it took about one hour to run this job with one Master and two slavers on the Amazon EMR.
My questions are
- Why this job used so much memory?
- Why this job took about one hour? Usually running a 40GB wordcount job on a small 4-node cluster takes about 10 mins.
- How to tune the MR parameters to solve this problem?
- Which Amazon EC2 Instance types are the good fit to solve this problem?
Please refer to the following log:
- Physical memory (bytes) snapshot=43327889408 => 43.3GB
- Virtual memory (bytes) snapshot=108950675456 => 108.95GB
- Total committed heap usage (bytes)=34940649472 => 34.94GB
My proposed solutions are as follows, but I'm not sure if they are correct solutions or not
- use larger Amazon EC2 Instance which is at least 8GB in memory
- tune the MR parameters using the following codes
Version 1:
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "jobtest1");
//don't kill the container, if the physical memory exceeds "mapreduce.reduce.memory.mb" or "mapreduce.map.memory.mb"
conf.setBoolean("yarn.nodemanager.pmem-check-enabled", false);
conf.setBoolean("yarn.nodemanager.vmem-check-enabled", false);
Version 2:
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "jobtest2");
//conf.set("mapreduce.input.fileinputformat.split.minsize","3073741824");
conf.set("mapreduce.map.memory.mb", "8192");
conf.set("mapreduce.map.java.opts", "-Xmx6144m");
conf.set("mapreduce.reduce.memory.mb", "8192");
conf.set("mapreduce.reduce.java.opts", "-Xmx6144m");
Log:
15/11/08 11:37:27 INFO mapreduce.Job: map 100% reduce 35%
15/11/08 11:37:27 INFO mapreduce.Job: Task Id : attempt_1446749367313_0006_r_000006_2, Status : FAILED
Container [pid=24745,containerID=container_1446749367313_0006_01_003145] is running beyond physical memory limits. Current usage: 3.0 GB of 3 GB physical memory used; 3.7 GB of 15 GB virtual memory used. Killing container.
Dump of the process-tree for container_1446749367313_0006_01_003145 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 24745 24743 24745 24745 (bash) 0 0 9658368 291 /bin/bash -c /usr/lib/jvm/java-openjdk/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2304m -Djava.io.tmpdir=/mnt1/yarn/usercache/ec2-user/appcache/application_1446749367313_0006/container_1446749367313_0006_01_003145/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1446749367313_0006/container_1446749367313_0006_01_003145 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild **.***.***.*** 32846 attempt_1446749367313_0006_r_000006_2 3145 1>/var/log/hadoop-yarn/containers/application_1446749367313_0006/container_1446749367313_0006_01_003145/stdout 2>/var/log/hadoop-yarn/containers/application_1446749367313_0006/container_1446749367313_0006_01_003145/stderr
|- 24749 24745 24745 24745 (java) 14124 1281 3910426624 789477 /usr/lib/jvm/java-openjdk/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2304m -Djava.io.tmpdir=/mnt1/yarn/usercache/ec2-user/appcache/application_1446749367313_0006/container_1446749367313_0006_01_003145/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1446749367313_0006/container_1446749367313_0006_01_003145 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild **.***.***.*** 32846 attempt_1446749367313_0006_r_000006_2 3145
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
15/11/08 11:37:28 INFO mapreduce.Job: map 100% reduce 25%
15/11/08 11:37:30 INFO mapreduce.Job: map 100% reduce 26%
15/11/08 11:37:37 INFO mapreduce.Job: map 100% reduce 27%
15/11/08 11:37:42 INFO mapreduce.Job: map 100% reduce 28%
15/11/08 11:37:53 INFO mapreduce.Job: map 100% reduce 29%
15/11/08 11:37:57 INFO mapreduce.Job: map 100% reduce 34%
15/11/08 11:38:02 INFO mapreduce.Job: map 100% reduce 35%
15/11/08 11:38:13 INFO mapreduce.Job: map 100% reduce 36%
15/11/08 11:38:22 INFO mapreduce.Job: map 100% reduce 37%
15/11/08 11:38:35 INFO mapreduce.Job: map 100% reduce 42%
15/11/08 11:38:36 INFO mapreduce.Job: map 100% reduce 100%
15/11/08 11:38:36 INFO mapreduce.Job: Job job_1446749367313_0006 failed with state FAILED due to: Task failed task_1446749367313_0006_r_000001
Job failed as tasks failed. failedMaps:0 failedReduces:1
15/11/08 11:38:36 INFO mapreduce.Job: Counters: 43
File System Counters
FILE: Number of bytes read=11806418671
FILE: Number of bytes written=22240791936
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16874
HDFS: Number of bytes written=0
HDFS: Number of read operations=59
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
S3: Number of bytes read=3942336319
S3: Number of bytes written=0
S3: Number of read operations=0
S3: Number of large read operations=0
S3: Number of write operations=0
Job Counters
Failed reduce tasks=22
Killed reduce tasks=5
Launched map tasks=59
Launched reduce tasks=27
Data-local map tasks=59
Total time spent by all maps in occupied slots (ms)=114327828
Total time spent by all reduces in occupied slots (ms)=131855700
Total time spent by all map tasks (ms)=19054638
Total time spent by all reduce tasks (ms)=10987975
Total vcore-seconds taken by all map tasks=19054638
Total vcore-seconds taken by all reduce tasks=10987975
Total megabyte-seconds taken by all map tasks=27438678720
Total megabyte-seconds taken by all reduce tasks=31645368000
Map-Reduce Framework
Map input records=728795619
Map output records=728795618
Map output bytes=50859151614
Map output materialized bytes=10506705085
Input split bytes=16874
Combine input records=0
Spilled Records=1457591236
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=150143
CPU time spent (ms)=14360870
Physical memory (bytes) snapshot=43327889408
Virtual memory (bytes) snapshot=108950675456
Total committed heap usage (bytes)=34940649472
File Input Format Counters
Bytes Read=0
I am not sure of Amazon EMR. So few points to consider regarding map reduce:
bzip2 is slower, although it compresses better than gzip. bzip2’s decompression speed is faster than its compression speed, but it is still slower than the other formats. So at a high level, you already have this compared to 40gb word count program which ran in ten minutes.(assuming that 40gb program don't have compression). Next question is, BUT HOW MUCH SLOWER
However, your job is still failing after one hour. Please confirm this. So only when the job runs successfully, can we thing of performance. For this reason, lets think of why is it failing.
You were getting memory error. Also based on error, a container is failed during the reducer phase(as mapper phase is completed 100%). Mostly not even one reducer might have succeeded. Even though 32% might trick you to think that some reducers ran, that % could be due to preparing clean up work before first reducer runs. One way to confirm is, see if you have got any reducer output file generated.
Once confirming that, none of the reducer ran, you can increase the memory for containers as per your version 2.
Your version 1 will help you to see if only a specific container is causing issue and allowing the job to complete.
Your input file size should conclude the number of reducers. Standard is 1 Reducer per 1 GB unless you are compressing the Mapper output data. So in this case ideal number should have been at least 38. Try passing the command line option as -D mapred.reduce.tasks=40 and see if there is any change.

Hadoop YARN reducer/shuffle stuck

I was migrating from Hadoop 1 to Hadoop 2 YARN. Source code were recompiled using MRV2 jars and didn't have any compatibility issue. When I was trying to run the job under YARN, map worked fine and went to 100%, but reduce was stuck at ~6,7%. There's no performance issue. Actually, I checked CPU usage, it turned out when reduce was stuck, there seems like no computation going on because CPU is mostly 100% idle. The job can run successfully on Hadoop 1.2.1.
I checked the log messages from resourcemanager and found out that since map finished, no more container was allocated so there's no reduce is running on any container. What caused this situation?
I'm wondering if it is related to the yarn.nodemanager.aux-services property setting. By following the official tutorial(http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html), this property has to be set to mapreduce_shuffle which indicates that MR will still use default shuffle method instead of other shuffle plugins(http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html). I tried not to set this property but Hadoop wouldn't let me.
Here's the log of userlogs/applicationforlder/containerfolder/syslog when it's about to reach 7% of reduce. After that log didn't update anymore and reduce stopped as well.
2014-11-26 09:01:04,104 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1416988910568_0001_m_002988_0 decomp: 129587 len: 129591 to MEMORY
2014-11-26 09:01:04,104 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 129587 bytes from map-output for attempt_1416988910568_0001_m_002988_0
2014-11-26 09:01:04,104 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 129587, inMemoryMapOutputs.size() -> 2993, commitMemory -> 342319024, usedMemory ->342448611
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1416988910568_0001_m_002989_0 decomp: 128525 len: 128529 to MEMORY
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 128525 bytes from map-output for attempt_1416988910568_0001_m_002989_0
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 128525, inMemoryMapOutputs.size() -> 2994, commitMemory -> 342448611, usedMemory ->342577136
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: datanode03:13562 freed by fetcher#1 in 13ms
Was this a common issue when migrating from Hadoop 1 to 2? Was the strategy of running map-shuffle-sort-reduce changed in Hadoop 2? What caused this problem? Thanks so much. Any comments will help!
Major environment setup:
Hadoop version: 2.5.2
6-node cluster with 8-core CPU, 15 GB memory on each node
Related properties settings:
yarn.scheduler.maximum-allocation-mb: 14336
yarn.scheduler.minimum-allocation-mb: 2500
yarn.nodemanager.resource.memory-mb: 14336
yarn.nodemanager.aux-services: mapreduce_shuffle
mapreduce.task.io.sort.factor: 100
mapreduce.task.io.sort.mb: 1024
Finally solved the problem after googling around and found out I posted this question three month ago already.
It's because of the data skew.

Map reduce job getting stuck at map 0% reduce 0%

I am running the famous wordcount example. I have a local and prod hadoop setup. The same example is working in prod, but its not working locally. Can someone tell me what should I look for.
The job is getting stuck. The task logs are:
~/tmp$ hadoop jar wordcount.jar WordCount /testhistory /outputtest/test
Warning: $HADOOP_HOME is deprecated.
13/08/29 16:12:34 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/29 16:12:35 INFO input.FileInputFormat: Total input paths to process : 3
13/08/29 16:12:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/08/29 16:12:35 WARN snappy.LoadSnappy: Snappy native library not loaded
13/08/29 16:12:35 INFO mapred.JobClient: Running job: job_201308291153_0015
13/08/29 16:12:36 INFO mapred.JobClient: map 0% reduce 0%
Locally hadoop in running as pseudo distributed mode. All the 3 processes, namenode, datanode, jobtracker is running. Let me know if some extra information is required.
The tasktracker seems to be missing.
Try:
hadoop tasktracker &
In Hadoop 2.x this problem could be related to memory issues, you can see it in MapReduce in Hadoop 2.2.0 not working
I had the same problem and this page helped me:
http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide/
Basically I solved my problem using the following 3 steps. The fact is that I had to configure much more memory I really have.
1) yarn-site.xml
yarn.resourcemanager.hostname = hostname_of_the_master
yarn.nodemanager.resource.memory-mb = 4000
yarn.nodemanager.resource.cpu-vcores = 2
yarn.scheduler.minimum-allocation-mb = 4000
2) mapred-site.xml
yarn.app.mapreduce.am.resource.mb = 4000
yarn.app.mapreduce.am.command-opts = -Xmx3768m
mapreduce.map.cpu.vcores = 2
mapreduce.reduce.cpu.vcores = 2
3) Send these files across all nodes
Except for hadoop tasktracker & and any other issues. Please check you code and make sure that there is no infinite loop or any other bugs. Maybe there are some bugs in your code!
If this problem is coming when using Hive queries then do check if you are joining two very big tables without leveraging partitions. Not using partitions may lead to long running full table scans and hence stuck at map 0% reduce 0%.

Hadoop. Restart Map

After 36 hours of working Hadoop 1.0.3 said:
INFO mapred.JobClient: map 42% reduce 0%
mapred.JobClient: Job Failed: # of failed Map Tasks exceeded allowed limit. FailedCount: 1.
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
and stopped.
Is it possible to restart Hadoop jobs not from the very beginning (map 0% reduce 0%) ?
There doesn't seem to be a good way to restart a failed job. A couple of things to keep in mind:
looks like your in mapred config [mapreduce.map.maxattempts=1] and the default is typically 4
mapred.JobClient: Job Failed: # of failed Map Tasks
exceeded allowed limit. FailedCount: 1.
You would typically want to understand why it failed. (not sure from your post if you identified the issue)
It may have failed for a bogus reason and you can implement this exception into your mapreduce program by providing failure traps. You can implement that same concept using the Hadoop API.
Check out this answer here: https://stackoverflow.com/a/9742235/1515370

EOFException thrown by a Hadoop pipes program

First of all, I am a newbie of Hadoop.
I have a small Hadoop pipes program that throws java.io.EOFException. The program takes
as input a small text file and uses hadoop.pipes.java.recordreader and hadoop.pipes.java.recordwriter.
The input is very simple like:
1 262144 42.8084 15.9157 4.1324 0.06 0.1
However, Hadoop will throw an EOFException, which I can't see the reason. Below is the
stack trace:
10/12/08 23:04:04 INFO mapred.JobClient: Running job: job_201012081252_0016
10/12/08 23:04:05 INFO mapred.JobClient: map 0% reduce 0%
10/12/08 23:04:16 INFO mapred.JobClient: Task Id : attempt_201012081252_0016_m_000000_0, Status : FAILED
java.io.IOException: pipe child exception
at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151)
at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:267)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114)
BTW, I ran this on a fully-distributed mode (a cluster with 3 work nodes).
Any help is appreciated! Thanks
Lessons learned: by all means, try to make sure there is no bug in your own program.
This stacktrace is usually indicative of running out of available file descriptors within your worker machines. This is exceedingly common, documented sparsely, and precisely why I have two related questions on the subject.
If you have root access on all of the machines, you should consider raising the file descriptor limit for your Hadoop user by editing /etc/sysctl.conf:
(Add) fs.file-max = 4096
Or issuing:
ulimit -Sn 4096
ulimit -Hn 4096
Ad infinitum. General information for raising this limit is available here.
However, from the perspective of long-term planning, this strategy is somewhat spurious. If you happen to discover more information on the problem, perhaps you can help me help you help us all? [Thank you, GLaDOS. -Ed]
(Edit: See commentary that follows.)

Resources