Hadoop error in shuffle in fetcher: Exceeded MAX_FAILED_UNIQUE_FETCHES - hadoop

I am new to hadoop. I have a kerberos security enabled hadoop cluster (master and 1 slave) set up on a virtual box. I am trying to run a job from the hadoop examples 'pi'. The job terminates with the error Exceeded MAX_FAILED_UNIQUE_FETCHES. I tried searching for this error but the solutions given on the internet do not seem to be working for me. Perhaps I am missing something obvious. I even tried removing the slave from the etc/hadoop/slaves file to see if the job can run only on the master but that fails as well with the same error. Below is the log. I am running this on 64-bit Ubuntu 14.04 virtual box. Any help appreciated.
montauk#montauk-vmaster:/usr/local/hadoop$ sudo -u yarn bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar pi 2 10
Number of Maps = 2
Samples per Map = 10
OpenJDK 64-Bit Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/06/05 12:04:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
14/06/05 12:04:49 INFO client.RMProxy: Connecting to ResourceManager at /192.168.0.29:8040
14/06/05 12:04:50 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 17 for yarn on 192.168.0.29:54310
14/06/05 12:04:50 INFO security.TokenCache: Got dt for hdfs://192.168.0.29:54310; Kind: HDFS_DELEGATION_TOKEN, Service: 192.168.0.29:54310, Ident: (HDFS_DELEGATION_TOKEN token 17 for yarn)
14/06/05 12:04:50 INFO input.FileInputFormat: Total input paths to process : 2
14/06/05 12:04:51 INFO mapreduce.JobSubmitter: number of splits:2
14/06/05 12:04:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1401975262053_0007
14/06/05 12:04:51 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 192.168.0.29:54310, Ident: (HDFS_DELEGATION_TOKEN token 17 for yarn)
14/06/05 12:04:53 INFO impl.YarnClientImpl: Submitted application application_1401975262053_0007
14/06/05 12:04:53 INFO mapreduce.Job: The url to track the job: http://montauk-vmaster:8088/proxy/application_1401975262053_0007/
14/06/05 12:04:53 INFO mapreduce.Job: Running job: job_1401975262053_0007
14/06/05 12:05:29 INFO mapreduce.Job: Job job_1401975262053_0007 running in uber mode : false
14/06/05 12:05:29 INFO mapreduce.Job: map 0% reduce 0%
14/06/05 12:06:04 INFO mapreduce.Job: map 50% reduce 0%
14/06/05 12:06:06 INFO mapreduce.Job: map 100% reduce 0%
14/06/05 12:06:34 INFO mapreduce.Job: map 100% reduce 100%
14/06/05 12:06:34 INFO mapreduce.Job: Task Id : attempt_1401975262053_0007_r_000000_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#4
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

I came across the same problem as yours when I install cdh5.1.0 with kerberos security using tarball,solutions found by google are insufficient memory,but I don't think it's my situation since my input is very small (52K).
After digging several days,I found root cause in this link.
To sum up solutions in that link can be:
add following property in yarn-site.xml even it's default in yarn-default.xml
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
remove property yarn.nodemanager.local-dirs and use default value /tmp.Then exec following commands:
mkdir -p /tmp/hadoop-yarn/nm-local-dir
chown yarn:yarn /tmp/hadoop-yarn/nm-local-dir
The problem can be concluded:
After setting yarn.nodemanager.local-dirs property, the property yarn.nodemanager.aux-services.mapreduce_shuffle.class in yarn-default.xml doesn't work.
The root cause I haven't found also.

I had the same issue.I had mapreduce job without reducer.Then I solved it using job.setNumReduceTasks(0);

change below property in yarn-site.xml and create the directory.
yarn.nodemanager.local-dirs
/tmp
mkdir -p /tmp/hadoop-yarn/nm-local-dir
chown yarn:yarn /tmp/hadoop-yarn/nm-local-dir
tune the resources properety in mapred-site.xml
mapreduce.reduce.shuffle.input.buffer.percent=0.50
mapreduce.reduce.shuffle.memory.limit.percent=0.2
mapreduce.reduce.shuffle.parallelcopies=4
Restart resourcemanager and nodemanager on their respective nodes.

Related

Hadoop Installation in Windows 7

I am working on hadoop installation in Windows 7.
Tried to untar the tarfiles from apache site but it was unsuccessful.
I have searched in internet and found below link.
http://toodey.com/2015/08/10/hadoop-installation-on-windows-without-cygwin-in-10-mints/
I was able to install. But when i was trying to execute the examples i was encountered with below errors.
Command executed :
C:\Users\hadoop\hadoop-2.7.1\bin\hadoop.cmd jar C:\Users\hadoop\hadoop-2.7.1\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.7.1.jar wordcount /hadoop/input /hadoop/output
Error :
C:\Users\hadoop\hadoop-2.7.1>C:\Users\hadoop\hadoop-2.7.1\bin\hadoop.cmd jar C:\Users\hadoop\hadoop-2.7.1\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.7.1.jar wordcount /hadoop/input /hadoop/output
16/11/14 17:05:28 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:0000
16/11/14 17:05:30 INFO input.FileInputFormat: Total input paths to process : 3
16/11/14 17:05:30 INFO mapreduce.JobSubmitter: number of splits:3
16/11/14 17:05:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1479122512555_0003
16/11/14 17:05:31 INFO impl.YarnClientImpl: Submitted application application_1479122512555_0003
16/11/14 17:05:32 INFO mapreduce.Job: The url to track the job: http://MachineName:8088/proxy/application_1479122512555_0003/
16/11/14 17:05:32 INFO mapreduce.Job: Running job: job_1479122512555_0003
16/11/14 17:05:36 INFO mapreduce.Job: Job job_1479122512555_0003 running in uber mode : false
16/11/14 17:05:36 INFO mapreduce.Job: map 0% reduce 0%
16/11/14 17:05:36 INFO mapreduce.Job: Job job_1479122512555_0003 failed with state FAILED due to: Application application_1479122512555_0003 failed 2 times due to AM Container for appattempt_1479122512555_0003_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://MachineName:8088/cluster/app/application_1479122512555_0003Then, click on links to logs of each attempt.
Diagnostics: null
Failing this attempt. Failing the application.
16/11/14 17:05:36 INFO mapreduce.Job: Counters: 0
Thanks in advance...
Actually this is due to the permission issues on some of the yarn local directories. So here is the solution
Identify the yarn directories specified using the parameter yarn.nodemanager.local-dirs from /etc/gphd/hadoop/conf/yarn-site.xml on yarn node manager.
Delete the files / folders under usercache from all the directories listed in yarn-site.xml and all the node managers.
e.g
rmdir path/to/yarn/nm/usercache/*

hadoop Input path does not exist

I am trying to get hadoop set up on my laptop. I have followed a few tutorials on setting up hadoop.
I ran this command:
bin/hdfs dfs -mkdir /user/<username>
If I run it again it says already exists.
I try to run the test jar file with this command:
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar grep input output 'dfs[a-z.]+'
and receive this exception
16/01/22 15:11:06 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/<username>/.staging/job_1453492366595_0006
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/<username>/grep-temp-891167560
I did not realize that I receive this before this error:
16/01/22 15:51:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/22 15:51:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/01/22 15:51:51 INFO input.FileInputFormat: Total input paths to process : 33
16/01/22 15:51:52 INFO mapreduce.JobSubmitter: number of splits:33
16/01/22 15:51:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1453492366595_0009
16/01/22 15:51:52 INFO impl.YarnClientImpl: Submitted application application_1453492366595_0009
16/01/22 15:51:52 INFO mapreduce.Job: The url to track the job: http://Marys-MacBook-Pro.local:8088/proxy/application_1453492366595_0009/
16/01/22 15:51:52 INFO mapreduce.Job: Running job: job_1453492366595_0009
16/01/22 15:51:56 INFO mapreduce.Job: Job job_1453492366595_0009 running in uber mode : false
16/01/22 15:51:56 INFO mapreduce.Job: map 0% reduce 0%
16/01/22 15:51:56 INFO mapreduce.Job: Job job_1453492366595_0009 failed with state FAILED due to: Application application_1453492366595_0009 failed 2 times due to AM Container for appattempt_1453492366595_0009_000002 exited with exitCode: 127
For more detailed output, check application tracking page:http://Marys-MacBook-Pro.local:8088/cluster/app/application_1453492366595_0009Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1453492366595_0009_02_000001
Exit code: 127
Stack trace: ExitCodeException exitCode=127:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 127
Failing this attempt. Failing the application.
There is a stack trace the follows this.
I am on a Mac PC.
I use Hadoop 2.7.2, and While following the Official Docs, I also encountered this problem at first.
The reason was that I forgot to follow "Prepare to Start the Hadoop Cluster" chapter.
I solved it by setting JAVA_HOME in etc/hadoop/hadoop-env.sh.
For me, it's because using wrong version JDK with hadoop. I used hadoop 2.6.5. At first, I started hadoop using oracle JDK 1.8.0_131, ran example jar and error occurred. After I used JDK 1.7.0_80, the example works like a charm.
There is a page about HadoopJavaVersions.

MapReduce job is stuck on a multi node Hadoop-2.7.1 cluster

I have successfully run Hadoop 2.7.1 on a multi node cluster (1 namenode and 4 datanodes). But, when I run MapReduce job (WordCount example from Hadoop website), it always stuck at this point.
[~#~ hadoop-2.7.1]$ bin/hadoop jar WordCount.jar WordCount /user/inputdata/ /user/outputdata
15/09/30 17:54:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/30 17:54:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/09/30 17:54:58 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/09/30 17:54:59 INFO input.FileInputFormat: Total input paths to process : 1
15/09/30 17:55:00 INFO mapreduce.JobSubmitter: number of splits:1
15/09/30 17:55:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1443606819488_0002
15/09/30 17:55:00 INFO impl.YarnClientImpl: Submitted application application_1443606819488_0002
15/09/30 17:55:00 INFO mapreduce.Job: The url to track the job: http://~~~~:8088/proxy/application_1443606819488_0002/
15/09/30 17:55:00 INFO mapreduce.Job: Running job: job_1443606819488_0002
Do I have to specify a memory for yarn?
NOTE: DataNode hardwares are really old (Each has 1GB RAM).
Appreciate your help.
Thank you.
The data nodes memory (1gb) is really very scarce to prepare atleast 1 container to run mapper/reducer/am in it.
You could try lowering the below container memory allocation values in yarn-site.xml with very lower values to get the container created on them.
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
Also try to reduce the below properties values in your job configration,
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts

Why hadoop yarn mapreduce job not working and stop on running job?

I have a mapreduce job and I ran it with YARN mode. But why my mapreduce job stop and not continue while running job step? It's like this :
15/04/04 17:18:21 INFO impl.YarnClientImpl: Submitted application application_1428142358448_0002
15/04/04 17:18:21 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1428142358448_0002/
15/04/04 17:18:21 INFO mapreduce.Job: Running job: job_1428142358448_0002
And that's stop here. Is because lack of memory? After start-all.sh and all daemon have started, I have about 300-350 MB memory. I need your suggest all, why this happened?
Thanks all..
No, this isn't because of out of memory, else the logs would have clearly mentioned that. The job seems to be in running state and has got stuckup somewhere, you can probably go and check on the application master for more details about the job.
I'm sorry but you mean this thing ?
15/04/05 14:11:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/05 14:11:29 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.163:8050
15/04/05 14:11:30 INFO input.FileInputFormat: Total input paths to process : 1
15/04/05 14:11:31 INFO mapreduce.JobSubmitter: number of splits:1
15/04/05 14:11:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1428216622742_0003
15/04/05 14:11:31 INFO impl.YarnClientImpl: Submitted application application_1428216622742_0003
15/04/05 14:11:31 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1428216622742_0003/
15/04/05 14:11:31 INFO mapreduce.Job: Running job: job_1428216622742_0003
or something else? on my master node port 8088 there are only tables....

Why Mapreduce with YARN stuck on CDH 5.3?

Mapreduce with YARN fail to move ahead of 0% map and 0% reduce. I am using Cloudera CDH on google compute high memory instance(13 GM RAM). 8 GB free ram is available on the machine. Can you please help me to fix it?
sunny#hadoop-m:~$ hadoop jar /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/jars/hadoop-mapreduce-examples-2.5.0-cdh5.3.0.jar grep input output 'dfs[a-z.]+'
14/12/24 00:13:53 INFO client.RMProxy: Connecting to ResourceManager at hadoop-m.c.sunny-hadoop-trial.internal/10.240.253.233:8032
14/12/24 00:13:53 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
14/12/24 00:13:54 INFO input.FileInputFormat: Total input paths to process : 5
14/12/24 00:13:54 INFO mapreduce.JobSubmitter: number of splits:5
14/12/24 00:13:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1419360146634_0001
14/12/24 00:13:54 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
14/12/24 00:13:54 INFO impl.YarnClientImpl: Submitted application application_1419360146634_0001
14/12/24 00:13:55 INFO mapreduce.Job: The url to track the job: http://hadoop-m.c.sunny-hadoop-trial.internal:8088/proxy/application_1419360146634_0001/
14/12/24 00:13:55 INFO mapreduce.Job: Running job: job_1419360146634_0001
Resource Manager Output
Some more info about job
yarn-site.xml: http://pastebin.mozilla.org/8113782
mapred-site.xml: http://pastebin.mozilla.org/8113813
Server 's IP got changed because of DHCP service. Client configuration for HDFS and YARN became stale. I needed to update client configuration, I did it with Cloudera manager and now cluster is running fine.

Resources