Unable to open Operator - hadoop

I am trying to run Pig Latin script on Hadoop, and I am getting the error Unable to open terator for alias result.
Now this is not a script issue, I have tested it on a small test file, and it has worked.
I have the feelling it is the Reduce Function, becuase the script runs longer than 2 hours, then it fails noting : Successfully read Input, failed to produce result!
I am getting this warning too :
2015-01-09 09:29:13,320 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: server:port. Already tried 0
enter code here
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2015-01-09 09:29:14,321 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: server:port. Already tried 1
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2015-01-09 09:29:15,322 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: server:port. Already tried 2
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2015-01-09 09:29:15,429 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate -
Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-01-09 09:29:15,847 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Some
jobs have failed! St
op running all dependent jobs
2015-01-09 09:29:15,854 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066:Unable to open iterator for alias result
What does that mean ?? what does FinalAPplicationStatus=SUCCEEDED mean?
Thanks very much

Related

How to solve problem of DUMP not working in Apache Pig

I have Apache Pig version 0.17.0 (r1797386) .
I executed the following code. I have hadoop 2.9.2 on Ubuntu 18.04.
While run pig in mapreduce mode it gives following messages:
21/11/12 09:47:37 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
21/11/12 09:47:37 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
21/11/12 09:47:37 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
21/11/12 09:47:37 WARN pig.Main: Cannot write to log file: /home/hadoop/hadoop- 2.9.2/Pig/pigprogs/pig_1636690657754.log
2021-11-12 09:47:37,755 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0
(r1797386) compiled Jun 02 2017, 15:41:58
2021-11-12 09:47:37,785 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found
2021-11-12 09:47:37,997 [main] INFO
org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2021-11-12 09:47:37,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://127.0.0.1:9000
2021-11-12 09:47:38,390 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-51d35c23-16a2-42eb-9868-d9aa4a7aea0f
2021-11-12 09:47:38,390 [main] WARN org.apache.pig.PigServer - ATS is disabled since
yarn.timeline-service.enabled set to false
grunt>
I run a simple pig code:
grunt>A = LOAD '/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt' USING PigStorage(',') as (a1:int,a2:int,a3:int);
grunt>DUMP A;
**While I run DUMP A; it gives following error messages:**
2021-11-12 09:52:07,615 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2021-11-12 09:52:07,628 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2021-11-12 09:52:07,629 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-11-12 09:52:07,629 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2021-11-12 09:52:07,630 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2021-11-12 09:52:07,632 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2021-11-12 09:52:07,632 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2021-11-12 09:52:07,643 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2021-11-12 09:52:07,645 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /127.0.0.1:8050
2021-11-12 09:52:07,648 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2021-11-12 09:52:07,648 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2021-11-12 09:52:07,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2021-11-12 09:52:08,226 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-849926297/tmp-1952067843/pig-0.17.0-core-h2.jar
2021-11-12 09:52:08,381 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-849926297/tmp764505864/automaton-1.11-8.jar
2021-11-12 09:52:08,951 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-849926297/tmp1481980209/antlr-runtime-3.4.jar
2021-11-12 09:52:09,089 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-849926297/tmp789184813/joda-time-2.9.3.jar
2021-11-12 09:52:09,092 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2021-11-12 09:52:09,094 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2021-11-12 09:52:09,094 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2021-11-12 09:52:09,094 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2021-11-12 09:52:09,145 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2021-11-12 09:52:09,160 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /127.0.0.1:8050
2021-11-12 09:52:09,266 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2021-11-12 09:52:09,310 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2021-11-12 09:52:09,317 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1636690618976_0002
2021-11-12 09:52:09,331 [JobControl] INFO org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:DefaultJobName got an error while submitting
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:314)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:331)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:202)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:301)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:329)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:393)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
... 18 more
2021-11-12 09:52:09,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1636690618976_0002
2021-11-12 09:52:09,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A
2021-11-12 09:52:09,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4],A[-1,-1] C: R:
2021-11-12 09:52:09,661 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-11-12 09:52:14,674 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-11-12 09:52:14,674 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1636690618976_0002 has failed! Stop running all dependent jobs
2021-11-12 09:52:14,675 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-11-12 09:52:14,682 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /127.0.0.1:8050
2021-11-12 09:52:14,694 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1636690618976_0002. Redirecting to job history server.
2021-11-12 09:52:15,695 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:16,696 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:17,698 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:18,699 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:19,702 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:20,703 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:21,704 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:22,705 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:23,707 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:24,708 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:24,823 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1636690618976_0002. Redirecting to job history server.
2021-11-12 09:52:25,825 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:26,827 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:27,828 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:28,830 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:29,831 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:30,833 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:31,834 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:32,836 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:33,837 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:34,839 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:34,956 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1636690618976_0002. Redirecting to job history server.
2021-11-12 09:52:35,958 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:36,960 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:37,961 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:38,963 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:39,965 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:40,966 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:41,968 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:42,970 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:43,972 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:44,973 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:45,080 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-11-12 09:52:45,080 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-11-12 09:52:45,081 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.9.2 0.17.0 hadoop 2021-11-12 09:52:07 2021-11-12 09:52:45 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1636690618976_0002 A MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:314)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:331)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:202)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:301)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:329)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:393)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
... 18 more
hdfs://127.0.0.1:9000/tmp/temp-849926297/tmp-249405812,
Input(s):
Failed to read data from "/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt"
Output(s):
Failed to produce result in "hdfs://127.0.0.1:9000/tmp/temp-849926297/tmp-249405812"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1636690618976_0002
2021-11-12 09:52:45,082 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2021-11-12 09:52:45,087 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A. Backend error : java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-11-12 09:52:45,087 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2021-11-12 09:52:45,087 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A. Backend error : java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.pig.PigServer.openIterator(PigServer.java:1010)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:782)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:564)
at org.apache.pig.Main.main(Main.java:175)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:841)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:473)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
at org.apache.pig.PigServer.storeEx(PigServer.java:1119)
at org.apache.pig.PigServer.store(PigServer.java:1082)
at org.apache.pig.PigServer.openIterator(PigServer.java:995)
... 13 more
Caused by: java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:344)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:429)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:804)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:214)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.getTaskReports(MRJobStats.java:528)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:823)
... 20 more
Caused by: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1453)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy15.getJobReport(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:325)
... 25 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:412)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1568)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
... 35 more
grunt>
I think perhaps that you should lookinto myfiles1.txt files as it seems to be missing/have incorrect permissions.
hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt

Trying to run statments on pig getting error

When i start to read a file on hdfs using pig in mapreduce mode, when i used dump b it started the mapreduce process and after completing it, it goes on to repetition please tell me whats the problem. (I have set the file permissions to 777 and /tmp permissions in hdfs to 777).
[root#master conf]# pig -x mapreduce
17/04/19 23:05:59 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
17/04/19 23:05:59 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
17/04/19 23:05:59 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2017-04-19 23:05:59,615 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0 (r1746530) compiled Jun 01 2016, 23:10:49
2017-04-19 23:05:59,615 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/hadoop/pig/conf/pig_1492623359614.log
2017-04-19 23:05:59,652 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2017-04-19 23:06:01,031 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost/
2017-04-19 23:06:02,136 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8021
2017-04-19 23:06:02,205 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-3df7c96f-9eac-4874-aab9-9ca7726fe860
2017-04-19 23:06:02,205 [main] WARN org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false
grunt> a= load '/temp' AS (name:chararray, age:int, salary:int);
grunt> b= foreach a generate (name, salary);
grunt> dump b;
2017-04-19 23:06:22,093 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2017-04-19 23:06:22,190 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2017-04-19 23:06:22,267 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2017-04-19 23:06:22,309 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for a: $1
2017-04-19 23:06:22,456 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2017-04-19 23:06:22,564 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2017-04-19 23:06:22,589 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2017-04-19 23:06:22,589 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2017-04-19 23:06:22,724 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 23:06:23,128 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2017-04-19 23:06:23,152 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2017-04-19 23:06:23,154 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2017-04-19 23:06:23,820 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/pig-0.16.0-core-h2.jar to DistributedCache through /tmp/temp2091099620/tmp-1166978625/pig-0.16.0-core-h2.jar
2017-04-19 23:06:23,951 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp2091099620/tmp-1829507825/automaton-1.11-8.jar
2017-04-19 23:06:24,026 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp2091099620/tmp-1436552250/antlr-runtime-3.4.jar
2017-04-19 23:06:24,119 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp2091099620/tmp-1393102603/joda-time-2.9.3.jar
2017-04-19 23:06:24,132 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2017-04-19 23:06:24,148 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2017-04-19 23:06:24,148 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2017-04-19 23:06:24,148 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2017-04-19 23:06:24,279 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2017-04-19 23:06:24,302 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 23:06:24,920 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2017-04-19 23:06:24,952 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2017-04-19 23:06:24,995 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2017-04-19 23:06:24,995 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2017-04-19 23:06:25,056 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2017-04-19 23:06:25,375 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2017-04-19 23:06:25,889 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1492621692528_0002
2017-04-19 23:06:26,195 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2017-04-19 23:06:26,411 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1492621692528_0002
2017-04-19 23:06:26,537 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master:8088/proxy/application_1492621692528_0002/
2017-04-19 23:06:26,537 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1492621692528_0002
2017-04-19 23:06:26,537 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a,b
2017-04-19 23:06:26,537 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,3],b[-1,-1] C: R:
2017-04-19 23:06:26,595 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2017-04-19 23:06:26,595 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1492621692528_0002]
2017-04-19 23:06:48,598 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2017-04-19 23:06:48,598 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1492621692528_0002]
2017-04-19 23:06:51,639 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 23:06:51,705 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-04-19 23:06:52,983 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:53,985 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:54,989 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:55,993 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:56,994 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:57,995 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:58,999 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:07:00,001 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:07:01,005 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
[2]+ Stopped pig -x mapreduce
Start the JobHistoryServer
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
Pig when ran in mapreduce mode expects the JobHistoryServer to be available.

Yarn Container wrong hostname when contacting ResourceManager

I'm trying to write a simple query in Hive (just an INSERT) but I'm having issues with how MapReduce jobs are being provisioned. Containers are getting allocated correctly, but my jobs never run.
It seems that they're contacting the ResourceManager incorrectly. I have verified (via JPS) that my ResourceManager is indeed running, and is running on hostname hadoop1.personal which all servers have a reference to in /etc/hosts. The issue looks like this:
2016-09-27 09:41:55,223 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-09-27 09:41:55,224 INFO [Socket Reader #1 for port 45744] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 45744
2016-09-27 09:41:55,230 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-09-27 09:41:55,230 INFO [IPC Server listener on 45744] org.apache.hadoop.ipc.Server: IPC Server listener on 45744: starting
2016-09-27 09:41:55,299 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2016-09-27 09:41:55,300 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2016-09-27 09:41:55,300 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2016-09-27 09:41:55,375 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-09-27 09:41:56,414 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-09-27 09:41:57,415 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-09-27 09:41:58,415 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-09-27 09:41:59,416 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-09-27 09:42:00,417 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
And of course it does go on for some time before eventually dying.
Now, I know that my configurations are getting picked up in some sense. Earlier in the logs, the containers say 2016-09-27 09:41:52,783 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://hadoop1.personal:8020] which is the correct NameNode to be using.
Additionally, if I go to the NodeManager configuration (i.e. http://hadoop2.personal:8042/conf) then I can see that:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1.personal</value>
<source>yarn-site.xml</source>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
<source>yarn-default.xml</source>
</property>
So the NodeManager appears to know exactly where it needs to be at.
This seems incredibly strange to me: The NodeManager and ResourceManagers are talking together just fine, but containers are contacting the wrong scheduler. How do I control the address the containers are contacting for scheduling?
As a sidenote, I have tested this both with and without IPv6 enabled as recommended in this answer. No effect.

MapReduce client retrying to connect after job completion

Running on Hadoop 2.6.0-cdh5.7.0 and issuing a simple Pig script.
After a successful job completion I'm getting the following message :
Seems like the workers are trying to communicate with each other (with a maximum of 3 retries) but I'm not sure why, and where this behavior is configured.
Does anyone know how to solve this issue ?
Output(s):
Successfully stored 46933 records (12822705 bytes) in: "/profile/main_output_merged"
Counters:
Total records written : 46933
Total bytes written : 12822705
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1469941650260_0002 -> job_1469941650260_0011,
job_1469941650260_0003 -> job_1469941650260_0011,
job_1469941650260_0001 -> job_1469941650260_0005,job_1469941650260_0006,
job_1469941650260_0005 -> job_1469941650260_0006,
job_1469941650260_0006 -> job_1469941650260_0007,
job_1469941650260_0007 -> job_1469941650260_0008,job_1469941650260_0009,
job_1469941650260_0004 -> job_1469941650260_0008,
job_1469941650260_0008 -> job_1469941650260_0010,
job_1469941650260_0010 -> job_1469941650260_0011,
job_1469941650260_0009 -> job_1469941650260_0011,
job_1469941650260_0011
2016-07-31 05:28:54,418 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p1.c.project.internal/10.240.0.22:38762. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:28:55,419 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p1.c.project.internal/10.240.0.22:38762. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:28:56,420 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p1.c.project.internal/10.240.0.22:38762. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:28:56,527 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:28:57,626 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p2.c.project.internal/10.240.0.17:35325. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:28:58,628 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p2.c.project.internal/10.240.0.17:35325. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:28:59,629 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p2.c.project.internal/10.240.0.17:35325. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:28:59,732 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:00,833 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:45573. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:01,834 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:45573. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:02,835 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:45573. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:02,939 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:04,051 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker2.c.project.internal/10.240.0.24:36934. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:05,052 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker2.c.project.internal/10.240.0.24:36934. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:06,053 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker2.c.project.internal/10.240.0.24:36934. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:06,157 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:07,244 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker2.c.project.internal/10.240.0.24:43862. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:08,245 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker2.c.project.internal/10.240.0.24:43862. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:09,246 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker2.c.project.internal/10.240.0.24:43862. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:09,350 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:10,643 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:38481. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:11,644 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:38481. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:12,645 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:38481. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:12,749 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:13,832 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p2.c.project.internal/10.240.0.17:34431. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:14,833 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p2.c.project.internal/10.240.0.17:34431. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:15,834 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker-p2.c.project.internal/10.240.0.17:34431. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:15,937 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:17,045 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker1.c.project.internal/10.240.0.27:38757. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:18,046 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker1.c.project.internal/10.240.0.27:38757. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:19,047 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker1.c.project.internal/10.240.0.27:38757. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:19,149 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:20,230 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:37952. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:21,231 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:37952. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:22,232 [MainThread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: cdh-worker3.c.project.internal/10.240.0.25:37952. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-07-31 05:29:22,335 [MainThread] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-07-31 05:29:22,417 [MainThread] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist

WHEN I'M GOING TO IMPORT DATA FROM SQL TO HDFS I GOT FOLLOWING ERROR SAYING
org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
I'M PUTTING THE TERMINAL ACTIVITY WHICH I GOT ON MY TERMINAL
student#ubuntu:~$ sqoop import --connect jdbc:mysql://localhost:3306/p \
> --username root \
> --password student \
> --table p \
> -m \
> 1;
Warning: /home/student/Applications/sqoop/../hbase dofes not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/student/Applications/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/student/Applications/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/student/Applications/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
15/10/23 04:23:49 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
15/10/23 04:23:49 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/10/23 04:23:52 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/10/23 04:23:52 INFO tool.CodeGenTool: Beginning code generation
15/10/23 04:24:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `p` AS t LIMIT 1
15/10/23 04:24:01 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `p` AS t LIMIT 1
15/10/23 04:24:01 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/student/Applications/hadoop
Note: /tmp/sqoop-student/compile/d0a3526dcf308f25f4333c8558068bb8/p.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/10/23 04:25:03 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-student/compile/d0a3526dcf308f25f4333c8558068bb8/p.jar
15/10/23 04:25:04 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/10/23 04:25:04 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/10/23 04:25:04 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/10/23 04:25:04 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/10/23 04:25:05 INFO mapreduce.ImportJobBase: Beginning import of p
15/10/23 04:25:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/23 04:25:31 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/10/23 04:26:02 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/10/23 04:26:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/10/23 04:26:40 INFO db.DBInputFormat: Using read commited transaction isolation
15/10/23 04:26:41 INFO mapreduce.JobSubmitter: number of splits:1
15/10/23 04:26:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1445598425022_0001
15/10/23 04:26:53 INFO impl.YarnClientImpl: Submitted application application_1445598425022_0001
15/10/23 04:26:54 INFO mapreduce.Job: The url to track the job: http://ubuntu:8088/proxy/application_1445598425022_0001/
15/10/23 04:26:54 INFO mapreduce.Job: Running job: job_1445598425022_0001
15/10/23 04:28:24 INFO mapreduce.Job: Job job_1445598425022_0001 running in uber mode : false
15/10/23 04:28:25 INFO mapreduce.Job: map 0% reduce 0%
15/10/23 04:28:41 INFO mapreduce.Job: Task Id : attempt_1445598425022_0001_m_000000_0, Status : FAILED
Container launch failed for container_1445598425022_0001_01_000002 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/23 04:28:43 INFO mapreduce.Job: Task Id : attempt_1445598425022_0001_m_000000_1, Status : FAILED
Container launch failed for container_1445598425022_0001_01_000003 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/23 04:28:43 INFO mapreduce.Job: Task Id : attempt_1445598425022_0001_m_000000_2, Status : FAILED
Container launch failed for container_1445598425022_0001_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/23 04:28:47 INFO mapreduce.Job: map 100% reduce 0%
15/10/23 04:28:56 INFO mapreduce.Job: Job job_1445598425022_0001 failed with state FAILED due to: Task failed task_1445598425022_0001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
15/10/23 04:29:01 INFO mapreduce.Job: Counters: 3
Job Counters
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=0
Total time spent by all reduces in occupied slots (ms)=0
15/10/23 04:29:02 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
15/10/23 04:29:13 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:14 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:15 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:16 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:17 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:18 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:19 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:20 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:21 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:22 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/10/23 04:29:23 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
HOW CAN I OVERCOME THIS?
I have suffered from the same kind of situation.
To overcome this you have to check your yarn-site.xml such that it will match the following code snippet.
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Resources