I have Apache Pig version 0.17.0 (r1797386) .
I executed the following code. I have hadoop 2.9.2 on Ubuntu 18.04.
While run pig in mapreduce mode it gives following messages:
21/11/12 09:47:37 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
21/11/12 09:47:37 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
21/11/12 09:47:37 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
21/11/12 09:47:37 WARN pig.Main: Cannot write to log file: /home/hadoop/hadoop- 2.9.2/Pig/pigprogs/pig_1636690657754.log
2021-11-12 09:47:37,755 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0
(r1797386) compiled Jun 02 2017, 15:41:58
2021-11-12 09:47:37,785 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found
2021-11-12 09:47:37,997 [main] INFO
org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2021-11-12 09:47:37,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://127.0.0.1:9000
2021-11-12 09:47:38,390 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-51d35c23-16a2-42eb-9868-d9aa4a7aea0f
2021-11-12 09:47:38,390 [main] WARN org.apache.pig.PigServer - ATS is disabled since
yarn.timeline-service.enabled set to false
grunt>
I run a simple pig code:
grunt>A = LOAD '/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt' USING PigStorage(',') as (a1:int,a2:int,a3:int);
grunt>DUMP A;
**While I run DUMP A; it gives following error messages:**
2021-11-12 09:52:07,615 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2021-11-12 09:52:07,628 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2021-11-12 09:52:07,629 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-11-12 09:52:07,629 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2021-11-12 09:52:07,630 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2021-11-12 09:52:07,632 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2021-11-12 09:52:07,632 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2021-11-12 09:52:07,643 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2021-11-12 09:52:07,645 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /127.0.0.1:8050
2021-11-12 09:52:07,648 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2021-11-12 09:52:07,648 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2021-11-12 09:52:07,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2021-11-12 09:52:08,226 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-849926297/tmp-1952067843/pig-0.17.0-core-h2.jar
2021-11-12 09:52:08,381 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-849926297/tmp764505864/automaton-1.11-8.jar
2021-11-12 09:52:08,951 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-849926297/tmp1481980209/antlr-runtime-3.4.jar
2021-11-12 09:52:09,089 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hadoop/hadoop-2.9.2/Pig/pig-0.17.0/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-849926297/tmp789184813/joda-time-2.9.3.jar
2021-11-12 09:52:09,092 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2021-11-12 09:52:09,094 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2021-11-12 09:52:09,094 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2021-11-12 09:52:09,094 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2021-11-12 09:52:09,145 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2021-11-12 09:52:09,160 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /127.0.0.1:8050
2021-11-12 09:52:09,266 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2021-11-12 09:52:09,310 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2021-11-12 09:52:09,317 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1636690618976_0002
2021-11-12 09:52:09,331 [JobControl] INFO org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:DefaultJobName got an error while submitting
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:314)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:331)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:202)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:301)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:329)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:393)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
... 18 more
2021-11-12 09:52:09,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1636690618976_0002
2021-11-12 09:52:09,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A
2021-11-12 09:52:09,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4],A[-1,-1] C: R:
2021-11-12 09:52:09,661 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-11-12 09:52:14,674 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-11-12 09:52:14,674 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1636690618976_0002 has failed! Stop running all dependent jobs
2021-11-12 09:52:14,675 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-11-12 09:52:14,682 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /127.0.0.1:8050
2021-11-12 09:52:14,694 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1636690618976_0002. Redirecting to job history server.
2021-11-12 09:52:15,695 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:16,696 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:17,698 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:18,699 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:19,702 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:20,703 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:21,704 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:22,705 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:23,707 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:24,708 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:24,823 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1636690618976_0002. Redirecting to job history server.
2021-11-12 09:52:25,825 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:26,827 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:27,828 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:28,830 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:29,831 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:30,833 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:31,834 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:32,836 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:33,837 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:34,839 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:34,956 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1636690618976_0002. Redirecting to job history server.
2021-11-12 09:52:35,958 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:36,960 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:37,961 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:38,963 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:39,965 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:40,966 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:41,968 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:42,970 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:43,972 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:44,973 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: localhost/127.0.0.1:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-11-12 09:52:45,080 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-11-12 09:52:45,080 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-11-12 09:52:45,081 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.9.2 0.17.0 hadoop 2021-11-12 09:52:07 2021-11-12 09:52:45 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1636690618976_0002 A MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:314)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:331)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:202)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:301)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:329)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:393)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
... 18 more
hdfs://127.0.0.1:9000/tmp/temp-849926297/tmp-249405812,
Input(s):
Failed to read data from "/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt"
Output(s):
Failed to produce result in "hdfs://127.0.0.1:9000/tmp/temp-849926297/tmp-249405812"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1636690618976_0002
2021-11-12 09:52:45,082 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2021-11-12 09:52:45,087 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A. Backend error : java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-11-12 09:52:45,087 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2021-11-12 09:52:45,087 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A. Backend error : java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.pig.PigServer.openIterator(PigServer.java:1010)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:782)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:564)
at org.apache.pig.Main.main(Main.java:175)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:841)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:473)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
at org.apache.pig.PigServer.storeEx(PigServer.java:1119)
at org.apache.pig.PigServer.store(PigServer.java:1082)
at org.apache.pig.PigServer.openIterator(PigServer.java:995)
... 13 more
Caused by: java.io.IOException: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:344)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:429)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:804)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:214)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.getTaskReports(MRJobStats.java:528)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:823)
... 20 more
Caused by: java.net.ConnectException: Call From sudip-lenovo/10.14.14.198 to localhost:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1453)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy15.getJobReport(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:325)
... 25 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:412)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1568)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
... 35 more
grunt>
I think perhaps that you should lookinto myfiles1.txt files as it seems to be missing/have incorrect permissions.
hdfs://127.0.0.1:9000/home/hadoop/hadoop-2.9.2/Pig/pigprogs/myfile1.txt
[cloudera#quickstart ~]$ sqoop import -connect jdbc:mysql://localhost/test -username root -P -table transactions -m 1
When executing the above command, I get thefollowing exception.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/02/10 02:06:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.12.0
Enter password:
18/02/10 02:06:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/02/10 02:06:22 INFO tool.CodeGenTool: Beginning code generation
18/02/10 02:06:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `transactions` AS t LIMIT 1
18/02/10 02:06:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `transactions` AS t LIMIT 1
18/02/10 02:06:23 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/6c95208490848382aa6375de4e4f81bf/transactions.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/02/10 02:06:27 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/6c95208490848382aa6375de4e4f81bf/transactions.jar
18/02/10 02:06:27 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/02/10 02:06:27 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/02/10 02:06:27 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/02/10 02:06:27 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/02/10 02:06:27 INFO mapreduce.ImportJobBase: Beginning import of transactions
18/02/10 02:06:27 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/02/10 02:06:31 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/02/10 02:06:31 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/10.0.2.15:8032
18/02/10 02:06:33 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:34 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:35 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:36 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:37 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:38 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:39 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:40 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:41 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:42 INFO ipc.Client: Retrying connect to server: quickstart.cloudera/10.0.2.15:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/02/10 02:06:42 WARN ipc.Client: Failed to connect to server: quickstart.cloudera/10.0.2.15:8032: retries get failed due to exceeded maximum allowed retries number: 10
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1555)
at org.apache.hadoop.ipc.Client.call(Client.java:1478)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy20.getNewApplication(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy21.getNewApplication(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:206)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:214)
at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:187)
at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:262)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:157)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1325)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:203)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:176)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:273)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:513)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Port 8032 is the resource manager for YARN.
Sqoop is trying to execute a MapReduce job, but cannot without it running.
You'll need to start it from Cloudera Manager, or from the terminal, if you are comfortable with it
I configured an Apache hadoop cluster with 1 Namenode and 2 Datanodes in VMware Workstation and Namenode is working fine, also did ssh-passwordless login too, but when I try to start datanode get the following error?
Under data nodes log getting Retrying error for namenode under both datanodes, whereas I tried to ping and connect with Namenode no error.
Below is the log for datanode,
2015-11-14 19:54:22,622 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = dn2.hcluster.com/192.168.155.133
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 1
5:23:09 PDT 2013
STARTUP_MSG: java = 1.8.0_65
************************************************************/
2015-11-14 19:54:23,447 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-11-14 19:54:23,485 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2015-11-14 19:54:23,486 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-11-14 19:54:23,486 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2015-11-14 19:54:23,876 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2015-11-14 19:54:25,720 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:27,723 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:28,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:29,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:30,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:31,753 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:32,755 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:33,758 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:34,762 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:35,764 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1.hcluster.com/192.168.155.131:9000. Already tri
ed 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-11-14 19:54:35,922 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to nn1.hcluster.com/192.168.155.
131:9000 failed on local exception: java.net.NoRouteToHostException: No route to host
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1150)
at org.apache.hadoop.ipc.Client.call(Client.java:1118)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:414)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:453)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:335)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:300)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:457)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:205)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1249)
at org.apache.hadoop.ipc.Client.call(Client.java:1093)
... 16 more
2015-11-14 19:54:35,952 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at dn2.hcluster.com/192.168.155.133
************************************************************/
From Datanode 1 and 2, Namenode and it's GUI is working and all 3Desktop are able to communicate with eachother via pin or ssh passwordless too. Please help..
core-site.xml under namenode
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://nn01.hcluster.com:9000</value>
</property>
</configuration>
make sure your Namenode is running fine. Otherwise check the Machine IP and host name in /etc/hosts file.
Make sure that you have added this hostname "nn01.hcluster.com" there.
I am trying to run my MR job on YARN. There is this error in one of the userlogs on the node3:
2014-10-10 00:57:16,965 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2014-10-10 00:57:16,965 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1412895371072_0001, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier#69d5af30)
2014-10-10 00:57:17,330 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2014-10-10 00:57:18,547 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: node03/127.0.1.1:44874. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-10-10 00:57:19,548 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: node03/127.0.1.1:44874. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
...
2014-10-10 00:57:27,558 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: node03/127.0.1.1:44874. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-10-10 00:57:27,562 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From node03/127.0.1.1 to node03:44874 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1415)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
at org.apache.hadoop.ipc.Client.call(Client.java:1382)
... 4 more
2014-10-10 00:57:27,564 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2014-10-10 00:57:27,566 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
:
I have the same config on all nodes. I can't find anywhere having specified port 44874. What does this error actually telling?
If by "half-random" you mean completely random and by "can't really be configured" you mean undocumented and completely f'ed when used in a hardened environmment - you are correct.
The problem is that map-reduce jobs are using dynamic ports. Horton of course doesn't document why the random port is being created.
The answer thus far: disable firewall or allow high ranges (32768-65535) to on each data node. I am still looking for why this situation occurs.
Whenever I see a problem with Hadoop's ports, I google the port number and see if it is a default port for something. In your case it doesn't seem to be.
As far as I can tell, Hadoop uses this kind of half-random ports internally for some things and they can't be really configured. If there is a problem with this kind of ports, for me it has always been an indication of some other (detectable) problem.
I suggest you go through all your logs again to find other problems. Also check namenode statuses (web interface) and make sure all connections are working.
I do step by step to setup hadoop here http://hadoop.apache.org/docs/stable/single_node_setup.html
i setup hadoop in redhat with root account
everything ok but when ./start-all and watch log JobTracker file look like this:
2013-06-11 08:31:00,194 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:31:01,195 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:31:01,196 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root cause:java.net.ConnectException: Call to localhost/127.0.0.1:8088 failed on connection exception: java.net.ConnectException: Connection refused
2013-06-11 08:31:01,196 INFO org.apache.hadoop.mapred.JobTracker: Problem connecting to HDFS Namenode... re-trying
java.net.ConnectException: Call to localhost/127.0.0.1:8088 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
at org.apache.hadoop.ipc.Client.call(Client.java:1112)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy7.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:135)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:276)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:241)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:1908)
at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:1906)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.JobTracker.initializeFilesystem(JobTracker.java:1906)
at org.apache.hadoop.mapred.JobTracker.offerService(JobTracker.java:2324)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4792)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:453)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:579)
at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
at org.apache.hadoop.ipc.Client.call(Client.java:1087)
... 20 more
Log of datanode like this:
2013-06-11 08:30:25,051 INFO org.apache.hadoop.ipc.RPC: Server at localhost/127.0.0.1:8088 not available yet, Zzzzz...
2013-06-11 08:30:27,052 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:28,053 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:29,054 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:30,055 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:31,056 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:32,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:33,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:34,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:35,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:36,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8088. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-11 08:30:36,061 INFO org.apache.hadoop.ipc.RPC: Server at localhost/127.0.0.1:8088 not available yet, Zzzzz...
my configuration xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://127.0.0.1:8088</value>
</property>
</configuration>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>127.0.0.1:8089</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
My Hots file:
192.168.1.211 kkapps # Added by NetworkManager
127.0.0.1 localhost
::1 localhost
thanks for your investigate!
Go to services.msc Run.
Find CYGWIN sshd service. Right click on it, go to Properties and then to Log On tab.
Ensure that This account is checked with username as ./cyg_server(can vary depending on what you had entered while ssh localhost).
Now put the Password in the fields and start the sshd service.
Now try running the code.
This should work!!!