I used following 3 statments to read a data which was present in hdfs and then dump the data while using pig in mapreduce mode it gives me following huge error please can somebody expalin it to me or provide solution please
grunt> a= load '/temp' AS (name:chararray, age:int, salary:int);
grunt> b= foreach a generate (name, salary);
grunt> dump b;
2017-04-19 20:47:00,463 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2017-04-19 20:47:00,544 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2017-04-19 20:47:00,630 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2017-04-19 20:47:00,704 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for a: $1
2017-04-19 20:47:00,805 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2017-04-19 20:47:00,902 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2017-04-19 20:47:00,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2017-04-19 20:47:00,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2017-04-19 20:47:01,123 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 20:47:01,577 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2017-04-19 20:47:01,581 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2017-04-19 20:47:01,583 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2017-04-19 20:47:02,145 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/pig-0.16.0-core-h2.jar to DistributedCache through /tmp/temp1691370991/tmp-657769855/pig-0.16.0-core-h2.jar
2017-04-19 20:47:02,268 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1691370991/tmp-1451176406/automaton-1.11-8.jar
2017-04-19 20:47:02,345 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1691370991/tmp2084869503/antlr-runtime-3.4.jar
2017-04-19 20:47:02,425 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1691370991/tmp-658468570/joda-time-2.9.3.jar
2017-04-19 20:47:02,438 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2017-04-19 20:47:02,455 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2017-04-19 20:47:02,455 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2017-04-19 20:47:02,455 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2017-04-19 20:47:02,555 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2017-04-19 20:47:02,619 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 20:47:02,697 [JobControl] INFO org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:DefaultJobName got an error while submitting
org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=EXECUTE, inode="/tmp/hadoop-yarn/staging/root/.staging":dead:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3855)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1011)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2110)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:144)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=EXECUTE, inode="/tmp/hadoop-yarn/staging/root/.staging":dead:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3855)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1011)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
... 22 more
2017-04-19 20:47:03,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2017-04-19 20:47:08,186 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2017-04-19 20:47:08,187 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs
2017-04-19 20:47:08,187 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2017-04-19 20:47:08,203 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2017-04-19 20:47:08,204 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.7.2 0.16.0 root 2017-04-19 20:47:01 2017-04-19 20:47:08 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
N/A a,b MAP_ONLY Message: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=EXECUTE, inode="/tmp/hadoop-yarn/staging/root/.staging":dead:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3855)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1011)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2110)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:144)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=EXECUTE, inode="/tmp/hadoop-yarn/staging/root/.staging":dead:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3855)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1011)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
... 22 more
hdfs://localhost/tmp/temp1691370991/tmp-1122865224,
Input(s):
Failed to read data from "/temp"
Output(s):
Failed to produce result in "hdfs://localhost/tmp/temp1691370991/tmp-1122865224"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
null
2017-04-19 20:47:08,223 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2017-04-19 20:47:08,233 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias b
Details at logfile: /opt/hadoop/pig/conf/pig_1492615006487.log
org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=EXECUTE, inode="/tmp/hadoop-yarn/staging/root/.staging":dead:supergroup:drwx------
Provide permissions to the following directory Its on HDfs
Related
I have loaded JSON data to my HDFS, I created the table with required columns in MySQL database as follows.
How to create table with row formatter for accepting JSON?
My HDFS data
{
"Employees" : [
{
"userId":"rirani",
"jobTitleName":"Developer",
"firstName":"Romin",
"lastName":"Irani",
"preferredFullName":"Romin Irani",
"employeeCode":"E1",
"region":"CA",
"phoneNumber":"408-1234567",
"emailAddress":"romin.k.irani#gmail.com"
},
{
"userId":"nirani",
"jobTitleName":"Developer",
"firstName":"Neil",
"lastName":"Irani",
"preferredFullName":"Neil Irani",
"employeeCode":"E2",
"region":"CA",
"phoneNumber":"408-1111111",
"emailAddress":"neilrirani#gmail.com"
},
{
"userId":"thanks",
"jobTitleName":"Program Directory",
"firstName":"Tom",
"lastName":"Hanks",
"preferredFullName":"Tom Hanks",
"employeeCode":"E3",
"region":"CA",
"phoneNumber":"408-2222222",
"emailAddress":"tomhanks#gmail.com"
}
]
}
My SQL table structure
mysql> create table employee(userid int,jobTitleName varchar(20),firstName varchar(20),lastName varchar(20),preferrredFullName varchar(20),employeeCode varchar(20),region varchar(20),phoneNumber varchar(20), emailAddress varchar(20),modifiedDate timestamp DEFAULT CURRENT_TIMESTAMP);
mysql> desc employee;
+--------------------+-------------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+-------------+------+-----+-------------------+-------+
| userid | int(11) | YES | | NULL | |
| jobTitleName | varchar(20) | YES | | NULL | |
| firstName | varchar(20) | YES | | NULL | |
| lastName | varchar(20) | YES | | NULL | |
| preferrredFullName | varchar(20) | YES | | NULL | |
| employeeCode | varchar(20) | YES | | NULL | |
| region | varchar(20) | YES | | NULL | |
| phoneNumber | varchar(20) | YES | | NULL | |
| emailAddress | varchar(20) | YES | | NULL | |
| modifiedDate | timestamp | NO | | CURRENT_TIMESTAMP | |
+--------------------+-------------+------+-----+-------------------+-------+
10 rows in set (0.00 sec)
I am trying to load data from my HDFS to MySQL for the above table using sqoop export as follows
sqoop export --connect jdbc:mysql://localhost/emp_scheme --username root --password adithyan --table employee --export-dir /user/adithyan/filesystem/employee.txt
it has end up with exception as follows
17/02/18 19:35:35 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/02/18 19:35:35 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/02/18 19:35:35 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/02/18 19:35:35 INFO tool.CodeGenTool: Beginning code generation
17/02/18 19:35:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
17/02/18 19:35:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
17/02/18 19:35:36 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/adithyan/hadoop_dir/hadoop-1.2.1
Note: /tmp/sqoop-adithyan/compile/35afadf151a1dd1626a3658577cbc2dd/employee.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/02/18 19:35:41 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-adithyan/compile/35afadf151a1dd1626a3658577cbc2dd/employee.jar
17/02/18 19:35:41 INFO mapreduce.ExportJobBase: Beginning export of employee
17/02/18 19:35:45 INFO input.FileInputFormat: Total input paths to process : 1
17/02/18 19:35:45 INFO input.FileInputFormat: Total input paths to process : 1
17/02/18 19:35:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library
17/02/18 19:35:45 WARN snappy.LoadSnappy: Snappy native library not loaded
17/02/18 19:35:46 INFO mapred.JobClient: Running job: job_201702181051_0002
17/02/18 19:35:47 INFO mapred.JobClient: map 0% reduce 0%
17/02/18 19:36:17 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:18 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_0, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:29 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:29 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_1, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:42 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:42 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_2, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
Can somebody help me on this?
you may have to look at multiple options..
JSON_SET/REPLACE/INSERT - these options may not be directly supported by sqoop yet.
Another options is to pre-process data using pig then stage the data in HDFS before sqooping to RDBMS.
I have a map-only hadoop job, that throws several IO exceptions during it's work:
1) java.io.IOException: Write end dead
2) java.io.IOException: Pipe closed
It manages to finish it's work, but there exceptions make me worry. Is there anything I'm doing wrong?
Practically the same job is working daily on another dataset which is 20 times smaller, and no exceptions are thrown. Jobs are run by Google dataproc.
The config file I'm using:
#!/bin/bash
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar \
-D mapreduce.output.fileoutputformat.compress=true \
-D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec \
-D mapreduce.job.reduces=0 \
-D mapreduce.input.fileinputformat.split.maxsize=1500000000 \
-D mapreduce.map.failures.maxpercent=1 \
-D mapreduce.fileoutputcommitter.algorithm.version=2 \
-D mapreduce.task.timeout=900000 \
-D mapreduce.map.memory.mb=2048 \
-file mymapper.py \
-input gs://input_folder/* \
-output gs://output_folder/$1 \
-mapper mymapper.py \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer \
-inputformat org.apache.hadoop.mapred.lib.CombineTextInputFormat
Here is an error log:
17/03/15 09:53:30 INFO mapreduce.Job: Running job:
job_1489571529338_0001
17/03/15 09:53:37 INFO mapreduce.Job: Job job_1489571529338_0001 running in uber mode : false
17/03/15 09:53:37 INFO mapreduce.Job: map 0% reduce 0%
17/03/15 09:56:58 INFO mapreduce.Job: map 1% reduce 0%
17/03/15 10:00:16 INFO mapreduce.Job: Task Id : attempt_1489571529338_0001_m_000744_0, Status : FAILED
Error: java.io.IOException: java.io.IOException: Write end dead
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.write(AbstractGoogleAsyncWriteChannel.java:256)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.write(CacheSupplementedGoogleCloudStorage.java:58)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:101)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.close(GoogleHadoopOutputStream.java:126)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:109)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:108)
at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:844)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Suppressed: java.io.IOException: java.io.IOException: Write end dead
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.close(CacheSupplementedGoogleCloudStorage.java:68)
at java.nio.channels.Channels$1.close(Channels.java:178)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
... 14 more
Caused by: java.io.IOException: Write end dead
at java.io.PipedInputStream.read(PipedInputStream.java:310)
at java.io.PipedInputStream.read(PipedInputStream.java:377)
at com.google.api.client.util.ByteStreams.read(ByteStreams.java:181)
at com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:629)
at com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:409)
at com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:358)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[CIRCULAR REFERENCE:java.io.IOException: Write end dead]
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
17/03/15 10:01:06 INFO mapreduce.Job: map 2% reduce 0%
17/03/15 10:02:46 INFO mapreduce.Job: Task Id : attempt_1489571529338_0001_m_001089_0, Status : FAILED
Error: java.io.IOException: Pipe closed
at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
at java.io.PipedInputStream.receive(PipedInputStream.java:226)
at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
at java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:458)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.write(AbstractGoogleAsyncWriteChannel.java:259)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.write(CacheSupplementedGoogleCloudStorage.java:58)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:101)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.close(GoogleHadoopOutputStream.java:126)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:109)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:108)
at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:844)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Suppressed: java.io.IOException: java.io.IOException: Write end dead
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.close(CacheSupplementedGoogleCloudStorage.java:68)
at java.nio.channels.Channels$1.close(Channels.java:178)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
... 14 more
Caused by: java.io.IOException: Write end dead
at java.io.PipedInputStream.read(PipedInputStream.java:310)
at java.io.PipedInputStream.read(PipedInputStream.java:377)
at com.google.api.client.util.ByteStreams.read(ByteStreams.java:181)
at com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:629)
at com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:409)
at com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:358)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
17/03/15 10:03:35 INFO mapreduce.Job: Task Id : attempt_1489571529338_0001_m_001217_0, Status : FAILED
Error: java.io.IOException: Pipe closed
at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
at java.io.PipedInputStream.receive(PipedInputStream.java:226)
at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
at java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:458)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.write(AbstractGoogleAsyncWriteChannel.java:259)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.write(CacheSupplementedGoogleCloudStorage.java:58)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:101)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.close(GoogleHadoopOutputStream.java:126)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:109)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:108)
at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:844)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Suppressed: java.io.IOException: java.io.IOException: Write end dead
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.close(CacheSupplementedGoogleCloudStorage.java:68)
at java.nio.channels.Channels$1.close(Channels.java:178)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
... 14 more
Caused by: java.io.IOException: Write end dead
at java.io.PipedInputStream.read(PipedInputStream.java:310)
at java.io.PipedInputStream.read(PipedInputStream.java:377)
at com.google.api.client.util.ByteStreams.read(ByteStreams.java:181)
at com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:629)
at com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:409)
at com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:358)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
17/03/15 10:04:51 INFO mapreduce.Job: map 3% reduce 0%
17/03/15 10:08:34 INFO mapreduce.Job: map 4% reduce 0%
17/03/15 10:12:12 INFO mapreduce.Job: map 5% reduce 0%
UPD.
Now it comes with Backend Error:
Error: java.io.IOException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
{
"code" : 500,
"errors" : [ {
"domain" : "global",
"message" : "Backend Error",
"reason" : "backendError"
} ],
"message" : "Backend Error"
}
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWrit
eChannel.java:432)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage$WritableByteChannelImpl.close(CacheSupplementedGoogleCloud
Storage.java:68)
at java.nio.channels.Channels$1.close(Channels.java:178)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.close(GoogleHadoopOutputStream.java:126)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:109)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:108)
at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:844)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
{
"code" : 500,
"errors" : [ {
"domain" : "global",
"message" : "Backend Error",
"reason" : "backendError"
} ],
"message" : "Backend Error"
}
Usually Write end dead means a writer thread failed to close() the output stream before exiting, but if it's happening in something in the underlying framework rather than any kind of manually created write channel, it's likely the result of some transient failure which caused a single task to fail for other reasons, and then the Write end dead message is just another symptom of the failure.
In your case, the 410 Gone error is a known transient failure mode of GCS which is not recoverable in the same stream (recoverable errors are automatically retried silently under the hood). But that's just a single failed task, and Hadoop ensures that failed tasks will get retried end-to-end for the job, and only if the same task fails too many times will the overall job fail.
So in general, it means as long as your overall job completes successfully, then all your data was processed correctly; single-task failures can just be treated as warnings.
I am trying to deploy Hadoop cluster via Pivotal distribution.
For the same, I am following link mentioned below
http://pivotalhd.docs.pivotal.io/doc/2100/webhelp/topics/ManuallyInstallingandUsingPivotalHD21Stack.html
Deployment Configuration:
1) phd1.xyz.com - NameNode, ResourceManager
2) phd2.xyz.com - DataNode, NodeManager
I have above mentioned services UP and Running and also able to access the HDFS file system but not able to execute jobs on cluster
Above provided link doesn't mention if the job has to be executed via root or hdfs user, so I tried both the ways
Error when job is executed via root user
hadoop jar/usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0-gphd-3.1.0.0.jar
pi 2 200
The following error occurring:
> Number of Maps = 2
> Samples per Map = 200
> org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE,
> inode="/user":hdfs:supergroup:drwxr-xr-x
> at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
> at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214)
> at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5389)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5371)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5345)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3583)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3553)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3525)
> at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:745)
> at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
> at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:63031)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
Error when job is executed via hdfs user
sudo -u hdfs hadoop jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0-gphd-3.1.0.0.jar pi 2 200
the following error ocurring:
> Number of Maps = 2
> Samples per Map = 200
> Wrote input for Map #0
> Wrote input for Map #1
> Starting Job
> 15/01/01 20:48:20 INFO client.RMProxy: Connecting to ResourceManager at phd1.xyz.com/10.44.189.6:8050
> 15/01/01 20:48:21 INFO input.FileInputFormat: Total input paths to process : 2
> 15/01/01 20:48:21 INFO mapreduce.JobSubmitter: number of splits:2
> 15/01/01 20:48:21 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use
> mapreduce.map.speculative
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use
> mapreduce.job.output.value.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
> mapreduce.reduce.speculative
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use
> mapreduce.job.map.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use
> mapreduce.job.reduce.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use
> mapreduce.job.inputformat.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use
> mapreduce.output.fileoutputformat.outputdir
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use
> mapreduce.job.outputformat.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use
> mapreduce.job.output.key.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use
> mapreduce.job.working.dir
> 15/01/01 20:48:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1420122968684_0002
> 15/01/01 20:48:22 INFO impl.YarnClientImpl: Submitted application application_1420122968684_0002 to ResourceManager at
> phd1.xyz.com/10.44.189.6:8050
> 15/01/01 20:48:22 INFO mapreduce.Job: The url to track the job: http://phd1.persistent.co.in:8088/proxy/application_1420122968684_0002/
> 15/01/01 20:48:22 INFO mapreduce.Job: Running job: job_1420122968684_0002
> 15/01/01 20:48:26 INFO mapreduce.Job: Job job_1420122968684_0002 running in uber mode : false
> 15/01/01 20:48:26 INFO mapreduce.Job: map 0% reduce 0%
> 15/01/01 20:48:26 INFO mapreduce.Job: Job job_1420122968684_0002 failed with state FAILED due to: Application
> application_1420122968684_0002 failed 2 times due to AM Container for
> appattempt_1420122968684_0002_000002 exited with exitCode: 1 due to:
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
> .Failing this attempt.. Failing the application.
> 15/01/01 20:48:26 INFO mapreduce.Job: Counters: 0
> Job Finished in 5.973 seconds
> java.io.FileNotFoundException: File does not exist: hdfs://phd1.xyz.com:8020/user/hdfs/QuasiMonteCarlo_1420125497811_11863122/out/reduce-out
> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1112)
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1749)
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773)
> at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
> at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Please let me know how can I resolve this error.
Thanks
I am trying to copy data from one HDFS directory to another using distcp:
Source hadoop version:
hadoop version
Hadoop 2.0.0-cdh4.3.1
Destination hadoop version:
hadoop version
Hadoop 2.0.0-cdh4.4.0
Command I am using is:
hadoop distcp hftp://foo.test.net:50070/new_data/raw/new_logs/utc_date=2014-09-01/utc_hour=22 hdfs://localhost.localdomain/user/cloudera/new_logs
Error message I am getting is:
java.io.IOException: Copied: 0 Skipped: 0 Failed: 13
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
==
Logs from "Task Logs":
2014-09-02 13:50:30,947 INFO org.apache.hadoop.tools.DistCp: FAIL utc_hour=22/000012_0 : java.io.IOException: HTTP_OK expected, received 500
at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:376)
at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119)
at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187)
at java.io.DataInputStream.read(DataInputStream.java:83)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:424)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
2014-09-02 13:50:31,118 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2014-09-02 13:50:31,131 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:java.io.IOException: Copied: 0 Skipped: 0 Failed: 13
2014-09-02 13:50:31,131 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: Copied: 0 Skipped: 0 Failed: 13
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
2014-09-02 13:50:31,145 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
=
Any help?
Thanks,
Rio
I am getting sqoop error while loading data from Hive to MySQL
Error message is:
java.lang.NumberFormatException: For input string
==
hive > CREATE EXTERNAL TABLE IF NOT EXISTS test (
id int,
name string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION
'/user/cloudera/test';
==
vi test:
1 a
2 b
==
hadoop fs -put test /user/cloudera
==
mysql> CREATE TABLE `foo` (`id` int(11) , `name` varchar(30) )
==
sqoop export --connect jdbc:mysql://localhost/test --table foo -m 1 --export-dir /user/cloudera/test
==
log:
14/05/13 07:18:52 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/05/13 07:18:52 INFO tool.CodeGenTool: Beginning code generation
14/05/13 07:18:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `foo` AS t LIMIT 1
14/05/13 07:18:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `foo` AS t LIMIT 1
14/05/13 07:18:53 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
14/05/13 07:18:53 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/e6582e332bf9e0eedfb641f14d866599/foo.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/05/13 07:18:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/e6582e332bf9e0eedfb641f14d866599/foo.jar
14/05/13 07:18:56 INFO mapreduce.ExportJobBase: Beginning export of foo
14/05/13 07:18:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/05/13 07:19:00 INFO input.FileInputFormat: Total input paths to process : 1
14/05/13 07:19:00 INFO input.FileInputFormat: Total input paths to process : 1
14/05/13 07:19:00 INFO mapred.JobClient: Running job: job_201405081447_0046
14/05/13 07:19:01 INFO mapred.JobClient: map 0% reduce 0%
14/05/13 07:19:14 INFO mapred.JobClient: Task Id : attempt_201405081447_0046_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
14/05/13 07:19:20 INFO mapred.JobClient: Task Id : attempt_201405081447_0046_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
14/05/13 07:19:28 INFO mapred.JobClient: Task Id : attempt_201405081447_0046_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
==
Any help?
Thank you!
The location into which you placed the file does not appear to be correct. For a table "test" you should put a file underneath a directory test. But your command
hadoop fs -put test /user/cloudera
creates a file called test.
You would likely find more success as follows:
hadoop fs -mkdir /user/cloudera/test
hadoop dfs -put test /user/cloudera/test