sqoop export failing due to timeout

sqoop export failing due to timeout - hadoop

I am not able to export data from sqoop to as400 server.
I am able to import the data successfully.
I am using following command: –
sqoop export –driver com.ibm.as400.access.AS400JDBCDriver –connect jdbc:as400://178.xxx.3.21:23/MELLET1/TEXT4 –username xxxxxx –password xxxxx007 –table TEXT3 –export-dir /as400/1GBTBL5/part-m-00000 -m 1
I am getting timeout issue.
>15/05/10 17:42:06 INFO input.FileInputFormat: Total input paths to process : 1
15/05/10 17:42:06 INFO input.FileInputFormat: Total input paths to process : 1
15/05/10 17:42:06 INFO mapreduce.JobSubmitter: number of splits:1
15/05/10 17:42:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1431267418859_0014
15/05/10 17:42:07 INFO impl.YarnClientImpl: Submitted application application_1431267418859_0014
15/05/10 17:42:07 INFO mapreduce.Job: Running job: job_1431267418859_0014
15/05/10 17:42:18 INFO mapreduce.Job: Job job_1431267418859_0014 running in uber mode : false
15/05/10 17:42:18 INFO mapreduce.Job: map 0% reduce 0%
15/05/10 17:42:37 INFO mapreduce.Job: map 100% reduce 0%
15/05/10 17:47:47 INFO mapreduce.Job: Task Id : attempt_1431267418859_0014_m_000000_0, Status : FAILED
AttemptID:attempt_1431267418859_0014_m_000000_0 Timed out after 300 secs
15/05/10 17:47:48 INFO mapreduce.Job: map 0% reduce 0%
15/05/10 17:48:07 INFO mapreduce.Job: map 100% reduce 0%
15/05/10 17:53:16 INFO mapreduce.Job: Task Id : attempt_1431267418859_0014_m_000000_1, Status : FAILED
AttemptID:attempt_1431267418859_0014_m_000000_1 Timed out after 300 secs
15/05/10 17:53:17 INFO mapreduce.Job: map 0% reduce 0%
15/05/10 17:53:40 INFO mapreduce.Job: map 100% reduce 0%
15/05/10 17:58:46 INFO mapreduce.Job: Task Id : attempt_1431267418859_0014_m_000000_2, Status : FAILED
AttemptID:attempt_1431267418859_0014_m_000000_2 Timed out after 300 secs

Please follow the below command which is for MySQL, similarly you can frame to your database accordingly
$ sqoop export --connect jdbc:mysql://db.example.com/foo --table bar --export-dir /results/bar_data

Have you tried batch mode?
set -Dsqoop.export.records.per.statement and --batch

Related

Tez - DAGAppMaster - java.lang.IllegalArgumentException: Invalid ContainerId

I try to launch a mapreduce job, but I get an error while excuting the jobs in shell or in hive :
hive> select count(*) from employee ; Query ID =
mapr_20171107135114_a574713d-7d69-45e1-aa73-d4de07a3059b Total jobs =
1 Launching Job 1 out of 1 Number of reduce tasks determined at
compile time: 1 In order to change the average load for a reducer (in
bytes): set hive.exec.reducers.bytes.per.reducer= In order to
limit the maximum number of reducers: set
hive.exec.reducers.max= In order to set a constant number of
reducers: set mapreduce.job.reduces= Starting Job =
job_1510052734193_0005, Tracking URL =
http://hdpsrvpre2.intranet.darty.fr:8088/proxy/application_1510052734193_0005/
Kill Command = /opt/mapr/hadoop/hadoop-2.7.0/bin/hadoop job -kill
job_1510052734193_0005 Hadoop job information for Stage-1: number of
mappers: 0; number of reducers: 0 2017-11-07 13:51:25,951 Stage-1 map
= 0%, reduce = 0% Ended Job = job_1510052734193_0005 with errors Error during job, obtaining debugging information... **FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: Stage-Stage-1: MAPRFS Read: 0 MAPRFS Write: 0
FAIL Total MapReduce CPU Time Spent: 0 mse
in Ressourcemanager logs that what I find :
> 2017-11-07 13:51:25,269 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1510052734193_0005_000002 State change from LAUNCHED to
> FINAL_SAVING 2017-11-07 13:51:25,269 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
> Updating info for attempt: appattempt_1510052734193_0005_000002 at:
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1510052734193_0005/appattempt_1510052734193_0005_000002
> 2017-11-07 13:51:25,283 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Unregistering app attempt : appattempt_1510052734193_0005_000002
> 2017-11-07 13:51:25,283 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Application finished, removing password for
> appattempt_1510052734193_0005_000002 2017-11-07 13:51:25,283 **INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1510052734193_0005_000002 State change from FINAL_SAVING to
> FAILED** 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The
> number of failed attempts is 2. The max attempts is 2 2017-11-07
> 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> Updating application application_1510052734193_0005 with final state:
> FAILED 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1510052734193_0005 State change from ACCEPTED to
> FINAL_SAVING 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
> Updating info for app: application_1510052734193_0005 2017-11-07
> 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Application appattempt_1510052734193_0005_000002 is done.
> finalState=FAILED 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
> Updating info for app: application_1510052734193_0005 at:
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1510052734193_0005/application_1510052734193_0005
> 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
> Application application_1510052734193_0005 requests cleared 2017-11-07
> 13:51:25,296 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> Application application_1510052734193_0005 failed 2 times due to AM
> Container for appattempt_1510052734193_0005_000002 exited with
> exitCode: 1 For more detailed output, check application tracking
> page:http://hdpsrvpre2.intranet.darty.fr:8088/cluster/app/application_1510052734193_0005Then,
> click on links to logs of each attempt. Diagnostics: Exception from
> container-launch. Container id:
> container_e10_1510052734193_0005_02_000001 Exit code: 1 Stack trace:
> ExitCodeException exitCode=1: at
> org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at
> org.apache.hadoop.util.Shell.run(Shell.java:456) at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:304)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:354)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:87)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
> at java.lang.Thread.run(Thread.java:748) Shell output: main : command
> provided 1 main : user is mapr main : requested yarn user is mapr
>
> Container exited with a non-zero exit code 1 Failing this attempt. Failing the application.
Also , in sys log of jobs I find :
2017-11-07 12:09:46,419 FATAL [main] app.DAGAppMaster: Error starting
DAGAppMaster java.lang.IllegalArgumentException: Invalid ContainerId:
container_e10_1510052734193_0001_01_000001 at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)
at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:1794)
Caused by: java.lang.NumberFormatException: For input string: "e10"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441) at
java.lang.Long.parseLong(Long.java:483) at
org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
... 1 more
It seems to be that Tez which causes the issue, is there any solution to solve that?
Thank you !

I think that the execution environment has different versions of hadoop and their respective jar files.
Please verify the environment and make sure you use only the required version and remove the references of other versions from any of your environment variables.

error while executing pig script?

p.pig contains follwoing code
salaries= load 'salaries' using PigStorage(',') As (gender, age,salary,zip);
salaries= load 'salaries' using PigStorage(',') As (gender:chararray,age:int,salary:double,zip:long);
salaries=load 'salaries' using PigStorage(',') as (gender:chararray,details:bag{b(age:int,salary:double,zip:long)});
highsal= filter salaries by salary > 75000;
dump highsal
salbyage= group salaries by age;
describe salbyage;
salbyage= group salaries All;
salgrp= group salaries by $3;
A= foreach salaries generate age,salary;
describe A;
salaries= load 'salaries.txt' using PigStorage(',') as (gender:chararray,age:int,salary:double,zip:int);
vivek#ubuntu:~/Applications/Hadoop_program/pip$ pig -x mapreduce p.pig
15/09/24 03:16:32 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
15/09/24 03:16:32 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
15/09/24 03:16:32 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2015-09-24 03:16:32,990 [main] INFO org.apache.pig.Main - Apache Pig version 0.14.0 (r1640057) compiled Nov 16 2014, 18:02:05
2015-09-24 03:16:32,991 [main] INFO org.apache.pig.Main - Logging error messages to: /home/vivek/Applications/Hadoop_program/pip/pig_1443089792987.log
2015-09-24 03:16:38,966 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-09-24 03:16:41,232 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/vivek/.pigbootup not found
2015-09-24 03:16:42,869 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-09-24 03:16:42,870 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-09-24 03:16:42,870 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:9000
2015-09-24 03:16:45,436 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " <PATH> "salaries=load "" at line 7, column 1.
Was expecting one of:
<EOF>
"cat" ...
"clear" ...
"fs" ...
"sh" ...
"cd" ...
"cp" ...
"copyFromLocal" ...
"copyToLocal" ...
"dump" ...
"\\d" ...
"describe" ...
"\\de" ...
"aliases" ...
"explain" ...
"\\e" ...
"help" ...
"history" ...
"kill" ...
"ls" ...
"mv" ...
"mkdir" ...
"pwd" ...
"quit" ...
"\\q" ...
"register" ...
"rm" ...
"rmf" ...
"set" ...
"illustrate" ...
"\\i" ...
"run" ...
"exec" ...
"scriptDone" ...
"" ...
"" ...
<EOL> ...
";" ...
Details at logfile: /home/vivek/Applications/Hadoop_program/pip/pig_1443089792987.log
2015-09-24 03:16:45,554 [main] INFO org.apache.pig.Main - Pig script completed in 13 seconds and 48 milliseconds (13048 ms)
vivek#ubuntu:~/Applications/Hadoop_program/pip$
Here at starting p.pig comprised of the code give above.
i'm started my pig in mapreduce mode.
while executing above code it encounters following error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " "salaries=load "" at line 7, column 1.
please try to resolve the error .

You have not provided spaces between alias name and command.
Pig expects atleast on space before or after '=' operator.
Change this line :
salaries=load 'salaries' using PigStorage(',') as (gender:chararray,details:bag{b(age:int,salary:double,zip:long)});
TO
salaries = load 'salaries' using PigStorage(',') as (gender:chararray,details:bag{b(age:int,salary:double,zip:long)});

Hadoop Cluster Deployment Using Pivotal

I am trying to deploy Hadoop cluster via Pivotal distribution.
For the same, I am following link mentioned below
http://pivotalhd.docs.pivotal.io/doc/2100/webhelp/topics/ManuallyInstallingandUsingPivotalHD21Stack.html
Deployment Configuration:
1) phd1.xyz.com - NameNode, ResourceManager
2) phd2.xyz.com - DataNode, NodeManager
I have above mentioned services UP and Running and also able to access the HDFS file system but not able to execute jobs on cluster
Above provided link doesn't mention if the job has to be executed via root or hdfs user, so I tried both the ways
Error when job is executed via root user
hadoop jar/usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0-gphd-3.1.0.0.jar
pi 2 200
The following error occurring:
> Number of Maps = 2
> Samples per Map = 200
> org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE,
> inode="/user":hdfs:supergroup:drwxr-xr-x
> at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
> at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214)
> at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5389)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5371)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5345)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3583)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3553)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3525)
> at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:745)
> at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
> at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:63031)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
Error when job is executed via hdfs user
sudo -u hdfs hadoop jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0-gphd-3.1.0.0.jar pi 2 200
the following error ocurring:
> Number of Maps = 2
> Samples per Map = 200
> Wrote input for Map #0
> Wrote input for Map #1
> Starting Job
> 15/01/01 20:48:20 INFO client.RMProxy: Connecting to ResourceManager at phd1.xyz.com/10.44.189.6:8050
> 15/01/01 20:48:21 INFO input.FileInputFormat: Total input paths to process : 2
> 15/01/01 20:48:21 INFO mapreduce.JobSubmitter: number of splits:2
> 15/01/01 20:48:21 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use
> mapreduce.map.speculative
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use
> mapreduce.job.output.value.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
> mapreduce.reduce.speculative
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use
> mapreduce.job.map.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use
> mapreduce.job.reduce.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use
> mapreduce.job.inputformat.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use
> mapreduce.output.fileoutputformat.outputdir
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use
> mapreduce.job.outputformat.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use
> mapreduce.job.output.key.class
> 15/01/01 20:48:21 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use
> mapreduce.job.working.dir
> 15/01/01 20:48:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1420122968684_0002
> 15/01/01 20:48:22 INFO impl.YarnClientImpl: Submitted application application_1420122968684_0002 to ResourceManager at
> phd1.xyz.com/10.44.189.6:8050
> 15/01/01 20:48:22 INFO mapreduce.Job: The url to track the job: http://phd1.persistent.co.in:8088/proxy/application_1420122968684_0002/
> 15/01/01 20:48:22 INFO mapreduce.Job: Running job: job_1420122968684_0002
> 15/01/01 20:48:26 INFO mapreduce.Job: Job job_1420122968684_0002 running in uber mode : false
> 15/01/01 20:48:26 INFO mapreduce.Job: map 0% reduce 0%
> 15/01/01 20:48:26 INFO mapreduce.Job: Job job_1420122968684_0002 failed with state FAILED due to: Application
> application_1420122968684_0002 failed 2 times due to AM Container for
> appattempt_1420122968684_0002_000002 exited with exitCode: 1 due to:
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
> .Failing this attempt.. Failing the application.
> 15/01/01 20:48:26 INFO mapreduce.Job: Counters: 0
> Job Finished in 5.973 seconds
> java.io.FileNotFoundException: File does not exist: hdfs://phd1.xyz.com:8020/user/hdfs/QuasiMonteCarlo_1420125497811_11863122/out/reduce-out
> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1112)
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1749)
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773)
> at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
> at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Please let me know how can I resolve this error.
Thanks

Sqoop error while loading data from Hive to MySQL

I am getting sqoop error while loading data from Hive to MySQL
Error message is:
java.lang.NumberFormatException: For input string
==
hive > CREATE EXTERNAL TABLE IF NOT EXISTS test (
id int,
name string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION
'/user/cloudera/test';
==
vi test:
1 a
2 b
==
hadoop fs -put test /user/cloudera
==
mysql> CREATE TABLE `foo` (`id` int(11) , `name` varchar(30) )
==
sqoop export --connect jdbc:mysql://localhost/test --table foo -m 1 --export-dir /user/cloudera/test
==
log:
14/05/13 07:18:52 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/05/13 07:18:52 INFO tool.CodeGenTool: Beginning code generation
14/05/13 07:18:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `foo` AS t LIMIT 1
14/05/13 07:18:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `foo` AS t LIMIT 1
14/05/13 07:18:53 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
14/05/13 07:18:53 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/e6582e332bf9e0eedfb641f14d866599/foo.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/05/13 07:18:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/e6582e332bf9e0eedfb641f14d866599/foo.jar
14/05/13 07:18:56 INFO mapreduce.ExportJobBase: Beginning export of foo
14/05/13 07:18:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/05/13 07:19:00 INFO input.FileInputFormat: Total input paths to process : 1
14/05/13 07:19:00 INFO input.FileInputFormat: Total input paths to process : 1
14/05/13 07:19:00 INFO mapred.JobClient: Running job: job_201405081447_0046
14/05/13 07:19:01 INFO mapred.JobClient: map 0% reduce 0%
14/05/13 07:19:14 INFO mapred.JobClient: Task Id : attempt_201405081447_0046_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
14/05/13 07:19:20 INFO mapred.JobClient: Task Id : attempt_201405081447_0046_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
14/05/13 07:19:28 INFO mapred.JobClient: Task Id : attempt_201405081447_0046_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
==
Any help?
Thank you!

The location into which you placed the file does not appear to be correct. For a table "test" you should put a file underneath a directory test. But your command
hadoop fs -put test /user/cloudera
creates a file called test.
You would likely find more success as follows:
hadoop fs -mkdir /user/cloudera/test
hadoop dfs -put test /user/cloudera/test

Hadoop in windows : file not found exception

I'm using hadoop in windows and i've configured everything good (installing cygwin, passwordless ssh etc..)
I've compiled the wordcount program in WC.jar and tried to run. Its running perfectly in standalone mode.. but in fully distributed mode it gives FileNotFoundException
Please look into the logs and tel me what is wrong with it.
i've started the dfs and mapreduce in the MACH1. (thats my master)
$ bin/hadoop jar WC.jar WordCount words result
10/07/24 16:57:38 INFO input.FileInputFormat: Total input paths to process : 2
10/07/24 16:57:39 INFO mapred.JobClient: Running job: job_201007241657_0001
10/07/24 16:57:40 INFO mapred.JobClient: map 0% reduce 0%
10/07/24 16:57:50 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_m_0
00003_0, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-328510/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_m_000003_0/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:57:55 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_r_0
00002_0, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-328510/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_r_000002_0/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:07 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_m_0
00003_1, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-SYSTEM/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_m_000003_1/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:14 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_m_0
00003_2, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-SYSTEM/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_m_000003_2/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:26 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_m_0
00002_0, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-SYSTEM/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_m_000002_0/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:34 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_r_0
00001_0, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-SYSTEM/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_r_000001_0/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:41 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_m_0
00002_1, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-328510/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_m_000002_1/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:47 INFO mapred.JobClient: Task Id : attempt_201007241657_0001_m_0
00002_2, Status : FAILED
java.io.FileNotFoundException: File C:/tmp/hadoop-328510/mapred/local/taskTracke
r/jobcache/job_201007241657_0001/attempt_201007241657_0001_m_000002_2/work/tmp d
oes not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys
tem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.
java:245)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:519)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
10/07/24 16:58:53 INFO mapred.JobClient: Job complete: job_201007241657_0001
10/07/24 16:58:53 INFO mapred.JobClient: Counters: 0
328510#01HW179531 /usr/local/hadoop-0.20.2
$`
Thanks.

I think I might have seen this exception before but I don't have access to my old logs to confirm it. I solved my FileNotFoundException by reformatting the namenode. You might want to check the namenode logs for "inconsistent state" to confirm the cause before reformatting.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

sqoop export failing due to timeout - hadoop

Please follow the below command which is for MySQL, similarly you can frame to your database accordingly $ sqoop export --connect jdbc:mysql://db.example.com/foo --table bar --export-dir /results/bar_data

Have you tried batch mode? set -Dsqoop.export.records.per.statement and --batch

Related

Tez - DAGAppMaster - java.lang.IllegalArgumentException: Invalid ContainerId

error while executing pig script?

Hadoop Cluster Deployment Using Pivotal

Sqoop error while loading data from Hive to MySQL

Hadoop in windows : file not found exception

Categories

Resources