I am trying to execute below sqoop import.
sqoop import --connect 'jdbc:sqlserver://server-IP;database=db_name' --username xxx --password xxx --table xxx --hive-import --hive-table amit_hive --target-dir /user/hive/amitesh123 -m 1.
I have to import a DB table directly to the desired location. as per my understanding goes, above sqoop command line syntax is correctly written. But on executing it, I am get following error:-
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=xxx, access=EXECUTE, inode="/user/hive/amitesh123":hive:hdfs:drwx
Somebody informed me that we have to mention the HIVE database name as well with the above sqoop commands,is that true?,if yes, can someone help me with the parameter that I have use? As per my knowledge goes, we just need to mention the --table to bring the table from DB to HIVE table. please suggest
To test further, I created a new folder , and gave 777 rights to it, still I am getting same error. I have now added the HIVE DB.HIVEtable name with --hive-table, so now the new sqoop import is as follows::
sqoop import --connect 'jdbc:sqlserver://server-IP;database=db_name' --username xxx --password xxx --table xxx --hive-import --hive-table amitesh_db.amit_hive --target-dir /amitesh012345/amitesh -m 1.
However, the permission denied error still there...
INFO mapreduce.Job: Job job_1486315054135_2834 failed with state FAILED due to: Job setup failed : org.apache.hadoop.security.AccessControlException: Permission denied: user=xxx, access=WRITE, inode="/amitesh012345/amitesh/_temporary/1":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1704)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1687)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3890)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3002)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2970)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1061)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1877)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:254)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=at732615, access=WRITE, inode="/amitesh012345/amitesh/_temporary/1":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1704)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1687)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3890)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983)
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.mkdirs(Unknown Source)
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:55
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3000)
... 13 more
17/03/14 05:23:38 INFO mapreduce.Job: Counters: 2
Job Counters
Total time spent by all maps in occupied slots (ms)=0
Total time spent by all reduces in occupied slots (ms)=0
17/03/14 05:23:38 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
17/03/14 05:23:38 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 24.4698 seconds (0 bytes/sec)
17/03/14 05:23:38 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
17/03/14 05:23:38 INFO mapreduce.ImportJobBase: Retrieved 0 records.
17/03/14 05:23:38 ERROR tool.ImportTool: Error during import: Import job failed!
2nd full stacktrace
+++++++++++++
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/03/14 05:38:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6_IBM_27
17/03/14 05:38:02 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/03/14 05:38:02 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
17/03/14 05:38:02 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
17/03/14 05:38:02 INFO manager.SqlManager: Using default fetchSize of 1000
17/03/14 05:38:02 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:path_to/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:path_to/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/03/14 05:38:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM [T_VND] AS t WHERE 1=0
17/03/14 05:38:03 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is path_to/hadoop
Note: path_to/T_VND.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/03/14 05:38:04 INFO orm.CompilationManager: Writing jar file: path_to/T_VND.jar
17/03/14 05:38:04 INFO mapreduce.ImportJobBase: Beginning import of T_VND
17/03/14 05:38:05 INFO impl.TimelineClientImpl: Timeline service address: http://xxxxxx/
17/03/14 05:38:05 INFO client.RMProxy: Connecting to ResourceManager at xxxxxx/server-IP:port
17/03/14 05:38:06 INFO db.DBInputFormat: Using read commited transaction isolation
17/03/14 05:38:07 INFO mapreduce.JobSubmitter: number of splits:1
17/03/14 05:38:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1486315054135_2836
17/03/14 05:38:07 INFO impl.YarnClientImpl: Submitted application application_1486315054135_2836
17/03/14 05:38:07 INFO mapreduce.Job: The url to track the job: http://xxxxxx/server-IP:port/proxy/application_1486315054135_2836/
17/03/14 05:38:07 INFO mapreduce.Job: Running job: job_1486315054135_2836
17/03/14 05:38:13 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
17/03/14 05:38:13 INFO mapreduce.Job: Job job_1486315054135_2836 running in uber mode : false
17/03/14 05:38:13 INFO mapreduce.Job: map 0% reduce 100%
17/03/14 05:38:13 INFO mapreduce.Job: Job job_1486315054135_2836 failed with state FAILED due to:
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: The MapReduce job has already been retired. Performance
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: counters are unavailable. To get this information,
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: you will need to enable the completed job store on
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: the jobtracker with:
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.active = true
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.hours = 1
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: A jobtracker restart is required for these settings
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: to take effect.
17/03/14 05:38:13 ERROR tool.ImportTool: Error during import: Import job failed!
Related
I need to transfer HDFS data from one cluster to another. I see "distcp" command to be helpful for this case. But it was not. Both cluster Namenode is privately interconnected with other datanodes. So I have two proxy machines to connect publically with the namenode. Say, I made namenode's 8070 port to run under 20000 in haproxy. Now I can ping both clusters namenode. So, I went for distcp option. There the mapreduce job starts executing for the data transfer but it is not completing.
[hdfs#ip-20-0-42-252 ~]$ hadoop distcp hdfs://YY.YY.YY.YY:20000/user/ce_prasith/filter.txt hdfs://xx.xx.xx.xx:20000/user/gl_qauser
18/10/09 10:12:15 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs:/user/ce_prasith/filter.txt], targetPath=hdfs://xx.xx.xx.xx:20000/user/gl_qauser, targetPathExists=true, filtersFile='null'}
18/10/09 10:12:16 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; dirCnt = 0
18/10/09 10:12:16 INFO tools.SimpleCopyListing: Build file listing completed.
18/10/09 10:12:16 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
18/10/09 10:12:16 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
18/10/09 10:12:16 INFO tools.DistCp: Number of paths in the copy list: 1
18/10/09 10:12:16 INFO tools.DistCp: Number of paths in the copy list: 1
18/10/09 10:12:16 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm97
18/10/09 10:12:16 INFO mapreduce.JobSubmitter: number of splits:1
18/10/09 10:12:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1539063069030_0003
18/10/09 10:12:16 INFO impl.YarnClientImpl: Submitted application application_1539063069030_0003
18/10/09 10:12:17 INFO mapreduce.Job: The url to track the job: http://ip-20-0-21-94.ec2.internal:8088/proxy/application_1539063069030_0003/
18/10/09 10:12:17 INFO tools.DistCp: DistCp job-id: job_1539063069030_0003
18/10/09 10:12:17 INFO mapreduce.Job: Running job: job_1539063069030_0003
18/10/09 10:12:22 INFO mapreduce.Job: Job job_1539063069030_0003 running in uber mode : true
18/10/09 10:12:22 INFO mapreduce.Job: map 0% reduce 0%
18/10/09 10:13:22 INFO mapreduce.Job: map 100% reduce 0%
For your information I have taken few logs of the job
2018-10-09 12:01:42,715 WARN [CommitterEvent Processor #2] org.apache.hadoop.tools.mapred.CopyCommitter: Unable to cleanup temp files
org.apache.hadoop.net.ConnectTimeoutException: Call From ip-YY.YY.YY.YY.ec2.internal/YY.YY.YY.YY to ec2-xx.xx.xx.xx.ap-south-1.compute.amazonaws.com:20000 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=ec2-xx.xx.xx.xx.ap-south-1.compute.amazonaws.com/xx.xx.xx.xx:20000]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy10.getListing(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:573)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy11.getListing(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2101)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2084)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:731)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:110)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:796)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:792)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:792)
at org.apache.hadoop.fs.Globber.listStatus(Globber.java:76)
at org.apache.hadoop.fs.Globber.doGlob(Globber.java:237)
at org.apache.hadoop.fs.Globber.glob(Globber.java:151)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1714)
at org.apache.hadoop.tools.mapred.CopyCommitter.deleteAttemptTempFiles(CopyCommitter.java:145)
at org.apache.hadoop.tools.mapred.CopyCommitter.cleanupTempFiles(CopyCommitter.java:131)
at org.apache.hadoop.tools.mapred.CopyCommitter.abortJob(CopyCommitter.java:118)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobAbort(CommitterEventHandler.java:298)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=ec2-xx.xx.xx.xx.ap-south-1.compute.amazonaws.com/xx.xx.xx.xx:20000]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
... 31 more
2018-10-09 12:01:42,716 INFO [CommitterEvent Processor #2] org.apache.hadoop.tools.mapred.CopyCommitter: Cleaning up temporary work folder: /user/hdfs/.staging/_distcp1087004350
I get stuck over here. Does somebody have any idea to overcome this?
All nodes in source cluster should see all nodes in destination cluster.
I am trying to import data from sqoop to hive
MySQL
use sample;
create table forhive( id int auto_increment,
firstname varchar(36),
lastname varchar(36),
primary key(id)
);
insert into forhive(firstname, lastname) values("sample","singh");
select * from forhive;
1 abhay agrawal
2 vijay sharma
3 sample singh
This is the Sqoop command I'm using (version 1.4.7)
sqoop import --connect jdbc:mysql://********:3306/sample
--table forhive --split-by id --columns id,firstname,lastname
--target-dir /home/programmeur_v/forhive
--hive-import --create-hive-table --hive-table sqp.forhive --username vaibhav -P
This is the error I'm getting
Error Log
18/08/02 19:19:49 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
18/08/02 19:19:55 INFO tool.BaseSqoopTool: Using Hive-specific
delimiters for output. You can override
18/08/02 19:19:55 INFO tool.BaseSqoopTool: delimiters with
--fields-terminated-by, etc.
18/08/02 19:19:55 INFO manager.MySQLManager: Preparing to use a MySQL
streaming resultset.
18/08/02 19:19:55 INFO tool.CodeGenTool: Beginning code generation
18/08/02 19:19:56 INFO manager.SqlManager: Executing SQL statement:
SELECT t.* FROM forhive AS t LIMIT 1
18/08/02 19:19:56 INFO manager.SqlManager: Executing SQL statement:
SELECT t.* FROM forhive AS t LIMIT 1
18/08/02 19:19:56 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/home/programmeur_v/softwares/hadoop-2.9.1
Note:
/tmp/sqoop-programmeur_v/compile/e8ffa12496a2e421f80e1fa16e025d28/forhive.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details. 18/08/02 19:19:58
INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-programmeur_v/compile/e8ffa12496a2e421f80e1fa16e025d28/forhive.jar
18/08/02 19:19:58 WARN manager.MySQLManager: It looks like you are
importing from mysql.
18/08/02 19:19:58 WARN manager.MySQLManager: This transfer can be
faster! Use the --direct
18/08/02 19:19:58 WARN manager.MySQLManager: option to exercise a
MySQL-specific fast path.
18/08/02 19:19:58 INFO manager.MySQLManager: Setting zero DATETIME
behavior to convertToNull (mysql)
18/08/02 19:19:58 INFO mapreduce.ImportJobBase: Beginning import of
forhive
18/08/02 19:19:58 INFO Configuration.deprecation: mapred.jar is
deprecated. Instead, use mapreduce.job.jar
18/08/02 19:19:59 INFO Configuration.deprecation: mapred.map.tasks is
deprecated. Instead, use mapreduce.job.maps
18/08/02 19:19:59 INFO client.RMProxy: Connecting to ResourceManager
at /0.0.0.0:8032
18/08/02 19:20:02 INFO db.DBInputFormat: Using read commited
transaction isolation
18/08/02 19:20:02 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
SELECT MIN(id), MAX(id) FROM forhive
18/08/02 19:20:02 INFO db.IntegerSplitter: Split size: 0; Num splits:
4 from: 1 to: 3
18/08/02 19:20:02 INFO mapreduce.JobSubmitter: number of splits:3
18/08/02 19:20:02 INFO Configuration.deprecation:
yarn.resourcemanager.system-metrics-publisher.enabled is deprecated.
Instead, use yarn.system-metrics-publisher.enabl ed
18/08/02 19:20:02 INFO mapreduce.JobSubmitter: Submitting tokens for
job: job_1533231535061_0006
18/08/02 19:20:03 INFO impl.YarnClientImpl: Submitted application
application_1533231535061_0006
18/08/02 19:20:03 INFO mapreduce.Job: The url to track the job:
http://instance-1:8088/proxy/application_1533231535061_0006/
18/08/02 19:20:03 INFO mapreduce.Job: Running job:
job_1533231535061_0006
18/08/02 19:20:11 INFO mapreduce.Job: Job job_1533231535061_0006
running in uber mode : false
18/08/02 19:20:11 INFO mapreduce.Job: map 0% reduce 0%
18/08/02 19:20:21 INFO mapreduce.Job: map 33% reduce 0%
18/08/02 19:20:24 INFO mapreduce.Job: map 100% reduce 0%
18/08/02 19:20:25 INFO mapreduce.Job: Job job_1533231535061_0006
completed successfully
18/08/02 19:20:25 INFO mapreduce.Job: Counters: 31
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=622830
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=295
HDFS: Number of bytes written=48
HDFS: Number of read operations=12
HDFS: Number of large read operations=0
HDFS: Number of write operations=6
Job Counters
Killed map tasks=1
Launched map tasks=3
Other local map tasks=3
Total time spent by all maps in occupied slots (ms)=27404
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=27404
Total vcore-milliseconds taken by all map tasks=27404
Total megabyte-milliseconds taken by all map tasks=28061696
Map-Reduce Framework
Map input records=3
Map output records=3
Input split bytes=295
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=671
CPU time spent (ms)=4210
Physical memory (bytes) snapshot=616452096
Virtual memory (bytes) snapshot=5963145216
Total committed heap usage (bytes)=350224384
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=48
18/08/02 19:20:25 INFO mapreduce.ImportJobBase: Transferred 48 bytes
in 25.828 seconds (1.8584 bytes/sec)
18/08/02 19:20:25 INFO mapreduce.ImportJobBase: Retrieved 3 records.
18/08/02 19:20:25 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat
import job data to Listeners for table forhive
18/08/02 19:20:25 INFO manager.SqlManager: Executing SQL statement:
SELECT t.* FROM forhive AS t LIMIT 1
18/08/02 19:20:25 INFO hive.HiveImport: Loading uploaded data into
Hive
18/08/02 19:20:25 ERROR hive.HiveConfig: Could not load
org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set
correctly.
18/08/02 19:20:25 ERROR tool.ImportTool: Import failed:
java.io.IOException: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.conf.HiveConf
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:537)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
... 12 more
After I did google for the same error I added HIVE_CONF_DIR also to my bashrc
export HIVE_HOME=/home/programmeur_v/softwares/apache-hive-1.2.2-bin
export
HIVE_CONF_DIR=/home/programmeur_v/softwares/apache-hive-1.2.2-bin/conf
export
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SQOOP_HOME/bin:$HIVE_CONF_DIR
All my Hadoop services are also up and running.
6976 NameNode
7286 SecondaryNameNode
7559 NodeManager
7448 ResourceManager
8522 DataNode
14587 Jps
I'm just unable to figure out what mistake I'm making here. Please guide!
Download the file "hive-common-0.10.0.jar" by googling. Place this in "sqoop/lib" folder. This solution worked for me.
Go to $HIVE_HOME/lib directory using
cd $HIVE_HOME/lib
then copy hive-common-x.x.x.jar and paste it in $SQOOP_HOME/lib using
cp hive-common-x.x.x.jar $SQOOP_HOME/lib
You need to download the file hive-common-0.10.0.jar and copy it to $SQOOP_HOME/lib folder.
edit your .bash_profile ,then add HADOOP_CLASSPATH
vim ~/.bash_profile
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
source ~/.bash_profile
I got the same issue when I tried to import data from MySQL to Hive with the following command:
sqoop import --connect jdbc:mysql://localhost:3306/sqoop --username root --password z*****3 --table users -m 1 --hive-home /opt/hive --hive-import --hive-overwrite
Finally, these environment variables made it work perfectly.
export HIVE_HOME=/opt/hive
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
export HIVE_CONF_DIR=$HIVE_HOME/conf
i have an error while offloading oracle table to hdfs, here is the command:
sqoop import -Dmapreduce.job.queuename=root.username \
--connect jdbc:oracle:thin:#//someExadataHostname/dbInstance \
--username user \
--password welcome1 \
--table TB_RECHARGE_DIM_APPLICATION \
--target-dir /data/in/sqm/dev/unprocessed/sqoop/oracle_db_exa_test \
--delete-target-dir \
--m 1
it throws an error:
Warning: /opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/10 14:27:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.10.1
18/01/10 14:27:24 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/10 14:27:24 INFO teradata.TeradataManagerFactory: Loaded connector factory for 'Cloudera Connector Powered by Teradata' on version 1.5c5
18/01/10 14:27:25 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
18/01/10 14:27:25 INFO manager.SqlManager: Using default fetchSize of 1000
18/01/10 14:27:25 INFO tool.CodeGenTool: Beginning code generation
18/01/10 14:27:29 INFO manager.OracleManager: Time zone has been set to GMT
18/01/10 14:27:29 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM TB_RECHARGE_DIM_APPLICATION t WHERE 1=0
18/01/10 14:27:29 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/s/compile/926451c21b6a6623f9763b96c7afa503/TB_RECHARGE_DIM_APPLICATION.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/10 14:27:31 INFO orm.CompilationManager: Writing jar file: /tmp/compile/926451c21b6a6623f9763b96c7afa503/TB_RECHARGE_DIM_APPLICATION.jar
18/01/10 14:27:32 INFO tool.ImportTool: Destination directory /data/in/sqm/dev/unprocessed/sqoop/oracle_db_exa_test deleted.
18/01/10 14:27:32 INFO manager.OracleManager: Time zone has been set to GMT
18/01/10 14:27:34 INFO manager.OracleManager: Time zone has been set to GMT
18/01/10 14:27:34 INFO mapreduce.ImportJobBase: Beginning import of TB_RECHARGE_DIM_APPLICATION
18/01/10 14:27:34 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/01/10 14:27:34 INFO manager.OracleManager: Time zone has been set to GMT
18/01/10 14:27:34 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/01/10 14:27:34 INFO hdfs.DFSClient: Created token for username: HDFS_DELEGATION_TOKEN owner=username#company.CO.ID, renewer=yarn, realUser=, issueDate=1515569254366, maxDate=1516174054366, sequenceNumber=29920785, masterKeyId=849 on ha-hdfs:nameservice1
18/01/10 14:27:34 INFO security.TokenCache: Got dt for hdfs://nameservice1; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (token for username: HDFS_DELEGATION_TOKEN owner=username#company.CO.ID, renewer=yarn, realUser=, issueDate=1515569254366, maxDate=1516174054366, sequenceNumber=29920785, masterKeyId=849)
18/01/10 14:28:10 WARN hdfs.DFSClient: Slow waitForAckedSeqno took 33367ms (threshold=30000ms). File being written: /user/username/.staging/job_1508590044386_4156415/libjars/commons-lang3-3.4.jar, block: BP-673686138-10.54.0.2-1453972538527:blk_3947617000_2874005894, Write pipeline datanodes: [DatanodeInfoWithStorage[10.54.1.110:50010,DS-bfb333fb-f63f-4c85-b60f-3ce0889fe16d,DISK], DatanodeInfoWithStorage[10.54.0.187:50010,DS-5c692f55-614c-4d33-9e83-0758d2d54555,DISK], DatanodeInfoWithStorage[10.54.0.183:50010,DS-8530593e-b498-455e-9aaa-b1a12c8ec3b2,DISK]]
18/01/10 14:28:13 INFO db.DBInputFormat: Using read commited transaction isolation
18/01/10 14:28:14 INFO mapreduce.JobSubmitter: number of splits:1
18/01/10 14:28:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1508590044386_4156415
18/01/10 14:28:14 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (token for username: HDFS_DELEGATION_TOKEN owner=username#company.CO.ID, renewer=yarn, realUser=, issueDate=1515569254366, maxDate=1516174054366, sequenceNumber=29920785, masterKeyId=849)
18/01/10 14:28:15 INFO impl.YarnClientImpl: Submitted application application_1508590044386_4156415
18/01/10 14:28:15 INFO mapreduce.Job: The url to track the job: https://host:8090/proxy/application_1508590044386_4156415/
18/01/10 14:28:15 INFO mapreduce.Job: Running job: job_1508590044386_4156415
18/01/10 14:28:28 INFO mapreduce.Job: Job job_1508590044386_4156415 running in uber mode : false
18/01/10 14:28:28 INFO mapreduce.Job: map 0% reduce 0%
18/01/10 14:29:38 INFO mapreduce.Job: Task Id : attempt_1508590044386_4156415_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.RuntimeException: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:223)
at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:168)
... 10 more
Caused by: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:673)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:715)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:385)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:30)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:564)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:216)
... 11 more
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:445)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:464)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:594)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:229)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1360)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:486)
... 19 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:162)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:411)
... 24 more
i am able to list the table using
sqoop list-tables --connect jdbc:oracle:thin:#//someExadataHost/dbInstance --username user --password pass
i dont know why the network cant established connection but it successfully launching the job (it means sqoop able to connect and identify oracle table is exist right?). the map task never finished so..
any idea about this? thank you
There might be many reasons you are facing this issue:
The listener is not configured properly
The listener process (service) is not running.Re-start it with the
lsnrctl start command or on Windows by starting the listener
service.
Also please check the hostname in oracle net manager and
listener.Both should be same.Path (Local -> service naming -> orcl)
Hope this helps!!
Add --driver oracle.jdbc.driver.OracleDriver to the command line.
I'm getting the following error when I try to import-all-tables via sqoop:
sqoop import-all-tables -m 12 --connect enter code here"jdbc:mysql://quickstart.cloudera:3306/retail_db" --username=retail_dba --password=cloudera --warehouse-dir=/r/cloudera/sqoop_import
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/04/23 15:29:27 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.0
17/04/23 15:29:27 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/04/23 15:29:27 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/04/23 15:29:27 INFO tool.CodeGenTool: Beginning code generation
17/04/23 15:29:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
17/04/23 15:29:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
17/04/23 15:29:27 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/e8e72a2e112fced2b0f3251b5666473d/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/04/23 15:29:30 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/e8e72a2e112fced2b0f3251b5666473d/categories.jar
17/04/23 15:29:30 WARN manager.MySQLManager: It looks like you are importing from mysql.
17/04/23 15:29:30 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
17/04/23 15:29:30 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
17/04/23 15:29:30 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
17/04/23 15:29:30 INFO mapreduce.ImportJobBase: Beginning import of categories
17/04/23 15:29:31 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/04/23 15:29:32 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
17/04/23 15:29:32 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/192.168.40.134:8032
17/04/23 15:29:37 INFO db.DBInputFormat: Using read commited transaction isolation
17/04/23 15:29:37 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`category_id`), MAX(`category_id`) FROM `categories`
17/04/23 15:29:37 INFO db.IntegerSplitter: Split size: 4; Num splits: 12 from: 1 to: 58
17/04/23 15:29:38 INFO mapreduce.JobSubmitter: number of splits:12
17/04/23 15:29:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1492945339848_0010
17/04/23 15:29:39 INFO impl.YarnClientImpl: Submitted application application_1492945339848_0010
17/04/23 15:29:39 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1492945339848_0010/
17/04/23 15:29:39 INFO mapreduce.Job: Running job: job_1492945339848_0010
17/04/23 15:29:52 INFO mapreduce.Job: Job job_1492945339848_0010 running in uber mode : false
17/04/23 15:29:52 INFO mapreduce.Job: map 0% reduce 0%
17/04/23 15:29:52 INFO mapreduce.Job: Job job_1492945339848_0010 failed with state FAILED due to: Application application_1492945339848_0010 failed 2 times due to AM Container for appattempt_1492945339848_0010_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1492945339848_0010/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1492945339848_0010_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:578)
at org.apache.hadoop.util.Shell.run(Shell.java:481)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:763)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
17/04/23 15:29:52 INFO mapreduce.Job: Counters: 0
17/04/23 15:29:52 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
17/04/23 15:29:52 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 19.6175 seconds (0 bytes/sec)
17/04/23 15:29:52 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
17/04/23 15:29:52 INFO mapreduce.ImportJobBase: Retrieved 0 records.
17/04/23 15:29:52 ERROR tool.ImportAllTablesTool: Error during import: Import job failed!`enter
Looks like application masters are getting killed repeatedly meaning, they are not getting as much as memory as they would like. If you are just trying out sqoop on cloudera virtual machine, dont use -m 12, this will try spawn 12 parallel map tasks which you (single) machine may not be able to handle. Leave off that setting altogether or try with --direct instead . Also whats going on with --warehousedir=/r/cloudera/sqoop_import ? is /r/ as typo or it should be /user/
Try this instead :
sqoop import-all-tables \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--warehouse-dir=/user/cloudera/sqoop_import
--username=retail_dba \
--direct
--password=cloudera;
try to load first one table instead import-all-tables, Also try to restrict your mappers while using import-all-tables , 12 mappers is hampering the memory on VM.
sqoop import-all-tables \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--warehouse-dir=/user/cloudera/sqoop_import
--username=retail_dba \
--password=cloudera
-m 2
i have set up a single node hadoop cluster and configured it to work with apache Hive, now when i imported a mysql table using the following command (using sqoop)
sqoop import --connect jdbc:mysql://localhost/dwhadoop --table orders --username root --password 123456 --hive-import
it runs successfully with some exceptions thrown after that when i do
Hive> show tales;
it does not list the orders table
if i run the import command again it gives me the error that orders dir already exists
please help me find the solution
EDIT:
i havent created any tables prior to the import, do i have to create a table order in hive before running the import. If i import another table Customers it gives me the following exception
[root#localhost root-647263876]# sqoop import --connect jdbc:mysql://localhost/dwhadoop --table Customers --username root --password 123456 --hive-import
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: $HADOOP_HOME is deprecated.
12/08/05 07:30:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
12/08/05 07:30:25 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
12/08/05 07:30:25 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
12/08/05 07:30:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
12/08/05 07:30:26 INFO tool.CodeGenTool: Beginning code generation
12/08/05 07:30:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `Customers` AS t LIMIT 1
12/08/05 07:30:26 INFO orm.CompilationManager: HADOOP_HOME is /home/enigma/hadoop/libexec/..
Note: /tmp/sqoop-root/compile/e48d4803894ee63079f7194792d624ed/Customers.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
12/08/05 07:30:28 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e48d4803894ee63079f7194792d624ed/Customers.jar
12/08/05 07:30:28 WARN manager.MySQLManager: It looks like you are importing from mysql.
12/08/05 07:30:28 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
12/08/05 07:30:28 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
12/08/05 07:30:28 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
12/08/05 07:30:28 INFO mapreduce.ImportJobBase: Beginning import of Customers
12/08/05 07:30:28 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/08/05 07:30:29 INFO mapred.JobClient: Running job: job_local_0001
12/08/05 07:30:29 INFO util.ProcessTree: setsid exited with exit code 0
12/08/05 07:30:29 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#11f41fd
12/08/05 07:30:29 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
12/08/05 07:30:30 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/08/05 07:30:30 INFO mapred.LocalJobRunner:
12/08/05 07:30:30 INFO mapred.Task: Task attempt_local_0001_m_000000_0 is allowed to commit now
12/08/05 07:30:30 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_m_000000_0' to Customers
12/08/05 07:30:30 INFO mapred.JobClient: map 0% reduce 0%
12/08/05 07:30:32 INFO mapred.LocalJobRunner:
12/08/05 07:30:32 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/08/05 07:30:33 INFO mapred.JobClient: map 100% reduce 0%
12/08/05 07:30:33 INFO mapred.JobClient: Job complete: job_local_0001
12/08/05 07:30:33 INFO mapred.JobClient: Counters: 13
12/08/05 07:30:33 INFO mapred.JobClient: File Output Format Counters
12/08/05 07:30:33 INFO mapred.JobClient: Bytes Written=45
12/08/05 07:30:33 INFO mapred.JobClient: File Input Format Counters
12/08/05 07:30:33 INFO mapred.JobClient: Bytes Read=0
12/08/05 07:30:33 INFO mapred.JobClient: FileSystemCounters
12/08/05 07:30:33 INFO mapred.JobClient: FILE_BYTES_READ=3205
12/08/05 07:30:33 INFO mapred.JobClient: FILE_BYTES_WRITTEN=52579
12/08/05 07:30:33 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=45
12/08/05 07:30:33 INFO mapred.JobClient: Map-Reduce Framework
12/08/05 07:30:33 INFO mapred.JobClient: Map input records=3
12/08/05 07:30:33 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
12/08/05 07:30:33 INFO mapred.JobClient: Spilled Records=0
12/08/05 07:30:33 INFO mapred.JobClient: Total committed heap usage (bytes)=21643264
12/08/05 07:30:33 INFO mapred.JobClient: CPU time spent (ms)=0
12/08/05 07:30:33 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
12/08/05 07:30:33 INFO mapred.JobClient: SPLIT_RAW_BYTES=87
12/08/05 07:30:33 INFO mapred.JobClient: Map output records=3
12/08/05 07:30:33 INFO mapreduce.ImportJobBase: Transferred 45 bytes in 5.359 seconds (8.3971 bytes/sec)
12/08/05 07:30:33 INFO mapreduce.ImportJobBase: Retrieved 3 records.
12/08/05 07:30:33 INFO hive.HiveImport: Loading uploaded data into Hive
12/08/05 07:30:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `Customers` AS t LIMIT 1
12/08/05 07:30:33 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Cannot run program "hive": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
at java.lang.Runtime.exec(Runtime.java:615)
at java.lang.Runtime.exec(Runtime.java:526)
at org.apache.sqoop.util.Executor.exec(Executor.java:76)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:344)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:297)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:393)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:454)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:135)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
... 15 more
but then again if i run the import again it says
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: $HADOOP_HOME is deprecated.
12/08/05 07:33:48 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
12/08/05 07:33:48 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
12/08/05 07:33:48 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
12/08/05 07:33:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
12/08/05 07:33:48 INFO tool.CodeGenTool: Beginning code generation
12/08/05 07:33:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `Customers` AS t LIMIT 1
12/08/05 07:33:49 INFO orm.CompilationManager: HADOOP_HOME is /home/enigma/hadoop/libexec/..
Note: /tmp/sqoop-root/compile/9855cf7de9cf54c59095fb4bfd65a369/Customers.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
12/08/05 07:33:50 ERROR orm.CompilationManager: Could not rename /tmp/sqoop-root/compile/9855cf7de9cf54c59095fb4bfd65a369/Customers.java to /app/hadoop/tmp/mapred/staging/root-647263876/./Customers.java
java.io.IOException: Destination '/app/hadoop/tmp/mapred/staging/root-647263876/./Customers.java' already exists
at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
at org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:227)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:83)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:368)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:454)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
12/08/05 07:33:50 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/9855cf7de9cf54c59095fb4bfd65a369/Customers.jar
12/08/05 07:33:51 WARN manager.MySQLManager: It looks like you are importing from mysql.
12/08/05 07:33:51 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
12/08/05 07:33:51 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
12/08/05 07:33:51 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
12/08/05 07:33:51 INFO mapreduce.ImportJobBase: Beginning import of Customers
12/08/05 07:33:51 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/08/05 07:33:52 INFO mapred.JobClient: Cleaning up the staging area file:/app/hadoop/tmp/mapred/staging/root-195281052/.staging/job_local_0001
12/08/05 07:33:52 ERROR security.UserGroupInformation: PriviledgedActionException as:root cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory Customers already exists
12/08/05 07:33:52 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory Customers already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:889)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:119)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:179)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:97)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:381)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:454)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
The main thing to note is your original import fails because Sqoop tries to invoke hive but it's not in your path. Fix that problem before continuing.
Then you should just find and remove the Customers directory from hdfs (local, not in HDFS) and try again.
From what I've seen, errors of the form "Customers.java already exists" are not fatal.
You have to include --create-hive table in your query if you want Sqoop to create a table and load the data into the table in hive
When you import data into hive, Sqoop tries to create a temporary hdfs directory as staging inorder to load data to table finally. During that, its better make sure that the directory doesn't exists already.
It looks like your working directory of sqoop doesn't have enough privileges to make filesytem changes. Make sure the 'user' is the owner of the related files.