Sqoop Reading 0 records while doing a full load - sqoop2

PROBLEM DESCRIPTION:
I was trying to sqoop data but the sqoop returns zero records without any error.
But when I try to retrieve records using a certain limit it gets the data , but once i proceed further with a greater limit it does not fetch any record.
The QUERY that was passes using the Sqoop is as mentioned below:
select usr.id,usr.login,usr.auto_login,usr.password,usr.password_salt,usr.member,usr.first_name,usr.middle_name,usr.last_name,usr.user_type,usr.locale,usr.lastactivity_date,usr.lastpwdupdate,usr.generatedpwd,usr.registration_date,usr.email,usr.email_status,usr.receive_email,usr.last_emailed,usr.gender,usr.date_of_birth,usr.securitystatus,usr.description,usr.realm_id,usr.password_kdf,dcspp_order.last_modified_date, 20151223080640 FROM <TABLE_NAME> usr JOIN atgprdcore.dcspp_order ON (usr.id = dcspp_order.profile_id ) WHERE $CONDITIONS'
Generated SQOOP Command: sqoop job -Dmapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom" -libjars /<COMP>/stage/da_data/DataAqusition_ATG/dm-sqoop-1.0.0/lib/tdgssconfig.jar,/<COMP>/stage/da_data/DataAqusition_ATG/dm-sqoop-1.0.0/lib/ojdbc6.jar,/<COMP>/stage/da_data/DataAqusition_ATG/dm-sqoop-1.0.0/lib/nzjdbc3.jar,/<COMP>/stage/da_data/DataAqusition_ATG/dm-sqoop-1.0.0/lib/terajdbc4.jar -Dfile.encoding=UTF-8 -Dmapreduce.job.queuename=long_running -Dmapreduce.job.name=sample-job-name --create Sqoop_Utility1253423780 -- import --connect jdbc:oracle:thin:#10.202.201.15:9101:KOHLDBSA1 --username XXXXXX --password-file /tmp/sqoop-nzhdusr/27c6d6d50fccdc67342374a4f560d1d6-asdfg.txt --fetch-size 100 --query 'select usr.id,usr.login,usr.auto_login,usr.password,usr.password_salt,usr.member,usr.first_name,usr.middle_name,usr.last_name,usr.user_type,usr.locale,usr.lastactivity_date,usr.lastpwdupdate,usr.generatedpwd,usr.registration_date,usr.email,usr.email_status,usr.receive_email,usr.last_emailed,usr.gender,usr.date_of_birth,usr.securitystatus,usr.description,usr.realm_id,usr.password_kdf,dcspp_order.last_modified_date, 20151223080640 FROM <database>.<tablename> usr JOIN atgprdcore.dcspp_order ON (usr.id = dcspp_order.profile_id ) WHERE $CONDITIONS' --hive-drop-import-delims --null-string "" --target-dir /tmp/sqoop-nzhdusr/dps_user --num-mappers 1 --fields-terminated-by "|"
[INFO] running sqoop
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/12/23 08:07:10 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4.2.1.10.0-881
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/12/23 08:07:15 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4.2.1.10.0-881
15/12/23 08:07:18 INFO tool.CodeGenTool: Beginning code generation
15/12/23 08:07:19 INFO manager.OracleManager: Time zone has been set to GMT
15/12/23 08:07:19 INFO manager.SqlManager: Executing SQL statement: select usr.id,usr.login,usr.auto_login,usr.password,usr.password_salt,usr.member,usr.first_name,usr.middle_name,usr.last_name,usr.user_type,usr.locale,usr.lastactivity_date,usr.lastpwdupdate,usr.generatedpwd,usr.registration_date,usr.email,usr.email_status,usr.receive_email,usr.last_emailed,usr.gender,usr.date_of_birth,usr.securitystatus,usr.description,usr.realm_id,usr.password_kdf,dcspp_order.last_modified_date, 20151223080640 FROM <database>.<tablename> tab1 JOIN atgprdcore.dcspp_order ON (usr.id = dcspp_order.profile_id ) WHERE (1 = 0)
15/12/23 08:07:19 INFO manager.SqlManager: Executing SQL statement: select usr.id,usr.login,usr.auto_login,usr.password,usr.password_salt,usr.member,usr.first_name,usr.middle_name,usr.last_name,usr.user_type,usr.locale,usr.lastactivity_date,usr.lastpwdupdate,usr.generatedpwd,usr.registration_date,usr.email,usr.email_status,usr.receive_email,usr.last_emailed,usr.gender,usr.date_of_birth,usr.securitystatus,usr.description,usr.realm_id,usr.password_kdf,dcspp_order.last_modified_date, 20151223080640 FROM <database>.<tablename> tab2 JOIN atgprdcore.dcspp_order ON (usr.id = dcspp_order.profile_id ) WHERE (1 = 0)
15/12/23 08:07:19 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-nzhdusr/compile/ed8d5029fc473715d385a2c0b7e002c4/QueryResult.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/12/23 08:07:21 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-nzhdusr/compile/ed8d5029fc473715d385a2c0b7e002c4/QueryResult.jar
15/12/23 08:07:21 INFO mapreduce.ImportJobBase: Beginning query import.
15/12/23 08:07:21 INFO client.RMProxy: Connecting to ResourceManager at nhga0002.tst.<COMP>.com/10.200.0.3:8050
15/12/23 08:07:21 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 47174 for nzhdusr on ha-hdfs:<URL>
15/12/23 08:07:21 INFO security.TokenCache: Got dt for hdfs://<URL>; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:<URL>, Ident: (HDFS_DELEGATION_TOKEN token 47174 for nzhdusr)
15/12/23 08:07:23 INFO db.DBInputFormat: Using read commited transaction isolation
15/12/23 08:07:24 INFO mapreduce.JobSubmitter: number of splits:1
15/12/23 08:07:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1444949527622_18165
15/12/23 08:07:24 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:<URL>, Ident: (HDFS_DELEGATION_TOKEN token 47174 for nzhdusr)
15/12/23 08:07:24 INFO impl.YarnClientImpl: Submitted application application_1444949527622_18165
15/12/23 08:07:25 INFO mapreduce.Job: The url to track the job: https://nhga0002.tst.<COMP>.com:8090/proxy/application_1444949527622_18165/
15/12/23 08:07:25 INFO mapreduce.Job: Running job: job_1444949527622_18165
15/12/23 08:07:35 INFO mapreduce.Job: Job job_1444949527622_18165 running in uber mode : false
15/12/23 08:07:35 INFO mapreduce.Job: map 0% reduce 0%
15/12/23 08:24:57 INFO mapreduce.Job: map 100% reduce 0%
15/12/23 08:24:57 INFO mapreduce.Job: Job job_1444949527622_18165 completed successfully
15/12/23 08:24:57 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=117614
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=1039640
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=1039640
Total vcore-seconds taken by all map tasks=1039640
Total megabyte-seconds taken by all map tasks=6919843840
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=119
CPU time spent (ms)=7760
Physical memory (bytes) snapshot=315817984
Virtual memory (bytes) snapshot=6523957248
Total committed heap usage (bytes)=1114112000
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
15/12/23 08:24:57 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 1,055.9463 seconds (0 bytes/sec)
15/12/23 08:24:57 INFO mapreduce.ImportJobBase: Retrieved 0 records.

Related

sqoop imported data but with empty part-m-00000 files?

when importing the data from oracle database to HDFS using Apache sqoop. it's imported but empty files.
sqoop import --connect jdbc:oracle:thin:#192.168.0.15:1521:XE --username system --password system --table EMP -m 1 --target-dir /user/sinha
after running its creates part-m-00000 file without any data...while running query
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/03/05 09:43:57 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.12.0
18/03/05 09:43:57 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/03/05 09:44:00 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
18/03/05 09:44:58 INFO mapreduce.JobSubmitter: number of splits:1
18/03/05 09:45:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1520229051986_0016
18/03/05 09:45:03 INFO impl.YarnClientImpl: Submitted application application_1520229051986_0016
18/03/05 09:45:03 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1520229051986_0016/
18/03/05 09:45:03 INFO mapreduce.Job: Running job: job_1520229051986_0016
18/03/05 09:45:54 INFO mapreduce.Job: Job job_1520229051986_0016 running in uber mode : false
18/03/05 09:45:54 INFO mapreduce.Job: map 0% reduce 0%
18/03/05 09:46:35 INFO mapreduce.Job: map 100% reduce 0%
18/03/05 09:46:36 INFO mapreduce.Job: Job job_1520229051986_0016 completed successfully
18/03/05 09:46:36 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=151209
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=37383
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=37383
Total vcore-milliseconds taken by all map tasks=37383
Total megabyte-milliseconds taken by all map tasks=38280192
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=546
CPU time spent (ms)=5110
Physical memory (bytes) snapshot=143175680
Virtual memory (bytes) snapshot=1509150720
Total committed heap usage (bytes)=74973184
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
18/03/05 09:46:36 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 108.9264 seconds (0 bytes/sec)
18/03/05 09:46:36 INFO mapreduce.ImportJobBase: Retrieved 0 records
i don't know what is the problem?
Even if i check with "eval" command also it's display only column names of the table.
Looking at the logs, you source table don't have any records at all. Do a select * on your oracle table to validate. Add some records to your oracle table and try the sqoop operation again. You shall be able to fetch the data.

import-all-tables hanging in during hive-import

[cloudera#quickstart ~]$ sqoop import-all-tables --connect="jdbc:mysql://serverip:3306/dbname"
--username=xxx --password=yyy -m 1 --hive-import --hive-overwrite
--create-hive-table --hive-database sqoopimport --hive-home /user/hive/warehouse
It is running upto map-reduce and during hive import it is hanging in below step, am I doing any mistake?
Logging initialized using configuration in jar:file:/usr/jars/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
Full log:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/08/13 21:20:37 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.0
16/08/13 21:20:37 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/08/13 21:20:37 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/08/13 21:20:37 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/08/13 21:20:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/08/13 21:20:39 INFO tool.CodeGenTool: Beginning code generation
16/08/13 21:20:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/08/13 21:20:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/08/13 21:20:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/af1ef62229d717800a3cd16b42f53dc3/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/08/13 21:20:43 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/af1ef62229d717800a3cd16b42f53dc3/categories.jar
16/08/13 21:20:43 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/08/13 21:20:43 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/08/13 21:20:43 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/08/13 21:20:43 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/08/13 21:20:43 INFO mapreduce.ImportJobBase: Beginning import of categories
16/08/13 21:20:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/08/13 21:20:45 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/08/13 21:20:46 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/10.0.2.15:8032
16/08/13 21:20:49 INFO db.DBInputFormat: Using read commited transaction isolation
16/08/13 21:20:49 INFO mapreduce.JobSubmitter: number of splits:1
16/08/13 21:20:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470828230815_0030
16/08/13 21:20:50 INFO impl.YarnClientImpl: Submitted application application_1470828230815_0030
16/08/13 21:20:50 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1470828230815_0030/
16/08/13 21:20:50 INFO mapreduce.Job: Running job: job_1470828230815_0030
16/08/13 21:21:07 INFO mapreduce.Job: Job job_1470828230815_0030 running in uber mode : false
16/08/13 21:21:07 INFO mapreduce.Job: map 0% reduce 0%
16/08/13 21:21:18 INFO mapreduce.Job: map 100% reduce 0%
16/08/13 21:21:18 INFO mapreduce.Job: Job job_1470828230815_0030 completed successfully
16/08/13 21:21:18 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=139452
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=1029
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=1008512
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7879
Total vcore-seconds taken by all map tasks=7879
Total megabyte-seconds taken by all map tasks=1008512
Map-Reduce Framework
Map input records=58
Map output records=58
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=78
CPU time spent (ms)=950
Physical memory (bytes) snapshot=143511552
Virtual memory (bytes) snapshot=726700032
Total committed heap usage (bytes)=48234496
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=1029
16/08/13 21:21:18 INFO mapreduce.ImportJobBase: Transferred 1.0049 KB in 32.6366 seconds (31.5291 bytes/sec)
16/08/13 21:21:18 INFO mapreduce.ImportJobBase: Retrieved 58 records.
16/08/13 21:21:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/08/13 21:21:18 INFO hive.HiveImport: Loading uploaded data into Hive
Logging initialized using configuration in jar:file:/usr/jars/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties

No tables are created inside HIVE but data is crerated inside hdfs

I am new to HDFS and I am trying to import data from my oracle 12c db. I have a table EMP, it needs to be imported inside hdfs as well as hive tables.
My data is getting created inside hdfs ('/user/hdfs' a folder 'EMP' gets created). But when I open hive query editor and type 'show tables' I don't get to see any tables here. I need the tables to be created inside HIVE as well
I am running the following commands.
1. Since am running sqoop as root user
usermod -a -G supergroup hardik
2.
export SQOOP_HOME=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop
export HIVE_HOME=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hive
export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ojdbc7.jar:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hive/lib/*
export HADOOP_USER_NAME=hdfs
3.
export PATH=$PATH:$HIVE_HOME/bin
Now am running the SQOOP import command and I get the following on the console
4.
sqoop import --connect jdbc:oracle:thin:#bigdatadev2:1521/orcl --username BDD1 --password oracle123 --table EMP --hive-import -m 1 --create-hive-table --hive-table EMP
[root#bigdatadev1 ~]# sqoop import --connect jdbc:oracle:thin:#bigdatadev2:1521/orcl --username BDD1 --password oracle123 --table EMP --hive-import -m 1 --create-hive-table --hive-table EMP
Warning: /opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/04/07 22:15:23 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.1
16/04/07 22:15:23 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/04/07 22:15:23 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/04/07 22:15:23 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/04/07 22:15:23 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
16/04/07 22:15:23 INFO manager.SqlManager: Using default fetchSize of 1000
16/04/07 22:15:23 INFO tool.CodeGenTool: Beginning code generation
16/04/07 22:15:24 INFO manager.OracleManager: Time zone has been set to GMT
16/04/07 22:15:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM EMP t WHERE 1=0
16/04/07 22:15:24 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/fcb6484db042a7b4295d911956145a4e/EMP.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/04/07 22:15:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/fcb6484db042a7b4295d911956145a4e/EMP.jar
16/04/07 22:15:25 INFO manager.OracleManager: Time zone has been set to GMT
16/04/07 22:15:25 INFO manager.OracleManager: Time zone has been set to GMT
16/04/07 22:15:25 INFO mapreduce.ImportJobBase: Beginning import of EMP
16/04/07 22:15:25 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/04/07 22:15:25 INFO manager.OracleManager: Time zone has been set to GMT
16/04/07 22:15:26 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/04/07 22:15:26 INFO client.RMProxy: Connecting to ResourceManager at bigdata/10.103.25.39:8032
16/04/07 22:15:30 INFO db.DBInputFormat: Using read commited transaction isolation
16/04/07 22:15:30 INFO mapreduce.JobSubmitter: number of splits:1
16/04/07 22:15:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460040138373_0007
16/04/07 22:15:31 INFO impl.YarnClientImpl: Submitted application application_1460040138373_0007
16/04/07 22:15:31 INFO mapreduce.Job: The url to track the job: http://bigdata:8088/proxy/application_1460040138373_0007/
16/04/07 22:15:31 INFO mapreduce.Job: Running job: job_1460040138373_0007
16/04/07 22:15:37 INFO mapreduce.Job: Job job_1460040138373_0007 running in uber mode : false
16/04/07 22:15:37 INFO mapreduce.Job: map 0% reduce 0%
16/04/07 22:15:43 INFO mapreduce.Job: Task Id : attempt_1460040138373_0007_m_000000_0, Status : FAILED
Error: EMP : Unsupported major.minor version 52.0
16/04/07 22:15:56 INFO mapreduce.Job: Task Id : attempt_1460040138373_0007_m_000000_1, Status : FAILED
Error: EMP : Unsupported major.minor version 52.0
16/04/07 22:16:03 INFO mapreduce.Job: map 100% reduce 0%
16/04/07 22:16:04 INFO mapreduce.Job: Job job_1460040138373_0007 completed successfully
16/04/07 22:16:04 INFO mapreduce.Job: Counters: 31
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=137942
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=12
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Failed map tasks=2
Launched map tasks=3
Other local map tasks=3
Total time spent by all maps in occupied slots (ms)=20742
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=20742
Total vcore-seconds taken by all map tasks=20742
Total megabyte-seconds taken by all map tasks=10619904
Map-Reduce Framework
Map input records=3
Map output records=3
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=53
CPU time spent (ms)=2090
Physical memory (bytes) snapshot=207478784
Virtual memory (bytes) snapshot=2169630720
Total committed heap usage (bytes)=134217728
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=12
16/04/07 22:16:04 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 38.6207 seconds (0.3107 bytes/sec)
16/04/07 22:16:04 INFO mapreduce.ImportJobBase: Retrieved 3 records.
16/04/07 22:16:05 INFO manager.OracleManager: Time zone has been set to GMT
16/04/07 22:16:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM EMP t WHERE 1=0
16/04/07 22:16:05 INFO hive.HiveImport: Loading uploaded data into Hive
Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/hive-common-1.1.0-cdh5.5.1.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Table EMP already exists)
I have tried all variations of sqoop import commands but none have succeeded. I am even more confuse today. Please help. Please do not mark this as duplicate.
From your logs, I found two errors:
Error: EMP : Unsupported major.minor version 52.0
Unsupported major.minor version 52.0 comes when you are trying to run a class compiled using Java 1.8 compiler into a lower JRE version e.g. JRE 1.7 or JRE 1.6. Check more here.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Table EMP already exists)
Your job worked till taking data to hdfs. You must be trying same command again without deleting this /user/hdfs/EMP directory. That's why you got this error.
Check this related answer.

Can't finish MR when using Sqoop transfer data from HDFS to MYSQL

While transferring data from HDFS to MySQL, a MapReduce job gets spawned. But, it gets stuck and does not get completed.
sqoop export --connect jdbc:mysql://crxy2:3306/test --username root --password 19911130 --table info --export-dir sqoop_export
I see following in the logs:
Warning: /software/sqoop-1.4.6.alpha/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /software/sqoop-1.4.6.alpha/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /software/sqoop-1.4.6.alpha/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /software/sqoop-1.4.6.alpha/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
15/12/02 01:17:37 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
15/12/02 01:17:37 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/12/02 01:17:37 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/12/02 01:17:37 INFO tool.CodeGenTool: Beginning code generation
15/12/02 01:17:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `info` AS t LIMIT 1
15/12/02 01:17:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `info` AS t LIMIT 1
15/12/02 01:17:38 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /software/hadoop-2.6.0
Note: /tmp/sqoop-root/compile/344126e97612def1e3976c1978c2e75e/info.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/12/02 01:17:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/344126e97612def1e3976c1978c2e75e/info.jar
15/12/02 01:17:42 INFO mapreduce.ExportJobBase: Beginning export of info
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/software/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/software/hbase-0.98.8-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/12/02 01:17:43 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/12/02 01:17:45 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/12/02 01:17:45 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
15/12/02 01:17:45 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/12/02 01:17:46 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/12/02 01:17:50 INFO input.FileInputFormat: Total input paths to process : 1
15/12/02 01:17:50 INFO input.FileInputFormat: Total input paths to process : 1
15/12/02 01:17:50 INFO mapreduce.JobSubmitter: number of splits:4
15/12/02 01:17:50 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
15/12/02 01:17:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449047829255_0001
15/12/02 01:17:51 INFO impl.YarnClientImpl: Submitted application application_1449047829255_0001
15/12/02 01:17:52 INFO mapreduce.Job: The url to track the job: http://crxy2:8088/proxy/application_1449047829255_0001/
15/12/02 01:17:52 INFO mapreduce.Job: Running job: job_1449047829255_0001
15/12/02 01:18:12 INFO mapreduce.Job: Job job_1449047829255_0001 running in uber mode : false
15/12/02 01:18:12 INFO mapreduce.Job: map 0% reduce 0%
15/12/02 01:19:10 INFO mapreduce.Job: map 75% reduce 0%
15/12/02 01:19:12 INFO mapreduce.Job: map 100% reduce 0%
15/12/02 01:29:41 INFO mapreduce.Job: Task Id : attempt_1449047829255_0001_m_000001_0, Status : FAILED
AttemptID:attempt_1449047829255_0001_m_000001_0 Timed out after 600 secs
15/12/02 01:29:42 INFO mapreduce.Job: map 75% reduce 0%
15/12/02 01:29:58 INFO mapreduce.Job: map 100% reduce 0%
15/12/02 01:40:11 INFO mapreduce.Job: Task Id : attempt_1449047829255_0001_m_000001_1, Status : FAILED
AttemptID:attempt_1449047829255_0001_m_000001_1 Timed out after 600 secs
15/12/02 01:40:12 INFO mapreduce.Job: map 75% reduce 0%
15/12/02 01:40:28 INFO mapreduce.Job: map 100% reduce 0%
15/12/02 01:50:41 INFO mapreduce.Job: Task Id : attempt_1449047829255_0001_m_000001_2, Status : FAILED
AttemptID:attempt_1449047829255_0001_m_000001_2 Timed out after 600 secs
15/12/02 01:50:42 INFO mapreduce.Job: map 75% reduce 0%
15/12/02 01:51:00 INFO mapreduce.Job: map 100% reduce 0%
15/12/02 02:01:13 INFO mapreduce.Job: Job job_1449047829255_0001 failed with state FAILED due to: Task failed task_1449047829255_0001_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0
15/12/02 02:01:13 INFO mapreduce.Job: Counters: 32
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=370395
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=556
HDFS: Number of bytes written=0
HDFS: Number of read operations=15
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Failed map tasks=4
Launched map tasks=7
Other local map tasks=3
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=2732612
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=2732612
Total vcore-seconds taken by all map tasks=2732612
Total megabyte-seconds taken by all map tasks=2798194688
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=504
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=759
CPU time spent (ms)=5170
Physical memory (bytes) snapshot=245080064
Virtual memory (bytes) snapshot=2529026048
Total committed heap usage (bytes)=46792704
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
15/12/02 02:01:13 INFO mapreduce.ExportJobBase: Transferred 556 bytes in 2,607.4894 seconds (0.2132 bytes/sec)
15/12/02 02:01:13 INFO mapreduce.ExportJobBase: Exported 0 records.
15/12/02 02:01:13 ERROR tool.ExportTool: Error during export: Export job failed!
2015-12-02 08:01:15,791 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root OPERATION=Application Finished - Succeeded TARGET=RMAppManager RESULT=SUCCESS APPID=application_1449047829255_0002
2015-12-02 08:01:15,793 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1449047829255_0002_000001 is done. finalState=FINISHED
2015-12-02 08:01:15,793 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1449047829255_0002 requests cleared
2015-12-02 08:01:15,794 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1449047829255_0002 user: root queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2015-12-02 08:01:15,794 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1449047829255_0002 user: root leaf-queue of parent: root #applications: 0
2015-12-02 08:01:15,794 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1449047829255_0002,name=info.jar,user=root,queue=default,state=FINISHED,trackingUrl=http://crxy2:8088/proxy/application_1449047829255_0002/jobhistory/job/job_1449047829255_0002,appMasterHost=crxy2,startTime=1449069503787,finishTime=1449072069229,finalStatus=FAILED
2015-12-02 08:01:15,796 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1449047829255_0002_000001
2015-12-02 08:01:15,873 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-12-02 08:01:15,873 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-12-02 08:01:16,879 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
Questioner was looking at incorrect logs. He is able to troubleshoot the issue by going through failed task logs as per the suggestion in the comments section.

Oracle Sqoop Retrieves 0 Records

I have a table in Oracle XE 11g
SQL> create table bloblkup (
2 id NUMBER PRIMARY KEY,
3 name varchar(28) NOT NULL,
4 fdata BLOB
5 );
Table created.
SQL> desc bloblkup
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NOT NULL NUMBER
NAME NOT NULL VARCHAR2(28)
FDATA BLOB
populated with
SQL> select * from bloblkup;
ID NAME
---------- ----------------------------
FDATA
--------------------------------------------------------------------------------
1 photo.jpg
032135435135
From the cluster I attempt to Sqoop this table into HDFS
sqoop import --connect jdbc:oracle:thin:#Rhea:1521:xe --username SYSTEM --password oracle --table bloblkup --columns 'name' -m 1
that executes to completion every time but provides INFO
15/03/24 09:14:39 INFO mapreduce.ImportJobBase: Retrieved 0 records.
I can retrieve databases and tables.
I am logging in as system which created and owns the table.
I have also found that I can query tables such as the all_tables and return rows just not table that I have created through sqlplus>
I added --m 1 flag after the first attempted failed due to being unable to located an primary key for the table. I added a primary key to the table using alter table with no change.
thoughts?
console output:
[root#sandbox ~]# sqoop import --connect jdbc:oracle:thin:#Rhea:1521:xe --username SYSTEM --password oracle --table bloblkup --columns 'name' -m 1
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/03/24 09:14:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4.2.1.1.0-385
15/03/24 09:14:02 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/03/24 09:14:02 INFO manager.SqlManager: Using default fetchSize of 1000
15/03/24 09:14:02 INFO tool.CodeGenTool: Beginning code generation
15/03/24 09:14:04 INFO manager.OracleManager: Time zone has been set to GMT
15/03/24 09:14:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM bloblkup t WHERE 1=0
15/03/24 09:14:05 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/ce267f99c7e1b14da474c2c395368b67/bloblkup.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/03/24 09:14:08 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/ce267f99c7e1b14da474c2c395368b67/bloblkup.jar
15/03/24 09:14:08 INFO manager.OracleManager: Time zone has been set to GMT
15/03/24 09:14:08 INFO mapreduce.ImportJobBase: Beginning import of bloblkup
15/03/24 09:14:09 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/03/24 09:14:10 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/03/24 09:14:10 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/192.168.1.91:8050
15/03/24 09:14:12 INFO db.DBInputFormat: Using read commited transaction isolation
15/03/24 09:14:13 INFO mapreduce.JobSubmitter: number of splits:1
15/03/24 09:14:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1427151026592_0037
15/03/24 09:14:14 INFO impl.YarnClientImpl: Submitted application application_1427151026592_0037
15/03/24 09:14:14 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1427151026592_0037/
15/03/24 09:14:14 INFO mapreduce.Job: Running job: job_1427151026592_0037
15/03/24 09:14:27 INFO mapreduce.Job: Job job_1427151026592_0037 running in uber mode : false
15/03/24 09:14:27 INFO mapreduce.Job: map 0% reduce 0%
15/03/24 09:14:38 INFO mapreduce.Job: map 100% reduce 0%
15/03/24 09:14:39 INFO mapreduce.Job: Job job_1427151026592_0037 completed successfully
15/03/24 09:14:39 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=107031
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=8553
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=8553
Total vcore-seconds taken by all map tasks=8553
Total megabyte-seconds taken by all map tasks=2138250
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=76
CPU time spent (ms)=2170
Physical memory (bytes) snapshot=145907712
Virtual memory (bytes) snapshot=897458176
Total committed heap usage (bytes)=75497472
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
15/03/24 09:14:39 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 28.8478 seconds (0 bytes/sec)
15/03/24 09:14:39 INFO mapreduce.ImportJobBase: Retrieved 0 records.
Update the Oracle driver version ojdbc6.jar appears to work, but be flaky with JDK 1.7 use ojdbc7.jar
Also, you must "commit" database changes in SQLPLUS for them to persist
Check whether jdbc driver is under $SQOOP_HOME/lib if not copy the ojdbc6.jar file to:
/usr/lib/sqoop/lib/ directory
Provide more details from console.
If every thing is fine then add --target-dir to see output on that specific directory.
/usr/bin/sqoop import --connect jdbc:oracle:thin:system/system#<IP address>:1521:xe --username <username> -P--table <database name>.<table name> --columns "<column names>" --target-dir <target directory path> -m 1

Resources