Sqoop Imported Failed: Cannot convert SQL type 2005 when trying to import Oracle table - oracle

I get the following error when trying to import a table from an Oracle database as a parquet file.
ERROR tool.ImportTool: Imported Failed: Cannot convert SQL type 2005
This question has already been raised here, but the proposed solution does not help me.
I am trying to import a table from command line using the following command with parameters in <> filled in with their corresponding value:
sqoop import --connect jdbc:oracle:thin:#<host>:<port>/<service> --username <user> --password <password> --hive-import --query 'SELECT * FROM <DB>.<table> WHERE $CONDITIONS' --split-by <ID> --hive-database <HIVE_DB> --hive-table <HIVE_TABLE> --incremental append --check-column <ID> --map-column-hive <ID>=integer --compression-codec=snappy --target-dir=/user/hive/<FOLDER> --as-parquetfile --last-value 0 -m 1
Does anyone know how to solve this? I am not an expert on the sqooped Oracle database, but it seems to be due to the presence of CLOB data types.
I am running this command on CDH 5.8 with sqoop 1.4.6
Running the job without --as-parquetfile results in a sqoop job that seems to get stuck at map 0% reduce 0%.

Use --map-column-java to map clob datatype to Java String.
For example, you have a column C1. Use:
--map-column-java C1=String
Check docs for more details.

Related

Getting Protocol violation error while sqoop import from oracle to hive

I am trying to import data from oracle to hive through sqoop import command.But i am getting java.sql.SQLexception-protocol violation error. I checked found 1 text column with length(4000).
So i removed that column and ran the sqoop command its working.
So i found that because of that column only i am getting that protocol violation error.
Is this because of the length or something else.
Can someone help me on solving this.Below is the sqoop command i am using
sqoop import --connect jdbc:oracle:thin:#:port/servicename--username --password --query "select *from table_name where $CONDITIONS" --hive-drop-import-delims --target-dir /user/test --map-column-java -m 1

sqoop incremental job is failing due to org.kitesdk.data.DatasetOperationException

I am trying to import data from oracle to hive table using sqoop incremental job, using parquet file format. But job is failing due to below error
Error: org.kitesdk.data.DatasetOperationException: Failed to append
{"CLG_ID": "5",.....19/03/27 00:37:06 INFO mapreduce.Job: Task Id :
attempt_15088_130_m_000_2, Status : FAILED
Query to create saved job:
sqoop job -Dhadoop.security.credential.provider.path=jceks://xxxxx
--create job1 -- import --connect "jdbc:oracle:thinxxxxxx" --verbose --username user1 --password-alias alisas --query "select CLG_ID,.... from CLG_TBL where \$CONDITIONS" --as-parquetfile --incremental
append --check-column CLG_TS --target-dir /hdfs/clg_data/ -m 1
import query :
sqoop job -Dhadoop.security.credential.provider.path=jceks:/xxxxx
--exec job1 -- --connect "jdbc:oracle:xxx"
--username user1 --password-alias alisas --query "select CLG_ID,.... from CLG_TBL where \$CONDITIONS" --target-dir /hdfs/clg_data/ -m 1
--hive-import --hive-database clg_db --hive-table clg_table --as-parquetfile
This error is a known issue. We have faced with same problem a couple of weeks ago and
found this.
Here is the link.
Description of the problem or behavior
In HDP 3, managed Hive tables must be transactional (hive.strict.managed.tables=true). Transactional tables with Parquet format are not supported by Hive. Hive imports with --as-parquetfile must use external tables by specifying --external-table-dir.
Associated error message
Table db.table failed strict managed table checks due to the
following reason: Table is marked as a managed table but is not
transactional.
Workaround
When using --hive-import with --as-parquetfile, users must also provide --external-table-dir with a fully qualified location of the table:
sqoop import ... --hive-import
--as-parquetfile
--external-table-dir hdfs:///path/to/table

Sqoop import job error org.kitesdk.data.ValidationException for Oracle

Sqoop import job for Oracle 11g fails with error
ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.ValidationException: Dataset name
81fdfb8245ab4898a719d4dda39e23f9_C46010.HISTCONTACT is not
alphanumeric (plus '_')
here's the complete command:
$ sqoop job --create ingest_amsp_histcontact -- import --connect "jdbc:oracle:thin:#<IP>:<PORT>/<SID>" --username "c46010" -P --table C46010.HISTCONTACT --check-column ITEM_SEQ --target-dir /tmp/junk/amsp.histcontact -as-parquetfile -m 1 --incremental append
$ sqoop job --exec ingest_amsp_histcontact
it's an incremental import with parquet format. Surprisingly, it works pretty well if I use another format like --as-textfile.
This is similar issue with Sqoop job fails with KiteSDK validation error for Oracle import
But I've used ojdbc6 and switched to ojdbc7 doesn't work as well.
Sqoop version: 1.4.7
Oracle version: 11g
Thanks,
Yusata
I know it is kind of late but I faced the same problem and I solved it by omitting parquet file option.
Try running the job without
-as-parquetfile
There's a workaround, omitting "." character in --table parameter works for me, so instead of --table <schema>.<table_name>, I use --table <table_name>. But this doesn't work if you import a table from another schema in Oracle.
The problem is "." in --target-dir option. Workaround here: Change target dir to "/tmp/junk/amsp_histcontact". When sqoop job finishes, rename the hdfs target dir to "/tmp/junk/amsp.histcontact"

Error with sqoop import from mysql to hbase

I started learning sqoop recently with cloudera CDH5 VM.
I created mysql table from a CSV file having columns baseid, date, cars, kms.
Database used: mysql
Table created: uberdata
In hbase shell, I created with table name --myuberdatatable and column family --uber_details.
I checked with scan command and got to see empty table with 0 rows.
To Transfer the data from my mysql to hbase:
sqoop import jdbc:mysql://localhost/mysql --username root --password cloudera
--table uberdata --hbase-table myuberdatatable --column-family trip_details
--hbase-row-key base -m 1**
I am getting the following error:
Syntax error, unexpected tIdentifier
with a mark showing before jdbc.
It could be small error but tried to find solution in stackoverflow.
Can anyone help to fix this. Thanks in advance...
Yes, it is a syntax error. You have missed the connect keyword in the sqoop import statement.
Please use this format.[tested]
sqoop import --connect jdbc:mysql://localhost/emp --username root --password cloudera --table employee --hbase-table empdump --column-family emp_id --hbase-row-key id -m 1

Oraoop disabled for Sqoop import

I'm using the Hortonworks HDP Sandbox, and I’ve installed Oraoop per the instructions, but whenever I run a Sqoop import I get the message “oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.”. I’m not sure what else I need to do for it to pick it up. I have verified that the oraoop driver is in my sqoop lib directory. The imports do work, but they are just using the oracle driver, and I would like to play around with some of the features that you get with Oraoop.
This is the command I'm running:
sqoop-import --connect jdbc:oracle:thin:#<ip>:1521/sid --username myUser -P --query "select * from mytable where \$CONDITIONS" -split-by sequence_id -as-sequencefile --target-dir /user/hue/data/deactivatedsponsor
If '--query' argument is specified in place of '--table' parm, Oraoop connector is not used.
Following is mentioned in Sqoop Documentation
Data Connector for Oracle and Hadoop accepts responsibility for those Sqoop Jobs with the following attributes:
Oracle-related
Table-Based - Jobs where the table argument is used and the specified object is a table.
Following command should use Oraoop Connector. I have included "--direct" option as well which indicates to Sqoop that Oraoop should be used.
sqoop-import --connect jdbc:oracle:thin:#<ip>:1521/sid --direct --username myUser -P --table mytable -split-by sequence_id -as-sequencefile --target-dir /user/hue/data/deactivatedsponsor --columns <columns list> --where <where condition if needed>
Oraoop connector cannot process --query tool, when you use --query it automatically invokes sqoop.
So instead of using --query use --table for import.
Hope this helps!!

Resources