How to sqoop to import oracle clob data to avro files on hdfs - oracle

I am getting a strange error when sqooping the data from oracle DB to HDFS.
Sqoop is not able to import clob data into an avro files on hadoop.
This is the sqoop import error :
ERROR tool.ImportTool: Imported Failed: Cannot convert SQL type 2005
Do we need to add any extra arguments to sqoop import statement for it correctly import clob data into avro files ?

Update: Found the solution, We need to add --map-column-java for the clob columns.
For Eg: If the column name is clob then we have pass --map-column-java clob=string for sqoop to import the clob columns.

Related

How to retain datatypes when importing data from Oracle to HDFS using sqoop?

We are using sqoop to import data from Oracle to HDFS.
In HDFS we are creating an avro file.
Issue we are facing is that, date is being converted to long and all other datatypes are being converted to string.
Is there any way to preserve the datatypes, when importing data using sqoop?
Thanks

Sqoop mapping all datatype as string

I'm importing a table from oracle to a s3 directory using Amazon EMR. The files are being imported as avro and Sqoop exports the avsc file with all columns as String.
Does anyone knows what to do to Sqoop map the correct datatype ?
Use --map-column-java to map to the appropriate data type. For hive you can use --map-column-hive

Hive throws error after sqooping data

I want to import data from database to HDFS in a parquet format then populate the hive table. I can't use sqoop import --hive-import because sqoop moves data from the --target-dir to the hive metastore dir.
So, I am obliged to create the hive schema sqoop create-hive-table, convert the hive table to parquet SET FILEFORMAT parquet, change the location of hive table to point to the suitable file in HDFS and finally import data to the table using sqoop import --as-parquet-file
I am faced to a problem in hive : I cannot preview the data of my table because of this error :
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.UnsupportedOperationException: Cannot inspect org.apache.hadoop.io.LongWritable
1) How can I solve this problem ?
2) Is there a better solution to do this use case ?
What is your hive version? If your version is 1.0.0, it's a bug. Please follow this link
This bug is fixed in hive 1.2.0 version

sqoop not import datatype varchar2 from oracle 9i to hadoop

sqoop not import datatype varchar2 from oracle 9i to hadoop
I have a table in oracle 9i and I want send the information to hdfs.
I am trying to do ir with sqoop, but varchar2 columns are not imported. I mean
that these data isn't arriving to hdfs file, therefore I can't see it in my
hive table
could anybody help me please?

Sqoop not loading CLOB type data into hive table properly

I am trying to use Sqoop job for importing data from Oracle and one of the column in Oracle table is of data type CLOB which contains newline characters.
In this case, the option --hive-drop-import-delims is not working. Hive table doesn’t read the /n characters properly.
Please suggest how I can import CLOB data into target directory parsing all the characters properly.

Resources