ERROR tool.BaseSqoopTool - sqoop

I am trying to load data into a Hive table from Teradata using sqoop.
I am using CDH4.3 version. i am getting the following error..
ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-import
Can anyone tell me what the problem is?
This is my script
sqoop import --hive-overwrite --hive-drop-import-delims --warehouse-dir "/warehouse" --hive-table aster_sq \
--connect jdbc:teradata://xxxxx/DATABASE=xxxx \
--table aster2 --username xxxx --password xxxxx --hive-import\
--fields-terminated-by ',' --lines-terminated-by '\n'

Based on the exception I would say that you are using "Cloudera Connector Powered by Teradata" that sadly do not currently support Hive imports and hence the exception about Unsupported parameter --hive-import. You can easily workaround the issue by using the connector to import data into HDFS as they are and loading them into Hive using "LOAD DATA" command yourself. Another workaround is to use older "Cloudera Connector for Teradata" that do support the hive import. This should be fixed in upcoming releases.

The --hive-import is apparently supported for this command. Did you make sure to install the Teradata jar files?
http://blog.cloudera.com/blog/2012/01/cloudera-connector-for-teradata-1-0-0/

Related

Sqoop Imported Failed: Cannot convert SQL type 2005 when trying to import Oracle table

I get the following error when trying to import a table from an Oracle database as a parquet file.
ERROR tool.ImportTool: Imported Failed: Cannot convert SQL type 2005
This question has already been raised here, but the proposed solution does not help me.
I am trying to import a table from command line using the following command with parameters in <> filled in with their corresponding value:
sqoop import --connect jdbc:oracle:thin:#<host>:<port>/<service> --username <user> --password <password> --hive-import --query 'SELECT * FROM <DB>.<table> WHERE $CONDITIONS' --split-by <ID> --hive-database <HIVE_DB> --hive-table <HIVE_TABLE> --incremental append --check-column <ID> --map-column-hive <ID>=integer --compression-codec=snappy --target-dir=/user/hive/<FOLDER> --as-parquetfile --last-value 0 -m 1
Does anyone know how to solve this? I am not an expert on the sqooped Oracle database, but it seems to be due to the presence of CLOB data types.
I am running this command on CDH 5.8 with sqoop 1.4.6
Running the job without --as-parquetfile results in a sqoop job that seems to get stuck at map 0% reduce 0%.
Use --map-column-java to map clob datatype to Java String.
For example, you have a column C1. Use:
--map-column-java C1=String
Check docs for more details.

Sqoop job fails with KiteSDK validation error for Oracle import

I am attempting to run a Sqoop job to load from an Oracle db and into Parquet format to a Hadoop cluster. The job is incremental.
Sqoop version is 1.4.6. Oracle version is 12c. Hadoop version is 2.6.0 (distro is Cloudera 5.5.1).
The Sqoop command is (this creates the job, and executes it):
$ sqoop job -fs hdfs://<HADOOPNAMENODE>:8020 \
--create myJob \
-- import \
--connect jdbc:oracle:thin:#<DBHOST>:<DBPORT>/<DBNAME> \
--username <USERNAME> \
-P \
--as-parquetfile \
--table <USERNAME>.<TABLENAME> \
--target-dir <HDFSPATH> \
--incremental append \
--check-column <TABLEPRIMARYKEY>
$ sqoop job --exec myJob
Error on execute:
16/02/05 11:25:30 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.ValidationException: Dataset name
05112528000000918_2088_<USERNAME>.<TABLENAME>
is not alphanumeric (plus '_')
at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:103)
at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:66)
at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
at org.kitesdk.data.Datasets.create(Datasets.java:239)
at org.kitesdk.data.Datasets.create(Datasets.java:307)
at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107)
at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:80)
at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:106)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:668)
at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:444)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Troubleshooting Steps:
0) HDFS is stable, other Sqoop jobs are functional, Oracle source DB is up and the connection has been tested.
1) I tried creating a synonym in Oracle, that way I could simply have the --table option as:
--table TABLENAME (without the username)
This gave me an error that the table name was not correct. It needs the full USERNAME.TABLENAME for the --table option.
Error:
16/02/05 12:04:46 ERROR tool.ImportTool: Imported Failed: There is no column found in the target table <TABLENAME>. Please ensure that your table name is correct.
2) I made sure that this is a Parquet issue. I removed the --as-parquetfile option and the job was successful.
3) I wondered if this is somehow caused by the incremental options. I removed the --incremental append & --check-column options and the job was successful. This confuses me.
4) I tried the job with MySQL and it was successful.
Has anyone run into something similar? Is there a way (or is it even advisable) to disable the Kite validation? It seems that the dataset is being created with dots ("."), which then Kite SDK complains about - but this is an assumption on my part as I am not too familiar with Kite SDK.
Thanks in advance,
Jose
Resolved. There seems to be a known issue with the JDBC connectivity to Oracle 12c. Using a specific OJDBC6 (instead of 7) did the trick. FYI - the OJDBC is installed in /usr/share/java/ and a symbolic link is created in /installpath.../lib/sqoop/lib/
As reported by user #Remya Senan,
breaking the parameter
--hive-table my_hive_db_name.my_hive_table_name
into separate params
--hive-database my_hive_db_name
--hive-table my_hive_table_name
did the trick for me
My environment was
Sqoop v1.4.7
Hive 2.3.3
Tip: I was on emr-5.19.0
I also got this error when I was sqoop importing all tables as parquet file on CHD5.8. By looking at error message I felt this implementation does not support directories with "-" in their name. Based on this understanding I removed "-" from directory name and re-ran the sqoop import command and all worked fine. Hope this helps!

Oraoop disabled for Sqoop import

I'm using the Hortonworks HDP Sandbox, and I’ve installed Oraoop per the instructions, but whenever I run a Sqoop import I get the message “oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.”. I’m not sure what else I need to do for it to pick it up. I have verified that the oraoop driver is in my sqoop lib directory. The imports do work, but they are just using the oracle driver, and I would like to play around with some of the features that you get with Oraoop.
This is the command I'm running:
sqoop-import --connect jdbc:oracle:thin:#<ip>:1521/sid --username myUser -P --query "select * from mytable where \$CONDITIONS" -split-by sequence_id -as-sequencefile --target-dir /user/hue/data/deactivatedsponsor
If '--query' argument is specified in place of '--table' parm, Oraoop connector is not used.
Following is mentioned in Sqoop Documentation
Data Connector for Oracle and Hadoop accepts responsibility for those Sqoop Jobs with the following attributes:
Oracle-related
Table-Based - Jobs where the table argument is used and the specified object is a table.
Following command should use Oraoop Connector. I have included "--direct" option as well which indicates to Sqoop that Oraoop should be used.
sqoop-import --connect jdbc:oracle:thin:#<ip>:1521/sid --direct --username myUser -P --table mytable -split-by sequence_id -as-sequencefile --target-dir /user/hue/data/deactivatedsponsor --columns <columns list> --where <where condition if needed>
Oraoop connector cannot process --query tool, when you use --query it automatically invokes sqoop.
So instead of using --query use --table for import.
Hope this helps!!

sqoop import issue with mysql

I have a hadoop ha setup based on cdh5.I have tried to import tables from mysql by using sqoop failed with following error.
15/03/20 12:47:53 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#33573e93 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#33573e93 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
I have used the below command..
sqoop import --connect jdbc:mysql://<mysql hostname>:3306/haddata --username root --password password --table authors --hive-import
My mysql server version is 5.1.73-3. and used 5.1.34 and 5.1.17 version of mysql-connector-java
sqoop version is 1.4.5-cdh5.3.2
Please let me know any suggestion/comments.
Try including the option --driver com.mysql.jdbc.Driver in the import command.
Try using the below modified command, which can suit your purpose
sqoop import --connect jdbc:mysql://<mysql hostname>:3306/haddata --driver com.mysql.jdbc.Driver --username root --password password --table authors --hive-import
follow this link
Include the driver argument --driver com.mysql.jdbc.Driver in sqoop command.
sqoop import --connect jdbc:mysql://<mysql hostname>:3306/<db name> --username **** --password **** --table <table name> --hive-import --driver com.mysql.jdbc.Driver
The --driver parameter forces sqoop to use the latest mysql-connector-java.jar installed for mysql db on the sqoop machine
Try with mysql-connector-java-5.1.31.jar, it is compatable with sqoop 1.4.5.
mysql-connector-java-5.1.17.jar driver does not work with sqoop 1.4.5.
refer :
https://issues.apache.org/jira/browse/SQOOP-1400
If you have com.mysql.jdbc_5.1.5.jar or any version of com.mysql.jdbc_5.X.X.jar file in $HADOOP_HOME/bin folder, then remove that, and execute your SQOOP query.
including the option --driver com.mysql.jdbc.Driver in the import command worked for me.
Sqoop does not ship with third party JDBC drivers. You must download them separately and save them to the /var/lib/sqoop/ directory on the server.
Note:
The JDBC drivers need to be installed only on the machine where Sqoop runs. You do not need to install them on all hosts in your Hadoop cluster.
You can download driver from here : https://dev.mysql.com/downloads/connector/j/5.1.html
Try the exact command as like below.
sqoop import --connect "jdbc:mysql://localhost:3306/books"
--username=root --password=root --table authors --as-textfile --target-dir=/datasqoop/authors_db --columns "id, name, email" --split-by id --driver com.mysql.jdbc.Driver
This will resolve your issues.
Find the jar locations that are being used in sqoop, in my case, it is pointing to the link /usr/share/java/mysql-connector-java.jar
so when I check the link /usr/share/java/mysql-connector-java.jar it points to mysql-connector-java-5.1.17.jar
/usr/share/java/mysql-connector-java.jar -> mysql-connector-java-5.1.17.jar
as 5.1.17 is having this issue, try 5.1.37 or higher.
unlink /usr/share/java/mysql-connector-java.jar
ln -s /usr/share/java/mysql-connector-java.jar /usr/share/java/mysql-connector-java-5.1.37.jar

Sqoop : import data from Oracle

I try to use Sqoop to import data from an Oracle DB.
I have placed the Oracle JDBC Driver (ojdbc6.jar) into SQOOP_HOME/lib.
My JDK is 1.6 version.
Here is my query :
sqoop import --hive-import --connect jdbc:oracle:thin#<ip_server>:1521/db --table ENTITE --username username --password password
But, when i launch the command, i get this error :
ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.oracleDriver
java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.oracleDriver
I don't understand why Sqoop can't connect to my db server.
Thanks for your help
If your using sqoop 1.4.2 assuming based on ojdbc6.jar above then see comments about the --driver usage from Kathleen here as it shouldn't be required:
https://issues.apache.org/jira/browse/SQOOP-457
With sqoop 1.4.2 and dropping ojdbc6.jar into my sqoop/lib this string works w/HDP 1.3 and MapR 2.0:
sqoop import --connect "jdbc:oracle:thin:#(description=(address=(protocol=tcp)(host=myhost)(port=1521))(connect_data=(service_name=myservice)))" \
--username USER --table SCHEMA.TABLE_NAME --hive-import --hive-table SCHEMA.TABLE_NAME \
--num-mappers 1 --verbose -P \
If you have access to mysql and or sql server, etc. test those too and make sure your lib directory is getting picked up. SQL Server is / was supposed to be in sqoop 1.4, but the docs and attempting to use it proved otherwise:
http://www.microsoft.com/en-us/download/confirmation.aspx?id=11774 - here is what you want for sql server testing.
cheers.
You need to add the oracle jdbc driver inside sqoop lib directory
You have to download the oracle connector jar file and copy that jar file to lib folder of Sqoop.
The jar file can be downloaded from http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html
copy this jar file to your Sqoop lib folder (/usr/lib/sqoop/lib)
And run the sqoop command.
Check your sqoop classpath by adding echo and make sure your driver is on the classpath. Same problem I have faced and resolved it.
look at the error message: Could not load db driver class: oracle.jdbc.oracleDriver
You need to type oracle.jdbc.OracleDriver with high register "O", since java is case sensitive
The error says that sqoop can't load oracle driver class as there is no ojdbc driver jar file in its path.First, You have to add ojdbc driver jar to lib folder of your sqoop home. Please download it here
http://www.java2s.com/Code/Jar/o/Downloadojdbc6jar.htm
oracle ojdbc6.jar needs to be copied to sqoop/lib directory to make it work.
You can state the oracle driver you use like so
sqoop import --hive-import --driver oracle.jdbc.driver.oracledriver --connect jdbc:oracle:thin#<ip_server>:1521/db --table ENTITE --username username --password password
sqoop import --connect "jdbc:oracle:thin:#(description=(address=(protocol=tcp)(host=hostip)(port=1521))(connect_data=(service_name=servicename)))" --username user --password pwd --table schema.tablename --hive-import --num-mappers 1 --verbose -P

Resources