Sqoop Import Error "Could not load db driver class" with Amazon EMR Service - hadoop

I have created a EMR cluster with hadoop,Sqoop and Spark
configuration. I am trying Sqoop Import but getting error "Could not
load db driver class: com.mysql.jdbc.Driver" . My question is which
location do we put the Mysql Driver ?
I have tried putting the Jar at path
1. /etc/sqoop/conf/
2. /etc/sqoop/lib/ (after creating the lib folder)
sqoop import --connect jdbc:mysql://--.--.--.--:3306/xyz --table
sample_submission --target-dir /home/sqoop7 --username x --password y -m 1;

Jars to put for sqoop is under lib dir in SQOOP_HOME
Full path: /usr/lib/sqoop/lib/

Related

Want to copy oracle data to hadoop

The tutorial in the link, mentioned below .
java-spark-tutorial
I have loaded data in oracle . Now i need to import it into hadoop . I am new to hadoop . I am familiar with Ambari .Can anyone please suggest how can we load data from oracle to hadoop using ambari tool ?
You can import rows from oracle to hadoop using sqoop. Typical command would be
sqoop import --connect jdbc:oracle:thin:<username>/<password>#<IP address>:1521:<db name> --username <username> -P --table <database name>.<table name> --columns "<column names>" --target-dir <target directory path in hdfs> -m 1

How can we import all tables in RDBMS into a Hive custom database?

I want to "import-all-tables" using sqoop from mysql to a Hive Custom Database ( Not Hive default Database )
Steps tried:
Create a custom database in hive under "/user/hive/warehouse/Custom.db"
Assigned all permissions for this directory- so there will be NO issues in writing into this directory by sqoop.
Used below command with option "--hive-database" option on CDH5.7 VM :
sqoop import-all-tables
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db"
--username retail_dba
--password cloudera
--hive-database "/user/hive/warehouse/sqoop_import_retail.db"
Tables created in hive default database only, not in the custom DB in this case: "sqoop_import_retail.db"
Else its trying to creates tables in the previous HDFS directories (/user/cloudera/categories), and error out stating table already exists:
16/08/30 00:07:14 WARN security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/categories already exists
16/08/30 00:07:14 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://quickstart.cloudera:8020/user/cloudera/categories already exists
[cloudera#quickstart etc]$
How to address this issues?
1. Creating tables in hive custom DB
2. Flushing previous directory references with Sqoop.
You did not mention --hive-import in your command. So, it will import it to HDFS under /user/cloudera/ in your case.
You are exceuting query again. That's why getting Exception
Output directory hdfs://quickstart.cloudera:8020/user/cloudera/categories already exists
Modify import command:
sqoop import-all-tables --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera --hive-database custom --hive-import
It will fetch all the tables from retail_db of MySQL and create corresponding table to custom database in hive.

sqoop import is showing error

I'm using hadoop 2.5.1 and sqoop 1.4.6.
I am using sqoop import for importing table from mysql database to be used with hadoop. It is showing following error
Sqoop Command
sqoop import --connect jdbc:mysql://localhost/<dbname> --username hadoopsqoop --password hadoop#123 --table tablename -m 1
Exception
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.fs.FSOutputSummer
Is there any way to figure out issue?
I figured out issue. I set HADOOP_HOME correctly and it solves my problem.
How can you import with out mentioning where to store the file. try this
sqoop import --connect jdbc:mysql://localhost/dbname --username hadoopsqoop --password hadoop#123 --table tablename --target-dir 'hdfspath' -m 1

sqoop import issue with mysql

I have a hadoop ha setup based on cdh5.I have tried to import tables from mysql by using sqoop failed with following error.
15/03/20 12:47:53 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#33573e93 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#33573e93 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
I have used the below command..
sqoop import --connect jdbc:mysql://<mysql hostname>:3306/haddata --username root --password password --table authors --hive-import
My mysql server version is 5.1.73-3. and used 5.1.34 and 5.1.17 version of mysql-connector-java
sqoop version is 1.4.5-cdh5.3.2
Please let me know any suggestion/comments.
Try including the option --driver com.mysql.jdbc.Driver in the import command.
Try using the below modified command, which can suit your purpose
sqoop import --connect jdbc:mysql://<mysql hostname>:3306/haddata --driver com.mysql.jdbc.Driver --username root --password password --table authors --hive-import
follow this link
Include the driver argument --driver com.mysql.jdbc.Driver in sqoop command.
sqoop import --connect jdbc:mysql://<mysql hostname>:3306/<db name> --username **** --password **** --table <table name> --hive-import --driver com.mysql.jdbc.Driver
The --driver parameter forces sqoop to use the latest mysql-connector-java.jar installed for mysql db on the sqoop machine
Try with mysql-connector-java-5.1.31.jar, it is compatable with sqoop 1.4.5.
mysql-connector-java-5.1.17.jar driver does not work with sqoop 1.4.5.
refer :
https://issues.apache.org/jira/browse/SQOOP-1400
If you have com.mysql.jdbc_5.1.5.jar or any version of com.mysql.jdbc_5.X.X.jar file in $HADOOP_HOME/bin folder, then remove that, and execute your SQOOP query.
including the option --driver com.mysql.jdbc.Driver in the import command worked for me.
Sqoop does not ship with third party JDBC drivers. You must download them separately and save them to the /var/lib/sqoop/ directory on the server.
Note:
The JDBC drivers need to be installed only on the machine where Sqoop runs. You do not need to install them on all hosts in your Hadoop cluster.
You can download driver from here : https://dev.mysql.com/downloads/connector/j/5.1.html
Try the exact command as like below.
sqoop import --connect "jdbc:mysql://localhost:3306/books"
--username=root --password=root --table authors --as-textfile --target-dir=/datasqoop/authors_db --columns "id, name, email" --split-by id --driver com.mysql.jdbc.Driver
This will resolve your issues.
Find the jar locations that are being used in sqoop, in my case, it is pointing to the link /usr/share/java/mysql-connector-java.jar
so when I check the link /usr/share/java/mysql-connector-java.jar it points to mysql-connector-java-5.1.17.jar
/usr/share/java/mysql-connector-java.jar -> mysql-connector-java-5.1.17.jar
as 5.1.17 is having this issue, try 5.1.37 or higher.
unlink /usr/share/java/mysql-connector-java.jar
ln -s /usr/share/java/mysql-connector-java.jar /usr/share/java/mysql-connector-java-5.1.37.jar

Sqoop : import data from Oracle

I try to use Sqoop to import data from an Oracle DB.
I have placed the Oracle JDBC Driver (ojdbc6.jar) into SQOOP_HOME/lib.
My JDK is 1.6 version.
Here is my query :
sqoop import --hive-import --connect jdbc:oracle:thin#<ip_server>:1521/db --table ENTITE --username username --password password
But, when i launch the command, i get this error :
ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.oracleDriver
java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.oracleDriver
I don't understand why Sqoop can't connect to my db server.
Thanks for your help
If your using sqoop 1.4.2 assuming based on ojdbc6.jar above then see comments about the --driver usage from Kathleen here as it shouldn't be required:
https://issues.apache.org/jira/browse/SQOOP-457
With sqoop 1.4.2 and dropping ojdbc6.jar into my sqoop/lib this string works w/HDP 1.3 and MapR 2.0:
sqoop import --connect "jdbc:oracle:thin:#(description=(address=(protocol=tcp)(host=myhost)(port=1521))(connect_data=(service_name=myservice)))" \
--username USER --table SCHEMA.TABLE_NAME --hive-import --hive-table SCHEMA.TABLE_NAME \
--num-mappers 1 --verbose -P \
If you have access to mysql and or sql server, etc. test those too and make sure your lib directory is getting picked up. SQL Server is / was supposed to be in sqoop 1.4, but the docs and attempting to use it proved otherwise:
http://www.microsoft.com/en-us/download/confirmation.aspx?id=11774 - here is what you want for sql server testing.
cheers.
You need to add the oracle jdbc driver inside sqoop lib directory
You have to download the oracle connector jar file and copy that jar file to lib folder of Sqoop.
The jar file can be downloaded from http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html
copy this jar file to your Sqoop lib folder (/usr/lib/sqoop/lib)
And run the sqoop command.
Check your sqoop classpath by adding echo and make sure your driver is on the classpath. Same problem I have faced and resolved it.
look at the error message: Could not load db driver class: oracle.jdbc.oracleDriver
You need to type oracle.jdbc.OracleDriver with high register "O", since java is case sensitive
The error says that sqoop can't load oracle driver class as there is no ojdbc driver jar file in its path.First, You have to add ojdbc driver jar to lib folder of your sqoop home. Please download it here
http://www.java2s.com/Code/Jar/o/Downloadojdbc6jar.htm
oracle ojdbc6.jar needs to be copied to sqoop/lib directory to make it work.
You can state the oracle driver you use like so
sqoop import --hive-import --driver oracle.jdbc.driver.oracledriver --connect jdbc:oracle:thin#<ip_server>:1521/db --table ENTITE --username username --password password
sqoop import --connect "jdbc:oracle:thin:#(description=(address=(protocol=tcp)(host=hostip)(port=1521))(connect_data=(service_name=servicename)))" --username user --password pwd --table schema.tablename --hive-import --num-mappers 1 --verbose -P

Resources