sqoop + cloudera manger jdbc driver not found - jdbc

I am trying to set up the JDBC driver for sqoop in cloudera manager.
Here is some background on my setup:
1) I have a 5 machine Hadoop cluster running CDH 4.5 on ubundu
2) Installed sqoop through cloudera manager
I have already downloaded the latest JDBC mysqlconnector jar and copied that to the following locations:
sudo cp /home/clouderasudo/jbdcDriver/mysql-connector-java-5.1.29-bin.jar /usr/lib/sqoop/lib/
sudo cp /home/clouderasudo/jbdcDriver/mysql-connector-java-5.1.29-bin.jar /usr/lib/oozie/lib/
But it still get below an error when I try to set up a new job in sqoop with com.mysql.jdbc.Driver as the JDBC driver class:
Can't load specified driver
Any help appreciated.

you may need to copy the database driver file in directory that contains sqoop library, in my case it is /opt/cloudera/parcels/CDH/lib/sqoop/lib
Savio

for me it was
hdfs path: /user/oozie/share/lib/lib_20140909154837/sqoop
restart oozie in cloudera manager

Related

How to access remote hive and hadoop file system using sqoop import command with kerberos authentication?

Using sqoop version 1.4.5 and hadoop version 3.3.4. My requirement is to connect to remote hive and remote hadoop file system without changing the configuration files with kerberos.
Is it possible to do the following operation without amending the configuration files for hadoop, sqoop? If yes, then what all parameters needs to be changed in the configuration files?

HDP 2.2 Sandbox Could not find SQOOP directory

I was following the tutorial
http://hortonworks.com/hadoop-tutorial/import-microsoft-sql-server-hortonworks-sandbox-using-sqoop/
I am unable to find the /usr/lib/sqoop/lib.
I could see Sqoop running in the sandbox. Just not able to find the folder to drop the drivers.
Where else I could place the jdbc driver? Also where is the installation directory for sqoop?
It's in /usr/hdp/2.2.0.0-2041/sqoop/lib
run below command at sandbox root to locate sqoop
find . / "sqoop"

Hadoop issue with Sqoop installation

I have Hadoop(pseudo distributed mode), Hive, sqoop and mysql installed in my local machine.
But when I am trying to run sqoop Its giving me the following error
Error: /usr/lib/hadoop does not exist!
Please set $HADOOP_COMMON_HOME to the root of your Hadoop installation.
Then I set the sqoop-env-template.sh file with all the information. Beneath is the snapshot of the sqoop-env-template.sh file.
Even after providing the hadoop hive path I face the same error.
I've installed
hadoop in /home/hduser/hadoop version 1.0.3
hive in /home/hduser/hive version 0.11.0
sqoop in /home/hduser/sqoop version 1.4.4
and mysql connector jar java-5.1.29
Could anybody please throw some light on what is going wrong
sqoop-env-template.sh is a template, meaning it doesn't by itself get sourced by the configurator. If you want it to have a custom conf and load it, make a copy as $SQOOP_HOME/conf/sqoop-env.sh.
Note: here is the relevant excerpt from bin/configure-sqoop for version 1.4.4:
SQOOP_CONF_DIR=${SQOOP_CONF_DIR:-${SQOOP_HOME}/conf}
if [ -f "${SQOOP_CONF_DIR}/sqoop-env.sh" ]; then
. "${SQOOP_CONF_DIR}/sqoop-env.sh"
fi

Oozie + Sqoop: JDBC Driver Jar Location

I have a 6 node cloudera based hadoop cluster and I'm trying to connect to an oracle database from a sqoop action in oozie.
I have copied my ojdbc6.jar into the sqoop lib location (which for me happens to be at: /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/ ) on all the nodes and have verified that I can run a simple 'sqoop eval' from all the 6 nodes.
Now when I run the same command using Oozie's sqoop action, I get "Could not load db driver class: oracle.jdbc.OracleDriver"
I have read this article about using shared libs and it makes sense to me when we're talking about my task/action/workflow specific dependencies. But I see a JDBC driver installation as an extention to sqoop and so I think it belongs in the sqoop installation lib.
Now the question is, while sqoop sees this ojdbc6 jar I have put into it's lib folder, how come my Oozie workflow doesn't see it?
Is this something expected or am I missing something?
As an aside, what do you guy think about where is the appropriate location for a JDBC driver jar?
Thanks in advance!
The JDBC driver jar (and any jars it depends on) should go in your Oozie sharelib folder on HDFS. I'm running Hortonworks Data Platform 1.2 instead of Cloudera 4.2 so the details may vary, but my JDBC driver is located in /user/oozie/share/lib/sqoop. This should allow you to run Sqoop with the JDBC via Oozie.
It is not necessary to put to the JDBC driver jar in the sqoop lib on the data nodes. In my setupt I can't run a simple sqoop eval from the command line on my data nodes. I understand the logic for why you thought this would work. The reason the JDBC driver jar needs to be on HDFS is so that all the data nodes have access to it. Your solution should accomplish the same goal. I'm not familiar enough with the inner workings of Oozie to say why using the sharelib works but your solution does not.
In CDH5, you should put the jar to '/user/oozie/share/lib/lib_${timestamp}/sqoop', and after that, you must update the sharelib or restart oozie.
update sharelib:
oozie admin -oozie http://localhost:11000/oozie -sharelibupdate
If you are using CDH-5 the JDBC driver jar (and any jars it depends on) should go in '/user/oozie/share/lib/lib_timestamp/sqoop' folder on HDFS.
I was facing the same issue it was not able to find the mysql jar. I am using cloudera 4.4 in this even oozie admin -oozie http://localhost:11000/oozie -sharelibupdate command will not work
To resolve the issue I had followed the below steps:
create a user in Hue with hdfs and provide the admin privileges
using Hue UI upload the jar into /user/oozie/share/lib/sqoop hdfs path
or you can use below command:
hadoop put /var/lib/sqoop2/mysql-connector-java.jar /user/oozie/share/lib/sqoop
Once the jar is placed run the oozie command.

Error in sqoop import query

Scenario:
I am trying for importing data from MS SQL Server to HDFS. But I am getting certain errors as:
Errors:
hadoop#ubuntu:~/sqoop-1.1.0$ bin/sqoop import --connect 'jdbc:sqlserver://localhost;username=abcd;password=12345;database=HadoopTest' --table PersonInfo
11/12/09 18:08:15 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.1
java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.1
at com.cloudera.sqoop.shims.ShimLoader.loadShim(ShimLoader.java:190)
at com.cloudera.sqoop.shims.ShimLoader.getHadoopShim(ShimLoader.java:109)
at com.cloudera.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:173)
at com.cloudera.sqoop.tool.ImportTool.init(ImportTool.java:81)
at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:411)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:170)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:196)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:205)
Question:
I have configured Sqoop successfully and then what could be the problem? I am trying to connect to database by entering IP address but there is also the same problem.
How can I remove these error? Pls suggest me solution.
Thanks.
Sqoop is now an incubator project in Apache. There is no reason Sqoop should only run with CDH and not Apache Hadoop.
The Sqoop documentation says Sqoop is compatible with Apache Hadoop 0.21 and Cloudera's Distribution of Hadoop version 3.. So, I think using the the correct version of Apache will also solve the problem.
SQOOP-82 is more than an year old and there had been changes after that.
FYI, Sqoop was made part of the Hadoop 0.21 branch and has been removed from Hadoop after moving it to Apache Incubator.
Please check this issue:
Sqoop does not run with Apache Hadoop 0.20.2. The only supported platform is CDH 3 beta 2. It requires features of MapReduce not available in the Apache 0.20.2 release of Hadoop. You should upgrade to CDH 3 beta 2 if you want to run Sqoop 1.0.0.
In your sqoop import command you are missing the driver value using --driver
May be this will help.
I think you should try this one, it may solve your problem:
Add the port number of the sqlserver. For port number check with your my.conf(/etc/mysql/my.conf) file.
Try this command with port number and schema:
sqoop import --connect jdbc:mysql://localhost:3306/mydb -username root -password password --table emp --m 1

Resources