I'm trying to use sqoop in Hue but there's an error :
Sqoop error: Could not get connectors.
and no sqoop wizard in page.
But I could import data from Oracle using sqoop shell (not sqoop2).
My questions are :
Is there anything else to config beside putting oracle jdbc driver ? (in /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/sqoop2/client-lib/)
What directories that needs to be permitted for user sqoop2 ? (except /var/lib/sqoop )
note : I still got no clue after reading this post
env : Hue 3.9 / Sqoop2 / CDH 5.5.0 / CM 5.5.0 / LDAP, Kerberos & Sentry installed


Talend 8.0 DBConnection to Hive (Cloudera)

I'm trying to create connection for accessing to hive from talend.
My Hive is running in cloudera-quickstart (using virtualbox.)
and in virtualbox, my ifconfig is :
My parameter on Talend DBconnection metadata is
DBType : Hive
Distribution : Cloudera
Version : Cloudera CDH6.1.1
Hive Model : Standalone
Hive Server Version : Hive Server2 --jdbc:hive2://
string of connection : jdbc:hive2://
login : cloudera
password : cloudera
server :
Database: default
Additional JDBC Settings : Empty
right now the result is Connection failure, must change database setting.
Can anyone help me to fix this issue ?.
Or anyone know the resource or tutorial for connection talend to cloudera hadoop
I'm using Talend 8.0, Windows 10 and Virtualbox 6 + Cloudera 5.12.0
Thank you for advice.

Sqoop eval throwing error when I tried to check the connection due to Could not load jar into JVM

I have tried to run the Sqoop eval script through AWS EMR CLI for Teradata connection but found the error
Error loading ManagerFactory information from file /usr/lib/sqoop/conf/managers.d/td_connector.txt: Could not load jar $SQOOP_HOME/lib/teradata-connector-1.6.5.jar into JVM. (Could not find class org.apache.sqoop.teradata.TeradataConnManager.)
Steps I have followed:
login to EMR version emr-6.2.0 with the configuration of hadoop 3 and sqoop 1.4.7 through SSH
Downloaded the Teradata Hadoop connector 3.x from teradata downloads
moved the teradata hadoop connector to $SQOOP_HOME/lib and installed.
created the text file td_connect at /usr/lib/sqoop/conf/managers.d/ and included the text org.apache.sqoop.teradata.TeradataConnManager=$SQOOP_HOME/lib/teradata-connector-1.6.5.jar
ran the script
sqoop eval --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://host/database= --username username --password password --query 'select top 5 * from table'
Could you please help to identify the issue

Use Sqoop on different user?

I am having two users "ashish" and "ashuser". Hadoop is installed on "ashuser".
But I mistakenly installed Sqoop on "ashish" user.
So whenever I tried to check Sqoop version in "ashuser", I am getting an error "Command Not Found".
I tried giving ownership to Sqoop folder for "ashuser".
Is there any possibility that I can use Sqoop on "ashuser"?

Connecting HiveServer2 from pyspark

I am stuck at point as , how to use pyspark to fetch data from hive server using jdbc.
I am Trying to connect to HiveServer2 running on my local machine from pyspark using jdbc. All components HDFS,pyspark,HiveServer2 are on same machine.
Following is the code i am using to connect :
connProps={ "username" : 'hive',"password" : '',"driver" : "org.apache.hive.jdbc.HiveDriver"}'jdbc:hive2://',table='pokes',properties=connProps)
dataframe_mysql ="jdbc").option("url", "jdbc:hive://localhost:10000/default").option("driver", "org.apache.hive.jdbc.HiveDriver").option("dbtable", "pokes").option("user", "hive").option("password", "").load()
both methods used above are giving me same error as below:
org.apache.spark.sql.AnalysisException: java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate
javax.jdo.JDOFatalDataStoreException: Unable to open a test connection
to the given database. JDBC url =
jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
Terminating connection pool (set lazyInit to true if you expect to
start your database after your app).
ERROR XSDB6: Another instance of Derby may have already booted the database /home///jupyter-notebooks/metastore_db
metastore_db is located at same directory where my jupyter notebooks are created. but hive-site.xml is having different metastore location.
I have already checked reffering to other questions about same error saying other spark-shell or such process is running,but its not. Even if i try following command when HiveServer2 and HDFS are down i am getting same error
spark.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING) USING hive")
I am able to connect to hives using java program using jdbc. Am I missing something here? Please help.Thanks in advance.
Spark should not use JDBC to connect to Hive.
It reads from the metastore, and skips HiveServer2
However, Another instance of Derby may have already booted the database means that you're running Spark from another session, such as another Jupyter kernel that's still running. Try setting a different metastore location, or work on setting up a remote Hive metastore using a local Mysql or Postgres database and edit $SPARK_HOME/conf/hive-site.xml with that information.
From SparkSQL - Hive tables
spark = SparkSession \
.builder \
.appName("Python Spark SQL Hive integration example") \
.config("spark.sql.warehouse.dir", warehouse_location) \
.enableHiveSupport() \
# spark is an existing SparkSession
spark.sql("CREATE TABLE...")

Error in sqoop import query

I am trying for importing data from MS SQL Server to HDFS. But I am getting certain errors as:
hadoop#ubuntu:~/sqoop-1.1.0$ bin/sqoop import --connect 'jdbc:sqlserver://localhost;username=abcd;password=12345;database=HadoopTest' --table PersonInfo
11/12/09 18:08:15 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.1
java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.1
at com.cloudera.sqoop.shims.ShimLoader.loadShim(
at com.cloudera.sqoop.shims.ShimLoader.getHadoopShim(
at com.cloudera.sqoop.tool.BaseSqoopTool.init(
at com.cloudera.sqoop.tool.ImportTool.init(
at com.cloudera.sqoop.Sqoop.runSqoop(
at com.cloudera.sqoop.Sqoop.runTool(
at com.cloudera.sqoop.Sqoop.main(
I have configured Sqoop successfully and then what could be the problem? I am trying to connect to database by entering IP address but there is also the same problem.
How can I remove these error? Pls suggest me solution.
Sqoop is now an incubator project in Apache. There is no reason Sqoop should only run with CDH and not Apache Hadoop.
The Sqoop documentation says Sqoop is compatible with Apache Hadoop 0.21 and Cloudera's Distribution of Hadoop version 3.. So, I think using the the correct version of Apache will also solve the problem.
SQOOP-82 is more than an year old and there had been changes after that.
FYI, Sqoop was made part of the Hadoop 0.21 branch and has been removed from Hadoop after moving it to Apache Incubator.
Please check this issue:
Sqoop does not run with Apache Hadoop 0.20.2. The only supported platform is CDH 3 beta 2. It requires features of MapReduce not available in the Apache 0.20.2 release of Hadoop. You should upgrade to CDH 3 beta 2 if you want to run Sqoop 1.0.0.
In your sqoop import command you are missing the driver value using --driver
May be this will help.
I think you should try this one, it may solve your problem:
Add the port number of the sqlserver. For port number check with your my.conf(/etc/mysql/my.conf) file.
Try this command with port number and schema:
sqoop import --connect jdbc:mysql://localhost:3306/mydb -username root -password password --table emp --m 1
