Datastax Enterprise Sqoop demo, got exceptions - sqoop

I try to run the sqoop demo from Datastax Enterprise 4.8, I set up an Analytics cluster of 4 nodes, then with another node set up MySql, and populate the data as in the demo example, I followed all the steps of the demo, and everything seems working fine until the point where I actually run the sqoop data migration command. All DBs are created correctly, and cluster is running fine (I can see it with nodetool status and with OpsCenter), but when I run the sqoop command, I got an exception:
host# /bin/dse sqoop --options-file /usr/share/dse/demos/sqoop/import.options
/usr/share/dse/bin/dse.in.sh: line 4: /bin/dse-client-tool: No such file or directory
Unable to start sqoop: jobtracker not found
The import.options file:
*cql-import
--table
npa_nxx
--cassandra-keyspace
npa_nxx
--cassandra-table
npa_nxx_data
--cassandra-column-mapping
npa:npa,nxx:nxx,latitude:lat,longitude:lon,state:state,city:city
--connect
jdbc:mysql://10.xxx.xxx.xxx/npa_nxx_demo
--username
root
--password
xxxxx
--cassandra-host
10.xxx.xxx.xxx,10.xxx.xxx.xxx*
anyone has ideas why is this error? I reinstalled the DSE, and still got the same... Thanks.

I found the reason, need to do a softlink of the dse-client-tool in /bin dir:
# ln -s /usr/shares/dse/bin/dse-client-tool /bin/dse-client-tool
then it works, not sure why the link not created during the installation...

Start DSE as an analytics node.
Edit /etc/default/dse, set HADOOP_ENABLED=1 in the cassandra.yaml to start the DSE service.
bin/dse cassandra -t

Related

Sqoop command not found when running through Oozie

When I am running Sqoop script in CLI, it is running fine without any issue. But when run it using Oozie, it failed with Sqoop command not found. It seems sqoop is not installed in other data nodes. So to run Sqoop script using Oozie, sqoop should be installed in all data nodes or is there any alternatives for that. Currently we have one master and 2 Data nodes.

Stratio Sqoop complains of missing hadoop library

I have pulled stratio sqoop docker containers as documented here:
https://stratio.atlassian.net/wiki/display/SQOOP0X2/Example+mysql+to+kafka
but when start the process of creating link between mysql and kafka. The step
"create link -c generic-jdbc-connector"
complains of missing hadoop library.
Is there something else pre-req for this?
Thx
I had missed a step of connecting sqoop client to sqoop server and got confused with the error message.

Sqoop import cannot locate needed JDBC file

Ok, someone has already asked this question once, but it seems that didn't help, so here is my question.
I've got Hadoop 2.5.1 installed on my Cent OS 7 machine. It's set up to run in a pseudo distributed mode. I ran few MapReduce sample jobs - so assume that all the configuration is fine.
I've downloaded Sqoop 1.4.5. And installed MySql database (MariaDB) and created the needed table.
NOW. I'm running the following command:
bin/sqoop export --connect jdbc:mysql://localhost/sqoopdb \
--table sqooptable --export-dir /user/dennis \
--fields-terminated-by '\t' --username root --password ***
It returns the following error message:
14/11/12 06:11:54 ERROR tool.ExportTool: Encountered IOException
running export job: java.io.FileNotFoundException: File does not
exist:
hdfs://localhost:9000/home/dennis/Sqoop/lib/mysql-connector-java-5.1.34-bin.jar
The file mentioned in the error doest exist in the local file system, moreover I've given it chmod 777 - just so that everyone was able to access it.
Any ideas anyone please?
The way i understand it - it looks for the mentioned file somewhere in hdfs whereas it is located in the local file system.
I've made it work. It is definitely the worst solution possible - but noone had offered me anything better. I've created the folder structure in the HDFS and copied the bloody JAR there. Now you can judge me :) The same thing written on my blog

Issue while submitting Jobs in Sqoop2 Client API

Am using Hadoop-2.2.0 on two node cluster, hadoop is configured correctly and working fine , Now am trying to install sqoop 2 (sqoop-1.99.3-bin-hadoop200) on it and when am trying to access sqoop 2 web UI (like localhost:12000) am getting the following .
Apache Sqoop ROOT
And when try to access cloudera.com:12000/sqoop/version am getting following
HTTP Status 404 -
And when I use this in sqoop client
[stratapps#cloudera2 ~]$ sqoop.sh client
Sqoop home directory: /usr/local/sqoop2
Sqoop Shell: Type 'help' or '\h' for help.
sqoop:000> set server --host cloudera.com --port 12000 --webapp sqoop
Server is set successfully
sqoop:000> show version --all
client version:
Sqoop 1.99.3 revision 2404393160301df16a94716a3034e31b03e27b0b
Compiled by mengweid on Fri Oct 18 14:15:53 EDT 2013
Exception has occurred during processing command
Exception: com.sun.jersey.api.client.UniformInterfaceException Message: GET http://cloudera.com:12000/sqoop/version returned a response status of 404 Not Found
My catelina.proparties file for common.loader looks like
common.loader=
${catalina.base}/lib,
${catalina.base}/lib/*.jar,
${catalina.home}/lib,
${catalina.home}/lib/*.jar,
${catalina.home}/../lib/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/common/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/common/lib/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/hdfs/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/tools/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/tools/lib/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/yarn/*.jar,
/usr/local/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/*.jar
my Sqoop.proprties file org.apache.sqoop.submission.engine.mapreduce.configuration.directory looks like
# Hadoop configuration directory
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/hadoop-2.2.0/etc/hadoop
Please share your input I goggled a lot on this but dint find any solution yet.
Thankyou,
Malleshwar
Why are you setting hostname of sqoop server to cloudera.com ? Use you machine's host address, where sqoop server is running . In case you are using a single machine, then change cloudera.com to localhost.

Error in sqoop import query

Scenario:
I am trying for importing data from MS SQL Server to HDFS. But I am getting certain errors as:
Errors:
hadoop#ubuntu:~/sqoop-1.1.0$ bin/sqoop import --connect 'jdbc:sqlserver://localhost;username=abcd;password=12345;database=HadoopTest' --table PersonInfo
11/12/09 18:08:15 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.1
java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.1
at com.cloudera.sqoop.shims.ShimLoader.loadShim(ShimLoader.java:190)
at com.cloudera.sqoop.shims.ShimLoader.getHadoopShim(ShimLoader.java:109)
at com.cloudera.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:173)
at com.cloudera.sqoop.tool.ImportTool.init(ImportTool.java:81)
at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:411)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:170)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:196)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:205)
Question:
I have configured Sqoop successfully and then what could be the problem? I am trying to connect to database by entering IP address but there is also the same problem.
How can I remove these error? Pls suggest me solution.
Thanks.
Sqoop is now an incubator project in Apache. There is no reason Sqoop should only run with CDH and not Apache Hadoop.
The Sqoop documentation says Sqoop is compatible with Apache Hadoop 0.21 and Cloudera's Distribution of Hadoop version 3.. So, I think using the the correct version of Apache will also solve the problem.
SQOOP-82 is more than an year old and there had been changes after that.
FYI, Sqoop was made part of the Hadoop 0.21 branch and has been removed from Hadoop after moving it to Apache Incubator.
Please check this issue:
Sqoop does not run with Apache Hadoop 0.20.2. The only supported platform is CDH 3 beta 2. It requires features of MapReduce not available in the Apache 0.20.2 release of Hadoop. You should upgrade to CDH 3 beta 2 if you want to run Sqoop 1.0.0.
In your sqoop import command you are missing the driver value using --driver
May be this will help.
I think you should try this one, it may solve your problem:
Add the port number of the sqlserver. For port number check with your my.conf(/etc/mysql/my.conf) file.
Try this command with port number and schema:
sqoop import --connect jdbc:mysql://localhost:3306/mydb -username root -password password --table emp --m 1

Resources