sqoop to transfer data to HDFS from Teradata - hadoop

sqoop to transfer data to HDFS from Teradata:
Getting error as below:
-bash-4.1$ sqoop import --connection-manager com.cloudera.sqoop.manager.DefaultManagerFactory --driver com.teradata.jdbc.TeraDriver \
--connect jdbc:teradata://dwsoat.dws.company.co.uk/DATABASE=TS_72258_BASELDB \
--username userid -P --table ADDRESS --num-mappers 3 \
--target-dir /user/nathalok/ADDRESS
Warning: /apps/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
14/10/29 14:00:14 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.1.3
14/10/29 14:00:14 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/10/29 14:00:14 ERROR sqoop.ConnFactory: Sqoop wasn't able to create connnection manager properly. Some of the connectors supports explicit --driver and some do not. Please try to either specify --driver or leave it out.
14/10/29 14:00:14 ERROR tool.BaseSqoopTool: Got error creating database manager: java.io.IOException: java.lang.NoSuchMethodException: com.cloudera.sqoop.manager.DefaultManagerFactory.(java.lang.String, com.cloudera.sqoop.SqoopOptions)
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:165)
at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:243)
at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:84)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:494)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
Caused by: java.lang.NoSuchMethodException: com.cloudera.sqoop.manager.DefaultManagerFactory.(java.lang.String, com.cloudera.sqoop.SqoopOptions)
at java.lang.Class.getConstructor0(Class.java:2810)
at java.lang.Class.getDeclaredConstructor(Class.java:2053)
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:151)
... 9 more
-bash-4.1$
Any help will be appreciated.

To get Teradata working properly using a Cloudera distribution, you need to do the following:
Install the Teradata JDBC jars in /var/lib/sqoop. For me these were terajdbc4.jar and tdgssconfig.jar.
Install either Cloudera Connector Powered by Teradata or the Cloudera Connector for Teradata installed somewhere on your filesystem (I prefer /var/lib/sqoop).
In /etc/sqoop/conf/managers.d/, create a file (of any name) and add com.cloudera.connector.teradata.TeradataManagerFactory=<location of connector jar>. For example, I have /etc/sqoop/conf/managers.d/teradata => com.cloudera.connector.teradata.TeradataManagerFactory=/var/lib/sqoop/sqoop-connector-teradata-1.2c5.jar.
There are different ways to install the Teradata connector as well. For example, it may be easier to use Cloudera Manager.
If you're still having trouble, try reaching out to the sqoop mailing list.

Related

Sqoop's import-all-table is not working

Hi i am trying to import all table from all schema from Oracle DB to HDFS.
This is my script:
sqoop-import-all-tables -Dmapreduce.job.user.classpath.first=true -Dhadoop.security.credential.provider.path=jceks://x.jceks --connect jdbc:oracle:thin:#x.x.x.x:1521/yyyy --username xxxx --password xxxx --warehouse-dir /data-warehouse/xxxx --as-avrodatafile --compression-codec snappy --autoreset-to-one-mapper
When i am running this script, not getting any error and no any Job is starting.
Output:
Warning: /usr/hdp/2.6.2.0-205/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
find: failed to restore initial working directory: Permission denied
18/08/11 08:32:51 INFO sqoop.Sqoop: Running **Sqoop version: 1.4.6.2.6.2.0-205**
18/08/11 08:32:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/08/11 08:32:51 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
18/08/11 08:32:51 INFO manager.SqlManager: Using default fetchSize of 1000
18/08/11 08:32:53 INFO manager.OracleManager: Time zone has been set to IST
It seems that the user configured in sqoop does not have enough privileges to query and export the data from Oracle. Please check connect and query from command line to Oracle database.
Regards !!!

error while using sqoop for data transfer to hdfs

i have used sqoop to transfer data between hdfs and oracle as shown below :
hadoop#jiogis-cluster-jiogis-master-001:~$ sqoop import --connect jdbc:oracle:gis-scan.ril.com/SAT --username=r4g_viewer --password=viewer_123 --table=R4G_OSP.ENODEB --hive-import --hive-table=ENODEB --target-dir=user/hive/warehouse/proddb/JioCenterBoundary -- direct
And i get error as shown below when i use sqoop as show above
Warning: /volumes/disk1/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /volumes/disk1/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /volumes/disk1/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/05/09 11:11:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
16/05/09 11:11:19 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/05/09 11:11:19 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/05/09 11:11:19 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/05/09 11:11:19 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
16/05/09 11:11:19 ERROR tool.BaseSqoopTool: Got error creating database manager: java.io.IOException: No manager for connect string: jdbc:oracle:gis-scan.ril.com/SAT
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:191)
at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:256)
at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:89)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:593)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Your jdbc connection string does not look correct. Can you try it in this format:
--connect jdbc:oracle:thin:#//hostname:port/servicename
In your case, this is probably:
--connect jdbc:oracle:thin:#//gis-scan.ril.com:1521/SAT
You may want to double check the port number is correct as the scan listener may not be on the default 1521 port.

sqoop - connect to oracle and import data to HDFS in IBM BigInsights

i want to connect to my database (oracle 10g) and import data to HDFS.
i am using IBM big Insight Platform.
but when i use below command :
sqoop import --connect jdbc:oracle:thin://<IP>:1521/DB--username xxx --password xxx--table t /lib/sqoop/sqoopout
Got exception running Sqoop:
java.lang.RuntimeException: Could not load db driver class:
oracle.jdbc.OracleDriver
java.lang.RuntimeException: Could not load db driver class:
oracle.jdbc.OracleDriver
at org.apache.sqoop.manager.OracleManager.makeConnection(OracleManager.java:286)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:752) at
org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:775) at
org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
at
org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at
org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
at
org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at
org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1833)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645) at
org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at
org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478) at
org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at
org.apache.sqoop.Sqoop.run(Sqoop.java:143) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at
org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at
org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at
org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at
org.apache.sqoop.Sqoop.main(Sqoop.java:236)
i also copy the ojdbc6_g.jar in sqoop/lib.
please help me to solve the problem that i can import data to HDFS.
What version of BigInsights you are using ? Have you loaded the Oracle odbc jar in all the nodes ? Sqoop internally triggers the Map job that will be running from datanodes.
To sqoop data from oracle database first of all you need to download the ojdbc jar and put it into the sqoop lib folder. Link for downloading the OJDBC jar is :
https://mvnrepository.com/artifact/ojdbc/ojdbc/14
https://mvnrepository.com/artifact/com.oracle/ojdbc14/10.2.0.2.0
Apart from that the sqoop command for importing data from ojdbc is :
sqoop import --connect jdbc:oracle:thin:#127.0.0.1:1521:XE --username ***** --password ****** --table table_name --columns "COL1, COL2, COL3, COL4, COL5" --target-dir /xyz/zyx -m 1
Here you can pay attention to the --connect tool, the connection string used has the format:
jdbc:oracle:thin:#ip_address:port_number:SID
The second format that is allowed is:
jdbc:oracle:thin:#ip_address:port_number/service_name
Hope this helps.
P.S. - If you are unable to add the OJDBC jar to sqoop`s lib you can also append the path of Jar file to the $HADOOP_CLASSPATH variable.
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/root/shared_folder/ojdbc6.jar
P.P.S - chmod the ojdbc jar to 777 before execution.

Sqoop import not working in Hadoop 2.x

I installed Hadoop-2.0.3 and Sqoop-1.4.4 and run Hadoop in pseudo distributed mode. When I try to import table from rdbms to hdfs issuing below command
master#hadoop:~/apps/sqoop-1.4.4$ bin/sqoop import --connect jdbc:mysql://localhost:3306/hadoop --username root --password root --table employees
I get the following error:
14/02/10 05:20:32 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
Can you please provide solution for this?

SQOOP Not able to import table

I am running below command on sqoop
sqoop import --connect jdbc:mysql://localhost/hadoopguide --table widgets
my version of sqoop : Sqoop 1.4.4.2.0.6.1-101
Hadoop -- Hadoop 2.2.0.2.0.6.0-101
Both taken from hortonworks distribution. all the paths like HADOOP_HOME, HCAT_HOME, SQOOP_HOME are set properly. I am able to get list of databases, list of tables from mysql database by running list-database, list-tables commands in sqoop. Even able to get data from --query 'select * from widgets'; but when i use --table option getting below error.
14/02/06 14:02:17 WARN mapred.LocalJobRunner: job_local177721176_0001
java.lang.Exception: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class widgets not found
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class widgets not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.sqoop.mapreduce.db.DBConfiguration.getInputClass(DBConfiguration.java:394)
at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.createDBRecordReader(DataDrivenDBInputFormat.java:233)
at org.apache.sqoop.mapreduce.db.DBInputFormat.createRecordReader(DBInputFormat.java:236)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: Class widgets not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 13 more
Specify the --bindir where the compiled code and .jar file should be located.
Without these arguments, Sqoop would place the generated Java source file in your current working directory and the compiled .class file and .jar file in /tmp/sqoop-<username>/compile.
Use the --bindir option and point to your current working directory.
sqoop import --bindir ./ --connect jdbc:mysql://localhost/hadoopguide --table widgets
The problem is resolved after i copied the .class file from /tmp/sqoop-hduser/compile/ to hdfs /home/hduser/ and also the current working directory from where i am running sqoop.
For importing a specific table into hdfs, run:
sqoop import --connect jdbc:mysql://localhost/databasename --username root --password *** --table tablename --bindir /usr/lib/sqoop/lib/ --driver com.mysql.jdbc.Driver --target-dir /directory-name
Make sure that /usr/lib/sqoop/* and /usr/local/hadoop/* should be owned by the same user otherwise it will give error like "Permission denied".
PS: Make sure that you have installed mysql-java connector before you run the command. I installed hadoop version 2.7.3 and connector 5.0.8
Another fix for ClassNotFoundException is to tell Hadoop to use the user classpath first (-Dmapreduce.job.user.classpath.first=true). This can be on command line or in Options file. The top of an import Options file would be:
#Options file for Sqoop import
import
-Dmapreduce.job.user.classpath.first=true
This fixed ClassNotFoundException for me when trying to import data as-avrodatafile

Resources