Sqoop Job Exception - hadoop

I am getting the following Exception when I'm trying to List the Sqoop JOBS.
I'm not able to create the Soop jobs because of this exception:
root#ubuntu:/usr/lib/sqoop/conf# sqoop job --list 16/04/11 01:51:44
ERROR tool.JobTool: I/O error performing job operation:
java.io.IOException: Exception creating SQL connection at
com.cloudera.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:220)
at
com.cloudera.sqoop.metastore.hsqldb.AutoHsqldbStorage.open(AutoHsqldbStorage.java:113)
at com.cloudera.sqoop.tool.JobTool.run(JobTool.java:279) at
com.cloudera.sqoop.Sqoop.run(Sqoop.java:146) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182) at
com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221) at
com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230) at
com.cloudera.sqoop.Sqoop.main(Sqoop.java:239) Caused by:
Caused by: java.sql.SQLException: General error: java.lang.ClassFormatError: >Truncated class file
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
at org.hsqldb.jdbc.jdbcConnection.(Unknown Source)
at org.hsqldb.jdbcDriver.getConnection(Unknown Source)
at org.hsqldb.jdbcDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at com.cloudera.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:>180)
... 8 more
Sqoop Version: 1.3.0-cdh3u5
Please help
Commands used as below:
sqoop job --list
sqoop job --create sqoopjob21 -- import --connect jdbc:mysql://localhost/mysql1 --table emp --target-dir /importjob21 ;

This may possibly be because sqoop cannot find the hsqldb which it uses to store job information. Check if you have "metastore.db.script" file in the sqoop installed directory, if not create it.
Create a file named "metastore.db.script" and put the following lines
CREATE SCHEMA PUBLIC AUTHORIZATION DBA
CREATE MEMORY TABLE SQOOP_ROOT(VERSION INTEGER,PROPNAME VARCHAR(128) NOT NULL,PROPVAL VARCHAR(256),CONSTRAINT SQOOP_ROOT_UNQ UNIQUE(VERSION,PROPNAME))
CREATE MEMORY TABLE SQOOP_SESSIONS(JOB_NAME VARCHAR(64) NOT NULL,PROPNAME VARCHAR(128) NOT NULL,PROPVAL VARCHAR(1024),PROPCLASS VARCHAR(32) NOT NULL,CONSTRAINT SQOOP_SESSIONS_UNQ UNIQUE(JOB_NAME,PROPNAME,PROPCLASS))
CREATE USER SA PASSWORD ""
GRANT DBA TO SA
SET WRITE_DELAY 10
SET SCHEMA PUBLIC
INSERT INTO SQOOP_ROOT VALUES(NULL,'sqoop.hsqldb.job.storage.version','0')
INSERT INTO SQOOP_ROOT VALUES(0,'sqoop.hsqldb.job.info.table','SQOOP_SESSIONS')
Now create "metastore.db.properties" file and put these lines
#HSQL Database Engine 1.8.0.10
#Fri Aug 04 14:07:10 IST 2017
hsqldb.script_format=0
runtime.gc_interval=0
sql.enforce_strict_size=false
hsqldb.cache_size_scale=8
readonly=false
hsqldb.nio_data_file=true
hsqldb.cache_scale=14
version=1.8.0
hsqldb.default_table_type=memory
hsqldb.cache_file_scale=1
hsqldb.log_size=200
modified=no
hsqldb.cache_version=1.7.0
hsqldb.original_version=1.8.0
hsqldb.compatible_version=1.8.0
Now create a directory named ".sqoop" if not already created and put these two files there. Now run your job.

Related

JNI error while creating external table using gphdfs protocol : greenplum

1) Completed "One-time HDFS Protocol Installation" using link - http://gpdb.docs.pivotal.io/4360/admin_guide/load/topics/g-one-time-hdfs-protocol-installation.html#topic20
2) copied the 'csv' file on hdfs system at path - data/etl/ext01
3) created external table using following command
create external table orgData(orghk varchar(200),eff_datetime timestamp, source varchar(20), handle_id varchar(200), created_by_d varchar(100), created_datetime timestamp)
location ('gphdfs://<hostname>:8020/data/etl/ext01/part-r-00000-3eae416a-d0ff-4562-a762-d53469d42cd2.csv')
Format 'CSV' (DELIMITER ',')
However after executing the command - select * from orgData
I am getting following error
ERROR: ERROR: external table gphdfs protocol command ended with
error. Error: A JNI error has occurred, please check your
installation and try again (seg1 slice1
<hostname2>:40000 pid=4977) Detail:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/mapreduce/lib/input/FileInputFormat at
java.lang.Class.getDeclaredMethods0(Native Method) at
java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at
java.lang.Class.privateGetMethodRecursive(Class.java:3048) at
java.lang.Class.getMethod0(Class.java:3018) at
java.lang.Class.getMethod(Class.java:1784) at
sun.launcher.LauncherHelper.valid Command:
'gphdfs://<hostname>:8040/data/etl/ext01/part-r-00000-3eae416a-d0ff-4562-a762-d53469d42cd2.csv'
External table orgdata, file
gphdfs://<hostname>:8040/data/etl/ext01/part-r-00000-3eae416a-d0ff-4562-a762-d53469d42cd2.csv
Am I missing something?
Can you verify you set JAVA_HOME and HADOOP_HOME on ALL segments, then restarted the cluster?
gpssh -f clusterHostfile -e 'egrep (JAVA_HOME|HADOOP_HOME) ~/.bashrc | wc -l'
You should see the number 2 from each host in the cluster.

sqoop - connect to oracle and import data to HDFS in IBM BigInsights

i want to connect to my database (oracle 10g) and import data to HDFS.
i am using IBM big Insight Platform.
but when i use below command :
sqoop import --connect jdbc:oracle:thin://<IP>:1521/DB--username xxx --password xxx--table t /lib/sqoop/sqoopout
Got exception running Sqoop:
java.lang.RuntimeException: Could not load db driver class:
oracle.jdbc.OracleDriver
java.lang.RuntimeException: Could not load db driver class:
oracle.jdbc.OracleDriver
at org.apache.sqoop.manager.OracleManager.makeConnection(OracleManager.java:286)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:752) at
org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:775) at
org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
at
org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at
org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
at
org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at
org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1833)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645) at
org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at
org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478) at
org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at
org.apache.sqoop.Sqoop.run(Sqoop.java:143) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at
org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at
org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at
org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at
org.apache.sqoop.Sqoop.main(Sqoop.java:236)
i also copy the ojdbc6_g.jar in sqoop/lib.
please help me to solve the problem that i can import data to HDFS.
What version of BigInsights you are using ? Have you loaded the Oracle odbc jar in all the nodes ? Sqoop internally triggers the Map job that will be running from datanodes.
To sqoop data from oracle database first of all you need to download the ojdbc jar and put it into the sqoop lib folder. Link for downloading the OJDBC jar is :
https://mvnrepository.com/artifact/ojdbc/ojdbc/14
https://mvnrepository.com/artifact/com.oracle/ojdbc14/10.2.0.2.0
Apart from that the sqoop command for importing data from ojdbc is :
sqoop import --connect jdbc:oracle:thin:#127.0.0.1:1521:XE --username ***** --password ****** --table table_name --columns "COL1, COL2, COL3, COL4, COL5" --target-dir /xyz/zyx -m 1
Here you can pay attention to the --connect tool, the connection string used has the format:
jdbc:oracle:thin:#ip_address:port_number:SID
The second format that is allowed is:
jdbc:oracle:thin:#ip_address:port_number/service_name
Hope this helps.
P.S. - If you are unable to add the OJDBC jar to sqoop`s lib you can also append the path of Jar file to the $HADOOP_CLASSPATH variable.
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/root/shared_folder/ojdbc6.jar
P.P.S - chmod the ojdbc jar to 777 before execution.

Can sqoop export blob type from HDFS to Mysql?

Can Sqoop export blob type from HDFS to Mysql?
I have a table with blob type column, and I can import it to HDFS, but when export it back it raises exception:
Caused by: java.io.IOException: Could not buffer record
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:218)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
... 6 more
Caused by: java.lang.CloneNotSupportedException: com.cloudera.sqoop.lib.BlobRef
at java.lang.Object.clone(Native Method)
at org.apache.sqoop.lib.LobRef.clone(LobRef.java:109)
at TblPlugin.clone(TblPlugin.java:270)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:213)
This is a Major Bug of Sqoop and it is still in open Status.
For more details see Not able to export blob datatype from HDFS to MySQL
Hope this gives information about your issue.

Issue : Exporting table from hadoop to mysql

This is my sqoop script for exporting table from hadoop to mysql:
export
## Database details
--connect
jdbc:mysql://mktgcituspoc1.cisco.com:3306/poc
--username
pocuser
--password
pocuser
## Table to export to
--table
mktg_site_pub
--export-dir
##/app/MarketingIT/warehouse/mktg_mbd.db/performance_tst
/app/dev/MarketingIt/warehouse/hddvmktg/mktg_mbd.db/mktg_site_pub
--input-fields-terminated-by
'|'
--input-null-string
'\\N'
--input-null-non-string
'\\N'
-m
I'm getting the following error after running the above sqoop script..
15/02/19 02:16:38 INFO mapred.JobClient: Task Id : attempt_201502172305_2648_m_000016_1, Status : FAILED on node hdnprd-c01-r01-05
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:680)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:346)
at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1117)
at org.apache.hadoop.mapred.Child.main(Child.java:271)
Caused by: java.lang.IllegalArgumentException
at java.sql.Date.valueOf(Date.java:140)
at mktg_site_pub.__loadFromFields(mktg_site_pub.java:3622)
at mktg_site_pub.parse(mktg_site_pub.java:3549)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Can anyone please help to figure out whats the exact issue?
It is highly likely that you have a date-related field in MySQL that you are trying to export a non-date value into.
Check which fields in the MySQL table have that type and then check that the corresponding values in your Hadoop data set can be converted to a Java Date using Date.valueOf().

SQOOP Not able to import table

I am running below command on sqoop
sqoop import --connect jdbc:mysql://localhost/hadoopguide --table widgets
my version of sqoop : Sqoop 1.4.4.2.0.6.1-101
Hadoop -- Hadoop 2.2.0.2.0.6.0-101
Both taken from hortonworks distribution. all the paths like HADOOP_HOME, HCAT_HOME, SQOOP_HOME are set properly. I am able to get list of databases, list of tables from mysql database by running list-database, list-tables commands in sqoop. Even able to get data from --query 'select * from widgets'; but when i use --table option getting below error.
14/02/06 14:02:17 WARN mapred.LocalJobRunner: job_local177721176_0001
java.lang.Exception: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class widgets not found
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class widgets not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.sqoop.mapreduce.db.DBConfiguration.getInputClass(DBConfiguration.java:394)
at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.createDBRecordReader(DataDrivenDBInputFormat.java:233)
at org.apache.sqoop.mapreduce.db.DBInputFormat.createRecordReader(DBInputFormat.java:236)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: Class widgets not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 13 more
Specify the --bindir where the compiled code and .jar file should be located.
Without these arguments, Sqoop would place the generated Java source file in your current working directory and the compiled .class file and .jar file in /tmp/sqoop-<username>/compile.
Use the --bindir option and point to your current working directory.
sqoop import --bindir ./ --connect jdbc:mysql://localhost/hadoopguide --table widgets
The problem is resolved after i copied the .class file from /tmp/sqoop-hduser/compile/ to hdfs /home/hduser/ and also the current working directory from where i am running sqoop.
For importing a specific table into hdfs, run:
sqoop import --connect jdbc:mysql://localhost/databasename --username root --password *** --table tablename --bindir /usr/lib/sqoop/lib/ --driver com.mysql.jdbc.Driver --target-dir /directory-name
Make sure that /usr/lib/sqoop/* and /usr/local/hadoop/* should be owned by the same user otherwise it will give error like "Permission denied".
PS: Make sure that you have installed mysql-java connector before you run the command. I installed hadoop version 2.7.3 and connector 5.0.8
Another fix for ClassNotFoundException is to tell Hadoop to use the user classpath first (-Dmapreduce.job.user.classpath.first=true). This can be on command line or in Options file. The top of an import Options file would be:
#Options file for Sqoop import
import
-Dmapreduce.job.user.classpath.first=true
This fixed ClassNotFoundException for me when trying to import data as-avrodatafile

Resources