I have created a table in Mysql, "A"
And I have created a database in Hive - "hiveankit"
When I try to import table A to the target database with the following command:
[training#localhost ~]$ sqoop import --connect jdbc:mysql://localhost/march2015 --username root --table A -m 1 --target-dir hiveankit;
This is the result:
16/07/02 08:53:19 INFO mapreduce.ImportJobBase: Retrieved 15 records.
[training#localhost ~]$ hive;
Hive history file=/tmp/training/hive_job_log_training_201607020853_1580004608.txt
hive> show databases;
OK
default
hiveankit
Time taken: 3.029 seconds
hive> use hiveankit;
OK
Time taken: 0.044 seconds
hive> select * from A;
FAILED: Error in semantic analysis: Line 1:14 Table not found A
Why I am getting this error.
Am I missing any steps?
The import command should have "--hive-table", "--create-hive-table" and "--hive-import" options to automatically create Hive table during sqoop import. I have modified your code by adding these options (see below). Without these options, the Sqoop import will only copy the data to HDFS and a Hive table will NOT be created.
sqoop-import --connect jdbc:mysql://localhost/march2015 --username root --table A --hive-table ${hive_db_name}.A --create-hive-table --hive-import -m 1 --target-dir hiveankit;
Related
I am new to sqoop and hive . Please help me with understanding
The count of mysql and hive table are different
mysql is 51 rows (table has primary key and no duplicates ) ad hive is 38rows - first run itself
sqoop job --create mmod -- import --connect "jdbc:mysql://cxln2.c.thelab-240901.internal:3306/retail_db" --username sqoopuser --password-file
/tmp/.mysql-pass.txt --table mod --compression-codec org.apache.hadoop.io.compress.BZip2Codec --hive-import --hive-database encry --hive-table mod2 --h
ive-overwrite --check-column last_update_date --incremental lastmodified --merge-key id --last-value 0 --target-dir /user/user_name/append1sqo
pp
It is not creating target dir in given location , instead it creating in warehouse location
I am trying to schedule a sqoop incremental job , somehow I am doing mistake some where
command : above command
2.1 new rows are added with same date
2.2 delete and update on few rows
Output :
No new updates on given table .
It is not updating lastvalue in sqoop job
How to choose merge-key column in sqoop
Where condition in sqoop
--query "select * from reason where id>20 AND $CONDITIONS"
What is the use of $CONDITIONS and do we need to pass the variable in Linux
Is that possible to track rejected rows in sqoop job
Env: CDH
Tool: Sqoop
Version: Sqoop 1.4.6-cdh5.8.0
Objective: Import table from MySQL database
Create hive table with a subset of source data (e.g order_status = 'CLOSED')
Reimport more data in the same directory using order_status not in ('CLOSED')
Results:
1. Objective 1 complete using the command
sqoop import --connect jdbc:mysql://xxx:000/xxxx_db
--username=xxxx_dba --P
--warehouse-dir=/user/hive/warehouse/hex.db/
-m 1
--table orders --compression-codec=snappy
--hive-import --as-textfile --create-hive-table
--hive-table closed_orders
--hive-overwrite
--where "order_status='CLOSED'"
--compress
--columns "order_id, order_customer_id, order_status"
Creates the directory /user/hive/warehouse/hex.db/closed_orders with a data file and a hive table with "CLOSED" Orders.
I am trying to re import more data - this time order_status not in ('CLOSED')
-- This time not creating a hive table and just importing the order_status != 'CLOSED' into different directory (open_orders).
Issue: It creates a directory /user/hive/warehouse/hex.db/open_orders/orders/.
2.a How can import the file into the directory /user/hive/warehouse/hex.db/open_orders?
2.b How can we import the subset date of order_status != 'CLOSED' ie. open orders into the same directory created in the step 1 ie. /user/hive/warehouse/hex.db/closed_orders ?
Command used for step 2:
sqoop import --connect jdbc:mysql://xxxx:0000/retail_db
--username=xxxx_dba --P --warehouse-dir=/user/hive/warehouse/hex.db/ -m 1
--table orders --compression-codec=snappy --hive-import
--as-textfile --hive-table open_orders
--where "order_status not in ('CLOSED')"
--compress --columns "order_id, order_customer_id, order_status"
2.3 Error with --append command where in I am trying to import the open orders into the directory created from the step 1 /user/hive/warehouse/hex.db/closed_orders
17/04/15 14:24:22 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
17/04/15 14:24:22 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
Append mode for hive imports is not yet supported. Please remove the parameter --append-mode
when you doing #2 i.e. reimporting data, use
--target-dir /user/hive/warehouse/hex.db/open_orders
instead of
warehouse-dir
I am trying to import table from RDBMS to HIVE using SQOOP in hadoop cluster, i am getting the following error, can you please provide the solution for this.
bin/sqoop-import --connect jdbc:mysql://localhost:3306/hadoop -username root -password root --table salaries --hive-table salaries --create-hive-table --hive-import --hive-home /home/techgene/hive-0.11.0 -m 1 --target-dir /user/hive/warehouse
Exception:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
14/06/02 14:30:19 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 1
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:364)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:314)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:226)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
Whenever ,you are using a Sqoop with Hive import option,the sqoop connects directly the corresponding the database's metastore and gets the corresponding table 's metadata(the table's schema),so there is no need to create a table structure in Hive.This schema is then provided to the Hive when used with Hive-import option.
**Example::
**sudo sqoop import-all-tables --connect jdbc:mysql://10.0.0.57/movielens --username root --password root —hive-import**
this is too import the tables from movielens database in mysql.

**sqoop import \
--connect jdbc:mysql://10.0.0.57/movielens \
--username root \
--password hadoop \
--table cities \
--hive-import**
this is to just import one table called cities.**
So the output of all the sqoop data on HDFS will by default stored in the default directory .i.e /user/sqoop/tablename/part-m files
with hive import option,the tables will be downloaded directly into the default warehouse direcotry i.e.
/user/hive/warehouse/tablename
command : sudo -u hdfs hadoop fs -ls -R /user/
this lists recursively all the files with in the user.
Now go to Hive and type show databases.if there is only default database, then type show tables:
remember OK is common default system output and is not part of the command output.
hive> show databases;
OK
default
Time taken: 0.172 seconds
hive> show tables;
OK
genre
log_apache
movie
moviegenre
movierating
occupation
user
Time taken: 0.111 seconds
Check for the syntax, eliminate extra spaces..
$ sqoop-import --connect "jdbc:mysql://localhost:3306/hadoop;database=< db_name >"
-username root
-password root
--table salaries
--hive-import
--target-dir /user/hive/warehouse
No need to mention --hive-table < table_name > if using the same name as in mysql
I am trying to append data to already existing Table in hive.Using the Following command first i import the table from MS-SQL Server to hive.
Sqoop Command:
sqoop import --connect "jdbc:sqlserver://XXX.XX.XX.XX;databaseName=mydatabase" --table "my_table" --where "Batch_Id > 100" --username myuser --password mypassword --hive-import
Now i want to append the data to same existing table in hive where "Batch_Id < 100"
I am using the following Command:
sqoop import --connect "jdbc:sqlserver://XXX.XX.XX.XX;databaseName=mydatabase" --table "my_table" --where "Batch_Id < 100" --username myuser --password mypassword --append --hive-table my_table
This command however runs successfully also updates the HDFS data, but when u connect to hive shell and query the table, the records which are appended are not visible.
Sqoop updated the Data on hdfs "/user/hduser/my_table" but the data on "/user/hive/warehouse/batch_dim" is not updated.
How can reslove this issue.
Regards,
Bhagwant Bhobe
Try using
sqoop import --connect "jdbc:sqlserver://XXX.XX.XX.XX;databaseName=mydatabase"
--table "my_table" --where "Batch_Id < 100"
--username myuser --password mypassword
--hive-import --hive-table my_table
when you are using --hive-import DO NOT use --append parameter.
The Sqoop command you're using (--import) is only for ingesting records into HDFS. You need to use the --hive-import flag to import records into Hive.
See http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_importing_data_into_hive for more details and for additional import configuration options (you may want to change the document reference to your version of Sqoop, of course).
After install hadoop, hive (CDH version) I execute
./sqoop import -connect jdbc:mysql://10.164.11.204/server -username root -password password -table user -hive-import --hive-home /opt/hive/
All goes fine, but when I enter hive command line and execute show tables, there are nothing.
I use ./hadoop fs -ls, I can see /user/(username)/user existing.
Any help is appreciated.
---EDIT-----------
/sqoop import -connect jdbc:mysql://10.164.11.204/server -username root -password password -table user -hive-import --target-dir /user/hive/warehouse
import fail due to :
11/07/02 00:40:00 INFO hive.HiveImport: FAILED: Error in semantic analysis: line 2:17 Invalid Path 'hdfs://hadoop1:9000/user/ubuntu/user': No files matching path hdfs://hadoop1:9000/user/ubuntu/user
11/07/02 00:40:00 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 10
at com.cloudera.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:326)
at com.cloudera.sqoop.hive.HiveImport.executeScript(HiveImport.java:276)
at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:218)
at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:362)
at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:423)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
Check your hive-site.xml for the value of the property
javax.jdo.option.ConnectionURL. If you do not define this explicitly,
the default value will use a relative path for creation of hive
metastore (jdbc:derby:;databaseName=metastore_db;create=true) which
will be different depending upon where you launch the process from.
This would explain why you cannot see the table via show tables.
define this property value in your
hive-site.xml using an absolute path
no need of creating the table in hive..refer the below query
sqoop import --connect jdbc:mysql://xxxx.com/Database name --username root --password admin --table tablename (mysql table) --direct -m 1 --hive-import --create-hive-table --hive-table table name --target-dir '/user/hive/warehouse/Tablename(which u want create in hive)' --fields-terminated-by '\t'
In my case Hive stores data in /user/hive/warehouse directory in HDFS. This is where Sqoop should put it.
So I guess you have to add:
--target-dir /user/hive/warehouse
Which is default location for Hive tables (might be different in your case).
You might also want to create this table in Hive:
sqoop create-hive-table --connect jdbc:mysql://host/database --table tableName --username user --password password
in my case it creates table in hive default database, you can give it a try.
sqoop import --connect jdbc:mysql://xxxx.com/Database name --username root --password admin --table NAME --hive-import --warehouse-dir DIR --create-hive-table --hive-table NAME -m 1
Hive tables will be created by Sqoop import process. Please make sure the /user/hive/warehouse is created in you HDFS. You can browse the HDFS (http://localhost:50070/dfshealth.jsp - Browse the File System option.
Also include the HDFS local in -target dir i.e hdfs://:9000/user/hive/warehouse in the sqoop import command.
First of all , create the table definition in Hive with exact field names and types as in mysql.
Then, perform the import operation
For Hive Import
sqoop import --verbose --fields-terminated-by ',' --connect jdbc:mysql://localhost/test --table tablename --hive-import --warehouse-dir /user/hive/warehouse --fields-terminated-by ',' --split-by id --hive-table tablename
'id' can be your primary key of the existing table
'localhost' can be your local ip
'test' is database
'warehouse' directory is in HDFS
I think all you need is to specify the hive table where data should go.
add "--hive-table database.tablename" to the sqoop command and remove the --hive-home /opt/hive/. I think that should resolve the problem.