How to use a specified Hive database when using Sqoop import - hadoop

sqoop import --connect jdbc:mysql://remote-ip/db --username xxx --password xxx --table tb --hive-import
The above command imports table tb into the 'default' Hive database.
Can I use other database instead?

Off the top of my head i recall you can specify --hive-table foo.tb
where foo is your hive database and tb is your hive table.
so in your case it would be:
sqoop import --connect jdbc:mysql://remote-ip/db --username xxx --password xxx --table tb --hive-import --hive-table foo.tb
As a footnote, here is the original jira issue https://issues.apache.org/jira/browse/SQOOP-322

Hive database using Sqoop import:
sqoop import --connect jdbc:mysql://localhost/arun --table account --username root --password root -m 1 --hive-import **--hive-database** company **--create-hive-table --hive-table** account --target-dir /tmp/customer/ac

You can specify the database name as a part of the --hive-table parameter, e.g. "--hive-table foo.tb".
There is a new request to add a special parameter for the database that is being tracked: SQOOP-912.

Related

hive-import and hive-overwrite with sqoop import all

sqoop import-all-tables --connect jdbc:mysql://localhost/SomeDB --username root --hive-database test --hive-import;
The above command is working fine but it's duplicating the values in the destination tables. I used the below to overwrite the data.
sqoop import-all-tables --connect jdbc:mysql://localhost/SomeDB --username root --hive-import --hive-database Test --hive-overwrite
This replaced all the values in the table and inserted only null values. If I am removing --hive-import then also it's not working. What wrong I am doing here?
This will solve the problem.
sqoop import-all-tables
--connect jdbc:mysql://localhost/SomeDB
--username root
--hive-import
--warehouse-dir /user/hive/warehouse/Test
--hive-database Test
--hive-overwrite

Sqoop export from Hcatalog to MySQL with different col names assign

Now my hive table with columns - id, name
and MySQL table - number, id, name
I want to map id (from hive) with number (from mysql), name (from hive) with id (from mysql).
I use the command :
sqoop export --hcatalog-database <my_db> --hcatalog-table <my_table> --columns "number,id" \
--connect jdbc:mysql://db...:3306/test \
--username <my_user> --password <my_passwd> --table <my_mysql_table>
However, it didn't work.
The same scenario liked this case can work fine [1]. The requirement can be fulfilled by locating the hive table on hdfs and using the following command to achieve.
sqoop export --export-dir /[hdfs_path] --columns "number,id" \
--connect jdbc:mysql://db...:3306/test \
--username <my_user> --password <my_passwd> --table <my_mysql_table>
Is there any solution can fulfill my scenario via Hcatalog?
reference :
[1]. Sqoop export from hive to oracle with different col names, number of columns and order of columns
I didn't used the hcatalog part of sqoop, but as is written in the manual, the next script should do the work:
sqoop export --hcatalog-database <my_db> --hcatalog-table <my_table> --map-column-hive "number,id" \
--connect jdbc:mysql://db...:3306/test \
--username <my_user> --password <my_passwd> --table <my_mysql_table>
This option: --map-column-hive when is used along with --hcatalog, does the work for hcatalog instead of hive.
Hope that this works for you.

import data from vertica to hive

I try to upload data from Vertica to Hive by using Sqoop.
I can see that it creates a file and a table on HIVE, but when i try to select the data from the HIVE or from the file i cannot see the data. it shows me an ERROR(there is no delimiter on the column of the file) select.
this is my code:
sqoop import -m -1 --driver com.vertica.jdbc.Driver --connect "jdbc:vertica://serverName:5443/DBName" --username "user" --password "pass" --query 'select id, name from contacts limit 10' --target-dir "folder/contacts" --hive-import --create-hive-table --hive-table db.contacts
Use these arguments and choose a delimiters for your data
--fields-terminated-by
--lines-terminated-by

Importing vertica data to sqoop

I am injecting vertica data to sqoop1 on a mapr cluster. I use the following query :
sqoop import -m 1 --driver com.vertica.jdbc.Driver --connect "jdbc:vertica://*******:5433/db_name" --password "password" --username "username" --table "schemaName.tableName" --columns "id" --target-dir "/t" --verbose
This query gives me an error that
Caused by: com.vertica.util.ServerException: [Vertica][VJDBC](4856) ERROR: Syntax error at or near "."
I read https://groups.google.com/a/cloudera.org/forum/#!msg/cdh-user/xIBwvc_eOp0/TvhANQfvcv4J for getting more information regarding this, but wasnt quite helpful as they gave results on Sqoop2.
When I run this query :
sqoop import -m 1 --driver com.vertica.jdbc.Driver --connect "jdbc:vertica://*******:5433/db_name" --password "password" --username "username" --table "tableName" --columns "id" --target-dir "/t" --verbose
It gives an error: Relation "tableName" doesnt exist.
I have added the required vertica-jdk jars in sqoop library too.
Any help regarding how to mention schema name in sqoop for vertica?
You can specify the schema name to use in the connection string like this:
--connect "jdbc:vertica://*******:5433/db_name?searchpath=myschema"
I changed the statement to --query and the schema.table is working fine there. So the statement is :
sqoop import -m 1 --driver com.vertica.jdbc.Driver --connect "jdbc:vertica://*****:5433/dbName" --password "*****" --username "******" --target-dir "/tmp/cdsdj" --verbose --query 'SELECT t.col1 FROM schema.tableName t where $CONDITIONS'

Appending Data to hive Table using Sqoop

I am trying to append data to already existing Table in hive.Using the Following command first i import the table from MS-SQL Server to hive.
Sqoop Command:
sqoop import --connect "jdbc:sqlserver://XXX.XX.XX.XX;databaseName=mydatabase" --table "my_table" --where "Batch_Id > 100" --username myuser --password mypassword --hive-import
Now i want to append the data to same existing table in hive where "Batch_Id < 100"
I am using the following Command:
sqoop import --connect "jdbc:sqlserver://XXX.XX.XX.XX;databaseName=mydatabase" --table "my_table" --where "Batch_Id < 100" --username myuser --password mypassword --append --hive-table my_table
This command however runs successfully also updates the HDFS data, but when u connect to hive shell and query the table, the records which are appended are not visible.
Sqoop updated the Data on hdfs "/user/hduser/my_table" but the data on "/user/hive/warehouse/batch_dim" is not updated.
How can reslove this issue.
Regards,
Bhagwant Bhobe
Try using
sqoop import --connect "jdbc:sqlserver://XXX.XX.XX.XX;databaseName=mydatabase"
--table "my_table" --where "Batch_Id < 100"
--username myuser --password mypassword
--hive-import --hive-table my_table
when you are using --hive-import DO NOT use --append parameter.
The Sqoop command you're using (--import) is only for ingesting records into HDFS. You need to use the --hive-import flag to import records into Hive.
See http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_importing_data_into_hive for more details and for additional import configuration options (you may want to change the document reference to your version of Sqoop, of course).

Resources