I am new to hadoop, trying to run following sqoop command:
sqoop import --connect jdbc:mysql://localhost:3306/vaibhav --table employees --username root --password-file ${user.home}/.password.txt --target-dir /data/sqoop/eg4/ -m 2
but it gives me an error as
bash: ${user.home}/.password.txt: bad substitution
I tried the way it is given in the docs, but nothing happened. same error every time.
Step by step guide would be appreciated. Thanks
Sqoop expects the password file on HDFS location. Try copying the file to a location on HDFS and specify that path. Also check the read permission of the file. Read permission should be to given to home directory user.
Related
So I'm trying to import-all-tables into hive db, ie, user/hive/warehouse/... on hdfs, using the below command:
sqoop import-all-tables --connect "jdbc:sqlserver://<servername>;database=<dbname>" \
--username "<username>" \
--password "<password>" \
--warehouse-dir "/user/hive/warehouse/" \
--hive-import \
-m 1
In the testdatabase I have 3 tables, when mapreduce runs, the output is success,
ie, the mapreduce job is 100% complete but the file is not found on hive db.
It’s basically getting overwritten by the last table, try removing the forward slash at the end of the directory path. For the tests I would suggest not to use the warehouse directory, use something like ‘/tmp/sqoop/allTables’
There is a another way
1. Create a hive database pointing to a location says "targetLocation"
2. Create hcatalog table in your sqoop import using previously created database.
3. Use target-directory import options to point that targetLocation.
you doesn't need need to define warehouse directory.just define hive database it will automatically find out working directory.
sqoop import-all-tables --connect "jdbc:sqlserver://xxx.xxx.x.xxx:xxxx;databaseName=master" --username xxxxxx --password xxxxxxx --hive-import --create-hive-table --hive-database test -m 1
it will just run like rocket.
hope it work for you....
I'm trying to import a Table from Oracle to Hive using Sqoop. I used the following command:
sqoop-import --connect jdbc:<connection> --table test1 --username test --password test --hive-table hive_test --create-hive-table --hive-import -m 1
But this gives me the error
Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory <hdfs path> already exists
So I read online in many forums and it said that I should delete the directory and run the command again.
I did exactly that, but I still keep getting the error.
You need to understand working of Sqoop hive Import.
Import data to HDFS <some-dir>
Create hive table <some-table> IF NOT EXISTS
LOAD data inpath '<some-dir>' into table <some-table>
You are getting the error at step 1.
Output directory <hdfs path> already exists
Delete this <hdfs path> and proceed.
Better way:
No need to delete this manually everytime.
Use --delete-target-dir in the command. It will
Delete the import target directory if it exists
P.S. No need to use --create-hive-table with --hive-import. --hive-import by default create table for you.
Hive store its Table data in hive warehouse on hdfs with table name as directory and usually have below path
/user/hive/warehouse/
need to delete table name directory.
hadoop fs -rmr /user/hive/warehouse/hive_test
sqoop import --connect "jdbc:mysql:" --username sqoopuser --password-file HDFS directory
is working
sqoop import --connect "jdbc:mysql:" --username sqoopuser --password-file Local FS Directory
is not working . It is throwing a file does not exist error.
Sqoop Documentation says:
Secure way of supplying password to the database. You should save the password in a file on the users home directory with 400 permissions and specify the path to that file using the --password-file argument, and is the preferred method of entering credentials. Sqoop will then read the password from the file and pass it to the MapReduce cluster using secure means with out exposing the password in the job configuration. The file containing the password can either be on the Local FS or HDFS.
I'm really not sure on how sqoop decides whether the path is HDFS or Local FS.
Say your password are stored in /home/${user}/password.file (Local FS)
Instead of using
--password-file /home/${user}/password.file
use
--password-file file:///home/${user}/password.file
I want to store the password into a file & later use the same in sqoop command.
According to sqoop documentation --password-file option allow us for storing password. so i am storing it in pwd file with password abc text only. & hits the below command.
sqoop import --connect jdbc:mysql://localhost:3306/db --username bhavesh --password-file /pwd --table t1 --target-dir '/erp/test'
assuming pwd file is stored over HDFS
as a result i am getting following error :
java.sql.SQLException: Access denied for user 'bhavesh'#'localhost' (using password: YES)
When I perform the same operation using -p option it works fine for me.
For a saved sqoop job, I was getting the same error.
I stored the password in the metastore and that worked for me.
Making changes to the following configuration property in the file sqoop-site.xml which is usually stored here - /etc/sqoop/conf/sqoop-site.xml
<property>
<name>sqoop.metastore.client.record.password</name>
<value>true</value>
<description>If true, allow saved passwords in the metastore.
</description>
</property>
After making these changes, create the sqoop job and running the following command you would be able to see the password stored.
sqoop job --show [job_name]
You can store credential on HDFS.
Create credential using this command:
hadoop credential create mysql.password -provider jceks://hdfs/user/<your_hadoop_username>/mysqlpwd.jceks
When executed on the client machine it will ask to provide a password, then please enter the MySQL password that you were provided with -P option for sqoop command.
sqoop import --connect jdbc:mysql://localhost:3306/db --username bhavesh --password-alias mysql.password --table t1 --target-dir /erp/test
And run this modified command in that I have replaced
--password-file
with
--password-alias
File in hdfs contain a password in an encrypted format that cannot be recovered.
In Sqoop for Hadoop you can use a parameters file for connection string information.
--connection-param-file filename Optional properties file that provides connection parameters
What is the format of that file?
Say for example I have:
jdbc:oracle:thin:#//myhost:1521/mydb
How should that be in a parameters file?
if you want to give your database connection string and credentials then create a file with those details and use --options-file in your sqoop command
create a file database.props with the following details:
import
--connect
jdbc:mysql://localhost:5432/test_db
--username
root
--password
password
then your sqoop import command will look like:
sqoop --options-file database.props \
--table test_table \
--target-dir /user/test_data
and related to --connection-param-file hope this link will be helpful to understand its usage
It should be same as the command.
Example
import
--connect
jdbc:oracle:thin:#//myhost:1521/mydb
--username
foo
Below is the sample command connecting with mysql server
sqoop list-databases --connect jdbc:mysql://192.168.256.156/test --username root --password root
It will give you the list of databases available at your mysql server