I want use shell execute sqoop(1.4.5) command.
shell:
sqoop_cmd="sqoop import --connect jdbc:mysql://xx.x.xxx.xxx:3306/test --username test --password datagateway --query 'select t.name from table_name t where date(hrc.gmt_modified) = date_sub(curdate(),interval 1 day) AND $CONDITIONS' --target-dir /output -m 1 --append"
result=$sqoop_cmd 2>&1 | grep -c "successfully"
error:
WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
But I updated --query to --table and remove ' AND $CONDITIONS' param try again, The sqoop command result is successfull. I think the question about '$', but I try '\$CONDITIONS', "'$CONDITIONS'", it's unsuccessfull.
please help me, thank you so much!
this error is cause because
error: WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
you are entering your database password in the command itself like this --password datagateway, instead if you have written like this -P you won't get that warning and password will be prompted after you start executing the command
When you use --query parameter you have to use $CONDITION and you did it. But you forgot to add --split-by parameter that is obligatory:
From sqoop user guide:
"Your query must include the token $CONDITIONS which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by."
Hope it helps a bit.
Pawel
Related
I want to call an Oracle stored procedure from Sqoop but I'm getting an error. I have to call a function of stored procedure and need to pass the parameter.
$: sqoop import --connect jdbc:oracle:thin:#localhost:1512/db --username userA --password password --call Oracle_Schema.pkg_table_maintenance.sf_drop_index('TBL_A_%','Group_id')
-bash: syntax error near unexpected token `('
$: sqoop import --connect jdbc:oracle:thin:#localhost:1512/db --username userA --password password --call "Oracle_Schema.pkg_table_maintenance.sf_drop_index('TBL_A_%','Group_id')"
Warning: /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p1246.1021/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/11/27 10:31:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.4.7
17/11/27 10:31:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/11/27 10:31:32 ERROR tool.BaseSqoopTool: Error parsing arguments for import:
17/11/27 10:31:32 ERROR tool.BaseSqoopTool: Unrecognized argument: --call
17/11/27 10:31:32 ERROR tool.BaseSqoopTool: Unrecognized argument: Oracle_Schema.pkg_table_maintenance.sf_drop_index('TBL_A_%','Group_id')
Can someone please help!
Sqoop allows you to call a sql procedure only when you are exporting, not when you import.
If you stored procedure does some kind of select to import through sqoop-import that will not work, sqoop import doesn't have that flexibility.
But if your stored procedure is doing some kind of clean-up operation you can use the below sqoop eval utility
sqoop eval --connect jdbc:oracle:thin:#localhost:1512/db --username userA --password password --query "EXECUTE Oracle_Schema.pkg_table_maintenance.sf_drop_index('TBL_A_%','Group_id')"
--query "will execute the query as if you are running in oracle database, you can use the same SQL syntax that you will use from oracle client application/command line"
you can execute a stored procedure in Sqoop by using the Sql*plus method to execute a stored procedure: BEGIN STORED_PROCEDURE; END;
sqoop eval -Dmapred.job.queue.name=root.test.test-mis \
--connect jdbc:oracle:thin:#SERVER.NAME:PORT:INSTANCE --password **** \
--username MYSCHEMA --query "BEGIN MYSCHEMA.TEST_STORED_PROCEDURE_NAME; END;"
Detail:
Sqoop 1.4.6
Tried password file located in HDFS and Local FS. Also tried declaring the password file as --password-file file:///user/username/password.file
sqoop import \
--connect 'jdbc:sqlserver://servername;database=databasename' \
--username userxxx \
--password-file password.file \
--table 'sales' \
--fields-terminated-by '|' \
--null-string '\\N' \
--null-non-string '\\N' \
--hive-import
When running a sqoop import I am getting authentication failures from sql server unless I put the username and password in the connection string. If I try to use -P or --password-file the authentication fails.
--password-file
As per docs,
Secure way of supplying password to the database. You should save the password in a file on the users home directory with 400 permissions and specify the path to that file using the --password-file argument, and is the preferred method of entering credentials. Sqoop will then read the password from the file and pass it to the MapReduce cluster using secure means with out exposing the password in the job configuration. The file containing the password can either be on the Local FS or HDFS.
Check file's permission and it should be in home directory.
Sqoop expects --password-file in HDFS, if you are using local FS add file://.
Example: --password-file file:///user/dev/database.password
-P option
It will read password from console. Using this you have to write password while command executes.
password
Simply add you password.
There is also one better way of doing it. Using the expect scripting.
Create a script file say "script.sh" :
#!/usr/bin/expect -f
set password "<ur password desination or file>"
spawn sqoop <args> # define sqoop command here
expect "*?Enter password:*" # As ask for password as while reading from database
send -- "$password\r"
expect sleep 15 # or you can use "set timeout -1" for infinite waiting time
expect eof
EOF
done
In case you are facing error that spawn command does not exist etc. It means that expect is not installed.
For centos7
yum install expect -y
and run the file "script.sh"
If expect script fails to load
Try :
> /usr/bin/expect script.sh
Thanks
Sqoop job always prompts for a password in CLI. To avoid this it's been said that the property sqoop.metastore.client.record.password should be set as true. But everywhere it's said that I need to change this value in sqqop_site.xml. Is there anyway I can set this value to one job alone. I tried to create a job like below and sqoop fails to create it
sqoop job --create TEST -D sqoop.metastore.client.record.password=true -- import \
--connect jdbc:netezza://xx.xxx.xx.xxx/database \
--username username \
--password password \
--table tablename \
--split-by key \
--hcatalog-database hivedatabase \
--hcatalog-table hivetable \
--hcatalog-storage-stanza 'STORED as ORC TBLPROPERTIES('orc.compress'='NONE')' \
-m 100
Error :
Warning: /usr/iop/4.1.0.0/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/06/17 07:10:08 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6_IBM_20
16/06/17 07:10:08 ERROR tool.BaseSqoopTool: Error parsing arguments for job:
16/06/17 07:10:08 ERROR tool.BaseSqoopTool: Unrecognized argument: -D
16/06/17 07:10:08 ERROR tool.BaseSqoopTool: Unrecognized argument: sqoop.metastore.client.record.password=true
Can anyone please help me with this. I need to run a job witout prompting password in CLI.
You can save your password in a file and specify the path to this file with the parameter --password-file.
--password-file 'Set path for a file containing the authentication password'
Sqoop will then read the password from the file and pass it to the MapReduce cluster using secure means without exposing the password in the job configuration.
In Sqoop for Hadoop you can use a parameters file for connection string information.
--connection-param-file filename Optional properties file that provides connection parameters
What is the format of that file?
Say for example I have:
jdbc:oracle:thin:#//myhost:1521/mydb
How should that be in a parameters file?
if you want to give your database connection string and credentials then create a file with those details and use --options-file in your sqoop command
create a file database.props with the following details:
import
--connect
jdbc:mysql://localhost:5432/test_db
--username
root
--password
password
then your sqoop import command will look like:
sqoop --options-file database.props \
--table test_table \
--target-dir /user/test_data
and related to --connection-param-file hope this link will be helpful to understand its usage
It should be same as the command.
Example
import
--connect
jdbc:oracle:thin:#//myhost:1521/mydb
--username
foo
Below is the sample command connecting with mysql server
sqoop list-databases --connect jdbc:mysql://192.168.256.156/test --username root --password root
It will give you the list of databases available at your mysql server
I would like to know if I will be able to execute a procedure and get results in Sqoop import command. I am not able to find any such scenario in the web. Please help
I have tried something like this and it worked :
sqoop import --connect "jdbc:sqlserver://localhost;database=FADA" --username [name] --password [pdw] --query "print case when $CONDITIONS then 'yep' else 'yip' end exec dbo.ps" --target-dir /DIR/Psimport -m 1
https://issues.apache.org/jira/browse/SQOOP-769
It seems like Sqoop does not support it. Can you please let me know if there are any other tools which will help me to extract data from SQL server to HDFS
Have you tried the --query option in sqoop? The documentation for this option is here: http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_free_form_query_imports
Sqoop export has a stored procedure parameter but you have to also provide a table that will be evaluated with the stored procedure.
If you want to "execute Stored_procedure" in oracle from sqoop you need to use eval and in the query use the SQL*plus execute command:
'BEGIN STORED_PROCEDURE; END;'
example:
sqoop eval -Dmapred.job.queue.name=root.test.test-mis --connect jdbc:oracle:thin:#SERVER.NAME:PORT:INSTANCE --password **** --username MYSCHEMA --query "BEGIN MYSCHEMA.TEST_STORED_PROCEDURE_NAME; END;"