Sqoop fails with password-file argument - hadoop

I have a sqoop script which ingests data from SAP HANA to Hive. The sqoop scripts runs fine when I give password as argument "--password Password$$", but to secure the password , I put it in a file called sap.password and used argument"--password-file /dev/configs/sap.password", But the sqoop script returns an execption .
Below is my sqoop script and exception occured:
sqoop import
--connect jdbc:sap://hostname?currentschema=SCHEMA_REF
--driver com.sap.db.jdbc.Driver
--username SERVICE_ACCOUNT
--password-file /dev/configs/sap.password
--table TABLE1
--hive-import
--hive-overwrite
--hive-database cdc_stg
--hive-table HIVE_TABLE1
--as-parquetfile
--m 1
Exception that I get is (I'm sure the credentials are correct):
9/11/14 05:47:08 ERROR manager.SqlManager: Error executing statement:
com.sap.db.jdbc.exceptions.jdbc40.SQLInvalidAuthorizationSpecException: [10]: authentication failed
com.sap.db.jdbc.exceptions.jdbc40.SQLInvalidAuthorizationSpecException: [10]: authentication failed
at com.sap.db.jdbc.exceptions.jdbc40.SQLInvalidAuthorizationSpecException.createException(SQLInvalidAuthorizationSpecException.java:40)
at com.sap.db.jdbc.exceptions.SQLExceptionSapDB.createException(SQLExceptionSapDB.java:290)
at com.sap.db.jdbc.exceptions.SQLExceptionSapDB.generateDatabaseException(SQLExceptionSapDB.java:174)
at com.sap.db.jdbc.packet.ReplyPacket.buildExceptionChain(ReplyPacket.java:100)
at com.sap.db.jdbc.ConnectionSapDB.execute(ConnectionSapDB.java:1141)
at com.sap.db.jdbc.ConnectionSapDB.execute(ConnectionSapDB.java:888)
at com.sap.db.util.security.AbstractAuthenticationManager.connect(AbstractAuthenticationManager.java:43)
at com.sap.db.jdbc.ConnectionSapDB.openSession(ConnectionSapDB.java:586)
at com.sap.db.jdbc.ConnectionSapDB.doConnect(ConnectionSapDB.java:436)
at com.sap.db.jdbc.ConnectionSapDB.<init>(ConnectionSapDB.java:195)
at com.sap.db.jdbc.ConnectionSapDBFinalize.<init>(ConnectionSapDBFinalize.java:13)
at com.sap.db.jdbc.Driver.connect(Driver.java:255)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:903)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:59)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:762)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:785)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:288)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:259)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:245)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:333)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1879)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1672)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:515)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:633)
at org.apache.sqoop.Sqoop.run(Sqoop.java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:182)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:233)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:242)
at org.apache.sqoop.Sqoop.main(Sqoop.java:251)
19/11/14 05:47:08 ERROR tool.ImportTool: Import failed: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1678)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:515)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:633)
at org.apache.sqoop.Sqoop.run(Sqoop.java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:182)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:233)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:242)
at org.apache.sqoop.Sqoop.main(Sqoop.java:251)

I suspect the password file might be created with newline characters since --password works fine and the only difference or change made is conversion to using a password file.
Can you please re-create the password file using the sqoop docs warning clause stated below.
Reference: SqoopUserGuide
Sqoop will read the entire content of the password file and use it as a password. This will include any trailing white space characters such as newline characters that are added by default by most of the text editors. You need to make sure that your password file contains only characters that belong to your password. On the command line, you can use command echo with switch -n to store password without any trailing white space characters.
Ex: To store password secret use below.
echo -n "secret" > password.file
Also instead of sqoop import try list-databases or list-tables or eval for testing the connection with the password file.

Please check password file permissions. From Sqoop docs:
You should save the password in a file on the users home directory with 400 permissions

Related

Is there a way to execute free form query from a file in sqoop?

Have executed a similar kind of sqoop command as shown below. The free form query mentioned below, I wanted to keep it in a file and run the sqoop command since my real time queries are quite complex and bigger.
Wanted to know, Is there a way to keep the query in a file and execute the sqoop command which will refer the free form query inside the file and execute?
like we do for --password-file case. Thanks in advance.
sqoop import --connect "jdbc:mysql://<localhost>:port" --username "admin" --password-file "<passwordfile>" --query "select * from employee" --split-by employee_id --target-dir "<target directory>" --incremental append --check-column employee_id --last-value 0 --fields-terminated-by "|"
The command line options that are not convenient to put in command, can be read using the Sqoop--options-file argument for convenience, hence you can read the query using the options file. Using options file the Sqoop command should be similar to this:
sqoop import --connect $connect_string --username $username --password $pwd --options-file /home/user/sqoop_poc/query.txt --target-dir $target_dir --m 1
Entry in options file should be like this:
--query
select * from TEST_OPTION where ID <= 10 AND $CONDITIONS
More details on options file are available in Sqoop User Guide.

Using database name along with service name while importing from Oracle using Sqoop

When importing from oracle using SQOOP ,have already specified the service name
in the connection string jdbc:oracle:thin:#servername/servicename and unable to add the databasename in the connection string and also not able to specify the same in the --table parameter as databasename.tablename, getting the below error.
Import failed: There is no column found in the target table
databasename.tablename. Please ensure that your table name is correct.
Is there any way to use it or using the --query parameter is the only option.
The correct working command with Oracle
sqoop import --connect "jdbc:oracle:thin:#//host:port/service_name" --query "select column_name from oracle_schema_name.table where $CONDITIONS" --username $USER_NAME --password $PASSWORD --target-dir $TABLE_DIRECTORY_NAME

Special characters are not proper after sqooping data into Hive from teradata

I'm trying to sqoop the teradata table into Hive using below "sqoop-import" command.
sqoop tdimport
-Dtdch.output.hdfs.avro.schema.file=/tmp/data/country.avsc --connect jdbc:teradata://tdserver/database=SALES --username tduser
--password tdpw --as-avrodatafile --target-dir /tmp/data/country_avro --table COUNTRY
--split-by SALESCOUNTRYCODE --num-mappers 1
The teradata table contains special characters in some columns.After sqooping into Hive, the special characters are not coming proper.
Is there any way to enable the special characters while firing the sqoop import command?
Do we need to use UTF-8, to resolve this issue ?
Can anyone please suggest me regarding this issue ...

How to protect password and username in Sqoop?

I want to hide my password that I am using to import data from my RDBMS to Hadoop cluster. I am using --option-files for keeping my password and username in a text file but it's not protected.
Can I do some kind encryption on that particular file for better protection?
Secure way of supplying password to the database.
You should save the password in a file on the users home directory with 400 permissions and specify the path to that file using the --password-file argument, and is the preferred method of entering credentials. Sqoop will then read the password from the file and pass it to the MapReduce cluster using secure means with out exposing the password in the job configuration. The file containing the password can either be on the Local FS or HDFS. For example:
$ sqoop import --connect jdbc:mysql://database.example.com/employees \
--username venkatesh --password-file ${user.home}/.password
Check drill docs for more details.
Also, you can use -P option to Read password from console.
It seems that this question has been addressed previously here,
also described on this hortonworks page and basically consists on creating and .enc file. You also need to configure several parameters like the key to reveal the encryption.
sqoop import \
-Dorg.apache.sqoop.credentials.loader.class=org.apache.sqoop.util.password.CryptoFileLoader \
-Dorg.apache.sqoop.credentials.loader.crypto.passphrase=sqoop2 \
--connect jdbc:mysql://example.com/sqoop \
--username sqoop \
--password-file file:///tmp/pass.enc \
--table tbl
Here are multiple parameters that can be configured (again following the reference):
org.apache.sqoop.credentials.loader.class - Credentials loader
org.apache.sqoop.credentials.loader.crypto.alg – The Algorithm used to decrypt the file (default is AES/ECB/PKCS5Padding).
org.apache.sqoop.credentials.loader.crypto.salt – The salt used to derive a key with the passphrase (default is SALT).
org.apache.sqoop.credentials.loader.crypto.iterations – Number of PBKDF2 iterations (default is 10000).
org.apache.sqoop.credentials.loader.crypto.salt.key.len – Derived key length (default is 128).
org.apache.sqoop.credentials.loader.crypto.passphrase Passphrase used to derive key.
Alternatively you can also follow Sqoop documentation page and create a password alias that gets retrieved with an implementation of CredentialProviderPasswordLoader class. You can see the whole class here

sqoop import complete but hive show tables can't see table

After install hadoop, hive (CDH version) I execute
./sqoop import -connect jdbc:mysql://10.164.11.204/server -username root -password password -table user -hive-import --hive-home /opt/hive/
All goes fine, but when I enter hive command line and execute show tables, there are nothing.
I use ./hadoop fs -ls, I can see /user/(username)/user existing.
Any help is appreciated.
---EDIT-----------
/sqoop import -connect jdbc:mysql://10.164.11.204/server -username root -password password -table user -hive-import --target-dir /user/hive/warehouse
import fail due to :
11/07/02 00:40:00 INFO hive.HiveImport: FAILED: Error in semantic analysis: line 2:17 Invalid Path 'hdfs://hadoop1:9000/user/ubuntu/user': No files matching path hdfs://hadoop1:9000/user/ubuntu/user
11/07/02 00:40:00 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 10
at com.cloudera.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:326)
at com.cloudera.sqoop.hive.HiveImport.executeScript(HiveImport.java:276)
at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:218)
at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:362)
at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:423)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
Check your hive-site.xml for the value of the property
javax.jdo.option.ConnectionURL. If you do not define this explicitly,
the default value will use a relative path for creation of hive
metastore (jdbc:derby:;databaseName=metastore_db;create=true) which
will be different depending upon where you launch the process from.
This would explain why you cannot see the table via show tables.
define this property value in your
hive-site.xml using an absolute path
no need of creating the table in hive..refer the below query
sqoop import --connect jdbc:mysql://xxxx.com/Database name --username root --password admin --table tablename (mysql table) --direct -m 1 --hive-import --create-hive-table --hive-table table name --target-dir '/user/hive/warehouse/Tablename(which u want create in hive)' --fields-terminated-by '\t'
In my case Hive stores data in /user/hive/warehouse directory in HDFS. This is where Sqoop should put it.
So I guess you have to add:
--target-dir /user/hive/warehouse
Which is default location for Hive tables (might be different in your case).
You might also want to create this table in Hive:
sqoop create-hive-table --connect jdbc:mysql://host/database --table tableName --username user --password password
in my case it creates table in hive default database, you can give it a try.
sqoop import --connect jdbc:mysql://xxxx.com/Database name --username root --password admin --table NAME --hive-import --warehouse-dir DIR --create-hive-table --hive-table NAME -m 1
Hive tables will be created by Sqoop import process. Please make sure the /user/hive/warehouse is created in you HDFS. You can browse the HDFS (http://localhost:50070/dfshealth.jsp - Browse the File System option.
Also include the HDFS local in -target dir i.e hdfs://:9000/user/hive/warehouse in the sqoop import command.
First of all , create the table definition in Hive with exact field names and types as in mysql.
Then, perform the import operation
For Hive Import
sqoop import --verbose --fields-terminated-by ',' --connect jdbc:mysql://localhost/test --table tablename --hive-import --warehouse-dir /user/hive/warehouse --fields-terminated-by ',' --split-by id --hive-table tablename
'id' can be your primary key of the existing table
'localhost' can be your local ip
'test' is database
'warehouse' directory is in HDFS
I think all you need is to specify the hive table where data should go.
add "--hive-table database.tablename" to the sqoop command and remove the --hive-home /opt/hive/. I think that should resolve the problem.

Resources