Lowercase table name on HANA with sqoop - sqoop

We are trying to import one SAP HANA table via sqoop and the table name is in lowercase.
We are passing the tablename in lower case however HANA converting into uppercase and showing table not found
sqoop import --connect "jdbc:sap://hostname:port/?currentschema=test" --driver com.sap.db.jdbc.Driver --username test --password test -table "lower_case" --target-dir=/tmp/aa1 -m 1
the error we are getting
[259]: invalid table name: Could not find table/view LOWER_CASE in schema test: line 1 col 17 (at pos 16)
Any suggestion please

Please put schema name and table name in uppercase to avoid this error and for convenience use driver as well using parameter -–driver com.sap.db.jdbc.Driver
More details on:
https://blogs.sap.com/2015/08/14/importing-hana-tables-from-an-external-system-into-hdfs-using-apache-sqoop/
https://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html

If the table is actually in lower case in the system, then you need to hand over the name to HANA in quotation marks.
In your example command line you already tried that but the shell most probably will remove the quotation marks when handing the parameter values to the program.
To avoid that, try escaping the quotation marks in the command line parameters.

Related

what is the relevence of -m 1

I am executing below sqoop command::=
sqoop import --connect 'jdbc:sqlserver://10.xxx.xxx.xx:1435;database=RRAM_Temp' --username DRRM_DATALOADER --password ****** --table T_VND --hive-import --hive-table amitesh_db.amit_hive_test --as-textfile --target-dir amitesh_test_hive -m 1
I have two queries::-
1) what is the relevence of -m 1? as far as I know Its the number of mapper that I am assigning to the sqoop job. If that is true, then, the moment I assign -m 2, the execution start throwing error as below:
ERROR tool.ImportTool: Error during import: No primary key could be found for table xxx. Please specify one with --split-by or perform a sequential import with '-m 1'
Now, I am forced to change my concept, now I see, it has something to do with database primary key. Can somebody help me a logic behind this?
2) I have ordered the above sqoop command to save the file as text file format.But when I go to the location suggested by the execution, I find tbl_name.jar. Why, if --as-textfile is a wrong sytax, then what is the right one. Or is there any other location that I can find the file in?
1) To have -m or --num-mappers to be set to a value greater than 1, the table must either have PRIMARY KEY or the sqoop command must be provided with a --split-by column. Controlling Parallelism would explain the logic behind this.
2) The FileFormat of the data imported into the Hive table amit_hive_test would be plain text(--as-textfile). As this is --hive-import, the data will be first imported into the --target-dir and then is loaded (LOAD DATA INPATH) into the Hive table. The resultant data will be inside the table's LOCATION and not in --target-dir.

Sqoop import converting TINYINT to BOOLEAN

I am attempting to import a MySQL table of NFL play results into HDFS using Sqoop. I issued the following command to achieve this:
sqoop import \
--connect jdbc:mysql://127.0.0.1:3306/nfl \
--username <username> -P \
--table play
Unfortunately, there are columns of type TINYINT, which are being converted to booleans upon import. For instance, there is a 'quarter' column for which quarter of the game the play occurred in. The value in this column is converted to 'true' if the play occurred in the first quarter and 'false' otherwise.
In fact, I did a sqoop import-all-tables, importing the entire NFL database I have, and it behaves like this uniformly.
Is there a way around this, or perhaps some argument for import or import-all-tables that prevents this from happening?
Add tinyInt1isBit=false in your JDBC connection URL. Something like
jdbc:mysql://127.0.0.1:3306/nfl?tinyInt1isBit=false
Another solution would be to explicitly override the column mapping for the datatype TINYINT(1) column. For example, if the column name is foo, then pass the following option to Sqoop during import: --map-column-hive foo=tinyint. In the case of non-Hive imports to HDFS, use --map-column-java foo=integer.
Source

Sqoop export update only specified columns

As far as I know, we can update database using "--udate-key" argument. Which updates whole record for that key. we can either insert or update with "--update-mode allowinsert" or "--update-mode updateonly".
For example I have a file which consists of primary key and a column values which I have to update in a table where it has other columns too. My question is, can we update that particular column without updating those other columns in table? We must specify all the columns for --update-key argument right? is there any solution or work around for this?
Yes.
By using "--update-key" and "columns" arguments.
Example:
$ sqoop export --connect jdbc:mysql://localhost/TGL --username root --password root --table staging --export-dir /sqoop/DB1_Result -m 1 -input-fields-terminated-by ","
note: field specified in update-key must be in columns argument

Sqoop Hive table import, Table dataType doesn't match with database

Using Sqoop to import data from oracle to hive, its working fine but it create table in hive with only 2 dataTypes String and Double. I want to use timeStamp as datatype for some columns.
How can I do it.
bin/sqoop import --table TEST_TABLE --connect jdbc:oracle:thin:#HOST:PORT:orcl --username USER1 -password password -hive-import --hive-home /user/lib/Hive/
In addition to above answers we may also have to observe when the error is coming, e.g.
In my case I had two types of data columns that caused error: json and binary
for json column the error came while a Java Class was executing, at the very beginning of the import process :
/04/19 09:37:58 ERROR orm.ClassWriter: Cannot resolve SQL type
for binary column, error was thrown while importing into the hive tables (after data is imported and put into HDFS files)
16/04/19 09:51:22 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive does not support the SQL type for column featured_binary
To get rid of these two errors, I had to provide the following options
--map-column-java column1_json=String,column2_json=String,featured_binary=String --map-column-hive column1_json=STRING,column2_json=STRING,featured_binary=STRING
In summary, we may have to provide the
--map-column-java
or
--map-column-hive
depending upon the failure.
You can use the parameter --map-column-hive to override default mapping. This parameter expects a comma-separated list of key-value pairs separated by = to specify which column should be matched to which type in Hive.
sqoop import \
...
--hive-import \
--map-column-hive id=STRING,price=DECIMAL
A new feature was added with sqoop-2103/sqoop 1.4.5 that lets you call out the decimal precision with the map-column-hive parameter. Example:
--map-column-hive 'TESTDOLLAR_AMT=DECIMAL(20%2C2)'
This syntax would define the field as a DECIMAL(20,2). The %2C is used as a comma and the parameter needs to be in single quotes if submitting from the bash shell.
I tried using Decimal with no modification and I got a Decimal(10,0) as a default.

How to use sqoop to export the default hive delimited output?

I have a hive query:
insert override directory /x
select ...
Then I'm try to export the data with sqoop
sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x --input-fields-terminated-by 0x01 --lines-terminated-by '\n'
But this seems to fail to parse the fields according to delimiter
What am I missing?
I think the --input-fields-terminated-by 0x01 part doesn't work as expected?
I do not want to create additional tables in hive that contains the query results.
stack trace:
2013-09-24 05:39:21,705 ERROR org.apache.sqoop.mapreduce.TextExportMapper: Exception:
java.lang.NumberFormatException: For input string: "9-2"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
...
The vi view of output
16-09-2013 23^A1182^A-1^APub_X^A21782^AIT^A1^A0^A0^A0^A0^A0.0^A0.0^A0.0
16-09-2013 23^A1182^A6975^ASoMo Audience Corp^A2336143^AUS^A1^A1^A0^A0^A0^A0.2^A0.0^A0.0
16-09-2013 23^A1183^A-1^APub_UK, Inc.^A1564001^AGB^A1^A0^A0^A0^A0^A0.0^A0.0^A0.0
17-09-2013 00^A1120^A-1^APub_US^A911^A--^A181^A0^A0^A0^A0^A0.0^A0.0^A0.0
I've found the correct solution for that special character in bash
#!/bin/bash
# ... your script
hive_char=$( printf "\x01" )
sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x --input-fields-terminated-by ${hive_char} --lines-terminated-by '\n'
The problem was in correct separator recognition (nothing to do with types and schema) and that was achieved by hive_char.
Another possibility to encode this special character in linux to command-line is to type Cntr+V+A
Using
--input-fields-terminated-by '\001' --lines-terminated-by '\n'
as flags in the sqoop export command seems to do the trick for me.
So, in your example, the full command would be:
sqoop export --connect jdbc:mysql://mysqlm/site --username site --password site --table x_data --export-dir /x --input-fields-terminated-by '\001' --lines-terminated-by '\n'
I think its the DataType mismatch with your RDBMS schema.
Try to find the column name of "9-2" value and check the datatype in RDBMS schema.
If its int or numeric then Sqoop will parse the value and insert. And as it seems "9-2" is not numeric value.
Let me know if this doesn't work.
It seems like sqoop is taking '0' as a delimiter .
You are getting an error because:-
First column in your mysql table could be varchar and second column is a number.
As per below string:-
16- 0 9-2 0 13 23^A1182^A-1^APub_X^A21782^AIT^A1^A0^A0^A0^A0^A0.0^A0.0^A0.0
Your first column parsed by sqoop is :-16-
and second column is:-9-2
So its better to specify a delimiter in quotes('0x01')
or
(Its always easy and has better control)use hive create table command as:-
create table tablename row format delimited fields terminated by '\t' as select ... and specify '\t' as delimiter in your sqoop command.

Resources