SQL Loader incompatible length - oracle

This is my control file
FIELDS (
dummy1 filler terminated by "cid=",
address enclosed by "<address>" and "</address>"
...
The address column in the table is varchar(10).
If the address in the file is over 10 characters then SQL*Loader cannot load it.
How I can capture address truncating to 10 characters?

The documentation has a section on applying SQL operators to fields.
A wide variety of SQL operators can be applied to field data with the SQL string. This string can contain any combination of SQL expressions that are recognized by the Oracle database as valid for the VALUES clause of an INSERT statement. In general, any SQL function that returns a single value that is compatible with the target column's datatype can be used.
In this case you can use the substr() function on the value from the file:
...
dummy filler terminated by "cid=",
address enclosed by "<address>" and "</address>" "substr(:address, 1, 10)"
...
The quoted "substr(:address, 1, 10)" passes the initial value from the file through the function before inserting the resulting 10 character (maximum) value, however long the original value in the file was. Note the colon before the name in that function call.
If your file is XML then you might be better off loading it as an external table and then using the built-in XML query tools to extract the data you want, rather than trying to parse it through delimited field definitions.

Related

Load string data that does not have quotes to Hive

I'm trying to load some test data to a simple Hive table. The data is comma separated, but the individual elements are not enclosed in double quotes. I'm getting an error due to this. How do I tell Hive not to expect varchar fields to be enclosed in quotes. Manually adding quotes to varchar fields is not an option since the input file I'm trying to use has thousands of records. Sample query and data below.
create table mydatabase.flights(FlightDate varchar(10),Airline int,FlightNum int,Origin varchar(4),Destination varchar(4),Departure varchar(4),DepDelay double,Arrival varchar(4),ArrivalDelay double,Airtime double,Distance double) row format delimited;
insert into mydatabase.flights(FlightDate,Airline,FlightNum,Origin,Destination,Departure,DepDelay,Arrival,ArrivalDelay,Airtime,Distance)
values(2014-04-01,19805,1,JFK,LAX,0854,-6.00,1217,2.00,355.00,2475.00);
The insert query above gives me an error message. It works fine if I enclose the varchar fields in quotes.
Error while compiling statement: FAILED: ParseException line 11:11 mismatched input '-' expecting ) near '2014' in value row constructor
I'm loading data using the following query
load data inpath '/user/alpsusa/hive/flights.csv' overwrite into table mydatabase.flights;
After load, I see only the first field being loaded. Rest all are NULL.
Sample data
2014-04-01,19805,1,JFK,LAX,0854,-6.00,1217,2.00,355.00,2475.00
2014-04-01,19805,2,LAX,JFK,0944,14.00,1736,-29.00,269.00,2475.00
2014-04-01,19805,3,JFK,LAX,1224,-6.00,1614,39.00,371.00,2475.00
2014-04-01,19805,4,LAX,JFK,1240,25.00,2028,-27.00,264.00,2475.00
2014-04-01,19805,5,DFW,HNL,1300,-5.00,1650,15.00,510.00,3784.00
Below is the output of DESCRIBE FORMATTED

sqlldr WHEN clause

I am trying to code a sqlldr.ctl file WHEN Clause to limit the records imported to those matching a portion of the current Schema's name.
The code I have (which does NOT work) is:
LOAD DATA
TRUNCATE INTO TABLE TMP_PRIM_ACCTS
when REGION_NUM = substr(user,-3,3)
Fields terminated by "|" Optionally enclosed by '"'
Trailing NULLCOLS
( PORTFOLIO_ACCT,
PRIMARY_ACCT_ID NULLIF (PRIMARY_ASSET_ID="NULL"),
REGION_NUM NULLIF (PARTITION_NUM="NULL")
)
sqlldr returns:
SQL*Loader-350: Syntax error at line 3.
Expecting quoted string or hex identifier, found "substr".
when PARTITION_NUM = substr(user,-3,3)
I cannot put single quotes around "user", because that turns it into the literal string "user". Can anyone explain how I can reference the "active" User in this WHEN Clause?
Thank you!
Can you try something like this? (now I can't make test with SQLLDR, but this is syntax I used for changing values):
when REGION_NUM = "substr(:user,-3,3)"
It doesn't look like you can. The documentation only shows fixed values:
Trying to use an expression in when that clause (or in nullif; thought I'd try to see if you could cause a rejection based on null PK value) you just see the literal value in the log:
Table TMP_PRIM_ACCTS, loaded when REGION_NUM = 0X73756273747228757365722c2d332c3329(character 'substr(user,-3,3)')
which is sort of what you referred when you said you couldn't quote user, but you'd have to quite the whole thing anyway. Using :user doesn't work either, the colon is seen as just another character, it doesn't try to find a column called user instead.
The simplest approach may be to pre-process the data file and remove any rows which don't match the pattern (e.g. via a regex). That would actually be slightly easier if you used an external table instead of SQL*Loader.
Alternatively, generate your control file and embed the correct literal value based on the user you'll connect as.

Skip hyphen in hive

I have executed a query in HIVE CLI that should generate an External Table .
"create EXTERNAL TABLE IF NOT EXISTS hassan( code int, area_name string,
male_60_64 STRUCT,
male_above_65 STRUCT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';"
It works fine but if I put "-" instead of "_" I will face with error.
"create EXTERNAL TABLE IF NOT EXISTS hassan ( code int, area_name string, male-60-64 STRUCT< c1 : string, x-user : string>) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';"
Any help would be greatly appreciated.
The answer by Addy already provided an example of how to use a hyphen in a column name. Here is an addition that describes how this works in different versions of Hive, according to the documentation:
In Hive 0.12 and earlier, only
alphanumeric and underscore characters are allowed in table and
column names.
In Hive 0.13 and later, column names can contain any
Unicode character (see HIVE-6013). Any column name that is specified
within backticks (`) is treated literally. Within a backtick string,
use double backticks (``) to represent a backtick character. Backtick
quotation also enables the use of reserved keywords for table and
column identifiers.
To revert to pre-0.13.0 behavior and restrict
column names to alphanumeric and underscore characters, set the
configuration property hive.support.quoted.identifiers to none. In
this configuration, backticked names are interpreted as regular
expressions. For details, see Supporting Quoted Identifiers in Column
Names.
In addition to that, you can also find the syntax for STRUCT there, which should help you with the error that you mentioned in the comments:
struct_type : STRUCT < col_name : data_type [COMMENT col_comment],
...>
Update:
Note that hyphens in complex types (so inside structs) do not appear to be supported.
Try Quoted Identifiers
create table hassan( code int, `area_name` string, `male-60-64` STRUCT, `male-above-65` STRUCT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
Reference:
https://issues.apache.org/jira/secure/attachment/12618321/QuotedIdentifier.html

How to call ora_hash function inside control file in sql loader?

I'm trying to call a function(ORA_HASH) inside sqlldr but I'm not able to achive the target.
Data File
abc.txt
AKY,90035,"G","DP",20150121,"",0,,,,,,"","E8BD4346-A174-468B-ABC2-1586B81A8267",1,17934,5099627512855,"TEST of CLOROM","",14.00,"",14.00,17934,5099627512855,"TEST of CLOROM",14.00,"ONE TO BE T ONE",344,0,"98027f93-4f1a-44b2-b609-7ffbb041a375",,,AKY8035,"Taken Test","L-20 Shiv Lok"
AKY,8035,"D","DP",20150121,"",0,,,,,,"","E8BD4346-A174-468B-ABC2-1586B81A8267",2,17162,5099627885843,"CEN TESt","",15.00,"",250.00,17162,5099627885843,"CEN TESt",15.00,"ONE TDAILY",3659,0,"09615cc8-77c9-4781-b51f-d44ec85bbe54",,,LLY8035,"Taken Test","L-20 Shiv Lok"
Control file
cnt_file.ctl
load data
into table Table_XYZ
fields terminated by "," optionally enclosed by '"'
F1,F2,F3,F4,F5,F6,F7,F8,F9,F10,F11,F12,F13,F14,F15,F16,F17,F18,F19,F20,F21,F22,F23,F24,F25,F26,F27,F28,F29,F30,F31 ORA_HASH(CONCAT(F2,F5,F6,F9,F10,F12,F13,F14,F15,F16,F17,F19,F21,F22)),F32 ORA_HASH(CONCAT(f23,H24,F7,F8,F3)),F33,F34,F35
sqlldr "xxxxx/yyyyy" control=cnt_file.ctl data=abc.txt
whenever I'm executing sqlldr from Linux box I'm getting below error
SQL*Loader-350: Syntax error at line 4.
Expecting "," or ")", found "ORA_HASH".
F29,F30,F31,KEY_CLMNS_HASH ORA_HASH(CONCAT( F2,F5
^
Any idea
You might consider using a virtual column on the table to which you are loading the data.
For columns which are deterministically based on other column values in the same row, that usually ends up being a more simple solution than anything involving SQL*Loader.
You're doing a few things wrong. The immediate error is because the Oracle function call has to be enclosed in double quotes:
...,F31 "ORA_HASH(CONCAT(F2,F5,F6,...))",...
The second issue is that the concat function only takes two arguments, so you would either have to nest (lots of) concat calls, or more readably use the concatenation operator instead:
...,F31 "ORA_HASH(F2||F5||F6||...)",...
And finally you need to prefix the field names inside your function call with a colon:
...,F31 "ORA_HASH(:F2||:F5||:F6||...)",...
This is explained in the documentation:
The following requirements and restrictions apply when you are using SQL strings:
...
The SQL string must be enclosed in double quotation marks.
And
To refer to fields in the record, precede the field name with a colon (:). Field values from the current record are substituted. A field name preceded by a colon (:) in a SQL string is also referred to as a bind variable. Note that bind variables enclosed in single quotation marks are treated as text literals, not as bind variables.

convert newline character to html tag in oracle sql 11g

I am reading a file which has many descriptive fields of length 4000 bytes.
While reading this file as a table I could not put the conditions into the source qualifier because it does not have enough space to put SQL in it.
Hence i decided to put replace newline character by spaces in a expression transformation.
I tried the following code:
REPLACECHR(0,SUPSPD_COMMUNICATION_PLAN, CHR(10)||CHR(11)||CHR(12),'<br/>')
I am getting < in the output.
How do i do this?

Resources