Multibyte character error in Oracle while trying to load block elements - oracle

We have flat file which we are trying to load into an Oracle 19c table using SQL*Loader, but it fails with "Multibyte character error" for one of the CHAR(2) field. We know it's a junk value but still we have to load it into the database. The database character set is AL32UTF8.
The value we are trying to load is a block element : U+2592 ▒ Medium shade
We tried using UTF-8 in the SQL*Loader control file but are still facing the same issue. Any advice how to proceed?
Command:
sqlldr $connection parfile=SCHEMA.TABLE.par
Error:
Record 1: Rejected - Error on table SCHEMA.TABLE, column COL2.
Multibyte character error.
COLUMN info :COL2 CHAR(2) NOT NULL ENABLE
Parameter file contents:
data=filename.dat
control=SCHEMA.TABLE.ctl
log=SCHEMA.TABLE.log
bad=SCHEMA.TABLE.bad
Control file contents:
OPTIONS (BINDSIZE=20000000,READSIZE=10485760,ROWS=10000,DIRECT=FALSE,ERRORS=50)
LOAD DATA
CHARACTERSET 'AL32UTF8'
DISCARDMAX 100
REPLACE PRESERVE BLANKS INTO TABLE SCHEMA.TABLE
TRAILING NULLCOLS
(
COL1 POSITION(1:15) "NVL(:COL1,' ')",
COL2 POSITION(16:17) "NVL(:COL2,' ')"
)
File characterset:
filename.dat: text/plain; charset=utf-8
OS: GNU/Linux

From the documentation:
The start and end arguments to the POSITION parameter are interpreted in bytes, even if character-length semantics are in use in a data file.
So POSITION(16:17) is the 16th and 17th bytes of the line, not 16th (and only, in your example) character. The U+2592 character is three bytes in UTF-8 - 0xE2 0x96 0x92 (e29692) - and you're only looking at the first two bytes; and those on their own don't represent a valid character.
You can change from using POSITION to using CHAR with the fixed length of each field, and specify LENGTH SEMANTICS CHARACTER:
OPTIONS (BINDSIZE=20000000,READSIZE=10485760,ROWS=10000,DIRECT=FALSE,ERRORS=50)
LOAD DATA
CHARACTERSET 'AL32UTF8'
LENGTH SEMANTICS CHARACTER
DISCARDMAX 100
REPLACE PRESERVE BLANKS INTO TABLE SCHEMA.TABLE
TRAILING NULLCOLS
(
COL1 CHAR(15) "NVL(:COL1,' ')",
COL2 CHAR(2) "NVL(:COL2,' ')"
)

Related

Oracle 12c - SQL Loader Invalid month Error

I'm getting ORA-01843: not a valid month error while loading the data using tab delimited text file with sql loader 11.1.0.6.0 on oracle 12c.
control file:
options (skip=1)
load data
truncate
into table test_table
fields terminated by '\t' TRAILING NULLCOLS
(
type,
rvw_date "case when :rvw_date = 'NULL' then null WHEN REGEXP_LIKE(:rvw_date, '\d{4}/\d{2}/\d{2}') THEN to_date(:rvw_date,'yyyy/mm/dd') else to_date(:rvw_date,'mm-dd-yy') end"
)
Data:
type rvw_date
Phone 2014/01/29
Phone 2014/02/13
Field NULL
Phone 01/26/15
Field 02/25/12
Schema:
create table test_table
(
type varchar2(20),
rvw_date date
)
The SQL*Loader control file is interpreting a backslash as an escape character, it seems. The SQL operator section shows double-quotes being escaped too; it ins't obvious it would apply to anything else, but it kind of makes sense that a single backslash would always be assumed to be escaping something.
Your regular expression pattern needs to double-escape the \d to make the pattern work in the file as it does in plain SQL. At the moment the pattern is not matched, so all the values to go the else, which is the wrng format mask (even when they are both corrected).
This works:
options (skip=1)
load data
truncate
into table test_table
fields terminated by '\t' TRAILING NULLCOLS
(
type,
rvw_date "case when :rvw_date = 'NULL' then null when REGEXP_LIKE(:rvw_date,'\\d{4}/\\d{2}/\\d{2}') then to_date(:rvw_date,'yyyy/mm/dd') else to_date(:rvw_date,'mm/dd/rr') end"
)
With your original data that creates rows:
alter session set nls_date_format = 'YYYY-MM-DD';
select * from test_table;
TYPE RVW_DATE
-------------------- ----------
Phone 2014-01-29
Phone 2014-02-13
Field
Phone 2015-01-26
Field 2012-02-25

loading data in table using SQL Loader

I'm loading data into my table through SQL Loader
data loading is successful but i''m getting garbage(repetitive) value in a particular column for all rows
After inserting :
column TERM_AGREEMENT is getting value '806158336' for every record
My csv file contains atmost 3 digit data for that column,but i'm forced to set my column definition to Number(10).
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
**TERM_AGREEMENT INTEGER**
)
create table LOAN_BALANCE_MASTER_INT
(
ACCOUNT_NO NUMBER(30),
CUSTOMER_NAME VARCHAR2(70),
LIMIT NUMBER(30),
PRODUCT_DESC VARCHAR2(30),
SUBPRODUCT_CODE NUMBER,
ARREARS_INT NUMBER(20,2),
IRREGULARITY NUMBER(20,2),
PRINCIPLE_IRREGULARITY NUMBER(20,2),
**TERM_AGREEMENT NUMBER(10)**
)
INTEGER is for binary data type. If you're importing a csv file, I suppose the numbers are stored as plain text, so you should use INTEGER EXTERNAL. The EXTERNAL clause specifies character data that represents a number.
Edit:
The issue seems to be the termination character of the file. You should be able to solve this issue by editing the INFILE line this way:
INFILE'/ipoapplication/utl_file/LBR_HE_Mar16.csv' "STR X'5E204D'"
Where '5E204D' is the hexadecimal for '^ M'. To get the hexadecimal value you can use the following query:
SELECT utl_raw.cast_to_raw ('^ M') AS hexadecimal FROM dual;
Hope this helps.
I actually solved this issue on my own.
Firstly, thanks to #Gary_W AND #Alessandro for their inputs.Really appreciate your help guys,learned some new things in the process.
Here's the new fragment which worked and i got the correct data for the last column
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
**TERM_AGREEMENT INTEGER Terminated by Whitspace**
)
'Terminated by whitespace' - I went through some threads of SQL Loader and i used 'terminated by whitespace' in the last column of his ctl file. it worked ,this time i didn't even had to use 'INTEGER' or 'EXTERNAL' or EXPRESSION '..' for conversion.
Just one thing, now can you guys let me now what could possibly be creating issue ?what was there in my csv file in that column and how by adding this thing solved the issue ?
Thanks.

Oracle Sql Loader "ORA-01722: invalid number" when loading CSV file with Windows line endings

I am using Oracle Sql Loader Utility from Linux shell to load csv data into Oracle DB.
But I have noticed that if source csv files lines endings are '\r\n' (Windows format), sqlldr fails to load data for last column.
For example, if last column is of FLOAT type (defined in ctl file as 'FLOAT EXTERNAL'), sqlldr fails with 'ORA-01722: invalid number':
Sqlldr ctl file:
OPTIONS(silent=(HEADER))
load data
replace
into table fp_basic_bd
fields terminated by "|" optionally enclosed by '"'
TRAILING NULLCOLS
(
FS_PERM_SEC_ID CHAR(20),
"DATE" DATE "YYYY-MM-DD",
ADJDATE DATE "YYYY-MM-DD",
CURRENCY CHAR(3),
P_PRICE FLOAT EXTERNAL,
P_PRICE_OPEN FLOAT EXTERNAL,
P_PRICE_HIGH FLOAT EXTERNAL,
P_PRICE_LOW FLOAT EXTERNAL,
P_VOLUME FLOAT EXTERNAL
)
sqlldr execution command:
sqlldr -userid XXX -data ./test.data -log ./test.log -bad ./test.errors -control test.ctl -errors 3 -skip_unusable_indexes -skip_index_maintenance
sqlldr error log:
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
FS_PERM_SEC_ID FIRST 20 | O(") CHARACTER
"DATE" NEXT * | O(") DATE YYYY-MM-DD
ADJDATE NEXT * | O(") DATE YYYY-MM-DD
CURRENCY NEXT 3 | O(") CHARACTER
P_PRICE NEXT * | O(") CHARACTER
P_PRICE_OPEN NEXT * | O(") CHARACTER
P_PRICE_HIGH NEXT * | O(") CHARACTER
P_PRICE_LOW NEXT * | O(") CHARACTER
P_VOLUME NEXT * | O(") CHARACTER
value used for ROWS parameter changed from 300000 to 65534
Record 1: Rejected - Error on table FP_BASIC_BD, column P_VOLUME.
ORA-01722: invalid number
Record 2: Rejected - Error on table FP_BASIC_BD, column P_VOLUME.
ORA-01722: invalid number
When I replaced Windows line endings to Unix ones, all errors gone and all data loaded correctly.
My question is: how could I specify line terminator char in sqlldr config file but still keep the source file name in shell command?
I've seen some examples of how to do that with stream record format http://docs.oracle.com/cd/E11882_01/server.112/e16536/ldr_control_file.htm#SUTIL1087,
but these examples are not applicable in my case as I need to keep name of data file in shell command, and not inside ctl file.
I recently encountered the same issue while loading data into my table via csv file.
My file looked like this :
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
TERM_AGREEMENT INTEGER EXTERNAL
)
And as you mentioned , i kept getting the same error 'invalid number'
Turns out this usually occurs
-when your column datatype is Number but data you're getting from your csv file is in string,so oracle loader fails to perform a conversion of string to number.
- when your field in csv file is terminated by some delimiters ,say space,tabs etc.
This is how i altered my ctl file :
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
TERM_AGREEMENT INTEGER Terminated by Whitespace
)
Try using stream record format and specifying the terminator string. From the docs
On UNIX-based platforms, if no terminator_string is specified, SQL*Loader defaults to the line feed character, \n.
The terminator string should allow you to specify a combination of characters.

SQL loader position

New to SQL loader and am a bit confused about the POSITION.
Let's use the following sample data as reference:
Munising 49862 MI
Shingleton49884 MI
Seney 49883 MI
And here is the load statement:
LOAD DATA
INFILE 'zipcodes.dat'
REPLACE INTO TABLE zipcodes (
city_name POSITION(1) CHAR(10),
zip_code POSITION(*) CHAR(5),
state_abbr POSITION(*+1) CHAR(2)
)
In the load statement, the city_name POSITION is 1. How does SQLLDR know where it ends? Is CHAR(10) the trick here? Counting the two spaces behind 'Munising', it has 10 characters.
Also why is zip_code assigned with CHAR even though it contains nothing but numbers?
Thank You
Yes, when end position is not specified, it is derived from the datatype. This documentation explains the POSITION clause.
city_name POSITION(1) CHAR(10)
Here the starting position of data field is 1. Ending position is not specified, but is derived from the datatype, that is 10.
zip_code POSITION(*) CHAR(5)
Here * specifies that, data field immediately follows the previous field and should be 5 bytes long.
state_abbr POSITION(*+1) CHAR(2)
Here +1 specifies the offset from the previous field. Sqlloader skips 1 byte and reads next 2 bytes, as derived from char(2) datatype.
As to why zipcode is CHAR, zip code is considered simply a fixed length string. You are not going to do any arithmetic operations on it. So, CHAR is appropriate for it.
Also, have a look at SQL Loader datatypes. In control file you are telling SQL*Loader how to interpret the data. It can be different from that of table structure. In this example you could also specify INTEGER EXTERNAL for zip code.
we need three text file & 1 batch file for Load Data:
Suppose your file location 'D:\loaddata'
input file 'D:\loaddata\abc.CSV'
1. D:\loaddata\abc.bad -- empty
2. D:\loaddata\abc.log -- empty
3. D:\loaddata\abc.ctl "Write Code Below"
OPTIONS ( SKIP=1, DIRECT=TRUE, ERRORS=10000000, ROWS=5000000)
load data
infile 'D:\loaddata\abc.CSV'
TRUNCATE
into table Your_table
(
a_column POSITION (1:7) char,
b_column POSITION (8:10) char,
c_column POSITION (11:12) char,
d_column POSITION (13:13) char,
f_column POSITION (14:20) char
)
D:\loaddata\abc.bat --- For execution
sqlldr db_user/db_passward#your_tns control=D:\loaddata\abc.ctl log=D:\loaddata\abc.log
After double click "D:\loaddata\abc.bat" file you data will be load desire oracle table. if anything wrong check you "D:\loaddata\abc.bad" and "D:\loaddata\abc.log" file

How to really skip the processing of a column?

In order to load data (from a CSV file) into an Oracle database, I use SQL*Loader.
In the table that receives these data, there is a varchar2(500) column, called COMMENTS.
For some reasons, I want to ignore this information from the CSV file.
Thus, I wrote this control file:
Options (BindSize=10000000,Readsize=10000000,Rows=5000,Errors=100)
Load Data
Infile 'XXX.txt'
Append into table T_XXX
Fields Terminated By ';'
TRAILING NULLCOLS
(
...
COMMENTS FILLER,
...
)
This code seems to work correctly, as the COMMENTS field in database is always set to null.
However, if in my CSV file I have a record where the corresponding COMMENTS field exceeds the 500 characters limit, I get an error from SQL*Loader:
Record 2: Rejected - Error on table T_XXX, column COMMENTS.
Field in data file exceeds maximum length
Is there a way to really exclude the processing of my COMMENTS fields?
I can't reproduce your problem. I'm using Oracle 10.2.0.3.0 with SQL*Loader 10.2.0.1.
Here is my test case:
SQL> CREATE TABLE test_sqlldr (
2 ID NUMBER,
3 comments VARCHAR2(20),
4 id2 NUMBER
5 );
Table created
Control file:
LOAD DATA
INFILE test.data
INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'
TRAILING NULLCOLS
( id,
comments filler,
id2
)
data file:
1;aaa;2
3;abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz;4
5;bbb;6
I'm using the command sqlldr userid=xxx/yyy#zzz control=test.ctl and I'm getting all the rows without errors:
SQL> select * from test_sqlldr;
ID COMMENTS ID2
---------- -------------------- ----------
1 2
3 4
5 6
You may try another approach, I'm getting the same desired result with the following control file:
LOAD DATA
INFILE test.data
INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'
TRAILING NULLCOLS
( id,
comments "substr(:comments,1,0)",
id2
)
Update following Romaintaz's comment: I looked into it again and managed to get the same error as you when the size of the column exceeded 255 characters. This is because the default datatype of SQL*Loader is char(255). If you have a column with more data you will have to specify the length. The following control file solved the problem for a column with 300 characters:
LOAD DATA
INFILE test.data
INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'
TRAILING NULLCOLS
( id,
comments filler char(4000),
id2
)
Hope this Helps,
--
Vincent
Just to suggest a tiny improvement, you might try something like:
LOAD DATA
IN FILE test.data INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'TRAILING NULLCOLS
(
id,
comments char(4000) "substr(:comments, 1, 200)",
id2)
Now you'll grab the first 200 characters (or any number you specify in it's place) of all comments - unless some of your input records have values for the comments field that exceed 4000 characters, in which they'll be rejected by loader with the 'exceeds max length' error noted earlier. But assuming that's rare or not the case, all the records will load with some of the comments truncated to 200 chars.
If you go over char(4000) you'll get a SQL Loader error - there's a limit to how far you can push the beast.

Resources