I have an external table which reads from a CSV file and is failing on certain rows.
External table definition:
E_ID NUMBER
A_IND VARCHAR2 (3 Byte)
B_IND VARCHAR2 (3 Byte)
E_DATE DATE
E_AMT NUMBER
F_DATE DATE
D_E_DATE DATE
I see the following info from a log file generated when I select * from the external table.
KUP-05004: Warning: Intra source concurrency disabled because parallel select was not requested.
Field Definitions for table EXTERNAL_TABLE_XTL
Record format DELIMITED BY NEWLINE
Data in file has same endianness as the platform
Rows with all null fields are accepted
Fields in Data Source:
E_ID CHAR (255)
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
A_IND CHAR (255)
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
B_IND CHAR (255)
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
E_DATE CHAR (10)
Date datatype DATE, date mask MM/DD/YYYY
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
E_AMT CHAR (255)
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
F_DATE CHAR (10)
Date datatype DATE, date mask MM/DD/YYYY
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
D_E_DATE CHAR (10)
Date datatype DATE, date mask MM/DD/YYYY
Terminated by ","
Enclosed by """ and """
Trim whitespace same as SQL Loader
KUP-04021: field formatting error for field D_E_DATE
KUP-04026: field too long for datatype
KUP-04101: record 56 rejected in file /home/TEST.csv
KUP-04021: field formatting error for field D_E_DATE
KUP-04026: field too long for datatype
KUP-04101: record 61 rejected in file /home/TEST.csv
KUP-04021: field formatting error for field D_E_DATE
KUP-04026: field too long for datatype
KUP-04101: record 70 rejected in file /home/TEST.csv
The file was transferred to the server via FileZilla. From reading other posts I thought maybe it was because the file was transferred in binary mode (it was originally on Auto setting) and maybe some non-printed characters have came in. So I tried to transfer using ASCII setting but that did not work. Then I tried to delete one of the lines that caused an error and retype it in manually. That did not work either.
Failed sample data:
5560000,N,Y,,24950,10/12/2011,10/27/2011
5550001,Y,Y,11/26/2013,73813,11/18/2013,11/29/2013
5560002,Y,Y,11/6/2015,22041.28,11/6/2015,11/18/2015
5560003,Y,Y,10/10/2012,2768.66,10/10/2012,10/24/2012
5560004,N,Y,,29750,9/30/2013,10/15/2013
5560005,Y,Y,10/8/2015,76474.84,10/8/2015,10/21/2015
5560006,N,Y,,63879.28,11/16/2011,11/30/2011
5560007,N,Y,,100000,11/14/2013,11/21/2013
Successful sample data:
5560008,Y,N,11/1/2010,,,
5550009,Y,N,,,,
5550010,N,N,,,,
5550011,N,N,,,,
5560012,Y,Y,2/12/2016,50000,2/12/2016,2/23/2016
5560013,Y,N,7/22/2011,,,
My first assumption is for some reason double digit months are not being accepted for the field D_E_DATE. Please note this is successful in the dev environment but not production and both are the same database version.
The following is working fine for me.
Table Definition:
CREATE TABLE my_data (
E_ID NUMBER,
A_IND VARCHAR2 (3 Byte),
B_IND VARCHAR2 (3 Byte),
E_DATE DATE,
E_AMT NUMBER,
F_DATE DATE,
D_E_DATE DATE
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY MY_DIR
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ','
MISSING FIELD VALUES ARE NULL
(
E_ID,
A_IND,
B_IND,
E_DATE date 'MM/DD/YYYY',
E_AMT,
F_DATE date 'MM/DD/YYYY',
D_E_DATE date 'MM/DD/YYYY'
)
)
LOCATION ('data.txt')
);
Sample Data:
[oracle#ora12c Desktop]$ cat data.txt
5560000,N,Y,,24950,10/12/2011,10/27/2011
5550001,Y,Y,11/26/2013,73813,11/18/2013,11/29/2013
5560002,Y,Y,11/6/2015,22041.28,11/6/2015,11/18/2015
5560003,Y,Y,10/10/2012,2768.66,10/10/2012,10/24/2012
5560004,N,Y,,29750,9/30/2013,10/15/2013
5560005,Y,Y,10/8/2015,76474.84,10/8/2015,10/21/2015
5560006,N,Y,,63879.28,11/16/2011,11/30/2011
5560007,N,Y,,100000,11/14/2013,11/21/2013
Output:
SQL> select * from my_date;
E_ID A_I B_I E_DATE E_AMT F_DATE D_E_DATE
---------- --- --- --------- ---------- --------- ---------
5560000 N Y 24950 12-OCT-11 27-OCT-11
5550001 Y Y 26-NOV-13 73813 18-NOV-13 29-NOV-13
5560002 Y Y 06-NOV-15 22041.28 06-NOV-15 18-NOV-15
5560003 Y Y 10-OCT-12 2768.66 10-OCT-12 24-OCT-12
5560004 N Y 29750 30-SEP-13 15-OCT-13
5560005 Y Y 08-OCT-15 76474.84 08-OCT-15 21-OCT-15
5560006 N Y 63879.28 16-NOV-11 30-NOV-11
5560007 N Y 100000 14-NOV-13 21-NOV-13
8 rows selected.
The answer to this question was found in the following thread:
Oracle external table date field - works in one DB and not in another
Transferring the same file from the dev server to prod server seemed to have resolved the issue. Weird, I wish I knew better exactly why this issue occurred and how to resolve it.
I am getting Oracle 'ORA-01861: literal does not match format string' while loading date strings of simple (YYYY-MM-DD) format into Oracle 11g:
My table DDL is:
CREATE TABLE fp_basic_dividends (
fs_perm_sec_id VARCHAR(20) NOT NULL,
"DATE" DATE NOT NULL,
currency CHAR(3) NOT NULL,
adjdate DATE NOT NULL,
p_divs_pd FLOAT(53) NOT NULL,
p_divs_paydatec DATE NULL,
p_divs_recdatec DATE NULL,
p_divs_s_spinoff CHAR(1) NOT NULL,
p_divs_s_pd FLOAT(53) NULL,
PRIMARY KEY (fs_perm_sec_id, "DATE"));
My sqlldr ctl file is:
load data
append
into table fp_basic_dividends
fields terminated by "|" optionally enclosed by '"'
TRAILING NULLCOLS
(
FS_PERM_SEC_ID CHAR(20),
"DATE" DATE "YYYY-MM-DD",
CURRENCY CHAR(3),
ADJDATE DATE "YYYY-MM-DD",
P_DIVS_PD FLOAT,
P_DIVS_PAYDATEC DATE "YYYY-MM-DD",
P_DIVS_RECDATEC DATE "YYYY-MM-DD",
P_DIVS_S_SPINOFF,
P_DIVS_S_PD FLOAT
)
Example data is:
"XXXXRR-S-US"|1997-09-30|"UAH"|1997-09-30|.0126400003|1997-10-01|1997-09-29|"0"|
Result log file content is:
Table FP_BASIC_DIVIDENDS, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
FS_PERM_SEC_ID FIRST 20 | O(") CHARACTER
"DATE" NEXT * | O(") DATE YYYY-MM-DD
CURRENCY NEXT 3 | O(") CHARACTER
ADJDATE NEXT * | O(") DATE YYYY-MM-DD
P_DIVS_PD NEXT 4 FLOAT
P_DIVS_PAYDATEC NEXT * | O(") DATE YYYY-MM-DD
P_DIVS_RECDATEC NEXT * | O(") DATE YYYY-MM-DD
P_DIVS_S_SPINOFF NEXT * | O(") CHARACTER
P_DIVS_S_PD NEXT 4 FLOAT
Record 1: Rejected - Error on table FP_BASIC_DIVIDENDS, column P_DIVS_PAYDATEC.
ORA-01861: literal does not match format string
What am I doing wrong? Any help is greatly appreciated
I think the problem is with PD_DIVS_PD FLOAT,.
Looks like what you have in the datafile is really character data (VARCHAR), and not a fixed length binary representation.
To get SQL*Loader to convert from character representation, I think the datatype has to be qualified as EXTERNAL, e.g.:
PD_DIVS_PD FLOAT EXTERNAL,
Here's what I think is happening... I think SQL*Loader is picking up exactly four bytes for the PD_DIVS_PD FLOATfield,
'.012'
He's not seeing that as character, he's not seeing that as a value 1.2E-02. He's viewing those four bytes as an internal binary representation of a FLOAT (bit for sign, certain number of bits for exponent, certain number of bits as mantissa).
Then, for the next field, he's starting at the next position, and picking up the '6400003' (up to the next field delimiter), and then trying to convert that to a DATE.
I am using Oracle Sql Loader Utility from Linux shell to load csv data into Oracle DB.
But I have noticed that if source csv files lines endings are '\r\n' (Windows format), sqlldr fails to load data for last column.
For example, if last column is of FLOAT type (defined in ctl file as 'FLOAT EXTERNAL'), sqlldr fails with 'ORA-01722: invalid number':
Sqlldr ctl file:
OPTIONS(silent=(HEADER))
load data
replace
into table fp_basic_bd
fields terminated by "|" optionally enclosed by '"'
TRAILING NULLCOLS
(
FS_PERM_SEC_ID CHAR(20),
"DATE" DATE "YYYY-MM-DD",
ADJDATE DATE "YYYY-MM-DD",
CURRENCY CHAR(3),
P_PRICE FLOAT EXTERNAL,
P_PRICE_OPEN FLOAT EXTERNAL,
P_PRICE_HIGH FLOAT EXTERNAL,
P_PRICE_LOW FLOAT EXTERNAL,
P_VOLUME FLOAT EXTERNAL
)
sqlldr execution command:
sqlldr -userid XXX -data ./test.data -log ./test.log -bad ./test.errors -control test.ctl -errors 3 -skip_unusable_indexes -skip_index_maintenance
sqlldr error log:
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
FS_PERM_SEC_ID FIRST 20 | O(") CHARACTER
"DATE" NEXT * | O(") DATE YYYY-MM-DD
ADJDATE NEXT * | O(") DATE YYYY-MM-DD
CURRENCY NEXT 3 | O(") CHARACTER
P_PRICE NEXT * | O(") CHARACTER
P_PRICE_OPEN NEXT * | O(") CHARACTER
P_PRICE_HIGH NEXT * | O(") CHARACTER
P_PRICE_LOW NEXT * | O(") CHARACTER
P_VOLUME NEXT * | O(") CHARACTER
value used for ROWS parameter changed from 300000 to 65534
Record 1: Rejected - Error on table FP_BASIC_BD, column P_VOLUME.
ORA-01722: invalid number
Record 2: Rejected - Error on table FP_BASIC_BD, column P_VOLUME.
ORA-01722: invalid number
When I replaced Windows line endings to Unix ones, all errors gone and all data loaded correctly.
My question is: how could I specify line terminator char in sqlldr config file but still keep the source file name in shell command?
I've seen some examples of how to do that with stream record format http://docs.oracle.com/cd/E11882_01/server.112/e16536/ldr_control_file.htm#SUTIL1087,
but these examples are not applicable in my case as I need to keep name of data file in shell command, and not inside ctl file.
I recently encountered the same issue while loading data into my table via csv file.
My file looked like this :
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
TERM_AGREEMENT INTEGER EXTERNAL
)
And as you mentioned , i kept getting the same error 'invalid number'
Turns out this usually occurs
-when your column datatype is Number but data you're getting from your csv file is in string,so oracle loader fails to perform a conversion of string to number.
- when your field in csv file is terminated by some delimiters ,say space,tabs etc.
This is how i altered my ctl file :
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
TERM_AGREEMENT INTEGER Terminated by Whitespace
)
Try using stream record format and specifying the terminator string. From the docs
On UNIX-based platforms, if no terminator_string is specified, SQL*Loader defaults to the line feed character, \n.
The terminator string should allow you to specify a combination of characters.
I have a SQL Loader Control file,
LOAD DATA
INFILE 'test.txt'
INTO TABLE TEST replace
fields terminated "|" optionally enclosed by '"' TRAILING NULLCOLS
( DOCUMENTID INTEGER(10),
CUSTID INTEGER(10),
USERID INTEGER(10),
FILENAME VARCHAR(255),
LABEL VARCHAR(50),
DESCRIPTION VARCHAR(2000),
POSTDATE DATE "YYYY-MM-DD HH24:MI:SS" NULLIF POSTDATE="",
USERFILENAME VARCHAR(50),
STORAGEPATH VARCHAR(255)
)
and it's giving me an error when I run SQL Loader on it,
Record 1: Rejected - Error on table TEST, column FILENAME.
Variable length field exceeds maximum length.
Here's that row.. the length of that column is way under 255..
1|5001572|2|/Storage/Test/5001572/test.pdf|test.pdf||2005-01-13 11:47:49||
And here's an oddity I noticed within the log file
Column Name | Position | Len | Term | Encl | Datatype
FILENAME | NEXT | 257 | | | VARCHAR
I define the length as 255 in both my table and control file. Yet the log spits it out as 257? I've tried knocking down the length in the control file to 253, so it appears as 255 in the log file, but the same issue.
Any help? This has bugged me for two days now.
Thanks.
Don't define your data fields as VARCHAR2 and INTEGER. Use CHAR. Most of the time, when loading data from a text file, you want to use CHAR, or perhaps DATE, although even that is converted from a text form. Most of the time you don't even need a length specifier. The default length for a CHAR field is 255. Your control file should look something like:
LOAD DATA
INFILE "test.txt"
INTO TABLE TEST replace
fields terminated "|" optionally enclosed by '"' TRAILING NULLCOLS
(
DOCUMENTID,
CUSTID,
USERID ,
FILENAME,
LABEL,
DESCRIPTION CHAR(2000),
POSTDATE DATE "YYYY-MM-DD HH24:MI:SS" NULLIF POSTDATE=BLANKS,
USERFILENAME,
STORAGEPATH
)
+1 for DCookie, but to expand on that it's important to distinguish between data types as specified in a table and data types in a SQL*loader control file as they mean rather different things, confusingly.
Start with a look at the the documentation, and note that when loading regular text files you need to be using the "portable" data types.
Varchar is a "non-portable" type, in which:
... consists of a binary length subfield followed by a character string of the specified length
So as DCookie says, CHAR is the thing to go for, and INTEGER EXTERNAL is a very commonly used SQL*Loader data type which you'd probably want to specify for DOCUMENTID etc.
I'm trying to load some data using sql loader. Here is the top of my control/data file:
LOAD DATA
INFILE *
APPEND INTO TABLE economic_indicators
FIELDS TERMINATED BY ','
(ASOF_DATE DATE 'DD-MON-YY',
VALUE FLOAT EXTERNAL,
STATE,
SERIES_ID INTEGER EXTERNAL,
CREATE_DATE DATE 'DD-MON-YYYY')
BEGINDATA
01-Jan-79,AL,67.39940538,1,23-Jun-2009
... lots of other data lines.
The problem is that sql loader won't recognize the data types I'm specifying. This is the log file:
Table ECONOMIC_INDICATORS, loaded from every logical record.
Insert option in effect for this table: APPEND
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ASOF_DATE FIRST * , DATE DD-MON-YY
VALUE NEXT * , CHARACTER
STATE NEXT * , CHARACTER
SERIES_ID NEXT * , CHARACTER
CREATE_DATE NEXT * , DATE DD-MON-YYYY
value used for ROWS parameter changed from 10000 to 198
Record 1: Rejected - Error on table ECONOMIC_INDICATORS, column VALUE.
ORA-01722: invalid number
... lots of similiar errors, expected if trying to insert char data into a numeric column.
I've tried no datatype spec, all other numeric specs, and always the same issue. Any ideas?
Also, any ideas on why it's changing the Rows parameter?
From your example, SQL*Loader will try to evaluate the string "AL" to a number value, which will result in the error message you gave. The sample data has something looking like it could be a decimal number at third position, not second as specified int he column list.