This is the control file that I am trying to load using SQL Loader. However, I am able to only load 1 record and cannot load TRL (the last line of data file) into LTD column. I need to be able to load "TRL 02 0001 56778 34 999 111" value into LTD column. Appreciate your help on this.
Sample Data:
HDR
12|45|3|SUE|US
TRL 02 0001 56778 34 999 111
Control File:
OPTIONS (SKIP=1)
LOAD DATA
INFILE '*.TXT'
BADFILE 'A.bad'
INTO TABLE A
REPLACE
WHEN (1:3) != 'TRL'
FIELDS TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(
ID FILLER ,
LTD ,
CAGE ,
SUPP FILLER ,
CODE ,
NAME ,
DBA_NAME FILLER ,
CNTRY_CODE ,
STATUS CONSTANT "U",
RECORD_ID "S.nextval"
)
INTO TABLE A
REPLACE
WHEN (1:3) = 'TRL'
(
LTD CHAR(300),
STATUS CONSTANT "U",
RECORD_ID "S.nextval"
);
When you're inserting into multiple tables, you have to use position (otherwise it won't work). Also, the 2nd table lacks in trailing nullcols.
Therefore, for a sample target table and a sequence:
SQL> CREATE TABLE a
2 (
3 id NUMBER,
4 ltd VARCHAR2 (30),
5 cage VARCHAR2 (5),
6 supp NUMBER,
7 code VARCHAR2 (5),
8 name VARCHAR2 (5),
9 dba_name NUMBER,
10 cntry_code VARCHAR2 (5),
11 status VARCHAR2 (1),
12 record_id NUMBER
13 );
Table created.
SQL> CREATE SEQUENCE s;
Sequence created.
SQL>
control file looks like this:
OPTIONS (SKIP=1)
LOAD DATA
INFILE * --> modified this (as I have sample data in the control file)
BADFILE 'A.bad'
INTO TABLE A
REPLACE
WHEN (1:3) != 'TRL'
FIELDS TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(
ID FILLER ,
LTD ,
CAGE ,
SUPP FILLER ,
CODE ,
NAME ,
DBA_NAME FILLER ,
CNTRY_CODE ,
STATUS CONSTANT "U",
RECORD_ID "S.nextval"
)
INTO TABLE A
REPLACE
WHEN (1:3) = 'TRL'
TRAILING NULLCOLS --> added this
(
LTD POSITION(1) CHAR(300), --> added POSITION
STATUS CONSTANT "U",
RECORD_ID "S.nextval"
)
begindata
HDR
12|45|3|SUE|US
TRL 02 0001 56778 34 999 111
Testing:
SQL> $sqlldr scott/tiger#orcl control=test41.ctl log=test41.log
SQL*Loader: Release 18.0.0.0.0 - Production on Pet Lip 10 08:37:05 2022
Version 18.5.0.0.0
Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved.
Path used: Conventional
Commit point reached - logical record count 1
Commit point reached - logical record count 2
Table A:
1 Row successfully loaded.
Table A:
1 Row successfully loaded.
Check the log file:
test41.log
for more information about the load.
SQL> select * from a;
ID LTD CAGE SUPP CODE NAME DBA_NAME CNTRY S RECORD_ID
---------- ------------------------------ ----- ---------- ----- ----- ---------- ----- - ----------
45 3 US U 1
TRL 02 0001 56778 34 999 111 U 2
SQL>
Looks OK to me.
Related
I have the following code. Now the issue is that we expect three columns in the file but sometime the other team sends us 4 columns. So instead of failing the load, it will load first three columns. When the file has less than 3 columns, then it fails which is expected. What logic do I need to place where it fails when an extra column is present in the file?
CREATE TABLE TESTING_DUMP (
"FIELD_1" NUMBER,
"FIELD_2" VARCHAR2(5),
"FIELD_3" VARCHAR2(5)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY MY_DIR
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
BADFILE "MY_DIR":"TEST.bad"
LOGFILE "MY_DIR":"TEST.log"
READSIZE 1048576
FIELDS TERMINATED BY "|" LDRTRIM
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
"LOAD" CHAR(1),
"FIELD_1" CHAR(5),
"FIELD_2" INTEGER EXTERNAL(5),
"FIELD_3" CHAR(5)
)
)
location
(
'Test.xls'
)
)REJECT LIMIT 0;
File Test.xls has sample content below. Second line is correct. It should fail for first line but it does not.
|11111|22222|33333|AAAAA
|22222|33333|44444|
I wouldn't know how to do that in single step, so I'll suggest a workaround - see if it helps.
This is target table, which is - at the end - supposed to contain valid rows only:
SQL> create table ext_target
2 (col1 number,
3 col2 varchar2(5),
4 col3 varchar2(5));
Table created.
External table contains only one column which will contain the whole row (i.e. no separate columns):
SQL> create table ext_dump
2 (col varchar2(100))
3 organization external (
4 type oracle_loader
5 default directory ext_dir
6 access parameters (
7 records delimited by newline
8 fields terminated by ','
9 missing field values are null
10 (
11 col char(100) )
12 )
13 location ('test.txt')
14 )
15 reject limit unlimited;
Table created.
This is the whole file contents:
|11111|22222|33333|AAAAA
|22222|33333|44444|
|55555|66666|
External table contains the whole file (nothing is rejected):
SQL> select * from ext_dump;
COL
--------------------------------------------------------------------------------
|11111|22222|33333|AAAAA
|22222|33333|44444|
|55555|66666|
Insert only valid rows into the target table (so far, there are two conditions: there shouldn't be 4 "columns", and there can be only 4 | separators:
SQL> insert into ext_target (col1, col2, col3)
2 select regexp_substr(col, '\w+', 1, 1),
3 regexp_substr(col, '\w+', 1, 2),
4 regexp_substr(col, '\w+', 1, 3)
5 from ext_dump
6 where regexp_substr(col, '\w+', 1, 4) is null
7 and regexp_count(col, '\|') = 4;
1 row created.
The only valid row:
SQL> select * from ext_target;
COL1 COL2 COL3
---------- ----- -----
22222 33333 44444
SQL>
Now, you can adjust the where clause any way you want; what I posted is just an example.
How about adding a fourth field definition that maps to a datatype sure to fail, like a date column with a funky format sure not to be seen? The "MISSING FIELD VALUES ARE NULL" should render it NULL when not present, and the datatype conversion should error when it is present.
we are migrating DB from Oracle 11g -> 19 and facing issue with external table. Old and new db have exactly same table definition and pointing to the same file (db running on different hosts but pointing same qtree). Old DB can query file without errors, but new one rejecting all rows with:
KUP-04023: field start is after end of record
Tables have below config:
CREATE TABLE TEST
(
AA VARCHAR2 (40 BYTE),
BB VARCHAR2 (2 BYTE),
CC VARCHAR2 (3 BYTE),
DD VARCHAR2 (12 BYTE)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEST_DIRECTORY
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
BADFILE TEST_DIRECTORY : 'TEST.bad'
LOGFILE TEST_DIRECTORY : 'TEST.log'
FIELDS
TERMINATED BY '\t' LTRIM REJECT ROWS WITH ALL NULL FIELDS
(AA,
BB,
CC,
DD))
LOCATION (TEST_DIRECTORY:'TEST.dat'))
REJECT LIMIT UNLIMITED;
Test data (replace ^I with tabulator):
NAME1^I0^I ^IUK
NAME2^I0^I ^IUS
When I removed LTRIM, all data is read on new DB (but we need to keep LTRIM as input files contain unnecessary spaces). I've noticed that one field has value of one space and it looks to be causing that issue, but why only on new database? Any ideas what is the reason or how to easily fix?
NLS db/session parameters are same on both databases...but maybe there is some global parameter which could cause this issue?
Test data manually updated which is working on both db (replace whitespace in third column with X)
NAME1^I0^IX^IUK
NAME2^I0^IX^IUS
DEMO:
Below table created on 11g and 19c:
CREATE TABLE TEST
(
AA VARCHAR2 (40 BYTE),
BB VARCHAR2 (2 BYTE),
CC VARCHAR2 (3 BYTE),
DD VARCHAR2 (12 BYTE)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEST_DIRECTORY
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
BADFILE TEST_DIRECTORY : 'TEST.bad'
LOGFILE TEST_DIRECTORY : 'TEST.log'
FIELDS
TERMINATED BY '\t' LTRIM
REJECT ROWS WITH ALL NULL FIELDS
(AA,
BB,
CC ,
DD))
LOCATION (TEST_DIRECTORY:'TEST.dat'))
REJECT LIMIT UNLIMITED;
Both tables sourcing same file TEST.dat (data delimited by tabulator which is shown as 2 characters ^I):
$ cat -A TEST.dat
NAME1^I0^I ^IUK$
NAME2^I0^I ^IUS$
Querying on 11g:
SQL> SELECT * FROM TEST;
AA BB CC DD
---------------------------------------- -- --- ------------
NAME1 0 UK
NAME2 0 US
SQL> SELECT dump(CC) FROM TEST;
DUMP(CC)
--------------------------------------------------------------------------------
NULL
NULL
Querying on 19c:
SQL> SELECT * FROM TEST;
no rows selected
TEST.log shows after running query on 19c:
Bad File: TEST.bad
Field Definitions for table TEST
Record format DELIMITED BY NEWLINE
Data in file has same endianness as the platform
Reject rows with all null fields
Fields in Data Source:
AA CHAR (255)
Terminated by " "
Trim whitespace from left
BB CHAR (255)
Terminated by " "
Trim whitespace from left
CC CHAR (255)
Terminated by " "
Trim whitespace from left
DD CHAR (255)
Terminated by " "
Trim whitespace from left
KUP-04021: field formatting error for field DD
KUP-04023: field start is after end of record
KUP-04101: record 1 rejected in file /home/fff/TEST.dat
KUP-04021: field formatting error for field DD
KUP-04023: field start is after end of record
KUP-04101: record 2 rejected in file /home/fff/TEST.dat
Then, I recreated tables on both db just without LTRIM:
CREATE TABLE TEST
(
AA VARCHAR2 (40 BYTE),
BB VARCHAR2 (2 BYTE),
CC VARCHAR2 (3 BYTE),
DD VARCHAR2 (12 BYTE)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEST_DIRECTORY
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
BADFILE TEST_DIRECTORY : 'TEST.bad'
LOGFILE TEST_DIRECTORY : 'TEST.log'
FIELDS
TERMINATED BY '\t'
REJECT ROWS WITH ALL NULL FIELDS
(AA,
BB,
CC ,
DD))
LOCATION (TEST_DIRECTORY:'TEST.dat'))
REJECT LIMIT UNLIMITED;
Querying on new table in 11g:
SQL> SELECT * FROM TEST;
AA BB CC DD
---------------------------------------- -- --- ------------
NAME1 0 UK
NAME2 0 US
SQL> SELECT dump(CC) FROM TEST;
DUMP(CC)
--------------------------------------------------------------------------------
Typ=1 Len=1: 32
Typ=1 Len=1: 32
Querying on new table in 19c:
SQL> SELECT * FROM TEST;
AA BB CC DD
---------------------------------------- -- --- ------------
NAME1 0 UK
NAME2 0 US
SQL> SELECT dump(CC) FROM TEST;
DUMP(CC)
--------------------------------------------------------------------------------
Typ=1 Len=1: 32
Typ=1 Len=1: 32
Let me try to reproduce your issue on my own environment
Using Oracle 19c on Red Hat Linux 7.2
SQL> select version from v$instance ;
VERSION
-----------------
19.0.0.0.0
Demo
Update: delimiter is tab
Content of the file
$ cat -A TEST.dat
NAME1^I0^I ^IUK$
NAME2^I0^I ^IUS$
External Table
SQL> drop table TEST_EXTERNAL_TABLE ;
Table dropped.
SQL> CREATE TABLE TEST_EXTERNAL_TABLE
2 (
3 AA VARCHAR2 (40 BYTE),
4 BB VARCHAR2 (2 BYTE),
5 CC VARCHAR2 (3 BYTE),
6 DD VARCHAR2 (12 BYTE)
7 )
8 ORGANIZATION EXTERNAL
9 (
10 TYPE ORACLE_LOADER
11 DEFAULT DIRECTORY DIR_TEST
12 ACCESS PARAMETERS (
13 RECORDS DELIMITED BY NEWLINE
14 BADFILE DIR_TEST : 'TEST.bad'
15 LOGFILE DIR_TEST : 'TEST.log'
16 FIELDS TERMINATED BY '\t' NOTRIM
17 REJECT ROWS WITH ALL NULL FIELDS
18 (AA,
19 BB,
20 CC,
21 DD))
22* LOCATION (DIR_TEST:'TEST.dat'))
SQL> /
Table created.
SQL> select * from TEST_EXTERNAL_TABLE ;
AA BB CC DD
---------------------------------------- -- --- ------------
NAME1 0 UK
NAME2 0 US
SQL> select dump(cc) from TEST_EXTERNAL_TABLE ;
DUMP(CC)
--------------------------------------------------------------------------------
Typ=1 Len=1: 32
Typ=1 Len=1: 32
In my case I am able to load, but the blank spaces remain in the field, which is the expected behaviour of NOTRIM vs LDRTRIM.
LDRTRIM is used to provide compatibility with SQL*Loader trim
features. It is the same as NOTRIM except in the following cases:
If the field is not a delimited field, then spaces will be trimmed
from the right. If the field is a delimited field with OPTIONALLY
ENCLOSED BY specified, and the optional enclosures are missing for a
particular instance, then spaces will be trimmed from the left.
Doing the same with LDRTRIM
SQL> drop table TEST_eXTERNAL_TABLE;
Table dropped.
SQL> l
1 CREATE TABLE TEST_EXTERNAL_TABLE
2 (
3 AA VARCHAR2 (40 BYTE),
4 BB VARCHAR2 (2 BYTE),
5 CC VARCHAR2 (3 BYTE),
6 DD VARCHAR2 (12 BYTE)
7 )
8 ORGANIZATION EXTERNAL
9 (
10 TYPE ORACLE_LOADER
11 DEFAULT DIRECTORY DIR_TEST
12 ACCESS PARAMETERS (
13 RECORDS DELIMITED BY NEWLINE
14 BADFILE DIR_TEST : 'TEST.bad'
15 LOGFILE DIR_TEST : 'TEST.log'
16 FIELDS TERMINATED BY '\t' LDRTRIM
17 REJECT ROWS WITH ALL NULL FIELDS
18 (AA,
19 BB,
20 CC,
21 DD))
22* LOCATION (DIR_TEST:'TEST.dat'))
SQL> /
Table created.
SQL> select * from TEST_EXTERNAL_TABLE ;
AA BB CC DD
---------------------------------------- -- --- ------------
NAME1 0 UK
NAME2 0 US
SQL> select dump(cc) from TEST_EXTERNAL_TABLE ;
DUMP(CC)
--------------------------------------------------------------------------------
Typ=1 Len=1: 32
Typ=1 Len=1: 32
SQL>
If you use LTRIM it does not work, because the white spaces are in the right side, as the field is empty. That is the default behaviour, at least since 12c is how it works and should be.
SQL> drop table TEST_EXTERNAL_TABLE ;
Table dropped.
SQL> CREATE TABLE TEST_EXTERNAL_TABLE
(
AA VARCHAR2 (40 BYTE),
2 3 4 BB VARCHAR2 (2 BYTE),
CC VARCHAR2 (3 BYTE),
5 6 DD VARCHAR2 (12 BYTE)
7 )
8 ORGANIZATION EXTERNAL
(
9 10 TYPE ORACLE_LOADER
DEFAULT DIRECTORY DIR_TEST
ACCESS PARAMETERS (
11 12 13 RECORDS DELIMITED BY NEWLINE
BADFILE DIR_TEST : 'TEST.bad'
LOGFILE DIR_TEST : 'TEST.log'
14 15 16 FIELDS TERMINATED BY '\t' LTRIM
REJECT ROWS WITH ALL NULL FIELDS
(AA,
BB,
17 18 19 20 CC,
DD))
LOCATION (DIR_TEST:'TEST.dat'))
21 22 23 REJECT LIMIT UNLIMITED;
Table created.
SQL> select * from TEST_EXTERNAL_TABLE ;
no rows selected
Now with RTRIM works as expected, because the whitespaces in the whole field are treated from right to left.
SQL> drop table TEST_EXTERNAL_TABLE ;
Table dropped.
SQL> CREATE TABLE TEST_EXTERNAL_TABLE
2 (
AA VARCHAR2 (40 BYTE),
3 4 BB VARCHAR2 (2 BYTE),
CC VARCHAR2 (3 BYTE),
DD VARCHAR2 (12 BYTE)
5 6 7 )
ORGANIZATION EXTERNAL
(
8 9 10 TYPE ORACLE_LOADER
11 DEFAULT DIRECTORY DIR_TEST
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
12 13 14 BADFILE DIR_TEST : 'TEST.bad'
LOGFILE DIR_TEST : 'TEST.log'
15 16 FIELDS TERMINATED BY '\t' RTRIM
17 REJECT ROWS WITH ALL NULL FIELDS
18 (AA,
19 BB,
20 CC,
DD))
LOCATION (DIR_TEST:'TEST.dat'))
21 22 23 REJECT LIMIT UNLIMITED;
Table created.
SQL> select * from TEST_EXTERNAL_TABLE ;
AA BB CC DD
---------------------------------------- -- --- ------------
NAME1 0 UK
NAME2 0 US
My advice: Use LDRTRIM, or even better, avoid whitespaces all together is that is an option. Regarding your test in 11g, well that is quite an old version and probably the behaviour is consequence of a bug, although I could not find any reported one explaining this behaviour.
Its not LTRIM its LDRTRIM.
SQL> create table et
2 ( c1 varchar2(16),
3 c2 varchar2(8),
4 c3 varchar2(8),
5 c4 varchar2(8),
6 c5 varchar2(8),
7 c6 varchar2(8),
8 c7 varchar2(8)
9 )
10 ORGANIZATION EXTERNAL
11 ( TYPE ORACLE_LOADER
12 DEFAULT DIRECTORY temp
13 ACCESS PARAMETERS
14 ( RECORDS DELIMITED BY NEWLINE
15 BADFILE temp: 'TEST_FILE.bad'
16 LOGFILE temp: 'TEST_FILE.log'
17 FIELDS TERMINATED BY X'20A7' LTRIM
18 REJECT ROWS WITH ALL NULL FIELDS
19 (
20 c1,c2,c3,c4,c5,c6,c7
21 ) )
22 LOCATION (temp:'TEST_FILE.dat')
23 )
24 REJECT LIMIT UNLIMITED;
Table created.
SQL>
SQL> select * from et;
C1 C2 C3 C4 C5 C6 C7
---------------- -------- -------- -------- -------- -------- --------
31234569999999 0 A X 0 Z GGGG
SQL>
SQL> drop table et;
Table dropped.
SQL>
SQL> create table et
2 ( c1 varchar2(16),
3 c2 varchar2(8),
4 c3 varchar2(8),
5 c4 varchar2(8),
6 c5 varchar2(8),
7 c6 varchar2(8),
8 c7 varchar2(8)
9 )
10 ORGANIZATION EXTERNAL
11 ( TYPE ORACLE_LOADER
12 DEFAULT DIRECTORY temp
13 ACCESS PARAMETERS
14 ( RECORDS DELIMITED BY NEWLINE
15 BADFILE temp: 'TEST_FILE.bad'
16 LOGFILE temp: 'TEST_FILE.log'
17 FIELDS TERMINATED BY X'20A7' LDRTRIM
18 REJECT ROWS WITH ALL NULL FIELDS
19 (
20 c1,c2,c3,c4,c5,c6,c7
21 ) )
22 LOCATION (temp:'TEST_FILE.dat')
23 )
24 REJECT LIMIT UNLIMITED;
Table created.
SQL>
SQL> select * from et;
C1 C2 C3 C4 C5 C6 C7
---------------- -------- -------- -------- -------- -------- --------
31234569999999 0 A X 0 GGGG
31234569999999 0 A X 0 Z GGGG
load data
infile 'c:\oracle_toad\sql_loader\v1_data.txt'
replace into table v1 fields terminated by ','
( a integer external, b char, c char )
1,2,"da,ta1"
2,4,"dat,a2"
2,4,"da,ta2"
"" are not supposed to be inserted as a part of data. That's just for reference.
I intentionally inserted "," in each of the data set.
I am hoping to insert 1, 2, "da,ta1" <<< like this. Is there a way that I can include the separator "," within the data set?
Here's an example:
Test table:
SQL> create table test (col1 number, col2 varchar2(20), col3 varchar2(20));
Table created.
Control file:
load data
infile *
replace
into table test
fields terminated by ',' optionally enclosed by '"'
trailing nullcols
(
col1,
col2,
col3
)
begindata
1,2,"da,ta1"
2,4,"dat,a2"
2,4,"da,ta2"
Loading session & the result:
SQL> $sqlldr scott/tiger control=test04.ctl log=test04.log
SQL*Loader: Release 11.2.0.2.0 - Production on Pon Kol 27 14:11:26 2018
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 2
Commit point reached - logical record count 3
SQL> select * From test;
COL1 COL2 COL3
---------- -------------------- --------------------
1 2 da,ta1
2 4 dat,a2
2 4 da,ta2
SQL>
I have one temp table which is empty now. I want to load the data from that flat file to the oracle temp table. In one column col3 of the flat file mention as "X" but in the table i want to insert as "abc". If possible to remove column value from "X" in flat file then how it is possible? or replace value from "X" to "abc".
SQL*Loader lets you apply SQL operators to fields, so you can manipulate the value from the file.
Let's say you have a simple table like:
create table your_table(col1 number, col2 number, col3 varchar2(3));
and a data file like:
1,42,xyz
2,42,
3,42,X
then you could make your control file replace an 'X' value in col3 with the fixed value 'abc' using a case expression:
load data
replace
into table your_table
fields terminated by ',' optionally enclosed by '"'
trailing nullcols
(
col1,
col2,
col3 "CASE WHEN :COL3 = 'X' THEN 'abc' ELSE :COL3 END"
)
Running that file through with that control file inserts three rows:
select * from your_table;
COL1 COL2 COL
---------- ---------- ---
1 42 xyz
2 42
3 42 abc
The 'X' has been replaced, the other values are retained.
If you want to 'remove' the value, rather than replacing it, you could do the same thing but with null as the fixed value:
col3 "CASE WHEN :COL3 = 'X' THEN NULL ELSE :COL3 END"
or you could use nullif or defaultif:
col3 nullif(col3 = 'X')
DECODE, right?
SQL> create table test (id number, col3 varchar2(20));
Table created.
SQL> $type test25.ctl
load data
infile *
replace into table test
fields terminated by ',' trailing nullcols
(
id,
col3 "decode(:col3, 'x', 'abc', :col3)"
)
begindata
1,xxx
2,yyy
3,x
4,123
SQL>
SQL> $sqlldr scott/tiger#orcl control=test25.ctl log=test25.log
SQL*Loader: Release 11.2.0.2.0 - Production on ╚et O×u 29 12:57:56 2018
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 3
Commit point reached - logical record count 4
SQL> select * From test order by id;
ID COL3
---------- --------------------
1 xxx
2 yyy
3 abc
4 123
SQL>
Could anyone please explain the below two statements w.r.t the Oracle external table performance improvement with the ORACLE_LOADER access driver:
Fixed-length records are processed faster than records terminated by
a string.
Fixed-length fields are processed faster than delimited fields.
Explanation with code might help me to understand the concept in depth. here is the two syntax(s):
Fixed field length
create table ext_table_fixed (
field_1 char(4),
field_2 char(30)
)
organization external (
type oracle_loader
default directory ext_dir
access parameters (
records delimited by newline
fields (
field_1 position(1: 4) char( 4),
field_2 position(5:30) char(30)
)
)
location ('file')
)
reject limit unlimited;
Comma delimited
create table ext_table_csv (
i Number,
n Varchar2(20),
m Varchar2(20)
)
organization external (
type oracle_loader
default directory ext_dir
access parameters (
records delimited by newline
fields terminated by ','
missing field values are null
)
location ('file.csv')
)
reject limit unlimited;
Simplified, conceptual, non-database-specific explanation:
When the maximum possible record length is known in advance, the end of the record/the beginning of the next record can be found in constant time. This is because that location is computable using simple addition, very much analogous to array indexing. Imagine that I'm using ints as pointers to records, and that the record size is an integer constant defined somewhere. Then, to get from the current record location to the next:
int current_record = /* whatever */;
int next_record = current_record + FIXED_RECORD_SIZE;
That's it!
Alternatively, when using string-terminated (or otherwise delimited) records and fields, you could imagine that the next field/record is found by a linear-time scan, which has to look at every character until the delimiter is found. As before,
char DELIMITER = ','; // or whatever
int current_record = /* whatever */;
int next_record = current_record;
while(character_at_location(next_record) != DELIMITER) {
next_record++;
}
This might be a simplified or naïve version of the real-world implementation, but the general idea still stands: you can't easily do the same operation in constant time, and even if it were constant time, it's unlikely to be as fast as performing a single add operation.
I checked this and in my case performance deteriorated! I have a 1GB csv file with integer values, each of them is 10 characters long with padding, fields separated by "," and records separated by "\n". I have to following script (I also tried to set the fixed record size and removed ltrim, but it didn't help).
SQL> CREATE TABLE ints_ext (id0 NUMBER(10),
2 id1 NUMBER(10),
3 id2 NUMBER(10),
4 id3 NUMBER(10),
5 id4 NUMBER(10),
6 id5 NUMBER(10),
7 id6 NUMBER(10),
8 id7 NUMBER(10),
9 id8 NUMBER(10),
10 id9 NUMBER(10))
11 ORGANIZATION EXTERNAL (
12 TYPE oracle_loader
13 DEFAULT DIRECTORY tpch_dir
14 ACCESS PARAMETERS (
15 RECORDS DELIMITED BY NEWLINE
16 BADFILE 'bad_%a_%p.bad'
17 LOGFILE 'log_%a_%p.log'
18 FIELDS TERMINATED BY ','
19 MISSING FIELD VALUES ARE NULL)
20 LOCATION ('data1_1.csv'))
21 parallel 1
22 REJECT LIMIT 0
23 NOMONITORING;
SQL> select count(*) from ints_ext;
COUNT(*)
----------
9761289
Elapsed: 00:00:43.68
SQL> select /*+ parallel(1) tracing(STRIP,1) */ * from ints_ext;
no rows selected
Elapsed: 00:00:43.78
SQL> CREATE TABLE ints_ext (id0 NUMBER(10),
2 id1 NUMBER(10),
3 id2 NUMBER(10),
4 id3 NUMBER(10),
5 id4 NUMBER(10),
6 id5 NUMBER(10),
7 id6 NUMBER(10),
8 id7 NUMBER(10),
9 id8 NUMBER(10),
10 id9 NUMBER(10))
11 ORGANIZATION EXTERNAL (
12 TYPE oracle_loader
13 DEFAULT DIRECTORY tpch_dir
14 ACCESS PARAMETERS (
15 RECORDS DELIMITED BY NEWLINE
16 BADFILE 'bad_%a_%p.bad'
17 LOGFILE 'log_%a_%p.log'
18 FIELDS ltrim (
19 id0 position(1:10) char(10),
20 id1 position(12:21) char(10),
21 id2 position(23:32) char(10),
22 id3 position(34:43) char(10),
23 id4 position(45:54) char(10),
24 id5 position(56:65) char(10),
25 id6 position(67:76) char(10),
26 id7 position(78:87) char(10),
27 id8 position(89:98) char(10),
28 id9 position(100:109) char(10)
29 ))
30 LOCATION ('data1_1.csv'))
31 parallel 1
32 REJECT LIMIT 0
33 NOMONITORING;
SQL> select count(*) from ints_ext;
COUNT(*)
----------
9761289
Elapsed: 00:00:50.38
SQL>
select /*+ parallel(1) tracing(STRIP,1) */ * from ints_ext;
no rows selected
Elapsed: 00:00:45.26