In order to load data (from a CSV file) into an Oracle database, I use SQL*Loader.
In the table that receives these data, there is a varchar2(500) column, called COMMENTS.
For some reasons, I want to ignore this information from the CSV file.
Thus, I wrote this control file:
Options (BindSize=10000000,Readsize=10000000,Rows=5000,Errors=100)
Load Data
Infile 'XXX.txt'
Append into table T_XXX
Fields Terminated By ';'
TRAILING NULLCOLS
(
...
COMMENTS FILLER,
...
)
This code seems to work correctly, as the COMMENTS field in database is always set to null.
However, if in my CSV file I have a record where the corresponding COMMENTS field exceeds the 500 characters limit, I get an error from SQL*Loader:
Record 2: Rejected - Error on table T_XXX, column COMMENTS.
Field in data file exceeds maximum length
Is there a way to really exclude the processing of my COMMENTS fields?
I can't reproduce your problem. I'm using Oracle 10.2.0.3.0 with SQL*Loader 10.2.0.1.
Here is my test case:
SQL> CREATE TABLE test_sqlldr (
2 ID NUMBER,
3 comments VARCHAR2(20),
4 id2 NUMBER
5 );
Table created
Control file:
LOAD DATA
INFILE test.data
INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'
TRAILING NULLCOLS
( id,
comments filler,
id2
)
data file:
1;aaa;2
3;abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz;4
5;bbb;6
I'm using the command sqlldr userid=xxx/yyy#zzz control=test.ctl and I'm getting all the rows without errors:
SQL> select * from test_sqlldr;
ID COMMENTS ID2
---------- -------------------- ----------
1 2
3 4
5 6
You may try another approach, I'm getting the same desired result with the following control file:
LOAD DATA
INFILE test.data
INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'
TRAILING NULLCOLS
( id,
comments "substr(:comments,1,0)",
id2
)
Update following Romaintaz's comment: I looked into it again and managed to get the same error as you when the size of the column exceeded 255 characters. This is because the default datatype of SQL*Loader is char(255). If you have a column with more data you will have to specify the length. The following control file solved the problem for a column with 300 characters:
LOAD DATA
INFILE test.data
INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'
TRAILING NULLCOLS
( id,
comments filler char(4000),
id2
)
Hope this Helps,
--
Vincent
Just to suggest a tiny improvement, you might try something like:
LOAD DATA
IN FILE test.data INTO TABLE test_sqlldr
APPEND
FIELDS TERMINATED BY ';'TRAILING NULLCOLS
(
id,
comments char(4000) "substr(:comments, 1, 200)",
id2)
Now you'll grab the first 200 characters (or any number you specify in it's place) of all comments - unless some of your input records have values for the comments field that exceed 4000 characters, in which they'll be rejected by loader with the 'exceeds max length' error noted earlier. But assuming that's rare or not the case, all the records will load with some of the comments truncated to 200 chars.
If you go over char(4000) you'll get a SQL Loader error - there's a limit to how far you can push the beast.
Related
I'm trying to load data from a Datafile in different tables, I read a lot about field declaration and delimitation(Position(n:n), terminated by ). The point is than I'm not sure how to do what I need to do. Let me explain this with an example.
I have two tables (person, phone):
person_table( person_id_pk, person_name) - phone_table(person_id_pk, phone)
I have a datafile with:
$ datafile.txt
1,jack pierson,+13526985442
2,Katherine McLaren,+15264586548
My point is, when I'm declaring my ConfigFile.ctl, how do I specify than the field number 3 (phone field) should be insert or append into "phone_table", and the others two fields (person_id, person_name) should be insert or append into "person_table"
Considering than the fields are not fixed length, my reference is the field position. (Field datafile position)
I was thinking to try something like
$configfile.ctl
LOAD DATA
INFILE datafile.txt
APPEND
INTO TABLE person_table
(
person_id_pk POSITION (*) INTEGER EXTERNAL TERMINATED BY "," ,
person_name POSITION(*+1) CHAR(30) TERMINATED BY ","
)
INTO TABLE phone_table
(
person_id_fk POSITION (*) INTEGER EXTERNAL TERMINATED BY ","
phone ------> Right here is my point, how can I specify to SQL Loader than here
should be the field number 3 from datafile
)
I hope you guys get my point. it is a HUGE issue for me, because i'm dealing with CSV files which contains 60, 80, even 100 fields (columns based on Excel File). And every fields or group of fields could be in different tables.
I really appreciate the guide and help you could grant me. I'm probably wrong about my example and controlfile declarations, I haven't implemented anything yet. So I'm open to every suggest you could give me.
Your control file should look like this. The second "INTO TABLE" Uses POSITION(1) to move the logical "pointer" back to the start of the current line so it can be read again. then the name is skipped by defining it as a FILLER.
LOAD DATA
INFILE datafile.txt
APPEND
INTO TABLE person_table
FIELDS TERMINATED BY "," TRAILING NULLCOLS
(
person_id_pk INTEGER EXTERNAL,
person_name CHAR(30)
)
INTO TABLE phone_table
FIELDS TERMINATED BY "," TRAILING NULLCOLS
(
person_id_fk POSITION(1) INTEGER EXTERNAL,
x_name FILLER,
phone CHAR(12)
)
I'm loading data into my table through SQL Loader
data loading is successful but i''m getting garbage(repetitive) value in a particular column for all rows
After inserting :
column TERM_AGREEMENT is getting value '806158336' for every record
My csv file contains atmost 3 digit data for that column,but i'm forced to set my column definition to Number(10).
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
**TERM_AGREEMENT INTEGER**
)
create table LOAN_BALANCE_MASTER_INT
(
ACCOUNT_NO NUMBER(30),
CUSTOMER_NAME VARCHAR2(70),
LIMIT NUMBER(30),
PRODUCT_DESC VARCHAR2(30),
SUBPRODUCT_CODE NUMBER,
ARREARS_INT NUMBER(20,2),
IRREGULARITY NUMBER(20,2),
PRINCIPLE_IRREGULARITY NUMBER(20,2),
**TERM_AGREEMENT NUMBER(10)**
)
INTEGER is for binary data type. If you're importing a csv file, I suppose the numbers are stored as plain text, so you should use INTEGER EXTERNAL. The EXTERNAL clause specifies character data that represents a number.
Edit:
The issue seems to be the termination character of the file. You should be able to solve this issue by editing the INFILE line this way:
INFILE'/ipoapplication/utl_file/LBR_HE_Mar16.csv' "STR X'5E204D'"
Where '5E204D' is the hexadecimal for '^ M'. To get the hexadecimal value you can use the following query:
SELECT utl_raw.cast_to_raw ('^ M') AS hexadecimal FROM dual;
Hope this helps.
I actually solved this issue on my own.
Firstly, thanks to #Gary_W AND #Alessandro for their inputs.Really appreciate your help guys,learned some new things in the process.
Here's the new fragment which worked and i got the correct data for the last column
LOAD DATA
infile '/ipoapplication/utl_file/LBR_HE_Mar16.csv'
REPLACE
INTO TABLE LOAN_BALANCE_MASTER_INT
fields terminated by ',' optionally enclosed by '"'
(
ACCOUNT_NO,
CUSTOMER_NAME,
LIMIT,
REGION,
**TERM_AGREEMENT INTEGER Terminated by Whitspace**
)
'Terminated by whitespace' - I went through some threads of SQL Loader and i used 'terminated by whitespace' in the last column of his ctl file. it worked ,this time i didn't even had to use 'INTEGER' or 'EXTERNAL' or EXPRESSION '..' for conversion.
Just one thing, now can you guys let me now what could possibly be creating issue ?what was there in my csv file in that column and how by adding this thing solved the issue ?
Thanks.
As mentioned in the title, i wish to have a control file to handle this case. The scenario is i have to insert record into different table. For example, WHEN (1:3) is HEA, it need to Append into table header. WHEN (1:3) is DTL it need replace into table Detail. is that possible to do this?
I have a situation where data from one file goes to three tables depending on the first field in the file. The WHEN clause looks at the first field and takes action based on that. Notice that when a 'WHEN' is met, the first field is then skipped by declaring it a filler. To answer your question, I believe you can put the APPEND or REPLACE after the INTO TABLE clause. Give it a try and let us know.
OPTIONS (DIRECT=TRUE)
UNRECOVERABLE
LOAD DATA
APPEND
INTO TABLE TABLE_A
WHEN (01) = 'CLM'
FIELDS TERMINATED BY '|' TRAILING NULLCOLS
( rec_skip filler POSITION(1)
,CLM_CLAIM_ID CHAR NULLIF(CLM_CLAIM_ID=BLANKS)
...
)
INTO TABLE TABLE_B
WHEN (01) = 'SLN'
FIELDS TERMINATED BY '|' TRAILING NULLCOLS
( rec_skip filler POSITION(1)
,SL_CLAIM_ID CHAR NULLIF(SL_CLAIM_ID=BLANKS)
...
)
INTO TABLE TABLE_C
WHEN (01) = 'COB'
FIELDS TERMINATED BY '|' TRAILING NULLCOLS
( rec_skip filler POSITION(1)
,COB_CLAIM
...
)
More info: http://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_control_file.htm#i1005657
I have CSV file. The data looks like this :
PRICE_a
123
PRICE_b
500
PRICE_c
1000
PRICE_d
506
My XYZ Table is :
CREATE TABLE XYZ (
DESCRIPTION_1 VARCHAR2(25),
VALUE NUMBER
)
Do csv as above can be imported to the oracle?
How do I create a control.ctl file?
Here's how to do it without having to do any pre-processing. Use the CONCATENATE 2 clause to tell SQL-Loader to join every 2 lines together. This builds logical records but you have no separator between the 2 fields. No problem, but first understand how the data file is read and processed. SQL-Loader will read the data file a record at a time, and try to map each field in order from left to right to the fields as listed in the control file. See the control file below. Since the concatenated record it read matches with TEMP from the control file, and TEMP does not match a column in the table, it will not try to insert it. Instead, since it is defined as a BOUNDFILLER, that means don't try to do anything with it but save it for future use. There are no more data file fields to try to match, but the control file next lists a field name that matches a column name, DESCRIPTION_1, so it will apply the expression and insert it.
The expression says to apply the regexp_substr function to the saved string :TEMP (which we know is the entire record from the file) and return the substring of that record consisting of zero or more non-numeric characters from the start of the string where followed by zero or more numeric characters until the end of the string, and insert that into the DESCRIPTION_1 column.
The same is then done for the VALUE column, only returning the numeric part at the end of the string, skipping the non-numeric at the beginning of the string.
load data
infile 'xyz.dat'
CONCATENATE 2
into table XYZ
truncate
TRAILING NULLCOLS
(
TEMP BOUNDFILLER CHAR(30),
DESCRIPTION_1 EXPRESSION "REGEXP_SUBSTR(:TEMP, '^([^0-9]*)[0-9]*$', 1, 1, NULL, 1)",
VALUE EXPRESSION "REGEXP_SUBSTR(:TEMP, '^[^0-9]*([0-9]*)$', 1, 1, NULL, 1)"
)
Bada-boom, bada-bing:
SQL> select *
from XYZ
/
DESCRIPTION_1 VALUE
------------------------- ----------
PRICE_a 123
PRICE_b 500
PRICE_c 1000
PRICE_d 506
SQL>
Note that this is pretty dependent on the data following your example, and you should do some analysis of the data to make sure the regular expressions will work before putting this into production. Some tweaking will be required if the descriptions could contain numbers. If you can get the data to be properly formatted with a separator in a true CSV format, that would be much better.
When creating Oracle external tables, how should I phrase the reject rows clause to ensure that any field which exceeds its column width rejects and goes in the BADFILE?
This is my current design and I don't want records greater than 20 characters. I do want them to go BADFILE instead. Yet, they still appear when I select * from foobar
DROP TABLE FOOBAR CASCADE CONSTRAINTS;
CREATE TABLE FOOBAR
(
FOO_MAX20 VARCHAR2(20 CHAR)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY FOOBAR
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
BADFILE 'foobar_bad_rec.txt'
DISCARDFILE 'foobar_discard_rec.txt'
LOGFILE 'foobar_logfile.txt'
FIELDS
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
FOO_MAX20 POSITION(1:20)
)
)
LOCATION (foobar:'foobar.txt') )
REJECT LIMIT UNLIMITED
PARALLEL ( DEGREE DEFAULT INSTANCES DEFAULT )
NOMONITORING;
Here is my external file foobar.txt
1234567
1234567890123456
126464843750476074218751012345678901234567890
7135009765625
048669433593
7
527
You can't do this with the reject rows clause, as it only accepts one form.
You have a variable-length (delimited) record, but a fixed-length field. Everything after the last position you specify, which is 20 in this case, is seen as filler that you want to ignore. That isn't an error condition; you might have rubbish at the end that isn't relevant to your table. There is nothing that says chars 21-45 in your third record shouldn't be there - just that you aren't interested in them.
It would be nice if you could discard them with the load when clause, but you don't seem to be able to compare , say, (21:21) to null or an empty string - the former isn't recognised and the latter causes an internal error, which isn't good.
You can make the longer records be sent to the bad file by forcing an SQL error when it tries to put a longer parsed value from the file into the field, by changing:
FOO_MAX20 POSITION(1:20)
to
FOO_MAX20 POSITION(1:21)
Values that are up to 20 characters are still loaded:
select * from foobar;
FOO_MAX20
--------------------
1234567
1234567890123456
7135009765625
048669433593
7
527
6 rows selected
but for anything longer than 20 characters it'll try to put 21 chars in to the database's 20-char field, which gets this in the log:
error processing column FOO_MAX20 in row 3 for datafile /path/to/dir/foobar.txt
ORA-12899: value too large for column FOO_MAX20 (actual: 21, maximum: 20)
And the bad file gets that record:
126464843750476074218751012345678901234567890
Have a CHECK CONSTRAINT on the column to not allow any value exceeding the `LENGTH'.