Location and filename of most recent BADFILE when using external tables - oracle

Is there a way to detemine location/filename of latest BADFILE?
When I select from enternal table with BADFILE 'mytable_%a_%p.bad', how do I find out what specific values were %a and %p replaced with?
Or am I stuck with having mytable.bad which I can reliably query and hoping that there will be no race conditions?

As the documentation states
%p is replaced by the process ID of the current process. For example,
if the process ID of the access driver is 12345, then exttab_%p.log
becomes exttab_12345.log.
%a is replaced by the agent number of the current process. The agent
number is the unique number assigned to each parallel process
accessing the external table. This number is padded to the left with
zeros to fill three characters. For example, if the third parallel
agent is creating a file and bad_data_%a.bad was specified as the file
name, then the agent would create a file named bad_data_003.bad.
If %p or %a is not used to create unique file names for output files
and an external table is being accessed in parallel, then output files
may be corrupted or agents may be unable to write to the files.
Having said that, you must remember the purpose of the badfile in the first place.
The BADFILE clause names the file to which records are written when
they cannot be loaded because of errors. For example, a record would
be written to the bad file if a field in the data file could not be
converted to the data type of a column in the external table. The
purpose of the bad file is to have one file where all rejected data
can be examined and fixed so that it can be loaded. If you do not
intend to fix the data, then you can use the NOBADFILE option to
prevent creation of a bad file, even if there are bad records.
So the idea ( either for SQL Loader or External Tables with access driver oracle_loader ) is to have a file to store those records, the bad records, not to trace anything regarding them.
Normally you have external tables associated to text files that you are receiving in a daily/weekly/monthly basis. You store on the badfile those records that can't be read/loaded according to your own table specification.
You use then the LOGFILE to find what has happened. Those files are generated in the database directory where the external table is created, and you will have one for each time a badfile needs to be generated.

Related

SQL LOADER Control File without fields

I'm working on a task to load Database table from a flat file. My database table has 60 columns.
Now, In SQL LOADER control file, Is it mandatory to mention all the 60 fields ?
Is there a way to tell SQL LOADER that all 60 columns should be treated as required without mentioning the fields in the Control File ?
Oracle 12c (and higher versions) offer express mode.
In a few words (quoting the document):
The SQLLoader TABLE parameter triggers express mode. The value of the TABLE parameter is the name of the table that SQLLoader will load. If TABLE is the only parameter specified, then SQL* loader will do the following:
Looks for a data file in the current directory with the same name as the table being loaded that has an extension of ".dat". The upper/lower case used in the name of the data file is the same as the case for the table name specified in the TABLE parameter
Assumes the order of the fields in the data file matches the order of the columns in the table
Assumes the fields are terminated by commas, but there is no enclosure character
(...) order of the fields in the data file matches the order of the columns in the table. The following SQL*Loader command will load the table from the data file.
sqlldr userid=scott table=emp
Notice that no control file is used. After executing the SQL*Loader command, a SELECT from the table will return (...)
I guess that's what you're after.

SQL Loader in Oracle

As I am inserting data from a CSV file to a oracle table using SQL Loader and it is working fine .
LOAD DATA
INFILE DataOut.txt
BADFILE dataFile.bad
APPEND INTO TABLE ASP_Net_C_SHARP_Articles
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(ID,Name,Category)
above settings are being used to do that but I do not want to specify any of the column name ex. (ID,Name,Category) .
Is this possible or not if yes can anybody tell me how..
In SQL*Loader you need to specify the column names. If you still persist in ignoring the column names in the control file, then I would suggest you to use SQL to "discover" the name of the columns and dynamically generate the control file and wrap it via shell script to make it more automated.
Meanwhile, you can consider External Tables which uses the SQL*Loader engine, so you will still have to perform some dynamic creation here for your input file as suggested above. But you can create a script to scan the input file and dynamically generate the CREATE TABLE..ORGANIZATION EXTERNAL command for you. Then the data becomes available as if it were a table in your database.
You can also partially skip the columns if that would help you, by using FILLER. BOUNDFILLER (available with Oracle 9i and above) can be used if the skipped column's value will be required later again.

How to rollback or not commit with sql loader [duplicate]

If while loading this file
$ cat employee.txt
100,Thomas,Sales,5000
200,Jason,Technology,5500
300,Mayla,Technology,7000
400,Nisha,Marketing,9500
500,Randy,Technology,6000
501,Ritu,Accounting,5400
using the control file (say) sqlldr-add-new.ctl I came to know all the records are faulty so I want the previously loaded records in that table (those that were loaded yesterday) to be retained if today's had any error. How to handle this exception.
This is my sample ctl file
$ cat sqlldr-add-new.ctl
load data
infile '/home/ramesh/employee.txt'
into table employee
fields terminated by ","
( id, name, dept, salary )
You can't roll back from SQL*Loader, it commits automatically. This is mentioned in the errors parameter description:
On a single-table load, SQL*Loader terminates the load when errors exceed this error limit. Any data inserted up that point, however, is committed.
And there's a section on interrupted loads.
You could attempt to load the data to a staging table, and if it is successful move the data into the real table (with delete/insert into .. select .., or with a partition swap if you have a large amount of data). Or you could use an external table and do the same thing, but you'd need a way to determine if the table had any discarded or rejected records.
try with ERRORS=0.
You could find all explanation here:
http://docs.oracle.com/cd/F49540_01/DOC/server.815/a67792/ch06.htm
ERRORS (errors to allow)
ERRORS specifies the maximum number of insert errors to allow. If the number of errors exceeds the value of ERRORS parameter, SQL*Loader terminates the load. The default is 50. To permit no errors at all, set ERRORS=0. To specify that all errors be allowed, use a very high number.
On a single table load, SQL*Loader terminates the load when errors exceed this error limit. Any data inserted up that point, however, is committed.
SQL*Loader maintains the consistency of records across all tables. Therefore, multi-table loads do not terminate immediately if errors exceed the error limit. When SQL*loader encounters the maximum number of errors for a multi-table load, it continues to load rows to ensure that valid rows previously loaded into tables are loaded into all tables and/or rejected rows filtered out of all tables.
In all cases, SQL*Loader writes erroneous records to the bad filz

updating data in external table

Lets assume the following scenario :
I have several users that will prepare .csv files (not being aware of each other so concurrency is possible).
The .csv file will always be in same format.
The data in the .csv file will contain a list of ids together with some other columns like update_date.
Based on that data i will create a procedure that will update data in real DB table.
The idea is to use external tables, to maximally simplify it for the .csv creators, so they will put files in a folder and stuff will be done for them, rest is my job.
The questions are :
Can i have several files as the source for 1 external table or i need 1 ext table for each file (and what i mean here is whenever there is new func call to load data from csv, it should be added to existing external table...so not all files are being loaded at once)
Can i update records/fields in external table.
External table basically allowes to query the data stored in the external file(s). So from this point you can't issue an UPDATE on it.
You can
1) add new files in the directory and ALTER the table
ALTER TABLE my_ex LOCATION ('file1.csv','file2.csv');
2) you can of course modify the existing files as well. There is no database state of the external table, each SELECT loads the data in the database, so you will always see the "updated" status.
** UPDATE **
An attempt to modify (e.g. UPDATE) leads to ORA-30657 operation not supported on external organized table.
To be able to maintain status in the database the data must be first copied in a regular table (CTAS - create table as select from the external table).

Oracle SQL save file name from LOCATION as a column in external table

I have several input files being read into an external table in Oracle. I want to run some queries across the content from all the files, however, there are some queries where I would like to filter the data based on the input file it came from. Is there a way to access the name of the source file in a select statement against an external table or somehow create a column in the external table that includes the location source.
Here is an example:
CREATE TABLE MY_TABLE (
first_name CHAR(100 BYTES)
last_name CHAR(100 BYTES)
)
ORGANIZATION EXTERNAL
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TMP
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
SKIP 1
badfile 'my_table.bad'
discardfile 'my_table.dsc'
LOGFILE 'my_table.log'
FIELDS terminated BY 0x'09' optionally enclosed BY '"' LRTRIM missing field VALUES are NULL
(
first_name char(100),
last_name
)
)
LOCATION ( TMP:'file1.txt','file2.txt')
)
REJECT LIMIT 100;
select distinct last_name
from MY_TABLE
where location like 'file2.txt' -- This is the part I don't know how to code
Any suggestions?
There is always the option to add the file name to the input file itself as an additional column. Ideally, I would like to avoid this work around.
The ALL_EXTERNAL_LOCATIONS data dictionary view contains information about external table locations. Also DBA_* and USER_* versions.
Edit: (It would help if I read the question thoroughly.)
You don't just want to read the location for the external table, you want to know which row came from which file. Basically, you need to:
Create a shell script that adds the file location to the file contents and sends them to stdin.
Add the PREPROCESSOR directive to your external table definition to execute the script.
Alter the external table definition to include a column to show the filename appended in the first step.
Here is an asktom article explaining it in detail.

Resources