Oracle SQLLDR - Load Record with Invalid Date, Replace Invalid Date with Null - oracle

There are records in my source text file with invalid date values. The invalid date values are inconsistent in format due to manual entry. I still want to load all of these records, but I want to replace the invalid date value with a null.
Please let me know if/how this is possible via SQLLDR control file commands. I want to avoid creating any custom functions. Something simple that generally refers to errors/exceptions and that works (unlike the below) is ideal:
DATE "MM/DD/YYYY" NULLIF (FROM_DOS=EXCEPTION)
Thanks!

As far as I can tell, that won't go in a single pass. I'd suggest you to try a relatively simple approach:
load the original data "as is"
rows with invalid dates won't be loaded, but will end in the .BAD file
then modify the control file:
source will now be the .BAD file
load NULL into the date column (FILLER might help)
Alternatively, you might use the source file as an external table and write (PL/)SQL against it to load data into the target table. It allows you to actually code whatever you want, but - as you said you don't want to create a custom function (which would decide whether the input data is - or is not - a valid DATE value), I presume you'd rather skip that option.

Related

How to resolve Information - Not a valid month error?

When running a PowerCenter session that uses an Oracle database view as a source the session fails with one of the following errors:
ORA-01843: Not a valid month.
In SqL developer it runs without any issues
Column : Report_date : Data Type : date.
In mapping parameter variable defined as String
Passing report date dynamically using param file &Control table
Select * from ABC
Where report_date=to_char(31-MAR-21,'DD-MON-RR')
--Error: not valid month
Could you please advise on it
You need to put single quotes around infa mapping parameter. Like this -
First calculate Report_Date and put the value in infa param file in correct format. You can use control table too. But you need to create a param file from that.
param file should look like
[folder.workflow.session]
$$Report_Date='21-Oct-2021'
Then in mapping you can call it in source qualifier as -
Select * from ABC Where report_date=to_char('$$Report_Date','DD-MON-RR')
single quote will ensure your data is passed as string.
Now, if youdont want to use param file, you can use control file as a new source and join with the SQL above to get desired result. But first approach is faster.

Import CSV File into Oracle using SQL Developer

I am trying to import data from a CSV file into a Oracle GroupSpace table using SQL Developer tool. I am getting errors for Date column. My Date column has Date in the below format.
5/6/2016
4/11/2018
11/6/2017...
I get error that the date column has Invalid or Null Date Formats.
Any pointers on what format date format to use when importing Date column would be greatly appreciated.
Thank you so much!
JH
If you aren't sure that dates are valid (for example, nothing prevents you from entering 5/55/2016 into a CSV file, and that certainly isn't a valid DATE value), you can create a staging table whose columns are of VARCHAR2 datatype - it accepts everything, even garbage like 5/55/2016.
Then, after you load data, write some SQL to find errors, fix them, and then move data into the target table.
Check the CSV data in a text editor and look for which part represents the month (the month value will be in the range 1..12). If you are using US dates then use MM/DD/YYYY, otherwise you should probably use DD/MM/YYYY as the date format. If the data has a mixture of both, then you must separate those files and use a different format for each, or you are likely to get invalid date values in your database.
SQL Developer can help you.
You can try the date format masks in the drop-down. If we can guess it, we'll default to one. For some reason your data...fools us, but you can type your own.
If you get something that 'works' the warnings go away.
If you get it wrong, we'll let you know before you even get to the next step.
You can find all the data format masks here.

external tables: how to make sure i don't load same file/data

I want to use an external table to load a csv file as it's very convenient, but the problem is how do i make sure i don't load the same file twice in a row? i can't validate the data loaded because it can be the same information as before; i need to find a way to make sure the user doesnt load the same file as 2h ago for example.
I thought about uploading the file with a different name each time and issuing an alter table command to change the name of the file in the definition of the external table, but it sounds kinda risky.
I also thought about marking each row in the file with a sequence to help differentiate files, but i doubt the client would accept it as they would need to manually do this (the file is exported from somewhere).
Is there any better way to make sure i don't load the same file in the external table except changing the file's name and executing an alter on the table?
Thank you
when you bring the data from external table to your database you can use MERGE command instead of insert. it let you don't worry about duplicate data
see the blog about The Oracle Merge Command
What's more, we can wrap up the whole transformation process into this
one Oracle MERGE command, referencing the external table and the table
function in the one command as the source for the MERGED Oracle data.
alter session enable parallel dml;
merge /*+ parallel(contract_dim,10) append */
into contract_dim d
using TABLE(trx.go(
CURSOR(select /*+ parallel(contracts_file,10) full (contracts_file) */ *
from contracts_file ))) f
on d.contract_id = f.contract_id
when matched then
update set desc = f.desc,
init_val_loc_curr = f.init_val_loc_curr,
init_val_adj_amt = f.init_val_adj_amt
when not matched then
insert values ( f.contract_id,
f.desc,
f.init_val_loc_curr,
f.init_val_adj_amt);
So there we have it - our complex ETL function all contained within a
single Oracle MERGE statement. No separate SQL*Loader phase, no
staging tables, and all piped through and loaded in parallel
I can only think of a solution somewhat like this:
Have a timestamp encoded in the datafile name (like: YYYYMMDDHHMISS-file.csv), where YYYYMMDDHHMISS is the timestamp.
Create a table with the fields timestamp (as above).
Create a shell scripts that:
extracts the timestamp from the datafilename.
calls an sqlscript with the timestamp as the parameter, and return 0 if that timestamp does not exist, <>0 if the timestamp already exist, and in that case exit the script with the error: File: YYYYMMDDHHMISS-file.csv already loaded.
copy the YYYYMMDDDHHMISS-file.csv to input-file.csv.
run the sql loader script that loads the input-file.csv file
when succes: run a second sql script with parameter timestamp that inserts the record in the database to indicate that the file is loaded and move the original file to a backup folder.
when failure: report the failure of the load script.

How to insert name of file and modified time using batch/shell script and sql loader

I have a requirement to insert bulk data into an Oracle database from a CSV file. Now table columns specs match those of the CSV file's header with the exception of three additional fields in database:
A Primary Key field (for which a simple SEQUENCE.NEXTVAL is called)
A field for the name of the CSV file
A field for the last modified date+time of the file
The following stack question address an extra column issue, but the solution is pretty easy because it used Oracle sysdate which is internally available. I need to pass a parameter from either batch script/shell script.
Insert actual date time in a row with SQL*loader
Can PARFILE help here somehow?
My other alternative would be to do the whole task in two steps by writing a small java code:
Use SQL Loader for bulk upload leaving out data for the filename and
modified time
And then run a separate update statement to populate the newly
created rows
But I'm looking for something which will get the job done in one shot. Any advice??
I'm affraid it's not possible with sqlldr alone.
There is no tools for this in sqlldr.
You'd need some sort of script or a program to dynamically create a .ctl file for each load.
Here is a bash script to help you get started:
#!/bin/bash -xv
readonly MY_FILENAME=$1
readonly DB_BUF_TABLE=$2
readonly SQLLDR_CTL="LOAD DATA
CHARACTERSET UTF8
APPEND INTO TABLE $DB_BUF_TABLE
FIELDS TERMINATED BY ';'(
filename \"$MY_FILE_NAME\",
col_foo,
col_bar
)"
echo "$SQLLDR_CTL" > "loader.ctl"
sqlldr control=loader.ctl parfile=loader.par data="$MY_FILENAME"
sqlldrReturnValue=$?
You'd needsome locking with this.. or path separation for concurrent loads to be sure sqlldr starts with proper ctl file

SQL*Loader - How can i ignore certain rows with a specific charactre

If i have a CSV file that is in the following format
"fd!","sdf","dsfds","dsfd"
"fd!","asdf","dsfds","dsfd"
"fd","sdf","rdsfds","dsfd"
"fdd!","sdf","dsfds","fdsfd"
"fd!","sdf","dsfds","dsfd"
"fd","sdf","tdsfds","dsfd"
"fd!","sdf","dsfds","dsfd"
Is it possible to exclude any row where the first column has an exclamation mark at the end of the string.
i.e. it should only load the following rows
"fd","sdf","rdsfds","dsfd"
"fd","sdf","tdsfds","dsfd"
Thanks
According to the Loading Records Based on a Condition section of the SQL*Loader Control File Reference (11g):
"You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record."
So you'd need something like this:
LOAD DATA ... INSERT INTO TABLE mytable WHEN mycol1 NOT LIKE '%!'
(mycol1.. ,mycol2 ..)
But the LIKE operator is not available! You only have = and !=
Maybe you could try an External Table instead.
I'd stick a CONSTRAINT on the table, and just let them be rejected. Maybe delete them after load. Or a unix "grep -v" to clear them out the file.

Resources