Batch insert: is there a way to just skip on next record when a constraint is violated? - oracle

I am using mybatis to perform a massive batch insert on an oracle DB.
My process is very simple: I am taking records from a list of files and inserting them into a specific table after performing some checks on the data.
-Each file contains an average of 180.000 records and I can have more than one file.
-Some records can be present in more than one file.
-A record is identical to another one if EVERY column matches, in other words I cannot simply perform a check on a specific field. And I have defined a constraint in my DB which makes sure this condition is satisfied.
To put it simply I want to just ignore the constraint exception Oracle will give to me in case that constraint is violated.
Record is not present?-->insert
Record is already present?--> go ahead
is this possible with mybatis?Or can I accomplish something at DB level?
I have control on both Application Server and DB so please tell me what's the most efficient way to accomplish this task (even though I'd like to avoid being too much DB dependant...)
of course, I'd like to avoid performing a select* before each insertion...given the number of records I am dealing with it would ruin my application's performances

Use the IGNORE_ROW_ON_DUPKEY_INDEX hint:
insert /*+ IGNORE_ROW_ON_DUPKEY_INDEX(table_name index_name) */
into table_name
select * ...

I'm not sure about JDBC, but at least in OCI it is possible. With batch operations you pass vectors as bind variables and you also get back vector(s) of returned IDs and also a vector of error codes.
You can also use MERGE on database server side together with custon collection types. Something like:
merge into t
using ( select * from TABLE(:var) v)
on ( v.id = t.id )
when not matched then insert ...
Where :var is bind variable of SQL type: TABLE OF <recordname>
The word "TABLE" is a construct used to cast from bind variable into a table.
Another option is to use SQL error loggin clause:
DBMS_ERRLOG.create_error_log (dml_table_name => 't');
insert into t(...) values(...) log errors reject limit unlimited;
Then after the load you will have to truncate error loging table err$_t;
another option would be to use external tables
It looks that any solution is quite a lot work to do, when compared to using sqlldr.

Ignore error with error table
insert
into table_name
select *
from selected_table
LOG ERRORS INTO SANJI.ERROR_LOG('some comment' )
REJECT LIMIT UNLIMITED;
and error table schema is :
CREATE GLOBAL TEMPORARY TABLE SANJI.ERROR_LOG (
ora_err_number$ number,
ora_err_mesg$ varchar2(2000),
ora_err_rowid$ rowid,
ora_err_optyp$ varchar2(2),
ora_err_tag$ varchar2(2000),
n1 varchar2(128)
)
ON COMMIT PRESERVE ROWS;

Related

Process SQL result set entirely

I need to work with a SQL result set in order to do some processing for each column (medians, standard deviations, several control statements included)
The SQL is dynamic so I don't know the number of columns, rows.
First I tried to use temporary tables, views, etc to store the results, however I did not manage to overcome the 30 character limit of Oracle columns when using the below sql:
create table (or view or global temporary table) as select * from (
SELECT
DMTTBF_MAT_MATURATO_BILL_POS.MAT_V_COD_ANNOMESE,
SUM(DMTTBF_MAT_MATURATO_BILL_POS.MAT_N_NUM_EVENTI_CHZ +DMTTBF_MAT_MATURATO_BILL_POS. MAT_N_NUM_EVENTI) <-- exceeds the 30 character limit
FROM DMTTBF_MAT_MATURATO_BILL_POS
WHERE DMTTBF_MAT_MATURATO_BILL_POS.MAT_V_COD_ANNOMESE >= '201301'
GROUP BY DMTTBF_MAT_MATURATO_BILL_POS.MAT_V_COD_ANNOMESE
)
Second choice was to use some PL/SQL types to store the entire table information, so I could call it like in other programming languages (e.g. a matrix result[i][j]) but I could not find anything similar.
Third variant, using files for reading and writing: i did not try it yet; i'm still expecting a more elegant pl/sql solution
It's possible that I have the wrong approach here so any advice is more than welcome.
UPDATE: Modifying the input SQL is not an option. The program has to accept any select statement.
Note that you can alias both tables and fields. Using a table alias keeps references to it from producing walls of text in the query. Using one for a field gives it a new name in the output.
SELECT A.LONG_FIELD_NAME_HERE AS SHORTNAME
FROM REALLY_LONG_TABLE_NAME_HERE A
The auto naming adds _1 and _2 etc to differentiate the same column name coming from different table references. This often puts a field already borderline over the limit. Giving the fields names yourself bypasses this.
You can put the alias also in dynamic SQL:
sqlstr := 'create table (or view or global temporary table) as select * from (
SELECT
DMTTBF_MAT_MATURATO_BILL_POS.MAT_V_COD_ANNOMESE,
SUM(DMTTBF_MAT_MATURATO_BILL_POS.MAT_N_NUM_EVENTI_CHZ + DMTTBF_MAT_MATURATO_BILL_POS.MAT_N_NUM_EVENTI) AS '||SUBSTR('SUM(DMTTBF_MAT_MATURATO_BILL_POS.MAT_N_NUM_EVENTI_CHZ +DMTTBF_MAT_MATURATO_BILL_POS.MAT_N_NUM_EVENTI)', 1, 30)
||' FROM DMTTBF_MAT_MATURATO_BILL_POS
WHERE DMTTBF_MAT_MATURATO_BILL_POS.MAT_V_COD_ANNOMESE >= ''201301''
GROUP BY DMTTBF_MAT_MATURATO_BILL_POS.MAT_V_COD_ANNOMESE
)'

How to insert while avoiding unique constraints with oracle

We have a process that aggregates some data and inserts the results into another table that we use for efficient querying. The problem we're facing is that we now have multiple aggregators running at roughly the same time.
We use the original records id as the primary key in this new table - a unique constraint. However, if two aggregation processes are running at the same time, one of them will error with a unique constraint violation.
Is there a way to specify some kind of locking mechanism which will make the second writer wait until the first is finished? Alternatively, is there a way to tell oracle to ignore that specific row and continue with the rest?
Unfortunately it's not practical to reduce the aggregation to a single process, as the following procedures rely on an up to date version of the data being available and those procedures do need to scale out.
Edit:
The following is my [redacted] query:
INSERT INTO
agg_table
SELECT
h.id, h.col, h.col2
FROM history h
JOIN call c
ON c.callid = h.callid
WHERE
h.id > (SELECT coalesce(max(id),0) FROM agg_table)
It is possible run an INSERT statement with an error logging clause. The example from the Oracle docs is as follows:
INSERT INTO dw_empl
SELECT employee_id, first_name, last_name, hire_date, salary, department_id
FROM employees
WHERE hire_date > sysdate - 7
LOG ERRORS INTO err_empl ('daily_load') REJECT LIMIT 25
Alternatively, you could try using a [MERGE][2] statement. You would be merging into the summary table with a select from the detail table. If a match is not found, you INSERT and if it is found you would UPDATE. I believe this solution will handle your concurrency issues, but you would need to test it.
have a look at FOR UPDATE clause. If you correctly write the SELECT statement with FOR UPDATE clause within a transaction before your update/insert statements you will be able to "lock" the required records
Serialising the inserts is probably the best way, as there's no method that will get you round the problem of the multiple inserts being unable to see what each one is doing.
DBMS_Lock is probably the appropriate serialisation mechanism.

select * through dblink

I have some trouble when trying to update a table by looping cursor which select from source table through dblink.
I have two database DB1, DB2.
They are two different database instance.
And I am using this following statement in DB1:
CURSOR TestCursor IS
SELECT a.*, 'A' TEST_COL_A, 'B' TEST_COL_B
FROM rpt.SOURCE#DB2 a;
BEGIN
For C1 in TestCursor loop
INSERT into RPT.TARGET
(
/*The company_name and cust_id are select from SOURCE table from DB2*/
COMPANY_NAME, CUST_ID, TEST_COL_A, TEST_COL_B
)
values
(
C1.COMPANY_NAME, C1.CUST_ID, C1.TEST_COL_A , C1.TEST_COL_B
) ;
End loop;
/*Some code...*/
End
Everything works fine until I add a column "NEW_COL" to SOURCE table#DB2
The insert data got the wrong value.
The value of TEST_COL_A , as I expect, should be 'A'.
However, it contains the value of NEW_COL which i add at SOURCE table.
And the value of TEST_COL_B contains 'A'.
Have anyone encounter the same issue?
It seems like oracle cache the table columns when it compile.
Is there any way to add a column to source table without recompile?
According to this:
Oracle Database does not manage
dependencies among remote schema
objects other than
local-procedure-to-remote-procedure
dependencies.
For example, assume that a local view
is created and defined by a query that
references a remote table. Also assume
that a local procedure includes a SQL
statement that references the same
remote table. Later, the definition of
the table is altered.
Therefore, the local view and
procedure are never invalidated, even
if the view or procedure is used after
the table is altered, and even if the
view or procedure now returns errors
when used. In this case, the view or
procedure must be altered manually so
that errors are not returned. In such
cases, lack of dependency management
is preferable to unnecessary
recompilations of dependent objects.
In this case you aren't quite seeing errors, but the cause is the same. You also wouldn't have a problem if you used explicit column names instead of *, which is usually safer anyway. If you're using * you can't avoid recompiling (unless, I suppose, the * is the last item in the select list, in which case any extra columns on the end wouldn't cause a problem - as long as their names didn't clash).
I recommend that you use a single set processing insert statement in DB1 rather than a row at a time cursor for loop for the insert, for example:
INSERT into RPT.TARGET
select COMPANY_NAME, CUST_ID, 'A' TEST_COL_A, 'B' TEST_COL_B
FROM rpt.SOURCE#DB2
;
Rationale:
Set processing will almost always out perform Row-at-a-time
processing [which is really slow-at-a-time processing].
Set processing the insert is a scalable solution. If the application will need to scale to tens of thousands of rows or millions of rows, the row-at-a-time solution will not likely scale.
Also, using the select * construct is dangerous for the reason you
encountered [and other similar reasons].

Return REF CURSOR to procedure generated data

I need to write a sproc which performs some INSERTs on a table, and compile a list of "statuses" for each row based on how well the INSERT went. Each row will be inserted within a loop, the loop iterates over a cursor that supplies some values for the INSERT statement. What I need to return is a resultset which looks like this:
FIELDS_FROM_ROW_BEING_INSERTED.., STATUS VARCHAR2
The STATUS is determined by how the INSERT went. For instance, if the INSERT caused a DUP_VAL_ON_INDEX exception indicating there was a duplicate row, I'd set the STATUS to "Dupe". If all went well, I'd set it to "SUCCESS" and proceed to the next row.
By the end of it all, I'd have a resultset of N rows, where N is the number of insert statements performed and each row contains some identifying info for the row being inserted, along with the "STATUS" of the insertion
Since there is no table in my DB to store the values I'd like to pass back to the user, I'm wondering how I can return the info back? Temporary table? Seems in Oracle temporary tables are "global", not sure I would want a global table, are there any temporary tables that get dropped after a session is done?
If you are using Oracle 10gR2 or later then you should check out DML error logging. This basically does what you want to achieve, that is, it allows us to execute all the DML in a batch process by recording any errors and pressing on with the statements.
The principle is that we create an ERROR LOG table for each table we need to work with, using a PL/SQL built-in package DBMS_ERRLOG. Find out more. There is a simple extension to the DML syntax to log messages to the error log table. See an example here. This approach doesn't create any more objects than your proposal, and has the merit of using some standard Oracle functionality.
When working with bulk processing (that is, when using the FORALL syntax) we can trap exceptions using the built-in SQL%BULK_EXCEPTIONS collection. Check it out. It is possible to combine Bulk Exceptions with DML Error Logging but that may create problems in 11g. Find out more.
"Global" in the case of temporary tables just means they are permanent, it's the data which is temporary.
I would define a record type that matches your cursor, plus the status field. Then define a table of that type.
TYPE t_record IS
(
field_1,
...
field_n,
status VARCHAR2(30)
);
TYPE t_table IS TABLE OF t_record;
FUNCTION insert_records
(
p_rows_to_insert IN SYS_REFCURSOR
)
RETURN t_table;
Even better would be to also define the inputs as a table type instead of a cursor.

ORA-30926: unable to get a stable set of rows in the source tables

I am getting
ORA-30926: unable to get a stable set of rows in the source tables
in the following query:
MERGE INTO table_1 a
USING
(SELECT a.ROWID row_id, 'Y'
FROM table_1 a ,table_2 b ,table_3 c
WHERE a.mbr = c.mbr
AND b.head = c.head
AND b.type_of_action <> '6') src
ON ( a.ROWID = src.row_id )
WHEN MATCHED THEN UPDATE SET in_correct = 'Y';
I've ran table_1 it has data and also I've ran the inside query (src) which also has data.
Why would this error come and how can it be resolved?
This is usually caused by duplicates in the query specified in USING clause. This probably means that TABLE_A is a parent table and the same ROWID is returned several times.
You could quickly solve the problem by using a DISTINCT in your query (in fact, if 'Y' is a constant value you don't even need to put it in the query).
Assuming your query is correct (don't know your tables) you could do something like this:
MERGE INTO table_1 a
USING
(SELECT distinct ta.ROWID row_id
FROM table_1 a ,table_2 b ,table_3 c
WHERE a.mbr = c.mbr
AND b.head = c.head
AND b.type_of_action <> '6') src
ON ( a.ROWID = src.row_id )
WHEN MATCHED THEN UPDATE SET in_correct = 'Y';
You're probably trying to to update the same row of the target table multiple times. I just encountered the very same problem in a merge statement I developed. Make sure your update does not touch the same record more than once in the execution of the merge.
A further clarification to the use of DISTINCT to resolve error ORA-30926 in the general case:
You need to ensure that the set of data specified by the USING() clause has no duplicate values of the join columns, i.e. the columns in the ON() clause.
In OP's example where the USING clause only selects a key, it was sufficient to add DISTINCT to the USING clause. However, in the general case the USING clause may select a combination of key columns to match on and attribute columns to be used in the UPDATE ... SET clause. Therefore in the general case, adding DISTINCT to the USING clause will still allow different update rows for the same keys, in which case you will still get the ORA-30926 error.
This is an elaboration of DCookie's answer and point 3.1 in Tagar's answer, which from my experience may not be immediately obvious.
How to Troubleshoot ORA-30926 Errors? (Doc ID 471956.1)
1) Identify the failing statement
alter session set events ‘30926 trace name errorstack level 3’;
or
alter system set events ‘30926 trace name errorstack off’;
and watch for .trc files in UDUMP when it occurs.
2) Having found the SQL statement, check if it is correct (perhaps using explain plan or tkprof to check the query execution plan) and analyze or compute statistics on the tables concerned if this has not recently been done. Rebuilding (or dropping/recreating) indexes may help too.
3.1) Is the SQL statement a MERGE?
evaluate the data returned by the USING clause to ensure that there are no duplicate values in the join. Modify the merge statement to include a deterministic where clause
3.2) Is this an UPDATE statement via a view?
If so, try populating the view result into a table and try updating the table directly.
3.3) Is there a trigger on the table? Try disabling it to see if it still fails.
3.4) Does the statement contain a non-mergeable view in an 'IN-Subquery'? This can result in duplicate rows being returned if the query has a "FOR UPDATE" clause. See Bug 2681037
3.5) Does the table have unused columns? Dropping these may prevent the error.
4) If modifying the SQL does not cure the error, the issue may be with the table, especially if there are chained rows.
4.1) Run the ‘ANALYZE TABLE VALIDATE STRUCTURE CASCADE’ statement on all tables used in the SQL to see if there are any corruptions in the table or its indexes.
4.2) Check for, and eliminate, any CHAINED or migrated ROWS on the table. There are ways to minimize this, such as the correct setting of PCTFREE.
Use Note 122020.1 - Row Chaining and Migration
4.3) If the table is additionally Index Organized, see:
Note 102932.1 - Monitoring Chained Rows on IOTs
Had the error today on a 12c and none of the existing answers fit (no duplicates, no non-deterministic expressions in the WHERE clause). My case was related to that other possible cause of the error, according to Oracle's message text (emphasis below):
ORA-30926: unable to get a stable set of rows in the source tables
Cause: A stable set of rows could not be got because of large dml activity or a non-deterministic where clause.
The merge was part of a larger batch, and was executed on a live database with many concurrent users. There was no need to change the statement. I just committed the transaction before the merge, then ran the merge separately, and committed again. So the solution was found in the suggested action of the message:
Action: Remove any non-deterministic where clauses and reissue the dml.
SQL Error: ORA-30926: unable to get a stable set of rows in the source tables
30926. 00000 - "unable to get a stable set of rows in the source tables"
*Cause: A stable set of rows could not be got because of large dml
activity or a non-deterministic where clause.
*Action: Remove any non-deterministic where clauses and reissue the dml.
This Error occurred for me because of duplicate records(16K)
I tried with unique it worked .
but again when I tried merge without unique same proble occurred
Second time it was due to commit
after merge if commit is not done same Error will be shown.
Without unique, Query will work if commit is given after each merge operation.
I was not able to resolve this after several hours. Eventually I just did a select with the two tables joined, created an extract and created individual SQL update statements for the 500 rows in the table. Ugly but beats spending hours trying to get a query to work.
As someone explained earlier, probably your MERGE statement tries to update the same row more than once and that does not work (could cause ambiguity).
Here is one simple example. MERGE that tries to mark some products as found when matching the given search patterns:
CREATE TABLE patterns(search_pattern VARCHAR2(20));
INSERT INTO patterns(search_pattern) VALUES('Basic%');
INSERT INTO patterns(search_pattern) VALUES('%thing');
CREATE TABLE products (id NUMBER,name VARCHAR2(20),found NUMBER);
INSERT INTO products(id,name,found) VALUES(1,'Basic instinct',0);
INSERT INTO products(id,name,found) VALUES(2,'Basic thing',0);
INSERT INTO products(id,name,found) VALUES(3,'Super thing',0);
INSERT INTO products(id,name,found) VALUES(4,'Hyper instinct',0);
MERGE INTO products p USING
(
SELECT search_pattern FROM patterns
) o
ON (p.name LIKE o.search_pattern)
WHEN MATCHED THEN UPDATE SET p.found=1;
SELECT * FROM products;
If patterns table contains Basic% and Super% patterns then MERGE works and first three products will be updated (found). But if patterns table contains Basic% and %thing search patterns, then MERGE does NOT work because it will try to update second product twice and this causes the problem. MERGE does not work if some records should be updated more than once. Probably you ask why not update twice!?
Here first update 1 and second update 1 are the same value but only by accident. Now look at this scenario:
CREATE TABLE patterns(code CHAR(1),search_pattern VARCHAR2(20));
INSERT INTO patterns(code,search_pattern) VALUES('B','Basic%');
INSERT INTO patterns(code,search_pattern) VALUES('T','%thing');
CREATE TABLE products (id NUMBER,name VARCHAR2(20),found CHAR(1));
INSERT INTO products(id,name,found) VALUES(1,'Basic instinct',NULL);
INSERT INTO products(id,name,found) VALUES(2,'Basic thing',NULL);
INSERT INTO products(id,name,found) VALUES(3,'Super thing',NULL);
INSERT INTO products(id,name,found) VALUES(4,'Hyper instinct',NULL);
MERGE INTO products p USING
(
SELECT code,search_pattern FROM patterns
) s
ON (p.name LIKE s.search_pattern)
WHEN MATCHED THEN UPDATE SET p.found=s.code;
SELECT * FROM products;
Now first product name matches Basic% pattern and it will be updated with code B but second product matched both patterns and cannot be updated with both codes B and T in the same time (ambiguity)!
That's why DB engine complaints. Don't blame it! It knows what it is doing! ;-)

Resources