Given the following trigger:
CREATE OR REPLACE TRIGGER TR_MY_TRG_NAME
AFTER UPDATE OF COL_A, COL_B, COL_C ON T_MY_TABLE_Y
FOR EACH ROW
BEGIN
UPDATE T_MY_TABLE_X X
SET
X.COL_A = :NEW.COL_A,
X.COL_B = :NEW.COL_B,
X.COL_C = :NEW.COL_C
WHERE
X.ID = :NEW.ID;
END;
... and given 2 million existing records in T_MY_TABLE_Y.
Problem:
if my app is changing all of the 2 mio records (e.g. COL_A), then without the trigger it runs 2-3 minutes, but with the trigger it took 40min.
Question:
are there some alternative approaches that I could try?
An alternative is to update T_MY_TABLE_X in a single statement, without forcing a trigger to fire for each of 2 million rows and (probably) perform context switching.
So: as you update T_MY_TABLE_Y, reuse the same UPDATE for T_MY_TABLE_X (with some modifications, if necessary).
I think you might benefit to decrease the load for DML by splitting into three parts so as to update for each individual columns :
CREATE OR REPLACE TRIGGER TR_MY_TRG_NAME
AFTER UPDATE OF COL_A, COL_B, COL_C ON T_MY_TABLE_Y
FOR EACH ROW
BEGIN
IF :NEW.COL_A != :OLD.COL_A THEN
UPDATE T_MY_TABLE_X SET COL_A = :NEW.COL_A WHERE ID = :NEW.ID;
END IF;
IF :NEW.COL_B != :OLD.COL_B THEN
UPDATE T_MY_TABLE_X SET COL_B = :NEW.COL_B WHERE ID = :NEW.ID;
END IF;
IF :NEW.COL_C != :OLD.COL_C THEN
UPDATE T_MY_TABLE_X SET COL_C = :NEW.COL_C WHERE ID = :NEW.ID;
END IF;
END;
This way, Update might not incur every column for each occurence.
Moreover, be sure T_MY_TABLE_X.ID has index on it, preferably Unique Index.
I don't have the time to write the code out, but an outline approach I can suggest trying is to create a package with a ARRAY of a RECORD of ID, COL_A, COL_B and COL_C.
The BEFORE statement trigger should instantiate and initialise the package and the array within, e.g.:
the_package.pr_init
The ROW LEVEL trigger should simply write the :NEW.ID, :NEW.COL_A, :NEW.COL_B, :NEW.COL_C to the array within the package, e.g:
the_package.pr_save ( :NEW.ID, :NEW.COL_A, :NEW.COL_B, :NEW.COL_C )
The AFTER statement trigger should then issue a BULK UPDATE on the driving off the array within the package, and clear out the ARRAY, eg:
the_package.pr_do_update.
The benefit of this approach is you only execute ONE additional UPDATE statement regardless of how many rows.
The solution is also contained with in a single package, albeit splayed across 3 triggers, though the trigger code itself will be much simplified.
Related
The UPSERT operation either updates or inserts a row in a table, depending if the table already has a row that matches the data:
if table t has a row exists that has key X:
update t set mystuff... where mykey=X
else
insert into t mystuff...
Since Oracle doesn't have a specific UPSERT statement, what's the best way to do this?
The MERGE statement merges data between two tables. Using DUAL
allows us to use this command. Note that this is not protected against concurrent access.
create or replace
procedure ups(xa number)
as
begin
merge into mergetest m using dual on (a = xa)
when not matched then insert (a,b) values (xa,1)
when matched then update set b = b+1;
end ups;
/
drop table mergetest;
create table mergetest(a number, b number);
call ups(10);
call ups(10);
call ups(20);
select * from mergetest;
A B
---------------------- ----------------------
10 2
20 1
The dual example above which is in PL/SQL was great becuase I wanted to do something similar, but I wanted it client side...so here is the SQL I used to send a similar statement direct from some C#
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" )
However from a C# perspective this provide to be slower than doing the update and seeing if the rows affected was 0 and doing the insert if it was.
An alternative to MERGE (the "old fashioned way"):
begin
insert into t (mykey, mystuff)
values ('X', 123);
exception
when dup_val_on_index then
update t
set mystuff = 123
where mykey = 'X';
end;
Another alternative without the exception check:
UPDATE tablename
SET val1 = in_val1,
val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%rowcount = 0 )
THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
insert if not exists
update:
INSERT INTO mytable (id1, t1)
SELECT 11, 'x1' FROM DUAL
WHERE NOT EXISTS (SELECT id1 FROM mytble WHERE id1 = 11);
UPDATE mytable SET t1 = 'x1' WHERE id1 = 11;
None of the answers given so far is safe in the face of concurrent accesses, as pointed out in Tim Sylvester's comment, and will raise exceptions in case of races. To fix that, the insert/update combo must be wrapped in some kind of loop statement, so that in case of an exception the whole thing is retried.
As an example, here's how Grommit's code can be wrapped in a loop to make it safe when run concurrently:
PROCEDURE MyProc (
...
) IS
BEGIN
LOOP
BEGIN
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" );
EXIT; -- success? -> exit loop
EXCEPTION
WHEN NO_DATA_FOUND THEN -- the entry was concurrently deleted
NULL; -- exception? -> no op, i.e. continue looping
WHEN DUP_VAL_ON_INDEX THEN -- an entry was concurrently inserted
NULL; -- exception? -> no op, i.e. continue looping
END;
END LOOP;
END;
N.B. In transaction mode SERIALIZABLE, which I don't recommend btw, you might run into
ORA-08177: can't serialize access for this transaction exceptions instead.
I'd like Grommit answer, except it require dupe values. I found solution where it may appear once: http://forums.devshed.com/showpost.php?p=1182653&postcount=2
MERGE INTO KBS.NUFUS_MUHTARLIK B
USING (
SELECT '028-01' CILT, '25' SAYFA, '6' KUTUK, '46603404838' MERNIS_NO
FROM DUAL
) E
ON (B.MERNIS_NO = E.MERNIS_NO)
WHEN MATCHED THEN
UPDATE SET B.CILT = E.CILT, B.SAYFA = E.SAYFA, B.KUTUK = E.KUTUK
WHEN NOT MATCHED THEN
INSERT ( CILT, SAYFA, KUTUK, MERNIS_NO)
VALUES (E.CILT, E.SAYFA, E.KUTUK, E.MERNIS_NO);
I've been using the first code sample for years. Notice notfound rather than count.
UPDATE tablename SET val1 = in_val1, val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%notfound ) THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
The code below is the possibly new and improved code
MERGE INTO tablename USING dual ON ( val3 = in_val3 )
WHEN MATCHED THEN UPDATE SET val1 = in_val1, val2 = in_val2
WHEN NOT MATCHED THEN INSERT
VALUES (in_val1, in_val2, in_val3)
In the first example the update does an index lookup. It has to, in order to update the right row. Oracle opens an implicit cursor, and we use it to wrap a corresponding insert so we know that the insert will only happen when the key does not exist. But the insert is an independent command and it has to do a second lookup. I don't know the inner workings of the merge command but since the command is a single unit, Oracle could execute the correct insert or update with a single index lookup.
I think merge is better when you do have some processing to be done that means taking data from some tables and updating a table, possibly inserting or deleting rows. But for the single row case, you may consider the first case since the syntax is more common.
A note regarding the two solutions that suggest:
1) Insert, if exception then update,
or
2) Update, if sql%rowcount = 0 then insert
The question of whether to insert or update first is also application dependent. Are you expecting more inserts or more updates? The one that is most likely to succeed should go first.
If you pick the wrong one you will get a bunch of unnecessary index reads. Not a huge deal but still something to consider.
Try this,
insert into b_building_property (
select
'AREA_IN_COMMON_USE_DOUBLE','Area in Common Use','DOUBLE', null, 9000, 9
from dual
)
minus
(
select * from b_building_property where id = 9
)
;
From http://www.praetoriate.com/oracle_tips_upserts.htm:
"In Oracle9i, an UPSERT can accomplish this task in a single statement:"
INSERT
FIRST WHEN
credit_limit >=100000
THEN INTO
rich_customers
VALUES(cust_id,cust_credit_limit)
INTO customers
ELSE
INTO customers SELECT * FROM new_customers;
I am trying to insert data in the form of creating trigger. But before inserting the data into the table, I want to run one condition which is
want to delete the data with some condition. So I implemented like this below
Once the condition is true then only INSERT otherwise not.
create or replace TRIGGER APP_WFM.TRG_INS_NE_SF_SITE_INSTANCE
BEFORE INSERT OR UPDATE ON SF_NE_DETAILS
FOR EACH ROW
BEGIN
IF
DELETE FROM NE_SITE_INSTANCE
WHERE build_by IN ('RCOM','RJIL','IP1','IP1 COLO')
AND validation_status IS NULL
AND wfm_hoto_flag IS NULL;
THEN
INSERT INTO NE_SITE_INSTANCE (
rf_siteid,
sap_id,
sitename,
DSUPPLIERBATTERYBANK,
DUTILITYINSTBATTERYBANK,
DENDDGDATE,
DSTARTDGDATE,
DEMFEND,
DEMFSTART,
DQUALITYDATE,
DRFE1OFFERED,
DUTILITYINSTODC,
DSUPPLIERSMPS,
DUTILITYINSTSMPS,
DLTOWERPOLEMATERIAL,
DLNOOFPLATFORM,
DLNOOFPOLES,
DLNOOFSECTORS,
DLBATREDIFIER,
DLNOOFREDINSMPS,
FUBATCOMMREPORT,
FUDGCOMREPORT,
FUSMPSCOMREPORT,
DLSHELTER,
CANDIDATEID,
CIRCLE,
CITY_NAME,
LATITUDE,
LONGITUDE,
DLMATERIALSUPPLIER,
DLBATMODELNAME,
DBATRECTIFIERSRNO1,
DBATRECTIFIERSRNO2,
DBATRECTIFIERSRNO3,
DBATRECTIFIERSRNO4,
DBATRECTIFIERSRNO5,
DBATRECTIFIERSRNO6,
DCBMS,
DDGALTERNATEMAKE,
DGALTERNATESRNO,
DCRANKMAKE,
DCRANKSRNO,
DDCENGINEMAKE,
DDGENGINESRNO,
DLDGMODELNAME,
DLEARTHING,
DLRP1MODELNAME,
DCENERGYMETEROWNSRNO,
DLMODELFDPFDMS1_COUNT,
DLMODELFDPFDMS1,
DFDPMAKE,
DLMODELFDPFDMS_COUNT,
DLMODELFDPFDMS,
DODCMAKE,
DLDOCMODELNAME,
ODCSERIALNO,
DLGO,
DSHELTERMAKE,
DSHELTERSRNO,
DCONTROLLERADDRESS,
DSMPSMAKE,
DLMODELSMPS,
DPAUIPADDRESS,
DSMPSRECTIFIERSRNO1,
DSMPSRECTIFIERSRNO2,
DSMPSRECTIFIERSRNO3,
DSMPSRECTIFIERSRNO4,
DSMPSRECTIFIERSRNO5,
DSMPSRECTIFIERSRNO6,
DSMPSSRNO,
DLLTOWERTYPE,
DSUPPLIERFDMS,
DRFE1DECLARED,
SITE_TYPE,
DSMPSRECTIFIERSRNO7,
DSMPSRECTIFIERSRNO8,
DSMPSRECTIFIERSRNO9,
DSMPSRECTIFIERSRNO10,
DSMPSRECTIFIERSRNO11,
DSMPSRECTIFIERSRNO12,
ALARM_GATEWAY_MAKE,
ALARM_GATEWAY_MODEL_NAME,
ALRM_GTWAY_INTS_DT,
ALRM_GTWAY_SR_NO,
ALRM_GTWAY_COMM_DT
) VALUES (
:NEW.rf_siteid,
:NEW.sap_id,
:NEW.site_name,
:NEW.DSUPPLIERBATTERYBANK,
:NEW.DUTILITYINSTBATTERYBANK,
:NEW.DENDDGDATE,
:NEW.DSTARTDGDATE,
:NEW.DEMFEND,
:NEW.DEMFSTART,
:NEW.DQUALITYDATE,
:NEW.DRFE1OFFERED,
:NEW.DUTILITYINSTODC,
:NEW.DSUPPLIERSMPS,
:NEW.DUTILITYINSTSMPS,
:NEW.DLTOWERPOLEMATERIAL,
:NEW.DLNOOFPLATFORM,
:NEW.DLNOOFPOLES,
:NEW.DLNOOFSECTORS,
:NEW.DLBATREDIFIER,
:NEW.DLNOOFREDINSMPS,
:NEW.FUBATCOMMREPORT,
:NEW.FUDGCOMREPORT,
:NEW.FUSMPSCOMREPORT,
:NEW.DLSHELTER,
:NEW.CANDIDATEID,
:NEW.CIRCLE,
:NEW.CITY_NAME,
:NEW.LATITUDE,
:NEW.LONGITUDE,
:NEW.DLMATERIALSUPPLIER,
:NEW.DLBATMODELNAME,
:NEW.DBATRECTIFIERSRNO1,
:NEW.DBATRECTIFIERSRNO2,
:NEW.DBATRECTIFIERSRNO3,
:NEW.DBATRECTIFIERSRNO4,
:NEW.DBATRECTIFIERSRNO5,
:NEW.DBATRECTIFIERSRNO6,
:NEW.DCBMS,
:NEW.DDGALTERNATEMAKE,
:NEW.DGALTERNATESRNO,
:NEW.DCRANKMAKE,
:NEW.DCRANKSRNO,
:NEW.DDCENGINEMAKE,
:NEW.DDGENGINESRNO,
:NEW.DLDGMODELNAME,
:NEW.DLEARTHING,
:NEW.DLRP1MODELNAME,
:NEW.DCENERGYMETEROWNSRNO,
:NEW.DLMODELFDPFDMS1_COUNT,
:NEW.DLMODELFDPFDMS1,
:NEW.DFDPMAKE,
:NEW.DLMODELFDPFDMS_COUNT,
:NEW.DLMODELFDPFDMS,
:NEW.DODCMAKE,
:NEW.DLDOCMODELNAME,
:NEW.ODCSERIALNO,
:NEW.DLGO,
:NEW.DSHELTERMAKE,
:NEW.DSHELTERSRNO,
:NEW.DCONTROLLERADDRESS,
:NEW.DSMPSMAKE,
:NEW.DLMODELSMPS,
:NEW.DPAUIPADDRESS,
:NEW.DSMPSRECTIFIERSRNO1,
:NEW.DSMPSRECTIFIERSRNO2,
:NEW.DSMPSRECTIFIERSRNO3,
:NEW.DSMPSRECTIFIERSRNO4,
:NEW.DSMPSRECTIFIERSRNO5,
:NEW.DSMPSRECTIFIERSRNO6,
:NEW.DSMPSSRNO,
:NEW.DLLTOWERTYPE,
:NEW.DSUPPLIERFDMS,
:NEW.DRFE1DECLARED,
:NEW.SITE_TYPE,
:NEW.DSMPSRECTIFIERSRNO7,
:NEW.DSMPSRECTIFIERSRNO8,
:NEW.DSMPSRECTIFIERSRNO9,
:NEW.DSMPSRECTIFIERSRNO10,
:NEW.DSMPSRECTIFIERSRNO11,
:NEW.DSMPSRECTIFIERSRNO12,
:NEW.ALARM_GATEWAY_MAKE,
:NEW.ALARM_GATEWAY_MODEL_NAME,
:NEW.ALRM_GTWAY_INTS_DT,
:NEW.ALRM_GTWAY_SR_NO,
:NEW.ALRM_GTWAY_COMM_DT
);
END;
But its giving error as
Error(8,1): PLS-00103: Encountered the symbol "DELETE" when expecting one of the following: ( - + case mod new not null continue avg count current exists max min prior sql stddev sum variance execute forall merge time timestamp interval date pipe <an alternat
If the goal is to check whether the delete removed rows before running the insert, you'd want something like
create or replace TRIGGER APP_WFM.TRG_INS_NE_SF_SITE_INSTANCE
BEFORE INSERT OR UPDATE ON SF_NE_DETAILS
FOR EACH ROW
BEGIN
DELETE ...
IF( sql%rowcount > 0 )
THEN
INSERT ...
END IF;
END;
It's not obvious to me that this makes a whole lot of sense, though. If you run a SQL statement that inserts 1000 rows into SF_NE_DETAILS, you'd run the delete 1000 times (which seems inefficient) and insert only a single row into NE_SITE_INSTANCE. That seems unlikely to be what you really want. My guess is that you really want a statement-level trigger that does the delete
create or replace TRIGGER APP_WFM.TRG_INS_NE_SF_SITE_INSTANCE
BEFORE INSERT OR UPDATE ON SF_NE_DETAILS
BEGIN
DELETE ...
END;
and then the row-level trigger that just does the insert. Then in the case that you run an insert that inserts 1000 rows, you'd run the delete once and the insert 1000 times.
Of course, this assumes that there is a good reason that you need to have two tables that seem to consist of mostly duplicate information. That would make some sense if ne_site_instance was a history table but that doesn't appear to be what's going on here. Are you sure that you don't really want a view/ materialized view/ something else?
I'm trying to update a table based on another one's information:
Source_Table (Table 1) columns:
TABLE_ROW_ID (Based on trigger-sequence when insert)
REP_ID
SOFT_ASSIGNMENT
Description (Table 2) columns:
REP_ID
NEW_SOFT_ASSIGNMENT
This is my loop statement:
SELECT count(table_row_id) INTO V_ROWS_APPROVED FROM Source_Table;
FOR i IN 1..V_ROWS_APPROVED LOOP
SELECT REQUESTED_SOFT_MAPPING INTO V_SOFT FROM Source_Table WHERE ROW_ID = i;
SELECT REP_ID INTO V_REP_ID FROM Source_Table WHERE ROW_ID = i;
UPDATE Description_Table D
SET D.NEW_SOFT_ASSIGNMENT = V_SOFT
WHERE D.REP_ID = V_REP_ID;
END LOOP;
END;
The ending result of this loop is a beautiful ''504 Gateway Time-out''.
I know the issue is on the Update query but there's no other way (I can think about) of doing it.
Can someone give me a hand please?
Thanks
Unless your row_id values are contiguous - i.e. count(row_id) == max(row_id) - then this will get a no-data-found. Sequences aren't gapless, so this seems fairly likely. We have no way of telling if that is happening and somehow that is leaving your connection hanging until it times out, or if it's just taking a long time because you're doing a lot of individual queries and updates over a large data set. (And you may be squashing any errors that do occur, though you haven't shown that.)
You don't need to query and update in a loop though, or even use PL/SQL; you can apply all the values in the source table to the description table with a single update or merge:
merge into description_table d
using source_table s
on (s.rep_id = d.rep_id)
when matched then
update set d.new_soft_assignment = s.requested_soft_mapping;
db<>fiddle with some dummy data, including a non-contiguous row_id to show that erroring.
My statement:
SELECT ROW_ID DATA_T WHERE CITY_ID=2000 AND IS_FREE=0 AND ROWNUM = 1
is used to retrieve the first row for a db table that has many entries with CITY_ID equal to 2000.
The ROW_ID that is returned is then used in an UPDATE statement in order to use this row and set IS_FREE=1.
That worked very well until two threads called the SELECT statement and the got the same ROW_ID obviously... That is my problem in a few words.
I am using ORACLE DB (12.x)
How do I resolve the problem? Can I use FOR UPDATE in this case?
I want every "client" somehow to get a different row or at least lock on of them
Something like this
function get_row_id return number
as
cursor cur_upd is
SELECT ROW_ID FROM TB WHERE CITY_ID=2000 AND IS_FREE=0 AND ROWNUM = 1
FOR UPDATE SKIP LOCKED;
begin
for get_cur_upd in cur_upd
loop
update TB
set IS_FREE = 1
where ROW_ID = get_cur_upd.ROW_ID;
commit work;
return get_cur_upd.ROW_ID;
end loop;
return null;
end;
commit or not after update depends on your logic.
Also you can return row_id without update&commit and do it later outside func.
I have 2 delete statements that are taking a long time to complete. There are several indexes on the columns in where clause.
What is a duplicate?
If 2 or more records have same values in columns id,cid,type,trefid,ordrefid,amount and paydt then there are duplicates.
The DELETEs delete about 1 million record.
Can they be re-written in any way to make it quicker.
DELETE FROM TABLE1 A WHERE loaddt < (
SELECT max(loaddt) FROM TABLE1 B
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
COMMIT;
DELETE FROM TABLE1 a where rowid > (
Select min(rowid) from TABLE1 b
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
commit;
Explain Plan:
DELETE TABLE1
HASH JOIN 1296491
Access Predicates
AND
A.ID=ITEM_1
A.CID=ITEM_2
ITEM_3=NVL(TYPE,'-99999')
ITEM_4=NVL(TREFID,'-99999')
ITEM_5=NVL(ORDREFID,'-99999')
ITEM_6=NVL(AMOUNT,(-99999))
ITEM_7=NVL(PAYDT,TO_DATE(' 9999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Filter Predicates
LOADDT<MAX(LOADDT)
TABLE ACCESS TABLE1 FULL 267904
VIEW VW_SQ_1 690385
SORT GROUP BY 690385
TABLE ACCESS TABLE1 FULL 267904
How large is the table? If count of deleted rows is up to 12% then you may think about index.
Could you somehow partition your table - like week by week and then scan only actual week?
Maybe this could be more effecient. When you're using aggregate function, then oracle must walk through all relevant rows (in your case fullscan), but when you use exists it stops when the first occurence is found. (and of course the query would be much faster, when there was one function-based(because of NVL) index on all columns in where clause)
DELETE FROM TABLE1 A
WHERE exists (
SELECT 1
FROM TABLE1 B
WHERE
A.loaddt != b.loaddt
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
Although some may disagree, I am a proponent of running large, long running deletes procedurally. In my view it is much easier to control and track progress (and your DBA will like you better ;-) Also, not sure why you need to join table1 to itself to identify duplicates (and I'd be curious if you ever run into snapshot too old issues with your current approach). You also shouldn't need multiple delete statements, all duplicates should be handled in one process. Finally, you should check WHY you're constantly re-introducing duplicates each week, and perhaps change the load process (maybe doing a merge/upsert rather than all inserts).
That said, you might try something like:
-- first create mat view to find all duplicates
create materialized view my_dups_mv
tablespace my_tablespace
build immediate
refresh complete on demand
as
select id,cid,type,trefid,ordrefid,amount,paydt, count(1) as cnt
from table1
group by id,cid,type,trefid,ordrefid,amount,paydt
having count(1) > 1;
-- dedup data (or put into procedure and schedule along with mat view refresh above)
declare
-- make sure my_dups_mv is refreshed first
cursor dup_cur is
select * from my_dups_mv;
type duprec_t is record(row_id rowid);
duprec duprec_t;
type duptab_t is table of duprec_t index by pls_integer;
duptab duptab_t;
l_ctr pls_integer := 0;
l_dupcnt pls_integer := 0;
begin
for rec in dup_cur
loop
l_ctr := l_ctr + 1;
-- assuming needed indexes exist
select rowid
bulk collect into duptab
from table1
where id = rec.id
and cid = rec.cid
and type = rec.type
and trefid = rec.trefid
and ordrefid = rec.ordrefid
and amount = rec.amount
and paydt = rec.paydt
-- order by whatever makes sense to make the "keeper" float to top
order by loaddt desc
;
for i in 2 .. duptab.count
loop
l_dupcnt := l_dupcnt + 1;
delete from table1 where rowid = duptab(i).row_id;
end loop;
if (mod(l_ctr, 10000) = 0) then
-- log to log table here (calling autonomous procedure you'll need to implement)
insert_logtable('Table1 deletes', 'Commit reached, deleted ' || l_dupcnt || ' rows');
commit;
end if;
end loop;
commit;
end;
Check your log table for progress status.
1. Parallel
alter session enable parallel dml;
DELETE /*+ PARALLEL */ FROM TABLE1 A WHERE loaddt < (
...
Assuming you have Enterprise Edition, a sane server configuration, and you are on 11g. If you're not on 11g, the parallel syntax is slightly different.
2. Reduce memory requirements
The plan shows a hash join, which is probably a good thing. But without any useful filters, Oracle has to hash the entire table. (Tbone's query, that only use a GROUP BY, looks nicer and may run faster. But it will also probably run into the same problem trying to sort or hash the entire table.)
If the hash can't fit in memory it must be written to disk, which can be very slow. Since you run this query every week, only one of the tables needs to look at all the rows. Depending on exactly when it runs, you can add something like this to the end of the query: ) where b.loaddt >= sysdate - 14. This may significantly reduce the amount of writing to temporary tablespace. And it may also reduce read IO if you use some partitioning strategy like jakub.petr suggested.
3. Active Report
If you want to know exactly what your query is doing, run the Active Report:
select dbms_sqltune.report_sql_monitor(sql_id => 'YOUR_SQL_ID_HERE', type => 'active')
from dual;
(Save the output to an .html file and open it with a browser.)