optimizing a dup delete statement Oracle

optimizing a dup delete statement Oracle - oracle

I have 2 delete statements that are taking a long time to complete. There are several indexes on the columns in where clause.
What is a duplicate?
If 2 or more records have same values in columns id,cid,type,trefid,ordrefid,amount and paydt then there are duplicates.
The DELETEs delete about 1 million record.
Can they be re-written in any way to make it quicker.
DELETE FROM TABLE1 A WHERE loaddt < (
SELECT max(loaddt) FROM TABLE1 B
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
COMMIT;
DELETE FROM TABLE1 a where rowid > (
Select min(rowid) from TABLE1 b
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
commit;
Explain Plan:
DELETE TABLE1
HASH JOIN 1296491
Access Predicates
AND
A.ID=ITEM_1
A.CID=ITEM_2
ITEM_3=NVL(TYPE,'-99999')
ITEM_4=NVL(TREFID,'-99999')
ITEM_5=NVL(ORDREFID,'-99999')
ITEM_6=NVL(AMOUNT,(-99999))
ITEM_7=NVL(PAYDT,TO_DATE(' 9999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Filter Predicates
LOADDT<MAX(LOADDT)
TABLE ACCESS TABLE1 FULL 267904
VIEW VW_SQ_1 690385
SORT GROUP BY 690385
TABLE ACCESS TABLE1 FULL 267904

How large is the table? If count of deleted rows is up to 12% then you may think about index.
Could you somehow partition your table - like week by week and then scan only actual week?
Maybe this could be more effecient. When you're using aggregate function, then oracle must walk through all relevant rows (in your case fullscan), but when you use exists it stops when the first occurence is found. (and of course the query would be much faster, when there was one function-based(because of NVL) index on all columns in where clause)
DELETE FROM TABLE1 A
WHERE exists (
SELECT 1
FROM TABLE1 B
WHERE
A.loaddt != b.loaddt
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);

Although some may disagree, I am a proponent of running large, long running deletes procedurally. In my view it is much easier to control and track progress (and your DBA will like you better ;-) Also, not sure why you need to join table1 to itself to identify duplicates (and I'd be curious if you ever run into snapshot too old issues with your current approach). You also shouldn't need multiple delete statements, all duplicates should be handled in one process. Finally, you should check WHY you're constantly re-introducing duplicates each week, and perhaps change the load process (maybe doing a merge/upsert rather than all inserts).
That said, you might try something like:
-- first create mat view to find all duplicates
create materialized view my_dups_mv
tablespace my_tablespace
build immediate
refresh complete on demand
as
select id,cid,type,trefid,ordrefid,amount,paydt, count(1) as cnt
from table1
group by id,cid,type,trefid,ordrefid,amount,paydt
having count(1) > 1;
-- dedup data (or put into procedure and schedule along with mat view refresh above)
declare
-- make sure my_dups_mv is refreshed first
cursor dup_cur is
select * from my_dups_mv;
type duprec_t is record(row_id rowid);
duprec duprec_t;
type duptab_t is table of duprec_t index by pls_integer;
duptab duptab_t;
l_ctr pls_integer := 0;
l_dupcnt pls_integer := 0;
begin
for rec in dup_cur
loop
l_ctr := l_ctr + 1;
-- assuming needed indexes exist
select rowid
bulk collect into duptab
from table1
where id = rec.id
and cid = rec.cid
and type = rec.type
and trefid = rec.trefid
and ordrefid = rec.ordrefid
and amount = rec.amount
and paydt = rec.paydt
-- order by whatever makes sense to make the "keeper" float to top
order by loaddt desc
;
for i in 2 .. duptab.count
loop
l_dupcnt := l_dupcnt + 1;
delete from table1 where rowid = duptab(i).row_id;
end loop;
if (mod(l_ctr, 10000) = 0) then
-- log to log table here (calling autonomous procedure you'll need to implement)
insert_logtable('Table1 deletes', 'Commit reached, deleted ' || l_dupcnt || ' rows');
commit;
end if;
end loop;
commit;
end;
Check your log table for progress status.

1. Parallel
alter session enable parallel dml;
DELETE /*+ PARALLEL */ FROM TABLE1 A WHERE loaddt < (
...
Assuming you have Enterprise Edition, a sane server configuration, and you are on 11g. If you're not on 11g, the parallel syntax is slightly different.
2. Reduce memory requirements
The plan shows a hash join, which is probably a good thing. But without any useful filters, Oracle has to hash the entire table. (Tbone's query, that only use a GROUP BY, looks nicer and may run faster. But it will also probably run into the same problem trying to sort or hash the entire table.)
If the hash can't fit in memory it must be written to disk, which can be very slow. Since you run this query every week, only one of the tables needs to look at all the rows. Depending on exactly when it runs, you can add something like this to the end of the query: ) where b.loaddt >= sysdate - 14. This may significantly reduce the amount of writing to temporary tablespace. And it may also reduce read IO if you use some partitioning strategy like jakub.petr suggested.
3. Active Report
If you want to know exactly what your query is doing, run the Active Report:
select dbms_sqltune.report_sql_monitor(sql_id => 'YOUR_SQL_ID_HERE', type => 'active')
from dual;
(Save the output to an .html file and open it with a browser.)

Related

Oracle equivalent query for this postgress query - CONFLICT [duplicate]

The UPSERT operation either updates or inserts a row in a table, depending if the table already has a row that matches the data:
if table t has a row exists that has key X:
update t set mystuff... where mykey=X
else
insert into t mystuff...
Since Oracle doesn't have a specific UPSERT statement, what's the best way to do this?

The MERGE statement merges data between two tables. Using DUAL
allows us to use this command. Note that this is not protected against concurrent access.
create or replace
procedure ups(xa number)
as
begin
merge into mergetest m using dual on (a = xa)
when not matched then insert (a,b) values (xa,1)
when matched then update set b = b+1;
end ups;
/
drop table mergetest;
create table mergetest(a number, b number);
call ups(10);
call ups(10);
call ups(20);
select * from mergetest;
A B
---------------------- ----------------------
10 2
20 1

The dual example above which is in PL/SQL was great becuase I wanted to do something similar, but I wanted it client side...so here is the SQL I used to send a similar statement direct from some C#
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" )
However from a C# perspective this provide to be slower than doing the update and seeing if the rows affected was 0 and doing the insert if it was.

An alternative to MERGE (the "old fashioned way"):
begin
insert into t (mykey, mystuff)
values ('X', 123);
exception
when dup_val_on_index then
update t
set mystuff = 123
where mykey = 'X';
end;

Another alternative without the exception check:
UPDATE tablename
SET val1 = in_val1,
val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%rowcount = 0 )
THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;

insert if not exists
update:
INSERT INTO mytable (id1, t1)
SELECT 11, 'x1' FROM DUAL
WHERE NOT EXISTS (SELECT id1 FROM mytble WHERE id1 = 11);
UPDATE mytable SET t1 = 'x1' WHERE id1 = 11;

None of the answers given so far is safe in the face of concurrent accesses, as pointed out in Tim Sylvester's comment, and will raise exceptions in case of races. To fix that, the insert/update combo must be wrapped in some kind of loop statement, so that in case of an exception the whole thing is retried.
As an example, here's how Grommit's code can be wrapped in a loop to make it safe when run concurrently:
PROCEDURE MyProc (
...
) IS
BEGIN
LOOP
BEGIN
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" );
EXIT; -- success? -> exit loop
EXCEPTION
WHEN NO_DATA_FOUND THEN -- the entry was concurrently deleted
NULL; -- exception? -> no op, i.e. continue looping
WHEN DUP_VAL_ON_INDEX THEN -- an entry was concurrently inserted
NULL; -- exception? -> no op, i.e. continue looping
END;
END LOOP;
END;
N.B. In transaction mode SERIALIZABLE, which I don't recommend btw, you might run into
ORA-08177: can't serialize access for this transaction exceptions instead.

I'd like Grommit answer, except it require dupe values. I found solution where it may appear once: http://forums.devshed.com/showpost.php?p=1182653&postcount=2
MERGE INTO KBS.NUFUS_MUHTARLIK B
USING (
SELECT '028-01' CILT, '25' SAYFA, '6' KUTUK, '46603404838' MERNIS_NO
FROM DUAL
) E
ON (B.MERNIS_NO = E.MERNIS_NO)
WHEN MATCHED THEN
UPDATE SET B.CILT = E.CILT, B.SAYFA = E.SAYFA, B.KUTUK = E.KUTUK
WHEN NOT MATCHED THEN
INSERT ( CILT, SAYFA, KUTUK, MERNIS_NO)
VALUES (E.CILT, E.SAYFA, E.KUTUK, E.MERNIS_NO);

I've been using the first code sample for years. Notice notfound rather than count.
UPDATE tablename SET val1 = in_val1, val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%notfound ) THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
The code below is the possibly new and improved code
MERGE INTO tablename USING dual ON ( val3 = in_val3 )
WHEN MATCHED THEN UPDATE SET val1 = in_val1, val2 = in_val2
WHEN NOT MATCHED THEN INSERT
VALUES (in_val1, in_val2, in_val3)
In the first example the update does an index lookup. It has to, in order to update the right row. Oracle opens an implicit cursor, and we use it to wrap a corresponding insert so we know that the insert will only happen when the key does not exist. But the insert is an independent command and it has to do a second lookup. I don't know the inner workings of the merge command but since the command is a single unit, Oracle could execute the correct insert or update with a single index lookup.
I think merge is better when you do have some processing to be done that means taking data from some tables and updating a table, possibly inserting or deleting rows. But for the single row case, you may consider the first case since the syntax is more common.

A note regarding the two solutions that suggest:
1) Insert, if exception then update,
or
2) Update, if sql%rowcount = 0 then insert
The question of whether to insert or update first is also application dependent. Are you expecting more inserts or more updates? The one that is most likely to succeed should go first.
If you pick the wrong one you will get a bunch of unnecessary index reads. Not a huge deal but still something to consider.

Try this,
insert into b_building_property (
select
'AREA_IN_COMMON_USE_DOUBLE','Area in Common Use','DOUBLE', null, 9000, 9
from dual
)
minus
(
select * from b_building_property where id = 9
)
;

From http://www.praetoriate.com/oracle_tips_upserts.htm:
"In Oracle9i, an UPSERT can accomplish this task in a single statement:"
INSERT
FIRST WHEN
credit_limit >=100000
THEN INTO
rich_customers
VALUES(cust_id,cust_credit_limit)
INTO customers
ELSE
INTO customers SELECT * FROM new_customers;

Oracle PL/SQL Update statement looping forever - 504 Gateway Time-out

I'm trying to update a table based on another one's information:
Source_Table (Table 1) columns:
TABLE_ROW_ID (Based on trigger-sequence when insert)
REP_ID
SOFT_ASSIGNMENT
Description (Table 2) columns:
REP_ID
NEW_SOFT_ASSIGNMENT
This is my loop statement:
SELECT count(table_row_id) INTO V_ROWS_APPROVED FROM Source_Table;
FOR i IN 1..V_ROWS_APPROVED LOOP
SELECT REQUESTED_SOFT_MAPPING INTO V_SOFT FROM Source_Table WHERE ROW_ID = i;
SELECT REP_ID INTO V_REP_ID FROM Source_Table WHERE ROW_ID = i;
UPDATE Description_Table D
SET D.NEW_SOFT_ASSIGNMENT = V_SOFT
WHERE D.REP_ID = V_REP_ID;
END LOOP;
END;
The ending result of this loop is a beautiful ''504 Gateway Time-out''.
I know the issue is on the Update query but there's no other way (I can think about) of doing it.
Can someone give me a hand please?
Thanks

Unless your row_id values are contiguous - i.e. count(row_id) == max(row_id) - then this will get a no-data-found. Sequences aren't gapless, so this seems fairly likely. We have no way of telling if that is happening and somehow that is leaving your connection hanging until it times out, or if it's just taking a long time because you're doing a lot of individual queries and updates over a large data set. (And you may be squashing any errors that do occur, though you haven't shown that.)
You don't need to query and update in a loop though, or even use PL/SQL; you can apply all the values in the source table to the description table with a single update or merge:
merge into description_table d
using source_table s
on (s.rep_id = d.rep_id)
when matched then
update set d.new_soft_assignment = s.requested_soft_mapping;
db<>fiddle with some dummy data, including a non-contiguous row_id to show that erroring.

UPDATE on INSERT duplicate primary key in Oracle?

I have a simple INSERT query where I need to use UPDATE instead when the primary key is a duplicate. In MySQL this seems easier, in Oracle it seems I need to use MERGE.
All examples I could find of MERGE had some sort of "source" and "target" tables, in my case, the source and target is the same table. I was not able to make sense of the examples to create my own query.
Is MERGE the only way or maybe there's a better solution?
INSERT INTO movie_ratings
VALUES (1, 3, 5)
It's basically this and the primary key is the first 2 values, so an update would be like this:
UPDATE movie_ratings
SET rating = 8
WHERE mid = 1 AND aid = 3
I thought of using a trigger that would automatically execute the UPDATE statement when the INSERT was called but only if the primary key is a duplicate. Is there any problem doing it this way? I need some help with triggers though as I'm having some difficulty trying to understand them and doing my own.

MERGE is the 'do INSERT or UPDATE as appropriate' statement in Standard SQL, and probably therefore in Oracle SQL too.
Yes, you need a 'table' to merge from, but you can almost certainly create that table on the fly:
MERGE INTO Movie_Ratings M
USING (SELECT 1 AS mid, 3 AS aid, 8 AS rating FROM dual) N
ON (M.mid = N.mid AND M.aid = N.aid)
WHEN MATCHED THEN UPDATE SET M.rating = N.rating
WHEN NOT MATCHED THEN INSERT( mid, aid, rating)
VALUES(N.mid, N.aid, N.rating);
(Syntax not verified.)

A typical way of doing this is
performing the INSERT and catch a DUP_VAL_ON_INDEX and then perform an UPDATE instead
performing the UPDATE first and if SQL%Rows = 0 perform an INSERT
You can't write a trigger on a table that does another operation on the same table. That's causing an Oracle error (mutating tables).

I'm a T-SQL guy but a trigger in this case is not a good solution. Most triggers are not good solutions. In T-SQL, I would simply perform an IF EXISTS (SELECT * FROM dbo.Table WHERE ...) but in Oracle, you have to select the count...
DECLARE
cnt NUMBER;
BEGIN
SELECT COUNT(*)
INTO cnt
FROM mytable
WHERE id = 12345;
IF( cnt = 0 )
THEN
...
ELSE
...
END IF;
END;
It would appear that MERGE is what you need in this case:
MERGE INTO movie_ratings mr
USING (
SELECT rating, mid, aid
WHERE mid = 1 AND aid = 3) mri
ON (mr.movie_ratings_id = mri.movie_ratings_id)
WHEN MATCHED THEN
UPDATE SET mr.rating = 8 WHERE mr.mid = 1 AND mr.aid = 3
WHEN NOT MATCHED THEN
INSERT (mr.rating, mr.mid, mr.aid)
VALUES (1, 3, 8)
Like I said, I'm a T-SQL guy but the basic idea here is to "join" the movie_rating table against itself. If there's no performance hit on using the "if exists" example, I'd use it for readability.

ways to avoid global temp tables in oracle

We just converted our sql server stored procedures to oracle procedures. Sql Server SP's were highly dependent on session tables (INSERT INTO #table1...) these tables got converted as global temporary tables in oracle. We ended up with aroun 500 GTT's for our 400 SP's
Now we are finding out that working with GTT's in oracle is considered a last option because of performance and other issues.
what other alternatives are there? Collections? Cursors?
Our typical use of GTT's is like so:
Insert into GTT
INSERT INTO some_gtt_1
(column_a,
column_b,
column_c)
(SELECT someA,
someB,
someC
FROM TABLE_A
WHERE condition_1 = 'YN756'
AND type_cd = 'P'
AND TO_NUMBER(TO_CHAR(m_date, 'MM')) = '12'
AND (lname LIKE (v_LnameUpper || '%') OR
lname LIKE (v_searchLnameLower || '%'))
AND (e_flag = 'Y' OR
it_flag = 'Y' OR
fit_flag = 'Y'));
Update the GTT
UPDATE some_gtt_1 a
SET column_a = (SELECT b.data_a FROM some_table_b b
WHERE a.column_b = b.data_b AND a.column_c = 'C')
WHERE column_a IS NULL OR column_a = ' ';
and later on get the data out of the GTT. These are just sample queries, in actuality the queries are really complext with lot of joins and subqueries.
I have a three part question:
Can someone show how to transform
the above sample queries to
collections and/or cursors?
Since
with GTT's you can work natively
with SQL...why go away from the
GTTs? are they really that bad.
What should be the guidelines on
When to use and When to avoid GTT's

Let's answer the second question first:
"why go away from the GTTs? are they
really that bad."
A couple of days ago I was knocking up a proof of concept which loaded a largish XML file (~18MB) into an XMLType. Because I didn't want to store the XMLType permanently I tried loading it into a PL/SQL variable (session memory) and a temporary table. Loading it into a temporary table took five times as long as loading it into an XMLType variable (5 seconds compared to 1 second). The difference is because temporary tables are not memory structures: they are written to disk (specifically your nominated temporary tablespace).
If you want to cache a lot of data then storing it in memory will stress the PGA, which is not good if you have lots of sessions. So it's a trade-off between RAM and time.
To the first question:
"Can someone show how to transform the
above sample queries to collections
and/or cursors?"
The queries you post can be merged into a single statement:
SELECT case when a.column_a IS NULL OR a.column_a = ' '
then b.data_a
else column_a end AS someA,
a.someB,
a.someC
FROM TABLE_A a
left outer join TABLE_B b
on ( a.column_b = b.data_b AND a.column_c = 'C' )
WHERE condition_1 = 'YN756'
AND type_cd = 'P'
AND TO_NUMBER(TO_CHAR(m_date, 'MM')) = '12'
AND (lname LIKE (v_LnameUpper || '%') OR
lname LIKE (v_searchLnameLower || '%'))
AND (e_flag = 'Y' OR
it_flag = 'Y' OR
fit_flag = 'Y'));
(I have simply transposed your logic but that case() statement could be replaced with a neater nvl2(trim(a.column_a), a.column_a, b.data_a) ).
I know you say your queries are more complicated but your first port of call should be to consider rewriting them. I know how seductive it is to break a gnarly query into lots of baby SQLs stitched together with PL/SQL but pure SQL is way more efficient.
To use a collection it is best to define the types in SQL, because it gives us the flexibility to use them in SQL statements as well as PL/SQL.
create or replace type tab_a_row as object
(col_a number
, col_b varchar2(23)
, col_c date);
/
create or replace type tab_a_nt as table of tab_a_row;
/
Here's a sample function, which returns a result set:
create or replace function get_table_a
(p_arg in number)
return sys_refcursor
is
tab_a_recs tab_a_nt;
rv sys_refcursor;
begin
select tab_a_row(col_a, col_b, col_c)
bulk collect into tab_a_recs
from table_a
where col_a = p_arg;
for i in tab_a_recs.first()..tab_a_recs.last()
loop
if tab_a_recs(i).col_b is null
then
tab_a_recs(i).col_b := 'something';
end if;
end loop;
open rv for select * from table(tab_a_recs);
return rv;
end;
/
And here it is in action:
SQL> select * from table_a
2 /
COL_A COL_B COL_C
---------- ----------------------- ---------
1 whatever 13-JUN-10
1 12-JUN-10
SQL> var rc refcursor
SQL> exec :rc := get_table_a(1)
PL/SQL procedure successfully completed.
SQL> print rc
COL_A COL_B COL_C
---------- ----------------------- ---------
1 whatever 13-JUN-10
1 something 12-JUN-10
SQL>
In the function it is necessary to instantiate the type with the columns, in order to avoid the ORA-00947 exception. This is not necessary when populating a PL/SQL table type:
SQL> create or replace procedure pop_table_a
2 (p_arg in number)
3 is
4 type table_a_nt is table of table_a%rowtype;
5 tab_a_recs table_a_nt;
6 begin
7 select *
8 bulk collect into tab_a_recs
9 from table_a
10 where col_a = p_arg;
11 end;
12 /
Procedure created.
SQL>
Finally, guidelines
"What should be the guidelines on When
to use and When to avoid GTT's"
Global temp tables are very good when we need share cached data between different program units in the same session. For instance if we have a generic report structure generated by a single function feeding off a GTT which is populated by one of several procedures. (Although even that could also be implemented with dynamic ref cursors ...)
Global temporary tables are also good if we have a lot of intermediate processing which is just too complicated to be solved with a single SQL query. Especially if that processing must be applied to subsets of the retrieved rows.
But in general the presumption should be that we don't need to use a temporary table. So
Do it in SQL unless it is too hard it which case ...
... Do it in PL/SQL variables (usually collections) unless it takes too much memory it which case ...
... Do it with a Global Temporary Table

Generally I'd use a PL/SQL collection for storing small volumes of data (maybe a thousand rows). If the data volumes were much larger, I'd use a GTT so that they don't overload the process memory.
So I might select a few hundred rows from the database into a PL/SQL collection, then loop through them to do some calculation/delete a few or whatever, then insert that collection into another table.
If I was dealing with hundreds of thousands of rows, I would try to push as much of the 'heavy lifting' processing into large SQL statements. That may or may not require GTT.
You can use SQL level collection objects as something that translates quite easily between SQL and PL/SQL
create type typ_car is object (make varchar2(10), model varchar2(20), year number(4));
/
create type typ_coll_car is table of typ_car;
/
select * from table (typ_coll_car(typ_car('a','b',1999), typ_car('A','Z',2000)));
MAKE MODEL YEAR
---------- -------------------- ---------------
a b 1,999.00
A Z 2,000.00
declare
v_car1 typ_car := typ_car('a','b',1999);
v_car2 typ_car := typ_car('A','Z',2000);
t_car typ_coll_car := typ_coll_car();
begin
t_car := typ_coll_car(v_car1, v_car2);
FOR i in (SELECT * from table(t_car)) LOOP
dbms_output.put_line(i.year);
END LOOP;
end;
/

Improving the performance of keeping the three recent records of each account query

I've a table in an Oracle (10g XE) database, and I'm going to clean it up and only keep the three recent records of each account. Here is what I'm doing right now:
CREATE TABLE ACCOUNT_TRANSACTION_TMP NOLOGGING AS SELECT * FROM ACCOUNT_TRANSACTION WHERE 1=2;
DECLARE
CURSOR mbsacc_cur (account_id_var account_transaction.account_id%TYPE) IS
SELECT * FROM account_transaction WHERE account_id = account_id_var ORDER BY transaction_time DESC;
account_transaction_rec account_transaction%ROWTYPE;
BEGIN
FOR i IN (SELECT DISTINCT(account_id) FROM account_transaction) LOOP
OPEN mbsacc_cur(i.account_id);
LOOP
FETCH mbsacc_cur INTO account_transaction_rec;
EXIT WHEN mbsacc_cur%NOTFOUND OR mbsacc_cur%ROWCOUNT > 3;
INSERT /*+ append */ INTO account_transaction_tmp VALUES account_transaction_rec;
END LOOP;
CLOSE mbsacc_cur;
END LOOP;
END;
/
And then I'll drop the old table, rename this new one to old one and add constraints.
But the problem is the above code runs forever (~3-4 hours) for about 1 million record which approximately half of them should be removed.
Is there any way to improve the performance of this?

Instead of creating an empty table and populating in an RBAR fashion create a table with the rows you want....
CREATE TABLE ACCOUNT_TRANSACTION_TMP NOLOGGING AS
SELECT account_id, col1, col2, col3, transaction_time from
( select at.*
, row_number()
over (partition by at.account_id
order by at.transaction_time desc) as to_keep
FROM ACCOUNT_TRANSACTION at)
where to_keep <= 3
/
Then skip straight to the renaming part of your plan.

You can do that with analytics (although I am not at all well versed in it myself). Take a look at this question, which seems to address a situation similar to yours:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1212501913138

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

optimizing a dup delete statement Oracle - oracle

Related

Oracle equivalent query for this postgress query - CONFLICT [duplicate]

Oracle PL/SQL Update statement looping forever - 504 Gateway Time-out

UPDATE on INSERT duplicate primary key in Oracle?

ways to avoid global temp tables in oracle

Improving the performance of keeping the three recent records of each account query

Categories

Resources