Best practices to purge millions of data in oracle - oracle

I need to 1 billion data which 10 years before records from a table tblmail , for that I have created the below procedure.
I am doing through batch size.
CREATE OR REPLACE PROCEDURE PURGE_Data AS
batch_size INTEGER := 1000;
pvc_procedure_name CONSTANT VARCHAR2(50) := 'Purge_data';
pvc_info_message_num CONSTANT NUMBER := 1;
pvc_error_message_type CONSTANT VARCHAR2(5) := 'ERROR';
v_message schema_mc.db_msg_log.message%TYPE;
v_msg_num schema_mc.db_msg_log.msg_num%TYPE;
/*
Purpose: Provide stored procedures to be used to purge unwanted archives.
*/
BEGIN
Delete from tblmail where createdate_dts < (SYSDATE - INTERVAL '10' YEAR) and ROWNUM <= batch_size;
COMMIT;
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
v_msg_num := SQLCODE;
v_message := 'Error deleting from tblmail table';
INSERT INTO error_log
(date, num, type, source, mail)
VALUES
(systimestamp, v_msg_num, pvc_error_message_type,pvc_procedure_name, v_message);
COMMIT;
END;
Do I need to use bulk collect and delete? What is the best way to do this?

As always in computing it depends. Provided that you have an index on createdate_dts your procedure should work, but how do you know when to stop calling it? I tend to use a loop:
loop
delete /*+ first_rows */ from tblmail where createdate_dts <
(SYSDATE - INTERVAL '10' YEAR) and ROWNUM <= batch_size;
v_rows := SQL%ROWCOUNT;
commit;
exit when v_rows = 0;
end loop;
You could also return the number of deleted records if you want to keep the loop outside of the procedure. Without an index on createdate_dts it may be cheaper to collect the primary keys for the rows to delete in one pass first and then loop over them, deleting batch size records per commit with bulk collect or something. However, when possible it is always nice to use a simple solution! You may want to experiment a bit in order to find the best batch size.

Related

PLSQL Loop through query finds all the records

i'm facing a problem with a stored procedure, the thing is i'm trying to find if the transaction number in the temporary table is already in the final table, if not it will insert the record, if it's in the final table, it's going to a log_error table, here's my SP
BEGIN
DECLARE
date temporary_table.transfer_date%TYPE;
auth temporary_table.auth_code%TYPE;
transac_num temporary_table.transaction_number%TYPE;
card temporary_table.card_number%TYPE;
amount temporary_table.amount%TYPE;
num_trx_search NUMBER;
counter NUMBER;
sid1 NUMBER;
sid2 NUMBER;
loopcounter NUMBER;
BEGIN
cod_error := 0;
warning := 'execution';
OPEN vocursor FOR
SELECT transfer_date,
auth_code,
transaction_number,
card_number,
amount
FROM temporary_table order by id;
prfcursor := vocursor;
OPEN ntxcursor FOR
SELECT transaction_number FROM final_table order by id;
trxcursor := ntxcursor;
LOOP
FETCH prfcursor INTO date, auth, transac_num, card, amount;
EXIT WHEN prfcursor%NOTFOUND;
FETCH trxcursor INTO num_trx_search;
dbms_output.Put_line('NumTrx: ' || num_trx);
begin
-- i need to check if the transaction number from the temporary table is already in the
--final table
FOR loopcounter IN (Select id from final_table where transaction_number = transac_num)
LOOP
DBMS_OUTPUT.PUT_LINE(loopcounter.sid);
END LOOP;
dbms_output.Put_line('num_trx_search: ' || num_trx_search);
dbms_output.Put_line('counter: ' || counter);
exception
WHEN NO_DATA_FOUND THEN
dbms_output.Put_line('No Data found');
end;
EXIT WHEN trxcursor%NOTFOUND;
--just for testing and debuging
counter := 1;
IF(counter > 0) THEN
--inserts into log error table
ELSE
--inserts into final table
END IF;
END LOOP;
dbms_output.Put_line( 'end loop' );
CLOSE trxcursor;
CLOSE prfcursor;
dbms_output.Put_line( 'end cursor' );
END;
The thing is, it's getting all the results, for each record in the temporary, should get just one if the transaction number matches.
NumTrx is the transaction number in the temporary table.
I'm a noob in plsql, thanks
You could achieve the same thing by trying to insert the records from the temporary table into the final table and use the LOG ERRORS INTO clause to push those records that are already in final into another table.
INSERT INTO final_table final
SELECT transfer_date,
auth_code,
transaction_number,
card_number,
amount
FROM temporary_table
LOG ERRORS INTO ERR$_final_table
The query above assumes that final_table and temporary_table have the same structure. If they are different you will need to adjust the query slightly. Generally you should try to do as much of what you want in a single SQL rather than writing lots of procedural code to achieve the same thing. It is usually quicker and in this case appears to be much simpler.
For the set up of the ERR$ table I suggest you look at the Oracle docs under DML ERROR LOGGING.
If you do wish to do a row-by-row (slow) update then I would suggest using implicit cursor for loops instead simply for readability. Also I don't think the ORDER BY on each cursor is going to do anything except slow your code down.

Bulk inserting in Oracle PL/SQL

I have around 5 million of records which needs to be copied from table of one schema to table of another schema(in the same database). I have prepared a script but it gives me the below error.
ORA-06502: PL/SQL: numeric or value error: Bulk bind: Error in define
Following is my script
DECLARE
TYPE tA IS TABLE OF varchar2(10) INDEX BY PLS_INTEGER;
TYPE tB IS TABLE OF SchemaA.TableA.band%TYPE INDEX BY PLS_INTEGER;
TYPE tD IS TABLE OF SchemaA.TableA.start_date%TYPE INDEX BY PLS_INTEGER;
TYPE tE IS TABLE OF SchemaA.TableA.end_date%TYPE INDEX BY PLS_INTEGER;
rA tA;
rB tB;
rD tD;
rE tE;
f number :=0;
BEGIN
SELECT col1||col2||col3 as main_col, band, effective_start_date as start_date, effective_end_date as end_date
BULK COLLECT INTO rA, rB, rD, rE
FROM schemab.tableb;
FORALL i IN rA.FIRST..rE.LAST
insert into SchemaA.TableA(main_col, BAND, user_type, START_DATE, END_DATE, roll_no)
values(rA(i), rB(i), 'C', rD(i), rE(i), 71);
f:=f+1;
if (f=10000) then
commit;
end if;
end;
Could you please help me in finding where the error lies?
Why not a simple
insert into SchemaA.TableA (main_col, BAND, user_type, START_DATE, END_DATE, roll_no)
SELECT col1||col2||col3 as main_col, band, 'C', effective_start_date, effective_end_date, 71
FROM schemab.tableb;
This
f:=f+1;
if (f=10000) then
commit;
end if;
does not make any sense. f becomes 1 - that's it. f=10000 will never be true, thus you don't make a COMMIT.
Following script worked for me and i was able to load around 5 millions of data within 15 minutes.
ALTER SESSION ENABLE PARALLEL DML
/
DECLARE
cursor c_p1 is
SELECT col1||col2||col3 as main_col, band, effective_start_date as start_date, effective_end_date as end_date
FROM schemab.tableb;
TYPE TY_P1_FULL is table of c_p1%rowtype
index by pls_integer;
v_P1_FULL TY_P1_FULL;
v_seq_num number;
BEGIN
open c_p1;
loop
fetch c_p1 BULK COLLECT INTO v_P1_FULL LIMIT 10000;
exit when v_P1_FULL.count = 0;
FOR i IN 1..v_P1_FULL.COUNT loop
INSERT /*+ APPEND */ INTO schemaA.tableA VALUES (v_P1_FULL(i));
end loop;
commit;
end loop;
close c_P1;
dbms_output.put_line('Load completed');
end;
-- Disable parallel mode for this session
ALTER SESSION DISABLE PARALLEL DML
/
ORA-06502: PL/SQL: numeric or value error: Bulk bind: Error in define
You get that error because you have a literal in the VALUES clause of the INSERT. The FORALL expects everything to be bind to an array.
Your program has a massive problem, literally. You have no LIMIT on the BULK COLLECT clause, so that's going to try to load all five million records from TableB into your collections. That will blow your session's memory limit.
The point of using BULK COLLECT and FORALL is to bite off chunks of a bigger data set and process it in batches. For that you need a loop. The loop has no FOR condition: instead test whether the fetch returned anything and exit when the array has zero records.
DECLARE
TYPE recA IS RECORD (
main_col SchemaA.TableA.main_col%TYPE
, band SchemaA.TableA.band%TYPE
, start_date date
, end_date date
, roll_ni number);
TYPE recsA is table of recA
nt_a recsA;
f number :=0;
CURSOR cur_b is
SELECT col1||col2||col3 as main_col,
band,
effective_start_date as start_date,
effective_end_date as end_date ,
71 as roll_no
FROM schemab.tableb;
BEGIN
open cur_b;
loop
fetch curb_b bulk collect into nt_a limit 1000;
exit when nt_a.count() = 0;
FORALL i IN rA.FIRST..rE.LAST
insert into SchemaA.TableA(main_col, BAND, user_type, START_DATE, END_DATE, roll_no)
values nt_a(i);
f := f + sql%rowcount;
if (f > = 10000) then
commit;
f := 0;
end if;
end loop;
commit;
close cur_b;
end;
Please note that issuing commits inside a loop is contraindicated. You lay yourself open to runtime errors such as ORA-01002 and ORA-01555. If your program does crash half-way through you will have great difficulty in resuming it without problems. By all means persist if you have problems with UNDO tablespace, but the correct answer is to get the DBA to enlarge the UNDO tablespace not weaken your code.
"i am using bulk insert because it gives better performance"
It is true that BULK COLLECT and FORALL ... INSERT is more performative than a CURSOR FOR loop with row-by-row single inserts. It is not more efficient than a pure SQL INSERT INTO ... SELECT. The value of the construct is that it allows us to manipulate the contents of the array before inserting it. This is handle if we have complex business rules which can only be applied programmatically.
Please try after changing first 2 line of your code with below:
DECLARE
TYPE tA IS TABLE OF SchemaA.TableA.main_col%TYPE INDEX BY PLS_INTEGER;
...
...
This may be because of data type/length mismatch. In declaration section you have missed to declare one to inherit type from table.
Also as mentioned, f logic for commit will not do the magic for you. Better you should use LIMIT with BULL COLLECT

Oracle PL/SQL how do you output how many inserts have been made in a FORALL statement

What's the best way of getting and outputting how many rows have been inserted in the FORALL statement I have below. I've seen the SQL%BULK_ROWCOUNT but I'm not sure how that would work in the below statement.
is it
DBMS_OUTPUT.('rows inserted '||SQL%BULK_ROWCOUNT||'');
Does the above need to go in another FORALL statement? For the code below how would I achieve this?
DECLARE
TYPE t_arc_act_plus_trigger1 IS TABLE OF arc_act_plus_triggers1%ROWTYPE;
v_arc_act_plus_triggers1 t_arc_act_plus_trigger1;
CURSOR c_arc_act_plus_triggers1 IS
SELECT /*+ PARALLEL */ apt.*
FROM act_plus_triggers1 apt
WHERE NOT EXISTS
(SELECT 1
FROM act_plus_triggers_copy1 aptc
WHERE aptc.surr_id = apt.surr_id)
AND apt.status IN ('EXT', 'EXP');
BEGIN
OPEN c_arc_act_plus_triggers1;
LOOP
FETCH c_arc_act_plus_triggers1 BULK COLLECT INTO v_arc_act_plus_triggers1 LIMIT 10000; -- limit to 10k to avoid out of memory
FORALL i IN 1..v_arc_act_plus_triggers1.COUNT
INSERT /*+ APPEND_VALUES */ INTO arc_act_plus_triggers1 values v_arc_act_plus_triggers1(i);
Com0932.get_parameter ('ACT_ARCHIVE_TRIGGER_STOP_YN',l_STOP_PROGRAM_YN);
IF l_STOP_PROGRAM_YN = 'Y' THEN
p_location('insert_into_arc_act_plus - STOP_PROGRAM_YN flag = '||l_STOP_PROGRAM_YN||' so ROLLBACK');
ROLLBACK;
EXIT;
END IF;
-- **************************************************
-- Output how many records have been inserted here???
-- **************************************************
-- commit after every 10000 records into arc_act_plus_triggers1
COMMIT;
EXIT WHEN c_arc_act_plus_triggers1%NOTFOUND;
END LOOP;
CLOSE c_arc_act_plus_triggers1;
END;
I haven't checked as I have nothing to test against so please forgive any 'missing semi-colon type errors' and I'm afraid I'm not in a position to performance check this.
Your code seems to select which rows to insert to the archive table based on there non-existence in the archive. Therefore simply use an INSERT based on a SELECT limited by a suitable ROWNUM value. Once you commit then the next time round the loop it wont try getting already archived rows as you just committed them.
I think this should be as quick if not quicker than bulkifying the inserts with the advantage that its simpler - Occams Razor and all that.
DECLARE
l_commit_count NUMBER := 10000;
l_rows_copied NUMBER := 0;
BEGIN
DBMS_OUTPUT.PUT_LINE('Started at '||TO_DATE(SYSDATE, 'DD_MON_YYY HH24:MI:SS');
LOOP
INSERT /*+APPEND */
INTO c_arc_act_plus_triggers1
SELECT /*+ PARALLEL */ apt.*
FROM act_plus_triggers1 apt
WHERE NOT EXISTS
(SELECT 1
FROM act_plus_triggers_copy1 aptc
WHERE aptc.surr_id = apt.surr_id)
AND apt.status IN ('EXT', 'EXP')
AND rownum < l_commit_count;
COMMIT;
l_rows := l_rows + SQL%ROWCOUNT;
EXIT WHEN SQL%ROWCOUNT < 1;
END LOOP
DBMS_OUTPUT.PUT_LINE('Finished at '||TO_DATE(SYSDATE, 'DD_MON_YYY HH24:MI:SS');
DBMS_OUTPUT.PUT_LINE(TO_CHAR(l_rows)||' rows copied to the archive table');
END;

ORACLE PL/SQL: The lost last chunk

Please help me to solve this ORACLE PL/SQL problem.
I have the following code:
BEGIN
DECLARE
P_COMMIT_STEP NUMBER := 10000; -- Commit every 10000 record copied
V_QUERY VARCHAR2 (4000) := NULL;
MY_CURSOR SYS_REFCURSOR;
TYPE FETCH_ARRAY IS TABLE OF MY_TABLE_BACKUP%ROWTYPE;
S_ARRAY FETCH_ARRAY;
BEGIN
V_QUERY := 'SELECT * FROM MY_TABLE_BACKUP';
OPEN MY_CURSOR FOR V_QUERY;
LOOP
FETCH MY_CURSOR
BULK COLLECT INTO S_ARRAY
LIMIT P_COMMIT_STEP;
FORALL I IN 1 .. S_ARRAY.COUNT
INSERT INTO MY_TABLE_BIS /*+ APPEND */
VALUES S_ARRAY (I);
COMMIT;
EXIT WHEN MY_CURSOR%NOTFOUND;
END LOOP;
CLOSE MY_CURSOR;
COMMIT;
END;
END;
Since the commit step is 10000, the copy works for a multiple of 10000 record.
So, if the original table has 1000010 records, only 1000000 records will be copied.
Where is the error?
The code seems correct, in my opinion.
Thank you very much for considering my request.
As noted in this article, you shouldn't rely on %NOTFOUND with bulk collect and forall. Check how many rows were fetched:
LOOP
FETCH MY_CURSOR
BULK COLLECT INTO S_ARRAY
LIMIT P_COMMIT_STEP;
FORALL I IN 1 .. S_ARRAY.COUNT
INSERT INTO MY_TABLE_BIS /*+ APPEND */
VALUES S_ARRAY (I);
COMMIT;
EXIT WHEN S_ARRAY.COUNT < P_COMMIT_STEP;
END LOOP;
Your first thousand iterations will get 10000 rows, so the count will equal your limit for all of those, and it will continue. The next one will get only 10 rows, so it will exit after the forall.
It should still work with the %NOTFOUND check at the end of the loop - the article is really talking about it being an issue if you use it to exit before processing the partial batch - and the documentation shows that pattern; but in some circumstances it seems not to. Having said that, I can't reproduce the issue from your code in 11.2.0.3.
Incidentally, committing inside a loop is generally a bad idea unless you've made the block restartable.
HI small modification to your Forall loop
FORALL I IN S_ARRAY.first .. S_ARRAY.last
INSERT INTO MY_TABLE_BIS /*+ APPEND */
VALUES S_ARRAY (I);
I think it will work for you.

given two scripts for deleting records in table(6 million rows) want to know which script will be better and why?

I have table with 6 million records I am running archival script to delete around 5 million records.
My script1 will do the deletion but the DBA said my script1 getting into more buffer gets and
he recommended the approach which is script2. I am confused how come script2 is better than script1.
So please review the scripts and reply which is the best approach and why.
script1 :
PROCEDURE archival_charging_txn(p_no_hrs IN NUMBER,
p_error_code OUT NUMBER,
p_error_msg OUT VARCHAR2) IS
v_sysdate DATE := SYSDATE - p_no_hrs / 24;
TYPE t_txn_id IS TABLE OF scg_charging_txn.txn_id%TYPE INDEX BY PLS_INTEGER;
v_txn_id t_txn_id;
CURSOR c IS
SELECT txn_id FROM scg_charging_txn WHERE req_time < v_sysdate; /* non unique index */
BEGIN
OPEN c;
LOOP
FETCH c BULK COLLECT
INTO v_txn_id LIMIT 10000;
IF v_txn_id.COUNT > 0 THEN
FORALL i IN v_txn_id.FIRST .. v_txn_id.LAST
DELETE FROM scg_charging_txn WHERE txn_id = v_txn_id(i); /* Primary key based */
END IF;
COMMIT;
EXIT WHEN c%NOTFOUND;
END LOOP;
CLOSE c;
COMMIT;
p_error_code := 0;
EXCEPTION
WHEN OTHERS THEN
p_error_code := 1;
p_error_msg := substr(SQLERRM, 1, 200);
END archival_charging_txn;
script 2:
PROCEDURE archival_charging_txn_W(p_no_hrs IN NUMBER,
p_error_code OUT NUMBER,
p_error_msg OUT VARCHAR2) IS
BEGIN
DELETE FROM scg_charging_txn WHERE req_time < SYSDATE - p_no_hrs / 24;
COMMIT;
p_error_code := 0;
EXCEPTION
WHEN OTHERS THEN
p_error_code := 1;
p_error_msg := substr(SQLERRM, 1, 200);
END archival_charging_txn_W;
the 1st script reads the table entries using the cursor, therefore more buffer gets, while the 2nd script just deletes the table entries.
I would prefer the 2nd script. in case you get troubles with locking the table for too log time, make a LOOP UNTIL SQL%NOTFOUND with DELETE ... AND ROWNUM <= 10000; COMMIT; inside.
If you're deleting 5 million of 6 million rows, performance will suck.
You're better off doing CTAS.
Something like:
create table new_scg_charging_txn nologging as select * from scg_charging_txn WHERE req_time >= SYSDATE - p_no_hrs / 24;
If downtime is unacceptable, you may be able to do something similar, but wrap it with DBMS_REDEFINITION.

Resources