I have this plsql script. I was able to test this on a test table with around 300 rows and it is working perfectly fine. However when I tried to run this using the actual table which is around 1M rows, it doesn't complete. I would like seek for your suggestion on how can I optimise my script, I am new to plsql so any ideas/suggestions are a great help. :)
DECLARE
c_BGROUP PP_TRANCHE_RBS.BGROUP%TYPE := 'RBS';
l_start NUMBER;
/* Check for all entries where pt03d = pt04d+1. */
CURSOR c_pp_tranche IS
SELECT
refno,
pt04d,
seqno
FROM PP_TRANCHE_RBS a
WHERE a.BGROUP = c_BGROUP
AND a.pt03d = (SELECT (pt04d + 1)
FROM PP_TRANCHE_RBS
WHERE bgroup = a.bgroup
AND refno = a.refno
and seqno = a.seqno)
;
TYPE c_refno IS TABLE OF PP_TRANCHE_RBS.REFNO%TYPE;
TYPE c_pt04d IS TABLE OF PP_TRANCHE_RBS.PT04D%TYPE;
TYPE c_seqno IS TABLE OF PP_TRANCHE_RBS.SEQNO%TYPE;
t_refno c_refno;
t_pt04d c_pt04d;
t_seqno c_seqno;
BEGIN
DBMS_OUTPUT.put_line('Updating rows... ');
l_start := DBMS_UTILITY.get_time;
OPEN c_pp_tranche;
LOOP
FETCH c_pp_tranche BULK COLLECT INTO t_refno, t_pt04d, t_seqno LIMIT 10000; -- break the data into chucks of 10000 rows
EXIT WHEN t_refno.COUNT() = 0; -- cursor attribute to exit when 0.
FORALL i IN t_refno.FIRST .. t_refno.LAST
/* Update pt03d = pt04d */
UPDATE PP_TRANCHE_RBS
SET pt03d = t_pt04d(i)
WHERE
bgroup = c_BGROUP
AND refno = t_refno(i)
AND seqno = t_seqno(i)
;
-- Process contents of collection here.
DBMS_OUTPUT.put_line(t_refno.count || ' rows was updated');
END LOOP;
DBMS_OUTPUT.put_line('Bulk Updates Time: ' || (DBMS_UTILITY.get_time - l_start));
CLOSE c_pp_tranche;
END;
/
exit;
Equivalent pure SQL statement:
UPDATE PP_TRANCHE_RBS
SET pt03d = pt04d
WHERE bgroup = 'RBS'
and pt03d = pt04d + 1;
This will probably run faster than your procedural version. PL/SQL bulk processing is faster than row-by-row but it's usually slower than a single set-based operation. So save it for those times when you have complicated transformation logic which can only be handled procedurally.
Related
SQL%ROWCOUNT is returning the count considered(10) for the run, not the exact number of records updated. Expectation is that SQL%ROWCOUNT should provide the actual number of records updated . Please suggest me how to achieve the task.
Code which triggers dynamic SQL
FORALL indx IN 1 .. l_account_data.COUNT --assume 10 as count
SAVE EXCEPTIONS
EXECUTE IMMEDIATE dynamic_sql_query USING l_account_data (indx);
DBMS_OUTPUT.put_line ('Successful UPDATE of '|| TO_CHAR (SQL%ROWCOUNT) || ' record');
COMMIT;
dynamic_sql_query
BEGIN
SELECT clmn_x, clmn_y
BULK COLLECT INTO l_subscr_data
FROM table_x e, table_y c
WHERE c.ref_id = :account_no AND e.account_no = c.account_no;
FORALL indx IN 1 .. l_subscr_data.COUNT
UPDATE table_z ciem --this update will update multiple records for each account
SET ciem.ext_id = ciem.sub_no || ROWID
WHERE ciem.sub_no = l_subscr_data (indx).clmn_x
AND ciem.subscr_no_resets = l_subscr_data (indx).clmn_y
AND ciem.status IN (1,2);
END;
Your outer execute immediate call isn't aware of what is happening inside the dynamic SQL; it doesn't know what it's doing, or how many rows it may or may not have affected.
To get an accurate count you would need to modify your dynamic statement to add something like:
FOR indx IN 1 .. l_subscr_data.COUNT LOOP
:total_count := :total_count + coalesce(SQL%BULK_ROWCOUNT(indx), 0);
END LOOP;
and change your outer call to (a) pass an extra IN OUT bind variable to track the total count, and (b) use a FOR LOOP rather than FORALL, because that only seems to retain the value after the first dynamic call (not sure if that's documented, or a bug). So something like:
...
l_total_count number := 0;
...
FOR indx IN 1 .. l_account_data.COUNT LOOP
EXECUTE IMMEDIATE dynamic_sql_query
USING l_account_data (indx), in out l_total_count;
END LOOP;
DBMS_OUTPUT.put_line ('Successful UPDATE of '|| TO_CHAR (l_total_count) || ' record');
db<>fiddle demo with made-up data.
I tinkered together the following PL/SQL BULK-COLLECT which works astonishingly fast for updates on huge tables (>50.000.000). The only problem is, that it does not perform the updates of the remaining < 5000 rows per table. 5000 is the given limit for the FETCH instruction:
DECLARE
-- source table cursor (only columns to be updated)
CURSOR base_table_cur IS
select a.rowid, TARGET_COLUMN from TARGET_TABLE a
where TARGET_COLUMN is null;
TYPE base_type IS
TABLE OF base_table_cur%rowtype INDEX BY PLS_INTEGER;
base_tab base_type;
-- new data
CURSOR new_data_cur IS
select a.rowid,
coalesce(b.SOURCE_COLUMN, 'FILL_VALUE'||a.JOIN_COLUMN) TARGET_COLUMN from TARGET_TABLE a
left outer join SOURCE_TABLE b
on a.JOIN_COLUMN=b.JOIN_COLUMN
where a.TARGET_COLUMN is null;
TYPE new_data_type IS TABLE OF new_data_cur%rowtype INDEX BY PLS_INTEGER;
new_data_tab new_data_type;
TYPE row_id_type IS TABLE OF ROWID INDEX BY PLS_INTEGER;
row_id_tab row_id_type;
TYPE rt_update_cols IS RECORD (
TARGET_COLUMN TARGET_TABLE.TARGET_COLUMN%TYPE
);
TYPE update_cols_type IS
TABLE OF rt_update_cols INDEX BY PLS_INTEGER;
update_cols_tab update_cols_type;
dml_errors EXCEPTION;
PRAGMA exception_init ( dml_errors,-24381 );
BEGIN
OPEN base_table_cur;
OPEN new_data_cur;
LOOP
FETCH base_table_cur BULK COLLECT INTO base_tab LIMIT 5000;
IF base_table_cur%notfound THEN
DBMS_OUTPUT.PUT_LINE('Nothing to update. Exiting.');
EXIT;
END IF;
FETCH new_data_cur BULK COLLECT INTO new_data_tab LIMIT 5000;
FOR i IN base_tab.first..base_tab.last LOOP
row_id_tab(i) := new_data_tab(i).rowid;
update_cols_tab(i).TARGET_COLUMN := new_data_tab(i).TARGET_COLUMN;
END LOOP;
FORALL i IN base_tab.first..base_tab.last SAVE EXCEPTIONS
UPDATE (SELECT TARGET_COLUMN FROM TARGET_TABLE)
SET row = update_cols_tab(i)
WHERE ROWID = row_id_tab(i);
COMMIT;
EXIT WHEN base_tab.count < 5000; -- changing to 1 didn't help!
END LOOP;
COMMIT;
CLOSE base_table_cur;
CLOSE new_data_cur;
EXCEPTION
WHEN dml_errors THEN
FOR i IN 1..SQL%bulk_exceptions.count LOOP
dbms_output.put_line('Some error occured');
END LOOP;
END;
Where is my mistake? It looks correct to me though.
The problem is this line:
IF base_table_cur%notfound THEN
The cursor meets %NOTFOUND when the number of records found is less than the LIMIT value. So if the last fetch is not exactly 5000 those records won't be processed.
It's a common gotcha for people using BULK COLLECT ... LIMIT for the first time. The solution is to change the exit condition to
EXIT when base_tab.count() = 0;
"I need to ensure, that the base_table_cur is not empty and exit if it is. I'l get an error if it is empty"
The new_data_cur cursor includes the table which is selected in base_table_cur cursor. So I don't think you need the two loops. You need a simple test to see whether the first cursor returns something, then just loop round the second cursor.
I'm not entirely clear on your logic, so I have changed as little as possible to demonstrate the sort of structure I think you need. However, the UPDATE statement looks a little odd, so you may still run into issues.
OPEN base_table_cur;
FETCH base_table_cur BULK COLLECT INTO base_tab LIMIT 1;
if base_table_tab.count = 0 then
DBMS_OUTPUT.PUT_LINE('Nothing to update. Exiting.');
else
OPEN new_data_cur;
LOOP
FETCH new_data_cur BULK COLLECT INTO new_data_tab LIMIT 5000;
exit when new_data_tab.count() = 0;
FOR i IN base_tab.first..base_tab.last LOOP
row_id_tab(i) := new_data_tab(i).rowid;
update_cols_tab(i).TARGET_COLUMN := new_data_tab(i).TARGET_COLUMN;
END LOOP;
FORALL i IN base_tab.first..base_tab.last SAVE EXCEPTIONS
UPDATE (SELECT TARGET_COLUMN FROM TARGET_TABLE)
SET row = update_cols_tab(i)
WHERE ROWID = row_id_tab(i);
END LOOP;
CLOSE new_data_cur;
end if;
COMMIT;
CLOSE base_table_cur;
I've got this PL/SQL procedure which runs for about 4-6 minutes:
DECLARE
i NUMBER := 0;
begin
for x in (select anumber
, position
, character
from sdc_positions_cip
where kind = 'Name')
loop
update sdc_compare_person dcip
set dcip.GESNAM_D = substr(dcip.GESNAM_D, 1, x.position - 1) || x.character ||
substr(dcip.GESNAM_D, x.position + 1, length(dcip.GESNAM_D) - x.position)
where dcip.sourcekey = x.anumber;
i := i + 1;
IF i > 100 THEN COMMIT;
i := 0;
END IF;
end loop;
commit;
end;
/
I'v placed an index on dcip.sourcekey and x.anumber.
The tablespace that it's using is 10GB.
Is there a way to make this procedure (much) faster?
Your performance bottleneck is the loop. It forces your code to switch between PLSQL and Oracle SQL for every single UPDATE-Statement.
In order to eliminate these context switches, you could probably use an UPDATE-Statement containing a subselect, but I more like MERGE, for example like in the following way:
merge into sdc_compare_person dcip
using (
select anumber, position, character
from sdc_positions_cip
where kind = 'Name'
) x
on (dcip.sourcekey = x.anumber)
when matched then update set
dcip.GESNAM_D = substr(dcip.GESNAM_D, 1, x.position - 1) ||
x.character ||
substr(dcip.GESNAM_D, x.position + 1, length(dcip.GESNAM_D) - x.position);
Another option would be to use BULK COLLECT INTO and FORALL to perform bulk selects and bulk inserts. Due to the limited complexity of your procedure, I strongly recommend using a single statement like mine.
You can also try this version:
update
(select dcip.GESNAM_D, x.position, x.character, dcip.sourcekey, anumber
from sdc_compare_person dcip
join sdc_positions_cip on dcip.sourcekey = x.anumber)
set GESNAM_D = substr(GESNAM_D, 1, position - 1) || character || substr(GESNAM_D, position + 1, length(GESNAM_D) - position);
What's the best way of getting and outputting how many rows have been inserted in the FORALL statement I have below. I've seen the SQL%BULK_ROWCOUNT but I'm not sure how that would work in the below statement.
is it
DBMS_OUTPUT.('rows inserted '||SQL%BULK_ROWCOUNT||'');
Does the above need to go in another FORALL statement? For the code below how would I achieve this?
DECLARE
TYPE t_arc_act_plus_trigger1 IS TABLE OF arc_act_plus_triggers1%ROWTYPE;
v_arc_act_plus_triggers1 t_arc_act_plus_trigger1;
CURSOR c_arc_act_plus_triggers1 IS
SELECT /*+ PARALLEL */ apt.*
FROM act_plus_triggers1 apt
WHERE NOT EXISTS
(SELECT 1
FROM act_plus_triggers_copy1 aptc
WHERE aptc.surr_id = apt.surr_id)
AND apt.status IN ('EXT', 'EXP');
BEGIN
OPEN c_arc_act_plus_triggers1;
LOOP
FETCH c_arc_act_plus_triggers1 BULK COLLECT INTO v_arc_act_plus_triggers1 LIMIT 10000; -- limit to 10k to avoid out of memory
FORALL i IN 1..v_arc_act_plus_triggers1.COUNT
INSERT /*+ APPEND_VALUES */ INTO arc_act_plus_triggers1 values v_arc_act_plus_triggers1(i);
Com0932.get_parameter ('ACT_ARCHIVE_TRIGGER_STOP_YN',l_STOP_PROGRAM_YN);
IF l_STOP_PROGRAM_YN = 'Y' THEN
p_location('insert_into_arc_act_plus - STOP_PROGRAM_YN flag = '||l_STOP_PROGRAM_YN||' so ROLLBACK');
ROLLBACK;
EXIT;
END IF;
-- **************************************************
-- Output how many records have been inserted here???
-- **************************************************
-- commit after every 10000 records into arc_act_plus_triggers1
COMMIT;
EXIT WHEN c_arc_act_plus_triggers1%NOTFOUND;
END LOOP;
CLOSE c_arc_act_plus_triggers1;
END;
I haven't checked as I have nothing to test against so please forgive any 'missing semi-colon type errors' and I'm afraid I'm not in a position to performance check this.
Your code seems to select which rows to insert to the archive table based on there non-existence in the archive. Therefore simply use an INSERT based on a SELECT limited by a suitable ROWNUM value. Once you commit then the next time round the loop it wont try getting already archived rows as you just committed them.
I think this should be as quick if not quicker than bulkifying the inserts with the advantage that its simpler - Occams Razor and all that.
DECLARE
l_commit_count NUMBER := 10000;
l_rows_copied NUMBER := 0;
BEGIN
DBMS_OUTPUT.PUT_LINE('Started at '||TO_DATE(SYSDATE, 'DD_MON_YYY HH24:MI:SS');
LOOP
INSERT /*+APPEND */
INTO c_arc_act_plus_triggers1
SELECT /*+ PARALLEL */ apt.*
FROM act_plus_triggers1 apt
WHERE NOT EXISTS
(SELECT 1
FROM act_plus_triggers_copy1 aptc
WHERE aptc.surr_id = apt.surr_id)
AND apt.status IN ('EXT', 'EXP')
AND rownum < l_commit_count;
COMMIT;
l_rows := l_rows + SQL%ROWCOUNT;
EXIT WHEN SQL%ROWCOUNT < 1;
END LOOP
DBMS_OUTPUT.PUT_LINE('Finished at '||TO_DATE(SYSDATE, 'DD_MON_YYY HH24:MI:SS');
DBMS_OUTPUT.PUT_LINE(TO_CHAR(l_rows)||' rows copied to the archive table');
END;
create or replace procedure Proc_1(P_IN_TABLE_NAME VARCHAR2)
AS
CURSOR T_FACT
IS
SELECT T_ID,T_VER,D_T_ID
from O_T_FACT
where T_ID is not null
and T_VER is not null;
TYPE call_tab IS TABLE OF O_T_FACT%rowtype;
BEGIN
IF P_IN_TABLE_NAME ='G_FACT' THEN
OPEN T_FACT;
LOOP
EXIT WHEN T_FACT%NOTFOUND ;
FETCH T_FACT BULK COLLECT INTO call_data_rec LIMIT no_of_rec;
EXIT WHEN call_data_rec.count = 0;
FOR j IN 1..call_data_rec.COUNT
loop
UPDATE G_FACT GL set
GL.T_ID = call_data_rec(j).T_ID,
GL.T_VER =call_data_rec(j).T_VER,
GL.TRANS_FLAG='Y'
WHERE GL.G_T_ID = call_data_rec(j).D_T_ID
AND GL.T_ID IS NULL
AND GL.T_VER IS NULL;
rec_count := rec_count + 1;
if mod(rec_count,10000) = 0 then
commit;
end if;
end loop;
end loop;
CLOSE T_FACT;
END IF;
End;
This particular procedure is taking long time, is there any other way to write this? Can this be done in a single update statement?
As suggested below I have tired for all but its giving error as
PLS-00436: implementation restriction: cannot reference fields of BULK In-BIND table of records
New Code with For all
create or replace procedure Proc_update_T_ID(P_IN_TABLE_NAME VARCHAR2)
AS
no_of_rec number := 1000;
CURSOR T_and_V_FACT
IS
SELECT O_T_FACT.T_ID, O_T_FACT.T_VER, O_T_FACT.Downstream_T_ID, G_FACT.rowid row_id,
From O_T_FACT, G_FACT
WHERE O_T_FACT.T_ID IS NOT NULL AND G_FACT.G_T_ID = O_T_FACT.Downstream_T_ID
AND T_VER is not null
AND G_FACT.T_VER IS NULL;
TYPE call_tab IS TABLE OF T_and_V_FACT%rowtype index by binary_integer;
call_data_rec call_tab;
BEGIN
IF P_IN_TABLE_NAME ='G_FACT' THEN
IF T_and_V_FACT%ISOPEN THEN
CLOSE T_and_V_FACT;
END IF;
open T_and_V_FACT;
LOOP
FETCH T_and_V_FACT BULK COLLECT
INTO call_data_rec LIMIT no_of_rec;
FORALL j IN call_data_rec.FIRST .. call_data_rec.LAST
UPDATE G_FACT GL set
GL.T_ID = call_data_rec(j).T_ID,
GL.T_VER =call_data_rec(j).T_VER,
GL.TRANS_FLAG='Y'
WHERE GL.rowid = call_data_rec(j).row_id;
COMMIT;
call_data_rec.DELETE;
EXIT WHEN T_and_V_FACT%NOTFOUND;
END LOOP;
CLOSE T_and_V_FACT;
End if;
END Proc_1;
It looks to me like you could rewrite this into a single statement like so:
update G_FACT GL
set (GL.T_ID, GL.T_VER, GL.TRANS_FLAG) =
(select T_ID,T_VER, 'Y'
from O_T_FACT F
where F.T_ID is not null
and F.T_VER is not null
and GL.G_T_ID = F.D_T_ID)
where exists (select null
from O_T_FACT F
where F.T_ID is not null
and F.T_VER is not null
and GL.G_T_ID = F.D_T_ID)
and GL.T_ID is null
and GL.T_VER is null;
If this doesn't work, then you should be able to get significant gains by converting your for loop into a forall statement:
FORALL j IN 1..call_data_rec.COUNT
UPDATE G_FACT GL set
GL.T_ID = call_data_rec(j).T_ID,
GL.T_VER =call_data_rec(j).T_VER,
GL.TRANS_FLAG='Y'
WHERE GL.G_T_ID = call_data_rec(j).D_T_ID
AND GL.T_ID IS NULL
AND GL.T_VER IS NULL;
Also, rethink whether you need that commit in the loop. Including this will:
Slow your processing down
Increase the chance of hitting ORA-1555s
Possibly leave your data in an inconsistent state
I think you should change your CURSOR and hit UPDATE not in a loop statement
Also I prefer using RowID for update statements and LIMIT for fetch size of cursors and FOR ALL for maximum performance and memory management.
We need to define a new type:
CREATE TYPE my_rec AS OBJECT
( T_ID NUMBER
, T_VER number
, row_id UROWID)
);
Then using Proc_1 may serve:
create or replace procedure Proc_1(P_IN_TABLE_NAME VARCHAR2)
AS
no_of_rec number := 1000;
CURSOR T_and_V_FACT
IS
SELECT T_FACT.T_ID, T_FACT.T_VER, T_FACT.D_T_ID, G_FACT.rowid row_id,
From O_T_FACT, G_FACT
WHERE T_FACT.T_ID IS NOT NULL AND G_FACT.G_T_ID = O_T_FACT.D_T_ID
AND T_VER is not null
-- AND G_FACT.T_ID IS NULL -- not requierd
AND G_FACT.T_VER IS NULL;
TYPE call_tab IS TABLE OF T_and_V_FACT%rowtype index by binary_integer;
call_data_rec call_tab;
BEGIN
IF P_IN_TABLE_NAME ='G_FACT' THEN
IF T_and_V_FACT%ISOPEN THEN
CLOSE T_and_V_FACT;
END IF;
open T_and_V_FACT;
LOOP
FETCH T_and_V_FACT BULK COLLECT
INTO call_data_rec LIMIT no_of_rec;
FORALL j IN call_data_rec.FIRST .. call_data_rec.LAST
UPDATE G_FACT GL set
GL.T_ID = TREAT(call_data_rec(j) AS my_rec).T_ID,
GL.T_VER =TREAT(call_data_rec(j) AS my_rec).T_VER,
GL.TRANS_FLAG='Y'
WHERE GL.rowid = TREAT(call_data_rec(j) AS my_rec).row_id;
COMMIT;
call_data_rec.DELETE;
EXIT WHEN T_and_V_FACT%NOTFOUND;
END LOOP;
CLOSE T_and_V_FACT;
End if;
END Proc_1;
I have edited some parts based on #Ben's comment.
Also i did some changes based on versions 9i and 10g
Restriction of using TREAT has been removed in 11g