I have table table_name with PK(id1 VARCHAR2(10), id2 NUMBER, id3 NUMBER)
This table only grow ~ 1 million records each year.
The problem is that in procedure A, I must DELETE 1 row in table_name if exists.
Procedure A is run very frequently by many users. The number of actual deletes (delete an existing row) is very small compared to the number of times procedure A will try to delete a row.
So what is better performance-wise: just delete the row (solution 1) , or check if exists then delete it (solution 2)?
What about row lock for each solution? Will DELETE lock table row in transaction of proc A if row does not exists?
Solution 1
DELETE table_name
WHERE id1 = p_id1
AND id2 = p_id2
AND id3 = p_id3;
Solution 2
SELECT COUNT(*) INTO l_check_exists
FROM table_name
WHERE id1 = p_id1
AND id2 = p_id2
AND id3 = p_id3;
IF l_check_exists <> 0
THEN
DELETE table_name
WHERE id1 = p_id1
AND id2 = p_id2
AND id3 = p_id3;
END IF;
Just delete the row. To delete it, Oracle first has to find it, which is essentially the same as performing a select statement (but without returning data). If Oracle doesn't find it, it won't delete it of course. There is no scenario in which checking first yourself via a SELECT statement will be quicker.
Also, if a record does not exist, it cannot be locked either, obviously.
Oracle cannot lock row if row does not exist. It just don't know how to do it :)
In worst scenario second solution is twice longer, but it try to success ratio is as small as you said, I guess, it would be equal performance in both cases.
Related
I am trying to use the following statement for the Delete process and it has to delete around 23566424 Rows, but oracle takes almost 3 hours to complete the process and we have already created an index on " SCHEDULE_DATE_KEY" but still, the process is very slow.Can someone advise on how to make Deletes faster in oracle
DELETE
FROM
EDWSOURCE.SCHEDULE_DAY_F
WHERE
SCHEDULE_DATE_KEY >
(
SELECT
LAST_PAYROLL_DATE_KEY
FROM
EDWSOURCE.LAST_PAYROLL_DATE
WHERE
CURRENT_FLAG = 'Y'
);
I don't think any index will help here, probably Oracle will decide the best approach is a full table scan to delete 20M rows from 300M. It is deleting at a rate of over 2000 rows per second, which isn't bad. In fact any additional indexes will slow it down as it has to delete the row entry from the index as well.
A quicker approach could be to create a new table of the rows you want to keep, something like:
create table EDWSOURCE.SCHEDULE_DAY_F_KEEP
as
select * from EDWSOURCE.SCHEDULE_DAY_F
where SCHEDULE_DATE_KEY <=
(
SELECT
LAST_PAYROLL_DATE_KEY
FROM
EDWSOURCE.LAST_PAYROLL_DATE
WHERE
CURRENT_FLAG = 'Y'
);
Then recreate any constraints and indexes to use the new table.
Finally drop the old table and rename the new one.
You can try testing a filtered table move. This has an online clause. So you can do this while the application is still running.
Note 12.2 and later the indexes will remain valid. In earlier versions you will need to rebuild the indexes as they will become invalid. Good Luck
Move a Table
Create and populate a new test table.
DROP TABLE t1 PURGE;
CREATE TABLE t1 AS
SELECT level AS id,
'Description for ' || level AS description
FROM dual
CONNECT BY level <= 100;
COMMIT;
Check the contents of the table.
SELECT COUNT(*) AS total_rows,
MIN(id) AS min_id,
MAX(id) AS max_id
FROM t1;
TOTAL_ROWS MIN_ID MAX_ID
---------- ---------- ----------
100 1 100
SQL>
Move the table, filtering out rows with an ID value greater than 50.
ALTER TABLE t1 MOVE ONLINE
INCLUDING ROWS WHERE id <= 50;
Check the contents of the table.
SELECT COUNT(*) AS total_rows,
MIN(id) AS min_id,
MAX(id) AS max_id
FROM t1;
TOTAL_ROWS MIN_ID MAX_ID
---------- ---------- ----------
50 1 50
SQL>
The rows with an ID value between 51 and 100 have been removed.
As mentioned above if maybe best to PARTITION the table abs drop a PARTITION every N number of days as part of a daily task.
Welcome Oracle pro's
In an Oracle 12 database (upgrade is already scheduled ;-)) we have a setup of different tables updating a common base table via "after update" triggers like following:
Search_Flat
ID
Field_A
Field_B
Field_C
Now table1 contains n columns where let's say 2 out of n are relevant for the Search_Flat table. As the update of table1 may only affect columns not relevant for Seach_Flat we want to add checks to the trigger. So our first approach is like following:
CREATE OR REPLACE TRIGGER tr_tbl_1_au_search
AFTER UPDATE OF
field_a,
field_b
ON schemauser.search_flat
FOR EACH ROW
BEGIN
IF :new.field_a <> :old.field_a THEN
UPDATE schemauser.search_flat SET field_a = :new.field_a WHERE id = :new.ID;
END IF;
IF :new.field_b <> :old.field_b THEN
UPDATE schemauser.search_flat SET field_b = :new.field_b WHERE id = :new.ID;
END IF;
END;
Alternatively we could also setup the trigger like following:
CREATE OR REPLACE TRIGGER tr_tbl_1_au_search
AFTER UPDATE OF
field_a,
field_b
ON schemauser.search_flat
FOR EACH ROW
BEGIN
IF :new.field_a <> :old.field_a OR :new.field_b <> :old.field_b THEN
UPDATE schemauser.search_flat
SET field_a = :new.field_a,
field_b = :new.field_b
WHERE id = :new.ID;
END IF;
END;
The question now is about the setup of the triggers themselves. Which approach is the better with respect to:
locking time of search_flat rows
overall performance of affected components (i.e., table_1, trigger and search_flat)
In production we are talking about 4 tables with 10 fields each considered in the triggers. And we have independent app servers accessing the shared database updating the 4 tables simultaneously. From time to time we detect the following error which is the reason we wan't to optimize the triggers:
ORA-02049: timeout: distributed transaction waiting for lock
Sidenote: This setup has been chosen instead of a view or materialized view due to performance reasons as the base table is used in gui with the requirement to be instantly updated and the number of records of the 4 feeding tables are too high for updating materialized view on update.
I'm looking forward to the discussion and your thoughts.
As I understand your post, you have 4 live tables (called "table1", "table2", etc.) that you want to search on, but querying from them is too slow, so you want to maintain a single, flattened table to search on instead and have triggers to keep that flattened table always up-to-date.
You want to know which of two trigger approaches is better.
I think the answer is "neither", since both are prone to deadlocks. Imagine this scenario
User 1 -
UPDATE table1
SET field_a = 500
WHERE <condition effecting 200 distinct IDs>
User 2 at about the same time -
UPDATE table1
SET field_b = 700
WHERE <condition effecting 200 distinct IDs>
Triggers start processing. You cannot control the order in which the rows are updated. Maybe it goes like this:
User 1's trigger, time index 100 ->
UPDATE search_flat SET field_a = 500 WHERE id = 90;
User 2's trigger, time index 101 ->
UPDATE search_flat SET field_b = 700 WHERE id = 91;
User 1's trigger, time index 102 ->
UPDATE search_flat SET field_a = 500 WHERE id = 91; (waits on user 2's session)
User 2's trigger, time index 103 ->
UPDATE search_flat SET field_b = 700 WHERE id = 90; (deadlock error)
User 2's original update fails and rolls back.
You have multiple concurrent processes all updating the same set of rows in search_flat with no control over the processing order. That is a recipe for deadlocks.
If you wanted to do this safely, you should consider neither of the FOR EACH ROW trigger approaches you outlines. Rather, make a compound trigger to do this.
Here's some sample code to illustrate the idea. Be sure to read the comments.
-- Aside: consider setting this at the system level if on 12.2 or later
-- alter system set temp_undo_enabled=false;
CREATE GLOBAL TEMPORARY TABLE table1_updates_gtt (
id NUMBER,
field_a VARCHAR2(80),
field_b VARCHAR2(80)
) ON COMMIT DELETE ROWS;
CREATE GLOBAL TEMPORARY TABLE table2_updates_gtt (
id NUMBER,
field_a VARCHAR2(80)
) ON COMMIT DELETE ROWS;
-- .. so on for table3 and 4.
CREATE OR REPLACE TRIGGER table1_search_maint_trg
FOR INSERT OR UPDATE OR DELETE ON table1 -- with similar compound triggers for table2, 3, 4.
COMPOUND TRIGGER
AFTER EACH ROW IS
BEGIN
-- Update the table-1 specific GTT with the changes.
CASE WHEN INSERTING OR UPDATING THEN
-- Assumes ID is immutable primary key
INSERT INTO table1_updates_gtt (id, field_a) VALUES (:new.id, :new.field_a);
WHEN DELETING THEN
INSERT INTO table1_updates_gtt (id, field_a) VALUES (:old.id, null); -- or figure out what you want to do about deletes.
END CASE;
END AFTER EACH ROW;
AFTER STATEMENT IS
BEGIN
-- Write the data from the GTT to the search_flat table.
-- NOTE: The ORDER BY in the next line is what saves us from deadlocks.
FOR r IN ( SELECT id, field_a, field_b FROM table1_updates_gtt ORDER BY id ) LOOP
-- TODO: replace with BULK processing for better performance, if DMLs can affect a lot of rows
UPDATE search_flat sf
SET sf.field_a = r.field_a,
sf.field_b = r.field_b
WHERE sf.id = r.id
AND ( sf.field_a <> r.field_a
OR (sf.field_a IS NULL AND r.field_a IS NOT NULL)
OR (sf.field_a IS NOT NULL AND r.field_a IS NULL)
OR sf.field_b <> r.field_b
OR (sf.field_b IS NULL AND r.field_b IS NOT NULL)
OR (sf.field_b IS NOT NULL AND r.field_b IS NULL)
);
END LOOP;
END AFTER STATEMENT;
END table1_search_maint_trg;
Also, as numerous commenters have pointed out, it's probably better to use a materialized view for this. If you are on 12.2 or later, real-time materialized views (aka "ENABLE ON QUERY COMPUTATION") offer a lot of promise for this sort of thing. No COMMIT overhead to your application and real-time search results. It's just that search time degrades slightly if there are a lot of recent updates to the underlying tables.
I have a requirement that I need to insert row number in a table based on value already present in the table. For example, the max row_nbr record in the current table is something like this:
+----------+----------+------------+---------+
| FST_NAME | LST_NAME | STATE_CODE | ROW_NBR |
+----------+----------+------------+---------+
| John | Doe | 13 | 123 |
+----------+----------+------------+---------+
Now, I need to insert more records, with given FST_NAME and LST_NAME values. ROW_NBR needs to be generated while inserting the data into table with values auto-incrementing from 123.
I can't use a sequence, as my loading process is not the only process that inserts data into this table. And I can't use a cursor as well, as due to high volume of data the TEMP space gets filled up quickly. And I'm inserting data as given below:
insert into final_table
( fst_name,lst_name,state_code)
(select * from staging_table
where state_code=13);
Any ideas how to implement this?
It sounds like other processes are finding the current maximum row_nbr value and incrementing it as they do single-row inserts in a cursor loop.
You could do something functionally similar, either finding the maximum in advance and incrementing it (if you're already running this in a PL/SQL block):
insert into final_table (fst_name, lst_name, state_code, row_nbr)
select st.*, variable_holding_maximum + rownum
from staging_table st
where st.state_code=13;
or by querying the table as part of the query, which doesn't need PL/SQL:
insert into final_table (fst_name, lst_name, state_code, row_nbr)
select st.*, (select max(row_nbr) from final_table) + rownum
from staging_table st
where st.state_code=13;
db<>fiddle
But this isn't a good solution because it doesn't prevent clashes from different processes and sessions trying to insert at the same time; but neither would the cursor loop approach, unless it is catching unique constraint errors and re-attempting with a new value, perhaps.
It would be better to use a sequence, which would be an auto-increment column but you said you can't change the table structure; and you need to let the other processes continue to work without modification. You can still do that with a sequence and trigger approach, having the trigger always set the row_nbr value form the sequence, regardless of whether the insert statement supplied a value.
If you create a sequence that starts from the current maximum, with something like:
create sequence final_seq start with <current max + 1>
or without manually finding it:
declare
start_with pls_integer;
begin
select nvl(max(row_nbr), 0) + 1 into start_with from final_table;
execute immediate 'create sequence final_seq start with ' || start_with;
end;
/
then your trigger could just be:
create trigger final_trig
before insert on final_table
for each row
begin
:new.row_nbr := final_seq.nextval;
end;
/
Then your insert ... select statement doesn't need to supply or even think about the row_nbr value, so you can leave it as you have it now (except I'd avoid select * even in that construct, and list the staging table columns explicitly); and any existing inserts that do supply the row_nbr don't need to be modified and the value they supply will just be overwritten from the sequence.
db<>fiddle showing inserts with and withouth row_nbr specified.
I have an application that uses the Oracle MERGE INTO... DML statement to update table A to correspond with some of the changes in another table B (table A is a summary of selected parts of table B along with some other info). In a typical merge operation, 5-6 rows (out of 10's of thousands) might be inserted in table B and 2-3 rows updated.
It turns out that the application is to be deployed in an environment that has a security policy on the target tables. The MERGE INTO... statement can't be used with these tables (ORA-28132: Merge into syntax does not support security policies)
So we have to change the MERGE INTO... logic to use regular inserts and updates instead. Is this a problem anyone else has run into? Is there a best-practice pattern for converting the WHEN MATCHED/WHEN NOT MATCHED logic in the merge statement into INSERT and UPDATE statements? The merge is within a stored procedure, so it's fine for the solution to use PL/SQL in addition to the DML if that is required.
Another way to do this (other than Merge) would be using two sql statements one for insert and one for update. The "WHEN MATCHED" and "WHEN NOT MATCHED" can be handled using joins or "in" Clause.
If you decide to take the below approach, it is better to run the update first (sine it only runs for the matching records) and then insert the non-Matching records. The Data sets would be the same either way, it just updates less number of records with the order below.
Also, Similar to the Merge, this update statement updates the Name Column even if the names in Source and Target match. If you dont want that, add that condition to the where as well.
create table src_table(
id number primary key,
name varchar2(20) not null
);
create table tgt_table(
id number primary key,
name varchar2(20) not null
);
insert into src_table values (1, 'abc');
insert into src_table values (2, 'def');
insert into src_table values (3, 'ghi');
insert into tgt_table values (1, 'abc');
insert into tgt_table values (2,'xyz');
SQL> select * from Src_Table;
ID NAME
---------- --------------------
1 abc
2 def
3 ghi
SQL> select * from Tgt_Table;
ID NAME
---------- --------------------
2 xyz
1 abc
Update tgt_Table tgt
set Tgt.Name =
(select Src.Name
from Src_Table Src
where Src.id = Tgt.id
);
2 rows updated. --Notice that ID 1 is updated even though value did not change
select * from Tgt_Table;
ID NAME
----- --------------------
2 def
1 abc
insert into tgt_Table
select src.*
from Src_Table src,
tgt_Table tgt
where src.id = tgt.id(+)
and tgt.id is null;
1 row created.
SQL> select * from tgt_Table;
ID NAME
---------- --------------------
2 def
1 abc
3 ghi
commit;
There could be better ways to do this, but this seems simple and SQL-oriented. If the Data set is Large, then a PL/SQL solution won't be as performant.
There are at least two options I can think of aside from digging into the security policy, which I don't know much about.
Process the records to merge row by row. Attempt to do the update, if it fails to update then insert, or vise versa, depending on whether you expect most records to need updating or inserting (ie optimize for the most common case that will reduce the number of SQL statements fired), eg:
begin
for row in (select ... from source_table) loop
update table_to_be_merged
if sql%rowcount = 0 then -- no row matched, so need to insert
insert ...
end if;
end loop;
end;
Another option may be to bulk collect the records you want to merge into an array, and then attempted to bulk insert them, catching all the primary key exceptions (I cannot recall the syntax for this right now, but you can get a bulk insert to place all the rows that fail to insert into another array and then process them).
Logically a merge statement has to check for the presence of each records behind the scenes anyway, and I think it is processed quite similarly to the code I posted above. However, merge will always be more efficient than coding it in PLSQL as it will be only 1 SQL call instead of many.
Question 1
Can anyone tell me if there is any difference between following 2 update statements:
UPDATE TABA SET COL1 = '123', COL2 = '456' WHERE TABA.PK = 1
UPDATE TABA SET COL1 = '123' WHERE TABA.PK = 1
where the original value of COL2 = '456'
how does this affect the UNDO?
Question 2
What about if I update a record in table TABA using ROWTYPE like the following snippet.
how's the performance, and how does it affect the UNDO?
SampleRT TABA%rowtype
SELECT * INTO SampleRT FROM TABA WHERE PK = 1;
SampleRT.COL2 = '111';
UPDATE TABA SET ROW = SampleRT WHERE PK = SampleRT.PK;
thanks
Is your question 1 asking whether UNDO (and REDO) is generated when you're running an UPDATE against a row but not actually changing the value?
Something like?
update taba set col2='456' where col2='456';
If this is the question, then the answer is that even if you're updating a column to the same value then UNDO (and REDO) is generated.
(An exception is when you're updating a NULL column to NULL - this doesn't generate any REDO).
For Question 1:
The outcome of the two UPDATEs for rows in your table where PK=1 and COL2='456' is identical. (That is, each such row will have its COL1 value set to '123'.)
Note: there may be rows in your table with PK=1 and COL2 <> '456'. The outcome of the two statements for these rows will be different. Both statements will alter COL1, but only the first will alter the value in COL2, the second will leave it unchanged.
For question 1:
There can be a difference as triggers can fire depending on which columns are updated. Even if you are updating column_a to the same value, the trigger will fire.
The UNDO shouldn't be different as, if you expand or shrink the length of a variable length column (eg VARCHAR or NUMBER), all the rest of the bytes of the record need to be shuffled along too.
If the columns don't change size, then you MAY get a benefit in not specifying the column. You can probably test it using v$transaction queries to look at undo generated.
For question 2:
I'd be more concerned about memory (especially if you are bulk collecting SELECT * ) and triggers firing than UNDO.
If you don't need SELECT *, specify the columns (eg as follows)
cursor c_1 is select pk, col1, col2 from taba;
SampleRT c_1%rowtype;
SELECT pk, col1, col2 INTO SampleRT FROM TABA WHERE PK = 1;
SampleRT.COL2 = '111';
UPDATE (select pk, col1, col2 from taba)
SET ROW = SampleRT WHERE PK = SampleRT.PK;