How to delete records in parallel from oracle table

How to delete records in parallel from oracle table - oracle

We are maintaining audit of our application in table 'application_audit'. I am trying to write the stored procedure to delete records from this table which we don't need any more.
So far, I have written below stored procedure but I found that it is taking lot of time when number of rows to be deleted are more than 100k.
Can you please help me to implement parallel sessions OR optimize delete query in below stored procedure to speedup the execution.
In production, this table will have at least 5 million rows at any given point of time and from what I can see, if we execute this stored proc everyday then there will be at least 100k records to be deleted. In below query, COMPONENT_NAME='REQUESTPURGE' means that for that particular request number purge already happened and there is no request data present in our active database instance for that request number so all records in 'application_audit' table with that request number become eligible for deletion.
Stored procedure:
create or replace PROCEDURE APPLICATION_AUDIT_PURGE_RECORD
IS
purgewait number := 30;
BEGIN
DBMS_OUTPUT.PUT_LINE('Application audit purge started with purge wait value as '||purgewait||' days');
delete from application_audit where id in (select id from application_audit where request_number in (select request_number from application_audit where COMPONENT_NAME='REQUESTPURGE' and trunc(timestamp) < trunc(sysdate - purgewait)));
END APPLICATION_AUDIT_PURGE_RECORD;
Table:
CREATE TABLE "APPLICATION_AUDIT" (
"ID" NUMBER GENERATED ALWAYS AS IDENTITY NOT NULL,
"MESSAGE_TYPE" VARCHAR2(64 CHAR),
"COMPONENT_NAME" VARCHAR2(64 CHAR),
"USERNAME" VARCHAR2(32 CHAR),
"TIMESTAMP" TIMESTAMP (6) WITH TIME ZONE NOT NULL,
"REQUEST_NUMBER" VARCHAR2(64 CHAR),
"MODULE_NAME" VARCHAR2(256 CHAR),
"PROCESS_NAME" VARCHAR2(256 CHAR),
"VERSION" VARCHAR2(64 CHAR),
"TASK" VARCHAR2(64 CHAR),
"ERROR_CODE" VARCHAR2(256 CHAR),
"ERROR_MESSAGE" VARCHAR2(4000 CHAR),
"MESSAGE" VARCHAR2(4000 CHAR)
)
Edit1:
Changing delete statements in stored procedure and using indexes reduced the execution time significantly.
Updated delete statements in stored procedure:
DELETE FROM APPLICATION_AUDIT WHERE REQUEST_NUMBER IN (SELECT APPLICATION_AUDIT.REQUEST_NUMBER FROM APPLICATION_AUDIT WHERE APPLICATION_AUDIT.REQUEST_NUMBER != 'null' AND APPLICATION_AUDIT.MESSAGE_TYPE='INFO' AND APPLICATION_AUDIT.COMPONENT_NAME='REQUESTPURGE' AND APPLICATION_AUDIT.TASK='DeleteRequest' AND TRUNC(APPLICATION_AUDIT.TIMESTAMP) < TRUNC(SYSDATE - v_reqnumpurgewait));
DELETE FROM APPLICATION_AUDIT WHERE REQUEST_NUMBER = 'null' AND TRUNC(APPLICATION_AUDIT.TIMESTAMP) < TRUNC(SYSDATE - v_purgewait);
Index creation queries:
CREATE INDEX APPLICATION_AUDIT_IDX1 ON APPLICATION_AUDIT (COMPONENT_NAME, TIMESTAMP, (NVL(REQUEST_NUMBER,'null')));
CREATE INDEX APPLICATION_AUDIT_IDX2 ON APPLICATION_AUDIT (NVL(REQUEST_NUMBER,'null'));

I see it suffices to find one row with component_name = 'REQUESTPURGE' to delete all rows with the same request number. This means that the component_name alone doesn't tell us whether to delete a row or not. Otherwise I'd have uggested to use table partitions here.
As is, all I can think of is providing appropriate indexes. First of all, though, your query can be simplified to:
delete from application_audit
where request_number in
(
select request_number
from application_audit
where component_name = 'REQUESTPURGE'
and timestamp < trunc(sysdate - purgewait)
);
The indexes I suggest for this statement:
create index idx1 on application_audit (component_name, timestamp, request_number);
create index idx2 on application_audit (request_number);

Deleting 5 million records shouldn't be that time consuming.
Having said that, you can try adding a parallel hint to the DELETE statement.
First enable
ALTER SESSION ENABLE PARALLEL DML;
If that's not helping, you could look into:
Disabling indexes on the table
But, of course, any queries needing and using those indexes will be slower while your delete runs. So you're just trading one slow statement for (lots of) others. And you'll have to rebuild them afterwards which will take (possibly a looooooong) time.
You can look into chunking by SQL or rowid
If none of these help enough, you may need to look into more radical solutions.
Such as saving the data you want to keep in a temporary table. Then dropping the current table and renaming the temporary one. e.g.:
create table tmp as select ...data you want to keep... from old_tab;
drop old_tab;
rename tmp to old_tab;
-- run grants, indexes etc. that were on the original table
...
But you need an outage to do this.
I would suggest track down where the bottleneck is occurring first with an explain plan or trace as it sounds like you have an underlying problem if 5 million deletes are taking a long time

I think your DELETE query can be oversimplified to -
DELETE FROM application_audit
WHERE COMPONENT_NAME = 'REQUESTPURGE'
AND TRUNC(timestamp) < TRUNC(SYSDATE - purgewait);
You can try having an index on COMPONENT_NAME column as well.

Related

avoiding duplicates oracle aapex

i am trying to avoid duplicates in a table called 'incomingrequest', my table looks like this
CREATE TABLE "REGISTRY"."INCOMINGREQUEST"
( "ID" NUMBER(30,0),
"FILENUMBER" VARCHAR2(30 BYTE),
"REQUESTEDFILE" VARCHAR2(300 BYTE),
"REQUESTEDDEPARTMENT" VARCHAR2(30 BYTE),
"REQUESTDATE" DATE,
"STATUS" VARCHAR2(30 BYTE),
"URGENCY" VARCHAR2(30 BYTE),
"VOLUME" NUMBER(30,0),
"SUB" NUMBER(30,0),
"REGISTRYID" NUMBER(30,0),
"TEMPORARY" VARCHAR2(30 BYTE)
)
and the table data is a s follows
filenumber Filename requester status REQUESTEDDEPARTMENT
1/11/2 Payments JOSHUA MITCHELL PENDING DAY CARE
1/11/2 Payments JOSHUA MITCHELL Delivered DAY CARE
1/11/2 Payments JOSHUA MITCHELL PENDING DAY CARE
1/11/2 Payments RAWLE MUSGRAVE PENDING COMCORP
NB i only included the important fields above for this scenario (the other fields in the table has data).
What i want to achieve is ,when the app_user which in this case is the department (daycare) makes the same request while the previous request is pending(status) i want an error to occur. so the 3rd record/request should not have happen.
the trigger i am trying is
create or replace trigger "INCOMINGREQUEST_T1"
BEFORE
insert or update or delete on "INCOMINGREQUEST"
for each row
DECLARE counter INTEGER;
BEGIN
SELECT * INTO COUNTER FROM
(SELECT COUNT(rownum) FROM INCOMINGREQUEST WHERE requesteddepartment = V('APP_USER')
and status ='PENDING');
IF counter = 1 THEN
RAISE_APPLICATION_ERROR(-20012,'Duplicated value');
END IF;
END;
but i am getting an error
REGISTRY.INCOMINGREQUEST is mutating, trigger/function may not see it ORA-06512: at "REGISTRY.INCOMINGREQUEST_T1", line 3 ORA-04088: error during execution of trigger 'REGISTRY.INCOMINGREQUEST_T1'

You can easily achieve the desired behavior using the conditional UNIQUE index as following:
CREATE UNIQUE INDEX INCOMINGREQUEST_IDX ON
T1 ( CASE WHEN STATUS = 'PENDING'
THEN FILENUMBER
END );
Cheers!!

You could use a procedure to stop duplicates, and pass over the parameters you need to insert into the table.
The issue with using a Trigger to find the current status is that you cannot query information from a table you are inserting/updating/deleting from inside the trigger as the data is "Mutating".
To run the procedure use:
BEGIN
stack_prc('DAY CARE', 'PENDING');
END;
Procedure

Performance of date time concatenation into timestamp

Oracle 12C, non partitioned, no ASM.
This is the background. I have a table with multiple columns, 3 of them being -
TRAN_DATE DATE
TRAN_TIME TIMESTAMP(6)
FINAL_DATETIME NOT NULL TIMESTAMP(6)
The table has around 70 million records. What I want to do is concatenate the tran_date and the tran_time field and update the final_datetime field with that output, for all 70 million records.
This is the query I have -
update MYSCHEMA.MYTAB set FINAL_DATETIME = (to_timestamp( (to_char(tran_date, 'YYYY/MM/DD') || ' ' ||to_char(TRAN_TIME,'HH24MISS') ), 'YYYY-MM-DD HH24:MI:SS.FF'))
Eg:
At present (for one record)
TRAN_DATE=01-DEC-16
TRAN_TIME=01-JAN-70 12.35.58.000000 AM /*I need only the time part from this*/
FINAL_DATETIME=22-OCT-18 04.37.18.870000 PM
Post the query - the FINAL_DATETIME needs to be
01-DEC-16 12.35.58.000000 AM
The to_timestamp does require 2 character strings and I fear this will slow down the update a lot. Any suggestions?
What more can I do to increase performance? No one else will be using the table at this point, so, I do have the option to
Drop indices
Turn off logging
and anything more anyone can suggest.
Any help is appreciated.

I would prefer CTAS method and your job would be simpler if you didn't have indexes, triggers and constraints on your table.
Create a new table for the column to be modified.
CREATE TABLE mytab_new
NOLOGGING
AS
SELECT /*+ FULL(mytab) PARALLEL(mytab, 10) */ --hint to speed up the select.
CAST(tran_date AS TIMESTAMP) + ( tran_time - trunc(tran_time) ) AS final_datetime
FROM mytab;
I have included only one(the final) column in your new table because storing the other two in the new table is waste of resources. You may include other columns in select apart from the two now redundant ones.
Read logging/nologging to know about NOLOGGING option in the select.
Next step is to rebuild indexes, triggers and constraints for the new table new_mytab using the definition from mytab for other columns if they exist.
Then rename the tables
rename mytab to mytab_bkp;
rename mytab_new to mytab;
You may drop the table mytab_bkp after validating the new table or later when you feel you no longer need it.
Demo

Amend Stored Procedure to Ignore Duplicate Records

I need to make the below amendment to this stored procedure
create or replace PROCEDURE "USP_IMPORT_FOBTPP_DATA"
AS
BEGIN
INSERT INTO FINIMP.FOBT_PARTPAYMENT
SELECT
PART_PAYMENT_ID,
ISSUING_SHOP,
TILL_NUMBER,
SLIP_NUMBER,
FOBT_NUMBER,
WHO_PAID,
WHEN_PAID,
AMOUNT_LEFT_TO_PAY,
FOBT_VALUE,
STATUS
FROM IMPORTDB.CLN_FOBTPP;
COMMIT;
END;
In order to skip any records that would result in a primary key violation, this is so the dataload process does not break.
Source Table
CREATE TABLE "FINIMP"."FOBT_PARTPAYMENT"
( "PART_PAYMENT_ID" NUMBER(*,0),
"ISSUING_SHOP" CHAR(4 BYTE),
"TILL_NUMBER" NUMBER(3,0),
"SLIP_NUMBER" NUMBER(*,0),
"FOBT_NUMBER" VARCHAR2(30 BYTE),
"WHO_PAID" CHAR(20 BYTE),
"WHEN_PAID" DATE,
"AMOUNT_LEFT_TO_PAY" NUMBER(19,4),
"FOBT_VALUE" NUMBER(19,4),
"STATUS" CHAR(2 BYTE)
);
ALTER TABLE "FINIMP"."FOBT_PARTPAYMENT" ADD CONSTRAINT "PK_FOBT_PP" PRIMARY KEY ("PART_PAYMENT_ID", "ISSUING_SHOP", "WHEN_PAID")
I am new to PL/SQL, how can I do this?

There are a number of ways to accomplish this, and the best method depends on your environment/requirements. Is the CLN_FOBTPP table considerably large? Is the USP_IMPORT_FOBTPP_DATA procedure called frequently, and does it need to meet certain performance criteria? These are all things you should consider.
One way to do this would be to start with the query that you use.
create or replace PROCEDURE "USP_IMPORT_FOBTPP_DATA"
AS
BEGIN
INSERT INTO FINIMP.FOBT_PARTPAYMENT
SELECT ...
FROM IMPORTDB.CLN_FOBTPP;
This will return all of the rows of data from IMPORTDB.CLN_FOBTP and insert them into FINIMP.FOBT_PARTPAYMENT. Instead, you could control for this by doing:
INSERT INTO FINIMP.FOBT_PARTPAYMENT
SELECT ...
FROM IMPORTDB.CLN_FOBTPP WHERE PART_PAYMENT_ID NOT IN (FINIMP.FOBT_PARTPAYMENT)
This would go through the FOBT_PARTPAYMENT table and check to see if a row's PART_PAYMENT_ID existed in the table before doing the insert. However, this can be prohibitively expensive if the table is large or if you have performance requirements.
Another way would be to create a temp table for each time the procedure is called, store the values in that temp table, and then add the new rows after validating the data. This would look something like:
create global temporary table temp_USP_table ("PART_PAYMENT_ID" NUMBER(*,0), "ISSUING_SHOP" CHAR(4 BYTE),...) on commit delete rows;
create or replace PROCEDURE "USP_IMPORT_FOBTPP_DATA"
AS
BEGIN
INSERT INTO temp_USP_table
SELECT ...
FROM IMPORTDB.CLN_FOBTPP;
From there, you can do a number of things. You could use the same procedure to add the new rows from the temp table into the FINIMP.FOBT_PARTPAYMENT table:
delete from temp_USP_table where PART_PAYMENT_ID in FINIMP.FOBT_PARTPAYMENT;
insert into FINIMP.FOBT_PARTPAYMENT select * from temp_USP_table;
Or you could create a new procedure to load the new data from the temp_USP_table into the FINIMP.FOBT_PARTPAYMENT table, in case you'd like to do something additional to the new data before it's added to the table. Since you reference a data load, I would recommend going the temporary table route because it should allow you to load the data without issue. Once the data is loaded, you can worry about adding it to the proper table(s).

WITH Clause performance issue in Oracle 11g

Table myfirst3 have 4 columns and 1.2 million records.
Table mtl_object_genealogy has over 10 million records.
Running the below code takes very long time. How to tune this code using with options?
WITH level1 as (
SELECT mln_parent.lot_number,
mln_parent.inventory_item_id,
gen.lot_num ,--fg_lot,
gen.segment1,
gen.rcv_date.
FROM mtl_lot_numbers mln_parent,
(SELECT MOG1.parent_object_id,
p.segment1,
p.lot_num,
p.rcv_date
FROM mtl_object_genealogy MOG1 ,
myfirst3 p
START WITH MOG1.object_id = p.gen_object_id
AND (MOG1.end_date_active IS NULL OR MOG1.end_date_active > SYSDATE)
CONNECT BY nocycle PRIOR MOG1.parent_object_id = MOG1.object_id
AND (MOG1.end_date_active IS NULL OR MOG1.end_date_active > SYSDATE)
UNION all
SELECT p1.gen_object_id,
p1.segment1,
p1.lot_num,
p1.rcv_date
FROM myfirst3 p1 ) gen
WHERE mln_parent.gen_object_id = gen.parent_object_id )
select /*+ NO_CPU_COSTING */ *
from level1;
execution plan
CREATE TABLE APPS.MYFIRST3
(
TO_ORGANIZATION_ID NUMBER,
LOT_NUM VARCHAR2(80 BYTE),
ITEM_ID NUMBER,
FROM_ORGANIZATION_ID NUMBER,
GEN_OBJECT_ID NUMBER,
SEGMENT1 VARCHAR2(40 BYTE),
RCV_DATE DATE
);
CREATE TABLE INV.MTL_OBJECT_GENEALOGY
(
OBJECT_ID NUMBER NOT NULL,
OBJECT_TYPE NUMBER NOT NULL,
PARENT_OBJECT_ID NUMBER NOT NULL,
START_DATE_ACTIVE DATE NOT NULL,
END_DATE_ACTIVE DATE,
GENEALOGY_ORIGIN NUMBER,
ORIGIN_TXN_ID NUMBER,
GENEALOGY_TYPE NUMBER,
);
CREATE INDEX INV.MTL_OBJECT_GENEALOGY_N1 ON INV.MTL_OBJECT_GENEALOGY(OBJECT_ID);
CREATE INDEX INV.MTL_OBJECT_GENEALOGY_N2 ON INV.MTL_OBJECT_GENEALOGY(PARENT_OBJECT_ID);

Your explain plan shows some very big numbers. The optimizer reckons the final result set will be about 3227,000,000,000 rows. Just returning that many rows will take some time.
All table accesses are Full Table Scans. As you have big tables that will eat time too.
As for improvements, it's pretty hard to for us understand the logic of your query. This is your data model, you business rules, your data. You haven't explained anything so all we can do is guess.
Why are you using the WITH clause? You only use the level result set once, so just have a regular FROM clause.
Why are you using UNION ALL? That operation just duplicates the records retrieved from myfirst3 ( all those values are already included as rows where MOG1.object_id = p.gen_object_id.
The MERGE JOIN CARTESIAN operation is interesting. Oracle uses it to implement transitive closure. It is an expensive operation but that's because treewalking a hierarchy is an expensive thing to do. It is unfortunate for you that you are generating all the parent-child relationships for a table with 27 million records. That's bad.
The full table scans aren't the problem. There are no filters on myfirst3 so obviously the database has to get all the records. If there is one parent for each myfirst3 record that's 10% of the contents mtl_object_genealogy so a full table scan would be efficient; but you're rolling up the entire hierarchy so it's like you're looking at a much greater chunk of the table.
Your indexes are irrelevant in the face of such numbers. What might help is a composite index on mtl_object_genealogy(OBJECT_ID, PARENT_OBJECT_ID, END_DATE_ACTIVE).
You want all the levels of PARENT_OBJECT_ID for the records in myfirst3. If you run this query often and mtl_object_genealogy is a slowly changing table you should consider materializing the transitive closure into a table which just has records for all the permutations of leaf records and parents.
To sum up:
Ditch the WITH clause
Drop the UNION ALL
Tune the tree-walk with a composite index (or materializing it)

How to use SQL trigger to record the affected column's row number

I want to have an 'updateinfo' table in order to record every update/insert/delete operations on another table.
In oracle I've written this:
CREATE TABLE updateinfo ( rnumber NUMBER(10), tablename VARCHAR2(100 BYTE), action VARCHAR2(100 BYTE), UPDATE_DATE date )
DROP TRIGGER TRI_TABLE;
CREATE OR REPLACE TRIGGER TRI_TABLE
AFTER DELETE OR INSERT OR UPDATE
ON demo
REFERENCING NEW AS NEW OLD AS OLD
FOR EACH ROW
BEGIN
if inserting then
insert into updateinfo(rnumber,tablename,action,update_date ) values(rownum,'demo', 'insert',sysdate);
elsif updating then
insert into updateinfo(rnumber,tablename,action,update_date ) values(rownum,'demo', 'update',sysdate);
elsif deleting then
insert into updateinfo(rnumber,tablename,action,update_date ) values(rownum,'demo', 'delete',sysdate);
end if;
-- EXCEPTION
-- WHEN OTHERS THEN
-- Consider logging the error and then re-raise
-- RAISE;
END TRI_TABLE;
but when checking updateinfo, all rnumber column is zero.
is there anyway to retrieve the correct row number?

The only option is to use primary key column of your "demo" table.
ROWNUM is not what you are looking for, read the explanation.
ROWID looks like a solution, but in fact it isn't, because it shouldn't be stored for a later use.

ROWNUM is not what you think it is. ROWNUM is a counter that has only a meaning within the context of one execution of a statement (i.e. the first resulting row always has rownum=1 etc.). I guess you are looking for ROWID, which identifies a row.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio