When commit records when we are inserting records from other table - oracle

I have sql query like :
insert into dedupunclear
select mdnnumber,poivalue
from deduporiginal a
where exists (
select 1
from deduporiginal
where mdnnumber=a.mdnnumber and rowid<a.rowid)
or mdnnumber is null;
There is 500K records in my deduporiginal. I have put this query inside function but it will take around 3hrs to commit records to dedupunclear table.
Is there any alternative to resolve performance issue ?
When this query commit records , At some interval or after getting all results from select query ?

This is how I did it the other day:
delete from table a
where rowid >
(select min(rowid) from table b
where nvl(a.id, 'x') = nvl(b.id, 'x') )
Instead of an insert into a dedupe table, I just deleted the rows directly from the staging table. For a table with 1 million rows, this query worked pretty well. I was worried that the nvl function would kill the index, but it worked well enough.

Related

Delete logic is taking a very long time to process in Oracle

I am trying to use the following statement for the Delete process and it has to delete around 23566424 Rows, but oracle takes almost 3 hours to complete the process and we have already created an index on " SCHEDULE_DATE_KEY" but still, the process is very slow.Can someone advise on how to make Deletes faster in oracle
DELETE
FROM
EDWSOURCE.SCHEDULE_DAY_F
WHERE
SCHEDULE_DATE_KEY >
(
SELECT
LAST_PAYROLL_DATE_KEY
FROM
EDWSOURCE.LAST_PAYROLL_DATE
WHERE
CURRENT_FLAG = 'Y'
);
I don't think any index will help here, probably Oracle will decide the best approach is a full table scan to delete 20M rows from 300M. It is deleting at a rate of over 2000 rows per second, which isn't bad. In fact any additional indexes will slow it down as it has to delete the row entry from the index as well.
A quicker approach could be to create a new table of the rows you want to keep, something like:
create table EDWSOURCE.SCHEDULE_DAY_F_KEEP
as
select * from EDWSOURCE.SCHEDULE_DAY_F
where SCHEDULE_DATE_KEY <=
(
SELECT
LAST_PAYROLL_DATE_KEY
FROM
EDWSOURCE.LAST_PAYROLL_DATE
WHERE
CURRENT_FLAG = 'Y'
);
Then recreate any constraints and indexes to use the new table.
Finally drop the old table and rename the new one.
You can try testing a filtered table move. This has an online clause. So you can do this while the application is still running.
Note 12.2 and later the indexes will remain valid. In earlier versions you will need to rebuild the indexes as they will become invalid. Good Luck
Move a Table
Create and populate a new test table.
DROP TABLE t1 PURGE;
CREATE TABLE t1 AS
SELECT level AS id,
'Description for ' || level AS description
FROM dual
CONNECT BY level <= 100;
COMMIT;
Check the contents of the table.
SELECT COUNT(*) AS total_rows,
MIN(id) AS min_id,
MAX(id) AS max_id
FROM t1;
TOTAL_ROWS MIN_ID MAX_ID
---------- ---------- ----------
100 1 100
SQL>
Move the table, filtering out rows with an ID value greater than 50.
ALTER TABLE t1 MOVE ONLINE
INCLUDING ROWS WHERE id <= 50;
Check the contents of the table.
SELECT COUNT(*) AS total_rows,
MIN(id) AS min_id,
MAX(id) AS max_id
FROM t1;
TOTAL_ROWS MIN_ID MAX_ID
---------- ---------- ----------
50 1 50
SQL>
The rows with an ID value between 51 and 100 have been removed.
As mentioned above if maybe best to PARTITION the table abs drop a PARTITION every N number of days as part of a daily task.

Compare differences before insert into oracle table

Could you please tell me how to compare differences between table and my select query and insert those results in separate table? My plan is to create one base table (name RESULT) by using select statement and populate it with current result set. Then next day I would like to create procedure which will going to compare same select with RESULT table, and insert differences into another table called DIFFERENCES.
Any ideas?
Thanks!
You can create the RESULT_TABLE using CTAS as follows:
CREATE TABLE RESULT_TABLE
AS SELECT ... -- YOUR QUERY
Then you can use the following procedure which calculates the difference between your query and data from RESULT_TABLE:
CREATE OR REPLACE PROCEDURE FIND_DIFF
AS
BEGIN
INSERT INTO DIFFERENCES
--data present in the query but not in RESULT_TABLE
(SELECT ... -- YOUR QUERY
MINUS
SELECT * FROM RESULT_TABLE)
UNION
--data present in the RESULT_TABLE but not in the query
(SELECT * FROM RESULT_TABLE
MINUS
SELECT ... );-- YOUR QUERY
END;
/
I have used the UNION and the difference between both of them in a different order using MINUS to insert the deleted data also in the DIFFERENCES table. If this is not the requirement then remove the query after/before the UNION according to your requirement.
-- Create a table with results from the query, and ID as primary key
create table result_t as
select id, col_1, col_2, col_3
from <some-query>;
-- Create a table with new rows, deleted rows or updated rows
create table differences_t as
select id
-- Old values
,b.col_1 as old_col_1
,b.col_2 as old_col_2
,b.col_3 as old_col_3
-- New values
,a.col_1 as new_col_1
,a.col_2 as new_col_2
,a.col_3 as new_col_3
-- Execute the query once again
from <some-query> a
-- Outer join to detect also detect new/deleted rows
full join result_t b using(id)
-- Null aware comparison
where decode(a.col_1, b.col_1, 1, 0) = 0
or decode(a.col_2, b.col_2, 1, 0) = 0
or decode(a.col_3, b.col_3, 1, 0) = 0;

Delete data based on the count & timestamp using pl\sql

I'm new to PL\SQL programming and I'm from DBA background. I got one requirement to delete data from both main table and reference table but need to follow below logic while deleting data because we need to delete 30M of data from the tables so we're reducing data based on the "State_ID" column below.
Following conditions need to consider
1. As per sample data given below(Main Table), sort data based on timestamp with desc order and leave the first 2 rows of data for each "State_id" and delete rest of the data from the both tables based on "state_id" column.
2. select state_id,count() from maintable group by state_id order by timestamp desc Having count()>2;
So if state_id=1 has 5 rows then has to delete 3 rows of data by leaving first 2 rows for state_id=1 and repeat for other state_id values.
Also same matching data should be deleted from the reference table as well.
Please someone help me on this issue. Thanks.
enter image description here
Main table
You should be able to do each table delete as a single SQL command. Anything else would essentially force row-by-row processing, which is the last thing you want for that much data. Something like this:
delete from main_table m
where m.row_id not in (
with keep_me as (
select row_id,
row_number() over (partition by state_id
order by time_stamp desc) id_row_number
from main_table where id_row_number<3)
select row_id from keep_me)
or
delete from main_table m
where m.row_id in (
with delete_me as (
select row_id,
row_number() over (partition by state_id
order by time_stamp desc) id_row_number
from main_table where id_row_number>2)
select row_id from delete_me)

How to copy all constrains and data form one schema to another in oracle

I am using Toad for oracle 12c. I need to copy a table and data (40M) from one shcema to another (prod to test). However there is an unique key(not the PK for this table) called record_Id col which has something data like this 3.000*******19E15. About 2M rows has same numbers(I believe its because very large number) which are unique in prod. When I try to copy it violets the unique key of that col. I am using toad "export data to another schema" function to copy the data.
when I execute query in prod
select count(*) from table_name
OR
select count(distinct(record_id) from table_name
Both query gives the exact same numbers of data.
I don't have DBA permission. How do I copy all data without violating unique key of the table.
Thanks in advance!
You can use UPSERT for decisional INSERT or UPDATE or you may write small procedure for this.
you may consider to use NOT EXISTS, but your data is big and it might not be resource efficient.
insert into prod_tab
select * from other_tab t1 where NOT exists (
select 1 from prod_tab t2 where t1.id = t2.id
);
In Oracle you can use a MERGE query for that.
The following query proceeds as follows for each data row :
if the source record_id does not yet exist in the target table, a new record is inserted
else, the existing record is updated with source values
For the sake of the example, I assumed that there are two other columns in the table : column1 and column2.
MERGE INTO target_table t1
USING (SELECT * from source_table t2)
ON (t1.record_id = t2.record_id)
WHEN MATCHED THEN UPDATE SET
t1.column1 = t2.column1,
t1.column2 = t2.column2
WHEN NOT MATCHED THEN INSERT
(record_id, column1, column2) VALUES (t2.record_id, t2.column1, t2.column2)

oracle | delete duplicates records

I have identified some duplicates in my table:
-- DUPLICATES: ----
select PPLP_NAME,
START_TIME,
END_TIME,
count(*)
from PPLP_LOAD_GENSTAT
group by PPLP_NAME,
START_TIME,
END_TIME
having count(*) > 1
-- DUPLICATES: ----
How is it possible to delete them?
Even if you don't have the primary key, each record has a unique rowid associated.
By using the query below you delete only the records that don't have the maximum row id by self joining a table with the columns that cause duplication. This will make sure that you delete any duplicates.
DELETE FROM PPLP_LOAD_GENSTAT plg_outer
WHERE ROWID NOT IN(
select MAX(ROWID)
from PPLP_LOAD_GENSTAT plg_inner
WHERE plg_outer.pplp_name = plg_inner.pplg_name
AND plg_outer.start_time= plg_inner.start_time
AND plg_outer.end_time = plg_inner.end_time
);
I'd suggest something easier:
CREATE table NewTable as
SELECT DISTINCT pplp_name,start_time,end_time
FROM YourTable
Then delete your table, and rename the new table.
If you really want to delete records, you can find a few examples of how here.

Resources