materialized view creation is fast but refresh takes hours

materialized view creation is fast but refresh takes hours - oracle

I am using a materialized view, and I cant set it to fast refresh because some of the tables are from remote database which does not have materialized view log.
When I create the materialized view, it took like 20 30 seconds. however when I was trying to refresh it.
It took more than 2 3 hours. and total number of records are only around 460,000.
Does anyone have any clue about how it would happen?
Thanks
Code looks like as following
create materialized view MY_MV1
refresh force on demand
start with to_date('20-02-2013 22:00:00', 'dd-mm-yyyy hh24:mi:ss') next trunc(sysdate)+1+22/24
as
( SELECT Nvl(Cr.Sol_Chng_Num, ' ') AS Change_Request_Nbr,
Nvl(Sr.Sr_Num, ' ') AS Service_Request_Nbr,
Nvl(Sr.w_Org_Id, 0) AS Org_Id,
Fcr.rowid,
Cr.rowid,
Bsr.rowid,
Sr.rowid,
SYSDATE
FROM Dwadmin.f_S_Change#DateWarehouse.World Fcr
INNER JOIN Dwadmin.d_S_Change#DateWarehouse.World Cr
ON Fcr.w_Sol_Chng_Id = Cr.w_Sol_Chng_Id
INNER JOIN Dwadmin.b_S_Change_Obl#DateWarehouse.World Bsr
ON Fcr.w_Sol_Chng_Id = Bsr.w_Sol_Chng_Id
INNER JOIN Dwadmin.d_S_Rec#DateWarehouse.World Sr
ON Sr.w_Srv_Rec_Id = Bsr.w_Srv_Rec_Id
WHERE Sr.Sr_Num <> 'NS'
);
I have tried to use dbms_mview.refresh('MY_MATVIEW', 'C', atomic_refresh=>false)
but it still took 141 mins to run... vs 159 mins without atomic_refresh=>false

I would personally NOT use the scheduler built into the mat view CREATE statement (start with ... next clause).
The main reason (for me) is that you cannot declare the refresh non-ATOMIC this way (at least I haven't found the syntax for this at CREATE time). Depending on your refresh requirements and size, this can save A LOT of time.
I would use dbms_mview.refresh('MY_MATVIEW', 'C', atomic_refresh=>false). This would:
Truncate MY_MATVIEW snapshot table
Insert append into MY_MATVIEW table
If you use the next clause in the create statement, it will setup an atomic refresh, meaning it will:
Delete * from MY_MATVIEW
Insert into MY_MATVIEW
Commit
This will be slower (sometimes much slower), but others can still query from MY_MATVIEW while the refresh is occurring. So, depends on your situation and needs.

You can test it. I run this manually and it works for me friend :)
BEGIN
DBMS_REFRESH.make(
name => 'DB_NAME.MINUTE_REFRESH',
list => '',
next_date => SYSDATE,
interval => '/*1:Mins*/ SYSDATE + 1/(60*24)',
implicit_destroy => FALSE,
lax => FALSE,
job => 0,
rollback_seg => NULL,
push_deferred_rpc => TRUE,
refresh_after_errors => TRUE,
purge_option => NULL,
parallelism => NULL,
heap_size => NULL);
END;
/
BEGIN
DBMS_REFRESH.add(
name => 'DB_NAME.MINUTE_REFRESH',
list => 'DB_NAME.MV_NAME',
lax => TRUE);
END;
/
And then u can destroy it with this.
BEGIN
DBMS_REFRESH.destroy(name => 'DB_NAME.MINUTE_REFRESH');
END;
/
You can create materialize view log.
CREATE MATERIALIZED VIEW LOG ON DB_NAME.TABLE_NAME
TABLESPACE users
WITH PRIMARY KEY
INCLUDING NEW VALUES;
I hope it can help you. :)

if it only takes 20-30 seconds to create why not just drop and recreate the materialized view instead of refreshing it?

I am guessing:
Create Table doesn't need to write into the transaction log, as it is a new table. atomic_refresh => false means there is a truncate on the delete side (bypassing logging), but you then still have the INSERT side to deal with, which likely means you get a lot of transaction logging.

Related

create a trigger to update values

I have these two tables
Project (projID,TotArticles)
Article (prodID,ArticleID)
How can I create a trigger to update by 1 the total amount of article every time someone published an article on it?
CREATE TRIGGER Art_Up
AFTER INSERT ON Article
FOR EACH ROW
UPDATE Project
SET TotArticle = TotArticle + 1
WHERE paperID = NEW.PaperID;
However, it gives me this error PLS-00103: Encountered the symbol ";"

You messed some names, once you write projID, once prodID, and in your trigger it is paperID. Also trigger has no begin ... end;. And you did not handle adding articles where projID does not exist in table project. You could check it at first or use rowcount after update and if it is 0 then use insert. More simple is to use merge.
create or replace trigger art_up after insert on article for each row
begin
merge into project
using (select :new.projid projid from dual) src
on (project.projid = src.projid)
when matched then update set totarticles = totarticles + 1
when not matched then insert (projID, TotArticles) values (:new.projid, 1);
end;
It works, I tested some basic inserts, but it is not recommended at all, because:
it's a bad idea to keep logic in triggers,
triggers can be droppped, disabled and then this information may be misleading, false,
we are slowing insert operations,
this trigger does not handle delete where you should decrement total number of articles.
Instead of trigger use simple view:
create or replace view vw_project as select projID, count(1) total from article group by projid;

Oracle dba_snapshot_refresh_times use in PL SQL

Essentially I have an MVIEW that doesn't need to be refreshed every time I run my report so as a test I'm just checking the minute interval where if it hasn't been refreshed in the last 20 min, I would refresh it. So, if I run this script by itself, it works properly:
declare
LastRefresh float;
myvar varchar2(10) := 'NO';
BEGIN
SELECT ROUND(ABS((LAST_REFRESH - SYSDATE)*24*60),0)
INTO LastRefresh
FROM dba_snapshot_refresh_times
WHERE owner = 'ME'
AND NAME = 'MATV_ADDRESS';
IF (LastRefresh > 20)
THEN BEGIN myvar := 'YES'; END;
--// THIS IS WHERE I WOULD REFRESH THE MVIEW
END IF;
dbms_output.put_line(myvar);
END;
/
The problem I'm having is when I try to run this within an Oracle procedure, I get the "Table or View does not exists" on dba_snapshot_refresh_times. I perform the EXECUTE IMMEDIATE to invoke the query but can't seem to find a way to insert into a variable with this method. Any clean alternative ways I could do do this?
Did a bit of research where I could potentially just use some other form of "flag" where I'd populate a temp table in order to check the value, but I figured I'd ask about a cleaner way / solution.

It appears that you're programming something that Oracle offers itself; why wouldn't you set materialized view to automatically refresh every 20 minutes? You'd use
alter materialized view matv_address
refresh
next sysdate + 20 / (60 * 24);
As of your problem (if I correctly understood the problem): DBA_ views are visible to users with DBA privileges. It seems that user you're connected to doesn't have those privileges. Instead, you could use USER_SNAPSHOT_REFRESH_TIMES (for materialized views belonging to you), or ALL_SNAPSHOT_REFRESH_TIMES (for materialized views you have access to, but belong to other users as well).

Found my answer on this thread Materialized Views - Identifying the last refresh where I'm simply making use of an alternative table (all_mviews) to obtain the "Last Refresh Date" , so this appears to work within the Oracle procedure.
SELECT ROUND(ABS((LAST_REFRESH_DATE - SYSDATE)*24*60),0)
INTO LastRefresh
FROM all_mviews
WHERE owner = 'ME'
AND mview_name = 'MATV_ADDRESS';
IF (LastRefresh > 20) THEN
BEGIN
DBMS_SNAPSHOT.REFRESH(
LIST => 'MATV_ADDRESS'
,PUSH_DEFERRED_RPC => TRUE
,REFRESH_AFTER_ERRORS => FALSE
,PURGE_OPTION => 1
,PARALLELISM => 1
,HEAP_SIZE => 1
,ATOMIC_REFRESH => FALSE
,NESTED => FALSE);
END;
END IF;

Reading oracle logs giving duplicate records

enter code hereI am facing an issue when reading the oracle logs through time interval.
Issue:
In oracle , while data is getting inserted through some external application, If I am using log miner to read the oracle logs, It gives me duplicate records.
For example, suppose if there is a time interval t1,t2,t3.
Data is inserted from t1 to t3.
Meanwhile If I am using log miner to read the data from t1 to t2 and then t2 to t3. Then there are some records which are coming in both the intervals.
One observation :Records which are showing duplicates are at the end of first interval and at the beginning for second interval.
Queries which I am using:
begin dbms_logmnr.start_logmnr(STARTTIME => t1,ENDTIME =>t2,OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.CONTINUOUS_MINE + DBMS_LOGMNR.COMMITTED_DATA_ONLY);end;
select sql_redo from V$LOGMNR_CONTENTS WHERE OPERATION IN('INSERT','UPDATE','DELETE') and table_name = xyz
begin dbms_logmnr.start_logmnr(STARTTIME => t2,ENDTIME =>t3,OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.CONTINUOUS_MINE + DBMS_LOGMNR.COMMITTED_DATA_ONLY);end;
select sql_redo from V$LOGMNR_CONTENTS WHERE OPERATION IN('INSERT','UPDATE','DELETE') and table_name = xyz
Date format which I am using to start miner: DD-MON-YYYY HH24:MI:SS
Note:Data is being committed as soon as it is getting inserted.

Accoring to oracle documentation , in logminr, start time will be used as greater then or equal to , and end time will be used as less then or equal to. So logmnr is designed in this way.

h2 index corruption? embedded database loaded with runscript has "invisible" rows

Using h2 in embedded mode, I am restoring an in memory database from a script backup that was previously generated by h2 using the SCRIPT command.
I use this URL:
jdbc:h2:mem:main
I am doing it like this:
FileReader script = new FileReader("db.sql");
RunScript.execute(conn,script);
which, according to the doc, should be similar to this SQL:
RUNSCRIPT FROM 'db.sql'
And, inside my app they do perform the same. But if I run the load using the web console using h2.bat, I get a different result.
Following the load of this data in my app, there are rows that I know are loaded but are not accessible to me via a query. And these queries demonstrate it:
select count(*) from MY_TABLE yields 96576
select count(*) from MY_TABLE where ID <> 3238396 yields 96575
select count(*) from MY_TABLE where ID = 3238396 yields 0
Loading the web console and using the same RUNSCRIPT command and file to load yields a database where I can find the row with that ID.
My first inclination was that I was dealing with some sort of locking issue. I have tried the following (with no change in results):
manually issuing a conn.commit() after the RunScript.execute()
appending ;LOCK_MODE=3 and the ;LOCK_MODE=0 to my URL
Any pointers in the right direction on how I can identify what is going on? I ended up inserting :
Server.createWebServer("-trace","-webPort","9083").start()
So that I could run these queries through the web console to sanity check what was coming back through JDBC. The problem happens consistently in my app and consistently doesn't happen via the web console. So there must be something at work.
The table schema is not exotic. This is the schema column from
select * from INFORMATION_SCHEMA.TABLES where TABLE_NAME='MY_TABLE'
CREATE MEMORY TABLE PUBLIC.MY_TABLE(
ID INTEGER SELECTIVITY 100,
P_ID INTEGER SELECTIVITY 4,
TYPE VARCHAR(10) SELECTIVITY 1,
P_ORDER DECIMAL(8, 0) SELECTIVITY 11,
E_GROUP INTEGER SELECTIVITY 1,
P_USAGE VARCHAR(16) SELECTIVITY 1
)
Any push in the right direction would be really appreciated.
EDIT
So it seems that the database is corrupted in some way just after running the RunScript command to load it. As I was trying to debug to find out what is going on, I tried executing the following:
delete from MY_TABLE where ID <> 3238396
And I ended up with:
Row not found when trying to delete from index "PUBLIC.MY_TABLE_IX1: 95326", SQL statement:
delete from MY_TABLE where ID <> 3238396 [90112-178] 90112/90112 (Help)
I then tried dropping and recreating all my indexes from within the context, but it had no effect on the overall problem.
Help!
EDIT 2
More information: The problem occurs due to the creation of an index. (I believe I have found a bug in h2 and I have working on creating a minimal case that reproduces it). The simple code below will reproduce the problem, if you have the right set of data.
public static void main(String[] args)
{
try
{
final String DB_H2URL = "jdbc:h2:mem:main;LOCK_MODE=3";
Class.forName("org.h2.Driver");
Connection c = DriverManager.getConnection(DB_H2URL, "sa", "");
FileReader script = new FileReader("db.sql");
RunScript.execute(c,script);
script.close();
Statement st = c.createStatement();
ResultSet rs = st.executeQuery("select count(*) from MY_TABLE where P_ID = 3238396");
rs.next();
if(rs.getLong(1) == 0)
System.err.println("It happened");
else
System.err.println("It didn't happen");
} catch (Throwable e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
I have reduced the db.sql script to about 5000 rows and it still happens. When I attempted to go to 2500 rows, it stopped happening. If I remove the last line of the db.sql (which is the index creation), the problem will also stop happening. The last line is this:
CREATE INDEX PUBLIC.MY_TABLE_IX1 ON PUBLIC.MY_TABLE(P_ID);
But the data is an important player in this. It still appears to only ever be the one row and the index somehow makes it inaccessible.
EDIT 3
I have identified the minimal data example to reproduce. I stripped the table schema down to a single column, and I found that the values in that column don't seem to matter -- just the number of rows. Here is the contents of (snipped with obvious stuff) of my db.sql generated via the SCRIPT command:
;
CREATE USER IF NOT EXISTS SA SALT '8eed806dbbd1ea59' HASH '6d55cf715c56f4ca392aca7389da216a97ae8c9785de5d071b49de5436b0c003' ADMIN;
CREATE MEMORY TABLE PUBLIC.MY_TABLE(
P_ID INTEGER SELECTIVITY 100
);
-- 5132 +/- SELECT COUNT(*) FROM PUBLIC.MY_TABLE;
INSERT INTO PUBLIC.MY_TABLE(P_ID) VALUES
(1),
(2),
(3),
... snipped you obviously have breaks in the bulk insert here ...
(5143),
(3238396);
CREATE INDEX PUBLIC.MY_TABLE_IX1 ON PUBLIC.MY_TABLE(P_ID);
But that will recreate the problem. [Note that my numbering skips a number every time there was a bulk insert line. So there really is 5132 rows, though you see 5143 select count(*) from MY_TABLE yields 5132]. Also, I seem to be able to recreate the problem in the WebConsole directly now by doing:
drop table MY_TABLE
runscript from 'db.sql'
select count(*) from MY_TABLE where P_ID = 3238396
You have recreated the problem if you get 0 back from the select when you know you have a row in there.
Oddly enough, I seem to be able to do
select * from MY_TABLE order by P_ID desc
and I can see the row at this point. But going directly for the row:
select * from MY_TABLE where P_ID = 3238396
Yields nothing.
I just realized that I should note that I am using h2-1.4.178.jar

The h2 folks have already apparently resolved this.
https://code.google.com/p/h2database/issues/detail?id=566
Just either need to get the code from version control or wait for the next release build. Thanks Thomas.

optimizing a dup delete statement Oracle

I have 2 delete statements that are taking a long time to complete. There are several indexes on the columns in where clause.
What is a duplicate?
If 2 or more records have same values in columns id,cid,type,trefid,ordrefid,amount and paydt then there are duplicates.
The DELETEs delete about 1 million record.
Can they be re-written in any way to make it quicker.
DELETE FROM TABLE1 A WHERE loaddt < (
SELECT max(loaddt) FROM TABLE1 B
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
COMMIT;
DELETE FROM TABLE1 a where rowid > (
Select min(rowid) from TABLE1 b
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
commit;
Explain Plan:
DELETE TABLE1
HASH JOIN 1296491
Access Predicates
AND
A.ID=ITEM_1
A.CID=ITEM_2
ITEM_3=NVL(TYPE,'-99999')
ITEM_4=NVL(TREFID,'-99999')
ITEM_5=NVL(ORDREFID,'-99999')
ITEM_6=NVL(AMOUNT,(-99999))
ITEM_7=NVL(PAYDT,TO_DATE(' 9999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Filter Predicates
LOADDT<MAX(LOADDT)
TABLE ACCESS TABLE1 FULL 267904
VIEW VW_SQ_1 690385
SORT GROUP BY 690385
TABLE ACCESS TABLE1 FULL 267904

How large is the table? If count of deleted rows is up to 12% then you may think about index.
Could you somehow partition your table - like week by week and then scan only actual week?
Maybe this could be more effecient. When you're using aggregate function, then oracle must walk through all relevant rows (in your case fullscan), but when you use exists it stops when the first occurence is found. (and of course the query would be much faster, when there was one function-based(because of NVL) index on all columns in where clause)
DELETE FROM TABLE1 A
WHERE exists (
SELECT 1
FROM TABLE1 B
WHERE
A.loaddt != b.loaddt
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);

Although some may disagree, I am a proponent of running large, long running deletes procedurally. In my view it is much easier to control and track progress (and your DBA will like you better ;-) Also, not sure why you need to join table1 to itself to identify duplicates (and I'd be curious if you ever run into snapshot too old issues with your current approach). You also shouldn't need multiple delete statements, all duplicates should be handled in one process. Finally, you should check WHY you're constantly re-introducing duplicates each week, and perhaps change the load process (maybe doing a merge/upsert rather than all inserts).
That said, you might try something like:
-- first create mat view to find all duplicates
create materialized view my_dups_mv
tablespace my_tablespace
build immediate
refresh complete on demand
as
select id,cid,type,trefid,ordrefid,amount,paydt, count(1) as cnt
from table1
group by id,cid,type,trefid,ordrefid,amount,paydt
having count(1) > 1;
-- dedup data (or put into procedure and schedule along with mat view refresh above)
declare
-- make sure my_dups_mv is refreshed first
cursor dup_cur is
select * from my_dups_mv;
type duprec_t is record(row_id rowid);
duprec duprec_t;
type duptab_t is table of duprec_t index by pls_integer;
duptab duptab_t;
l_ctr pls_integer := 0;
l_dupcnt pls_integer := 0;
begin
for rec in dup_cur
loop
l_ctr := l_ctr + 1;
-- assuming needed indexes exist
select rowid
bulk collect into duptab
from table1
where id = rec.id
and cid = rec.cid
and type = rec.type
and trefid = rec.trefid
and ordrefid = rec.ordrefid
and amount = rec.amount
and paydt = rec.paydt
-- order by whatever makes sense to make the "keeper" float to top
order by loaddt desc
;
for i in 2 .. duptab.count
loop
l_dupcnt := l_dupcnt + 1;
delete from table1 where rowid = duptab(i).row_id;
end loop;
if (mod(l_ctr, 10000) = 0) then
-- log to log table here (calling autonomous procedure you'll need to implement)
insert_logtable('Table1 deletes', 'Commit reached, deleted ' || l_dupcnt || ' rows');
commit;
end if;
end loop;
commit;
end;
Check your log table for progress status.

1. Parallel
alter session enable parallel dml;
DELETE /*+ PARALLEL */ FROM TABLE1 A WHERE loaddt < (
...
Assuming you have Enterprise Edition, a sane server configuration, and you are on 11g. If you're not on 11g, the parallel syntax is slightly different.
2. Reduce memory requirements
The plan shows a hash join, which is probably a good thing. But without any useful filters, Oracle has to hash the entire table. (Tbone's query, that only use a GROUP BY, looks nicer and may run faster. But it will also probably run into the same problem trying to sort or hash the entire table.)
If the hash can't fit in memory it must be written to disk, which can be very slow. Since you run this query every week, only one of the tables needs to look at all the rows. Depending on exactly when it runs, you can add something like this to the end of the query: ) where b.loaddt >= sysdate - 14. This may significantly reduce the amount of writing to temporary tablespace. And it may also reduce read IO if you use some partitioning strategy like jakub.petr suggested.
3. Active Report
If you want to know exactly what your query is doing, run the Active Report:
select dbms_sqltune.report_sql_monitor(sql_id => 'YOUR_SQL_ID_HERE', type => 'active')
from dual;
(Save the output to an .html file and open it with a browser.)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

materialized view creation is fast but refresh takes hours - oracle

if it only takes 20-30 seconds to create why not just drop and recreate the materialized view instead of refreshing it?

I am guessing: Create Table doesn't need to write into the transaction log, as it is a new table. atomic_refresh => false means there is a truncate on the delete side (bypassing logging), but you then still have the INSERT side to deal with, which likely means you get a lot of transaction logging.

Related

create a trigger to update values

Oracle dba_snapshot_refresh_times use in PL SQL

Reading oracle logs giving duplicate records

h2 index corruption? embedded database loaded with runscript has "invisible" rows

optimizing a dup delete statement Oracle

Categories

Resources