Is it possible to create a materialized view where it is only incremental?
I would like the old data already inserted not to be updated, only new insertions should be included in the view.
If possible, how could I do it?
Is there any documentation or any place I can use as a guide?
If you want to see every row at the time of insert in an MV the short answer is:
You can't.
A materialized view stores the result of a query as it exists now, not some time in the past. So after updating a row you can either see its current values or exclude it from the MV.
If you want to preserve/view the state of data at insert you have a few options:
Make the table insert-only (possibly storing change history within the table)
Capture the data when its added (e.g. via triggers) into another table
Use Flashback Data Archive to store change history and view it with Flashback Query
Which of these is most appropriate depends on why you need view the data at insert.
Technically it is possible to view data at the time of insert - up to a point.
With Flashback Version Query you can see changes to the table over time. So you can do something like this:
create table t (
c1 int, c2 int,
insert_date timestamp,
update_date timestamp
);
exec dbms_session.sleep(10);
insert into t values ( 1, 1, systimestamp, systimestamp );
insert into t values ( 2, 2, systimestamp, systimestamp );
commit;
create materialized view mv
as
select t.*
from t
versions between scn minvalue
and maxvalue
where versions_operation = 'I';
exec dbms_session.sleep(10);
update t
set c2 = 9999,
update_date = systimestamp
where c1 = 2;
insert into t values ( 3, 3, systimestamp, systimestamp );
commit;
exec dbms_mview.refresh ( 'mv' );
select * from t;
C1 C2 INSERT_DATE UPDATE_DATE
1 1 26-JUL-2021 13.49.51.666954000 26-JUL-2021 13.49.51.666954000
2 9999 26-JUL-2021 13.49.51.712259000 26-JUL-2021 13.50.08.421872000
3 3 26-JUL-2021 13.50.08.462300000 26-JUL-2021 13.50.08.462300000
select *
from mv;
C1 C2 INSERT_DATE UPDATE_DATE
3 3 26-JUL-2021 13.50.08.462300000 26-JUL-2021 13.50.08.462300000
2 2 26-JUL-2021 13.49.51.712259000 26-JUL-2021 13.49.51.712259000
1 1 26-JUL-2021 13.49.51.666954000 26-JUL-2021 13.49.51.666954000
This uses undo to reconstruct history. Eventually older changes will drop off and you're back to only seeing the current state. By default you only get 15 minutes worth of changes!
If you want to store history for long period of time, Flashback Data Archive is the way to go.
Related
We need to change the structure of our DB by re-partitioning a parent table and adding partitions to the child tables. The partitions will be based on range of a date-time field (which is the time of insertion).
The goal is to implement the changes with as little downtime as possible. So we have though on doing it on 2 phases: do the main load while the application is running and load the delta after the application has been shutdown.
I would like to know if the following steps are correct for the plan I mentioned, focusing on the part of re-partitioning the parent table. I have based on this example from Oracle: https://docs.oracle.com/database/121/SUTIL/GUID-56D8448A-0106-4B8C-85E0-11CFED8C71B1.htm
1. create a directory for the export/import of dumps
-- user: SYS
CREATE OR REPLACE DIRECTORY dumps AS '/data/tmp';
GRANT READ, WRITE ON DIRECTORY dumps TO APP_USR;
2. export the table while the application is running
# nohup expdp APP_USR/pwd parfile=export.par &
directory=dumps
dumpfile=parent_table.dmp
parallel=8
logfile=export.log
tables=parent_table
query=parent_table:"WHERE VERSION_TSP <= sysdate"
3. create a duplicate of the table with the new partitioning
CREATE TABLE "PARENT_TABLE_REPARTITIONED"
(
-- same fields as original
CONSTRAINT PK_TABLE_ PRIMARY KEY (foo, boo) -- with different name than original
)
PARALLEL 8
PARTITION BY RANGE (VERSION_TSP)
INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
(
PARTITION PARENT_TABLE_PARTITION_ -- with different name than original (should I?)
VALUES LESS THAN (TO_DATE('2007-07-01', 'YYYY-MM-DD')) COMPRESS FOR OLTP
)
ENABLE ROW MOVEMENT
;
-- with new local indexes (named differently)
CREATE INDEX IDX_VERSION_TSP_ ON PARENT_TABLE_REPARTITIONED (VERSION_TSP) LOCAL;
4. populate the re-partitioned table while the application is running
# nohup impdp APP_USR/pwd parfile=import.par &
directory=dumps
dumpfile=parent_table.dmp
parallel=8
logfile=import.log
remap_table=PARENT_TABLE:PARENT_TABLE_REPARTITIONED
table_exists_action=APPEND
5. Application shutdown (and backup the DB)
6. populate the delta
-- execute with nohup
declare
startDate DATE;
begin
select max(version_tsp) into startDate from PARENT_TABLE_REPARTITIONED;
insert into PARENT_TABLE_REPARTITIONED select /*+ parallel(8) */ s.* from PARENT_TABLE s where s.VERSION_TSP > startDate;
commit;
end;
7. switch tables
drop table parent_table cascade constraints;
rename parent_table_repartitioned to parent_table;
8. deploy the new version of the application
Is ok to export ALL from the table or should I export DATA_ONLY?
But most importantly, are those steps a valid approach?
Are you on 12.2 or higher? It could be as easy as:
SQL> create table t
2 partition by range ( created )
3 (
4 partition p1 values less than ( date '2020-01-01' ),
5 partition p2 values less than ( date '2021-01-01' ),
6 partition p3 values less than ( date '2022-01-01' ),
7 partition p4 values less than ( date '2023-01-01' )
8 )
9 as select * from dba_objects
10 where created is not null
11 and last_ddl_time is not null;
Table created.
SQL>
SQL> create index t_ix on t ( owner ) local;
Index created.
SQL>
SQL>
SQL> alter table t modify
2 partition by range ( last_ddl_time )
3 (
4 partition p1 values less than ( date '2020-01-01' ),
5 partition p2 values less than ( date '2021-01-01' ),
6 partition p3 values less than ( date '2022-01-01' ),
7 partition p4 values less than ( date '2023-01-01' )
8 ) online ;
Table altered.
I am trying to use the following statement for the Delete process and it has to delete around 23566424 Rows, but oracle takes almost 3 hours to complete the process and we have already created an index on " SCHEDULE_DATE_KEY" but still, the process is very slow.Can someone advise on how to make Deletes faster in oracle
DELETE
FROM
EDWSOURCE.SCHEDULE_DAY_F
WHERE
SCHEDULE_DATE_KEY >
(
SELECT
LAST_PAYROLL_DATE_KEY
FROM
EDWSOURCE.LAST_PAYROLL_DATE
WHERE
CURRENT_FLAG = 'Y'
);
I don't think any index will help here, probably Oracle will decide the best approach is a full table scan to delete 20M rows from 300M. It is deleting at a rate of over 2000 rows per second, which isn't bad. In fact any additional indexes will slow it down as it has to delete the row entry from the index as well.
A quicker approach could be to create a new table of the rows you want to keep, something like:
create table EDWSOURCE.SCHEDULE_DAY_F_KEEP
as
select * from EDWSOURCE.SCHEDULE_DAY_F
where SCHEDULE_DATE_KEY <=
(
SELECT
LAST_PAYROLL_DATE_KEY
FROM
EDWSOURCE.LAST_PAYROLL_DATE
WHERE
CURRENT_FLAG = 'Y'
);
Then recreate any constraints and indexes to use the new table.
Finally drop the old table and rename the new one.
You can try testing a filtered table move. This has an online clause. So you can do this while the application is still running.
Note 12.2 and later the indexes will remain valid. In earlier versions you will need to rebuild the indexes as they will become invalid. Good Luck
Move a Table
Create and populate a new test table.
DROP TABLE t1 PURGE;
CREATE TABLE t1 AS
SELECT level AS id,
'Description for ' || level AS description
FROM dual
CONNECT BY level <= 100;
COMMIT;
Check the contents of the table.
SELECT COUNT(*) AS total_rows,
MIN(id) AS min_id,
MAX(id) AS max_id
FROM t1;
TOTAL_ROWS MIN_ID MAX_ID
---------- ---------- ----------
100 1 100
SQL>
Move the table, filtering out rows with an ID value greater than 50.
ALTER TABLE t1 MOVE ONLINE
INCLUDING ROWS WHERE id <= 50;
Check the contents of the table.
SELECT COUNT(*) AS total_rows,
MIN(id) AS min_id,
MAX(id) AS max_id
FROM t1;
TOTAL_ROWS MIN_ID MAX_ID
---------- ---------- ----------
50 1 50
SQL>
The rows with an ID value between 51 and 100 have been removed.
As mentioned above if maybe best to PARTITION the table abs drop a PARTITION every N number of days as part of a daily task.
Is it possible in an Oracle 12/19 database to get a list of tables and its records which have changed/added/deleted within the last X month? The problem here is, that the tables do not have any kind of timestamp per record. Is this possible?
You can use Flashback Query to find the changes to a table in period of time/range of SCNs:
create table t (
c1 int
);
exec dbms_lock.sleep ( 2 );
insert into t values ( 1 );
commit;
select current_scn from v$database;
CURRENT_SCN
19279246
exec dbms_lock.sleep ( 2 );
update t set c1 = 2;
commit;
delete t;
exec dbms_lock.sleep ( 2 );
commit;
select t.*, versions_operation, versions_startscn, versions_endscn
from t
versions between scn 19279246 and maxvalue
order by versions_startscn nulls first;
C1 VERSIONS_OPERATION VERSIONS_STARTSCN VERSIONS_ENDSCN
1 <null> <null> 19279250
2 U 19279250 19279253
2 D 19279253 <null>
The default window for this is small (900s). So it's highly unlikely you'll be able to go back and view changes from weeks or even hours ago. Enable Flashback Data Archive to define how long you want to keep the history for.
I need to create materialized view test without data then I will create a script to insert data into this materialized view for the first time. After this I will run materialized view refresh to refresh the view every night.
As I am not expert in materialized views can anyone help me here.
At present I have script to create materialized view which is running for 2 hours for 20 million rows.
create materialize view
If I understand the question correctly, you want to break up the MV creation into separate steps:
Create an empty table / materialized view.
Populate it.
Schedule a nightly refresh process.
For this you can use the on prebuilt table clause to change a normal table into a materialized view.
Demo source table:
create table demo_source (id, name) as
select 1, 'Red' from dual union all
select 2, 'Yellow' from dual union all
select 3, 'Orange' from dual union all
select 4, 'Blue' from dual;
New table which is going to be our MV (you could also populate it with the create table as select, or you could create it using explicit column names, datatypes, constraints, partitioning etc like any normal table):
create table demo_mv as
select * from demo_source s
where 1 = 2;
Populate it using a separate insert step:
insert into demo_mv
select * from demo_source;
Now we convert it from a regular table into an MV:
create materialized view demo_mv on prebuilt table
as
select * from demo_source;
Now DEMO_MV is a materialized view.
If I were you, I'd create the materialized view "as is" (i.e. no restrictions you mentioned).
Anyway: the simplest option is to include the false condition in the WHERE clause which creates the object without data, such as
SQL> create materialized view mv_dept as
2 select * from dept
3 where 1 = 2; --> this
Materialized view created.
SQL> select * from mv_dept;
no rows selected
SQL> desc mv_dept;
Name Null? Type
----------------------------- -------- --------------------
DEPTNO NOT NULL NUMBER(2)
DNAME VARCHAR2(14)
LOC VARCHAR2(13)
SQL>
I know this question was asked specifically about Oracle, but I got here looking for the same question about Postgres.
Luckily, Postgres has a 'WITH NO DATA' clause at the end of the materialized view statement that just creates the view but does not populate data into it. It can still be refreshed on-demand the same way after that.
https://www.postgresql.org/docs/current/sql-creatematerializedview.html
I have the same problem. At deploy time I don't want that the refresh takes to0 much time. So here I think is a better solution.
drop materialized view test_mv;
create materialized view test_mv
as select * from all_objects
where 1 = ( select count(*) from user_tables where table_name = 'TEST_MV' ) ;
select * From test_mv
=> null
exec DBMS_MVIEW.REFRESH('TEST_MV', method => 'C', atomic_refresh => FALSE, out_of_place => false , PARALLELISM => 4);
select * From test_mv
=> result is now of all objects
I have a data mart mastered from our OLTP Oracle database using basic Materialized Views with on demand fast refresh capability. Refresh is working fine. What I am interested in adding are some statistics about the refresh of each Materialized View, such as the number of inserts, updates, and deletes that were applied to the master table since the last refresh like that data I can find in user_tab_modifications. Is this possible for Materialized Views?
Prior to doing the refresh, you could query the materialized view log to see what sort of change vectors it stores. Those will be the change vectors that need to be applied to the materialized view during the refresh process (assuming that there is just one materialized view that depends on this materialized view log).
For example, if I create my table, my materialized view log, and my materialized view.
SQL> create table foo( col1 number primary key);
Table created.
SQL> create materialized view log on foo;
Materialized view log created.
SQL> ed
Wrote file afiedt.buf
1 create materialized view mv_foo
2 refresh fast on demand
3 as
4 select *
5* from foo
SQL> /
Materialized view created.
SQL> insert into foo values( 1 );
1 row created.
SQL> insert into foo values( 2 );
1 row created.
SQL> commit;
Commit complete.
Now, I refresh the materialized view and verify that the table and the materialized view are in sync
SQL> exec dbms_mview.refresh( 'MV_FOO' );
PL/SQL procedure successfully completed.
SQL> select * from user_tab_modifications where table_name = 'MV_FOO';
no rows selected
SQL> select * from foo;
COL1
----------
1
2
SQL> select * from mv_foo;
COL1
----------
1
2
Since the two objects are in sync, the materialized view log is empty (the materialized view log will be named MLOG$_<<table name>>
SQL> select * from mlog$_foo;
no rows selected
Now, if I insert a new row into the table, I'll see a row in the materialized view log with a DMLTYPE$$ of I indicating an INSERT
SQL> insert into foo values( 3 );
1 row created.
SQL> select * from mlog$_foo;
COL1 SNAPTIME$ D O
---------- --------- - -
CHANGE_VECTOR$$
--------------------------------------------------------------------------------
XID$$
----------
3 01-JAN-00 I N
FE
2.2519E+15
So you could do something like this to get the number of pending inserts, updates, and deletes.
SELECT SUM( CASE WHEN dmltype$$ = 'I' THEN 1 ELSE 0 END ) num_pending_inserts,
SUM( CASE WHEN dmltype$$ = 'U' THEN 1 ELSE 0 END ) num_pending_updates,
SUM( CASE WHEN dmltype$$ = 'D' THEN 1 ELSE 0 END ) num_pending_deletes
FROM mlog$_foo
Once you refresh the materialized view log, however, this information is gone.
On the other hand, USER_TAB_MODIFICATIONS should track the approximate number of changes that have been made to the materialized view since the last time that statistics were gathered on it just as it would track the information for a table. You'll almost certainly need to call DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO to force the data to be made visible if you want to capture the data before and after the refresh of the materialized view.
SELECT inserts, updates, deletes
INTO l_starting_inserts,
l_starting_updates,
l_starting_deletes
FROM user_tab_modifications
WHERE table_name = 'MV_FOO';
dbms_mview.refresh( 'MV_FOO' );
dbms_stats.flush_database_monitoring_info;
SELECT inserts, updates, deletes
INTO l_ending_inserts,
l_ending_updates,
l_ending_deletes
FROM user_tab_modifications
WHERE table_name = 'MV_FOO';
l_incremental_inserts := l_ending_inserts - l_starting_inserts;
l_incremental_updates := l_ending_updates - l_starting_updates;
l_incremental_deletes := l_ending_deletes - l_starting_deletes;