I'm having trouble with the content of a materialized view in Oracle 19c (version 19.15). I've managed to distill the issues into a reproducible test with this script:
create table b(
tsn varchar2(16) not null primary key,
fid varchar2(256) not null
);
create table bs(
tsn varchar2(16) not null constraint bet_stakes_fk references b,
leg number(1) not null,
amount number(10) not null,
primary key (tsn, leg) using index compress 1
);
create materialized view log on b
with primary key, rowid, sequence, commit scn (fid)
including new values;
create materialized view log on bs
with primary key, rowid, sequence, commit scn (amount)
including new values;
create materialized view bsd_mv
refresh fast start with (sysdate - 1) next (sysdate + 1/14400)
as
select fid, leg, sum(amount), count(*)
from bs inner join b using (tsn)
group by fid, leg;
insert into b values ('a', 'o');
insert into bs values ('a', 1, 10);
commit;
delete from bs where tsn = 'a';
delete from b where tsn = 'a';
insert into b values ('a', 'o');
insert into bs values ('a', 1, 5);
commit;
Wait 10 seconds or so before selecting
select * from bsd_mv;
The result will vary somewhat with different runs of the script, but usually the result will be
| Fid | Leg | Sum | Count |
| --- | --- | --- | ----- |
| o | 1 | 15 | 3 |
... where I would expect it to be...
| Fid | Leg | Sum | Count |
| --- | --- | --- | ----- |
| o | 1 | 5 | 1 |
If I run the query the view is based on, I always get the expected result.
Am I missing something in the setup, or do I have the wrong expectations, or have I triggered a bug in Oracle?
It took months with Oracle support, but eventually this was accepted as a bug...
Related
Say you have a table of car owners (car_owners) :
car_id
person_id
registration_date
...
And each time someone buy's a car there is a record inserted into this table.
Now I would like to create a materialized view that only holds the newest registration for each vehicle, that is when a record is inserted, the materialized view updates the record for this vehicle (if it exists) with the new record from the base table.
The materialized view only hold one record per car.
I tried something like this
create materialized view newest_owner
build immediately
refresh force on commit
select *
from car_owners c
where c.registration_date = (
select max(cc.registration_date)
from car_owners cc
where cc.car_id = c.car_id
);
It seems that materialized view do not like sub-selects.
Do you have any tips on how to do this or how to achieve this another way?
I have another solution for now, use triggers to update a separate table to hold the newest values, but I was hoping that materialized view could do the trick.
Thanks.
For this, you may have to use nested materialized views:
create table car_owners
(pk_col number primary key
,car_id number
,person_id number
,registration_date date
);
truncate table car_owners;
insert into car_owners
select rownum
,trunc(dbms_random.value(1,1000)) car_id
,mod(rownum,100000) person_id
,(sysdate-dbms_random.value(1,3000)) registration_date
from dual
connect by rownum <= 1000000;
commit;
exec dbms_stats.gather_Table_stats(null,'car_owners')
create materialized view log on car_owners with sequence, rowid
(car_id, registration_date) including new values;
create materialized view latest_registration
refresh fast on commit enable query rewrite
as
select c.car_id
,max(c.registration_date) max_registration_date
from car_owners c
group by c.car_id
/
create materialized view log on latest_registration with sequence, rowid
(car_id, max_registration_date) including new values;
create materialized view newest_owner
refresh fast on commit enable query rewrite
as
select c.rowid row_id,cl.rowid cl_rowid, c.pk_col, c.car_id, c.person_id, c.registration_date
from car_owners c
join latest_registration cl
on c.registration_date = cl.max_registration_date
and c.car_id = cl.car_id
/
select * from newest_owner where car_id = 25;
ROW_ID CL_ROWID PK_COL CAR_ID PERSON_ID REGISTRAT
------------------ ------------------ ---------- ---------- ---------- ---------
AAAUreAAMAAD/IxABS AAAUriAAMAAD+TNACE 644158 25 44158 09-APR-22
insert into car_owners values (1000001, 25,-1,sysdate);
commit;
select * from newest_owner where car_id = 25;
ROW_ID CL_ROWID PK_COL CAR_ID PERSON_ID REGISTRAT
------------------ ------------------ ---------- ---------- ---------- ---------
AAAUreAAMAAD/pLAB1 AAAUriAAMAAD+TNACE 1000001 25 -1 22-APR-22
explain plan for
select c.rowid row_id,cl.rowid cl_rowid, c.pk_col, c.car_id, c.person_id, c.registration_date
from car_owners c
join latest_registration cl
on c.registration_date = cl.max_registration_date
and c.car_id = cl.car_id
/
select * from dbms_xplan.display();
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 999 | 41958 | 3 (0)| 00:00:01 |
| 1 | MAT_VIEW REWRITE ACCESS FULL| NEWEST_OWNER | 999 | 41958 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Have a read of the docs: https://docs.oracle.com/en/database/oracle/oracle-database/21/dwhsg/basic-materialized-views.html#GUID-E087FDD0-B08C-4878-BBA9-DE56A705835E https://docs.oracle.com/en/database/oracle/oracle-database/21/dwhsg/basic-materialized-views.html#GUID-179C8C8A-585B-49E6-8970-09396DB53DE3 there are some restrictions that can slow down your refreshes (eg deletes).
I would like to analyze table usage on Verica to check the following
the tables that are hit most be queries
tables that are getting more write queries
tables that are getting more read queries.
So I am asking for help for SQL query or if anyone has any documents please point me in right direction. Thank you.
Here, I create a function QTYPE() that assigns a request of type 'QUERY' to either a SELECT, an INSERT, or a MODIFY (meaning DELETE,UPDATE,MERGE). The differentiation comes from the fact that, in Vertica, UPDATE/MERGE are actually DELETEs, then INSERTs.
I use two regular expressions of a certain complexity: first, finding [schema.]tablename after a JOIN or FROM keyword, then finding [schema.]tablename after either the UPDATE, the INSERT INTO, the MERGE INTO and the DELETE FROM keywords. Then, I join back to the tables system table to a) only select the tables really existing and b) add the schema name if it is missing.
The final report would be:
qtype | tbname | tx_count
--------+------------------------------------------------------+----------
INSERT | dbadmin.nrm_cpustats_rate | 74
INSERT | dbadmin.v_poll_item | 39
INSERT | dbadmin.child | 32
INSERT | dbadmin.tbid | 32
INSERT | dbadmin.etl_group_membership | 12
INSERT | dbadmin.sensor_oco | 11
INSERT | webanalytics.webtraffic_part | 10
INSERT | webanalytics.webtraffic_new_design_platform_datadate | 9
MODIFY | cp.foo | 2
MODIFY | public.foo | 2
MODIFY | taboola_tests.foo | 2
SELECT | dbadmin.flext | 112
SELECT | dbadmin.children | 112
SELECT | dbadmin.ffoo | 112
SELECT | dbadmin.demovals | 112
SELECT | dbadmin.allbut4 | 112
SELECT | dbadmin.allcols | 112
SELECT | dbadmin.allbut1 | 112
SELECT | dbadmin.flx | 112
Here's the function definition, and the CREATE TABLE statement to collect the statistics of what you're looking for, and finally the query getting the 'hit parade' of the most touched tables ...
Mind you, it might become a long runner with a lot of history in your query_requests table ...
CREATE OR REPLACE FUNCTION qtype(sql VARCHAR(64000))
RETURN VARCHAR(8) AS BEGIN
RETURN
CASE UPPER(REGEXP_SUBSTR(sql,'\w+')::VARCHAR(16))
WHEN 'SELECT' THEN 'SELECT'
WHEN 'WITH' THEN 'SELECT'
WHEN 'AT' THEN 'SELECT'
WHEN 'INSERT' THEN 'INSERT'
WHEN 'DELETE' THEN 'MODIFY'
WHEN 'UPDATE' THEN 'MODIFY'
WHEN 'MERGE' THEN 'MODIFY'
ELSE UPPER(REGEXP_SUBSTR(sql,'\w+')::VARCHAR(16))
END
;
END;
DROP TABLE IF EXISTS table_op_stats;
CREATE TABLE table_op_stats AS
WITH
-- need 1000 integers - up to ~400 source tables found in 1 select
i(i) AS (
SELECT MICROSECOND(tm)
FROM (
SELECT TIMESTAMPADD(MICROSECOND, 1,'2000-01-01'::TIMESTAMP)
UNION ALL SELECT TIMESTAMPADD(MICROSECOND,1000,'2000-01-01'::TIMESTAMP)
) l(ts)
TIMESERIES tm AS '1 MICROSECOND' OVER(ORDER BY ts)
)
,
tblist AS (
-- selects can affect several types, found by JOIN or FROM keyword before
-- hence look_behind regular expression
SELECT
QTYPE(request) AS qtype
, transaction_id
, statement_id
, i
, LTRIM(REGEXP_SUBSTR(request,'(?<=(from|join))\s+(\w+\.)?\w+\b',1,i,'i')) as tbname
FROM query_requests CROSS JOIN i
WHERE request_type='QUERY'
AND success
AND LTRIM(REGEXP_SUBSTR(request,'(?<=(from|join))\s+(\w+\.)?\w+\b',1,i,'i')) <> ''
UNION ALL
-- insert/delete/update/merge queries only affect one table each
SELECT
QTYPE(request) AS qtype
, transaction_id
, statement_id
, 1 AS i
, LTRIM(REGEXP_SUBSTR(request,'(insert\s+.*into\s+|update\s+.*|merge\s+.*into|delete\s+.*from)\s*((\w+\.)?\w+)\b',1,1,'i',2)) as tbname
FROM query_requests
WHERE request_type='QUERY'
AND success
AND QTYPE(request) <> 'SELECT'
)
,
-- join back to the "tables" system table - removes queries from correlation names, and adds schema name if needed
real_tables AS (
SELECT
qtype
, transaction_id
, statement_id
, i
, CASE WHEN SPLIT_PART(tbname,'.',2)=''
THEN table_schema||'.'||tbname
ELSE tbname
END AS tbname
FROM tblist
JOIN tables ON CASE WHEN SPLIT_PART(tbname,'.',2)=''
THEN tbname=table_name
ELSE SPLIT_PART(tbname,'.',1)=table_schema AND SPLIT_PART(tbname,'.',2)=table_name
END
)
SELECT
qtype
, transaction_id
, statement_id
, i
, tbname
FROM real_tables;
-- Time: First fetch (0 rows): 42483.769 ms. All rows formatted: 42484.324 ms
-- the query at the end:
WITH grp AS (
SELECT
qtype
, tbname
, COUNT(*) AS tx_count
FROM table_op_stats
GROUP BY 1,2
)
SELECT
*
FROM grp
LIMIT 8 OVER(
PARTITION BY qtype
ORDER BY tx_count DESC
);
Suppose I have two tables (tblA and tblB) and want to switch the second column of each table (tblA.Grade and tblB.Grade) as shown:
+-------------------------------------+
| table a table b |
+-------------------------------------+
| name grade name grade |
| a 60 f 50 |
| b 45 g 70 |
| c 30 h 90 |
+-------------------------------------+
Now, I would like to switch the grade column from table a to table b and the the grade column from table b to table a. The result should look like this:
+-----------------------------------------+
| table a table b |
+-----------------------------------------+
| name grade name grade |
| a 50 f 60 |
| b 70 g 45 |
| c 90 h 30 |
+-----------------------------------------+
I have created the tables, loaded them into cursors using bulk collect and the following code to complete the transformation:
insert into tblA values('a',60);
insert into tblA values('b',45);
insert into tblA values('c',30);
insert into tblb values('f',70);
insert into tblb values('g',80);
insert into tblb values('h',90);
.
DECLARE
TYPE tbla_type IS TABLE OF tbla%ROWTYPE;
l_tbla tbla_type;
TYPE tblb_type IS TABLE OF tblb%ROWTYPE;
l_tblb tblb_type;
BEGIN
-- All rows at once...
SELECT *
BULK COLLECT INTO l_tbla
FROM tbla;
SELECT *
BULK COLLECT INTO l_tblb
FROM tblb;
DBMS_OUTPUT.put_line (l_tblb.COUNT);
FOR indx IN 1 .. l_tbla.COUNT
LOOP
DBMS_OUTPUT.put_line (l_tbla(indx).lname);
update tbla set grade = l_tblb(indx).grade
where l_tbla(indx).lname= tbla.lname;
update tblb set grade = l_tbla(indx).grade
where l_tblb(indx).lname= tblb.lname;
END LOOP;
END;
So, although I did the task, I am wondering if there is a more simple solution that I have not thought of?
Please let me know if anyone knows if there may be a more simple solution?
Note that there is nothing called first or second record in databases as there is no guarantee that the first record entered will be the first one returned. So there should always be an order by to decide first/second etc.
So assuming you want the records to be ordered by name and then swap grade of smallest name of first table with grade of smallest name of second table,
Now assuming you fix the order thingy in your existing code, and if it is working, I believe it would be faster than the way I would do it below. Something like
Create a temp table and put names and grade ordered by name.
Reason of using temp table is mostly because later if I want to correct or revert the data, I can use the same temp table to reverse the merge.
create table tmp1 as
with ta as
(select t.* ,
row_number() over (order by name) as rnk
from tblA t)
,tb as
(select t.* ,
row_number() over (order by name) as rnk
from tblb t)
select ta.name as ta_name,ta.grade as ta_grade,
tb.name as tb_name,tb.grade as tb_grade
from ta inner join tb
on ta.rnk=tb.rnk
Output of tmp1
+---------+----------+---------+----------+
| TA_NAME | TA_GRADE | TB_NAME | TB_GRADE |
+---------+----------+---------+----------+
| a | 60 | f | 70 |
| b | 45 | g | 80 |
| c | 30 | h | 90 |
+---------+----------+---------+----------+
Then use merge to swap value from tmp1.
merge into tbla t1
using tmp1 t
on (t1.name=t.ta_name)
when matched then update
set t1.grade=t.tb_grade;
merge into tblb t1
using tmp1 t
on (t1.name=t.tb_name)
when matched then update
set t1.grade=t.ta_grade;
If satisfied with result, drop the temp table later
drop table tmp1;
I have a sample of the table and problem I am trying to solve in Oracle.
CREATE TABLE mytable (
id_field number
,status_code number
,desc1 varchar2(15)
);
INSERT INTO mytable VALUES (1,240,'desc1');
INSERT INTO mytable VALUES (2,242,'desc1');
INSERT INTO mytable VALUES (3,241,'desc1');
INSERT INTO mytable VALUES (4,244,'desc1');
INSERT INTO mytable VALUES (5,240,'desc2');
INSERT INTO mytable VALUES (6,242,'desc2');
INSERT INTO mytable VALUES (7,245,'desc2');
INSERT INTO mytable VALUES (8,246,'desc2');
INSERT INTO mytable VALUES (9,246,'desc1');
INSERT INTO mytable VALUES (10,242,'desc1');
commit;
SELECT *
FROM mytable
WHERE status_code IN CASE WHEN desc1 = 'desc1' THEN (240,242)
WHEN desc1 = 'desc2' THEN (240,245)
END
Basically I need to select a subset of status codes for each condition.
I could solve this with separate statements but the actual table I am doing this on has multiple descriptions and would result in around 20 unioned queries.
Any way to do this in one statement like I have attempted?
I believe that a CASE statement can only return one value (corresponding to one column in the result set). However, you can achive this in your WHERE clause without using a CASE statement:
WHERE (desc1 = 'desc1' AND status_code IN (240,242)) OR
(desc1 = 'desc2' AND status_code IN (240,245))
I like Tim answer better, but at least in postgres you can do this. Couldnt try it on oracle
Sql Fiddle DEMO
SELECT *
FROM mytable
WHERE CASE WHEN desc1 = 'desc1' THEN status_code IN (240,242)
WHEN desc1 = 'desc2' THEN status_code IN (240,245)
END
ORDER BY desc1
OUTPUT
| id_field | status_code | desc1 |
|----------|-------------|-------|
| 1 | 240 | desc1 |
| 2 | 242 | desc1 |
| 10 | 242 | desc1 |
| 5 | 240 | desc2 |
| 7 | 245 | desc2 |
I have the following use case:
A table stores the changed as well as the original data from a person. My query is designed to get only one row for each person: The changed data if there is some, else the original data.
I populated the table with 100k rows of data and 2k of changed data. When using a primary key on my table the query runs in less than a half second. If I put an index on the table instead of a primary key the query runs really slow. So I'll use the primary key, no doubt about that.
My question is: Why is the PK approach so much faster than the one with an index?
Code here:
drop table up_data cascade constraints purge;
/
create table up_data(
pk integer,
hp_nr integer,
up_nr integer,
ps_flag varchar2(1),
ps_name varchar2(100)
-- comment this out and uncomment the index below.
, constraint pk_up_data primary key (pk,up_nr)
);
/
-- insert some data
insert into up_data
select rownum, 1, 0, 'A', 'tester_' || to_char(rownum)
from dual
connect by rownum < 100000;
/
-- insert some changed data
-- change ps_flag = 'B' and mark it with a change number in up_nr
insert into up_data
select rownum, 1, 1, 'B', 'tester_' || to_char(rownum)
from dual
connect by rownum < 2000;
/
-- alternative(?) to the primary key
-- CREATE INDEX idx_up_data ON up_data(pk, up_nr);
/
The select statement looks like this:
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
The statement might be target of optimization but for the moment it will stay like this.
When you create a primary key constraint, Oracle also creates an index to support this at the same time. A primary key index has a couple of important differences over a basic index, namely:
All the values in this are guaranteed to be unique
There's no nulls in the table rows (of the columns forming the PK)
These reasons are the key to the performance differences you see. Using your setup, I get the following query plans:
--fast version with PK
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
-----------------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| PK_UP_DATA | 103K|
| 4 | INDEX UNIQUE SCAN | PK_UP_DATA | 1 |
-----------------------------------------------------
alter table up_data drop constraint pk_up_data;
CREATE INDEX idx_up_data ON up_data(pk, up_nr);
/
--slow version with normal index
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| IDX_UP_DATA | 103K|
| 4 | INDEX FAST FULL SCAN| IDX_UP_DATA | 1870 |
------------------------------------------------------
The big difference is that the fast version employs a INDEX UNIQUE SCAN, rather than a INDEX FAST FULL SCAN in the second access of the table data.
From the Oracle docs (emphasis mine):
In contrast to an index range scan, an index unique scan must have
either 0 or 1 rowid associated with an index key. The database
performs a unique scan when a predicate references all of the columns
in a UNIQUE index key using an equality operator. An index unique scan
stops processing as soon as it finds the first record because no
second record is possible.
This optimization to stop processing proves to be a significant factor in this example. The fast version of your query:
Full scans ~103,000 index entries
For each one of these finds one matching row in the PK index and stop processing the second index further
The slow version:
Full scans ~103,000 index entries
For each one of these performs another scan of the 103,000 rows to find if there's any matches.
So to compare the work done:
With the PK, we have one fast full scan, then 103,000 lookups of one index value
With normal index, we have one fast full scan then 103,000 scans of 103,000 index entries - several orders of magnitude more work!
In this example, both the uniqueness of the primary key and the not null-ness of the index values are necessary to get the performance benefit:
-- create index as unique - we still get two fast full scans
drop index index idx_up_data;
create unique index idx_up_data ON up_data(pk, up_nr);
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| IDX_UP_DATA | 103K|
| 4 | INDEX FAST FULL SCAN| IDX_UP_DATA | 1870 |
------------------------------------------------------
-- now the columns are not null, we see the index unique scan
alter table up_data modify (pk not null, up_nr not null);
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| IDX_UP_DATA | 103K|
| 4 | INDEX UNIQUE SCAN | IDX_UP_DATA | 1 |
------------------------------------------------------