Are materialized views virtual tables or real tables with real data? - oracle

When materialized views are created in Oracle, do they store indices or do they store actual table values?
I am asking this as creating index on table and using views on that table and using materialized views (created with refresh complete start with (sysdate) next (sysdate+1) with rowid as) on unindexed table gives similar performance.
Where as I would expect materialized views to be far more faster.
Update
I slightly modified the content/title. My current concern after discussion is if materialized views are actual real tables or virtual tables with some optimization.

Materialized views create a copy of the data. To all intents and purposes they are actual tables. In fact we can create a materialized view from an existing table using the PREBUILT clause. The only difference is how the data is mastered - a materialized view doesn't own its data, a table does.
As to your performance conundrum:
When you say "on unindexed table" do you literally mean one table? If so, we wouldn't expect any difference in the time to query a view, a materialized view or the actual data: they all execute a full table scan on the same volume of data.
Consider the case where views have expecting select * from <table> where <condition>.
We would a SELECT against a materialised view built on that query to execute quicker than the same SELECT against the actual table, provided the WHERE clause restricts the data to a significantly smaller subset of the original data. Simply because a full table scan over a small table (materialised view) takes less time than a full table scan over a big table. Same applies if the materialised view's projection has fewer columns than the base table.
Indexing is a different matter. Unless the query selects a very small subset of the data it's not going to be more efficient than a full table scan and a filter.
To sum up: the only universal tuning heuristic is: it takes less time to do less work. Beyond that it is impossible to generalise. We can't discuss some vague "consider the case where views have select * from <table> where <condition>." It's all about the specifics.

Fundamentally, a materialized view is just a table with an associated query to populate it.
Given static data, one would generally expect the performance of a SELECT * from the materialized view (with no WHERE clause) to be at least as fast as running the query that underlies the materialized view, regardless of indexing.
If we add a WHERE clause to a SELECT * against the mview, however, that query could perform significantly slower than running the query that underlies the mview with the same WHERE clause. That's because the tables referenced in the query underlying the mview could have indexes to support the conditions in the WHERE clause, where as the mview might not have such indexes.

Related

Simple Join Query Execution Time v/s Materialized View execution Time

I am looking for performance improvement of SQL (ORACLE)
Based on few examples I tried to compare execution time between simple join between two tables v/s same query with MatrializedView.
Both execution time is almost same.
TableA Join TableB
V/s
CREATE MATERIALIZED VIEW emp_mv
BUILD IMMEDIATE
REFRESH FORCE
ON DEMAND
AS (QUERY TableA Join TableB)
both sqls are running for 7m for 1000 records.
total we have 14k Records in Table A and 50 recoreds in Table B , final output with 14K records
Is there anything which I am missing regarding performance of query execution?
Why would you expect the same query to act differently?
Materialized view's benefit might come later, when you actually start using data it contains because you already pre-processed it and prepared for future use. You could use the same query (with the join) over and over again and it'll take more or less the same time (disregard caching). But, if you store that query's data into a materialized view (and properly index it), data retrieval might/should be faster.
That's kind of "opposite" of creating an ordinary view which doesn't contain any data - it is just a stored query and it retrieves data every time you select from it, performing the same join all over again.
Materialized view contains data, just as if it were a table. It helps a lot if data is stored in tables you access over database links - that might be, and usually is, slow. But, if you create a materialized view (during night/off hours), you have data available to you much faster. It won't help much if data in tables change frequently because you'll have to refresh the MV frequently as well (usually ON COMMIT), but - if tables are really large, you have a complex query, then refreshing might also take some (a lot of?) time.

Materialized View on Snowflake

I am migrating Oracle objects to Snowflake. I have materialized view in Oracle that fetches data from multiple tables but in Snowflake a materialized view can be created on single table only. Can I use Oracle materialized view script and use it as a simple view to load into a temporary table and then use this temporary table to create a materialized view on top of it?
Can I use Oracle materialized view script and use it as a simple view to load into a temporary table and then use this temporary table to create a materialized view on top of it?
No, this won't work. A materialized view in Snowflake cannot be based on another view. But don't despair, just because you needed a materialised view in Oracle does not mean that you will need one in Snowflake ! On the contrary, it is typical in scenarios where a materialized view was needed in traditional RDBMS, that no special handling is required in Snowflake due to it's superior performance. Snowflake recommends the following considerations when deciding to use a materialized or regular view:
Create a materialized view when all of the following are true:
The query results from the view don’t change often. This almost always means that the underlying/base table for the view doesn’t change often, or at least that the subset of base table rows used in the materialized view don’t change often.
The results of the view are used often (typically significantly more often than the query results change).
The query consumes a lot of resources. Typically, this means that the query consumes a lot of processing time or credits, but it could also mean that the query consumes a lot of storage space for intermediate results.
Create a regular view when any of the following are true:
The results of the view change often.
The results are not used often (relative to the rate at which the results change).
The query is not resource intensive so it is not costly to re-run it.

ClickHouse: How to delete on *AggregatingMergeTree tables from a materialized view

Having a structure where there is a base table, then a materialized view base_mv that aggregates sending the result TO an AggregatedMergeTree table base_agg_by_id. Then we have a view over this final table. base_unique. Similarly as in this blog post](https://www.altinity.com/blog/clickhouse-continues-to-crush-time-series).
However, if I delete from base, I would expect the base_mv would trigger the mutation and then act on it, and reflected on the base_agg_by_id, but it doesn't.
Is this the expected behaviour? How to DELETE in such a schema?
I've seen here that in MVs that keep data you can act on .inner tables. However in this case, since the table is from an AggregatedMergeTree and its fields are defined as functions (e.g. AggregateFunction(argMax, String, DateTime) ), I cannot apply a deletion via a value such as ALTER base_agg_by_id DELETE WHERE field = 'myval'.
Note. For the record, we have these tables in a replicated environment using Replicated* engine: base_d, base_agg_by_id_d, base_unique_d
Mutations are not propagated to materialized views.
The reason is very simple: it not possible in common case. And even in cases when it is theoretically possible it can be very expensive operation.
For example, let's say you're deleting one record from the table which references some userid. And your materialized view contains uniqState( userid ). Data structures used for calculating uniqState don't support 'remove' operation; but even if they would - the is no way to decide if that userid should be removed or not without rereading whole data for the partition again because that userid could be seen in other records too.
So in general case, you need to refill the whole partition for your AggregatedMergeTree.
I.e. something like (daily partitioning case):
ALTER amt_table DROP PARTITION '2019-03-01';
-- use same select as in your materialized view
INSERT INTO amt_table SELECT ... WHERE date = '2019-03-01';

Materialized view in oracle with Fast Refresh instead of complete dosn't work

I have created a materialized view with Refresh complete like that and it works well:
CREATE MATERIALIZED VIEW VM4
Build immediate
refresh complete on commit
AS
select C.codecomp,
count(c.numpolice) as NbContrat,
SUM(c.montant) as MontantGlobal
from contrat C
group by c.codecomp;
Now I would like to create a similar view but with Refresh Fast but it dosn't work and it shows me this error:
error
Knowing that I have already created the LOG of the Contrat table ilke that:
CREATE MATERIALIZED VIEW LOG ON contrat with rowid ;
Check documentation:
General Restrictions on Fast Refresh
The defining query of the materialized view is restricted as follows:
The materialized view must not contain references to non-repeating expressions like SYSDATE and ROWNUM.
The materialized view must not contain references to RAW or LONG RAW data types.
It cannot contain a SELECT list subquery.
It cannot contain analytic functions (for example, RANK) in the SELECT clause.
It cannot reference a table on which an XMLIndex index is defined.
It cannot contain a MODEL clause.
It cannot contain a HAVING clause with a subquery.
It cannot contain nested queries that have ANY, ALL, or NOT EXISTS.
It cannot contain a [START WITH …] CONNECT BY clause.
It cannot contain multiple detail tables at different sites.
ON COMMIT materialized views cannot have remote detail tables.
Nested materialized views must have a join or aggregate.
Materialized join views and materialized aggregate views with a GROUP BY clause cannot select from an index-organized table.
Restrictions on Fast Refresh on Materialized Views with Aggregates
Defining queries for materialized views with aggregates or joins have
the following restrictions on fast refresh:
All restrictions from "General Restrictions on Fast Refresh".
Fast refresh is supported for both ON COMMIT and ON DEMAND
materialized views, however the following restrictions apply:
All tables in the materialized view must have materialized view logs, and the materialized view logs must:
Contain all columns from the table referenced in the materialized view.
Specify with ROWID and INCLUDING NEW VALUES.
Specify the SEQUENCE clause if the table is expected to have a mix of inserts/direct-loads, deletes, and updates.
Only SUM, COUNT, AVG, STDDEV, VARIANCE, MIN and MAX are supported for fast refresh.
COUNT(*) must be specified.
Aggregate functions must occur only as the outermost part of the expression. That is, aggregates such as AVG(AVG(x)) or AVG(x)+ AVG(x)
are not allowed.
For each aggregate such as AVG(expr), the corresponding COUNT(expr) must be present. Oracle recommends that SUM(expr) be
specified. See Requirements for Using Materialized Views with
Aggregates for further details.
If VARIANCE(expr) or STDDEV(expr) is specified, COUNT(expr) and SUM(expr) must be specified. Oracle recommends that SUM(expr *expr) be
specified. See Requirements for Using Materialized Views with
Aggregates for further details.
The SELECT column in the defining query cannot be a complex expression with columns from multiple base tables. A possible
workaround to this is to use a nested materialized view.
The SELECT list must contain all GROUP BY columns.
The materialized view is not based on one or more remote tables.
If you use a CHAR data type in the filter columns of a materialized view log, the character sets of the master site and the
materialized view must be the same.
If the materialized view has one of the following, then fast refresh is supported only on conventional DML inserts and direct
loads.
Materialized views with MIN or MAX aggregates
Materialized views which have SUM(expr) but no COUNT(expr)
Materialized views without COUNT(*)
Such a materialized view is called an insert-only materialized view.
A materialized view with MAX or MIN is fast refreshable after delete or mixed DML statements if it does not have a WHERE clause.
The max/min fast refresh after delete or mixed DML does not have the same behavior as the insert-only case. It deletes and recomputes
the max/min values for the affected groups. You need to be aware of
its performance impact.
Materialized views with named views or subqueries in the FROM clause can be fast refreshed provided the views can be completely
merged. For information on which views will merge, see Oracle Database
SQL Tuning Guide.
If there are no outer joins, you may have arbitrary selections and joins in the WHERE clause.
Materialized aggregate views with outer joins are fast refreshable after conventional DML and direct loads, provided only the
outer table has been modified. Also, unique constraints must exist on
the join columns of the inner join table. If there are outer joins,
all the joins must be connected by ANDs and must use the equality (=)
operator.
For materialized views with CUBE, ROLLUP, grouping sets, or concatenation of them, the following restrictions apply:
The SELECT list should contain grouping distinguisher that can either be a GROUPING_ID function on all GROUP BY expressions or
GROUPING functions one for each GROUP BY expression. For example, if
the GROUP BY clause of the materialized view is "GROUP BY CUBE(a, b)",
then the SELECT list should contain either "GROUPING_ID(a, b)" or
"GROUPING(a) AND GROUPING(b)" for the materialized view to be fast
refreshable.
GROUP BY should not result in any duplicate groupings. For example, "GROUP BY a, ROLLUP(a, b)" is not fast refreshable because it
results in duplicate groupings "(a), (a, b), AND (a)".
I think you missed
COUNT(*) must be specified.
For each aggregate such as AVG(expr), the corresponding COUNT(expr) must be present.
Specify with ROWID and INCLUDING NEW VALUES.
Specify the SEQUENCE clause if the table is expected to have a mix of inserts/direct-loads, deletes, and updates.

Creating Materialized view in oracle taking forever

I have following query which has select query that returns data in 5sec. But when I add create materialized view command infront it takes ever for the query to create materialized view.
When you create a materialized view, you actually create a copy of the data that Oracle takes care to keep synchronized (and it makes those views somewhat like indexes). If your view operates over a big amount of data or over data from other servers, it's natural that the creating this view can take time.
From docs.oracle.com:
A materialized view is a replica of a target master from a single
point in time.
Just for "yuks", try
create table temp_tab nologging as select ...
I've seen cases where MV creation is long for some reason, probably logging.
Also, query development tools sometimes begin returning the data to the screen right away, but if you "paged" to the last row, you would find out how long it really takes to get all the data.
You should profile the select statement with explain plan and understand the table cardinality, indexes, waits states when running, ... in order to see if the query needs tuning.

Resources