Hive - Incremental update materialized view - hadoop

Suppose I have a transactional table t1 as
1,abc,4.5
2,xyz,3.7
And I create a materialized view on it :
> create materialized view t1_mv as select * from t1;
Then I update the table :
> insert into t1 values (3,"lmn",4.7)
Now when I want to update the view I have to execute following query :
> ALTER MATERIALIZED VIEW t1_mv REBUILD;
In above query rebuild operation triggers full scan on t1 table and rewrites materialized view.
As per Hive documentation : "Hive supports incremental view maintenance, i.e., only refresh data that was affected by the changes in the original source tables. Incremental view maintenance will decrease the rebuild step execution time. In addition, it will preserve LLAP cache for existing data in the materialized view." link to documentation
However the exact process is not mentioned about incremental update on materialized views,
My Questions are :
How to incrementally update materialized view ?
What is the role of LLAP cache in the process of incremental update ?

Related

SQL Error: ORA-32361

When I create real-Time Materialized Views in Oracle Database 12c Release 2 I get SQL Error: ORA-32361. What does it mean?
My view is simple, without any aggregation.
CREATE MATERIALIZED VIEW example_t_mv
REFRESH FAST ON DEMAND
ENABLE QUERY REWRITE
ENABLE ON QUERY COMPUTATION
AS
SELECT et.some_value_1, et.some_value_2
FROM example_t et WHERE et.some_value_2 < 10;
SQL Error: ORA-32361: cannot ENABLE ON QUERY COMPUTATION for the
materialized view
Create MVLog and try to create the MV
CREATE MATERIALIZED VIEW LOG ON example_t
WITH ROWID, SEQUENCE(some_value_1, some_value_2)
INCLUDING NEW VALUES;

does dropping materialized view deletes all indexes and data used to speed up queries?

I created a materialized view which significantly sped up
select * from table
query above. After, I dropped materialized view, this query is still returning records as fast as with materialized view.
I used drop materialized view viewname query to drop MV. Do I need to do anything else to return table to previous state ?

Oracle materialized view computational cost

Is the computational cost of updating a stored procedure materialized view, in Oracle, based on the query execution or the result set? More specifically, does Oracle store the results of the query in such a way that contributes significantly to the time required to refresh the view?
Of course, queries which take very long to execute as well as incredibly large or small result sets make this impossible to answer ubiquitously.
The question is more about how the view actually stores the result set (in memory, on disk) so I can think about how frequently to rebuild materialized views.
Materialized view is basically a table combined with an algorithm to update it.
01:37:23 HR#sandbox> create materialized view mv_dual as select dummy from dual;
Materialized view created.
Elapsed: 00:00:00.52
01:37:56 HR#sandbox> select object_name, object_type from user_objects where object_name = 'MV_DUAL';
OBJECT_NAME OBJECT_TYPE
--------------- -------------------
MV_DUAL TABLE
MV_DUAL MATERIALIZED VIEW
Elapsed: 00:00:00.01
You can also create materialized views on prebuilt tables.
If we talk about refresh - there are two options: fast refresh and complete refresh.
Complete refresh just re-executes MV query, while fast refresh performs incremental updates.
http://docs.oracle.com/cd/E16338_01/server.112/e10706/repmview.htm#i29858
there are two types of mviews
Complete refresh mview - the entier mview will be rebuild every refresh. similar to delete and insert (notice: if you specify atomic = F or have version < 9 it will be truncate / insert append).
Fast refresh mview - oracle will create a table to store incremental changes. when refreshing, the changes stored in the side table will be applied to the mview.
fast refresh is faster on refresh but slows down dml operations on the base table.
when you consider your refresh strategy you should consider how much changes are applied to the base table and how often you need to refresh the mview.

Query too complex for a simple join [duplicate]

So I'm pretty sure Oracle supports this, so I have no idea what I'm doing wrong. This code works:
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.* FROM TPM_PROJECTVERSION V;
If I add in a JOIN, it breaks:
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.* FROM TPM_PROJECTVERSION V
INNER JOIN TPM_PROJECT P ON P.PROJECTID = V.PROJECTID
Now I get the error:
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view
I've created materialized view logs on both TPM_PROJECT and TPM_PROJECTVERSION. TPM_PROJECT has a primary key of PROJECTID and TPM_PROJECTVERSION has a compound primary key of (PROJECTID,VERSIONID). What's the trick to this? I've been digging through Oracle manuals to no avail. Thanks!
To start with, from the Oracle Database Data Warehousing Guide:
Restrictions on Fast Refresh on Materialized Views with Joins Only
...
Rowids of all the tables in the FROM list must appear in the SELECT
list of the query.
This means that your statement will need to look something like this:
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.*, V.ROWID as V_ROWID, P.ROWID as P_ROWID
FROM TPM_PROJECTVERSION V,
TPM_PROJECT P
WHERE P.PROJECTID = V.PROJECTID
Another key aspect to note is that your materialized view logs must be created as with rowid.
Below is a functional test scenario:
CREATE TABLE foo(foo NUMBER, CONSTRAINT foo_pk PRIMARY KEY(foo));
CREATE MATERIALIZED VIEW LOG ON foo WITH ROWID;
CREATE TABLE bar(foo NUMBER, bar NUMBER, CONSTRAINT bar_pk PRIMARY KEY(foo, bar));
CREATE MATERIALIZED VIEW LOG ON bar WITH ROWID;
CREATE MATERIALIZED VIEW foo_bar
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT AS SELECT foo.foo,
bar.bar,
foo.ROWID AS foo_rowid,
bar.ROWID AS bar_rowid
FROM foo, bar
WHERE foo.foo = bar.foo;
Have you tried it without the ANSI join ?
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.* FROM TPM_PROJECTVERSION V,TPM_PROJECT P
WHERE P.PROJECTID = V.PROJECTID
You will get the error on REFRESH_FAST, if you do not create materialized view logs for the master table(s) the query is referring to. If anyone is not familiar with materialized views or using it for the first time, the better way is to use oracle sqldeveloper and graphically put in the options, and the errors also provide much better sense.
The key checks for FAST REFRESH includes the following:
1) An Oracle materialized view log must be present for each base table.
2) The RowIDs of all the base tables must appear in the SELECT list of the MVIEW query definition.
3) If there are outer joins, unique constraints must be placed on the join columns of the inner table.
No 3 is easy to miss and worth highlighting here
USE THIS CODE
CREATE MATERIALIZED VIEW MV_ptbl_Category2
BUILD IMMEDIATE
REFRESH FORCE
ON COMMIT
AS
SELECT *
FROM ptbl_Category2;
Note- MV_ptbl_Category2 is the Materialized view name
Ptbl is the table name.

Oracle - What happens when refreshing a 'REFRESH FORCE ON DEMAND' view with DBMS_MVIEW.REFRESH

I have the following materialized view -
CREATE MATERIALIZED VIEW TESTRESULT
ON PREBUILT TABLE WITH REDUCED PRECISION
REFRESH FORCE ON DEMAND
WITH PRIMARY KEY
AS
SELECT...
FROM...
WHERE...
This materialized view has no backing MATERIALIZED VIEW LOG. As seen in the clause above this MV has "ON DEMAND" specifies, and according to Oracle documentation,
"[ON DEMAND] indicate[s] that the materialized
view will be refreshed on demand by
calling one of the three DBMS_MVIEW
refresh procedures."
When I call DBMS_MVIEW.REFRESH('TESTRESULT') , what is occuring? Is it manually checking each record to see if it has been updated?
Oracle Version: 10g
By default (and this default changes in different versions of Oracle), that will do a full, atomic refresh on the materialized view. That means that the data in the materialized view will be deleted, the underlying query will be re-executed, and the results will be loaded into the materialized view. You can make the refresh more efficient by passing in a value of FALSE for the ATOMIC_REFRESH parameter, i.e.
dbms_mview.refresh( 'TESTRESULT', atomic_refresh => false );
That will cause the materialized view to be truncated, the query re-executed, and the results inserted into the materialized view via a direct path insert. That will be more efficient than an atomic refresh but the materialized view will be empty during the refresh.

Resources