How to monitor OPTIMIZE process in Clickhouse? - clickhouse

Adding a materialized column
ALTER TABLE actions ADD COLUMN `total` UInt64 MATERIALIZED price1 + price2
Requires running OPTIMIZE on table
OPTIMIZE TABLE actions FINAL
OPTIMIZE is a slow process, how can I monitor this process?

select * from system.merges where table = 'actions'

Related

Adding a sequence to a large Oracle table

I have queries that take an existing large table and build tables off of them for reporting. The problem is that the source tables are 60-80MM+ records and it takes a long time to recreate. I'd like to be able to identify which records are new so I can build just add the new records to the reporting tables.
To me, the best way to identify this is to have an identity column. Is there any significant cost to creating this and adding it to the table?
Separately, is it possible to create a materialized view that takes data from one of these tables but add a sequence as part of the materialized view? That is, something like
create materialized view some_materialized_view as
select somesequence.nextval, source_table.*
from source_table?
You can add a sequence based column to your table, but as Gary suggests I wouldn't do that.
The task you are about to solve is so common that other solutions have been already implemented.
The first built-in option that comes to mind is the system change number SCN, a kind of Oracle internal clock. By default, tables are set up to record the SCN of the whole (usually 8K) block, containing usually many rows, but you can set a table to keep a record of the SCN that changed every row. Then you can track the columns that are new or change and have not been copied to your reporting tables.
CREATE TABLE t (c1 NUMBER) ROWDEPENDENCIES;
INSERT INTO t VALUES (1);
COMMIT;
SELECT c1, ora_rowscn FROM t;
Secondly, I would think of adding a date column. With 60-80 mio rows I wouldn't do this with ALTER TABLE xxx ADD (d DATE DEFAULT SYSDATE), but with rename, create as select, drop:
CREATE TABLE t AS SELECT * FROM all_objects;
RENAME t TO told;
CREATE TABLE t AS SELECT sysdate AS d, told.* FROM told;
ALTER TABLE t MODIFY d DATE DEFAULT SYSDATE;
DROP TABLE told;
Thirdly, I would read up on materialized views. I never had the chance to use this a work, but in theory, you should be able to set up a materialized view log on your 80 m table that records changes and updates dependent materialized views.
And forthly, I'd look into partitioning your large table on the (newly introduced) date column, so that identifying the new rows will become faster. That sadly depends on your version and Oracle license, though.

How to Fast Refresh on a Materialized View with Joins and Aggregates?

Suppose I have two tables job and batch:
CREATE TABLE batch
(
batch_id NUMBER(20) PRIMARY KEY,
batch_type NUMBER(20),
[some other values] ...
);
CREATE TABLE job
(
job_id NUMBER(20) PRIMARY KEY,
job_batch_id NUMBER(20),
job_usr_id NUMBER(20),
job_date DATE,
[some other values] ...
CONSTRAINT fk_job_batch
FOREIGN KEY (job_batch_id) REFERENCES batch(batch_id),
CONSTRAINT fk_job_usr
FOREIGN KEY (job_usr_id) REFERENCES client(usr_id)
);
And suppose they each contain a considerable amount of data (many millions of rows). What I want to do is create a Materialized View to reflect, for each usr_id, what the first and last jobs were run for a particular type of batch. For example:
CREATE MATERIALIZED VIEW client_first_last_job
(usr_id, first_job_date, last_job_date)
AS
(
SELECT
job_usr_id AS usr_id,
MIN(job_date) AS first_job_date,
MAX(job_date) AS last_job_date
FROM job, batch
WHERE job_batch_id=batch_id
AND batch_type IN (1,3,5,9)
GROUP BY job_usr_id
);
This is all well and good, but because there are so many records it takes a very long time to build this materialized view (far longer than is acceptable to deal with each time it needs to refresh). My immediate thought would be to use Materialized View Logs to have incremental updates. These are easy enough to create. But when I try to build the MV to use REFRESH FAST ON DEMAND, that gets me a ORA-12015: cannot create a fast refresh materialized view from a complex query error, which from some Googling I am guessing is due to the coexistence of a join and aggregate functions.
Is there another way to do this? Note that de-normalization or other alterations to the parent tables are not feasible.
You can nest your mviews which you can read about from the docs:
CREATE MATERIALIZED VIEW joinmview
(usr_id, job_date)
REFRESH FORCE ON DEMAND
AS
(
SELECT
job_usr_id AS usr_id,
job_date
FROM job, batch
WHERE job_batch_id=batch_id
AND batch_type IN (1,3,5,9)
);
CREATE MATERIALIZED VIEW LOG ON JOINMVIEW
WITH ROWID (usr_id, JOB_DATE) including new values;
CREATE MATERIALIZED VIEW client_first_last_job
(usr_id, first_job_date, last_job_date)
REFRESH FORCE ON DEMAND
AS
(
SELECT
usr_id,
MIN(job_date) AS first_job_date,
MAX(job_date) AS last_job_date
FROM joinmview
GROUP BY usr_id
);
Verify that both mviews can fast refresh:
exec dbms_mview.refresh('JOINMVIEW', 'C');
exec dbms_mview.refresh('JOINMVIEW', 'F');
exec dbms_mview.refresh('CLIENT_FIRST_LAST_JOB', 'C');
exec dbms_mview.refresh('CLIENT_FIRST_LAST_JOB', 'F');
You can put both mviews into the same refresh group (docs), just be sure to add them in the order of their dependency. In other words, in this example, add JOINMVIEW before you add CLIENT_FIRST_LAST_JOB to the refresh group.

Scheduling a job in oracle to drop and create table

I am trying to DROP and CREATE a table using a scheduled job. How to do it please?
DROP TABLE REGISTRATION;
CREATE TABLE REGISTRATION AS
SELECT *
FROM REGISTRATION_MV#D3PROD_KMDW
perhaps you are looking for a materialized view instead of the table?
create materialized view registration as
select *
from registration_mv#d3prod_kmdw
and then refresh it in job
dbms_mview.refresh('REGISTRATION');
This is much better than dropping and creating the table because PL/SQL objects will become invalid after dropping the table. With mview it will be silent and without harm.
You should use DBMS_SCHEDULER to schedule jobs in Oracle.
Refer to the documentation for how to create a sample scheduler job
Don't drop and recreate the table. If you do this on a regular basis you will end up with a huge number of items in your recycle bin, which can lead to problems. Also as above you can invalidate pl/sql when you drop objects, and can break your app if they fail to recompile.
Instead you should truncate the table to delete all rows, and then repopulate it:
truncate table registration;
insert into registration (select * from registration_mv#D3PROD_KMDW);

Query too complex for a simple join [duplicate]

So I'm pretty sure Oracle supports this, so I have no idea what I'm doing wrong. This code works:
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.* FROM TPM_PROJECTVERSION V;
If I add in a JOIN, it breaks:
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.* FROM TPM_PROJECTVERSION V
INNER JOIN TPM_PROJECT P ON P.PROJECTID = V.PROJECTID
Now I get the error:
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view
I've created materialized view logs on both TPM_PROJECT and TPM_PROJECTVERSION. TPM_PROJECT has a primary key of PROJECTID and TPM_PROJECTVERSION has a compound primary key of (PROJECTID,VERSIONID). What's the trick to this? I've been digging through Oracle manuals to no avail. Thanks!
To start with, from the Oracle Database Data Warehousing Guide:
Restrictions on Fast Refresh on Materialized Views with Joins Only
...
Rowids of all the tables in the FROM list must appear in the SELECT
list of the query.
This means that your statement will need to look something like this:
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.*, V.ROWID as V_ROWID, P.ROWID as P_ROWID
FROM TPM_PROJECTVERSION V,
TPM_PROJECT P
WHERE P.PROJECTID = V.PROJECTID
Another key aspect to note is that your materialized view logs must be created as with rowid.
Below is a functional test scenario:
CREATE TABLE foo(foo NUMBER, CONSTRAINT foo_pk PRIMARY KEY(foo));
CREATE MATERIALIZED VIEW LOG ON foo WITH ROWID;
CREATE TABLE bar(foo NUMBER, bar NUMBER, CONSTRAINT bar_pk PRIMARY KEY(foo, bar));
CREATE MATERIALIZED VIEW LOG ON bar WITH ROWID;
CREATE MATERIALIZED VIEW foo_bar
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT AS SELECT foo.foo,
bar.bar,
foo.ROWID AS foo_rowid,
bar.ROWID AS bar_rowid
FROM foo, bar
WHERE foo.foo = bar.foo;
Have you tried it without the ANSI join ?
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.* FROM TPM_PROJECTVERSION V,TPM_PROJECT P
WHERE P.PROJECTID = V.PROJECTID
You will get the error on REFRESH_FAST, if you do not create materialized view logs for the master table(s) the query is referring to. If anyone is not familiar with materialized views or using it for the first time, the better way is to use oracle sqldeveloper and graphically put in the options, and the errors also provide much better sense.
The key checks for FAST REFRESH includes the following:
1) An Oracle materialized view log must be present for each base table.
2) The RowIDs of all the base tables must appear in the SELECT list of the MVIEW query definition.
3) If there are outer joins, unique constraints must be placed on the join columns of the inner table.
No 3 is easy to miss and worth highlighting here
USE THIS CODE
CREATE MATERIALIZED VIEW MV_ptbl_Category2
BUILD IMMEDIATE
REFRESH FORCE
ON COMMIT
AS
SELECT *
FROM ptbl_Category2;
Note- MV_ptbl_Category2 is the Materialized view name
Ptbl is the table name.

Oracle materialized view question

I have a table that holds information about different events, for example
CREATE TABLE events (
id int not null primary key,
event_date date, ...
)
I realized that 90% of all queries access only today events; the older rows are stored for history and eventually moved to an archive table.
However, events table is still large, and I wonder if I can improve the performance by creating a materialized view that has something like WHERE event_date = trunc(sysdate) and maybe index on event_date ? Is it allowed at all?
Thanks
yes this is allowed see "primary key materialized view":
Primary key materialized views may contain a subquery so that you can
create a subset of rows at the remote materialized view site
and "complex materialized view"
If you refresh rarely and want faster query performance, then use
Method A (complex materialized view).
If you refresh regularly and can sacrifice query performance, then use Method B (simple materialized view).
at http://download.oracle.com/docs/cd/B10500_01/server.920/a96567/repmview.htm
In your example chances are good IMHO that this is not a "complex materialized view":
CREATE MATERIALIZED VIEW events_today REFRESH FAST AS
SELECT * FROM EVENT WHERE event_date = trunc(sysdate);
Just try it and see if Oracle accepts it with the REFRESH FAST clause.
EDIT - another option:
Depending on your DB Edition (Enterprise + Partitioning) and Version (11gR2) you could use a new Oracle feature called INTERVAL partitioning to define "daily partitions" within the existing table. This way most of your queries get alot faster without effectively duplicating the data - see http://www.oracle.com/technetwork/database/options/partitioning/twp-partitioning-11gr2-2009-09-130569.pdf

Resources