I have a materialized view:
CREATE MATERIALIZED VIEW reporting_device_raw_data
ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (device_id, ts)
TTL ts + INTERVAL 3 MONTH
AS SELECT
device_id,
ts,
value
FROM reporting_device_raw_data_null;
I tried to:
ALTER TABLE reporting_device_raw_data MODIFY TTL ts + INTERVAL 12 MONTH;
But got erorr:
DB::Exception: Alter of type 'MODIFY TTL' is not supported by storage MaterializedView.
What are possible workarounds?
Check show create database .... for database engine.
Ordinary database:
ALTER TABLE ".inner.reporting_device_raw_data" MODIFY TTL ts + INTERVAL 12 MONTH;
Atomic database:
select uuid from system.tables where name = 'reporting_device_raw_data';
ALTER TABLE ".inner_id.{uuid from prev. select}" MODIFY TTL ts + INTERVAL 12 MONTH;
Related
env: Clikchouse version:22.3.3.44; Database engine: atomic
I have a raw table and mv, schema like this:
CREATE TABLE IF NOT EXISTS test.Income_Raw on cluster '{cluster}' (
Id Int64,
DateNum Date,
Cnt Int64,
LoadTime DateTime
) ENGINE==MergeTree
PARTITION BY toYYYYMMDD(LoadTime)
ORDER BY (Id, DateNum);
CREATE MATERIALIZED VIEW test.Income_MV on cluster '{cluster}'
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/{shard}/{database}/{table}', '{replica}')
PARTITION BY toYYYYMM(DateNum)
ORDER BY (Id, DateNum)
TTL DateNum+ INTERVAL 100 DAY
AS SELECT
DateNum,
Id,
argMaxState(Cnt, LoadTime) as Cnt ,
maxState( LoadTime) as latest_loadtime
FROM test.Income_Raw
GROUP BY Id, DataNum;
now I want to add a column named 'price' to raw table and mv,
so I run sql step by step like below:
// first I alter raw table
1. alter table test.Income_Raw on cluster '{cluster}' add column Price Int32
// below sqls, I run to alter MV
2. detach test.Income_MV on cluster '{cluster}'
3. alter test.`.inner_id.{uuid}` on cluster '{cluster}' add column Price Int32
// step 4, basically I just use 'attach' replace 'create' and add 'Price' to select query
4. attach MATERIALIZED VIEW test.Income_MV on cluster '{cluster}'
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/{shard}/{database}/{table}', '{replica}')
PARTITION BY toYYYYMM(DateNum)
ORDER BY (Id, DateNum)
TTL DateNum+ INTERVAL 100 DAY
AS SELECT
DateNum,
Id,
Price,
argMaxState(Cnt, LoadTime) as Cnt ,
maxState( LoadTime) as latest_loadtime
FROM test.Income_Raw
GROUP BY Id, DataNum, Price;
but at step 4, I met error like this
Code: 80. DB::Exception: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE Income_MV;
2. CREATE TABLE Income_MV <table definition>;
3. ATTACH TABLE Income_MV FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE Income_MVUUID '<uuid>' <table definition>;. (INCORRECT_QUERY) (version 22.3.3.44 (official build))
these sqls I runned is I followed from below references.
https://kb.altinity.com/altinity-kb-schema-design/materialized-views/
Clickhouse altering materialized view's select
so my question is, how to modify mv select query, which step I was wrong?
I figure out that just need:
prepare: use explicit target table instead inner table for MV
1 alter MV target table
2 drop MV
3 re-create MV with new query
Is it possible to create a materialized view where it is only incremental?
I would like the old data already inserted not to be updated, only new insertions should be included in the view.
If possible, how could I do it?
Is there any documentation or any place I can use as a guide?
If you want to see every row at the time of insert in an MV the short answer is:
You can't.
A materialized view stores the result of a query as it exists now, not some time in the past. So after updating a row you can either see its current values or exclude it from the MV.
If you want to preserve/view the state of data at insert you have a few options:
Make the table insert-only (possibly storing change history within the table)
Capture the data when its added (e.g. via triggers) into another table
Use Flashback Data Archive to store change history and view it with Flashback Query
Which of these is most appropriate depends on why you need view the data at insert.
Technically it is possible to view data at the time of insert - up to a point.
With Flashback Version Query you can see changes to the table over time. So you can do something like this:
create table t (
c1 int, c2 int,
insert_date timestamp,
update_date timestamp
);
exec dbms_session.sleep(10);
insert into t values ( 1, 1, systimestamp, systimestamp );
insert into t values ( 2, 2, systimestamp, systimestamp );
commit;
create materialized view mv
as
select t.*
from t
versions between scn minvalue
and maxvalue
where versions_operation = 'I';
exec dbms_session.sleep(10);
update t
set c2 = 9999,
update_date = systimestamp
where c1 = 2;
insert into t values ( 3, 3, systimestamp, systimestamp );
commit;
exec dbms_mview.refresh ( 'mv' );
select * from t;
C1 C2 INSERT_DATE UPDATE_DATE
1 1 26-JUL-2021 13.49.51.666954000 26-JUL-2021 13.49.51.666954000
2 9999 26-JUL-2021 13.49.51.712259000 26-JUL-2021 13.50.08.421872000
3 3 26-JUL-2021 13.50.08.462300000 26-JUL-2021 13.50.08.462300000
select *
from mv;
C1 C2 INSERT_DATE UPDATE_DATE
3 3 26-JUL-2021 13.50.08.462300000 26-JUL-2021 13.50.08.462300000
2 2 26-JUL-2021 13.49.51.712259000 26-JUL-2021 13.49.51.712259000
1 1 26-JUL-2021 13.49.51.666954000 26-JUL-2021 13.49.51.666954000
This uses undo to reconstruct history. Eventually older changes will drop off and you're back to only seeing the current state. By default you only get 15 minutes worth of changes!
If you want to store history for long period of time, Flashback Data Archive is the way to go.
Is it possible to define the TTL for a table in Clickhouse so that it references other table? Let's say I have a chat application and in my database I have two tables: chats and chat_messages. Chats have start and stop time information and I want to delete old chats along with their messages entirely when they expire - so basing on the chat stop_time. I tried to create those tables in following way:
db43af298bb9 :) CREATE TABLE chats (id Int64, start_time DateTime, stop_time DateTime) ENGINE = MergeTree() ORDER BY (start_time, id) TTL stop_time + INTERVAL 1 MONTH;
CREATE TABLE chats
(
`id` Int64,
`start_time` DateTime,
`stop_time` DateTime
)
ENGINE = MergeTree()
ORDER BY (start_time, id)
TTL stop_time + toIntervalMonth(1)
Ok.
0 rows in set. Elapsed: 0.014 sec.
db43af298bb9 :) CREATE TABLE chat_messages (id Int64, text String, chat_id Int64) ENGINE = MergeTree() ORDER BY id TTL (SELECT stop_time from chats where chats.id = chat_id) + INTERVAL 1 MONTH;
CREATE TABLE chat_messages
(
`id` Int64,
`text` String,
`chat_id` Int64
)
ENGINE = MergeTree()
ORDER BY id
TTL
(
SELECT stop_time
FROM chats
WHERE chats.id = chat_id
) + toIntervalMonth(1)
Received exception from server (version 19.16.10):
Code: 47. DB::Exception: Received from localhost:9000. DB::Exception: Missing columns: 'chat_id' while processing query: 'SELECT stop_time FROM chats WHERE id = chat_id', required columns: 'id' 'chat_id' 'stop_time', source columns: 'stop_time' 'id' 'start_time'.
0 rows in set. Elapsed: 0.017 sec.
The TTL definition for the second table fails because it tries to find the 'call_id' column in 'chats' table instead of the source 'chat_messages' table. Is what I'm trying to achieve even possible or am I forced to use ALTER DELETE mechanism instead?
In my Oracle SQL Developer, i have a table with a column with DATE format. When i insert a new row into this table, and insert a new value in this column, it automatically suggestes me the current date with the current hour.
I would like that it automatically suggestes me current date, but with 00:00:00 hour . Is there some setting or parameter that i can set in my SQL Developer to have this result?
We can't able to insert 00:00:00 hours ... the hour value should be between 1 to 12...
we can use below query to insert 00:00:00 hours but the value will be changed to 12:00:00
INSERT INTO TABLE (DATE_COL) VALUES
( TO_DATE ('11/16/2017 00:00:00 ', 'MM/DD/YYYY HH24:MI:SS '));
It seems to me that your DATE column is set with a DEFAULT of SYSDATE. This means, for any INSERT operations which do not specify a value in your DATE column, the current date and time will populate for that row. However, if INSERT operations do specify a value in your DATE column, then the specified date value will supersede the DEFAULT of SYSDATE.
If an application is controlling INSERT operations on that table, then one solution is to ensure the application utilizes the TRUNC() function to obtain your desired results. For example:
INSERT INTO tbl_target
(
col_date,
col_value
)
VALUES
(
TRUNC(SYSDATE, 'DDD'),
5000
)
;
However, if there are multiple applications or interfaces where users could be inserting new rows into the table, (e.g. using Microsoft Access or users running INSERT statements via SQL Developer) and you can't force all of those interfaces to utilize the TRUNC() function on that column during insertion, then you need to look into other options.
If you can ensure via applications that INSERT operations will not actually reference the DATE, then you can simply ALTER the table so that the DATE column will have a DEFAULT of TRUNC(SYSDATE). A CHECK CONSTRAINT can be added for further integrity:
ALTER TABLE tbl_target
MODIFY
(
col_date DATE DEFAULT TRUNC(SYSDATE, 'DDD') NOT NULL
)
ADD
(
CONSTRAINT tbl_target_CHK_dt CHECK(col_date = TRUNC(col_date, 'DDD'))
)
;
However, if users still have the freedom to specify the DATE when inserting new rows, you will want to use a TRIGGER:
CREATE OR REPLACE TRIGGER tbl_target_biu_row
BEFORE INSERT OR UPDATE OF col_val
ON tbl_target
FOR EACH ROW
BEGIN
:NEW.col_date := TRUNC(SYSDATE, 'DDD');
END tbl_target_biu_row
;
This will take of needing to manage the application code of all external INSERT operations on the table. Keep in mind, the above trigger is also modifying the DATE column if a user updates the specified value column.
Is it possible to create partition like 01 from date like 2017-01-02' where 01 is month ?
I have daily sales record and I need to do query like select * from sales where month = '01'. So it will be better if I could partition my daily sales by month.but my data has date of format 2017-01-01 and doing
create table tl (columns ......) partitioned by (date <datatype> ) will create partition on daily basis which is the last thing I want .
I need to create partition dynamically.
CAUTION:- You need to escape date column(by using ` i.e. backtick around column name) in create statement. Because date is a datatype in hive.
You can create partitions dynamically:-
by setting below parameter in query.
set hive.exec.dynamic.partition.mode=nonstrict;
Along with that you need to select only month part from source table:-
insert into table sales partition(date) select columns...,SUBSTR(date,5,2) from source_table
This insert statement will create partitions like.
show partitions sales
date=01
date=02
date=03
date=04