Clickhouse. Cannot alter settings, because table engine doesn't support settings changes - clickhouse

If I create table:
│ CREATE TABLE default.graphite
(
`Path` String,
`Value` Float64,
`Time` UInt32,
`Date` Date,
`Timestamp` UInt32
)
ENGINE = GraphiteMergeTree('graphite_rollup')
PARTITION BY toYYYYMM(Date)
ORDER BY (Path, Time)
SETTINGS index_granularity = 8192
I can: ALTER TABLE graphite MODIFY SETTING storage_policy = '???';
But if my table created:
│ CREATE TABLE graphite
(
`Path` String,
`Value` Float64,
`Time` UInt32,
`Date` Date,
`Timestamp` UInt32
)
ENGINE = GraphiteMergeTree(Date, (Path, Time), 8192, 'graphite_rollup') │
I can't =(
I recieve error: DB::Exception: Cannot alter settings, because table engine doesn't support settings changes.
Help me, how I can change settin?
└──────────────────────────────────────────────────────────────────────────

Related

Materialzed view works for few days and then stops

I have these three table (I cleaned them)
CREATE TABLE Record (
`visitId` String,
`visitorId` String,
`pageUrl` LowCardinality(String),
`createdAtDay` Date DEFAULT now()
) ENGINE = MergeTree PARTITION BY toYYYYMM(createdAtDay) PRIMARY KEY (
visitorId,
visitId,
pageUrl,
createdAtDay
)
ORDER BY
(visitorId, visitId, pageUrl)
CREATE MATERIALIZED VIEW DurationPerPage (
`visits` Int64 CODEC(DoubleDelta, LZ4),
`pageUrl` LowCardinality(String),
`visitors` Int64 CODEC(DoubleDelta, LZ4),
`duration` Int64 CODEC(DoubleDelta, LZ4),
`createdAtDay` Date,
) ENGINE = SummingMergeTree((visits, visitors, duration))
ORDER BY
(createdAtDay, pageUrl) AS
SELECT
countDistinct(visitId) AS visits,
cutQueryStringAndFragment(pageUrl) AS pageUrl,
countDistinct(visitorId) AS visitors,
sum(e.value) AS duration,
createdAtDay
FROM
Record AS r
LEFT JOIN Events AS e ON (r.visitId = e.visitId)
AND (e.eventType = 6)
WHERE
pageType LIKE '%single%'
GROUP BY
(createdAtDay, pageUrl);
CREATE TABLE Events (
`visitId` String,
`visitorId` String,
`value` Int64 CODEC(DoubleDelta, LZ4),
`eventType` Int16 CODEC(DoubleDelta, LZ4)
) ENGINE = MergeTree PARTITION BY (toYYYYMM(createdAtDay), eventType) PRIMARY KEY (visitId, eventType, createdAtDay)
ORDER BY
(visitId, eventType, createdAtDay)
as you can see I'm using both Record and Events table to feed my materialzed view. it works good for few days and then it stops and starts saving weird data (mostly zeros at the duration field) and I have then to delete and recreate it.
is there a related bug to this ? or something is wrong the View ?

AWS Randomize data for large tables

I have a table in redshift with values over 1.8 billion i am trying to randomize that data.
Here is the table values attributes
id bigint,
customer_internal_id bigint,
customer_id VARCHAR(256) Not NULL,
customer_name VARCHAR(256) Not NULL,
customer_type_id bigint,
start_date date,
end_date date,
request_id bigint,
entered VARCHAR(256) not NULL,
superseded VARCHAR(256) not NULL,
customer_latitude double precision,
customer_longitude double precision,
zip_internal_id bigint
How can i achieve this as i tried to look for more of an option but there is no enough documentation available.
Here is the expected output.
i have some code written for PostgresSQL
with result as (
select id, customer_id, customer_name,
lead(customer_id) over w as first_1,
lag(customer_name) over w as first_2
from master.customer_temp_df
window w as (order by random())
)
update master.customer_temp_df
set customer_id = coalesce(first_1, first_2),customer_name = coalesce(first_2, first_1)
from result
where master.customer_temp_df.id = result.id;
but this doesnt work in redshift and i am looking for something like this.
The final goal is to randomize entire table.

Clickhouse GraphiteMergeTree Table migrate from deprecated format_version

I tried 2 ways described here enter link description here
Edit metadata file
CREATE TABLE graphite.data_test
(
Path String,
Value Float64,
Time UInt32,
Date Date,
Timestamp UInt32
)
ENGINE = GraphiteMergeTree(Date, (Path, Time), 8192, 'graphite_rollup')
alter table graphite.data_test attach partition 202208 from graphite.data;
detach table graphite.data_test;
vi /var/lib/clickhouse/metadata/graphite/data_test.sql
ATTACH TABLE data_test
(
Path String,
Value Float64,
Time UInt32,
Date Date,
Timestamp UInt32
)
ENGINE = GraphiteMergeTree('graphite_rollup')
PARTITION BY toYYYYMM(Date)
ORDER BY (Path, Time)
SETTINGS index_granularity = 8192;
attach table graphite.data_test;
ERROR: MergeTree data format version on disk doesn't support custom partitioning.
Copy partitions
CREATE TABLE data_test
(
Path String,
Value Float64,
Time UInt32,
Date Date,
Timestamp UInt32
)
ENGINE = GraphiteMergeTree('graphite_rollup')
PARTITION BY toYYYYMM(Date)
ORDER BY (Path, Time)
SETTINGS index_granularity = 8192;
alter table graphite.data_test attach partition 202208 from graphite.data;
ERROR: Tables have different format_version.
Can you tell me if there is any workaround, a way to change the deprecate table to a new format?
it is possible only using console app and only for parts (not tables) and you need to build this app by yourself.
this app has different names (dependent on version) convert-parts-from-old-format / convert-month-partitioned-parts

clickhouse create table Exception: Aggregate function minState(origin_user) is found in wrong place in query

CREATE TABLE user_dwd.user_tag_bitmap_local
(
`tag` String,
`tag_item` String,
`p_day` Date,
`origin_user` UInt64,
`users` AggregateFunction(min, UInt64) MATERIALIZED minState(origin_user)
)
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMMDD(p_day)
ORDER BY (tag, tag_item)
SETTINGS index_granularity = 8192;
when running sql to create table, show error:
[2021-10-17 12:05:28] Code: 184, e.displayText() = DB::Exception: Aggregate function minState(origin_user) is found in wrong place in query: While processing minState(origin_user) AS users_tmp_alter9508717652815860223: default expression and column type are incompatible. (version 21.8.4.51 (official build))
how to solve the error?
minState is an aggregating function, you cannot use it like this (it is for queries with a groupby section).
To solve it you can use MATERIALIZED initializeAggregation... or MATERIALIZED arrayReduce(minState...
But actually you don't need the second column.
You are looking for SimpleAggregateFunction:
https://clickhouse.com/docs/en/sql-reference/data-types/simpleaggregatefunction/
CREATE TABLE user_dwd.user_tag_bitmap_local
(
`tag` String,
`tag_item` String,
`p_day` Date,
`origin_user` SimpleAggregateFunction(min, UInt64) ---<<<-----
)
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMMDD(p_day)
ORDER BY (tag, tag_item)
SETTINGS index_granularity = 8192;
https://clickhouse.com/docs/en/sql-reference/functions/other-functions/#initializeaggregation
CREATE TABLE user_tag_bitmap_local
(
`tag` String,
`tag_item` String,
`p_day` Date,
`origin_user` UInt64,
`users` AggregateFunction(min, UInt64) MATERIALIZED initializeAggregation('minState', origin_user)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMMDD(p_day)
ORDER BY (tag, tag_item)
SETTINGS index_granularity = 8192
https://clickhouse.com/docs/en/sql-reference/functions/array-functions/#arrayreduce
CREATE TABLE user_tag_bitmap_local
(
`tag` String,
`tag_item` String,
`p_day` Date,
`origin_user` UInt64,
`users` AggregateFunction(min, UInt64) MATERIALIZED arrayReduce('minState', [origin_user])
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMMDD(p_day)
ORDER BY (tag, tag_item)
SETTINGS index_granularity = 8192

Clickhouse frequently dies and comeup automatically

clickhouse running in K8s as a stateful set dies frequently and coming up automatically. Following call trace is seen error log file of clickhouse.
https://pastebin.com/m1N54vHy
Startup of clickhouse.
https://pastebin.com/EwVKpP12
Can anyone please help to understand what's wrong with the setup?
SHOW CREATE TABLE graphite_index
Query id: ed03eae3-c9d1-4f42-a061-d790460ce3ef
┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.graphite_index
(
`Date` Date,
`Level` UInt32,
`Path` String,
`Version` UInt32,
`updated` DateTime DEFAULT now(),
`status` Enum8('SIMPLE' = 0, 'BAN' = 1, 'APPROVED' = 2, 'HIDDEN' = 3, 'AUTO_HIDDEN' = 4)
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/single/default.graphite_index', 'clickhouse1-0', updated)
PARTITION BY toYYYYMM(Date)
ORDER BY Path
SETTINGS index_granularity = 1024 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────────────────────┘
SHOW CREATE TABLE graphite
Query id: 95f64ef9-3154-4c04-9d0f-11d706469429
┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.graphite
(
`Path` String CODEC(ZSTD(2)),
`Value` Float64 CODEC(Delta(8), ZSTD(2)),
`Time` UInt32 CODEC(Delta(4), ZSTD(2)),
`Date` Date CODEC(Delta(2), ZSTD(2)),
`Timestamp` UInt32 CODEC(Delta(4), ZSTD(2))
)
ENGINE = Distributed('graphitereplica', '', 'graphite', xxHash64(Path)) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────────────────────┘

Resources