Clickhouse - Materialized view is not updating for Postgres source table - clickhouse

My requirement is to have a Clickhouse Materialized view based on a Postgres table. A Postgres connection is created in Clickhouse and the table data is visible. Materialized view is not reflecting insert/updated data.
Postgres table
CREATE TABLE prod_data (
id int4 NOT NULL,
prod_id int4 NOT NULL,
prod_type int4 NOT NULL,
prod_cost float4 NULL,
sell_cost float4 NULL,
created_by varchar(100) NOT NULL,
created_date timestamp(6) NOT NULL,
updated_by varchar(100) NULL,
updated_date timestamp(6) NULL
);
Materialized view in Clickhouse
CREATE MATERIALIZED VIEW testing.mv_prod_data_summerge
(
prod_id Int32,
prod_type Int32,
prod_cost Nullable(Float32),
sell_cost Nullable(Float32)
)
ENGINE = SummingMergeTree
order by (prod_id,prod_type)
populate
AS
SELECT
prod_id,
prod_type,
sum(prod_cost) as prod_cost ,
sum(sell_cost) as sell_cost
FROM PG_mi_datasource.test_prod_data
group by prod_id ,prod_type ;
After creating the Materialized view, the changes made in base table is not reflecting. On execution of the base query the changes are visible. The aggregate function sum and sumState exhibit same behavior.
Kindly suggest what needs to be done to have the changes reflected in Materialized view.

Materiazed View is an insert trigger. They work only if you insert data into ClickHouse tables.
https://den-crane.github.io/Everything_you_should_know_about_materialized_views_commented.pdf
You may use MaterializedPostgreSQL
https://clickhouse.com/docs/en/integrations/postgresql/postgres-with-clickhouse-database-engine/#1-in-postgresql

Related

How to select only those rows which are greater than modified time using spring data jpa

For Example ,
I have created a table ,
CREATE DATABASE es_db;
USE es_db;
DROP TABLE IF EXISTS es_table;
CREATE TABLE es_table (
id BIGINT(20) UNSIGNED NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY unique_id (id),
client_name VARCHAR(32) NOT NULL,
modification_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
insertion_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
Now assume i have to select those data which are greater than time i give as input .
consider this query for example,
SELECT *, UNIX_TIMESTAMP(modification_time) AS unix_ts_in_secs FROM es_table WHERE (UNIX_TIMESTAMP(modification_time) > :sql_last_modifiedvalue AND modification_time < NOW()) ORDER BY modification_time ASC
Is there away to translate the same to native query ? i can achieve the same with jdbctemplate but would like to know if this is possible with native query?

optimize an inner join between two multi-million row tables

I'm new to Postgres and even newer to understanding how explain works. I have a query below which is typical, I just replace the date:
explain
select account_id,
security_id,
market_value_date,
sum(market_value) market_value
from market_value_history mvh
inner join holding_cust hc on hc.id = mvh.owning_object_id
where
hc.account_id = 24766
and market_value_date = '2015-07-02'
and mvh.created_by = 'HoldingLoad'
group by account_id, security_id, market_value_date
order by security_id, market_value_date;
Attached is a screenshot of explain
The count for holding_cust table is 2 million rows and market_value_history table has 163 million rows
Below are the table definitions and indexes for market_value_history and holding_cust:
I'd appreciate any advice you may be able to give me on tuning this query.
CREATE TABLE public.market_value_history
(
id integer NOT NULL DEFAULT nextval('market_value_id_seq'::regclass),
market_value numeric(18,6) NOT NULL,
market_value_date date,
holding_type character varying(25) NOT NULL,
owning_object_type character varying(25) NOT NULL,
owning_object_id integer NOT NULL,
created_by character varying(50) NOT NULL,
created_dt timestamp without time zone NOT NULL,
last_modified_dt timestamp without time zone NOT NULL,
CONSTRAINT market_value_history_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.market_value_history
OWNER TO postgres;
-- Index: public.ix_market_value_history_id
-- DROP INDEX public.ix_market_value_history_id;
CREATE INDEX ix_market_value_history_id
ON public.market_value_history
USING btree
(owning_object_type COLLATE pg_catalog."default", owning_object_id);
-- Index: public.ix_market_value_history_object_type_date
-- DROP INDEX public.ix_market_value_history_object_type_date;
CREATE UNIQUE INDEX ix_market_value_history_object_type_date
ON public.market_value_history
USING btree
(owning_object_type COLLATE pg_catalog."default", owning_object_id, holding_type COLLATE pg_catalog."default", market_value_date);
CREATE TABLE public.holding_cust
(
id integer NOT NULL DEFAULT nextval('holding_cust_id_seq'::regclass),
account_id integer NOT NULL,
security_id integer NOT NULL,
subaccount_type integer,
trade_date date,
purchase_date date,
quantity numeric(18,6),
net_cost numeric(18,2),
adjusted_net_cost numeric(18,2),
open_date date,
close_date date,
created_by character varying(50) NOT NULL,
created_dt timestamp without time zone NOT NULL,
last_modified_dt timestamp without time zone NOT NULL,
CONSTRAINT holding_cust_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.holding_cust
OWNER TO postgres;
-- Index: public.ix_holding_cust_account_id
-- DROP INDEX public.ix_holding_cust_account_id;
CREATE INDEX ix_holding_cust_account_id
ON public.holding_cust
USING btree
(account_id);
-- Index: public.ix_holding_cust_acctid_secid_asofdt
-- DROP INDEX public.ix_holding_cust_acctid_secid_asofdt;
CREATE INDEX ix_holding_cust_acctid_secid_asofdt
ON public.holding_cust
USING btree
(account_id, security_id, trade_date DESC);
-- Index: public.ix_holding_cust_security_id
-- DROP INDEX public.ix_holding_cust_security_id;
CREATE INDEX ix_holding_cust_security_id
ON public.holding_cust
USING btree
(security_id);
-- Index: public.ix_holding_cust_trade_date
-- DROP INDEX public.ix_holding_cust_trade_date;
CREATE INDEX ix_holding_cust_trade_date
ON public.holding_cust
USING btree
(trade_date);
Two things:
As Dmitry pointed out, you should look at creating an Index on market_value_date field. Its possible that post that you have a completely different query plan, which may or may not bring up other bottlenecks, but it should certainly remove this seq-Scan.
Minor (since I doubt if it affects performance), but secondly, if you aren't enforcing field length by design, you may want to change createdby field to TEXT. As can be seen in the query, its trying to cast all createdby fields to TEXT for this query.

Materialized view data doesn't update

I want to create a materialized view with fast refresh. The view aggregates values from a single table:
CREATE TABLE N_INSP_DTSEDIF_PLANTAS (
IMPORTACION_ID NUMBER(*,0) NOT NULL,
ID NUMBER(10) NOT NULL,
INSPECCION_ID NUMBER(10) NOT NULL,
NOMBRE_PLANTA VARCHAR2(255 CHAR),
NUM_VIVIENDAS NUMBER(10),
SUP_CONSTRUIDA_VIVIENDAS DECIMAL(10,4),
-- Plus some other columns I don't need
CONSTRAINT N_INSP_DTSEDIF_PLANTAS_P PRIMARY KEY (
IMPORTACION_ID,
ID
) ENABLE,
CONSTRAINT N_INSP_DTSEDIF_PLANTAS_F FOREIGN KEY (IMPORTACION_ID)
REFERENCES IMPORTACION (IMPORTACION_ID)
ON DELETE CASCADE
ENABLE
);
CREATE INDEX N_INSP_DTSEDIF_PLANTAS_X ON N_INSP_DTSEDIF_PLANTAS (IMPORTACION_ID);
CREATE SEQUENCE N_INSP_DTSEDIF_PLANTAS_S
INCREMENT BY 1
START WITH 1
MINVALUE 1
CACHE 20;
CREATE OR REPLACE TRIGGER N_INSP_DTSEDIF_PLANTAS_T
BEFORE INSERT
ON N_INSP_DTSEDIF_PLANTAS
REFERENCING NEW AS NEW OLD AS OLD
FOR EACH ROW
BEGIN
IF :NEW.ID IS NULL THEN
SELECT N_INSP_DTSEDIF_PLANTAS_S.NEXTVAL INTO :NEW.ID FROM DUAL;
END IF;
END N_INSP_DTSEDIF_PLANTAS_T;
/
ALTER TRIGGER N_INSP_DTSEDIF_PLANTAS_T ENABLE;
I've composed this through trial and error:
CREATE MATERIALIZED VIEW LOG ON N_INSP_DTSEDIF_PLANTAS
WITH ROWID, SEQUENCE (IMPORTACION_ID, INSPECCION_ID, NUM_VIVIENDAS, SUP_CONSTRUIDA_VIVIENDAS)
INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW V_PLANTAS
REFRESH FAST
AS
SELECT IMPORTACION_ID, INSPECCION_ID,
SUM(NUM_VIVIENDAS) AS NUM_VIVIENDAS, SUM(SUP_CONSTRUIDA_VIVIENDAS) AS SUP_CONSTRUIDA_VIVIENDAS
FROM N_INSP_DTSEDIF_PLANTAS
GROUP BY IMPORTACION_ID, INSPECCION_ID;
Objects get created without errors and SELECT * FROM V_PLANTAS returns data. However, the view is stalled. New rows added to N_INSP_DTSEDIF_PLANTAS don't show up at V_PLANTAS.
What did I misunderstand from the documentation?
In the mess of random changes that follow panic and despair I inadvertently dropped the ON COMMIT clause:
CREATE MATERIALIZED VIEW V_PLANTAS
REFRESH FAST ON COMMIT
AS
-- ...
The log itself is also invalid for fast refresh because I also omitted the PRIMARY KEY clause. It should be like:
CREATE MATERIALIZED VIEW LOG ON N_INSP_DTSEDIF_PLANTAS
WITH ROWID, PRIMARY KEY, SEQUENCE (INSPECCION_ID, NUM_VIVIENDAS, SUP_CONSTRUIDA_VIVIENDAS)
INCLUDING NEW VALUES;
(Said that, it's worth noting that materialized tables are not just a simple results cache but a fairly large and complex feature that requires careful planning and maintenance. In many situations is easier to just optimize the underlying query.)

Getting wrong results from fultext search in postgres

This is my table structure:
CREATE TABLE semantified_content_key_word
(
id bigint NOT NULL,
semantified_content_id bigint,
key_word_text text,
content_date timestamp without time zone NOT NULL,
context_id bigint NOT NULL,
CONSTRAINT pk_sckw_id PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE semantified_content_key_word
OWNER TO postgres;
-- Index: idx_sckw_kwt
-- DROP INDEX idx_sckw_kwt;
CREATE INDEX idx_sckw_kwt
ON semantified_content_key_word
USING gin
(to_tsvector('english'::regconfig, key_word_text) );
-- Index: idx_sckw_sc_id
-- DROP INDEX idx_sckw_sc_id;
CREATE INDEX idx_sckw_sc_id
ON semantified_content_key_word
USING btree
(semantified_content_id );
Following is the data :
INSERT INTO semantified_content_key_word (id, semantified_content_id, key_word_text, content_date, context_id)
VALUES (7347, 7347, ', agreementnumber customer servicecreditdate the guarantor taken exhausted prior being pursuant avoidance doubt shall remain liable case non incomplete', '2014-11-21 00:00:00', 111);
INSERT INTO semantified_content_key_word (id, semantified_content_id, key_word_text, content_date, context_id)
VALUES (7356, 7356, ', ; agreementnumber agreementperiod aircraftmodel commencementdate customer enginemodel enginequantity enginetype foddeductibleamount llpminimumbuild servicecreditdate steppedpopularrate takeoffderate termdate turnaroundtime ion ls initiated manager otherwise) inform whether proposed qualified view lnltlated confirm satisfies criteria out article instruct programme accordingly determined meet pursuant paragraph a) treated subject only g) below b)', '2014-11-21 00:00:00', 111);
INSERT INTO semantified_content_key_word (id, semantified_content_id, key_word_text, content_date, context_id)
VALUES (7441, 7441, ', activationdate agreementnumber enginemodel enginetype llpminimumbuild servicecreditdate steppedpopularrate turnaroundtime leap-1a as united continental customer 1/ neutral qec configuration engines shop maintenance: each engine ', '2014-11-17 00:00:00', 111);
-------------------------------------------------------------
select sckw.*
FROM semantified_content_key_word sckw
where TO_TSVECTOR(sckw.key_word_text) ## TO_TSQUERY('exhausted');
This is the query which i am running. And the keyword "exhausted" is present only in one of the rows but i am getting all the 3 rows.
How to avoid the rows where the keyword is not present
Thanks
Prasanna

updating create view - error ORA-01779:

I've used the CREATE VIEW command to create a view (obviously), and join multiple tables. The CREATE VIEW command works perfectly, but when I try to update the VIEW RentalInfoOct, I receive error "ORA-01779: cannot modify a column which maps to a non key-preserved table"
CREATE VIEW RentalInfoOct
(branch_no, branch_name, customer_no, customer_name, item_no, rental_date)
AS
SELECT i.branchNo, b.branchName, r.customerNo, c.customerName, i.itemNo, r.dateFrom
FROM item i
INNER JOIN rental r
ON i.itemNo = r.itemNo
INNER JOIN branch b
ON i.branchNo = b.branchNo
INNER JOIN customer c
ON r.customerNo = c.customerNo
WHERE r.dateFrom
BETWEEN to_date('10-01-2009','MM-DD-YYYY')
AND to_date('10-31-2009','MM-DD-YYYY')
My update command.
UPDATE RentalInfoOct
SET item_no = '3'
WHERE customer_name = 'April Alister'
AND branch_name = 'Kingsway'
AND rental_date = '10/28/2009'
I'm not sure if this will help in solving the problem, but here are my CREATE TABLE commands
CREATE TABLE Branch
(
branchNo SMALLINT NOT NULL,
branchName VARCHAR(20) NOT NULL,
branchAddress VARCHAR(40) NOT NULL,
PRIMARY KEY (BranchNo)
);
--Item Table Definition
CREATE TABLE Item
(
branchNo SMALLINT NOT NULL,
itemNo SMALLINT NOT NULL,
itemSize VARCHAR(8) NOT NULL,
price DECIMAL(6,2) NOT NULL,
PRIMARY KEY (ItemNo, BranchNo),
FOREIGN KEY (BranchNo) REFERENCES Branch ON DELETE CASCADE,
CONSTRAINT VALIDAMT
CHECK (price > 0)
);
-- Customer Table Definition
CREATE TABLE Customer
(
customerNo SMALLINT NOT NULL,
customerName VARCHAR(15) NOT NULL,
customerAddress VARCHAR(40) NOT NULL,
customerTel VARCHAR(10),
PRIMARY KEY (CustomerNo)
);
-- Rental Table Definition
CREATE TABLE Rental
(
branchNo SMALLINT NOT NULL,
customerNo SMALLINT NOT NULL,
dateFrom DATE NOT NULL,
dateTo DATE,
itemNo SMALLINT NOT NULL,
PRIMARY KEY (BranchNo, CustomerNo, dateFrom),
FOREIGN KEY (BranchNo) REFERENCES Branch(BranchNo) ON DELETE CASCADE,
FOREIGN KEY (CustomerNo) REFERENCES Customer(CustomerNo) ON DELETE CASCADE,
CONSTRAINT CORRECTDATES CHECK (dateTo > dateFrom OR dateTo IS NULL)
);
See: Oracle: multiple table updates => ORA-01779: cannot modify a column which maps to a non key-preserved table
You're attempting to update a view with joins, but the join conditions are not based on a uniqueness constraint, which creates the possibility of multiple rows that are created from a single row in one table.
It seems like you need a Unique Key - Foreign Key relationship between the columns your join condition is based on.
EDIT: I just saw your edit. Changing r.branchNo = b.branchNo to i.branchNo = b.branchNo should go a long way. Not sure how well r.customerNo = c.customerNo will work out.

Resources