ClickHouse uniqTheta* variations - clickhouse

According to Druid doc, the DS_THETA function can work on sketches.
How could the below Druid query be rewritten in ClickHouse using the uniqTheta* variations?
SELECT THETA_SKETCH_ESTIMATE(
THETA_SKETCH_INTERSECT(
DS_THETA(theta_uid) FILTER(WHERE "show" = 'Bridgerton' AND "episode" = 'S1E1'),
DS_THETA(theta_uid) FILTER(WHERE "show" = 'Bridgerton' AND "episode" = 'S1E2')
)
) AS users
FROM ts_tutorial
Here's my attempt to solve it using MV
CREATE TABLE ts_tutorial
(
date_id Date,
uid String,
show String,
episode String
)
ENGINE = MergeTree()
ORDER BY (date_id, show, episode, uid);
CREATE TABLE tutorial_tbl
(
date_id Date,
show String,
episode String,
theta_uid AggregateFunction(uniqTheta, String)
)
ENGINE = AggregatingMergeTree()
ORDER BY (date_id, show, episode);
CREATE MATERIALIZED VIEW IF NOT EXISTS tutorial_mv TO tutorial_tbl
AS
SELECT
date_id,
show,
episode,
uniqThetaState(uid) as theta_uid
FROM ts_tutorial
GROUP BY date_id, show, episode
;
INSERT INTO ts_tutorial VALUES ('2022-05-19','alice','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-19','alice','Game of Thrones','S1E2');
INSERT INTO ts_tutorial VALUES ('2022-05-19','alice','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-19','bob','Bridgerton','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-20','alice','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-20','carol','Bridgerton','S1E2');
INSERT INTO ts_tutorial VALUES ('2022-05-20','dan','Bridgerton','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-21','alice','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-21','carol','Bridgerton','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-21','erin','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-21','alice','Bridgerton','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-22','bob','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-22','bob','Bridgerton','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-22','carol','Bridgerton','S1E2');
INSERT INTO ts_tutorial VALUES ('2022-05-22','bob','Bridgerton','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-22','erin','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-22','erin','Bridgerton','S1E2');
INSERT INTO ts_tutorial VALUES ('2022-05-23','erin','Game of Thrones','S1E1');
INSERT INTO ts_tutorial VALUES ('2022-05-23','alice','Game of Thrones','S1E1');
And to answer: How many users watched both episodes of Bridgerton?
SELECT finalizeAggregation(
uniqThetaIntersect(
uniqThetaStateIf(theta_uid, show = 'Bridgerton' AND episode = 'S1E1'),
uniqThetaStateIf(theta_uid, show = 'Bridgerton' AND episode = 'S1E2')
)
) AS users
FROM tutorial_mv
The above query would return 0 instead of 1 for user carol. Which doesn't seem to work

Something like that
SELECT finalizeAggregation(
uniqThetaIntersect(
uniqThetaStateIf(theta_uid, "show" = 'Bridgerton' AND "episode" = 'S1E1'),
uniqThetaStateIf(theta_uid, "show" = 'Bridgerton' AND "episode" = 'S1E2')
)
) AS users
FROM ts_tutorial
look to https://clickhouse.com/docs/en/sql-reference/aggregate-functions/combinators/#-if and https://clickhouse.com/docs/en/sql-reference/functions/uniqtheta-functions/
for details

Related

How to insert all rows in a new table that have certain initials (pl/sql)

when a row contains the initial(s): 'H' I want to insert all the rows into a new table. How can I accomplish this? This is the cmp_employer_employees table:
As long as the new table has the same structure as cmp_employer_employees, you can add the records where the initials are "H" with an insert like this:
INSERT INTO new_table
SELECT *
FROM cmp_employer_employees
WHERE initials = 'H';

NVL2 and not exists in Oracle

I am pulling some records in table A from X table.
Now, I want to select records which are not available in table A but available in table B. But at the same time, I don't want to select records available in both tables.
Moreover,if a column in table A is null but the same column in the record in table B has value, I want to take that too.
Is it possible to do something like this in one statement ?
This is a simple set based operation. Relational databases are very good in that:
CREATE TABLE a (id NUMBER);
CREATE TABLE b (id NUMBER);
INSERT INTO a VALUES (1);
INSERT INTO a VALUES (3);
INSERT INTO b VALUES (2);
INSERT INTO b VALUES (3);
SELECT id FROM b
MINUS
SELECT id FROM a;
ID
2

Order By ERROR , not showing

-- Trying to sort data using ORDER by with 2 SUBSTITUTION VARIABLE , but not working , problem is included below the thread.
Table: users
CREATE TABLE users
(
user_id VARCHAR(5) ,
user_name VARCHAR(30),
CONSTRAINT pk_users PRIMARY KEY(user_id)
)
/
INSERT INTO users
VALUES ('U01','User1')
/
INSERT INTO users
VALUES ('U02','User2')
/
Table: staffaccount
CREATE TABLE staffaccount
(
staffaccount_id VARCHAR(5) ,
user_id VARCHAR(5) ,
CONSTRAINT pk_staffaccount PRIMARY KEY(staffaccount_id),
CONSTRAINT fk_staffaccount1 FOREIGN KEY (user_id) REFERENCES users(user_id)
)
/
INSERT INTO staffaccount
VALUES ('STF01','U01')
/
INSERT INTO staffaccount
VALUES ('STF02','U02')
/
Table: location
CREATE TABLE location
(
location_id VARCHAR(5),
location_name VARCHAR(25) NOT NULL,
CONSTRAINT pk_location PRIMARY KEY(location_id)
)
/
INSERT INTO location
VALUES ('LOC01','Staff Toilet')
/
INSERT INTO location
VALUES ('LOC02','Staff Office')
/
INSERT INTO location
VALUES ('LOC03','Staff Meeting Room')
/
INSERT INTO location
VALUES ('LOC04','Staff Hall')
/
Table: bookingstaff
CREATE TABLE bookingstaff
(
staffaccount_id VARCHAR(5),
location_id VARCHAR(5),
timebooked TIMESTAMP,
usages VARCHAR(25)
)
/
INSERT INTO bookingstaff
VALUES ('STF01','LOC01',TIMESTAMP'2018-01-01 10:00:00','Pee')
/
INSERT INTO bookingstaff
VALUES ('STF02','LOC02',TIMESTAMP'2018-01-02 10:00:00','Writing')
/
INSERT INTO bookingstaff
VALUES ('STF01','LOC03',TIMESTAMP'2018-01-05 10:00:00','Meeting')
/
INSERT INTO bookingstaff
VALUES ('STF02','LOC04',TIMESTAMP'2018-01-12 10:00:00','Dancing')
/
INSERT INTO bookingstaff
VALUES ('STF01','LOC02',TIMESTAMP'2018-02-01 10:00:00','Writing')
/
INSERT INTO bookingstaff
VALUES ('STF02','LOC03',TIMESTAMP'2018-02-02 10:00:00','Meeting')
/
INSERT INTO bookingstaff
VALUES ('STF01','LOC02',TIMESTAMP'2018-02-15 10:00:00','Writing')
/
INSERT INTO bookingstaff
VALUES ('STF02','LOC04',TIMESTAMP'2018-03-01 10:00:00','Dancing')
/
INSERT INTO bookingstaff
VALUES ('STF01','LOC03',TIMESTAMP'2018-03-02 10:00:00','Meeting')
/
On the above is all my table query, try to use substitution variable
to display data. Code below
SELECT u.user_name,l.location_name,b.usages,to_char(cast(b.timebooked as date),'DD-MM-YYYY')as "DATE"
FROM staffaccount s
JOIN bookingstaff b
ON b.staffaccount_id = s.staffaccount_id
LEFT OUTER JOIN users u
ON u.user_id= s.user_id
LEFT OUTER JOIN location l
ON l.location_id= b.location_id
WHERE l.location_name LIKE '%Staff%'
AND timebooked
BETWEEN date&
AND date&
with the code, here is the result for it. (first substitution
variable: '2017-01-01' , second substitution variable:'2018-02-10')
http://prntscr.com/iztftx < Result displayed with myOra
But when i tried to add ORDER by usages, it will show error.
Error:
Never allowing me to enter second substitution variable
should be &var and not var&
WHERE l.location_name LIKE '%Staff%'
AND timebooked BETWEEN &date AND &date
ORDER BY usages
and for a better readiblity
WHERE l.location_name LIKE '%Staff%'
AND timebooked BETWEEN &date1 AND &date2
ORDER BY usages

Oracle SQL - Creating trigger that accesses multiple tables

I have three tables:
table_family(
id CHAR(4) PRIMARY KEY,
name VARCHAR(30)
)
table_child(
id CHAR(4) PRIMARY KEY,
name VARCHAR(30),
family_parents_id CHAR(4) REFERENCES table_family(id)
)
table_babysit(
family_babysitter_id CHAR(4) REFERENCES table_family(id),
child_babysittee_id CHAR(4) REFERENCES table_child(id),
hours INTEGER
)
I'm trying to create a trigger before insert in table_babysit which prevents a family member from babysitting their own children. So if in table_babysit, the family_babysitter_id is matches the same family id as the child's family_parents_id, that would be illegal.
CREATE TRIGGER check_illegal_babysit
BEFORE INSERT
ON table_babysit
FOR EACH ROW
BEGIN
JOIN table_family ON table_family.id = family_babysitter_id
JOIN table_child ON table_child.id = child_babysittee_id
IF (table_family.id = table_child.family_parents_id) THEN
RAISE_APPLICATION_ERROR(-20000,'Family cannot babysit their own children');
END IF;
END;
I'm new to writing triggers and I can't seem to JOIN multiple tables in a trigger. What would be the proper way to create this trigger?
Whether it's a trigger,a statement in a standard procedure, or just a standalone query to employ a JOIN it must follow the form
Select ... [into ...] from table1 JOIN table2 on join_condition ...
Where a trigger and procedural statement requires the INTO phrase.
In his case you can employ a CTE to create table1. However, there is a complication here in that the conditions needed for the join is actually the condition the trigger is trying prevent, but it can be made to work by reversing the logic and selecting what you do not want:
-- define trigger (with join)
create or replace trigger check_illegal_babysit
before insert
on table_babysit
for each row
declare
x varchar2(1);
begin
with s as
(select :new.family_babysitter_id sitter
, :new.child_babysittee_id sittee
from dual
)
select null
into x
from s
left outer join table_family on(table_family.id = sitter)
left outer join table_child on(table_child.id = sittee)
where table_family.id = table_child.family_parents_id;
raise_application_error(-20000,'Family cannot babysit their own children');
exception
when no_data_found then null;
end;
--- Create test Family and Child rows
insert into table_family (id, name) values('Fam1','Family1');
insert into table_family (id, name) values('Fam2','Family2');
insert into table_family (id, name) values('Fam3','Family3');
insert into table_child( id,name,family_parents_id) values('c1f1', 'Child1 of Family1', 'Fam1');
insert into table_child( id,name,family_parents_id) values('c2f1', 'Child2 of Family1', 'Fam1');
insert into table_child( id,name,family_parents_id) values('c3f1', 'Child3 of Family1', 'Fam1');
insert into table_child( id,name,family_parents_id) values('c1f2', 'Child1 of Family2', 'Fam2');
insert into table_child( id,name,family_parents_id) values('c2f2', 'Child2 of Family2', 'Fam2');
-- Insert into babysit table to test trigger
insert into table_babysit(family_babysitter_id, child_babysittee_id) values( 'Fam2', 'c1f1') ; -- valid
insert into table_babysit(family_babysitter_id, child_babysittee_id) values( 'Fam3', 'c2f1') ; -- valid
insert into table_babysit(family_babysitter_id, child_babysittee_id) values( 'Fam1', 'c3f1') ; -- invalid
I'm sure there are other JOINS that accomplish what you desire. I just can not think of one at the moment. But perhaps the easiest understand is to use 2 simple straight forward selects. So maybe try:
create or replace trigger check_illegal_babysit
before insert
on table_babysit
for each row
declare
family_id_l table_family.id%type;
parents_id_l table_child.family_parents_id%type;
begin
select table_family.id
into family_id_l
from table_family
where id = :new.family_babysitter_id;
select family_parents_id
into parents_id_l
from table_child
where id = :new.child_babysittee_id;
if (family_id_l = parents_id_l) then
raise_application_error(-20000,'Family cannot babysit their own children');
end if;
end;

Oracle Inserting or Updating a row through a procedure

I have a table
CREATE TABLE STUDENT
(
ID INTEGER PRIMARY KEY,
FIRSTNAME VARCHAR2(1024 CHAR),
LASTNAME VARCHAR2(1024 CHAR),
MODIFIEDDATE DATE DEFAULT sysdate
)
I am inserting a row of data
insert into STUDENT (ID, FIRSTNAME, LASTNAME, MODIFIEDDATE) values (1,'Scott', 'Tiger', sysdate);
When I have to insert a record of data, I need to write a procedure or function which does the following:
if there is no record for the same id insert the row.
if there is a record for the same id and data matches then do nothing.
if there is a record for the same id but data does not match then update the data.
I am new to oracle. From the java end, It is possible to select the record by id and then update that record, but that would make 2 database calls. just to avoid that I am trying update the table using a procedure. If the same can be done in a single database call please mention.
For a single SQL statement solution, you can try to use the MERGE statement, as described in this answer https://stackoverflow.com/a/237328/176569
e.g.
create or replace procedure insert_or_update_student(
p_id number, p_firstname varchar2, p_lastname varchar2
) as
begin
merge into student st using dual on (id = p_id)
when not matched then insert (id, firstname, lastname)
values (p_id, p_firstname, p_lastname)
when matched then update set
firstname = p_firstname, lastname = p_lastname, modifiedate = SYSDATE
end insert_or_update_student;
instead of procedure try using merge in oracle .
If Values is matched it will update the table and if values is not found it will insert the values
MERGE INTO bonuses b
USING (
SELECT employee_id, salary, dept_no
FROM employee
WHERE dept_no =20) e
ON (b.employee_id = e.employee_id)
WHEN MATCHED THEN
UPDATE SET b.bonus = e.salary * 0.1
DELETE WHERE (e.salary < 40000)
WHEN NOT MATCHED THEN
INSERT (b.employee_id, b.bonus)
VALUES (e.employee_id, e.salary * 0.05)
WHERE (e.salary > 40000)
Try this
To solve the second task - "if there is a record for the same id and data matches then do nothing." - starting with 10g we have additional "where" clause in update and insert sections of merge operator.
To do the task we can add some checks for data changes:
when matched then update
set student.last_name = query.last_name
where student.last_name <> query.last_name
This will update only matched rows, and only for rows where data were changed

Resources