How can Clickhouse local join and local insert on distributed tables?

How can Clickhouse local join and local insert on distributed tables? - clickhouse

I have 3 distributed table: t_user_info_all, t_user_event_all, t_user_flat_all, they all on same cluster with same shard key user_id.
And i want insert join result of t_user_event_all and t_user_info_all into t_user_flat_all, SQL like this:
insert into t_user_flat_all select * from t_user_info_all t1 left join t_user_event_all t2 on t1.user_id = t2.user_id.
With setting distributed_product_mode = 'local', join runs on local mode, but insert statements still on a distributed table.
I found setting parallel_distributed_insert_select = 2, SELECT and INSERT will be executed on each shard from/to the underlying table of the distributed engine. But it only works for queries like INSERT INTO distributed_table_a SELECT ... FROM distributed_table_b, the select query can not have where conditions or joins.
Or i can run local insert insert into t_user_flat_local select * from t_user_info_local t1 left join t_user_event_local t2 on t1.user_id = t2.user_id on each shard, but it makes the case complex.

Related

Oracle Performance issues on using subquery in an "In" orperator

I have two query that looks close to the same but Oracle have very different performance.
Query A
Create Table T1 as Select * from FinalView1 where CustomerID in ('A0000001','A000002')
Query B
Create Table T1 as Select * from FinalView1 where CustomerID in (select distinct CustomerID from CriteriaTable)
The CriteriaTable have 800 rows but all belongs to Customer ID 'A0000001' and 'A000002'.
This means the subquery: "select distinct CustomerID from CriteriaTable" also only returns the same two elements('A0000001','A000002') as manually entered in query A
Following is the query under the FinalView1
create or replace view FinalView1_20200716 as
select
Customer_ID,
<Some columns>
from
Table1_20200716 T1
INNER join Table2_20200716 T2 on
T1.Invoice_number = T2.Invoice_number
and
T1.line_id = T2.line_id
left join Table3_20200716 T3 on
T3.id = T1.Customer_ID
left join Table4_20200716 T4 on
T4.Shipping_ID = T1.Shipping_ID
left join Table5_20200716 Table5 on
Table5.Invoice_ID = T1.Invoice_ID
left join Table6_20200716 T6 on
T6.Shipping_ID = T4.Shipping_ID
left join First_Order first on
first.Shipping_ID = T1.Shipping_ID
;
Table1_20200716,Table2_20200716,Table3_20200716,Table4_20200716,Table5_20200716,Table6_20200716 are views to the corresponding table with temporal validity feature. For example
The query under Table1_20200716
Create or replace view Table1_20200716 as
select
*
from Table1 as for period of to_date('20200716,'yyyymmdd')
However table "First_Order" is just a normal table as
Following is the performance for both queries (According to explain plan):
Query A:
Cardinality: 102
Cost : 204
Total Runtime: 5 secs max
Query B:
Cardinality:27921981
Cost: 14846
Total Runtime:20 mins until user cancelled
All tables are indexed using those columns that used to join against other tables in the FinalView1. According to the explain plan, they have all been used except for the FirstOrder table.
Query A used uniquue index on the FirstOrder Table while Query B performed a full scan.
For query B, I was expecting the Oracle will firstly query the sub-query get the result into the in operator, before executing the main query and therefore should only have minor impact to the performance.
Thanks in advance!

As mentioned from my comment 2 days ago. Someone have actually posted the solution and then have it removed while the answer actually work. After waiting for 2 days the So I designed to post that solution.
That solution suggested that the performance was slow down by the "in" operator. and suggested me to replace it with an inner join
Create Table T1 as
Select
FV.*
from
FinalView1 FV
inner join (
select distinct
CustomerID
from
CriteriaTable
) CT on CT.customerid = FV.customerID;
Result from explain plan was worse then before:
Cardinality:28364465 (from 27921981)
Cost: 15060 (from 14846)
However, it only takes 17 secs. Which is very good!

Oracle query select data in multi tables

I have 2 tables.
This is tableA
(invoice,D/O, cost..) and
Table B
(D/O, GRN, Qty)
Now how to use query to show table A include GRN,Qty
See

You need a LEFT OUTER JOIN to retrieve all the records from table A with matched records from table B.
Guessing at the join criteria because your question doesn't say what they are:
select a.*
, b.grn
, b.grn_line
, b.qty_grn
from a
left outer join b
on a.do = b.do
and a.do_line = b.do_line
and a.invoice_line = b.grn_line

can i set up an SSRS report where users input parameters to a table

I have an oracle query that uses a created table as part of the code. Every time I need to run a report I delete current data and import the new data I receive. This is one column of id's. I need to create a report on SSRS in which the user can input this data into said table as a parameter. I have designed a simple report that they can enter some of the id's into a parameter, but there may be times when they need to enter in a few thousand id's, and the report already runs long. Here is what the SSRS code currently says:
select distinct n.id, n.notes
from notes n
join (
select max(seq_num) as seqnum, id from notes group by id) maxresults
on n.id = maxresults.ID
where n.seq_num = maxresults.seqnum
and n.id in (#MyParam)
Is there a way to have MyParam insert data into a table I would join called My_ID, joining as Join My_Id id on n.id = id.id
I do not have permissions to create functions or procedures in the database.
Thank you

You may try the trick with MATERIALIZE hint which normally forces Oracle to create a temporary table :
WITH cte1 AS
( SELECT /*+ MATERIALIZE */ 1 as id FROM DUAL
UNION ALL
SELECT 2 DUAL
)
SELECT a.*
FROM table1 a
INNER JOIN cte1 b ON b.id = a.id

Update SQL Query with Join on 2 tables

I have a requirement to read from 2 tables once read i have to update the falg on both table.
My SQL query
SELECT t1.KUNNR,t1.SETT_KEY,t1.QUART_START,t1.QUART_END,t2.PAY_METH,t2.MAT_NDC,t2.AMOUNT
FROM TSAP_REBATE_MEDI t1
INNER JOIN TSAP_REBATE_LINE t2 ON t1.KUNNR=t2.KUNNR AND t1.SETT_KEY=t2.SETT_KEY
WHERE t1.PROCESSING_STATUS = 'N' AND t2.PROCESSING_STATUS = 'N'
This is working fine now i need an update query for the same where PROCESSING_STATUS is set to 'P' on both tables.

You cannot update two tables at the same time. Run two separate UPDATE statements of the following nature
UPDATE t1
SET COLUMN = VALUE
FROM TSAP_REBATE_MEDI t1
INNER JOIN TSAP_REBATE_LINE t2
ON t1.KUNNR=t2.KUNNR
AND t1.SETT_KEY=t2.SETT_KEY
WHERE t1.PROCESSING_STATUS = 'N'
AND t2.PROCESSING_STATUS = 'N'
/* Add any other conditions */
However, if you want them both to be updated (or neither one), wrap both updates in a BEGIN TRANSACTION - COMMIT

Rownum in the join condition

Recently I fixed the some bug: there was rownum in the join condition.
Something like this: left join t1 on t1.id=t2.id and rownum<2. So it was supposed to return only one row regardless of the “left join”.
When I looked further into this, I realized that I don’t understand how Oracle evaluates rownum in the "left join" condition.
Let’s create two sampe tables: master and detail.
create table MASTER
(
ID NUMBER not null,
NAME VARCHAR2(100)
)
;
alter table MASTER
add constraint PK_MASTER primary key (ID);
prompt Creating DETAIL...
create table DETAIL
(
ID NUMBER not null,
REF_MASTER_ID NUMBER,
NAME VARCHAR2(100)
)
;
alter table DETAIL
add constraint PK_DETAIL primary key (ID);
alter table DETAIL
add constraint FK_DETAIL_MASTER foreign key (REF_MASTER_ID)
references MASTER (ID);
prompt Disabling foreign key constraints for DETAIL...
alter table DETAIL disable constraint FK_DETAIL_MASTER;
prompt Loading MASTER...
insert into MASTER (ID, NAME)
values (1, 'First');
insert into MASTER (ID, NAME)
values (2, 'Second');
commit;
prompt 2 records loaded
prompt Loading DETAIL...
insert into DETAIL (ID, REF_MASTER_ID, NAME)
values (1, 1, 'REF_FIRST1');
insert into DETAIL (ID, REF_MASTER_ID, NAME)
values (2, 1, 'REF_FIRST2');
insert into DETAIL (ID, REF_MASTER_ID, NAME)
values (3, 1, 'REF_FIRST3');
commit;
prompt 3 records loaded
prompt Enabling foreign key constraints for DETAIL...
alter table DETAIL enable constraint FK_DETAIL_MASTER;
set feedback on
set define on
prompt Done.
Then we have this query :
select * from master t
left join detail d on d.ref_master_id=t.id
The result set is predictable: we have all the rows from the master table and 3 rows from the detail table that matched this condition d.ref_master_id=t.id.
Result Set
Then I added “rownum=1” to the join condition and the result was the same
select * from master t
left join detail d on d.ref_master_id=t.id and rownum=1
The most interesting thing is that I set “rownum<-666” and got the same result again!
select * from master t
left join detail d on d.ref_master_id=t.id and rownum<-666.
Due to the result set we can say that this condition was evaluated as “True” for 3 rows in the detail table. But if I use “inner join” everything goes as supposed to be.
select * from master t
join detail d on d.ref_master_id=t.id and rownum<-666.
This query doesn’t return any row,because I can't imagine rownum to be less then -666 :-)
Moreover, if I use oracle syntax for outer join, using “(+)” everything goes well too.
select * from master m ,detail t
where m.id=t.ref_master_id(+) and rownum<-666.
This query doesn’t return any row too.
Can anyone tell me, what I misunderstand with outer join and rownum?

ROWNUM is a pseudo-attribute of result sets, not of base tables. ROWNUM is defined after rows are selected, but before they're sorted by an ORDER BY clause.
edit: I was mistaken in my previous writeup of ROWNUM, so here's new information:
You can use ROWNUM in a limited way in the WHERE clause, for testing if it's less than a positive integer only. See ROWNUM Pseudocolumn for more details.
SELECT ... WHERE ROWNUM < 10
It's not clear what value ROWNUM has in the context of a JOIN clause, so the results may be undefined. There seems to be some special-case handling of expressions with ROWNUM, for instance WHERE ROWNUM > 10 always returns false. I don't know how ROWNUM<-666 works in your JOIN clause, but it's not meaningful so I would not recommend using it.
In any case, this doesn't help you to fetch the first detail row for each given master row.
To solve this you can use analytic functions and PARTITION, and combine it with Common Table Expressions so you can access the row-number column in a further WHERE condition.
WITH numbered_cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY t.id ORDER BY d.something) AS rn
FROM master t LEFT OUTER JOIN detail d ON d.ref_master_id = t.id
)
SELECT *
FROM numbered_cte
WHERE rn = 1;

if you want to get the first three values from the join condition change the select statement like this.
select *
from (select *
from master t left join detail d on d.ref_master_id=t.id)
where rownum<3;
You will get the required output. Take care on unambigiously defined column names when using *
Let me give an absolute answer which u can run directly with out making any changes to the code.
select *
from (select t.id,t.name,d.id,d.ref_master_id,d.name
from master t left join detail d on d.ref_master_id=t.id)
where rownum<3;

A ROWNUM filter doesn't make any sense in a join, but it isn't being rejected as invalid.
The explain plan will either include the ROWNUM filter or exclude it. If it includes it, it will apply the filter to the detail table after applying the other join condition(s). So if you put in ROWNUM=100 (which will never be satisfied) all the detail rows are excluded and then the outer join kicks in.
If you put in ROWNUM=1 it seems to drop the filter.
And if you query
with
a as (select rownum a_val from dual connect by level < 10),
b as (select rownum*2 b_val from dual connect by level < 10)
select * from a left join b on a_val < b_val and rownum in (1,3);
you get something totally weird.
It probably should be rejected as an error, so expect nonsensical things to happen

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How can Clickhouse local join and local insert on distributed tables? - clickhouse

Related

Oracle Performance issues on using subquery in an "In" orperator

Oracle query select data in multi tables

can i set up an SSRS report where users input parameters to a table

Update SQL Query with Join on 2 tables

Rownum in the join condition

Categories

Resources