CREATE TABLE AS takes too long in clickhouse - clickhouse

I am trying to create a table in Clickhouse(v22.4.3) based on the result that returns a query. The query returns only 1 record and takes around 12 seconds to be executed. When I put this specific query in a CREATE TABLE AS statement like the following one, around 200 seconds are needed for the table to be created.
create table mydb.test_table engine = MergeTree() order by a
as (
with
cte1 as (..),
cte2 as (..),
...
final_cte as ()
select a
from final_cte
)
Does anyone know why this CREATE TABLE AS statement takes that much time despite that the SELECT statement is quick?
P.S. I have around 10 CTEs before the final SELECT statement. I don't know if this plays any role and makes the creation of the table slower

Related

how to get all values from LowCardinality(String) datatype more optimal way?

i am new to clickhouse,and amazed by clickhosue's performance.
there have one thing bother me, i have a table use LowCardinality(String) type.
DDL like
create table test.table1(code LowCardinality(String),xxx,xxx,xxx ) engine=MergeTree xxx
i want get all distinct value from table.
there is my select query
select distinct code from test.table1;
but the mount of data is quite large, about 14 billion which may take 40s to complete,and is processed all rows in table,
so i wonder if there have any method to get data from LowCardinality datatype?
Try this query:
select code /*, count() */
from test.table1
group by code

Hive: Creating a table within a case statement

I need to update a table once in a period, however periods are either 4 or 5 weeks long.
In order to schedule a SQL query to run this process, I have been attempting to use the create table statement as a result of a successful when clause.
Is this possible as I am not getting any joy and cannot find anything that states one way or the other.
e.g.
select
case when 1=1 then
(create table db.case_test_1 as
select 'Y' as test)
end as test

how to get select statement query which was used to create table in oracle

I created a table in oracle like
CREATE TABLE suppliers AS (SELECT * FROM companies WHERE id > 1000);
I would like to know the complete select statement which was used to create this table.
I have already tried get_ddl but it is not giving the select statement. Can you please let me know how to get the select statement?
If you're lucky one of these statements will show the DDL used to generate the table:
select *
from gv$sql
where lower(sql_fulltext) like '%create table suppliers%';
select *
from dba_hist_sqltext
where lower(sql_text) like '%create table%';
I used the word lucky because GV$SQL will usually only have results for a few hours or days, until the data is purged from the shared pool. DBA_HIST_SQLTEXT will only help if you have AWR enabled, the statement was run in the last X days that AWR is configured to hold data (the default is 8), the statement was run after the last snapshot collection (by default it happens every hour), and the statement ran long enough for AWR to think it's worth saving.
And for each table Oracle does not always store the full SQL. For security reasons, DDL statements are often truncated in the data dictionary. Don't be surprised if the text suddenly cuts off after the first N characters.
And depending on how the SQL is called the case and space may be different. Use lower and lots of wildcards to increase the chance of finding the statement.
TRY THIS:
select distinct table_name
from
all_tab_columns where column_name in
(
select column_name from
all_tab_columns
where table_name ='SUPPLIERS'
)
you can find table which created from table

Can not improve bulk delete

I am using Java with mybatis.
I have a query like this and I need to execute this for 2000 values on key_b. That means I need to run the sql for 2000 times. Which is reasonably slow.
DELETE FROM my_table
WHERE key_a = xxx
AND key_b = yyy
Now I came up with another solution, this time I am sending 1000 values in IN clause for key_b. Which means only two query I am executing. I was expecting this one to be faster at least. But this seems to be even slower than the above one. Here is the sql.
DELETE FROM my_table
WHERE key_a = xxxx
AND key_b IN (y1, y2, ... y1000)
For more information, the key_b is the Primary Key. And the key_a is a Foreign key and has an Index.
Another thing, I've tried to take out the session and make a commit after all the sqls are executed. But It didn't improve that much.
you can use temp table for this:
I mean if you have a table which has id column.
And then You can insert your values to that table like this:
insert into temp_table
select 1 from dual -- your ids
union all
select 2 from dual
union all
select 3 from dual
union all
......
after you fill your temp_table you can run just this:
DELETE FROM my_table
WHERE key_a = xxxx
AND key_b IN
(
select id from temp_table
);
I recommend sticking with 1st approach: called prepared Delete statement in a Java loop over id collection. Off course with ExecutorType REUSE or BATCH, so that statement is prepared once and run for every record.
Furthermore, I discourage trying to bind thousands of parameters.
Anyway, I fear this is the best you can do since Delete operation will check integrity constraints, probably update index, for every record. That is not "bulked".

how to save a query result in a temporary table within a procedure

i'm quite new at oracle so i apologize in advance for the simple question.
So i have this procedure, in wich i run a query, i want to save the query result for further use, specifically i want run a for loop wich will take row by row my selection and copy some of the values in another table. The purpose is to populate a child table ( a weak entity ) starting from a parent table.
For the purpose let's imagine i have a query :
select *
from tab
where ...
now i want to save the selection with a local scope and therefore with a lifespan confined to the procedure itself ( like a local variable in a C function basically ). How can i achieve such a result ?
Basically i have a class schedule table composed like this :
Schedule
--------------------------------------------------------
subject_code | subject_name | class_starting_date | starting hour | ending hour | day_of_week
so i made a query to get all the subjects scheduled for the current accademic year, and i need to use the function next_day on each row of the result-set to populate a table of the actual classes scheduled for the next week.
My tought was :
I get the classes that need to be scheduled for the next week with a query, save the result somewhere and then trough a for loop using next_day ( because i need the actual date in wich the class take place ) populate the "class_occurence" table. I'm not sure that this is the correct way of thinking, there could be something to perform this job without saving the result first, maybe a cursor, who konws...
Global Temporary tables are a nice solution.. As long as you know the structure of the data to be inserted (how many columns and what datatype) you can insert into the global temp table. Data can only be seen by the session that does the inserts. Data can be dropped or committed by using some of the options.
CREATE GLOBAL TEMPORARY TABLE my_temp_table (
column1 NUMBER,
column2 NUMBER
) ON COMMIT DELETE ROWS;
This has worked great for me where I need to have data aggregated but only for a short period of time.
Edit: the data is local and temporary, the temp table is always there.
If you want to have the table in memory in the procedure that is another solution but somewhat more sophisticated.

Resources