Limit select count subquery work in 21.4.5.46 version but can not work in 21.10.2.15 - clickhouse

Limit select count subquery work in 21.4.5.46 version but can not work in 21.10.2.15
Sql is
select * from mytable order by sid limit (select toInt64(count(cid)*0.01) from mytable);
The sql can work very well in in 21.4.5.46 version but can not work in 21.10.2.15.
Exception is : [1002] ClickHouse exception, message: Code: 440. DB::Exception: Illegal type Nullable(Int32) of LIMIT expression, must be numeric type. (INVALID_LIMIT_EXPRESSION) (version 21.10.2.15 (official build))
How to reproduce
1 create table sql:
create table mytable(cid String,create_time String,sid String)engine = MergeTree PARTITION BY sid ORDER BY cid SETTINGS index_granularity = 8192;
2 execute sql
select * from mytable order by sid limit (select toInt64(count(cid)*0.01) from mytable);

ClickHouse release v21.9, 2021-09-09
Backward Incompatible Change
Now, scalar subquery always returns Nullable result if it's type can be Nullable. It is needed because in case of empty subquery it's result should be Null. Previously, it was possible to get error about incompatible types (type deduction does not execute scalar subquery, and it could use not-nullable type). Scalar subquery with empty result which can't be converted to Nullable (like Array or Tuple) now throws error. Fixes #25411. #26423 (Nikolai Kochetov).
Now you should use
SELECT *
FROM mytable
ORDER BY sid ASC
LIMIT assumeNotNull((
SELECT toUInt64(count(cid) * 0.01)
FROM mytable
))
Query id: e3ab56af-96e4-4e01-812d-39af945d7878
Ok.
0 rows in set. Elapsed: 0.004 sec.

Related

MIN function behavior changed on Oracle databases after SAS Upgrade to 9.4M7

I have a program that has been working for years. Today, we upgraded from SAS 9.4M3 to 9.4M7.
proc setinit
Current version: 9.04.01M7P080520
Since then, I am not able to get the same results as before the upgrade.
Please note that I am querying on Oracle databases directly.
Trying to replicate the issue with a minimal, reproducible SAS table example, I found that the issue disappear when querying on a SAS table instead of on Oracle databases.
Let's say I have the following dataset:
data have;
infile datalines delimiter="|";
input name :$8. id $1. value :$8. t1 :$10.;
datalines;
Joe|A|TLO
Joe|B|IKSK
Joe|C|Yes
;
Using the temporary table:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have
group by name;
quit;
Results:
name A
Joe TLO
However, when running the very same query on the oracle database directly I get a missing value instead:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have_oracle
group by name;
quit;
name A
Joe
As per documentation, the min() function is behaving properly when used on the SAS table
The MIN function returns a missing value (.) only if all arguments are missing.
I believe this happens when Oracle don't understand the function that SAS is passing it - the min functions in SAS and Oracle are very different and the equivalent in SAS would be LEAST().
So my guess is that the upgrade messed up how is translates the SAS min function to Oracle, but it remains a guess. Does anyone ran into this type of behavior?
EDIT: #Richard's comment
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
proc sql;
create table want as
select t1.name,
min(case when id = 'A' then value else "" end) as A length 8
from oracle_db.names t1 inner join oracle_db.ids t2 on (t1.tid = t2.tid)
group by t1.name;
ORACLE_26: Prepared: on connection 0
SELECT * FROM NAMES
ORACLE_27: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='NAMES' AND
UIC.TABLE_NAME='NAMES' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_28: Executed: on connection 1
SELECT statement ORACLE_27
ORACLE_29: Prepared: on connection 0
SELECT * FROM IDS
ORACLE_30: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='IDS' AND
UIC.TABLE_NAME='IDS' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_31: Executed: on connection 1
SELECT statement ORACLE_30
ORACLE_32: Prepared: on connection 0
select t1."NAME", MIN(case when t2."ID" = 'A' then t1."VALUE" else ' ' end) as A from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID" group by t1."NAME"
ORACLE_33: Executed: on connection 0
SELECT statement ORACLE_32
ACCESS ENGINE: SQL statement was passed to the DBMS for fetching data.
NOTE: Table WORK.SELECTED_ATTR created, with 1 row and 2 columns.
! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.34 seconds
cpu time 0.09 seconds
Use the SASTRACE= system option to log SQL statements sent to the DBMS.
options SASTRACE=',,,d';
will provide the most detailed logging.
From the prepared statement you can see why you are getting a blank from the Oracle query.
select
t1."NAME"
, MIN ( case
when t2."ID" = 'A' then t1."VALUE"
else ' '
end
) as A
from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID"
group by
t1."NAME"
The SQL MIN () aggregate function will exclude null values from consideration.
In SAS SQL, a blank value is also interpreted as null.
In SAS your SQL query returns the min non-null value TLO
In Oracle transformed query, the SAS blank '' is transformed to ' ' a single blank character, which is not-null, and thus ' ' < 'TLO' and you get the blank result.
The actual MIN you want to force in Oracle is min(case when id = "A" then value else null end) which #Tom has shown is possible by omitting the else clause.
The only way to see the actual difference is to run the query with trace in the prior SAS version, or if lucky, see the explanation in the (ignored by many) "What's New" documents.
Why are you using ' ' or '' as the ELSE value? Perhaps Oracle is treating a string with blanks in it differently than a null string.
Why not use null in the ELSE clause?
or just leave off the ELSE clause and let it default to null?
libname mylib oracle .... ;
proc sql;
create table want as
select name
, min(case when id = "A" then value else null end) as A length 8
from mylib.have_oracle
group by name
;
quit;
Also try running the Oracle code yourself, instead of using implicit pass thru.
proc sql;
connect to oracle ..... ;
create table want as
select * from connection to oracle
(
select name,
min(case when id = "A" then value else null end) as A length 8
from have_oracle
group by name
)
;
quit;
When I try to reproduce this in Oracle I get the result you are looking for so I suspect it has something to do with SAS (which I'm not familiar with).
with t as (
select 'Joe' name, 'A' id, 'TLO' value from dual union all
select 'Joe' name, 'B' id, 'IKSK' value from dual union all
select 'Joe' name, 'C' id, 'Yes' value from dual
)
select name
, min(case when id = 'A' then value else '' end) as a
from t
group by name;
NAME A
---- ----
Joe TLO
Unrelated, if you are only interested in id = 'A' then a better query would be:
select name
, min(value) as a
from t
where id = 'A'
group by name;

Is there any replacement for ROWNUM in Oracle?

I have JPA Native queries to an Oracle database. The only way I know to limit results is using 'rownum' in Oracle, but for some reason, query parser of a jar driver I have to use does not recognize it.
Caused by: java.sql.SQLException: An exception occurred when executing the following query: "/* dynamic native SQL query */ SELECT * from SFDC_ACCOUNT A where SBSC_TYP = ? and rownum <= ?". Cause: Invalid column name 'rownum'. On line 1, column 90. [parser-2900650]
com.compositesw.cdms.services.parser.ParserException: Invalid column name 'rownum'. On line 1, column 90. [parser-2900650]
How can I get rid of that?
ANSI Standard would be something like the following
SELECT *
FROM (
SELECT
T.*,
ROW_NUMBER() OVER (PARTITION BY T.COLUMN ORDER BY T.COLUMN) ROWNUM_REPLACE
FROM TABLE T
)
WHERE
1=1
AND ROWNUM_REPLACE < 100
or you could also use the following:
SELECT * FROM TABLE T
ORDER BY T.COLUMN
OFFSET 0 ROWS
FETCH NEXT 100 ROWS ONLY;

How to set range for limit clause in hive

How to set range for limit clause in hive , I have tried the below query but failed with syntax error . Can someone please help
select * from table limit 1000,2000;
You can use Row_Number window function and set the range limit.
Below Query will result only the first 20 records from the table
hive> select * from
(
SELECT *,ROW_NUMBER() over (Order by id) as rowid FROM <tab_name>
)t
where rowid > 0 and rowid <=20;
Using Between operator to specify range
hive> select * from
(
SELECT *,ROW_NUMBER() over (Order by id) as rowid FROM <tab_name>
)t
where rowid between 0 and 20;
To fetch rows from 20 to 40 then increase the lower/upper bound values
hive> select * from
(
SELECT *,ROW_NUMBER() over (Order by id) as rowid FROM <tab_name>
)t
where rowid > 20 and rowid <=40;
The LIMIT clause is used to set a ceiling on the number of rows in the result set. You are getting a syntax error because of an incorrect usage of this HQL clause.
The query could be written as the following to return no more than 2000 rows:
SELECT * FROM table LIMIT 2000;
You could also write it like so to return no more than 1000 rows:
SELECT * FROM table LIMIT 1000;
However you cannot combine both into the same argument for LIMIT. The LIMIT argument must evaluate to a constant value.
I will try and expand on this information a bit to try and help solve your problem. If you are attempting to "paginate" your results the following may be of use.
FIRST I would recommend against leaning on HQL for pagination, in most situations that would be more efficiently implemented on the application logic side (query large result set, cache what you need, paginate with application logic). If you have no choice but to pull out ranges of rows you can get the desired effect through a combination of the LIMIT, ORDER BY, and OFFSET clauses.
LIMIT : This will limit your result set to a maximum number of row
ORDER BY: This will sort/order your result set based on one or more columns
OFFSET: This will start your result set at a certain row after the logical first entry in the table.
You may combine these three clauses to effectively query "pages" of your table. For example the following three queries show how to get the first 3 blocks of data from a table where each block contains 1000 rows and the target table's 'column1' is used to determine logical order.
SELECT title as "Page 1", column1, column2, ... FROM table
ORDER BY column1 LIMIT 1000 OFFSET 0;
SELECT title as "Page 2", column1, column2, ... FROM table
ORDER BY column1 LIMIT 1000 OFFSET 1000;
SELECT title as "Page 3", column1, column2, ... FROM table
ORDER BY column1 LIMIT 1000 OFFSET 2000;
Each query declares 'column1' as the sorting value with ORDER BY. The queries will return no more than 1000 rows due to the LIMIT clause. Each result set will start at a different row due to the OFFSET being incremented by the "page size" for each query.
I am not sure what you are trying to achieve, but ...
That will return the 1001 and the 2001 record in the query results set only if you are using hive a hive version greater than 2.0.0
hive --version
(https://issues.apache.org/jira/browse/HIVE-11531)
Limit in Hive gives 'n' number of records randomly. It's not to print a range of records.
You may use order by in conjunction with limit to get what you want

Oracle Database Link + Inner Join + To_Number Query

DatabaseA - TableA - FieldA VARCHAR2
DatabaseB - TableB - FieldB NUMBER [dblink created]
SELECT *
FROM TableB#dblink b
INNER JOIN TableA a
ON b.FieldB = a.FieldA
There are 2 complications.
1. FieldA is VARCHAR2 but FieldB is NUMBER.
2. FieldA contains - and FieldB contains 0.
More info about the fields
FieldA: VARCHAR2(15), NOT NULL
Sample values
-
123
No non-numeric values, except for -
FieldB: NUMBER(5,0)
Sample values
0
123
No non-numeric values
What I'm trying to do is to ignore the rows if FieldA='-' OR FieldB=0, otherwise compare FieldA to FieldB.
SELECT *
FROM TableB#dblink b
JOIN TableA a
ON to_char(b.FieldB) = a.FieldA
I get the following error:
SQL Error: 17410, SQLState: 08000
No more data to read from socket.
NULLs will never match with equals, so your join already takes care of that.
You would get an implicit type conversion of (probably) the NUMBER to VARCHAR, so that should also be taken care of.
Having said that, I am a big proponent of not relying on implicit datatype conversions. So I would write my query as
SELECT *
FROM TableB#dblink b
JOIN TableA a
ON to_char(b.FieldB) = a.FieldA
If that is not giving the results you want, perhaps posting examples of the data in each table and the results you desire would be helpful.

SQL Tuning for IN Clause

The below Teradata query is taking around 18 seconds to complete.
The highlighted values passed in IN clause is from another Oracle database so I am not able to implement a join with that table.
SELECT distinct sec.SerialNum esn, ef.EngineFamilyCd family, em.EngineModelCd model,
es.EngineSeriesCd series, sac.AircraftTailNum tailNumRef, sec.EnginePositionNum enginePosition,
o1.OrganizationId ownerOrgId, o2.OrganizationId operatorOrgId,
sec.EngineInstallationDttm installedDate, sec.EngineRemovalDttm removalDate,
sec.HardwareConfigNm hardwareConfig, sec.EngineControlNm engineControl,
sec.ApplicationSelectorNm appSelector, sec.EngineMonitorInd engineMonitorInd,
sec.EngineThrustRatingFctr engineThrustRating, sec.StatusDesc engineStatus, sec.n1modifiernum n1modifier
FROM DB_MASTER_BV.SZEngineCurrent sec,
DB_MASTER_BV.EngineSeries es,
DB_MASTER_BV.EngineModel em,
DB_Master_BV.EngineFamily ef,
DB_MASTER_BV.SZAircraftCurrent sac,
DB_MASTER_BV.Organization o1,
DB_MASTER_BV.Organization o2
WHERE sec.EngineSeriesCd = es.EngineSeriesCd
and es.EngineModelCd = em.EngineModelCd
and em.EngineFamilyCd = ef.EngineFamilyCd
and sec.MasterAircraftId = sac.MasterAircraftId
and o1.MasterOrganizationId = sec.OwnerMasterOrganizationId
and o2.MasterOrganizationId = sec.OperatorMasterOrganizationId
AND (sec.SerialNum in('733276','193283','690168','741471','876374','873383','193386','906397','804314','900116','785670','900399','724321','193488','811373','779917','193699','994688',
'779410','575169','A59299','900206','193297','575484','896359','367230','810105','876485','906385','876484','707149','811222','706801','193596','731949','697881',
'889697','804626','575194','707159','706129','900230','900231','706834','811352','900229','785748','193460','888221','906272','906266','906264','906263','994356',
'194431','731966','892417','811341','577413','741572','575564','889262','706956','876157','900257','900153','706958','706957','960436','892429','892427','900354',
'697138','645655','193352','994337','707189','697833','959190','900246','811317','577437','193643','697976','890692','193229','965579','900137','900135','894897',
'697723','193363','193367','785505','907077','959184','811311','706526','577302','706529','994332','702792','706663','779834','731931','960127','193371','876183',
'741563','193235','803843','577320','994318','907087','741460','907086','959170','994462','900464','193626','877503','643711','811202','811201','704585','193504',
'193500','875246','704876','725834','699783','699780','802380','900304','706885','906191','577773','959152','872574','811435','697388','699381','892485','577698',
'907035','811445','907039','894999','894857','894595','697273','894597','959139','577894','874898','706959','900424','193337','577697','907011','875696','699555',
'699554','575629','906149','906150','193452','962968','811264','811266','962970','875395','699543','575638','906153','857962','896247','858349','779746','906161',
'906928','802857','779640','193424','550309','424520','550305','575608','872517','906169','892196','811386','811385','906173','907220','959234','876666','959231',
'876662','893785','875914','802649','550218','550315','906111','741984','550319','906405','906501','550118','643371','785254','550116','550117','802946','906629',
'907145','550325','550324','906837','550320','906838','702591','550220','550227','906415','690289','906517','704416','731431','550125','959201','906413','994176',
'550333','550140','550337','891651','550141','550338','906746','907269','550132','550137','550138','892914','550342','906123','550153','550345','950923','906129',
'873188','906850','906953','690270','890713','645352','893127','697590','874826','424439','893126','907110','550144','856305','690269','892824','550256','550257',
'906867','907186','960852','720754','960851','906866','888607','805573','811530','960756','872352','550266','550267','550264','811518','888896','906730','994958',
'892247','960970','875186','906987','424124','550232','A59303','702660','875885','811609','888626','424219','906897','994981','731502','697496','695345','962996',
'894371','907153','805541','907154','424337','906613','906615','900512','906610','956141','994611','804582','994718','888648','575219','888756','896973','424395',
'872117','A59227','697616','731380','697614','900161','690410','994213','956155','956154','779492','994231','702876','577248','994727','193818','890879','722243',
'906499','577354','888560','645121','896972','960823','804279','900175','888853','193724','550285','550282','906469','994803','906466','888299','877141','890984',
'695688','994533','888327','A59348','A59346','994410','733116','550296','550290','550292','906478','731763','725658','896408','645145','994751','731654','740358',
'906441','550158','193849','906543','906448','994262','575824','424186','906345','643663','888305','906243','906244','702963','906453','906452','956119','906451',
'956116','950489','550166','906454','367457','896764','575833','994268','906252','994127','733236','906258','956123','550178','994777','956126','956127','956128',
'906786','906788','906687','643290','994631','956225','994632','888574','906365','804228','731599','643682','550182','804369','994784','550186','550183','888826',
'575127','906439','890482','906438','906691','890472','994509','193147','575718','804215','575276','994793','897257'))
and END(sec.EngineValidPd) is until_changed
order by esn
Also if there is more than 1000 records, I am implementing the IN clause as follows
AND (sec.SerialNum in( first 999 recods) OR sec.SerialNum in( next 999 recods)… OR sec.SerialNum in( remaining recods))
Please suggest solution which would be faster than the above query and which will not cause issue with more than 1000 records in IN clause
What is your Teradata release?
In TD14 there's a built-in table function to split a string of values, you can simply pass all values within a single string:
AND sec.SerialNum IN
(
SELECT token
FROM TABLE (STRTOK_SPLIT_TO_TABLE(1, '733276,193283,690168,741471,876374', ',')
RETURNS (outkey INTEGER,
tokennum INTEGER,
token VARCHAR(20) CHARACTER SET UNICODE)
) AS d
)

Resources