Oracle Query Subselect with order by - oracle

I am normally using MS SQL and am a total rookie with oracle.
I get an oracle driver problem when I use the ORDER BY statement in my subquery.
Example (my real statement is much more complex but I doubt it matters to my problem - I can post it if needed):
SELECT col1
, col2
, (SELECT colsub FROM subtbl WHERE idsub = tbl.id AND ROWNUM=1 ORDER BY coldate) col3
FROM tbl
If I do such a construct I get an odbc driver error: ORA-00907: Right bracket is missing (translated from german, so bracket might be other word :)).
If I remove the ORDER BY coldate everything works fine. I couldn't find any reason why, so what do I wrong?

It doesn't make any sense to write the ROWNUM and the ORDER BY this way since the ORDER BY is evaluated after the WHERE clause, meaning that it has no effect in this case. An example is given in this question.
This also gets a little more complicated because it is hard to join a sub-query back to the main query if it is nested too deeply.
The query below won't necessarily work because you can't join between tbl and subtbl in this way.
SELECT
col1,
col2,
(
select colsub
from (
SELECT colsub
FROM subtbl
WHERE idsub = tbl.id
order by coldate
)
where rownum = 1
) as col3
FROM tbl
So you'll need to use some sort of analytic function as shown in the example below:
SELECT
col1,
col2,
(SELECT max(colsub) keep (dense_rank first order by coldate) as colsub
FROM subtbl
WHERE idsub = tbl.id
group by idsub
) col3
FROM tbl
The FIRST analytic function is more complicated than it needs to be but it will get the job done.
Another option would be to use the ROW_NUMBER analytic function which would also solve the problem.
SELECT
col1,
col2,
(select colsub
from (
SELECT
idsub,
row_number() over (partition by idsub order by coldate) as rown,
colsub
FROM subtbl a
) a
WHERE a.idsub = tbl.id
and a.rown = 1
) col3
FROM tbl

What you are doing wrong is clear. You are using an order by in a sub-query. It does not make any sense using an order by in a sub-query so why would you want to do that?
Also you are using an order by on a sub-query that always returns 1 row. That also does not make any sense.
If you want the query result to be sorted use an order by at the highest level.

try:
select
col1,
col2,
colsub
from(
select
col1 ,
col2 ,
coldate,
max(coldate) over (partition by st.idsub) max_coldate
from
tbl t,
subtbl st
where
st.idsub = t.id)
where
coldate = max_coldate

Related

Oracle Select unique on multiple column

How can I achieve this to Select to one row only dynamically since
the objective is to get the uniqueness even on multiple columns
select distinct
coalesce(least(ColA, ColB),cola,colb) A1, greatest(ColA, ColB) B1
from T
The best solution is to use UNION
select colA from your_table
union
select colB from your_table;
Update:
If you want to find the duplicate then use the EXISTS as follows:
SELECT COLA, COLB FROM YOUR_TABLE T1
WHERE EXISTS (SELECT 1 FROM YOUR_tABLE T2
WHERE T2.COLA = T1.COLB OR T2.COLB = T1.COLA)
If I correctly understand words: objective is to get the uniqueness even on multiple columns, number of columns may vary, table can contain 2, 3 or more columns.
In this case you have several options, for example you can unpivot values, sort, pivot and take unique values. The exact code depends on Oracle version.
Second option is listagg(), but it has limited length and you should use separators not appearing in values.
Another option is to compare data as collections. Here I used dbms_debug_vc2coll which is simple table of varchars. Multiset except does main job:
with t as (select rownum rn, col1, col2, col3,
sys.dbms_debug_vc2coll(col1, col2, col3) as coll
from test )
select col1, col2, col3 from t a
where not exists (
select 1 from t b where b.rn < a.rn and a.coll multiset except b.coll is empty )
dbfiddle with 3-column table, nulls and different test cases

Reduce overload on pl/sql

I have a requirement to do matching of few attributes one by one. I'm looking to avoid multiple select statements. Below is the example.
Table1
Col1|Price|Brand|size
-----------------------
A|10$|BRAND1|SIZE1
B|10$|BRAND1|SIZE1
C|30$|BRAND2|SIZE2
D|40$|BRAND2|SIZE4
Table2
Col1|Col2|Col3
--------------
B|XYZ|PQR
C|ZZZ|YYY
Table3
Col1|COL2|COL3|LIKECOL1|Price|brand|size
-----------------------------------------
B|XYZ|PQR|A|10$|BRAND1|SIZE1
C|ZZZ|YYY|D|NULL|BRAND2|NULL
In table3, I need to insert data from table2 by checking below conditions.
Find a match for record in table2, if Brand and size, Price match
If no match found, then try just Brand, Size
still no match found, try brand only
In the above example, for the first record in table2, found match with all the 3 attributes and so inserted into table3 and second record, record 'D' is matching but only 'Brand'.
All I can think of is writing 3 different insert statements like below into an oracle pl/sql block.
insert into table3
select from tab2
where all 3 attributes are matching;
insert into table3
select from tab2
where brand and price are matching
and not exists in table3 (not exists is to avoid
inserting the same record which was already
inserted with all 3 attributes matched);
insert into table3
select from tab2
where Brand is matching and not exists in table3;
Can anyone please suggest a better way to achieve it in any better way avoiding multiple times selecting from table2.
This is a case for OUTER APPLY.
OUTER APPLY is a type of lateral join that allows you join on dynamic views that refer to tables appearing earlier in your FROM clause. With that ability, you can define a dynamic view that finds all the matches, sorts them by the pecking order you've specified, and then use FETCH FIRST 1 ROW ONLY to only include the 1st one in the results.
Using OUTER APPLY means that if there is no match, you will still get the table B record -- just with all the match columns null. If you don't want that, you can change OUTER APPLY to CROSS APPLY.
Here is a working example (with step by step comments), shamelessly stealing the table creation scripts from Michael Piankov's answer:
create table Table1 (Col1,Price,Brand,size1)
as select 'A','10','BRAND1','SIZE1' from dual union all
select 'B','10','BRAND1','SIZE1' from dual union all
select 'C','30','BRAND2','SIZE2' from dual union all
select 'D','40','BRAND2','SIZE4'from dual
create table Table2(Col1,Col2,Col3)
as select 'B','XYZ','PQR' from dual union all
select'C','ZZZ','YYY' from dual;
-- INSERT INTO table3
SELECT t2.col1, t2.col2, t2.col3,
t1.col1 likecol1,
decode(t1.price,t1_template.price,t1_template.price, null) price,
decode(t1.brand,t1_template.brand,t1_template.brand, null) brand,
decode(t1.size1,t1_template.size1,t1_template.size1, null) size1
FROM
-- Start with table2
table2 t2
-- Get the row from table1 matching on col1... this is our search template
inner join table1 t1_template on
t1_template.col1 = t2.col1
-- Get the best match from table1 for our search
-- template, excluding the search template itself
outer apply (
SELECT * FROM table1 t1
WHERE 1=1
-- Exclude search template itself
and t1.col1 != t2.col1
-- All matches include BRAND
and t1.brand = t1_template.brand
-- order by match strength based on price and size
order by case when t1.price = t1_template.price and t1.size1 = t1_template.size1 THEN 1
when t1.size1 = t1_template.size1 THEN 2
else 3 END
-- Only get the best match for each row in T2
FETCH FIRST 1 ROW ONLY) t1;
Unfortunately is not clear what do you mean when say match. What is you expectation if there is more then one match?
Should it be only first matching or it will generate all available pairs?
Regarding you question how to avoid multiple inserts there is more then one way:
You could use multitable insert with INSERT first and condition.
You could join table1 to self and get all pairs and filter results in where condition
You could use analytical function
I suppose there is another ways. But why you would like to avoid 3 simple inserts. Its easy to read and maintain. And may be
There is example with analytical function next:
create table Table1 (Col1,Price,Brand,size1)
as select 'A','10','BRAND1','SIZE1' from dual union all
select 'B','10','BRAND1','SIZE1' from dual union all
select 'C','30','BRAND2','SIZE2' from dual union all
select 'D','40','BRAND2','SIZE4'from dual
create table Table2(Col1,Col2,Col3)
as select 'B','XYZ','PQR' from dual union all
select'C','ZZZ','YYY' from dual
with s as (
select Col1,Price,Brand,size1,
count(*) over(partition by Price,Brand,size1 ) as match3,
count(*) over(partition by Price,Brand ) as match2,
count(*) over(partition by Brand ) as match1,
lead(Col1) over(partition by Price,Brand,size1 order by Col1) as like3,
lead(Col1) over(partition by Price,Brand order by Col1) as like2,
lead(Col1) over(partition by Brand order by Col1) as like1,
lag(Col1) over(partition by Price,Brand,size1 order by Col1) as like_desc3,
lag(Col1) over(partition by Price,Brand order by Col1) as like_desc2,
lag(Col1) over(partition by Brand order by Col1) as like_desc1
from Table1 t )
select t.Col1,t.Col2,t.Col3, coalesce(s.like3, like_desc3, s.like1, like_desc1, s.like1, like_desc1),
case when match3 > 1 then size1 end as size1,
case when match1 > 1 then Brand end as Brand,
case when match2 > 1 then Price end as Price
from table2 t
left join s on s.Col1 = t.Col1
COL1 COL2 COL3 LIKE_COL SIZE1 BRAND PRICE
B XYZ PQR A SIZE1 BRAND1 10
C ZZZ YYY D - BRAND2 -

How to optimize this SELECT with sub query Oracle

Here is my query,
SELECT ID As Col1,
(
SELECT VID FROM TABLE2 t
WHERE (a.ID=t.ID or a.ID=t.ID2)
AND t.STARTDTE =
(
SELECT MAX(tt.STARTDTE)
FROM TABLE2 tt
WHERE (a.ID=tt.ID or a.ID=tt.ID2) AND tt.STARTDTE < SYSDATE
)
) As Col2
FROM TABLE1 a
Table1 has 48850 records and Table2 has 15944098 records.
I have separate indexes in TABLE2 on ID,ID & STARTDTE, STARTDTE, ID, ID2 & STARTDTE.
The query is still too slow. How can this be improved? Please help.
I'm guessing that the OR in inner queries is messing up with the optimizer's ability to use indexes. Also I wouldn't recommend a solution that would scan all of TABLE2 given its size.
This is why in this case I would suggest using a function that will efficiently retrieve the information you are looking for (2 index scan per call):
CREATE OR REPLACE FUNCTION getvid(p_id table1.id%TYPE)
RETURN table2.vid%TYPE IS
l_result table2.vid%TYPE;
BEGIN
SELECT vid
INTO l_result
FROM (SELECT vid, startdte
FROM (SELECT vid, startdte
FROM table2 t
WHERE t.id = p_id
AND t.startdte < SYSDATE
ORDER BY t.startdte DESC)
WHERE rownum = 1
UNION ALL
SELECT vid, startdte
FROM (SELECT vid, startdte
FROM table2 t
WHERE t.id2 = p_id
AND t.startdte < SYSDATE
ORDER BY t.startdte DESC)
WHERE rownum = 1
ORDER BY startdte DESC)
WHERE rownum = 1;
RETURN l_result;
END;
Your SQL would become:
SELECT ID As Col1,
getvid(a.id) vid
FROM TABLE1 a
Make sure you have indexes on both table2(id, startdte DESC) and table2(id2, startdte DESC). The order of the index is very important.
Possibly try the following, though untested.
WITH max_times AS
(SELECT a.ID, MAX(t.STARTDTE) AS Startdte
FROM TABLE1 a, TABLE2 t
WHERE (a.ID=t.ID OR a.ID=t.ID2)
AND t.STARTDTE < SYSDATE
GROUP BY a.ID)
SELECT b.ID As Col1, tt.VID
FROM TABLE1 b
LEFT OUTER JOIN max_times mt
ON (b.ID = mt.ID)
LEFT OUTER JOIN TABLE2 tt
ON ((mt.ID=tt.ID OR mt.ID=tt.ID2)
AND mt.startdte = tt.startdte)
You can look at analytic functions to avoid having to hit the second table twice. Something like this might work:
SELECT id AS col1, vid
FROM (
SELECT t1.id, t2.vid, RANK() OVER (PARTITION BY t1.id ORDER BY
CASE WHEN t2.startdte < TRUNC(SYSDATE) THEN t2.startdte ELSE null END
NULLS LAST) AS rn
FROM table1 t1
JOIN table2 t2 ON t2.id IN (t1.ID, t1.ID2)
)
WHERE rn = 1;
The inner select gets the id and vid values from the two tables with a simple join on id or id2. The rank function calculates a ranking for each matching row in the second table based on the startdte. It's complicated a bit by you wanting to filter on that date, so I've used a case to effectively ignore any dates today or later by changing the evaluated value to null, and in this instance that means the order by in the over clause needs nulls last so they're ignored.
I'd suggest you run the inner select on its own first - maybe with just a couple of id values for brevity - to see what its doing, and what ranks are being allocated.
The outer query is then just picking the top-ranked result for each id.
You may still get duplicates though; if table2 has more than one row for an id with the same startdte they'll get the same rank, but then you may have had that situation before. You may need to add more fields to the order by to break ties in a way that makes sens to you.
But this is largely speculation without being able to see where your existing query is actually slow.

Replace selfjoin with analytic functions

How do I go about replacing the following self join using analytics:
SELECT
t1.col1 col1,
t1.col2 col2,
SUM((extract(hour FROM (t1.times_stamp - t2.times_stamp)) * 3600 + extract(minute FROM ( t1.times_stamp - t2.times_stamp)) * 60 + extract(second FROM ( t1.times_stamp - t2.times_stamp)) ) ) div,
COUNT(*) tot_count
FROM tab1 t1,
tab1 t2
WHERE t2.col1 = t1.col1
AND t2.col2 = t1.col2
AND t2.col3 = t1.sequence_num
AND t2.times_stamp < t1.times_stamp
AND t2.col4 = 3
AND t1.col4 = 4
AND t2.col5 NOT IN(103,123)
AND t1.col5 != 549
GROUP BY t1.col1, t1.col2
I'm pretty sure you won't be able to replace the self-join with analytics because you are using inter-rows operations (t1.time_stamp - t2.time_stamp). Analytics can only access the values of the current row and the value of aggregate functions over a subset of rows (windowing clause).
See this article from Tom Kyte and this paper for further analysis of the limitations of analytics.
It almost looks like you could eliminate the self join on t2 and replace
t1.time_stamp - t2.time_stamp
with something like
t1.time_stamp - lag(t1.time_stamp) over (partition by col1, col2 order by time_stamp)
The different filters on t1 and t2 on col4 and col5 are what prevents you from doing this.
Analytic functions are applied after the where / group by on the main query, so you'd need to have a single filter on t1 in order to use lag/lead to specify following or preceding rows in a sequence.
Also, you'd need to push the sum/group by to an outer query to aggregate after the analytic function:
select col1, col2, sum(timestamp_diff) from (
select col1, col2, timestamp - lag(timestamp) over(.....) as timestamp_diff
where ....
) group by col1, col2

returning multiple columns using Case in Select Satement in Oracle

I have a sceanrio where i need to retreive values from different sub queries based on a condition in a main select statement. i was trying to use Case, but the problem is that Case does not support multiple columns. Is there any work around to this, or is there any other way to acheive this.
My scenario in a simplified query
select col1,col2,
case when col3='E01089001' then
(select 1,3 from dual)
else
(select 2,4 from dual)
end
from Table1
where col1='A0529';
Here's another way of writing it which may address concerns about accessing the second table more times than necessary.
select col1,col2,
case when col3='E01089001' then 1 else 2 end,
case when col3='E01089001' then 3 else 4 end
end
from Table1, dual
where col1='A0529';
Your example uses an obviously simplified subquery, so this version looks kind of silly as-is; there's no reason to join with DUAL at all. But in your real query you presumably have a subquery like SELECT a, b FROM otherTable WHERE someCondition. So you would want to use the actual column names instead of numeric literals and the actual table name instead of dual, and move the subquery conditions into the final WHERE clause.
A quick and dirty solution.
select dummy,substr(c,1,instr(c,',')-1) c1,substr(c,instr(c,',')+1) c2
from (
select dummy,
case when dummy='X' then
(select 1||','||3 from dual)
end c
from (select * from dual)
)
If each case only allows one column, then you probably need two cases:
select col1,col2,
case when col3='E01089001' then
(select 1 from dual)
else
(select 2 from dual)
end,
case when col3='E01089001' then
(select 3 from dual)
else
(select 4 from dual)
end
from Table1
where col1='A0529';
I hope I don't need to tell you that this sort of stuff doesn't scale very well when the database tables become large.
Case does support multiple columns in the conditional check
CASE WHEN A=X AND B=Y THEN ... END
What you are trying to do in your example is return a table (2 columns) into a resultset that expects one column: col1, col2, (col3,col4).
You need to return them separately: col1, col2, col3, col4
select
col1,
col2,
case when col3='E01089001' then (select 1 from dual) else (select 3 from dual) end,
case when col3='E01089001' then (select 2 from dual) else (select 4 from dual) end
from Table1 where col1='A0529';
The best approach for me is to use REGEXP_REPLACE. Have a single string returned from the case statement, and in the outer query block select statement tokenize the string into different fields.

Resources