paginated query optimization
It would be a real help if you could give your insight.
I have a query with multiple joins and filter criteria, the result is sorted and finally only 100 records are retrieved. Which shall be more efficient?
Option 1:
select * from
( SELECT INTR.col1 AS ID
FROM INTR, TR, J
WHERE
INTR.col1 > ?
AND ........
AND ........
AND ........
ORDER BY INTR.col1
)
where rownum <= 100;
Option 2:
SELECT INTR.col1 AS ID
FROM INTR, TR, J
WHERE
INTR.col1 > ?
AND ........
AND ........
AND ........
AND rownum <= 100;
Option 2 will be better choice if we can get rid of sorting and select 100 records as soon we have it. How can I confirm? Please help.
Option two is more preformant from obvious reason that you need not to sort.
The drawback (and often the reason why the option one is required) is that the second option provides non stable results, i.e. you may get each time different 100 rows event if you data are unchanged.
BTW Oracle can optimize the first option in a way you need not to sort the whole cursor result, but only to find the top N rows and return them sorted.
Related
I have the following query that is not sorting the table the way I want it:
SELECT * FROM tbl
ORDER BY
BAN,
BEN,
bill_seq_no DESC,
CASE
WHEN Ebene='BAN - Open Debts' THEN 1
WHEN Ebene='BEN - Open Debts' THEN 2
END,
Rufnummer
;
It should sort the table first by BAN, then by BEN. Now in the third level row with Ebene='BEN - Open Debts' has bill_seq_no = NULL. This is why it sorts this row in the bottom.
I want it at the top.
How can I do that?
Got it! It's
SELECT * FROM adam_tmp.AAM711119__result
ORDER BY
BAN,
BEN,
CASE
WHEN Ebene LIKE '%BEN - Open Debts%' THEN 1
ELSE 2
END,
bill_seq_no DESC,
Rufnummer
;
In an algorithm the users passes a query, for instance:
SELECT o_orderdate, o_orderpriority FROM h_orders WHERE rownum <= 5
The query returns the following:
1996-01-02 5-LOW
1996-12-01 1-URGENT
1993-10-14 5-LOW
1995-10-11 5-LOW
1994-07-30 5-LOW
The algorithm needs the count for the select attributes (o_orderdate, o_orderpriority in the above example) and therefore it rewrites the query to:
SELECT o_orderdate, count(o_orderdate) FROM
(SELECT o_orderdate, o_orderpriority FROM h_orders WHERE rownum <= 5)
GROUP BY o_orderdate
This query returns the following:
1992-01-01 5
However the intended result is:
1996-12-01 1
1995-10-11 1
1994-07-30 1
1996-01-02 1
1993-10-14 1
Any idea how I could rewrite the parsing stage or how the user could pass a syntactically different query to receive the above results?
The rows returned by the inner query are essentially non-deterministic, as they depend on the order in which the optimiser identifies rows as part of the required data set. A change in execution plan due to modified predicates might change the order in which the rows come back, and new rows added to the table can also change which rows are included.
If you always want n rows then either use distinct(o_orderdate) in the innerquery, which will render the GROUP BY useless.
Or you can add another outer select with rownum to get n of the grouped rows, like this:
select o_orderdate, counter from
(
SELECT o_orderdate, count(o_orderdate) as counter FROM
(SELECT o_orderdate, o_orderpriority FROM h_orders)
GROUP BY o_orderdate
)
WHERE rownum <= 5
Although the results will most likely be useless as they will be undeterministic (as mentioned by David Aldridge).
As your outer query makes no use of "o_orderpriority", why not just get rid of the subquery and simply query like this:
SELECT o_orderdate, count(o_orderdate) AS order_count
FROM h_orders
WHERE rownum <= 5
GROUP BY o_orderdate
I have two tables:
Vehicles
make model modification
Audi A 5 A 5 2010 Sportsback 2.8
Audi A 5 A 5 2012 Quattro L
Audi A 5 A 5 Cabriolet
and
matchingModel
make model modContain modEnd finalModel
Audi A 5 Sportback A5 Sportback
Audi A 5 L A5 L
Audi A 5 A5
My task is to get only best fitting finalModel by finding matches (can be seen in select below).
First i tried to join tables
(SELECT
matchingModel.finalModel
FROM vehicles
LEFT OUTER JOIN matchingModel ON
matchingModel.TEXT1 = vehicles.make
AND vehicles.model = nvl(matchingModel.model,vehicles.model)
AND vehicles.modification LIKE decode(matchingModel.modContain, NULL, vehicles.modification, '%'||matchingModel.modContain||'%')
AND vehicles.modification LIKE decode(matchingModel.modEnd, NULL, vehicles.modification, '%'||' '||matchingModel.modEnd)
)
AS bestMatch
but that did not work, because as Sportsback was found as sportsback, later its overwritten as a simple A5 because that matches too.
So next i made this happen simply by "nvling" all possible options: nvl(nvl(nvl(select where make, model fits and modContains is in the middle of Modification and option cell is empty), (select where make, model fits and modEnd is like ending of Modification and modEnd is not empty), (select where make and model fits AND so on)) AS Bestmatch
This works, but it is very slow (and both tables have more that 500k records).
This is just a part of very huge select, so its difficult to rewrite this normal way.
Anyway, the question is, are there any best practices how to get best match, only once, fast, in oracle? The problems i have run into, is performance, or values fits twice, or "where" clause does not work, because i can not know if modContain or modEnd is empty or not.
Thank You in advance.
Sorry for English.
It is not quite there yet but I worked out an example you can continue to work out for yourself: SQL Fiddle Demo
select * from (
(select
case when v.modification like '%'||m.modContain||'%' then 2
when m.modcontain is null then 1
else 0 end m1,
case when v.modification like '%'||m.modend then 2
when m.modend is null then 1
else 0 end m2
, m.make mmake, m.model mmodel, modcontain, modend, finalmodel
, v.make vmake, v.model vmodel, modification
from vehicles v, matchingmodel m
where
v.make = m.make
and soundex(v.model) = soundex(m.model) ) ) x
order by m1+m2 desc
So the sub-query adds together the matches and the highest match should be your best match. I also used soundex which may also help you because Sportback and Sportsback is not quite the same and that helped me to make A5 and A 5 make the same. Also to make it fast you will have to work a lot with assigning good indicies and watching the explain plan, especially if you have 500k records. That is not an easier undertaking.
To the idea about writing a procedure (which is a good idea) untested it might look like this:
create or replace function vehicle_matching(i_vehicles vehicles%rowtype,
i_matchingmodel matchingmodel%rowtype)
return number
is
l_return number;
begin
if i_vehicles.modification like '%'||i_matchingmodel.modContain||'%' then
l_return := 3;
elsif soundex(i_vehicles.modification) like '%'||soundex(i_matchingmodel.modContain)||'%' then
l_return := 2;
...
if i_vehicles.modification like '%'||i_matchingmodel.modend then
l_return := l_return + 1; -- there is no i++ in PL/SQL
elsif
...
return l_return;
end vehicle_matching;
Also I was thinking if it is more efficient to work with INSTR and SUBSTR than with the % but I actually do not really think this is the case.
you may consider something like this:
write a query to return 1 on any partial match
then write another query to return another 1 on another partial match - etc.
repeat this for all possible columns that count towards your 'similarity'
in the end, you will find the row with the highest sum (or count) of 1's and that will be the closest match.
I need to select rows randomly from an Oracle DB.
Ex: Assume a table with 100 rows, how I can randomly return 20 of those records from the entire 100 rows.
SELECT *
FROM (
SELECT *
FROM table
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum < 21;
SAMPLE() is not guaranteed to give you exactly 20 rows, but might be suitable (and may perform significantly better than a full query + sort-by-random for large tables):
SELECT *
FROM table SAMPLE(20);
Note: the 20 here is an approximate percentage, not the number of rows desired. In this case, since you have 100 rows, to get approximately 20 rows you ask for a 20% sample.
SELECT * FROM table SAMPLE(10) WHERE ROWNUM <= 20;
This is more efficient as it doesn't need to sort the Table.
SELECT column FROM
( SELECT column, dbms_random.value FROM table ORDER BY 2 )
where rownum <= 20;
In summary, two ways were introduced
1) using order by DBMS_RANDOM.VALUE clause
2) using sample([%]) function
The first way has advantage in 'CORRECTNESS' which means you will never fail get result if it actually exists, while in the second way you may get no result even though it has cases satisfying the query condition since information is reduced during sampling.
The second way has advantage in 'EFFICIENT' which mean you will get result faster and give light load to your database.
I was given an warning from DBA that my query using the first way gives loads to the database
You can choose one of two ways according to your interest!
In case of huge tables standard way with sorting by dbms_random.value is not effective because you need to scan whole table and dbms_random.value is pretty slow function and requires context switches. For such cases, there are 3 additional methods:
1: Use sample clause:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
for example:
select *
from s1 sample block(1)
order by dbms_random.value
fetch first 1 rows only
ie get 1% of all blocks, then sort them randomly and return just 1 row.
2: if you have an index/primary key on the column with normal distribution, you can get min and max values, get random value in this range and get first row with a value greater or equal than that randomly generated value.
Example:
--big table with 1 mln rows with primary key on ID with normal distribution:
Create table s1(id primary key,padding) as
select level, rpad('x',100,'x')
from dual
connect by level<=1e6;
select *
from s1
where id>=(select
dbms_random.value(
(select min(id) from s1),
(select max(id) from s1)
)
from dual)
order by id
fetch first 1 rows only;
3: get random table block, generate rowid and get row from the table by this rowid:
select *
from s1
where rowid = (
select
DBMS_ROWID.ROWID_CREATE (
1,
objd,
file#,
block#,
1)
from
(
select/*+ rule */ file#,block#,objd
from v$bh b
where b.objd in (select o.data_object_id from user_objects o where object_name='S1' /* table_name */)
order by dbms_random.value
fetch first 1 rows only
)
);
To randomly select 20 rows I think you'd be better off selecting the lot of them randomly ordered and selecting the first 20 of that set.
Something like:
Select *
from (select *
from table
order by dbms_random.value) -- you can also use DBMS_RANDOM.RANDOM
where rownum < 21;
Best used for small tables to avoid selecting large chunks of data only to discard most of it.
Here's how to pick a random sample out of each group:
SELECT GROUPING_COLUMN,
MIN (COLUMN_NAME) KEEP (DENSE_RANK FIRST ORDER BY DBMS_RANDOM.VALUE)
AS RANDOM_SAMPLE
FROM TABLE_NAME
GROUP BY GROUPING_COLUMN
ORDER BY GROUPING_COLUMN;
I'm not sure how efficient it is, but if you have a lot of categories and sub-categories, this seems to do the job nicely.
-- Q. How to find Random 50% records from table ?
when we want percent wise randomly data
SELECT *
FROM (
SELECT *
FROM table_name
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum <= (select count(*) from table_name) * 50/100;
I had written a cursor in a pl/sql block. This block taking lot of time if it has more records.
How to write this without a cursor or Is there any other alternative way that will reduce the time?
Is there any alternative query to perform insert into one table and delete from another table using a single query?
DECLARE
MDLCursor SYS_REFCURSOR;
BEGIN
open MDLCursor for
select dc.dest_id, dc.digits, dc.Effectivedate, dc.expirydate
from DialCodes dc
INNER JOIN MDL d
ON dc.Dest_ID = d.Dest_ID
AND d.PriceEntity = 1
join sysmdl_calltypes s
on s.call_type_id = v_CallType_ID
and s.dest_id = dc.Dest_ID
and s.call_type_id not in
(select calltype_id from ignore_calltype_for_routing)
order by length(dc.digits) desc, dc.digits desc;
loop
fetch MDLCursor
into v_mdldest_id, v_mdldigits, v_mdlEffectiveDate, v_mdlExpDate;
insert into tt_pendingcost_temp
(Dest_ID,
Digits,
CCASDigits,
Destination,
tariff_id,
NewCost,
Effectivedate,
ExpiryDate,
previous,
Currency)
select v_mdldest_id,
Digits,
v_mdldigits,
Destination,
tariff_id,
NewCost,
Effectivedate,
ExpiryDate,
previous,
Currency
FROM tt_PendingCost
where substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
and instr(Digits, v_MDLDigits) = 1
and v_mdlEffectiveDate <= effectivedate
and (v_mdlExpDate > effectivedate or v_mdlExpDate is null);
if SQL%ROWCOUNT > 0 then
delete FROM tt_PendingCost
where substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
and instr(Digits, v_MDLDigits) = 1
and v_mdlEffectiveDate <= effectivedate
and (v_mdlExpDate > effectivedate or v_mdlExpDate is null);
end if;
exit when MDLCursor%NOTFOUND;
end loop;
close MDLCursor;
END;
I don't have your tables and your data so I can only guess at a couple of things that would be slowing you down.
Firstly, the query used in your cursor has an ORDER BY clause in it. If this query returns a lot of rows, Oracle has to fetch them all and sort them all before it can return the first row. If this query typically returns a lot of results, and you don't particularly need it to return sorted results, you may find your PL/SQL block speeds up a bit if you drop the ORDER BY. That way, you can start getting results out of the cursor without needing to fetch all the results, store them somewhere and sort them first.
Secondly, the following is the WHERE clause used in your INSERT INTO ... SELECT ... and DELETE FROM ... statements:
where substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
and instr(Digits, v_MDLDigits) = 1
and v_mdlEffectiveDate <= effectivedate
and (v_mdlExpDate > effectivedate or v_mdlExpDate is null);
I don't see how Oracle can make effective use of indexes with any of these conditions. It would therefore have to do a full table scan each time.
The last two conditions seem reasonable and there doesn't seem a lot that can be done with them. I'd like to focus on the first two conditions as I think there's more scope for improvement with them.
The second of the four conditions is
instr(Digits, v_MDLDigits) = 1
This condition holds if and only if Digits starts with the contents of v_MDLDigits. A better way of writing this would be
Digits LIKE v_MDLDigits || '%'
The advantage of using LIKE in this situation instead of INSTR is that Oracle can make use of indexes when using LIKE. If you have an index on the Digits column, Oracle will be able to use it with this query. Oracle would then be able to focus in on those rows that start with the digits in v_MDLDigits instead of doing a full table scan.
The first of the four conditions is:
substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
If v_MDLDigits has length at least 2, and all entries in the Digits columns also have length at least 2, then this condition is redundant since it is implied by the previous one we looked at.
I'm not sure why you would have a condition like this. The only reason I can think why you might have this condition is if you have a functional index on substr(Digits, 1, 2). If not, I would be tempted to remove this substr condition altogether.
I don't think the cursor is what is making this procedure run slowly, and there's no single statement I know of that can insert into one table and delete from another. To make this procedure speed up I think you just need to tune the queries a bit.