I'm learning MS SQL Server 2008 R2 so please excuse my ignorance.
This query takes 3 sec and I would like to do it in less than 1 sec.
the query is only for testing purposes, in reality I would join on different fields.
select * from
(
select row_number() over(order by t1.id) as n, t1.id as id1, t2.id as id2, t3.id as id3, t4.id as id4, t5.id as id5
from dbo.Context t1
inner join dbo.Context t2 on t1.id = t2.test
inner join dbo.Context t3 on t2.id = t3.test
inner join dbo.Context t4 on t3.id = t4.test
inner join dbo.Context t5 on t4.id = t5.test
) as t
where t.n between 950000 and 950009;
I'm afraid this will be worse by the time I have several billion records in this table.
Do I need to enable multi-threading from configuration or something?
There is no real way to optimize the paging portion of such a query, the part that is
t.n between 950000 and 950009
Which is really
{ ROW_NUMBER } between 950000 and 950009
Without fully materializing the INNER JOINs, there is no way to accurately row-number the result. This is unlike a single table with Row_Number - the Query Optimizer can sometimes just count the index keys and go to a direct range.
The only thing you can do is ensure that the JOIN conditions are all fully indexed and have the indexes INCLUDE the columns that will be selected (so they become COVERING INDEXes). There is no point showing specifics since those are not your real columns.
Do I need to enable multi-threading from configuration or something?
By default, parallelism is [already] turned on so such a query will very likely gather the data in multiple streams.
I'd suggest creating the inner query as an indexed view and then running your paging off of that. Since an indexed view actually has a real index on it the same optimization tricks that work on tables can be used.
See here for more information on indexed views including the limitations.
Related
I m building a search query right now. I used the Oracle 19c database.
sample query:
select * from t1
inner join t2 on t2.id = t1.id and t2.name <> 'AKA'
left outer join t3 Buyer on Buyer.id = t1.id and Buyer.userId is null
left outer join t3 Seller on Seller.id = t1.id and Seller.userId = t1.userId
where rownum < 500;
This query runs super slow without indexes. It took like 14s to feed the result.
I checked the explain plan and give some indexes to the database. The query time down to 3s! Yeah. BUT,BUT,BUTBBBBBUT. It's not enough. I need it to react even faster like 1s.
I rechecked the autotrace. There still two place use full scan event I gave them the index.
The first is
INDEX t2.name
- Filter Predicates
- t2.NAME<>'AKA'
Is there a way to do <> index in Oracle? I tried build index in multiple ways. For example,
create index t2_name_idx on t2 (case when name <> 'AKA' then name end);, not work. Also the sample way like create index t2_name_idx on t2 (name); not works too.
The second is
HASH JOIN RIGHT OUTER 890639 556473 400400 2906406
Access Predicates
AND
T3.ID=T1.ID
T3.USERID=T1.USERID
TABLE ACCESS
ADDRESSES BY INDEX ROWID BATCHED 1225069 314696 384117 596570
INDEX
T3_UERID FULL SCAN 1225069 4708 4678 121216
Filter Predicates
T3.USERID IS NOT NULL
Above means that when doing left outer join t3 Seller on Seller.id = t1.id and Seller.userId = t1.userId it will check the t3.userid is not null first. The index T3_USERID is just a simple non-unique index on table T3 col T3.userid.
Any help or hint will be appreciated.
I have two query that looks close to the same but Oracle have very different performance.
Query A
Create Table T1 as Select * from FinalView1 where CustomerID in ('A0000001','A000002')
Query B
Create Table T1 as Select * from FinalView1 where CustomerID in (select distinct CustomerID from CriteriaTable)
The CriteriaTable have 800 rows but all belongs to Customer ID 'A0000001' and 'A000002'.
This means the subquery: "select distinct CustomerID from CriteriaTable" also only returns the same two elements('A0000001','A000002') as manually entered in query A
Following is the query under the FinalView1
create or replace view FinalView1_20200716 as
select
Customer_ID,
<Some columns>
from
Table1_20200716 T1
INNER join Table2_20200716 T2 on
T1.Invoice_number = T2.Invoice_number
and
T1.line_id = T2.line_id
left join Table3_20200716 T3 on
T3.id = T1.Customer_ID
left join Table4_20200716 T4 on
T4.Shipping_ID = T1.Shipping_ID
left join Table5_20200716 Table5 on
Table5.Invoice_ID = T1.Invoice_ID
left join Table6_20200716 T6 on
T6.Shipping_ID = T4.Shipping_ID
left join First_Order first on
first.Shipping_ID = T1.Shipping_ID
;
Table1_20200716,Table2_20200716,Table3_20200716,Table4_20200716,Table5_20200716,Table6_20200716 are views to the corresponding table with temporal validity feature. For example
The query under Table1_20200716
Create or replace view Table1_20200716 as
select
*
from Table1 as for period of to_date('20200716,'yyyymmdd')
However table "First_Order" is just a normal table as
Following is the performance for both queries (According to explain plan):
Query A:
Cardinality: 102
Cost : 204
Total Runtime: 5 secs max
Query B:
Cardinality:27921981
Cost: 14846
Total Runtime:20 mins until user cancelled
All tables are indexed using those columns that used to join against other tables in the FinalView1. According to the explain plan, they have all been used except for the FirstOrder table.
Query A used uniquue index on the FirstOrder Table while Query B performed a full scan.
For query B, I was expecting the Oracle will firstly query the sub-query get the result into the in operator, before executing the main query and therefore should only have minor impact to the performance.
Thanks in advance!
As mentioned from my comment 2 days ago. Someone have actually posted the solution and then have it removed while the answer actually work. After waiting for 2 days the So I designed to post that solution.
That solution suggested that the performance was slow down by the "in" operator. and suggested me to replace it with an inner join
Create Table T1 as
Select
FV.*
from
FinalView1 FV
inner join (
select distinct
CustomerID
from
CriteriaTable
) CT on CT.customerid = FV.customerID;
Result from explain plan was worse then before:
Cardinality:28364465 (from 27921981)
Cost: 15060 (from 14846)
However, it only takes 17 secs. Which is very good!
I have a below query
How to use full outer join for TABLE T4 for getting all records?
WHERE
(DB.T4.AUTH_REV_NO=DB.T2.AUTH_REV_NO
AND DB.T4.AUTH_NO=DB.T2.AUTH_NO)
AND (DB.T2.AUTH_CURR_IN='Y' )
AND (DB.T3.AUTH_NO=DB.T2.AUTH_NO)
AND (DB.T3.AUTH_REV_NO=DB.T2.AUTH_REV_NO )
AND (DB.T6.FNC_ID=DB.T4.FNC_ID)
AND (DB.T7.FNC_SEG_ID=DB.T6.FNC_SEG_ID)
AND (DB.T1.SCT_ID(+)=DB.T7.SCT_ID
AND DB.T1.FNC_SEG_ID(+)=DB.T7.FNC_SEG_ID)
AND (DB.T8.NDE_ID=DB.T12.NDE_ID)
AND (DB.T7.FNC_SEG_ID=DB.T8.FNC_SEG_ID)
AND (DB.T7.SCT_ID=DB.T8.SCT_ID)
AND ((DB.T12.NDE_ID=DB.T6.NDE_STRT_ID)
OR (DB.T12.NDE_ID=DB.T6.NDE_END_ID))
AND (DB.T5.FNC_ID(+)=DB.T4.FNC_ID)
AND (T13_A4.REF_ID(+)=DB.T5.REF_TONE_TYP_ID)
AND (fne.FNC_SEG_ID=DB.T8.FNC_SEG_ID)
AND (fne.NDE_ID=DB.T8.NDE_ID)
AND (fne.SCT_ID=DB.T8.SCT_ID)
AND fnode.NDE_ID=DB.T6.NDE_STRT_ID
AND tnode.NDE_ID=DB.T6.NDE_END_ID
AND (DB.T4.REF_FNC_TYP_ID=T13_A1.REF_ID)
AND (ne_port.NDE_EQP_ID=fne.NDE_EQP_ID)
AND (ne_port.NDE_EQP_PRN_ID=ne_card.NDE_EQP_ID)
AND (ne_card.NDE_EQP_PRN_ID=ne_shelf.NDE_EQP_ID)
AND (ne_shelf.NDE_EQP_PRN_ID=ne_rack.NDE_EQP_ID)
AND (eq.EQP_ID=ne_card.EQP_ID)
AND (eq.REF_EQP_CLS_ID=T13_A2.REF_ID)
AND (DB.T3.REF_AUTH_STS_ID=T13_A3.REF_ID)
AND (DB.T3.AUTH_STS_ID
IN (SELECT MAX(DB.T3.AUTH_STS_ID) FROMDB.T3
WHERE (DB.T3.AUTH_NO,DB.T3.AUTH_REV_NO)
IN
(SELECT
DB.T3.AUTH_NO,
MAX(DB.T3.AUTH_REV_NO)
FROM
DB.T3
GROUP BY
DB.T3.AUTH_NO)
GROUP BY
DB.T3.AUTH_NO))
How to use full outer join for TABLE T4 and for COLUMN FNC_TONE_LVL_QT to get all records.
Please help.
You posted a whole lot of "joins". I'm not going to rewrite it for you, but - I'd suggest you to switch to a more recent explicit JOIN syntax which makes things somewhat simpler and easier to understand as you'd separate joins from conditions. Moreover, it allows you to outer join the same table to more than just one another table, which is impossible with the old (+) Oracle's outer join operator.
Something like this
select ...
from table_1 a left join table_2 b on a.id = b.id
full outer join table_3 c on c.id = a.id
...
Query 1
select student.identifier,
id_tab.reporter_name,
non_id_tab.reporter_name
from student_table student
inner join id_table id_tab on (student.is_NEW = 'Y'
and student.reporter_id = id_tab.reporter_id
and id_tab.name in('name1','name2'))
inner join id_table non_id_tab on (student.non_reporter_id = non_id_tab.reporter_id)
Query 2
select student.identifier,
id_tab.reporter_name,non_id_tab.reporter_name
from student_table student,
id_table id_tab,
id_table non_id_tab
where student.is_NEW = 'Y'
and student.reporter_id = id_tab.reporter_id
and id_tab.name in('name1','name2')
and student.non_reporter_id = non_id_tab.reporter_id
Since these two queries produce exactly same output,I am assuming they are syntactically same(please correct me if I am wrong).
I was wondering whether either of them is more efficient that the other.
Can anyone help me here please?
I would rewrite it as follows, using the ON only for JOIN conditions and moving the filters to a WHERE condition:
...
from student_table student
inner join id_table id_tab on ( student.reporter_id = id_tab.reporter_id )
inner join id_table non_id_tab on (student.non_reporter_id = non_id_tab.reporter_id)
where student.is_NEW = 'Y'
and id_tab.name in('name1','name2')
This should give a more readable query; however, no matter how you write it (the ANSI join is highly preferrable), you should check the explain plans to understand how the query will be executed.
In terms of performance, there should be no difference.
Execution Plans created by the Oracle optimizer do not differ.
In terms of readability, joining tables inside the WHERE clause is an old style (SQL89).
From SQL92 and higher, it is recommended to use the JOIN syntax.
I'm using PostgreSQL 9.3 and have the following tables (simplified to only show the relevant fields):
SITES:
id
name
...
DEVICES:
id
site_id
mac_address UNIQUE
...
Given the mac_address of a particular device, and I want to get the details of the associated site. I have the following two queries:
Using LEFT JOIN:
SELECT s.* FROM sites s
LEFT JOIN devices d ON s.id = d.site_id
WHERE d.mac_address = '00:00:00:00:00:00';
Using SUBQUERY:
SELECT s.* FROM sites s
WHERE s.id IN (SELECT d.site_id FROM devices d WHERE d.mac_address = '00:00:00:00:00:00');
Which of the two queries would have the best performance over an infinitely growing database? I have always leaned towards the LEFT JOIN option, but would be interested to know how the performance of both rates on a large data set.
It generally won't make any difference, because they should result in the same query plan. At least, an EXISTS subquery will; IN isn't as always as intelligently optimised.
For the subquery, rather than using IN (...) you should generally prefer EXISTS (...).
SELECT s.*
FROM sites s
WHERE EXISTS (
SELECT 1
FROM devices d
WHERE d.mac_address = '00:00:00:00:00:00'
AND d.site_id = s.id
);