I have this self join that is very slow on oracle DB. I have put indexes on all fields concerned. Does anybody have advice on how to increase performance?
select count(tNew.idtariffa) CONT
from tariffe tAtt
join tariffe tNew on tAtt.idtariffa = tNew.idtariffa
where (tAtt.stato_attivo = 't')
and (tNew.stato_attivo = 'f')
and (tAtt.validity_date < tNew.validity_date)
and (tAtt.dataimport < tNew.dataimport)
and (tNew.validity_date < to_date('2017-6-26','YYYY-MM-DD'))
Try PUSH_PRED hint :
select /*+ NO_MERGE(tNew) PUSH_PRED(tNew) */
count(tNew.idtariffa) CONT
from tariffe tAtt
join tariffe tNew on tAtt.idtariffa = tNew.idtariffa
where (tAtt.stato_attivo = 't')
and (tNew.stato_attivo = 'f')
and (tAtt.validity_date < tNew.validity_date)
and (tAtt.dataimport < tNew.dataimport)
and (tNew.validity_date < to_date('2017-6-26','YYYY-MM-DD'))
Exists version is worth of try:
select count(1) cont
from tariffe n
where stato_attivo = 'f'
and validity_date < date '2017-06-26'
and exists ( select null
from tariffe
where idtariffa = n.idtariffa
and stato_attivo = 't'
and validity_date < n.validity_date
and dataimport < n.dataimport )
Performance tuning without details like data volumes, data skew, index defintions, explain plan, etc is just guessing.
So here are some more guesses :)
Your driving table should be tariffe tNew as that's the one you use to top the result set.
tNew.validity_date < to_date('2017-6-26','YYYY-MM-DD'))
Now, unless tNew.stato_attivo = 'f' is extremely selective you're going to be retrieving a large chunk of all the rows in the table (depending on how far back the records go) so a Full Table Scan would be the most efficient way of grabbing those records.
The join on tariffe tAtt is problematic because idtariffa is not a unique column. So the join is a set of tNew records against a set of tAtt records. These will be filtered in memory using the criteria in the WHERE clause.
" I have put indexes on all fields concerned"
Single column indexes won't help here. You might get some joy from a compound index on all the pertinent columns:
tariffe (stato_attivo , validity_date, idtariffa, dataimport)
This would be worth building if you run this query very often.
Any other guesses? Subquery factoring to hit the main table once. Doing a Full Table Scan just once would speed things up if tariffe has a lot of columns.
with cte as (
select stato_attivo , validity_date, idtariffa, dataimport
from tariffe
where validity_date < to_date('2017-6-26','YYYY-MM-DD'
)
select count(tNew.idtariffa) CONT
from cte tNew
join cte tAtt on tAtt.idtariffa = tNew.idtariffa
where (tAtt.stato_attivo = 't')
and (tNew.stato_attivo = 'f')
and (tAtt.validity_date < tNew.validity_date)
and (tAtt.dataimport < tNew.dataimport)
Related
Hello kind people of the internet.
I am wrecking my head trying to figure out why the optimiser isn't using my index for my query on Amazon Aurora. The query is dynamically created based on a report users have created through an applications UI, so I can't change the query per se.
The query uses these qualifiers
WHERE
table_in_question.deleted = 0
ORDER BY
table_in_question.date_modified DESC,
table_in_question.id DESC
I have an index, "my_index", which indexes these specific fields (table_in_question fields deleted, date_modified, ID) but MySQL doesn't use it.
The query takes approx 1200 ms to run. If I add FORCE INDEX (my_index) it takes about 120ms. Arguably about 10x faster - but unless I use force index, it doesn't use it.
Around 1 million rows are returned according to EXPLAIN, so I don't think it's a case of not using the index because of a low amount of rows being returned is the case.
The full query is
SELECT
case when some_table.id IS NOT NULL then some_table.id else "" end my_favorite,
table_in_question.date_entered,
table_in_question.name,
table_in_question.description,
table_in_question.pr_is_read,
table_in_question.pr_is_approved,
table_in_question.parent_type,
table_in_question.parent_id,
table_in_question.id,
table_in_question.date_modified,
table_in_question.assigned_user_id,
table_in_question.created_by
FROM
table_in_question
INNER JOIN (
SELECT
tst.team_user_is_member_of
FROM
team_sets_teams tst
INNER JOIN team_memberships team_membershipstable_in_question ON (
team_membershipstable_in_question.team_id = tst.team_id
)
AND (team_membershipstable_in_question.user_id = 'UUID')
AND (team_membershipstable_in_question.deleted = 0)
GROUP BY
tst.team_user_is_member_of
) table_in_question_tf ON table_in_question_tf.team_user_is_member_of = table_in_question.team_user_is_member_of
LEFT JOIN systemfavourites sf_table_in_question ON (sf_table_in_question.module = 'table_in_question')
AND (sf_table_in_question.record_id = table_in_question.id)
AND (sf_table_in_question.assigned_user_id = 'UUID')
AND (sf_table_in_question.deleted = '0')
INNER JOIN opportunities jt1_table_in_question ON (table_in_question.opportunity_id = jt1_table_in_question.id)
AND (jt1_table_in_question.deleted = 0)
LEFT JOIN another_table jt1_table_in_question_cstm ON jt1_table_in_question_cstm.id_c = jt1_table_in_question.id
LEFT JOIN systemfavourites table_in_question_favorite ON (table_in_question.id = table_in_question_favorite.record_id)
AND (table_in_question_favorite.deleted = '0')
AND (table_in_question_favorite.module = 'table_in_question')
AND (table_in_question_favorite.created_by = 'UUID')
LEFT JOIN users some_table ON (
some_table.id = table_in_question_favorite.modified_user_id
)
AND (some_table.deleted = 0)
WHERE
table_in_question.deleted = 0
ORDER BY
table_in_question.date_modified DESC,
table_in_question.id DESC
;
EXPLAIN shows this
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
table_in_question
ALL
idx_table_in_question_tmst_id
968234
10.0
Using where; Using temporary; Using filesort
Can anyone help explain how I make an index it will actually use by default?
Thanks.
I have a statement that is used quiet often but does perform pretty poorly.
So i want to optimize it as good as possible. The statement consist of various parts and unions.
One part that is pretty costly is the following.
select *
from rc050 F
LEFT OUTER JOIN
(SELECT fi_nr,
fklz,
rvc_status,
lfdnr,
txt
FROM rc0531 temp
WHERE lfdnr =
(SELECT MAX(lfdnr) FROM rc0531 WHERE fi_nr = temp.fi_nr AND fklz = temp.fklz
)
) SUB_TABLE1
ON F.fklz = SUB_TABLE1.fklz
AND F.fi_nr = SUB_TABLE1.fi_nr
LEFT OUTER JOIN
(SELECT fi_nr,
fklz,
rvc_status,
lfdnr,
txt
FROM rc0532 temp
WHERE lfdnr =
(SELECT MAX(lfdnr) FROM rc0532 WHERE fi_nr = temp.fi_nr AND fklz = temp.fklz
)
) SUB_TABLE2
ON F.fklz = SUB_TABLE2.fklz
AND F.fi_nr = SUB_TABLE2.fi_nr
LEFT OUTER JOIN
(SELECT fi_nr,
fklz,
rvc_status,
lfdnr,
txt
FROM rc05311 temp
WHERE lfdnr =
(SELECT MAX(lfdnr)
FROM rc05311
WHERE fi_nr = temp.fi_nr
AND fklz = temp.fklz
)
) SUB_TABLE11
ON F.fklz = SUB_TABLE11.fklz
AND F.fi_nr = SUB_TABLE11.fi_nr
where F.fklz != ' '
This is a part where I have to load from these rc0531 ... rc05311 tables
what there latest entry is. There are 11 of these tables so this is broken down.
As you can see I'm currently joining each table via subquery and need only the latest entry so i need an additional subquery to get the max(lfdnr).
This all works good so far but i wanna know if there is a more efficient way to perform this.
In my main select I need to be able to address each column from each of these tables.
Do you guys have any suggestions ?
This currently runs in 1.3 sec on 13k rows to get a decent boost this should get down to 0.1 sec or so. Again this is only one problem in a bigger statement full of unefficient declarations.
It's not possible to optimize a SQL query. Oracle is going to take your query and determine an execution plan that it uses to determine the information you've asked for. You might completely rewrite the query and, if it leads to the same execution plan, you'll get exactly the same performance. To tune a query, you need to determine the execution plan.
You may well benefit from an index on each of these tables, on the (fi_nr, fklz) columns or possibly even (fi_nr, fklz, lfdnr), depending on how selective that gets. There's no way to tell in advance from this information, you'll just have to try.
You should also remove the select * and select only the columns you want. If it's possible to get just the needed information from an index, Oracle will not need to retrieve the actual table row.
replace the left joins from :
LEFT OUTER JOIN (
SELECT fi_nr,fklz,rvc_status,lfdnr,txt FROM rc0531 temp
WHERE lfdnr = (SELECT MAX(lfdnr) FROM rc0531 WHERE fi_nr = temp.fi_nr AND fklz = temp.fklz)
) SUB_TABLE1
ON F.fklz = SUB_TABLE1.fklz
AND F.fi_nr = SUB_TABLE1.fi_nr
to this:
LEFT OUTER JOIN (
SELECT fi_nr,fklz,rvc_status,lfdnr,txt FROM rc0531 inner join
(SELECT fi_nr, fklz, MAX(lfdnr) lfdnr FROM rc0531 group by fi_nr, fklz)x on x.fi_nr=rc0531.fi_nr and x.fklz=rc0531.fklz and x.lfdnr=rc0531.lfdnr
) SUB_TABLE1
ON F.fklz = SUB_TABLE1.fklz
AND F.fi_nr = SUB_TABLE1.fi_nr
let me know if it got to 0.1 sec, i think it will go
I'm trying to execute a left join where multiple conditions must be met with the inclusion of pulling in the MAX sequence number that meets those conditions.
The left join is on the unique identifier in both tables. Table acaps_history has several rows for each app_id. I need to pull in only one row with the highest seq_number and activity_code of 'XU'. If the code 'XU' doesn't exist for the given app_id, then the case statement above should return 'N' for that row. The code I have currently just isn't working - returning the error "a column may not be outer-joined to a subquery":
create table orig_play3 as
(select
x.*,
case when xa.activity_code in 'XU' then 'Y' else 'N' end as cpo_flag
from
dfs_tab_orig_play_x x
left join cf.acaps_history xa on
x.APP_ID = xa.FOC_APPL_ID
and xa.activity_code in 'XU'
and xa.seq_number = (select max(seq_number) from cf.acaps_history where FOC_APPL_ID=x.app_id)
)
Given your error, it seems that the issue is the last part of your query:
and xa.seq_number = (select max(seq_number) from cf.acaps_history where FOC_APPL_ID=x.app_id)
This is still operating in the context of the ON clause, so the sub-query to find the max sequence number is the issue.
You should be able to avoid this by moving that sub-query out of the ON clause:
LEFT JOIN (
SELECT FOC_APPL_ID, activity_code, seq_number
FROM cf.acaps_history
WHERE activity_code in 'XU'
) xa
ON x.APP_ID = xa.FOC_APPL_ID
WHERE xa.seq_number = (select max(ah.seq_number) from cf.acaps_history ah where ah.FOC_APPL_ID=x.app_id and ah.activity_code in 'XU')
This may be the most inefficient way to execute this query, but it worked... It took like 3 minutes to run (table size is over 600K rows), but again, it returned the results I needed:
create table test as (
select x.*,
case when xb.activity_code in 'XU' then 'Y' else 'N' end as cpo_flag
from dfs_tab_orig_play_x x
left join
(select
xa.FOC_APPL_ID, xa.activity_code, xa.seq_number
from dfs_tab_orig_play_x x, cf.acaps_history xa
where x.app_id = xa.FOC_APPL_ID (+)
and xa.seq_number = (select max(seq_number) from cf.acaps_history where
x.app_id=FOC_APPL_ID(+) and activity_code in 'XU')) xb
on x.app_id = xb.FOC_APPL_ID (+)
)
If you are on 12c, I like OUTER APPLY for this sort of thing, because it lets you sort the rows for each app_id descending by seq_number and then just pick the highest one.
SELECT
x.*,
CASE
WHEN xa.activity_code IN 'XU' THEN 'Y'
ELSE 'N'
END
AS cpo_flag
FROM
dfs_tab_orig_play_x x
OUTER APPLY ( SELECT *
FROM cf.acaps_history xa
WHERE xa.foc_appl_id = x.app_id
AND xa.activity_code = 'XU'
ORDER BY xa.seq_number DESC
FETCH FIRST 1 ROW ONLY ) xa
Note: this logic is a little different from what you posted. In this version, it will join to the acaps_history row having the highest seq_number from among 'XU' records for the given app_id. Your version was joining to the row having the highest seq_number for the given app_id, whether that row was an 'XU' row or not. I am assuming (with little reason) that that was a bug on your part. But, if it wasn't, my version won't work as given.
This is on 11g. I have an INSERT ALL statement which uses a SELECT to build up the values to insert. The select has some subqueries that check that the record doesn't already exist. The problem is that the insert is taking over 30 minutes, even when there are zero rows to insert. The select statement on its own runs instantly, so the problem seems to be when it is used in conjunction with the INSERT ALL. I rewrote the statement to use MERGE but it was just as bad.
The table has no triggers. There is a primary key index, and a unique constraint on two of the columns, but nothing else that looks like it might be causing an issue. It currently has about 15000 rows, so definitely not big.
Has anyone a suggestion for what might be causing this, or how to go about debugging it?
Here's the INSERT ALL statement.
insert all
into template_letter_merge_fields (merge_field_id, letter_type_id,table_name,field_name,pretty_name, tcl_proc)
values (template_let_mrg_fld_sequence.nextval,letter_type_id,table_name,field_name, pretty_name, tcl_proc)
select lt.letter_type_id,
i.object_type as table_name,
i.interface_key as field_name,
i.pretty_name as pretty_name,
case
when w.widget = 'dynamic_select' then
'dbi::'||i.interface_key||'::get_name'
when w.widget = 'category_tree' and
i.interface_key not like '%_name' and
i.interface_key not like '%_desc' then
'dbi::'||i.interface_key||'::get_name'
else
'dbi::'||i.interface_key||'::get_value'
end as tcl_proc
from template_letter_types lt,
dbi_interfaces i
left outer join acs_attributes aa on (aa.object_type||'_'||aa.attribute_name = i.interface_key
and decode(aa.object_type,'person','party','aims_organisation','party',aa.object_type) = i.object_type)
left outer join flexbase_attributes fa on fa.acs_attribute_id = aa.attribute_id
left outer join flexbase_widgets w on w.widget_name = fa.widget_name
where i.object_type IN (select linked_object_type
from template_letter_object_map lom
where lom.interface_object_type = lt.interface_object_type
union select lt.interface_object_type from dual
union select 'template_letter' from dual)
and lt.interface_object_type = lt.interface_object_type
and not exists (select 1
from template_letter_merge_fields m
where m.sql_code is null
and m.field_name = i.interface_key
and m.letter_type_id = lt.letter_type_id)
and not exists (select 1
from template_letter_merge_fields m2
where m2.pretty_name = i.pretty_name
and m2.letter_type_id = lt.letter_type_id)
I've faced with a weird problem now. The query itself is huge so I'm not going to post it here (I could post however in case someone needs to see). Now I have a table ,TABLE1, with a CHAR(1) column, COL1. This table column is queried as part of my query. When I filter the recordset for this column I say:
WHERE TAB1.COL1=1
This way the query runs and returns a very big resultset. I've recently updated one of the subqueries to speed up the query. But after this when I write WHERE TAB1.COL1=1 it does not return anything, but if I change it to WHERE TAB1.COL1='1' it gives me the records I need. Notice the WHERE clause with quotes and w/o them. So to make it more clear, before updating one of the sub-queries I did not have to put quotes to check against COL1 value, but after updating I have to. What feature of Oracle is it that I'm not aware of?
EDIT: I'm posting the tw versions of the query in case someone might find it useful
Version 1:
SELECT p.ssn,
pss.pin,
pd.doc_number,
p.surname,
p.name,
p.patronymic,
to_number(p.sex, '9') as sex,
citiz_c.short_name citizenship,
p.birth_place,
p.birth_day as birth_date,
coun_c.short_name as country,
di.name as leg_city,
trim( pa.settlement
|| ' '
|| pa.street) AS leg_street,
pd.issue_date,
pd.issuing_body,
irs.irn,
irs.tpn,
irs.reg_office,
to_number(irs.insurer_type, '9') as insurer_type,
TO_CHAR(sa.REG_CODE)
||CONVERT_INT_TO_DOUBLE_LETTER(TO_NUMBER(SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 2, 3)))
||SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 5, 4) CONVERTED_SSN_DOSSIER_NR,
fa.snr
FROM
(SELECT pss_t.pin,
pss_t.ssn
FROM EHDIS_INSURANCE.pin_ssn_status pss_t
WHERE pss_t.difference_status < 5
) pss
INNER JOIN SSPF_CENTRE.file_archive fa
ON fa.ssn = pss.ssn
INNER JOIN SSPF_CENTRE.persons p
ON p.ssn = fa.ssn
INNER JOIN
(SELECT pd_2.ssn,
pd_2.type,
pd_2.series,
pd_2.doc_number,
pd_2.issue_date,
pd_2.issuing_body
FROM
--The changed subquery starts here
(SELECT ssn,
MIN(type) AS type
FROM SSPF_CENTRE.person_documents
GROUP BY ssn
) pd_1
INNER JOIN SSPF_CENTRE.person_documents pd_2
ON pd_2.type = pd_1.type
AND pd_2.ssn = pd_1.ssn
) pd
--The changed subquery ends here
ON pd.ssn = p.ssn
INNER JOIN SSPF_CENTRE.ssn_archive sa
ON p.ssn = sa.ssn
INNER JOIN SSPF_CENTRE.person_addresses pa
ON p.ssn = pa.ssn
INNER JOIN
(SELECT i_t.irn,
irs_t.ssn,
i_t.tpn,
i_t.reg_office,
(
CASE i_t.insurer_type
WHEN '4'
THEN '1'
ELSE i_t.insurer_type
END) AS insurer_type
FROM sspf_centre.irn_registered_ssn irs_t
INNER JOIN SSPF_CENTRE.insurers i_t
ON i_t.irn = irs_t.new_irn
OR i_t.old_irn = irs_t.old_irn
WHERE irs_t.is_registration IS NOT NULL
AND i_t.is_real IS NOT NULL
) irs ON irs.ssn = p.ssn
LEFT OUTER JOIN SSPF_CENTRE.districts di
ON di.code = pa.city
LEFT OUTER JOIN SSPF_CENTRE.countries citiz_c
ON p.citizenship = citiz_c.numeric_code
LEFT OUTER JOIN SSPF_CENTRE.countries coun_c
ON pa.country_code = coun_c.numeric_code
WHERE pa.address_flag = '1'--Here's the column value with quotes
AND fa.form_type = 'Q3';
And Version 2:
SELECT p.ssn,
pss.pin,
pd.doc_number,
p.surname,
p.name,
p.patronymic,
to_number(p.sex, '9') as sex,
citiz_c.short_name citizenship,
p.birth_place,
p.birth_day as birth_date,
coun_c.short_name as country,
di.name as leg_city,
trim( pa.settlement
|| ' '
|| pa.street) AS leg_street,
pd.issue_date,
pd.issuing_body,
irs.irn,
irs.tpn,
irs.reg_office,
to_number(irs.insurer_type, '9') as insurer_type,
TO_CHAR(sa.REG_CODE)
||CONVERT_INT_TO_DOUBLE_LETTER(TO_NUMBER(SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 2, 3)))
||SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 5, 4) CONVERTED_SSN_DOSSIER_NR,
fa.snr
FROM
(SELECT pss_t.pin,
pss_t.ssn
FROM EHDIS_INSURANCE.pin_ssn_status pss_t
WHERE pss_t.difference_status < 5
) pss
INNER JOIN SSPF_CENTRE.file_archive fa
ON fa.ssn = pss.ssn
INNER JOIN SSPF_CENTRE.persons p
ON p.ssn = fa.ssn
INNER JOIN
--The changed subquery starts here
(SELECT ssn,
type,
series,
doc_number,
issue_date,
issuing_body
FROM
(SELECT ssn,
type,
series,
doc_number,
issue_date,
issuing_body,
ROW_NUMBER() OVER (partition BY ssn order by type) rn
FROM SSPF_CENTRE.person_documents
)
WHERE rn = 1
) pd --
--The changed subquery ends here
ON pd.ssn = p.ssn
INNER JOIN SSPF_CENTRE.ssn_archive sa
ON p.ssn = sa.ssn
INNER JOIN SSPF_CENTRE.person_addresses pa
ON p.ssn = pa.ssn
INNER JOIN
(SELECT i_t.irn,
irs_t.ssn,
i_t.tpn,
i_t.reg_office,
(
CASE i_t.insurer_type
WHEN '4'
THEN '1'
ELSE i_t.insurer_type
END) AS insurer_type
FROM sspf_centre.irn_registered_ssn irs_t
INNER JOIN SSPF_CENTRE.insurers i_t
ON i_t.irn = irs_t.new_irn
OR i_t.old_irn = irs_t.old_irn
WHERE irs_t.is_registration IS NOT NULL
AND i_t.is_real IS NOT NULL
) irs ON irs.ssn = p.ssn
LEFT OUTER JOIN SSPF_CENTRE.districts di
ON di.code = pa.city
LEFT OUTER JOIN SSPF_CENTRE.countries citiz_c
ON p.citizenship = citiz_c.numeric_code
LEFT OUTER JOIN SSPF_CENTRE.countries coun_c
ON pa.country_code = coun_c.numeric_code
WHERE pa.address_flag = 1--Here's the column value without quotes
AND fa.form_type = 'Q3';
I've put separating comments for the changed subqueries and the WHERE clause in both queries. Both versions of the subqueries return the same result, one of them is just slower, which is why I decided to update it.
With the most simplistic example I can't reproduce your problem on 11.2.0.3.0 or 11.2.0.1.0.
SQL> create table tmp_test ( a char(1) );
Table created.
SQL> insert into tmp_test values ('1');
1 row created.
SQL> select *
2 from tmp_test
3 where a = 1;
A
-
1
If I then insert a non-numeric value into the table I can confirm Chris' comment "that Oracle will rewrite tab1.col1 = 1 to to_number(tab1.col1) = 1", which implies that you only have numeric characters in the column.
SQL> insert into tmp_test values ('a');
1 row created.
SQL> select *
2 from tmp_test
3 where a = 1;
ERROR:
ORA-01722: invalid number
no rows selected
If you're interested in tracking this down you should gradually reduce the complexity of the query until you have found a minimal, reproducible, example. Oracle can pre-compute a conversion to be used in a JOIN, which as your query is complex seems like a possible explanation of what's happening.
Oracle explicitly recommends against using implicit conversion so it's wiser not to use it at all; as you're finding out. For a start there's no guarantees that your indexes will be used correctly.
Oracle recommends that you specify explicit conversions, rather than rely on implicit or automatic conversions, for these reasons:
SQL statements are easier to understand when you use explicit data type conversion functions.
Implicit data type conversion can have a negative impact on performance, especially if the data type of a column value is converted to that of a constant rather than the other way around.
Implicit conversion depends on the context in which it occurs and may not work the same way in every case. For example, implicit conversion from a datetime value to a VARCHAR2 value may return an unexpected year depending on the value of the NLS_DATE_FORMAT
parameter.
Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.
If you do only have numeric characters in the column I would highly recommend changing this to a NUMBER(1) column and I would always recommend explicit conversion to avoid a lot of pain in the longer run.
It's hard to tell without the actual query. What I would expect is that TAB1.COL1 is in some way different before and after the refactoring.
Candidates differences are Number vs. CHAR(1) vs. CHAR(x>1) vs VARCHAR2
It is easy to introduce differences like this with subqueries where you join two tables which have different types in the join column and you return different columns in your subquery.
To hunt that issue down you might want to check the exact datatypes of your query. Not sure how to do that right now .. but an idea would be to put it in a view and use sqlplus desc on it.