Alternative Query to Implement Minus Query logic

Alternative Query to Implement Minus Query logic - oracle

we are using the below-mentioned minus query logic to find out the non-existing record between the 2 tables, is there an alternative logic that can be used via SQL to achieve the same this is causing performance issues and running for a very long time.
SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM EDWHRSTG.PS_JOB_FULL_S
WHERE EMPLID = '09762931'
MINUS
SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM SUODS.PS_JOB_S
WHERE EMPLID = '09762931'

You can try using an OUTER JOIN:
SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM EDWHRSTG.PS_JOB_FULL_S a
LEFT OUTER JOIN (SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM SUODS.PS_JOB_S
WHERE EMPLID = '09762931') b
ON b.EMPLID = a.EMPL_ID AND
b.EMPL_RCD = a.EMPL_RCD AND
b.EFFDT = a.EFFDT AND
b.HR_STATUS = a.HR_STATUS AND
b.EMPL_STATUS = a.EMPL_STATUS
WHERE b.EMPLID IS NULL AND
b.EMPL_RCD IS NULL AND
b.EFFDT IS NULL AND
b.HR_STATUS IS NULL AND
b.EMPL_STATUS IS NULL
However, I doubt this will perform any better. Your best option is to add an index on the five fields in play here (EMPL_ID, EMPL_RCD, EFFDT, HR_STATUS, EMPL_STATUS) to both tables, or in other words
CREATE INDEX EDWHRSTG.PS_JOB_FULL_S_1
ON EDWHRSTG.PS_JOB_FULL_S (EMPL_ID, EMPL_RCD, EFFDT, HR_STATUS, EMPL_STATUS);
and
CREATE INDEX SUODS.PS_JOB_S_1
ON SUODS.PS_JOB_S (EMPL_ID, EMPL_RCD, EFFDT, HR_STATUS, EMPL_STATUS);

You need to identify where time is being spent, in order to determine root cause if the performance problem; otherwise it’s just a guess.
An Active SQL Monitor report is your diagnostic tool of choice

Related

How to determine if an Index is required for my Oracle query

i would like to know if an index is required or would help to run the below query? i dont have any idea how can i analyze this question.
if some one can help please thanks
WITH C(A0_ID, A1_ID, A1_Col0)
AS (
SELECT
Table_1.ID AS A0_ID,
Table_2.ID AS A1_ID,
Table_2.Col0 AS A1_Col0
FROM Table_1 ,Table_2
WHERE Table_2.ID = Table_1.ID
AND Table_1.col1 = ?
AND BITAND(Table_1.col2, ?) <> ?
AND Table_2.col3 IN (?,?,?)
), T(A0_ID, A1_ID, A1_Col0) AS (
SELECT
A0_ID,
A1_ID,
A1_Col0
FROM C
WHERE A1_ID = ?
UNION ALL
SELECT
C.A0_ID,
C.A1_ID,
C.A1_Col0
from C
INNER JOIN T P ON C.A1_Col0 = P.A1_ID
) SELECT A0_ID, A1_ID, A1_Col0 FROM T

The main query selects from T with no post-processing (filtering, aggregation, sorting, etc.), so it doesn't require optimization.
T is a recursive CTE based on the subquery C. Therefore, T doesn't need optimization (unless you materialized it, but that's a different story).
Now, C can be optimized:
I would consider Table_1 as the driving table since it has an equality in the filtering criteria. It also, uses ID to join against Table_2. Therefore a good index for it is:
create index ix1 on Table_1 (col1, ID);
Then, to access Table_2 you'll need to get through ID that should be the main index column. You may add col3 to the index to somewhat improve the performance of the query; only a benchmark will tell if this is a wise idea. The index could look like:
create index ix2 on Table_2 (ID, col3); -- col3 is optional here
I would recommend you create these indexes and compare the performance that each option produces.

Does Oracle implicit conversion depend on joined tables or views

I've faced with a weird problem now. The query itself is huge so I'm not going to post it here (I could post however in case someone needs to see). Now I have a table ,TABLE1, with a CHAR(1) column, COL1. This table column is queried as part of my query. When I filter the recordset for this column I say:
WHERE TAB1.COL1=1
This way the query runs and returns a very big resultset. I've recently updated one of the subqueries to speed up the query. But after this when I write WHERE TAB1.COL1=1 it does not return anything, but if I change it to WHERE TAB1.COL1='1' it gives me the records I need. Notice the WHERE clause with quotes and w/o them. So to make it more clear, before updating one of the sub-queries I did not have to put quotes to check against COL1 value, but after updating I have to. What feature of Oracle is it that I'm not aware of?
EDIT: I'm posting the tw versions of the query in case someone might find it useful
Version 1:
SELECT p.ssn,
pss.pin,
pd.doc_number,
p.surname,
p.name,
p.patronymic,
to_number(p.sex, '9') as sex,
citiz_c.short_name citizenship,
p.birth_place,
p.birth_day as birth_date,
coun_c.short_name as country,
di.name as leg_city,
trim( pa.settlement
|| ' '
|| pa.street) AS leg_street,
pd.issue_date,
pd.issuing_body,
irs.irn,
irs.tpn,
irs.reg_office,
to_number(irs.insurer_type, '9') as insurer_type,
TO_CHAR(sa.REG_CODE)
||CONVERT_INT_TO_DOUBLE_LETTER(TO_NUMBER(SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 2, 3)))
||SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 5, 4) CONVERTED_SSN_DOSSIER_NR,
fa.snr
FROM
(SELECT pss_t.pin,
pss_t.ssn
FROM EHDIS_INSURANCE.pin_ssn_status pss_t
WHERE pss_t.difference_status < 5
) pss
INNER JOIN SSPF_CENTRE.file_archive fa
ON fa.ssn = pss.ssn
INNER JOIN SSPF_CENTRE.persons p
ON p.ssn = fa.ssn
INNER JOIN
(SELECT pd_2.ssn,
pd_2.type,
pd_2.series,
pd_2.doc_number,
pd_2.issue_date,
pd_2.issuing_body
FROM
--The changed subquery starts here
(SELECT ssn,
MIN(type) AS type
FROM SSPF_CENTRE.person_documents
GROUP BY ssn
) pd_1
INNER JOIN SSPF_CENTRE.person_documents pd_2
ON pd_2.type = pd_1.type
AND pd_2.ssn = pd_1.ssn
) pd
--The changed subquery ends here
ON pd.ssn = p.ssn
INNER JOIN SSPF_CENTRE.ssn_archive sa
ON p.ssn = sa.ssn
INNER JOIN SSPF_CENTRE.person_addresses pa
ON p.ssn = pa.ssn
INNER JOIN
(SELECT i_t.irn,
irs_t.ssn,
i_t.tpn,
i_t.reg_office,
(
CASE i_t.insurer_type
WHEN '4'
THEN '1'
ELSE i_t.insurer_type
END) AS insurer_type
FROM sspf_centre.irn_registered_ssn irs_t
INNER JOIN SSPF_CENTRE.insurers i_t
ON i_t.irn = irs_t.new_irn
OR i_t.old_irn = irs_t.old_irn
WHERE irs_t.is_registration IS NOT NULL
AND i_t.is_real IS NOT NULL
) irs ON irs.ssn = p.ssn
LEFT OUTER JOIN SSPF_CENTRE.districts di
ON di.code = pa.city
LEFT OUTER JOIN SSPF_CENTRE.countries citiz_c
ON p.citizenship = citiz_c.numeric_code
LEFT OUTER JOIN SSPF_CENTRE.countries coun_c
ON pa.country_code = coun_c.numeric_code
WHERE pa.address_flag = '1'--Here's the column value with quotes
AND fa.form_type = 'Q3';
And Version 2:
SELECT p.ssn,
pss.pin,
pd.doc_number,
p.surname,
p.name,
p.patronymic,
to_number(p.sex, '9') as sex,
citiz_c.short_name citizenship,
p.birth_place,
p.birth_day as birth_date,
coun_c.short_name as country,
di.name as leg_city,
trim( pa.settlement
|| ' '
|| pa.street) AS leg_street,
pd.issue_date,
pd.issuing_body,
irs.irn,
irs.tpn,
irs.reg_office,
to_number(irs.insurer_type, '9') as insurer_type,
TO_CHAR(sa.REG_CODE)
||CONVERT_INT_TO_DOUBLE_LETTER(TO_NUMBER(SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 2, 3)))
||SUBSTR(TO_CHAR(sa.DOSSIER_NR, '0999999'), 5, 4) CONVERTED_SSN_DOSSIER_NR,
fa.snr
FROM
(SELECT pss_t.pin,
pss_t.ssn
FROM EHDIS_INSURANCE.pin_ssn_status pss_t
WHERE pss_t.difference_status < 5
) pss
INNER JOIN SSPF_CENTRE.file_archive fa
ON fa.ssn = pss.ssn
INNER JOIN SSPF_CENTRE.persons p
ON p.ssn = fa.ssn
INNER JOIN
--The changed subquery starts here
(SELECT ssn,
type,
series,
doc_number,
issue_date,
issuing_body
FROM
(SELECT ssn,
type,
series,
doc_number,
issue_date,
issuing_body,
ROW_NUMBER() OVER (partition BY ssn order by type) rn
FROM SSPF_CENTRE.person_documents
)
WHERE rn = 1
) pd --
--The changed subquery ends here
ON pd.ssn = p.ssn
INNER JOIN SSPF_CENTRE.ssn_archive sa
ON p.ssn = sa.ssn
INNER JOIN SSPF_CENTRE.person_addresses pa
ON p.ssn = pa.ssn
INNER JOIN
(SELECT i_t.irn,
irs_t.ssn,
i_t.tpn,
i_t.reg_office,
(
CASE i_t.insurer_type
WHEN '4'
THEN '1'
ELSE i_t.insurer_type
END) AS insurer_type
FROM sspf_centre.irn_registered_ssn irs_t
INNER JOIN SSPF_CENTRE.insurers i_t
ON i_t.irn = irs_t.new_irn
OR i_t.old_irn = irs_t.old_irn
WHERE irs_t.is_registration IS NOT NULL
AND i_t.is_real IS NOT NULL
) irs ON irs.ssn = p.ssn
LEFT OUTER JOIN SSPF_CENTRE.districts di
ON di.code = pa.city
LEFT OUTER JOIN SSPF_CENTRE.countries citiz_c
ON p.citizenship = citiz_c.numeric_code
LEFT OUTER JOIN SSPF_CENTRE.countries coun_c
ON pa.country_code = coun_c.numeric_code
WHERE pa.address_flag = 1--Here's the column value without quotes
AND fa.form_type = 'Q3';
I've put separating comments for the changed subqueries and the WHERE clause in both queries. Both versions of the subqueries return the same result, one of them is just slower, which is why I decided to update it.

With the most simplistic example I can't reproduce your problem on 11.2.0.3.0 or 11.2.0.1.0.
SQL> create table tmp_test ( a char(1) );
Table created.
SQL> insert into tmp_test values ('1');
1 row created.
SQL> select *
2 from tmp_test
3 where a = 1;
A
-
1
If I then insert a non-numeric value into the table I can confirm Chris' comment "that Oracle will rewrite tab1.col1 = 1 to to_number(tab1.col1) = 1", which implies that you only have numeric characters in the column.
SQL> insert into tmp_test values ('a');
1 row created.
SQL> select *
2 from tmp_test
3 where a = 1;
ERROR:
ORA-01722: invalid number
no rows selected
If you're interested in tracking this down you should gradually reduce the complexity of the query until you have found a minimal, reproducible, example. Oracle can pre-compute a conversion to be used in a JOIN, which as your query is complex seems like a possible explanation of what's happening.
Oracle explicitly recommends against using implicit conversion so it's wiser not to use it at all; as you're finding out. For a start there's no guarantees that your indexes will be used correctly.
Oracle recommends that you specify explicit conversions, rather than rely on implicit or automatic conversions, for these reasons:
SQL statements are easier to understand when you use explicit data type conversion functions.
Implicit data type conversion can have a negative impact on performance, especially if the data type of a column value is converted to that of a constant rather than the other way around.
Implicit conversion depends on the context in which it occurs and may not work the same way in every case. For example, implicit conversion from a datetime value to a VARCHAR2 value may return an unexpected year depending on the value of the NLS_DATE_FORMAT
parameter.
Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.
If you do only have numeric characters in the column I would highly recommend changing this to a NUMBER(1) column and I would always recommend explicit conversion to avoid a lot of pain in the longer run.

It's hard to tell without the actual query. What I would expect is that TAB1.COL1 is in some way different before and after the refactoring.
Candidates differences are Number vs. CHAR(1) vs. CHAR(x>1) vs VARCHAR2
It is easy to introduce differences like this with subqueries where you join two tables which have different types in the join column and you return different columns in your subquery.
To hunt that issue down you might want to check the exact datatypes of your query. Not sure how to do that right now .. but an idea would be to put it in a view and use sqlplus desc on it.

NOT IN query... odd results

I need a list of users in one database that are not listed as the new_user_id in another. There are 112,815 matching users in both databases; user_id is the key in all queries tables.
Query #1 works, and gives me 111,327 users who are NOT referenced as a new_user_Id. But it requires querying the same data twice.
-- 111,327 GSU users are NOT listed as a CSS new user
-- 1,488 GSU users ARE listed as a new user in CSS
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (select cud.new_user_id
from css.user_desc cud
where cud.new_user_id is not null);
Query #2 would be perfect... and I'm actually surprised that it's syntactically accepted. But it gives me a result that makes no sense.
-- This gives me 1,505 users... I've checked, and they are not
-- referenced as new_user_ids in CSS, but I don't know why the ones
-- that were excluded were excluded.
--
-- Where are the missing 109,822, and whatexcluded them?
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (cudsubq.new_user_id);
What exactly is the where clause in the second query doing, and why is it excluding 109,822 records from the results?
Note The above query is a simplification of what I'm really after. There are other/better ways to do the above queries... they're just representative of the part of the query that's giving me problems.

Read this: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:442029737684
For what I understand, your cudsubq.new_user_id can be NULL even though both tables are joined by user_id, so, you won't get results using the NOT IN operator when the subset contains NULL values . Consider the example in the article:
select * from dual where dummy not in ( NULL )
This returns no records. Try using the NOT EXISTS operator or just another kind of join. Here is a good source: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
And what you need is the fourth example:
SELECT COUNT(descr.user_id)
FROM
user_profile prof
LEFT OUTER JOIN user_desc descr
ON prof.user_id = descr.user_id
WHERE descr.new_user_id IS NULL
OR descr.new_user_id != prof.user_id

Second query is semantically different. In this case
where gup.user_id not in (cudsubq.new_user_id)
cudsubq.new_user_id is treated as expression (doc: IN condition), not as a subquery, thus the whole clause is basically equivalent to
where gup.user_id != cudsubq.new_user_id
So, in your first query, you're literally asking "show me all users in GUP, who also have entries in CSS and their GUP.ID is not matching ANY NOT NULL NEW_ID in CSS ".
However, the second query is "show me all users in GUP, who also have entries in CSS and their GUP.ID is not equal to their RESPECTIVE NULLABLE (no is not null clause, remember?) CSS.NEW_ID value".
And any (not) in (or equality/inequality) checks with nulls don't actually work.
12:07:54 SYSTEM#oars_sandbox> select * from dual where 1 not in (null, 2, 3, 4);
no rows selected
Elapsed: 00:00:00.00
This is where you lose your rows. I would probably rewrite your second query's where clause as
where cudsubq.new_user_id is null, assuming that non-matching users have null new_user_id.

Your second select compares gup.user_id with cud.new_user_id on current joining record. You can rewrite the query to get the same result
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id != cud.new_user_id or cud.new_user_id is null;
You mentioned you compare list of user in one database with a list of users in another. So you need to query data twice and you don't query the same data. Maybe you can use "minus" operator to avoid using "in"
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id from css.user_desc cud
minus
select cud.new_user_id from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id;

You want new_user_id's from table gup that don't match any new_user_id on table cud, right? It sounds like a job for a left join:
SELECT count(gup.user_id)
FROM gsu.user_profile gup LEFT JOIN css.user_desc cud
ON gup.user_id = cud.new_user_id
WHERE cud.new_user_id is NULL
The join keeps all rows of gup, matching them with a new_user_id if possible. The WHERE condition keeps only the rows that have no matching row in cud.
(Apologies if you know this already and you're only interested in the behavior of the not in query)

Reference parent query column in subquery (Oracle)

How can I reference a column outside of a subquery using Oracle? I specifically need to use it in the WHERE statement of the subquery.
Basically I have this:
SELECT Item.ItemNo, Item.Group
FROM Item
LEFT OUTER JOIN (SELECT Attribute.Group, COUNT(1) CT
FROM Attribute
WHERE Attribute.ItemNo=12345) A ON A.Group = Item.Group
WHERE Item.ItemNo=12345
I'd like to change WHERE Attribute.ItemNo=12345 to WHERE Attribute.ItemNo=Item.ItemNo in the subquery, but I can't figure out if this is possible. I keep getting "ORA-00904: 'Item'.'ItemNo': Invalid Identifier"
EDIT:
Ok, this is why I need this kind of structure:
I want to be able to get a count of the "Error" records (where the item is missing a value) and the "OK" records (where the item has a value).
The way I have set it up in the fiddle returns the correct data. I think I might just end up filling in the value in each of the subqueries, since this would probably be the easiest way. Sorry if my data structures are a little convoluted. I can explain if need be.
My tables are:
create table itemcountry(
itemno number,
country nchar(3),
imgroup varchar2(10),
imtariff varchar2(20),
exgroup varchar2(10),
extariff varchar2(20) );
create table itemattribute(
attributeid varchar2(10),
tariffgroup varchar2(10),
tariffno varchar2(10) );
create table icav(
itemno number,
attributeid varchar2(10),
value varchar2(10) );
and my query so far is:
select itemno, country, imgroup, imtariff, im.error "imerror", im.ok "imok", exgroup, extariff, ex.error "exerror", ex.ok "exok"
from itemcountry
left outer join (select sum(case when icav.itemno is null then 1 else 0 end) error, sum(case when icav.itemno is not null then 1 else 0 end) ok, tariffgroup, tariffno
from itemattribute ia
left outer join icav on ia.attributeid=icav.attributeid
where (icav.itemno=12345 or icav.itemno is null)
group by tariffgroup, tariffno) im on im.tariffgroup=imgroup and imtariff=im.tariffno
left outer join (select sum(case when icav.itemno is null then 1 else 0 end) error, sum(case when icav.itemno is not null then 1 else 0 end) ok, tariffgroup, tariffno
from itemattribute ia
left outer join icav on ia.attributeid=icav.attributeid
where (icav.itemno=12345 or icav.itemno is null)
group by tariffgroup, tariffno) ex on ex.tariffgroup=exgroup and extariff=ex.tariffno
where itemno=12345;
It's also set up in a SQL Fiddle.

You can do it in a sub-query but not in a join. In your case I don't see any need to. You can put it in the join condition.
select i.itemno, i.group
from item i
left outer join ( select group, itemno
from attribute b
group by group itemno ) a
on a.group = i.group
and i.itemno = a.itemno
where i.itemno = 12345
The optimizer is built to deal with this sort of situation so utilise it!
I've changed the count(1) to a group by as you need to group by all columns that aren't aggregated.
I'm assuming that your actual query is more complicated than this as with the columns you're selecting this is probably equivilent to
select itemno, group
from item
where itemno = 12345
You could also write your sub-query with an analytic function instead. Something like count(*) over ( partition by group).
As an aside using a keyword as a column name, in this case group is A Bad Idea TM. It can cause a lot of confusion. As you can see from the code above you have a lot of groups in there.
So, based on your SQL-Fiddle, which I've added to the question I think you're looking for something like the following, which doesn't look much better. I suspect, given time, I could make it simpler. On another side note explicitly lower casing queries is never worth the hassle it causes. I've followed your naming convention though.
with sub_query as (
select count(*) - count(icav.itemno) as error
, count(icav.itemno) as ok
, min(itemno) over () as itemno
, tariffgroup
, tariffno
from itemattribute ia
left outer join icav
on ia.attributeid = icav.attributeid
group by icav.itemno
, tariffgroup
, tariffno
)
select ic.itemno, ic.country, ic.imgroup, ic.imtariff
, sum(im.error) as "imerror", sum(im.ok) as "imok"
, ic.exgroup, ic.extariff
, sum(ex.error) as "exerror", sum(ex.ok) as "exok"
from itemcountry ic
left outer join sub_query im
on ic.imgroup = im.tariffgroup
and ic.imtariff = im.tariffno
and ic.itemno = im.itemno
left outer join sub_query ex
on ic.exgroup = ex.tariffgroup
and ic.extariff = ex.tariffno
and ic.itemno = ex.itemno
where ic.itemno = 12345
group by ic.itemno, ic.country
, ic.imgroup, ic.imtariff
, ic.exgroup, ic.extariff
;

You can put WHERE attribute.itemno=item.itemno inside the subquery. You are going to filter the data anyway, filtering the data inside the subquery is usually faster too.

Efficient Alternative to Outer Join

The RIGHT JOIN on this query causes a TABLE ACCESS FULL on lims.operator. A regular join runs quickly, but of course, the samples 'WHERE authorised_by IS NULL' do not show up.
Is there a more efficient alternative to a RIGHT JOIN in this case?
SELECT full_name
FROM (SELECT operator_id AS authorised_by, full_name
FROM lims.operator)
RIGHT JOIN (SELECT sample_id, authorised_by
FROM lims.sample
WHERE sample_template_id = 200)
USING (authorised_by)
NOTE: All columns shown (except full_name) are indexed and the primary key of some table.

Since you're doing an outer join, it could easily be that it actually is more efficient to do a full table scan rather than use the index.
If you are convinced the index should be used, force it with a hint:
SELECT /*+ INDEX (lims.operator operator_index_name)*/ ...
then see what happens...

No need to nest queries. Try this:
select s.full_name
from lims.operator o, lims.sample s
where o.operator_id = s.authorised_by(+)
and s.sample_template_id = 200

I didn't write sql for oracle since a while, but i would write the query like this:
SELECT lims.operator.full_name
FROM lims.operator
RIGHT JOIN lims.sample
on lims.operator.operator_id = lims.sample.authorized_by
and sample_template_id = 200
Does this still perform that bad?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio