NOT IN query... odd results - oracle

I need a list of users in one database that are not listed as the new_user_id in another. There are 112,815 matching users in both databases; user_id is the key in all queries tables.
Query #1 works, and gives me 111,327 users who are NOT referenced as a new_user_Id. But it requires querying the same data twice.
-- 111,327 GSU users are NOT listed as a CSS new user
-- 1,488 GSU users ARE listed as a new user in CSS
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (select cud.new_user_id
from css.user_desc cud
where cud.new_user_id is not null);
Query #2 would be perfect... and I'm actually surprised that it's syntactically accepted. But it gives me a result that makes no sense.
-- This gives me 1,505 users... I've checked, and they are not
-- referenced as new_user_ids in CSS, but I don't know why the ones
-- that were excluded were excluded.
--
-- Where are the missing 109,822, and whatexcluded them?
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (cudsubq.new_user_id);
What exactly is the where clause in the second query doing, and why is it excluding 109,822 records from the results?
Note The above query is a simplification of what I'm really after. There are other/better ways to do the above queries... they're just representative of the part of the query that's giving me problems.

Read this: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:442029737684
For what I understand, your cudsubq.new_user_id can be NULL even though both tables are joined by user_id, so, you won't get results using the NOT IN operator when the subset contains NULL values . Consider the example in the article:
select * from dual where dummy not in ( NULL )
This returns no records. Try using the NOT EXISTS operator or just another kind of join. Here is a good source: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
And what you need is the fourth example:
SELECT COUNT(descr.user_id)
FROM
user_profile prof
LEFT OUTER JOIN user_desc descr
ON prof.user_id = descr.user_id
WHERE descr.new_user_id IS NULL
OR descr.new_user_id != prof.user_id

Second query is semantically different. In this case
where gup.user_id not in (cudsubq.new_user_id)
cudsubq.new_user_id is treated as expression (doc: IN condition), not as a subquery, thus the whole clause is basically equivalent to
where gup.user_id != cudsubq.new_user_id
So, in your first query, you're literally asking "show me all users in GUP, who also have entries in CSS and their GUP.ID is not matching ANY NOT NULL NEW_ID in CSS ".
However, the second query is "show me all users in GUP, who also have entries in CSS and their GUP.ID is not equal to their RESPECTIVE NULLABLE (no is not null clause, remember?) CSS.NEW_ID value".
And any (not) in (or equality/inequality) checks with nulls don't actually work.
12:07:54 SYSTEM#oars_sandbox> select * from dual where 1 not in (null, 2, 3, 4);
no rows selected
Elapsed: 00:00:00.00
This is where you lose your rows. I would probably rewrite your second query's where clause as
where cudsubq.new_user_id is null, assuming that non-matching users have null new_user_id.

Your second select compares gup.user_id with cud.new_user_id on current joining record. You can rewrite the query to get the same result
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id != cud.new_user_id or cud.new_user_id is null;
You mentioned you compare list of user in one database with a list of users in another. So you need to query data twice and you don't query the same data. Maybe you can use "minus" operator to avoid using "in"
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id from css.user_desc cud
minus
select cud.new_user_id from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id;

You want new_user_id's from table gup that don't match any new_user_id on table cud, right? It sounds like a job for a left join:
SELECT count(gup.user_id)
FROM gsu.user_profile gup LEFT JOIN css.user_desc cud
ON gup.user_id = cud.new_user_id
WHERE cud.new_user_id is NULL
The join keeps all rows of gup, matching them with a new_user_id if possible. The WHERE condition keeps only the rows that have no matching row in cud.
(Apologies if you know this already and you're only interested in the behavior of the not in query)

Related

Alternative Query to Implement Minus Query logic

we are using the below-mentioned minus query logic to find out the non-existing record between the 2 tables, is there an alternative logic that can be used via SQL to achieve the same this is causing performance issues and running for a very long time.
SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM EDWHRSTG.PS_JOB_FULL_S
WHERE EMPLID = '09762931'
MINUS
SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM SUODS.PS_JOB_S
WHERE EMPLID = '09762931'
You can try using an OUTER JOIN:
SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM EDWHRSTG.PS_JOB_FULL_S a
LEFT OUTER JOIN (SELECT EMPLID,EMPL_RCD,EFFDT,HR_STATUS,EMPL_STATUS
FROM SUODS.PS_JOB_S
WHERE EMPLID = '09762931') b
ON b.EMPLID = a.EMPL_ID AND
b.EMPL_RCD = a.EMPL_RCD AND
b.EFFDT = a.EFFDT AND
b.HR_STATUS = a.HR_STATUS AND
b.EMPL_STATUS = a.EMPL_STATUS
WHERE b.EMPLID IS NULL AND
b.EMPL_RCD IS NULL AND
b.EFFDT IS NULL AND
b.HR_STATUS IS NULL AND
b.EMPL_STATUS IS NULL
However, I doubt this will perform any better. Your best option is to add an index on the five fields in play here (EMPL_ID, EMPL_RCD, EFFDT, HR_STATUS, EMPL_STATUS) to both tables, or in other words
CREATE INDEX EDWHRSTG.PS_JOB_FULL_S_1
ON EDWHRSTG.PS_JOB_FULL_S (EMPL_ID, EMPL_RCD, EFFDT, HR_STATUS, EMPL_STATUS);
and
CREATE INDEX SUODS.PS_JOB_S_1
ON SUODS.PS_JOB_S (EMPL_ID, EMPL_RCD, EFFDT, HR_STATUS, EMPL_STATUS);
You need to identify where time is being spent, in order to determine root cause if the performance problem; otherwise it’s just a guess.
An Active SQL Monitor report is your diagnostic tool of choice

How to retrieve data from 3 tables using sub query oracle SQL

I want to retrieve users name and there responsibility_key where there end_date is null and i want to convert it to (sysdate+1) using nvl but i am only able to retrieve the responsibility_key not the name please help.
The error in the image says "column ambiguously defined". Take a close look. Your last END_DATE could refer to either the u alias or the table from the subquery. Change it to match the rest of your subquery (FIND_USER_GROUPS_DIRECT.END_DATE)
EDIT
Your query is
select u.USER_NAME, d.responsibility_key from FND_USER u,FND_RESPONSIBILITY_VL d
where responsibility_id in(
select responsibility_id from
FND_USER_RESP_GROUPS_DIRECT WHERE END_USER_RESP_GROUPS_DIRECT.END_DATE=nvl(END_DATE,sysdate+1)) and
u.END_DATE=nvl(END_DATE,SYSDATE + 1)
;
The query isn't formatted, which makes it hard to read.
Not all columns are qualified with table name (or aliases), as mentioned in the comments.
The query currently uses an implicit join.
The query is impossible to understand without seeing the table definitions (desc [table_name]).
For points 1 and 2, a properly formatted query will look something like
select u.user_name, d.responsibility_key
from
fnd_user u,
fnd_responsibility_vl d
where
d.responsibility_id in (
select urgd.responsibility_id
from
fnd_user_resp_groups_direct urgd
where
urgd.end_date = nvl(u.end_date, sysdate+1)
) and
u.end_date = nvl(urgd.end_date, sysdate + 1)
;
This makes it easier to read and in addition to this, you can see that without table definitions I guessed (see point 4) as to which tables the end_date column belongs in your query. If I had to guess, so does Oracle. That means you have an ambiguity problem. To fix it, take a close look at the end_date column as it appears in your original query and where you do not prefix it with anything, you need to prefix it with the appropriate alias (after you have aliased all your tables).
For point 3, you can write your query more clearly with an explicit join and by using aliases for all columns. As for the explicit join I have no idea what your tables look like but one possibility is something like
select u.user_name, d.responsibility_key
from fnd_user u
join fnd_responsibility_vl d
on u.id = d.user_id
where
d.responsibility_id in (
select responsibility_id
from fnd_user_resp_groups_direct urgd
where
urgd.end_date = nvl(u.end_date, sysdate+1)
) and
u.end_date = nvl(urgd.end_date, sysdate+1)
;
If you follow these points you will get to the root of the error.

Update statement with joins in Oracle

I need to update one column in table A with the result of a multiplication of one field from table A with one field from table B.
It would be pretty simple to do this in T-SQL, but I can't write the correct syntax in Oracle.
What I've tried:
UPDATE TABLE_A
SET TABLE_A.COLUMN_TO_UPDATE =
(select TABLE_A.COLUMN_WITH_SOME_VALUE * TABLE_B.COLUMN_WITH_PERCENTAGE
from TABLE_A
INNER JOIN TABLE_B
ON TABLE_A.PRODUCT_ID = TABLE_B.PRODUCT_ID
AND TABLE_A.SALES_CHANNEL_ID = TABLE_B.SALES_CHANNEL_ID)
WHERE TABLE_A.MONTH_ID IN (201601, 201602, 201603);
But I keep getting errors. Could anybody help me, please?
I generally prefer to use the below format for such cases since this will ensure there's no update performed if there's no data in the table(query extracted temp table) whereas in the above solution provided by Brian Leach will update the new value as null if there's no record present in the 2nd table but exists in the first table.
UPDATE
(
select TABLE_A.COLUMN_TO_UPDATE
, TABLE_A.PRODUCT_ID
, TABLE_A.COLUMN_WITH_SOME_VALUE * TABLE_B.COLUMN_WITH_PERCENTAGE as value
from TABLE_A
INNER JOIN TABLE_B
ON TABLE_A.PRODUCT_ID = TABLE_B.PRODUCT_ID
AND TABLE_A.SALES_CHANNEL_ID = TABLE_B.SALES_CHANNEL_ID
AND TABLE_A.MONTH_ID IN (201601, 201602, 201603)
) DATA
SET DATA.COLUMN_TO_UPDATE = DATA.value;
This solution can cause key preserved value issues which shouldn't be an issue here since i expect a single row in both the tables for one product(ID).
More on Key Preserved table concept in inner join can be found here
https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:548422757486
#Jayesh Mulwani raiesed a valid point, this will set the value to null if there is no matching record. This may or may not be the desired result. If it isn't, and no change is desirect, you can change the select statement to:
coalesce((SELECT table_b.column_with_percentage
FROM table_b
WHERE table_a.product_id = table_b.product_id AND table_a.sales_channel_id = table_b.sales_channel_id),1)
If this is the desired outcome, Jayesh's solution will be more efficient as it will only update matching records.
UPDATE table_a
SET table_a.column_to_update = table_a.column_with_some_value
* (SELECT table_b.column_with_percentage
FROM table_b
WHERE table_a.product_id = table_b.product_id
AND table_a.sales_channel_id = table_b.sales_channel_id)
WHERE table_a.month_id IN (201601, 201602, 201603);

Insert Statement Returns ORA-01427 Error While Trying To Insert From Multiple Tables

I have this table F_Flight which I am trying to insert into from 3 different tables. The first, fourth and fifth columns are from the same, and the second and third columns from different tables. When I execute the code, I get a "single-row subquery returns more than one row" error.
insert when 1 = 1 then into F_Flight (planeid, groupid, dateid, flightduration, kmsflown) values
(planeid, (select b.groupid from BridgeTable b where exists (select p.p1id from pilotkeylookup p where b.pilotid = p.p1id)),
(select dd.id from D_Date dd where exists (select p.launchtime from PilotKeyLookup p where dd."Date" = p.launchtime)),
flightduration, kmsflown) select * from PilotKeyLookup p;
Your subqueries get multiple rows back, which is what the error message says. There is no correlation between the various bits of data and subqueries you're trying to insert into a single row.
This can be done as a much simpler insert...select with joins, something like:
insert into f_flight (planeid, groupid, dateid, flightduration, kmsflown)
select pkl.planeid, bt.groupid, dd.id, pkl.flightduration, pkl.kmsflown
from pilotkeylookup pkl
join bridgetable bt on bt.pilotid = pkl.p1id
join d_date dd on dd."Date" = pkl.launchtime;
This joins the main PilotKeyLookup table to the other two on the keys you used in your subqueries.
Storing an ID value instead of an actual date is unusual, and if launchtime has a time component - which seems likely from the name - and your d_date entries are just dates (i.e. all with time at midnight) then you won't find matches; you might need to do:
join d_date dd on dd."Date" = trunc(pkl.launchtime);
It also seems like this could be a view, as you're storing duplicate data - everything in f_flight could, obviously, be found from the other tables.

Efficient Alternative to Outer Join

The RIGHT JOIN on this query causes a TABLE ACCESS FULL on lims.operator. A regular join runs quickly, but of course, the samples 'WHERE authorised_by IS NULL' do not show up.
Is there a more efficient alternative to a RIGHT JOIN in this case?
SELECT full_name
FROM (SELECT operator_id AS authorised_by, full_name
FROM lims.operator)
RIGHT JOIN (SELECT sample_id, authorised_by
FROM lims.sample
WHERE sample_template_id = 200)
USING (authorised_by)
NOTE: All columns shown (except full_name) are indexed and the primary key of some table.
Since you're doing an outer join, it could easily be that it actually is more efficient to do a full table scan rather than use the index.
If you are convinced the index should be used, force it with a hint:
SELECT /*+ INDEX (lims.operator operator_index_name)*/ ...
then see what happens...
No need to nest queries. Try this:
select s.full_name
from lims.operator o, lims.sample s
where o.operator_id = s.authorised_by(+)
and s.sample_template_id = 200
I didn't write sql for oracle since a while, but i would write the query like this:
SELECT lims.operator.full_name
FROM lims.operator
RIGHT JOIN lims.sample
on lims.operator.operator_id = lims.sample.authorized_by
and sample_template_id = 200
Does this still perform that bad?

Resources