LEFT OUTER JOIN vs NOT EXISTS syntax with multiple tables - syntax

I am doing some performance troubleshooting on my SSRS native instance. I have what I hope is a simple syntax issue. I am troubleshooting execution plans when using LEFT OUTER JOIN and NOT EXISTS. I know the difference between the two and hope that maybe NOT EXISTS is my solution, however I have one problem. Here is my query.
SELECT [Facility]
,[CategoryDesc]
,[SubCategoryDesc]
,[ItemKey]
,[ItemDesc]
,[HeadCount]
,[Group]
,[Group Name]
,[CustomerKey]
,[Customer]
,[InvoiceNo]
,[InvoiceDate]
,[OrderNo]
,[OrderDate]
,[FiscalYear]
,[Quarter]
,[WeekNo]
,[SalesmanID]
,[Salesman]
,[ReasonCodeKey]
,[Weight]
,[Box]
,[Value]
,[OrderStatus]
,[PONumber]
,[SubCategoryKey]
,[DispatchCenterOrderKey]
,[PromotionFlag]
,[CategoryKey]
,b.UserID
FROM [FinancialData].[dbo].[FactSalesHistoryDetail] a
LEFT OUTER JOIN [FinancialData].[dbo].[DimSalesRepUserIDMap] b on b.SalesRepID = a.SalesmanID
I am hoping to use this instead:
SELECT [Facility]
,[CategoryDesc]
,[SubCategoryDesc]
,[ItemKey]
,[ItemDesc]
,[HeadCount]
,[Group]
,[Group Name]
,[CustomerKey]
,[Customer]
,[InvoiceNo]
,[InvoiceDate]
,[OrderNo]
,[OrderDate]
,[FiscalYear]
,[Quarter]
,[WeekNo]
,[SalesmanID]
,[Salesman]
,[ReasonCodeKey]
,[Weight]
,[Box]
,[Value]
,[OrderStatus]
,[PONumber]
,[SubCategoryKey]
,[DispatchCenterOrderKey]
,[PromotionFlag]
,[CategoryKey]
,b.UserID
FROM [FinancialData].[dbo].[FactSalesHistoryDetail] a
WHERE NOT EXISTS (SELECT 1 FROM FinancialData.dbo.DimSalesRepUserIDMap b WHERE b.SalesRepID = a.SalesmanID)
The problem is that the very last column "b.UserID" uses the LEFT OUTER JOIN to get it's alias. When use the last query, I get "the multi-part identifier "b.UserID" could not be bound. Obviously this is because I have removed the call to this table. If I include it this way... it takes far far too long and not what I am expecting to receive.
FROM [FinancialData].[dbo].[FactSalesHistoryDetail] a, FinancialData.dbo.DimSalesRepUserIDMap b
WHERE NOT EXISTS (SELECT 1 FROM FinancialData.dbo.DimSalesRepUserIDMap b WHERE b.SalesRepID = a.SalesmanID)
So the question is how do I format this so that I am optimizing the performance with NOT EXISTS or EXISTS, while also referencing multiple columns from other tables?

Left join is the best option at this juncture. The last query will produce results but will have performance impact. Btw, did you try cross join?

Related

MAximo workorder total labor costs and total Material costs

I'm working with my DBA to try to figure out a way to roll up all costs associated with a Work Order. Since any work Order can have multiple child work orders (through multiple "generations") as well as related work orders (through the RELATEDRECORDS table), I need to be able to get the total of the ACTLABORCOST and ACTMATERIALCOST fields for all child and related work orders (as well as each of their child and related work orders). I've worked though a hierarchical query (using CONNECT BY PRIOR) to get all the children, grandchildren, etc., but I'm stuck on the related work orders. Since every work order can have a related work order with it's own children and related work orders, I need an Oracle function that drills down through the children and the related work orders and their children and related work orders. Since I would think that this is something that should be fairly common, I'm hoping that there is someone who has done this and can share what they've done.
Another option would be a recursive query, as suggested by Francisco Sitja. Since my Oracle didn't allow 2 UNION ALLs, I had to joint to the WOANCESTOR table in both child queries instead of dedicating a UNION ALL for doing the WO hierarchy. I was then able to use the one permitted UNION ALL for doing the RELATEDRECORD hierarchy. And it seems to run pretty quickly.
with mywos (wonum, parent, taskid, worktype, description, origrecordid, woclass, siteid) as (
-- normal WO hierarchy
select wo.wonum, wo.parent, wo.taskid, wo.worktype, wo.description, wo.origrecordid, wo.woclass, wo.siteid
from woancestor a
join workorder wo
on a.wonum = wo.wonum
and a.siteid = wo.siteid
where a.ancestor = 'MY-STARTING-WONUM'
union all
-- WO hierarchy associated via RELATEDRECORD
select wo.wonum, wo.parent, wo.taskid, wo.worktype, wo.description, wo.origrecordid, wo.woclass, wo.siteid
from mywos
join relatedrecord rr
on mywos.woclass = rr.class
and mywos.siteid = rr.siteid
and mywos.wonum = rr.recordkey
-- prevent cycle / going back up the hierarchy
and rr.relatetype not in ('ORIGINATOR')
join woancestor a
on rr.relatedrecsiteid = a.siteid
and rr.relatedreckey = a.ancestor
join workorder wo
on a.siteid = wo.siteid
and a.wonum = wo.wonum
)
select * from mywos
;
Have you considered the WOGRANDTOTAL object? Its description in MAXOBJECT is "Non-Persistent table to display WO grandtotals". There is a dialog in the Work Order Tracking application that you can get to from the Select Action / More Actions menu. Since you mentioned it repeatedly, I should note that WOGRANDTOTAL values do not include joins across RELATEDRECORDS to other work order hierarchies.
You can also save yourself the complication of CONNECT BY PRIOR by joining to WOANCESTOR, which is effectively a dump from a CONNECT BY PRIOR query. (There are other %ANCESTOR tables for other hierarchies.)
I think a recursive automation script would be the best way to do what you want, if you need the results in Maximo. If you need the total cost outside of Maximo, maybe a recursive function would work.
We finally figured out how to pull this off.
WITH WO(WONUM,
PARENT) AS
((SELECT X.WONUM,
X.PARENT
FROM (SELECT R.RECORDKEY WONUM,
R.RELATEDRECKEY PARENT
FROM MAXIMO.RELATEDRECORD R
WHERE R.RELATEDRECKEY = '382418'
UNION ALL
SELECT W.WONUM,
W.PARENT
FROM MAXIMO.WORKORDER W
START WITH W.PARENT = '382418'
CONNECT BY PRIOR W.WONUM = W.PARENT) X)
UNION ALL
SELECT W.WONUM, W.PARENT FROM MAXIMO.WORKORDER W, WO WHERE W.WONUM = WO.PARENT)
SELECT DISTINCT WONUM FROM WO;
This returns a list of all of the child and related work orders for a given work order.

How do I improve this Stored Procedure?

I have a question:
Assuming an assembly line where a bike goes through some tests, and then the devices send the information regarding the test to
our database (in oracle). I created this stored procedure; it works correctly for what I want, which is:
It gets a list of the first test (per type of test) that a bike has gone through. For instance, if a bike had 2 tests of the same type, it only
shows the first one, AND it shows it only when that first test is between the dates specified by the user. Also I look from 2 months back
because a bike cannot spend more than 2 months (I'm probably overestimating) at the assembly line, but if the user searches 2 days for instance, and I only look in between those days, I could let outside of my results a test made over a bike 3 days ago or maybe 4, and it get's worst if they search between hours.
As I said before, the sp works just fine, but I'm wondering if there's a way to optimize it.
Also consider that the table has around 7 millions of records by the end of the year, so I cannot query the whole year because it could get ugly.
Here's the main part of the stored procedure:
SELECT pid AS "bike_id",
TYPE AS "type",
stationnr AS "stationnr",
testtime AS "testtime",
rel2.releasenr AS "releasenr",
placedesc AS description,
tv.recordtime AS "recordtime",
To_char(tv.testtime, 'YYYY.MM.DD') AS "dategroup",
testcounts AS "testcounts",
tv.result AS "result",
progressive AS "PROGRESIVO"
FROM (SELECT l_bike_id AS pid,
l_testcounts AS testcounts,
To_char(l_testtime, 'yyyy-MM-dd hh24:mi:ss') AS testtimes,
testtime,
pl.code AS place,
t2.recordtime,
t2.releaseid,
t2.testresid,
t2.stationnr,
t2.result,
v.TYPE,
v.progressive,
v.prs,
pl.description AS placeDesc
FROM (SELECT v.bike_id AS l_bike_id,
v.TYPE AS l_type,
Min(t.testtime) AS l_testtime,
Count(t.testtime) AS l_testcounts
FROM result_test t
inner join bikes v
ON v.bike_id = t.pid
inner join result_release rel
ON t.releaseid = rel.releaseid
inner join resultconfig.places p
ON p.place = t.place
WHERE t.testtime >= Add_months(Trunc(p_startdate), -2)
GROUP BY v.bike_id,
v.TYPE,
p.code)p_bikelist
inner join result_test t2
ON p_bikelist.l_bike_id = t2.pid
AND p_bikelist.l_testtime = t2.testtime
inner join resultconfig.places pl
ON pl.place = t2.place
inner join bikes v
ON v.bike_id = t2.pid
inner join result_release rel2
ON t2.releaseid = rel2.releaseid
ORDER BY t2.pid)tv
inner join result_release rel2
ON tv.releaseid = rel2.releaseid
WHERE tv.testtime BETWEEN p_startdate AND p_enddate
ORDER BY testtime;
Thank you for answering!!
I'm struggling a bit to understand the business requirement from the English description you give. The wording suggests that this procedure is intended to work per bike but I don't see any obvious bike_id parameters being supplied, instead, you appear to be returning the earliest result for all bikes tested between given dates. Is that the aim? If it is designed to be run per bike, then ensure bike id gets passed in and used early :)
There is some confusion about your data types. You convert testtime in result_test (presumably a DATE or TIMESTAMP column ) into a string in the p_bikelist subquery but then compare back to the original value in the tv subquery. You further use (presumably typed parameters) p_startdate and p_enddate to filter results. I strongly suspect the conversion in p_bikelist to be unnecessary, and possibly a cause for index avoidance.
Finally, I don't get the add_months logic. By all means, extend the window back in time to get tests that finished within the window but started up to 2 months before the start date, but as written you will exclude the earlier starts anyway because of the condition on tv.testtime. Most likely you'd be better off fudging the startdate earlier in the stored procedure with code like
l_assumedstart := add_months(p_startdate, -2);
and then using l_assumedstart in the query itself.

Oracle 9i Sub query

Hi Can any one help me out of this query forming logic
SELECT C.CPPID, c.CPP_AMT_MANUAL
FROM CPP_PRCNT CC,CPP_VIEW c
WHERE
CC.CPPYR IN (
SELECT C.YEAR FROM CPP_VIEW_VIEW C WHERE UPPER(C.CPPNO) = UPPER('123')
AND C.CPP_CODE ='CPP000000000053'
and TO_CHAR(c.CPP_DATE,'YYYY/Mon')='2012/Nov'
)
AND UPPER(C.CPPNO) = UPPER('123')
AND C.CPP_CODE ='CPP000000000053'
and TO_CHAR(c.CPP_DATE,'YYYY/Mon') = '2012/Nov';
Please Correct me if i formed wrong query structure, in terms of query Performance and Standards. Thanks in Advance
If you have some indexes or partitioned tables I would not use functions on columns but on variables, to be able to use indexes/select partitions.
Also I use ANSI 92 SQL syntax. You don't specify(or not directly) a join contition between cpp_prcnt and cpp_view so it is actually a cartesian product(cross join)
SELECT C.CPPID, c.CPP_AMT_MANUAL
FROM CPP_PRCNT CC
CROSS JOIN CPP_VIEW c
WHERE
CC.CPPYR IN (
SELECT C.YEAR
FROM CPP_VIEW_VIEW C
WHERE C.CPPNO = '123'
AND C.CPP_CODE ='CPP000000000053'
AND trunc(c.CPP_DATE,'MM')=to_date('2012/Nov','YYYY/Mon')
)
AND C.CPPNO = '123'
AND C.CPP_CODE ='CPP000000000053'
AND trunc(c.CPP_DATE,'MM')=to_date('2012/Nov','YYYY/Mon')
If you show us the definition of cpp_view_view(seems to be a view over cpp_view), the definition(if simple) of CPP_VIEW and what you're trying to achieve, I bet there are more things to be improved/fixed.
There are a couple of things you could improve:
if possible, get rid of the UPPER() in the comparison - this will render any indices useless. If that's not possible, consider a function-based index on UPPER(CPPNO)
do not convert your DATE column to a string to compare it with a string - do it the other way round (i.e. convert your string to a date => only one conversion needed instead of one per table row, use of indices possible)
play around with EXISTS instead of IN, as suggested by Dileep - might be faster

Oracle Indexes on Left Outer Joins

So I'm having some issues with proper / any use of indexes in Oracle 11Gr2 and I'm trying to get a better understanding of how my explain plan ties back to my query so that I can apply indexing properly. When running the following query:
SELECT JLOG1.JLOG_KEY,
JLOG1.SRC_CD,
JLOG1.JRNL_AMT,
CASD.CONT_NO,
SUM (NVL (VJLOG.TDTL_AMT, 0)) TDTL_SUM
FROM GL_Journal_Logs JLOG1,
GL_JLOG_Details VJLOG,
CASE_DATA CASD
WHERE VJLOG.JLOG_KEY(+) = JLOG1.JLOG_KEY
AND CASD.CASE_KEY(+) = JLOG1.CASE_KEY
AND JLOG1.JRNL_CD = '0'
AND JLOG1.SRC_CD = '2'
AND JLOG1.ACCT_IF_CD = '0'
GROUP BY JLOG1.JLOG_KEY, JLOG1.SRC_CD,JLOG1.JRNL_AMT, CASD.CONT_NO
HAVING JLOG1.JRNL_AMT <> SUM (NVL (VJLOG.TDTL_AMT, 0));
I'm getting the following explain details:
I can understand that the indexes on my join "keys" (JLOG_KEY or CASE_KEY) wouldn't necessarily apply seeing as it's an outer join (or should they?), however when creating indexes on JLOG1 (JRNL_CD, SRC_CD, ACCT_IF_CD), technically would these take effect given my "where" clause?
Should I create any indexes at all given the circumstances or is there a better way of doing this?
Depending on the cardinality of the columns in your predicates, an appropriate index might be used on the GL_JLOG_DETAILS table, avoiding a full table scan. A covering index may avoid accessing the data pages at all:
ON GL_JOURNAL_LOGS (JRNL_CD,SRC_CD,ACCT_IF_CD,JLOG_KEY,CASE_KEY,JRNL_AMT)
(You probably want the column with the most selective predicate first in that index)
Also, your query may be able to make effective use of indexes
ON GL_JLOG_DETAILS (JLOG_KEY, TDTL_AMT)
and
ON CASE_DATA (CASE_KEY, CONT_NO)
Also, be sure that the statistics on the tables and indexes are up-to-date.
Also, that (+) notation for an OUTER JOIN may be limiting the optimizer.
Oracle now supports the ANSI style joins, which may allow the optimizer more latitude in coming up with an execution plan, e.g.
FROM GL_Journal_Logs JLOG1
LEFT
JOIN GL_JLOG_Details VJLOG ON VJLOG.JLOG_KEY = JLOG1.JLOG_KEY
LEFT
JOIN CASE_DATA CASD ON CASD.CASE_KEY = JLOG1.CASE_KEY
WHERE JLOG1.JRNL_CD = '0'
AND JLOG1.SRC_CD = '2'
AND JLOG1.ACCT_IF_CD = '0'

how to change/simplify joins in oracle

I have a join in a oracle query which looks like:
FROM eiv.table1
eiv.table2 b
WHERE a.state_cd =
b.state_code(+)
what does the (+) towards the end mean?
With this query I have noticed that sometimes I am getting an empty space when records do not match in tables.
Is this a left outer join or right? How can this be simplified.
SELECT *
FROM eiv.table1 a
LEFT JOIN
eiv.table2 b
ON b.state_code = a.state_cs
Before 9i, Oracle did not support ANSI join syntax, and (+) clause was used instead.
It means that it's a left outer join... Details always come from a, and only come from b when the condition is met...
FROM eiv.table1
eiv.table2 b
WHERE a.state_cd =
b.state_code(+)
=
from evi.table1 a left join eiv.tableb b on (a.state_cd = b.state_code)
You might want to give some thought to using the same column name for the state code on both tables, but it may be a little late for that...
http://www.adp-gmbh.ch/ora/sql/outer_join.html
"This might be what one want or it
might not. Assuming that we want to
return all numbers, even if the german
translation is missing, we need an
outer join. An outer join uses a (+)
on the side of the operator (which in
this case happens to be the equality
operator) where we want to have nulls
returned if no value matches:
select l.v "English", r.v
"German" from r,l where l.i
= r.i (+) and r.l(+) = 'de';
And this returns a row for each
english word, even if there is no
german translation:
English German
-------------------- -------------------- one two zwei three drei four
five"

Resources