how to change/simplify joins in oracle - oracle

I have a join in a oracle query which looks like:
FROM eiv.table1
eiv.table2 b
WHERE a.state_cd =
b.state_code(+)
what does the (+) towards the end mean?
With this query I have noticed that sometimes I am getting an empty space when records do not match in tables.
Is this a left outer join or right? How can this be simplified.

SELECT *
FROM eiv.table1 a
LEFT JOIN
eiv.table2 b
ON b.state_code = a.state_cs
Before 9i, Oracle did not support ANSI join syntax, and (+) clause was used instead.

It means that it's a left outer join... Details always come from a, and only come from b when the condition is met...
FROM eiv.table1
eiv.table2 b
WHERE a.state_cd =
b.state_code(+)
=
from evi.table1 a left join eiv.tableb b on (a.state_cd = b.state_code)
You might want to give some thought to using the same column name for the state code on both tables, but it may be a little late for that...

http://www.adp-gmbh.ch/ora/sql/outer_join.html
"This might be what one want or it
might not. Assuming that we want to
return all numbers, even if the german
translation is missing, we need an
outer join. An outer join uses a (+)
on the side of the operator (which in
this case happens to be the equality
operator) where we want to have nulls
returned if no value matches:
select l.v "English", r.v
"German" from r,l where l.i
= r.i (+) and r.l(+) = 'de';
And this returns a row for each
english word, even if there is no
german translation:
English German
-------------------- -------------------- one two zwei three drei four
five"

Related

How do I improve this Stored Procedure?

I have a question:
Assuming an assembly line where a bike goes through some tests, and then the devices send the information regarding the test to
our database (in oracle). I created this stored procedure; it works correctly for what I want, which is:
It gets a list of the first test (per type of test) that a bike has gone through. For instance, if a bike had 2 tests of the same type, it only
shows the first one, AND it shows it only when that first test is between the dates specified by the user. Also I look from 2 months back
because a bike cannot spend more than 2 months (I'm probably overestimating) at the assembly line, but if the user searches 2 days for instance, and I only look in between those days, I could let outside of my results a test made over a bike 3 days ago or maybe 4, and it get's worst if they search between hours.
As I said before, the sp works just fine, but I'm wondering if there's a way to optimize it.
Also consider that the table has around 7 millions of records by the end of the year, so I cannot query the whole year because it could get ugly.
Here's the main part of the stored procedure:
SELECT pid AS "bike_id",
TYPE AS "type",
stationnr AS "stationnr",
testtime AS "testtime",
rel2.releasenr AS "releasenr",
placedesc AS description,
tv.recordtime AS "recordtime",
To_char(tv.testtime, 'YYYY.MM.DD') AS "dategroup",
testcounts AS "testcounts",
tv.result AS "result",
progressive AS "PROGRESIVO"
FROM (SELECT l_bike_id AS pid,
l_testcounts AS testcounts,
To_char(l_testtime, 'yyyy-MM-dd hh24:mi:ss') AS testtimes,
testtime,
pl.code AS place,
t2.recordtime,
t2.releaseid,
t2.testresid,
t2.stationnr,
t2.result,
v.TYPE,
v.progressive,
v.prs,
pl.description AS placeDesc
FROM (SELECT v.bike_id AS l_bike_id,
v.TYPE AS l_type,
Min(t.testtime) AS l_testtime,
Count(t.testtime) AS l_testcounts
FROM result_test t
inner join bikes v
ON v.bike_id = t.pid
inner join result_release rel
ON t.releaseid = rel.releaseid
inner join resultconfig.places p
ON p.place = t.place
WHERE t.testtime >= Add_months(Trunc(p_startdate), -2)
GROUP BY v.bike_id,
v.TYPE,
p.code)p_bikelist
inner join result_test t2
ON p_bikelist.l_bike_id = t2.pid
AND p_bikelist.l_testtime = t2.testtime
inner join resultconfig.places pl
ON pl.place = t2.place
inner join bikes v
ON v.bike_id = t2.pid
inner join result_release rel2
ON t2.releaseid = rel2.releaseid
ORDER BY t2.pid)tv
inner join result_release rel2
ON tv.releaseid = rel2.releaseid
WHERE tv.testtime BETWEEN p_startdate AND p_enddate
ORDER BY testtime;
Thank you for answering!!
I'm struggling a bit to understand the business requirement from the English description you give. The wording suggests that this procedure is intended to work per bike but I don't see any obvious bike_id parameters being supplied, instead, you appear to be returning the earliest result for all bikes tested between given dates. Is that the aim? If it is designed to be run per bike, then ensure bike id gets passed in and used early :)
There is some confusion about your data types. You convert testtime in result_test (presumably a DATE or TIMESTAMP column ) into a string in the p_bikelist subquery but then compare back to the original value in the tv subquery. You further use (presumably typed parameters) p_startdate and p_enddate to filter results. I strongly suspect the conversion in p_bikelist to be unnecessary, and possibly a cause for index avoidance.
Finally, I don't get the add_months logic. By all means, extend the window back in time to get tests that finished within the window but started up to 2 months before the start date, but as written you will exclude the earlier starts anyway because of the condition on tv.testtime. Most likely you'd be better off fudging the startdate earlier in the stored procedure with code like
l_assumedstart := add_months(p_startdate, -2);
and then using l_assumedstart in the query itself.

LEFT OUTER JOIN vs NOT EXISTS syntax with multiple tables

I am doing some performance troubleshooting on my SSRS native instance. I have what I hope is a simple syntax issue. I am troubleshooting execution plans when using LEFT OUTER JOIN and NOT EXISTS. I know the difference between the two and hope that maybe NOT EXISTS is my solution, however I have one problem. Here is my query.
SELECT [Facility]
,[CategoryDesc]
,[SubCategoryDesc]
,[ItemKey]
,[ItemDesc]
,[HeadCount]
,[Group]
,[Group Name]
,[CustomerKey]
,[Customer]
,[InvoiceNo]
,[InvoiceDate]
,[OrderNo]
,[OrderDate]
,[FiscalYear]
,[Quarter]
,[WeekNo]
,[SalesmanID]
,[Salesman]
,[ReasonCodeKey]
,[Weight]
,[Box]
,[Value]
,[OrderStatus]
,[PONumber]
,[SubCategoryKey]
,[DispatchCenterOrderKey]
,[PromotionFlag]
,[CategoryKey]
,b.UserID
FROM [FinancialData].[dbo].[FactSalesHistoryDetail] a
LEFT OUTER JOIN [FinancialData].[dbo].[DimSalesRepUserIDMap] b on b.SalesRepID = a.SalesmanID
I am hoping to use this instead:
SELECT [Facility]
,[CategoryDesc]
,[SubCategoryDesc]
,[ItemKey]
,[ItemDesc]
,[HeadCount]
,[Group]
,[Group Name]
,[CustomerKey]
,[Customer]
,[InvoiceNo]
,[InvoiceDate]
,[OrderNo]
,[OrderDate]
,[FiscalYear]
,[Quarter]
,[WeekNo]
,[SalesmanID]
,[Salesman]
,[ReasonCodeKey]
,[Weight]
,[Box]
,[Value]
,[OrderStatus]
,[PONumber]
,[SubCategoryKey]
,[DispatchCenterOrderKey]
,[PromotionFlag]
,[CategoryKey]
,b.UserID
FROM [FinancialData].[dbo].[FactSalesHistoryDetail] a
WHERE NOT EXISTS (SELECT 1 FROM FinancialData.dbo.DimSalesRepUserIDMap b WHERE b.SalesRepID = a.SalesmanID)
The problem is that the very last column "b.UserID" uses the LEFT OUTER JOIN to get it's alias. When use the last query, I get "the multi-part identifier "b.UserID" could not be bound. Obviously this is because I have removed the call to this table. If I include it this way... it takes far far too long and not what I am expecting to receive.
FROM [FinancialData].[dbo].[FactSalesHistoryDetail] a, FinancialData.dbo.DimSalesRepUserIDMap b
WHERE NOT EXISTS (SELECT 1 FROM FinancialData.dbo.DimSalesRepUserIDMap b WHERE b.SalesRepID = a.SalesmanID)
So the question is how do I format this so that I am optimizing the performance with NOT EXISTS or EXISTS, while also referencing multiple columns from other tables?
Left join is the best option at this juncture. The last query will produce results but will have performance impact. Btw, did you try cross join?

Oracle 9i Sub query

Hi Can any one help me out of this query forming logic
SELECT C.CPPID, c.CPP_AMT_MANUAL
FROM CPP_PRCNT CC,CPP_VIEW c
WHERE
CC.CPPYR IN (
SELECT C.YEAR FROM CPP_VIEW_VIEW C WHERE UPPER(C.CPPNO) = UPPER('123')
AND C.CPP_CODE ='CPP000000000053'
and TO_CHAR(c.CPP_DATE,'YYYY/Mon')='2012/Nov'
)
AND UPPER(C.CPPNO) = UPPER('123')
AND C.CPP_CODE ='CPP000000000053'
and TO_CHAR(c.CPP_DATE,'YYYY/Mon') = '2012/Nov';
Please Correct me if i formed wrong query structure, in terms of query Performance and Standards. Thanks in Advance
If you have some indexes or partitioned tables I would not use functions on columns but on variables, to be able to use indexes/select partitions.
Also I use ANSI 92 SQL syntax. You don't specify(or not directly) a join contition between cpp_prcnt and cpp_view so it is actually a cartesian product(cross join)
SELECT C.CPPID, c.CPP_AMT_MANUAL
FROM CPP_PRCNT CC
CROSS JOIN CPP_VIEW c
WHERE
CC.CPPYR IN (
SELECT C.YEAR
FROM CPP_VIEW_VIEW C
WHERE C.CPPNO = '123'
AND C.CPP_CODE ='CPP000000000053'
AND trunc(c.CPP_DATE,'MM')=to_date('2012/Nov','YYYY/Mon')
)
AND C.CPPNO = '123'
AND C.CPP_CODE ='CPP000000000053'
AND trunc(c.CPP_DATE,'MM')=to_date('2012/Nov','YYYY/Mon')
If you show us the definition of cpp_view_view(seems to be a view over cpp_view), the definition(if simple) of CPP_VIEW and what you're trying to achieve, I bet there are more things to be improved/fixed.
There are a couple of things you could improve:
if possible, get rid of the UPPER() in the comparison - this will render any indices useless. If that's not possible, consider a function-based index on UPPER(CPPNO)
do not convert your DATE column to a string to compare it with a string - do it the other way round (i.e. convert your string to a date => only one conversion needed instead of one per table row, use of indices possible)
play around with EXISTS instead of IN, as suggested by Dileep - might be faster

Oracle Indexes on Left Outer Joins

So I'm having some issues with proper / any use of indexes in Oracle 11Gr2 and I'm trying to get a better understanding of how my explain plan ties back to my query so that I can apply indexing properly. When running the following query:
SELECT JLOG1.JLOG_KEY,
JLOG1.SRC_CD,
JLOG1.JRNL_AMT,
CASD.CONT_NO,
SUM (NVL (VJLOG.TDTL_AMT, 0)) TDTL_SUM
FROM GL_Journal_Logs JLOG1,
GL_JLOG_Details VJLOG,
CASE_DATA CASD
WHERE VJLOG.JLOG_KEY(+) = JLOG1.JLOG_KEY
AND CASD.CASE_KEY(+) = JLOG1.CASE_KEY
AND JLOG1.JRNL_CD = '0'
AND JLOG1.SRC_CD = '2'
AND JLOG1.ACCT_IF_CD = '0'
GROUP BY JLOG1.JLOG_KEY, JLOG1.SRC_CD,JLOG1.JRNL_AMT, CASD.CONT_NO
HAVING JLOG1.JRNL_AMT <> SUM (NVL (VJLOG.TDTL_AMT, 0));
I'm getting the following explain details:
I can understand that the indexes on my join "keys" (JLOG_KEY or CASE_KEY) wouldn't necessarily apply seeing as it's an outer join (or should they?), however when creating indexes on JLOG1 (JRNL_CD, SRC_CD, ACCT_IF_CD), technically would these take effect given my "where" clause?
Should I create any indexes at all given the circumstances or is there a better way of doing this?
Depending on the cardinality of the columns in your predicates, an appropriate index might be used on the GL_JLOG_DETAILS table, avoiding a full table scan. A covering index may avoid accessing the data pages at all:
ON GL_JOURNAL_LOGS (JRNL_CD,SRC_CD,ACCT_IF_CD,JLOG_KEY,CASE_KEY,JRNL_AMT)
(You probably want the column with the most selective predicate first in that index)
Also, your query may be able to make effective use of indexes
ON GL_JLOG_DETAILS (JLOG_KEY, TDTL_AMT)
and
ON CASE_DATA (CASE_KEY, CONT_NO)
Also, be sure that the statistics on the tables and indexes are up-to-date.
Also, that (+) notation for an OUTER JOIN may be limiting the optimizer.
Oracle now supports the ANSI style joins, which may allow the optimizer more latitude in coming up with an execution plan, e.g.
FROM GL_Journal_Logs JLOG1
LEFT
JOIN GL_JLOG_Details VJLOG ON VJLOG.JLOG_KEY = JLOG1.JLOG_KEY
LEFT
JOIN CASE_DATA CASD ON CASD.CASE_KEY = JLOG1.CASE_KEY
WHERE JLOG1.JRNL_CD = '0'
AND JLOG1.SRC_CD = '2'
AND JLOG1.ACCT_IF_CD = '0'

SQL to Relational Algebra

How do I go about writing the relational algebra for this SQL query?
Select patient.name,
patient.ward,
medicine.name,
prescription.quantity,
prescription.frequency
From patient, medicine, prescription
Where prescription.frequency = "3perday"
AND prescription.end-date="08-06-2010"
AND canceled = "Y"
Relations...
prescription
prescription-ref
patient-ref
medicine-ref
quantity
frequency
end-date
cancelled (Y/N))
medicine
medicine-ref
name
patient
Patient-ref
name
ward
I will just point you out the operators you should use
Projection (π)
π(a1,...,an): The result is defined as the set that is obtained when all tuples in R are restricted to the set {a1,...,an}.
For example π(name) on your patient table would be the same as SELECT name FROM patient
Selection (σ)
σ(condition): Selects all those tuples in R for which condition holds.
For example σ(frequency = "1perweek") on your prescription table would be the same as SELECT * FROM prescription WHERE frequency = "1perweek"
Cross product(X)
R X S: The result is the cross product between R and S.
For example patient X prescription would be SELECT * FROM patient,prescription
You can combine these operands to solve your exercise. Try posting your attempt if you have any issues.
Note: I did not include the natural join as there are no joins. The cross product should be enough for this exercise.
An example would be something like the following. This is only if you accidentally left out the joins between patient, medicine, and prescription. If not, you will be looking for cross product (which seems like a bad idea in this case...) as mentioned by Lombo. I gave example joins that may fit your tables marked as "???". If you could include the layout of your tables that would be helpful.
I also assume that canceled comes from prescription since it is not prefixed.
Edit: If you need it in standard RA form, it's pretty easy to get from a diagram.
alt text http://img532.imageshack.us/img532/8589/diagram1b.jpg

Resources