Is there any difference between these two statements:
-- Statement 1:
SELECT *
FROM Table1 t1
LEFT OUTER JOIN TABLE2 t2 on t1.id = t2.id
and
-- Statement 2:
SELECT *
FROM Table1 t1
LEFT OUTER JOIN (SELECT id, a, b, c FROM Table2) t2 on t1.id = t2.id
I'm not an expert but statement 2 just looks like poorly written sql, and like it would take much longer. I'm attempting to optimize a code block and it has many joins like the second one. Are they technically the same and I can just replace with the standard join statement 1?
Thanks!
Ps. This is Oracle, and working with 100's of millions of rows.
PSS. I'm doing my own detective work to figure out if they are the same, and time differences, was hoping an expert could explain if there is a technical difference what it is.
They are not same queries, with the lack of a criteria in the subquery that depends on whether the all columns and all column names of the TABLE2 is involved in the subquery. If the subquery involves all of the column names of the TABLE2 in the select list then they are the same query and the subquery is unnecessary. With subquery I refer to the part with a select statement after the join statement in the parens.
The first one uses the TABLE2 with its all columns, all those columns will be available in the result set where the criteria met.
However in the second one the table you make the JOIN is not the TABLE2 of yours but a table with just columns from TABLE2 specified in the subquery's SELECT list, namely id, a, b, and c. But it will have all the rows after this subquery since no criteria is enforced on it by a WHERE clause in the subquery.
You will have same number of rows with only selected columns participating from the TABLE2.
The second one is not necessarily the poorly written one. You could have a criteria to met before you JOIN to the TABLE2.
Related
I don't understand why these two queries below fetch different count. Case 1 below fetches more rows while Case 2 fetches fewer rows. If the where clause is put outside, fewer records are fetched.
Case 1
SELECT COUNT(1)
FROM (
SELECT *
FROM (SELECT * FROM TABLE1 WHERE COL1 = 123) A
LEFT JOIN TABLE2 B ON B.COL2=A.COL4
LEFT JOIN TABLE3 C ON C.COL3=B.COL2
)
Case 2
SELECT COUNT(1)
FROM (
SELECT *
FROM (SELECT * FROM TABLE1 ) A
LEFT JOIN TABLE2 B ON B.COL2=A.COL4
LEFT JOIN TABLE3 C ON C.COL3=B.COL2
)
WHERE COL1 = 123
Theoretical explanation:
Consider a left outer join of tables A and B. A condition (filter) on table B has different effects if it is in the join condition (ON clause) vs. in the WHERE clause. EDIT: The filter on B being in the ON condition is equivalent to replacing B with a subquery where the filter is applied first (similar to the OP's example).
If it's in the ON clause, then the rows in table B are filtered for that condition, and then the left join is performed. Then the result of the query will include rows from A (with NULL for the B side) whenever there are no rows in B that satisfy the filter and match the row in A on the join condition.
On the other hand, if the filter on B comes later in the execution, in a WHERE clause, then the left join is performed first. Only then is the WHERE clause applied. The WHERE clause is very likely (depending on the conditions on B) to reject all the rows from A that didn't have a matching row in B - because for such rows, all the values from B are NULL.
In your case, assuming COL1 only exists in table B, then the condition COL1=123 in a WHERE clause will effectively cause the left join to produce the same result as an inner join: any rows from A that didn't have a match in B will come from the left join with COL1 as NULL, so they will fail the filter condition. When you put COL1=123 in the ON clause, that check is done BEFORE the "outer join" operation.
My requirement is to get a report from a complex query using a if sentence.
If a flag=0 I must perform set of select statements, if the flag = 1 I must perform another set of select statements from another table,
Is there any way I can achieve this in a query rather than writing a function or stored procedure?
Eg:
In SQL I do this
if flag = 0
select var1, vari2 from table1
else
select var1, vari2, var3, vari4 from table2
Is this possible ??
There is no if in SQL - there is the case expression, but it is not quite the same thing.
If you have two tables, t1 and t2, and flag is in a scalar table t3 ("scalar" means exactly one column, flag, and with exactly one row, with the value either 0 or 1), you can do what you want but only if t1 and t2 have the same number of columns, with the same data types (and, although not required by syntax, this would only make sense if the columns in t1 and t2 have the same business meaning). Or, at least, if you plan to select only some columns from t1 or from t2, the columns you want to select from either table should be equal in number, have the same data type, and preferably the same business meaning.
For example: t1 and t2 may be employee tables, perhaps for two companies that just merged. If they both include first_name, last_name, date_of_birth and you just want to select these three columns from either t1 or t2 based on the flag value (even if t1 has other columns, not present in t2), you can do it. Same if t1 or t2 or both is not a single table, but the result of a more complicated query. The principle is the same.
The way you can do it is with a UNION ALL, like this:
select t1.col1, t1.col2, ...
from t1 cross join t3
where t3.flag = 0
UNION ALL
select t2.col1, t2.col2, ...
from t2 cross join t3
where t3.flag = 1
;
Which is better than the two queries below in terms of performance? The difference is that the first query uses the distinct directly, and the second one has the first query as the inner query (the records are already filtered before the distinct)
(this is oracle)
select distinct t1.f1, t2.f2
from t1, t2
where ...
select distinct f1, f2
from
select *
from t1, t2
where ...
If the subquery expresses the same logic, then they are the same. The subquery will be eliminated by a transformation in the optimiser.
UPDATE TABLE1 T1 SET T1.CENTERNAME=
(SELECT AC.CENTERNAME
FROM TABLE2 T2 INNER JOIN TABLET3 AN ON T2.CENTERID = T3.LOCATIONID
INNER JOIN TABLE1 T1 ON T3.LOG_ID = T1.LOGID W
HERE TRUNC(T1.ROW_DATE)='25-MAR-2014');
This gives the error 'ORA-01427: single-row subquery returns more than one row'.
The error message
ORA-01427: single-row subquery returns more than one row
means, er, the sub-query returns more than row. That is, this part of your statement ...
(SELECT AC.CENTERNAME
FROM TABLE2 T2 INNER JOIN TABLET3 AN ON T2.CENTERID = T3.LOCATIONID
INNER JOIN TABLE1 T1 ON T3.LOG_ID = T1.LOGID
WHERE TRUNC(T1.ROW_DATE)='25-MAR-2014')
returns more than row. The error occurs because the SET part of the UPDATE depends on the equality operator - SET T1.CENTERNAME= - so it can take only be one value.
Without more details about your data structure it is hard to be certain but I suspect what you really want is something like this
UPDATE TABLE1 T1
SET T1.CENTERNAME= (SELECT T2.CENTERNAME
FROM TABLE2 T2
INNER JOIN TABLE3 T3
ON T2.CENTERID = T3.LOCATIONID
WHERE T3.LOG_ID = T1.LOGID )
WHERE TRUNC(T1.ROW_DATE)='25-MAR-2014'
/
(I've tidied up your redaction to make the aliases consistent.)
Say you've got the following query on 9i:
SELECT /*+ USE_HASH(t2 t3) */
* FROM
table1 t1 -- this has lots of rows
LEFT JOIN table2 t2 ON t1.col1 = t2.col1
AND t1.col2 = t2.col2
LEFT JOIN table3 t3 ON t1.col1 = t3.col1
AND t1.col2 = t3.col2
Due to 9i not having RIGHT OUTER HASH JOIN, it needs to hash table1 for both joins. Does it re-hash table1 between joining t2 and t3 (even though it's using the same join columns), or does it keep the same hash information for both joins?
It would need to rehash since the second hash would be table3 against the join of table1/table2 rather than against table1. Or vice versa.
For example, say TABLE1 had 100 rows, table2 had 50 and table3 had 10.
Joining table1 to table2 may give 500 rows. It then joins that result set to table3 to give (perhaps) 700 rows.
It won't do a join of table1 to table2, then a join of table1 to table3, then a join of those two intermediate results.
Look at the plan, it'll tell you the answer.
An example might be something like (I've just made this up):
SELECT
HASH JOIN
HASH JOIN
TABLE FULL SCAN table1
TABLE FULL SCAN table2
TABLE FULL SCAN table3
This sample plan involves a scan through table1, hashing its contents as it goes; scans through table2, hashes the results of the join into a second hash, then finally scans table3.
There are other plans it could choose from.
If table1 is the biggest table, and the optimizer knows this (due to stats), it probably won't drive from it though.