Why does this query result in a MERGE JOIN CARTESIAN in Oracle? - oracle

Here is my query:
select count(*)
from email_prod_junc j
inner join trckd_prod t5 on j.trckd_prod_sk = t5.trckd_prod_sk
inner join prod_brnd b on t5.prod_brnd_sk = b.prod_brnd_sk
inner join email e on j.email_sk = e.email_sk
inner join dm_geography_sales_pos_uniq u on (u.emp_sk = e.emp_sk and u.prod_brnd_sk = b.prod_brnd_sk)
The explain plan says:
Cartesian Join between DM_GEOGRAPHY_SALES_POS_UNIQ and EMAIL_PROD_JUNC.
I don't understand why because there is a join condition for each table.

I solved this by adding the ORDERED hint:
select /*+ ordered */
I got the information from here
If you specify the tables in the order you want them joined and use this hint, Oracle won't spend time trying to figure out the optimal join order, it will just join them as they are ordered in the FROM clause.

Without knowing your indexes and the full plan, it's hard to say why this is happening exactly. My best guess is that EMAIL_PROD_JUNC and DM_GEOGRAPHY_SALES_POS_UNIQ are relatively small and that there's an index on TRCKD_PROD(trckd_prod_sk, prod_brnd_sk). If that's the case, then the optimizer may have decided that the Cartesian on the two smaller tables is less expensive than filtering TRCKD_PROD twice.

I would speculate that it happens because of the on (x and y) condition of the last inner join. Oracle probably doesn't know how to optimize the multi-statement condition, so it does a full join, then filters the result by the condition after the fact. I'm not really familiar with Oracle's explain plan, so I can't say that with authority
Edit
If you wanted to test this hypothesis, you could try changing the query to:
inner join dm_geography_sales_pos_uniq u on u.emp_sk = e.emp_sk
where u.prod_brnd_sk = b.prod_brnd_sk
and see if that eliminates the full join from the plan

Related

Consecutive JOIN and aliases: order of execution

I am trying to use FULLTEXT search as a preliminary filter before fetching data from another table. Consecutive JOINs follow to further refine the query and to mix-and-match rows (in reality there are up to 6 JOINs of the main table).
The first "filter" returns the IDs of the rows that are useful, so after joining I have a subset to continue with. My issue is performance, however, and my lack of understanding of how the SQL query is executed in SQLite.
SELECT *
FROM mytbl AS t1
JOIN
(SELECT someid
FROM myftstbl
WHERE
myftstbl MATCH 'MATCHME') AS prior
ON
t1.someid = prior.someid
AND t1.othercol = 'somevalue'
JOIN mytbl AS t2
ON
t2.someid = prior.someid
/* Or is this faster? t2.someid = t1.someid */
My thought process for the query above is that first, we retrieve the matched IDs from the myftstbl table and use those to JOIN on the main table t1 to get a sub-selection. Then we again JOIN a duplicate of the main table as t2. The part that I am unsure of is which approach would be faster: using the IDs from the matches, or from t2?
In other words: when I refer to t1.someid inside the second JOIN, does that contain only the someids after the first JOIN (so only those at the intersection of prior and those for which t1.othercol = 'somevalue) OR does it contain all the original someids of the whole original table?
You can assume that all columns are indexed. In fact, when I use one or the other approach, I find with EXPLAIN QUERY PLAN that different indices are being used for each query. So there must be a difference between the two.
The query should be simplified to
SELECT *
FROM mytbl AS t1
JOIN myftstbl USING (someid) -- or ON t1.someid = myftstbl.someid
JOIN mytbl AS t2 USING (someid) -- or ON t1.someid = t2.someid
WHERE myftstbl.{???} MATCH 'MATCHME' -- replace {???} with correct column name
AND t1.othercol = 'somevalue'
PS. The query logic is not clear for me, so it is saved as-is.

Technical and syntax doubts about joins

i'm having a technical and syntax problem with JOINS in ORACLE.
If i have 7 tables, listed below:
FROM
QT_QTS.PLA_ORDEM_PRODUCAO pla,
qt_qts.res_tubo_austenitizacao aust,
qt_qts.res_tubo_revenimento1 res_rev1,
qt_qts.res_tubo_revenimento2 res_rev2,
limsprod.SAMPLE sp,
limsprod.test t,
limsprod.result r
I need to get ALL the data in the "limsprod.result r" table linked with similar corresponding data inside the qt_qts.res_tubo_austenitizacao aust, qt_qts.res_tubo_revenimento1 res_rev1 and qt_qts.res_tubo_revenimento2 res_rev2 tables.
How can I do this join using Oracle Database? I tried a left join, but it did not work.
It is impossible to answer that question. We have nothing but list of some tables. I'm not sure I'd even want to do that instead of you.
However, here's a suggestion: start with one table:
select * from limsprod.result r;
It'll return all rows. Then join it to another table:
select *
from limsprod.result r join qt_qts.res_tubo_austenitizacao aust on aust.id = r.id
and see what happens - did you get all rows you want? If not, should you add another JOIN condition? Perhaps an outer join? Don't move on to the third table until you sort that out. Once you're satisfied with the result, add another table:
select *
from limsprod.result r join qt_qts.res_tubo_austenitizacao aust on aust.id = r.id
join qt_qts.res_tubo_revenimento1 res_rev1 on res_rev1.idrr = aust.idrr
Repeat what's being said previously.

How to effeciently select data from two tables?

I have two tables: A, B.
A has prisoner_id and prisoner_name columns.
B has all other info about prisoners included prisoner_name column.
First I select all of the data that I need from B:
WITH prisoner_datas AS
(SELECT prisoner_name, ... FROM B WHERE ...)
Then I want to know all of the id of my prisoner_datas. To do this I need to combine information by prisoner_name column, because it's common for both tables
I did the following
SELECT A.prisoner_id, prisoner_datas.prisoner_name, prisoner_datas. ...,
FROM A, prisoner_datas
WHERE A.prisoner_name = prisoner_datas.prisoner_name
But it works very slow. How can I improve performance?
Add an index on the prisoner_name join column in the B table. Then the following join should have some performance improvement:
SELECT
A.prisoner_id,
B.prisoner_name,
B.prisoner_datas.id -- and other columns if needed
FROM A
INNER JOIN B
ON A.prisoner_name = B.prisoner_name
Note here that I used an explicit join syntax here. It isn't required, and the query plan might not change, but it makes the query easier to read. I don't think the CTE will change much, but the lack of an index on the join column should be important here.

Oracle: Which one of the queries is more efficient?

Query 1
select student.identifier,
id_tab.reporter_name,
non_id_tab.reporter_name
from student_table student
inner join id_table id_tab on (student.is_NEW = 'Y'
and student.reporter_id = id_tab.reporter_id
and id_tab.name in('name1','name2'))
inner join id_table non_id_tab on (student.non_reporter_id = non_id_tab.reporter_id)
Query 2
select student.identifier,
id_tab.reporter_name,non_id_tab.reporter_name
from student_table student,
id_table id_tab,
id_table non_id_tab
where student.is_NEW = 'Y'
and student.reporter_id = id_tab.reporter_id
and id_tab.name in('name1','name2')
and student.non_reporter_id = non_id_tab.reporter_id
Since these two queries produce exactly same output,I am assuming they are syntactically same(please correct me if I am wrong).
I was wondering whether either of them is more efficient that the other.
Can anyone help me here please?
I would rewrite it as follows, using the ON only for JOIN conditions and moving the filters to a WHERE condition:
...
from student_table student
inner join id_table id_tab on ( student.reporter_id = id_tab.reporter_id )
inner join id_table non_id_tab on (student.non_reporter_id = non_id_tab.reporter_id)
where student.is_NEW = 'Y'
and id_tab.name in('name1','name2')
This should give a more readable query; however, no matter how you write it (the ANSI join is highly preferrable), you should check the explain plans to understand how the query will be executed.
In terms of performance, there should be no difference.
Execution Plans created by the Oracle optimizer do not differ.
In terms of readability, joining tables inside the WHERE clause is an old style (SQL89).
From SQL92 and higher, it is recommended to use the JOIN syntax.

Entity Framework "Joins" resulting in returning entire table from SQL

We are writing entity lambda expression query like this. But when we checked in profile. There were almost all the tables which were used in join returning entire table to the .net linq queries.
We have few transaction tables which has thousands of records. which is causing performance issue.
Please let us know if we can avoid table returning entire rows to .net
var result = (from f in f
join a in this.Context.a on f.primeryKey equals a.primeryKey
join d in this.Context.d on f.secondid equals d.secondid
join t in this.Context.t on d.thirdId equals t.thirdId
where t.isfoo && pfIds.Contains(a.fourthId.HasValue ? a.fourthId.Value : -1)
select f).Distinct().ToList();
Well, no real answer, for that I don't have enough info, but a few remarks to improve your query.
First remark: Don't do Contains and HasValue, because Linq won't SQL-ize these operations. I'm also not quite sure about the this.Context. stuff.
Second: NULL won't join in smart joins.
Third: Instead of selecting f, you'd typically select only a few fields of f that you need.
You'll need to rewrite your query. EF really needs to get all lines to utilize operator ? in order to evaluate value in a.fourthId column. I believe that
var result = (from f in f
join a in this.Context.a on f.primeryKey equals a.primeryKey
join d in this.Context.d on f.secondid equals d.secondid
join t in this.Context.t on d.thirdId equals t.thirdId
where t.isfoo && pfIds.Contains(a.fourthId)
select f).Distinct().ToList();
would meet your needs without necessary overhead, that evaluation seems to be superfluous.

Resources