Oracle nested CONNECT BY clauses causing poor performance - oracle

The query below is taking about a minute. I believe the poor performance is caused by the two "IN (SELECT..." clauses. I have a table of terms where one may be connected to another via the term_relationship table. These relationships can be recursive, e.g. dog is a type of mammal, mammal is a type of animal. This recursion could be any depth but probably not more than ~10 levels. I am trying to select all terms which have a (potentially recursive) relationship with type A and have a (potentially recursive) relationship with type B. I think that replacing the two "IN (SELECT..." clauses with restrictions on the outer query would improve performance but cannot figure out how to do this using the CONNECT BY clauses. Can anyone help with this?
SELECT term_name
FROM term
WHERE term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id = 123
CONNECT BY NOCYCLE PRIOR term_id = related_term_id)
AND term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id = 456
CONNECT BY NOCYCLE PRIOR term_id = related_term_id)

Instead of doing the same CONNECT BY query twice with only differing start values how about you do it once by providing both start values to one instance of the subquery. This change will get you all the term_ids related to either of your starting values, however, you want only those term_ids related to both of your starting values. To get that, you then need to group the results by by term_id and limit to those results having a count of more than one:
SELECT term_name
FROM term
WHERE term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id in (123, 456)
CONNECT BY NOCYCLE PRIOR term_id = related_term_id
group by term_id having count(*) >= 2)
Edit
With the above code I made an assumption about your data that may not be correct. I assumed a tree like structure where you were starting with nodes on a branch and traveling up towards the root like in diagram A, however, if your data looks like diagram B, then the above query will fail if you start at nodes 7 and 9 as node 7 has two paths back to node 1, and the above query would return node 1 twice, thereby misidentifying it as a common node.
A) -(1)- B) -(1)-
/ | \ (8) / | \ (8)
(2) | (3) | (2) | (3) |
| (4) | (9) | (4) | (9)
(5) (6) (5) (6)
| \ /
(7) -(7)-
The query below corrects for this and will correctly identify that for starting nodes 7 and 9 no nodes are in common, however, with starting nodes 7 and 4 node 1 is identified as a common node:
SELECT term_name
FROM term
WHERE term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id in (123, 456)
CONNECT BY NOCYCLE PRIOR term_id = related_term_id
group by term_id
having count(distinct connect_by_root related_term_id) >= 2)

Related

Extract a sub-tree from a hierarchy tree based on a leaf in Oracle

I have a table users representing a hierarchical tree like this:
Column
Type
Comment
user_id
integer
sequence
user_type
integer
1 for group of users 2 for normal user
group_id
integer
Reference to a user in the same table with user_type = 1
user_name
varchar(xxx)
The group_id column references another user_id so that groups and users are stored in the same table.
The master group_id is 0.
Like this:
user_id
user_type
group_id
user_name
0
1
null
'All users'
5
2
0
'USER1'
6
2
0
'USER2'
11
1
0
'SUBGROUP1'
12
1
11
'SUBGROUP2'
13
2
12
'USER3'
20
1
0
'SUBGROUP3'
21
2
20
'USER4'
Notice that:
There can be gaps in user_id.
A group can contain nothing or any number of groups or users.
I have already managed to retrieve the full tree, properly indented and sorted, by using the connect by oracle statement.
This is not my question here.
My question is:
Given a user_id to a query, how to browse the tree up to the master group 'All Users'
and output as a result the full path from the leaf to the master group ?
Example 1: I run the query for USER1, i want the following output:
All Users
- USER1
Example 2: I run the same query for USER3, i want the following output:
All Users
- SUBGROUP1
-- SUBGROUP2
--- USER3
I hope someone could help me on this.
For information i post the query to retrieve the full tree, for you to see the use of connect by and start with.
I'm sure this query is close to the one i want, but my tries never produce the result i want.
select
lpad('-', (level - 1) * 2, ' ') || u.user_name as padded_name,
u.userid,
u.user_group,
u.user_type,
level
from users u
connect by prior u.user_id = u.group_id
start with u.user_id = 0
order siblings by upper(u.user_name);
You could use connect by to walk in the opposite direction. Then the level will of course be opposite too. So to get the results in the right order and indentation, chain another query based on these results that will use row_number() to determine the indentation:
with base as (
select
u.user_name,
u.user_id,
u.group_id,
u.user_type,
level as lvl
from users u
connect by prior u.group_id = u.user_id
start with u.user_id = 13
)
select
lpad('-', (row_number() over (order by lvl desc) - 1) * 2, ' ') || base.user_name
as padded_name,
user_id,
group_id,
user_type
from base
order by lvl desc;

Nearest neighbor and distance between points and lines

In oracle spatial I have two tables (AVALREGULACAO and ATROCOADUTOR) representing points and lines, respectively.
The structure of both tables is as follows:
AVALREGULACAO (295 point records)
IPID [number(10)]
GEOMETRY [MDSYS.SDO_GEOMETRY]
ATROCOADUTOR (12536 line records)
IPID [number(10)]
GEOMETRY [MDSYS.SDO_GEOMETRY]
I need to find the nearest ATROCOADUTOR neighbor from each AVALREGULACAO and calculate the distance between them
AVALREGULACAO_IPID | ATROCOADUTOR _IPID | DISTANCE
I’ve used 2 options
1
SELECT /*+ ORDERED */ A.IPID, B.IPID, MIN(SDO_GEOM.SDO_DISTANCE(sdo_cs.make_2d(A.GEOMETRY), sdo_cs.make_2d(B.GEOMETRY), 0.005)) as DISTANCE
FROM AVALREGULACAO A, ATROCOADUTOR B
GROUP BY c_b.IPID,c_d.IPID;
It takes quite a long time to compute - It generates a huge output of 295 x 12536 = 3 698 120 possible combinations (Cartesian product). Furthermore the csv file output cannot accommodate all this records (1 048 576 rows limit)
I only need 295 records corresponding to the 295 AVALREGULACAO.
2
I’ve also tried/adapted another query with the nearest neighbor (nn) operator
PROMPT IPID, nearest_IPID, distance
select /*+ ORDERED USE_NL(s,s2)*/
s.IPID,
s2.IPID as nearest_IPID,
TO_CHAR(REPLACE(mdsys.sdo_geom.sdo_distance(sdo_cs.make_2d(s.GEOMETRY),sdo_cs.make_2d(s2.GEOMETRY),0.05), ',','.')) as distance
from AVALREGULACAO s,
ATROCOADUTOR s2
where s2.IPID in (select IPID
from AVALREGULACAO s3
where sdo_nn(s3.GEOMETRY,s.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
and s3.IPID <> s.IPID
and rownum < 2)
order by 1,2;
This query takes forever - I need to shut down the process before it ends.
I guess I'm missing the point on how to optimize/filter the desired results.
Any tips on how to efficiently solve this would be much appreciated.
Thanks in advance,
Pedro
PS:
#Boneist. Thanks a lot for the input.
Unfortunately I got an error after applying your query (still trying to work the semantics/syntax of new commands KEEP, dense_rank)
SELECT a.ipid a_ipid,
MIN(b.ipid) KEEP (dense_rank FIRST order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) b_ipid,
MIN(sdo_geom.sdo_distance(sdo_cs.make_2d(a.geometry), sdo_cs.make_2d(b.geometry), 0.005)) AS distance
FROM avalregulacao a
INNER JOIN atrocoadutor b ON sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
GROUP BY a.ipid;
Error
Error starting at line : 1 in command -
SELECT a.ipid a_ipid,
MIN(b.ipid) KEEP (dense_rank FIRST order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) b_ipid,
MIN(sdo_geom.sdo_distance(sdo_cs.make_2d(a.geometry), sdo_cs.make_2d(b.geometry), 0.005)) AS distance
FROM avalregulacao a
INNER JOIN atrocoadutor b ON sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
GROUP BY a.ipid
Error at Command Line : 2 Column : 45
Error report -
SQL Error: ORA-29907: foram encontradas etiquetas em duplicado em invocações primárias
29907. 00000 - "found duplicate labels in primary invocations"
*Cause: There are multiple primary invocations of operators with
the same number as the label.
*Action: Use distinct labels in primary invocations.
I think you're probably after something like:
SELECT a.ipid a_ipid,
MIN(b.ipid) KEEP (dense_rank FIRST order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) b_ipid,
MIN(sdo_geom.sdo_distance(sdo_cs.make_2d(a.geometry), sdo_cs.make_2d(b.geometry), 0.005)) AS distance
FROM avalregulacao a
INNER JOIN atrocoadutor b ON sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
GROUP BY a.ipid;
This joins both tables on the nearest neighbour function, which should reduce the number of rows being returned.
The MIN(b.ipid) KEEP (dense_rank first order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) simply returns the lowest b.ipid value for the lowest difference.
(I think this query will work as is, but I can't test it. You might have to do the join and have sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) as a column in a subquery and then do the group by in the outer query.)

oracle connect by multiple parents

I am facing an issue using connect by.
I have a query through which I retrieve a few columns including these three:
ID
ParentID
ObjectID
Now for the same ID and parentID, there are multiple objects associated e.g.
ID ParentID ObjectID
1 0 112
1 0 113
2 0 111
2 0 112
3 1 111
4 1 112
I am trying to use connect by but I'm unable to get the result in a proper hierarchy. I need it the way it is showed below. Take an ID-parentID combo, display all rows with that ID-parentID and then all the children of this ID i.e. whose parentID=ID
ID ParentID ObjectID
1 0 112
1 0 113
3 1 111
4 1 112
2 0 111
2 0 112
select ID,parent_id, object_id from table start with parent_id=0
connect by prior id=parent_id order by id,parent_id
Above query is not resulting into proper hierarchy that i need.
Well, your problem appears to be that you are using a non-normalized table design. If a given ID always has the same ParentID, that relationship shouldn't be indicated separately in all these rows.
A better design would be to have a single table showing the parent child relationships, with ID as a primary key, and a second table showing the mappings of ID to ObjectID, where I presume both columns together would comprise the primary key. Then you would apply your hierarchical query against the first table, and join the results of that to the other table to get the relevant objects for each row.
You can emulate this with your current table structure ...
with parent_child as (select distinct id, parent_id from table),
tree as (select id, parent_id from parent_child
start with parent_id = 0
connect by prior id = parent_id )
select id, table.parent_id, table.object_id
from tree join table using (id)
Here's a script that runs. Not ideal but will work -
select * from (select distinct test.id,
parent_id,
object_id,
connect_by_root test.id root
from test
start with test.parent_id = 0
connect by prior test.id = parent_id)
order by root,id
First of all Thanks to all who tried helping me.
Finally i changed my approach as applying hierarchy CONNECT BY clause to inner queryw ith multiple joins was not working for me.
I took following approach
Get the hierarchical data from First table i.e. table with ID-ParentID. Select Query table1 using CONNECT BY. It will give the ID in proper sequence.
Join the retrieved List of ID.
Pass the above ID as comma seperated string in select query IN Clause to second table with ID-ObjectID.
select * from table2 where ID in (above Joined string of ID) order by
instr('above Joined string of ID',ID);
ORDER BY INSTR did the magic. It will give me the result ordered by the IN Clause data and IN Clause string is prepared using the hierarchical query. Hence it will obviously be in sequence.
Again Thanks all for the help!
Note: Above approach has one constraint : ID passed as comma separated string in IN Clause. IN Clause has a limit of characters inside it. I guess 1000 chars. Not sure.
But as i am sure on the data of First table that it will not be so much so as to cross limit of 1000 chars. Hence i chose above approach.

oracle hierarchical query nocycle and connect by root

Can somebody explain use of nocycle and connect by root clauses in hierarchical queries in oracle, also when we dont use 'start with' what is the order we get the rows, i mean when we don't use 'start with' we get lot many rows, can anybody explain nocycle and connect by root(how is different than start with?) using simple emp table, Thanks for the help
If your data has a loop in it (A -> B -> A -> B ...), Oracle will throw an exception, ORA-01436: CONNECT BY loop in user data if you do a hierarchical query. NOCYCLE instructs Oracle to return rows even if such a loop exists.
CONNECT_BY_ROOT gives you access to the root element, even several layers down in the query. Using the HR schema:
select level, employee_id, last_name, manager_id ,
connect_by_root employee_id as root_id
from employees
connect by prior employee_id = manager_id
start with employee_id = 100
LEVEL EMPLOYEE_ID LAST_NAME MANAGER_ID ROOT_ID
---------- ----------- ------------------------- ---------- ----------
1 100 King 100
2 101 Kochhar 100 100
3 108 Greenberg 101 100
4 109 Faviet 108 100
...
Here, you see I started with employee 100 and started finding his employees. The CONNECT_BY_ROOT operator gives me access to King's employee_id even four levels down. I was very confused at first by this operator, thinking it meant "connect by the root element" or something. Think of it more like "the root of the CONNECT BY clause."
Here is about nocycle use in query.
Suppose we have a simple table
with r1 and r2 column names and the values for
first row r1=a,r2=b
and second row r1=b,r2=a
Now we know a refers to b and b refers back to a .
Hence there is a loop and if we write a hierarchical query as
select r1 from table_name
start with r1='a'
connect by prior r2=r1;
we get connect by loop error
Hence use nocycle to allow oracle to give results even if loop exists.
Hence the query
select r1 from table_name
start with r1='a'
connect by nocycle prior r2=r1;

PL SQL concatenate 2 resultsets

I need to get the result of concatenating 2 similar querys' resulsets. For some reason had to split the original query in 2, both with their corresponding order by clause. Should be something like (this is an oversimplification of the original queries)
Query1: Select name, age from person where age=10
Resultset1:
Person1, 10
Person3, 10
Query2: Select name, age from person where age=20
Resultset1:
Person2, 20
Person6, 20
The expected result:
Person1, 10
Person3, 10
Person2, 20
Person6, 20
I can not simply use Query1 UNION Query2.
Below the 2 original querys:
(#1)
select cp.CP_ID, cpi.CI_DESCRIPCION, cp.CP_CODIGOJERARQUIZADO, cp.CP_ESGASTO as gasto, cp.CP_CONCEPTOPADRE, LEVEL
from TGCCP_ConceptoPagoIng cp
left join tgcci_ConceptoPagoIngIdioma cpi on cpi.CI_IDCONCEPTOPAGOING = cp.CP_ID and cpi.CI_IDIDIOMA = 1
start with ((CP_CONCEPTOPADRE is null) and (**cp.CP_ESGASTO = 1**))
connect by prior cp.CP_ID = cp.CP_CONCEPTOPADRE
order siblings by CP_CODIGOJERARQUIZADO
(#2)
select cp.CP_ID, cpi.CI_DESCRIPCION, cp.CP_CODIGOJERARQUIZADO, cp.CP_ESGASTO as gasto, cp.CP_CONCEPTOPADRE, LEVEL
from TGCCP_ConceptoPagoIng cp
left join tgcci_ConceptoPagoIngIdioma cpi on cpi.CI_IDCONCEPTOPAGOING = cp.CP_ID and cpi.CI_IDIDIOMA = 1
start with ((CP_CONCEPTOPADRE is null) and (**cp.CP_ESGASTO = 2**))
connect by prior cp.CP_ID = cp.CP_CONCEPTOPADRE
order siblings by CP_CODIGOJERARQUIZADO
I think you want a
select * from ( first query )
UNION ALL
select * from ( second query )
Where first query and second query are the queries from above, so you are turning them into subqueries, thus preserving the order by clauses.
OK, well, I'm not fully certain why you need it this way, but if Oracle won't allow you to do a UNION, or it screws up the ordering when you do, I would try creating a pipelined table function.
An example here
Basically, you'd create a procedure that ran both queries, first one, then the other, putting the results of each into the returned dataset.
It looks like you are looking for a MULTISET UNION. Which can only be used from version 10 upwards.
Regards,
Rob.
You could combine your queries as subqueries and do a single order by on the outer query:
select * from (
<query 1 with its order by>
UNION ALL
<query 2 with its order by>
)
order by column1, column2;
Alternatively, you can implement in PL/SQL the equivalent of a sort merge join with two cursors, but that's unnecessarily complicated.
this solution works perfectly:
select * from ( first query )
UNION ALL
select * from ( second query )
I appreciate everyone that have taken the time to answer.
regards.
For your example:
Select name, age from person where age in (10,20)
or
Select name, age from person where age = 10 or age = 20
However I'm guessing this is not what you need :)

Resources