oracle hierarchical query nocycle and connect by root - oracle

Can somebody explain use of nocycle and connect by root clauses in hierarchical queries in oracle, also when we dont use 'start with' what is the order we get the rows, i mean when we don't use 'start with' we get lot many rows, can anybody explain nocycle and connect by root(how is different than start with?) using simple emp table, Thanks for the help

If your data has a loop in it (A -> B -> A -> B ...), Oracle will throw an exception, ORA-01436: CONNECT BY loop in user data if you do a hierarchical query. NOCYCLE instructs Oracle to return rows even if such a loop exists.
CONNECT_BY_ROOT gives you access to the root element, even several layers down in the query. Using the HR schema:
select level, employee_id, last_name, manager_id ,
connect_by_root employee_id as root_id
from employees
connect by prior employee_id = manager_id
start with employee_id = 100
LEVEL EMPLOYEE_ID LAST_NAME MANAGER_ID ROOT_ID
---------- ----------- ------------------------- ---------- ----------
1 100 King 100
2 101 Kochhar 100 100
3 108 Greenberg 101 100
4 109 Faviet 108 100
...
Here, you see I started with employee 100 and started finding his employees. The CONNECT_BY_ROOT operator gives me access to King's employee_id even four levels down. I was very confused at first by this operator, thinking it meant "connect by the root element" or something. Think of it more like "the root of the CONNECT BY clause."

Here is about nocycle use in query.
Suppose we have a simple table
with r1 and r2 column names and the values for
first row r1=a,r2=b
and second row r1=b,r2=a
Now we know a refers to b and b refers back to a .
Hence there is a loop and if we write a hierarchical query as
select r1 from table_name
start with r1='a'
connect by prior r2=r1;
we get connect by loop error
Hence use nocycle to allow oracle to give results even if loop exists.
Hence the query
select r1 from table_name
start with r1='a'
connect by nocycle prior r2=r1;

Related

How to write correct left Join of two tables?

I want to join two tables, first table primary key data type is number, and second table primary key data type is VARCHAR2(30 BYTE). How to join both tables.
I tried this code but second tables all values are null. why is that?
SELECT a.act_phone_no,a.act_actdevice,a.bi_account_id, a.packag_start_date, c.identification_number,
FROM ACTIVATIONS_POP a
left JOIN customer c
on TO_CHAR(a.act_phone_no) = c.msisdn_voice
first table
act_phone_no bi_account_id
23434 45345
34245 43556
Second table
msisdn_voice identification_number
23434 321113
34245 6547657
It seems that you didn't tell us everything. Query works, if correctly written, on such a sample data:
SQL> with
2 -- Sample data
3 activations_pop (act_phone_no, bi_account_id) as
4 (select 23434, 45345 from dual union all
5 select 34245, 43556 from dual
6 ),
7 customer (msisdn_voice, identification_number) as
8 (select '23434', 321113 from dual union all
9 select '34245', 6547657 from dual
10 )
11 -- query works OK
12 select a.act_phone_no,
13 a.bi_account_id,
14 c.identification_number
15 from activations_pop a join customer c on to_char(a.act_phone_no) = c.msisdn_voice;
ACT_PHONE_NO BI_ACCOUNT_ID IDENTIFICATION_NUMBER
------------ ------------- ---------------------
23434 45345 321113
34245 43556 6547657
SQL>
What could be wrong? Who knows. If you got some result but columns from the CUSTOMER table are empty (NULL?), then they really might be NULL, or you didn't manage to join rows on those columns (left/right padding with spaces?). Does joining on e.g.
on to_char(a.act_phone_no) = trim(c.msisdn_voice)
or
on a.act_phone_no = to_number(c.msisdn_voice)
help?
Consider posting proper test case (CREATE TABLE and INSERT INTO statements).
You are using Oracle ?
Please check the below demo
SELECT a.act_phone_no, a.bi_account_id, c.identification_number
FROM ACTIVATIONS_POP a
left JOIN customer c
on TO_CHAR(a.act_phone_no) = c.msisdn_voice;
SQLFiddle

Translate hierarchical Oracle query to DB2 query

I work primarily with SAS and Oracle and am still new to DB2. Im faced with needing a hierarchical query to separate a clob into chunks that can be pulled into sas. SAS has a limit of 32K for character variables so I cant just pull the dataset in normally.
I found an old stackoverflow question about the best way to pull a clob into a sas data set but it is written in Oracle.
Import blob through SAS from ORACLE DB
Since I am new to DB2 and the syntax for this type of join seems very different I was hoping to find someone that could help convert it and explain the syntax. I find the Oracle syntax to be much easier to understand. I'm not sure in DB2 if you would use a CTE recursion like this https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/apsg/src/tpc/db2z_xmprecursivecte.html or if you would use hierarchical queries like this https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_71/sqlp/rbafyrecursivequeries.htm
Here is the Oracle query.
SELECT
id
, level as chunk_id
, regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') as clob_chunk
FROM (
SELECT id, clob_value
FROM schema.table
WHERE id = 1
)
CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
order by id, chunk_id;
The table has two fields the id and the clob_value and would look like this.
ID CLOB_VALUE
1 really large clob
2 medium clob
3 another large clob
The thought is I would want this result. I would only ever be doing this one row at a time where id= which ever row I am processing.
ID CHUNK_ID CLOB
1 1 clob_chunk1of3
1 2 clob_chunk2of3
1 3 clob_chunk3of3
Thanks for any time spent reading and helping.
Here is a solution that should work in DB2 with few changes (but please be advised that I don't know DB2 at all; I am just using Oracle features that are in the SQL Standard, so they should be implemented identically - or almost so - in DB2).
Below I create a table with your sample data; then I show how to chunk it into substrings of length at most 8 characters. Although the strings are short, I defined the column as CLOB and I am using CLOB tools; this should work on much larger CLOBs.
You can make both the chunk size and the id into bind parameters, if needed. In my demo below I hardcoded the chunk size and I show the result for all IDs in the table. In case the CLOB is NULL, I do return one chunk (which is NULL, of course).
Note that touching CLOBs in a query is very expensive; so most of the work is done without touching the CLOBs. I only work on them as little as possible.
PREP WORK
drop table tbl purge; -- If needed
create table tbl (id number, clob_value clob);
insert into tbl (id, clob_value)
select 1, 'really large clob' from dual union all
select 2, 'medium clob' from dual union all
select 3, 'another large clob' from dual union all
select 4, null from dual -- added to check handling
;
commit;
QUERY
with
prep(id, len) as (
select id, dbms_lob.getlength(clob_value)
from tbl
)
, rec(id, len, ord, pos) as (
select id, len, 1, 1
from prep
union all
select id, len, ord + 1, pos + 8
from rec
where len >= pos + 8
)
select id, ord, dbms_lob.substr(clob_value, 8, pos)
from tbl inner join rec using (id)
order by id, ord
;
ID ORD CHUNK
---- ---- --------
1 1 really l
1 2 arge clo
1 3 b
2 1 medium c
2 2 lob
3 1 another
3 2 large cl
3 3 ob
4 1
Another option is to enable the Oracle compatibility in Db2 and just issue the hierarchical query.
This GitHub repository has background information on SQL recursion in DB2, including the Oracle-style syntax and a side by side example (both work against the Db2 sample database):
-- both queries are against the SAMPLE database
-- and should return the same result
SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
START WITH DEPTNO = 'A00'
CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT;
WITH tdep(level, deptname, deptno) as (
SELECT 1, CAST( DEPTNAME AS VARCHAR(40)) AS DEPTNAME, deptno
FROM department
WHERE DEPTNO = 'A00'
UNION ALL
SELECT t.LEVEL+1, CAST(SPACE(t.LEVEL * 4) || '/' || d.DEPTNAME
AS VARCHAR(40)) AS DEPTNAME, d.deptno
FROM DEPARTMENT d, tdep t
WHERE d.admrdept=t.deptno and d.deptno<>'A00')
SELECT level, deptname
FROM tdep;

Oracle nested CONNECT BY clauses causing poor performance

The query below is taking about a minute. I believe the poor performance is caused by the two "IN (SELECT..." clauses. I have a table of terms where one may be connected to another via the term_relationship table. These relationships can be recursive, e.g. dog is a type of mammal, mammal is a type of animal. This recursion could be any depth but probably not more than ~10 levels. I am trying to select all terms which have a (potentially recursive) relationship with type A and have a (potentially recursive) relationship with type B. I think that replacing the two "IN (SELECT..." clauses with restrictions on the outer query would improve performance but cannot figure out how to do this using the CONNECT BY clauses. Can anyone help with this?
SELECT term_name
FROM term
WHERE term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id = 123
CONNECT BY NOCYCLE PRIOR term_id = related_term_id)
AND term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id = 456
CONNECT BY NOCYCLE PRIOR term_id = related_term_id)
Instead of doing the same CONNECT BY query twice with only differing start values how about you do it once by providing both start values to one instance of the subquery. This change will get you all the term_ids related to either of your starting values, however, you want only those term_ids related to both of your starting values. To get that, you then need to group the results by by term_id and limit to those results having a count of more than one:
SELECT term_name
FROM term
WHERE term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id in (123, 456)
CONNECT BY NOCYCLE PRIOR term_id = related_term_id
group by term_id having count(*) >= 2)
Edit
With the above code I made an assumption about your data that may not be correct. I assumed a tree like structure where you were starting with nodes on a branch and traveling up towards the root like in diagram A, however, if your data looks like diagram B, then the above query will fail if you start at nodes 7 and 9 as node 7 has two paths back to node 1, and the above query would return node 1 twice, thereby misidentifying it as a common node.
A) -(1)- B) -(1)-
/ | \ (8) / | \ (8)
(2) | (3) | (2) | (3) |
| (4) | (9) | (4) | (9)
(5) (6) (5) (6)
| \ /
(7) -(7)-
The query below corrects for this and will correctly identify that for starting nodes 7 and 9 no nodes are in common, however, with starting nodes 7 and 4 node 1 is identified as a common node:
SELECT term_name
FROM term
WHERE term_id IN
(SELECT term_id
FROM term_relationship
START WITH related_term_id in (123, 456)
CONNECT BY NOCYCLE PRIOR term_id = related_term_id
group by term_id
having count(distinct connect_by_root related_term_id) >= 2)

Oracle sql query returning all of the values instead limiting the values

full disclosure this is part of a homework question but I have tried 6 different versions and I am stuck.
I am trying for find 1 manager every time the query runs. I.e I put the department id in and 1 name pops out. currently, I get all the names, multiple times. I have tried a nesting with an '=' not nesting, union, intersection, etc. I can get the manager id with a basic query, I just can't get the name. the current version looks like this:
select e.ename
from .emp e
where d.managerid in (select unique d.managerid
from works w, .dept d, emp e1
where d.did=1 and e1.eid=w.eid and d.did=w.did );
I realize its probably a really basic mistake that I am not seeing - any ideas?
Its not clear what do you mean get 1 menager any time. are it should be different menagers any time or the same?
Lest go throw your query:
you select all empolyes from table emp where manager_id in next query dataset
You get all managers for dep=1. The rest of tables and conditions are not influent on result dataset.
I theing did is primary key for table dept, If so your query may be rewritten to
select e.ename
from emp e
where d.managerid in (select unique d.managerid
from dept d
where d.did=1);
but this query return to you all emploees and not manager from dept=1
and if you need a manager. you should get emploee who is a manager. If eid is primary key of employee, and managerid is id from employee table you need something like:
select e.ename
from emp e
where e1.eid in (select unique d.managerid
from dept d
where d.did=1);

Oracle Self-Join on multiple possible column matches - CONNECT BY?

I have a query requirement from ----. Trying to solve it with CONNECT BY, but can't seem to get the results I need.
Table (simplified):
create table CSS.USER_DESC (
USER_ID VARCHAR2(30) not null,
NEW_USER_ID VARCHAR2(30),
GLOBAL_HR_ID CHAR(8)
)
-- USER_ID is the primary key
-- NEW_USER_ID is a self-referencing key
-- GLOBAL_HR_ID is an ID field from another system
There are two sources of user data (datafeeds)... I have to watch for mistakes in either of them when updating information.
Scenarios:
A user is given a new User ID... The old record is set accordingly and deactivated (typically a rename for contractors who become fulltime)
A user leaves and returns sometime later. HR fails to send us the old user ID so we can connect the accounts.
The system screwed up and didn't set the new User ID on the old record.
The data can be bad in a hundred other ways
I need to know the following are the same user, and I can't rely on name or other fields... they differ among matching records:
ROOTUSER NUMROOTS NODELEVEL ISLEAF USER_ID NEW_USER_ID GLOBAL_HR_ID USERTYPE LAST_NAME FIRST_NAME
-----------------------------------------------------------------------------------------------------------------------------
EX0T1100 2 1 0 EX0T1100 EX000005 CONTRACTOR VON DER HAAVEN VERONICA
EX0T1100 2 2 1 EX000005 00126121 EMPLOYEE HAAVEN, VON DER VERONICA
GL110456 1 1 1 GL110456 00126121 EMPLOYEE VONDERHAAVEN VERONICA
EXOT1100 and EX000005 are connected properly by the NEW_USER_ID field. The rename occurred before there were global HR IDs, so EX0T1100 doesn't have one. EX000005 was given a new user ID, 'GL110456', and the two are only connected by having the same global HR ID.
Cleaning up the data isn't an option.
The query so far:
select connect_by_root cud.user_id RootUser,
count(connect_by_root cud.user_id) over (partition by connect_by_root cud.user_id) NumRoots,
level NodeLevel, connect_by_isleaf IsLeaf, --connect_by_iscycle IsCycle,
cud.user_id, cud.new_user_id, cud.global_hr_id,
cud.user_type_code UserType, ccud.last_name, cud.first_name
from css.user_desc cud
where cud.user_id in ('EX000005','EX0T1100','GL110456')
-- Using this so I don't get sub-users in my list of root users...
-- It complicates the matches with GLOBAL_HR_ID, however
start with cud.user_id not in (select cudsub.new_user_id
from css.user_desc cudsub
where cudsub.new_user_id is not null)
connect by nocycle (prior new_user_id = user_id);
I've tried various CONNECT BY clauses, but none of them are quite right:
-- As a multiple CONNECT BY
connect by nocycle (prior global_hr_id = global_hr_id)
connect by nocycle (prior new_user_id = user_id)
-- As a compound CONNECT BY
connect by nocycle ((prior new_user_id = user_id)
or (prior global_hr_id = global_hr_id
and user_id != prior user_Id))
UNIONing two CONNECT BY queries doesn't work... I don't get the leveling.
Here is what I would like to see... I'm okay with a resultset that I have to distinct and use as a subquery. I'm also okay with any of the three user IDs in the ROOTUSER column... I just need to know they're the same users.
ROOTUSER NUMROOTS NODELEVEL ISLEAF USER_ID NEW_USER_ID GLOBAL_HR_ID USERTYPE LAST_NAME FIRST_NAME
-----------------------------------------------------------------------------------------------------------------------------
EX0T1100 3 1 0 EX0T1100 EX000005 CONTRACTOR VON DER HAAVEN VERONICA
EX0T1100 3 2 1 EX000005 00126121 EMPLOYEE HAAVEN, VON DER VERONICA
EX0T1100 3 (2 or 3) 1 GL110456 00126121 EMPLOYEE VONDERHAAVEN VERONICA
Ideas?
Update
Nicholas, your code looks very much like the right track... at the moment, the lead(user_id) over (partition by global_hr_id) gets false hits when the global_hr_id is null. For example:
USER_ID NEW_USER_ID CHAINNEWUSER GLOBAL_HR_ID LAST_NAME FIRST_NAME
FP004468 FP004469 AARON TIMOTHY
FP004469 FOONG KOK WAH
I've often wanted to treat nulls as separate records in a partition, but I've never found a way to make ignore nulls work. This did what I wanted:
decode(global_hr_id,null,null,lead(cud.user_id ignore nulls) over (partition by global_hr_id order by user_id)
... but there's got to be a better way. I haven't been able to get the query to finish yet on the full-blown user data (about 40,000 users). Both global_hr_id and new_user_id are indexed.
Update
The query returns after about 750 seconds... long, but manageable. It returns 93k records, because I don't have a good way of filtering level 2 hits out of the root - you have start with global_hr_id is null, but unfortunately, that isn't always the case. I'll have to think some more about how to filter those out.
I've tried adding more complex start with clauses before, but I find that separately, they run < 1 second... together, they take 90 minutes >.<
Thanks again for you help... plodding away at this.
You have provided sample of data for only one user. Would be better to have a little bit more. Anyway, lets look at something like this.
SQL> with user_desc(USER_ID, NEW_USER_ID, GLOBAL_HR_ID)as(
2 select 'EX0T1100', 'EX000005', null from dual union all
3 select 'EX000005', null, 00126121 from dual union all
4 select 'GL110456', null, 00126121 from dual
5 )
6 select connect_by_root(user_id) rootuser
7 , count(connect_by_root(user_id)) over(partition by connect_by_root(user_id)) numroot
8 , level nodlevel
9 , connect_by_isleaf
10 , user_id
11 , new_user_id
12 , global_hr_id
13 from (select user_id
14 , coalesce(new_user_id, usr) new_user_id1
15 , new_user_id
16 , global_hr_id
17 from ( select user_id
18 , new_user_id
19 , global_hr_id
20 , decode(global_hr_id,null,null,lead(user_id) over (partition by global_hr_id order by user_id)) usr
21 from user_desc
22 )
23 )
24 start with global_hr_id is null
25 connect by prior new_user_id1 = user_id
26 ;
Result:
ROOTUSER NUMROOT NODLEVEL CONNECT_BY_ISLEAF USER_ID NEW_USER_ID GLOBAL_HR_ID
-------- ---------- ---------- ----------------- -------- ----------- ------------
EX0T1100 3 1 0 EX0T1100 EX000005
EX0T1100 3 2 0 EX000005 126121
EX0T1100 3 3 1 GL110456 126121

Resources