How to Choose a specific value from a table and to avoid duplicates? - algorithm

I have two tables:
MainTable
id AccountNum status
1 11001 active
2 11002 active
3 11003 active
4 11004 active
AddTable
id date description
1 01.2020 ACCOUNT.SET
1 02.2020 ACCOUNT.CHANGE
1 03.2020 ACCOUNT.REMOVE
2 04.2020 ACCOUNT.SET
2 05.2020 ACCOUNT.CHANGE
3 08.2020 ACCOUNT.SET
4 05.2020 ACCOUNT.SET
4 09.2020 ACCOUNT.REMOVE
I need to get a such result:
EffectiveFrom is date when Account was set,
EffectiveTo is date when Account was removed
id AccountNum EffectiveFrom EffectiveTo
1 11001 01.2020 03.2020
2 11002 04.2020 null
3 11003 08.2020 null
4 11004 05.2020 09.2020
The problem is that after joining on AddTable I get the duplicates, but I need just one row on every Id and only dates where the description in ACCOUNT.SET,ACCOUNT.REMOVE.

Are you looking for left join?
select m.id as id,
m.AccountNum as AccountNum,
a.date as EffectiveFrom,
b.date as EffectiveTo
from MainTable m left join
AddTable a on (a.id = m.id and a.description = 'ACCOUNT.SET') left join
AddTable b on (b.id = m.id and b.description = 'ACCOUNT.REMOVE')
order by m.AccountNum

Use a PIVOT and a LEFT OUTER JOIN:
SELECT m.id,
a.EffectiveFrom,
a.EffectiveTo
FROM MainTable m
LEFT OUTER JOIN
(
SELECT *
FROM AddTable
PIVOT( MAX( dt ) FOR description IN (
'ACCOUNT.SET' AS EffectiveFrom,
'ACCOUNT.REMOVE' AS EffectiveTo
) )
) a
ON ( a.id = m.id )
ORDER BY m.id
So for your test data:
CREATE TABLE MainTable ( id, AccountNum, status ) AS
SELECT 1, 11001, 'active' FROM DUAL UNION ALL
SELECT 2, 11002, 'active' FROM DUAL UNION ALL
SELECT 3, 11003, 'active' FROM DUAL UNION ALL
SELECT 4, 11004, 'active' FROM DUAL;
CREATE TABLE AddTable ( id, dt, description ) AS
SELECT 1, DATE '2020-01-01', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-02', 'ACCOUNT.CHANGE' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-03', 'ACCOUNT.REMOVE' FROM DUAL UNION ALL
SELECT 2, DATE '2020-01-04', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 2, DATE '2020-01-05', 'ACCOUNT.CHANGE' FROM DUAL UNION ALL
SELECT 3, DATE '2020-01-08', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 4, DATE '2020-01-05', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 4, DATE '2020-01-09', 'ACCOUNT.REMOVE' FROM DUAL;
This outputs:
ID | EFFECTIVEFROM | EFFECTIVETO
-: | :------------ | :----------
1 | 01-JAN-20 | 03-JAN-20
2 | 04-JAN-20 | null
3 | 08-JAN-20 | null
4 | 05-JAN-20 | 09-JAN-20
db<>fiddle here

Related

split data in a single column into multiple columns in oracle

my_table :
Name
Value
item_1
AB
item_2
2
item_3
B1
item_1
CD
item_1
EF
item_2
3
item_3
B2
item_4
ZZ
required output:
item_1
item_2
item_3
item_4
AB
2
B1
ZZ
CD
3
B2
NULL
EF
NULL
NULL
NULL
SQL query :
with item_1 as (select value from my_table where name = 'item_1'),
item_2 as (select value from my_table where name = 'item_2'),
item_3 as (select value from my_table where name = 'item_3'),
item_4 as (select value from my_table where name = 'item_4')
select item_1.value, item_2.value,item_3.value, item_4.value from item_1 cross join item_2 cross join item_3 cross join item_4;
If I am using pivot along with MAX aggregate function, the query will display only max values of the corresponding items instead of displaying all the values.
Is there any way to split a single column into multiple columns(using where condition as mentioned in the above query) without cross join.
Use ROW_NUMBER and then PIVOT:
SELECT item_1,
item_2,
item_3,
item_4
FROM (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY ROWNUM) AS rn
FROM table_name t
)
PIVOT (
MAX(value) FOR name IN (
'item_1' AS item_1,
'item_2' AS item_2,
'item_3' AS item_3,
'item_4' AS item_4
)
)
Which, for the sample data:
CREATE TABLE table_name (Name, Value) AS
SELECT 'item_1', 'AB' FROM DUAL UNION ALL
SELECT 'item_2', '2' FROM DUAL UNION ALL
SELECT 'item_3', 'B1' FROM DUAL UNION ALL
SELECT 'item_1', 'CD' FROM DUAL UNION ALL
SELECT 'item_1', 'EF' FROM DUAL UNION ALL
SELECT 'item_2', '3' FROM DUAL UNION ALL
SELECT 'item_3', 'B2' FROM DUAL UNION ALL
SELECT 'item_4', 'ZZ' FROM DUAL;
Outputs:
ITEM_1
ITEM_2
ITEM_3
ITEM_4
AB
2
B1
ZZ
CD
3
B2
null
EF
null
null
null
db<>fiddle here
How about this?
DF column is calculated by row_number analytic function which partitions by each name (and sorts by value). It is ignored from the final column list, but its role is crucial in the GROUP BY clause.
SQL> with test (name, value) as
2 (select 'item_1', 'AB' from dual union all
3 select 'item_2', '2' from dual union all
4 select 'item_3', 'B1' from dual union all
5 select 'item_1', 'CD' from dual union all
6 select 'item_1', 'EF' from dual union all
7 select 'item_2', '3' from dual union all
8 select 'item_3', 'B2' from dual union all
9 select 'item_4', 'ZZ' from dual
10 ),
11 temp as
12 (select name, value,
13 row_number() over (partition by name order by value) df
14 from test
15 )
16 select
17 max(case when name = 'item_1' then value end) item_1,
18 max(case when name = 'item_2' then value end) item_2,
19 max(case when name = 'item_3' then value end) item_3,
20 max(case when name = 'item_4' then value end) item_4
21 from temp
22 group by df;
ITEM_1 ITEM_2 ITEM_3 ITEM_4
------ ------ ------ ------
AB 2 B1 ZZ
CD 3 B2
EF
SQL>

Oracle query to keep looking until value is not 0 anymore

I am using Oracle 11.
I have 2 tables
TblA with columns id, entity_id and effective_date.
TblADetail with columns id and value.
If Value = 0 for the effective date, I want to keep looking for the next effective date until I found value <> 0 anymore.
The below query only look for value on 3/10/21.
If value = 0, I want to look for value on 3/11/21. If that's not 0, I want to stop.
But, if that's 0, I want to look for value on 3/12/21. If that's not 0, I want to stop.
But, if that's 0, I want to keep looking until value is not 0.
How can I do that ?
SELECT SUM(pd.VALUE)
FROM TblA p,TblADetail pd
WHERE p.id = pd.id
AND p.effective_date = to_date('03/10/2021','MM/DD/YYYY')
AND TRIM (p.entity_id) = 123
Sample data:
TblA
id entity_id effective_date
1 123 3/10/21
2 123 3/11/21
3 123 3/12/21
TblADetail
id value
1 -136
1 136
2 2000
3 3000
In the above data, for entity_id 123, starting from effective_date 3/10/21, I would like to to return value 2000 (from TblADetail) effective_date 3/11/21.
So, starting from a certain date, I want the results from the minimum date that has non-zero values.
Thank you.
You can do what you need to do by grouping the sum on the effective date, and using the MIN analytic function to find the earliest date. Once you've done that, you simply need to select the date that matches the earliest date.
E.g.:
with tbla as (select 1 id, ' 123' entity_id, to_date('10/03/2021', 'dd/mm/yyyy') effective_date from dual union all
select 2 id, ' 123' entity_id, to_date('11/03/2021', 'dd/mm/yyyy') effective_date from dual union all
select 3 id, ' 123' entity_id, to_date('12/03/2021', 'dd/mm/yyyy') effective_date from dual),
tbla_detail as (select 1 id, -136 value from dual union all
select 1 id, 136 value from dual union all
select 2 id, 2000 value from dual union all
select 3 id, 3000 value from dual),
results as (select a.effective_date,
sum(ad.value) sum_value,
min(case when sum(ad.value) != 0 then a.effective_date end) over () min_effective_date
from tbla a
inner join tbla_detail ad on a.id = ad.id
where a.effective_date >= to_date('10/03/2021', 'dd/mm/yyyy')
and trim(a.entity_id) = '123'
group by a.effective_date)
select sum_value
from results
where effective_date = min_effective_date;
SUM_VALUE
----------
2000
Straightforward; read comments within code. Sample data in lines #1 - 13, query begins at line #14.
SQL> with
2 -- sample data
3 tbla (id, entity_id, effective_date) as
4 (select 1, 123, date '2021-03-10' from dual union all
5 select 2, 123, date '2021-03-11' from dual union all
6 select 3, 123, date '2021-03-12' from dual
7 ),
8 tblb (id, value) as
9 (select 1, -136 from dual union all
10 select 1, 136 from dual union all
11 select 2, 2000 from dual union all
12 select 3, 3000 from dual
13 ),
14 tblb_temp as
15 -- simple grouping per ID
16 (select id, sum(value) value
17 from tblb
18 group by id
19 )
20 -- return TBLA values whose ID equals TBLB_TEMP's minimum ID
21 -- whose value isn't zero
22 select a.id, a.entity_id, a.effective_date
23 from tbla a
24 where a.id = (select min(b.id)
25 from tblb_temp b
26 where b.value > 0
27 );
ID ENTITY_ID EFFECTIVE_
---------- ---------- ----------
2 123 03/11/2021
SQL>

Query to join Oracle tables

I have a 'MASTER' table as shown below:
Entity Cat1 Cat2
A Mary;Steve Jacob
B Alex;John Sally;Andrew
Another table 'PERSON' has associations of person's name (this could be InfoID as well) with emails.
Name Email InfoID
Mary maryD#gmail.com mryD
Steve steveR#gmail.com stvR
Jacob jacobB#gmail.com jacbb
Sally sallyD#gmail.com sallD
Alex AlexT#gmail.com alexT
John JohnP#gmail.com johP
Andrew AndrewV#gmail.com andV
I want to join the person table with master such as:
Entity Cat1 EmailCat1 Cat2 EmailCat2
A Mary;Steve maryD#gmail.com;steveR#gmail.com Jacob jacobB#gmail.com
B Alex;John AlexT#gmail.com;JohnP#gmail.com Sally;Andrew sallyD#gmail.com;AndrewV#gmail.com
any insights on how to go about it?
Honestly, your master table design needs to be normalized. But, in the meantime you could try this query below :
with
needed_rows_for_cat1_tab (lvl) as (
select level from dual
connect by level <= (select max(regexp_count(Cat1, ';')) from Your_bad_master_tab) + 1
)
, needed_rows_for_cat2_tab (lvl) as (
select level from dual
connect by level <= (select max(regexp_count(Cat2, ';')) from Your_bad_master_tab) + 1
)
, split_cat1_val_tab as (
select Entity, Cat1
, substr(Cat1||';'
, lag(pos, 1, 0)over(partition by Entity order by lvl) + 1
, pos - lag(pos, 1, 0)over(partition by Entity order by lvl) - 1
) val
, lvl
, pos
, 1 cat
from (
select Entity, Cat1, instr(Cat1||';', ';', 1, r1.lvl)pos, r1.lvl
from Your_bad_master_tab c1
join needed_rows_for_cat1_tab r1 on r1.lvl <= regexp_count(Cat1, ';') + 1
)
)
, split_cat2_val_tab as (
select Entity, Cat2
, substr(Cat2||';'
, lag(pos, 1, 0)over(partition by Entity order by lvl) + 1
, pos - lag(pos, 1, 0)over(partition by Entity order by lvl) - 1
) val
, lvl
, pos
, 2 cat
from (
select Entity, Cat2, instr(Cat2||';', ';', 1, r2.lvl)pos, r2.lvl
from Your_bad_master_tab c1
join needed_rows_for_cat2_tab r2 on r2.lvl <= regexp_count(Cat2, ';') + 1
)
)
select ENTITY
, max(decode(cat, 1, CAT1, null)) CAT1
, listagg(decode(cat, 1, EMAIL, null), ';')within group (order by lvl) EmailCat1
, max(decode(cat, 2, CAT1, null)) CAT2
, listagg(decode(cat, 2, EMAIL, null), ';')within group (order by lvl) EmailCat2
from (
select c.*, p.Email
from split_cat1_val_tab c join Your_person_tab p on (c.val = p.name)
union all
select c.*, p.Email
from split_cat2_val_tab c join Your_person_tab p on (c.val = p.name)
)
group by ENTITY
;
Here are your sample data
--drop table Your_bad_master_tab purge;
create table Your_bad_master_tab (Entity, Cat1, Cat2) as
select 'A', 'Mary;Steve', 'Jacob' from dual union all
select 'B', 'Alex;John', 'Sally;Andrew' from dual
;
--drop table Your_person_tab purge;
create table Your_person_tab (Name, Email, InfoID) as
select 'Mary', 'maryD#gmail.com' ,'mryD' from dual union all
select 'Steve', 'steveR#gmail.com' ,'stvR' from dual union all
select 'Jacob', 'jacobB#gmail.com' ,'jacbb' from dual union all
select 'Sally', 'sallyD#gmail.com' ,'sallD' from dual union all
select 'Alex', 'AlexT#gmail.com' ,'alexT' from dual union all
select 'John', 'JohnP#gmail.com' ,'johP' from dual union all
select 'Andrew', 'AndrewV#gmail.com' ,'andV' from dual
;
DB<>fiddle

oracle- JOIN 2 tables with 2 ID's in common

table "team1" :
id country
1 India
2 Pakistan
3 srilanka
4 England
table "team2" :
id name name2
1 2 4
2 1 3
i have to combine two tables
another table when retrieve the data that time in place of 2 , 4 Pakistan,England
It is about the self join of team1 table (lines #15 and 16):
SQL> with
2 team1 (id, country) as
3 (select 1, 'India' from dual union all
4 select 2, 'Pakistan' from dual union all
5 select 3, 'Sri Lanka' from dual union all
6 select 4, 'England' from dual
7 ),
8 team2 (id, name, name2) as
9 (select 1, 2, 4 from dual union all
10 select 2, 1, 3 from dual
11 )
12 select b.id,
13 t1.country,
14 t2.country
15 from team2 b join team1 t1 on t1.id = b.name
16 join team1 t2 on t2.id = b.name2
17 order by b.id;
ID COUNTRY COUNTRY
---------- --------- ---------
1 Pakistan England
2 India Sri Lanka
SQL>
Just showing another way of writing the same query with aggregate functions and grouping.
with
team1 (id, country) as
(select 1, 'India' from dual union all
select 2, 'Pakistan' from dual union all
select 3, 'Sri Lanka' from dual union all
select 4, 'England' from dual
),
team2 (id, name, name2) as
(select 1, 2, 4 from dual union all
select 2, 1, 3 from dual
)
SELECT
T2.ID,
MAX(CASE
WHEN T2.NAME = T1.ID THEN T1.COUNTRY
END) AS TEAM1,
MAX(CASE
WHEN T2.NAME2 = T1.ID THEN T1.COUNTRY
END) AS TEAM2
FROM
TEAM2 T2
JOIN TEAM1 T1 ON T1.ID IN (
T2.NAME,
T2.NAME2
)
GROUP BY
T2.ID
ORDER BY
T2.ID;
Output:
ID TEAM1 TEAM2
---------- --------- ---------
1 Pakistan England
2 India Sri Lanka
Cheers!!

calculate the average time difference between each stage

How to calculate the average time difference between each stage.
The challenge with the actual data set is not every id will go through all stages.. some will skip stages and the date is not continuous for all Id's like below.
id date status
1 1/1/18 requirement
1 1/8/18 analysis
1 ? design
1 1/30/18 closed
2 2/1/18 requirement
2 2/18/18 closed
3 1/2/18 requirement
3 1/29/18 analysis
3 ? accepted
3 2/5/18 closed
?--we have missing dates as well
Expected output
id date status time_spent
1 1/1/18 requirement 0
1 1/8/18 analysis 7
1 ? design
1 1/30/18 closed 22
2 2/1/18 requirement 0
2 2/18/18 closed 17
3 1/2/18 requirement 0
3 1/29/18 analysis 27
3 ? accepted
3 2/5/18 closed 24
status avg(timespent)
requirement 0
analysis 17
design
closed 21
You can use windowing functions LAG (or LEAD) to get the data of the previous (or next) status for each id. That will let you compute the time elapsed in each stage. Then, compute the average time elapsed for each stage.
Here is an example of how to do that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, dte-nvl(lag(dte ignore nulls) over ( partition by id order by dte ), dte) elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+-------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+-------------------------------------------+
| requirement | 0 |
| analysis | 17 |
| design | |
| accepted | |
| closed | 15.33333333333333333333333333333333333333 |
+-------------+-------------------------------------------+
As I said in my comment, that logic computes the elapsed days as the time to the given status from the prior status. Since, "requirement" has no prior status, this logic will always show zero days spent in requirements. It would probably be better to compute the time from the given status to the next status. For "closed", there would be no next status. You could just leave that blank or use SYSDATE as the data of the next status. Here is an example of that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, nvl(lead(dte ignore nulls) over ( partition by id order by dte ), trunc(sysdate))-dte elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+------------------------------------------+
| requirement | 17 |
| analysis | 14.5 |
| design | |
| accepted | |
| closed | 361.666666666666666666666666666666666667 |
+-------------+------------------------------------------+
I agree with #MatthewMcPeak. Your requirements seem a bit odd: you spend zero days of requirement stage but spend an average of 21 days on closed? Fnord.
This solution treats the presented date as the start date of the stage and calculates the difference between it and the start_date of the next phase.
with cte as (
select status
, lead(dd ignore nulls) over (partition by id order by dd) - dd as dt_diff
from your_table)
select status, avg(dt_diff) as avg_ela
from cte
group by status
/
If you wish to include all stages for each d and estimate the time spent in each (using linear interpolation) then you can create a sub-query with all the statuses and use a PARTITION OUTER JOIN to join them and then use LAG and LEAD to find the date range the status is in and interpolate between:
Oracle Setup:
CREATE TABLE data ( d, dt, status ) AS
SELECT 1, TO_DATE( '1/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/8/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/30/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/18/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/2/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/29/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '2/5/18', 'MM/DD/YY' ), 'closed' FROM DUAL;
Query:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT d,
dt,
status,
( next_dt - recent_dt ) / (next_id - recent_id ) AS estimated_duration
FROM ranges;
Output:
D | DT | STATUS | ESTIMATED_DURATION
-: | :-------- | :---------- | ---------------------------------------:
1 | 01-JAN-18 | requirement | 7
1 | 08-JAN-18 | analysis | 7.33333333333333333333333333333333333333
1 | null | design | 7.33333333333333333333333333333333333333
1 | null | accepted | 7.33333333333333333333333333333333333333
1 | 30-JAN-18 | closed | 0
2 | 01-FEB-18 | requirement | 4.25
2 | null | analysis | 4.25
2 | null | design | 4.25
2 | null | accepted | 4.25
2 | 18-FEB-18 | closed | 0
3 | 02-JAN-18 | requirement | 27
3 | 29-JAN-18 | analysis | 2.33333333333333333333333333333333333333
3 | null | design | 2.33333333333333333333333333333333333333
3 | null | accepted | 2.33333333333333333333333333333333333333
3 | 05-FEB-18 | closed | 0
Query 2:
Then of you can easily change that to take the average for each status:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT status,
AVG( ( next_dt - recent_dt ) / (next_id - recent_id ) ) AS estimated_duration
FROM ranges
GROUP BY status, id
ORDER BY id;
Results:
STATUS | ESTIMATED_DURATION
:---------- | ---------------------------------------:
requirement | 12.75
analysis | 4.63888888888888888888888888888888888889
design | 4.63888888888888888888888888888888888889
accepted | 4.63888888888888888888888888888888888889
closed | 0
db<>fiddle here

Resources