ROW_NUMBER over PARTITION BY restart row counter between breaks [closed] - oracle

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I have a list of activities that is currently ordered by user, date and time of activity, and ID. I want to generate numbers for each group set by those same fields. Using the following code, I achieve considerable accuracy. However, there's a problem when the same ID is repeated at a later time and I need the row number count to restart instead of continuing from the previous iteration.
Here's my code:
ROW_NUMBER() OVER (PARTITION BY USER_ID, foc_id ORDER BY USER_ID, to_char(activity_date, 'MM/DD/YYYY HH24:MI:SS'), foc_id) seq_nbr
In the image below, we see that FOC_ID "A240" had activity around 2:20PM. Then FOC_ID "B410" had activity around 3:19PM, lastly the user returned to "A240" for additional activity around 3:20. Because there was activity between the first and second sequence of events of "A240," I need the row number (seq_nbr) to restart instead of continuing from the previous activity.

You can use MATCH_RECOGNIZE:
SELECT user_id,
activity_date,
foc_id,
ROW_NUMBER() OVER ( PARTITION BY user_id, mno ORDER BY activity_date ) AS seq_num
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY user_id
ORDER BY activity_date
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( same_foc_id* last_row )
DEFINE
same_foc_id AS FIRST( foc_id ) = NEXT( foc_id )
)
or, multiple ROW_NUMBERs:
SELECT user_id,
activity_date,
foc_id,
ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id, grp ORDER BY activity_date ) AS seq_num
FROM (
SELECT user_id,
activity_date,
foc_id,
ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY activity_date )
- ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id ORDER BY activity_date ) AS grp
FROM table_name
)
ORDER BY user_id, activity_date
Which, for the sample data:
CREATE TABLE table_name ( user_id, activity_date, foc_id ) AS
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:20:34' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:39' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:44' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:58' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:20:11' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:16' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:33' HOUR TO SECOND, 'A240' FROM DUAL;
Both output:
USER_ID | ACTIVITY_DATE | FOC_ID | SEQ_NUM
:------ | :------------------ | :----- | ------:
UVAC3 | 2020-11-04 14:20:34 | A240 | 1
UVAC3 | 2020-11-04 14:21:23 | A240 | 2
UVAC3 | 2020-11-04 14:21:23 | A240 | 3
UVAC3 | 2020-11-04 14:21:23 | A240 | 4
UVAC3 | 2020-11-04 15:19:39 | B410 | 1
UVAC3 | 2020-11-04 15:19:44 | B410 | 2
UVAC3 | 2020-11-04 15:19:58 | B410 | 3
UVAC3 | 2020-11-04 15:20:11 | B410 | 4
UVAC3 | 2020-11-04 15:22:16 | A240 | 1
UVAC3 | 2020-11-04 15:22:33 | A240 | 2
db<>fiddle here

Related

Separating Overlapping Date Ranges in Oracle

I have data with overlapping data ranges. Example below
Customer_ID
FAC_NUM
Start_Date
End_Date
New_Monies
12345
ABC1234
26/NOV/2014
26/MAY/2015
100000
12345
ABC1234
12/DEC/2014
12/JUN/2015
200000
12345
ABC1234
15/JUN/2015
15/DEC/2015
500000
12345
ABC1234
20/DEC/2015
20/JUN/2016
600000
I want to convert this table into data with non overlapping ranges such that for each overlapping period, the New_Monies column is summed together and shown as a new row. For the example above, I want the output to be as follows
Customer_ID
FAC_NUM
Start_Date
End_Date
New_Monies
12345
ABC1234
26/NOV/2014
11/DEC/2014
100000
12345
ABC1234
12/DEC/2014
26/MAY/2015
300000
12345
ABC1234
27/MAY/2015
12/JUN/2015
200000
12345
ABC1234
15/JUN/2015
15/DEC/2015
500000
12345
ABC1234
20/DEC/2015
20/JUN/2016
600000
Row 2 above being the overlapping period of 12 Dec 2014 to 26 May 2015 showing the total New_Monies as 300000 (100000+200000)
What would be the best way to do this in Oracle?
Thanks in advance for your support.
Regards,
Ani
with
prep (customer_id, fac_num, dt, amount) as (
select t.customer_id, t.fac_num,
case h.col when 's' then t.start_date else t.end_date + 1 end as dt,
case h.col when 's' then t.new_monies else - t.new_monies end as amount
from sample_data t
cross join
(select 's' as col from dual union all select 'e' from dual) h
)
, cumul_sums (customer_id, fac_num, dt, amount) as (
select distinct
customer_id, fac_num, dt,
sum(amount) over (partition by customer_id, fac_num order by dt)
from prep
)
, with_intervals (customer_id, fac_num, start_date, end_date, amount) as (
select customer_id, fac_num, dt,
lead(dt) over (partition by customer_id, fac_num order by dt) - 1,
amount
from cumul_sums
)
select customer_id, fac_num, start_date, end_date, amount
from with_intervals
where end_date is not null
order by customer_id, fac_num, start_date
;
The prep subquery unpivots the inputs, while at the same time changing the "end date" to the "start date" of the following interval and assigning a positive amount to the "start date" and the negative of the same amount to the following "start date". cumul_sums computes the cumulative sums; note that if two or more intervals begin on the same date (so the same date from prep appears multiple times for a customer and fac_num), the analytic sum will include the amounts from ALL the rows up to that date - the default windowing clause is range between...... After the cumulative sums are computed, this subquery also de-duplicates the output rows (to handle precisely that complication, of multiple intervals starting on the same date). with_intervals recovers the "start date" - "end date" intervals, and the final step simply removes the last interval ("to infinity") which would have an "amount" of zero.
EDIT This solution answers the OP's original question. After posting the solution, the OP changed the question. The solution can be changed easily to address the new formulation. I'm not going to chase shadows though; the solution will remain as is.
Here is an way to do this.
with all_data
as (select Customer_ID,FAC_NUM,start_date as dt,new_monies as calc_monies
from t
union all
select Customer_ID,FAC_NUM,end_date as dt,new_monies*-1 as calc_monies
from t
)
select x.customer_id
,x.fac_num
,x.start_date
,case when row_number() over(order by end_date desc)=1 then
x.end_date + 1
else x.end_date
end as new_end_date
from (
select t.customer_id
,t.fac_num
,t.dt as start_date
,lead(dt) over(order by dt)-1 as end_date
,sum(calc_monies) over(order by dt) as new_monies
from all_data t
)x
where end_date is not null
order by 3
db fiddle link
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=856c9ac0954e45429994f4ac45699e6f
+-------------+---------+------------+--------------+------------+
| CUSTOMER_ID | FAC_NUM | START_DATE | NEW_END_DATE | NEW_MONIES |
+-------------+---------+------------+--------------+------------+
| 12345 | ABC1234 | 26-NOV-14 | 11-DEC-14 | 100000 |
| 12345 | ABC1234 | 12-DEC-14 | 25-MAY-15 | 300000 |
| 12345 | ABC1234 | 26-MAY-15 | 12-JUN-15 | 200000 |
+-------------+---------+------------+--------------+------------+

Get month wise yearly report in oracle

all I have a employee table with the following fields employee name, wages date, wages I want to sum records month-wise.here is table data. I m using oracle database 11g
and here is the output I want.
You can use ROLLUP in your GROUP BY
WITH t(NAME, dt, w) AS (
SELECT 'adam', DATE '2020-01-01', 200 FROM dual UNION ALL
SELECT 'adam', DATE '2020-02-01', 200 FROM dual UNION ALL
SELECT 'adam', DATE '2020-03-01', 200 FROM dual UNION ALL
SELECT 'jhone', DATE '2020-01-01', 100 FROM dual UNION ALL
SELECT 'jhone', DATE '2020-02-01', 200 FROM dual UNION ALL
SELECT 'jhone', DATE '2020-03-01', 151 FROM dual
)
SELECT NAME, NVL(TO_CHAR(dt, 'fmMon'), 'total') AS mon, SUM(w) AS sum_w
FROM t
GROUP BY NAME, ROLLUP(TO_CHAR(dt, 'fmMon'));
+-----------------+
|NAME |MON |SUM_W|
+-----------------+
|adam |Feb |200 |
|adam |Jan |200 |
|adam |Mar |200 |
|adam |total|600 |
|jhone|Feb |200 |
|jhone|Jan |100 |
|jhone|Mar |151 |
|jhone|total|451 |
+-----------------+
If you need to transpose your result, you can PIVOT it:
WITH t(NAME, dt, w) AS (
SELECT 'adam', DATE '2020-01-01', 200 FROM dual UNION ALL
SELECT 'adam', DATE '2020-02-01', 200 FROM dual UNION ALL
SELECT 'adam', DATE '2020-03-01', 200 FROM dual UNION ALL
SELECT 'jhone', DATE '2020-01-01', 100 FROM dual UNION ALL
SELECT 'jhone', DATE '2020-02-01', 200 FROM dual UNION ALL
SELECT 'jhone', DATE '2020-03-01', 151 FROM dual
)
SELECT *
FROM (
SELECT NAME, NVL(TO_CHAR(dt, 'fmMon'), 'total') AS mon, SUM(w) AS sum_w
FROM t
GROUP BY NAME, ROLLUP(TO_CHAR(dt, 'fmMon'))
)
PIVOT (
SUM(sum_w)
FOR mon IN ('Jan','Feb','Mar','total')
);
+-------------------------------+
|NAME |'Jan'|'Feb'|'Mar'|'total'|
+-------------------------------+
|adam |200 |200 |200 |600 |
|jhone|100 |200 |151 |451 |
+-------------------------------+
You can use conditional aggregation as follows:
Select name,
Sum(case when to_char(date,'mon') = 'jan' then wages end) as jan,
Sum(case when to_char(date,'mon') = 'feb' then wages end) as feb,
...
Sum(wages) as total
From yourTable
Group by name;
You need to use the where condition to only consider one year data.

Oracle - how to update a unique row based on MAX effective date which is part of the unique index

Oracle - Say you have a table that has a unique key on name, ssn and effective date. The effective date makes it unique. What is the best way to update a current indicator to show inactive for the rows with dates less than the max effective date? I can't really wrap my head around it since there are multiple rows with the same name and ssn combinations. I haven't been able to find this scenario on here for Oracle and I'm having developer's block. Thanks.
"All name/ssn having a max effective date earlier than this time yesterday:"
SELECT name, ssn
FROM t
GROUP BY name, ssn
HAVING MAX(eff_date) < SYSDATE - 1
Oracle supports multi column in, so
UPDATE t
SET current_indicator = 'inactive'
WHERE (name,ssn,eff_date) IN (
SELECT name, ssn, max(eff_date)
FROM t
GROUP BY name, ssn
HAVING MAX(eff_date) < SYSDATE - 1
)
Use a MERGE statement using an analytic function to identify the rows to update and then merge on the ROWID pseudo-column so that Oracle can efficiently identify the rows to update (without having to perform an expensive self-join by comparing the values):
MERGE INTO table_name dst
USING (
SELECT rid,
max_eff_date
FROM (
SELECT ROWID AS rid,
effective_date,
status,
MAX( effective_date ) OVER ( PARTITION BY name, ssn ) AS max_eff_date
FROM table_name
)
WHERE ( effective_date < max_eff_date AND status <> 'inactive' )
OR ( effective_date = max_eff_date AND status <> 'active' )
) src
ON ( dst.ROWID = src.rid )
WHEN MATCHED THEN
UPDATE
SET status = CASE
WHEN src.max_eff_date = dst.effective_date
THEN 'active'
ELSE 'inactive'
END;
So, for some sample data:
CREATE TABLE table_name ( name, ssn, effective_date, status ) AS
SELECT 'aaa', 1, DATE '2020-01-01', 'inactive' FROM DUAL UNION ALL
SELECT 'aaa', 1, DATE '2020-01-02', 'inactive' FROM DUAL UNION ALL
SELECT 'aaa', 1, DATE '2020-01-03', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 2, DATE '2020-01-01', 'active' FROM DUAL UNION ALL
SELECT 'bbb', 2, DATE '2020-01-02', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 3, DATE '2020-01-01', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 3, DATE '2020-01-03', 'active' FROM DUAL;
The query only updates the 3 rows that need changing and:
SELECT *
FROM table_name;
Outputs:
NAME | SSN | EFFECTIVE_DATE | STATUS
:--- | --: | :------------- | :-------
aaa | 1 | 01-JAN-20 | inactive
aaa | 1 | 02-JAN-20 | inactive
aaa | 1 | 03-JAN-20 | active
bbb | 2 | 01-JAN-20 | inactive
bbb | 2 | 02-JAN-20 | active
bbb | 3 | 01-JAN-20 | inactive
bbb | 3 | 03-JAN-20 | active
db<>fiddle here

calculate the average time difference between each stage

How to calculate the average time difference between each stage.
The challenge with the actual data set is not every id will go through all stages.. some will skip stages and the date is not continuous for all Id's like below.
id date status
1 1/1/18 requirement
1 1/8/18 analysis
1 ? design
1 1/30/18 closed
2 2/1/18 requirement
2 2/18/18 closed
3 1/2/18 requirement
3 1/29/18 analysis
3 ? accepted
3 2/5/18 closed
?--we have missing dates as well
Expected output
id date status time_spent
1 1/1/18 requirement 0
1 1/8/18 analysis 7
1 ? design
1 1/30/18 closed 22
2 2/1/18 requirement 0
2 2/18/18 closed 17
3 1/2/18 requirement 0
3 1/29/18 analysis 27
3 ? accepted
3 2/5/18 closed 24
status avg(timespent)
requirement 0
analysis 17
design
closed 21
You can use windowing functions LAG (or LEAD) to get the data of the previous (or next) status for each id. That will let you compute the time elapsed in each stage. Then, compute the average time elapsed for each stage.
Here is an example of how to do that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, dte-nvl(lag(dte ignore nulls) over ( partition by id order by dte ), dte) elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+-------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+-------------------------------------------+
| requirement | 0 |
| analysis | 17 |
| design | |
| accepted | |
| closed | 15.33333333333333333333333333333333333333 |
+-------------+-------------------------------------------+
As I said in my comment, that logic computes the elapsed days as the time to the given status from the prior status. Since, "requirement" has no prior status, this logic will always show zero days spent in requirements. It would probably be better to compute the time from the given status to the next status. For "closed", there would be no next status. You could just leave that blank or use SYSDATE as the data of the next status. Here is an example of that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, nvl(lead(dte ignore nulls) over ( partition by id order by dte ), trunc(sysdate))-dte elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+------------------------------------------+
| requirement | 17 |
| analysis | 14.5 |
| design | |
| accepted | |
| closed | 361.666666666666666666666666666666666667 |
+-------------+------------------------------------------+
I agree with #MatthewMcPeak. Your requirements seem a bit odd: you spend zero days of requirement stage but spend an average of 21 days on closed? Fnord.
This solution treats the presented date as the start date of the stage and calculates the difference between it and the start_date of the next phase.
with cte as (
select status
, lead(dd ignore nulls) over (partition by id order by dd) - dd as dt_diff
from your_table)
select status, avg(dt_diff) as avg_ela
from cte
group by status
/
If you wish to include all stages for each d and estimate the time spent in each (using linear interpolation) then you can create a sub-query with all the statuses and use a PARTITION OUTER JOIN to join them and then use LAG and LEAD to find the date range the status is in and interpolate between:
Oracle Setup:
CREATE TABLE data ( d, dt, status ) AS
SELECT 1, TO_DATE( '1/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/8/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/30/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/18/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/2/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/29/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '2/5/18', 'MM/DD/YY' ), 'closed' FROM DUAL;
Query:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT d,
dt,
status,
( next_dt - recent_dt ) / (next_id - recent_id ) AS estimated_duration
FROM ranges;
Output:
D | DT | STATUS | ESTIMATED_DURATION
-: | :-------- | :---------- | ---------------------------------------:
1 | 01-JAN-18 | requirement | 7
1 | 08-JAN-18 | analysis | 7.33333333333333333333333333333333333333
1 | null | design | 7.33333333333333333333333333333333333333
1 | null | accepted | 7.33333333333333333333333333333333333333
1 | 30-JAN-18 | closed | 0
2 | 01-FEB-18 | requirement | 4.25
2 | null | analysis | 4.25
2 | null | design | 4.25
2 | null | accepted | 4.25
2 | 18-FEB-18 | closed | 0
3 | 02-JAN-18 | requirement | 27
3 | 29-JAN-18 | analysis | 2.33333333333333333333333333333333333333
3 | null | design | 2.33333333333333333333333333333333333333
3 | null | accepted | 2.33333333333333333333333333333333333333
3 | 05-FEB-18 | closed | 0
Query 2:
Then of you can easily change that to take the average for each status:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT status,
AVG( ( next_dt - recent_dt ) / (next_id - recent_id ) ) AS estimated_duration
FROM ranges
GROUP BY status, id
ORDER BY id;
Results:
STATUS | ESTIMATED_DURATION
:---------- | ---------------------------------------:
requirement | 12.75
analysis | 4.63888888888888888888888888888888888889
design | 4.63888888888888888888888888888888888889
accepted | 4.63888888888888888888888888888888888889
closed | 0
db<>fiddle here

Want to ROUND the Data according to DAY difference

Query :
select
TO_CHAR((to_date(IP_START_DATE,'DD-MM-YYYY HH24:MI:SS')+ (level-1)),'DD-MM-YYYY'),
TO_CHAR(to_date(IP_START_DATE,'DD-MM-YYYY HH24:MI:SS') + level,'DD-MM-YYYY') ,
to_number(regexp_substr(IP_PLAN_CONSUMPTION, '^\d+'))/(TO_DATE(IP_END_DATE, 'DD-MM-YYYY HH24:MI:SS') - TO_DATE(IP_START_DATE, 'DD-MM-YYYY HH24:MI:SS')) || regexp_substr(IP_PLAN_CONSUMPTION, '[A-Z]') as IP_PLAN_CONSUMPTION
FROM
dual
CONNECT BY
level <= to_date(IP_END_DATE,'DD-MM-YYYY HH24:MI:SS')-to_date(IP_START_DATE,'DD-MM-YYYY HH24:MI:SS')+1;
-> Data in Query :
select
TO_CHAR((to_date('16-07-2018 11:02','DD-MM-YYYY HH24:MI:SS')+ (level-1)),'DD-MM-YYYY'),
TO_CHAR(to_date('16-07-2018 11:02','DD-MM-YYYY HH24:MI:SS') + level,'DD-MM-YYYY'),
to_number(regexp_substr('4000 T', '^\d+'))/(TO_DATE('18-07-2018 00:00', 'DD-MM-YYYY HH24:MI:SS') - TO_DATE('16-07-2018 11:02', 'DD-MM-YYYY HH24:MI:SS')) || regexp_substr('4000 T', '[A-Z]') as IP_PLAN_CONSUMPTION
FROM
dual
CONNECT BY
level <= to_date('18-07-2018 00:00','DD-MM-YYYY HH24:MI:SS')-to_date('16-07-2018 11:02','DD-MM-YYYY HH24:MI:SS')+1;
Output will Be :
But its should be 2000 T
Not : If Start Date: 16-07-2018 00:00 & End Date : 19-07-2018 00:00 then Day Difference is 3 Days & Consumption is 4000 T then Inserted Consumption Should be 1333.333333333333 T ~ 1334 T in each date.
If you are storing dates, you should store them in your table as the DATE data type (and not as strings).
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE your_table( id, ip_start_date, ip_end_date, ip_plan_consumption ) AS
SELECT 1,
DATE '2018-07-16' + INTERVAL '11:02' HOUR TO MINUTE,
DATE '2018-07-18' + INTERVAL '00:00' HOUR TO MINUTE,
'4000 T'
FROM DUAL
UNION ALL
SELECT 2,
DATE '2018-07-16' + INTERVAL '11:02' HOUR TO MINUTE,
DATE '2018-07-16' + INTERVAL '23:08' HOUR TO MINUTE,
'3000 T'
FROM DUAL
UNION ALL
SELECT 3,
DATE '2018-07-10' + INTERVAL '00:00' HOUR TO MINUTE,
DATE '2018-07-13' + INTERVAL '23:59' HOUR TO MINUTE,
'15000 U'
FROM DUAL
;
Query 1:
WITH data ( id, start_dt, end_dt, consumption, units ) AS (
SELECT id,
TRUNC( IP_START_DATE ),
GREATEST( TRUNC( IP_START_DATE ) + 1, TRUNC( IP_END_DATE ) ),
TO_NUMBER( REGEXP_SUBSTR( IP_PLAN_CONSUMPTION, '^\d+' ) ),
REGEXP_SUBSTR( IP_PLAN_CONSUMPTION, '\S+$' )
FROM your_table
)
SELECT id,
t.column_value AS start_dt,
t.column_value + 1 AS end_dt,
consumption / ( end_dt - start_dt ) || units AS IP_PLAN_CONSUMPTION
FROM data d
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT d.start_dt + LEVEL - 1
FROM DUAL
CONNECT BY d.start_dt + LEVEL - 1 < d.end_dt
)
AS SYS.ODCIDATELIST
)
) t
Results:
| ID | START_DT | END_DT | IP_PLAN_CONSUMPTION |
|----|----------------------|----------------------|---------------------|
| 1 | 2018-07-16T00:00:00Z | 2018-07-17T00:00:00Z | 2000T |
| 1 | 2018-07-17T00:00:00Z | 2018-07-18T00:00:00Z | 2000T |
| 2 | 2018-07-16T00:00:00Z | 2018-07-17T00:00:00Z | 3000T |
| 3 | 2018-07-10T00:00:00Z | 2018-07-11T00:00:00Z | 5000U |
| 3 | 2018-07-11T00:00:00Z | 2018-07-12T00:00:00Z | 5000U |
| 3 | 2018-07-12T00:00:00Z | 2018-07-13T00:00:00Z | 5000U |

Resources