Want to ROUND the Data according to DAY difference - oracle

Query :
select
TO_CHAR((to_date(IP_START_DATE,'DD-MM-YYYY HH24:MI:SS')+ (level-1)),'DD-MM-YYYY'),
TO_CHAR(to_date(IP_START_DATE,'DD-MM-YYYY HH24:MI:SS') + level,'DD-MM-YYYY') ,
to_number(regexp_substr(IP_PLAN_CONSUMPTION, '^\d+'))/(TO_DATE(IP_END_DATE, 'DD-MM-YYYY HH24:MI:SS') - TO_DATE(IP_START_DATE, 'DD-MM-YYYY HH24:MI:SS')) || regexp_substr(IP_PLAN_CONSUMPTION, '[A-Z]') as IP_PLAN_CONSUMPTION
FROM
dual
CONNECT BY
level <= to_date(IP_END_DATE,'DD-MM-YYYY HH24:MI:SS')-to_date(IP_START_DATE,'DD-MM-YYYY HH24:MI:SS')+1;
-> Data in Query :
select
TO_CHAR((to_date('16-07-2018 11:02','DD-MM-YYYY HH24:MI:SS')+ (level-1)),'DD-MM-YYYY'),
TO_CHAR(to_date('16-07-2018 11:02','DD-MM-YYYY HH24:MI:SS') + level,'DD-MM-YYYY'),
to_number(regexp_substr('4000 T', '^\d+'))/(TO_DATE('18-07-2018 00:00', 'DD-MM-YYYY HH24:MI:SS') - TO_DATE('16-07-2018 11:02', 'DD-MM-YYYY HH24:MI:SS')) || regexp_substr('4000 T', '[A-Z]') as IP_PLAN_CONSUMPTION
FROM
dual
CONNECT BY
level <= to_date('18-07-2018 00:00','DD-MM-YYYY HH24:MI:SS')-to_date('16-07-2018 11:02','DD-MM-YYYY HH24:MI:SS')+1;
Output will Be :
But its should be 2000 T
Not : If Start Date: 16-07-2018 00:00 & End Date : 19-07-2018 00:00 then Day Difference is 3 Days & Consumption is 4000 T then Inserted Consumption Should be 1333.333333333333 T ~ 1334 T in each date.

If you are storing dates, you should store them in your table as the DATE data type (and not as strings).
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE your_table( id, ip_start_date, ip_end_date, ip_plan_consumption ) AS
SELECT 1,
DATE '2018-07-16' + INTERVAL '11:02' HOUR TO MINUTE,
DATE '2018-07-18' + INTERVAL '00:00' HOUR TO MINUTE,
'4000 T'
FROM DUAL
UNION ALL
SELECT 2,
DATE '2018-07-16' + INTERVAL '11:02' HOUR TO MINUTE,
DATE '2018-07-16' + INTERVAL '23:08' HOUR TO MINUTE,
'3000 T'
FROM DUAL
UNION ALL
SELECT 3,
DATE '2018-07-10' + INTERVAL '00:00' HOUR TO MINUTE,
DATE '2018-07-13' + INTERVAL '23:59' HOUR TO MINUTE,
'15000 U'
FROM DUAL
;
Query 1:
WITH data ( id, start_dt, end_dt, consumption, units ) AS (
SELECT id,
TRUNC( IP_START_DATE ),
GREATEST( TRUNC( IP_START_DATE ) + 1, TRUNC( IP_END_DATE ) ),
TO_NUMBER( REGEXP_SUBSTR( IP_PLAN_CONSUMPTION, '^\d+' ) ),
REGEXP_SUBSTR( IP_PLAN_CONSUMPTION, '\S+$' )
FROM your_table
)
SELECT id,
t.column_value AS start_dt,
t.column_value + 1 AS end_dt,
consumption / ( end_dt - start_dt ) || units AS IP_PLAN_CONSUMPTION
FROM data d
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT d.start_dt + LEVEL - 1
FROM DUAL
CONNECT BY d.start_dt + LEVEL - 1 < d.end_dt
)
AS SYS.ODCIDATELIST
)
) t
Results:
| ID | START_DT | END_DT | IP_PLAN_CONSUMPTION |
|----|----------------------|----------------------|---------------------|
| 1 | 2018-07-16T00:00:00Z | 2018-07-17T00:00:00Z | 2000T |
| 1 | 2018-07-17T00:00:00Z | 2018-07-18T00:00:00Z | 2000T |
| 2 | 2018-07-16T00:00:00Z | 2018-07-17T00:00:00Z | 3000T |
| 3 | 2018-07-10T00:00:00Z | 2018-07-11T00:00:00Z | 5000U |
| 3 | 2018-07-11T00:00:00Z | 2018-07-12T00:00:00Z | 5000U |
| 3 | 2018-07-12T00:00:00Z | 2018-07-13T00:00:00Z | 5000U |

Related

ROW_NUMBER over PARTITION BY restart row counter between breaks [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I have a list of activities that is currently ordered by user, date and time of activity, and ID. I want to generate numbers for each group set by those same fields. Using the following code, I achieve considerable accuracy. However, there's a problem when the same ID is repeated at a later time and I need the row number count to restart instead of continuing from the previous iteration.
Here's my code:
ROW_NUMBER() OVER (PARTITION BY USER_ID, foc_id ORDER BY USER_ID, to_char(activity_date, 'MM/DD/YYYY HH24:MI:SS'), foc_id) seq_nbr
In the image below, we see that FOC_ID "A240" had activity around 2:20PM. Then FOC_ID "B410" had activity around 3:19PM, lastly the user returned to "A240" for additional activity around 3:20. Because there was activity between the first and second sequence of events of "A240," I need the row number (seq_nbr) to restart instead of continuing from the previous activity.
You can use MATCH_RECOGNIZE:
SELECT user_id,
activity_date,
foc_id,
ROW_NUMBER() OVER ( PARTITION BY user_id, mno ORDER BY activity_date ) AS seq_num
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY user_id
ORDER BY activity_date
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( same_foc_id* last_row )
DEFINE
same_foc_id AS FIRST( foc_id ) = NEXT( foc_id )
)
or, multiple ROW_NUMBERs:
SELECT user_id,
activity_date,
foc_id,
ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id, grp ORDER BY activity_date ) AS seq_num
FROM (
SELECT user_id,
activity_date,
foc_id,
ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY activity_date )
- ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id ORDER BY activity_date ) AS grp
FROM table_name
)
ORDER BY user_id, activity_date
Which, for the sample data:
CREATE TABLE table_name ( user_id, activity_date, foc_id ) AS
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:20:34' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:39' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:44' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:58' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:20:11' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:16' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:33' HOUR TO SECOND, 'A240' FROM DUAL;
Both output:
USER_ID | ACTIVITY_DATE | FOC_ID | SEQ_NUM
:------ | :------------------ | :----- | ------:
UVAC3 | 2020-11-04 14:20:34 | A240 | 1
UVAC3 | 2020-11-04 14:21:23 | A240 | 2
UVAC3 | 2020-11-04 14:21:23 | A240 | 3
UVAC3 | 2020-11-04 14:21:23 | A240 | 4
UVAC3 | 2020-11-04 15:19:39 | B410 | 1
UVAC3 | 2020-11-04 15:19:44 | B410 | 2
UVAC3 | 2020-11-04 15:19:58 | B410 | 3
UVAC3 | 2020-11-04 15:20:11 | B410 | 4
UVAC3 | 2020-11-04 15:22:16 | A240 | 1
UVAC3 | 2020-11-04 15:22:33 | A240 | 2
db<>fiddle here

retrieve multible columns group by date intervall

I want to retrieve multible columns, sum of weight data from a table over a whole month. what I need help with is that I want to group the result into 2 parts sum of 1-15 of the month and second line 16-31 of the month.
Select TO_CHAR(sysdate) dummy
(SELECT(SUM(B.SCALE_WEIGHT) FROM TRACKING.DATALOG_TAB B WHERE B.MATERIALID= 1
AND B.SCALE_EVENTDATE BETWEEN TO_DATE(TRUNC(TO_DATE('2020-10-1', 'YYYY-MM-
DD'),'MONTH')) AND TO_DATE(TRUNC(TO_DATE('2020-11-1', 'YYYY-MM-DD'),
'MONTH')+16)) as MTRL1,
(SELECT(SUM(B.SCALE_WEIGHT) FROM TRACKING.DATALOG_TAB B WHERE B.MATERIALID= 2
AND B.SCALE_EVENTDATE BETWEEN TO_DATE(TRUNC(TO_DATE('2020-10-1', 'YYYY-MM-
DD'),'MONTH')) AND TO_DATE(TRUNC(TO_DATE('2020-11-1', 'YYYY-MM-DD'),
'MONTH')+16)) as MTRL2
FROM DUAL
GROUP BY(somthing like this - 1-15 and 16-31);
UPDATE
the result should look like this
To me, it looks like this:
select
sum(case when b.materialid = 1 and
to_number(to_char(b.scale_eventdate, 'dd')) between 1 and 15 then
b.scale_weight
end) mtrl1,
--
sum(case when b.materialid = 2 and
to_number(to_char(b.scale_eventdate, 'dd')) between 16 and 31 then
b.scale_weight
end) mtrl2
from datalog_tab b
where to_char(b.scale_eventdate, 'yyyymm') = '202010'
In other words, check whether day of scale_eventdate column belongs to 1st or 2nd half of the month and sum scale_weight accordingly.
If you have the sample data:
CREATE TABLE tracking.datalog_tab ( materialid, scale_eventdate, scale_weight ) AS
SELECT 1, DATE '2020-10-01', 1 FROM DUAL UNION ALL
SELECT 1, DATE '2020-10-15', 2 FROM DUAL UNION ALL
SELECT 1, DATE '2020-10-16', 3 FROM DUAL UNION ALL
SELECT 1, DATE '2020-10-31', 4 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-01', -1 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-15', -2 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-16', -3 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-31', -4 FROM DUAL;
You can use:
SELECT MATERIALID,
CASE
WHEN EXTRACT( DAY FROM SCALE_EVENTDATE ) <= 15
THEN ' 1-15'
ELSE '16-31'
END AS day_range,
SUM(SCALE_WEIGHT)
FROM TRACKING.DATALOG_TAB
WHERE MATERIALID IN ( 1, 2 )
AND SCALE_EVENTDATE >= DATE '2020-10-01'
AND SCALE_EVENTDATE < DATE '2020-11-01'
GROUP BY
MATERIALID,
CASE
WHEN EXTRACT( DAY FROM SCALE_EVENTDATE ) <= 15
THEN ' 1-15'
ELSE '16-31'
END;
Which outputs:
MATERIALID | DAY_RANGE | SUM(SCALE_WEIGHT)
---------: | :-------- | ----------------:
1 | 1-15 | 3
2 | 1-15 | -3
1 | 16-31 | 7
2 | 16-31 | -7
Or, if you want them as columns then PIVOT:
SELECT *
FROM (
SELECT MATERIALID,
CASE
WHEN EXTRACT( DAY FROM SCALE_EVENTDATE ) <= 15
THEN ' 1-15'
ELSE '16-31'
END AS day_range,
SCALE_WEIGHT
FROM TRACKING.DATALOG_TAB
WHERE MATERIALID IN ( 1, 2 )
AND SCALE_EVENTDATE >= DATE '2020-10-01'
AND SCALE_EVENTDATE < DATE '2020-11-01'
)
PIVOT (
SUM( scale_weight ) FOR ( materialid, day_range ) IN (
( 1, ' 1-15' ) AS mtrl1_01_15,
( 1, '16-31' ) AS mtrl1_16_31,
( 2, ' 1-15' ) AS mtrl2_01_15,
( 2, '16-31' ) AS mtrl2_16_31
)
);
Which outputs:
MTRL1_01_15 | MTRL1_16_31 | MTRL2_01_15 | MTRL2_16_31
----------: | ----------: | ----------: | ----------:
3 | 7 | -3 | -7
db<>fiddle here
Update
SELECT *
FROM (
SELECT MATERIALID,
CASE
WHEN EXTRACT( DAY FROM SCALE_EVENTDATE ) <= 15
THEN ' 1-15 '
ELSE '16-31 '
END
|| TO_CHAR( scale_eventdate, 'Mon' ) AS date_range,
SCALE_WEIGHT
FROM /*TRACKING.*/DATALOG_TAB
WHERE MATERIALID IN ( 1, 2, 3 )
AND SCALE_EVENTDATE >= DATE '2020-10-01'
AND SCALE_EVENTDATE < DATE '2020-11-01'
)
PIVOT (
SUM( scale_weight ) FOR materialid IN (
1 AS sum_mtrl1_weight,
2 AS sum_mtrl2_weight,
3 AS sum_mtrl3_weight
)
);
Which, for the sample data:
CREATE TABLE /*TRACKING.*/datalog_tab ( materialid, scale_eventdate, scale_weight ) AS
SELECT 1, DATE '2020-10-01', 25 FROM DUAL UNION ALL
SELECT 1, DATE '2020-10-15', 75 FROM DUAL UNION ALL
SELECT 1, DATE '2020-10-16', 125 FROM DUAL UNION ALL
SELECT 1, DATE '2020-10-31', 375 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-01', 90 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-15', 110 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-16', 90 FROM DUAL UNION ALL
SELECT 2, DATE '2020-10-31', 125 FROM DUAL UNION ALL
SELECT 3, DATE '2020-10-01', 120 FROM DUAL UNION ALL
SELECT 3, DATE '2020-10-16', 120 FROM DUAL UNION ALL
SELECT 3, DATE '2020-10-31', 240 FROM DUAL;
Outputs:
DATE_RANGE | SUM_MTRL1_WEIGHT | SUM_MTRL2_WEIGHT | SUM_MTRL3_WEIGHT
:--------- | ---------------: | ---------------: | ---------------:
1-15 Oct | 100 | 200 | 120
16-31 Oct | 500 | 215 | 360
db<>fiddle here

How to split record in multiple records from start/end date record

I'm trying to split record to multiple record from start/end date in Oracle
I have data like this
MachineID | start date | end date | running time |
WC01 | 2019/09/05 07:00 | 2019/09/07 09:00 | 26:00 |
and I want to split record to each day from 08:00 to 08:00
MachineID | running date | running time |
WC01 | 2019/09/05 | 1:00 |
WC01 | 2019/09/06 | 24:00 |
WC01 | 2019/09/07 | 1:00 |
Thank you for your help!
We can handle this via the help from a calendar table which contains all dates you expect to appear in your data set, along with a separate record for each minute:
WITH dates AS (
SELECT TIMESTAMP '2019-09-05 00:00:00' + NUMTODSINTERVAL(rownum, 'MINUTE') AS dt
FROM dual
CONNECT BY level <= 5000
)
SELECT
m.MachineID,
TRUNC(d.dt) AS running_date,
COUNT(t.MachineID) / 60 AS running_hours
FROM dates d
CROSS JOIN (SELECT DISTINCT MachineID FROM yourTable) m
LEFT JOIN yourTable t
ON d.dt >= t.start_date AND d.dt < t.end_date
WHERE
TO_CHAR(d.dt, 'HH24') >= '08' AND TO_CHAR(d.dt, 'HH24') < '21'
GROUP BY
m.MachineID,
TRUNC(d.dt)
ORDER BY
TRUNC(d.dt);
Demo
You can try below query:
SELECT
MACHINEID,
RUNNING_DATE,
DECODE(RUNNING_DATE, TRUNC(START_DATE), CASE
WHEN DIFF_START < 0 THEN 0
WHEN DIFF_START > 12 THEN 12
ELSE DIFF_START
END, TRUNC(END_DATE), CASE
WHEN DIFF_END < 0 THEN 0
WHEN DIFF_END > 12 THEN 12
ELSE DIFF_END
END, 24) AS RUNNING_HOURS
FROM
(
SELECT
MACHINEID,
RUNNING_DATE,
ROUND(24 *((TRUNC(START_DATE + LVL - 1) + 8 / 24) - START_DATE)) AS DIFF_START,
ROUND(24 *(END_DATE -(TRUNC(START_DATE + LVL - 1) + 8 / 24))) AS DIFF_END,
START_DATE,
END_DATE
FROM
(
SELECT
DISTINCT MACHINEID,
LEVEL AS LVL,
START_DATE,
END_DATE,
TRUNC(START_DATE + LEVEL - 1) AS RUNNING_DATE
FROM
YOURTABLE
CONNECT BY
LEVEL <= TRUNC(END_DATE) - TRUNC(START_DATE) + 1
)
);
db<>fiddle demo
Change the logic wherever it is not meeting your requirement. I have created the query taking sample data and expected output into consideration.
Cheers!!

Convert time from a format to an int in Oracle

I can't seem to figure this out. I have some rows with time in the format 00:00:00 (hh:mm:ss) and i need to calculate the total time it takes for a task.
I am unable to sum this data. Can someone advise on a way to convert this to a format i can sum or a method to calculate the total time for the task.
Thanks for any assistance. This is in an Oracle DB.
Convert your time string to a date and subtract the equivalent date at midnight to give you an number as a fraction of a day. You can then sum this number and convert it to an interval:
Oracle Setup:
CREATE TABLE test_data( value ) AS
SELECT '01:23:45' FROM DUAL UNION ALL
SELECT '12:34:56' FROM DUAL UNION ALL
SELECT '23:45:00' FROM DUAL;
Query:
SELECT NUMTODSINTERVAL(
SUM( TO_DATE( value, 'HH24:MI:SS' ) - TO_DATE( '00:00:00', 'HH24:MI:SS' ) ),
'DAY'
) AS total_time_taken
FROM test_data;
Output:
| TOTAL_TIME_TAKEN |
| :---------------------------- |
| +000000001 13:43:41.000000000 |
db<>fiddle here
Update including durations longer than 23:59:59.
Oracle Setup:
CREATE TABLE test_data( value ) AS
SELECT '1:23:45' FROM DUAL UNION ALL
SELECT '12:34:56' FROM DUAL UNION ALL
SELECT '23:45:00' FROM DUAL UNION ALL
SELECT '48:00:00' FROM DUAL;
Query:
SELECT NUMTODSINTERVAL(
SUM(
DATE '1970-01-01'
+ NUMTODSINTERVAL( SUBSTR( value, 1, HM - 1 ), 'HOUR' )
+ NUMTODSINTERVAL( SUBSTR( value, HM + 1, MS - HM - 1 ), 'MINUTE' )
+ NUMTODSINTERVAL( SUBSTR( value, MS + 1 ), 'SECOND' )
- DATE '1970-01-01'
),
'DAY'
) AS total_time
FROM (
SELECT value,
INSTR( value, ':', 1, 1 ) AS HM,
INSTR( value, ':', 1, 2 ) AS MS
FROM test_data
);
Output:
| TOTAL_TIME |
| :---------------------------- |
| +000000003 13:43:41.000000000 |
db<>fiddle here
Even better would be if you changed your table to hold the durations as intervals rather than as strings then everything becomes much simpler:
Oracle Setup:
CREATE TABLE test_data( value ) AS
SELECT INTERVAL '1:23:45' HOUR TO SECOND FROM DUAL UNION ALL
SELECT INTERVAL '12:34:56' HOUR TO SECOND FROM DUAL UNION ALL
SELECT INTERVAL '23:45:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT INTERVAL '48:00:00' HOUR TO SECOND FROM DUAL;
Query:
SELECT NUMTODSINTERVAL(
SUM( DATE '1970-01-01' + value - DATE '1970-01-01' ),
'DAY'
) AS total_time
FROM test_data;
Output:
| TOTAL_TIME |
| :---------------------------- |
| +000000003 13:43:41.000000000 |
db<>fiddle here

calculate the average time difference between each stage

How to calculate the average time difference between each stage.
The challenge with the actual data set is not every id will go through all stages.. some will skip stages and the date is not continuous for all Id's like below.
id date status
1 1/1/18 requirement
1 1/8/18 analysis
1 ? design
1 1/30/18 closed
2 2/1/18 requirement
2 2/18/18 closed
3 1/2/18 requirement
3 1/29/18 analysis
3 ? accepted
3 2/5/18 closed
?--we have missing dates as well
Expected output
id date status time_spent
1 1/1/18 requirement 0
1 1/8/18 analysis 7
1 ? design
1 1/30/18 closed 22
2 2/1/18 requirement 0
2 2/18/18 closed 17
3 1/2/18 requirement 0
3 1/29/18 analysis 27
3 ? accepted
3 2/5/18 closed 24
status avg(timespent)
requirement 0
analysis 17
design
closed 21
You can use windowing functions LAG (or LEAD) to get the data of the previous (or next) status for each id. That will let you compute the time elapsed in each stage. Then, compute the average time elapsed for each stage.
Here is an example of how to do that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, dte-nvl(lag(dte ignore nulls) over ( partition by id order by dte ), dte) elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+-------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+-------------------------------------------+
| requirement | 0 |
| analysis | 17 |
| design | |
| accepted | |
| closed | 15.33333333333333333333333333333333333333 |
+-------------+-------------------------------------------+
As I said in my comment, that logic computes the elapsed days as the time to the given status from the prior status. Since, "requirement" has no prior status, this logic will always show zero days spent in requirements. It would probably be better to compute the time from the given status to the next status. For "closed", there would be no next status. You could just leave that blank or use SYSDATE as the data of the next status. Here is an example of that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, nvl(lead(dte ignore nulls) over ( partition by id order by dte ), trunc(sysdate))-dte elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+------------------------------------------+
| requirement | 17 |
| analysis | 14.5 |
| design | |
| accepted | |
| closed | 361.666666666666666666666666666666666667 |
+-------------+------------------------------------------+
I agree with #MatthewMcPeak. Your requirements seem a bit odd: you spend zero days of requirement stage but spend an average of 21 days on closed? Fnord.
This solution treats the presented date as the start date of the stage and calculates the difference between it and the start_date of the next phase.
with cte as (
select status
, lead(dd ignore nulls) over (partition by id order by dd) - dd as dt_diff
from your_table)
select status, avg(dt_diff) as avg_ela
from cte
group by status
/
If you wish to include all stages for each d and estimate the time spent in each (using linear interpolation) then you can create a sub-query with all the statuses and use a PARTITION OUTER JOIN to join them and then use LAG and LEAD to find the date range the status is in and interpolate between:
Oracle Setup:
CREATE TABLE data ( d, dt, status ) AS
SELECT 1, TO_DATE( '1/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/8/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/30/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/18/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/2/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/29/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '2/5/18', 'MM/DD/YY' ), 'closed' FROM DUAL;
Query:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT d,
dt,
status,
( next_dt - recent_dt ) / (next_id - recent_id ) AS estimated_duration
FROM ranges;
Output:
D | DT | STATUS | ESTIMATED_DURATION
-: | :-------- | :---------- | ---------------------------------------:
1 | 01-JAN-18 | requirement | 7
1 | 08-JAN-18 | analysis | 7.33333333333333333333333333333333333333
1 | null | design | 7.33333333333333333333333333333333333333
1 | null | accepted | 7.33333333333333333333333333333333333333
1 | 30-JAN-18 | closed | 0
2 | 01-FEB-18 | requirement | 4.25
2 | null | analysis | 4.25
2 | null | design | 4.25
2 | null | accepted | 4.25
2 | 18-FEB-18 | closed | 0
3 | 02-JAN-18 | requirement | 27
3 | 29-JAN-18 | analysis | 2.33333333333333333333333333333333333333
3 | null | design | 2.33333333333333333333333333333333333333
3 | null | accepted | 2.33333333333333333333333333333333333333
3 | 05-FEB-18 | closed | 0
Query 2:
Then of you can easily change that to take the average for each status:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT status,
AVG( ( next_dt - recent_dt ) / (next_id - recent_id ) ) AS estimated_duration
FROM ranges
GROUP BY status, id
ORDER BY id;
Results:
STATUS | ESTIMATED_DURATION
:---------- | ---------------------------------------:
requirement | 12.75
analysis | 4.63888888888888888888888888888888888889
design | 4.63888888888888888888888888888888888889
accepted | 4.63888888888888888888888888888888888889
closed | 0
db<>fiddle here

Resources