I have 2 rows with 2 periods of time that intersect. For example:
---------------------------------------------
| START_DATE | END_DATE |
---------------------------------------------
| 01/01/2018 08:00:00 | 01/01/2018 09:30:00 |
| 01/01/2018 08:30:00 | 01/01/2018 10:00:00 |
---------------------------------------------
There are 30 minutes where both periods intersect. I want to avoid it. I would like to join both rows in one single column, taking the starting date as the older and the ending date as the newer:
---------------------------------------------
| START_DATE | END_DATE |
---------------------------------------------
| 01/01/2018 08:00:00 | 01/01/2018 10:00:00 |
---------------------------------------------
Have you any idea how can I get the solution I want with a SQL sentence?
For two rows just use greatest() and least(). But the problem is when you have many rows which may overlap in different ways. You can:
add row numbers to each row,
assign groups for overlapping periods using recursive query,
group data using this value and find min and max dates in each group.
dbfiddle demo
with
r(rn, start_date, end_date) as (
select row_number() over(order by start_date), start_date, end_date from t ),
c(rn, start_date, end_date, grp) as (
select rn, start_date, end_date, 1 from r where rn = 1
union all
select r.rn,
case when r.start_date <= c.end_date and c.start_date <= r.end_date
then least(r.start_date, c.start_date) else r.start_date end,
case when r.start_date <= c.end_date and c.start_date <= r.end_date
then greatest(r.end_date, c.end_date) else r.end_date end,
case when r.start_date <= c.end_date and c.start_date <= r.end_date
then grp else grp + 1 end
from c join r on r.rn = c.rn + 1)
select min(start_date), max(end_date) from c group by grp
If all you have is a set of date ranges, with no other correlating or constraining criteria, and you want to reduce that to a set of non overlapping ranges, you can do that with a recursive query like this one:
with recur(start_date, end_date) as (
select * from yourdata yd
where not exists (select 1 from yourdata cyd
where yd.start_Date between cyd.start_date and cyd.end_date
and (yd.start_date <> cyd.start_date or yd.end_date <> cyd.end_date))
union all
select r.start_date
, yd.end_date
from recur r
join yourdata yd
on r.start_date < yd.start_date
and yd.start_date <= r.end_date
and r.end_date < yd.end_date
)
select start_date, max(end_date) end_Date from recur group by start_Date;
In this query the anchor (the part before the union all) select all records whose start date is not contained in any other range.
The recursive part (the part after the union all) then select ranges that extend the current range. In both halves the original start date is returned while in the recursive part the new extended end date is returned. This results in a set of over lapping ranges with a common start date.
Finally the output query returns the start date and max end date grouped by start date.
Related
I have the following recursive CTE which splits each element coming from base per month:
with
base (id, start_date, end_date) as (
select 1, date '2022-01-15', date '2022-03-15' from dual
union
select 2, date '2022-09-15', date '2022-12-31' from dual
union
select 3, date '2023-09-15', date '2023-09-25' from dual
),
split (id, start_date, end_date) as (
select base.id, base.start_date, least(last_day(base.start_date), base.end_date) from base
union all
select base.id, split.end_date + 1, least(last_day(split.end_date + 1), base.end_date) from base join split on base.id = split.id and split.end_date < base.end_date
)
select * from split order by id, start_date, end_date;
It works on Oracle and gives the following result:
id
start_date
end_date
1
2022-01-15
2022-01-31
1
2022-02-01
2022-02-28
1
2022-03-01
2022-03-15
2
2022-09-15
2022-09-30
2
2022-10-01
2022-10-31
2
2022-11-01
2022-11-30
2
2022-12-01
2022-12-31
3
2023-09-15
2023-09-25
The two following stop conditions work correctly:
... from base join split on base.id = split.id and split.end_date < base.end_date
... from base, split where base.id = split.id and split.end_date < base.end_date
The following one fails with the message ORA-32044: cycle detected while executing recursive WITH query:
... from base join split on base.id = split.id where split.end_date < base.end_date
I fail to understand how the last one is different from the two others.
It looks like a bug as all your queries should result in identical explain plans.
However, you can rewrite the recursive sub-query without the join (and using a SEARCH clause so you may not have to re-order the query later):
WITH split (id, start_date, month_end, end_date) AS (
SELECT id,
start_date,
LEAST(
ADD_MONTHS(TRUNC(start_date, 'MM'), 1) - INTERVAL '1' SECOND,
end_date
),
end_date
FROM base
UNION ALL
SELECT id,
month_end + INTERVAL '1' SECOND,
LEAST(
ADD_MONTHS(month_end, 1),
end_date
),
end_date
FROM split
WHERE month_end < end_date
) SEARCH DEPTH FIRST BY id, start_date SET order_id
SELECT id,
start_date,
month_end AS end_date
FROM split;
Note: if you want to just use values at midnight rather than the entire month then use INTERVAL '1' DAY rather than 1 second.
Which, for the sample data:
CREATE TABLE base (id, start_date, end_date) as
select 1, date '2022-01-15', date '2022-04-15' from dual union all
select 2, date '2022-09-15', date '2022-12-31' from dual union all
select 3, date '2023-09-15', date '2023-09-25' from dual;
Outputs:
ID
START_DATE
END_DATE
1
2022-01-15T00:00:00Z
2022-01-31T23:59:59Z
1
2022-02-01T00:00:00Z
2022-02-28T23:59:59Z
1
2022-03-01T00:00:00Z
2022-03-31T23:59:59Z
1
2022-04-01T00:00:00Z
2022-04-15T00:00:00Z
2
2022-09-15T00:00:00Z
2022-09-30T23:59:59Z
2
2022-10-01T00:00:00Z
2022-10-31T23:59:59Z
2
2022-11-01T00:00:00Z
2022-11-30T23:59:59Z
2
2022-12-01T00:00:00Z
2022-12-31T00:00:00Z
3
2023-09-15T00:00:00Z
2023-09-25T00:00:00Z
fiddle
It's because WHERE and ON conditions are not evaluated at the same level:
when the condition is in the ON clause it's limiting the rows concerned by the JOIN, where it's in the WHERE it's filtering the results after the JOIN has been applied, and since a recursive CTE see all rows selected up to now...
I have data with overlapping data ranges. Example below
Customer_ID
FAC_NUM
Start_Date
End_Date
New_Monies
12345
ABC1234
26/NOV/2014
26/MAY/2015
100000
12345
ABC1234
12/DEC/2014
12/JUN/2015
200000
12345
ABC1234
15/JUN/2015
15/DEC/2015
500000
12345
ABC1234
20/DEC/2015
20/JUN/2016
600000
I want to convert this table into data with non overlapping ranges such that for each overlapping period, the New_Monies column is summed together and shown as a new row. For the example above, I want the output to be as follows
Customer_ID
FAC_NUM
Start_Date
End_Date
New_Monies
12345
ABC1234
26/NOV/2014
11/DEC/2014
100000
12345
ABC1234
12/DEC/2014
26/MAY/2015
300000
12345
ABC1234
27/MAY/2015
12/JUN/2015
200000
12345
ABC1234
15/JUN/2015
15/DEC/2015
500000
12345
ABC1234
20/DEC/2015
20/JUN/2016
600000
Row 2 above being the overlapping period of 12 Dec 2014 to 26 May 2015 showing the total New_Monies as 300000 (100000+200000)
What would be the best way to do this in Oracle?
Thanks in advance for your support.
Regards,
Ani
with
prep (customer_id, fac_num, dt, amount) as (
select t.customer_id, t.fac_num,
case h.col when 's' then t.start_date else t.end_date + 1 end as dt,
case h.col when 's' then t.new_monies else - t.new_monies end as amount
from sample_data t
cross join
(select 's' as col from dual union all select 'e' from dual) h
)
, cumul_sums (customer_id, fac_num, dt, amount) as (
select distinct
customer_id, fac_num, dt,
sum(amount) over (partition by customer_id, fac_num order by dt)
from prep
)
, with_intervals (customer_id, fac_num, start_date, end_date, amount) as (
select customer_id, fac_num, dt,
lead(dt) over (partition by customer_id, fac_num order by dt) - 1,
amount
from cumul_sums
)
select customer_id, fac_num, start_date, end_date, amount
from with_intervals
where end_date is not null
order by customer_id, fac_num, start_date
;
The prep subquery unpivots the inputs, while at the same time changing the "end date" to the "start date" of the following interval and assigning a positive amount to the "start date" and the negative of the same amount to the following "start date". cumul_sums computes the cumulative sums; note that if two or more intervals begin on the same date (so the same date from prep appears multiple times for a customer and fac_num), the analytic sum will include the amounts from ALL the rows up to that date - the default windowing clause is range between...... After the cumulative sums are computed, this subquery also de-duplicates the output rows (to handle precisely that complication, of multiple intervals starting on the same date). with_intervals recovers the "start date" - "end date" intervals, and the final step simply removes the last interval ("to infinity") which would have an "amount" of zero.
EDIT This solution answers the OP's original question. After posting the solution, the OP changed the question. The solution can be changed easily to address the new formulation. I'm not going to chase shadows though; the solution will remain as is.
Here is an way to do this.
with all_data
as (select Customer_ID,FAC_NUM,start_date as dt,new_monies as calc_monies
from t
union all
select Customer_ID,FAC_NUM,end_date as dt,new_monies*-1 as calc_monies
from t
)
select x.customer_id
,x.fac_num
,x.start_date
,case when row_number() over(order by end_date desc)=1 then
x.end_date + 1
else x.end_date
end as new_end_date
from (
select t.customer_id
,t.fac_num
,t.dt as start_date
,lead(dt) over(order by dt)-1 as end_date
,sum(calc_monies) over(order by dt) as new_monies
from all_data t
)x
where end_date is not null
order by 3
db fiddle link
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=856c9ac0954e45429994f4ac45699e6f
+-------------+---------+------------+--------------+------------+
| CUSTOMER_ID | FAC_NUM | START_DATE | NEW_END_DATE | NEW_MONIES |
+-------------+---------+------------+--------------+------------+
| 12345 | ABC1234 | 26-NOV-14 | 11-DEC-14 | 100000 |
| 12345 | ABC1234 | 12-DEC-14 | 25-MAY-15 | 300000 |
| 12345 | ABC1234 | 26-MAY-15 | 12-JUN-15 | 200000 |
+-------------+---------+------------+--------------+------------+
I'm trying to split record to multiple record from start/end date in Oracle
I have data like this
MachineID | start date | end date | running time |
WC01 | 2019/09/05 07:00 | 2019/09/07 09:00 | 26:00 |
and I want to split record to each day from 08:00 to 08:00
MachineID | running date | running time |
WC01 | 2019/09/05 | 1:00 |
WC01 | 2019/09/06 | 24:00 |
WC01 | 2019/09/07 | 1:00 |
Thank you for your help!
We can handle this via the help from a calendar table which contains all dates you expect to appear in your data set, along with a separate record for each minute:
WITH dates AS (
SELECT TIMESTAMP '2019-09-05 00:00:00' + NUMTODSINTERVAL(rownum, 'MINUTE') AS dt
FROM dual
CONNECT BY level <= 5000
)
SELECT
m.MachineID,
TRUNC(d.dt) AS running_date,
COUNT(t.MachineID) / 60 AS running_hours
FROM dates d
CROSS JOIN (SELECT DISTINCT MachineID FROM yourTable) m
LEFT JOIN yourTable t
ON d.dt >= t.start_date AND d.dt < t.end_date
WHERE
TO_CHAR(d.dt, 'HH24') >= '08' AND TO_CHAR(d.dt, 'HH24') < '21'
GROUP BY
m.MachineID,
TRUNC(d.dt)
ORDER BY
TRUNC(d.dt);
Demo
You can try below query:
SELECT
MACHINEID,
RUNNING_DATE,
DECODE(RUNNING_DATE, TRUNC(START_DATE), CASE
WHEN DIFF_START < 0 THEN 0
WHEN DIFF_START > 12 THEN 12
ELSE DIFF_START
END, TRUNC(END_DATE), CASE
WHEN DIFF_END < 0 THEN 0
WHEN DIFF_END > 12 THEN 12
ELSE DIFF_END
END, 24) AS RUNNING_HOURS
FROM
(
SELECT
MACHINEID,
RUNNING_DATE,
ROUND(24 *((TRUNC(START_DATE + LVL - 1) + 8 / 24) - START_DATE)) AS DIFF_START,
ROUND(24 *(END_DATE -(TRUNC(START_DATE + LVL - 1) + 8 / 24))) AS DIFF_END,
START_DATE,
END_DATE
FROM
(
SELECT
DISTINCT MACHINEID,
LEVEL AS LVL,
START_DATE,
END_DATE,
TRUNC(START_DATE + LEVEL - 1) AS RUNNING_DATE
FROM
YOURTABLE
CONNECT BY
LEVEL <= TRUNC(END_DATE) - TRUNC(START_DATE) + 1
)
);
db<>fiddle demo
Change the logic wherever it is not meeting your requirement. I have created the query taking sample data and expected output into consideration.
Cheers!!
We have a set of values which we use to populate a bar chart. For this application, we will always need 5 years of data, we will always need 5 rows of data, even if the values are NULL.
See this query. Assume that the DATE column goes from 2017, 2016, 2015.........even those we may have no data for 2014 & 2013, I will need to return a 2014 & 2013 for, with a NULL as the other column.....
SELECT period_date, actual_eps
FROM (SELECT LAST_DAY(TO_DATE(TO_CHAR(period_date),'YYYYMM')) period_date, actual_eps
FROM period_data
WHERE ticker = 'ADRO'
AND period_type = 'A'
AND actual_eps IS NOT NULL
ORDER BY period_date DESC NULLS LAST)
WHERE rownum <= 5;
So, it will return what rows it has, up to 5, and NULL for the other rows which it does not have, up to 5.......
Thanks in advance
Try using a Common Table Expression/Subquery Factoring to generate rows for each year value. Use a RIGHT JOIN to generate NULLs for any missing rows.
Normally I would use a LEFT JOIN. But in this case I think it reads better this way.
Use NVL to substitute the year for NULL period_date values.
with years as
(
select to_char(sysdate, 'YYYY') as year from dual
UNION ALL
select to_char(add_months(sysdate,-12), 'YYYY') as year from dual
UNION ALL
select to_char(add_months(sysdate,-24), 'YYYY') as year from dual
UNION ALL
select to_char(add_months(sysdate,-36), 'YYYY') as year from dual
UNION ALL
select to_char(add_months(sysdate,-48), 'YYYY') as year from dual
)
SELECT
NVL(TO_CHAR(LAST_DAY(pd.period_date),'YYYYMM'),y.year) as period_date,
pd.actual_eps
FROM period_data pd
RIGHT JOIN years y ON y.year = to_char(pd.period_date,'YYYY')
AND pd.ticker = 'ADRO'
AND pd.period_type = 'A'
AND pd.actual_eps IS NOT NULL
WHERE rownum <= 5
ORDER BY period_date desc, actual_eps nulls last;
Output:
| PERIOD_DATE | ACTUAL_EPS |
|-------------|------------|
| 201902 | foo |
| 201802 | foo |
| 201702 | foo |
| 2016 | (null) |
| 2015 | (null) |
SQL Fiddle example
Imagine this scenario (YYYY/MM/DD):
Start date: 2015/01/01 End date: 2015/08/10
Start date: 2014/10/03 End date: 2015/07/06
Start date: 2015/09/30 End date: 2016/04/28
Using PL/SQL can I calculate the distinct days between these overlapping dates?
Edit: My table has 2 DATE columns, Start_Date and End_Date. The result I'm expecting is 515 days ((2015/08/10 - 2014/10/03) + (2016/04/28 -2015/09/30))
You can do also with pure SQL (no need for PL/SQL):
with
minmax as (select min(start_date) min_dt, max(end_date) max_dt from myTable ),
dates as (
SELECT min_dt + rownum-1 dt1
FROM minmax CONNECT BY ROWNUM <= (max_dt - min_dt +1)
)
select count(*) from dates
where exists(
select 1 from MyTable T2
where dates.dt1 between T2.start_date and T2.end_date )
NOTE: an idea, written from head, not tested. Adapt generated dates as needed, with start date and needed length.
Hope it helps.
EDIT: Using actual table dates
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE DATES ( start_date, end_date ) AS
SELECT DATE '2015-01-01', DATE '2015-08-10' FROM DUAL
UNION ALL SELECT DATE '2014-10-03', DATE '2015-07-06' FROM DUAL
UNION ALL SELECT DATE '2015-09-30', DATE '2016-04-28' FROM DUAL
Query 1:
SELECT COUNT( DISTINCT COLUMN_VALUE ) AS number_of_days
FROM DATES d,
TABLE(
CAST(
MULTISET(
SELECT d.START_DATE + LEVEL - 1
FROM DUAL
CONNECT BY d.START_DATE + LEVEL - 1 < d.END_DATE
)
AS SYS.ODCIDATELIST
)
)
ORDER BY 1
Results:
| NUMBER_OF_DAYS |
|----------------|
| 522 |
Query 2 - Check:
SELECT DATE '2015-08-10' - DATE '2014-10-03'
+ DATE '2016-04-28' - DATE '2015-09-30'
FROM DUAL
Results:
| DATE'2015-08-10'-DATE'2014-10-03'+DATE'2016-04-28'-DATE'2015-09-30' |
|---------------------------------------------------------------------|
| 522 |