Calculate average values in Oracle - oracle

I want to calculate average values in Oracle tables
CREATE TABLE AGENT_HISTORY(
EVENT_ID INTEGER NOT NULL,
AGENTID INTEGER NOT NULL,
EVENT_DATE DATE NOT NULL
)
/
CREATE TABLE CPU_HISTORY(
CPU_HISTORY_ID INTEGER NOT NULL,
EVENT_ID INTEGER NOT NULL,
CPU_NAME VARCHAR2(50 ) NOT NULL,
CPU_VALUE NUMBER NOT NULL
)
/
I use this SQL query:
----- FOR 24 HOURS CPU
CURSOR LAST_24_CPU_CURSOR IS
--SELECT EVENT_DATE, CPU FROM AGENT_HISTORY WHERE NAME = NAMEIN AND EVENT_DATE >= SYSDATE-(60*24)/1440;
SELECT START_DATE, NVL(AVG(CH.CPU_VALUE),0)
FROM (SELECT START_DATE - (LVL+1)/24 START_DATE, START_DATE - LVL/24 END_DATE
FROM (SELECT SYSDATE START_DATE, LEVEL LVL FROM DUAL CONNECT BY LEVEL <= 24))
LEFT JOIN AGENT_HISTORY AH ON EVENT_DATE BETWEEN START_DATE AND END_DATE
LEFT JOIN CPU_HISTORY CH ON AH.EVENT_ID = CH.EVENT_ID
JOIN AGENT AG ON AH.AGENTID = AG.ID
WHERE AG.NAME = NAMEIN
GROUP BY START_DATE
ORDER BY 1;
This query prints only one average value. I would like to modify it to print 24 values for every hour average value. Can you help me to modify the query?

I guess your input contains data only for one of the given intervals; since you're using an INNER JOIN with AGENT which in turn is filtered by AGENT_HISTORY, you're effectively converting all your LEFT JOINs to inner ones.
I suggest you use a CROSS JOIN between AGENT and the timeslots instead:
with agent_history(event_date, agentid, event_id) as (
select timestamp '2015-11-18 09:00:07', 1, 1001 from dual
),
agent(id, name) as (
select 1, 'myAgent' from dual
),
cpu_history(event_id, cpu_value) as (
select 1001, 75.2 from dual
),
time_slots(start_date, end_date) as (
SELECT START_DATE - (LVL + 1) / 24 START_DATE,
START_DATE - LVL / 24 END_DATE
FROM (SELECT SYSDATE START_DATE,
LEVEL LVL
FROM DUAL
CONNECT BY LEVEL <= 24)
)
SELECT START_DATE,
NVL(AVG(CH.CPU_VALUE),
0)
FROM time_slots ts
CROSS JOIN AGENT AG
LEFT JOIN AGENT_HISTORY AH
ON AH.AGENTID = AG.ID
AND EVENT_DATE BETWEEN START_DATE AND END_DATE
LEFT JOIN CPU_HISTORY CH
ON AH.EVENT_ID = CH.EVENT_ID
WHERE AG.NAME = 'myAgent'
GROUP BY START_DATE
ORDER BY 1;
This ensures you get the full 24 rows (one for each timeslot).

Change start_date to to_char(start_date, 'hh24:mi') both in select and group by clauses.

Related

Stop condition for recursive CTE on Oracle (ORA-32044)

I have the following recursive CTE which splits each element coming from base per month:
with
base (id, start_date, end_date) as (
select 1, date '2022-01-15', date '2022-03-15' from dual
union
select 2, date '2022-09-15', date '2022-12-31' from dual
union
select 3, date '2023-09-15', date '2023-09-25' from dual
),
split (id, start_date, end_date) as (
select base.id, base.start_date, least(last_day(base.start_date), base.end_date) from base
union all
select base.id, split.end_date + 1, least(last_day(split.end_date + 1), base.end_date) from base join split on base.id = split.id and split.end_date < base.end_date
)
select * from split order by id, start_date, end_date;
It works on Oracle and gives the following result:
id
start_date
end_date
1
2022-01-15
2022-01-31
1
2022-02-01
2022-02-28
1
2022-03-01
2022-03-15
2
2022-09-15
2022-09-30
2
2022-10-01
2022-10-31
2
2022-11-01
2022-11-30
2
2022-12-01
2022-12-31
3
2023-09-15
2023-09-25
The two following stop conditions work correctly:
... from base join split on base.id = split.id and split.end_date < base.end_date
... from base, split where base.id = split.id and split.end_date < base.end_date
The following one fails with the message ORA-32044: cycle detected while executing recursive WITH query:
... from base join split on base.id = split.id where split.end_date < base.end_date
I fail to understand how the last one is different from the two others.
It looks like a bug as all your queries should result in identical explain plans.
However, you can rewrite the recursive sub-query without the join (and using a SEARCH clause so you may not have to re-order the query later):
WITH split (id, start_date, month_end, end_date) AS (
SELECT id,
start_date,
LEAST(
ADD_MONTHS(TRUNC(start_date, 'MM'), 1) - INTERVAL '1' SECOND,
end_date
),
end_date
FROM base
UNION ALL
SELECT id,
month_end + INTERVAL '1' SECOND,
LEAST(
ADD_MONTHS(month_end, 1),
end_date
),
end_date
FROM split
WHERE month_end < end_date
) SEARCH DEPTH FIRST BY id, start_date SET order_id
SELECT id,
start_date,
month_end AS end_date
FROM split;
Note: if you want to just use values at midnight rather than the entire month then use INTERVAL '1' DAY rather than 1 second.
Which, for the sample data:
CREATE TABLE base (id, start_date, end_date) as
select 1, date '2022-01-15', date '2022-04-15' from dual union all
select 2, date '2022-09-15', date '2022-12-31' from dual union all
select 3, date '2023-09-15', date '2023-09-25' from dual;
Outputs:
ID
START_DATE
END_DATE
1
2022-01-15T00:00:00Z
2022-01-31T23:59:59Z
1
2022-02-01T00:00:00Z
2022-02-28T23:59:59Z
1
2022-03-01T00:00:00Z
2022-03-31T23:59:59Z
1
2022-04-01T00:00:00Z
2022-04-15T00:00:00Z
2
2022-09-15T00:00:00Z
2022-09-30T23:59:59Z
2
2022-10-01T00:00:00Z
2022-10-31T23:59:59Z
2
2022-11-01T00:00:00Z
2022-11-30T23:59:59Z
2
2022-12-01T00:00:00Z
2022-12-31T00:00:00Z
3
2023-09-15T00:00:00Z
2023-09-25T00:00:00Z
fiddle
It's because WHERE and ON conditions are not evaluated at the same level:
when the condition is in the ON clause it's limiting the rows concerned by the JOIN, where it's in the WHERE it's filtering the results after the JOIN has been applied, and since a recursive CTE see all rows selected up to now...

Oracle PLSQL generating absences

have some PLSQL code that generates a list of dates from a range, which seems to be working fine.
As part of a larger project I want to generate a procedure that will create a list of absences for each employee.
My first step is to use the MINUS command to remove all the holidays, which fall into the range of dates. Is there an easy way of doing this instead of comparing each holiday one at a time (there maybe several in the table) against the GENERATED dates.
If possible, I would prefer breaking all these tasks into small procedures or functions for easy debugging and legibility.
If there is an easier way to do this I am open to all suggestions. Thanks in advance for your help, expertise and patience.
ALTER SESSION SET
NLS_DATE_FORMAT = 'MMDDYYYY HH24:MI:SS';
create table holidays(
holiday_date DATE,
holiday_name VARCHAR2(20)
);
INSERT into holidays
(holiday_date,
holiday_name)
VALUES
(
TO_DATE('2021/07/21 00:00:00', 'yyyy/mm/dd hh24:mi:ss'), 'July 21 2021');
CREATE OR REPLACE PROCEDURE generate_dates
(
p_start_date IN DATE,
p_end_date IN DATE
)
AS
l_day DATE := p_start_date;
BEGIN
WHILE l_day <= p_end_date
LOOP
DBMS_OUTPUT.PUT_LINE(l_day);
l_day := l_day + 1;
END LOOP;
END generate_dates;
EXEC generate_dates(TRUNC(SYSDATE),TRUNC(SYSDATE+10));
Create table employees(
employee_id NUMBER(6),
first_name VARCHAR2(20),
last_name VARCHAR2(20),
card_num VARCHAR2(10),
work_days VARCHAR2(7)
);
ALTER TABLE employees
ADD ( CONSTRAINT employees_pk
PRIMARY KEY (employee_id));
INSERT INTO employees
(
EMPLOYEE_ID,
first_name,
last_name,
card_num,
work_days
)
WITH names AS (
SELECT 1, 'Jane', 'Doe', 'F123456', 'NYYYYYN' FROM dual UNION ALL
SELECT 2, 'Madison', 'Smith', 'R33432','NYYYYYN'
FROM dual UNION ALL
SELECT 3, 'Justin', 'Case', 'C765341','NYYYYYN'
FROM dual UNION ALL
SELECT 4, 'Mike', 'Jones', 'D564311','NYYYYYN' FROM dual
) SELECT * FROM names
-- check to see if working for that day. Byte=Y for Yes
SELECT SUBSTR( work_days, to_char(TRUNC(SYSDATE), 'D'),1) Work_Day
FROM employees
create table timeoff(
seq_num integer GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
employee_id NUMBER(6),
timeoff_date DATE,
timeoff_type VARCHAR2(1),
constraint timeoff_chk check (timeoff_date=trunc(timeoff_date, 'dd')),
constraint timeoff_pk primary key (employee_id, timeoff_date)
);
INSERT INTO timeoff (EMPLOYEE_ID,TIMEOFF_DATE,TIMEOFF_TYPE
)
WITH dts AS (
SELECT 1, to_date('20210726 00:00:00','YYYYMMDD HH24:MI:SS'),'V' FROM dual UNION ALL
SELECT 2, to_date('20210726 00:00:00','YYYYMMDD HH24:MI:SS'),'V' FROM dual UNION ALL
SELECT 2, to_date('20210727 00:00:00','YYYYMMDD HH24:MI:SS'),'V' FROM dual )
SELECT * FROM dts
CREATE TABLE emp_attendance(
seq_num integer GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
employee_id NUMBER(6),
start_date DATE,
end_date DATE,
week_number NUMBER(2),
create_date DATE DEFAULT SYSDATE
);
create table absences(
seq_num integer GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
employee_id NUMBER(6),
absent_date DATE,
constraint absence_chk check (absent_date=trunc(absent_date, 'dd')),
constraint absence_pk primary key (employee_id, absent_date)
);
INSERT INTO emp_attendance ( EMPLOYEE_ID, START_DATE,END_DATE,WEEK_NUMBER)
WITH dts AS (
SELECT 1, to_date('20210728 13:10:00','YYYYMMDD HH24:MI:SS'),
to_date('20210728 23:15:00','YYYYMMDD HH24:MI:SS'), 30 FROM dual UNION ALL
SELECT 2, to_date('20210728 12:10:10','YYYYMMDD HH24:MI:SS'),
to_date('20210728 20:15:01','YYYYMMDD HH24:MI:SS'), 30 FROM dual)
SELECT * FROM dts
CREATE OR REPLACE TYPE obj_date IS OBJECT (
date_val DATE
);
CREATE OR REPLACE TYPE nt_date IS TABLE OF obj_date;
CREATE OR REPLACE FUNCTION generate_dates(
p_from IN DATE
,p_to IN DATE)
RETURN nt_date PIPELINED
IS
-- normalize inputs to be as-of midnight
v_from DATE :=
TRUNC(NVL(p_from, SYSDATE));
v_to DATE := TRUNC(NVL(p_to, SYSDATE));
BEGIN
LOOP
EXIT WHEN v_from > v_to;
PIPE ROW (obj_date(v_from));
v_from := v_from + 1; -- next. calendar day
END LOOP;
RETURN;
END generate_dates;
INSERT INTO absences
(employee_id, absent_date)
SELECT e.employee_id,
c.date_val
FROM employees e
INNER JOIN table(generate_dates(date '2021-07-20', DATE '2021-07-31')) c
PARTITION BY ( e.employee_id )
ON (SUBSTR(e.work_days,
TRUNC(c.date_val) -
TRUNC(c.date_val, 'IW') + 1, 1) = 'Y')
WHERE NOT EXISTS (
SELECT 1
FROM holidays h
WHERE c.date_val = h.holiday_date
)
AND NOT EXISTS(
SELECT 1
FROM timeoff t
WHERE e.employee_id = t.employee_id
AND t.timeoff_date = c.date_val
)
AND NOT EXISTS(
SELECT 1
FROM emp_attendance ea
WHERE e.employee_id = ea.employee_id
AND TRUNC(ea.start_date) = c.date_val
)
ORDER BY
e.employee_id,
c.date_val
;
Don't use lots of procedures and/or a functions; just use a single query:
SELECT e.employee_id,
c.day
FROM employees e
INNER JOIN (
WITH calendar ( start_date, end_date ) AS (
SELECT DATE '2021-07-01', DATE '2021-07-30' FROM DUAL
UNION ALL
SELECT start_date + 1, end_date
FROM calendar
WHERE start_date + 1 <= end_date
)
SELECT start_date AS day
FROM calendar
) c
PARTITION BY ( e.employee_id )
ON (SUBSTR(e.work_days, TRUNC(c.day) - TRUNC(c.day, 'IW') + 1, 1) = 'Y')
WHERE NOT EXISTS (
SELECT 1
FROM holidays h
WHERE c.day = h.holiday_date
)
AND NOT EXISTS(
SELECT 1
FROM timeoff t
WHERE e.employee_id = t.employee_id
AND t.timeoff_date = c.day
)
ORDER BY
e.employee_id,
c.day
Notes:
This assumes that your work_days column aligns with the ISO week; if it does not then you will need to adjust the substring.
Do not use TO_CHAR(date_value, 'D') as users will get different results depending on their NLS_TERRITORY session setting.
db<>fiddle here

Sql query to filter out the overlapping dates

Version
start_date
end_date
1
2005-11-23
2005-11-23
2
2005-11-23
2005-11-23
3
2005-11-23
2008-10-23
4
2008-10-23
2010-05-18
5
2011-05-13
2012-05-19
In the above table instead of keeping version 1,2,3,4 we can keep version 1 starting from '2005-11-23' to '2010-05-18' since all these verions are overlapping and keep version 5 as it is.
Ouput needed
..............
Version
start_date
end_date
1
2005-11-23
2010-05-18
5
2011-05-13
2012-05-19
How we can frame sql query for thi scenario?
Hive or Postgresql
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2005-11-23'),
('3', '2005-11-23', '2008-10-23'),
('4', '2008-10-23', '2010-05-18'),
('5', '2011-05-13', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version") OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date,
CASE
WHEN md.end_date IS NULL THEN s.end_date
ELSE md.end_date
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version";
Version
start_date
end_date
1
2005-11-23T00:00:00.000Z
2010-05-18T00:00:00.000Z
5
2011-05-13T00:00:00.000Z
2012-05-19T00:00:00.000Z
View on DB Fiddle
Schema (PostgreSQL v13)
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2005-11-23'),
('3', '2005-11-23', '2008-10-23'),
('4', '2008-10-23', '2010-05-18'),
('5', '2011-05-13', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version") OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date::text,
CASE
WHEN md.end_date IS NULL THEN s.end_date::text
ELSE md.end_date::text
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version";
Version
start_date
end_date
1
2005-11-23
2010-05-18
5
2011-05-13
2012-05-19
View on DB Fiddle
Update 1
Lag/Lead functions now assigned default values
Schema (PostgreSQL v13)
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date,1,null) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version",1,3) OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date::text,
CASE
WHEN md.end_date IS NULL THEN s.end_date::text
ELSE md.end_date::text
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version";
ORDER BY
s."Version";
Version
start_date
end_date
1
2005-11-23
2012-05-19
View on DB Fiddle
With original dataset
Schema (PostgreSQL v13)
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2005-11-23'),
('3', '2005-11-23', '2008-10-23'),
('4', '2008-10-23', '2010-05-18'),
('5', '2011-05-13', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date,1,null) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version",1,3) OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date::text,
CASE
WHEN md.end_date IS NULL THEN s.end_date::text
ELSE md.end_date::text
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version"
ORDER BY
s."Version";
Version
start_date
end_date
1
2005-11-23
2010-05-18
5
2011-05-13
2005-11-23
View on DB Fiddle
The safest way to handle this -- assuming that you can create stable sort on the rows (which version provides) -- uses a cumulative maximum instead of lag().
select min(version), min(start_date), min(end_date)
from (select t.*,
sum(case when prev_max_end_date >= start_date then 0 else 1 end) over
(order by start_date, version) as grp
from (select t.*,
max(end_date) over (order by start_date, version
rows between unbounded preceding and 1 preceding
) as prev_max_end_date
from t
) t
) t
group by grp;
This should work in any (reasonable) database. Here is a db<>fiddle that happens to use Postgres.
The issue with lag()/lead() approaches is that the overlap with earlier rows may not be on the "previous" row. For instance, consider this diagram (where lower case means start and upper case means end):
---a----b--B----c--C----d--D--e---A--E--
E overlaps with A. However, by any reasonable definition of "previous", A is not the previous row for E.

group orders based on crossing date ranges

I need to group order together with crossing their date ranges only
scenario A.
order 1, 1.3.2020-30.6.2020
order 2, 1.5.2020-31.8.2020
order 3, 31.7.2020-31.10.2020
order 4, 31.7.2020-31.12.2020
so the output should be
order 1, order 2
order 2, order 3, order 4
order1,3,4 are not grouped because their ranges don't cross at all
scenario B.
same as above plus another order
order 5, 1.1.2020-31.12.2020
so output will be
order 1, order 2, order 5
order 2, order 3, order 4, order 5
I tried Self Join to check which start date falls in that range.
so in the range of order 1 falls only the start date of order 2 -> we have one group
then in the range of order 2 fall both start dates of order 3 and 4 -> we have second group
but then for order 3 falls start date of order 4 and opposite -> that will give another 2 groups but they are invalid because order 2 is crossing their date ranges as well and shoul be included as well and becuase there will be 3 douplicates we should display it just once as in the desired output but this approach will fail.
Thanks
the result of MATCH_RECOGNIZE solution is incorrent because order 5 should be in both groups
I use some analitycal functions to solve this:
-- create table
Create table cross_dates (order_id number, start_date date , end_date date);
-- insert dates
insert into cross_dates values( 1, to_date('01.03.2020', 'dd.mm.yyyy'), to_date('30.06.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 2, to_date('01.05.2020', 'dd.mm.yyyy'), to_date( '31.08.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 3, to_date('31.07.2020', 'dd.mm.yyyy'), to_date( '31.08.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 4, to_date('31.07.2020', 'dd.mm.yyyy'), to_date( '31.10.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 5, to_date('01.01.2020', 'dd.mm.yyyy'), to_date( '31.12.2020', 'dd.mm.yyyy'));
-- SQL
select 'Order '|| min_order_id ||': ', listagg( order_id, ',') within group (order by order_id) list
from (
select distinct min_order_id, order_id from (
with dates (cur_date, end_date, order_id, start_date) as (
select start_date, end_date, order_id, start_date
from cross_Dates
union all
select cur_date + 1, end_date, order_id,start_date
from dates
where cur_date < end_date )
select d.order_id,
min(d.order_id) over(partition by greatest(d.start_date, cd.start_date)) min_order_id
from dates d, cross_Dates cd
where d.cur_date between cd.start_date and cd.end_date ))
group by min_order_id
having count(*) > 1;
Result:
Order 1: 1,2,5
Order 2: 2,3,4,5
-- add new column and update old records
alter table cross_dates add (item varchar2(1));
update cross_dates set item = 'A';
-- insert new records B
insert into cross_dates values( 1, to_date('01.01.2020', 'dd.mm.yyyy'), to_date( '30.06.2020', 'dd.mm.yyyy'), 'B');
insert into cross_dates values( 1, to_date('01.07.2020', 'dd.mm.yyyy'), to_date( '31.12.2020', 'dd.mm.yyyy'), 'B');
My assumption:
A and B are separate orders, not going in same groups even when crossing
order 1 B - has two records as a continuations - in my understanding counts like one order : order 1 B 01.01.2020 - 21.12.2020
If my assumption are correct the SQL could look like this:
select distinct min_order_id, order_id, item from (
with dates (cur_date, end_date, order_id, start_date, item) as (
select start_date, end_date, order_id, start_date, item
from cross_Dates
union all
select cur_date + 1, end_date, order_id,start_date, item
from dates
where cur_date < end_date )
select d.order_id, d.item,
min(d.order_id) over(partition by greatest(d.start_date, cd.start_date),d.item) min_order_id
from dates d, cross_Dates cd
where d.cur_date between cd.start_date and cd.end_date and d.item = cd.item )
order by item, min_order_id;
Result:
MIN_ORDER_ID ORDER_ID I
1 1 A
1 2 A
1 5 A
2 2 A
2 3 A
2 4 A
2 5 A
5 5 A
1 1 B
If my assumption are not ok please provide me what result should look like i this case.
:)
You can use MATCH_RECOGNIZE to find groups where the next value's start date is before, or equal to, the end date of all the previous values in the group. Then you can aggregate and exclude groups that would be entirely contained in another group:
WITH groups ( id, ids, start_date, end_date ) AS (
SELECT id,
LISTAGG( grp_id, ',' ) WITHIN GROUP ( ORDER BY start_date ),
MIN( start_date ),
MIN( end_date )
FROM (
SELECT t.id,
x.id AS grp_id,
x.start_date,
x.end_date
FROM table_name t
INNER JOIN table_name x
ON (
x.start_date >= t.start_date
AND x.start_date <= t.end_date
)
)
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY start_date
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW GROUPED_ROWS* )
DEFINE GROUPED_ROWS AS (
GROUPED_ROWS.start_date <= MIN( end_date )
)
)
WHERE mno = 1
GROUP BY id
)
SELECT id,
ids
FROM groups g
WHERE NOT EXISTS (
SELECT 1
FROM groups x
WHERE g.ID <> x.ID
AND x.start_date <= g.start_date
AND g.end_date <= x.end_date
)
Which for the sample data:
CREATE TABLE table_name ( id, start_date, end_date ) AS
SELECT 'order 1', DATE '2020-03-01', DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 'order 2', DATE '2020-05-01', DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 'order 3', DATE '2020-07-31', DATE '2020-10-31' FROM DUAL UNION ALL
SELECT 'order 4', DATE '2020-07-31', DATE '2020-12-31' FROM DUAL;
Outputs:
ID | IDS
:------ | :----------------------
order 2 | order 2,order 3,order 4
order 1 | order 1,order 2
I you then:
INSERT INTO table_name ( id, start_date, end_date )
VALUES ( 'order 5', DATE '2020-01-01', DATE '2020-12-31' );
The output would be:
ID | IDS
:------ | :----------------------
order 2 | order 2,order 3,order 4
order 5 | order 5,order 1,order 2
db<>fiddle here

Oracle adding a subquery in a CTE

I have the following setup, which works fine and generates output as expected.
I'm trying to add the locations subquery into the CTE so my output will have a random location_id for each row.
The subquery is straight forward and should work but I am getting syntax errors when I try to place it into the 'data's CTE. I was hoping someone could help me out.
CREATE TABLE employees(
employee_id NUMBER(6),
emp_name VARCHAR2(30)
);
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(1, 'John Doe');
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(2, 'Jane Smith');
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(3, 'Mike Jones');
CREATE TABLE locations AS
SELECT level AS location_id,
'Door ' || level AS location_name
FROM dual
CONNECT BY level <=
with rws as (
select level rn from dual connect by level <= 5 ),
data as ( select e.*,round (dbms_random.value(1,5)
) n from employees e)
select employee_id,
emp_name,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws
join data d on rn <= n
order by employee_id;
-- trying to make this work
with rws as ( select level rn from dual connect by level <= 5 ),
data as ( select e.*, loc.location_id = (
select location_id
from locations order by dbms_random.value()
fetch first 1 row only
),
round (dbms_random.value(1,5)
) n from employees e )
select employee_id,
emp_name,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws
join data d on rn <= n
order by employee_id;
You need to alias the subquery column expression, rather than trying to assign it to a [variable] name. So instead of this:
with rws as ( select level rn from dual connect by level <= 5 ),
data as ( select e.*, loc.location_id = (
select location_id
from locations order by dbms_random.value()
fetch first 1 row only
),
round (dbms_random.value(1,5)
) n from employees e )
you would do this:
with rws as (
select level rn
from dual
connect by level <= 5
),
data as (
select e.*,
(
select location_id
from locations
order by dbms_random.value()
fetch first 1 row only
) as location_id,
round (dbms_random.value(1,5)) as n
from employees e
)
db<>fiddle
But yes, you'll get the same location_id for each row, which probably isn't what you want.
There are probably better ways to avoid it (or to approach whatever you're actually trying to achieve) but one option is to force the subquery to be correlated by adding something like:
where location_id != -1 * e.employee_id
db<>fiddle
although that might be expensive. It's probably worth asking a new question about that specific aspect.
I am getting the same location_id for every employee_id, which I don't want either.
The subquery is in the wrong place then; move it to the main query, and correlate against both ID and n:
with rws as (
select level rn
from dual
connect by level <= 5
),
data as (
select e.*,
round (dbms_random.value(1,5)) as n
from employees e
)
select d.employee_id,
d.emp_name,
(
select location_id
from locations
where location_id != -1 * d.employee_id * d.n
order by dbms_random.value()
fetch first 1 row only
) as location_id,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws r
join data d on r.rn <= d.n
order by d.employee_id;
db<>fiddle
Or move the location part to a new CTE, I suppose, with its own row number; and join that on one of your other generated values.

Resources