group orders based on crossing date ranges - oracle

I need to group order together with crossing their date ranges only
scenario A.
order 1, 1.3.2020-30.6.2020
order 2, 1.5.2020-31.8.2020
order 3, 31.7.2020-31.10.2020
order 4, 31.7.2020-31.12.2020
so the output should be
order 1, order 2
order 2, order 3, order 4
order1,3,4 are not grouped because their ranges don't cross at all
scenario B.
same as above plus another order
order 5, 1.1.2020-31.12.2020
so output will be
order 1, order 2, order 5
order 2, order 3, order 4, order 5
I tried Self Join to check which start date falls in that range.
so in the range of order 1 falls only the start date of order 2 -> we have one group
then in the range of order 2 fall both start dates of order 3 and 4 -> we have second group
but then for order 3 falls start date of order 4 and opposite -> that will give another 2 groups but they are invalid because order 2 is crossing their date ranges as well and shoul be included as well and becuase there will be 3 douplicates we should display it just once as in the desired output but this approach will fail.
Thanks

the result of MATCH_RECOGNIZE solution is incorrent because order 5 should be in both groups
I use some analitycal functions to solve this:
-- create table
Create table cross_dates (order_id number, start_date date , end_date date);
-- insert dates
insert into cross_dates values( 1, to_date('01.03.2020', 'dd.mm.yyyy'), to_date('30.06.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 2, to_date('01.05.2020', 'dd.mm.yyyy'), to_date( '31.08.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 3, to_date('31.07.2020', 'dd.mm.yyyy'), to_date( '31.08.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 4, to_date('31.07.2020', 'dd.mm.yyyy'), to_date( '31.10.2020', 'dd.mm.yyyy'));
insert into cross_dates values( 5, to_date('01.01.2020', 'dd.mm.yyyy'), to_date( '31.12.2020', 'dd.mm.yyyy'));
-- SQL
select 'Order '|| min_order_id ||': ', listagg( order_id, ',') within group (order by order_id) list
from (
select distinct min_order_id, order_id from (
with dates (cur_date, end_date, order_id, start_date) as (
select start_date, end_date, order_id, start_date
from cross_Dates
union all
select cur_date + 1, end_date, order_id,start_date
from dates
where cur_date < end_date )
select d.order_id,
min(d.order_id) over(partition by greatest(d.start_date, cd.start_date)) min_order_id
from dates d, cross_Dates cd
where d.cur_date between cd.start_date and cd.end_date ))
group by min_order_id
having count(*) > 1;
Result:
Order 1: 1,2,5
Order 2: 2,3,4,5
-- add new column and update old records
alter table cross_dates add (item varchar2(1));
update cross_dates set item = 'A';
-- insert new records B
insert into cross_dates values( 1, to_date('01.01.2020', 'dd.mm.yyyy'), to_date( '30.06.2020', 'dd.mm.yyyy'), 'B');
insert into cross_dates values( 1, to_date('01.07.2020', 'dd.mm.yyyy'), to_date( '31.12.2020', 'dd.mm.yyyy'), 'B');
My assumption:
A and B are separate orders, not going in same groups even when crossing
order 1 B - has two records as a continuations - in my understanding counts like one order : order 1 B 01.01.2020 - 21.12.2020
If my assumption are correct the SQL could look like this:
select distinct min_order_id, order_id, item from (
with dates (cur_date, end_date, order_id, start_date, item) as (
select start_date, end_date, order_id, start_date, item
from cross_Dates
union all
select cur_date + 1, end_date, order_id,start_date, item
from dates
where cur_date < end_date )
select d.order_id, d.item,
min(d.order_id) over(partition by greatest(d.start_date, cd.start_date),d.item) min_order_id
from dates d, cross_Dates cd
where d.cur_date between cd.start_date and cd.end_date and d.item = cd.item )
order by item, min_order_id;
Result:
MIN_ORDER_ID ORDER_ID I
1 1 A
1 2 A
1 5 A
2 2 A
2 3 A
2 4 A
2 5 A
5 5 A
1 1 B
If my assumption are not ok please provide me what result should look like i this case.
:)

You can use MATCH_RECOGNIZE to find groups where the next value's start date is before, or equal to, the end date of all the previous values in the group. Then you can aggregate and exclude groups that would be entirely contained in another group:
WITH groups ( id, ids, start_date, end_date ) AS (
SELECT id,
LISTAGG( grp_id, ',' ) WITHIN GROUP ( ORDER BY start_date ),
MIN( start_date ),
MIN( end_date )
FROM (
SELECT t.id,
x.id AS grp_id,
x.start_date,
x.end_date
FROM table_name t
INNER JOIN table_name x
ON (
x.start_date >= t.start_date
AND x.start_date <= t.end_date
)
)
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY start_date
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW GROUPED_ROWS* )
DEFINE GROUPED_ROWS AS (
GROUPED_ROWS.start_date <= MIN( end_date )
)
)
WHERE mno = 1
GROUP BY id
)
SELECT id,
ids
FROM groups g
WHERE NOT EXISTS (
SELECT 1
FROM groups x
WHERE g.ID <> x.ID
AND x.start_date <= g.start_date
AND g.end_date <= x.end_date
)
Which for the sample data:
CREATE TABLE table_name ( id, start_date, end_date ) AS
SELECT 'order 1', DATE '2020-03-01', DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 'order 2', DATE '2020-05-01', DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 'order 3', DATE '2020-07-31', DATE '2020-10-31' FROM DUAL UNION ALL
SELECT 'order 4', DATE '2020-07-31', DATE '2020-12-31' FROM DUAL;
Outputs:
ID | IDS
:------ | :----------------------
order 2 | order 2,order 3,order 4
order 1 | order 1,order 2
I you then:
INSERT INTO table_name ( id, start_date, end_date )
VALUES ( 'order 5', DATE '2020-01-01', DATE '2020-12-31' );
The output would be:
ID | IDS
:------ | :----------------------
order 2 | order 2,order 3,order 4
order 5 | order 5,order 1,order 2
db<>fiddle here

Related

Stop condition for recursive CTE on Oracle (ORA-32044)

I have the following recursive CTE which splits each element coming from base per month:
with
base (id, start_date, end_date) as (
select 1, date '2022-01-15', date '2022-03-15' from dual
union
select 2, date '2022-09-15', date '2022-12-31' from dual
union
select 3, date '2023-09-15', date '2023-09-25' from dual
),
split (id, start_date, end_date) as (
select base.id, base.start_date, least(last_day(base.start_date), base.end_date) from base
union all
select base.id, split.end_date + 1, least(last_day(split.end_date + 1), base.end_date) from base join split on base.id = split.id and split.end_date < base.end_date
)
select * from split order by id, start_date, end_date;
It works on Oracle and gives the following result:
id
start_date
end_date
1
2022-01-15
2022-01-31
1
2022-02-01
2022-02-28
1
2022-03-01
2022-03-15
2
2022-09-15
2022-09-30
2
2022-10-01
2022-10-31
2
2022-11-01
2022-11-30
2
2022-12-01
2022-12-31
3
2023-09-15
2023-09-25
The two following stop conditions work correctly:
... from base join split on base.id = split.id and split.end_date < base.end_date
... from base, split where base.id = split.id and split.end_date < base.end_date
The following one fails with the message ORA-32044: cycle detected while executing recursive WITH query:
... from base join split on base.id = split.id where split.end_date < base.end_date
I fail to understand how the last one is different from the two others.
It looks like a bug as all your queries should result in identical explain plans.
However, you can rewrite the recursive sub-query without the join (and using a SEARCH clause so you may not have to re-order the query later):
WITH split (id, start_date, month_end, end_date) AS (
SELECT id,
start_date,
LEAST(
ADD_MONTHS(TRUNC(start_date, 'MM'), 1) - INTERVAL '1' SECOND,
end_date
),
end_date
FROM base
UNION ALL
SELECT id,
month_end + INTERVAL '1' SECOND,
LEAST(
ADD_MONTHS(month_end, 1),
end_date
),
end_date
FROM split
WHERE month_end < end_date
) SEARCH DEPTH FIRST BY id, start_date SET order_id
SELECT id,
start_date,
month_end AS end_date
FROM split;
Note: if you want to just use values at midnight rather than the entire month then use INTERVAL '1' DAY rather than 1 second.
Which, for the sample data:
CREATE TABLE base (id, start_date, end_date) as
select 1, date '2022-01-15', date '2022-04-15' from dual union all
select 2, date '2022-09-15', date '2022-12-31' from dual union all
select 3, date '2023-09-15', date '2023-09-25' from dual;
Outputs:
ID
START_DATE
END_DATE
1
2022-01-15T00:00:00Z
2022-01-31T23:59:59Z
1
2022-02-01T00:00:00Z
2022-02-28T23:59:59Z
1
2022-03-01T00:00:00Z
2022-03-31T23:59:59Z
1
2022-04-01T00:00:00Z
2022-04-15T00:00:00Z
2
2022-09-15T00:00:00Z
2022-09-30T23:59:59Z
2
2022-10-01T00:00:00Z
2022-10-31T23:59:59Z
2
2022-11-01T00:00:00Z
2022-11-30T23:59:59Z
2
2022-12-01T00:00:00Z
2022-12-31T00:00:00Z
3
2023-09-15T00:00:00Z
2023-09-25T00:00:00Z
fiddle
It's because WHERE and ON conditions are not evaluated at the same level:
when the condition is in the ON clause it's limiting the rows concerned by the JOIN, where it's in the WHERE it's filtering the results after the JOIN has been applied, and since a recursive CTE see all rows selected up to now...

ORACLE - How to use LAG to display strings from all previous rows into current row

I have data like below:
group
seq
activity
A
1
scan
A
2
visit
A
3
pay
B
1
drink
B
2
rest
I expect to have 1 new column "hist" like below:
group
seq
activity
hist
A
1
scan
NULL
A
2
visit
scan
A
3
pay
scan, visit
B
1
drink
NULL
B
2
rest
drink
I was trying to solve with LAG function, but LAG only returns one row from previous instead of multiple.
Truly appreciate any help!
Use a correlated sub-query:
SELECT t.*,
(SELECT LISTAGG(activity, ',') WITHIN GROUP (ORDER BY seq)
FROM table_name l
WHERE t."GROUP" = l."GROUP"
AND l.seq < t.seq
) AS hist
FROM table_name t
Or a hierarchical query:
SELECT t.*,
SUBSTR(SYS_CONNECT_BY_PATH(PRIOR activity, ','), 3) AS hist
FROM table_name t
START WITH seq = 1
CONNECT BY
PRIOR seq + 1 = seq
AND PRIOR "GROUP" = "GROUP"
Or a recursive sub-query factoring clause:
WITH rsqfc ("GROUP", seq, activity, hist) AS (
SELECT "GROUP", seq, activity, NULL
FROM table_name
WHERE seq = 1
UNION ALL
SELECT t."GROUP", t.seq, t.activity, r.hist || ',' || r.activity
FROM rsqfc r
INNER JOIN table_name t
ON (r."GROUP" = t."GROUP" AND r.seq + 1 = t.seq)
)
SEARCH DEPTH FIRST BY "GROUP" SET order_rn
SELECT "GROUP", seq, activity, SUBSTR(hist, 2) AS hist
FROM rsqfc
Which, for the sample data:
CREATE TABLE table_name ("GROUP", seq, activity) AS
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL;
All output:
GROUP
SEQ
ACTIVITY
HIST
A
1
scan
null
A
2
visit
scan
A
3
pay
scan,visit
B
1
drink
null
B
2
rest
drink
db<>fiddle here
To aggregate strings in Oracle we use LISAGG function.
In general, you need a windowing_clause to specify a sliding window for analytic function to calculate running total.
But unfortunately LISTAGG doesn't support it.
To simulate this behaviour you may use model_clause of the select statement. Below is an example with explanation.
select
group_
, activity
, seq
, hist
from t
model
/*Where to restart calculation*/
partition by (group_)
/*Add consecutive numbers to reference "previous" row per group.
May use "seq" column if its values are consecutive*/
dimension by (
row_number() over(
partition by group_
order by seq asc
) as rn
)
measures (
/*Other columnns to return*/
activity
, cast(null as varchar2(1000)) as hist
, seq
)
rules update (
/*Apply this rule sequentially*/
hist[any] order by rn asc =
/*Previous concatenated result*/
hist[cv()-1]
/*Plus comma for the third row and tne next rows*/
|| presentv(activity[cv()-2], ',', '') /**/
/*lus previous row's value*/
|| activity[cv()-1]
)
GROUP_ | ACTIVITY | SEQ | HIST
:----- | :------- | --: | :---------
A | scan | 1 | null
A | visit | 2 | scan
A | pay | 3 | scan,visit
B | drink | 1 | null
B | rest | 2 | drink
db<>fiddle here
Few more variants (without subqueries):
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
DBFIddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=9b477a2089d3beac62579d2b7103377a
Full test case with output:
with table_name ("GROUP", seq, activity) AS (
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL
)
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
GROUP SEQ ACTIV HIST1 HIST2
------ ---------- ----- ------------------------------ ------------------------------
A 1 scan
A 2 visit scan, scan
A 3 pay scan,visit, scan,visit
B 1 drink
B 2 rest drink, drink

Sql query to filter out the overlapping dates

Version
start_date
end_date
1
2005-11-23
2005-11-23
2
2005-11-23
2005-11-23
3
2005-11-23
2008-10-23
4
2008-10-23
2010-05-18
5
2011-05-13
2012-05-19
In the above table instead of keeping version 1,2,3,4 we can keep version 1 starting from '2005-11-23' to '2010-05-18' since all these verions are overlapping and keep version 5 as it is.
Ouput needed
..............
Version
start_date
end_date
1
2005-11-23
2010-05-18
5
2011-05-13
2012-05-19
How we can frame sql query for thi scenario?
Hive or Postgresql
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2005-11-23'),
('3', '2005-11-23', '2008-10-23'),
('4', '2008-10-23', '2010-05-18'),
('5', '2011-05-13', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version") OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date,
CASE
WHEN md.end_date IS NULL THEN s.end_date
ELSE md.end_date
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version";
Version
start_date
end_date
1
2005-11-23T00:00:00.000Z
2010-05-18T00:00:00.000Z
5
2011-05-13T00:00:00.000Z
2012-05-19T00:00:00.000Z
View on DB Fiddle
Schema (PostgreSQL v13)
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2005-11-23'),
('3', '2005-11-23', '2008-10-23'),
('4', '2008-10-23', '2010-05-18'),
('5', '2011-05-13', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version") OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date::text,
CASE
WHEN md.end_date IS NULL THEN s.end_date::text
ELSE md.end_date::text
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version";
Version
start_date
end_date
1
2005-11-23
2010-05-18
5
2011-05-13
2012-05-19
View on DB Fiddle
Update 1
Lag/Lead functions now assigned default values
Schema (PostgreSQL v13)
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date,1,null) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version",1,3) OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date::text,
CASE
WHEN md.end_date IS NULL THEN s.end_date::text
ELSE md.end_date::text
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version";
ORDER BY
s."Version";
Version
start_date
end_date
1
2005-11-23
2012-05-19
View on DB Fiddle
With original dataset
Schema (PostgreSQL v13)
CREATE TABLE my_dates (
"Version" INTEGER,
start_date date,
end_date date
);
INSERT INTO my_dates
("Version",start_date, end_date)
VALUES
('1', '2005-11-23', '2005-11-23'),
('2', '2005-11-23', '2005-11-23'),
('3', '2005-11-23', '2008-10-23'),
('4', '2008-10-23', '2010-05-18'),
('5', '2011-05-13', '2012-05-19');
Query #1
with my_overlaps AS (
select
*,
LAG(end_date,1,null) OVER (ORDER BY "Version") >= start_date as overlap
from my_dates
),
selected AS (
SELECT
"Version",
start_date,
end_date ,
LEAD("Version",1,3) OVER (ORDER BY "Version") AS next_version
FROM
my_overlaps
where overlap=false or
overlap is null
)
select
s."Version",
s.start_date::text,
CASE
WHEN md.end_date IS NULL THEN s.end_date::text
ELSE md.end_date::text
END as end_date
FROM
selected s
LEFT JOIN
my_dates md on s.next_version -1 = md."Version"
ORDER BY
s."Version";
Version
start_date
end_date
1
2005-11-23
2010-05-18
5
2011-05-13
2005-11-23
View on DB Fiddle
The safest way to handle this -- assuming that you can create stable sort on the rows (which version provides) -- uses a cumulative maximum instead of lag().
select min(version), min(start_date), min(end_date)
from (select t.*,
sum(case when prev_max_end_date >= start_date then 0 else 1 end) over
(order by start_date, version) as grp
from (select t.*,
max(end_date) over (order by start_date, version
rows between unbounded preceding and 1 preceding
) as prev_max_end_date
from t
) t
) t
group by grp;
This should work in any (reasonable) database. Here is a db<>fiddle that happens to use Postgres.
The issue with lag()/lead() approaches is that the overlap with earlier rows may not be on the "previous" row. For instance, consider this diagram (where lower case means start and upper case means end):
---a----b--B----c--C----d--D--e---A--E--
E overlaps with A. However, by any reasonable definition of "previous", A is not the previous row for E.

Oracle adding a subquery in a CTE

I have the following setup, which works fine and generates output as expected.
I'm trying to add the locations subquery into the CTE so my output will have a random location_id for each row.
The subquery is straight forward and should work but I am getting syntax errors when I try to place it into the 'data's CTE. I was hoping someone could help me out.
CREATE TABLE employees(
employee_id NUMBER(6),
emp_name VARCHAR2(30)
);
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(1, 'John Doe');
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(2, 'Jane Smith');
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(3, 'Mike Jones');
CREATE TABLE locations AS
SELECT level AS location_id,
'Door ' || level AS location_name
FROM dual
CONNECT BY level <=
with rws as (
select level rn from dual connect by level <= 5 ),
data as ( select e.*,round (dbms_random.value(1,5)
) n from employees e)
select employee_id,
emp_name,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws
join data d on rn <= n
order by employee_id;
-- trying to make this work
with rws as ( select level rn from dual connect by level <= 5 ),
data as ( select e.*, loc.location_id = (
select location_id
from locations order by dbms_random.value()
fetch first 1 row only
),
round (dbms_random.value(1,5)
) n from employees e )
select employee_id,
emp_name,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws
join data d on rn <= n
order by employee_id;
You need to alias the subquery column expression, rather than trying to assign it to a [variable] name. So instead of this:
with rws as ( select level rn from dual connect by level <= 5 ),
data as ( select e.*, loc.location_id = (
select location_id
from locations order by dbms_random.value()
fetch first 1 row only
),
round (dbms_random.value(1,5)
) n from employees e )
you would do this:
with rws as (
select level rn
from dual
connect by level <= 5
),
data as (
select e.*,
(
select location_id
from locations
order by dbms_random.value()
fetch first 1 row only
) as location_id,
round (dbms_random.value(1,5)) as n
from employees e
)
db<>fiddle
But yes, you'll get the same location_id for each row, which probably isn't what you want.
There are probably better ways to avoid it (or to approach whatever you're actually trying to achieve) but one option is to force the subquery to be correlated by adding something like:
where location_id != -1 * e.employee_id
db<>fiddle
although that might be expensive. It's probably worth asking a new question about that specific aspect.
I am getting the same location_id for every employee_id, which I don't want either.
The subquery is in the wrong place then; move it to the main query, and correlate against both ID and n:
with rws as (
select level rn
from dual
connect by level <= 5
),
data as (
select e.*,
round (dbms_random.value(1,5)) as n
from employees e
)
select d.employee_id,
d.emp_name,
(
select location_id
from locations
where location_id != -1 * d.employee_id * d.n
order by dbms_random.value()
fetch first 1 row only
) as location_id,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws r
join data d on r.rn <= d.n
order by d.employee_id;
db<>fiddle
Or move the location part to a new CTE, I suppose, with its own row number; and join that on one of your other generated values.

Calculate average values in Oracle

I want to calculate average values in Oracle tables
CREATE TABLE AGENT_HISTORY(
EVENT_ID INTEGER NOT NULL,
AGENTID INTEGER NOT NULL,
EVENT_DATE DATE NOT NULL
)
/
CREATE TABLE CPU_HISTORY(
CPU_HISTORY_ID INTEGER NOT NULL,
EVENT_ID INTEGER NOT NULL,
CPU_NAME VARCHAR2(50 ) NOT NULL,
CPU_VALUE NUMBER NOT NULL
)
/
I use this SQL query:
----- FOR 24 HOURS CPU
CURSOR LAST_24_CPU_CURSOR IS
--SELECT EVENT_DATE, CPU FROM AGENT_HISTORY WHERE NAME = NAMEIN AND EVENT_DATE >= SYSDATE-(60*24)/1440;
SELECT START_DATE, NVL(AVG(CH.CPU_VALUE),0)
FROM (SELECT START_DATE - (LVL+1)/24 START_DATE, START_DATE - LVL/24 END_DATE
FROM (SELECT SYSDATE START_DATE, LEVEL LVL FROM DUAL CONNECT BY LEVEL <= 24))
LEFT JOIN AGENT_HISTORY AH ON EVENT_DATE BETWEEN START_DATE AND END_DATE
LEFT JOIN CPU_HISTORY CH ON AH.EVENT_ID = CH.EVENT_ID
JOIN AGENT AG ON AH.AGENTID = AG.ID
WHERE AG.NAME = NAMEIN
GROUP BY START_DATE
ORDER BY 1;
This query prints only one average value. I would like to modify it to print 24 values for every hour average value. Can you help me to modify the query?
I guess your input contains data only for one of the given intervals; since you're using an INNER JOIN with AGENT which in turn is filtered by AGENT_HISTORY, you're effectively converting all your LEFT JOINs to inner ones.
I suggest you use a CROSS JOIN between AGENT and the timeslots instead:
with agent_history(event_date, agentid, event_id) as (
select timestamp '2015-11-18 09:00:07', 1, 1001 from dual
),
agent(id, name) as (
select 1, 'myAgent' from dual
),
cpu_history(event_id, cpu_value) as (
select 1001, 75.2 from dual
),
time_slots(start_date, end_date) as (
SELECT START_DATE - (LVL + 1) / 24 START_DATE,
START_DATE - LVL / 24 END_DATE
FROM (SELECT SYSDATE START_DATE,
LEVEL LVL
FROM DUAL
CONNECT BY LEVEL <= 24)
)
SELECT START_DATE,
NVL(AVG(CH.CPU_VALUE),
0)
FROM time_slots ts
CROSS JOIN AGENT AG
LEFT JOIN AGENT_HISTORY AH
ON AH.AGENTID = AG.ID
AND EVENT_DATE BETWEEN START_DATE AND END_DATE
LEFT JOIN CPU_HISTORY CH
ON AH.EVENT_ID = CH.EVENT_ID
WHERE AG.NAME = 'myAgent'
GROUP BY START_DATE
ORDER BY 1;
This ensures you get the full 24 rows (one for each timeslot).
Change start_date to to_char(start_date, 'hh24:mi') both in select and group by clauses.

Resources