aggregate date ranges with gaps in oracle - oracle

I need to aggregate date ranges allowing for max 2 days gaps in between for each id. Any help would be much appreciated
create table tt ( id int, startdate date, stopdate date);
Insert into TT values (1,'24/05/2010', '29/05/2010');
Insert into TT values (1,'30/05/2010', '22/06/2010');
Insert into TT values (10,'26/06/2012', '28/06/2012');
Insert into TT values (10,'29/06/2012', '30/06/2012');
Insert into TT values (10,'01/07/2012', '30/07/2012');
Insert into TT values (10,'03/08/2012', '30/12/2012');
insert into TT values (90,'08/03/2002', '16/03/2002');
insert into TT values (90,'31/01/2002', '15/02/2002');
insert into TT values (90,'15/02/2002', '28/02/2002');
insert into TT values (90,'31/01/2002', '15/02/2004');
insert into TT values (90,'15/02/2004', '15/04/2004');
insert into TT values (90,'01/03/2002', '07/03/2002');
expected output would be:
1 24/05/2010 22/06/2010
10 26/06/2012 30/07/2012
10 03/08/2012 30/12/2012
90 31/01/2002 15/04/2004

If you're on 12c, you can use one of my favourite SQL features: pattern matching (match_recognize).
With this you need to define a pattern variable. This is where you'll check that the start date of the current row is within two days of the stop date for the previous row. Which is:
startdate <= prev ( stopdate ) + 2
The pattern you're searching for is any row, followed by zero or more rows that meet this criterium.
So you have an "always true" strt variable, followed by * (regular expression zero-or-more quantifier) occurrences of the within2 variable:
( strt within2* )
I'm guessing you also need to split the ranges up by ID. So I've added a partition by for this.
Put it all together and you get:
select *
from tt match_recognize (
partition by id
order by startdate, stopdate
measures
first ( startdate ) startdate,
last ( stopdate ) stopdate
pattern ( strt within2* )
define
within2 as startdate <= prev ( stopdate ) + 2
);
ID STARTDATE STOPDATE
1 24/05/2010 22/06/2010
10 26/06/2012 30/07/2012
10 03/08/2012 30/12/2012
If you want to know more about this, you can find several match_recognize examples here.

Related

exclude part of the select not to consider date where clause

i have a select(water readings, previous water reading, other columns) , a "where clause" that is based on date water reading date. however for previous water reading it must not consider the where clause. I want to get previous meter reading regardless where clause date range.
looked at union problem is that i have to use the same clause,
SELECT
WATERREADINGS.name,
WATERREADINGS.date,
LAG( WATERREADINGS.meter_reading,1,NULL) OVER(
PARTITION BY WATERREADINGS.meter_id,WATERREADINGS.register_id
ORDER BY WATERREADINGS.meter_id DESC,WATERREADINGS.register_id
DESC,WATERREADINGS.readingdate ASC,WATERREADINGS.created ASC
) AS prev_water_reading,
FROM WATERREADINGS
WHERE waterreadings.waterreadingdate BETWEEN '24-JUN-19' AND
'24-AUG-19' and isactive = 'Y'
The prev_water_reading value must not be restricted by the date BETWEEN '24-JUN-19' AND '24-AUG-19' predicate but the rest of the sql should be.
You can do this by first finding the previous meter readings for all rows and then filtering those results on the date, e.g.:
WITH meter_readings AS (SELECT waterreadings.name,
waterreadings.date dt,
lag(waterreadings.meter_reading, 1, NULL) OVER (PARTITION BY waterreadings.meter_id, waterreadings.register_id
ORDER BY waterreadings.readingdate ASC, waterreadings.created ASC)
AS prev_water_reading,
FROM waterreadings
WHERE isactive = 'Y')
-- the meter_readings subquery above gets all rows and finds their previous meter reading.
-- the main query below then applies the date restriction to the rows from the meter_readings subquery.
SELECT name,
date,
prev_water_reading,
FROM meter_readings
WHERE dt BETWEEN to_date('24/06/2019', 'dd/mm/yyyy') AND to_date('24/08/2019', 'dd/mm/yyyy');
Perform the LAG in an inner query that is not filtered by dates and then filter by the dates in the outer query:
SELECT name,
"date",
prev_water_reading
FROM (
SELECT name,
"date",
LAG( meter_reading,1,NULL) OVER(
PARTITION BY meter_id, register_id
ORDER BY meter_id DESC, register_id DESC, readingdate ASC, created ASC
) AS prev_water_reading,
waterreadingdate --
FROM WATERREADINGS
WHERE isactive = 'Y'
)
WHERE waterreadingdate BETWEEN DATE '2019-06-24' AND DATE '2019-08-24'
You should also not use strings for dates (that require an implicit cast using the NLS_DATE_FORMAT session parameter, which can be changed by any user in their own session) and use date literals DATE '2019-06-24' or an explicit cast TO_DATE( '24-JUN-19', 'DD-MON-RR' ).
You also do not need to reference the table name for every column when there is only a single table as this clutters up your code and makes it difficult to read and DATE is a keyword so you either need to wrap it in double quotes to use it as a column name (which makes the column name case sensitive) or should use a different name for your column.
I've added a subquery with previous result without filter and then joined it with the main table with filters:
SELECT
WATERREADINGS.name,
WATERREADINGS.date,
w_lag.prev_water_reading
FROM
WATERREADINGS,
(SELECT name, date, LAG( WATERREADINGS.meter_reading,1,NULL) OVER(
PARTITION BY WATERREADINGS.meter_id,WATERREADINGS.register_id
ORDER BY WATERREADINGS.meter_id DESC,WATERREADINGS.register_id
DESC,WATERREADINGS.readingdate ASC,WATERREADINGS.created ASC
) AS prev_water_reading
FROM WATERREADINGS) w_lag
WHERE waterreadings.waterreadingsdate BETWEEN '24-JUN-19' AND '24-AUG-19' and isactive = 'Y'
and WATERREADINGS.name = w_lag.name
and WATERREADINGS.date = w_lag.date

Oracle query with different dates

I have to write this query, and it is a bit complex. I am hoping someone can help, as I've received much help from here before.
Say I have a customers stock portfolio. And a list of company tickers, and the date the ticker was purchased. My list looks something like this:
CYSL 1/16/2017
MCIG 4/1/2016
MSRT 9/13/2016
NTFU 1/16/2017
QNTM 10/30/2014
SIGWX 6/28/2014
TRMCX 6/25/2014
TRT2 4/19/2016
Now, in order for my to compute some YTD performance, I need to apply the following logic:
If the purchase date > 01/01/2017, I'll use the closing price of the ticker when it was purchased.
If the purchase date < 01/01/2017. I'll use the closing price of the ticker on <= 12/31/2016.
There are 2 tables involved: 1) Portfolio Table 2) Price History
I've gotten this far:
SELECT ticker, MIN(transaction_date) KEEP (DENSE_RANK FIRST ORDER BY transaction_date) transaction_date
FROM customer_portfolios
WHERE portfolio_id = 954118
GROUP BY ticker;
This gives me the list above. Now, I am lost on how to join this with the logic above, to get the proper date, and go after the proper price.
I hope I am explaining this correctly.
And help will be great, and I can explain more if it will help you, help me.
Thank you.
Use GREATEST to get the date at either 2016-12-31 or the later transaction date and then just join to the price history table:
SELECT cp.ticker,
cp.transaction_date,
h.close_price
FROM (
SELECT ticker,
GREATEST(
DATE '2016-12-31',
MIN(transaction_date) KEEP (
DENSE_RANK FIRST
ORDER BY transaction_date
)
) AS transaction_date
FROM customer_portfolios
WHERE portfolio_id = 954118
GROUP BY ticker
) cp
INNER JOIN price_history h
ON ( cp.ticker = h.ticker
AND cp.transaction_date BETWEEN h.start_date AND h.end_date )
or if the price history has a row per day (rather than the assumed range in the query above) then replace the last line with:
AND cp.transaction_date = h.price_date )
Perhaps, I'm missing something here but without any real sample data, this is the best I can come up with. Below are the table ddl's, inserts and query and an image of the results.
CREATE TABLE PRICE_HISTORY
( PRICE_DATE DATE,
TICKER VARCHAR2(20 BYTE),
OPEN_PRICE NUMBER,
CLOSE_PRICE NUMBER
)
CREATE TABLE PORTFOLIO_TABLE
( TICKER VARCHAR2(20 BYTE),
TICKER_DATE VARCHAR2(20 BYTE),
CUSTOMER VARCHAR2(20 BYTE)
)
Insert into PORTFOLIO_TABLE (TICKER,TICKER_DATE,CUSTOMER) values ('CYSL','1/16/2018','1');
Insert into PORTFOLIO_TABLE (TICKER,TICKER_DATE,CUSTOMER) values ('MCIG','04/1/2016','2');
Insert into PORTFOLIO_TABLE (TICKER,TICKER_DATE,CUSTOMER) values ('MSRT','09/13/2016','3');
Insert into PORTFOLIO_TABLE (TICKER,TICKER_DATE,CUSTOMER) values ('NTFU','01/16/2017','4');
Insert into PRICE_HISTORY (PRICE_DATE,TICKER,OPEN_PRICE,CLOSE_PRICE) values (to_date('27-MAR-2017 20:27:12','DD-MON-RRRR HH24:MI:SS'),'CYSL',1,2);
Insert into PRICE_HISTORY (PRICE_DATE,TICKER,OPEN_PRICE,CLOSE_PRICE) values (to_date('16-JUN-1997 20:27:33','DD-MON-RRRR HH24:MI:SS'),'MCIG',1,2);
Insert into PRICE_HISTORY (PRICE_DATE,TICKER,OPEN_PRICE,CLOSE_PRICE) values (to_date('31-MAY-2011 20:27:45','DD-MON-RRRR HH24:MI:SS'),'MSRT',5,8);
Insert into PRICE_HISTORY (PRICE_DATE,TICKER,OPEN_PRICE,CLOSE_PRICE) values (to_date('25-JAN-2021 20:27:55','DD-MON-RRRR HH24:MI:SS'),'NTFU',7,6);
WITH portfolio AS
( SELECT TICKER , TICKER_DATE, CUSTOMER FROM PORTFOLIO_TABLE
)
SELECT
CASE
WHEN PRICE_DATE > '01-JAN-17'
THEN CLOSE_PRICE
ELSE OPEN_PRICE
END AS AMOUNT,
PRICE_DATE,
P.TICKER,
OPEN_PRICE,
CLOSE_PRICE
FROM PRICE_HISTORY H,
PORTFOLIO P
WHERE H.TICKER = P.TICKER;
The ultimate goal of this query is to get a SUM of the customers portfolio, on his purchases.
This part is working perfectly!
SELECT m_ticker,
GREATEST(
DATE '2016-12-31',
MIN(transaction_date) KEEP (
DENSE_RANK FIRST
ORDER BY transaction_date
)
) AS transaction_date
FROM customer_portfolio_history
WHERE portfolio_id = 954118
GROUP BY m_ticker;
Gives me the data I need:
CYSL 1/16/2017
MCIG 12/31/2016
MSRT 12/31/2016
NTFU 1/16/2017
QNTM 12/31/2016
SIGWX 12/31/2016
TRMCX 12/31/2016
TRT2 12/31/2016
Now it gets trickier. With the results above, I need to go into the PRICE_HISTORY, and find the price which is on or closest to (earlier) that date.
So, if there is no entry for 12/31/2016 (maybe market was closed), then try 12/30, 12/29, etc. Same for 1/16/2017. If there is no entry, then try 1/15, then 1/14....
After, I can take the total # shares from the PORTFOLIO table for that customer / portfolio / ticker, and multiply it by the price I retrieve for the day found.....and there is the value.
Pretty crazy, I know.

Loop through date range

How to loop Oracle query through the date? I have to put variable in 4 place. My query start with WITH AS, so I can't use Oracle SQL Loop through Date Range solution.
I also can't create temporary table.
Here is my attempt:
WITH d
AS (
SELECT DATE'2015-06-22' + LEVEL - 1 AS current_d
FROM dual
CONNECT BY DATE'2015-06-22' + LEVEL - 1 < DATE'2015-10-04'
),
OrderReserve
AS (
SELECT cvwarehouseid
,lproductid
,SUM(lqty) lqty
FROM ABBICS.iOrdPrdQtyDate
GROUP BY cvwarehouseid
,lproductid
)
SELECT
...
WHERE IORDREFILL.DNCONFIRMEDDELDATE < CAST(TO_CHAR(d.current_d , 'YYYYMMDD') AS NUMBER(38))
...
If I understand you correctly, you assume that you can only use 1 inline table per query. That is not true, you can use multiple inline tables and expand the existing WITH clause with another to loop through dates:
with OrderReserve as (
SELECT cvwarehouseid
,lproductid
,SUM(lqty) lqty
FROM ABBICS.iOrdPrdQtyDate
GROUP BY cvwarehouseid
,lproductid
), date_range as (
select sysdate+level
from dual
connect by level <= 30
)
select *
from OrderReserve, date_range
... -- expand with date_range as you see fit
;

Aggregate only new rows from source table

I got one Source table with a timestamp column (YYYY.MM.DD HH24:MI:SS) and a target table with aggregated rows on daily basis (Date column: YYYY.MM.DD).
My Problem is: How do I bring new data from source to target and aggregate it?
I tried:
select
a.Sales,
trunc(a.timestamp,'DD') as TIMESTAMP,
count(1) as COUNT,
from
tbl_Source a
where trunc(a.timestamp,'DD') > nvl((select MAX(b.TIME_TO_DAY)from tbl_target b), to_date('01.01.1975 00:00:00','dd.mm.yyyy hh24:mi:ss'))
group by a.sales,
trunc(a.Timestamp,'DD')
The problem with that is: when I have a row with timestamp '2013.11.15 00:01:32' and the max day from target is the 14th of november, it will only aggregate the 15th. Would I use >= instead of > some rows would get loaded twice.
It looks like you are looking for a merge statement: If the day is already present in tbl_target then update the count else insert the record.
merge into tbl_target dest
using
(
select sales, trunc(timestamp) as theday , count(*) as sales_count
from tbl_Source
where trunc(timestamp) >= ( select nvl(max(time_to_day),to_date('01.01.1975','dd.mm.yyyy')) from tbl_target )
group by sales, trunc(timestamp)
) src
on (src.theday = dest.time_to_day)
when matched then update set
dest.sales_count = src.sales_count
when not matched then
insert (time_to_day, sales_count)
values (src.theday, src.sales_count)
;
As far as I can understand your question: you need to get everything since the last reload to target table.
The problem here: you need this date, but it is truncated during the update.
If my guesses are correct you cannot do anything except to store the date of reload as an additional column because there is no way to get it back from the data presented here.
about your query:
count(*) and count(1) are the same in performance (proved many times, at least in 10-11 versions) - do not make this count(1), looks really ugly
do not use nvl, use coalesce instead of it - it is much faster
I would write your query like that:
with t as (select max(b.time_to_day) mx from tbl_target b)
select a.sales,trunc(a.timestamp,'dd') as timestamp,count(*) as count
from tbl_source a,t
where trunc(a.timestamp,'dd') > t.mx or t.mx is null
group by a.sales,trunc(a.timestamp,'dd')
Does this fit your needs:
WHERE trunc(a.timestamp,'DD') > nvl((select MAX(b.TIME_TO_DAY) + 1 - 1/(24*60*60) from tbl_target b), to_date('01.01.1975 00:00:00','dd.mm.yyyy hh24:mi:ss'))
i.e. instead of 2013-11-15 00:00:00 compare to 2013-11-16 23:59:59
Update:
This one?
WHERE trunc(a.timestamp,'DD') BETWEEN nvl((select MAX(b.TIME_TO_DAY) from ...) AND nvl((select MAX(b.TIME_TO_DAY) + 1 - 1/(24*60*60) from ...)

modifying query resultset to include all dates in a range

I have a query which I run on a table TXN_DEC(id, resourceid, usersid, date, eventdesc) which return distinct count of users for a given date-range and resourceid, group by date and eventdesc (each resource can have 4 to 5 eventdesc)
if there is no value of distinct users count on a date in the range, for an eventdesc, then it skips that date row in the resultset.
I need to have all date rows in my resultset or collection such that if there is no value of count for a date,eventdesc combination, then its value is set to 0 but that date still exists in the collection..
How do I go about getting such a collection
I know getting the final dataset entirely from the query result would be too complicated,
but I can use collections in groovy to modify and populate my map/list to get the data in the required format
something similar to following: if
input date range = 5th Feb to 3 March 2011
DataMap = [dateval: '02/05/2011' eventdesc: 'Read' dist_ucnt: 23,
dateval: '02/06/2011' eventdesc: 'Read' dist_ucnt: 23,
dateval: '02/07/2011' eventdesc: 'Read' dist_ucnt: 0, -> this row was not present in query resultset, but row exists in the map with value 0
....and so on till 3 march 2011 and then whole range repeated for each eventdesc
]
If you want all dates (including those with no entries in your TXN_DEC table) for a given range, you could use Oracle to generate your date range and then use an outer join to your existing query. Then you would just need to fill in null values. Something like:
select
d.dateInRange as dateval,
'Read' as eventdesc,
nvl(td.dist_ucnt, 0) as dist_ucnt
from (
select
to_date('02-FEB-2011','dd-mon-yyyy') + rownum - 1 as dateInRange
from all_objects
where rownum <= to_date('03-MAR-2011','dd-mon-yyyy') - to_date('02-FEB-2011','dd-mon-yyyy') + 1
) d
left join (
select
date,
count(distinct usersid) as dist_ucnt
from
txn_dec
where eventDesc = 'Read'
group by date
) td on td.date = d.dateInRange
That's my purely Oracle solution since I'm not a Groovy guy (well, actually, I am a pretty groovy guy...)
EDIT: Here's the same version wrapped in a stored procedure. It should be easy to call if you know the API....
create or replace procedure getDateRange (
p_begin_date IN DATE,
p_end_date IN DATE,
p_event IN txn_dec.eventDesc%TYPE,
p_recordset OUT SYS_REFCURSOR)
AS
BEGIN
OPEN p_recordset FOR
select
d.dateInRange as dateval,
p_event as eventdesc,
nvl(td.dist_ucnt, 0) as dist_ucnt
from (
select
p_begin_date + rownum - 1 as dateInRange
from all_objects
where rownum <= p_end_date - p_begin_date + 1
) d
left join (
select
date,
count(distinct usersid) as dist_ucnt
from
txn_dec
where eventDesc = p_event
group by date
) td on td.date = d.dateInRange;
END getDateRange;

Resources