How to fill out grouped data by day? - oracle

From events table, I get grouped events count by day and for this I use
SELECT event_date, COUNT(event_id) event_count FROM events
WHERE event_date >= TRUNC(SYSDATE-1, 'DD')
GROUP BY event_date
ORDER BY event_date
My problem here is that this returns only these days, where are some events
2017-04-03 , 4
2017-04-05 , 2
but I need get all consecutive days from Yesterday to next 30 day, and fill out this days from my grouped events data, like this
2017-03-31 ,
2017-04-01 ,
2017-04-02 ,
2017-04-03 , 4
2017-04-04 ,
2017-04-05 , 2
2017-04-06 ,
... Next 30 days (with events count when events exists in my table)
...
How to do that? Thank you for any help

So, you can generate next 30 days from yesterday (see first subquery) and then simply left join it to your existing query.
Try this:
select all_days.days , certain_days.event_count from (
select TRUNC( sysdate + level - 2, 'DD') as days from DUAL connect by level <= 30
) all_days
left join (
SELECT event_date, COUNT(event_id) event_count FROM events
WHERE event_date >= TRUNC(SYSDATE-1, 'DD')
GROUP BY event_date
) certain_days
on all_days.days = certain_days.event_date
order by all_days.days

Related

Efficiently get array of all previous dates per id per date limited to past 6 months in BigQuery

I have a very big table 'DATES_EVENTS' (20 T) that looks like this:
ID DATE
1 '2022-04-01'
1 '2022-03-02'
1 '2022-03-01'
2 '2022-05-01'
3 '2021-12-01'
3 '2021-11-11'
3 '2020-11-11'
3 '2020-10-01'
I want per each row to get all past dates (per user) limited to up to 6 months.
My desired table:
ID DATE DATE_list
1 '2022-04-01' ['2022-04-01','2022-03-02','2022-03-01']
1 '2022-03-02' ['2022-03-02','2022-03-01']
1 '2022-03-01' ['2022-03-01']
2 '2022-05-01' ['2022-05-01']
3 '2021-12-01' ['2021-12-01','2021-11-11']
3 '2021-11-11' ['2021-11-11']
3 '2020-11-11' ['2020-11-11','2020-10-01']
3 '2020-10-01' ['2020-10-01']
I have a solution for all dates not limited:
SELECT
ID, DATE, ARRAY_AGG(DATE) OVER (PARTITION BY ID ORDER BY DATE) as DATE_list
FROM
DATES_EVENTS
But for a limited up to 6 months I don't have an efficient solution:
SELECT
distinct A.ID, A.DATE, ARRAY_AGG(B.DATE) OVER (PARTITION BY B.ID ORDER BY B.DATE) as DATE_list
FROM
DATES_EVENTS A
INNER JOIN
DATES_EVENTS B
ON
A.ID=B.ID
AND B.DATE BETWEEN DATE_SUB(A.DATE, INTERVAL 180 DAY) AND A.DATE
** ruffly a solution
Anyone know of a good and efficient way to do what I need?
Consider below approach
select id, date, array(
select day
from t.date_list day
where day <= date
order by day desc
) as date_list
from (
select *, array_agg(date) over win as date_list
from dates_events
window win as (
partition by id
order by extract(year from date) * 12 + extract(month from date)
range between 5 preceding and current row
)
) t
if applied to sample data in your question - output is
In case if (as I noticed in your question) 180 days is appropriate substitution for 6 months for you - you can use below simpler version
select *, array_agg(date) over win as date_list
from dates_events
window win as (
partition by id
order by unix_date(date)
range between current row and 179 following
)

How to group by month including all months?

I group my table by months
SELECT TO_CHAR (created, 'YYYY-MM') AS operation, COUNT (id)
FROM user_info
WHERE created IS NOT NULL
GROUP BY ROLLUP (TO_CHAR (created, 'YYYY-MM'))
2015-04 1
2015-06 10
2015-08 22
2015-09 8
2015-10 13
2015-12 5
2016-01 25
2016-02 37
2016-03 24
2016-04 1
2016-05 1
2016-06 2
2016-08 2
2016-09 7
2016-10 103
2016-11 5
2016-12 2
2017-04 14
2017-05 2
284
But the records don't cover all the months.
I would like the output to include all the months, with the missing ones displayed in the output with a default value:
2017-01 ...
2017-02 ...
2017-03 ZERO
2017-04 ZERO
2017-05 ...
Oracle has a good array of date manipulation functions. The two pertinent ones for this problem are
MONTHS_BETWEEN() which calculates the number of months between two dates
ADD_MONTHS() which increments a date by the given number of months
We can combine these functions to generate a table of all the months spanned by your table's records. Then we use an outer join to conditionally join records from USER_INFO to that calendar. When no records match count(id) will be zero.
with cte as (
select max(trunc(created, 'MM')) as max_dt
, min(trunc(created, 'MM')) as min_dt
from user_info
)
, cal as (
select add_months(min_dt, (level-1)) as mth
from cte
connect by level <= months_between(max_dt, min_dt) + 1
)
select to_char(cal.mth, 'YYYY-MM') as operation
, count(id)
from cal
left outer join user_info
on trunc(user_info.created, 'mm') = cal.mth
group by rollup (cal.mth)
order by 1
/

oracle sql, counting working hours

I have been trying to find something related but couldn't.
I have an issue that i need to produce an availability percentage of something. I have a table that includes events that are happening, which i managed to count them by the day they are happening, but i am finding issues to count the total number of working hours in a quarter or a year.
when each day of the week has a different weight.
Basically my question is: can i do it without making a table with all dates in that month/year?
An example of the data:
ID DATE duration Environment
1 23/10/15 25 a
2 15/01/15 50 b
3 01/01/15 43 c
8 05/06/14 7 b
It can work for me by a calculated field or just a general query to get the information.
sorry I don't really understand the question but if you want to generate dates using connect by level is an easy way to do it (you could also use the model clause or recursive with) I did it here for just 10 days but you get the idea. I put in your dates as t1 and generated a list of dates (t) and then did a left outer join to put them together.
WITH t AS
(SELECT to_date('01-01-2015', 'mm-dd-yyyy') + level - 1 AS dt,
NULL AS duration
FROM dual
CONNECT BY level < = 10
),
t1 AS
(
SELECT to_date('10/01/15', 'dd-mm-yy') as dt, 50 as duration FROM dual
UNION ALL
SELECT to_date('01/01/15', 'dd-mm-yy'), 43 FROM dual
UNION ALL
SELECT to_date('06/01/15', 'dd-mm-yy'), 43 FROM dual
)
SELECT t.dt,
NVL(NVL(t1.duration, t.duration),0) duration
FROM t,
t1
WHERE t.dt = t1.dt(+)
ORDER BY dt
results
DT Duration
01-JAN-15 43
02-JAN-15 0
03-JAN-15 0
04-JAN-15 0
05-JAN-15 0
06-JAN-15 43
07-JAN-15 0
08-JAN-15 0
09-JAN-15 0
10-JAN-15 50
This was my intention, and the full answer below.
WITH t AS
(SELECT to_date('01-01-2015', 'mm-dd-yyyy') + level - 1 AS dt
FROM dual
CONNECT BY level < =365
),
t1 as
(
SELECT dt,
CASE
WHEN (to_char(TO_DATE( t.dt,'YYYY-MM-DD HH24:MI:SS'),'DY') in ('MON', 'TUE', 'WED', 'THU', 'FRI'))
THEN 14*60
WHEN (to_char(TO_DATE( t.dt,'YYYY-MM-DD HH24:MI:SS'),'DY') in ('SAT'))
THEN 8*60
WHEN (to_char(TO_DATE( t.dt,'YYYY-MM-DD HH24:MI:SS'),'DY') in ('SUN'))
THEN 10*60
ELSE 0 END duration ,
to_char(t.dt,'Q') as quarter
FROM t
)
select to_char(t1.dt,'yyyy'), to_char(t1.dt,'Q'),sum(t1.duration)
from t1
group by
to_char(t1.dt,'yyyy'), to_char(t1.dt,'Q');

Finding a count of rows in an arbitrary date range using Oracle

The question I need to answer is this "What is the maximum number of page requests we have ever received in a 60 minute period?"
I have a table that looks similar to this:
date_page_requested date;
page varchar(80);
I'm looking for the MAX count of rows in any 60 minute timeslice.
I thought analytic functions might get me there but so far I'm drawing a blank.
I would love a pointer in the right direction.
You have some options in the answer that will work, here is one that uses Oracle's "Windowing Functions with Logical Offset" feature instead of joins or correlated subqueries.
First the test table:
Wrote file afiedt.buf
1 create table t pctfree 0 nologging as
2 select date '2011-09-15' + level / (24 * 4) as date_page_requested
3 from dual
4* connect by level <= (24 * 4)
SQL> /
Table created.
SQL> insert into t values (to_date('2011-09-15 11:11:11', 'YYYY-MM-DD HH24:Mi:SS'));
1 row created.
SQL> commit;
Commit complete.
T now contains a row every quarter hour for a day with one additional row at 11:11:11 AM. The query preceeds in three steps. Step 1 is to, for every row, get the number of rows that come within the next hour after the time of the row:
1 with x as (select date_page_requested
2 , count(*) over (order by date_page_requested
3 range between current row
4 and interval '1' hour following) as hour_count
5 from t)
Then assign the ordering by hour_count:
6 , y as (select date_page_requested
7 , hour_count
8 , row_number() over (order by hour_count desc, date_page_requested asc) as rn
9 from x)
And finally select the earliest row that has the greatest number of following rows.
10 select to_char(date_page_requested, 'YYYY-MM-DD HH24:Mi:SS')
11 , hour_count
12 from y
13* where rn = 1
If multiple 60 minute windows tie in hour count, the above will only give you the first window.
This should give you what you need, the first row returned should have
the hour with the highest number of pages.
select number_of_pages
,hour_requested
from (select to_char(date_page_requested,'dd/mm/yyyy hh') hour_requested
,count(*) number_of_pages
from pages
group by to_char(date_page_requested,'dd/mm/yyyy hh')) p
order by number_of_pages
How about something like this?
SELECT TOP 1
ranges.date_start,
COUNT(data.page) AS Tally
FROM (SELECT DISTINCT
date_page_requested AS date_start,
DATEADD(HOUR,1,date_page_requested) AS date_end
FROM #Table) ranges
JOIN #Table data
ON data.date_page_requested >= ranges.date_start
AND data.date_page_requested < ranges.date_end
GROUP BY ranges.date_start
ORDER BY Tally DESC
For PostgreSQL, I'd first probably write something like this for a "window" aligned on the minute. You don't need OLAP windowing functions for this.
select w.ts,
date_trunc('minute', w.ts) as hour_start,
date_trunc('minute', w.ts) + interval '1' hour as hour_end,
(select count(*)
from weblog
where ts between date_trunc('minute', w.ts) and
(date_trunc('minute', w.ts) + interval '1' hour) ) as num_pages
from weblog w
group by ts, hour_start, hour_end
order by num_pages desc
Oracle also has a trunc() function, but I'm not sure of the format. I'll either look it up in a minute, or leave to see a friend's burlesque show.
WITH ranges AS
( SELECT
date_page_requested AS StartDate,
date_page_requested + (1/24) AS EndDate,
ROWNUMBER() OVER(ORDER BY date_page_requested) AS RowNo
FROM
#Table
)
SELECT
a.StartDate AS StartDate,
MAX(b.RowNo) - a.RowNo + 1 AS Tally
FROM
ranges a
JOIN
ranges b
ON a.StartDate <= b.StartDate
AND b.StartDate < a.EndDate
GROUP BY a.StartDate
, a.RowNo
ORDER BY Tally DESC
or:
WITH ranges AS
( SELECT
date_page_requested AS StartDate,
date_page_requested + (1/24) AS EndDate,
ROWNUMBER() OVER(ORDER BY date_page_requested) AS RowNo
FROM
#Table
)
SELECT
a.StartDate AS StartDate,
( SELECT MIN(b.RowNo) - a.RowNo
FROM ranges b
WHERE b.StartDate > a.EndDate
) AS Tally
FROM
ranges a
ORDER BY Tally DESC

How to “group by” over a DATETIME range?

I'm trying to bulid up a datetime range based transactions report, for a business that can be open across two days, depending on the shift management.
The user can select a datetime range (monthly, daily, weekly, freely...), the query I implemented get the startDateTime and the EndDateTime, and will return all the transactions total grouped by day.
I.E.
DateTime Total Sales
---------------------------
10/15/2010 $2,300.38
10/16/2010 $1,780.00
10/17/2010 $4,200.22
10/20/2010 $900.66
My problem is that if the shift of the business is setted, for example, from 05.00 AM to 02.00 AM of the next day, all the transactions done from midnight to 02.00 AM will be grouped in the next day... and so on... the totals are corrupted.
When a business has a shift like this, it wants a report based on that shift, but without code patching (I'm using Java calling Oracle native queries), I'm unable to get the requested report.
I'm wondering if there is some smart manner to group by a datetime range these sets of transactions using nothing more than Oracle.
Here goes the query, for the the month of July:
SELECT Q1.dateFormat, NVL(Q1.sales, 0)
FROM (
SELECT to_date(to_char(tx.datetimeGMT +1/24 , 'mm-dd-yyyy'), 'mm-dd-yyyy') AS dateFormat
, NVL(SUM(tx.amount),0) AS sales
FROM Transaction tx
WHERE tx.datetimeGMT > to_date('20100801 08:59:59', 'yyyymmdd hh24:mi:ss') +1/24
AND tx.datetimeGMT < to_date('20100901 09:00:00', 'yyyymmdd hh24:mi:ss') + 1/24
GROUP BY to_date(to_char(tx.datetimeGMT +1/24 , 'mm-dd-yyyy'), 'mm-dd-yyyy')
) Q1
ORDER BY 1 DESC
Thank you all for your answers, by taking a look to them I could write down the query I was searching for:
SELECT CASE
WHEN EXTRACT(HOUR FROM TX.DATETIME) >= 5 THEN TO_CHAR(TX.DATETIME,'DD-MM-YYYY')
WHEN EXTRACT(HOUR FROM TX.DATETIME) BETWEEN 0 AND 2 THEN TO_CHAR(TX.DATETIME-1,'DD-MM-YYYY')
WHEN EXTRACT(hour from tx.datetime) between 2 and 5 THEN to_char(TX.DATETIME-1,'DD-MM-YYYY')
END AS age,
NVL(SUM(tx.amount),0) AS sales
FROM TRANSACTION TX
WHERE tx.datetime > to_date('20100801 08:59:59', 'yyyymmdd hh24:mi:ss')
AND TX.DATETIME < TO_DATE('20100901 09:00:00', 'yyyymmdd hh24:mi:ss')
GROUP BY CASE
WHEN EXTRACT(HOUR FROM TX.DATETIME) >= 5 THEN TO_CHAR(TX.DATETIME,'DD-MM-YYYY')
WHEN EXTRACT(HOUR FROM TX.DATETIME) BETWEEN 0 AND 2 THEN TO_CHAR(TX.DATETIME-1,'DD-MM-YYYY')
WHEN EXTRACT(hour from tx.datetime) between 2 and 5 THEN to_char(TX.DATETIME-1,'DD-MM-YYYY')
END
ORDER BY 1
To group by a date range, you'll have to have this range into a column value into a subquery, and group by it in your query. Obviously, this date range within this column value will be of VARCHAR type.
If the first shift of the day starts at 08:00, and the last shift of that same day ends 07:59 the next day, you can use something like this to group the transactions by the shift date.
select trunc(trans_date - interval '8' hour) as shift_date
,sum(amount)
from transactions
group
by trunc(trans_date - interval '8' hour)
order
by shift_date desc;
You can try this approach (just out of my head, not even sure if it runs):
select
trans_date,
trans_shift,
aggregates(whatever)
from (
select
-- we want to group by normalized transaction date,
-- not by real transaction date
normalized_trans_date,
-- get the shift to group by
case
when trans_date between trunc(normalized_trans_date) + shift_1_start_offset and
trunc(normalized_trans_date) + shift_1_end_offset then
1
when trans_date between trunc(normalized_trans_date) + shift_2_start_offset and
trunc(normalized_trans_date) + shift_2_end_offset then
2
...
when trans_date between trunc(normalized_trans_date) + shift_N_start_offset and
trunc(normalized_trans_date) + shift_N_end_offset then
N
end trans_shift,
whatever
from (
select
-- get a normalized transaction date: if date is before 1st shift
-- it belongs to the day before
case
when trans_date - trunc(trans_date) < shift_1_start_offset then
trans_date - 1
else
trans_date
end normalized_trans_date,
t.*
from
transactions t
)
)
group by trans_date, trans_shift
Ronnis solution with the trunc(trans_date - interval '8' hour) helped me in a similar query.
Did a Backup Report and had to summarize output-bytes from RC_BACKUP_SET_DETAILS. The backup task runs for more than 8 hours, there are several RC_BACKUP_SET_DETAILS rows for one job which starts at night time and end the next day.
select trunc(start_time - interval '12' hour) "Start Date",
to_char(sum(output_bytes)/(1024*1024*1024),'999,990.0') "Output GB"
from rc_backup_set_details
where db_key = 173916 and backup_type = 'I' and incremental_level = 0
group by trunc(start_time - interval '12' hour)
order by 1 asc;

Resources