I have a scenario where in I have to aggregate data for a dynamic 24 hour period.
For eg: If a user selects the FROM date as Jan 05 2016 8:00 AM and TO date as Jan 10 2016 2:00 AM data in the output should be aggregated from Jan 05 2016 8:00 AM to Jan 06 2016 7:59 AM as 1 day (Jan 05 2016).
Jan 5 2016 - Jan 5 2016 8:00 AM to Jan 6 2016 7:59 AM
Jan 6 2016 - Jan 6 2016 8:00 AM to Jan 7 2016 7:59 AM
Jan 7 2016 - Jan 7 2016 8:00 AM to Jan 8 2016 7:59 AM
Jan 8 2016 - Jan 8 2016 8:00 AM to Jan 9 2016 7:59 AM
Jan 9 2016 - Jan 9 2016 8:00 AM to Jan 10 2016 2:00 AM
To achieve this, I subtracted 8 hours from the date column in the fact table and joined it to the Date Dimension. The query looks like this:
SELECT D.DAY_FMT,SUM(F.MEASURE) from FACT F
INNER JOIN DATES D ON
to_number(to_char((F.DATESTIME - 0.3333333),'YYYYMMDD')) = D.DATEID
WHERE F.DATESTIME between to_timestamp ('05-Jan-16 08.00.00.000000000 AM')
and to_timestamp ('10-Jan-16 02.00.00.000000000 AM')
GROUP BY D.DAY_FMT
Note 1: If the From Time is 06:00 AM then we would be subtracting 0.25 (days) instead of 0.3333333 (days)
Note 2: The Fact table has billions of rows.
Is there any way to improve the performance of the above query?
In Oracle the date and the time are stored together. You don't need to join on equality, and you don't need to wrap the date within any functions. (And why timestamps?) Having all the computations (if any are even needed) on the "right hand side" of conditions means the computations are done just once, the same for every row, instead of separately for each row.
select f.day_fmt, sum(f.measure) as some_col_name
from fact f inner join dates d
on f.datestime >= to_date('05-Jan-16 08:00:00 AM', 'dd-Mon-yy hh:mi:ss AM')
and f.datestime < to_date('10-Jan-16 02:00:00 AM', 'dd-Mon-yy hh:mi:ss AM')
group by day_fmt;
Edit: Based on further clarification from OP - suppose the data is in table "fact" - with columns day_fmt, measure, and datestime. The assignment is to aggregate (sum) measure, grouped by day_fmt and also grouped by 24-hour intervals, starting from a date-time chosen by the user and ending with a date-time chosen by the user. Solution below.
with user_input (sd, ed) as (
select to_date('05-Jan-16 08:00:00 AM', 'dd-Mon-yy hh:mi:ss AM'),
to_date('10-Jan-16 02:00:00 AM', 'dd-Mon-yy hh:mi:ss AM') from dual
),
prep (dt) as (
select (select sd from user_input) + level - 1 from dual
connect by level < (select ed - sd from user_input) + 1
union
select ed from user_input
),
dates (from_date, to_date) as (
select dt, lead(dt) over (order by dt) from prep
)
select f.day_fmt, d.from_datetime, d.to_datetime, sum(f.measure) as some_column_name
from fact f inner join dates d
on f.datestime >= d.from_datetime and f.datestime < d.to_datetime
where to_datetime is not null
group by f.day_fmt, d.from_datetime, f.to_datetime
order by f.day_fmt, d.from_datetime;
By not using function calls wrapped around f.datestime, you can take advantage of an index defined on this column of the "fact" table (an index you already have or one you can create now, to help speed up your queries).
Related
I've a table with employees and their birth date, in a column in a format string.
I cannot modify the table, so I created a view to get their birth date in a real date format (TO_DATE).
Now, I would like to get the list of the employees having theirs birthday in the last 15 days and the employees who'll have theirs birthday in the next 15 days.
So, just based with the Day and the month.
I successfully get for exemple all employees bornt in April with "Extract", but, I'm sure you've already understand, when I'll run the query the 25 April, I'd like the futures birthday in May.
How could I get that (oracle 12c)
Thank you š
Using the hiredate column in table scott.emp for testing:
select empno, ename, hiredate
from scott.emp
where add_months(trunc(hiredate),
12 * round(months_between(sysdate, hiredate) / 12))
between trunc(sysdate) - 15 and trunc(sysdate) + 15
;
EMPNO ENAME HIREDATE
---------- ---------- ----------
7566 JONES 04/02/1981
7698 BLAKE 05/01/1981
7788 SCOTT 04/19/1987
This will produce the wrong result in the following situation: if someone's birthday is Feb. 28 in a non-leap year, their birthday in a leap year (calculated with the ADD_MONTHS function in the query) will be considered to be Feb. 29. So, they will be excluded if running the query on, say, Feb. 13 2024 (even though they should be included), and they will be included if running the query on March 14 (even though they should be excluded). If you can live with this - those people will be recognized in the wrong window, once every four years - then this may be all you need. Otherwise that situation will require further tweaking.
For people born on Feb. 29 (in a leap year, obviously), their birthday in a non-leap-year is considered to be Feb. 28. With this convention, the query will always work correctly for them. Whether this convention is appropriate in your locale, only your business users can tell you. (Local laws and regulations may matter, too - depending on what you are using this for.)
You can use ddd format model:
DDD - Day of year (1-366).
For example:
SQL> with v(dt) as (
2 select date'2020-01-01'+level-1 from dual connect by date'2020-01-01'+level-1<date'2021-01-01'
3 )
4 select *
5 from v
6 where
7 not abs(
8 to_number(to_char(date'&dt','ddd'))
9 -to_number(to_char(dt ,'ddd'))
10 ) between 15 and 350;
Enter value for dt: 2022-01-03
DT
-------------------
2020-01-01 00:00:00
2020-01-02 00:00:00
2020-01-03 00:00:00
2020-01-04 00:00:00
2020-01-05 00:00:00
2020-01-06 00:00:00
2020-01-07 00:00:00
2020-01-08 00:00:00
2020-01-09 00:00:00
2020-01-10 00:00:00
2020-01-11 00:00:00
2020-01-12 00:00:00
2020-01-13 00:00:00
2020-01-14 00:00:00
2020-01-15 00:00:00
2020-01-16 00:00:00
2020-01-17 00:00:00
2020-12-19 00:00:00
2020-12-20 00:00:00
2020-12-21 00:00:00
2020-12-22 00:00:00
2020-12-23 00:00:00
2020-12-24 00:00:00
2020-12-25 00:00:00
2020-12-26 00:00:00
2020-12-27 00:00:00
2020-12-28 00:00:00
2020-12-29 00:00:00
2020-12-30 00:00:00
2020-12-31 00:00:00
30 rows selected.
NB: This example doesn't analyze leap years.
Similar to mathguy's answer, but translating the current date back to the birth year (rather than translating the birth year forwards):
SELECT *
FROM employees
WHERE birth_date BETWEEN ADD_MONTHS(
TRUNC(SYSDATE),
ROUND(MONTHS_BETWEEN(birth_date, SYSDATE)/12)*12
) - INTERVAL '15' DAY
AND ADD_MONTHS(
TRUNC(SYSDATE),
ROUND(MONTHS_BETWEEN(birth_date, SYSDATE)/12)*12
) + INTERVAL '15' DAY;
Then, for the sample data:
CREATE TABLE employees (name, birth_date) AS
SELECT 'Alice', DATE '2020-02-28' FROM DUAL UNION ALL
SELECT 'Betty', DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 'Carol', DATE '2021-02-28' FROM DUAL UNION ALL
SELECT 'Debra', DATE '2022-04-28' FROM DUAL UNION ALL
SELECT 'Emily', DATE '2021-03-30' FROM DUAL UNION ALL
SELECT 'Fiona', DATE '2021-03-31' FROM DUAL;
If today's date is 2022-04-16 then the output is:
NAME
BIRTH_DATE
Debra
28-APR-22
If today's date is 2022-03-15 then the output is:
NAME
BIRTH_DATE
Betty
29-FEB-20
Carol
28-FEB-21
Emily
30-MAR-21
And would get values from 28th February - 30th March in a non-leap-year and from 29th February - 30th March in a leap year.
db<>fiddle here
I have inherited a oracle database and being used to MySQL I am struggling to get the data I need.
I am trying to get records from TTDINV700732 and TTCCOM001732 where the max(date) in TTDINV700732 is GTEQ one year ago and where there are records in the joined table TTDINV150732 where the date is GTEQ today.
I get the error
[99999][30484] ORA-30484: missing window specification for this function
Here is my SQL
SELECT
first_value(trim("TTDINV700732"."T$ITEM")) AS "item",
first_value("TTDINV700732"."T$CWAR") AS "whse",
max("TTDINV700732"."T$TRDT") AS "date",
first_value("TTCCOM001732"."T$NAMB") AS "business"
FROM "DB"."TTDINV700732" "TTDINV700732"
LEFT OUTER JOIN "DB"."TTIITM001732" "TTIITM001732" ON "TTDINV700732"."T$ITEM"="TTIITM001732"."T$ITEM"
LEFT OUTER JOIN "DB"."TTCCOM001732" "TTCCOM001732" ON "TTIITM001732"."T$CPLB"="TTCCOM001732"."T$EMNO"
LEFT OUTER JOIN "DB"."TTDINV150732" "TTDINV150732" ON "TTDINV150732"."T$ITEM"="TTDINV700732"."T$ITEM"
where "TTDINV700732"."T$TRDT" <= to_date('12 Oct 2016', 'DD MON YYYY')
and "TTDINV700732"."T$QUAN" < 0
and "TTDINV150732"."T$DATE" >= to_date('12 Oct 2017','DD MON YYYY')
group by "TTDINV700732"."T$ITEM", "TTDINV700732"."T$CWAR"
where the max(date) in TTDINV700732 is GTEQ one year ago (from today)
If you filter that table for <= to_date('12 Oct 2016', 'DD MON YYYY') then no maximum can be greater than the date you have specified, is that intentional/correct? OR, do you require a subquery to get the MAX() dates then use HAVING MAX(T$TRDT) <= to_date('12 Oct 2016', 'DD MON YYYY') ??
...joined table TTDINV150732 where the date is GTEQ today
Should this be equal to today, or greater than and equal ??
Would something like this work?
SELECT
t700732.t$item AS item
, t700732.t$cwar AS whse
, t700732.mx_date AS mx_date
, ttccom001732.t$namb AS business
FROM (
SELECT ttdinv700732.t$item , ttdinv700732.t$cwar, MAX(ttdinv700732.t$trdt) mx_date
FROM db.ttdinv700732
WHERE ttdinv700732.t$quan < 0
GROUP BY ttdinv700732.t$item , ttdinv700732.t$cwar
HAVING MAX(ttdinv700732.t$trdt) <= To_date('12 Oct 2016', 'DD MON YYYY')
) t700732
INNER JOIN db.ttdinv150732 ON t700732.t$item = ttdinv150732.t$item
LEFT OUTER JOIN db.ttiitm001732 ON ttdinv700732.t$item = ttiitm001732.t$item
LEFT OUTER JOIN db.ttccom001732 ON ttiitm001732.t$cplb = ttccom001732.t$emno
where ttdinv150732.t$date >= To_date('12 Oct 2017', 'DD MON YYYY')
You may find it easier to solve it like it was MySQL instead of attempting features of Oracle that you are not familiar with yet.
Iām a beginner at Oracle SQL and I want have two questions.
First, I want to find the number of day between two events (so when they are a yes). These two dates are currently varchars(!).
Pseudocode:
When request is yes and sales is yes, subtract sales_date from request_date.
Data looks like this:
Id request request_date sales sales_date
1 yes 2 feb14 yes 3 feb 14
2 yes 3 feb 14 no 3 feb 14
3 no 4 feb 14 no 5 feb 14
4 no 4 feb 14 yes 6 feb 14
And ideally I want this to be the result:
Id request request_date sales sales_date days_between_request_sales
1 yes 2 feb14 yes 3 feb 14 1
My second question is that if I have all these results, then how can I get the average of all the dates?
You can try using:
select trunc(to_date(sales_date,'dd-mm-yy') - to_date(request_date, 'dd-mm-yy')) as days
from <yourtable>
where sales = 'yes'
and request = 'yes'
demo:
select trunc(to_date('3 feb 14','dd-mm-yy') - to_date('1 feb 14', 'dd-mm-yy')) as days
from dual
Output:
DAYS
----------
2
Average:
select avg(trunc(to_date(sales_date,'dd-mm-yy') - to_date(request_date, 'dd-mm-yy'))) as Average
from <yourtable>
where sales = 'yes'
and request = 'yes'
I didn't understood how it worked with the previous one to_date(sales_date,'dd-mm-yyyy') is totally wrong as your format is dd mon yy so what Aleksej suggested is correct to_date(sales_date,'dd mon rr') is absolutely correct. He might have missed one condition request =' yes'.
Say you have a table like this:
create table yourTable(Id, request, request_date, sales, sales_date) as (
select 1 ,'yes', '2 feb 14', 'yes' , '3 feb 14' from dual union all
select 2 ,'yes', '3 feb 14', 'no' , '3 feb 14' from dual union all
select 3 ,'no' , '4 feb 14', 'no' , '5 feb 14' from dual union all
select 4 ,'no' , '4 feb 14', 'no' , '6 feb 14' from dual
)
Assuming that your strings represent dates always in the format you showed, you can use:
select Id, request, request_date, sales, sales_date,
to_date(sales_date, 'dd mon rr') - to_date(request_date, 'dd mon rr') as days_between_request_sales
from yourTable
where sales = 'yes'
and request = 'yes'
To compute the average of these resulting numbers od days, you can simply use the AVG:
select avg (to_date(sales_date, 'dd mon rr') - to_date(request_date, 'dd mon rr') ) as average
from yourTable
where sales = 'yes'
and request = 'yes'
I need to do my reporting on week on week basis but my week number should start from 1st day of month
here is my sample data:
report_date Vol
01 nov 2014 23
03 nov 2014 34
16 nov 2014 56
30 nov 2014 44
Desired output
Week no Vol
1 57
2 56
3 0
4 44
hope its clear
Thanks
Since your desired output include "zero" rows as well, and assuming you'd like this report to work across multiple months as well:
WITH sample_data AS
(SELECT DATE '2014-11-01' AS report_date, 23 AS vol FROM DUAL
UNION ALL SELECT DATE '2014-11-03', 34 FROM DUAL
UNION ALL SELECT DATE '2014-11-16', 56 FROM DUAL
UNION ALL SELECT DATE '2014-11-30', 44 FROM DUAL)
,weeks AS
(SELECT report_month
,TO_CHAR(ROWNUM) AS week_no
FROM (SELECT DISTINCT
TRUNC(report_date,'MM') AS report_month
FROM sample_data)
CONNECT BY LEVEL <= TO_NUMBER(TO_CHAR(LAST_DAY(report_month),'W')))
SELECT TO_CHAR(weeks.report_month,'Month') AS "Month"
,weeks.week_no AS "Week no"
,NVL(sum(sample_data.vol),0) AS "Vol"
FROM weeks
LEFT JOIN sample_data
ON weeks.report_month = TRUNC(report_date,'MM')
AND weeks.week_no = to_char(report_date,'W')
GROUP BY weeks.report_month, weeks.week_no ORDER BY 1,2;
We determine the number of weeks in each month of the source data by using the LAST_DAY function, and we do a hierarchical query (CONNECT BY LEVEL <= n) to generate one row for each week in each month.
The expected output should be:
Month Week no Vol
======== ======= ===
November 1 57
November 2 0
November 3 56
November 4 0
November 5 44
select to_char(report_date, 'W'), sum(vol)
from your_table
group by to_char(report_date, 'W');
W Week of month (1-5) where week 1 starts on the first day of the
month and ends on the seventh.
Can any body help me to solve this:
Value Date
1000 01-jan-12
............
1000 01-apr-13
My Aim is to calculate the sum of marks which secured from current month and year APRIL-13 to Previous one year from Current Month and Year.
your example data shows more that one year from apr 2013.
assuming you wanted to go back to apr-2012
select sum(value)
from your_tab
where date >= add_months(trunc(sysdate, 'mm', -12) -- from 1st apr 2012
and date < trunc(sydate, 'mm');-- anytime up to the end of mar 2013
if you wanted to go back to jan of the prior year, then
select sum(value)
from your_tab
where date >= trunc(add_months(trunc(sysdate, 'mm', -12), 'yy') -- from 1st jan 2012
and date < add_months(trunc(sydate, 'mm'), 1); -- anytime up to the end of apr 2013