Imagine a table with the following rows
order
date
1
2021-01-01 00:00:00
2
2021-01-01 01:00:00
3
2021-01-01 02:00:00
4
2021-01-01 03:00:00
5
2021-01-02 00:00:00
6
2021-01-03 00:00:00
7
2021-01-04 00:00:00
8
2021-01-04 01:00:00
9
2021-01-04 02:00:00
10
2021-01-06 00:00:00
I am using a cursor to loop through a database result one by one:
cursor = conn.parse("SELECT * FROM table ORDER BY date ASC")
while result = cursor.fetch_hash()
# Do some processing on the result
# ...
end
I want to store the last processed row date into a variable to be used later for a different part of code. My solution is to do so:
last_date = ""
cursor = conn.parse("SELECT * FROM table ORDER BY date ASC")
while result = cursor.fetch()
last_date = result["date"]
# Do some processing on the result
# ...
end
Each iteration last_date is being updated and once the iteration ends it stores 2021-01-06 00:00:00, which is what I want.
I'm curious is there a better or more elegant way of doing this? I can't think of an alternative.
Related
I've a table with employees and their birth date, in a column in a format string.
I cannot modify the table, so I created a view to get their birth date in a real date format (TO_DATE).
Now, I would like to get the list of the employees having theirs birthday in the last 15 days and the employees who'll have theirs birthday in the next 15 days.
So, just based with the Day and the month.
I successfully get for exemple all employees bornt in April with "Extract", but, I'm sure you've already understand, when I'll run the query the 25 April, I'd like the futures birthday in May.
How could I get that (oracle 12c)
Thank you 🙂
Using the hiredate column in table scott.emp for testing:
select empno, ename, hiredate
from scott.emp
where add_months(trunc(hiredate),
12 * round(months_between(sysdate, hiredate) / 12))
between trunc(sysdate) - 15 and trunc(sysdate) + 15
;
EMPNO ENAME HIREDATE
---------- ---------- ----------
7566 JONES 04/02/1981
7698 BLAKE 05/01/1981
7788 SCOTT 04/19/1987
This will produce the wrong result in the following situation: if someone's birthday is Feb. 28 in a non-leap year, their birthday in a leap year (calculated with the ADD_MONTHS function in the query) will be considered to be Feb. 29. So, they will be excluded if running the query on, say, Feb. 13 2024 (even though they should be included), and they will be included if running the query on March 14 (even though they should be excluded). If you can live with this - those people will be recognized in the wrong window, once every four years - then this may be all you need. Otherwise that situation will require further tweaking.
For people born on Feb. 29 (in a leap year, obviously), their birthday in a non-leap-year is considered to be Feb. 28. With this convention, the query will always work correctly for them. Whether this convention is appropriate in your locale, only your business users can tell you. (Local laws and regulations may matter, too - depending on what you are using this for.)
You can use ddd format model:
DDD - Day of year (1-366).
For example:
SQL> with v(dt) as (
2 select date'2020-01-01'+level-1 from dual connect by date'2020-01-01'+level-1<date'2021-01-01'
3 )
4 select *
5 from v
6 where
7 not abs(
8 to_number(to_char(date'&dt','ddd'))
9 -to_number(to_char(dt ,'ddd'))
10 ) between 15 and 350;
Enter value for dt: 2022-01-03
DT
-------------------
2020-01-01 00:00:00
2020-01-02 00:00:00
2020-01-03 00:00:00
2020-01-04 00:00:00
2020-01-05 00:00:00
2020-01-06 00:00:00
2020-01-07 00:00:00
2020-01-08 00:00:00
2020-01-09 00:00:00
2020-01-10 00:00:00
2020-01-11 00:00:00
2020-01-12 00:00:00
2020-01-13 00:00:00
2020-01-14 00:00:00
2020-01-15 00:00:00
2020-01-16 00:00:00
2020-01-17 00:00:00
2020-12-19 00:00:00
2020-12-20 00:00:00
2020-12-21 00:00:00
2020-12-22 00:00:00
2020-12-23 00:00:00
2020-12-24 00:00:00
2020-12-25 00:00:00
2020-12-26 00:00:00
2020-12-27 00:00:00
2020-12-28 00:00:00
2020-12-29 00:00:00
2020-12-30 00:00:00
2020-12-31 00:00:00
30 rows selected.
NB: This example doesn't analyze leap years.
Similar to mathguy's answer, but translating the current date back to the birth year (rather than translating the birth year forwards):
SELECT *
FROM employees
WHERE birth_date BETWEEN ADD_MONTHS(
TRUNC(SYSDATE),
ROUND(MONTHS_BETWEEN(birth_date, SYSDATE)/12)*12
) - INTERVAL '15' DAY
AND ADD_MONTHS(
TRUNC(SYSDATE),
ROUND(MONTHS_BETWEEN(birth_date, SYSDATE)/12)*12
) + INTERVAL '15' DAY;
Then, for the sample data:
CREATE TABLE employees (name, birth_date) AS
SELECT 'Alice', DATE '2020-02-28' FROM DUAL UNION ALL
SELECT 'Betty', DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 'Carol', DATE '2021-02-28' FROM DUAL UNION ALL
SELECT 'Debra', DATE '2022-04-28' FROM DUAL UNION ALL
SELECT 'Emily', DATE '2021-03-30' FROM DUAL UNION ALL
SELECT 'Fiona', DATE '2021-03-31' FROM DUAL;
If today's date is 2022-04-16 then the output is:
NAME
BIRTH_DATE
Debra
28-APR-22
If today's date is 2022-03-15 then the output is:
NAME
BIRTH_DATE
Betty
29-FEB-20
Carol
28-FEB-21
Emily
30-MAR-21
And would get values from 28th February - 30th March in a non-leap-year and from 29th February - 30th March in a leap year.
db<>fiddle here
I have a table similar to the below. My goal is to remove groups for each date where the status moves to either 'Cancelled' or 'Failed', while retaining groups per day that contain other status changes.
Group
Status
Date
A
Pending
2021-01-01 08:00:00
A
Cancelled
2021-01-01 13:00:00
A
Pending
2021-01-02 08:00:00
A
Failed
2021-01-02 13:00:00
A
Pending
2021-01-03 08:00:00
A
Pending Settlement
2021-01-03 13:00:00
A
Pending
2021-01-04 08:00:00
A
Settled
2021-01-04 13:00:00
B
Pending
2021-01-01 08:00:00
B
Cancelled
2021-01-01 13:00:00
B
Pending
2021-01-02 08:00:00
B
Failed
2021-01-02 13:00:00
B
Pending
2021-01-03 08:00:00
B
Pending Settlement
2021-01-03 13:00:00
B
Pending
2021-01-04 08:00:00
B
Settled
2021-01-04 13:00:00
My first attempt was something like:
select GROUP, STATUS, DATE
from TABLE TBL
, (
select GROUP, STATUS, DATE
from TABLE
where STATUS in ('Cancelled','Failed')
) FLAG
where (TBL.GROUP <> FLAG.GROUP and TBL.DATE <> FLAG.DATE)
;
My expected output is shown below EDIT:, however it seems to be taking exceptionally long (>10 mins) even when applying date filters:
Group
Status
Date
A
Pending
2021-01-03 08:00:00
A
Pending Settlement
2021-01-03 13:00:00
A
Pending
2021-01-04 08:00:00
A
Settled
2021-01-04 13:00:00
B
Pending
2021-01-03 08:00:00
B
Pending Settlement
2021-01-03 13:00:00
B
Pending
2021-01-04 08:00:00
B
Settled
2021-01-04 13:00:00
You may use the last_value() window function to get the last value within a group and then apply your filter against it.
SELECT "GROUP",
"STATUS",
"DATE"
FROM (SELECT "GROUP",
"STATUS",
"DATE",
last_value("STATUS") OVER (PARTITION BY "GROUP",
trunc("DATE")
ORDER BY "DATE" ASC) lv
FROM "TABLE") x
WHERE lv NOT IN ('Cancelled',
'Failed');
Edit:
To filter out days where the status was 'Cancelled' or 'Failed' anytime during the day, you can use for example the windowed version of of count() with a CASE expression that gives a non-NULL value when the status is 'Cancelled' or 'Failed', or NULL (the default) otherwise.
SELECT "GROUP",
"STATUS",
"DATE"
FROM (SELECT "GROUP",
"STATUS",
"DATE",
count(CASE
WHEN "STATUS" IN ('Cancelled',
'Failed') THEN
0
END) OVER (PARTITION BY "GROUP",
trunc("DATE")) c
FROM "TABLE") x
WHERE c = 0;
I have a table with business days BUSINESS_DAYS which has all the dates
I have another table with payment information and DUE_DATES
I want to return in my query the next business day IF the DUE_DATE is not a business day
SELECT SQ1.DUE_DATE, SQ2.DATE FROM
(SELECT * FROM
PAYMENTS
ORDER BY
DUE_DATE) SQ1,
(SELECT MIN(DATE) DATE FROM BUSINESS_DAYS WHERE SQ1.DUE_DATE <= DATE GROUP BY DATE) SQ2
Anyone can shed some light?
The way I see it, code you posted doesn't do what you wanted anyway (otherwise, you won't be asking a question at all). Therefore, I'd suggest another approach:
Altering the session (you don't have to do it; my database speaks Croatian so I'm switching to English; also, setting date format to display day name):
SQL> alter session set nls_date_language = 'english';
Session altered.
SQL> alter session set nls_date_format = 'dd.mm.yyyy, dy';
Session altered.
Two CTEs contain
business_days: as commented, only this year's July, weekends excluded, there are no holidays)
payments: two rows, one whose due date is a working day and another whose isn't
Sample data end at line #15, query you might be interested in begins at line #16. Its CASE expression check whether due_date is one of weekend days; if not, due date to be returned is exactly what it is. Otherwise, another SELECT statement returns the first (MIN) business day larger than due_date.
SQL> with
2 business_days (datum) as
3 -- for simplicity, only all dates in this year's July,
4 -- weekends excluded (as they aren't business days), no holidays
5 (select date '2021-07-01' + level - 1
6 from dual
7 where to_char(date '2021-07-01' + level - 1, 'dy')
8 not in ('sat', 'sun')
9 connect by level <= 31
10 ),
11 payments (id, due_date) as
12 (select 1, date '2021-07-14' from dual -- Wednesday, business day
13 union all
14 select 2, date '2021-07-25' from dual -- Sunday, non-business day
15 )
16 select p.id,
17 p.due_date current_due_date,
18 --
19 case when to_char(p.due_date, 'dy') not in ('sat', 'sun') then
20 p.due_date
21 else (select min(b.datum)
22 from business_days b
23 where b.datum > p.due_date
24 )
25 end new_due_date
26 from payments p
27 order by id;
ID CURRENT_DUE_DAT NEW_DUE_DATE
---------- --------------- ---------------
1 14.07.2021, wed 14.07.2021, wed --> Wednesday remains "as is"
2 25.07.2021, sun 26.07.2021, mon --> Sunday switched to Monday
SQL>
I am trying to replace DB2 with Oracle DB.
In DB2, there is a WEEK function, which returns the number of weeks of the year.
For example:
SELECT week('2021-01-04') FROM sysibm.sysdummy1
Then I get the return of 2 (as DB2 WEEK function regards Sunday as the first day of the week.)
However, in Oracle, if I make a similar query, then I get a different value.
SELECT to_char(to_date('2021-01-04', 'YYYY-MM-DD'), 'WW') FROM dual;
I get the value of 1 (not 2 as Oracle regards January 1st as the first day of the week.)
Is there any other turnaround or different function to replace DB2 WEEK function?
Oracle offers two Week of Year Number functions. As you have discovered, to_char(dt, 'WW') gives a number in which the week starts on 1st of January and increments every seven days. There is also to_char(dt, 'IW') giving the ISO week number, which runs Monday to Sunday; in this case the 1st of January is Week 53 and 2021-01-04 is the first day of week 1 of 2021.
Demo on db<>fiddle
There is no function in Oracle which increments the week number on the basis of the day of the week number. You could write your own PL/SQL function to do this.
Incidentally, the first day of the week in Oracle is determined by our NLS parameters. If we run with (say) American settings Sunday is day 1; if we run with British settings then Monday is day 1.
If you want to count the number of weeks where the first day of the week is Sunday then:
SELECT FLOOR(
(DATE '2021-01-04' - NEXT_DAY( TRUNC( DATE '2021-01-04', 'YY' ) - INTERVAL '7' DAY, 'SUNDAY'))
/ 7
) + 1
AS week
FROM your_input;
Which outputs 2.
If you run it with multiple input days:
WITH your_input ( value ) AS (
SELECT DATE '2021-01-01' + LEVEL - 1
FROM DUAL
CONNECT BY LEVEL <= 32
)
SELECT value,
FLOOR(
(value - NEXT_DAY( TRUNC( value, 'YY' ) - INTERVAL '7' DAY, 'SUNDAY'))
/ 7
) + 1
AS week
FROM your_input;
Then you get the output (with the NLS_DATE_FORMAT as YYYY-MM-DD (DY)):
VALUE
WEEK
2021-01-01 (FRI)
1
2021-01-02 (SAT)
1
2021-01-03 (SUN)
2
2021-01-04 (MON)
2
2021-01-05 (TUE)
2
2021-01-06 (WED)
2
2021-01-07 (THU)
2
2021-01-08 (FRI)
2
2021-01-09 (SAT)
2
2021-01-10 (SUN)
3
2021-01-11 (MON)
3
2021-01-12 (TUE)
3
2021-01-13 (WED)
3
2021-01-14 (THU)
3
2021-01-15 (FRI)
3
2021-01-16 (SAT)
3
2021-01-17 (SUN)
4
2021-01-18 (MON)
4
2021-01-19 (TUE)
4
2021-01-20 (WED)
4
2021-01-21 (THU)
4
2021-01-22 (FRI)
4
2021-01-23 (SAT)
4
2021-01-24 (SUN)
5
2021-01-25 (MON)
5
2021-01-26 (TUE)
5
2021-01-27 (WED)
5
2021-01-28 (THU)
5
2021-01-29 (FRI)
5
2021-01-30 (SAT)
5
2021-01-31 (SUN)
6
2021-02-01 (MON)
6
db<>fiddle here
I have a scenario where in I have to aggregate data for a dynamic 24 hour period.
For eg: If a user selects the FROM date as Jan 05 2016 8:00 AM and TO date as Jan 10 2016 2:00 AM data in the output should be aggregated from Jan 05 2016 8:00 AM to Jan 06 2016 7:59 AM as 1 day (Jan 05 2016).
Jan 5 2016 - Jan 5 2016 8:00 AM to Jan 6 2016 7:59 AM
Jan 6 2016 - Jan 6 2016 8:00 AM to Jan 7 2016 7:59 AM
Jan 7 2016 - Jan 7 2016 8:00 AM to Jan 8 2016 7:59 AM
Jan 8 2016 - Jan 8 2016 8:00 AM to Jan 9 2016 7:59 AM
Jan 9 2016 - Jan 9 2016 8:00 AM to Jan 10 2016 2:00 AM
To achieve this, I subtracted 8 hours from the date column in the fact table and joined it to the Date Dimension. The query looks like this:
SELECT D.DAY_FMT,SUM(F.MEASURE) from FACT F
INNER JOIN DATES D ON
to_number(to_char((F.DATESTIME - 0.3333333),'YYYYMMDD')) = D.DATEID
WHERE F.DATESTIME between to_timestamp ('05-Jan-16 08.00.00.000000000 AM')
and to_timestamp ('10-Jan-16 02.00.00.000000000 AM')
GROUP BY D.DAY_FMT
Note 1: If the From Time is 06:00 AM then we would be subtracting 0.25 (days) instead of 0.3333333 (days)
Note 2: The Fact table has billions of rows.
Is there any way to improve the performance of the above query?
In Oracle the date and the time are stored together. You don't need to join on equality, and you don't need to wrap the date within any functions. (And why timestamps?) Having all the computations (if any are even needed) on the "right hand side" of conditions means the computations are done just once, the same for every row, instead of separately for each row.
select f.day_fmt, sum(f.measure) as some_col_name
from fact f inner join dates d
on f.datestime >= to_date('05-Jan-16 08:00:00 AM', 'dd-Mon-yy hh:mi:ss AM')
and f.datestime < to_date('10-Jan-16 02:00:00 AM', 'dd-Mon-yy hh:mi:ss AM')
group by day_fmt;
Edit: Based on further clarification from OP - suppose the data is in table "fact" - with columns day_fmt, measure, and datestime. The assignment is to aggregate (sum) measure, grouped by day_fmt and also grouped by 24-hour intervals, starting from a date-time chosen by the user and ending with a date-time chosen by the user. Solution below.
with user_input (sd, ed) as (
select to_date('05-Jan-16 08:00:00 AM', 'dd-Mon-yy hh:mi:ss AM'),
to_date('10-Jan-16 02:00:00 AM', 'dd-Mon-yy hh:mi:ss AM') from dual
),
prep (dt) as (
select (select sd from user_input) + level - 1 from dual
connect by level < (select ed - sd from user_input) + 1
union
select ed from user_input
),
dates (from_date, to_date) as (
select dt, lead(dt) over (order by dt) from prep
)
select f.day_fmt, d.from_datetime, d.to_datetime, sum(f.measure) as some_column_name
from fact f inner join dates d
on f.datestime >= d.from_datetime and f.datestime < d.to_datetime
where to_datetime is not null
group by f.day_fmt, d.from_datetime, f.to_datetime
order by f.day_fmt, d.from_datetime;
By not using function calls wrapped around f.datestime, you can take advantage of an index defined on this column of the "fact" table (an index you already have or one you can create now, to help speed up your queries).