self join with max value - oracle

I am have a table with 500k transactions. I want to fetch the last balance for a particular date. So I have have returned a query like below.
SELECT curr_balance
FROM transaction_details
WHERE acct_num = '10'
AND is_deleted = 'N'
AND ( value_date, srl_num ) IN(
SELECT MAX( value_date ), MAX( srl_num )
FROM transaction_details
WHERE TO_DATE( value_date, 'dd/mm/yyyy' )
<= TO_DATE( ADD_MONTHS( '05-APR-2012', 1 ), 'dd/mm/yyyy' )
AND acct_num = '10'
AND is_deleted = 'N'
AND ver_status = 'Y' )
AND ver_status = 'Y'
This has to be executed for incrementing of 12 months to find the last balance for each particular month. But this query is having more cpu cost, 12 times it is taking huge time. How to remodify the above query to get the results in faster way. Whether this can be broken into two part in PL/SQL to achieve the performance. ?

Try:
select * from(
SELECT value_date, srl_num, curr_balance
FROM transaction_details
WHERE acct_num = '10'
AND is_deleted = 'N'
AND ver_status = 'Y'
row_number() over (partition by trunc(value_date - interval '5' day,'MM')
order by srl_num desc
) as rnk
)
where rnk = 1;
You'll get a report with the ballance on last srl_num on each month in your table.
The benefit is that your approach scans the table 24 times for 12 months report and my approach scans the table once.
The analytic function gets the rank of record in current month(partition by clause) ordering the rows in the month after srl_num.

You don't have to query your table twice. Try using analytic functions
SELECT t.curr_balance
-- , any other column you want as long it is in the subselect.
FROM (
SELECT
trans.curr_balance
, trans.value_date
-- any other column you want
, trans.srl_num
, MAX(trans.srl_num) OVER(PARTITION BY trans.value_date, trans.srl_num) max_srl_num
, MAX(trans.value_date) OVER(PARTITION BY trans.value_date, trans.srl_num) max_date
FROM transaction_details trans
WHERE TO_DATE( value_date, 'dd/mm/yyyy' ) <= TO_DATE( ADD_MONTHS( '01-APR-2012', 1 ), 'dd/mm/yyyy' )
AND acct_num = '10'
AND is_deleted = 'N'
AND ver_status = 'Y'
) t
WHERE t.max_date = t.value_date
AND t.max_srl_num = t.srl_num
A couple of thoughts.
Why do you have TO_DATE( value_date...? Isn't your data type DATE? this might be breaking your index if you have one in that column.
Note that (this is a wild guess) if your srl_num is not the highest for the latest date, you will have incorrect results and might not return any rows.

Related

(Oracle 11g DB) Calculate Number of buisiness days between current time and a date while excluding holidays in a view

So I have this working SQL script that take a date and returns the age from current time to the given date excluding dates defined in a table called exclude dates
SELECT
COUNT(*)
FROM
(
SELECT
ROWNUM rnum
FROM
all_objects
WHERE
ROWNUM <= CAST(current_timestamp AS DATE) - to_date('&2') + 1
)
WHERE
to_char(to_date('&2') + rnum - 1, 'DY') NOT IN ( 'SAT', 'SUN' )
AND NOT EXISTS (
SELECT
NULL
FROM
exclude_dates
WHERE
no_work = trunc(to_date('&2') + rnum - 1)
);
I have a table called
TICKETS
that contains columns named
ID, UPDATED_AT
I want to create a view that uses the above script to return
ID, AGE
where age is the output of the script above.
You code has a few weaknesses.
There is no need for CAST(current_timestamp AS DATE).
If you need the current DATE then simply use TRUNC(SYSDATE)
You don't need to select from all_objects. Better use hierarchical query
SELECT LEVEL as rnum FROM dual CONNECT BY LEVEL <= ...
Using to_date('&2') without a format is usually bad. Either your input value is a string, then you should include the format, e.g. to_date('&2', 'YYYY-MM-DD') or your input value is a DATE, then simply use &2 - never use TO_DATE() on a value which is already a DATE!
Final query could be this one - assuming input parameter is a DATE value:
WITH t AS (
SELECT LEVEL as d
FROM dual
CONNECT BY LEVEL <= TRUNC(SYSDATE) - the_day)
SELECT COUNT(*) AS buisiness_days
FROM t
WHERE TO_CHAR(the_day + d - 1, 'DY', 'NLS_DATE_LANGUAGE = american') NOT IN ('SAT', 'SUN')
AND NOT EXISTS (
SELECT 'x'
FROM exclude_dates
WHERE no_work = TRUNC(the_day + d - 1)
)
However, for me it is not clear how you want to provide this as a view! You would need to create a separate view for each input date, or at least create a new view every day.
I would suggest to create a function:
CREATE OR REPLACE FUNCTION buisiness_days(the_date IN DATE) RETURN INTEGER AS
ret INTEGER;
BEGIN
WITH t AS (
SELECT LEVEL as d
FROM dual
CONNECT BY LEVEL <= TRUNC(SYSDATE) - the_date)
SELECT COUNT(*) AS buisiness_days
INTO ret
FROM t
WHERE TO_CHAR(the_date + d - 1, 'DY', 'NLS_DATE_LANGUAGE = american') NOT IN ('SAT', 'SUN')
AND NOT EXISTS (
SELECT 'x'
FROM exclude_dates
WHERE no_work = TRUNC(the_date + d - 1)
);
RETURN ret;
END;
The function will return a list of dates between the date range you provide so the dates don't have to be stored in a table.
CREATE OR REPLACE TYPE nt_date IS TABLE OF DATE;
/
CREATE OR REPLACE FUNCTION generate_dates_pipelined(
p_from IN DATE,
p_to IN DATE
)
RETURN nt_date PIPELINED DETERMINISTIC
IS
v_start DATE := TRUNC(LEAST(p_from, p_to));
v_end DATE := TRUNC(GREATEST(p_from, p_to));
BEGIN
LOOP
PIPE ROW (v_start);
EXIT WHEN v_start >= v_end;
v_start := v_start + INTERVAL '1' DAY;
END LOOP;
RETURN;
END generate_dates_pipelined;
/
To exclude holidays you need to know what dates they fall on so there needs to be a holiday table.
create table holidays(
holiday_date DATE not null,
holiday_name VARCHAR2(20),
constraint holidays_pk primary key (holiday_date),
constraint is_midnight check ( holiday_date = trunc ( holiday_date ) )
);
INSERT into holidays (HOLIDAY_DATE,HOLIDAY_NAME)
WITH dts as (
select to_date('25-NOV-2021 00:00:00','DD-MON-YYYY HH24:MI:SS'), 'Thanksgiving 2021' from dual union all
select to_date('29-NOV-2021 00:00:00','DD-MON-YYYY HH24:MI:SS'), 'Hanukkah 2021' from dual
)
SELECT * from dts;
This query will provide the count of days between the range, number of working days and number of holidays in the range.
SELECT COUNT (*) AS total_days
, COUNT ( CASE
WHEN h.holiday_date IS NULL
AND d.column_value - TRUNC (d.column_value, 'IW') < 5
THEN 'Business Day'
END
) AS business_days
, COUNT (h.holiday_date) AS holidays
FROM generate_dates_pipelined (DATE '2021-11-01', DATE '2021-11-30') d
LEFT JOIN holidays h ON h.holiday_date = d.column_value;
This query will provide a list of dates excluding sat, sun and holidays that fall between the range.
SELECT
COLUMN_VALUE
FROM
TABLE(generate_dates_pipelined(DATE '2021-11-01',
DATE '2021-11-30')) c
where
to_char(COLUMN_VALUE, 'DY') NOT IN ('SAT', 'SUN')
AND NOT EXISTS (
SELECT 1
FROM holidays h
WHERE c.COLUMN_VALUE = h.holiday_date
);
You don't need a function or to use a row generator function and can calculate the number of business days:
CREATE VIEW business_day_ages (ID, AGE) AS
SELECT id,
( TRUNC( SYSDATE, 'IW' ) - TRUNC( updated_at, 'IW' ) ) * 5 / 7
-- Number of full weeks.
+ LEAST( SYSDATE - TRUNC( SYSDATE, 'IW' ), 5 )
-- Add part weeks at the end.
- LEAST( updated_at - TRUNC( updated_at, 'IW' ), 5 )
-- Subtract part weeks at the start.
- COALESCE(
( SELECT SUM(
LEAST(no_work + INTERVAL '1' DAY, SYSDATE)
- GREATEST(no_work, updated_at)
)
FROM exclude_dates
WHERE no_work BETWEEN TRUNC(updated_at) AND SYSDATE
),
0
)
-- Subtract the holiday days.
FROM tickets;
Or, if you are not calculating using part days then you can simplify it to:
CREATE OR REPLACE VIEW business_day_ages (ID, AGE) AS
SELECT id,
( TRUNC( SYSDATE, 'IW' ) - TRUNC( updated_at, 'IW' ) ) * 5 / 7
-- Number of full weeks.
+ LEAST( TRUNC(SYSDATE) - TRUNC( SYSDATE, 'IW' ), 5 )
-- Add part weeks at the end.
- LEAST( updated_at - TRUNC( updated_at, 'IW' ), 5 )
-- Subtract part weeks at the start.
- COALESCE(
( SELECT 1
FROM exclude_dates
WHERE no_work BETWEEN TRUNC(updated_at) AND TRUNC(SYSDATE)
),
0
)
-- Subtract the holiday days.
FROM tickets;
db<>fiddle here

How to convert this code from oracle to redshift?

I am trying to implement the same in redshift and i am finding it little difficult to do that. Since redshift is in top of postgresql engine, if any one can do it in postgresql it would be really helpfull. Basically the code gets the count for previous two month at column level. If there is no count for exact previous month then it gives 0.
This is my code:
with abc(dateval,cnt) as(
select 201908, 100 from dual union
select 201907, 200 from dual union
select 201906, 300 from dual union
select 201904, 600 from dual)
select dateval, cnt,
last_value(cnt) over (order by dateval
range between interval '1' month preceding
and interval '1' month preceding ) m1,
last_value(cnt) over (order by dateval
range between interval '2' month preceding
and interval '2' month preceding ) m2
from (select to_date(dateval, 'yyyymm') dateval, cnt from abc)
I get error in over by clause. I tried to give cast('1 month' as interval) but still its failing. Can someone please help me with this windows function.
expected output:
Regards
This is how I would do it. In Redshift there's no easy way to generate sequences, do I select row_number() from an arbitrary table to create a sequence:
with abc(dateval,cnt) as(
select 201908, 100 union
select 201907, 200 union
select 201906, 300 union
select 201904, 600),
cal(date) as (
select
add_months(
'20190101'::date,
row_number() over () - 1
) as date
from <an arbitrary table to generate a sequence of rows> limit 10
),
with_lag as (
select
dateval,
cnt,
lag(cnt, 1) over (order by date) as m1,
lag(cnt, 2) over (order by date) as m2
from abc right join cal on to_date(dateval, 'YYYYMM') = date
)
select * from with_lag
where dateval is not null
order by dateval

Coalesce statement to handle multiple values and NULLS?

I am trying to figure out how to create an SQL query that will check for (:FROM_DATE) and (:TO_DATE) parameters and if NULL to put the past month dates in for the two values, and if not NULL to accept whatever values are entered in the parameters.
For example:
if the user enters (01-JAN-17) as FROM_DATE, and (31-JAN-17) as TO_DATE, I want the query to not automatically pass any values for the TO_DATE and FROM_DATE.
if the user does not enter any values for TO_DATE and FROM_DATE or there are NULL values passed in, I want the query to automatically enter the the past months values (i.e., if query is run July 1st 2017, the FROM_DATE would be 01-JUN-17 and the TO_DATE would be 30-JUN-17).
I was hinted to use a coalesce statement to handle multiple values and NULLS (i.e., AND ( (coalesce(null, :P_ORG) is null) or (ORG.ORGANIZATION_ID in :P_ORG)))???
Any help would be greatly appreciated.
Something like:
SELECT *
FROM your_table
WHERE your_date_column BETWEEN TO_DATE( :from_date, 'DD-MON-YYYY' )
AND TO_DATE( :to_date, 'DD-MON-YYYY' )
OR ( ( :from_date IS NULL OR :to_date IS NULL )
AND your_date_column BETWEEN ADD_MONTHS( TRUNC( SYSDATE, 'MM' ), -1 )
AND TRUNC( SYSDATE, 'MM' ) - 1
);
If either (or both) :from_date or :to_date is NULL then the dates will be compared to the previous month.
If your table has dates where the time component is not always set to midnight then you will need to use:
SELECT *
FROM your_table
WHERE your_date_column BETWEEN TO_DATE( :from_date, 'DD-MON-YYYY' )
AND TO_DATE( :to_date, 'DD-MON-YYYY' )
OR ( ( :from_date IS NULL OR :to_date IS NULL )
AND your_date_column >= ADD_MONTHS( TRUNC( SYSDATE, 'MM' ), -1 )
AND your_date_column < TRUNC( SYSDATE, 'MM' )
);
Proof of concept: consider the following query, where we have dates and values, and we want to sum the values for the dates that fall between :from_date and :to_date. If either of them is null, the query will use the first day of the prior month for from_date and the last day of the prior month for to_date. Note that this will cause problems if one date is given an actual value and the other is left null - you didn't explain how you would want that handled. But that's a different issue.
I use SQL developer, and in it I don't know how to pass in dates; I show passing in strings, and converting them to dates.
with
test_data ( dt, val ) as (
select date '2017-05-29', 200 from dual union all
select date '2017-06-13', 150 from dual union all
select date '2017-06-18', 500 from dual
)
select sum(val) as sum_val
from test_data
where dt between coalesce(to_date(:from_date, 'yyyy-mm-dd'),
add_months(trunc(sysdate, 'mm'), -1))
and coalesce(to_date(:to_date , 'yyyy-mm-dd'), trunc(sysdate, 'mm') - 1)
;
Yes, you can use COALESCE (or Oracle's NVL). When a parameter is null, replace it with the default date.
select *
from mytable
where mydate >= coalesce(:from_date, trunc(sysdate - interval '1' month), 'month')
and mydate <= coalesce(:to_date, last_day(sysdate - interval '1' month));

Calculate average values in Oracle

I want to calculate average values in Oracle tables
CREATE TABLE AGENT_HISTORY(
EVENT_ID INTEGER NOT NULL,
AGENTID INTEGER NOT NULL,
EVENT_DATE DATE NOT NULL
)
/
CREATE TABLE CPU_HISTORY(
CPU_HISTORY_ID INTEGER NOT NULL,
EVENT_ID INTEGER NOT NULL,
CPU_NAME VARCHAR2(50 ) NOT NULL,
CPU_VALUE NUMBER NOT NULL
)
/
I use this SQL query:
----- FOR 24 HOURS CPU
CURSOR LAST_24_CPU_CURSOR IS
--SELECT EVENT_DATE, CPU FROM AGENT_HISTORY WHERE NAME = NAMEIN AND EVENT_DATE >= SYSDATE-(60*24)/1440;
SELECT START_DATE, NVL(AVG(CH.CPU_VALUE),0)
FROM (SELECT START_DATE - (LVL+1)/24 START_DATE, START_DATE - LVL/24 END_DATE
FROM (SELECT SYSDATE START_DATE, LEVEL LVL FROM DUAL CONNECT BY LEVEL <= 24))
LEFT JOIN AGENT_HISTORY AH ON EVENT_DATE BETWEEN START_DATE AND END_DATE
LEFT JOIN CPU_HISTORY CH ON AH.EVENT_ID = CH.EVENT_ID
JOIN AGENT AG ON AH.AGENTID = AG.ID
WHERE AG.NAME = NAMEIN
GROUP BY START_DATE
ORDER BY 1;
This query prints only one average value. I would like to modify it to print 24 values for every hour average value. Can you help me to modify the query?
I guess your input contains data only for one of the given intervals; since you're using an INNER JOIN with AGENT which in turn is filtered by AGENT_HISTORY, you're effectively converting all your LEFT JOINs to inner ones.
I suggest you use a CROSS JOIN between AGENT and the timeslots instead:
with agent_history(event_date, agentid, event_id) as (
select timestamp '2015-11-18 09:00:07', 1, 1001 from dual
),
agent(id, name) as (
select 1, 'myAgent' from dual
),
cpu_history(event_id, cpu_value) as (
select 1001, 75.2 from dual
),
time_slots(start_date, end_date) as (
SELECT START_DATE - (LVL + 1) / 24 START_DATE,
START_DATE - LVL / 24 END_DATE
FROM (SELECT SYSDATE START_DATE,
LEVEL LVL
FROM DUAL
CONNECT BY LEVEL <= 24)
)
SELECT START_DATE,
NVL(AVG(CH.CPU_VALUE),
0)
FROM time_slots ts
CROSS JOIN AGENT AG
LEFT JOIN AGENT_HISTORY AH
ON AH.AGENTID = AG.ID
AND EVENT_DATE BETWEEN START_DATE AND END_DATE
LEFT JOIN CPU_HISTORY CH
ON AH.EVENT_ID = CH.EVENT_ID
WHERE AG.NAME = 'myAgent'
GROUP BY START_DATE
ORDER BY 1;
This ensures you get the full 24 rows (one for each timeslot).
Change start_date to to_char(start_date, 'hh24:mi') both in select and group by clauses.

Oracle select most recent date record

I am trying to find the most recent record based on a date field. When I set latest = 1 in the where clause, I get an error. Please help if possible. DATE is a the field I'm sorting by. I have tried both latest = 1 and latest = '1'
SELECT
STAFF_ID,
SITE_ID,
PAY_LEVEL,
ROW_NUMBER() OVER (PARTITION BY STAFF_ID ORDER BY DATE DESC) latest
FROM OWNER.TABLE
WHERE END_ENROLLMENT_DATE is null
AND latest = 1
you can't use aliases from select list inside the WHERE clause (because of the Order of Evaluation of a SELECT statement)
also you cannot use OVER clause inside WHERE clause - "You can specify analytic functions with this clause in the select list or ORDER BY clause." (citation from docs.oracle.com)
select *
from (select
staff_id, site_id, pay_level, date,
max(date) over (partition by staff_id) max_date
from owner.table
where end_enrollment_date is null
)
where date = max_date
Assuming staff_id + date form a uk, this is another method:
SELECT STAFF_ID, SITE_ID, PAY_LEVEL
FROM TABLE t
WHERE END_ENROLLMENT_DATE is null
AND DATE = (SELECT MAX(DATE)
FROM TABLE
WHERE staff_id = t.staff_id
AND DATE <= SYSDATE)
i think i'd try with MAX something like this:
SELECT staff_id, max( date ) from owner.table group by staff_id
then link in your other columns:
select staff_id, site_id, pay_level, latest
from owner.table,
( SELECT staff_id, max( date ) latest from owner.table group by staff_id ) m
where m.staff_id = staff_id
and m.latest = date
select *
from (select
staff_id, site_id, pay_level, date,
rank() over (partition by staff_id order by date desc) r
from owner.table
where end_enrollment_date is null
)
where r = 1

Resources