calculate the average time difference between each stage - oracle

How to calculate the average time difference between each stage.
The challenge with the actual data set is not every id will go through all stages.. some will skip stages and the date is not continuous for all Id's like below.
id date status
1 1/1/18 requirement
1 1/8/18 analysis
1 ? design
1 1/30/18 closed
2 2/1/18 requirement
2 2/18/18 closed
3 1/2/18 requirement
3 1/29/18 analysis
3 ? accepted
3 2/5/18 closed
?--we have missing dates as well
Expected output
id date status time_spent
1 1/1/18 requirement 0
1 1/8/18 analysis 7
1 ? design
1 1/30/18 closed 22
2 2/1/18 requirement 0
2 2/18/18 closed 17
3 1/2/18 requirement 0
3 1/29/18 analysis 27
3 ? accepted
3 2/5/18 closed 24
status avg(timespent)
requirement 0
analysis 17
design
closed 21

You can use windowing functions LAG (or LEAD) to get the data of the previous (or next) status for each id. That will let you compute the time elapsed in each stage. Then, compute the average time elapsed for each stage.
Here is an example of how to do that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, dte-nvl(lag(dte ignore nulls) over ( partition by id order by dte ), dte) elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+-------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+-------------------------------------------+
| requirement | 0 |
| analysis | 17 |
| design | |
| accepted | |
| closed | 15.33333333333333333333333333333333333333 |
+-------------+-------------------------------------------+
As I said in my comment, that logic computes the elapsed days as the time to the given status from the prior status. Since, "requirement" has no prior status, this logic will always show zero days spent in requirements. It would probably be better to compute the time from the given status to the next status. For "closed", there would be no next status. You could just leave that blank or use SYSDATE as the data of the next status. Here is an example of that:
with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, nvl(lead(dte ignore nulls) over ( partition by id order by dte ), trunc(sysdate))-dte elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);
+-------------+------------------------------------------+
| STATUS | AVG(ELAPSED) |
+-------------+------------------------------------------+
| requirement | 17 |
| analysis | 14.5 |
| design | |
| accepted | |
| closed | 361.666666666666666666666666666666666667 |
+-------------+------------------------------------------+

I agree with #MatthewMcPeak. Your requirements seem a bit odd: you spend zero days of requirement stage but spend an average of 21 days on closed? Fnord.
This solution treats the presented date as the start date of the stage and calculates the difference between it and the start_date of the next phase.
with cte as (
select status
, lead(dd ignore nulls) over (partition by id order by dd) - dd as dt_diff
from your_table)
select status, avg(dt_diff) as avg_ela
from cte
group by status
/

If you wish to include all stages for each d and estimate the time spent in each (using linear interpolation) then you can create a sub-query with all the statuses and use a PARTITION OUTER JOIN to join them and then use LAG and LEAD to find the date range the status is in and interpolate between:
Oracle Setup:
CREATE TABLE data ( d, dt, status ) AS
SELECT 1, TO_DATE( '1/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/8/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/30/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/1/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/18/18', 'MM/DD/YY' ), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/2/18', 'MM/DD/YY' ), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/29/18', 'MM/DD/YY' ), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '2/5/18', 'MM/DD/YY' ), 'closed' FROM DUAL;
Query:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT d,
dt,
status,
( next_dt - recent_dt ) / (next_id - recent_id ) AS estimated_duration
FROM ranges;
Output:
D | DT | STATUS | ESTIMATED_DURATION
-: | :-------- | :---------- | ---------------------------------------:
1 | 01-JAN-18 | requirement | 7
1 | 08-JAN-18 | analysis | 7.33333333333333333333333333333333333333
1 | null | design | 7.33333333333333333333333333333333333333
1 | null | accepted | 7.33333333333333333333333333333333333333
1 | 30-JAN-18 | closed | 0
2 | 01-FEB-18 | requirement | 4.25
2 | null | analysis | 4.25
2 | null | design | 4.25
2 | null | accepted | 4.25
2 | 18-FEB-18 | closed | 0
3 | 02-JAN-18 | requirement | 27
3 | 29-JAN-18 | analysis | 2.33333333333333333333333333333333333333
3 | null | design | 2.33333333333333333333333333333333333333
3 | null | accepted | 2.33333333333333333333333333333333333333
3 | 05-FEB-18 | closed | 0
Query 2:
Then of you can easily change that to take the average for each status:
WITH statuses ( status, id ) AS (
SELECT 'requirement', 1 FROM DUAL UNION ALL
SELECT 'analysis', 2 FROM DUAL UNION ALL
SELECT 'design', 3 FROM DUAL UNION ALL
SELECT 'accepted', 4 FROM DUAL UNION ALL
SELECT 'closed', 5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
SELECT d.d,
d.dt,
s.status,
s.id,
NVL(
d.dt,
LAG( d.dt, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
NVL2(
d.dt,
s.id,
LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
),
LEAD( d.dt, 1, d.dt )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
FROM data d
PARTITION BY ( d )
RIGHT OUTER JOIN statuses s
ON ( d.status = s.status )
)
SELECT status,
AVG( ( next_dt - recent_dt ) / (next_id - recent_id ) ) AS estimated_duration
FROM ranges
GROUP BY status, id
ORDER BY id;
Results:
STATUS | ESTIMATED_DURATION
:---------- | ---------------------------------------:
requirement | 12.75
analysis | 4.63888888888888888888888888888888888889
design | 4.63888888888888888888888888888888888889
accepted | 4.63888888888888888888888888888888888889
closed | 0
db<>fiddle here

Related

ORACLE - How to use LAG to display strings from all previous rows into current row

I have data like below:
group
seq
activity
A
1
scan
A
2
visit
A
3
pay
B
1
drink
B
2
rest
I expect to have 1 new column "hist" like below:
group
seq
activity
hist
A
1
scan
NULL
A
2
visit
scan
A
3
pay
scan, visit
B
1
drink
NULL
B
2
rest
drink
I was trying to solve with LAG function, but LAG only returns one row from previous instead of multiple.
Truly appreciate any help!
Use a correlated sub-query:
SELECT t.*,
(SELECT LISTAGG(activity, ',') WITHIN GROUP (ORDER BY seq)
FROM table_name l
WHERE t."GROUP" = l."GROUP"
AND l.seq < t.seq
) AS hist
FROM table_name t
Or a hierarchical query:
SELECT t.*,
SUBSTR(SYS_CONNECT_BY_PATH(PRIOR activity, ','), 3) AS hist
FROM table_name t
START WITH seq = 1
CONNECT BY
PRIOR seq + 1 = seq
AND PRIOR "GROUP" = "GROUP"
Or a recursive sub-query factoring clause:
WITH rsqfc ("GROUP", seq, activity, hist) AS (
SELECT "GROUP", seq, activity, NULL
FROM table_name
WHERE seq = 1
UNION ALL
SELECT t."GROUP", t.seq, t.activity, r.hist || ',' || r.activity
FROM rsqfc r
INNER JOIN table_name t
ON (r."GROUP" = t."GROUP" AND r.seq + 1 = t.seq)
)
SEARCH DEPTH FIRST BY "GROUP" SET order_rn
SELECT "GROUP", seq, activity, SUBSTR(hist, 2) AS hist
FROM rsqfc
Which, for the sample data:
CREATE TABLE table_name ("GROUP", seq, activity) AS
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL;
All output:
GROUP
SEQ
ACTIVITY
HIST
A
1
scan
null
A
2
visit
scan
A
3
pay
scan,visit
B
1
drink
null
B
2
rest
drink
db<>fiddle here
To aggregate strings in Oracle we use LISAGG function.
In general, you need a windowing_clause to specify a sliding window for analytic function to calculate running total.
But unfortunately LISTAGG doesn't support it.
To simulate this behaviour you may use model_clause of the select statement. Below is an example with explanation.
select
group_
, activity
, seq
, hist
from t
model
/*Where to restart calculation*/
partition by (group_)
/*Add consecutive numbers to reference "previous" row per group.
May use "seq" column if its values are consecutive*/
dimension by (
row_number() over(
partition by group_
order by seq asc
) as rn
)
measures (
/*Other columnns to return*/
activity
, cast(null as varchar2(1000)) as hist
, seq
)
rules update (
/*Apply this rule sequentially*/
hist[any] order by rn asc =
/*Previous concatenated result*/
hist[cv()-1]
/*Plus comma for the third row and tne next rows*/
|| presentv(activity[cv()-2], ',', '') /**/
/*lus previous row's value*/
|| activity[cv()-1]
)
GROUP_ | ACTIVITY | SEQ | HIST
:----- | :------- | --: | :---------
A | scan | 1 | null
A | visit | 2 | scan
A | pay | 3 | scan,visit
B | drink | 1 | null
B | rest | 2 | drink
db<>fiddle here
Few more variants (without subqueries):
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
DBFIddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=9b477a2089d3beac62579d2b7103377a
Full test case with output:
with table_name ("GROUP", seq, activity) AS (
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL
)
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
GROUP SEQ ACTIV HIST1 HIST2
------ ---------- ----- ------------------------------ ------------------------------
A 1 scan
A 2 visit scan, scan
A 3 pay scan,visit, scan,visit
B 1 drink
B 2 rest drink, drink

ORACLE Recursive query

I'm trying to build a recursive query and I'm facing a problem.
please find below my dataset
WITH table1 ( ID, Code, Label ) as(
SELECT 123, 'C1', 'LABEL_1' from dual UNION ALL
SELECT 1, 'C2', 'LABEL_2' from dual UNION ALL
SELECT 30, 'C3', 'LABEL_3' from dual UNION ALL
SELECT 44, 'C4', 'LABEL_4' from dual UNION ALL
SELECT 5, 'C5', 'LABEL_5' from dual
),
table2 ( ID, id_table1, code_child, label_child ) as (
SELECT 1, 123, 'C1_1','LABEL_1_1' from dual UNION ALL
SELECT 2, 123, 'C1_2','LABEL_1_2' from dual UNION ALL
SELECT 3, 123, 'C1_3','LABEL_1_3' from dual UNION ALL
SELECT 4, 123, 'C1_4','LABEL_1_4' from dual UNION ALL
SELECT 6, 30, 'C3_1','LABEL_3_1' from dual UNION ALL
SELECT 7, 30, 'C3_2','LABEL_3_2' from dual UNION ALL
SELECT 8, 30, 'C3_3','LABEL_3_3' from dual UNION ALL
SELECT 9, 30, 'C3_4','LABEL_3_4' from dual UNION ALL
SELECT 10, 5, 'C5_1','LABEL_5_1' from dual
),
hierarchy as (
Select
a.id, code, label, CODE_CHILD,id_table1
from table1 a
left join table2 b on b.id_table1 = a.ID
)
,recursive (base, id, code, label, CODE_CHILD,id_table1) as (
SELECT
id as base,
id,
code,
label,
CODE_CHILD,
id_table1
FROM hierarchy
UNION ALL
SELECT
previous_level.base,
current_level.id,
current_level.code,
current_level.label,
current_level.CODE_CHILD,
current_level.id_table1
FROM recursive previous_level,
hierarchy current_level
WHERE 1=1
and current_level.id = previous_level.id_table1
)
SELECT * FROM recursive order by base;
And i'm getting this error :
32044. 00000 - "cycle detected while executing recursive WITH query"
*Cause: A recursive WITH clause query produced a cycle and was stopped
in order to avoid an infinite loop.
*Action: Rewrite the recursive WITH query to stop the recursion or use
the CYCLE clause.
Where i'm wrong ?
I need to merge these two tables into one.
here's what I'd like to get as a result.
id code label id_parent
1 C1 LABEL_1
2 C2 LABEL_2
3 C3 LABEL_3
4 C4 LABEL_4
5 C5 LABEL_5
6 C1_1 LABEL_1_1 1
7 C1_2 LABEL_1_2 1
8 C1_3 LABEL_1_3 1
9 C1_4 LABEL_1_4 1
10 C3_1 LABEL_3_1 3
11 C3_2 LABEL_3_2 3
12 C3_3 LABEL_3_3 3
13 C3_4 LABEL_3_4 3
14 C5_1 LABEL_5_1 5
Thank you
Not sure why you want a recursive query? It appears that you could just use UNION ALL and join the two tables:
WITH table1 ( ID, Code, Label ) as(
SELECT 1, 'C1', 'LABEL_1' from dual UNION ALL
SELECT 2, 'C2', 'LABEL_2' from dual UNION ALL
SELECT 3, 'C3', 'LABEL_3' from dual UNION ALL
SELECT 4, 'C4', 'LABEL_4' from dual UNION ALL
SELECT 5, 'C5', 'LABEL_5' from dual
),
table2 ( ID, id_table1, code_child, label_child ) as (
SELECT 1, 1, 'C1_1','LABEL_1_1' from dual UNION ALL
SELECT 2, 1, 'C1_2','LABEL_1_2' from dual UNION ALL
SELECT 3, 1, 'C1_3','LABEL_1_3' from dual UNION ALL
SELECT 4, 1, 'C1_4','LABEL_1_4' from dual UNION ALL
SELECT 6, 3, 'C3_1','LABEL_3_1' from dual UNION ALL
SELECT 7, 3, 'C3_2','LABEL_3_2' from dual UNION ALL
SELECT 8, 3, 'C3_3','LABEL_3_3' from dual UNION ALL
SELECT 9, 3, 'C3_4','LABEL_3_4' from dual UNION ALL
SELECT 10, 5, 'C5_1','LABEL_5_1' from dual
)
SELECT ROW_NUMBER() OVER ( ORDER BY table_no, code ) AS id,
code,
label,
id_parent
FROM (
SELECT code,
label,
1 AS table_no,
NULL AS id_parent
FROM table1
UNION ALL
SELECT code_child,
label_child,
2 AS table_no,
id_table1
FROM table2
)
order by table_no, code;
Which outputs:
ID | CODE | LABEL | ID_PARENT
-: | :--- | :-------- | --------:
1 | C1 | LABEL_1 | null
2 | C2 | LABEL_2 | null
3 | C3 | LABEL_3 | null
4 | C4 | LABEL_4 | null
5 | C5 | LABEL_5 | null
6 | C1_1 | LABEL_1_1 | 1
7 | C1_2 | LABEL_1_2 | 1
8 | C1_3 | LABEL_1_3 | 1
9 | C1_4 | LABEL_1_4 | 1
10 | C3_1 | LABEL_3_1 | 3
11 | C3_2 | LABEL_3_2 | 3
12 | C3_3 | LABEL_3_3 | 3
13 | C3_4 | LABEL_3_4 | 3
14 | C5_1 | LABEL_5_1 | 5
db<>fiddle here
A recursive WITH clause query produced a cycle and was stopped in order to avoid an infinite loop.
This issue is coming due to bad data in the DB. There are some records which are causing circular relationship among them which is causing infinite loops.
For example: P is parent of C and C is again parent of P.
You can fetch the above output simple using UNION ALL and join of the tables.

Oracle - how to update a unique row based on MAX effective date which is part of the unique index

Oracle - Say you have a table that has a unique key on name, ssn and effective date. The effective date makes it unique. What is the best way to update a current indicator to show inactive for the rows with dates less than the max effective date? I can't really wrap my head around it since there are multiple rows with the same name and ssn combinations. I haven't been able to find this scenario on here for Oracle and I'm having developer's block. Thanks.
"All name/ssn having a max effective date earlier than this time yesterday:"
SELECT name, ssn
FROM t
GROUP BY name, ssn
HAVING MAX(eff_date) < SYSDATE - 1
Oracle supports multi column in, so
UPDATE t
SET current_indicator = 'inactive'
WHERE (name,ssn,eff_date) IN (
SELECT name, ssn, max(eff_date)
FROM t
GROUP BY name, ssn
HAVING MAX(eff_date) < SYSDATE - 1
)
Use a MERGE statement using an analytic function to identify the rows to update and then merge on the ROWID pseudo-column so that Oracle can efficiently identify the rows to update (without having to perform an expensive self-join by comparing the values):
MERGE INTO table_name dst
USING (
SELECT rid,
max_eff_date
FROM (
SELECT ROWID AS rid,
effective_date,
status,
MAX( effective_date ) OVER ( PARTITION BY name, ssn ) AS max_eff_date
FROM table_name
)
WHERE ( effective_date < max_eff_date AND status <> 'inactive' )
OR ( effective_date = max_eff_date AND status <> 'active' )
) src
ON ( dst.ROWID = src.rid )
WHEN MATCHED THEN
UPDATE
SET status = CASE
WHEN src.max_eff_date = dst.effective_date
THEN 'active'
ELSE 'inactive'
END;
So, for some sample data:
CREATE TABLE table_name ( name, ssn, effective_date, status ) AS
SELECT 'aaa', 1, DATE '2020-01-01', 'inactive' FROM DUAL UNION ALL
SELECT 'aaa', 1, DATE '2020-01-02', 'inactive' FROM DUAL UNION ALL
SELECT 'aaa', 1, DATE '2020-01-03', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 2, DATE '2020-01-01', 'active' FROM DUAL UNION ALL
SELECT 'bbb', 2, DATE '2020-01-02', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 3, DATE '2020-01-01', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 3, DATE '2020-01-03', 'active' FROM DUAL;
The query only updates the 3 rows that need changing and:
SELECT *
FROM table_name;
Outputs:
NAME | SSN | EFFECTIVE_DATE | STATUS
:--- | --: | :------------- | :-------
aaa | 1 | 01-JAN-20 | inactive
aaa | 1 | 02-JAN-20 | inactive
aaa | 1 | 03-JAN-20 | active
bbb | 2 | 01-JAN-20 | inactive
bbb | 2 | 02-JAN-20 | active
bbb | 3 | 01-JAN-20 | inactive
bbb | 3 | 03-JAN-20 | active
db<>fiddle here

How to Choose a specific value from a table and to avoid duplicates?

I have two tables:
MainTable
id AccountNum status
1 11001 active
2 11002 active
3 11003 active
4 11004 active
AddTable
id date description
1 01.2020 ACCOUNT.SET
1 02.2020 ACCOUNT.CHANGE
1 03.2020 ACCOUNT.REMOVE
2 04.2020 ACCOUNT.SET
2 05.2020 ACCOUNT.CHANGE
3 08.2020 ACCOUNT.SET
4 05.2020 ACCOUNT.SET
4 09.2020 ACCOUNT.REMOVE
I need to get a such result:
EffectiveFrom is date when Account was set,
EffectiveTo is date when Account was removed
id AccountNum EffectiveFrom EffectiveTo
1 11001 01.2020 03.2020
2 11002 04.2020 null
3 11003 08.2020 null
4 11004 05.2020 09.2020
The problem is that after joining on AddTable I get the duplicates, but I need just one row on every Id and only dates where the description in ACCOUNT.SET,ACCOUNT.REMOVE.
Are you looking for left join?
select m.id as id,
m.AccountNum as AccountNum,
a.date as EffectiveFrom,
b.date as EffectiveTo
from MainTable m left join
AddTable a on (a.id = m.id and a.description = 'ACCOUNT.SET') left join
AddTable b on (b.id = m.id and b.description = 'ACCOUNT.REMOVE')
order by m.AccountNum
Use a PIVOT and a LEFT OUTER JOIN:
SELECT m.id,
a.EffectiveFrom,
a.EffectiveTo
FROM MainTable m
LEFT OUTER JOIN
(
SELECT *
FROM AddTable
PIVOT( MAX( dt ) FOR description IN (
'ACCOUNT.SET' AS EffectiveFrom,
'ACCOUNT.REMOVE' AS EffectiveTo
) )
) a
ON ( a.id = m.id )
ORDER BY m.id
So for your test data:
CREATE TABLE MainTable ( id, AccountNum, status ) AS
SELECT 1, 11001, 'active' FROM DUAL UNION ALL
SELECT 2, 11002, 'active' FROM DUAL UNION ALL
SELECT 3, 11003, 'active' FROM DUAL UNION ALL
SELECT 4, 11004, 'active' FROM DUAL;
CREATE TABLE AddTable ( id, dt, description ) AS
SELECT 1, DATE '2020-01-01', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-02', 'ACCOUNT.CHANGE' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-03', 'ACCOUNT.REMOVE' FROM DUAL UNION ALL
SELECT 2, DATE '2020-01-04', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 2, DATE '2020-01-05', 'ACCOUNT.CHANGE' FROM DUAL UNION ALL
SELECT 3, DATE '2020-01-08', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 4, DATE '2020-01-05', 'ACCOUNT.SET' FROM DUAL UNION ALL
SELECT 4, DATE '2020-01-09', 'ACCOUNT.REMOVE' FROM DUAL;
This outputs:
ID | EFFECTIVEFROM | EFFECTIVETO
-: | :------------ | :----------
1 | 01-JAN-20 | 03-JAN-20
2 | 04-JAN-20 | null
3 | 08-JAN-20 | null
4 | 05-JAN-20 | 09-JAN-20
db<>fiddle here

find nearest row of different type in oracle

My table looks like
__ Key type timeStamp flag
1 ) 1 B 2015-06-28 22:19:26 Y
2 ) 1 B 2015-06-28 22:20:22 Y
3 ) 1 C 2015-06-28 22:22:06 N
4 ) 1 A 2015-06-28 22:25:11 N
5 ) 1 B 2015-06-28 22:29:44 Y
6 ) 1 A 2015-06-28 22:33:33 N
7 ) 1 B 2015-06-28 22:35:21 N
8 ) 1 B 2015-06-28 22:39:34 Y
9 ) 1 B 2015-06-28 22:43:53 N
10) 1 A 2015-06-28 22:45:53 N
I need to find out all the types of A whose flag='N' with respect to which there exist type B whose timestampOF(B)<timestampOF(A) and Flag(B)='Y' and key(A)=key(B).
note: If there exist two B previous than A than take the B with max timestamp.(ROW[8,9,10] 9 is taken instead of 8)
OUTPUT
__ Key type timeStamp flag
4 ) 1 A 2015-06-28 22:25:11 N
6 ) 1 A 2015-06-28 22:33:33 N
My approach
SELECT *
FROM tab TAB_OUT
WHERE TAB_OUT.TYPE='A'
AND TAB_OUT.FLAG='N'
AND EXISTS(
SELECT *
FROM tab TAB_IN
WHERE TAB_IN.KEY = TAB_OUT.KEY
AND TAB_IN.TYPE='B'
AND TAB_OUT.FLAG='Y'
AND TAB_IN.timestamp<TAB_OUT.timestamp
AND TAB_IN.timestamp = (SELECT MAX(timestamp) from
tab where timestamp< `TAB_OUT.timestamp`)
);
But in this i can not use TAB_OUT.timestamp in third level query. Is there any alternative solution to solve this problem.
In my query note: part is not satisfied as my query as it skips no. 9) and satisfy condition with no. 8)
A solution that only requires a single table scan:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( Key, type, timeStamp, flag ) AS
SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:19:26' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:20:22' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'C', CAST( TIMESTAMP '2015-06-28 22:22:06' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'A', CAST( TIMESTAMP '2015-06-28 22:25:11' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:29:44' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'A', CAST( TIMESTAMP '2015-06-28 22:33:33' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:35:21' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:39:34' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:43:53' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'A', CAST( TIMESTAMP '2015-06-28 22:45:53' AS DATE ), 'N' FROM DUAL
Query 1:
SELECT Key,
type,
timeStamp,
flag
FROM (
SELECT Key,
type,
timeStamp,
flag,
LAG( CASE WHEN type = 'B' THEN flag END ) IGNORE NULLS OVER ( PARTITION BY Key ORDER BY timeStamp ) AS prev_b_flag
FROM table_name t
WHERE type IN ( 'A', 'B' )
)
WHERE type = 'A'
AND flag = 'N'
AND prev_b_flag = 'Y'
Results:
| KEY | TYPE | TIMESTAMP | FLAG |
|-----|------|------------------------|------|
| 1 | A | June, 28 2015 22:25:11 | N |
| 1 | A | June, 28 2015 22:33:33 | N |
SELECT
*
FROM
tab A
WHERE
flag = 'N' AND type = 'A'
AND EXISTS (
SELECT
NULL
FROM
tab B
WHERE
type = 'B'
AND A.timestamp > timestamp AND A.Key = Key
GROUP BY
Key
HAVING
MAX(flag) KEEP (DENSE_RANK LAST ORDER BY timestamp) = 'Y'
);
There is no need to make correlated query to select flag from the the last record. Using aggregate KEEP clause is more efficient way. In this case it sort the groups by timestamp and keeps only the last value for the aggregation (last timestamp you wanted), so there comes only single record to the MAX function and we just take the FLAG value from it.
Here is simple example:
WITH sample (value1, value2) AS (
SELECT 1, 'Y' FROM DUAL UNION ALL
SELECT 2, 'X' FROM DUAL
)
SELECT
MIN(value2) KEEP (DENSE_RANK LAST ORDER BY value1) value2
FROM
sample
This returns value2 from the record with highest value1.

Resources