Oracle Ranking query - oracle

Need help to achieve below result:
source data:
Output Expected:
Query to generate source data:
SELECT '43443' AS MSISDN,'Turkey' AS LOC,TRUNC(SYSDATE) AS DATA_DAY FROM DUAL
UNION
SELECT '43443' AS MSISDN,'Turkey' AS LOC,TRUNC(SYSDATE-1) AS DATA_DAY FROM DUAL
UNION
SELECT '43443' AS MSISDN,'India' AS LOC,TRUNC(SYSDATE-2) AS DATA_DAY FROM DUAL
UNION
SELECT '43443' AS MSISDN,'Eng' AS LOC,TRUNC(SYSDATE-3) AS DATA_DAY FROM DUAL
UNION
SELECT '43446' AS MSISDN,'Eng' AS LOC,TRUNC(SYSDATE-4) AS DATA_DAY FROM DUAL
UNION
SELECT '43446' AS MSISDN,'India' AS LOC,TRUNC(SYSDATE-5) AS DATA_DAY FROM DUAL;

Related

Duplicated rows numbering

I need to number the rows so that the row number with the same ID is the same. For example:
Oracle database. Any ideas?
Use the DENSE_RANK analytic function:
SELECT DENSE_RANK() OVER (ORDER BY id) AS row_number,
id
FROM your_table
Which, for the sample data:
CREATE TABLE your_table ( id ) AS
SELECT 86325 FROM DUAL UNION ALL
SELECT 86325 FROM DUAL UNION ALL
SELECT 86326 FROM DUAL UNION ALL
SELECT 86326 FROM DUAL UNION ALL
SELECT 86352 FROM DUAL UNION ALL
SELECT 86353 FROM DUAL UNION ALL
SELECT 86354 FROM DUAL UNION ALL
SELECT 86354 FROM DUAL;
Outputs:
ROW_NUMBER
ID
1
86325
1
86325
2
86326
2
86326
3
86352
4
86353
5
86354
5
86354
db<>fiddle here

REGEXP_SUBSTR not able to process only current row

(SELECT LISTAGG(EVENT_DESC, ',') WITHIN GROUP (ORDER BY EVENT_DESC) FROM EVENT_REF WHERE EVENT_ID IN
( SELECT REGEXP_SUBSTR(AFTER_VALUE,'[^,]+', 1, level) FROM DUAL
CONNECT BY REGEXP_SUBSTR(AFTER_VALUE, '[^,]+', 1, level) IS NOT NULL
)
)
A table from which I am fetching AFTER_VALUE has values of integer which is comma seperated like
AFTER_VALUE data
Expected output
1
Event1
1,2
Event1,Event2
1,12,2,5
Event1,Event12,Event2,Event5
15,13
Event15,Event13
these are Ids in EVENT_REF table which have some description. I am trying to basically present
ex. 1,2 as Event1, Event2 and send back from query. There are multiple events so using REPLACE would be very tedious.
When using above query I'm getting error as “ORA-01722: invalid number” whenever there is more than one value in AFTER_VALUE column Ex. if there exists only one id , then the query works but for values like 1,2 or 1,13 etc it throws invalid number error.
PS: The event names are not Event1,Event2 etc , I have just put for reference.
You don't even need regular expressions for this assignment. Standard string function replace() can do the same thing, and faster. You only need an extra 'Event' at the beginning of the string, since that one doesn't "replace" anything.
Like this: (note that you don't need the with clause; I included it only for quick testing)
with
event_ref (after_value) as (
select '1' from dual union all
select '1,2' from dual union all
select '1,12,2,5' from dual union all
select '15,13' from dual
)
select after_value,
'Event' || replace(after_value, ',', ',Event') as desired_output
from event_ref
;
AFTER_VALUE DESIRED_OUTPUT
----------- -----------------------------
1 Event1
1,2 Event1,Event2
1,12,2,5 Event1,Event12,Event2,Event5
15,13 Event15,Event13
Ah,ok, looks, like you have other characters in your comma-separated list, so you can use this query:
with EVENT_REF(EVENT_ID,EVENT_DESC) as (
select 1, 'Desc 1' from dual union all
select 2, 'Desc 2' from dual union all
select 3, 'Desc 3' from dual union all
select 4, 'Desc 4' from dual union all
select 5, 'Desc 5' from dual union all
select 12, 'Desc12' from dual union all
select 13, 'Desc13' from dual union all
select 15, 'Desc15' from dual
)
select
(SELECT LISTAGG(EVENT_DESC, ',')
WITHIN GROUP (ORDER BY EVENT_DESC)
FROM EVENT_REF
WHERE EVENT_ID IN
( SELECT to_number(REGEXP_SUBSTR(AFTER_VALUE,'\d+', 1, level))
FROM DUAL
CONNECT BY level<=REGEXP_COUNT(AFTER_VALUE, '\d+')
)
)
from (
select '1' AFTER_VALUE from dual union all
select '1,2' AFTER_VALUE from dual union all
select '1,12,2,5' AFTER_VALUE from dual union all
select '15,13' AFTER_VALUE from dual
);
PS. And do not forget that to_number has 'default on conversion error' now: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/TO_NUMBER.html
There is no need to split and concatenate substrings, just use regexp_replace:
with EVENT_REF (AFTER_VALUE) as (
select '1' from dual union all
select '1,2' from dual union all
select '1,12,2,5' from dual union all
select '15,13' from dual
)
select regexp_replace(AFTER_VALUE,'(\d+)','Event\1') from EVENT_REF;
REGEXP_REPLACE(AFTER_VALUE,'(\D+)','EVENT\1')
-----------------------------------------------
Event1
Event1,Event2
Event1,Event12,Event2,Event5
Event15,Event13

Oracle: Is it possible to filter out duplicates only when they appear in succession?

Thank you in advance for your help.
I have a table that holds itinerary information for drivers. There will be times when the itinerary seems to have the same stop (but is several days apart). I'd like to be able to query the table and filter out any record where the address is the same AND the dates are consecutive.
Is this possible?
Thanks again,
josh
with tst as(
select timestamp '2020-08-01 00:00:00' dt, '123 street' loc from dual
union all
select timestamp '2020-08-01 00:00:00', '89 street' from dual
union all
select timestamp '2020-08-02 00:00:00', '456 airport' from dual
union all
select timestamp '2020-08-04 00:00:00', '456 airport' from dual
union all
select timestamp '2020-08-05 00:00:00', '67 street' from dual
union all
select timestamp '2020-08-06 00:00:00', '89 street' from dual
union all
select timestamp '2020-08-07 00:00:00', '123 street' from dual
)
select dt, loc
from (
select dt, loc, nvl(lag(loc) over(order by dt), 'FIRST_ROW') prev_loc
from tst
) where loc <> prev_loc;
fiddle
Another approach would be to use Tabibitosan method which assign consecutive rows a group number and then count number of rows per group.(found in asktom website).
with test_data as(
select date'2020-08-01' dt, '123 street' loc from dual
union all
select date '2020-08-01', '89 street' from dual
union all
select date '2020-08-02', '456 airport' from dual
union all
select date '2020-08-04', '456 airport' from dual
union all
select date '2020-08-05', '67 street' from dual
union all
select date '2020-08-06', '89 street' from dual
union all
select date '2020-08-07', '123 street' from dual
)
select max(dt),loc
from
(
select t.*
,row_number() over (order by dt) -
row_number() over (partition by loc order by dt) grp
from test_data t
)
group by grp,loc
having count(*) > 1;
Another approach using match_recognize available from 12c onwards.patter used {1,} says repeated one or more times
more to learn match_recognize here
with test_data as(
select date'2020-08-01' dt, '123 street' loc from dual
union all
select date '2020-08-01', '89 street' from dual
union all
select date '2020-08-02', '456 airport' from dual
union all
select date '2020-08-04', '456 airport' from dual
union all
select date '2020-08-05', '67 street' from dual
union all
select date '2020-08-06', '89 street' from dual
union all
select date '2020-08-07', '123 street' from dual
)
select *
from test_data
match_recognize (
order by dt
all rows per match
pattern (equal{1,})
define
equal as loc = prev(loc)
);
Playground: Dbfiddle

Determine start of data's most recent uninterrupted 'streak' by date

I have a dataset that looks something like:
asset_id,date_logged
1234,2018-02-01
1234,2018-02-02
1234,2018-02-03
1234,2018-02-04
1234,2018-02-05
1234,2018-02-06
1234,2018-02-07
1234,2018-02-08
1234,2018-02-09
1234,2018-02-10
9876,2018-02-01
9876,2018-02-02
9876,2018-02-03
9876,2018-02-07
9876,2018-02-08
9876,2018-02-09
9876,2018-02-10
For the purpose of this exercise, imagine today's date is 2018-02-10 (10 Feb 2018). For all the asset_ids in the table, I am trying to identify the start of the most recent unbroken streak for date_logged.
For asset_id = 1234, this would be 2018-02-01. The asset_id was logged all 10 days in an unbroken streak. For asset_id = 9876, this would be 2018-02-07. Because the asset_id was not logged on 2018-02-04, 2018-02-05, and 2018-02-06, the most recent unbroken streak starts on 2018-02-07.
So, my result set would hopefully look something like:
asset_id,Number_of_days_in_most_recent_logging_streak
1234,10
9876,4
Or, alternatively:
asset_id,Date_Begin_Most_Recent_Streak
1234,2018-02-01
9876,2018-02-07
I haven't been able to work out anything that gets me close -- my best effort so far is to get the number of days since the first log date and today, and the number of days the asset_id appears in the dataset, and compare these to identify situations where the streak is more recent than the first day they appear. For my real dataset this isn't particularly problematic, but it's an ugly solution and I would like to understand a better way of getting to the outcome.
Perhaps something like this. Break the query after each inline view in the WITH clause and SELECT * FROM the most recent inline view, to see what each step does.
with
inputs ( asset_id, date_logged ) as (
select 1234, to_date('2018-02-01', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-02', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-03', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-04', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-05', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-06', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-07', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-08', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-09', 'yyyy-mm-dd') from dual union all
select 1234, to_date('2018-02-10', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-01', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-02', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-03', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-07', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-08', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-09', 'yyyy-mm-dd') from dual union all
select 9876, to_date('2018-02-10', 'yyyy-mm-dd') from dual
),
prep ( asset_id, date_logged, grp ) as (
select asset_id, date_logged,
date_logged - row_number()
over (partition by asset_id order by date_logged)
from inputs
),
agg ( asset_id, date_logged, cnt ) as (
select asset_id, min(date_logged), count(*)
from prep
group by asset_id, grp
)
select asset_id, max(date_logged) as date_start_recent_streak,
max(cnt) keep (dense_rank last order by date_logged) as cnt
from agg
group by asset_id
order by asset_id -- If needed
;
ASSET_ID DATE_START_RECENT_STREAK CNT
---------- ------------------------ ----------
1234 2018-02-01 10
9876 2018-02-07 4
you can try this,
with test (asset_id, date_logged) as
(select 1234, date '2018-02-01' from dual union all
select 1234, date '2018-02-02' from dual union all
select 1234, date '2018-02-03' from dual union all
select 1234, date '2018-02-04' from dual union all
select 1234, date '2018-02-05' from dual union all
select 1234, date '2018-02-06' from dual union all
select 1234, date '2018-02-07' from dual union all
select 1234, date '2018-02-08' from dual union all
select 1234, date '2018-02-09' from dual union all
select 1234, date '2018-02-10' from dual union all
select 9876, date '2018-02-01' from dual union all
select 9876, date '2018-02-02' from dual union all
select 9876, date '2018-02-03' from dual union all
select 9876, date '2018-02-07' from dual union all
select 9876, date '2018-02-08' from dual union all
select 9876, date '2018-02-09' from dual union all
select 9876, date '2018-02-10' from dual union all
select 9876, date '2018-02-11' from dual union all
select 9876, date '2018-02-12' from dual
)
SELECT asset_id, MIN(date_logged), COUNT(1)
FROM (SELECT asset_id, date_logged,
MAX(date_logged) OVER (PARTITION BY asset_id)+1 max_date_logged_plus_one,
DENSE_RANK() OVER (PARTITION BY asset_id ORDER BY date_logged desc) rown
FROM test
ORDER BY asset_id, date_logged desc)
WHERE max_date_logged_plus_one - date_logged = rown
GROUP BY asset_id;
ASSET_ID MIN(DATE_LOGGED) COUNT(1)
---------- ---------------- ----------
1234 01-FEB-18 10
9876 07-FEB-18 6
if below data is commented, output is
select 9876, date '2018-02-10' from dual union all
ASSET_ID MIN(DATE_LOGGED) COUNT(1)
---------- ---------------- ----------
1234 01-FEB-18 10
9876 11-FEB-18 2
Would this make any sense?
SQL> with test (asset_id, date_logged) as
2 (select 1234, date '2018-02-01' from dual union all
3 select 1234, date '2018-02-02' from dual union all
4 select 1234, date '2018-02-03' from dual union all
5 select 1234, date '2018-02-04' from dual union all
6 select 1234, date '2018-02-05' from dual union all
7 select 1234, date '2018-02-06' from dual union all
8 select 1234, date '2018-02-07' from dual union all
9 select 1234, date '2018-02-08' from dual union all
10 select 1234, date '2018-02-09' from dual union all
11 select 1234, date '2018-02-10' from dual union all
12 select 9876, date '2018-02-01' from dual union all
13 select 9876, date '2018-02-02' from dual union all
14 select 9876, date '2018-02-03' from dual union all
15 select 9876, date '2018-02-07' from dual union all
16 select 9876, date '2018-02-08' from dual union all
17 select 9876, date '2018-02-09' from dual union all
18 select 9876, date '2018-02-10' from dual
19 ),
20 inter as
21 -- difference between DATE_LOGGED and its previous DATE_LOGGED
22 (select asset_id,
23 date_logged,
24 date_logged - lag(date_logged) over (partition by asset_id order by date_logged) diff
25 from test
26 )
27 select i.asset_id, min(i.date_logged) date_logged
28 from inter i
29 where nvl(i.diff, 1) = (select max(i1.diff) from inter i1
30 where i1.asset_id = i.asset_id
31 )
32 group by i.asset_id
33 order by i.asset_id;
ASSET_ID DATE_LOGGE
---------- ----------
1234 2018-02-01
9876 2018-02-07
SQL>

Grouping and aggregation based on a specific condition

My result set from a query looks like this
trackingnumber type price
------------------------------------------
12799467 AVRM 674.0536
12799467 AVRM 860.7415
12799467 PRICESTD 200.00
12799468 PRICESTD 590.85
12799469 PRICESTD 800
12799470 PRICESTD 640
12799471 PRICESTD 160
12799472 PRICESTD 2080
12799473 PRICESTD 354.3779
I want to group this by the trackingnumber and in cases where the count of grouped result set is greater than 1 return the SUM of all the price which has type as AVRM else return the individual price as it is. If the count
is more that zero and the none of them has type AVRM then it's total price would be null
The expected result would be this
trackingnumber Total Price
-----------------------------------------
12799467 1534.7951 --sum of price excluding 200
12799468 590.85
12799469 800
12799470 640
12799471 160
12799472 2080
12799473 354.3779
I couldn't think of a way to get this done except for grouping by trackingnumber and checking for type by using case statement in the select part but that I believe would not work since we do not group by type
I'm not sure if this can be achieved using a single query.
Yes, it can be done in a single query.
with sample_data ( tracking_Number, "TYPE", price ) as
(
SELECT 12799467,'AVRM',674.0536 FROM DUAL UNION ALL
SELECT 12799467,'AVRM',860.7415 FROM DUAL UNION ALL
SELECT 12799467,'PRICESTD',200.00 FROM DUAL UNION ALL
SELECT 12799468,'PRICESTD',590.85 FROM DUAL UNION ALL
SELECT 12799469,'PRICESTD',800 FROM DUAL UNION ALL
SELECT 12799470,'PRICESTD',640 FROM DUAL UNION ALL
SELECT 12799471,'PRICESTD',160 FROM DUAL UNION ALL
SELECT 12799472,'PRICESTD',2080 FROM DUAL UNION ALL
SELECT 12799473,'PRICESTD',354.3779 FROM DUAL )
SELECT tracking_number,
case when count(*) > 1 THEN
sum(decode("TYPE",'AVRM',price,null)) ELSE
sum(price) END price
from sample_data
group by tracking_number
order by tracking_Number;
For example next solution. I add condition to exlude rows with type not equla 'AVRM' if rows with 'AVRM' exests
with s (trackingnumber ,type ,price)
as (
select 12799467,'AVRM',674.0536 from dual union all
select 12799467 ,'AVRM', 860.7415 from dual union all
select 12799467 ,'PRICESTD', 200.00 from dual union all
select 12799468 ,'PRICESTD', 590.85 from dual union all
select 12799469 ,'PRICESTD', 800 from dual union all
select 12799470 ,'PRICESTD', 640 from dual union all
select 12799471 ,'PRICESTD', 160 from dual union all
select 12799472 ,'PRICESTD', 2080 from dual union all
select 12799473 ,'PRICESTD', 354.3779 from dual )
select trackingnumber,
sum(price)
from (select s.*,rownum as rn from s
where not exists (select null
from s subs
where s.trackingnumber = subs.trackingnumber
and s.type != 'AVRM'
and subs.type = 'AVRM')
)
group by trackingnumber,
case when type = 'AVRM' then 0 else rn end;

Resources