11g Oracle aggregate SQL query - oracle

Can you please help me in getting a query for this scenario. In below case it should return me single row of A=13 because 13,14 in column A has most occurrences and value of B (30) is greater for 13. We are interested in maximum occurrences of A and in case of tie B should be considered as tie breaker.
A B
13 30
13 12
14 10
14 25
15 5
In below case where there are single occurrence of A (all tied) it should return 14 having maximum value of 40 for B.
A B
13 30
14 40
15 5
Use case - we get calls from corporate customers. We are interested in knowing during what hours of day when most calls come and in case of tie - which of the busiest hours has longest call.
Further question
There is further questions on this. I want to use either of two solutions - '11g or lower' from #GurV or 'dense_rank' from #mathguy in bigger query below how can I do it.
SELECT dv.id , u.email , dv.email_subject AS headline , dv.start_date , dv.closing_date, b.name AS business_name, ls.call_cost, dv.currency,
SUM(lsc.duration) AS duration, COUNT(lsc.id) AS call_count, ROUND(AVG(lsc.duration), 2) AS avg_duration
-- max(extract(HOUR from started )) keep (dense_rank last order by count(duration), max(duration)) as most_popular_hour
FROM deal_voucher dv
JOIN lead_source ls ON dv.id = ls.deal_id
JOIN lead_source_call lsc ON ls.PHONE_SID = lsc.phone_number_id
JOIN business b ON dv.business_id = b.id
JOIN users u ON b.id = u.business_id
AND TRUNC(dv.closing_date) = to_date('13-01-2017', 'dd-mm-yyyy')
AND lsc.status = 'completed' and lsc.duration >= 30
GROUP BY dv.id , u.email , dv.email_subject , dv.start_date , dv.closing_date, b.name, ls.call_cost, dv.currency
--, extract(HOUR from started )

Try this if 12c+
select a
from t
group by a
order by count(*) desc, max(b) desc
fetch first 1 row only;
If 11g or lower:
select * from (
select a
from t
group by a
order by count(*) desc, max(b) desc
) where rownum = 1;
Note that if there is equal count and equal max value for two or more values of A, then any one of them will be fetched.

Here is a query that will work in older versions (no fetch clause) and does not require a subquery. It uses the first/last function. In case of ties by both "count by A" and "value of max(B)" it selects only the row with the largest value of A. You can change that to min(A), or even to sum(A) (although that probably doesn't make sense in your problem) or LISTAGG(A, ',') WITHIN GROUP (ORDER BY A) to get a comma-delimited list of the A's that are tied for first place, but that requires 11.2 (I believe).
select max(a) keep (dense_rank last order by count(b), max(b)) as a
, max(max(b)) keep (dense_rank last order by count(b)) as b
from inputs
group by a
;

Related

Referancing value from select column in where clause : Oracle

My tables are as below
MS_ISM_ISSUE
ISSUE_ID ISSUE_DUE_DATE ISSUE_SOURCE_TYPE
I1 25-11-2018 1
I2 25-12-2018 1
I3 27-03-2019 2
MS_ISM_SOURCE_SETUP
SOURCE_ID MODULE_NAME
1 IT-Compliance
2 Risk Assessment
I have written following query.
with rs as
(select
count(ISSUE_ID) as ISSUE_COUNT, src.MODULE_NAME,
case
when ISSUE_DUE_DATE<sysdate then 'Overdue'
when ISSUE_DUE_DATE between sysdate and sysdate + 90 then 'Within 3 months'
when ISSUE_DUE_DATE>sysdate+90 then 'Beyond 90 days'
end as date_range
from MS_ISM_ISSUE issue, MS_ISM_SOURCE_SETUP src
where issue.Issue_source_type = src.source_id
group by src.MODULE_NAME, case
when ISSUE_DUE_DATE<sysdate then 'Overdue'
when ISSUE_DUE_DATE between sysdate and sysdate + 90 then 'Within 3 months'
when ISSUE_DUE_DATE>sysdate+90 then 'Beyond 90 days'
end)
select ISSUE_COUNT,MODULE_NAME, DATE_RANGE,
(select count(ISSUE_COUNT) from rs where rs.MODULE_NAME=MODULE_NAME) as total from rs;
The output of the code is as below.
ISSUE_COUNT MODULE_NAME DATE_RANGE Total
1 IT-Compliance Overdue 3
1 IT-Compliance Within 3 months 3
1 Risk Assessment Beyond 90 days 3
The result is correct till 3rd column. In 4th column what I want is, total of Issue count for given module name. Hence in above case Total column will have value as 2 for first and second row (since there are 2 Issues for IT-Compliance) and value 1 for the third row (since one issue is present for Risk Assessment).
Essentially, I want to achieve is to replace current row's MODULE_NAME in last where clause. How do I achieve this using query?
OK, this condition
where rs.MODULE_NAME=MODULE_NAME
is essentially the same as if you wrote
where MODULE_NAME = MODULE_NAME
which is simply always true (if there are no nulls in module_name).
Try using different table alias for inner query and outer query, e.g.
select count(ISSUE_COUNT) from rs rs2 where rs2.MODULE_NAME=rs.MODULE_NAME
You can also try to use analytic function here, something like
select ISSUE_COUNT,
MODULE_NAME,
DATE_RANGE,
COUNT(ISSUE_COUNT) OVER (PARTITION BY RS.MODULE_NAME) AS TOTAL
from rs
instead of your subquery

Oracle Select Query on Same Table (self join)

It seems to simple, but not getting desired results
I have a table with there data
Team_id, Player_id, Player_name Game_cd
1 100 abc 24
1 1000 xyz 24
1 588 ert 24
1 500 you 24
2 600 ops 24
2 700 dps 24
2 900 lmv 24
2 200 hmv 24
I have to write a query to get a result like this
Home_team home_plr_id home_player away_team away_plr_id away_player
1 100 abc 2 600 ops
1 1000 xyz 2 900 lmv
The query I wrote
select f1.Team_id as home_team,
f1.player_id as home_plr_id,
f1.player_Name as home_player,
f2.Team_id as away_team,
f2.player_id as away_plr_id,
f2.player_Name as home_player
from game f1, game f2
where
f1.team_id<> f2.team_id and
f1.game_cd = f2.game_cd
Alternative to #Radagast81's self-join is pivot, available in your Oracle version:
select home_plr_id, home_plr_name, away_plr_id, away_plr_name
from (select game.*,
row_number() over (partition by team_id order by player_id) rn
from game)
pivot (max(player_id) plr_id, max(player_name) plr_name
for team_id in (1 home, 2 away))
SQL Fiddle
Players have to be numbered somehow (here by ID), it can be done by name, null or even random. This numbering is needed only to put them in same rows. Pivot works also if numbers of players in teams differs.
It is not clear how you want to pair a home player with an away player. But provided that you don't care about that, the following might be what you are looking for:
WITH game_p AS (SELECT team_id, player_id, player_name, game_cd
, ROW_NUMBER() over (PARTITION BY team_id, game_cd ORDER BY player_id) pos
, dense_rank() over (PARTITION BY game_cd ORDER BY team_id) team_pos
FROM game)
SELECT NVL(f1.game_cd, f2.game_cd) AS game_cd
, f1.Team_id as home_team
, f1.player_id as home_plr_id
, f1.player_Name as home_player
, f2.Team_id as away_team
, f2.player_id as away_plr_id
, f2.player_Name as away_player
FROM (SELECT * FROM game_p WHERE team_pos = 1) f1
FULL JOIN (SELECT * FROM game_p WHERE team_pos = 2) f2
ON f1.game_cd = f2.game_cd
AND f1.pos = f2.pos
The new column POS gives any player of each team a position to pair them with the other team.
The new column TEAM_POS is to get the team_id mapped to the values 1 and 2, as the team_id's can differ per game.
Finally do a FULL JOIN to get the final list. If the number of players are allways the same for both teams you can do a normal join instead...

Add indicator to top and bottom 10%

I'm trying to capture the average of FIRST_CONTACT_CAL_DAYS but what I would like to do is create an indicator for the top and bottom 10% of values so I can exclude those (outliers) from my average calculation.
Not sure how to go about do this, any thoughts?
SELECT DISTINCT
TO_CHAR(A.FIRST_ASSGN_DT,'DAY') AS DAY_NUMBER,
A.FIRST_ASSGN_DT,
A.FIRST_CONTACT_DT,
TO_CHAR(A.FIRST_CONTACT_DT,'DAY') AS DAY_NUMBER2,
A.FIRST_CONTACT_DT AS FIRST_PHONE_CONTACT,
A.ID,
ABS(TO_DATE(A.FIRST_CONTACT_DT, 'DD/MM/YYYY') - TO_DATE(A.FIRST_ASSGN_DT, 'DD/MM/YYYY')) AS FIRST_CONTACT_CAL_DAYS,
FROM HIST A
LEFT JOIN CONTACTS D ON A.ID = D.ID
WHERE 1=1
You may be looking for something like this. Please adapt to your situation.
I assume you may have more than one "group" or "partition" and you need to compute the average for each group separately, after throwing out the outliers in each partition. (An alternative, which can be easily accommodated by adapting the query below, is to throw out the outliers at the global level, and only then to group and take the average for each group.)
If you don't have any groups, and everything is one big pile of data, it's even easier - you don't need GROUP BY and PARTITION BY.
Then: the function NTILE assigns a bucket number, in this example between 1 and 10, to each row, based on where they fall (first decile, i.e. first 10%, next decile, ... all the way to the last decile). I do this in a subquery. Then in the outer query just filter out the first and last bucket before you group by and you compute the average.
For testing purposes I create three groups with 10,000 random numbers each in a WITH clause - no need to spend any time on that portion of the code, since it is not part of the solution (the SQL code to solve your problem) - it's just a dirty trick to create test data on the fly.
with
inputs ( grp, val ) as (
select ceil(level/10000), dbms_random.value(0, 150)
from dual
connect by level <= 30000
)
select grp, avg(val) as avg_val
from (
select grp, val, ntile(10) over (partition by grp order by val) as bkt
from inputs
)
where bkt between 2 and 9
group by grp
;
GRP AVG_VAL
--- -----------------------
1 75.021614866547043734458
2 74.286117923344418598032
3 75.437412573353736953791

recursive cte working very slow

I want to Group the rows based on certain columns, i.e. if data is same in these columns in continuous rows, then assign same Group Number to them, and if its changed, assign new one. This become complex as the same data in the columns could appear later in some other rows, so they have to be given another Group Number as they are not in continuous rows with previous group.
I used cte for this purpose and it is giving correct output also, but is so slow that iterating over 75k+ rows takes about 15 minutes. The code I used is:
WITH
cte AS (SELECT ROW_NUMBER () OVER (ORDER BY Patient_ID, Opnamenummer, SPECIALISMEN, Opnametype, OntslagDatumTijd) AS RowNumber,
Opnamenummer, Patient_ID, AfdelingsCode, Opnamedatum, Opnamedatumtijd, Ontslagdatum, Ontslagdatumtijd, IsSpoedopname, OpnameType, IsNuOpgenomen, SpecialismeCode, Specialismen
FROM t_opnames)
SELECT * INTO #ttt FROM cte;
WITH cte2 AS (SELECT TOP 1 RowNumber,
1 AS GroupNumber,
Opnamenummer, Patient_ID, AfdelingsCode, Opnamedatum, Opnamedatumtijd, Ontslagdatum, Ontslagdatumtijd, IsSpoedopname, OpnameType, IsNuOpgenomen, SpecialismeCode, Specialismen
FROM #ttt
ORDER BY RowNumber
UNION ALL
SELECT c1.RowNumber,
CASE
WHEN c2.Afdelingscode <> c1.Afdelingscode
OR c2.Patient_ID <> c1.Patient_ID
OR c2.Opnametype <> c1.Opnametype
THEN c2.GroupNumber + 1
ELSE c2.GroupNumber
END AS GroupNumber,
c1.Opnamenummer,c1.Patient_ID,c1.AfdelingsCode,c1.Opnamedatum,c1.Opnamedatumtijd,c1.Ontslagdatum,c1.Ontslagdatumtijd,c1.IsSpoedopname,c1.OpnameType,c1.IsNuOpgenomen, SpecialismeCode, Specialismen
FROM cte2 c2
JOIN #ttt c1 ON c1.RowNumber = c2.RowNumber + 1
)
SELECT *
FROM cte2
OPTION (MAXRECURSION 0) ;
DROP TABLE #ttt
I tried to improve performance by putting output of cte in a temp table. That increased the performance, but still its too slow. So, how can I increase the performance of this code to run it under 10 seconds for 75k+ records? The output before cancelling the query is: Screenshot. As visible from the image, data is same in columns Afdelingscode,Patient_ID and Opnametype in RowNumber 3,5 and 6, but they have different GroupNumber because of concurrency of the rows.
Without data its not that easy to test but i would try first to not use temporary table and just use both cte from start to end, ie;
;WITH
cte AS (...),
cte2 AS (...)
select * from cte2
OPTION (MAXRECURSION 0);
Without knowing indices etc... for instance, you do a lot of ordering in the first cte. Is this supported by indices (or one multicolumn index) or not?
Without the data i don't have the option to play with it but looking at this:
CASE
WHEN c2.Afdelingscode <> c1.Afdelingscode
OR c2.Patient_ID <> c1.Patient_ID
OR c2.Opnametype <> c1.Opnametype
THEN c2.GroupNumber + 1
ELSE c2.GroupNumber
i would try to take a look at partition by statement in row_number
So try to run this:
WITH
cte AS (
SELECT ROW_NUMBER () OVER (PARTITION BY Afdelingscode , Patient_ID ,Opnametype ORDER BY Patient_ID, Opnamenummer, SPECIALISMEN, Opnametype, OntslagDatumTijd ) AS RowNumber,
Opnamenummer, Patient_ID, AfdelingsCode, Opnamedatum, Opnamedatumtijd, Ontslagdatum, Ontslagdatumtijd, IsSpoedopname, OpnameType, IsNuOpgenomen
FROM t_opnames)

PL SQL concatenate 2 resultsets

I need to get the result of concatenating 2 similar querys' resulsets. For some reason had to split the original query in 2, both with their corresponding order by clause. Should be something like (this is an oversimplification of the original queries)
Query1: Select name, age from person where age=10
Resultset1:
Person1, 10
Person3, 10
Query2: Select name, age from person where age=20
Resultset1:
Person2, 20
Person6, 20
The expected result:
Person1, 10
Person3, 10
Person2, 20
Person6, 20
I can not simply use Query1 UNION Query2.
Below the 2 original querys:
(#1)
select cp.CP_ID, cpi.CI_DESCRIPCION, cp.CP_CODIGOJERARQUIZADO, cp.CP_ESGASTO as gasto, cp.CP_CONCEPTOPADRE, LEVEL
from TGCCP_ConceptoPagoIng cp
left join tgcci_ConceptoPagoIngIdioma cpi on cpi.CI_IDCONCEPTOPAGOING = cp.CP_ID and cpi.CI_IDIDIOMA = 1
start with ((CP_CONCEPTOPADRE is null) and (**cp.CP_ESGASTO = 1**))
connect by prior cp.CP_ID = cp.CP_CONCEPTOPADRE
order siblings by CP_CODIGOJERARQUIZADO
(#2)
select cp.CP_ID, cpi.CI_DESCRIPCION, cp.CP_CODIGOJERARQUIZADO, cp.CP_ESGASTO as gasto, cp.CP_CONCEPTOPADRE, LEVEL
from TGCCP_ConceptoPagoIng cp
left join tgcci_ConceptoPagoIngIdioma cpi on cpi.CI_IDCONCEPTOPAGOING = cp.CP_ID and cpi.CI_IDIDIOMA = 1
start with ((CP_CONCEPTOPADRE is null) and (**cp.CP_ESGASTO = 2**))
connect by prior cp.CP_ID = cp.CP_CONCEPTOPADRE
order siblings by CP_CODIGOJERARQUIZADO
I think you want a
select * from ( first query )
UNION ALL
select * from ( second query )
Where first query and second query are the queries from above, so you are turning them into subqueries, thus preserving the order by clauses.
OK, well, I'm not fully certain why you need it this way, but if Oracle won't allow you to do a UNION, or it screws up the ordering when you do, I would try creating a pipelined table function.
An example here
Basically, you'd create a procedure that ran both queries, first one, then the other, putting the results of each into the returned dataset.
It looks like you are looking for a MULTISET UNION. Which can only be used from version 10 upwards.
Regards,
Rob.
You could combine your queries as subqueries and do a single order by on the outer query:
select * from (
<query 1 with its order by>
UNION ALL
<query 2 with its order by>
)
order by column1, column2;
Alternatively, you can implement in PL/SQL the equivalent of a sort merge join with two cursors, but that's unnecessarily complicated.
this solution works perfectly:
select * from ( first query )
UNION ALL
select * from ( second query )
I appreciate everyone that have taken the time to answer.
regards.
For your example:
Select name, age from person where age in (10,20)
or
Select name, age from person where age = 10 or age = 20
However I'm guessing this is not what you need :)

Resources