Row count by group

Row count by group - oracle

I am attempting to write the following query to get a row count by group.
select
a.employee, a.cov_option,
count(a.cov_option) over (partition by a.cov_option order by a.employee) as row_num
from wilson.benefit a
inner join wilson.bncategory b
ON a.plan_type = b.plan_type and a.plan_option = b.plan_option
inner join wilson.bncovopt c
ON a.company = c.company and a.plan_code = c.plan_code and a.cov_option = c.coverage_opt
where
a.plan_type = 'HL' and
to_char(a.stop_date, 'yyyy-mm-dd') = '1700-01-01'
order by a.employee, a.cov_option
The result set returned is:
employee | cov_option |row_num
-------------|--------------|--------------
429 | 1 | 1
429 | 3 | 2
429 | 3 | 2
1420 | 1 | 2
1420 | 3 | 4
1420 | 3 | 4
1537 | 2 | 2
1537 | 2 | 2
The result set I am attempting to return is:
429 | 1 | 1
429 | 3 | 2
429 | 3 | 2
1420 | 1 | 1
1420 | 3 | 2
1420 | 3 | 2
1537 | 2 | 1
1537 | 2 | 1

What you seem to want is dense_rank() rather than count(). Indeed, "count" means simply determining how many rows are in each group, it is not "counting" in the way we learn as children (first, second, third). That kind of counting is called "ranking".
dense_rank() over (partition by a.employee order by a.cov_option) as row_num
should do what you need.
There is also rank() - the difference is that if two rows are tied for first, with dense_rank() the third row gets rank 2; with simple rank() it gets rank 3 (rank 2 is "used up" by the first two rows).

Related

Count Max number of day continious overdraff

Date | Account | Amount | Count max number of day continuous < 0 |
1 | 1001 | 100 | 0 |
2 | 1001 | -100 | 1 |
3 | 1001 | -100 | 2 |
4 | 1001 | 100 | 2 |
5 | 1001 | -100 | 2 |
6 | 1001 | -100 | 2 |
7 | 1001 | -100 | 3 |
8 | 1001 | -100 | 4 |
9 | 1001 | 100 | 4 |
I have sample data. I want have column "Count max number of day continuous < 0". How i can select it in database oracle

In order to find continuous periods you can use Tabibitosian method. Then use analytical count and finally max:
select date_, account, amount,
max(cnt) over (partition by account order by date_) max_overdraft_period
from (
select date_, account, amount,
count(case when amount <= 0 then 1 end)
over (partition by account, grp order by date_) cnt
from (
select date_, account, amount,
date_ - sum(case when amount <= 0 then 1 else 0 end)
over (partition by account order by date_) grp
from t ))
demo
I assumed that dates are continuous, if not then use row numbering at first.

How to reuse sql with subquery factoring by database view

What is the best practice to convert the following sql statement using a subquery (with data as clause) to use it in a database view.
AFAIK the with data as clause is not supported in database views (Edited: Oracle supports Common Table Expressions), but in my case the subquery factoring offers advantage for performance. If I create a database view using Common Table Expression, than this advantage is lost.
Please have a look at my example:
Description of query
a_table
Millions of entries, by the select statement a few thousand are selected.
anchor_table
For each entry in a_table exists a corresponding entry in anchor_table. By this table is determined at runtime exactly one row as anchor. See example below.
horizon_table
For each selection exactly one entry is determined at runtime (all entries of a selection of a_table have the same horizon_id)
Please notice: This is a strongly simplified sql that works fine so far.
In reality more than 20 tables are joined together to get the results of data.
The where clause is much more complex.
Further columns of horizon_table and anchor_table are required to prepare my where condition and result list in the subquery, i.e. moving these tables to the main query is no solution.
with data as (
select
a_table.id,
a_table.descr,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(
order by a_table.a_position_field) as position
from a_table
join anchor_table on (anchor_table.id = a_table.anchor_id)
join horizon_table on (horizon_table.id = a_table.horizon_id)
where a_table.a_value between 1 and 10000
)
select *
from data d
where d.position between (
select d1.position - d.offset
from data d1
where d1.anchor = 1)
and (
select d2.position + d.offset
from data d2
where d2.anchor = 1)
example of with data as select:
id descr offset anchor position
1 bla 3 0 1
2 blab 3 0 2
5 dfkdj 3 0 3
4 dld 3 0 4
6 oeroe 3 1 5
3 blab 3 0 6
9 dfkdj 3 0 7
14 dld 3 0 8
54 oeroe 3 0 9
...
result of select * from data
id descr offset anchor position
2 blab 3 0 2
5 dfkdj 3 0 3
4 dld 3 0 4
6 oeroe 3 1 5
3 blab 3 0 6
9 dfkdj 3 0 7
14 dld 3 0 8
I.E. the result is the anchor row and the tree rows above and below.
How can I achieve the same within a database view?
My attempt failed as I expected by performance issues:
Create a view data of with data as select above
Use this view as above
select *
from data d
where d.position between (
select d1.position - d.offset
from data d1
where d1.anchor = 1)
and (
select d2.position + d.offset
from data d2
where d2.anchor = 1)
Thank you for any advice :-)
Amendment
If I create a view as recommended in first comment, than I get the same performance issue. Oracle does not use the subquery to restrict the results.
Here are the execution plans of my production queries (please click at the images)
a) SQL
b) View
Here are the execution plans of my test cases
-- Create Testdata table with ~ 1,000,000 entries
insert into a_table
(id, descr, a_position_field, anchor_id, horizon_id, a_value)
select level, 'data' || level, mod(level, 10), level, 1, level
from dual
connect by level <= 999999;
insert into anchor_table
(id, a_date)
select level, trunc(sysdate) - 500000 + level
from dual
connect by level <= 999999;
insert into horizon_table (id, offset) values (1, 50);
commit;
-- Create view
create or replace view testdata_vw as
with data as
(select a_table.id,
a_table.descr,
a_table.a_value,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(order by a_table.a_position_field) as position
from a_table
join anchor_table
on (anchor_table.id = a_table.anchor_id)
join horizon_table
on (horizon_table.id = a_table.horizon_id))
select *
from data d
where d.position between
(select d1.position - d.offset from data d1 where d1.anchor = 1) and
(select d2.position + d.offset from data d2 where d2.anchor = 1);
-- Explain plan of subquery factoring select statement
explain plan for
with data as
(select a_table.id,
a_table.descr,
a_value,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(order by a_table.a_position_field) as position
from a_table
join anchor_table
on (anchor_table.id = a_table.anchor_id)
join horizon_table
on (horizon_table.id = a_table.horizon_id)
where a_table.a_value between 500000 - 500 and 500000 + 500)
select *
from data d
where d.position between
(select d1.position - d.offset from data d1 where d1.anchor = 1) and
(select d2.position + d.offset from data d2 where d2.anchor = 1);
select plan_table_output
from table(dbms_xplan.display('plan_table', null, null));
/*
Note: Size of SYS_TEMP_0FD9D6628_284C5768 ~ 1000 rows
Plan hash value: 1145408420
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 62 | 1791 (2)| 00:00:31 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6628_284C5768 | | | | |
| 3 | WINDOW SORT | | 57 | 6840 | 1785 (2)| 00:00:31 |
|* 4 | HASH JOIN | | 57 | 6840 | 1784 (2)| 00:00:31 |
|* 5 | TABLE ACCESS FULL | A_TABLE | 57 | 4104 | 1193 (2)| 00:00:21 |
| 6 | MERGE JOIN CARTESIAN | | 1189K| 54M| 586 (2)| 00:00:10 |
| 7 | TABLE ACCESS FULL | HORIZON_TABLE | 1 | 26 | 3 (0)| 00:00:01 |
| 8 | BUFFER SORT | | 1189K| 24M| 583 (2)| 00:00:10 |
| 9 | TABLE ACCESS FULL | ANCHOR_TABLE | 1189K| 24M| 583 (2)| 00:00:10 |
|* 10 | FILTER | | | | | |
| 11 | VIEW | | 57 | 3534 | 2 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_284C5768 | 57 | 4104 | 2 (0)| 00:00:01 |
|* 13 | VIEW | | 57 | 912 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_284C5768 | 57 | 4104 | 2 (0)| 00:00:01 |
|* 15 | VIEW | | 57 | 912 | 2 (0)| 00:00:01 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_284C5768 | 57 | 4104 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("HORIZON_TABLE"."ID"="A_TABLE"."HORIZON_ID" AND
"ANCHOR_TABLE"."ID"="A_TABLE"."ANCHOR_ID")
5 - filter("A_TABLE"."A_VALUE">=499500 AND "A_TABLE"."A_VALUE"<=500500)
10 - filter("D"."POSITION">= (SELECT "D1"."POSITION"-:B1 FROM (SELECT + CACHE_TEMP_TABLE
("T1") "C0" "ID","C1" "DESCR","C2" "A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM
"SYS"."SYS_TEMP_0FD9D6628_284C5768" "T1") "D1" WHERE "D1"."ANCHOR"=1) AND "D"."POSITION"<=
(SELECT "D2"."POSITION"+:B2 FROM (SELECT + CACHE_TEMP_TABLE ("T1") "C0" "ID","C1"
"DESCR","C2" "A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM
"SYS"."SYS_TEMP_0FD9D6628_284C5768" "T1") "D2" WHERE "D2"."ANCHOR"=1))
13 - filter("D1"."ANCHOR"=1)
15 - filter("D2"."ANCHOR"=1)
Note
-----
- dynamic sampling used for this statement (level=4)
*/
-- Explain plan of database view
explain plan for
select *
from testdata_vw
where a_value between 500000 - 500 and 500000 + 500;
select plan_table_output
from table(dbms_xplan.display('plan_table', null, null));
/*
Note: Size of SYS_TEMP_0FD9D662A_284C5768 ~ 1000000 rows
Plan hash value: 1422141561
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2973 | 180K| | 50324 (1)| 00:14:16 |
| 1 | VIEW | TESTDATA_VW | 2973 | 180K| | 50324 (1)| 00:14:16 |
| 2 | TEMP TABLE TRANSFORMATION | | | | | | |
| 3 | LOAD AS SELECT | SYS_TEMP_0FD9D662A_284C5768 | | | | | |
| 4 | WINDOW SORT | | 1189K| 136M| 147M| 37032 (1)| 00:10:30 |
|* 5 | HASH JOIN | | 1189K| 136M| | 6868 (1)| 00:01:57 |
| 6 | TABLE ACCESS FULL | HORIZON_TABLE | 1 | 26 | | 3 (0)| 00:00:01 |
|* 7 | HASH JOIN | | 1189K| 106M| 38M| 6860 (1)| 00:01:57 |
| 8 | TABLE ACCESS FULL | ANCHOR_TABLE | 1189K| 24M| | 583 (2)| 00:00:10 |
| 9 | TABLE ACCESS FULL | A_TABLE | 1209K| 83M| | 1191 (2)| 00:00:21 |
|* 10 | FILTER | | | | | | |
|* 11 | VIEW | | 1189K| 70M| | 4431 (1)| 00:01:16 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D662A_284C5768 | 1189K| 81M| | 4431 (1)| 00:01:16 |
|* 13 | VIEW | | 1189K| 18M| | 4431 (1)| 00:01:16 |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D662A_284C5768 | 1189K| 81M| | 4431 (1)| 00:01:16 |
|* 15 | VIEW | | 1189K| 18M| | 4431 (1)| 00:01:16 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D662A_284C5768 | 1189K| 81M| | 4431 (1)| 00:01:16 |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("HORIZON_TABLE"."ID"="A_TABLE"."HORIZON_ID")
7 - access("ANCHOR_TABLE"."ID"="A_TABLE"."ANCHOR_ID")
10 - filter("D"."POSITION">= (SELECT "D1"."POSITION"-:B1 FROM (SELECT + CACHE_TEMP_TABLE ("T1")
"C0" "ID","C1" "DESCR","C2" "A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM
"SYS"."SYS_TEMP_0FD9D662A_284C5768" "T1") "D1" WHERE "D1"."ANCHOR"=1) AND "D"."POSITION"<= (SELECT
"D2"."POSITION"+:B2 FROM (SELECT + CACHE_TEMP_TABLE ("T1") "C0" "ID","C1" "DESCR","C2"
"A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM "SYS"."SYS_TEMP_0FD9D662A_284C5768" "T1") "D2"
WHERE "D2"."ANCHOR"=1))
11 - filter("A_VALUE">=499500 AND "A_VALUE"<=500500)
13 - filter("D1"."ANCHOR"=1)
15 - filter("D2"."ANCHOR"=1)
Note
-----
- dynamic sampling used for this statement (level=4)
*/
sqlfiddle
explain plan of sql http://www.sqlfiddle.com/#!4/6a7022/3
explain plan of view http://www.sqlfiddle.com/#!4/6a7022/2

You need to write a view definition which returns all possible selectable ranges of a_value as two columns, start_a_value and end_a_value, along with all records which fall into each start/end range. In other words, the correct view definition should logically describe a |n^3| result set given n rows in a_table.
Then query that view as:
SELECT * FROM testdata_vw WHERE START_A_VALUE = 4950 AND END_A_VALUE = 5050;
Also, your multiple references to "data" are unnecessary; same logic can be delivered with an additional analytic function.
Final view def:
CREATE OR REPLACE VIEW testdata_vw AS
SELECT *
FROM
(
SELECT T.*,
MAX(CASE WHEN ANCHOR=1 THEN POSITION END)
OVER (PARTITION BY START_A_VALUE, END_A_VALUE) ANCHOR_POS
FROM
(
SELECT S.A_VALUE START_A_VALUE,
E.A_VALUE END_A_VALUE,
B.ID ID,
B.DESCR DESCR,
HORIZON_TABLE.OFFSET OFFSET,
CASE
WHEN ANCHOR_TABLE.A_DATE = TRUNC(SYSDATE)
THEN 1
ELSE 0
END ANCHOR,
ROW_NUMBER()
OVER(PARTITION BY S.A_VALUE, E.A_VALUE
ORDER BY B.A_POSITION_FIELD) POSITION
FROM
A_TABLE S
JOIN A_TABLE E
ON S.A_VALUE<E.A_VALUE
JOIN A_TABLE B
ON B.A_VALUE BETWEEN S.A_VALUE AND E.A_VALUE
JOIN ANCHOR_TABLE
ON ANCHOR_TABLE.ID = B.ANCHOR_ID
JOIN HORIZON_TABLE
ON HORIZON_TABLE.ID = B.HORIZON_ID
) T
) T
WHERE POSITION BETWEEN ANCHOR_POS - OFFSET AND ANCHOR_POS+OFFSET;
EDIT: SQL Fiddle with expected execution plan
I'm seeing the same (sensible) plan here that I saw in my database; if you're getting something different, please send fiddle link.
Use index lookup to find 1 row in "S" A_TABLE (A_VALUE = 4950)
Use index lookup to find 1 row in "E" A_TABLE (A_VALUE = 5050)
Nested Loop join #1 and #2 (1 x 1 join, still 1 row)
FTS 1 row from HORIZON table
Cartesian join #1 and #2 (1 x 1, okay to use Cartesian).
Use index lookup to find ~100 rows in "B" A_TABLE with values between 4950 and 5050.
Cartesian join #5 and #6 (1 x 102, okay to use Cartesian).
FTS ANCHOR_TABLE with hash join to #7.
Window-sort for analytic functions

You have a predicate outside the view and you want to be applied in the view.
For this, you can use push_pred hint:
select /*+PUSH_PRED(v)*/
*
from
testdata_vw v
where
a_value between 5000 - 50 and 5000 + 50;
SQLFIDDLE
EDIT: Now I've seen that you use the data subquery three times. For the first occurrence it makes sense to push the predicate, but for d1 and d2 it doesn't. It's another query.
What would I do is to use two context variables, set them according my needs and write the query:
SYS_CONTEXT('my_context_name', 'var5000');
create or replace view testdata_vw as
with data as (
select
a_table.id,
a_table.descr,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(
order by a_table.a_position_field) as position
from a_table
join anchor_table on (anchor_table.id = a_table.anchor_id)
join horizon_table on (horizon_table.id = a_table.horizon_id)
where a_table.a_value between SYS_CONTEXT('my_context_name', 'var5000') - SYS_CONTEXT('my_context_name', 'var50') and SYS_CONTEXT('my_context_name', 'var5000') + SYS_CONTEXT('my_context_name', 'var50')
)
select *
from data d
where d.position between (
select d1.position - d.offset
from data d1
where d1.anchor = 1)
and (
select d2.position + d.offset
from data d2
where d2.anchor = 1) ;
to use it:
dbms_session.set_context ('my_context_name', 'var5000', 5000);
dbms_session.set_context ('my_context_name', 'var50', 50);
select * from testdata_vw;
UPDATE: Instead of context variables(which can be used across sessions) you can use package variables as you commented.

max() issue in Oracle SQL [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 8 years ago.
I'm using Oracle SQL and i need some help with max() function.
I have the following table:
ID | Type | Price | Quantity
1 | A | 10 | 2
2 | B | 5 | 5
3 | C | 10 | 3
4 | A | 8 | 7
5 | A | 6 | 9
6 | A | 7 | 5
7 | B | 15 | 3
8 | A | 20 | 4
9 | A | 3 | 7
10 | B | 11 | 8
I need to aggregate the table by Type column. For each group of Type (A, B, C), i need to select the price and the quantity of max(id).
I this case:
ID | Type | Price | Quantity
9 | A | 3 | 7
10 | B | 11 | 8
3 | C | 10 | 3
Any Suggestion?

max won't help you with this. You can use the row_number partitioning function.
select id, type, price, quantity
from
(
select yourtable.*,
row_number() over (partition by type order by id desc) rn
from yourtable
) v
where rn = 1

Something like this:
Select t.* From
(Select Max(ID) As ID From Table
Group By Type) tmp
Join Table t On t.ID = tmp.ID

Oracle select two (or more) adjacent rows having the same value for a given column

How do I do the following in Oracle:
I have a (simplified) table:
+-----+-----+-----+
| a | b | ... |
+-----+-----+-----+
| 1 | 7 | ... |
| 2 | 5 | ... |
| 1 | 7 | ... |
+-----+-----+-----+
Where a functions as a unique identifier for a person, and b is the field I am interested in matching across rows. How do I construct a query that basically says "give me the person-ID's where the person has multiple b values (i.e., duplicates)"?
So far I have tried:
SELECT a FROM mytable GROUP BY a HAVING COUNT(DISTINCT b) > 1;
This feels close except it just gives me the user IDs where the user has multiple unique b's, which I suspect is coming from the DISTINCT part, but I'm not sure how to change the query to achieve what I want.

Try
group by a,b having count(b) > 1
Yours would count 7,5,7 as 2 (one 7, one 5). This one one will count total Bs in any grouping, so you'll get 1,7 - > 2 and 1,5 -> 1

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE mytable ( a, b ) AS
SELECT LEVEL, LEVEL FROM DUAL CONNECT BY LEVEL <= 2000
UNION ALL
SELECT LEVEL *2, LEVEL * 2 FROM DUAL CONNECT BY LEVEL <= 1000;
Query 1:
WITH data AS (
SELECT a
FROM mytable
GROUP BY a
HAVING COUNT(b) > COUNT( DISTINCT b )
ORDER BY a
),
numbered AS (
SELECT a,
ROWNUM AS rn
FROM data
)
SELECT a
FROM numbered
WHERE rn <= 20
Results:
| A |
|----|
| 2 |
| 4 |
| 6 |
| 8 |
| 10 |
| 12 |
| 14 |
| 16 |
| 18 |
| 20 |
| 22 |
| 24 |
| 26 |
| 28 |
| 30 |
| 32 |
| 34 |
| 36 |
| 38 |
| 40 |

SQL bring back highest sum of rows

I'm looking to calculate the highest basket in my set of data but I can't get my head around how I should do it.
I have data like:
OrderID | CustomerID | BasketID | ProductID | Price
1 | 1 | 1 | 221 | 10
2 | 1 | 1 | 431 | 123
3 | 1 | 2 | 761 | 44
4 | 2 | 3 | 12 | 54
5 | 2 | 3 | 102 | 78
6 | 3 | 4 | 111 | 98
7 | 3 | 4 | 41 | 45
8 | 3 | 5 | 65 | 66
9 | 4 | 6 | 32 | 47
10 | 4 | 6 | 118 | 544
Sorry if it seems quite messy.
But I can easily get the SUM with an obvious
SELECT SUM([Price]), BasketID, CustomerID FROM table
GROUP BY BasketID, CustomerID
But how can I filter the list for only the highest priced Basket ID for that CustomerID
Thanks

You can use a CTE (Common Table Expression) with the ROW_NUMBER function:
;WITH HighestPricePerCustomerAndBasket AS
(
SELECT
ID, UserID, ClassID, SchoolID, Created,
ROW_NUMBER() OVER(PARTITION BY BasketID,CustomerID ORDER BY Price DESC) AS 'RowNum'
FROM dbo.YourTable
)
SELECT
[Price], BasketID, CustomerID
FROM HighestPricePerCustomerAndBasket
WHERE RowNum = 1
This CTE "partitions" your data by BasketID,CustomerID, and for each partition, the ROW_NUMBER function hands out sequential numbers, starting at 1 and ordered by Price DESC - so the first row (highest price) gets RowNum = 1 (for each BasketID,CustomerID "partition") which is what I select from the CTE in the SELECT statement after it.

SELECT *
FROM (SELECT *,
DENSE_RANK() OVER (PARTITION BY CustomerID ORDER BY BasketTotal DESC) AS RNK
FROM (SELECT Sum(Price) AS BasketTotal,
BasketID,
CustomerID
FROM Order a
GROUP BY BasketID,
CustomerID
) a
) b
WHERE RNK = 1
I managed to conjure something up that worked.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Row count by group - oracle

Related

Count Max number of day continious overdraff

How to reuse sql with subquery factoring by database view

max() issue in Oracle SQL [duplicate]

Oracle select two (or more) adjacent rows having the same value for a given column

SQL bring back highest sum of rows

Categories

Resources