Select N Row in Oracle - oracle

Suppose we are having the following data:
Key Value Desired Rank
--- ----- ------------
P1 0.6 2
P1 0.6 2
P1 0.6 2
P2 0.8 1
P2 0.8 1
P3 0.6 3
P3 0.6 3
I want to select Distinct Keys ordered by Value DESC to be displayed in a grid that supports pagination.
I don’t know how to generate rank as the values displayed in Desired Rank column. So that I can paginate correctly over the data set
When I tried to use: DENSE_RANK() OVER(ORDER BY value), the result was
Key Value DENSE_RANK() OVER(ORDER BY value)
--- ----- ------------
P1 0.6 2
P1 0.6 2
P1 0.6 2
P2 0.8 1
P2 0.8 1
P3 0.6 2
P3 0.6 2
When I try to select the first two keys “rank between 1 and 2” I receive back 3 keys. And this ruins the required pagination mechanism.
Any ideas?
Thanks

If you want the distinct keys and values, why not use distinct?
select distinct
t.Key,
t.Value
from
YourTable t
order by
t.value
Do you actualle need the rank?
If you do, you still could
select distinct
t.Key,
t.Value,
dense_rank() over () order by (t.Value, t.Key) as Rank
from
YourTable t
order by
t.value
This whould work without the distinct as well.

'When I try to select the first two
keys “rank between 1 and 2” I receive
back 3 keys.'
That is because you are ordering just by VALUE, so all KEYS with the same value are assigned the same rank. So you need to include the KEY in the ordering clause. Like this:
DENSE_RANK() OVER (ORDER BY key ASC, value DESC)

Related

Why when using grouping_id with rollup the total will be level 3 instead of 2?

Is there any logic behind the reason to why the total grouping when using rollup will be "lvl" 3?..
example:
col1
col2
id1
1
id1
2
id2
1
id2
2
id3
1
id3
2
In cube method this is understandable as
level 0 is the "basic values" (my own term)such as "col1-ID1","col2-id1", "col3-id1" etc..,
level 1 will be the subtotal row of each basic value which means subtotal of ID1(1)+ID1(2) --> id1(3) for example.
level 2 will be the total of each combination which means subtotal of 1's and subtotal of 2's , in this example: 1- subtotal will be 3 and 2-subtotal will be 6
and level 3 will be the grand total of them all, in this example: 9
my explanation might not make any sense:) .. sorry
Is there any reason behind that skipping from lvl 0/1 straight to 3? or its just the way it is?
Summary: The grand total will be in the grouping set with the id the maximum value of the grouping set which is equal to 2(number of columns in the grouping set) - 1. Therefore, with 2 columns being cubed, the grand total is in 22-1 = 3 and, with 3 columns being cubed, the grand total is in 23-1 = 7.
Given the query:
SELECT LISTAGG(col1, ',') WITHIN GROUP (ORDER BY col1) AS col1,
LISTAGG(col2, ',') WITHIN GROUP (ORDER BY col2) AS col2,
GROUPING_ID(col1, col2) AS grp
FROM table_name
GROUP BY CUBE(col1, col2)
ORDER BY grp, col1, col2
Which, for the sample data:
CREATE TABLE table_name (col1, col2) AS
SELECT 'id1', 1 FROM DUAL UNION ALL
SELECT 'id1', 2 FROM DUAL UNION ALL
SELECT 'id2', 1 FROM DUAL UNION ALL
SELECT 'id2', 2 FROM DUAL UNION ALL
SELECT 'id3', 1 FROM DUAL UNION ALL
SELECT 'id3', 2 FROM DUAL;
Outputs:
COL1
COL2
GRP
id1
1
0
id1
2
0
id2
1
0
id2
2
0
id3
1
0
id3
2
0
id1,id1
1,2
1
id2,id2
1,2
1
id3,id3
1,2
1
id1,id2,id3
1,1,1
2
id1,id2,id3
2,2,2
2
id1,id1,id2,id2,id3,id3
1,1,1,2,2,2
3
You can see that when GROUPING_ID(col1, col2) is:
The un-grouped value.
The value grouped by the first column in the grouping set (and there is one value for the first column and N values for the second column).
The value grouped by the second column in the grouping set (and there are M values for the first column and one value for the second column).
The value grouped by the both columns in the grouping set (and there are M values for the first column and N values for the second column giving N*M total values); which will give you the grand total.
If you had the sample data with 3 columns:
CREATE TABLE table2 (col1, col2, col3) AS
SELECT t1.COLUMN_VALUE,
t2.COLUMN_VALUE,
t3.COLUMN_VALUE
FROM TABLE(SYS.ODCIVARCHAR2LIST('id1', 'id2', 'id3')) t1
CROSS JOIN TABLE(SYS.ODCINUMBERLIST(1, 2)) t2
CROSS JOIN TABLE(SYS.ODCINUMBERLIST(3, 4)) t3;
Then using CUBE across 3 columns:
SELECT LISTAGG(col1, ',') WITHIN GROUP (ORDER BY col1) AS col1,
LISTAGG(col2, ',') WITHIN GROUP (ORDER BY col2) AS col2,
LISTAGG(col3, ',') WITHIN GROUP (ORDER BY col3) AS col3,
GROUPING_ID(col1, col2, col3) AS grp
FROM table2
GROUP BY CUBE(col1, col2, col3)
ORDER BY grp, col1, col2, col3
Outputs:
COL1
COL2
COL3
GRP
id1
1
3
0
id1
1
4
0
id1
2
3
0
id1
2
4
0
id2
1
3
0
id2
1
4
0
id2
2
3
0
id2
2
4
0
id3
1
3
0
id3
1
4
0
id3
2
3
0
id3
2
4
0
id1,id1
1,1
3,4
1
id1,id1
2,2
3,4
1
id2,id2
1,1
3,4
1
id2,id2
2,2
3,4
1
id3,id3
1,1
3,4
1
id3,id3
2,2
3,4
1
id1,id1
1,2
3,3
2
id1,id1
1,2
4,4
2
id2,id2
1,2
3,3
2
id2,id2
1,2
4,4
2
id3,id3
1,2
3,3
2
id3,id3
1,2
4,4
2
id1,id1,id1,id1
1,1,2,2
3,3,4,4
3
id2,id2,id2,id2
1,1,2,2
3,3,4,4
3
id3,id3,id3,id3
1,1,2,2
3,3,4,4
3
id1,id2,id3
1,1,1
3,3,3
4
id1,id2,id3
1,1,1
4,4,4
4
id1,id2,id3
2,2,2
3,3,3
4
id1,id2,id3
2,2,2
4,4,4
4
id1,id1,id2,id2,id3,id3
1,1,1,1,1,1
3,3,3,4,4,4
5
id1,id1,id2,id2,id3,id3
2,2,2,2,2,2
3,3,3,4,4,4
5
id1,id1,id2,id2,id3,id3
1,1,1,2,2,2
3,3,3,3,3,3
6
id1,id1,id2,id2,id3,id3
1,1,1,2,2,2
4,4,4,4,4,4
6
id1,id1,id1,id1,id2,id2,id2,id2,id3,id3,id3,id3
1,1,1,1,1,1,2,2,2,2,2,2
3,3,3,3,3,3,4,4,4,4,4,4
7
And will generate 23 = 8 levels (from 0 to 7) since there are all the possible combinations of grouping 3 columns and the grand-total will be in level 7; compared to 22 levels (0 to 3) when you are cubing 2 columns and the grand total is in level 3.
fiddle
Update
What I don't understand is why roll up is skipping level 2 straight to 3?
From the SELECT documentation:
ROLLUP
The ROLLUP operation in the simple_grouping_clause groups the selected rows based on the values of the first n, n-1, n-2, ... 0 expressions in the GROUP BY specification, and returns a single row of summary for each group.
[...]
CUBE
The CUBE operation in the simple_grouping_clause groups the selected rows based on the values of all possible combinations of expressions in the specification. It returns a single row of summary information for each group.
CUBE generates all possible grouping sets; ROLLUP generates groups of the first 1 column then with the first 2 columns, 3 columns, ..., up to n columns which is the same as the CUBE when the grouping sets are restricted to the 20-1, 21-1, 22-1, ..., 2n-1 (or more simply 0, 1, 3, 7, ... 2n-1).
This means that ROLLUP will skip the grouping set with id 2 as that is grouping only by the 2nd column and that is not "one of the first n, n-1, n-2, ... 0 expressions" in the GROUP BY specification.
fiddle

Is it possible to add distinct to part of a sum clause in Oracle?

I have a pretty lengthy SQL query which I'm going to run on Oracle via hibernate. It consists of two nested selects. In the first select statement, a number of sums are calculated, but in one of them I want to filter the results using unique ids.
SELECT ...
SUM(NVL(CASE WHEN SECOND_STATUS= 50 OR SECOND_STATUS IS NULL THEN RECEIVE_AMOUNT END, 0) +
NVL(CASE WHEN FIRST_STATUS = 1010 THEN AMOUNT END, 0) +
NVL(CASE WHEN FIRST_STATUS = 1030 THEN AMOUNT END, 0) -
NVL(CASE WHEN FIRST_STATUS = 1010 AND (SECOND_STATUS= 50 OR SECOND_STATUS IS NULL) THEN RECEIVE_AMOUNT END, 0)) TOTAL, ...
And at the end:
... FROM (SELECT s.*, p.* FROM FIRST_TABLE s
JOIN SECOND_TABLE p ON s.ID = p.FIRST_ID
In one of the lines that start with NVL (second line actually), I want to add a distinct clause that sums the amounts only if first table ids are unique. But I don't know if this is possible or not. If yes, how would it be?
Assume such setup
select * from first;
ID AMOUNT
---------- ----------
1 10
2 20
select * from second;
SECOND_ID FIRST_ID AMOUNT2
---------- ---------- ----------
1 1 100
2 1 100
3 2 100
After the join you get the total sum of both amounts too high because the amount from the first table is duplicated.
select *
from first
join second on first.id = second.first_id;
ID AMOUNT SECOND_ID FIRST_ID AMOUNT2
---------- ---------- ---------- ---------- ----------
1 10 1 1 100
1 10 2 1 100
2 20 3 2 100
You must add a row_number that identifies the first occurence in the parent table and consider in the AMOUNT only the first row and resets it to NULL in the duplicated rows.
select ID,
case when row_number() over (partition by id order by second_id) = 1 then AMOUNT end as AMOUNT,
SECOND_ID, FIRST_ID, AMOUNT2
from first
join second on first.id = second.first_id;
ID AMOUNT SECOND_ID FIRST_ID AMOUNT2
---------- ---------- ---------- ---------- ----------
1 10 1 1 100
1 2 1 100
2 20 3 2 100
Now you can safely sum in a separate subquery
with tab as (
select ID,
case when row_number() over (partition by id order by second_id) = 1 then AMOUNT end as AMOUNT,
SECOND_ID, FIRST_ID, AMOUNT2
from first
join second on first.id = second.first_id
)
select id, sum(nvl(amount,0) + nvl(amount2,0))
from tab
group by id
;
ID SUM(NVL(AMOUNT,0)+NVL(AMOUNT2,0))
---------- ---------------------------------
1 210
2 120
Note also that this is an answer to your question. Some will argue that in your situation you should first aggregate and than join. This will also resolve your problem possible more elegantly.

Oracle: prioritizing results based on column’s value

I have a data-set in which there are duplicate IDs in the first column. I'm hoping to obtain a single row of data for each ID based on the second column's value. The data looks like so:
ID Info_Source Prior?
A 1 Y
A 3 N
A 2 Y
B 1 N
B 1 N
B 2 Y
C 2 N
C 3 Y
C 1 N
Specifically the criteria would call for prioritizing based on the second column's value (3 highest priority; then 1; and lastly 2): if the 'Info_Source' column has a value of 3, return that row; if there is no 3 in the second column for a given ID, look for a 1 and if found return that row; and finally if there is no 3 or 1 associated with the ID, search for 2 and return that row for the ID.
The desired results would be a single row for each ID, and the resulting data would be:
ID Info_Source Prior?
A 3 N
B 1 N
C 3 Y
row_number() over() usually solves these needs nicely and efficiently e.g.
select ID, Info_Source, Prior
from (
select ID, Info_Source, Prior
, row_number() over(partition by id order by Info_source DESC) as rn
)
where rn = 1
For prioritizing the second column's value (3 ; then 1, then 2) use a case expression to alter the raw value into an order that you need.
select ID, Info_Source, Prior
from (
select ID, Info_Source, Prior
, row_number() over(partition by id
order by case when Info_source = 3 then 3
when Infor_source = 1 then 2
else 1 end DESC) as rn
)
where rn = 1

Is there an algorithm that can divide a number into three parts and have their totals match the original number?

For example if you take the following example into consideration.
100.00 - Original Number
33.33 - 1st divided by 3
33.33 - 2nd divided by 3
33.33 - 3rd divided by 3
99.99 - Is the sum of the 3 division outcomes
But i want it to match the original 100.00
One way that i saw it could be done was by taking the original number minus the first two divisions and the result would be my third number. Now if i take those 3 numbers i get my original number.
100.00 - Original Number
33.33 - 1st divided by 3
33.33 - 2nd divided by 3
33.34 - 3rd number
100.00 - Which gives me my original number correctly. (33.33+33.33+33.34 = 100.00)
Is there a formula for this either in Oracle PL/SQL or a function or something that could be implemented?
Thanks in advance!
This version takes precision as a parameter as well:
with q as (select 100 as val, 3 as parts, 2 as prec from dual)
select rownum as no
,case when rownum = parts
then val - round(val / parts, prec) * (parts - 1)
else round(val / parts, prec)
end v
from q
connect by level <= parts
no v
=== =====
1 33.33
2 33.33
3 33.34
For example, if you want to split the value among the number of days in the current month, you can do this:
with q as (select 100 as val
,extract(day from last_day(sysdate)) as parts
,2 as prec from dual)
select rownum as no
,case when rownum = parts
then val - round(val / parts, prec) * (parts - 1)
else round(val / parts, prec)
end v
from q
connect by level <= parts;
1 3.33
2 3.33
3 3.33
4 3.33
...
27 3.33
28 3.33
29 3.33
30 3.43
To apportion the value amongst each month, weighted by the number of days in each month, you could do this instead (change the level <= 3 to change the number of months it is calculated for):
with q as (
select add_months(date '2013-07-01', rownum-1) the_month
,extract(day from last_day(add_months(date '2013-07-01', rownum-1)))
as days_in_month
,100 as val
,2 as prec
from dual
connect by level <= 3)
,q2 as (
select the_month, val, prec
,round(val * days_in_month
/ sum(days_in_month) over (), prec)
as apportioned
,row_number() over (order by the_month desc)
as reverse_rn
from q)
select the_month
,case when reverse_rn = 1
then val - sum(apportioned) over (order by the_month
rows between unbounded preceding and 1 preceding)
else apportioned
end as portion
from q2;
01/JUL/13 33.7
01/AUG/13 33.7
01/SEP/13 32.6
Use rational numbers. You could store the numbers as fractions rather than simple values. That's the only way to assure that the quantity is truly split in 3, and that it adds up to the original number. Sure you can do something hacky with rounding and remainders, as long as you don't care that the portions are not exactly split in 3.
The "algorithm" is simply that
100/3 + 100/3 + 100/3 == 300/3 == 100
Store both the numerator and the denominator in separate fields, then add the numerators. You can always convert to floating point when you display the values.
The Oracle docs even have a nice example of how to implement it:
CREATE TYPE rational_type AS OBJECT
( numerator INTEGER,
denominator INTEGER,
MAP MEMBER FUNCTION rat_to_real RETURN REAL,
MEMBER PROCEDURE normalize,
MEMBER FUNCTION plus (x rational_type)
RETURN rational_type);
Here is a parameterized SQL version
SELECT COUNT (*), grp
FROM (WITH input AS (SELECT 100 p_number, 3 p_buckets FROM DUAL),
data
AS ( SELECT LEVEL id, (p_number / p_buckets) group_size
FROM input
CONNECT BY LEVEL <= p_number)
SELECT id, CEIL (ROW_NUMBER () OVER (ORDER BY id) / group_size) grp
FROM data)
GROUP BY grp
output:
COUNT(*) GRP
33 1
33 2
34 3
If you edit the input parameters (p_number and p_buckets) the SQL essentially distributes p_number as evenly as possible among the # of buckets requested (p_buckets).
I've solved this problem yesterday by subtracting 2 of 3 parts from the starting number, e.g. 100 - 33.33 - 33.33 = 33.34 and the result of summing it up is still 100.

Interpolation between two values in a single query

I want to calculate a value by interpolating the value between two nearest neighbours.
I have a subquery that returns the values of the neighbours and their relative distance, in the form of two columns with two elements.
Let's say:
(select ... as value, ... as distance
from [get some neighbours by distance] limit 2) as sub
How can I calculate the value of the point by linear interpolation? Is it possible to do that in a single query?
Example: My point has the neighbour A with value 10 at distance 1, and the neighbour B with value 20 at distance 4. The function should return a value 10 * 4 + 20 * 1 / 5 = 12 for my point.
I tried the obvious approach
select sum(value * (sum(distance)-distance)) / sum(distance)
which will fail because you cannot work with group clauses inside group clauses. Using another subquery returning the sum is not possible either, because then I cannot forward the individual values at the same time.
This is an ugly hack (based on a abused CTE ;). The crux of it is that
value1 * distance2 + value2 * distance1
Can, by dividing by distance1*distance2, be rewritten to
value1/distance1 + value2/distance2
So, the products (or divisions) can stay inside their rows. After the summation, multiplying by (distance1*distance2) rescales the result to the desired output. Generalisation to more than two neighbors is left as an exercise to the reader.YMMV
DROP TABLE tmp.points;
CREATE TABLE tmp.points
( pname VARCHAR NOT NULL PRIMARY KEY
, distance INTEGER NOT NULL
, value INTEGER
);
INSERT INTO tmp.points(pname, distance, value) VALUES
( 'A' , 1, 10 )
, ( 'B' , 4, 20 )
, ( 'C' , 10 , 1)
, ( 'D' , 11 , 2)
;
WITH RECURSIVE twin AS (
select 1::INTEGER AS zrank
, p0.pname AS zname
, p0.distance AS dist
, p0.value AS val
, p0.distance* p0.value AS prod
, p0.value::float / p0.distance AS frac
FROM tmp.points p0
WHERE NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance < p0.distance)
UNION
select 1+twin.zrank AS zrank
, p1.pname AS zname
, p1.distance AS dist
, p1.value AS val
, p1.distance* p1.value AS prod
, p1.value::float / p1.distance AS frac
FROM tmp.points p1, twin
WHERE p1.distance > twin.dist
AND NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance > twin.dist
AND px.distance < p1.distance
)
)
-- SELECT * from twin ;
SELECT min(zname) AS name1, max(zname) AS name2
, MIN(dist) * max(dist) *SUM(frac) / SUM(dist) AS score
FROM twin
WHERE zrank <=2
;
The result:
CREATE TABLE
INSERT 0 4
name1 | name2 | score
-------+-------+-------
A | B | 12
Update: this one is a bit cleaner ... ties are still not handled (need a window function or a LIMIT 1 clause in the outer query for that)
WITH RECURSIVE twin AS (
select 1::INTEGER AS zrank
, p0.pname AS name1
, p0.pname AS name2
, p0.distance AS dist
FROM tmp.points p0
WHERE NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance < p0.distance)
UNION
select 1+twin.zrank AS zrank
, twin.name1 AS name1
, p1.pname AS name2
, p1.distance AS dist
FROM tmp.points p1, twin
WHERE p1.distance > twin.dist
AND NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance > twin.dist
AND px.distance < p1.distance
)
)
SELECT twin.name1, twin.name2
, (p1.distance * p2.value + p2.distance * p1.value) / (p1.distance+p2.distance) AS score
FROM twin
JOIN tmp.points p1 ON (p1.pname = twin.name1)
JOIN tmp.points p2 ON (p2.pname = twin.name2)
WHERE twin.zrank =2
;
If you actually want the point in between, there is a built-in way of doing that (but not an aggregate function):
SELECT center(box(x.mypoint,y.mypoint))
FROM ([get some neighbours by distance] order by value limit 1) x
,([get some neighbours by distance] order by value offset 1 limit 1) y;
If you want the mean distance:
SELECT avg(x.distance)
FROM ([get some neighbours by distance] order by value limit 2) as x
See geometrical function and aggregate functions in the manual.
Edit:
For the added example, the query could look like this:
SELECT (x.value * 4 + y.value) / 5 AS result
FROM ([get some neighbours by distance] order by value limit 1) x
,([get some neighbours by distance] order by value offset 1 limit 1) y;
I added missing () to get the result you expect!
Or, my last stab at it:
SELECT y.x, y.x[1], (y.x[1] * 4 + y.x[2]) / 5 AS result
FROM (
SELECT ARRAY(
SELECT value FROM tbl WHERE [some condition] ORDER BY value LIMIT 2
) x
) y
It would be so much easier, if you provided the full query and the table definitions.

Resources