I want to calculate a value by interpolating the value between two nearest neighbours.
I have a subquery that returns the values of the neighbours and their relative distance, in the form of two columns with two elements.
Let's say:
(select ... as value, ... as distance
from [get some neighbours by distance] limit 2) as sub
How can I calculate the value of the point by linear interpolation? Is it possible to do that in a single query?
Example: My point has the neighbour A with value 10 at distance 1, and the neighbour B with value 20 at distance 4. The function should return a value 10 * 4 + 20 * 1 / 5 = 12 for my point.
I tried the obvious approach
select sum(value * (sum(distance)-distance)) / sum(distance)
which will fail because you cannot work with group clauses inside group clauses. Using another subquery returning the sum is not possible either, because then I cannot forward the individual values at the same time.
This is an ugly hack (based on a abused CTE ;). The crux of it is that
value1 * distance2 + value2 * distance1
Can, by dividing by distance1*distance2, be rewritten to
value1/distance1 + value2/distance2
So, the products (or divisions) can stay inside their rows. After the summation, multiplying by (distance1*distance2) rescales the result to the desired output. Generalisation to more than two neighbors is left as an exercise to the reader.YMMV
DROP TABLE tmp.points;
CREATE TABLE tmp.points
( pname VARCHAR NOT NULL PRIMARY KEY
, distance INTEGER NOT NULL
, value INTEGER
);
INSERT INTO tmp.points(pname, distance, value) VALUES
( 'A' , 1, 10 )
, ( 'B' , 4, 20 )
, ( 'C' , 10 , 1)
, ( 'D' , 11 , 2)
;
WITH RECURSIVE twin AS (
select 1::INTEGER AS zrank
, p0.pname AS zname
, p0.distance AS dist
, p0.value AS val
, p0.distance* p0.value AS prod
, p0.value::float / p0.distance AS frac
FROM tmp.points p0
WHERE NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance < p0.distance)
UNION
select 1+twin.zrank AS zrank
, p1.pname AS zname
, p1.distance AS dist
, p1.value AS val
, p1.distance* p1.value AS prod
, p1.value::float / p1.distance AS frac
FROM tmp.points p1, twin
WHERE p1.distance > twin.dist
AND NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance > twin.dist
AND px.distance < p1.distance
)
)
-- SELECT * from twin ;
SELECT min(zname) AS name1, max(zname) AS name2
, MIN(dist) * max(dist) *SUM(frac) / SUM(dist) AS score
FROM twin
WHERE zrank <=2
;
The result:
CREATE TABLE
INSERT 0 4
name1 | name2 | score
-------+-------+-------
A | B | 12
Update: this one is a bit cleaner ... ties are still not handled (need a window function or a LIMIT 1 clause in the outer query for that)
WITH RECURSIVE twin AS (
select 1::INTEGER AS zrank
, p0.pname AS name1
, p0.pname AS name2
, p0.distance AS dist
FROM tmp.points p0
WHERE NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance < p0.distance)
UNION
select 1+twin.zrank AS zrank
, twin.name1 AS name1
, p1.pname AS name2
, p1.distance AS dist
FROM tmp.points p1, twin
WHERE p1.distance > twin.dist
AND NOT EXISTS ( SELECT * FROM tmp.points px
WHERE px.distance > twin.dist
AND px.distance < p1.distance
)
)
SELECT twin.name1, twin.name2
, (p1.distance * p2.value + p2.distance * p1.value) / (p1.distance+p2.distance) AS score
FROM twin
JOIN tmp.points p1 ON (p1.pname = twin.name1)
JOIN tmp.points p2 ON (p2.pname = twin.name2)
WHERE twin.zrank =2
;
If you actually want the point in between, there is a built-in way of doing that (but not an aggregate function):
SELECT center(box(x.mypoint,y.mypoint))
FROM ([get some neighbours by distance] order by value limit 1) x
,([get some neighbours by distance] order by value offset 1 limit 1) y;
If you want the mean distance:
SELECT avg(x.distance)
FROM ([get some neighbours by distance] order by value limit 2) as x
See geometrical function and aggregate functions in the manual.
Edit:
For the added example, the query could look like this:
SELECT (x.value * 4 + y.value) / 5 AS result
FROM ([get some neighbours by distance] order by value limit 1) x
,([get some neighbours by distance] order by value offset 1 limit 1) y;
I added missing () to get the result you expect!
Or, my last stab at it:
SELECT y.x, y.x[1], (y.x[1] * 4 + y.x[2]) / 5 AS result
FROM (
SELECT ARRAY(
SELECT value FROM tbl WHERE [some condition] ORDER BY value LIMIT 2
) x
) y
It would be so much easier, if you provided the full query and the table definitions.
Related
I have a data-set in which there are duplicate IDs in the first column. I'm hoping to obtain a single row of data for each ID based on the second column's value. The data looks like so:
ID Info_Source Prior?
A 1 Y
A 3 N
A 2 Y
B 1 N
B 1 N
B 2 Y
C 2 N
C 3 Y
C 1 N
Specifically the criteria would call for prioritizing based on the second column's value (3 highest priority; then 1; and lastly 2): if the 'Info_Source' column has a value of 3, return that row; if there is no 3 in the second column for a given ID, look for a 1 and if found return that row; and finally if there is no 3 or 1 associated with the ID, search for 2 and return that row for the ID.
The desired results would be a single row for each ID, and the resulting data would be:
ID Info_Source Prior?
A 3 N
B 1 N
C 3 Y
row_number() over() usually solves these needs nicely and efficiently e.g.
select ID, Info_Source, Prior
from (
select ID, Info_Source, Prior
, row_number() over(partition by id order by Info_source DESC) as rn
)
where rn = 1
For prioritizing the second column's value (3 ; then 1, then 2) use a case expression to alter the raw value into an order that you need.
select ID, Info_Source, Prior
from (
select ID, Info_Source, Prior
, row_number() over(partition by id
order by case when Info_source = 3 then 3
when Infor_source = 1 then 2
else 1 end DESC) as rn
)
where rn = 1
For example if you take the following example into consideration.
100.00 - Original Number
33.33 - 1st divided by 3
33.33 - 2nd divided by 3
33.33 - 3rd divided by 3
99.99 - Is the sum of the 3 division outcomes
But i want it to match the original 100.00
One way that i saw it could be done was by taking the original number minus the first two divisions and the result would be my third number. Now if i take those 3 numbers i get my original number.
100.00 - Original Number
33.33 - 1st divided by 3
33.33 - 2nd divided by 3
33.34 - 3rd number
100.00 - Which gives me my original number correctly. (33.33+33.33+33.34 = 100.00)
Is there a formula for this either in Oracle PL/SQL or a function or something that could be implemented?
Thanks in advance!
This version takes precision as a parameter as well:
with q as (select 100 as val, 3 as parts, 2 as prec from dual)
select rownum as no
,case when rownum = parts
then val - round(val / parts, prec) * (parts - 1)
else round(val / parts, prec)
end v
from q
connect by level <= parts
no v
=== =====
1 33.33
2 33.33
3 33.34
For example, if you want to split the value among the number of days in the current month, you can do this:
with q as (select 100 as val
,extract(day from last_day(sysdate)) as parts
,2 as prec from dual)
select rownum as no
,case when rownum = parts
then val - round(val / parts, prec) * (parts - 1)
else round(val / parts, prec)
end v
from q
connect by level <= parts;
1 3.33
2 3.33
3 3.33
4 3.33
...
27 3.33
28 3.33
29 3.33
30 3.43
To apportion the value amongst each month, weighted by the number of days in each month, you could do this instead (change the level <= 3 to change the number of months it is calculated for):
with q as (
select add_months(date '2013-07-01', rownum-1) the_month
,extract(day from last_day(add_months(date '2013-07-01', rownum-1)))
as days_in_month
,100 as val
,2 as prec
from dual
connect by level <= 3)
,q2 as (
select the_month, val, prec
,round(val * days_in_month
/ sum(days_in_month) over (), prec)
as apportioned
,row_number() over (order by the_month desc)
as reverse_rn
from q)
select the_month
,case when reverse_rn = 1
then val - sum(apportioned) over (order by the_month
rows between unbounded preceding and 1 preceding)
else apportioned
end as portion
from q2;
01/JUL/13 33.7
01/AUG/13 33.7
01/SEP/13 32.6
Use rational numbers. You could store the numbers as fractions rather than simple values. That's the only way to assure that the quantity is truly split in 3, and that it adds up to the original number. Sure you can do something hacky with rounding and remainders, as long as you don't care that the portions are not exactly split in 3.
The "algorithm" is simply that
100/3 + 100/3 + 100/3 == 300/3 == 100
Store both the numerator and the denominator in separate fields, then add the numerators. You can always convert to floating point when you display the values.
The Oracle docs even have a nice example of how to implement it:
CREATE TYPE rational_type AS OBJECT
( numerator INTEGER,
denominator INTEGER,
MAP MEMBER FUNCTION rat_to_real RETURN REAL,
MEMBER PROCEDURE normalize,
MEMBER FUNCTION plus (x rational_type)
RETURN rational_type);
Here is a parameterized SQL version
SELECT COUNT (*), grp
FROM (WITH input AS (SELECT 100 p_number, 3 p_buckets FROM DUAL),
data
AS ( SELECT LEVEL id, (p_number / p_buckets) group_size
FROM input
CONNECT BY LEVEL <= p_number)
SELECT id, CEIL (ROW_NUMBER () OVER (ORDER BY id) / group_size) grp
FROM data)
GROUP BY grp
output:
COUNT(*) GRP
33 1
33 2
34 3
If you edit the input parameters (p_number and p_buckets) the SQL essentially distributes p_number as evenly as possible among the # of buckets requested (p_buckets).
I've solved this problem yesterday by subtracting 2 of 3 parts from the starting number, e.g. 100 - 33.33 - 33.33 = 33.34 and the result of summing it up is still 100.
I have made a matrix report in oracle report builder like this
And here is my query from which report is being calling
SELECT A.p_date,
L.sup_name,
Decode(A.perc_typ, 1, 'Buff',
2, 'Cow') PERC_TYPE,
A.sup_rate RATE,
Decode(A.perc_typ,
1, Round(( Nvl(A.fat_perc, 0) * Nvl(A.gross_vol, 0) ) / 6, 5),
2, Round(
( Nvl(A.fat_perc, 0) + (
( Nvl(A.fat_perc, 0) * 0.22 ) + (
Nvl(A.lr_perc, 0) * 0.25 ) + 0.72 ) ) *
Nvl(A.gross_vol, 0) / 13, 5)) VOL
FROM mlk_purchase A,
supplier L
WHERE A.sup_cod = L.sup_cod
AND A.p_date <= Trunc(SYSDATE)
AND a.p_date >= Trunc(SYSDATE) - 7
ORDER BY 1
Problem is that there are are showing empty cells where no data is coming from query. I want to show zero cells instead of empty space. Is there any way to do this in oracle report builder.
There are at least two solutions.
Solution 1 -- In Oracle Reports, create a boilerplate text object that displays the zero, and arrange this object so that it displays behind the matrix field. This way, the boilerplate is hidden when the field is displayed, but is revealed when the field is not displayed. This solution is described in the documentation.
Solution 2 -- Rewrite your query to return rows with zero values for combinations of your row and column fields that have no data. For example, you might find all the possible combinations of the matrix row and column fields (supplier and date in this case), outer join your data to the combinations, and use NVL to convert null values to zeroes. It might look something like this:
SELECT
L.P_DATE,
L.SUP_NAME,
DECODE(A.PERC_TYP, 1, 'Buff', 2, 'Cow') PERC_TYPE,
A.SUP_RATE RATE,
NVL
(
DECODE
(
A.PERC_TYP,
1,
ROUND
(
(NVL(A.FAT_PERC, 0) * NVL(A.GROSS_VOL, 0)) / 6,
5
),
2,
ROUND
(
(NVL(A.FAT_PERC, 0) +
(
(NVL(A.FAT_PERC, 0) * 0.22) +
(NVL(A.LR_PERC, 0) * 0.25) + 0.72)
) * NVL(A.GROSS_VOL, 0) / 13,
5
)
),
0
) VOL
FROM
MLK_PURCHASE A,
(
SELECT
L1.SUP_CODE,
L1.SUP_NAME,
L2.P_DATE
FROM
(
SELECT DISTINCT
SUPPLIER.SUP_CODE,
SUPPLIER.SUP_NAME
FROM
SUPPLIER
) L1,
(
SELECT DISTINCT
MLK_PURCHASE.P_DATE
FROM
MLK_PURCHASE
WHERE
MLK_PURCHASE.P_DATE <= TRUNC(SYSDATE)
AND
MLK_PURCHASE.P_DATE >= TRUNC(SYSDATE) - 7
) L2
) L
WHERE
A.SUP_COD (+) = L.SUP_COD
AND
A.P_DATE (+) = L.P_DATE
ORDER BY
1
A more efficient (and simpler) way to rewrite the query to the same effect might be to use a partitioned outer join between MLK_PURCHASE and SUPPLIER that partitions by SUP_CODE, but I don't know to what extent your version of Oracle Reports supports this syntax.
I had the following query:
SELECT nvl(sum(adjust1),0)
FROM (
SELECT
ManyOperationsOnFieldX adjust1,
a, b, c, d, e
FROM (
SELECT
a, b, c, d, e,
SubStr(balance, INSTR(balance, '[&&2~', 1, 1)) X
FROM
table
WHERE
a >= To_Date('&&1','YYYYMMDD')
AND a < To_Date('&&1','YYYYMMDD')+1
)
)
WHERE
b LIKE ...
AND e IS NULL
AND adjust1>0
AND (b NOT IN ('...','...','...'))
OR (b = '... AND c <> NULL)
I tried to change it to this:
SELECT nvl(sum(adjust1),0)
FROM (
SELECT
ManyOperationsOnFieldX adjust1
FROM (
SELECT
SubStr(balance, INSTR(balance, '[&&2~', 1, 1)) X
FROM
table
WHERE
a >= To_Date('&&1','YYYYMMDD')
AND a < To_Date('&&1','YYYYMMDD')+1
AND b LIKE '..'
AND e IS NULL
AND (b NOT IN ('..','..','..'))
OR (b='..' AND c <> NULL)
)
)
WHERE
adjust1>0
Mi intention was to have all the filtering in the innermost query, and only give to the outer ones the field X which is the one I have to operate a lot. However, the firts (original) query takes a couple of seconds to execute, while the second one won't even finish. I waited for almost 20 minutes and still I wouldn't get the answer.
Is there an obvious reason for this to happen that I might be overlooking?
These are the plans for each of them:
SELECT STATEMENT optimizer=all_rows (cost = 973 Card = 1 bytes = 288)
SORT (aggregate)
PARTITION RANGE (single) (cost=973 Card = 3 bytes = 864)
TABLE ACCESS (full) OF "table" #3 TABLE Optimizer = analyzed(cost=973 Card = 3 bytes=564)
SELECT STATEMENT optimizer=all_rows (cost = 750.354 Card = 1 bytes = 288)
SORT (aggregate)
PARTITION RANGE (ALL) (cost=759.354 Cart = 64.339 bytes = 18.529.632)
TABLE ACCESS (full) OF "table" #3 TABLE Optimizer = analyzed(cost=750.354 Card = 64.339 bytes=18.529.632)
Your two queries are not identical.
the logical operator AND is evaluated before the operator OR:
SQL> WITH data AS
2 (SELECT rownum id
3 FROM dual
4 CONNECT BY level <= 10)
5 SELECT *
6 FROM data
7 WHERE id = 2
8 AND id = 3
9 OR id = 5;
ID
----------
5
So your first query means: Give me the big SUM over this partition when the data is this way.
Your second query means: give me the big SUM over (this partition when the data is this way) or (when the data is this other way [no partition elimination hence big full scan])
Be careful when mixing the logical operators AND and OR. My advice would be to use brackets so as to avoid any confusion.
It is all about your OR... Try this:
SELECT nvl(sum(adjust1),0)
FROM (
SELECT
ManyOperationsOnFieldX adjust1
FROM (
SELECT
SubStr(balance, INSTR(balance, '[&&2~', 1, 1)) X
FROM
table
WHERE
a >= To_Date('&&1','YYYYMMDD')
AND a < To_Date('&&1','YYYYMMDD')+1
AND (
b LIKE '..'
AND e IS NULL
AND (b NOT IN ('..','..','..'))
OR (b='..' AND c <> NULL)
)
)
)
WHERE
adjust1>0
Because you have the OR inline with the rest of your AND statements with no parenthesis, the 2nd version isn't limiting the data checked to just the rows that fall in the date filter. For more info, see the documentation of Condition Precedence
I have a hobby project which is about creating a tree to store identification numbers. I had used digit, stored at node, that is node can be 0 1 2 3 4 5 6 7 8 9.
After I have create tree, I want compose list from tree. But, I could not find a algorithm to manage my goal.
What I want :
"recompose tree" will return list of numbers. For below tree it should be
[ 2, 21, 243, 245, 246, 78, 789 ]
Root
/ \
2* 7
/ \ \
1* 4 8*
/ \ \ \
3* 5* 6* 9*
my data type : data ID x = ID ( ( x, Mark ), [ ID x ] )
data Mark = Marked | Unmarked
EDIT:
for convenience : * shows it is marked
I have stored digit as char, actually not 1,
it is stored as'1'
Do you have advice how I can do that ? ( advice is prefferred to be algorithm )
What about
recompose :: Num a => ID a -> [a]
recompose = go 0
where
go acc (ID ((n, Marked), ids)) =
let acc' = 10 * acc + n
in acc' : concatMap (go acc') ids
go acc (ID ((n, Unmarked), ids)) =
let acc' = 10 * acc + n
in concatMap (go acc') ids
?
That is, we traverse the tree while accumulating a value for the path from the root to a node. At every node we update the accumulator by multiplying the value for the path by 10 and adding the value for the node to it. The traversal produces a list of all accumulator values for marked node: so, at marked node we add the accumulator value to the list, for unmarked nodes we just propagate the list that we have collected for the children of the node.
How do we compute the list for the children of a node? We recursively call the traversal function (go) to all children by mapping it over the list of children. That gives us a list of lists that we concatenate to obtain a single list. (concatMap f xs is just concat (map f xs)) or concat . map f.)
In attribute-grammar terminology: the accumulator serves as an inherited attribute, while the returned list is a synthesised attribute.
As a refinement, we can introduce an auxiliary function isMarked,
isMarked :: Mark -> Bool
isMarked Marked = True
isMarked Unmarked = False
so that we can concisely write
recompose :: Num a => ID a -> [a]
recompose = go 0
where
go acc (ID ((n, m), ids)) =
let acc' = 10 * acc + n
prefix = if isMarked m then (acc' :) else id
in prefix (concatMap (go acc') ids)
BTW: this can even be done in sql:
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path='tmp';
CREATE TABLE the_tree
( id CHAR(1) NOT NULL PRIMARY KEY
, parent_id CHAR(1) REFERENCES the_tree(id)
);
INSERT INTO the_tree(id,parent_id) VALUES
( '#', NULL)
,( '2', '#')
,( '7', '#')
,( '1', '2')
,( '4', '2')
,( '3', '4')
,( '5', '4')
,( '6', '4')
,( '8', '7')
,( '9', '8')
;
WITH RECURSIVE sub AS (
SELECT id, parent_id, ''::text AS path
FROM the_tree t0
WHERE id = '#'
UNION
SELECT t1.id, t1.parent_id, sub.path || t1.id::text
FROM the_tree t1
JOIN sub ON sub.id = t1.parent_id
)
SELECT sub.id,sub.path
FROM sub
ORDER BY path
;
RESULT: (postgresql)
NOTICE: drop cascades to table tmp.the_tree
DROP SCHEMA
CREATE SCHEMA
SET
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "the_tree_pkey" for table "the_tree"
CREATE TABLE
INSERT 0 10
id | path
----+------
# |
2 | 2
1 | 21
4 | 24
3 | 243
5 | 245
6 | 246
7 | 7
8 | 78
9 | 789
(10 rows)