I have a table ABC with the following columns
AO
AOM
100
200
200
300
300
400
600
500
500
300
800
900
800
1000
900
1000
1200
1300
100
1200
1500
1600
100
1600
and if we see here we can see that 100 is the root element and has following leaf elements 200,300,400,500,600,1200,1300
I will need my output to look like below:
list of all the elements and the corresponding root element
ELEMENT
ROOT
100
100
200
100
300
100
400
100
500
100
600
100
1200
100
1300
100
800
800
900
800
1000
800
1500
100
1600
1600
I tried using the below query as a starting point but was not sure on how to get the expected value
select *
from ABC
start with ao=100
connect by ao = prior aom;
It is easier to reverse the question and treat each element as the root and then navigate back up to the ancestors until you reach a leaf (an element with no children):
SELECT DISTINCT element, root
FROM (
SELECT CONNECT_BY_ROOT aom AS aom,
CONNECT_BY_ROOT ao AS ao,
ao AS root
FROM abc
WHERE CONNECT_BY_ISLEAF = 1
CONNECT BY PRIOR ao = aom
)
UNPIVOT (
element FOR type IN (ao, aom)
);
Which, for the sample data:
CREATE TABLE abc (AO, AOM) AS
SELECT 100, 200 FROM DUAL UNION ALL
SELECT 200, 300 FROM DUAL UNION ALL
SELECT 300, 400 FROM DUAL UNION ALL
SELECT 600, 500 FROM DUAL UNION ALL
SELECT 500, 300 FROM DUAL UNION ALL
SELECT 800, 900 FROM DUAL UNION ALL
SELECT 800, 1000 FROM DUAL UNION ALL
SELECT 900, 1000 FROM DUAL UNION ALL
SELECT 1200, 1300 FROM DUAL UNION ALL
SELECT 100, 1200 FROM DUAL;
Outputs:
ELEMENT
ROOT
100
100
200
100
300
100
500
600
300
600
400
100
400
600
600
600
800
800
900
800
1000
800
1200
100
1300
100
Note: 600 is a root because you have the rows in your data set as 600, 500 and 500, 300 and not 500, 600 and 300, 500. If the ao and aom values in those rows were reversed then the root would be 100 for all those rows.
You can start from the roots and work down to the descendants but its less efficient as you need a sub-query to find the roots:
SELECT DISTINCT
element,
root
FROM (
SELECT CONNECT_BY_ROOT ao AS root,
ao,
aom
FROM abc
START WITH ao NOT IN (SELECT aom FROM abc)
CONNECT BY PRIOR aom = ao
)
UNPIVOT (
element FOR type IN (ao, aom)
)
The output is identical.
db<>fiddle here
Related
I have two matrices with the same dimension, “value“ and “mask“:
value=matrix( 100 200 100,200 100 200,100 200 100);
//output
col1 col2 col3
--- --- ---
100 200 100
200 100 200
100 200 100
mask=matrix( 1 2 1,1 1 2,1 2 1);
//output
col1 col2 col3
-- -- --
1 1 1
2 1 2
1 2 1
Each column of the “mask“ matrix has two groups, “1“ and “2“.
I want to obtain the average value of each group in the corresponding column of the “value” matrix.
Expected result:
col1 col2 col3
100 150 100
200 200 200
If you use DolphinDB V2.00 or higher, you can use the script below:
value=matrix( 100 200 100,200 100 200,100 200 100);
mask=matrix( 1 2 1,1 1 2,1 2 1);
each(def(v, m):v[groups(m, 'table').index].rowAvg(), value, mask);
If not, you can run the following script:
value=matrix(100 200 100,200 100 200,100 200 100)
mask=matrix(1 2 1,1 1 2,1 2 1)
each(def(v, m)->groupby(avg, v, m).avg,value,mask)
Then, you can get your expected result:
col1 col2 col3
100 150 100
200 200 200
It should be noted that DolphinDB adopts column-major order for matrices, therefore, the each function apply a function to each column.
We have this data : Table R(A,..) with attribute A, nbLine of R is 1000, distinct value for A are 500.
data are displayed like this : bucket -> end_point_value.
1 -> 800
2 -> 900
3 -> 1000
4 -> 1200
5 -> 1500
6 -> 2000
7 -> 2300
8 -> 2400
9 -> 2550
10 -> 2590
the question is : Does this histogram confirm or deny the hypothesis of a uniform distribution uniform?
I think I can not confirm nor deny, what do you think ?
First you must ask, which histogram type is defined on the column.
Oracle provides four different histogram types and is you want to claim about uniform distribution the frequency histogram must be defined.
The frequency histogram has one bucket for each distinct value (stored in ENDPOINT_VALUE and the frequency is (additive) stored in the column ENDPOINT_NUMBER)
So if you histogram has only 10 buckets (as you show in the data) you are ready and you can say nothing about the distribution.
Example of a Uniform Distribution
create table r as
select
1 + trunc((rownum-1)/2) A
from dual connect by level <= 1000;
select count(*), count(distinct a), min(a), max(a) from r;
COUNT(*) COUNT(DISTINCTA) MIN(A) MAX(A)
---------- ---------------- ---------- ----------
1000 500 1 500
Create FREQUENCY Histogram with 500 Buckets
exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'R', method_opt=>'for all columns size 500');
select NUM_BUCKETS, HISTOGRAM from user_tab_columns where table_name = 'R';
NUM_BUCKETS HISTOGRAM
----------- ---------------
500 FREQUENCY
select ENDPOINT_VALUE, ENDPOINT_NUMBER from user_histograms where table_name = 'R' order by ENDPOINT_VALUE;
ENDPOINT_VALUE, ENDPOINT_NUMBER
1 2
2 4
3 6
4 8
...
498 996
499 998
500 1000
Each row Total value go for the set OP value for the next row, except the first row
My Table Data
Id
WarehouseId
ProductId
OP
RE
IS
Total
1
100
10000
10
0
0
10
2
100
10000
0
0
5
5
4
100
10000
0
15
0
15
5
101
10001
15
0
0
15
6
101
10001
0
0
5
5
8
101
10001
0
15
0
15
9
101
10002
25
0
0
25
10
101
10002
0
0
10
10
11
101
10002
0
15
0
15
I want to show below result (OP+RE)-IS=Total
Id
WarehouseId
ProductId
OP
RE
IS
Total
1
100
10000
10
0
0
10
2
100
10000
10
0
5
5
4
100
10000
5
15
0
20
5
101
10001
15
0
0
15
6
101
10001
15
0
5
10
8
101
10001
10
15
0
25
9
101
10002
25
0
0
25
10
101
10002
25
0
10
15
11
101
10002
15
15
0
30
You can use recursive subquery factoring for that purpose like below. You need to rank your data as you have some gaps between IDs in your data. Column name = IS is not allowed, You should use "IS" enclosed by ".
with t1 ( ID, WAREHOUSEID, PRODUCTID, OP, RE, "IS", RNB, rnb_per_wh_prod) as (
select ID, WAREHOUSEID, PRODUCTID, OP, RE, "IS"
, row_number()over(order by WAREHOUSEID, PRODUCTID, ID) rnb
, row_number()over(partition by WAREHOUSEID, PRODUCTID order by ID) rnb_per_wh_prod
from Your_table t
), cte (ID, WAREHOUSEID, PRODUCTID, OP, RE, "IS", Total, RNB) as (
select ID, WAREHOUSEID, PRODUCTID, OP, RE, "IS", OP + RE - "IS", RNB
from t1
where rnb_per_wh_prod = 1
union all
select t1.ID
, t1.WAREHOUSEID
, t1.PRODUCTID
, case when t1.op = 0 then c.OP + c.RE - c."IS" else t1.OP end as OP
, t1.RE
, t1."IS"
, case when t1.op = 0 then c.OP + c.RE - c."IS" else t1.OP end + t1.RE - t1."IS" total
, t1.RNB
from cte c
join t1 on (t1.RNB = c.RNB + 1
and c.WAREHOUSEID = t1.WAREHOUSEID
and c.PRODUCTID = t1.PRODUCTID)
)
select ID, WAREHOUSEID, PRODUCTID, OP, RE, "IS", TOTAL
from cte
order by id
;
If I understand the problem correctly, "every 3 row" is a coincidence in your data; in fact, the computation must be done separately for each product in each warehouse, no matter how many rows there are for each distinct combination. And the "id" column is in reality some sort of timestamp - ordering is by "id".
If so, you can do it all in a single query using analytic sum(). Instead of creating a table for testing, I included all the sample data in a WITH clause at the top of the query itself; you can remove the WITH clause, and use your actual table and column names. I also changed the column name is to is_ since is is a reserved keyword, it can't be a column name.
Note also that I am ignoring your existing total column completely (I didn't even include it in the sample data); I assume it doesn't exist in your real-life data, and instead it is part of your attempt at a solution. You don't need it - not in the way you have it in your question.
with
sample_data (id, warehouseid, productid, op, re, is_) as (
select 1, 100, 10000, 10, 0, 0 from dual union all
select 2, 100, 10000, 0, 0, 5 from dual union all
select 4, 100, 10000, 0, 15, 0 from dual union all
select 5, 101, 10001, 15, 0, 0 from dual union all
select 6, 101, 10001, 0, 0, 5 from dual union all
select 8, 101, 10001, 0, 15, 0 from dual union all
select 9, 101, 10002, 25, 0, 0 from dual union all
select 10, 101, 10002, 0, 0, 10 from dual union all
select 11, 101, 10002, 0, 15, 0 from dual
)
select id, warehouseid, productid,
nvl(sum(op + re - is_) over (partition by warehouseid, productid
order by id rows between unbounded preceding and 1 preceding),
op) as op,
re, is_,
sum(op + re - is_) over
(partition by warehouseid, productid order by id) as total
from sample_data
;
ID WAREHOUSEID PRODUCTID OP RE IS_ TOTAL
------ ----------- ---------- ---------- ---------- ---------- ----------
1 100 10000 10 0 0 10
2 100 10000 10 0 5 5
4 100 10000 5 15 0 20
5 101 10001 15 0 0 15
6 101 10001 15 0 5 10
8 101 10001 10 15 0 25
9 101 10002 25 0 0 25
10 101 10002 25 0 10 15
11 101 10002 15 15 0 30
Consider below table table.
Id balance
1 100
2 500
3 4000
I need output in below format.
Id balance begin_bal end_bal
1 100 0 100
2 500 100 600
3 4000 600 4600
A little bit of analytics, as you presumed:
SQL> with test (id, balance) as
2 (select 1, 100 from dual union all
3 select 2, 500 from dual union all
4 select 3, 4000 from dual
5 ),
6 temp as
7 (select id, balance, sum(balance) over (order by id) rsum
8 from test
9 )
10 select id,
11 balance,
12 nvl(lag(rsum) over (order by id), 0) begin_bal,
13 rsum end_bal
14 from temp
15 order by id;
ID BALANCE BEGIN_BAL END_BAL
---------- ---------- ---------- ----------
1 100 0 100
2 500 100 600
3 4000 600 4600
SQL>
First off, I'm a total Oracle noob although I'm very familiar with SQL. I have a single cost column. I need to calculate the total cost, the percentage of the total cost, and then a running sum of the percentages. I'm having trouble with the running sum of percentages because the only way I can think to do this uses nested SUM functions, which isn't allowed.
Here's what works:
SELECT cost, SUM(cost) OVER() AS total, cost / SUM(cost) OVER() AS per
FROM my_table
ORDER BY cost DESC
Here's what I'm trying to do that doesn't work:
SELECT cost, SUM(cost) OVER() AS total, cost / SUM(cost) OVER() AS per,
SUM(cost/SUM(cost) OVER()) OVER(cost) AS per_sum
FROM my_table
ORDER BY cost DESC
Am I just going about it wrong, or is what I'm trying to do just not possible? By the way I'm using Oracle 10g. Thanks in advance for any help.
You don't need the order by inside that inline view, especially since the outer select is doing an order by the order way around. Also, cost / SUM(cost) OVER () equals RATIO_TO_REPORT(cost) OVER ().
An example:
SQL> create table my_table(cost)
2 as
3 select 10 from dual union all
4 select 20 from dual union all
5 select 5 from dual union all
6 select 50 from dual union all
7 select 60 from dual union all
8 select 40 from dual union all
9 select 15 from dual
10 /
Table created.
Your initial query:
SQL> SELECT cost, SUM(cost) OVER() AS total, cost / SUM(cost) OVER() AS per
2 FROM my_table
3 ORDER BY cost DESC
4 /
COST TOTAL PER
---------- ---------- ----------
60 200 .3
50 200 .25
40 200 .2
20 200 .1
15 200 .075
10 200 .05
5 200 .025
7 rows selected.
Quassnoi's query contains a typo:
SQL> SELECT cost, total, per, SUM(running) OVER (ORDER BY cost)
2 FROM (
3 SELECT cost, SUM(cost) OVER() AS total, cost / SUM(cost) OVER() AS per
4 FROM my_table
5 ORDER BY
6 cost DESC
7 )
8 /
SELECT cost, total, per, SUM(running) OVER (ORDER BY cost)
*
ERROR at line 1:
ORA-00904: "RUNNING": invalid identifier
And if I correct that typo. It gives the right results, but wrongly sorted (I guess):
SQL> SELECT cost, total, per, SUM(per) OVER (ORDER BY cost)
2 FROM (
3 SELECT cost, SUM(cost) OVER() AS total, cost / SUM(cost) OVER() AS per
4 FROM my_table
5 ORDER BY
6 cost DESC
7 )
8 /
COST TOTAL PER SUM(PER)OVER(ORDERBYCOST)
---------- ---------- ---------- -------------------------
5 200 .025 .025
10 200 .05 .075
15 200 .075 .15
20 200 .1 .25
40 200 .2 .45
50 200 .25 .7
60 200 .3 1
7 rows selected.
I think this is the one you are looking for:
SQL> select cost
2 , total
3 , per
4 , sum(per) over (order by cost desc)
5 from ( select cost
6 , sum(cost) over () total
7 , ratio_to_report(cost) over () per
8 from my_table
9 )
10 order by cost desc
11 /
COST TOTAL PER SUM(PER)OVER(ORDERBYCOSTDESC)
---------- ---------- ---------- -----------------------------
60 200 .3 .3
50 200 .25 .55
40 200 .2 .75
20 200 .1 .85
15 200 .075 .925
10 200 .05 .975
5 200 .025 1
7 rows selected.
Regards,
Rob.
SELECT cost, total, per, SUM(per) OVER (ORDER BY cost)
FROM (
SELECT cost, SUM(cost) OVER() AS total, cost / SUM(cost) OVER() AS per
FROM my_table
)
ORDER BY
cost DESC