Is there an other solution calculate data by filtered date without "with" clause? - time

Need to calculate value when date< first_date by name with interval 3 day in BigQuery.
Example of data:
+------+------------+------------------+
| Name | date | order_id | value |
+------+------------+----------+-------+
| JONES| 2019-01-03 | 11 | 10 |
| JONES| 2019-01-05 | 12 | 5 |
| JONES| 2019-06-03 | 13 | 3 |
| JONES| 2019-07-03 | 14 | 20 |
| John | 2019-07-23 | 15 | 10 |
+------+------------+----------+-------+
My solution is:
WITH data AS (
SELECT "JONES" name, DATE("2019-01-03") date_time, 11 order_id, 10 value
UNION ALL
SELECT "JONES", DATE("2019-01-05"), 12, 5
UNION ALL
SELECT "JONES", DATE("2019-06-03"), 13, 3
UNION ALL
SELECT "JONES", DATE("2019-07-03"), 14, 20
UNION ALL
SELECT "John", DATE("2019-07-23"), 15, 10
),
data2 AS (
SELECT *, MIN(date_time) OVER (PARTITION BY name) min_date
FROM data
)
SELECT name,
ARRAY_AGG(STRUCT(order_id as f_id, date_time as f_date) ORDER BY order_id LIMIT 1)[OFFSET(0)].*,
sum(case when date_time< date_add(min_date,interval 3 day) then value end) as total_value_day3,
SUM(value) AS total
FROM data2
GROUP BY name
Output:
+------+------+------------+----------------+------+
| name | f_id | f_date |total_value_day3| total|
+------+------+------------+----------------+------+
| JONES| 11 | 2019-01-03 | 15 | 38 |
| John | 15 | 2019-07-23 | 10 | 10 |
+------+------+------------+----------------+------+
So my question, can do the same calculated with a more effective way?
Or this solution is ok for large datasets?

The following gets the same results without using window functions or array aggregations, so BQ has to do less ordering/partitioning. For this small example, my query takes longer to run, but there is less byte shuffling. If you run this against a much larger dataset, I think mine will be more efficient.
WITH data AS (
SELECT "JONES" name, DATE("2019-01-03") date_time, "11" order_id, 10 value UNION ALL
SELECT "JONES", DATE("2019-01-05"), "12", 5 UNION ALL
SELECT "JONES", DATE("2019-06-03"), "13", 3 UNION ALL
SELECT "JONES", DATE("2019-07-03"), "14", 20 UNION ALL
SELECT "John", DATE("2019-07-23"), "15", 10
),
aggs as (
select name, min(date_time) as first_order_date, min(order_id) as first_order_id, sum(value) as total
from data
group by 1
)
select
name,
first_order_id as f_id,
first_order_date as f_date,
sum(value) as total_value_day3,
total
from aggs
inner join data using(name)
where date_time < date_add(first_order_date, interval 3 day) -- <= perhaps
group by 1,2,3,5
Note, this makes an assumption that order_id is sequential (aka order_id 11 always occurs before order_id 12) in the same manner that dates are sequential.

Related

how to loop through each row of every group (while doing "group by") in Oracle table

I have a table like this:
I want to group by the table base on "customer_id" column and calculate "Day-day[0]" column. "Day-day[0]" is "Day" field in every group and "day[0]" is first row of the day in the group. At the same time, I have to calculate total risk which is in following:
This is the table after grouping by:
This is total risk formula:
In fact, I have to loop through each row of every group to calculate total risk.
My sample table is like this:
CREATE TABLE risk_test
(id VARCHAR2 (32) NOT NULL PRIMARY KEY,
customer_id varchar2 (40BYTE),
risk number,
day VARCHAR2(50 BYTE))
insert into risk_test values(1,102,15,1);
insert into risk_test values(2,102,16,1);
insert into risk_test values(3,104,11,1);
insert into risk_test values(4,102,17,2);
insert into risk_test values(5,102,10,2);
insert into risk_test values(6,102,13,3);
insert into risk_test values(7,104,14,2);
insert into risk_test values(8,104,13,2);
insert into risk_test values(9,104,17,1);
insert into risk_test values(10,104,16,2);
The sample answer is like this:
Would you please guide me how I can do this scenario in Oracle database?
Any help is really appreciated.
Using the sample data that was provided, I believe this query should calculate the risks properly:
Query
SELECT o.*,
ROUND (
SUM (day_minus_day0 * risk) OVER (PARTITION BY customer_id)
/ SUM (day_minus_day0) OVER (PARTITION BY customer_id),
5) AS total_risk
FROM (SELECT rt.*, (rt.day - MIN (rt.day) OVER (PARTITION BY customer_id)) + 1 AS day_minus_day0
FROM risk_test rt) o
ORDER BY customer_id, TO_NUMBER (day), TO_NUMBER (id);
Result
ID CUSTOMER_ID RISK DAY DAY_MINUS_DAY0 TOTAL_RISK
_____ ______________ _______ ______ _________________ _____________
1 102 15 1 1 13.77778
2 102 16 1 1 13.77778
4 102 17 2 2 13.77778
5 102 10 2 2 13.77778
6 102 13 3 3 13.77778
3 104 11 1 1 14.25
9 104 17 1 1 14.25
7 104 14 2 2 14.25
8 104 13 2 2 14.25
10 104 16 2 2 14.25
Your total risk calculation just looks like a weighted average to me. That is, the average risk of the rows for each customer, weighted according to the day offset (day-day[0]), so that risks in later days count for more.
To compute that, you need a common table expression to 1st compute the day-weighted risk for each row. Then you can just compute the weighted average by dividing.
The query below illustrates the approach, with comments.
-- This first WITH clause is just sample data. In your database you would
-- get rid of this and replace all references to "input" with your actual
-- table name
with input ( customer_id, risk, day ) AS (
SELECT 1053, 100, 1 FROM DUAL UNION ALL
SELECT 1053, 100, 1 FROM DUAL UNION ALL
SELECT 1053, 100, 2 FROM DUAL UNION ALL
SELECT 1053, 100, 2 FROM DUAL UNION ALL
SELECT 1053, 100, 3 FROM DUAL UNION ALL
SELECT 1054, 200, 1 FROM DUAL UNION ALL
SELECT 1054, 200, 1 FROM DUAL UNION ALL
SELECT 1054, 200, 3 FROM DUAL UNION ALL
SELECT 1054, 200, 3 FROM DUAL UNION ALL
SELECT 1054, 200, 4 FROM DUAL
),
-- This CTE computes the day offset for each row and multiplies by the risk to
-- compute a day-weighted risk.
-- I added +1 to the day_offset, otherwise risks on the 1st day would not contribute
-- to the total risk, which I think is not what you intended(?)
weighted_input AS (
SELECT i.customer_id,
i.risk,
i.day,
i.day - min(i.day) over ( partition by i.customer_id ) + 1 day_offset,
( i.day - min(i.day) over ( partition by i.customer_id ) + 1 ) * i.risk day_weighted_risk
FROM input i )
-- This is the main SELECT clause that gets all the weighted risks and computes
-- the group total risk, which appears the same in every row in each group.
SELECT wi.*,
sum(wi.day_weighted_risk) over ( partition by wi.customer_id ) / sum(wi.day_offset) over ( partition by wi.customer_id ) total_risk
FROM weighted_input wi;
+-------------+------+-----+------------+-------------------+------------+
| CUSTOMER_ID | RISK | DAY | DAY_OFFSET | DAY_WEIGHTED_RISK | TOTAL_RISK |
+-------------+------+-----+------------+-------------------+------------+
| 1053 | 100 | 1 | 1 | 100 | 100 |
| 1053 | 100 | 1 | 1 | 100 | 100 |
| 1053 | 100 | 2 | 2 | 200 | 100 |
| 1053 | 100 | 2 | 2 | 200 | 100 |
| 1053 | 100 | 3 | 3 | 300 | 100 |
| 1054 | 200 | 1 | 1 | 200 | 200 |
| 1054 | 200 | 1 | 1 | 200 | 200 |
| 1054 | 200 | 3 | 3 | 600 | 200 |
| 1054 | 200 | 3 | 3 | 600 | 200 |
| 1054 | 200 | 4 | 4 | 800 | 200 |
+-------------+------+-----+------------+-------------------+------------+
For your database, having the actual table and not needing the input CTE, it would be:
WITH weighted_input AS (
-- This CTE computes the day offset for each row and multiplies by the risk to
-- compute a day-weighted risk.
-- I added +1 to the day_offset, otherwise risks on the 1st day would not contribute
-- to the total risk, which I think is not what you intended(?)
SELECT i.customer_id,
i.risk,
i.day,
i.day - min(i.day) over ( partition by i.customer_id ) + 1 day_offset,
( i.day - min(i.day) over ( partition by i.customer_id ) + 1 ) * i.risk day_weighted_risk
FROM my_table i )
-- This is the main SELECT clause that gets all the weighted risks and computes
-- the group total risk, which appears the same in every row in each group.
SELECT wi.*,
sum(wi.day_weighted_risk) over ( partition by wi.customer_id ) / sum(wi.day_offset) over ( partition by wi.customer_id ) total_risk
FROM weighted_input wi;

display avg of the last column on the last row in SQL query

Hello I am having trouble trying to figure out this particular question. using oracle SQL developer.
trying to figure out how a query so that it will display exactly like the below table/picture.
the last row of this query to display word AVERAGE: and show the average (of all values in the in the sixth column) of the percentage above min selling price for all the sales made. and all the remaining column to display "--------"
Code ProductName Title ShopID SalePrice %SoldAbove Min.SellPrice
1 Martin Robot 1 $49000 15%
2
3
4
--- ------ ---- ---- AVERAGE: 16.5%
below is last row of the output i am looking for. But i have no clue on how to produce the
"--------" in the remaining columns let alone AVERAGE: and the average of all the values in the sixth column of the last row.
in summary, the last row of the output should show the average (in the sixth column) of the percentage
sold above the minimum selling price for all the sales.
Use ROLLUP:
SELECT DECODE( GROUPING( Code ), 1, '----', code ) AS code,
DECODE( GROUPING( Code ), 1, '----', MAX(col1) ) AS Col1,
DECODE( GROUPING( Code ), 1, '----', MAX(col2) ) AS Col2,
DECODE( GROUPING( Code ), 1, 'Average:', MAX(col3) ) AS Col3,
AVG( value )
FROM table_name
GROUP BY ROLLUP(Code);
Which, for the sample data:
CREATE TABLE table_name ( code, col1, col2, col3, value ) AS
SELECT 1, 'AAA', 1, 'AA1', 15.0 FROM DUAL UNION ALL
SELECT 2, 'BBB', 2, 'BB2', 17.5 FROM DUAL UNION ALL
SELECT 3, 'CCC', 3, 'CC3', 20.0 FROM DUAL;
Outputs:
CODE | COL1 | COL2 | COL3 | AVG(VALUE)
:--- | :--- | :--- | :------- | ---------:
1 | AAA | 1 | AA1 | 15
2 | BBB | 2 | BB2 | 17.5
3 | CCC | 3 | CC3 | 20
---- | ---- | ---- | Average: | 17.5
db<>fiddle here
This is a job for UNION ALL. A good way to get this sort of result:
SELECT * FROM (
SELECT COALESCE(whatever1, '------') whatever1
COALESCE(whatever2, '------') whatever2,
COALESCE(whatever3, '------') whatever3,
whatever4
FROM whatever
UNION ALL
SELECT NULL, NULL, NULL, AVG(whatever4) FROM whatever
) r
ORDER BY whatever1 NULLS LAST
The COALESCE function puts your ----- characters into the output.
You can also investigate GROUP BY WITH ROLLUP. It may do what you want.

Selecting only distinct record from table in oracle

I have table with following records;
ID | NN | MBL | IC | OTHER
---+-----+------+----+------
1 | 123 | | | ac
2 | | 544 | | dc
3 | | | 524| df
4 |527 | | 124| ff
5 |123 | | | tr // duplicate NN of ID 1
6 | | 544 | | op // duplicate MBL of ID 2
7 | | | 124| ii // duplicate for IC ID 4
When querying with select I need just records with single entry, skipping second occurrence,
select
ID, NN, MBL, IC, OTHER
from
TABLE1 // this should return only one entry of any NN, MBL and IC
How do I get this, I cannot use distinct for multiple columns and I also need ID and OTHER column to display in select query
Expecting result like this:
1 | 123 | | | ac
2 | | 544 | | dc
3 | | | 524| df
4 |527 | | 124| ff
You can use the analytical function ROW_NUMBER() to calculate ranks over each column you want and filter only these rows with rank = 1.
Here is an example:
WITH testdata AS (
SELECT 1 AS ID, 123 AS NN, NULL AS MBL, NULL AS IC, 'ac' AS OTHER FROM DUAL UNION ALL
SELECT 2, NULL, 544 , NULL, 'dc' FROM DUAL UNION ALL
SELECT 3, NULL, NULL, 524 , 'df' FROM DUAL UNION ALL
SELECT 4, 527, NULL, 124, 'ff' FROM DUAL UNION ALL
SELECT 5, 123, NULL, NULL, 'tr' FROM DUAL UNION ALL
SELECT 6, NULL, 544, NULL, 'op' FROM DUAL UNION ALL
SELECT 7, NULL, NULL , 124, 'ii' FROM DUAL
)
SELECT *
FROM(SELECT ID,
NN,
CASE WHEN NN IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY NN ORDER BY ID) END AS NN_RANG,
MBL,
CASE WHEN MBL IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY MBL ORDER BY ID) END AS MBL_RANG,
IC,
CASE WHEN IC IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY IC ORDER BY ID) END AS IC_RANG,
OTHER
FROM testdata
)
WHERE NN_RANG = 1
AND MBL_RANG = 1
AND IC_RANG = 1
;
Hope it helps.

ORACLE Query Get Last ID Using MIN Based On Quantity Consumed By ID

I have Incoming Stock transaction data using Oracle:
ID | DESCRIPTION | PART_NO | QUANTITY | DATEADDED
TR5 | FG | P0025 | 5 | 06-SEP-2017 08:20:33 <-- just now added
TR4 | Test | TEST1 | 8 | 05-SEP-2017 15:11:15
TR3 | FG | GSDFGSG | 10 | 31-AUG-2017 16:26:04
TR2 | FG | GSDFGSG | 2 | 31-AUG-2017 16:05:39
TR1 | FG | GSDFGSG | 2 | 30-AUG-2017 16:30:16
And now I'm grouping that data to be:
TR_ID | PART_NO | TOTAL
TR1 | GSDFGSG | 14
TR4 | TEST1 | 8
TR5 | P0025 | 5 <-- just now added
Query Code:
SELECT MIN(TRANSACTION_EQUIPMENTID) as TR_ID,
PART_NO,
SUM(T.QUANTITY) AS TOTAL
FROM WA_II_TBL_TR_EQUIPMENT T
GROUP BY T.PART_NO
As you can see on that data and query code, I'm show TR_ID using MIN to get first ID on first transaction.
And now I have Outgoing transaction data:
Assume I try to get quantity 8
ID_FK | QUANTITY
TR1 | 8
And now I want to get last ID due to quantity 8 has been consumed
ID | DESCRIPTION | PART_NO | QUANTITY
TR3| FG | GSDFGSG | 10 <-- CONSUMED 4+2+2, TOTAL 8
TR2| FG | GSDFGSG | 2 <-- CONSUMED 2+2, TOTAL 4
TR1| FG | GSDFGSG | 2 <-- CONSUMED 2
As you can see above, TR1, TR2 has been consumed. Now I want the query
SELECT MIN(TRANSACTION_EQUIPMENTID) as TR_ID,
PART_NO,
SUM(T.QUANTITY) AS TOTAL
FROM WA_II_TBL_TR_EQUIPMENT T
GROUP BY T.PART_NO
get the last id is : TR3, due to TR1 & TR2 has been consumed.
How to do that in query?
Take minimum id where growing sum is greater than 8. Use analytic sum():
select min(id) id
from (select t.*,
sum(quantity) over (partition by part_no order by id) sq
from t
where part_no = 'GSDFGSG'
)
where sq >= 8
Test data, output:
create table t(ID varchar2(3), DESCRIPTION varchar2(5),
PART_NO varchar2(8), QUANTITY number(5), DATEADDED date);
insert into t values ('TR4', 'Test', 'TEST1', 8, timestamp '2017-09-05 15:11:15');
insert into t values ('TR3', 'FG', 'GSDFGSG', 10, timestamp '2017-08-31 16:26:04');
insert into t values ('TR2', 'FG', 'GSDFGSG', 2, timestamp '2017-08-31 16:05:39');
insert into t values ('TR1', 'FG', 'GSDFGSG', 2, timestamp '2017-08-30 16:30:16');
insert into t values ('TR5', 'FG', 'GSDFGSG', 3, timestamp '2017-08-31 17:00:00');
Edit:
Add part_no and total columns and group by clause:
select min(id) id, part_no, min(sq) total
from (select t.*,
sum(quantity) over (partition by part_no order by id) sq
from t
where part_no = 'GSDFGSG'
)
where sq >= 8
group by part_no
ID PART_NO TOTAL
--- -------- ----------
TR3 GSDFGSG 14

Oracle sum group by date range

I have the following table
+-----------+-------+-------+
| Date | Type | Value |
+-----------+-------+-------+
| 1/1/2013 | A | 1 |
| 1/2/2013 | A | 3 |
| 1/3/2013 | A | 5 |
| 1/4/2013 | A | 6 |
| 1/6/2013 | A | 8 |
| 1/7/2013 | A | 1 |
| 1/8/2013 | A | 2 |
+-----------+-------+-------+
I want to sum the value for the previous 3 dates for a certain day so i used this query.
ie: sel_date = 1/3/2013.
select type, sum(value)
from table_name
where date <= seldate
and date > seldate - 3
group by type
Now the problem is, I want to output a table with a given date range computing for the previous 3 days for each date.
ie: sel_date range 1/3/2013 - 1/8/2013
+-----------+-------+------------+
| Date | Type | Sum(Value) |
+-----------+-------+------------+
| 1/3/2013 | A | 9 | // 5 + 3 + 1
| 1/4/2013 | A | 14 | // 6 + 5 + 3
| 1/5/2013 | A | 11 | // 0 + 6 + 5
| 1/6/2013 | A | 14 | // 8 + 0 + 6
| 1/7/2013 | A | 9 | // 1 + 8 + 0
| 1/8/2013 | A | 11 | // 2 + 1 + 8
+-----------+-------+------------+
Is there a way to do this in a single query. I tried reading on partitioning but it is leading me no where.
Use range between in windowing clause:
select dt, type, value,
sum(value) over (order by dt range between 2 preceding and current row) as sv
from t
Test data and output:
create table t (dt date, type varchar2(1), value number(5));
insert into t values (date '2013-01-01', 'A', 1);
insert into t values (date '2013-01-02', 'A', 3);
insert into t values (date '2013-01-03', 'A', 5);
insert into t values (date '2013-01-04', 'A', 6);
insert into t values (date '2013-01-05', 'A', 8);
insert into t values (date '2013-01-06', 'A', 1);
insert into t values (date '2013-01-07', 'A', 2);
insert into t values (date '2013-01-12', 'A', 2);
DT TYPE VALUE SV
----------- ---- ------ ----------
2013-01-01 A 1 1
2013-01-02 A 3 4
2013-01-03 A 5 9
2013-01-04 A 6 14
2013-01-05 A 8 19
2013-01-06 A 1 15
2013-01-07 A 2 11
2013-01-12 A 2 2
You can try with something like this:
with test(Date_, Type, Value ) as
(
select to_date('01/01/2013', 'mm/dd/yyyy'), 'A', 1 from dual union all
select to_date('01/02/2013', 'mm/dd/yyyy'), 'A', 3 from dual union all
select to_date('01/03/2013', 'mm/dd/yyyy'), 'A', 5 from dual union all
select to_date('01/04/2013', 'mm/dd/yyyy'), 'A', 6 from dual union all
select to_date('01/05/2013', 'mm/dd/yyyy'), 'A', 8 from dual union all
select to_date('01/06/2013', 'mm/dd/yyyy'), 'A', 1 from dual union all
select to_date('01/07/2013', 'mm/dd/yyyy'), 'A', 2 from dual
)
select *
from (
select date_, type,
value + nvl(lag(value, 1) over (partition by type order by date_), 0)
+ nvl(lag(value, 2) over (partition by type order by date_), 0) as value
from test
)
where date_ between to_date('01/03/2013', 'mm/dd/yyyy') and to_date('01/07/2013', 'mm/dd/yyyy')
This sums, for each row, the values of the two preceding ones, based on date; the external query is simply used to apply the filter, given that applying it in the internal query would lead to a wrong sum.
The LAG is used to read values from the rows that precede the current row by 1 or 2 positions.
You can use this:
select date1 ,type,
(select sum(t1.value) sumvalue from table_name t1 where t1.date1 between (t2.date1 - 2) and t2.date1 )
from table_name t2
where date1 between startDate and endDate
select t.date, sum(t.value) OVER(ORDER BY t.date ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING) as Pre_3row_sum
from table_name t

Resources