SQL help to count number of locations for each item/branch - oracle

I'm a SQL rookie, and am having trouble wrapping my head around how to do the following. I have a table that contains item information by branch. Within a branch an item can be in multiple locations. The data I need to extract needs to include a column that provides the total number of locations (count) the item is associated with for a given branch.
Output would look something like this:
I'm guessing this is a sub query, but to be honest I'm not sure how to get started... order in which this is done (subquery group by first, then join, etc)
In purely logical terms:
SELECT
a.Branch,
a.Item,
a.Loc,
COUNT(a.Branch||a.Item) AS 'LocCount'
FROM BranchInventoryFile a
GROUP BY a.Branch,a.Item

You can tackle this by using Oracle's Count Analytical functions found here. Be sure to read up on WINDOW/Partitioning functions as this unlocks quite a bit of functionality in SQL.
SQL:
SELECT
a.BRANCH,
a.ITEM,
a.LOC,
COUNT(a.ITEM) OVER (PARTITION BY a.BRANCH, a.ITEM) AS LOC_COUNT
FROM
BRANCH a;
Result:
| BRANCH | ITEM | LOC | LOC_COUNT |
|--------|------|------|-----------|
| 100 | A | 1111 | 2 |
| 100 | A | 1112 | 2 |
| 200 | A | 2111 | 1 |
| 200 | B | 1212 | 2 |
| 200 | B | 1212 | 2 |
| 300 | A | 1222 | 1 |
SQL Fiddle:
Here

total number of locations (count) the item is associated with for a given branch
The way you described it, you should
remove location from query:
SQL> with branchinventoryfile (branch, item, location) as
2 (select 100, 'A', 1111 from dual union all
3 select 100, 'A', 1112 from dual union all
4 select 200, 'A', 2111 from dual
5 )
6 select branch,
7 item,
8 count(distinct location) cnt
9 from BranchInventoryFile
10 group by branch, item;
BRANCH I CNT
---------- - ----------
100 A 2
200 A 1
SQL>
if you leave location in select, you have to group by it (and get wrong result):
6 select branch,
7 item,
8 location,
9 count(distinct location) cnt
10 from BranchInventoryFile
11 group by branch, item, location;
BRANCH I LOCATION CNT
---------- - ---------- ----------
100 A 1111 1
200 A 2111 1
100 A 1112 1
SQL>
or include locations, but aggregate them, e.g.
6 select branch,
7 item,
8 listagg(location, ', ') within group (order by null) loc,
9 count(distinct location) cnt
10 from BranchInventoryFile
11 group by branch, item;
BRANCH I LOC CNT
---------- - -------------------- ----------
100 A 1111, 1112 2
200 A 2111 1
SQL>

Related

why adding order by in the query changes the aggregate value?

Following vertica example from https://www.vertica.com/docs/11.0.x/HTML/Content/Authoring/AnalyzingData/SQLAnalytics/AnalyticFunctionsVersusAggregateFunctions.htm?tocpath=Analyzing%20Data%7CSQL%20Analytics%7C_____2
CREATE TABLE employees(emp_no INT, dept_no INT);
INSERT INTO employees VALUES(1, 10);
INSERT INTO employees VALUES(2, 30);
INSERT INTO employees VALUES(3, 30);
INSERT INTO employees VALUES(4, 10);
INSERT INTO employees VALUES(5, 30);
INSERT INTO employees VALUES(6, 20);
INSERT INTO employees VALUES(7, 20);
INSERT INTO employees VALUES(8, 20);
INSERT INTO employees VALUES(9, 20);
INSERT INTO employees VALUES(10, 20);
INSERT INTO employees VALUES(11, 20);
COMMIT;
If I run this query without order by, I get same count value for all rows
dbadmin#b006bc38a718(*)=>
select
emp_no
, dept_not
, count(*) over (partition by dept_not) as emp_count
from employees;
emp_no | dept_not | emp_count
--------+----------+-----------
6 | 20 | 6
7 | 20 | 6
8 | 20 | 6
9 | 20 | 6
10 | 20 | 6
11 | 20 | 6
1 | 10 | 2
4 | 10 | 2
2 | 30 | 3
3 | 30 | 3
5 | 30 | 3
(11 rows)
But if I add order by, I get incremental value
dbadmin#b006bc38a718(*)=>
select
emp_no
, dept_not
, count(*) over (partition by dept_not order by emp_no) as emp_count
from employees;
emp_no | dept_not | emp_count
--------+----------+-----------
2 | 30 | 1
3 | 30 | 2
5 | 30 | 3
1 | 10 | 1
4 | 10 | 2
6 | 20 | 1
7 | 20 | 2
8 | 20 | 3
9 | 20 | 4
10 | 20 | 5
11 | 20 | 6
(11 rows)
Time: First fetch (11 rows): 85.075 ms. All rows formatted: 85.139 ms
What is the affect of order by ? Why do I get incremental value?
If the window clause only contains PARTITION BY, it returns the total sum of the partition - for each row of the partition the same value.
If the window clause contains both PARTITION BY and ORDER BY, it returns the running count within the partition . So, using the ORDER BY expression, how many rows have been counted so far within the partition.
That's exactly how window functions work. They give you a whole world of possibilities ...
That happens because Vertica applies a default frame-clause which is defined as:
RANGE UNBOUNDED PRECEDING AND CURRENT ROW
So to get the result you want, you may want to add the frame clause as below after you ORDER BY in the OVER() clause:
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
This behaviour is documented as:
If the OVER clause omits specifying a window frame, the function creates a default window that extends from the current row to the first row in the current partition.
Link to doc

how to loop through each row of every group (while doing "group by") in Oracle table

I have a table like this:
I want to group by the table base on "customer_id" column and calculate "Day-day[0]" column. "Day-day[0]" is "Day" field in every group and "day[0]" is first row of the day in the group. At the same time, I have to calculate total risk which is in following:
This is the table after grouping by:
This is total risk formula:
In fact, I have to loop through each row of every group to calculate total risk.
My sample table is like this:
CREATE TABLE risk_test
(id VARCHAR2 (32) NOT NULL PRIMARY KEY,
customer_id varchar2 (40BYTE),
risk number,
day VARCHAR2(50 BYTE))
insert into risk_test values(1,102,15,1);
insert into risk_test values(2,102,16,1);
insert into risk_test values(3,104,11,1);
insert into risk_test values(4,102,17,2);
insert into risk_test values(5,102,10,2);
insert into risk_test values(6,102,13,3);
insert into risk_test values(7,104,14,2);
insert into risk_test values(8,104,13,2);
insert into risk_test values(9,104,17,1);
insert into risk_test values(10,104,16,2);
The sample answer is like this:
Would you please guide me how I can do this scenario in Oracle database?
Any help is really appreciated.
Using the sample data that was provided, I believe this query should calculate the risks properly:
Query
SELECT o.*,
ROUND (
SUM (day_minus_day0 * risk) OVER (PARTITION BY customer_id)
/ SUM (day_minus_day0) OVER (PARTITION BY customer_id),
5) AS total_risk
FROM (SELECT rt.*, (rt.day - MIN (rt.day) OVER (PARTITION BY customer_id)) + 1 AS day_minus_day0
FROM risk_test rt) o
ORDER BY customer_id, TO_NUMBER (day), TO_NUMBER (id);
Result
ID CUSTOMER_ID RISK DAY DAY_MINUS_DAY0 TOTAL_RISK
_____ ______________ _______ ______ _________________ _____________
1 102 15 1 1 13.77778
2 102 16 1 1 13.77778
4 102 17 2 2 13.77778
5 102 10 2 2 13.77778
6 102 13 3 3 13.77778
3 104 11 1 1 14.25
9 104 17 1 1 14.25
7 104 14 2 2 14.25
8 104 13 2 2 14.25
10 104 16 2 2 14.25
Your total risk calculation just looks like a weighted average to me. That is, the average risk of the rows for each customer, weighted according to the day offset (day-day[0]), so that risks in later days count for more.
To compute that, you need a common table expression to 1st compute the day-weighted risk for each row. Then you can just compute the weighted average by dividing.
The query below illustrates the approach, with comments.
-- This first WITH clause is just sample data. In your database you would
-- get rid of this and replace all references to "input" with your actual
-- table name
with input ( customer_id, risk, day ) AS (
SELECT 1053, 100, 1 FROM DUAL UNION ALL
SELECT 1053, 100, 1 FROM DUAL UNION ALL
SELECT 1053, 100, 2 FROM DUAL UNION ALL
SELECT 1053, 100, 2 FROM DUAL UNION ALL
SELECT 1053, 100, 3 FROM DUAL UNION ALL
SELECT 1054, 200, 1 FROM DUAL UNION ALL
SELECT 1054, 200, 1 FROM DUAL UNION ALL
SELECT 1054, 200, 3 FROM DUAL UNION ALL
SELECT 1054, 200, 3 FROM DUAL UNION ALL
SELECT 1054, 200, 4 FROM DUAL
),
-- This CTE computes the day offset for each row and multiplies by the risk to
-- compute a day-weighted risk.
-- I added +1 to the day_offset, otherwise risks on the 1st day would not contribute
-- to the total risk, which I think is not what you intended(?)
weighted_input AS (
SELECT i.customer_id,
i.risk,
i.day,
i.day - min(i.day) over ( partition by i.customer_id ) + 1 day_offset,
( i.day - min(i.day) over ( partition by i.customer_id ) + 1 ) * i.risk day_weighted_risk
FROM input i )
-- This is the main SELECT clause that gets all the weighted risks and computes
-- the group total risk, which appears the same in every row in each group.
SELECT wi.*,
sum(wi.day_weighted_risk) over ( partition by wi.customer_id ) / sum(wi.day_offset) over ( partition by wi.customer_id ) total_risk
FROM weighted_input wi;
+-------------+------+-----+------------+-------------------+------------+
| CUSTOMER_ID | RISK | DAY | DAY_OFFSET | DAY_WEIGHTED_RISK | TOTAL_RISK |
+-------------+------+-----+------------+-------------------+------------+
| 1053 | 100 | 1 | 1 | 100 | 100 |
| 1053 | 100 | 1 | 1 | 100 | 100 |
| 1053 | 100 | 2 | 2 | 200 | 100 |
| 1053 | 100 | 2 | 2 | 200 | 100 |
| 1053 | 100 | 3 | 3 | 300 | 100 |
| 1054 | 200 | 1 | 1 | 200 | 200 |
| 1054 | 200 | 1 | 1 | 200 | 200 |
| 1054 | 200 | 3 | 3 | 600 | 200 |
| 1054 | 200 | 3 | 3 | 600 | 200 |
| 1054 | 200 | 4 | 4 | 800 | 200 |
+-------------+------+-----+------------+-------------------+------------+
For your database, having the actual table and not needing the input CTE, it would be:
WITH weighted_input AS (
-- This CTE computes the day offset for each row and multiplies by the risk to
-- compute a day-weighted risk.
-- I added +1 to the day_offset, otherwise risks on the 1st day would not contribute
-- to the total risk, which I think is not what you intended(?)
SELECT i.customer_id,
i.risk,
i.day,
i.day - min(i.day) over ( partition by i.customer_id ) + 1 day_offset,
( i.day - min(i.day) over ( partition by i.customer_id ) + 1 ) * i.risk day_weighted_risk
FROM my_table i )
-- This is the main SELECT clause that gets all the weighted risks and computes
-- the group total risk, which appears the same in every row in each group.
SELECT wi.*,
sum(wi.day_weighted_risk) over ( partition by wi.customer_id ) / sum(wi.day_offset) over ( partition by wi.customer_id ) total_risk
FROM weighted_input wi;

use LAG with expression in oracle

I have a column (status) in a table that contain numbers and values are 1, 2 or 4.
I would like, in a SQL query, add a calculated column (bitStatus) that will store the bitwise oerator OR for the status column of the current line and the column in the previous line.
like so :
| id | status| bitStatus|
|----|-------|----------|
| 1 | 1 | 1 |
| 2 | 2 | 3 |
| 3 | 4 | 7 |
| 4 | 1 | 7 |
So what I did is to use LAG function in oracle but I coudn't figure out how to do it as long as I want to create only on calculated column bitStatus
my query is like :
select id, status,
BITOR(LAG(bitStatus) OVER (ORDER BY 1), status)) AS bitStatus
But as you know, I can't use LAG(bitStatus) when calculating bitStatus.
So how could I make it the desired table.
Thanks in advance.
Would this help?
lines #1 - 6 represent sample data
the TEMP CTE is here to fetch LAG status value (to improve readability)
the final select does the BITOR operation as bitor(a, b) = a - bitand(a, b) + b
SQL> with test (id, status) as
2 (select 1, 1 from dual union all
3 select 2, 2 from dual union all
4 select 3, 1 from dual union all
5 select 4, 4 from dual
6 ),
7 temp as
8 (select id, status,
9 lag(status) over (order by id) lag_status
10 from test
11 )
12 select id,
13 status,
14 status - bitand(status, nvl(lag_status, status)) + nvl(lag_status, status) as bitstatus
15 from temp
16 order by id;
ID STATUS BITSTATUS
---------- ---------- ----------
1 1 1
2 2 3
3 1 3
4 4 5
SQL>

Search data from table column using regex in oracle sql

I need to seach data from table with person`s phone number. But phone number is saved in different forms. But its length equals to 9 and only consist of numbers. How can I find number when I search with static form like 998732387 then result should be.
2 | 99 873 23 87 | Kike
When I enter 971234573 then result should look like below:
3 | 97 123-45-73 | Cris
mytable
-----------------------------------------
id | phone | name
----------------------------------------
1 | 991234567 | Michael
2 | 99 873 23 87 | Kike
3 | 97 123-45-73 | Cris
Please Help me. Any help is appreciated.
One way is to remove all non-digits:
select *
from mytable
where regexp_replace(phone, '[^[:digit:]]', '') = '971234573';
Or, if your database doesn't support regular expressions (hard to believe), translate does the job:
SQL> with test (id, phone, name) as
2 -- your sample data
3 (select 1, '991234567', 'Michael' from dual union all
4 select 2, '99 873 23 87', 'Kike' from dual union all
5 select 3, '97 123-45-73', 'Cris' from dual
6 ),
7 only_digits as
8 -- remove non-digits from the PHONE colunmn (pre-regex version)
9 (select id, phone, name,
10 translate(phone, 'a' || translate(phone, 'x0123456789x', 'x'), 'a') digit
11 from test
12 )
13 select id, phone, name
14 from only_digits
15 where digit = '998732387';
ID PHONE NAME
---------- ------------ -------
2 99 873 23 87 Kike
SQL>

Oracle "partition" a table at each new value

I have an Oracle table I need to "partition" :I use the terme loosely, I just need to detect groups and would like to display the group through a SELECT. Here's an example that might serve as a sample data (the four columns):
ID | Ref | Rank | Partition_group (only available for the 1st member)
1 | 1 | 1 | 1_A
2 | 1 | 2 | (null)
3 | 1 | 3 | 1_B
4 | 2 | 1 | (null)
5 | 2 | 2 | 2_A
...
It is sorted (the sort key would be the 'Ref' and a creation date). What I would need here, is to extract three groups:
IDs 1 and 2
ID 3
ID 5
What happens with ID 4 is not really important: it may be in its own group, or with the ID 5.
Two IDS should be in the same group if they have the same 'Ref' and if there hasn't been any 'Partition_group' change. In other words, at each change of 'Ref' or (logical or) 'Partition_group', I need to detect a new group. For instance, we could return something like that:
ID | Ref | Rank | Partition_group | Group
1 | 1 | 1 | 1_A | 1_A
2 | 1 | 2 | (null) | 1_A
3 | 1 | 3 | 1_B | 1_B
4 | 2 | 1 | (null) | (null) (or 2_A)
5 | 2 | 2 | 2_A | 2_A
...
I thought about writing a function or something, but it appears I don't have the rights to do so (yeah...) so I have to use plain Oracle SQL (11g).
I've been looking at CONNECT BY and OVER (analytical functions) but they don't seem to do the trick.
Has anyone been faced to such a problem? How would you resolve it?
Thanks in advance.
Assuming the input data is the first four columns, how about something like:
with sample_data as (select 1 id, 1 ref, 1 rank, '1_A' ptn_group from dual union all
select 2 id, 1 ref, 2 rank, null ptn_group from dual union all
select 3 id, 1 ref, 3 rank, '1_B' ptn_group from dual union all
select 4 id, 2 ref, 1 rank, null ptn_group from dual union all
select 5 id, 2 ref, 2 rank, '2_A' ptn_group from dual)
select id,
ref,
rank,
ptn_group,
last_value(ptn_group ignore nulls) over (partition by ref order by rank, id) grp1,
case when last_value(ptn_group ignore nulls) over (partition by ref order by rank, id) is null then
first_value(ptn_group ignore nulls) over (partition by ref order by rank, id rows between current row and unbounded following)
else last_value(ptn_group ignore nulls) over (partition by ref order by rank, id)
end grp2
from sample_data;
ID REF RANK PTN_GROUP GRP1 GRP2
---------- ---------- ---------- --------- ---- ----
1 1 1 1_A 1_A 1_A
2 1 2 1_A 1_A
3 1 3 1_B 1_B 1_B
4 2 1 2_A
5 2 2 2_A 2_A 2_A
I've given you two options to generate the grp, based on how you want to deal with rows where the first rows of the ptn_group are null - leave them null or pick up the first non-null value in the group.

Resources