putting data into categories (buckets) based on a value

putting data into categories (buckets) based on a value - oracle

how can i assign a category based on a value?
for example, i have a table with values from 1-200. how do i assign a category to each record, like 1-5, 6-10, 11-15, etc.
i can do it using the below but that seems like a bad solution.
sorry, this is probably very basic but i don't know what it's called and googling buckets (as it's called in our company) didn't bring up any results.
thank you
SELECT DISTINCT CountOfSA,
CASE
WHEN CountOfSA BETWEEN 1 AND 5 THEN
'1-5'
WHEN CountOfSA BETWEEN 6 AND 10 THEN
'6-10'
WHEN CountOfSA BETWEEN 11 AND 15 THEN
'11-15'
WHEN CountOfSA BETWEEN 16 AND 20 THEN
'16-20'
WHEN CountOfSA BETWEEN 21 AND 25 THEN
'21-25'
WHEN CountOfSA BETWEEN 26 AND 30 THEN
'26-30'
END
AS diff
FROM NR_CF_212

Have a look at WIDTH_BUCKET function.
This divides the range into equal sized intervals and assigns a bucket number to each interval.
with x as (
select CountOfSA,
width_bucket(CountOfSA, 1, 200, 40) bucket_
from NR_CF_212
)
select CountOfSA,
cast(1 + (bucket_ - 1)*5 as varchar2(4)) ||
'-' ||
cast( bucket_*5 as varchar2(4)) diff
from x
order by CountOfSA;
Demo here.

I would put the range values and descriptions into a separate table, especially if you plan to use these for future queries, views, etc. Plus its easier to change the ranges or descriptions as needed. For example:
create table sales_ranges
(
low_val number not null,
high_val number not null,
range_desc varchar2(100) not null
)
cache;
insert into sales_ranges values (0,1000,'$0-$1k');
insert into sales_ranges values (1001,10000,'$1k-$10k');
insert into sales_ranges values (10001,100000,'$10k-$100k');
insert into sales_ranges values (100001,1000000,'$100k-$1mm');
insert into sales_ranges values (1000001,10000000,'$1mm-$10mm');
insert into sales_ranges values (10000001,100000000,'$10mm-$100mm');
commit;
create table sales
(
id number,
total_sales number
);
insert into sales(id, total_sales)
-- some random values for testing
select level, trunc(dbms_random.value(1,10000000))
from dual
connect by level <= 100;
commit;
select id, total_sales, range_desc
from sales s
left outer join sales_ranges sr
on (s.total_sales between sr.low_val and sr.high_val)
order by s.id
;
Output (just first 3 rows):
ID TOTAL_SALES RANGE_DESC
1 5122380 $1mm-$10mm
2 347726 $100k-$1mm
3 6564700 $1mm-$10mm

I guess you could use concatenate together some calculated values to dynamically create the bucket name:
select countofsa
, ((countofsa - 1)/5) * 5 + 1
, ((countofsa - 1)/5 + 1) * 5
, ((countofsa - 1)/5) * 5 + 1 || '-' || ((countofsa - 1)/5 + 1) * 5 AS diff
from nr_cf_212
Some output:
countofsa | ?column? | ?column? | diff
-----------+----------+----------+-------
1 | 1 | 5 | 1-5
2 | 1 | 5 | 1-5
3 | 1 | 5 | 1-5
4 | 1 | 5 | 1-5
5 | 1 | 5 | 1-5
6 | 6 | 10 | 6-10
7 | 6 | 10 | 6-10
8 | 6 | 10 | 6-10
9 | 6 | 10 | 6-10
10 | 6 | 10 | 6-10
11 | 11 | 15 | 11-15
(11 rows)
UPDATE from comments, Oracle example, dynamically computing range:
create table nr_cf_212(countofsa number);
insert into nr_cf_212 values(1);
insert into nr_cf_212 values(2);
insert into nr_cf_212 values(3);
insert into nr_cf_212 values(4);
insert into nr_cf_212 values(5);
insert into nr_cf_212 values(6);
insert into nr_cf_212 values(7);
insert into nr_cf_212 values(9);
insert into nr_cf_212 values(10);
insert into nr_cf_212 values(11);
select countofsa
, TRUNC((countofsa - 1)/5) * 5 + 1
, (TRUNC((countofsa - 1)/5) + 1) * 5
, TRUNC((countofsa - 1)/5) * 5 + 1 || '-' || (TRUNC((countofsa - 1)/5) + 1) * 5 AS diff
from nr_cf_212;
| COUNTOFSA | TRUNC((COUNTOFSA-1)/5)*5+1 | (TRUNC((COUNTOFSA-1)/5)+1)*5 | DIFF |
|-----------|----------------------------|------------------------------|-------|
| 1 | 1 | 5 | 1-5 |
| 2 | 1 | 5 | 1-5 |
| 3 | 1 | 5 | 1-5 |
| 4 | 1 | 5 | 1-5 |
| 5 | 1 | 5 | 1-5 |
| 6 | 6 | 10 | 6-10 |
| 7 | 6 | 10 | 6-10 |
| 9 | 6 | 10 | 6-10 |
| 10 | 6 | 10 | 6-10 |
| 11 | 11 | 15 | 11-15 |
I tried it with sqlfiddle (http://sqlfiddle.com/#!4/b922e/4).
I broke it into parts to show the "from" column, the "to" column and then the range. If you divide your number by 5 and look at the quotient and the remainder, you will see a pattern:
1/5 = 0 remainder 1
2/5 = 0 remainder 2
3/5 = 0 remainder 3
4/5 = 0 remainder 4
5/5 = 1 remainder 0
6/5 = 1 remainder 1
7/5 = 1 remainder 2
8/5 = 1 remainder 3
9/5 = 1 remainder 4
10/5 = 2 remainder 0
11/5 = 2 remainder 1
The range for the number is "from" 5 times the quotient "to" 5 times the quotient plus the remainder -almost. Actually, everything is offset by 1. So take your number, subtract 1, then do the division.

Related

order by not returning running count

In this question - why adding order by in the query changes the aggregate value? - I was told that "If the window clause contains both PARTITION BY and ORDER BY, it returns the running count within the partition . So, using the ORDER BY expression, how many rows have been counted so far within the partition."
Referring to this example - https://www.vertica.com/docs/11.0.x/HTML/Content/Authoring/AnalyzingData/SQLAnalytics/ReportingAggregates.htm?tocpath=Analyzing%20Data%7CSQL%20Analytics%7CWindow%20Framing%7C_____3
Why does the cumulative count shows 4 (last value of count) for all values of sal=109?
=> SELECT deptno, sal, empno, COUNT(sal) OVER (
-> PARTITION BY deptno ORDER BY sal
-> ) AS COUNT
-> FROM emp;
deptno | sal | empno | count
--------+-----+-------+-------
10 | 101 | 1 | 1
10 | 104 | 4 | 2
------------------------------
20 | 100 | 11 | 1
20 | 109 | 7 | 4<-
20 | 109 | 6 | 4<-
20 | 109 | 8 | 4<-
20 | 110 | 10 | 6<-
20 | 110 | 9 | 6<-
------------------------------
30 | 102 | 2 | 1
30 | 103 | 3 | 2
30 | 105 | 5 | 3

You order by sal, which is at 109 for 3 rows within deptno 20. For the ordering criteria, there are 3 rows that should appear at the same time. After all 3 are added, after 100 for 1 row, you are immediately at 4. So you get 4 for all 3.
You need distinct ordering values to get distinct running count results.

How to calculate difference between 2 entries based on dates?

I have an Oracle DB View like:
DATE | PRODUCT_NUMBER | PRODUCT_COUNT | PRODUCT_FACTOR
2018-01-01 | 1 | 10 | 3
2018-03-15 | 1 | 8 | 3
2019-02-11 | 1 | 11 | 3
2019-08-01 | 1 | 5 | 3
2019-08-01 | 2 | 20 | 5
2019-08-02 | 2 | 15 | 5
2019-06-01 | 2 | 5 | 5
2020-07-01 | 2 | 30 | 5
2018-07-07 | 3 | 100 | 2
Where,
DATE is the date
NUMBER is a unique Product Number
COUNT is the number of items from the Product Number in the storage facility
FACTOR is the number of products that fit into a storage rack
I now need to know how much it changed since the last update for every Product Number.
Since the first entry has no past date to compare to, change is undefined and something like NULL, NONE, 0 or so. Doesn't matter as long as I can filter those out later.
Some products only have 1 entry, those should be ignored (nothing to calculate difference on).
End result should be:
DATE | PRODUCT_NUMBER | PRODUCT_COUNT | PRODUCT_FACTOR | PRODUCT_CHANGE | CHANGE_FACTOR
2018-01-01 | 1 | 10 | 3 | NULL | NULL
2018-03-15 | 1 | 8 | 3 | 2 # 10-8 | 6 # 2*3
2019-02-11 | 1 | 11 | 3 | -3 # 8-11 | -9 # 3*-3
2019-08-01 | 1 | 5 | 3 | 6 # 11-5 | 18 # 6*3
2019-08-01 | 2 | 20 | 5 | -15 # 5-20 | -75 # -15*5
2019-08-02 | 2 | 15 | 5 | 5 # 20-15 | 25 # 5*5
2019-06-01 | 2 | 5 | 5 | NULL | NULL
2020-07-01 | 2 | 30 | 5 | -15 # 15-30 | -75 # -15*5
How can I achieve this within Oracle SQL?

End result is a bit unclear:
Why for product_number 2 15 and 5 values are compared - 2019-06-01 is less than 2019-08-01 and should be first row
Why change_factor for product 1 on the first row is 3 and for product 2 it's null
Why change_factor for 2019-02-11 is calculated as 11 * 0 instead of 0 * 3
Assumming all of this as typos(changed 2019-06-01 to 2019-09-01) you can use something like below
select dt, product_number, product_count, product_factor, product_change, product_change*product_factor change_factor
from (
select "DATE" dt, product_number, product_count, product_factor,
greatest(lag(product_count) over(partition by product_number order by "DATE") - product_count, 0) product_change
from test_tab t1
where (select count(1) from test_tab t2 where t1.product_number = t2.product_number and rownum < 3) > 1
)
fiddle
See also LAG documentation

Row count by group

I am attempting to write the following query to get a row count by group.
select
a.employee, a.cov_option,
count(a.cov_option) over (partition by a.cov_option order by a.employee) as row_num
from wilson.benefit a
inner join wilson.bncategory b
ON a.plan_type = b.plan_type and a.plan_option = b.plan_option
inner join wilson.bncovopt c
ON a.company = c.company and a.plan_code = c.plan_code and a.cov_option = c.coverage_opt
where
a.plan_type = 'HL' and
to_char(a.stop_date, 'yyyy-mm-dd') = '1700-01-01'
order by a.employee, a.cov_option
The result set returned is:
employee | cov_option |row_num
-------------|--------------|--------------
429 | 1 | 1
429 | 3 | 2
429 | 3 | 2
1420 | 1 | 2
1420 | 3 | 4
1420 | 3 | 4
1537 | 2 | 2
1537 | 2 | 2
The result set I am attempting to return is:
429 | 1 | 1
429 | 3 | 2
429 | 3 | 2
1420 | 1 | 1
1420 | 3 | 2
1420 | 3 | 2
1537 | 2 | 1
1537 | 2 | 1

What you seem to want is dense_rank() rather than count(). Indeed, "count" means simply determining how many rows are in each group, it is not "counting" in the way we learn as children (first, second, third). That kind of counting is called "ranking".
dense_rank() over (partition by a.employee order by a.cov_option) as row_num
should do what you need.
There is also rank() - the difference is that if two rows are tied for first, with dense_rank() the third row gets rank 2; with simple rank() it gets rank 3 (rank 2 is "used up" by the first two rows).

max() issue in Oracle SQL [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 8 years ago.
I'm using Oracle SQL and i need some help with max() function.
I have the following table:
ID | Type | Price | Quantity
1 | A | 10 | 2
2 | B | 5 | 5
3 | C | 10 | 3
4 | A | 8 | 7
5 | A | 6 | 9
6 | A | 7 | 5
7 | B | 15 | 3
8 | A | 20 | 4
9 | A | 3 | 7
10 | B | 11 | 8
I need to aggregate the table by Type column. For each group of Type (A, B, C), i need to select the price and the quantity of max(id).
I this case:
ID | Type | Price | Quantity
9 | A | 3 | 7
10 | B | 11 | 8
3 | C | 10 | 3
Any Suggestion?

max won't help you with this. You can use the row_number partitioning function.
select id, type, price, quantity
from
(
select yourtable.*,
row_number() over (partition by type order by id desc) rn
from yourtable
) v
where rn = 1

Something like this:
Select t.* From
(Select Max(ID) As ID From Table
Group By Type) tmp
Join Table t On t.ID = tmp.ID

Oracle query to list all parents before children

I have a table which has child#, parent# like the following :
child# | parent#
------------------
10 | NULL
20 | NULL
2 | 1
1 | 10
50 | 10
6 | 5
5 | 2
There is no ordering of numbers, i.e. 1 can be parent of 10 and 10 can be parent of 20.
I want an ORACLE SQL query which lists all parents first, followed by their children.
I want a temporary table like following:
child# | parent#
----------------
10 | NULL
20 | NULL
1 | 10
2 | 1
50 | 10
5 | 2
I want to traverse this temporary table and process each rows, so for that I need to make sure parent is listed before the children rows.

select level,child,parent
from your_table
start with t2.parent is null
connect by prior t2.child = t2.parent
order by level
OUTPUT:
LEVEL CHILD PARENT
1 10 (null)
1 20 (null)
2 1 10
2 50 10
3 2 1
4 5 2
5 6 5
Link to fiddle

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

putting data into categories (buckets) based on a value - oracle

Related

order by not returning running count

How to calculate difference between 2 entries based on dates?

Row count by group

max() issue in Oracle SQL [duplicate]

Oracle query to list all parents before children

Categories

Resources