Oracle 19.3 on Windows.
Trying to figure out how to show a running total.
Inner table shows unique sku_id and its qty sold.
I am trying to show a running total, but in some instances, when I sold the same number of items, running total is not changing.
What am I doing wrong?
select sku_id, sku_qty, sum(sku_qty) over (ORDER BY sku_qty desc) running_total
from (
-- April sales
select
sku_id
,sum(item_qty) sku_qty
from ecomm_order_item
where extract(YEAR from created_date) = 2021
and extract(MONTH from created_date) = 4
group by sku_id
order by sku_qty desc
)
order by sku_qty desc
You are ordering by sku_qty and if you have two rows with the same sku_qty then they will have the same rank in the ordering and the SUM will count them both at the same time.
If you want them to be counted separately then you need to give them a unique ordering.
For example, you could add sku_id to the ORDER BY clause:
select sku_id,
sku_qty,
sum(sku_qty) over (ORDER BY sku_qty desc, sku_id) running_total
from ...
Or, could use ROWNUM
select sku_id,
sku_qty,
sum(sku_qty) over (ORDER BY sku_qty desc, ROWNUM) running_total
from ...
Or anything else unique.
Or, you could change the windowing clause from the default RANGE window to use ROWS:
select sku_id,
sku_qty,
sum(sku_qty) over (
ORDER BY sku_qty desc
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) running_total
from ...
db<>fiddle here
Related
I have an employee table as below. As you can see that second highest salary is 200
Incase the second highest salary is missing then there will be only one row as shown at last . In this case the query should fetch only 100
I have written query as but it is not working. Please help! Thanks
select salary "SecondHighestSalary" from(
(select id,salary,rank() over(order by salary desc) rnk
from employee2)
)a
where (rnk) in coalesce(2,1)
I have also tried the following but it is fetching 2 rows but i need only 1
It sounds like you'd want something like
with ranked_emp as (
select e.*,
rank() over (order by e.sal desc) rnk,
count(*) over () cnt
from employee2 e
)
select salary "SecondHighestSalary"
from ranked_emp
where (rnk = 2 and cnt > 1)
or (rnk = 1 and cnt = 1)
Note that I'm still using rank since you're using that in your approach and you don't specify how you want to handle ties. If there are two employees with the same top salary, rank will assign both a rnk of 1 and no employee would have a rnk of 2 so the query wouldn't return any data. dense_rank would ensure that there was at least one employee with a rnk of 2 if there were employees with at least 2 different salaries. If there are two employees with the same top salary, row_number would arbitrarily assign one the rnk of 2. The query I posted isn't trying to handle those duplicate situations because you haven't outlined exactly how you'd want it to behave in each instance.
If you are in Oracle 12.2 or higher, you can try:
select distinct id,
nvl
(
nth_value(salary, 2) from first
over(partition by id
order by salary desc
range between unbounded preceding and unbounded
following),
salary
) second_max_salary
from employee2
I have a table called Orders, i want to get maximum number of orders for each day with respect to hours with following query
SELECT
trunc(created,'HH') as dated,
count(*) as Counts
FROM
orders
WHERE
created > trunc(SYSDATE -2)
group by trunc(created,'HH') ORDER BY counts DESC
this gets the result of all hours, I want only max hour of a day e.g.
Image
This result looks good but now i want only rows with max number of count for a day
e.g.
for 12/23/2019 max number of counts is 90 for "12/23/2019 4:00:00 PM",
for 12/22/2019 max number of counts is 25 for "12/22/2019 3:00:00 PM"
required dataset
1 12/23/2019 4:00:00 PM 90
2 12/24/2019 12:00:00 PM 76
3 12/22/2019 1:00:00 PM 25
This could be the solution and in my opinion is the most trivial.
Use the WITH clause to make a sub query then search for the greatest value in the data set on a specific date.
WITH ORD AS (
SELECT
trunc(created,'HH') as dated,
count(*) as Counts
FROM
orders
WHERE
created > trunc(SYSDATE-2)
group by trunc(created,'HH')
)
SELECT *
FROM ORD ord
WHERE NOT EXISTS (
SELECT 'X'
FROM ORD ord1
WHERE trunc(ord1.dated) = trunc(ord.dated) AND ord1.Counts > ord.Counts
)
Use ROW_NUMBER analytic function over your original query and filter the rows with number 1.
You need to partition on the day, i.e. TRUNC(dated) to get the correct result
with ord1 as (
SELECT
trunc(created,'HH') as dated,
count(*) as Counts
FROM
orders
WHERE
created > trunc(SYSDATE -2)
group by trunc(created,'HH')
),
ord2 as (
select dated, Counts,
row_number() over (partition by trunc(dated) order by Counts desc) as rn
from ord1)
select dated, Counts
from ord2
where rn = 1
The advantage of using the ROW_NUMBER is that it correct handels ties, i.e. cases where there are more hour in a day with the same maximal count. The query shows only one record and you can controll with the order by e.g. to show the first / last hour.
You can use the analytical function ROW_NUMBER as following to get the desired result:
SELECT DATED, COUNTS
FROM (
SELECT
TRUNC(CREATED, 'HH') AS DATED,
COUNT(*) AS COUNTS,
ROW_NUMBER() OVER(
PARTITION BY TRUNC(CREATED)
ORDER BY COUNT(*) DESC NULLS LAST
) AS RN
FROM ORDERS
WHERE CREATED > TRUNC(SYSDATE - 2)
GROUP BY TRUNC(CREATED, 'HH'), TRUNC(CREATED)
)
WHERE RN = 1
Cheers!!
I am trying to select multiple rows of data into one row through multiple columns which will change dynamically.
This is in Oracle database. I want to count repeated work done by the LEAD_TECHNISIAN_ID within a duration. If the difference of last work delivery date and new work receive date is 15 or below 15 then LEAD_TECHNISIAN_ID has one repeated work.
List item
SELECT *
FROM (WITH CTE AS (
SELECT ROW_NUMBER () OVER (ORDER BY ID) AS RW,
RECEIVED_DATE,
DELIVERY_DATE,
SERVICE_NO,
LEAD_TECHNISIAN_ID,
ID,
SERVICE_CENTER
FROM ( SELECT cc.SERVICE_CENTER,
CC.ID,
CC.BARCODE,
TRUNC (cc.CREATED_DATE) RECEIVED_DATE,
TRUNC (CC.DELIVERY_DATE) DELIVERY_DATE,
cc.SERVICE_NO,
CC.LEAD_TECHNISIAN_ID
FROM customer_complains cc
WHERE cc.BARCODE IN (SELECT BARCODE
FROM (SELECT BARCODE,
COUNT (BARCODE)
FROM customer_complains c
WHERE c.BARCODE <> 'UNDEFINE'
AND C.BARCODE = NVL ('351950102757821', BARCODE)
AND c.SEGMENT3 = NVL ('',c.SEGMENT3)
AND c.SEGMENT3 IN (SELECT SEGMENT3
FROM ITEM_MST
WHERE PRODUCT_GROUP = NVL ('',PRODUCT_GROUP))
GROUP BY c.BARCODE
HAVING COUNT (c.BARCODE) >1))
ORDER BY ID DESC)
ORDER BY ID DESC)
SELECT a.id,
a.DELIVERY_DATE,
a.RECEIVED_DATE,
b.RECEIVED_DATE PRE_RCV,
b.DELIVERY_DATE PRE_DEL,
(a.RECEIVED_DATE - b.DELIVERY_DATE) AS DIFF,
a.SERVICE_NO,
a.LEAD_TECHNISIAN_ID,
b.LEAD_TECHNISIAN_ID PRE_TECH --, a.DELIVERY_DATE
FROM CTE a
LEFT JOIN CTE b ON a.RW = b.RW + 1
)
WHERE DIFF <= 15
Here is the output for a specific barcode. but when I try for All the barcode I have in My Customer_complains table. The query provides irrelevant output.
Currently your code is giving numbers 1,2,3,4... to rows irrespective of LEAD_TECHNISIAN_ID and then you are joining it with RW. It will not consider LEAD_TECHNISIAN_ID while giving row numbers.
RW must start with 1 for each LEAD_TECHNISIAN_ID.
You just need to change calculation of RW as following:
ROW_NUMBER () OVER (PARTITION BY LEAD_TECHNISIAN_ID ORDER BY ID) AS RW
Cheers!!
We have one requirement where we want to find top N region by their price sum and then find top N customers for each of the region.
Sample Data.
REGION_NAME,CUSTOMER_NAME,PRICE
RG1,Customer1,100
RG1,Customer2,200
RG1,Customer3,100
RG2,Customer4,100
RG2,Customer5,200
RG2,Customer6,400
RG3,Customer7,100
RG3,Customer8,200
RG3,Customer9,500
RG3,Customer9,200
Assume we want Top 2 region and Top 2 customer within each region by summing the price
Region_name,Region_sum,Customer_name,Customer_price (Sum)
RG3,1000,Customer9,700 (Sum of customer price)
RG3,1000,Customer8,200
RG2,700,Customer6,400
RG2,700,customer5,200
How to write HIVE query for this? We are not able to think how to write this using HIVE. We may have to write MapReduce or PIG?
You can do this in Hive using analytics functions and a self-join:
select regions_ranked.region_name, regions_ranked.region_sum, customers_ranked.customer_name, customers_ranked.customer_sum from
(
select region_name, customer_name, customer_sum, rank() over (partition by region_name order by customer_sum desc) as customer_rank from (
select region_name, customer_name, sum(price) as customer_sum
from foo group by region_name, customer_name
) customers_sum
) customers_ranked
join
(
select region_name, region_sum, rank() over (order by region_sum desc) as region_rank from (
select region_name, sum(price) as region_sum
from foo group by region_name
) regions_sum
) regions_ranked
on customers_ranked.region_name = regions_ranked.region_name
where region_rank <= 2 and customer_rank <= 2;
This gives the exact output that you were looking for, although out of order. You can tack on an "order by" clause at the very end if you want that.
Table A
ID EmpNo Grade
--------------------
1 100 HIGH
2 105 LOW
3 100 MEDIUM
4 100 LOW
5 105 LOW
Query:
select *
from A
where EMPNO = 100
and rownum <= 2
order by ID desc
I tried this query to retrieve max and max-1 value; I need to compare the grade from max and max-1, if equals I need to set the flag as 'Y' or 'N' without using a cursor. Also I don't want to scan the entire record twice.
Please help me.
ROWNUM is applied before ORDER BY, so you need to nest the query like this:
select * from
(select * from A where EMPNO =100 order by ID desc)
where rownum<=2
That only performs one table scan (or it may use an index on EMPNO).
select *
from (
select id, emp_no, grade
, case
when lag(grade) over (order by emp_no desc) = grade
then 'Y'
else 'N'
end
as flag
, dense_rank() over( order by emp_no desc) as rank
from t
)
where rank <=2
;