Group and take average based on count - oracle

I have a table with three columns. I need to take average of a column(duration) based on the repetition of another column(phase).
id phase duration
19743 D17 1.66
19743 D06 2.25
19743 C17 2.3
19743 D06 4.44
In above data D06 have two entries, and instead of taking two I need to get the average of D06 in a single row along with other.
The final output should be like this.
id phase duration
19743 D17 1.66
19743 D06 3.35
19743 C17 2.3

Can you use this query:
with table1 as (
select 19743 as id, 'D17' as phase, 1.66 as duration from dual union all
select 19743 as id, 'D06' as phase, 2.25 as duration from dual union all
select 19743 as id, 'C17' as phase, 2.3 as duration from dual union all
select 19743 as id, 'D06' as phase, 4.44 as duration from dual
)
select id, phase, round(avg(duration) ,2)
from table1
group by id, phase
ID PHA ROUND(AVG(DURATION),2)
---------- --- ----------------------
19743 D06 3,35
19743 D17 1,66
19743 C17 2,3
Thank you

I found an easy way to achieve this using a sub query. I am posting it here because it will be useful for someone else.
SELECT ID,PHASE,AVG(DURATION)
FROM (SELECT ID, DURATION, PHASE
FROM TR_TRANSACTION
WHERE ID =19743
)GROUP BY ID,PHASE;

Related

Filtering a dataset based on condition SQL Oracle

I have the following input and expected output I am looking for. Basically I would like to filter request_id only when the type = crossborder` and then show the units. I think I would need to use some Min and Max but I am not sure how to use it.
Input
request_id type unit_count
A11 local 10
A11 crossborder 5
B11 local 15
C11 crossborder 25
Output
request_id type unit_count
C11 crossborder 25
I think what you can use here in ranking. You can rank based on anything in the rows and order by the value to leave the highest value on top, that value will get the ranking # 1, all you have to do later is to filter for all the rankings #1.
This little demo will show what I am talking about:
with data as (
select 1 as id, 'cross' as type, 27 as unit from dual union
select 1 as id, 'cross' as type, 23 as unit from dual union
select 1 as id, 'cross' as type, 2 as unit from dual union
select 3 as id, 'cross' as type, 25 as unit from dual union
select 2 as id, 'cross' as type, 23 as unit from dual union
select 5 as id, 'cross' as type, 2 as unit from dual)
select id, type,unit, rank() over ( order by unit desc) ranking from data where type='cross'
Output:
As per my understanding of the question, You don't need MAX or MIN but simply an aggregation -
SELECT request_id, type, unit_count
FROM (SELECT request_id, type, unit_count,
COUNT(*) OVER (PARTITION BY request_id) CNT
FROM YOUR_TABLE)
WHERE CNT = 1
AND type = 'crossborder';

alternate of intersect set operator in oracle

I have a table emp1 wherein I am interested in only the employees who have joined with salary less than 2000 and whose salary is greater than 2000 now. This is the case with only one person Ward as shown below. I prepared the answer with intersect but wanted to know if there is more efficient way of doing it .Please let me know that will be of great help to me
(select empno,deptno
from emp1
where sal<2000
group by empno,hiredate,deptno
)
intersect
(select empno,deptno
from emp1
where sal>2000
group by empno,hiredate,deptno
)
Thanks
First, here's how you can get the specific employees who satisfy your conditions (as modified in a comment): Earliest salary < 2000, current (most recent) salary > 2500. Note that in my sample data employee 1008 started at 1300 and had salary > 2500 at some point, but his current salary is < 2500 so he is not selected.
The query is as efficient as possible: it performs a standard aggregation and nothing else. The conditions are in the having clause. The first/last aggregate function, even though it is exceptionally useful, is ignored by a vast majority of programmers - for no good reason.
with
sal_hist (empno, sal_date, sal) as (
select 1003, date '2000-01-01', 2300 from dual union all
select 1003, date '2008-01-01', 2600 from dual union all
select 1008, date '2002-03-20', 1300 from dual union all
select 1008, date '2005-01-31', 2600 from dual union all
select 1008, date '2013-11-01', 2400 from dual union all
select 2025, date '2008-03-01', 1900 from dual union all
select 2025, date '2015-04-01', 2550 from dual
)
select empno
from sal_hist
group by empno
having min(sal) keep (dense_rank first order by sal_date) < 2000
and min(sal) keep (dense_rank last order by sal_date) > 2500
;
EMPNO
----------
2025
To get the count of such employees, wrap the above query within an outer query, with select count(*) as my_count from ( <above query> ).
For extra credit, try to understand why the following query also works. It's more compact (and possibly faster, even though not by much), but a bit harder to understand - and especially, to understand why I need min(empno) rather than simply empno or * within the count() call.
select count(min(empno)) as my_count
from sal_hist
group by empno
having min(sal) keep (dense_rank first order by sal_date) < 2000
and min(sal) keep (dense_rank last order by sal_date) > 2500
;

Grouping based on value by date range

Hi i have a data in daily basis below:
daytime value
01.01.2017 20000
02.01.2017 20000
03.01.2017 20000
04.01.2017 35000
05.01.2017 35000
06.01.2017 40000
07.01.2017 40000
08.01.2017 50000
How can i have in date range format such as below?
FromDate ToDate Value
01.01.2017 03.01.2017 20000
04.01.2017 05.01.2017 35000
06.01.2017 07.01.2017 40000
08.01.2017 08.01.2017 50000
Thanks!
Tabibitosan handles this very easily:
WITH your_table AS (SELECT to_date('01/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual UNION ALL
SELECT to_date('02/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual UNION ALL
SELECT to_date('03/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual UNION ALL
SELECT to_date('04/01/2017', 'dd/mm/yyyy') daytime, 35000 VALUE FROM dual UNION ALL
SELECT to_date('05/01/2017', 'dd/mm/yyyy') daytime, 35000 VALUE FROM dual UNION ALL
SELECT to_date('06/01/2017', 'dd/mm/yyyy') daytime, 40000 VALUE FROM dual UNION ALL
SELECT to_date('07/01/2017', 'dd/mm/yyyy') daytime, 40000 VALUE FROM dual UNION ALL
SELECT to_date('08/01/2017', 'dd/mm/yyyy') daytime, 50000 VALUE FROM dual UNION ALL
SELECT to_date('09/01/2017', 'dd/mm/yyyy') daytime, 20000 VALUE FROM dual)
-- end of mimicking your table with data in it. See SQL below:
SELECT MIN(daytime) fromdate,
MAX(daytime) todate,
VALUE
FROM (SELECT daytime,
VALUE,
row_number() OVER (ORDER BY daytime) - row_number() OVER (PARTITION BY VALUE ORDER BY daytime) grp
FROM your_table)
GROUP BY grp,
VALUE
ORDER BY MIN(daytime);
FROMDATE TODATE VALUE
---------- ---------- ----------
01/01/2017 03/01/2017 20000
04/01/2017 05/01/2017 35000
06/01/2017 07/01/2017 40000
08/01/2017 08/01/2017 50000
09/01/2017 09/01/2017 20000
What this does is compare the row number for all the rows ordered by date, and then the row number for all the rows for each value ordered by date. If the value rows are consecutive in the main set of data, then the difference between the two sets of data remains the same, so you can then group by that. If there is a gap, then the difference increases.
In your example above, the first three rows for value = 20000 happen to be the first three rows of the whole set, so the difference will be 0. However the fourth value = 20000 row is the 9th row in the whole set, so the difference is now 5. You can easily see that the value of 20000 falls into two groups, and as such, you can find the min/max daytime for each group separately by including that difference calculation in the group by clause.
N.B. This does assume that the dates in your data are consecutive or that if there are missing dates that you assume the value stays the same for the missing dates. If you do have missing days and you want the values across a gap to show in different groups, you'd need to outer join to a subquery that contains the missing dates. In that case, I think GurV's answer (with the additional clause in the case statement that I mentioned in the comments) would be the best one to use, as that would avoid the need to outer join to a list of consecutive dates.
If I understand correctly, you want to group the value only if they are same for consecutive dates.
You can use window functions to generate groups based on value and increasing date order and then find the required aggregates.
with your_table(daytime ,value) as (
select to_date('13.02.2017','dd.mm.yyyy'),25000 from dual union all
select to_date('14.02.2017','dd.mm.yyyy'),20000 from dual union all
select to_date('15.01.2017','dd.mm.yyyy'),90000 from dual union all
select to_date('16.01.2017','dd.mm.yyyy'),90000 from dual union all
select to_date('17.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('18.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('19.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('20.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('21.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('22.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('23.01.2017','dd.mm.yyyy'),95800 from dual union all
select to_date('24.01.2017','dd.mm.yyyy'),90000 from dual union all
select to_date('25.01.2017','dd.mm.yyyy'),90000 from dual union all
select to_date('26.01.2017','dd.mm.yyyy'),90000 from dual
)
select
min(daytime) fromdate,
max(daytime) todate,
value
from (
select
t.*,
sum(x) over (order by daytime) grp
from (
select
t.*,
case when value = lag(value) over (order by daytime)
then 0 else 1 end x
from your_table t
) t
) t group by grp, value
order by fromdate;
Produces:
FROMDATE TODATE VALUE
15-JAN-17 16-JAN-17 90000
17-JAN-17 23-JAN-17 95800
24-JAN-17 26-JAN-17 90000
13-FEB-17 13-FEB-17 25000
14-FEB-17 14-FEB-17 20000

Hive: unable to fetch column that is not present in GROUP BY

I have a table in hive called purchase_data that has a list all the purchases made.
I need to query this table and find the cust_id, product_id and price of the most expensive product purchased by a customer.
The data in purchase_data table looks like:
cust_id product_id price purchase_data
--------------------------------------------------------
aiman_sarosh apple_iphone5s 55000 01-01-2014
aiman_sarosh apple_iphone6s 65000 01-01-2017
jeff_12 apple_iphone6s 65000 01-01-2017
jeff_12 dell_vostro 70000 01-01-2017
missy_el lenovo_thinkpad 70000 01-02-2017
I have written the code below, but it is not fetching the right rows.
Some rows are getting repeated:
select master.cust_id, master.product_id, master.price
from
(
select cust_id, product_id, price
from purchase_data
) as master
join
(
select cust_id, max(price) as price
from purchase_data
group by cust_id
) as max_amt_purchase
on max_amt_purchase.price = master.price;
output:
aiman_sarosh apple_iphone6s 65000.0
jeff_12 apple_iphone6s 65000.0
jeff_12 dell_vostro 70000.0
jeff_12 dell_vostro 70000.0
missy_el lenovo_thinkpad 70000.0
missy_el lenovo_thinkpad 70000.0
Time taken: 21.666 seconds, Fetched: 6 row(s)
Is there something wrong with the code ?
Use row_number():
select pd.*
from (select pd.*,
row_number() over (partition by cust_id order by price_desc) as seqnum
from purchase_data pd
) pd
where seqnum = 1;
This returns one row per cust_id, even if there are ties. If you want multiple rows when there are ties, then use rank() or dense_rank() instead of row_number().
I changed the code, its working now:
select master.cust_id, master.product_id, master.price
from
purchase_data as master,
(
select cust_id, max(price) as price
from purchase_data
group by cust_id
) as max_price
where master.cust_id=max_price.cust_id and master.price=max_price.price;
output:
aiman_sarosh apple_iphone6s 65000.0
missy_el lenovo_thinkpad 70000.0
jeff_12 dell_vostro 70000.0
Time taken: 55.788 seconds, Fetched: 3 row(s)

Oracle Query for getting CURRENT CTC (Salary) of Each Employee

i want current CTC of each employee following is the design of my table
Ecode Implemented Date Salary
7654323 2010-05-20 350000
7654322 2010-05-17 250000
7654321 2003-04-01 350000
7654321 2004-04-01 450000
7654321 2005-04-01 750000
7654321 2007-04-01 650000
i want oracle query for following output
Ecode Salary
7654321 650000
7654322 250000
7654323 350000
thanks in advance
See also
Oracle Query for getting MAximum CTC (Salary) of Each Employee
If you want to keep the last salary for each ecode sorted by implemented_date:
SQL> WITH data AS (
2 SELECT 7654323 Ecode, '2010-05-20' Implemented_Date, 350000 Salary
3 FROM DUAL
4 UNION ALL SELECT 7654322, '2010-05-17', 250000 FROM DUAL
5 UNION ALL SELECT 7654321, '2003-04-01', 350000 FROM DUAL
6 UNION ALL SELECT 7654321, '2004-04-01', 450000 FROM DUAL
7 UNION ALL SELECT 7654321, '2005-04-01', 750000 FROM DUAL
8 UNION ALL SELECT 7654321, '2007-04-01', 650000 FROM DUAL
9 )
10 SELECT ecode,
11 MAX(salary)
12 KEEP (dense_rank FIRST ORDER BY Implemented_Date DESC) sal
13 FROM DATA
14 GROUP BY ecode;
ECODE SAL
---------- ----------
7654321 650000
7654322 250000
7654323 350000
SELECT *
FROM salary s
INNER JOIN
(SELECT ecode, MAX(implemented_date) as implemented_date
FROM salary GROUP BY ecode) curr
ON curr.ecode = s.ecode AND curr.implemented_date = s.implemented_date
I'd use analytical functions for that. You want to select the first value of salary for each combination of ecode and implementeddate ordered by the implementeddate to put the latest at the top.
select
distinct
first_value(ecode) OVER (PARTITION BY ecode ORDER BY IMPLEMENTEDDATE DESC NULLS LAST) Ecode,
first_value(implementeddate) OVER (PARTITION BY ecode ORDER BY IMPLEMENTEDDATE DESC NULLS LAST) ImplementedDate,
first_value(salary) OVER (PARTITION BY ecode ORDER BY IMPLEMENTEDDATE DESC NULLS LAST) Salary
from
tbl_Salary;
The "DISTINCT" will keep null rows at bay that would otherwise be returned for the other 3 versions of Ecode=7654321 that we're filtering out.
The result is:
ECODE IMPLEMENTEDDATE SALARY
----- --------------- ------
7654321 01/04/2007 650000
7654322 17/05/2010 250000
7654323 20/05/2010 350000

Resources