Problem with MINUS and sub queries with ORDER BYs - oracle

select salary
from (
(select salary
from employees
where rownum<=10
order by salary desc)
minus
(select salary
from employees
where rownum<=4
order by salary desc)
);

You cannot use ORDER BY there.
Try this instead:
select salary from (
select salary, row_number() over ( order by salary desc ) rn
from employees )
where rn between 5 and 10;
On Oracle 12c or later, you can also do this:
select salary from employees
order by salary desc
offset 4 rows fetch next 6 rows only;

You've got several issues in what you've written. The immediate problem is that you'll get an error from having an order by in the first branch of your union, but just removing that won't help you much.
You're making a (fairly common) mistake with ordering and rownum; looking just at the first subquery you have:
select salary
from employees
where rownum<=10
order by salary desc
The rownum filter will be applied before the order-by, so what this will actually produce is 10 indeterminate rows from the table, which are then ordered. If I run that I get:
SALARY
----------
24000
13000
12000
10000
8300
6500
6000
4400
2600
2600
but you'll see different values, even from the same sample schema. If you look at the whole table you'll see higher values than those; and even running the second query will show something isn't as you expect - for me that gets:
SALARY
----------
13000
4400
2600
2600
which are not the first four rows from the previous query. (Again, you'll see different results, but hopefully the same effect; if not, look at the whole table ordered by salary.)
You need to order the whole table - in a subquery - and then filter:
select salary
from (
select salary
from employees
order by salary desc
)
where rownum<=10
which gives a much more sensible - and consistent - result. You can then minus the two queries:
select salary
from (
select salary
from employees
order by salary desc
)
where rownum<=10
minus
select salary
from (
select salary
from employees
order by salary desc
)
where rownum<=4
order by salary desc;
SALARY
----------
13500
13000
12000
11500
You may be expecting to see six values there, but there are three employees with a salary of 12000, and minus eliminates duplicates so that is only reported once. #Matthew's approach (or #Jeff's!) will give you all six, including duplicates, if that is what you want. It also stops you having to hit the table multiple times.
A further problem is with ties - if the 4th highest was the same as the 5th highest, what would you expect to happen? Using minus would exclude that value; #Matthew's approach would preserve it.
You need to define what you actually want to get - the 5th to 10th highest salary values? The salaries of the 5th to 10th highest-paid people (a subtle but important difference)? Do you really only want the numbers, or who those employees are - in which case how you deal with ties is even more important? Etc. Once you know what you actually need to find you can decide the best way to get that result.

It doesn't make sense to order rows in two sets that are subsequently operated upon because sets don't have order. If you need a solution that can execute on older versions and you want to return the bottom 6 ranked out of the top 10 ranked, then this will work. If you can use newer features, then you may want to because it's possible they'll require fewer machine instruction executions.
After making the obvious changes that escaped me in my haste...
select salary
from (
select rownum rn, salary
from (
select salary
from employees
order by salary desc
)
)
where rn between 5 and 10

Related

alternate of intersect set operator in oracle

I have a table emp1 wherein I am interested in only the employees who have joined with salary less than 2000 and whose salary is greater than 2000 now. This is the case with only one person Ward as shown below. I prepared the answer with intersect but wanted to know if there is more efficient way of doing it .Please let me know that will be of great help to me
(select empno,deptno
from emp1
where sal<2000
group by empno,hiredate,deptno
)
intersect
(select empno,deptno
from emp1
where sal>2000
group by empno,hiredate,deptno
)
Thanks
First, here's how you can get the specific employees who satisfy your conditions (as modified in a comment): Earliest salary < 2000, current (most recent) salary > 2500. Note that in my sample data employee 1008 started at 1300 and had salary > 2500 at some point, but his current salary is < 2500 so he is not selected.
The query is as efficient as possible: it performs a standard aggregation and nothing else. The conditions are in the having clause. The first/last aggregate function, even though it is exceptionally useful, is ignored by a vast majority of programmers - for no good reason.
with
sal_hist (empno, sal_date, sal) as (
select 1003, date '2000-01-01', 2300 from dual union all
select 1003, date '2008-01-01', 2600 from dual union all
select 1008, date '2002-03-20', 1300 from dual union all
select 1008, date '2005-01-31', 2600 from dual union all
select 1008, date '2013-11-01', 2400 from dual union all
select 2025, date '2008-03-01', 1900 from dual union all
select 2025, date '2015-04-01', 2550 from dual
)
select empno
from sal_hist
group by empno
having min(sal) keep (dense_rank first order by sal_date) < 2000
and min(sal) keep (dense_rank last order by sal_date) > 2500
;
EMPNO
----------
2025
To get the count of such employees, wrap the above query within an outer query, with select count(*) as my_count from ( <above query> ).
For extra credit, try to understand why the following query also works. It's more compact (and possibly faster, even though not by much), but a bit harder to understand - and especially, to understand why I need min(empno) rather than simply empno or * within the count() call.
select count(min(empno)) as my_count
from sal_hist
group by empno
having min(sal) keep (dense_rank first order by sal_date) < 2000
and min(sal) keep (dense_rank last order by sal_date) > 2500
;

In case second highest value is missing then use the highest value of salary from employee table

I have an employee table as below. As you can see that second highest salary is 200
Incase the second highest salary is missing then there will be only one row as shown at last . In this case the query should fetch only 100
I have written query as but it is not working. Please help! Thanks
select salary "SecondHighestSalary" from(
(select id,salary,rank() over(order by salary desc) rnk
from employee2)
)a
where (rnk) in coalesce(2,1)
I have also tried the following but it is fetching 2 rows but i need only 1
It sounds like you'd want something like
with ranked_emp as (
select e.*,
rank() over (order by e.sal desc) rnk,
count(*) over () cnt
from employee2 e
)
select salary "SecondHighestSalary"
from ranked_emp
where (rnk = 2 and cnt > 1)
or (rnk = 1 and cnt = 1)
Note that I'm still using rank since you're using that in your approach and you don't specify how you want to handle ties. If there are two employees with the same top salary, rank will assign both a rnk of 1 and no employee would have a rnk of 2 so the query wouldn't return any data. dense_rank would ensure that there was at least one employee with a rnk of 2 if there were employees with at least 2 different salaries. If there are two employees with the same top salary, row_number would arbitrarily assign one the rnk of 2. The query I posted isn't trying to handle those duplicate situations because you haven't outlined exactly how you'd want it to behave in each instance.
If you are in Oracle 12.2 or higher, you can try:
select distinct id,
nvl
(
nth_value(salary, 2) from first
over(partition by id
order by salary desc
range between unbounded preceding and unbounded
following),
salary
) second_max_salary
from employee2

use RANK or DENSE_RANK along with aggregate function

I have a table with the following data:
SCORE ROW_ID NAME
0.4 1011 ABC
0.95 1011 DEF
0.4 501 GHI
0.95 501 XYZ
At any point of time, i only need single row of data with maximum score, if there has more than 1 records, take the one with minimum row_id.
Is it possible to achieve by using RANK or DENSE_RANK function? How about partition by?
MAX(score) keep(dense_rank first order by row_id)
You are looking for max score, one row, so use row_number():
select score, row_id, name
from (select t.*, row_number() over (order by score desc, row_id) rn from t)
where rn = 1
demo
You can use rank and dense_rank in your example, but they can return more than one row, for instance when you add row (0.95, 501, 'PQR') to your data.
keep dense_rank is typically used when searched value is other than search criteria, for instance if we look for salary of employee who works the longest:
max(salary) keep (dense_rank first order by sysdate - hiredate desc)
max in this case means that if there are two or more employees who works longest, but exactly the same number of days than we take highest salary.
max(salary)
keep (dense_rank first order by sysdate - hiredate desc)
over (partition by deptno)
This is the same as above, but salary of longest working employees is shown for each department separately. You can even use empty over() to show salary of longest working employee in separate column except other data like name, salary, hire_date.
You dont need to use dense_rank. This would help
SELECT * FROM (
SELECT
SCORE,
ROW_ID
NAME
FROM T
ORDER BY SCORE DESC, ROW_ID DESC
)
WHERE ROWNUM = 1;

Okay so I am trying to a rownum into a variable but I need it to give me only one value, so 2 if it's the second number in the row

select rownum into v_rownum
from waitlist
where p_callnum=callnum
order by sysdate;
tried doing this but gives too many values.
and if I do p_snum=snum, it will keep returning 1. I need it to return 2 if it's #2 on the waitlist.
select rn into v_rownum
from (select callnum,
row_number() over (order by sysdate) rn
from waitlist)
where p_snum=snum;
Almost got it to work. Running into issues in the first select. I believe I might have to use v_count instead. Also Ordering by Sysdate even if a second apart will order it correctly.
SNU CALLNUM TIME
--- ---------- ---------
101 10125 11-DEC-18
103 10125 11-DEC-18
BTW time is = date which I entered people into waitlist using sysdate. So I suppose ordering by time could work.
create table waitlist(
snum varchar2(3),
callnum number(8),
time date,
constraint fk_waitlist_snum foreign key(snum) references students(snum),
constraint fk_waitlist_callnum foreign key(callnum) references schclasses(callnum),
primary key(snum,callnum)
);
is the waitlist table.
I used Scott's DEPT table to create your WAITLIST; department numbers represent CALLNUM column:
SQL> select * From waitlist;
CALLNUM WAITER
---------- --------------------
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OPERATIONS
How to fetch data you need?
using analytic function (ROW_NUMBER) which orders values by CALLNUMs, you'll know the order
that query will be used as an inline view for the main query that returns number in the waitlist for any CALLNUM
Here's how:
SQL> select rn
2 from (select callnum,
3 row_number() over (order by callnum) rn
4 from waitlist
5 )
6 where callnum = 30;
RN
----------
3
SQL>
rownum in oracle is a generated column, it does not refer to any specific row, it is just the nth row in a set.
With a select into it can only return one row (hence the two many rows error) so rownum will always be 1.
Without more details about your table structure, and how you are uniquely identifying records it is hard to give assist you further with a solution.

Can anybody explain how this query works?

This is a SQL query to find the Nth highest salary of employees:
SELECT *
FROM emp t
WHERE 1 = (SELECT COUNT(DISTINCT sal)
FROM emp t2
WHERE t2.sal > t.sal)
I don't know how it returns the result. If you put 1 in the WHERE clause, it will return the second highest, and for 2 the 3rd highest salary, and so on.
Please explain the query as I am unsure.
Let me start by saying a better way to write the query is:
select e.*
from (select e.*, dense_rank() over (order by sal desc) as seqnum
from emp e
) e
where seqnum = 2;
What is your query doing? Go step-by-step:
The outer query is doing a comparison for every row in emp.
The comparison counts the number of distinct salaries that are larger than the salary in the row.
The row is kept if there is exactly 1 salary that is larger.
In other words, this is keeping all rows that have the second largest salary. dense_rank() is a much saner way to write the query (and it has better performance too).

Resources