Oracle ROWNUM pseudocolumn - oracle

I have a complex query with group by and order by clause and I need a sorted row number (1...2...(n-1)...n) returned with every row. Using a ROWNUM (value is assigned to a row after it passes the predicate phase of the query but before the query does any sorting or aggregation) gives me a non-sorted list (4...567...123...45...). I cannot use application for counting and assigning numbers to each row.

Is there a reason that you can't just do
SELECT rownum, a.*
FROM (<<your complex query including GROUP BY and ORDER BY>>) a

You could do it as a subquery, so have:
select q.*, rownum from (select... group by etc..) q
That would probably work... don't know if there is anything better than that.

Can you use an in-line query? ie
SELECT cols, ROWNUM
FROM (your query)

Assuming that you're query is already ordered in the manner you desire and you just want a number to indicate what row in the order it is:
SELECT ROWNUM AS RowOrderNumber, Col1, Col2,Col3...
FROM (
[Your Original Query Here]
)
and replace "Colx" with the names of the columns in your query.

I also sometimes do something like:
SELECT * FROM
(SELECT X,Y FROM MY_TABLE WHERE Z=16 ORDER BY MY_DATE DESC)
WHERE ROWNUM=1

If you want to use ROWNUM to do anything more than limit the total number of rows returned in a query (e.g. AND ROWNUM < 10) you'll need to alias ROWNUM:
select *
(select rownum rn, a.* from
(<sorted query>) a))
where rn between 500 and 1000

Related

Select distinct on specific columns but select other columns also in hive

I have multiple columns in a table in hive having around 80 columns. I need to apply the distinct clause on some of the columns and get the first values from the other columns also. Below is the representation of what I am trying to achieve.
select distinct(col1,col2,col3),col5,col6,col7
from abc where col1 = 'something';
All the columns mentioned above are text columns. So I cannot apply group by and aggregate functions.
You can use row_number function to solve the problem.
create table temp as
select *, row_number() over (partition by col1,col2,col3) as rn
from abc
where col1 = 'something';
select *
from temp
where rn=1
You can also sort the table while partitioning.
row_number() over (partition by col1,col2,col3 order by col4 asc) as rn
DISTINCT is the most overused and least understood function in SQL. It's the last thing that is executed over your entire result set and removes duplicates using ALL columns in your select. You can do a GROUP BY with a string, in fact that is the answer here:
SELECT col1,col2,col3,COLLECT_SET(col4),COLLECT_SET(col5),COLLECT_SET(col6)
FROM abc WHERE col1 = 'something'
GROUP BY col1,col2,col3;
Now that I re-read your question though, I'm not really sure what you are after. You might have to join the table to an aggregate of itself.

How to use Oracle hints or other optimization to fix function in where clause performance issue?

This is slow:
select col_x from table_a where col_y in (select col_y from table_b) and fn(col_x)=0;
But I know that this will return 4 rows fast, and that I can run fn() on 4 values fast.
So I do some testing, and I see that this is fast:
select fn(col_x) from table_a where col_y in (select col_y from table_b);
When using the fn() in the where clause, Oracle is running it on every row in table_a. How can I make it so Oracle first uses the col_y filter, and only runs the function on the matched rows?
For example, conceptually, I though this would work:
with taba as (
select fn(col_x) x from table_a where col_y in (select col_y from table_b)
)
select * from taba where x=0;
because I thought Oracle would run the with clause first, but Oracle is "optimizing" this query and making this run exactly the same as the first query above where fn(col_x)=0 is in the where clause.
I would like this to run just as a query and not in a pl/sql block. It seems like there should be a way to give oracle a hint, or do some other trick, but I can't figure it out. BTW, table is indexed on col_y and it is being used as an access predicate. Stats are up to date.
There are two ways you could go around it,
1) add 'AND rownum >=0' in the subquery to force materialization.
OR
2) use a Case statement inside the query to force the execution priority (maybe)
This works, but if anyone has a better answer, please share:
select col_x
from table_a
where col_y in (select col_y from table_b)
and (select 1 from dual where fn(col_x)=0);
Kind of kludgy, but works. Takes a query running in 60+ seconds down to .1 seconds.
You could try the HAVING clause in your query. This clause is not executed until the base query is completed, and then the HAVING clause is run on the resulting rows. It's typically used for analytic functions, but could be useful in your case.
select col_x
from table_a
where col_y in (select col_y from table_b)
having fn(col_x)=0;
A HAVING clause restricts the results of a GROUP BY in a
SelectExpression. The HAVING clause is applied to each group of the
grouped table, much as a WHERE clause is applied to a select list. If
there is no GROUP BY clause, the HAVING clause is applied to the
entire result as a single group. The SELECT clause cannot refer
directly to any column that does not have a GROUP BY clause. It can,
however, refer to constants, aggregates, and special registers.
http://docs.oracle.com/javadb/10.8.3.0/ref/rrefsqlj14854.html
1) Why you don't try join table_a and table_b using col_y.
select a.col_x from table_a a,table_b b
where a.col_y = b.col_y
and fn(col_x) = 0
2) NO_PUSH_PRED -
select /*+ NO_PUSH_PRED(v) */ col_x from (
select col_x from table_a where col_y in (select col_y from table_b)
) v
where fn(col_x) =0
3) Exists and PUSH_SUBQ.
select col_x from table_a a
where exists( select /*+ PUSH_SUBQ */ 1 from table_b b where a.col_y = b.coly )
and fn(col_x) = 0;

how to display identical values in a single column using EXCEL or ORACLE

Hello I need a formula in column ā€˜Cā€™ which calculates/adds the amount of B Column based on the column A ID. If there are several amounts in same ID it should add the total amount and would show the result in column ā€˜Cā€™ as a single row.
the output can be obtained from Oracle SQL query or an Excel formula.your help would be appreciated.
You can get the same output from Oracle itself, using analytical functions like below.
SUM() OVER(PARTITION BY ... ) -> This actually do the cumulative sum
WITH MYTABLE(ID,AMT) AS
(SELECT '2UF2', '500' FROM DUAL
UNION ALL
SELECT '2TC6', '300' FROM DUAL
UNION ALL
SELECT '2TC6', '200' FROM DUAL
UNION ALL
SELECT '2TC6', '800' FROM DUAL
)
SELECT ID,
AMT,
CASE ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NULL)
WHEN 1
THEN SUM(AMT) OVER(PARTITION BY ID ORDER BY NULL)
END AS FORMULA
FROM MYTABLE
ORDER BY ID, FORMULA NULLS LAST;
SQL Fiddle Demo
You can use rollup in oracle
Select id,amt,sum (amt) nullFrom table nullGroup by rollup (id,amt)
For more details see below link
https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets
In SQL you need an aggregation function, in this case sum, and a group by clause. The generic query should look like the following:
Select sum(b) from table group by a
I hope this helps.

select * from (select first_name, last_name from employees)

I do understand the meaning of this statement but I don't understand why do we need this?
This is equivalent to
select first_Name, last_name from employees
I can see this type of statements in many examples. Can you please explain when we need this? In practical do we use this type of statements?
Can you please explain when we need this?
These are called Derived Tables.
A "derived table" is essentially a statement-local temporary table
created by means of a subquery in the FROM clause of a SQL SELECT
statement. It exists only in memory and behaves like a standard view
or table.
In SQL, subqueries can only see values from parent queries one level deep.
In practical do we use this type of statements?
The most common use of it is the classic row-limiting query using ROWNUM.
Row-Limiting query:
SELECT *
FROM (SELECT *
FROM emp
ORDER BY sal DESC)
WHERE ROWNUM <= 5;
Pagination query:
SELECT eno
FROM (SELECT e.empno eno,
e.ROWNUM rn
FROM (SELECT empno
FROM emp
ORDER BY sal DESC) e)
WHERE rn <= 5;
This kind of statement is useless, you're right, but there are many occasions when you need a subselect because you can't do everything in one statement. Of the top of my head I'd be thinking about for instance, combining aggregate functions, get the min, max and avg of a sum
select min(t.summed), max(t.summed), avg(t.summed)
from (select type, sum(value) as summed from table1 group by type) t
this is just from the top of my head, but I did encounter many occasions where subselects in the from clause were necessary. Once the statements are complex enough you'll see it.

Best practice for pagination in Oracle?

Problem: I need write stored procedure(s) that will return result set of a single page of rows and the number of total rows.
Solution A: I create two stored procedures, one that returns a results set of a single page and another that returns a scalar -- total rows. The Explain Plan says the first sproc has a cost of 9 and the second has a cost of 3.
SELECT *
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY D.ID DESC ) AS RowNum, ...
) AS PageResult
WHERE RowNum >= #from
AND RowNum < #to
ORDER BY RowNum
SELECT COUNT(*)
FROM ...
Solution B: I put everything in a single sproc, by adding the same TotalRows number to every row in the result set. This solution feel hackish, but has a cost of 9 and only one sproc, so I'm inclined to use this solution.
SELECT *
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY D.ID DESC ) RowNum, COUNT(*) OVER () TotalRows,
WHERE RowNum >= from
AND RowNum < to
ORDER BY RowNum;
Is there a best-practice for pagination in Oracle? Which of the aforementioned solutions is most used in practice? Is any of them considered just plain wrong? Note that my DB is and will stay relatively small (less than 10GB).
I'm using Oracle 11g and the latest ODP.NET with VS2010 SP1 and Entity Framework 4.4. I need the final solution to work within the EF 4.4. I'm sure there are probably better methods out there for pagination in general, but I need them working with EF.
If you're already using analytics (ROW_NUMBER() OVER ...) then adding another analytic function on the same partitioning will add a negligible cost to the query.
On the other hand, there are many other ways to do pagination, one of them using rownum:
SELECT *
FROM (SELECT A.*, rownum rn
FROM (SELECT *
FROM your_table
ORDER BY col) A
WHERE rownum <= :Y)
WHERE rn >= :X
This method will be superior if you have an appropriate index on the ordering column. In this case, it might be more efficient to use two queries (one for the total number of rows, one for the result).
Both methods are appropriate but in general if you want both the number of rows and a pagination set then using analytics is more efficient because you only query the rows once.
In Oracle 12C you can use limit LIMIT and OFFSET for the pagination.
Example -
Suppose you have Table tab from which data needs to be fetched on the basis of DATE datatype column dt in descending order using pagination.
page_size:=5
select * from tab
order by dt desc
OFFSET nvl(page_no-1,1)*page_size ROWS FETCH NEXT page_size ROWS ONLY;
Explanation:
page_no=1
page_size=5
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY - Fetch 1st 5 rows only
page_no=2
page_size=5
OFFSET 5 ROWS FETCH NEXT 5 ROWS ONLY - Fetch next 5 rows
and so on.
Refrence Pages -
https://dba-presents.com/index.php/databases/oracle/31-new-pagination-method-in-oracle-12c-offset-fetch
https://oracle-base.com/articles/12c/row-limiting-clause-for-top-n-queries-12cr1#paging
This may help:
SELECT * FROM
( SELECT deptno, ename, sal, ROW_NUMBER() OVER (ORDER BY ename) Row_Num FROM emp)
WHERE Row_Num BETWEEN 5 and 10;
A clean way to organize your SQL code could be trough WITH statement.
The reduced version implements also total number of results and total pages count.
For example
WITH SELECTION AS (
SELECT FIELDA, FIELDB, FIELDC FROM TABLE),
NUMBERED AS (
SELECT
ROW_NUMBER() OVER (ORDER BY FIELDA) RN,
SELECTION.*
FROM SELECTION)
SELECT
(SELECT COUNT(*) FROM NUMBERED) TOTAL_ROWS,
NUMBERED.*
FROM NUMBERED
WHERE
RN BETWEEN ((:page_size*:page_number)-:page_size+1) AND (:page_size*:page_number)
This code gives you a paged resultset with two more fields:
TOTAL_ROWS with the total rows of your full SELECTION
RN the row number of the record
It requires 2 parameter: :page_size and :page_number to slice your SELECTION
Reduced Version
Selection implements already ROW_NUMBER() field
WITH SELECTION AS (
SELECT
ROW_NUMBER() OVER (ORDER BY FIELDA) RN,
FIELDA,
FIELDB,
FIELDC
FROM TABLE)
SELECT
:page_number PAGE_NUMBER,
CEIL((SELECT COUNT(*) FROM SELECTION ) / :page_size) TOTAL_PAGES,
:page_size PAGE_SIZE,
(SELECT COUNT(*) FROM SELECTION ) TOTAL_ROWS,
SELECTION.*
FROM SELECTION
WHERE
RN BETWEEN ((:page_size*:page_number)-:page_size+1) AND (:page_size*:page_number)
Try this:
select * from ( select * from "table" order by "column" desc ) where ROWNUM > 0 and ROWNUM <= 5;
I also faced a similar issue. I tried all the above solutions and none gave me a better performance. I have a table with millions of records and I need to display them on screen in pages of 20. I have done the below to solve the issue.
Add a new column ROW_NUMBER in the table.
Make the column as primary key or add a unique index on it.
Use the population program (in my case, Informatica), to populate the column with rownum.
Fetch Records from the table using between statement. (SELECT * FROM TABLE WHERE ROW_NUMBER BETWEEN LOWER_RANGE AND UPPER_RANGE).
This method is effective if we need to do an unconditional pagination fetch on a huge table.
Sorry, this one works with sorting:
SELECT * FROM (SELECT ROWNUM rnum,a.* FROM (SELECT * FROM "tabla" order by "column" asc) a) WHERE rnum BETWEEN "firstrange" AND "lastrange";

Resources