Select N arbitrary rows from a very large table in Oracle - oracle

I have a very large table MY_TABLE (100 million rows). I wish to select a sample of 5, say, records from this table.
What I can think of is getting 5 arbitrary primary keys as follows, this uses fast full scan as the explain plan shows:
select MY_PRIMARY_KEY_COLUMN from (select MY_PRIMARY_KEY_COLUMN, rownum as rn from MY_TABLE) where rn <=5
and then getting the records corresponding to these primary keys.
However this is still very very slow..
Can it be done more efficiently?

As it looks, I got confused. As the commenters noticed, there should have been no problem with the query
select * from MY_TABLE where rownum <=5
but I somehow started to look at
select MY_PRIMARY_KEY_COLUMN from (select MY_PRIMARY_KEY_COLUMN, rownum as rn from MY_TABLE) where rn <=5
which indeed runs very slowly..
Sorry for wasting everyone's time, the select * from MY_TABLE where rownum <=5 works perfectly.

Related

Sub Query Issues

Recently our DB server changed and after that sub query started giving performance issue.
Example :
select * from table1 a where col1 =
(select max(col1) from table1 b where a.p1=b.p1)
This pattern is available at many places so NOT looking to change query but any database level changes should be fine. Looking for which DB parameters can cause performance Issue.
The optimizer should be able to rewrite your query in the following, more efficient form (but it probably can't do it if your query is much more complicated than that, involving many joins, etc. - assuming your example is much simplified just to illustrate the problem):
select * from table1 where (p1, col1) in
(select p1, max(col1) from table1 group by p1);
If the optimizer doesn't (or can't, for some reason) rewrite the query this way, then it will obviously be slow, since it must read the table repeatedly - once for each row in the table.
Another common way to get the same result (but experimentation shows that it is often slower, even though it only reads the base table once, vs. twice with the solution above) uses analytic functions:
select * -- or just the columns from table1, without rnk
from (
select t.*,
dense_rank() over (partition by p1
order by col1 desc nulls last) as rnk
from table1 t
)
where rnk = 1
;

select * from (select first_name, last_name from employees)

I do understand the meaning of this statement but I don't understand why do we need this?
This is equivalent to
select first_Name, last_name from employees
I can see this type of statements in many examples. Can you please explain when we need this? In practical do we use this type of statements?
Can you please explain when we need this?
These are called Derived Tables.
A "derived table" is essentially a statement-local temporary table
created by means of a subquery in the FROM clause of a SQL SELECT
statement. It exists only in memory and behaves like a standard view
or table.
In SQL, subqueries can only see values from parent queries one level deep.
In practical do we use this type of statements?
The most common use of it is the classic row-limiting query using ROWNUM.
Row-Limiting query:
SELECT *
FROM (SELECT *
FROM emp
ORDER BY sal DESC)
WHERE ROWNUM <= 5;
Pagination query:
SELECT eno
FROM (SELECT e.empno eno,
e.ROWNUM rn
FROM (SELECT empno
FROM emp
ORDER BY sal DESC) e)
WHERE rn <= 5;
This kind of statement is useless, you're right, but there are many occasions when you need a subselect because you can't do everything in one statement. Of the top of my head I'd be thinking about for instance, combining aggregate functions, get the min, max and avg of a sum
select min(t.summed), max(t.summed), avg(t.summed)
from (select type, sum(value) as summed from table1 group by type) t
this is just from the top of my head, but I did encounter many occasions where subselects in the from clause were necessary. Once the statements are complex enough you'll see it.

Best practice for pagination in Oracle?

Problem: I need write stored procedure(s) that will return result set of a single page of rows and the number of total rows.
Solution A: I create two stored procedures, one that returns a results set of a single page and another that returns a scalar -- total rows. The Explain Plan says the first sproc has a cost of 9 and the second has a cost of 3.
SELECT *
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY D.ID DESC ) AS RowNum, ...
) AS PageResult
WHERE RowNum >= #from
AND RowNum < #to
ORDER BY RowNum
SELECT COUNT(*)
FROM ...
Solution B: I put everything in a single sproc, by adding the same TotalRows number to every row in the result set. This solution feel hackish, but has a cost of 9 and only one sproc, so I'm inclined to use this solution.
SELECT *
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY D.ID DESC ) RowNum, COUNT(*) OVER () TotalRows,
WHERE RowNum >= from
AND RowNum < to
ORDER BY RowNum;
Is there a best-practice for pagination in Oracle? Which of the aforementioned solutions is most used in practice? Is any of them considered just plain wrong? Note that my DB is and will stay relatively small (less than 10GB).
I'm using Oracle 11g and the latest ODP.NET with VS2010 SP1 and Entity Framework 4.4. I need the final solution to work within the EF 4.4. I'm sure there are probably better methods out there for pagination in general, but I need them working with EF.
If you're already using analytics (ROW_NUMBER() OVER ...) then adding another analytic function on the same partitioning will add a negligible cost to the query.
On the other hand, there are many other ways to do pagination, one of them using rownum:
SELECT *
FROM (SELECT A.*, rownum rn
FROM (SELECT *
FROM your_table
ORDER BY col) A
WHERE rownum <= :Y)
WHERE rn >= :X
This method will be superior if you have an appropriate index on the ordering column. In this case, it might be more efficient to use two queries (one for the total number of rows, one for the result).
Both methods are appropriate but in general if you want both the number of rows and a pagination set then using analytics is more efficient because you only query the rows once.
In Oracle 12C you can use limit LIMIT and OFFSET for the pagination.
Example -
Suppose you have Table tab from which data needs to be fetched on the basis of DATE datatype column dt in descending order using pagination.
page_size:=5
select * from tab
order by dt desc
OFFSET nvl(page_no-1,1)*page_size ROWS FETCH NEXT page_size ROWS ONLY;
Explanation:
page_no=1
page_size=5
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY - Fetch 1st 5 rows only
page_no=2
page_size=5
OFFSET 5 ROWS FETCH NEXT 5 ROWS ONLY - Fetch next 5 rows
and so on.
Refrence Pages -
https://dba-presents.com/index.php/databases/oracle/31-new-pagination-method-in-oracle-12c-offset-fetch
https://oracle-base.com/articles/12c/row-limiting-clause-for-top-n-queries-12cr1#paging
This may help:
SELECT * FROM
( SELECT deptno, ename, sal, ROW_NUMBER() OVER (ORDER BY ename) Row_Num FROM emp)
WHERE Row_Num BETWEEN 5 and 10;
A clean way to organize your SQL code could be trough WITH statement.
The reduced version implements also total number of results and total pages count.
For example
WITH SELECTION AS (
SELECT FIELDA, FIELDB, FIELDC FROM TABLE),
NUMBERED AS (
SELECT
ROW_NUMBER() OVER (ORDER BY FIELDA) RN,
SELECTION.*
FROM SELECTION)
SELECT
(SELECT COUNT(*) FROM NUMBERED) TOTAL_ROWS,
NUMBERED.*
FROM NUMBERED
WHERE
RN BETWEEN ((:page_size*:page_number)-:page_size+1) AND (:page_size*:page_number)
This code gives you a paged resultset with two more fields:
TOTAL_ROWS with the total rows of your full SELECTION
RN the row number of the record
It requires 2 parameter: :page_size and :page_number to slice your SELECTION
Reduced Version
Selection implements already ROW_NUMBER() field
WITH SELECTION AS (
SELECT
ROW_NUMBER() OVER (ORDER BY FIELDA) RN,
FIELDA,
FIELDB,
FIELDC
FROM TABLE)
SELECT
:page_number PAGE_NUMBER,
CEIL((SELECT COUNT(*) FROM SELECTION ) / :page_size) TOTAL_PAGES,
:page_size PAGE_SIZE,
(SELECT COUNT(*) FROM SELECTION ) TOTAL_ROWS,
SELECTION.*
FROM SELECTION
WHERE
RN BETWEEN ((:page_size*:page_number)-:page_size+1) AND (:page_size*:page_number)
Try this:
select * from ( select * from "table" order by "column" desc ) where ROWNUM > 0 and ROWNUM <= 5;
I also faced a similar issue. I tried all the above solutions and none gave me a better performance. I have a table with millions of records and I need to display them on screen in pages of 20. I have done the below to solve the issue.
Add a new column ROW_NUMBER in the table.
Make the column as primary key or add a unique index on it.
Use the population program (in my case, Informatica), to populate the column with rownum.
Fetch Records from the table using between statement. (SELECT * FROM TABLE WHERE ROW_NUMBER BETWEEN LOWER_RANGE AND UPPER_RANGE).
This method is effective if we need to do an unconditional pagination fetch on a huge table.
Sorry, this one works with sorting:
SELECT * FROM (SELECT ROWNUM rnum,a.* FROM (SELECT * FROM "tabla" order by "column" asc) a) WHERE rnum BETWEEN "firstrange" AND "lastrange";

select rows from third row to n th row in oracle [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Paging with Oracle
I try to select data starting from 11 row. and i used
select e_name from copy where rownum>10;
this will not display's anything..
please help me to select 11th row to 15th row in my table
You cannot use rownum like that, you need to wrap everything into a derived table:
select *
from (
select *,
rownum as rn
form your_table
order by some_column
)
where rn between 11 and 15
You should use an order by in the inner query because otherwise you will not get consistent results over time. Rows in a relational table do not have any ordering so the database is free to return the rows in any order it feels approriate.
Please read the manual for more details. The reason your query isn't working is documented there with examples.
http://docs.oracle.com/cd/E11882_01/server.112/e26088/pseudocolumns009.htm#i1006297
You have to use like
select e_name
from (select e_name,rownum rno from copy)
where rno > 10 and rno < 16
Sample Example
you could use analytic function row_number() as well. Please consider http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions137.htm
First write the query which will select all the rows like this:-
select ename from employee
Now add a rownum column to your query, rownum will help you to identify the number of row in your query result
select rownum r,ename from employee
Now make your query as a sub query and apply the range on 'r' (rownum)
select * from (selecr rownum r, ename from employee) subq where subq.r between 11 and 15

Divide my Oracle table into 5 parts randomly

I want to separete my Oracle table into 5 parts, these parts records will be selected randomly from the original table. Parts can contain the same results, it is not a problem.
How can I do that?
You could use ORDER BY dbms_random.value and then work out the number of total records and divide by 5 and use this to limit the number of row returned:
SELECT * FROM
( SELECT * FROM mytable
ORDER BY dbms_random.value
)
WHERE rownum <= (SELECT count(*)/5 from mytable)

Resources