Oracle query to obtain batches of rows - oracle

So here is my problem: I need to get batches of rows (select statements) for a migration to another database (other then oracle).
Suggested solution: I take batches of rows (using rowid maybe?) example:
batch1: 0-10000,
batch2: 10000 - 20000,
batchn: 10000(n) - 10000(n+1)
So what should my query be?
batch1: select * from table_name where rownum >= 0 and rownum < 10000,
batch2: select * from table_name where rownum >= 10000 and rownum < 20000,
batch n: select * from table_name where rownum >= 10000*n and rownum < 10000*(n+1)
This does not work, (only the first select will work).
PS, I am pulling this data from a nodejs app, and thus I am sending in these batch queries in a for loop.

To illustrate my comment:
-- Between rows --
SELECT * FROM
( SELECT deptno, ename, sal, ROW_NUMBER() OVER (ORDER BY ename) Row_Num
FROM scott.emp
)
WHERE Row_Num BETWEEN 5 and 10
/
You may replace between operator with <= and >= if necessary.
Here's what I see in output:
DEPTNO ENAME SAL ROW_NUM
20 FORD 3000 5
30 JAMES 950 6
20 JONES 2975 7
10 KING 5000 8
30 MARTIN 1250 9
10 MILLER 1300 10

Using rownum is not a great idea, because there's no guarantee that the same rows will be assigned the same rownum values in different queries.
If the table has any combination of columns that uniquely identify a row, it is better to generate a ranking based on that and use that ranking to identify batches of rows. For example:
SELECT * FROM (
SELECT table.*, RANK() OVER (ORDER BY column1, column2) as my_rank
FROM table
)
WHERE my_rank >= 10000 AND my_rank < 20000
This will work with any range, and will be reproducible as long as the values in the columns used do not change and uniquely identify a row. (Actually, I think this would be usable even if they do not uniquely identify a row, as long as they work to break the rows into small enough batches.)
The downside is that MY_RANK will be included in the output. You can avoid that by explicitly listing the columns you do want to select; or it may be easier to filter it out when you are loading the data into the other database.

If you want to preserve the rowids, use the following SQL. This SQL took 4 minutes, 20 seconds to run against a 218 million row table on a 2 CPU server with 18 GB devoted to the DB.
CREATE TABLE rowids
AS
WITH
aset
AS
(SELECT ROWID AS row_id, row_number () OVER (ORDER BY ROWID) r
FROM amiadm.big_table)
SELECT *
FROM aset
WHERE MOD (r, 10000) = 0;
After creating this table, loop through it with the following:
BEGIN
FOR recs
IN ( SELECT row_id
, LAG (row_id) OVER (ORDER BY row_id) prev_row_id
, LEAD (row_id) OVER (ORDER BY row_id) next_row_id
FROM rowids
ORDER BY row_id)
LOOP
IF prev_row_id IS NULL
THEN
SELECT *
FROM big_table
WHERE ROWID <= recs.row_id;
ELSIF next_row_id IS NULL
THEN
SELECT *
FROM big_table
WHERE ROWID > row_id;
ELSE
SELECT *
FROM big_table
WHERE ROWID > prev_row_id
AND ROWID <= row_id;
END IF;
END LOOP;
END;

Related

Query to count distinct values in Oracle db CLOB column

I would like to query an Oracle DB table for the number of rows containing each distinct value in a CLOB column.
This returns all rows containing a value:
select * from mytable where dbms_lob.instr(mycol,'value') > 0;
Using DBMS_LOB, this returns the number of rows containing that value:
select count(*) from mytable where dbms_lob.instr(mycol,'value') > 0;
But is it possible to query for the number of times (rows in which) each distinct value appears?
Depending on what that column really contains, see whether TO_CHAR helps.
SQL> create table mytable (mycol clob);
Table created.
SQL> insert into mytable
2 select 'Query to count distinct values' from dual union all
3 select 'I have no idea which values are popular' from dual;
2 rows created.
SQL> select count(*), to_char(mycol) toc
2 from mytable
3 where dbms_lob.instr(mycol,'value') > 0
4 group by to_char(mycol);
COUNT(*) TOC
---------- ----------------------------------------
1 Query to count distinct values
1 I have no idea which values are popular
SQL>
If your CLOB values are more than 4000 bytes (and if not, why are they CLOBs?) then it's not perfect - collisions are possible, if unlikely - but you could hash the CLOB values.
If you want to count the number of distinct values:
select count(distinct dbms_crypto.hash(src=>mycol, typ=>2))
from mytable
where dbms_lob.instr(mycol,'value') > 0;
If you want to count how many times each distinct value appears:
select mycol, cnt
from (
select mycol,
count(*) over (partition by dbms_crypto.hash(src=>mycol, typ=>2)) as cnt,
row_number() over (partition by dbms_crypto.hash(src=>mycol, typ=>2) order by null) as rn
from mytable
where dbms_lob.instr(mycol,'value') > 0
)
where rn = 1;
Both are likely to be fairly expensive and slow with a lot of data.
(typ=>2 gives the numeric value for dbms_crypto.hash_md5, as you can't refer to the package constant in a SQL call, at least up to 12cR1...)
Rather more crudely, but possibly significantly quicker, you could base the count on the just the first 4000 characters - which may or may not be plausible for your actual data:
select count(distinct dbms_lob.substr(mycol, 4000, 1))
from mytable
where dbms_lob.instr(mycol,'value') > 0;
select dbms_lob.substr(mycol, 4000, 1), count(*)
from mytable
where dbms_lob.instr(mycol,'value') > 0
group by dbms_lob.substr(mycol, 4000, 1);
Standard Oracle functions do not support distinction of CLOB values. But, if you have access to DBMS_CRYPTO.HASH function, you can compare CLOB hashes instead, and thus, get the desired output:
select myCol, h.num from
myTable t join
(select min(rowid) rid, count(rowid) num
from myTable
where dbms_lob.instr(mycol,'value') > 0
group by DBMS_CRYPTO.HASH(myCol, 3)) h
on t.rowid = h.rid;
Also, note, that there's a very little possibility of hash collision. But if that's ok with you, you can use this approach.

Row num is not working for duplicate columns while pagination

Code is written below, multiple nested query:
select *
from
(select employee_id,employee_id
from employees) a
where rownum <= 5
)
where rnum >= 10
If i give duplicate columns in the select its giving "column ambiguously defined" error.
employee_id,employee_id, rownum rnum
Firstly, your query is incorrect since you have ambiguously defined the columns. It will throw ORA-00918: column ambiguously defined.
You must use proper ALIAS to avoid the error. For example,
SELECT departments.department_id AS "dept_id",
employees.department_id AS "emp_Dept_id"
FROM...
Secondly, it is not at all a pagination query. Since you are alwyas going to pick random rows as there is no ORDER BY clause. You are not ordering the rows.
where rownum <= 5
where rnum >= 10
At last, how on earth could you try to fetch the rows beyond 10 when you have fetched ONLY 5 rows in the inner query? It will ALWAYS return zero rows.
The correct way of paging through data is:
SQL> SELECT empno
2 FROM (SELECT empno, rownum AS rnum
3 FROM (SELECT empno
4 FROM emp
5 ORDER BY sal)
6 WHERE rownum <= 8)
7 WHERE rnum >= 5;
EMPNO
----------
7654
7934
7844
7499
SQL>
when you give a.* it means you are trying to refer two columns with same name in a table which is not permitted.The column names in a table are unique
So select employee_id,employee_id from employees is not a problem
but
select a.* from (select employee_id,employee_id from employees)a is a problem
the sqlfiddle here
Also if you want records from 10 to 15 in your query then use like the below
select * from
(select a.*,Rownum rnum from
(select employee_id as emp_id1,employee_id as emp_id2
from employees order by 1)
a where rownum <= 15 ) where rnum >= 10
EDIT1:- If duplicate column is required use like the below
with emp1 as (select employee_id from employees),
emp2 as (select * from
(select a.*,rownum rnum from emp1 a order by 1)
where rownum <=15)
select b.*,c.*
from emp1 b,emp2 c
where b.employee_id=c.employee_id
and c.rnum >=10

Alternate of sql server TOP in oracle

How to get top 3 records in oracle pl sql?i am new to oracle,earlier i have used sql server.
My requirement is to get distinct top 3 records of Column X.
Try this to retrieve the Top N records from a query, you can use the following syntax::-
SELECT *
FROM (your ordered query) alias_name
WHERE rownum <= Rows_to_return
Example:-
SELECT *
FROM (select * from suppliers ORDER BY supplier_name) suppliers2
WHERE rownum <= 3
This may help you
SELECT ename, sal
FROM ( SELECT ename, sal, RANK() OVER (ORDER BY sal DESC) sal_rank
FROM emp )
WHERE sal_rank <= 3;

Issue in SQL query full scanning twice?

Table A
ID EmpNo Grade
--------------------
1 100 HIGH
2 105 LOW
3 100 MEDIUM
4 100 LOW
5 105 LOW
Query:
select *
from A
where EMPNO = 100
and rownum <= 2
order by ID desc
I tried this query to retrieve max and max-1 value; I need to compare the grade from max and max-1, if equals I need to set the flag as 'Y' or 'N' without using a cursor. Also I don't want to scan the entire record twice.
Please help me.
ROWNUM is applied before ORDER BY, so you need to nest the query like this:
select * from
(select * from A where EMPNO =100 order by ID desc)
where rownum<=2
That only performs one table scan (or it may use an index on EMPNO).
select *
from (
select id, emp_no, grade
, case
when lag(grade) over (order by emp_no desc) = grade
then 'Y'
else 'N'
end
as flag
, dense_rank() over( order by emp_no desc) as rank
from t
)
where rank <=2
;

How can I return multiple identical rows based on a quantity field in the row itself?

I'm using oracle to output line items in from a shopping app. Each item has a quantity field that may be greater than 1 and if it is, I'd like to return that row N times.
Here's what I'm talking about for a table
product_id, quanity
1, 3,
2, 5
And I'm looking a query that would return
1,3
1,3
1,3
2,5
2,5
2,5
2,5
2,5
Is this possible? I saw this answer for SQL Server 2005 and I'm looking for almost the exact thing in oracle. Building a dedicated numbers table is unfortunately not an option.
I've used 15 as a maximum for the example, but you should set it to 9999 or whatever the maximum quantity you will support.
create table t (product_id number, quantity number);
insert into t values (1,3);
insert into t values (2,5);
select t.*
from t
join (select rownum rn from dual connect by level < 15) a
on a.rn <= t.quantity
order by 1;
First create sample data:
create table my_table (product_id number , quantity number);
insert into my_table(product_id, quantity) values(1,3);
insert into my_table(product_id, quantity) values(2,5);
And now run this SQL:
SELECT product_id, quantity
FROM my_table tproducts
,( SELECT LEVEL AS lvl
FROM dual
CONNECT BY LEVEL <= (SELECT MAX(quantity) FROM my_table)) tbl_sub
WHERE tbl_sub.lvl BETWEEN 1 AND tproducts.quantity
ORDER BY product_id, lvl;
PRODUCT_ID QUANTITY
---------- ----------
1 3
1 3
1 3
2 5
2 5
2 5
2 5
2 5
This question is propably same as this: how to calc ranges in oracle
Update solution, for Oracle 9i:
You can use pipelined_function() like this:
CREATE TYPE SampleType AS OBJECT
(
product_id number,
quantity varchar2(2000)
)
/
CREATE TYPE SampleTypeSet AS TABLE OF SampleType
/
CREATE OR REPLACE FUNCTION GET_DATA RETURN SampleTypeSet
PIPELINED
IS
l_one_row SampleType := SampleType(NULL, NULL);
BEGIN
FOR cur_data IN (SELECT product_id, quantity FROM my_table ORDER BY product_id) LOOP
FOR i IN 1..cur_data.quantity LOOP
l_one_row.product_id := cur_data.product_id;
l_one_row.quantity := cur_data.quantity;
PIPE ROW(l_one_row);
END LOOP;
END LOOP;
RETURN;
END GET_DATA;
/
Now you can do this:
SELECT * FROM TABLE(GET_DATA());
Or this:
CREATE OR REPLACE VIEW VIEW_ALL_DATA AS SELECT * FROM TABLE(GET_DATA());
SELECT * FROM VIEW_ALL_DATA;
Both with same results.
(Based on my article pipelined function)

Resources