Oracle DELETE busting TEMP space

Oracle DELETE busting TEMP space - oracle

I have a table with 17 billion rows. I want to delete some of those, present in another table.
I tried a delete statement, parallelized, that did not complete because the temp space wasn't enough.
delete /* + PARALLEL(a, 32) */
from a
where (a.key1, a.key2) in
(select /*+ PARALLEL(b, 16) */
key1,
key2
from b);
Then I tried create table as select that also failed because of the same reason.
create table a_temp parallel 32 nologging as
select /* + PARALLEL (a, 32) */
key1,
key2,
rest_of_data
from a
where (a.key1, a.key2) not in
(select key1, key2 from b);
A regular (without PARALLEL) delete was taking more than one day so I had to terminate it.
Is there a way to free temp space as it is not needed anymore, during delete execution?
Is there another way I could do this?
EDIT:
B has 173 million records, and almost 16 billion records have to be deleted (almost the whole table). There are no indexes on the table.
EDIT2:
The explain plan for the create table is as follows:
CREATE TABLE STATEMENT, GOAL = ALL_ROWS 6749420 177523935 10828960035
PX COORDINATOR
PX SEND QC (RANDOM) SYS :TQ10001 6740915 177523935 10828960035
LOAD AS SELECT (HYBRID TSM/HWMB) USER A_TEMP
OPTIMIZER STATISTICS GATHERING 6740915 177523935 10828960035
MERGE JOIN ANTI NA 6740915 177523935 10828960035
SORT JOIN 6700114 17752393472 745600525824
PX BLOCK ITERATOR 45592 17752393472 745600525824
TABLE ACCESS FULL USER A 45592 17752393472 745600525824
SORT UNIQUE 40802 173584361 3298102859
PX RECEIVE 5365 173584361 3298102859
PX SEND BROADCAST SYS :TQ10000 5365 173584361 3298102859
PX BLOCK ITERATOR 5365 173584361 3298102859
TABLE ACCESS FULL USER B 5365 173584361 3298102859
Thanks in advance

I made it work, using a different solution.
I created manually the a_temp table and did an insert with an APPEND PARALLEL hint. The temp space wasn't exceeded and the inserts performed perfectly.
Here is the code:
create table a_temp(..);
insert /* + APPEND PARALLEL(a_temp, 32) */
into a_temp(...)
select /* + PARALLEL(a, 32) */
(...)
from a
where not exists
(select /* + PARALLEL(b, 16) */
'1'
from b
where a.key1 = b.key1
and a.key2 = b.key2)

To solve this issue in the past, I have deleted in batches of ~1M at a time. After a lot of digging for a cleaner solution, a DBA insisted that I take this approach.
This was my workflow:
I used Python and the cx_Oracle module to read in the PK values for the to-be-deleted records, iteratively plugged them into an executemany call as bind variables, and committed after every iteration.
If you want to stick with a Parallel Execution Approach:
Remember to use ALTER SESSION ENABLE PARALLEL DML so that your merge or delete is executed in parallel too. Check out this great blog post that walks you through this:
https://dioncho.wordpress.com/2010/12/10/interpreting-parallel-merge-statement/

Related

Delta Detection in Oracle table with 2 billion records using composite key

Initial Load on Day 1
id
key
fkid
1
0
100
1
1
200
2
0
300
Load on Day 2
id
key
fkid
1
0
100
1
1
200
2
0
300
3
1
400
4
0
500
Need to find delta records
Load on Day 2
id
key
address
3
1
400
4
0
500
Problem Statement
Need to find delta records in minimum time with following facts
1: I have to process around 2 billion records initially from a table as mentioned below
2: Also need to find delta with minimal time so that I can process it quickly
Questions :
1: Will it be a time consuming process to identify delta especially during production downtime ?
2: How long should it take to identify delta with 3 numeric columns in a table out of which
id & key forms a composite key.
Solution tried :
1: Use full join and extract delta with case nvl condition but looks to be costly.
nvl(node1.id, node2.id) id,
nvl(node1.key, node2.key) key,
nvl(node1.fkid, node2.fkid) fkid
FROM
TABLE_DAY_1 node1
FULL JOIN TABLE_DAY_2 node2 ON node2.id = node1.id
WHERE
node2.id IS NULL
OR node1.id IS NULL;```

You need two separate statements to handle this, one to detect new & changed rows, a separate one to detect deleted rows.
While it is cumberson to write, the fastest comparison is field-by-field, so:
SELECT /*+ parallel(8) full(node1) full(node2) USE_HASH(node1 node) */ *
FROM table_day_1 node1,
table_day_2 node2
WHERE node1.id = node2.id(+)
AND (node2.id IS NULL -- new rows
OR node1.col1 <> node2.col2 -- changed val on non-nullable col
OR NVL(node1.col3,' ') <> NVL(node2.col3,' ') -- changed val on nullable string
OR NVL(node1.col4,-1) <> NVL(node2.col4,-1) -- changed val on nullable numeric, etc..
)
Then for deleted rows:
SELECT /*+ parallel(8) full(node1) full(node2) USE_HASH(node1 node) */ node2.id
FROM table_day_1 node1,
table_day_2 node2
WHERE node1.id(+) = node2.id
AND node1.id IS NULL -- deleted rows
You will want to make sure Oracle does a full table scan. If you have lots of CPUs and parallel query is enabled on your database, make sure the query uses parallel query (hence the hint). And you want a hash join between them. Work with your DBA to ensure you have enough temporary space to pull this off, and enough PGA to at least handle this with a single pass workarea rather than multipass.

How to Improve Cross Join Performance in Hive TEZ?

I have a hive table with 5 billion records. I want each of these 5 billion records to be joined with a hardcoded 52 records.
For achieving this I am doing a cross join like
select *
from table1 join table 2
ON 1 = 1;
This is taking 5 hours to run with the highest possible memory parameters.
Is there any other short or easier way to achieve this in less time ?

Turn on map-join:
set hive.auto.convert.join=true;
select *
from table1 cross join table2;
The table is small (52 records) and should fit into memory. Map-join operator will load small table into the distributed cache and each reducer container will use it to process data in memory, much faster than common-join.

Your query is slow because a cross-join(Cartesian product) is processed by ONE single reducer. The cure is to enforce higher parallelism. One way is to turn the query into an inner-join, so as to utilize map-side join optimization.
with t1 as (
selct col1, col2,..., 0 as k from table1
)
,t2 as (
selct col3, col4,..., 0 as k from table2
)
selct
*
from t1 join t2
on t1.k = t2.k
Now each table (CTE) has a fake column called k with identical value 0. So it works just like a cross-join while only a map-side join operation takes place.

SAS join (or insert) little table to big table

I have little problem. I have big table and few little table where little tables including part of fields from big table. How I can insert (or union) tables on the basis of if field is the same - set data, if little table not have field from big - set null/0 in big table.
Example:
data temp1;
infile DATALINES dsd missover;
input a b c d e f g;
CARDS;
1, 2, 3, 4,5,6
2, 3, , 5
3, 3
4,,3,2,3,
;
run;
data temp2;
infile DATALINES dsd missover;
input a c e g;
CARDS;
5, 2, 3, 4
6, 3, , 5
7, 3
;
run;
Is there an elegant method where if I insert temp2 to temp1 - missing fields in temp2 will set value of null in temp1?
Thank you for help!

That is exactly what SAS does by default.
data want ;
set have1 have2;
run;
It will match the variables by name and any variables that do not exist (in either source) will have missing values.
For better performance when appending a small table to a large table you should use PROC APPEND instead of a data step to avoid having to make new copy of the large dataset. That is more like an "insert". The FORCE option will allow the dataset to be different. But since the new data is being added to the old dataset any extra variables that appear only in HAVE2 will just be ignored and their values will be lost.
proc append base=have1 data=have2 force ;
run;
If you really did have to generate an actual INSERT statement (perhaps you are actually trying to generate SQL code to run in a foreign database) you might want to compare the metadata of the two datasets and find the common variables.
proc contents data=have1 out=cont1 noprint; run;
proc contents data=have2 out=cont2 noprint; run;
proc sql noprint;
select a.name into :varlist separated by ','
from cont2 a
inner join cont1 b
on upcase(a.name) = upcase(b.name)
;
...
insert into have1 (&varlist) select &varlist from have2 ;

It is not very clear to me what operation you intend to do but some initial thoughts are:
To compare columns between two datasets (and check whether a value exists in one of them) it is good practice to use an outer join. You can do joins via MERGE clause in a datastep, or more elegantly use PROC SQL.
However, using either approach you will have to specify which two rows in temp1 and temp2 shall be compared - you are typically joining on a column that is available in both tables.
To help us resolve your issue, could you possibly provide the correct output for your desired operation, if you perform it on temp1 and temp2? This would show what options you've explored and what needs to be fixed there.

you should try proc append.that will be more efficient because you will not reading your big table again and again unlike in
/*reads temp1 which is big table and temp2*/
data temp3;
set temp1 temp2;
run;
/* this does pretty much same as above code but will not read your big table
and will be efficient*/
proc append base=temp1 data=temp2 force;
run;
more on proc append in documentation http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#n19kwc3onglzh2n1l2k4e39edv3x.htm

optimizing a dup delete statement Oracle

I have 2 delete statements that are taking a long time to complete. There are several indexes on the columns in where clause.
What is a duplicate?
If 2 or more records have same values in columns id,cid,type,trefid,ordrefid,amount and paydt then there are duplicates.
The DELETEs delete about 1 million record.
Can they be re-written in any way to make it quicker.
DELETE FROM TABLE1 A WHERE loaddt < (
SELECT max(loaddt) FROM TABLE1 B
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
COMMIT;
DELETE FROM TABLE1 a where rowid > (
Select min(rowid) from TABLE1 b
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
commit;
Explain Plan:
DELETE TABLE1
HASH JOIN 1296491
Access Predicates
AND
A.ID=ITEM_1
A.CID=ITEM_2
ITEM_3=NVL(TYPE,'-99999')
ITEM_4=NVL(TREFID,'-99999')
ITEM_5=NVL(ORDREFID,'-99999')
ITEM_6=NVL(AMOUNT,(-99999))
ITEM_7=NVL(PAYDT,TO_DATE(' 9999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Filter Predicates
LOADDT<MAX(LOADDT)
TABLE ACCESS TABLE1 FULL 267904
VIEW VW_SQ_1 690385
SORT GROUP BY 690385
TABLE ACCESS TABLE1 FULL 267904

How large is the table? If count of deleted rows is up to 12% then you may think about index.
Could you somehow partition your table - like week by week and then scan only actual week?
Maybe this could be more effecient. When you're using aggregate function, then oracle must walk through all relevant rows (in your case fullscan), but when you use exists it stops when the first occurence is found. (and of course the query would be much faster, when there was one function-based(because of NVL) index on all columns in where clause)
DELETE FROM TABLE1 A
WHERE exists (
SELECT 1
FROM TABLE1 B
WHERE
A.loaddt != b.loaddt
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);

Although some may disagree, I am a proponent of running large, long running deletes procedurally. In my view it is much easier to control and track progress (and your DBA will like you better ;-) Also, not sure why you need to join table1 to itself to identify duplicates (and I'd be curious if you ever run into snapshot too old issues with your current approach). You also shouldn't need multiple delete statements, all duplicates should be handled in one process. Finally, you should check WHY you're constantly re-introducing duplicates each week, and perhaps change the load process (maybe doing a merge/upsert rather than all inserts).
That said, you might try something like:
-- first create mat view to find all duplicates
create materialized view my_dups_mv
tablespace my_tablespace
build immediate
refresh complete on demand
as
select id,cid,type,trefid,ordrefid,amount,paydt, count(1) as cnt
from table1
group by id,cid,type,trefid,ordrefid,amount,paydt
having count(1) > 1;
-- dedup data (or put into procedure and schedule along with mat view refresh above)
declare
-- make sure my_dups_mv is refreshed first
cursor dup_cur is
select * from my_dups_mv;
type duprec_t is record(row_id rowid);
duprec duprec_t;
type duptab_t is table of duprec_t index by pls_integer;
duptab duptab_t;
l_ctr pls_integer := 0;
l_dupcnt pls_integer := 0;
begin
for rec in dup_cur
loop
l_ctr := l_ctr + 1;
-- assuming needed indexes exist
select rowid
bulk collect into duptab
from table1
where id = rec.id
and cid = rec.cid
and type = rec.type
and trefid = rec.trefid
and ordrefid = rec.ordrefid
and amount = rec.amount
and paydt = rec.paydt
-- order by whatever makes sense to make the "keeper" float to top
order by loaddt desc
;
for i in 2 .. duptab.count
loop
l_dupcnt := l_dupcnt + 1;
delete from table1 where rowid = duptab(i).row_id;
end loop;
if (mod(l_ctr, 10000) = 0) then
-- log to log table here (calling autonomous procedure you'll need to implement)
insert_logtable('Table1 deletes', 'Commit reached, deleted ' || l_dupcnt || ' rows');
commit;
end if;
end loop;
commit;
end;
Check your log table for progress status.

1. Parallel
alter session enable parallel dml;
DELETE /*+ PARALLEL */ FROM TABLE1 A WHERE loaddt < (
...
Assuming you have Enterprise Edition, a sane server configuration, and you are on 11g. If you're not on 11g, the parallel syntax is slightly different.
2. Reduce memory requirements
The plan shows a hash join, which is probably a good thing. But without any useful filters, Oracle has to hash the entire table. (Tbone's query, that only use a GROUP BY, looks nicer and may run faster. But it will also probably run into the same problem trying to sort or hash the entire table.)
If the hash can't fit in memory it must be written to disk, which can be very slow. Since you run this query every week, only one of the tables needs to look at all the rows. Depending on exactly when it runs, you can add something like this to the end of the query: ) where b.loaddt >= sysdate - 14. This may significantly reduce the amount of writing to temporary tablespace. And it may also reduce read IO if you use some partitioning strategy like jakub.petr suggested.
3. Active Report
If you want to know exactly what your query is doing, run the Active Report:
select dbms_sqltune.report_sql_monitor(sql_id => 'YOUR_SQL_ID_HERE', type => 'active')
from dual;
(Save the output to an .html file and open it with a browser.)

How to otimize select from several tables with millions of rows

Have the following tables (Oracle 10g):
catalog (
id NUMBER PRIMARY KEY,
name VARCHAR2(255),
owner NUMBER,
root NUMBER REFERENCES catalog(id)
...
)
university (
id NUMBER PRIMARY KEY,
...
)
securitygroup (
id NUMBER PRIMARY KEY
...
)
catalog_securitygroup (
catalog REFERENCES catalog(id),
securitygroup REFERENCES securitygroup(id)
)
catalog_university (
catalog REFERENCES catalog(id),
university REFERENCES university(id)
)
Catalog: 500 000 rows, catalog_university: 500 000, catalog_securitygroup: 1 500 000.
I need to select any 50 rows from catalog with specified root ordered by name for current university and current securitygroup. There is a query:
SELECT ccc.* FROM (
SELECT cc.*, ROWNUM AS n FROM (
SELECT c.id, c.name, c.owner
FROM catalog c, catalog_securitygroup cs, catalog_university cu
WHERE c.root = 100
AND cs.catalog = c.id
AND cs.securitygroup = 200
AND cu.catalog = c.id
AND cu.university = 300
ORDER BY name
) cc
) ccc WHERE ccc.n > 0 AND ccc.n <= 50;
Where 100 - some catalog, 200 - some securitygroup, 300 - some university. This query return 50 rows from ~ 170 000 in 3 minutes.
But next query return this rows in 2 sec:
SELECT ccc.* FROM (
SELECT cc.*, ROWNUM AS n FROM (
SELECT c.id, c.name, c.owner
FROM catalog c
WHERE c.root = 100
ORDER BY name
) cc
) ccc WHERE ccc.n > 0 AND ccc.n <= 50;
I build next indexes: (catalog.id, catalog.name, catalog.owner), (catalog_securitygroup.catalog, catalog_securitygroup.index), (catalog_university.catalog, catalog_university.university).
Plan for first query (using PLSQL Developer):
http://habreffect.ru/66c/f25faa5f8/plan2.jpg
Plan for second query:
http://habreffect.ru/f91/86e780cc7/plan1.jpg
What are the ways to optimize the query I have?

The indexes that can be useful and should be considered deal with
WHERE c.root = 100
AND cs.catalog = c.id
AND cs.securitygroup = 200
AND cu.catalog = c.id
AND cu.university = 300
So the following fields can be interesting for indexes
c: id, root
cs: catalog, securitygroup
cu: catalog, university
So, try creating
(catalog_securitygroup.catalog, catalog_securitygroup.securitygroup)
and
(catalog_university.catalog, catalog_university.university)
EDIT:
I missed the ORDER BY - these fields should also be considered, so
(catalog.name, catalog.id)
might be beneficial (or some other composite index that could be used for sorting and the conditions - possibly (catalog.root, catalog.name, catalog.id))
EDIT2
Although another question is accepted I'll provide some more food for thought.
I have created some test data and run some benchmarks.
The test cases are minimal in terms of record width (in catalog_securitygroup and catalog_university the primary keys are (catalog, securitygroup) and (catalog, university)). Here is the number of records per table:
test=# SELECT (SELECT COUNT(*) FROM catalog), (SELECT COUNT(*) FROM catalog_securitygroup), (SELECT COUNT(*) FROM catalog_university);
?column? | ?column? | ?column?
----------+----------+----------
500000 | 1497501 | 500000
(1 row)
Database is postgres 8.4, default ubuntu install, hardware i5, 4GRAM
First I rewrote the query to
SELECT c.id, c.name, c.owner
FROM catalog c, catalog_securitygroup cs, catalog_university cu
WHERE c.root < 50
AND cs.catalog = c.id
AND cu.catalog = c.id
AND cs.securitygroup < 200
AND cu.university < 200
ORDER BY c.name
LIMIT 50 OFFSET 100
note: the conditions are turned into less then to maintain comparable number of intermediate rows (the above query would return 198,801 rows without the LIMIT clause)
If run as above, without any extra indexes (save for PKs and foreign keys) it runs in 556 ms on a cold database (this is actually indication that I oversimplified the sample data somehow - I would be happier if I had 2-4s here without resorting to less then operators)
This bring me to my point - any straight query that only joins and filters (certain number of tables) and returns only a certain number of the records should run under 1s on any decent database without need to use cursors or to denormalize data (one of these days I'll have to write a post on that).
Furthermore, if a query is returning only 50 rows and does simple equality joins and restrictive equality conditions it should run even much faster.
Now let's see if I add some indexes, the biggest potential in queries like this is usually the sort order, so let me try that:
CREATE INDEX test1 ON catalog (name, id);
This makes execution time on the query - 22ms on a cold database.
And that's the point - if you are trying to get only a page of data, you should only get a page of data and execution times of queries such as this on normalized data with proper indexes should take less then 100ms on decent hardware.
I hope I didn't oversimplify the case to the point of no comparison (as I stated before some simplification is present as I don't know the cardinality of relationships between catalog and the many-to-many tables).
So, the conclusion is
if I were you I would not stop tweaking indexes (and the SQL) until I get the performance of the query to go below 200ms as rule of the thumb.
only if I would find an objective explanation why it can't go below such value I would resort to denormalisation and/or cursors, etc...

First I assume that your University and SecurityGroup tables are rather small. You posted the size of the large tables but it's really the other sizes that are part of the problem
Your problem is from the fact that you can't join the smallest tables first. Your join order should be from small to large. But because your mapping tables don't include a securitygroup-to-university table, you can't join the smallest ones first. So you wind up starting with one or the other, to a big table, to another big table and then with that large intermediate result you have to go to a small table.
If you always have current_univ and current_secgrp and root as inputs you want to use them to filter as soon as possible. The only way to do that is to change your schema some. In fact, you can leave the existing tables in place if you have to but you'll be adding to the space with this suggestion.
You've normalized the data very well. That's great for speed of update... not so great for querying. We denormalize to speed querying (that's the whole reason for datawarehouses (ok that and history)). Build a single mapping table with the following columns.
Univ_id, SecGrp_ID, Root, catalog_id. Make it an index organized table of the first 3 columns as pk.
Now when you query that index with all three PK values, you'll finish that index scan with a complete list of allowable catalog Id, now it's just a single join to the cat table to get the cat item details and you're off an running.

The Oracle cost-based optimizer makes use of all the information that it has to decide what the best access paths are for the data and what the least costly methods are for getting that data. So below are some random points related to your question.
The first three tables that you've listed all have primary keys. Do the other tables (catalog_university and catalog_securitygroup) also have primary keys on them?? A primary key defines a column or set of columns that are non-null and unique and are very important in a relational database.
Oracle generally enforces a primary key by generating a unique index on the given columns. The Oracle optimizer is more likely to make use of a unique index if it available as it is more likely to be more selective.
If possible an index that contains unique values should be defined as unique (CREATE UNIQUE INDEX...) and this will provide the optimizer with more information.
The additional indexes that you have provided are no more selective than the existing indexes. For example, the index on (catalog.id, catalog.name, catalog.owner) is unique but is less useful than the existing primary key index on (catalog.id). If a query is written to select on the catalog.name column, it is possible to do and index skip scan but this starts being costly (and most not even be possible in this case).
Since you are trying to select based in the catalog.root column, it might be worth adding an index on that column. This would mean that it could quickly find the relevant rows from the catalog table. The timing for the second query could be a bit misleading. It might be taking 2 seconds to find 50 matching rows from catalog, but these could easily be the first 50 rows from the catalog table..... finding 50 that match all your conditions might take longer, and not just because you need to join to other tables to get them. I would always use create table as select without restricting on rownum when trying to performance tune. With a complex query I would generally care about how long it take to get all the rows back... and a simple select with rownum can be misleading
Everything about Oracle performance tuning is about providing the optimizer enough information and the right tools (indexes, constraints, etc) to do its job properly. For this reason it's important to get optimizer statistics using something like DBMS_STATS.GATHER_TABLE_STATS(). Indexes should have stats gathered automatically in Oracle 10g or later.
Somehow this grew into quite a long answer about the Oracle optimizer. Hopefully some of it answers your question. Here is a summary of what is said above:
Give the optimizer as much information as possible, e.g if index is unique then declare it as such.
Add indexes on your access paths
Find the correct times for queries without limiting by rowwnum. It will always be quicker to find the first 50 M&Ms in a jar than finding the first 50 red M&Ms
Gather optimizer stats
Add unique/primary keys on all tables where they exist.

The use of rownum is wrong and causes all the rows to be processed. It will process all the rows, assigned them all a row number, and then find those between 0 and 50. When you want to look for in the explain plan is COUNT STOPKEY rather than just count
The query below should be an improvement as it will only get the first 50 rows... but there is still the issue of the joins to look at too:
SELECT ccc.* FROM (
SELECT cc.*, ROWNUM AS n FROM (
SELECT c.id, c.name, c.owner
FROM catalog c
WHERE c.root = 100
ORDER BY name
) cc
where rownum <= 50
) ccc WHERE ccc.n > 0 AND ccc.n <= 50;
Also, assuming this for a web page or something similar, maybe there is a better way to handle this than just running the query again to get the data for the next page.

try to declare a cursor. I dont know oracle, but in SqlServer would look like this:
declare #result
table (
id numeric,
name varchar(255)
);
declare __dyn_select_cursor cursor LOCAL SCROLL DYNAMIC for
--Select
select distinct
c.id, c.name
From [catalog] c
inner join university u
on u.catalog = c.id
and u.university = 300
inner join catalog_securitygroup s
on s.catalog = c.id
and s.securitygroup = 200
Where
c.root = 100
Order by name
--Cursor
declare #id numeric;
declare #name varchar(255);
open __dyn_select_cursor;
fetch relative 1 from __dyn_select_cursor into #id,#name declare #maxrowscount int
set #maxrowscount = 50
while (##fetch_status = 0 and #maxrowscount <> 0)
begin
insert into #result values (#id, #name);
set #maxrowscount = #maxrowscount - 1;
fetch next from __dyn_select_cursor into #id, #name;
end
close __dyn_select_cursor;
deallocate __dyn_select_cursor;
--Select temp, final result
select
id,
name
from #result;

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio