Oracle: Return Large Dataset with Cursor in Procedure - oracle

I've seen lots of posts regarding the use of cursors in PL/SQL to return data to a calling application, but none of them touch on the issue I believe I'm having with this technique. I am fairly new to Oracle, but have extensive experience with MSSQL Server. In SQL Server, when building queries to be called by an application for returning data, I usually put the SELECT statement inside a stored proc with/without parameters, and let the stored proc execute the statement(s) and return the data automatically. I've learned that with PL/SQL, you must store the resulting dataset in a cursor and then consume the cursor.
We have a query that doesn't necessarily return huge amounts of rows (~5K - 10K rows), however the dataset is very wide as it's composed of 1400+ columns. Running the SQL query itself in SQL Developer returns results instantaneously. However, calling a procedure that opens a cursor for the same query takes 5+ minutes to finish.
CREATE OR REPLACE PROCEDURE PROCNAME(RESULTS OUT SYS_REFCURSOR)
AS
BEGIN
OPEN RESULTS FOR
<SELECT_query_with_1400+_columns>
...
END;
After doing some debugging to try to get to the root cause of the slowness, I'm leaning towards the cursor returning one row at a time very slowly. I can actually see this real-time by converting the proc code into a PL/SQL block and using DBMS_SQL.return_result(RESULTS) after the SELECT query. When running this, I can see each row show up in the Script output window in SQL Developer one at a time. If this is exactly how the cursor returns the data to the calling application, then I can definitely see how this is the bottleneck as it could take 5-10 minutes to finish returning all 5K-10K rows. If I remove columns from the SELECT query, the cursor displays all the rows much faster, so it does seem like the large amount of columns is an issue using a cursor.
Knowing that running the SQL query by itself returns instant results, how could I get this same performance out of a cursor? It doesn't seem like it's possible. Is the answer putting the embedded SQL in the application code and not using a procedure/cursor to return data in this scenario? We are using Oracle 12c in our environment.
Edit: Just want to address how I am testing performance using the regular SELECT query vs the PL/SQL block with cursor method:
SELECT (takes ~27 seconds to return ~6K rows):
SELECT <1400+_columns>
FROM <table_name>;
PL/SQL with cursor (takes ~5-10 minutes to return ~6K rows):
DECLARE RESULTS SYS_REFCURSOR;
BEGIN
OPEN RESULTS FOR
SELECT <1400+_columns>
FROM <table_name>;
DBMS_SQL.return_result(RESULTS);
END;
Some of the comments are referencing what happens in the console application once all the data is returned, but I am only speaking regarding the performance of the two methods described above within Oracle\SQL Developer. Hope this helps clarify the point I'm trying to convey.

You can run a SQL Monitor report for the two executions of the SQL; that will show you exactly where the time is being spent. I would also consider running the two approaches in separate snapshot intervals and checking into the output from an AWR Differences report and ADDM Compare Report; you'd probably be surprised at the amazing detail these comparison reports provide.
Also, even though > 255 columns in a table is a "no-no" according to Oracle as it will fragment your record across > 1 database blocks, thus increasing the IO time needed to retrieve the results, I suspect the differences in the two approaches that you are seeing is not an IO problem since in straight SQL you report fast result fetching all. Therefore, I suspect more of a memory problem. As you probably know, PL/SQL code will use the Program Global Area (PGA), so I would check the parameter pga_aggregate_target and bump it up to say 5 GB (just guessing). An ADDM report run for the interval when the code ran will tell you if the advisor recommends a change to that parameter.

Related

SDO_JOIN table join on large tables no longer returns results after upgrade from Oracle 11.2.0.1 to 11.2.0.4

I have the same database on 2 different Oracle servers, one is 11.2.0.1.0 and the other is 11.2.0.4.0.
I have the same 2 geometry tables in both databases and run the following query on both servers. When run on an 11.2.0.1.0 version of Oracle, the query runs for a few minutes and I get results, the same query when run on 11.2.0.4.0 runs for about 3 seconds and returns no results.
The BLPUs table holds 36 million points and the PD_B2 table holds a polygon. I am trying to find all the points that fall in the polygon.
Other spatial queries do return rows but it takes hours and hours whereas the table join suggested in the Oracle Spatial documentation, takes 15 minutes to return all the points.
SELECT /*+ ordered */ a.uprn
FROM TABLE(SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC','mask=ANYINTERACT')) c, blpus a, PD_B2 b
WHERE c.rowid1 = a.rowid
AND c.rowid2 = b.rowid;
The spatial queryies below return SDO_ROWIDSET() when run on the 11.2.0.4 server
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC','mask=ANYINTERACT')
from dual;
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC')
from dual;
On the 11.2.0.1 server they return results.
I have discovered that a much smaller table of points will work on 11.2.0.4 so it seems that there is a size limit on 11.2.0.4 when using SDO_JOIN where as 11.2.0.1 seems to cope with the large table.
Does anyone know why this is or if there is an actual limit on table size when using SDO_JOIN?
This is strange. I see no reason why SDO_JOIN will not work the same way in 11.2.0.4. At least I have not seen that sort of behavior before. It looks like a bug to me and I suggest you file a service request with Oracle Support so we can take a look. You may need to provide a dump of the tables - or at least of a small enough subset that demonstrates the problem.
That said there are a few things to check: did you apply the 11.2.0.4 patch on the same database ? I.e. nothing changed in terms of table structures or content, grants, etc ?
When you say that the query returns no rows, does it do so immediately ? Or does it perform some processing before completing without anything being returned ?
How large is the PD_B2 table ?
What happens when you do:
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC','mask=ANYINTERACT')
from dual;
Does this also return nothing ?
What happens if you do
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC')
from dual;
Both should return something that looks like this:
SDO_ROWIDSET(SDO_ROWIDPAIR('AAAW3LAAGAAAADmAAu', 'AAAW3TAAGAAAAg7AAC'), SDO_ROWIDPAIR('AAAW3LAAGAAAADmABE', 'AAAW3TAAGAAAAgrAAA'),...)
You will see this if you run the query in sqlplus. [If you use a GUI (like TOAD or SQLDeveloper) then you may not see that. All those GUIs have problems dealing with objects or arrays.]
But the fact that your 11.2.0.4 tests complete very quickly probably means that you get an empty result back, maybe like this:
SDO_ROWIDSET()
and that confirms that something did not work.
Now, from what you say PD_B2 only contains one row ? If so then there is no reason whatsoever to go via SDO_JOIN. A straightforward spatial query is easier to write. SDO_JOIN only makes sense when both tables being joined contain multiple rows. Then again, if one of the tables is very small (like the PD_B2 table in your case), then it anyway falls back to a simple query.
A simple query would look like this:
SELECT a.uprn
FROM blpus a, PD_B2 b
WHERE sdo_anyinteract (a.geoloc, b.geoloc) = 'TRUE';
Does this return what you expect in 11.2.0.4 ?
You may also want to examine the cost of the query and the query plan used. In sqlplus, do this:
set timing on
set autotrace traceonly
then run the above query. The effect is that sqlplus will not display any results - but it will still fetch and format the output: it will just not print them. At the end you will get a printout of the query plan used as well as some execution statistics and the elapsed time. Please add those results to your question.
Running the query from within your application should have a similar profile: similar response time and database-side costs - assuming you do fetch all the 1.3 million rows.
BTW, what do you do with those results ? Do you show them out as a report ? Do you save them into a table for later analysis ? Surely you do not want to show them all on a map ?
To clarify: SDO_JOIN is only for matching many to many geometries. For matching one to many, the simple SDO_ANYINTERACT() is what you need. As a matter of fact, SDO_JOIN will automatically fall back to a simple SDO_ANYINTERACT when one of the sets is very much smaller than the other. I can't see how SDO_JOIN could be faster in your circumstances (i.e. when searching for all objects that match one object) since it will perform the same as an SDO_ANYINTERACT and will require extra joins to the two tables.
Going back to the original issue - that of a difference in behavior in SDO_JOIN between 11.2.0.1 and 11.2.0.4 that IMO is definitely a bug that you need to report to Oracle Support.

Same stored procedure acts differently on two/(three) different IDEs

I just created a stored procedure in MS SQL DB using TOAD.
what it does is that it accepts an ID wherein some records are associated with, then it inserts those records to a table.
next part of the stored procedure is to use the ID input to search on the table where the items got inserted and then return it as the result set to the user just to confirm that the information got inserted.
IN TOAD, it does what is expected. It inserts date and returns information using just the stored procedure.
IN Oracle SQL developer however, it does the insert and it ends at that. It seems to not execute the 2nd part of the stored procedure which is a select stmt.
I just have a feeling that this is because of the jdbc adapter. Also why I'm asking is because I'm using a reporting tool Pentaho Report Designer and it would really make it easier if I can do 2 things at the same time. Pentaho Report Designer is also using jdbc adapters, not a coincidence maybe?
But if there are other things that I can tweak I'd really appreciate it.
This is a guess, but worth considering...
There are things called "Batches", where are sets of SQL Statements that are all sent to the server at once, and executed by the server as one set of statements, within a single server-side session. Sending a set of sql statements to the server as a batch will often result in different results than if you sent them one at a time, where each statement is executed in its own session.
I haven't used Toad (or Oracle) in a while, but as I recall, it dealt with batches differently than the other ide I used. If the second statement in your set is relying on being in the same session as the first, and in one ide it is in a separate session, then this might explain what is happening.

SQL Performance - SSIS takes 2 mins to call a SProc, but SSMS takes <1 sec

I have an SSIS package that fills a SQL Server 2008 R2 DataWarehouse, and when it recreates the DW from scratch, it does several million calls to a stored procedure that does the heavy lifting in terms of calculations.
The problem is that the SSIS package takes days to run and shouldn't take that long. The key seems to be that when the SSIS package calls the SProc, it takes about 2 minutes for the SProc to return the results. But if I recreate the call manually (on the same database) it takes <1 sec to return the result, which is what I'd expect.
See this screen shot, in the top is the SQL Profiler Trace showing the call by the SSIS Package taking 130 seconds, and in the bottom is my recreation of the call, taking <1 sec.
http://screencast.com/t/ygsGcdBV
The SProc queries the database, iterates through the results with a cursor, does a lot of calculations on pairs of records, and amalgamates the numbers into 2 results which get returned.
However the timing of the manual call, suggests to me that it's not an issue with the SProc itself, or any indexing issue with the database itself, so why would the SSIS package be taking so much longer than a manual call?
Any hints appreciated.
Thanks,
I don't have a solution for you, but I do have a suggestion and a procedure that can help you get more information for a solution.
Suggestion: a stored proc that gets called millions of times should not be using a cursor. Usually a cursor can be replaced by a few statements and a temp table or two. For temp tables with more than 10k rows or so, index it.
Procedure: At key places in your stored proc put a statement
declare #timer1 datetime = GetDate()
(some code)
declare #timer2 datetime = GetDate()
(more code)
declare #timer3 datetime = GetDate()
then, at the end:
select
datediff(ss,#timer1,#timer2) as Action1,
datediff(ss,#timer2,#timer3) as Action2
At that point, you'll know which part of your query is working differently. I'm guessing you've got an "Aha!" coming.
I suspect the real problem is in your stored procedure, but I've included some basic SSIS items as well to try to fix your problem:
Ensure connection managers for OLE DB sources are all set toDelayValidation ( = True).
Ensure that ValidateExternalMetadata is set to false
DefaultBufferMaxRows and DefaultBufferSize to correspond to the table's row sizes
DROP and Recreate your destination component is SSIS
Ensure in your stored procedure that SET ANSI_NULLS ON
Ensure that the SQL in your sproc hits an index
Add the query hint OPTION (FAST 10000) - This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size
Review your stored procedure SQL Server parameter sniffing.
Slow way:
create procedure GetOrderForCustomers(#CustID varchar(20))
as
begin
select * from orders
where customerid = #CustID
end
Fast way:
create procedure GetOrderForCustomersWithoutPS(#CustID varchar(20))
as
begin
declare #LocCustID varchar(20)
set #LocCustID = #CustID
select * from orders
where customerid = #LocCustID
end
It could be quite simple. Verify the indexes in the table. It could be similiar/conflicting indexes and the solution could be to drop one of them.
With the SQL query in SSMS have a look at the Execution plan and which index object is used. Is it the same for the slow SP?
If they use different indexes try to use the fast one in SP. Example how to:
SELECT *
FROM MyTable WITH (INDEX(IndexName))
WHERE MyIndexedColumn = 0

Quick-n-dirty results: View results of Procedure OUT cursor in SQL Worksheet?

Platform: Oracle
Language: PL/SQL
Issue: Want to output a procedure OUT cursor into the SQLDeveloper SQLWosksheet.
Anyone know how to use the Oracle "Select * from Table( PipelinedFunction( Param ) ) " to check procedure code output cursors?
I am using Crsytal Reports off of an Oracle stored procedure. Crystal requires that a procedure return a cursor, which it fetchs and reads.
The procedure code I have is currently working, but I want to find the easiest way to view the effects of changes to the procedure code. I have SQLDeveloper available, and I'm doing my creation and sql testing in that. I would like to get a quick result visible in the SQL Developer Query Result window ("SQL Worksheet").
Is there a (simple) way to use a Function to read the cursor from the procedure? (and pipe that out to the Table function?)
Convoluted, I know, but I deal best when I can just see the results of code changes. If I can view the record results directly, it will speed up development of the report.
I know of the Table function and a little about pipelining in Oracle. I know a little about cursors in general and sys_refcursor. I know diddly about types and why I need them. (Isn't sys_regCursor supposed to get us away from that?)
The current procedure does an adequate but ungraceful series of queries, inserts to global temp tables (GTT), joins from GTT and original tables, more inserts, and more self-joins and then SELECTS the results into the OUT cursor. I might be able to do better relying on just cursors and such, but the current method is good enough to get results to the report.
I think I can handle SQL pretty well (for our purposes), but I am not an Oracle-specific developer... but I need help.
Anybody run across this? The whole idea was to speed my development for the procedure code, but I've spent a couple of days looking for a way to just get at the output... not what I had in mind.
Update:
I have tried some hare-brained schemes based on slivers that I've seen on the web... such as
Create or replace FUNCTION GET_BACKPLANE (
Node VARCHAR2 ) RETURN SYS_REFCURSOR
AS
RESULTS SYS_REFCURSOR;
BEGIN
Open Results for
Select Backplane(Results, Node) from Dual ;
... etc.
and
Create or replace Function GET_BACKPLANE (
NODE VARCHAR2 ) RETURN My_Table_Stru%ROWTYPE PIPELINED
AS
BEGIN ...
I don't think that Oracle is even considering letting me re-reference the output cursor from the procedure ("Results" is a sys_refcursor that holds the results of the last SELECT in the procedure). I don't know how to define it, open it, and reference it from the procedure.
I never got to the place where I could try
SELECT * FROM TABLE(GET_BACKPLANE( ... etc )
Sorry for any typos and bad Oracle Grammar... it's been a long several days.
SQL Developer allows us to use SQL*Plus commands in the Worksheet. So all you need to do is define a variable to hold the output of the ref cursor.
I may have misinterpreted the actual code you want to run but I'm assuming your actual program is a procedure Backplane(Results, Node) where results is an OUT parameter of datatype sys_refcursor and node is some input parameter.
var rc refcursor
exec Backplane(results=>:rc, Node=>42)
print rc
The output of the print statement is written to the Script Output pane.
Note that the use of SQL*Plus commands means we have to use the Run Script option F5 rather than execute statement.
Thanks for the help. In the end, I wound up brute-force-ing it...
Step by step:
Make a query, test a query,
create a global temp table from the structure,
add code to make another query off of that GTT, test the query,
create a global temp table from the structure,
etc.
In the end, I wound up running (anonymous block) scripts and checking the GTT contents at every stage.
The last part was to use the same last query from the original procedure, stuffing everything into the Cursor that crystal likes...
tomorrow, I test that.
But, I'll just force it through for the next procedure, and get it done in a day and a half instead of 2+ weeks (embarrassed).
Thanks,
Marc

Why there is the difference in Oracle PL/SQL response time within SQL Developer vs. SQL Plus?

I have PL/SQL function that returns cursor that holds 28 columns and 8100 rows. When I execute that function from SQL Plus I got the results right away and in SQL Developer I'm running script that takes looong time (about 80 seconds). The same happen from Java code. When number of columns reduced to 2 then I got response in less than 4 seconds. Can someone explain what is going on in this case?
The easiest experiment to make is changing the "SQL Array Fetch Size" in SQL Developer, which defaults to 50. If you see results from bumping it to 500, there's the answer.
Interestingly, the default for the equivalent SQLPlus parameter is only 15, but as APC said, SQLPlus has the advantage of being native.
If changing "SQL Array Fetch Size" does not do anything, the next thing to look at is JDBC settings, which SQL Developer uses and SQL*Plus does not.
In addition to the good answers before mine...
SQL*PLus sends the data straight back to the screen as soon as the first rows are returned whereas SQL Developer has to find the size of the resultset to return in advance of displaying records.
This might explain why there is a delay for SQL Developer especially if the resultset is large or takes a long time to fully return (e.g. if the execution path is complicated).

Resources