I have a query that updates a set of records based on specific criteria. I want to get columns of the result set of that update statement and pass it back in a refcursor.
I can get the result set by using RETURNING INTO, or in my case, RETURNING myrows BULK COLLECT INTO .... However, I'm not sure how to make this work with a cursor - you can't do an OPEN cursor FOR with an update statement.
I'm guessing there's a way to get the results of a RETURNING statement into my cursor. How can I do this?
Assuming that you have a SQL collection defined (rather than a PL/SQL collection), you should be able to
RETURNING my_column
BULK COLLECT INTO my_collection;
and then
OPEN p_rc
FOR SELECT *
FROM TABLE( my_collection );
Though that works, there are some caveats. If you expect the UPDATE to modify a large number of rows (or you expect many sessions to be running this code), storing all this data in a collection may consume a large amount of space in the PGA which may negatively impact performance. Reading a bunch of data into a collection just to send it all back to the SQL engine also tends to be a bit inelegant. And, as I said initially, this assumes that your collection is declared at the SQL level rather than being declared in PL/SQL.
Related
I've seen lots of posts regarding the use of cursors in PL/SQL to return data to a calling application, but none of them touch on the issue I believe I'm having with this technique. I am fairly new to Oracle, but have extensive experience with MSSQL Server. In SQL Server, when building queries to be called by an application for returning data, I usually put the SELECT statement inside a stored proc with/without parameters, and let the stored proc execute the statement(s) and return the data automatically. I've learned that with PL/SQL, you must store the resulting dataset in a cursor and then consume the cursor.
We have a query that doesn't necessarily return huge amounts of rows (~5K - 10K rows), however the dataset is very wide as it's composed of 1400+ columns. Running the SQL query itself in SQL Developer returns results instantaneously. However, calling a procedure that opens a cursor for the same query takes 5+ minutes to finish.
CREATE OR REPLACE PROCEDURE PROCNAME(RESULTS OUT SYS_REFCURSOR)
AS
BEGIN
OPEN RESULTS FOR
<SELECT_query_with_1400+_columns>
...
END;
After doing some debugging to try to get to the root cause of the slowness, I'm leaning towards the cursor returning one row at a time very slowly. I can actually see this real-time by converting the proc code into a PL/SQL block and using DBMS_SQL.return_result(RESULTS) after the SELECT query. When running this, I can see each row show up in the Script output window in SQL Developer one at a time. If this is exactly how the cursor returns the data to the calling application, then I can definitely see how this is the bottleneck as it could take 5-10 minutes to finish returning all 5K-10K rows. If I remove columns from the SELECT query, the cursor displays all the rows much faster, so it does seem like the large amount of columns is an issue using a cursor.
Knowing that running the SQL query by itself returns instant results, how could I get this same performance out of a cursor? It doesn't seem like it's possible. Is the answer putting the embedded SQL in the application code and not using a procedure/cursor to return data in this scenario? We are using Oracle 12c in our environment.
Edit: Just want to address how I am testing performance using the regular SELECT query vs the PL/SQL block with cursor method:
SELECT (takes ~27 seconds to return ~6K rows):
SELECT <1400+_columns>
FROM <table_name>;
PL/SQL with cursor (takes ~5-10 minutes to return ~6K rows):
DECLARE RESULTS SYS_REFCURSOR;
BEGIN
OPEN RESULTS FOR
SELECT <1400+_columns>
FROM <table_name>;
DBMS_SQL.return_result(RESULTS);
END;
Some of the comments are referencing what happens in the console application once all the data is returned, but I am only speaking regarding the performance of the two methods described above within Oracle\SQL Developer. Hope this helps clarify the point I'm trying to convey.
You can run a SQL Monitor report for the two executions of the SQL; that will show you exactly where the time is being spent. I would also consider running the two approaches in separate snapshot intervals and checking into the output from an AWR Differences report and ADDM Compare Report; you'd probably be surprised at the amazing detail these comparison reports provide.
Also, even though > 255 columns in a table is a "no-no" according to Oracle as it will fragment your record across > 1 database blocks, thus increasing the IO time needed to retrieve the results, I suspect the differences in the two approaches that you are seeing is not an IO problem since in straight SQL you report fast result fetching all. Therefore, I suspect more of a memory problem. As you probably know, PL/SQL code will use the Program Global Area (PGA), so I would check the parameter pga_aggregate_target and bump it up to say 5 GB (just guessing). An ADDM report run for the interval when the code ran will tell you if the advisor recommends a change to that parameter.
First, I want to make it clear that the question is not about the materialized views feature.
Suppose, I have a table function that returns a pre-defined set of columns.
When a function call is submitted as
SELECT col1, col2, col3
FROM TABLE(my_tfn(:p1))
WHERE col4 = 'X';
I can evaluate the parameter and choose what queries to execute.
I can either open one of the pre-defined cursors, or I can assemble my query dynamically.
What if instead of evaluating the parameter I want to evaluate the text of the requesting query?
For example, if my function returns 20 columns but the query is only requesting 4,
I can assign NULLs to remaining 16 clumns of the return type, and execute fewer joins.
Or I can push the filter down to my dynamic query.
Is there a way to make this happen?
More generally, is there a way to look at the requesting query before exuting the function?
There is no robust way to identify the SQL that called a PL/SQL object.
Below is a not-so-robust way to identify the calling SQL. I've used code like this before, but only in special circumstances where I knew that the PL/SQL would never run concurrently.
This seems like it should be so simple. The data dictionary tracks all sessions and running SQL. You can find the current session with sys_context('userenv', 'sid'), match that to GV$SESSION, and then get either SQL_ID and PREV_SQL_ID. But neither of those contain the calling SQL. There's even a CURRENT_SQL in SYS_CONTEXT, but it's only for fine-grained auditing.
Instead, the calling SQL must be found by a string search. Using a unique name for the PL/SQL object will help filter out unrelated statements. To prevent re-running for old statements, the SQL must be individually purged from the shared pool as soon as it is found. This could lead to race conditions so this approach will only work if it's never called concurrently.
--Create simple test type for function.
create or replace type clob_table is table of clob;
--Table function that returns the SQL that called it.
--This requires elevated privileges to run.
--To simplify the code, run this as SYS:
-- "grant execute on sys.dbms_shared_pool to your_user;"
--(If you don't want to do that, convert this to invoker's rights and use dynamic SQL.)
create or replace function my_tfn return clob_table is
v_my_type clob_table;
type string_table is table of varchar2(4000);
v_addresses string_table;
v_hash_values string_table;
begin
--Get calling SQL based on the SQL text.
select sql_fulltext, address, hash_value
bulk collect into v_my_type, v_addresses, v_hash_values
from gv$sql
--Make sure there is something unique in the query.
where sql_fulltext like '%my_tfn%'
--But don't include this query!
--(Normally creating a quine is a challenge, but in V$SQL it's more of
-- a challenge to avoid quines.)
and sql_fulltext not like '%quine%';
--Flush the SQL statements immediately, so they won't show up in next run.
for i in 1 .. v_addresses.count loop
sys.dbms_shared_pool.purge(v_addresses(i)||', '||v_hash_values(i), 'C');
end loop;
--Return the SQL statement(s).
return v_my_type;
end;
/
Now queries like these will return themselves, demonstrating that the PL/SQL code was reading the SQL that called it:
SELECT * FROM TABLE(my_tfn) where 1=1;
SELECT * FROM TABLE(my_tfn) where 2=2;
But even if you go through all this trouble - what are you going to do with the results? Parsing SQL is insanely difficult unless you can ensure that everyone always follows strict syntax rules.
First off, my background is in SQL Server. Using CTEs (Common Table Expressions) is a breeze and converting it to a stored procedure with variables doesn't require any changes to the structure of the SQL other than replacing entered values with variable names.
In Oracle PL/SQL however, it is a completely different matter. My CTEs work fine as straight SQL, but once I try to wrap them as PL/SQL I run into a host of issues. From my understanding, a SELECT now needs an INTO which will only hold the results of a single record. However, I am wanting the entire recordset of multiple values.
My apologies if I am missing the obvious here. I'm thinking that 99% of my problem is the paradigm shift I need to make.
Given the following example:
NOTE: I am greatly over simplifying the SQL here. I do know the below example can be done in a single SQL statement. The actual SQL is much more complex. It's the fundamentals I am looking for here.
WITH A as (SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 200),
B as (SELECT * FROM A WHERE EMPLOYEE_START_DATE > date '2014-02-01'),
C as (SELECT * FROM B WHERE EMPLOYEE_TYPE = 'SALARY')
SELECT 'COUNTS' as Total,
(SELECT COUNT(*) FROM A) as 'DEPT_TOTAL',
(SELECT COUNT(*) FROM B) as 'NEW_EMPLOYEES',
(SELECT COUNT(*) FROM C) as 'NEW_SALARIED'
FROM A
WHERE rowcount = 1;
Now if I want to make this into PL/SQL with variables that are passed in or predefined at the top, it's not a simple matter of declaring the variables, popping values into them, and changing my hard-coded values into variables and running it. NOTE: I do know that I can simply change the hard-coded values to variables like :Department, :StartDate, and :Type, but again, I am oversimplifying the example.
There are three issues I am facing here that I am trying to wrap my head around:
1) What would be the best way to rewrite this using PL/SQL with declared variables? The CTEs now have to go INTO something. But then I am dealing with one row at a time as opposed to the entire table. So CTE 'A' is a single row at a time, and CTE B will only see the single row as opposed to all of the data results of A, etc. I do know that I will most likely have to use CURSORS to traverse the records, which somehow seems to over complicate this.
2) The output now has to use DBMS_OUTPUT. For multiple records, I will have to use a CURSOR with FETCH (or a FOR...LOOP). Yes?
3) Is there going to a big performance issue with this vs. straight SQL in regards to speed and resources used?
Thanks in advance and again, my apologies if I am missing something really obvious here!
First, this has nothing to do with CTEs. This behavior would be the same with a simple select * from table query. The difference is that with T-SQL, the query goes into an implicit cursor which is returned to the caller. When executing the SP from Management Studio this is convenient. The result set appears in the data window as if we had executed the query directly. But this is actually non-standard behavior. Oracle has the more standard behavior which might be stated as "the result set of any query that isn't directed into a cursor must be directed to variables." When directed into variables, then the query must return only one row.
To duplicate the behavior of T-SQL, you just have to explicitly declare and return the cursor. Then the calling code fetches from the cursor the entire result set but one row at a time. You don't get the convenience of Sql Developer or PL/SQL Developer diverting the result set to the data display window, but you can't have everything.
However, as we don't generally write SPs just to be called from the IDE, it is easier to work with Oracle's explicit cursors than SQL Server's implicit ones. Just google "oracle return ref cursor to caller" to get a whole lot of good material.
Simplest way is to wrap it into an implicit for loop
begin
for i in (select object_id, object_name
from user_objects
where rownum = 1) loop
-- Do something with the resultset
dbms_output.put_line (i.object_id || ' ' || i.object_name);
end loop;
end;
Single row query without the need to predefine the variables.
A SELECT query in my stored procedure takes 3 seconds to execute when the table queried has no indexes. This is true in both when executing the query in Toad Editor and when calling the stored procedure. The Explain Plan shows that a full table scan is done.
When an index is added, the same query in Toad Editor returns results instantaneously (just a few milliseconds). The Explain Plan shows that the index is used. However, even when the index is present, the query still takes 3 seconds in the stored procedure. It looks like the query uses a full table scan when executed in stored procedure despite having an index that can speed it up. Why?
I have tried with indexes on different columns with different orders. The same results persist in all cases.
In the stored procedure, the results of the query are collected using BULK COLLECT INTO. Does this make a difference? Also, The stored procedure is inside a package.
The query is a very simple SELECT statement, like this:
SELECT MY_COL, COUNT (MY_COL)
/* this line is only in stored proc */ BULK COLLECT INTO mycollection
FROM MY_TABLE
WHERE ANOTHER_COL = '123' /* or ANOTHER_COL = filterval (which is type NUMBER) */
GROUP BY MY_COL
ORDER BY MY_COL
Without source code we can only guess...
So I suspect it's because in Toad you get just first 500 rows (500 is default buffer size in Toad) but in stored proc you fetch ALL rows into collection. So fetching probably takes most of 3 sec time. Especially if there are nested loops iny our query.
Update: It might also be implicit type conversion in where condition
I want to write pl/sql code which utilizes a Cursor and Bulk Collect to retrieve my data. My database has rows in the order of millions, and sometimes I have to query it to fetch nearly all records on client's request. I do the querying and subsequent processing in batches, so as to not congest the server and show incremental progress to the client. I have seen that digging down for later batches takes considerably more time, which is why I am trying to do it by way of cursor.
Here is what should be simple pl/sql around my main sql query:
declare
cursor device_row_cur
is
select /my_query_details/;
type l_device_rows is table of device_row_cur%rowtype;
out_entries l_device_rows := l_device_rows();
begin
open device_row_cur;
fetch device_row_cur
bulk collect into out_entries
limit 100;
close device_row_cur;
end;
I am doing batches of 100, and fetching them into out_entries. The problem is that this block compiles and executes just fine, but doesn't return the data rows it fetched. I would like it to return those rows just the way a select would. How can this be achieved? Any ideas?
An anonymous block can't return anything. You can assign values to a bind variable, including a collection type or ref cursor, inside the block. But the collection would have to be defined, as well as declared, outside the block. That is, it would have to be a type you can use in plain SQL, not something defined in PL/SQL. At the moment you're using a PL/SQL type that is defined within the block, and a variable that is declared within the block too - so it's out of scope to the client, and wouldn't be a valid type outside it either. (It also doesn't need to be initialised, but that's a minor issue).
Dpending on how it will really be consumed, one option is to use a ref cursor, and you can declare and display that through SQL*Plus or SQL Developer with the variable and print commands. For example:
variable rc sys_refcursor
begin
open :rc for ( select ... /* your cursor statement */ );
end;
/
print rc
You can do something similar from a client application, e.g. have a function returning a ref cursor or a procedure with an out parameter that is a ref cursor, and bind that from the application. Then iterate over the ref cursor as a result set. But the details depend on the language your application is using.
Another option is to have a pipelined function that returns a table type - again defined at SQL level (with create type) not in PL/SQL - which might consume fewer resources than a collection that's returned in one go.
But I'd have to question why you're doing this. You said "digging down for later batches takes considerably more time", which sounds like you're using a paging mechanism in your query, generating a row number and then picking out a range of 100 within that. If your client/application wants to get all the rows then it would be simpler to have a single query execution but fetch the result set in batches.
Unfortunately without any information about the application this is just speculation...
I studied this excellent paper on optimizing pagination:
http://www.inf.unideb.hu/~gabora/pagination/article/Gabor_Andras_pagination_article.pdf
I used technique 6 mainly. It describes how to limit query to fetch page x and onward. For added improvement, you can limit it further to fetch page x alone. If used right, it can bring a performance improvement by a factor of 1000.
Instead of returning custom table rows (which is very hard, if not impossible to interface with Java), I eneded up opening a sys_refcursor in my pl/sql which can be interfaced such as:
OracleCallableStatement stmt = (OracleCallableStatement) connection.prepareCall(sql);
stmt.registerOutParameter(someIndex, OracleTypes.CURSOR);
stmt.execute();
resultSet = stmt.getCursor(idx);