Postgres Function - SQLs in Loop really a prepared Statement? - performance

Right now I am doing for my company a migration from Firebird 2.5 to Postgres 9.4 and I also converted Stored Procedures from Firebird into Functions to Postgres...
Now I figured out that the performance is quite slow, but only if there are loops in which I execute more SQLs whith changing parameters.
So for example it looks like this (I simplified it to the necessary things)
CREATE OR REPLACE FUNCTION TEST
(TEST_ID BigInt) returns TABLE(NAME VARCHAR)
AS $$
declare _tmp bigint;
begin
for _tmp in select id from test
loop
-- Shouldn't the following SQL work as a Prepared Statement?
for name in select label
from test2
where id = _tmp
loop
return next;
end loop;
end loop;
end; $$
LANGUAGE plpgsql;
So if I compare the the time it takes to execute just the select inside the loop with Postgres and Firebird then Postgres is usually a bit faster than Firebird. But if the loop runs like 100 or 1000 or 10000 times than the time of the Firebird Stored Procedure is much faster. When I compare the times in Postgres it seemes like if the loop runs 10 times it takes 10 times longer then 1 row and if it runs 1000 times it takes 1000 times longer.... That should not be if it its reallly a Prepared Statement, right?
I checked also other issues like setting the memory settings high, leaving the statement "return next" out because I read that can cause a performance problem also....
It has also nothing to do with the "returns table" expression. If I leave that out it takes also the same time..
Nothing worked so far...
Of course this simple example could be solved also with one SQL, but the functions I migrated are much more complicated and I don't want to change the whole functions into something new (if possible)...
Am I missing something?

PL/pgSQL reuses prepared queries across function invocations; you only incur preparation overhead once per session. So unless you've been reconnecting between each test, the linear execution times are expected.
But it may also reuse execution plans, and sometimes this does not work to your advantage. Running your query in an EXECUTE statement can give better performance, despite the overhead of repreparing it each time.
See the PL/pgSQL documentation for more detail.

Finally got it... It was an index problem but it makes not complete sense to me...
Because if I executed the SQLs outside the function they were even faster than Firebird with Indizes. Now they are outside the functions in Postgres even twice as fast as before and the functions works now also really fast. Also faster as in Firebird...
The reason why I also was not considering this is because in Firebird the Foreign Keys also works as indizes. I expected would be the same in Postgres but it's not...
Should have really considered that earlier also because of the comments of Frank and Pavel.
Thanks to all anyways...

Related

Oracle: Return Large Dataset with Cursor in Procedure

I've seen lots of posts regarding the use of cursors in PL/SQL to return data to a calling application, but none of them touch on the issue I believe I'm having with this technique. I am fairly new to Oracle, but have extensive experience with MSSQL Server. In SQL Server, when building queries to be called by an application for returning data, I usually put the SELECT statement inside a stored proc with/without parameters, and let the stored proc execute the statement(s) and return the data automatically. I've learned that with PL/SQL, you must store the resulting dataset in a cursor and then consume the cursor.
We have a query that doesn't necessarily return huge amounts of rows (~5K - 10K rows), however the dataset is very wide as it's composed of 1400+ columns. Running the SQL query itself in SQL Developer returns results instantaneously. However, calling a procedure that opens a cursor for the same query takes 5+ minutes to finish.
CREATE OR REPLACE PROCEDURE PROCNAME(RESULTS OUT SYS_REFCURSOR)
AS
BEGIN
OPEN RESULTS FOR
<SELECT_query_with_1400+_columns>
...
END;
After doing some debugging to try to get to the root cause of the slowness, I'm leaning towards the cursor returning one row at a time very slowly. I can actually see this real-time by converting the proc code into a PL/SQL block and using DBMS_SQL.return_result(RESULTS) after the SELECT query. When running this, I can see each row show up in the Script output window in SQL Developer one at a time. If this is exactly how the cursor returns the data to the calling application, then I can definitely see how this is the bottleneck as it could take 5-10 minutes to finish returning all 5K-10K rows. If I remove columns from the SELECT query, the cursor displays all the rows much faster, so it does seem like the large amount of columns is an issue using a cursor.
Knowing that running the SQL query by itself returns instant results, how could I get this same performance out of a cursor? It doesn't seem like it's possible. Is the answer putting the embedded SQL in the application code and not using a procedure/cursor to return data in this scenario? We are using Oracle 12c in our environment.
Edit: Just want to address how I am testing performance using the regular SELECT query vs the PL/SQL block with cursor method:
SELECT (takes ~27 seconds to return ~6K rows):
SELECT <1400+_columns>
FROM <table_name>;
PL/SQL with cursor (takes ~5-10 minutes to return ~6K rows):
DECLARE RESULTS SYS_REFCURSOR;
BEGIN
OPEN RESULTS FOR
SELECT <1400+_columns>
FROM <table_name>;
DBMS_SQL.return_result(RESULTS);
END;
Some of the comments are referencing what happens in the console application once all the data is returned, but I am only speaking regarding the performance of the two methods described above within Oracle\SQL Developer. Hope this helps clarify the point I'm trying to convey.
You can run a SQL Monitor report for the two executions of the SQL; that will show you exactly where the time is being spent. I would also consider running the two approaches in separate snapshot intervals and checking into the output from an AWR Differences report and ADDM Compare Report; you'd probably be surprised at the amazing detail these comparison reports provide.
Also, even though > 255 columns in a table is a "no-no" according to Oracle as it will fragment your record across > 1 database blocks, thus increasing the IO time needed to retrieve the results, I suspect the differences in the two approaches that you are seeing is not an IO problem since in straight SQL you report fast result fetching all. Therefore, I suspect more of a memory problem. As you probably know, PL/SQL code will use the Program Global Area (PGA), so I would check the parameter pga_aggregate_target and bump it up to say 5 GB (just guessing). An ADDM report run for the interval when the code ran will tell you if the advisor recommends a change to that parameter.

Using `SELECT` to call a function

I occasionally encounter examples where SELECT...INTO...FROM DUAL is used to call a function - e.g.:
SELECT some_function INTO a_variable FROM DUAL;
is used, instead of
a_variable := some_function;
My take on this is that it's not good practice because A) it makes it unclear that a function is being invoked, and B) it's inefficient in that it forces a transition from the PL/SQL engine to the SQL engine (perhaps less of an issue today).
Can anyone explain why this might have been done, e.g. was this necessary in early PL/SQL coding in order to invoke a function? The code I'm looking at may date from as early as Oracle 8.
Any insights appreciated.
This practice dates from before PLSQL and Oracle 7. As already mentioned assignment was possible (and of course Best Practice) in Oracle7.
Before Oracle 7 there were two widely used Tools that needed the use of Select ... into var from dual;
On the one hand there used to be an Oracle Tool called RPT, some kind of report generator. RPT could be used to create batch processes. It had two kinds of macros, that could be combined to achieve what we use PLSQL for today. My first Oracle job involved debugging PLSQL that was generated by a program that took RPT batches and converted them automatically to PLSQL. I threw away my only RPT handbook sometime shortly after 2000.
On the other hand there was Oracle Forms 2.x and its Menu component. Context switching in Oracle Menu was often done with a Select ... from dual; I still remember how proud I was when I discovered that an untractable Bug was caused by a total of 6 records in table Dual.
I am sorry to say that I can not proof any of this, but it is the time of year to think back to the old times and really fun to have the answer.

Execute immediate fills up the library cache

I have a question regarding how queries executed through
'execute immediate' is treated in the library cache (We use Oracle 11).
Let's say I have a function like this:
FUNCTION get_meta_map_value (
getfield IN VARCHAR2,
searchfield IN VARCHAR2,
searchvalue IN VARCHAR2
) RETURN VARCHAR2 IS
v_outvalue VARCHAR2(32767);
sql_stmt VARCHAR2(2000) := 'SELECT '||getfield||' FROM field_mapping, metadata '||
'WHERE field_mapping.metadataid = metadata.metadataid AND rownum = 1 AND '||searchfield||' = :1';
BEGIN
EXECUTE IMMEDIATE sql_stmt INTO v_outvalue USING searchvalue;
...
The getfield and searchfield are in one installation always the same (but has other values in another installation, so that is why we use dynamic sql)
So this leaves us with an sql that only differs in the searchvalue (which is a parameter).
This function is called in a loop that executes x times, from inside another stored procedure.
The stored procedure is executed y times during the connection life time, through ODBC connection.
And there are z connections, but each of them uses the same database login.
Now let us also assume that the searchvalue changes b times during one loop.
Question 1:
When calculating how many copies of the sql will be kept in the library cache,
can we disregard the different values the searchvalue can have (b), as the value is sent as a parameter to execute immediate?
Question 2:
Will the loop cause a hard parse of the query x times (query will be created in library cache x times), or can Oracle reuse the query?
(We assume that the searchvalue is the same for all calls in this question here, for simplicity)
Question 3:
Does the y (number of times the stored procedure is called from odbc during the lifetime of one connection)
also multiply the amount of copies of the query that are kept in library cache?
Question 4:
Does the z (number of simultaneous connections with same db login)
multiply the amount of copies of the query that are kept in library cache?
Main question:
What behaviour should I expect here?
Is the behaviour configurable?
The cause for this question is that we have had this code is production for 4 years, and now one of our customer gets back to us and says "This query fills our whole SGA, and Oracle says it's your fault".
The number of different combinations of getfield and searchfield should determine how many "copies" there will be. I use teh word "copies" cautiously because Oracke will treat each variation as distinct. Since you are using a bind variable for searchvalue so however many values you have for this will not add to the query count.
In short, it looks like your code is OK.
Number of connections should not increase the hard parses.
Ask for a AWR report to see exactly how many of these queries are in the SGA, and how many hard parses are being triggered.
I will disagree that the number of connections will not increase the hard parse count for the posted code because the last I knew dynamic SQL cannot be shared between sessions. Since the generated SQL uses a bind variable it should generate a reusable statement by the session, but it will not be sharable between user sessions. As a general rule dynamic SQL should be used only for infrequently executed statements. You may want to refer to the following:
- -
Designing applications for performance and scalability An Oracle White Paper July 2005
https://www.oracle.com/technetwork/database/performance/designing-applications-for-performa-131870.pdf
- -
enter code here

select condition from cdef$ where rowid=:1 query elapsed time is more

In Db trace, there is a query taking long time.Can some one explain what it means.Seems this is very generic oracle query and not involved with my custom tables.
select condition from cdef$ where rowid=:1;
Found the same query in multiple places in trc files(DB trace) and one among all have huge amount of elapsed time. So, what will be the solution to avoid taking such a long time. Am using 11g version oracle.
You're right, that is an example of Oracle's recursive SQL, the statements it runs against the data dictionary to support our application SQL. That particular statement is the query Oracle runs to get the Search Condition of a CHECK constraint. If you are inserting or updating rows in tables with check constraints you will see it a lot.
The actual statement shouldn't take too long to run, so it is unlikely to be the source of a performance problem. Unless you are running lots of insert statements with hard-coded values. Oracle will run that query every time it parses a fresh insert or update statement. That will get expensive if you're not using bind variables.

Oracle 11g PL/SQL Diana Nodes Limit

I have a statement such as below but its padded out to do 1000 calls at a time. Anything over that throws a PLS-123 error Program Too Large Diana Nodes
begin
sp_myprocedure(....)
sp_myprocedure(....)
sp_myprocedure(....)
sp_myprocedure(....)
end
We are moving to 11g and I was wondering if this limitation could be increased to 2000 for example.
Thanks
"I have a statement such as below but its padded out to do 1000 calls
at a time"
This is a very bad programming strategy. Writing the same thing multiple times is a code smell. Anytime we find ourselves programming with cut'n'paste and then a bit of editing is a time when we should stop and ask ourselves, 'hmmm is there a better way to do this?'
"The parameters are different for each stored procedure call"
Yes, but the parameters have to come from somewhere. Presumably at the moment you are hard-coding them one thousand times. Yuck.
A better solution would be to store them in a table. Then you could write a simple loop. Like this:
for prec in ( select p1, p2 from my_parameters
order by id -- if ordering is important
)
loop
sp_myprocedure(prec.p1, prec.p2);
end loop;
Because you are storing the parameters in a table you can have as many calls to that proc as you like, and you are not bound by the Diana node limit.
True you will have to move your parameter values to a table, but it is not harder to maintain data in a table than it is to maintain hardcoded values in source code.
If you're just moving from 10g then I don't believe the limit has changed. So, if you're having problems now then you'll have them again in 11g. Take a look at this Ask Tom article. A general suggestion is to put your procedure in a package. Or, break it down into smaller blocks. If you're only getting the error when running the block which calls the procedure 1000 times and in the procedure on its own then I suggest you try as APC says and loop through it instead as this should reduce the number of nodes.

Resources