PostgreSQL vs Oracle: "compile-time" checking of PL/pgSQL - oracle

Executive summary: PostgreSQL is amazing, but we are facing many issues at work due to the fact that it postpones many checks on PL/pgSQL code until runtime. Is there a way to make it more like Oracle's PL/SQL in this respect?
For example...
Try executing this in any Oracle DB:
create function foo return number as
begin
select a from dual;
return a;
end;
Oracle will immediately (i.e. at compile-time!) respond with:
[Error] ORA-00904: invalid identifier
Now try the semantically equivalent thing in PostgreSQL:
CREATE OR REPLACE FUNCTION public.foo ()
RETURNS integer AS
$body$
BEGIN
select a;
return a;
END;
$body$
LANGUAGE plpgsql;
You will see it - unfortunately! - execute fine ... No error is reported.
But when you then try to call this function (i.e. at runtime) you will get:
ERROR: column "a" does not exist
LINE 1: select a
Is there a way to force PostgreSQL to perform syntax analysis and checking at function definition time - not at run-time? We have tons of legacy PL/SQL code at work, which we are porting to PostgreSQL - but the lack of compile-time checks is very painful, forcing us to do manual work - i.e. writing code to test all code paths in all functions/procedures - that was otherwise automated in Oracle.

Yes, this is a known issue.
PL/pgSQL (like any other function, except on SQL) is a “black box” for the PostgreSQL, therefore it is not really possible to detect errors except in runtime.
You can do several things:
wrap your function calling SQL queries into BEGIN / COMMIT statements in order to have better control over errors;
add EXCEPTION blocks to your code to catch and track errors. Note, though, that this will affect function performance;
use plpgsql_check extension, developed by the Pavel Stěhule, who is one of the main contributors to PL/pgSQL development. I suppose eventually this extension will make it into the core of the PostgreSQL, but it'll take some time (now we're in 9.4beta3 state);
You might also look into this related question: postgresql syntax check without running the query
And it really looks like you're in a huge need of a unit testing framework.

Plpgsql language is designed without semantics checking at compile-time. I am not sure if this feature was an intention or a side effect of old plpgsql implementation, but over time we found some advantages to it (but also disadvantages as you mentioned).
Plus :
there are less issues with dependency between functions and other database objects. It's a simple solution to cyclic dependency problem. Deployment of plpgsql functions is easier, because you don't need to respect dependency.
Some patterns with temporary tables are possible using lazy dependency checking. It's necessary, because Postgres doesn't support global temporary tables.
Example:
BEGIN
CREATE TEMP TABLE xx(a int);
INSERT INTO xx VALUES(10); -- isn't possible with compile-time dependency checks
END;
Minus:
Compile-time deep checking is not possible (identifiers checking), although it's sometimes possible.
For some bigger projects a mix of solutions should be used:
regress and unit tests - it is fundamental, because some situations cannot be checked statically - dynamic SQL for example.
plpgsql_check - it is an external but supported project used by some bigger companies and bigger plpgsql users. It can enforce a static check of SQL identifiers validity. You can enforce this check by DDL triggers.

Related

Is static sql to be preferred over dynamic sql in postgresql stored procedures?

I am not sure in case of Stored Procedures, if Postgresql treats static sql any differently from a query submitted as a quoted string.
When I create a stored procedure in PostgreSQL using static sql, there seems to be no validation of the table names and table columns or column types but when I run the procedure I get the listing of the problems if any.
open ref_cursor_variable for
select usr_name from usres_master;
-- This is a typing mistake. The table name should be users_master. But the stored procedure is created and the error is thrown only when I run the procedure.
When I run the procedure I (naturally) get some error like :
table usres_master - invalid table name
The above is a trivial version. The real procedures we use at work combine several tables and run to at least a few hundred lines. In PostgresQL stored procedure, is there no advantage to using static sql over dynamic sql i.e. something like open ref_cursor_variable for EXECUTE select_query_string_variable.
The static SQL should be preferred almost time - dynamic SQL should be used only when it is necessary
from performance reasons (dynamic SQL doesn't reuse execution plans). One shot plan can be better some times (and necessary).
can reduce lot of code
In other cases uses static SQL every time. Benefits:
readability
reuse of execution plans
it is safe against SQL injection by default
static check is available
The source of a function is just a string to Postgres. The main reason for this is the fact that Postgres (unlike other DBMS) supports many, even installable languages for functions and procedures. As the Postgres core can't possibly know the syntax of all languages, it can not validate the "inner" part of a function. To my knowledge the "language API" does not contain any "validate" method (in theory this would probably be possible though).
If you want to statically validate your PL/pgSQL functions (and procedures since Postgres 11) you could use e.g. https://github.com/okbob/plpgsql_check/

How Pragma UDF works?

Recently, I have read about UDF Pragma optimization method in Oracle Database 12.
I'm very interested in how exactly it works. I've only found very short description in the Oracle documentation.
As I understand, every Pragma in PL/SQL is some kind of compiler directive (I could be wrong here) similar to C++ Pragma Directives.
Maybe someone can explain to me in more details (or provide links :) ) how the internal mechanism of UDF Pragma works?
The internals are not made available to us, but essentially the pragma is an instruction to the compiler to focus on reducing the cost of the PL/SQL => SQL context switch overhead. If I had to hypothesise I would say that it takes advantage of a different 12c feature which allow PL/SQL directly within a CTE (common table expression), e.g.:
WITH
function myfunction(x int) return int as
begin
...
return ...
end;
select myfunction(col) from my_table
PL/SQL functions coded like the above exist solely for the duration of the SQL execution and also execute with improved performance characteristics over that of standalone functions. So I suspect the UDF pragma allows invocation of a standalone function via the same code path as the inline example above.
(But all just hypothesis on my part)

Using `SELECT` to call a function

I occasionally encounter examples where SELECT...INTO...FROM DUAL is used to call a function - e.g.:
SELECT some_function INTO a_variable FROM DUAL;
is used, instead of
a_variable := some_function;
My take on this is that it's not good practice because A) it makes it unclear that a function is being invoked, and B) it's inefficient in that it forces a transition from the PL/SQL engine to the SQL engine (perhaps less of an issue today).
Can anyone explain why this might have been done, e.g. was this necessary in early PL/SQL coding in order to invoke a function? The code I'm looking at may date from as early as Oracle 8.
Any insights appreciated.
This practice dates from before PLSQL and Oracle 7. As already mentioned assignment was possible (and of course Best Practice) in Oracle7.
Before Oracle 7 there were two widely used Tools that needed the use of Select ... into var from dual;
On the one hand there used to be an Oracle Tool called RPT, some kind of report generator. RPT could be used to create batch processes. It had two kinds of macros, that could be combined to achieve what we use PLSQL for today. My first Oracle job involved debugging PLSQL that was generated by a program that took RPT batches and converted them automatically to PLSQL. I threw away my only RPT handbook sometime shortly after 2000.
On the other hand there was Oracle Forms 2.x and its Menu component. Context switching in Oracle Menu was often done with a Select ... from dual; I still remember how proud I was when I discovered that an untractable Bug was caused by a total of 6 records in table Dual.
I am sorry to say that I can not proof any of this, but it is the time of year to think back to the old times and really fun to have the answer.

10g Package Construction - Restricting References

I am constructing a rather large Oracle 10g package full of functions etc.
The problem is some of these functions are pulling information from materilized view's or tables that other functions are creating.
Is there a way to successfully compile a package even though some of the functions cannot find the information they are looking for (But these functions will work once the views have been created?).
Attempts:
I have looked into PRAGMA RESTRICT_REFERENCES but have had no success so far. Am I even on the right track or is this not even possible?
You cannot refer using static SQL to objects that do not exist when the code is compiled. There is nothing you can do about that.
You would need to modify your code to use dynamic SQL to refer to any object that is created at runtime. You can probably use EXECUTE IMMEDIATE, i.e.
EXECUTE IMMEDIATE
'SELECT COUNT(*) FROM new_mv_name'
INTO l_cnt;
rather than
SELECT COUNT(*)
INTO l_cnt
FROM new_mv_name;
That being said, however, I would be extremely dubious about a PL/SQL implementation that involved creating any new tables and materialized views at runtime. That is almost always a mistake in Oracle. Why do you need to create new objects at runtime?

Refactoring PL/SQL triggers - extract procedures

we have application where database contains large parts of business logic in triggers, with a update subsequently firing triggers on several other tables. I want to refactor the mess and wanted to start by extracting procedures from triggers, but can't find any reliable tool to do this. Using "Extract procedure" in both SQL Developer and Toad failed to properly handle :new and :old trigger variables.
If you had similar problem with triggers, did you find a way around it?
EDIT: Ideally, only columns that are referenced by extracted code would be sent as in/out parameters, like:
Example of original code to be extracted from trigger:
.....
if :new.col1 = some_var then
:new.col1 := :old.col1
end if
.....
would become :
procedure proc(in old_col1 varchar2, in out new_col1 varchar2, some_var varchar2) is
begin
if new_col1 = some_var then
new_col1 := old_col1
end if;
end;
......
proc(:old.col1,:new.col1, some_var);
It sounds like you want to carry out transformations on PL/SQL source. To do this reliably, you need a tool that can parse PL/SQL to some kind of compiler data structure, pattern-match against that structure and make directed changes, and then regenerate the modified PL/SQL code.
The DMS Software Reengineering Toolkit is such a tool. It is parameterized by the programming language being translated; it has off-the-shelf front ends for many languages, including C, C++, C#, Java, COBOL and ... PL/SQL.
This is not exactly the answer. I have not enough reputation to edit original question, obviously.
The issue is that it is not a "refactoring" as we usually think of. Even when you'll create bunch of procedures from triggers, you'll need to make a proper framework to run them in order to achieve original functionality. I suspect that this will be a challenge as well.
As a solution proposal, I'd go with one python script, based on state machine (see http://www.ibm.com/developerworks/library/l-python-state.html for example). If you put strict definition of what should be translated and how, it will be easy to implement.

Resources