Ordered iteration in an user-defined aggregate function? - oracle

I just implemented the ODCIAggregate Interface to create a custom aggregation function. It works quite well and fast, but I would like it to do a little something more. I have a statement going like this:
SELECT SomeId, myAggregationFunction(Item) FROM
(
SELECT
Foo.SomeId,
SomeType(Foo.SomeValue, Foo.SomeOtherValue) AS Item
FROM
Foo
ORDER BY Foo.SomeOrderingValue
)
GROUP BY SomeId;
My problem is that items aren't passed to the ODCIAggregateIterate function of my implementation in the same order that my inner (ordered) SELECT returns them.
I've Googled around and didn't find any Oracle-provided way to do so. Has any of you experimented a similar problem based on that requirement?
Thanks!

Have you considered using COLLECT instead of data cartridge?
At least for string aggregation, the COLLECT method is simpler and much faster. It does make your SQL a little weirder though.
Below is an example using just simple string concatenation.
--Create a type
create or replace type sometype as object
(
someValue varchar2(100),
someOtherValue varchar2(100)
);
--Create a nested table of the type.
--This is where the performance improvement comes from - Oracle can aggregate
--the types in SQL using COLLECT, and then can process all the values at once.
--This significantly reduces the context switches between SQL and PL/SQL, which
--are usually more expensive than the actual work.
create or replace type sometypes as table of sometype;
--Process all the data (it's already been sorted before it gets here)
create or replace function myAggregationFunction(p_sometypes in sometypes)
return varchar2 is
v_result varchar2(4000);
begin
--Loop through the nested table, just concatenate everything for testing.
--Assumes a dense nested table
for i in 1 .. p_sometypes.count loop
v_result := v_result || ',' ||
p_sometypes(i).someValue || '+' || p_sometypes(i).someOtherValue;
end loop;
--Remove the first delimeter, return value
return substr(v_result, 2);
end;
/
--SQL
select someId
,myAggregationFunction
(
cast
(
--Here's where the aggregation and ordering happen
collect(sometype(SomeValue, SomeOtherValue)
order by SomeOrderingValue)
as someTypes
)
) result
from
(
--Test data: note the unordered SoemOrderingValue.
select 1 someId, 3 SomeOrderingValue, '3' SomeValue, '3' SomeOtherValue
from dual union all
select 1 someId, 1 SomeOrderingValue, '1' SomeValue, '1' SomeOtherValue
from dual union all
select 1 someId, 2 SomeOrderingValue, '2' SomeValue, '2' SomeOtherValue
from dual
) foo
group by someId;
--Here are the results, aggregated and ordered.
SOMEID RESULT
------ ------
1 1+1,2+2,3+3

Oracle is very likely be rewriting your query and getting rid of the subquery. I've never done anything like what you're doing, but could you add the NO_UNNEST hint on the inner query?
SELECT SomeId, myAggregationFunction(Item) FROM
(
SELECT /*+ NO_UNNEST */
Foo.SomeId, ...
Even then, I'm really not sure what it will do with an ORDER BY inside a subquery.

Related

Oracle equivalent query for this postgress query - CONFLICT [duplicate]

The UPSERT operation either updates or inserts a row in a table, depending if the table already has a row that matches the data:
if table t has a row exists that has key X:
update t set mystuff... where mykey=X
else
insert into t mystuff...
Since Oracle doesn't have a specific UPSERT statement, what's the best way to do this?
The MERGE statement merges data between two tables. Using DUAL
allows us to use this command. Note that this is not protected against concurrent access.
create or replace
procedure ups(xa number)
as
begin
merge into mergetest m using dual on (a = xa)
when not matched then insert (a,b) values (xa,1)
when matched then update set b = b+1;
end ups;
/
drop table mergetest;
create table mergetest(a number, b number);
call ups(10);
call ups(10);
call ups(20);
select * from mergetest;
A B
---------------------- ----------------------
10 2
20 1
The dual example above which is in PL/SQL was great becuase I wanted to do something similar, but I wanted it client side...so here is the SQL I used to send a similar statement direct from some C#
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" )
However from a C# perspective this provide to be slower than doing the update and seeing if the rows affected was 0 and doing the insert if it was.
An alternative to MERGE (the "old fashioned way"):
begin
insert into t (mykey, mystuff)
values ('X', 123);
exception
when dup_val_on_index then
update t
set mystuff = 123
where mykey = 'X';
end;
Another alternative without the exception check:
UPDATE tablename
SET val1 = in_val1,
val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%rowcount = 0 )
THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
insert if not exists
update:
INSERT INTO mytable (id1, t1)
SELECT 11, 'x1' FROM DUAL
WHERE NOT EXISTS (SELECT id1 FROM mytble WHERE id1 = 11);
UPDATE mytable SET t1 = 'x1' WHERE id1 = 11;
None of the answers given so far is safe in the face of concurrent accesses, as pointed out in Tim Sylvester's comment, and will raise exceptions in case of races. To fix that, the insert/update combo must be wrapped in some kind of loop statement, so that in case of an exception the whole thing is retried.
As an example, here's how Grommit's code can be wrapped in a loop to make it safe when run concurrently:
PROCEDURE MyProc (
...
) IS
BEGIN
LOOP
BEGIN
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" );
EXIT; -- success? -> exit loop
EXCEPTION
WHEN NO_DATA_FOUND THEN -- the entry was concurrently deleted
NULL; -- exception? -> no op, i.e. continue looping
WHEN DUP_VAL_ON_INDEX THEN -- an entry was concurrently inserted
NULL; -- exception? -> no op, i.e. continue looping
END;
END LOOP;
END;
N.B. In transaction mode SERIALIZABLE, which I don't recommend btw, you might run into
ORA-08177: can't serialize access for this transaction exceptions instead.
I'd like Grommit answer, except it require dupe values. I found solution where it may appear once: http://forums.devshed.com/showpost.php?p=1182653&postcount=2
MERGE INTO KBS.NUFUS_MUHTARLIK B
USING (
SELECT '028-01' CILT, '25' SAYFA, '6' KUTUK, '46603404838' MERNIS_NO
FROM DUAL
) E
ON (B.MERNIS_NO = E.MERNIS_NO)
WHEN MATCHED THEN
UPDATE SET B.CILT = E.CILT, B.SAYFA = E.SAYFA, B.KUTUK = E.KUTUK
WHEN NOT MATCHED THEN
INSERT ( CILT, SAYFA, KUTUK, MERNIS_NO)
VALUES (E.CILT, E.SAYFA, E.KUTUK, E.MERNIS_NO);
I've been using the first code sample for years. Notice notfound rather than count.
UPDATE tablename SET val1 = in_val1, val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%notfound ) THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
The code below is the possibly new and improved code
MERGE INTO tablename USING dual ON ( val3 = in_val3 )
WHEN MATCHED THEN UPDATE SET val1 = in_val1, val2 = in_val2
WHEN NOT MATCHED THEN INSERT
VALUES (in_val1, in_val2, in_val3)
In the first example the update does an index lookup. It has to, in order to update the right row. Oracle opens an implicit cursor, and we use it to wrap a corresponding insert so we know that the insert will only happen when the key does not exist. But the insert is an independent command and it has to do a second lookup. I don't know the inner workings of the merge command but since the command is a single unit, Oracle could execute the correct insert or update with a single index lookup.
I think merge is better when you do have some processing to be done that means taking data from some tables and updating a table, possibly inserting or deleting rows. But for the single row case, you may consider the first case since the syntax is more common.
A note regarding the two solutions that suggest:
1) Insert, if exception then update,
or
2) Update, if sql%rowcount = 0 then insert
The question of whether to insert or update first is also application dependent. Are you expecting more inserts or more updates? The one that is most likely to succeed should go first.
If you pick the wrong one you will get a bunch of unnecessary index reads. Not a huge deal but still something to consider.
Try this,
insert into b_building_property (
select
'AREA_IN_COMMON_USE_DOUBLE','Area in Common Use','DOUBLE', null, 9000, 9
from dual
)
minus
(
select * from b_building_property where id = 9
)
;
From http://www.praetoriate.com/oracle_tips_upserts.htm:
"In Oracle9i, an UPSERT can accomplish this task in a single statement:"
INSERT
FIRST WHEN
credit_limit >=100000
THEN INTO
rich_customers
VALUES(cust_id,cust_credit_limit)
INTO customers
ELSE
INTO customers SELECT * FROM new_customers;

Why is the output only the last value? Oracle loop cursor

I'm trying to output a list of the courses a professor teaches, by receiving the prof's id by parameter to my function, and showing all courses, each separated by a comma. For example, if a Professor teaches Humanities, Science and Math, I want the output to be: 'Humanities, Science, Math'. However, I'm getting just 'Math,'. It only shows the last field that it found that matched with the prof's id.
CREATE OR REPLACE FUNCTION listar_cursos(prof NUMBER) RETURN VARCHAR
IS
CURSOR C1 IS
SELECT subject.name AS name FROM subject
INNER JOIN course_semester
ON subject.id = course_semester.id_subject
WHERE course_semester.id_profesor = prof
ORDER BY subject.name;
test VARCHAR(500);
BEGIN
FOR item IN C1
LOOP
test:= item.name ||',';
END LOOP;
RETURN test;
END;
/
I am aware that listagg exists, however I do not wish to use it.
In your loop, you re-assign to the test variable, instead of appending to it. This is why, at the end of the loop, it will just hold the last value of item.name.
The assignment should instead be something like
test := test || ',' || item.name
Note also that this will leave a comma at the beginning of the string. Instead of returning test, you may want to return ltrim(test, ',').
Note that you don't need to declare a cursor explicitly. The code is easier to read (in my opinion) with an implicit cursor, as shown below. I create sample tables and data to test the function, then I show the function code and how it's used.
create table subject as
select 1 id, 'Humanities' name from dual union all
select 2 , 'Science' from dual union all
select 3 , 'Math' from dual
;
create table course_semester as
select 1 id_subject, 201801 semester, 1002 as id_profesor from dual union all
select 2 , 201702 , 1002 as id_profesor from dual union all
select 3 , 201801 , 1002 as id_profesor from dual
;
CREATE OR REPLACE FUNCTION listar_cursos(prof NUMBER) RETURN VARCHAR IS
test VARCHAR(500);
BEGIN
FOR item IN
(
SELECT subject.name AS name FROM subject
INNER JOIN course_semester
ON subject.id = course_semester.id_subject
WHERE course_semester.id_profesor = prof
ORDER BY subject.name
)
LOOP
test:= test || ',' || item.name;
END LOOP;
RETURN ltrim(test, ',');
END;
/
select listar_cursos(1002) from dual;
LISTAR_CURSOS(1002)
-----------------------
Humanities,Math,Science

Sorting by value returned by a function in oracle

I have a function that returns a value and displays a similarity between tracks, i want the returned result to be ordered by this returned value, but i cannot figure out a way on how to do it, here is what i have already tried:
CREATE OR REPLACE PROCEDURE proc_list_similar_tracks(frstTrack IN tracks.track_id%TYPE)
AS
sim number;
res tracks%rowtype;
chosenTrack tracks%rowtype;
BEGIN
select * into chosenTrack from tracks where track_id = frstTrack;
dbms_output.put_line('similarity between');
FOR res IN (select * from tracks WHERE ROWNUM <= 10)LOOP
SELECT * INTO sim FROM ( SELECT func_similarity(frstTrack, res.track_id)from dual order by sim) order by sim; //that's where i am getting the value and where i am trying to order
dbms_output.put_line( chosenTrack.track_name || '(' ||frstTrack|| ') and ' || res.track_name || '(' ||res.track_id|| ') ---->' || sim);
END LOOP;
END proc_list_similar_tracks;
/
declare
begin
proc_list_similar_tracks(437830);
end;
/
no errors are given, the list is just presented unsorted, is it not possible to order by a value that was returned by a function? if so, how do i accomplish something like this? or am i just doing something horribly wrong?
Any help will be appreciated
In the interests of (over-)optimisation I would avoid ordering by a function if I could possibly avoid it; especially one that queries other tables. If you're querying a table you should be able to add that part to your current query, which enables you to use it normally.
However, let's look at your function:
There's no point using DBMS_OUTPUT for anything but debugging unless you're going to be there looking at exactly what is output every time the function is run; you could remove these lines.
The following is used only for a DBMS_OUTPUT and is therefore an unnecessary SELECT and can be removed:
select * into chosenTrack from tracks where track_id = frstTrack;
You're selecting a random 10 rows from the table TRACKS; why?
FOR res IN (select * from tracks WHERE ROWNUM <= 10)LOOP
Your ORDER BY, order by sim, is ordering by a non-existent column as the column SIM hasn't been declared within the scope of the SELECT
Your ORDER BY is asking for the least similar as the default sort order is ascending (this may be correct but it seems wrong?)
Your function is not a function, it's a procedure (one without an OUT parameter).
Your SELECT INTO is attempting to place multiple rows into a single-row variable.
Assuming your "function" is altered to provide the maximum similarity between the parameter and a random 10 TRACK_IDs it might look as follows:
create or replace function list_similar_tracks (
frstTrack in tracks.track_id%type
) return number is
sim number;
begin
select max(func_similarity(frstTrack, track_id)) into sim
from tracks
where rownum <= 10
;
return sim;
end list_similar_tracks;
/
However, the name of the function seems to preclude that this is what you're actually attempting to do.
From your comments, your question is actually:
I have the following code; how do I print the top 10 function results? The current results are returned unsorted.
declare
sim number;
begin
for res in ( select * from tracks ) loop
select * into sim
from ( select func_similarity(var1, var2)
from dual
order by sim
)
order by sim;
end loop;
end;
/
The problem with the above is firstly that you're ordering by the variable sim, which is NULL in the first instance but changes thereafter. However, the select from DUAL is only a single row, which means you're randomly ordering by a single row. This brings us back to my point at the top - use SQL where possible.
In this case you can simply SELECT from the table TRACKS and order by the function result. To do this you need to give the column created by your function result an alias (or order by the positional argument as already described in Emmanuel's answer).
For instance:
select func_similarity(var1, var2) as function_result
from dual
Putting this together the code becomes:
begin
for res in ( select *
from ( select func_similarity(variable, track_id) as f
from tracks
order by f desc
)
where rownum <= 10 ) loop
-- do something
end loop;
end;
/
You have a query using a function, let's say something like:
select t.field1, t.field2, ..., function1(t.field1), ...
from table1 t
where ...
Oracle supports order by clause with column indexes, i.e. if the field returned by the function is the nth one in the select (here, field1 is in position 1, field2 in position 2), you just have to add:
order by n
For instance:
select t.field1, function1(t.field1) c2
from table1 t
where ...
order by 2 /* 2 being the index of the column computed by the function */

Oracle pipelined function cannot access remote table (ORA-12840) when used in a union

I have created a pipelined function which returns a table. I use this function like a dynamic view in another function, in a with clause, to mark certain records. I then use the results from this query in an aggregate query, based on various criteria. What I want to do is union all these aggregations together (as they all use the same source data, but show aggregations at different heirarchical levels).
When I produce the data for individual levels, it works fine. However, when I try to combine them, I get an ORA-12840 error: cannot access a remote table after parallel/insert direct load txn.
(I should note that my function and queries are looking at tables on a remote server, via a DB link).
Any ideas what's going on here?
Here's an idea of the code:
function getMatches(criteria in varchar2) return myTableType pipelined;
...where this function basically executes some dynamic SQL, which references remote tables, as a reference cursor and spits out the results.
Then the factored queries go something like:
with marked as (
select id from table(getMatches('OK'))
),
fullStats as (
select mainTable.id,
avg(nvl2(marked.id, 1, 0)) isMarked,
sum(mainTable.val) total
from mainTable
left join marked
on marked.id = mainTable.id
group by mainTable.id
)
The reason for the first factor is speed -- if I inline it, in the join, the query goes really slowly -- but either way, it doesn't alter the status of whatever's causing the exception.
Then, say for a complete overview, I would do:
select sum(total) grandTotal
from fullStats
...or for an overview by isMarked:
select sum(total) grandTotal
from fullStats
where isMarked = 1
These work fine individually (my pseudocode maybe wrong or overly simplistic, but you get the idea), but as soon as I union all them together, I get the ORA-12840 error :(
EDIT By request, here is an obfuscated version of my function:
function getMatches(
search in varchar2)
return idTable pipelined
as
idRegex varchar2(20) := '(05|10|20|32)\d{3}';
searchSQL varchar2(32767);
type rc is ref cursor;
cCluster rc;
rCluster idTrinity;
BAD_CLUSTER exception;
begin
if regexp_like(search, '^L\d{3}$') then
searchSQL := 'select distinct null id1, id2_link id2, id3_link id3 from anotherSchema.linkTable#my.remote.link where id2 = ''' || search || '''';
elsif regexp_like(search, '^' || idRegex || '(,' || idRegex || || ')*$') then
searchSQL := 'select distinct null id1, id2, id3 from anotherSchema.idTable#my.remote.link where id2 in (' || regexp_replace(search, '(\d{5})', '''\1''') || ')';
else
raise BAD_CLUSTER;
end if;
open cCluster for searchSQL;
loop
fetch cCluster into rCluster;
exit when cCluster%NOTFOUND;
pipe row(rCluster);
end loop;
close cCluster;
return;
exception
when BAD_CLUSTER then
raise_application_error(-20000, 'Invalid Cluster Search');
return;
when others then
raise_application_error(-20999, 'API' || sqlcode || chr(10) || sqlerrm);
return;
end getMatches;
It's very simple, designed for an API with limited access to the database, in terms of sophistication (hence passing a comma delimited string as a possible valid argument): If you supply a grouping code, it returns linked IDs (it's a composite, 3-field key); however, if you supply a custom list of codes, it just returns those instead.
I'm on Oracle 10gR2; not sure which version exactly, but I can look it up when I'm back in the office :P
To be honest no idea where the issue came from but the simplest way to solve it - create a temporary table and populate it by values from your pipelined function and use the table inside WITH clause. Surely the temp table should be created but I'm pretty sure you get serious performance shift because dynamic sampling isn't applied to pipelined functions without tricks.
p.s. the issue could be fixed by with marked as ( select /*+ INLINE / id from table(getMatches('OK'))) but surely it isn't the stuff you're looking for so my suggestion is confirmed WITH does something like 'insert /+ APPEND*/' inside it'.

Oracle: Using Pseudo column value in the same Select statement

I have a scenario in oracle where i need to be able to reuse the value of a pseudo column which was calculated previously within the same select statement something like:
select 'output1' process, process || '-Output2' from Table1
I don't want to the repeat the first columns logic again in the second column for maintenance purposes, currently it is done as
select 'output1' process, 'output1' || '-Output2' name from Table1
since i have 4 such columns which depend on the previous column output, repeating would be a maintenance nightmare
Edit: i included table name and removed dual, so that no assumptions are made about that this is not a complex process, my actual statement does have 2 to 3 joins on different tables
You could calculate the value in a sub-query:
select calculated_output process, calculated_output || '-Output2' name from
(
select 'output1' calculated_output from dual
)
You can also use the subquery factoring syntax - i think it is more readable myself:
with tmp as
(
select 'output' process from dual
)
select
process, process || '-Output2' from tmp;
very similar to the previous poster, but I prefer the method I show below, it is more readable, thusly more supportable, in my opinion
select val1 || val2 || val3 as val4, val1 from (
select 'output1' as val1, '-output2' as val2, '-output3' as val3 from dual
)

Resources