I have written a trigger on one table which deletes data from other table upon a condition.
The trigger has pragma autonomous_transaction, and trigger works as intended. However, I do wonder if there can be any problems in future, say if data is inserted by multiple users/sources at the same time etc...Any suggestions?
Source table t1:
--------------------------------------------
| user_id | auth_name1 | auth_name2 | data |
--------------------------------------------
| 1 | Name1 | Name2 | d1 |
| 2 | Name3 | Name4 | d2 |
| 3 | Name5 | Name1 | d3 |
--------------------------------------------
Target table t2:
------------------------------------------------
| record_id | identifier | status | data1 |
------------------------------------------------
| 100 | Broken | 11 | Name1 |
| 101 | Reminder | 99 | Name1 |
| 102 | Broken | 99 | Name2 |
| 103 | Broken | 11 | Name4 |
------------------------------------------------
Trigger code:
create or replace trigger "ca"."t$t1"
after update of auth_name1, auth_name2 on ca.t1
for each row
declare
pragma autonomous_transaction;
begin
if :new.auth_name1 is not null and :new.auth_name2 is not null then
delete from ca.t2 ml
where ml.identifier = 'Broken'
and data1 = regexp_substr(:new.auth_name1, '\S+$')||' '||regexp_substr(:new.auth_name1, '^\S+')
and status = 11;
commit;
end if;
end t$t1;
Using an autonomous transaction for anything other than logging that you want to be preserved when the parent transaction rolls back is almost certainly an error. This is not a good use of an autonomous transaction.
What happens, for example, if I update a row in t1 but my transaction rolls back. The t2 changes have already been made and committed so they don't roll back. That generally means that the t2 data is now incorrect. The whole point of transactions is to ensure that a set of changes is atomic and is either completely successful or completely reverted. Allowing code to be partially successful is almost never a good idea.
I'm hard-pressed to see what using an autonomous transaction buys you here. You'll often see people incorrectly using autonomous transactions to incorrectly work around mutating trigger errors. But the code you posted wouldn't generate a mutating trigger error unless there was a row-level trigger on t2 that was also trying to update t1 or some similar mechanism that was introducing a mutating table. If that's the case, though, using an autonomous transaction is generally even worse because the autonomous transaction then cannot see the changes being made in the parent transaction which almost certainly causes the code to behave differently than you would desire.
Related
I was using today Oracle SQL Developer version 19.2.1.247 and removed couple of rows using the graphical way. I mean, I selected the row and press the red x cross to remove the selected rows.
It worked, all fine. Rows were removed.
However, I took a look to the logs and find out this:
DELETE FROM "DB"."PERSONS" WHERE ROWID = 'AADG' AND ORA_ROWSCN = '5196' and ( "ID" is null or "ID" is not null )
Well, I don't understand the last statement.
"ID" is null or "ID" is not null
To my eyes, apparently, this is always going to be true. I feel curiosity about knowing the reason why this Oracle SQL Developer is adding this. Why is this used? When this statement will not be true?
Probably just an anomaly, but it wont have any impact on the execution (or performance) of the statement. For example, I did:
SQL> create table t as select * from scott.emp;
Table created.
and then deleted a row in SQL Dev, yielding
DELETE FROM "T" WHERE ROWID = 'AAZRaiABAAAA5eLAAL'
AND ORA_ROWSCN = '16330866545209'
and ( "EMPNO" is null or "EMPNO" is not null )
and the optimizer came up with this:
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 12 | 1 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | TABLE ACCESS BY USER ROWID| T | 1 | 12 | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ORA_ROWSCN"=16330866545209)
The redundant predicates were simply ignored.
My table had no indexes or primary keys, so it would appear the simply takes the first column from the table for this predicate.
I'm updating an oracle table with a UDT (user defined type), and then querying the V$LOGMNR_CONTENTS view. I'm seeing that for each row updated there are 2 records - UPDATE and INTERNAL. I need to figure out how to link between them as the UPDATE operation has a temporary value in the ROW_ID and the correct value appears only in the INTERNAL operation, and I'm not sure how do their SCN numbers relate. The way I'm thinking about is to make a queue of UPDATEs per DATA_OBJ#, and link them to the INTERNALs FIFO. Is there something nicer I'm missing?
Script:
CREATE TYPE srulon AS OBJECT (name VARCHAR2(30),phone VARCHAR2(20) );
create table root.udt_table (myrowid rowid, myudt srulon);
BEGIN rdsadmin.rdsadmin_util.switch_logfile;END;
insert into root.udt_table values (null, srulon('small', '1234'));
commit;
BEGIN rdsadmin.rdsadmin_util.switch_logfile;END;
insert into root.udt_table values (null, srulon('small', '1234'));
update root.udt_table set myrowid=rowid, myudt = srulon('smaller', rowid);
commit;
BEGIN rdsadmin.rdsadmin_util.switch_logfile;END;
Query (after START_LOGMNR for the last log):
select scn, SEQUENCE#,operation, SQL_REDO, ROW_ID from V$LOGMNR_CONTENTS
where session# = 6366 and not operation like '%XML%'
order by scn, SEQUENCE#;
Results:
| SCN | SEQUENCE# | OPERATION | ROW\_ID | SQL\_REDO |
| :--- | :--- | :--- | :--- | :--- |
| 240676056 | 1 | INTERNAL | AAB1avAAAAAAwT7AAA | NULL |
| 240676056 | 1 | UPDATE | AAAAAAAAAAAAAAAAAA | update "ROOT"."UDT\_TABLE" a set a."MYROWID" = 'AAB1avAAAAAAwT7AAA' where a."MYROWID" IS NULL; |
| 240676057 | 5 | INTERNAL | AAB1avAAAAAAwT7AAA | NULL |
| 240676058 | 1 | UPDATE | AAAAAAAAAAAAAAAAAA | update "ROOT"."UDT\_TABLE" a set a."MYROWID" = 'AAB1avAAAAAAwT7AAB' where a."MYROWID" IS NULL; |
| 240676059 | 5 | INTERNAL | AAB1avAAAAAAwT7AAB | NULL |
| 240676069 | 1 | COMMIT | AAAAAAAAAAAAAAAAAA | commit; |
System Change Number (SCN) is the main controlling function that is used to keep track of database transactional activity. SCN is a stamp that defines a committed version of a database at a particular point in time. Every committed transaction gets a unique SCN assigned. DB keeps records of all database changes with the help of SCN numbers. SCN is a running number for the database changes
To get current SCN use
DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER()
So there is no other Connection between the UPDATE and INTERNAL Operation then the Fact that UPDATE SCN is lower then INTERNAL SCN - but no calculated or logical connection
the mistake was to order by scn, SEQUENCE#.
once you remove the order by clause, each INTERNAL statement follows its corresponding UPDATE.
credit goes to srulon.
I have one table that need to split into several other tables.
But the main table is just like a transitive table.
I dump data from a excel into it (from 5k to 200k rows) , and using insert into select, split into the correct tables (Five different tables).
However, the latest dataset that my client sent has records with duplicates values.
The primary key usually is ENI for my table. But even this record is duplicated because the same company can be a customer and a service provider, so they have two different registers but use the same ENI.
What i have so far.
I found a script that uses merge and modified it to find same eni and update the same main_id to all
|Main_id| ENI | company_name| Type
| 1 | 1864 | JOHN | C
| 2 | 351485 | JOEL | C
| 3 | 16546 | MICHEL | C
| 2 | 351485 | JOEL J. | S
| 1 | 1864 | JOHN E. E. | C
Main_id: Primarykey that the main BD uses
ENI: Unique company number
Type: 'C' - COSTUMER 'S' - SERVICE PROVIDERR
Some Cases it can have the same type. just like id 1
there are several other Columns...
What i need:
insert any of the main_id my other script already sorted, and set a flag on the others that they were not inserted. i cant delete any data i'll need to send these info to the costumer validate.
or i just simply cant make this way and go back to the good old excel
Edit: as a question below this is a example
|Main_id| ENI | company_name| Type| RANK|
| 1 | 1864 | JOHN | C | 1 |
| 2 | 351485 | JOEL | C | 1 |
| 3 | 16546 | MICHEL | C | 1 |
| 2 | 351485 | JOEL J. | S | 2 |
| 1 | 1864 | JOHN E. E. | C | 2 |
RANK - would be like the 1864 appears 2 times,
1st one found gets 1 second 2 and so on. i tryed using
RANK() OVER (PARTITION BY MAIN_ID ORDER BY ENI)
RANK() OVER (PARTITION BY company_name ORDER BY ENI)
Thanks to TEJASH i was able to come up with this solution
MERGE INTO TABLEA S
USING (Select ROWID AS ID,
row_number() Over(partition by eniorder by eni, type) as RANK_DUPLICATED
From TABLEA
) T
ON (S.ROWID = T.ID)
WHEN MATCHED THEN UPDATE SET S.RANK_DUPLICATED= T.RANK_DUPLICATED;
As far as I understood your problem, you just need to know the duplicate based on 2 columns. You can achieve it using analytical function as follows:
Select t.*,
row_number() Over(partition by main_id, eni order by company_name) as rnk
From your_table t
I've got the following (abstract) problem in ORACLE 11g:
There are two tables called STRONG and WEAK.
The primary key of WEAK is a compound consisting of the primary key of STRONG (called pk) plus an additional column (called fk).
Thus we have a generic 1:N relationship.
However, in the real world application, there is always exactly one row within the N WEAK rows related to an entry of STRONG that has a special relationship.
This is why there is an additional 1:1 relationship realised by adding the column fk to STRONG as well.
Furthermore, it might be worth noting that both tables are huge but well indexed.
The overall picture looks like this:
Now, I have to define a view showing rows of STRONG along with some additional columns linked by that 1:1 relationship. I tried two basic approaches:
Subselects
SELECT
(SELECT some_data1 FROM weak WHERE weak.pk = strong.pk AND weak.fk = strong.fk) some_data1,
(SELECT some_data2 FROM weak WHERE weak.pk = strong.pk AND weak.fk = strong.fk) some_data2
FROM strong
Left Outer Join
SELECT
weak.some_data1,
weak.some_data2
FROM strong
LEFT OUTER JOIN weak ON weak.pk = strong.pk AND weak.fk = strong.fk
I first thought that the "Left Outer Join"-way has to be better and I still think that this is true as long as there is no WHERE/ORDER_BY-clause. However, in the real world application, user query dialog inputs are dynamically
translated into extensions of the above statements. Typically, the user knows the primary key of STRONG resulting in queries like this:
SELECT *
FROM the_view
WHERE the_view.pk LIKE '123%' --Or even the exact key
ORDER BY the_view.pk
Using the "Left Outer Join"-way, we encountered some very serious performance problems, even though most of these SELECTs only return a few rows. I think what happened is that the hash table did not fit into the
memory resulting in way too many I/O-events. Thus, we went back to the Subselects.
Now, i have a few questions:
Q1
Does Oracle have to compute the entire hash table for every SELECT (with ORDER_BY)?
Q2
Why is the "Subselect"-way faster? Here, it might be worth noting that these columns can appear in the WHERE-clause as well.
Q3
Does it somehow matter that joining the two tables might potentially increase the number of selcted rows? If so: Can we somehow tell Oracle that this can never happen from a logical perspective?
Q4
In case that the "Left Outer Join"-Way is not a well-performing option: The "Subselect"-way does seem somewhat redundant. Is there a better way?
Thanks a lot!
EDIT
Due to request, I will add an explanation plan of the actual case. However, there are a few important things here:
In the above description, I tried to simplify the actual problem. In the real world, the view is a lot more complex.
One thing I left out due to simplification is that the performance issues mainly occur when using the STRONG => WEAK-Join in a nested join (see below). The actual situation looks like this:
ZV is the name of our target view - the explanation plan below refers to that view.
Z (~3M rows) join T (~1M rows)
T joins CCPP (~1M rows)
TV is a view based on T. Come to think of it... this might be critical. The front end application sort of restricts us in the way we define these views: In ZV, we have to join TV instead of T and we can not implement that T => CCPP-join in TV, forcing us to define the join TV => CCPP as a nested join.
We only encountered the performance problems in our productive environment with lots of user. Obviously, we had to get rid of these problems. Thus, it can not be reproduced right now. The response time of the statements below are totally fine.
The Execution Plan
---------------------------------------------------------------------------- ----------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
---------------------------------------------------------------------------- ----------------------------------
| 0 | SELECT STATEMENT | | 717K| 73M| | 13340 (2)|
| 1 | HASH JOIN OUTER | | 717K| 73M| 66M| 13340 (2)|
| 2 | VIEW | | 687K| 59M| | 5 (0)|
| 3 | NESTED LOOPS OUTER | | 687K| 94M| | 5 (0)|
| 4 | NESTED LOOPS OUTER | | 1 | 118 | | 4 (0)|
| 5 | TABLE ACCESS BY INDEX ROWID | Z | 1 | 103 | | 3 (0)|
| 6 | INDEX UNIQUE SCAN | SYS_C00245876 | 1 | | | 2 (0)|
| 7 | INDEX UNIQUE SCAN | SYS_C00245876 | 1798K| 25M| | 1 (0)|
| 8 | VIEW PUSHED PREDICATE | TV | 687K| 17M| | 1 (0)|
| 9 | NESTED LOOPS OUTER | | 1 | 67 | | 2 (0)|
| 10 | TABLE ACCESS BY INDEX ROWID| T | 1 | 48 | | 2 (0)|
| 11 | INDEX UNIQUE SCAN | SYS_C00245609 | 1 | | | 1 (0)|
| 12 | INDEX UNIQUE SCAN | SYS_C00254613 | 1 | 19 | | 0 (0)|
| 13 | TABLE ACCESS FULL | CCPP | 5165K| 88M| | 4105 (3)|
--------------------------------------------------------------------------------------------------------------
The real question is - how many records does your query return?
10 records only or 10.000 (or 10M) and you expect to see the first 10 rows quickly?
For the letter case the subquery solution works indeed better as you need no sort and you lookup the WEAK table only small number of times.
For the former case (i.e. the number of selected rows in both table is small) I'd expect execution plan as follows:
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 336 | 100 (1)| 00:00:02 |
| 1 | SORT ORDER BY | | 4 | 336 | 100 (1)| 00:00:02 |
| 2 | NESTED LOOPS OUTER | | 4 | 336 | 99 (0)| 00:00:02 |
| 3 | TABLE ACCESS BY INDEX ROWID| STRONG | 4 | 168 | 94 (0)| 00:00:02 |
|* 4 | INDEX RANGE SCAN | STRONG_IDX | 997 | | 4 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| WEAK | 1 | 42 | 2 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | WEAK_IDX | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("STRONG"."PK" LIKE '1234%')
filter("STRONG"."PK" LIKE '1234%')
6 - access("WEAK"."PK"(+)="STRONG"."PK" AND "WEAK"."FK"(+)="STRONG"."FK")
filter("WEAK"."PK"(+) LIKE '1234%')
If you see FULL TABLE SCAN on one or other table - the optimizes impression could be that the predicate pk LIKE '123%' will return too much records and the index access will be slower.
This could be a good or bad guess, so you may need to check your table statistics and cardinality estimation.
Some additional information follows
Q1
If Oracle performs a HASH JOIN the whole data source (typically the smaller one) must be read in memory
in the hash table. This is the whole table or the part of it as filtered by the WHERE/ON clause.
(In your case only records with pk LIKE '123%' )
Q2
This may be only an impression, as you see quickly first records. The subquery is performed
only for the first few fetched rows.
To know the exact answer you must examine (or post) the execution plans.
Q3
No, sorry, joining of the two tables NEVER potentially increase the number of selcted rows but returns exact the number of rows
as defined in the SQL standard.
It is your responsibility to define the join on a unique / primary key to avoid duplication.
Q4
You may of course select something like some_data1 ||'#'||some_data2 in the subquery, but it is in your responsibility
to decide if it is safe..
I'm having a performance issue when deploying an app developed on 10g XE in a client's 9i server. The same query produces completely different query plans depending on the server:
SELECT DISTINCT FOO.FOO_ID AS C0,
GEE.GEE_CODE AS C1,
TO_CHAR(FOO.SOME_DATE, 'DD/MM/YYYY') AS C2,
TMP_FOO.SORT_ORDER AS SORT_ORDER_
FROM TMP_FOO
INNER JOIN FOO ON TMP_FOO.FOO_ID=FOO.FOO_ID
LEFT JOIN BAR ON FOO.FOO_ID=BAR.FOO_ID
LEFT JOIN GEE ON FOO.GEE_ID=GEE.GEE_ID
ORDER BY SORT_ORDER_;
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production:
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 67 | 10 (30)| 00:00:01 |
| 1 | SORT UNIQUE | | 1 | 67 | 9 (23)| 00:00:01 |
| 2 | NESTED LOOPS OUTER | | 1 | 67 | 8 (13)| 00:00:01 |
|* 3 | HASH JOIN OUTER | | 1 | 48 | 7 (15)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 44 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | TMP_FOO | 1 | 26 | 2 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| FOO | 1 | 18 | 1 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | FOO_PK | 1 | | 0 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL | BAR | 1 | 4 | 3 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID | GEE | 1 | 19 | 1 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | GEE_PK | 1 | | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Oracle9i Release 9.2.0.1.0 - 64bit Production:
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98M| 6546M| | 3382K|
| 1 | SORT UNIQUE | | 98M| 6546M| 14G| 1692K|
|* 2 | HASH JOIN OUTER | | 98M| 6546M| 137M| 2874 |
| 3 | VIEW | | 2401K| 109M| | 677 |
|* 4 | HASH JOIN OUTER | | 2401K| 169M| 40M| 677 |
| 5 | VIEW | | 587K| 34M| | 24 |
|* 6 | HASH JOIN | | 587K| 34M| | 24 |
| 7 | TABLE ACCESS FULL| TMP_FOO | 8168 | 207K| | 10 |
| 8 | TABLE ACCESS FULL| FOO | 7188 | 245K| | 9 |
| 9 | TABLE ACCESS FULL | BAR | 409 | 5317 | | 1 |
| 10 | TABLE ACCESS FULL | GEE | 4084 | 89848 | | 5 |
----------------------------------------------------------------------------
As far as I can tell, indexes exist and are correct. What are my options to make Oracle 9i use them?
Update #1: TMP_FOO is a temporary table and it has no rows in this test. FOO is a regular table with 13,035 rows in my local XE; not sure why the query plan shows 1, perhaps it's realising that an INNER JOIN against an empty table won't require a full table scan :-?
Update #2: I've spent a couple of weeks trying everything and nothing provided a real enhancement: query rewriting, optimizer hints, changes in DB design, getting rid of temp tables... Finally, I got a copy of the same 9.2.0.1.0 unpatched Oracle version the customer has (with obvious architecture difference), installed it at my site and... surprise! In my 9i, all execution plans come instantly and queries take from 1 to 10 seconds to complete.
At this point, I'm almost convinced that the customer has a serious misconfiguration issue.
it looks like either you don't have data on your 10g express database, or your statistics are not collected properly. In either case it looks to Oracle like there aren't many rows, and therefore an index-range scan is appropriate.
In your 9i database, the statistics look like they are collected properly and Oracle sees a 4-table join with lots of rows and without a where clause. In that case since you haven't supplied an hint, Oracle builds an explain plan with the default ALL_ROWS optimizer behaviour: Oracle will find the plan that is the most performant to return all rows to the last. In that case the HASH JOIN with full table scans is brutally efficient, it will return big sets of rows faster that with an index NESTED LOOP join.
Maybe you want to use an index because you are only interested in the first few rows of the query. In that case use the hint /*+ FIRST_ROWS*/ that will help Oracle understand that you are more interested in the first row response time than overall total query time.
Maybe you want to use an index because you think this would result in a faster total query time. You can force an explain plan through the use of hints like USE_NL and USE_HASH but most of the time you will see that if the statistics are up-to-date the optimizer will have picked the most efficient plan.
Update: I saw your update about TMP_FOO being a temporary table having no row. The problem with temporary table is that they have no stats so my above answer doesn't apply perfectly to temporary tables. Since the temp table has no stats, Oracle has to make a guess (here it chooses quite arbitrarly 8168 rows) which results in an inefficient plan.
This would be a case where it could be appropriate to use hints. You have several options:
A mix of LEADING, USE_NL and USE_HASH hints can force a specific plan (LEADING to set the order of the joins and USE* to set the join method).
You could use the undocumented CARDINALITY hint to give additional information to the optimizer as described in an AskTom article. While the hint is undocumented, it is arguably safe to use. Note: on 10g+ the DYNAMIC_SAMPLING could be the documented alternative.
You can also set the statistics on the temporary table beforehand with the DBMS_STATS.set_table_stats procedure. This last option would be quite radical since it would potentially modify the plan of all queries against this temp table.
It could be that 9i is doing it exactly right. According to the stats posted, the Oracle 9i database believes it is dealing with a statement returning 98 million rows, whereas the 10G database thinks it will return 1 row. It could be that both are correct, i.e the amount of data in the 2 databases is very very different. Or it could be that you need to gather stats in either or both databases to get a more accurate query plan.
In general it is hard to tune queries when the target version is older and a different edition. You have no chance of tuning a query without realistic volumes of data, or at least realistic statistics.
If you have a good relationship with your client you could ask them to export their statistics using DBMS_STATS.EXPORT_SCHEMA_STATS(). Then you can import the stats using the matching IMPORT_SCHEMA_STATS procedure.
Otherwise you'll have to fake the numbers yourself using the DBMS_STATS.SET_TABLE_STATISTICS() procedure. Find out more.
You could add the following hints which would "force" Oracle to use your indexes (if possible):
Select /*+ index (FOO FOO_PK) */
/*+ index (GEE GEE_PK) */
From ...
Or try to use the FIRST_ROWS hint to indicate you're not going to fetch all these estimated 98 Million rows... Otherwise I doubt the indexes would make a huge difference because you have no Where clause so Oracle would have to read these tables anyways.
The customer had changed a default setting in order to support a very old third-party legacy application: the static parameter OPTIMIZER_FEATURES_ENABLE had been changed from the default value in 9i (9.2.0) to 8.1.7.
I made the same change in a local copy of 9i and I got the same problems: explain plans that take hours to be calculated and so on.
(Knowing this, I've asked a related question at ServerFault, but I believe this solves the original question.)