using spring transaction management with select queries [duplicate] - spring

I don't use Stored procedures very often and was wondering if it made sense to wrap my select queries in a transaction.
My procedure has three simple select queries, two of which use the returned value of the first.

In a highly concurrent application it could (theoretically) happen that data you've read in the first select is modified before the other selects are executed.
If that is a situation that could occur in your application you should use a transaction to wrap your selects. Make sure you pick the correct isolation level though, not all transaction types guarantee consistent reads.
Update :
You may also find this article on concurrent update/insert solutions (aka upsert) interesting. It puts several common methods of upsert to the test to see what method actually guarantees data is not modified between a select and the next statement. The results are, well, shocking I'd say.

Transactions are usually used when you have CREATE, UPDATE or DELETE statements and you want to have the atomic behavior, that is, Either commit everything or commit nothing.
However, you could use a transaction for READ select statements to:
Make sure nobody else could update the table of interest while the bunch of your select query is executing.
Have a look at this msdn post.

Most databases run every single query in a transaction even if not specified it is implicitly wrapped. This includes select statements.
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a transaction block.
https://www.postgresql.org/docs/current/tutorial-transactions.html

Related

PostgreSQL vs. Oracle default transaction management

In PostgreSQL, if you encounter an error in transaction (for example when your insert statement violates unique constraint), the whole transaction is aborted, you cannot commit it and no rows are inserted:
database=# begin;
BEGIN
database=# insert into table (id, something) values ('1','whatever');
INSERT 0 1
database=# insert into table (id, something) values ('1','whatever');
ERROR: duplicate key value violates unique constraint "table_id_key"
Key (id)=(1) already exists.
database=# insert into table (id, something) values ('2','whatever');
ERROR: current transaction is aborted, commands ignored until end of transaction block
database=# rollback;
database=# select * from table;
id | something |
-----+------------+
(0 rows)
You can change that by setting ON_ERROR_ROLLBACK to "on" or "interactive", after that you can do multiple inserts ignoring errors, commit and have only successfully inserted rows in table after transaction end.
database=# \set ON_ERROR_ROLLBACK interactive
In Oracle, this is the default transaction management behaviour, which surprises me. Isn't this completely counterintuitive and dangerous?
When I start a transaction I want to be sure that all the statements were successfull. What if my multiple inserts comprise some kind of an object or data structure? I end up completely unaware of the data state in my database and should be checking it after the commit.
If one of the inserts fails I want to be sure that other inserts will be rollbacked or not even evaluated after the first error, which is exactly how it's done in PostgreSQL.
Why does Oracle have such way of transaction management as a default, and why is it considered good practice?
For example, some random guy here in comments
This is a very neat feature.
I don't understand this, though: "Normally, any error you make will
throw an exception and cause your current transaction to be marked as
aborted. This is sane and expected behavior..."
No, it's really not. Oracle doesn't work this way, nor does MySQL. I
have no experience with MSSQL or DB2 but I'll bet a dollar each they
don't work this way either. There no intuitive reason why a syntax
error, or any other error for that matter, should abort a transaction.
I can only assume there's either some limitation deep in the Postgres
guts that requires this behavior, or that it conforms to some obscure
part of the SQL standard that everyone else sensibly ignores. There's
certainly no API / UX reason why it should work this way.
We really shouldn't be too proud of any workarounds we've developed
for this pathological behavior. It's like IT Stockholm Syndrome.
Does not it violate even the definition of the transaction?
Transactions provide an "all-or-nothing" proposition, stating that
each work-unit performed in a database must either complete in its
entirety or have no effect whatsoever.
I agree with you. I think it's a mistake not to abort the whole tx. But people are used to that, so they think it's reasonable and correct. Like people who use MySQL think that the DBMS should accept 0000-00-00 as a date, or people using Oracle expect that '' IS NULL.
The idea that there's a clear distinction between a syntax error and something else is flawed.
If I write
BEGIN;
CREATE TABLE new_customers (...);
INSET INTO new_customers (...)
SELECT ... FROM customers;
DROP TABLE customers;
COMMIT;
I don't care that it's a typo resulting in a syntax error that caused me to lose my data. I care that the transaction didn't successfully execute all its statements but still committed.
It'd be technically feasible to allow soft rollback in PostgreSQL before any rows are actually written by a statement - probably before we even enter the executor. So failures in the parse and parameter binding phases could allow the tx not to be aborted. We have a statement memory context we could use to clean up.
However, once the statement starts changing rows, it's doing so on disk with the same transaction ID as the prior statements in the tx. So you can't roll it back without rolling back the whole tx. To allow statement rollback Pg needs to assign a new subtransaction ID. That costs resources. You can do it explicitly with SAVEPOINTs when you want to, and internally that's what psql is doing. In theory we could allow the server to do this implicitly for each statement to implement statement rollback, just at a performance cost. But I doubt any patch implementing this would get committed, at least not without a LOT of argument, because most of the PostgreSQL team are (IMO reasonably) not fond of "whoops, that broke but we'll continue anyway" transaction semantics.

Is it possible to know what triggers would fire given a query?

I have a database with (too) many triggers. They can cascade.
I have a query, which seems simple, and by no means I can remember the effect of all triggers. So, that simple query might actually be not simple at all and not do what I expect.
Is there a way to know what triggers would fire before running the query, or what triggers have fired after running it (not committed yet)?
I am not really interested in queries like SELECT … FROM user_triggers WHERE … because I know them already, and also because it does not tell me whether the firing conditions of the triggers will be met in my query.
Thanks
"I have a database with (too) many triggers. They can cascade."
This is just one of the reasons why many people anathematize triggers.
"Is there a way to know what triggers would fire before running the
query"
No. Let's consider something which you might find in an UPDATE trigger body:
if :new.sal > :old.sal * 1.2 then
insert into big_pay_rises values (:new.empno, :old.sal, :new.sal, sysdate);
end if;
How could we tell whether the trigger on BIG_PAY_RISES will fire? It might, it might not depending on an algorithm we cannot parse out of the DML statement.
So, the best you can hope for is a recursive search of DBA_TRIGGERS and DBA_DEPENDENCIES to identify all the triggers which might feature in your cascade. But it's going to be impossible to identify which ones will definitely fire in any given scenario.
" or what triggers have fired after running it (not committed yet)?"
As others have pointed out, logging is one option. But if you are using Oracle 11g you have another option: the PL/SQL Hierarchical Profiler. This is a non-intrusive tool which tracks all the PL/SQL program units touched by a PL/SQL call, including triggers. One of the cool features of the Hierarchical Profiler is that it includes PUs which belong in other schemas, which might be useful with cascading triggers.
So, you just need to wrap your SQL in an anonymous block and call it with the Hierarchical Profiler. Then you can filter you report to reveal only the triggers which fired. Find out more .
Is there a way to know what triggers would fire before running the query, or what triggers have fired after running it (not committed yet)?
To address this I would run the query inside an anonymous block using a PL/SQL debugger.
There is no such thing called parse through your query and give u the triggers involved in your query. It is going to be as simple as this. Just pick the table names from the query you are running and for each one just list the triggers using the following query before running the query. Isn't that simple enough?
select trigger_name
, trigger_type
, status
from dba_triggers
where owner = '&owner'
and table_name = '&table'
order by status, trigger_name

Do the time of the COMMIT and ROLLBACK affect performance?

Suppose I have a set of ID . For each ID , I will insert many records to many different tables based on the ID .Between inserting difference tables, different business checks will be called . If any checking fail , all the records that are inserted based on this ID will be ROLLBACK .This bulk insert action is done through using PL/SQL . Do the time of the COMMIT and ROLLBACK affect the performance and how does it affect ? For example , should I COMMIT after finish the process for one ID or COMMIT after finish all ID?
This is not so much of a performance decision but a process design decision. Do you want the other IDs to stay in the database when you have to roll back a faulty ID?
For obvious reasons, rollback takes longer when more rows must be rolled back. Rollback usually takes longer (sometimes much longer!) than the operations that have to be rolled back. Commit is always fast in Oracle, so it probably doesn't matter how often you commit in that regard.
Your problem description indicates you have a large set of smaller logical transactions (each new ID is a transaction). You should commit each logical transaction. The two reasons to wait to commit the entire set of transactions are:
If the entire set of transactions is in fact a transaction itself - all inserts must succeed for any rows to be committed. In that context, your smaller "transactions" aren't truly transactions.
You don't have a restart capability in your bulk load process, which in effect makes this a special case of item 1. If your bulk load process aborts, you need a way to skip successfully applied ID's.
Tom Kyte's advice is to commit each logical unit of work - the transaction.
Don't take the transaction time longer. make it short as possible as you can. Because according to your query some locks have been created. This locks may cause perfomance issues... so do it ID by ID...
There are two "forces" at work....
locking
during your open transaction, oracle puts locks on the changed rows.
whenever another transaction needs to update any of the locked rows,
it has to wait.
in the worst case, you can even build a deadlock.
synchronous write
every commit performs a synchronous write.
(there are ways to disable that, but it is usually the thing everybody wants: integrity).
that synchronous write can take (much) longer then the a regular write (that can be buffered).
Not to forget that there is usually an additional network round trip involved with an commit.
so, the one force says "commit as soon as possible (considering your integrity requirements)" the other says "commit as as less often as possible".
There are some other issues to consider as well, e.g. the maximum transaction size. every uncommited transaction needs some temporary space. the bigger the transaction gets, the more you need. You can also run into ORA-01555 "snapshot too old".
If there is any advice to give, then it is to implement a configurable "commit frequency" so that you can easily change it as needed.
One option if you need to control the individual sets but retain the ability to commit or rollback the entire transaction is to use savepoints. You can set a savepoint at the beginning of the outermost loop, then rollback to it if an error occurs. You might end up with something like this:
begin
--Initial batch logging
for r_record in cur_cursor loop
savepoint s_cursor loop;
begin
--Process rows
exception
when others then
rollback to s_cursor;
end;
end loop;
--Final batch logging
exception
when others then
rollback;
raise;
end;

triggers statement level atomicity

What does Oracle mean by "statement level atomicity"?
Let's cite a couple of chapters from the Concepts:
Statement-Level Read Consistency
Oracle always enforces statement-level read consistency. This guarantees that all the data returned by a single query comes from a single point in time--the time that the query began. Therefore, a query never sees dirty data nor any of the changes made by transactions that commit during query execution. As query execution proceeds, only data committed before the query began is visible to the query. The query does not see changes committed after statement execution begins.
Statement-Level Rollback
If at any time during execution a SQL statement causes an error, all effects of the statement are rolled back. The effect of the rollback is as if that statement had never been run. This operation is a statement-level rollback.
A SQL statement that fails causes the loss only of any work it would have performed itself. It does not cause the loss of any work that preceded it in the current transaction.
It means that any single SQL statement you run is atomic in nature - it will either succeed completely or fail completely. If your SQL statement fails, and triggers that would have run as a result of that SQL statement will fail as well.
This is a statement about the nature of transactions: a Unit Of Work either succeeds or completes in its entirety. If your transaction comprises two inserts plus an update, and the update statement fails, the database will rollback all three statements. Any recursive SQL or other SQL such as that included in triggers is included in the atomic scope of the transaction. Find out more.
Atomicity means the A in the ACID principles of transaction management.

How to find out when an Oracle table was updated the last time

Can I find out when the last INSERT, UPDATE or DELETE statement was performed on a table in an Oracle database and if so, how?
A little background: The Oracle version is 10g. I have a batch application that runs regularly, reads data from a single Oracle table and writes it into a file. I would like to skip this if the data hasn't changed since the last time the job ran.
The application is written in C++ and communicates with Oracle via OCI. It logs into Oracle with a "normal" user, so I can't use any special admin stuff.
Edit: Okay, "Special Admin Stuff" wasn't exactly a good description. What I mean is: I can't do anything besides SELECTing from tables and calling stored procedures. Changing anything about the database itself (like adding triggers), is sadly not an option if want to get it done before 2010.
I'm really late to this party but here's how I did it:
SELECT SCN_TO_TIMESTAMP(MAX(ora_rowscn)) from myTable;
It's close enough for my purposes.
Since you are on 10g, you could potentially use the ORA_ROWSCN pseudocolumn. That gives you an upper bound of the last SCN (system change number) that caused a change in the row. Since this is an increasing sequence, you could store off the maximum ORA_ROWSCN that you've seen and then look only for data with an SCN greater than that.
By default, ORA_ROWSCN is actually maintained at the block level, so a change to any row in a block will change the ORA_ROWSCN for all rows in the block. This is probably quite sufficient if the intention is to minimize the number of rows you process multiple times with no changes if we're talking about "normal" data access patterns. You can rebuild the table with ROWDEPENDENCIES which will cause the ORA_ROWSCN to be tracked at the row level, which gives you more granular information but requires a one-time effort to rebuild the table.
Another option would be to configure something like Change Data Capture (CDC) and to make your OCI application a subscriber to changes to the table, but that also requires a one-time effort to configure CDC.
Ask your DBA about auditing. He can start an audit with a simple command like :
AUDIT INSERT ON user.table
Then you can query the table USER_AUDIT_OBJECT to determine if there has been an insert on your table since the last export.
google for Oracle auditing for more info...
SELECT * FROM all_tab_modifications;
Could you run a checksum of some sort on the result and store that locally? Then when your application queries the database, you can compare its checksum and determine if you should import it?
It looks like you may be able to use the ORA_HASH function to accomplish this.
Update: Another good resource: 10g’s ORA_HASH function to determine if two Oracle tables’ data are equal
Oracle can watch tables for changes and when a change occurs can execute a callback function in PL/SQL or OCI. The callback gets an object that's a collection of tables which changed, and that has a collection of rowid which changed, and the type of action, Ins, upd, del.
So you don't even go to the table, you sit and wait to be called. You'll only go if there are changes to write.
It's called Database Change Notification. It's much simpler than CDC as Justin mentioned, but both require some fancy admin stuff. The good part is that neither of these require changes to the APPLICATION.
The caveat is that CDC is fine for high volume tables, DCN is not.
If the auditing is enabled on the server, just simply use
SELECT *
FROM ALL_TAB_MODIFICATIONS
WHERE TABLE_NAME IN ()
You would need to add a trigger on insert, update, delete that sets a value in another table to sysdate.
When you run application, it would read the value and save it somewhere so that the next time it is run it has a reference to compare.
Would you consider that "Special Admin Stuff"?
It would be better to describe what you're actually doing so you get clearer answers.
How long does the batch process take to write the file? It may be easiest to let it go ahead and then compare the file against a copy of the file from the previous run to see if they are identical.
If any one is still looking for an answer they can use Oracle Database Change Notification feature coming with Oracle 10g. It requires CHANGE NOTIFICATION system privilege. You can register listeners when to trigger a notification back to the application.
Please use the below statement
select * from all_objects ao where ao.OBJECT_TYPE = 'TABLE' and ao.OWNER = 'YOUR_SCHEMA_NAME'

Resources