Are inserts with sequence numbers nextval atomic for this number? - oracle

If I am inserting rows into a table in auto-commit mode where one column is defined by a sequence's nextval value, are these values guaranteed to become visible in the order they are inserted? I am wondering if a scenario where from three concurrent connections:
Inserts foo
Inserts bar
Select all, observes bar with sequence number 2 but not foo with sequence number 1.
is possible.

The Oralce sequences are thread safe and are always created in order. It is guaranteed that the numbers produced are unique.
But you might not see an insert of an other session immediatley, if this other session has still an open transaction. This might create a temporary gap in the sequence you are seeing from SELECTs.
Further more, if a transaction which has called NEXTVAL is rolled back, then this will cause permanent gaps in the sequence. Sequences are not affected by roll backs or commits. An increment is always immediate and definitive.
See: CREATE SEQUENCE (Oracle Help Center)

"Auto-commit" is not a concept of the Oracle database. That is, there is no "auto-commit" mode or feature in the database -- it is only implemented in tools (like SQL*Plus).
A tool can implement "auto-commit" in different ways, but in most cases, it's probably along the lines of this:
(user's command, e.g., INSERT INTO ...)
<success response from Oracle server>
COMMIT;
In this case, the COMMIT does not get issued by the tool until there is a positive response from the server that the user's command has been executed. In a networked environment with >10ms latency, plus the vagaries of multithreading on the Oracle server itself, I would say there could be situations where session #2's automatic COMMIT gets processed on the server before session #1's and that, therefore, it is possible for session #3 to observe "bar" but not "foo".
The COMMIT timing of each session relative to the time at which session #3 starts its query is the only thing that matters. Session #3 will see whatever work session #1 and/or session #2 have committed as of the time session #3's query starts.

Related

Database read locking

I have a use case where I need to do the following things in one transaction:
start the transaction
INSERT an item into a table
SELECT all the items in the table
dump the selected items into a file (this file is versioned and another program always uses the latest version)
If all the above things succeed, commit the transaction, if not, rollback.
If two transactions begin almost simultaneously, it is possible that before the first transaction A commits what it has inserted into the table (step 4), the second transaction B has already performed the SELECT operation(step 2) whose result doesn't contain yet the inserted item by the first transaction(as it is not yet committed by A, so not visible to B). In this case, when A finishes, it will have correctly dumped a file File1 containing its inserted item. Later, B finishes, it will have dumped another file File2 containing only its inserted item but not the one inserted by A. Since File2 is more recent, we will use File2. The problem is that File2 doesn't contain the item inserted by A even though this item is well in the DB.
I would like to know if it is feasible to solve this problem by locking the read(SELECT) of the table when a transaction inserts something into the table until its commit or rollback and if yes, how this locking can be implemented in Spring with Oracle as DB.
You need some sort of synchronization between the transactions:
start the transaction
Obtain a lock to prevent the transaction in another session to proceed or wait until the transaction in the other session finishes
INSERT an item into a table
SELECT ......
......
Commit and release the lock
The easiest way is to use LOCK TABLE command, at least in SHARE mode (SHARE ROW EXCLUSIVE or EXCLUSIVE modes can also be used, but they are too restrictve for this case).
The advantage of this approach is that the lock is automatically released at commit or rollback.
The disadvantage is a fact, that this lock can interfere with other transactions in the system that update this table at the same time, and could reduce an overall performance.
Another approach is to use DBMS_LOCK package. This lock doesn't affect other transactions that don't explicitely use that lock. The drawaback is that this package is difficult to use, the lock is not released on commit nor rollback, you must explicitelly release the lock at the end of the transaction, and thus all exceptions must be carefully handled, othervise a deadlock easily could occur.
One more solution is to create a "dummy" table with a single row in it, for example:
CREATE TABLE my_special_lock_table(
int x
);
INSERT INTO my_special_lock_table VALUES(1);
COMMIT:
and then use SELECT x FROM my_special_lock_table FOR UPDATE
or - even easier - simple UPDATE my_special_lock_table SET x=x in your transaction.
This will place an exclusive lock on a row in this table and synchronize only this one transaction.
A drawback is that another "dummy" table must be created.
But this solution doesn't affect the other transactions in the system, the lock is automatically released upon commit or rollback, and it is portable - it should work in all other databases, not only in Oracle.
Use spring's REPEATABLE_READ or SERIALIZABLE isolation levels:
REPEATABLE_READ A constant indicating that dirty reads and
non-repeatable reads are prevented; phantom reads can occur. This
level prohibits a transaction from reading a row with uncommitted
changes in it, and it also prohibits the situation where one
transaction reads a row, a second transaction alters the row, and the
first transaction rereads the row, getting different values the second
time (a "non-repeatable read").
SERIALIZABLE A constant indicating that dirty reads, non-repeatable
reads and phantom reads are prevented. This level includes the
prohibitions in ISOLATION_REPEATABLE_READ and further prohibits the
situation where one transaction reads all rows that satisfy a WHERE
condition, a second transaction inserts a row that satisfies that
WHERE condition, and the first transaction rereads for the same
condition, retrieving the additional "phantom" row in the second read.
with serializable or repeatable read, the group will be protected from non-repeatable reads:
connection 1: connection 2:
set transaction isolation level
repeatable read
begin transaction
select name from users where id = 1
update user set name = 'Bill' where id = 1
select name from users where id = 1 |
commit transaction |
|--> executed here
In this scenario, the update will block until the first transaction is complete.
Higher isolation levels are rarely used because they lower the number of people that can work in the database at the same time. At the highest level, serializable, a reporting query halts any update activity.
I think you need to serialize the whole transaction. While a SELECT ... FOR UPDATE could work, it does not really buy you anything, since you would be selecting all rows. You may as well just take and release a lock, using DBMS_LOCK()

PostgreSQL vs. Oracle default transaction management

In PostgreSQL, if you encounter an error in transaction (for example when your insert statement violates unique constraint), the whole transaction is aborted, you cannot commit it and no rows are inserted:
database=# begin;
BEGIN
database=# insert into table (id, something) values ('1','whatever');
INSERT 0 1
database=# insert into table (id, something) values ('1','whatever');
ERROR: duplicate key value violates unique constraint "table_id_key"
Key (id)=(1) already exists.
database=# insert into table (id, something) values ('2','whatever');
ERROR: current transaction is aborted, commands ignored until end of transaction block
database=# rollback;
database=# select * from table;
id | something |
-----+------------+
(0 rows)
You can change that by setting ON_ERROR_ROLLBACK to "on" or "interactive", after that you can do multiple inserts ignoring errors, commit and have only successfully inserted rows in table after transaction end.
database=# \set ON_ERROR_ROLLBACK interactive
In Oracle, this is the default transaction management behaviour, which surprises me. Isn't this completely counterintuitive and dangerous?
When I start a transaction I want to be sure that all the statements were successfull. What if my multiple inserts comprise some kind of an object or data structure? I end up completely unaware of the data state in my database and should be checking it after the commit.
If one of the inserts fails I want to be sure that other inserts will be rollbacked or not even evaluated after the first error, which is exactly how it's done in PostgreSQL.
Why does Oracle have such way of transaction management as a default, and why is it considered good practice?
For example, some random guy here in comments
This is a very neat feature.
I don't understand this, though: "Normally, any error you make will
throw an exception and cause your current transaction to be marked as
aborted. This is sane and expected behavior..."
No, it's really not. Oracle doesn't work this way, nor does MySQL. I
have no experience with MSSQL or DB2 but I'll bet a dollar each they
don't work this way either. There no intuitive reason why a syntax
error, or any other error for that matter, should abort a transaction.
I can only assume there's either some limitation deep in the Postgres
guts that requires this behavior, or that it conforms to some obscure
part of the SQL standard that everyone else sensibly ignores. There's
certainly no API / UX reason why it should work this way.
We really shouldn't be too proud of any workarounds we've developed
for this pathological behavior. It's like IT Stockholm Syndrome.
Does not it violate even the definition of the transaction?
Transactions provide an "all-or-nothing" proposition, stating that
each work-unit performed in a database must either complete in its
entirety or have no effect whatsoever.
I agree with you. I think it's a mistake not to abort the whole tx. But people are used to that, so they think it's reasonable and correct. Like people who use MySQL think that the DBMS should accept 0000-00-00 as a date, or people using Oracle expect that '' IS NULL.
The idea that there's a clear distinction between a syntax error and something else is flawed.
If I write
BEGIN;
CREATE TABLE new_customers (...);
INSET INTO new_customers (...)
SELECT ... FROM customers;
DROP TABLE customers;
COMMIT;
I don't care that it's a typo resulting in a syntax error that caused me to lose my data. I care that the transaction didn't successfully execute all its statements but still committed.
It'd be technically feasible to allow soft rollback in PostgreSQL before any rows are actually written by a statement - probably before we even enter the executor. So failures in the parse and parameter binding phases could allow the tx not to be aborted. We have a statement memory context we could use to clean up.
However, once the statement starts changing rows, it's doing so on disk with the same transaction ID as the prior statements in the tx. So you can't roll it back without rolling back the whole tx. To allow statement rollback Pg needs to assign a new subtransaction ID. That costs resources. You can do it explicitly with SAVEPOINTs when you want to, and internally that's what psql is doing. In theory we could allow the server to do this implicitly for each statement to implement statement rollback, just at a performance cost. But I doubt any patch implementing this would get committed, at least not without a LOT of argument, because most of the PostgreSQL team are (IMO reasonably) not fond of "whoops, that broke but we'll continue anyway" transaction semantics.

Detect if data in Oracle table has changed

I'm looking for a best performance solution to detect if data in an Oracle table has changed. This will be used to kick start a calculation that uses lots of data from the same tables. It would be too expensive to poll the data to track changes. The changes happen rarely.
I have analyzed the following solutions to the problem.
ORA_ROWSCN : Too slow, will do a full table scan.
Oracle Audit : Not possible to set up in my environment.
DBMS_ALERT : Writers are not able to signal.
Then I came up with the following simple idea. Add a trigger to the tables that increments a sequence on insert, update or delete. I know this will be materialized if rolled back but I can afford some false positives. My calculations service then polls the sequence current value to detect possible changes (= very cheap query)
CREATE OR REPLACE TRIGGER trg_change_tracker
before insert or update or delete on mytable
declare
dummy number;
begin
select seq_event_seqno.nextval into dummy from dual;
end;
How does this sound? Any pitfalls?
EDIT: Yes there is a major pitfall: When the writer holds back the commit, and the reader sees the new sequence value and query for changes before the writer has committed.
I can suggest you some addition to your trigger soulution:
When your polling see changes of the sequence, you can check opened transactions on interested table, if it exists, then skip current recalc.
UPD:
Also, you can save min SCN of opened transactions, and in case of frequently table changes you will not freeze
UPD2:
It is some heuristic improvement, not full problem solution. If you will skip recal every time when you see opened transaction, then you can freeze much time in case of frequently (and may be long) DML on table.
I mean, when seq is changed and the polling see opened transactions on your table, you can store min(start_scn) from v$transaction, and when the polling see opened transactions at next time you can compare current min(start_scn) with sotred min(start_scn), if current is greater then it is some chance that it is time to recalc.
You can use ( for particular XTABLE ):
SELECT * FROM dba_tab_modifications WHERE TABLE_NAME = 'XTABLE';
it will allow you ( among other data ) to see:
INSERTS UPDATES DELETES
43,708 1,845 0
.. there are issues ( bugs in some Oracle versions ), but if you will test thoroughly in your environment - you may find this useful.
.. as this question is pretty old - it would be nice to know how you implemented the change detection mechanism.

Oracle: difference between max(id)+1 and sequence.nextval

I am using Oracle
What is difference when we create ID using max(id)+1 and using sequance.nexval,where to use and when?
Like:
insert into student (id,name) values (select max(id)+1 from student, 'abc');
and
insert into student (id,name) values (SQ_STUDENT.nextval, 'abc');
SQ_STUDENT.nextval sometime gives error that duplicate record...
please help me on this doubt
With the select max(id) + 1 approach, two sessions inserting simultaneously will see the same current max ID from the table, and both insert the same new ID value. The only way to use this safely is to lock the table before starting the transaction, which is painful and serialises the transactions. (And as Stijn points out, values can be reused if the highest record is deleted). Basically, never use this approach. (There may very occasionally be a compelling reason to do so, but I'm not sure I've ever seen one).
The sequence guarantees that the two sessions will get different values, and no serialisation is needed. It will perform better and be safer, easier to code and easier to maintain.
The only way you can get duplicate errors using the sequence is if records already exist in the table with IDs above the sequence value, or if something is still inserting records without using the sequence. So if you had an existing table with manually entered IDs, say 1 to 10, and you created a sequence with a default start-with value of 1, the first insert using the sequence would try to insert an ID of 1 - which already exists. After trying that 10 times the sequence would give you 11, which would work. If you then used the max-ID approach to do the next insert that would use 12, but the sequence would still be on 11 and would also give you 12 next time you called nextval.
The sequence and table are not related. The sequence is not automatically updated if a manually-generated ID value is inserted into the table, so the two approaches don't mix. (Among other things, the same sequence can be used to generate IDs for multiple tables, as mentioned in the docs).
If you're changing from a manual approach to a sequence approach, you need to make sure the sequence is created with a start-with value that is higher than all existing IDs in the table, and that everything that does an insert uses the sequence only in the future.
Using a sequence works if you intend to have multiple users. Using a max does not.
If you do a max(id) + 1 and you allow multiple users, then multiple sessions that are both operating at the same time will regularly see the same max and, thus, will generate the same new key. Assuming you've configured your constraints correctly, that will generate an error that you'll have to handle. You'll handle it by retrying the INSERT which may fail again and again if other sessions block you before your session retries but that's a lot of extra code for every INSERT operation.
It will also serialize your code. If I insert a new row in my session and go off to lunch before I remember to commit (or my client application crashes before I can commit), every other user will be prevented from inserting a new row until I get back and commit or the DBA kills my session, forcing a reboot.
To add to the other answers, a couple of issues.
Your max(id)+1 syntax will also fail if there are no rows in the table already, so use:
Coalesce(Max(id),0) + 1
There's nothing wrong with this technique if you only have a single process that inserts into the table, as might be the case with a data warehouse load, and if max(id) is fast (which it probably is).
It also avoids the need for code to synchronise values between tables and sequences if you are moving restoring data to a test system, for example.
You can extend this method to multirow insert by using:
Coalesce(max(id),0) + rownum
I expect that might serialise a parallel insert, though.
Some techniques don't work well with these methods. They rely of course on being able to issue the select statement, so SQL*Loader might be ruled out. However SQL*Loader has support for this technique in general through the SEQUENCE parameter of the column specification: http://docs.oracle.com/cd/E11882_01/server.112/e22490/ldr_field_list.htm#i1008234
Assuming MAX(ID) is actually fast enough, wouldn't it be possible to:
First get MAX(ID)+1
Then get NEXTVAL
Compare those two and increase sequence in case NEXTVAL is smaller then MAX(ID)+1
Use NEXTVAL in INSERT statement
In that case I would have a fully stable procedure and manual inserts would also be allowed without worrying about updating the sequence

Do the time of the COMMIT and ROLLBACK affect performance?

Suppose I have a set of ID . For each ID , I will insert many records to many different tables based on the ID .Between inserting difference tables, different business checks will be called . If any checking fail , all the records that are inserted based on this ID will be ROLLBACK .This bulk insert action is done through using PL/SQL . Do the time of the COMMIT and ROLLBACK affect the performance and how does it affect ? For example , should I COMMIT after finish the process for one ID or COMMIT after finish all ID?
This is not so much of a performance decision but a process design decision. Do you want the other IDs to stay in the database when you have to roll back a faulty ID?
For obvious reasons, rollback takes longer when more rows must be rolled back. Rollback usually takes longer (sometimes much longer!) than the operations that have to be rolled back. Commit is always fast in Oracle, so it probably doesn't matter how often you commit in that regard.
Your problem description indicates you have a large set of smaller logical transactions (each new ID is a transaction). You should commit each logical transaction. The two reasons to wait to commit the entire set of transactions are:
If the entire set of transactions is in fact a transaction itself - all inserts must succeed for any rows to be committed. In that context, your smaller "transactions" aren't truly transactions.
You don't have a restart capability in your bulk load process, which in effect makes this a special case of item 1. If your bulk load process aborts, you need a way to skip successfully applied ID's.
Tom Kyte's advice is to commit each logical unit of work - the transaction.
Don't take the transaction time longer. make it short as possible as you can. Because according to your query some locks have been created. This locks may cause perfomance issues... so do it ID by ID...
There are two "forces" at work....
locking
during your open transaction, oracle puts locks on the changed rows.
whenever another transaction needs to update any of the locked rows,
it has to wait.
in the worst case, you can even build a deadlock.
synchronous write
every commit performs a synchronous write.
(there are ways to disable that, but it is usually the thing everybody wants: integrity).
that synchronous write can take (much) longer then the a regular write (that can be buffered).
Not to forget that there is usually an additional network round trip involved with an commit.
so, the one force says "commit as soon as possible (considering your integrity requirements)" the other says "commit as as less often as possible".
There are some other issues to consider as well, e.g. the maximum transaction size. every uncommited transaction needs some temporary space. the bigger the transaction gets, the more you need. You can also run into ORA-01555 "snapshot too old".
If there is any advice to give, then it is to implement a configurable "commit frequency" so that you can easily change it as needed.
One option if you need to control the individual sets but retain the ability to commit or rollback the entire transaction is to use savepoints. You can set a savepoint at the beginning of the outermost loop, then rollback to it if an error occurs. You might end up with something like this:
begin
--Initial batch logging
for r_record in cur_cursor loop
savepoint s_cursor loop;
begin
--Process rows
exception
when others then
rollback to s_cursor;
end;
end loop;
--Final batch logging
exception
when others then
rollback;
raise;
end;

Resources