I'm trying to see whether I can use database lock to deal with race conditions. For example
CREATE TABLE ORDER
(
T1_ID NUMBER PRIMARY KEY,
AMT NUMBER,
STATUS1 CHAR(1),
STATUS2 CHAR(1),
UPDATED_BY VARCHAR(25)
);
insert into order values (order_seq.nextval, 1, 'N', 'N', 'U0');
Later two users can update the order record at the same time. Requirement is that only one can proceed while the other should NOT. We can certainly use a distributed lock manager (DLM) to do this but I figure database lock may be more efficient.
User 1:
update T1 set status1='Y', updated_by='U1' where status1='N';
User 2:
update T1 set status2='Y', updated_by='U2' where status1='N';
Two users are doing these at the same time. Ideally only one should be allowed to proceed. I played using Sql Plus and also wrote a little java test program letting two threads do these simultaneously. I got the same result. Let's say User 1 got the DB row lock first. It returns 1 row updated. The second session will be blocked waiting for the row lock before the 1st session commits or rollbacks. The question is REALLY this:
Update with a where clause seems like two operations: first it will do an implicit select based on the where clause to pick the row that will be updated. Since Oracle only supports READ COMMITTED isolation level, I expect both UPDATE statements will pick the single record in the DB. As a result, I expected both UPDATE statement will eventually return "1 row updated" although one will wait till the other transaction commits. HOWEVER that's not what I saw. The second UPDATE returns "0 row updated" after the first commits. I feel that Oracle actually runs the where clause AGAIN after the first session commits, which results in "0 row updated" result.
This is strange to me. I thought I would run into the classical "lost update" phenomenon.
can somebody please explain what's going on here? Thanks very much!
Related
I have one table name as user_count_details. There are total 3 columns in this table.
msisdn=Which uniquely defines row for one specific user
user_count= Which stores the count of user.
last_Txn_id= Which stores the last transfer id of txn which user has performed.
The user_count column of this table user_count_details is gets updated with every transaction performed by the user.
But here the logic of my system is
select sum(user_count ) from user_count_details
will always gives us the 0 and it is considered as the system is in stable state and everything is fine.
Now i want to write trigger which will check first when new request to update user_count come ,will hamper the sum(user_count )=0 or not and if it hampers that msisdn details will be captured in separate update table.
Based on your last comments, check if this works. Replace the other_table_name as per your scenario.
CREATE TRIGGER trgCheck_user_sum
BEFORE INSERT
ON user_count_details FOR EACH ROW
BEGIN
IF (select sum(user_count) from user_count_details) > 0 THEN
insert into other_table_name(msisdn) values(new.msisdn)
END IF
END
We have simple case, We have a table with column emailId i.e. unique.....using oracle DB
Question#1
Multiple concurrent user can check if some email id is available or not. Like 2 user that same time check availability of: abc#test.com
session1: select emailid from user_table;
//If not present allow user to complete rest of the process & insert info
session2: select emailid from user_table;
Now both session will get that this email id (abc#test.com) is available & both try to insert, I know one of them will get error upon insertion BUT how we can make sure only 1 user get availability & other get not available upon select ??
Question#2
Also in case both sessions inserted the same value, then first will succeed, is there ways that 2nd session update that row instead of throwing error. Like we have another column for timestamp & want that 2nd session instead of throwing error simple update the timestamp column ?
As this is a rather abstract question, here are only some general guidelines:
To deal with concurrent insert in a table, you need an unique index, and be prepared in your code to deal with ORA-00001 error unique constraint violated. Never rely only on check before insert(unless you have somehow exclusive access to your table -- and even if so ... as of myself, I would add an unique index: doesn't cost much and make me sleep better)
Oracle has a MERGE statement that allows you update or insert based on a condition. This operation is sometimes called an upsert. By using that keywork you should be able to find more informationsSee Oracle: how to UPSERT (update or insert into a table?) for example.
Now for, some thoughts about you specific case (maybe):
The only way for the system to work as you suggested, would be to make some kind of reservation when you check for availability (i.e.: immediately insert the row, instead of just select). And then update the row when the user confirm. But that means: (1) you will have to somehow deal with never-confirmed reservations (2) that doesn't dispense you to have an unique index, and to deal with ORA-00001.
I am using Oracle
What is difference when we create ID using max(id)+1 and using sequance.nexval,where to use and when?
Like:
insert into student (id,name) values (select max(id)+1 from student, 'abc');
and
insert into student (id,name) values (SQ_STUDENT.nextval, 'abc');
SQ_STUDENT.nextval sometime gives error that duplicate record...
please help me on this doubt
With the select max(id) + 1 approach, two sessions inserting simultaneously will see the same current max ID from the table, and both insert the same new ID value. The only way to use this safely is to lock the table before starting the transaction, which is painful and serialises the transactions. (And as Stijn points out, values can be reused if the highest record is deleted). Basically, never use this approach. (There may very occasionally be a compelling reason to do so, but I'm not sure I've ever seen one).
The sequence guarantees that the two sessions will get different values, and no serialisation is needed. It will perform better and be safer, easier to code and easier to maintain.
The only way you can get duplicate errors using the sequence is if records already exist in the table with IDs above the sequence value, or if something is still inserting records without using the sequence. So if you had an existing table with manually entered IDs, say 1 to 10, and you created a sequence with a default start-with value of 1, the first insert using the sequence would try to insert an ID of 1 - which already exists. After trying that 10 times the sequence would give you 11, which would work. If you then used the max-ID approach to do the next insert that would use 12, but the sequence would still be on 11 and would also give you 12 next time you called nextval.
The sequence and table are not related. The sequence is not automatically updated if a manually-generated ID value is inserted into the table, so the two approaches don't mix. (Among other things, the same sequence can be used to generate IDs for multiple tables, as mentioned in the docs).
If you're changing from a manual approach to a sequence approach, you need to make sure the sequence is created with a start-with value that is higher than all existing IDs in the table, and that everything that does an insert uses the sequence only in the future.
Using a sequence works if you intend to have multiple users. Using a max does not.
If you do a max(id) + 1 and you allow multiple users, then multiple sessions that are both operating at the same time will regularly see the same max and, thus, will generate the same new key. Assuming you've configured your constraints correctly, that will generate an error that you'll have to handle. You'll handle it by retrying the INSERT which may fail again and again if other sessions block you before your session retries but that's a lot of extra code for every INSERT operation.
It will also serialize your code. If I insert a new row in my session and go off to lunch before I remember to commit (or my client application crashes before I can commit), every other user will be prevented from inserting a new row until I get back and commit or the DBA kills my session, forcing a reboot.
To add to the other answers, a couple of issues.
Your max(id)+1 syntax will also fail if there are no rows in the table already, so use:
Coalesce(Max(id),0) + 1
There's nothing wrong with this technique if you only have a single process that inserts into the table, as might be the case with a data warehouse load, and if max(id) is fast (which it probably is).
It also avoids the need for code to synchronise values between tables and sequences if you are moving restoring data to a test system, for example.
You can extend this method to multirow insert by using:
Coalesce(max(id),0) + rownum
I expect that might serialise a parallel insert, though.
Some techniques don't work well with these methods. They rely of course on being able to issue the select statement, so SQL*Loader might be ruled out. However SQL*Loader has support for this technique in general through the SEQUENCE parameter of the column specification: http://docs.oracle.com/cd/E11882_01/server.112/e22490/ldr_field_list.htm#i1008234
Assuming MAX(ID) is actually fast enough, wouldn't it be possible to:
First get MAX(ID)+1
Then get NEXTVAL
Compare those two and increase sequence in case NEXTVAL is smaller then MAX(ID)+1
Use NEXTVAL in INSERT statement
In that case I would have a fully stable procedure and manual inserts would also be allowed without worrying about updating the sequence
I have table with "varchar2" as primary key.
It has about 1 000 000 Transactions per day.
My app wakes up every 5 minute to generate text file by querying only new record.
It will remember last point and process only new records.
Do you have idea how to query with good performance?
I am able to add new column if necessary.
What do you think this process should do by?
plsql?
java?
Everyone here is really really close. However:
Scott Bailey's wrong about using a bitmap index if the table's under any sort of continuous DML load. That's exactly the wrong time to use a bitmap index.
Everyone else's answer about the PROCESSED CHAR(1) check in ('Y','N')column is right, but missing how to index it; you should use a function-based index like this:
CREATE INDEX MY_UNPROCESSED_ROWS_IDX ON MY_TABLE
(CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END);
You'd then query it using the same expression:
SELECT * FROM MY_TABLE
WHERE (CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END) = 'N';
The reason to use the function-based index is that Oracle doesn't write index entries for entirely NULL values being indexed, so the function-based index above will only contain the rows with PROCESSED_FLAG = 'N'. As you update your rows to PROCESSED_FLAG = 'Y', they'll "fall out" of the index.
Well, if you can add a new column, you could create a Processed column, which will indicate processed records, and create an index on this column for performance.
Then the query should only be for those rows that have been newly added, and not processed.
This should be easily done using sql queries.
Ah, I really hate to add another answer when the others have come so close to nailing it. But
As Ponies points out, Oracle does have a hidden column (ORA_ROWSCN - System Change Number) that can pinpoint when each row was modified. Unfortunately, the default is that it gets the information from the block instead of storing it with each row and changing that behavior will require you to rebuild a really large table. So while this answer is good for quieting the SQL Server fella, I'd not recommend it.
Astander is right there but needs a few caveats. Add a new column needs_processed CHAR(1) DEFAULT 'Y' and add a BITMAP index. For low cardinality columns ('Y'/'N') the bitmap index will be faster. Once you have the rest is pretty easy. But you've got to be careful not select the new rows, process them and mark them as processed in one step. Otherwise, rows could be inserted while you are processing that will get marked processed even though they have not been.
The easiest way would be to use pl/sql to open a cursor that selects unprocessed rows, processes them and then updates the row as processed. If you have an aversion to walking cursors, you could collect the pk's or rowids into a nested table, process them and then update using the nested table.
In MS SQL Server world where I work, we have a 'version' column of type 'timestamp' on our tables.
So, to answer #1, I would add a new column.
To answer #2, I would do it in plsql for performance.
Mark
"astander" pretty much did the work for you. You need to ALTER your table to add one more column (lets say PROCESSED)..
You can also consider creating an INDEX on the PROCESSED ( a bitmap index may be of some advantage, as the possible value can be only 'y' and 'n', but test it out ) so that when you query it will use INDEX.
Also if sure, you query only for every 5 mins, check whether you can add another column with TIMESTAMP type and partition the table with it. ( not sure, check out again ).
I would also think about writing job or some thing and write using UTL_FILE and show it front end if it can be.
If performance is really a problem and you want to create your file asynchronously, you might want to use Oracle Streams, which will actually get modification data from your redo log withou affecting performance of the main database. You may not even need a separate job, as you can configure Oracle Streams to do Asynchronous replication of the changes, through which you can trigger the file creation.
Why not create an extra table that holds two columns. The ID column and a processed flag column. Have an insert trigger on the original table place it's ID in this new table. Your logging process can than select records from this new table and mark them as processed. Finally delete the processed records from this table.
I'm pretty much in agreement with Adam's answer. But I'd want to do some serious testing compared to an alternative.
The issue I see is that you need to not only select the rows, but also do an update of those rows. While that should be pretty fast, I'd like to avoid the update. And avoid having any large transactions hanging around (see below).
The alternative would be to add CREATE_DATE date default sysdate. Index that. And then select records where create_date >= (start date/time of your previous select).
But I don't have enough data on the relative costs of setting a sysdate as default vs. setting a value of Y, updating the function based vs. date index, and doing a range select on the date vs. a specific select on a single value for the Y. You'll probably want to preserve stats or hint the query to use the index on the Y/N column, and definitely want to use a hint on a date column -- the stats on the date column will almost certainly be old.
If data are also being added to the table continuously, including during the period when your query is running, you need to watch out for transaction control. After all, you don't want to read 100,000 records that have the flag = Y, then do your update on 120,000, including the 20,000 that arrived when you query was running.
In the flag case, there are two easy ways: SET TRANSACTION before your select and commit after your update, or start by doing an update from Y to Q, then do your select for those that are Q, and then update to N. Oracle's read consistency is wonderful but needs to be handled with care.
For the date column version, if you don't mind a risk of processing a few rows more than once, just update your table that has the last processed date/time immediately before you do your select.
If there's not much information in the table, consider making it Index Organized.
What about using Materialized view logs? You have a lot of options to play with:
SQL> create table test (id_test number primary key, dummy varchar2(1000));
Table created
SQL> create materialized view log on test;
Materialized view log created
SQL> insert into test values (1, 'hello');
1 row inserted
SQL> insert into test values (2, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------------
1 01/01/4000 I N FE
2 01/01/4000 I N FE
SQL> delete from mlog$_test where id_test in (1,2);
2 rows deleted
SQL> insert into test values (3, 'hello');
1 row inserted
SQL> insert into test values (4, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------
3 01/01/4000 I N FE
4 01/01/4000 I N FE
I think this solution should work..
What you need to do following steps
For the first run, you will have to copy all records. In first run you need to execute following query
insert into new_table(max_rowid) as (Select max(rowid) from yourtable);
Now next time when you want to get only newly inserted values, you can do it by executing follwing command
Select * from yourtable where rowid > (select max_rowid from new_table);
Once you are done with processing above query, simply truncate new_table and insert max(rowid) from yourtable
I think this should work and would be fastest solution;
I tried to put it in a sentence but it is better to give an example:
SELECT * FROM someTable WHERE id = someID;
returns no rows
...
some time passes (no inserts are done to the table and no ID updates)
...
SELECT * FROM someTable WHERE id = someID;
returns one row!
Is it possible that some DB mechanism prevents first SELECT to return row?
Oracle log has no errors.
No transactions are rolled back when two selects are executed.
You can't see uncommitted data in another session. When did the commit happen?
EDIT1: Are you the only one using this database? Or did/do you have multiple sessions?
I think in another session you or someone else has inserted this row, you do your select and you don't see this row. After that a commit happens in the other session (maybe implicit because a session is closed) and then you see this row when you select again.
I can think of other explanations, but I first want to know are you only one using this database.
With read consistency as provided by Oracle, you should not see a row appear like that. If you are running in some mode with automatic commits, so that each statement is a self-contained transaction, then read consistency is not being violated. Which program are you using to access the database? I agree with the other observations; the row should not appear if your session is not inserting it and no other session is active at the same time. I don't know of a DBMS that indulges in spontaneous data generation.
Don't you have scheduled jobs in that Oracle?