Oracle DB links and retrieving stale data - oracle

I have 2 databases, DBa and DBb. I have 2 records sets, RecordsA and RecordsB. The concept is that in our app you can add records from A to B. I am having an issue where I go to add a record from A to B and try to query the records again. The particular property on the added record is stale/incorrect.
RecordsA lives on DBa and RecordsB lives on DBb. I make my stored proc call to add the record to the B side and modify a column's value on DBa which makes the insert/update using a dblink on DBb. Problem is, when I do a insert/update followed by an immidiate get call on DBa (calling DBb) that modified property is incorrect, it's null as if the insert never went through. However, if I put a breakpoint before the pull call and wait about 1 second the correct data is returned. Making me wonder if there is some latency issues with dblinks.
This seems like an async issue but we verified no async calls are being made and everything is running on the same thread. Would this type of behavior be likely with a db link? As in, inserting/updating a record on a remote server and retrieving it right away causing some latency where the record wasn't quite updated at the time of the re-pull?

Related

Cache and update regularly complex data

Lets star with background. I have an api endpoint that I have to query every 15 minutes and that returns complex data. Unfortunately this endpoint does not provide information of what exactly changed. So it requires me to compare the data that I have in db and compare everything and than execute update, add or delete. This is pretty boring...
I came to and idea that I can simply remove all data from certain tables and build everything from scratch... But it I have to also return this cached data to my clients. So there might be a situation that the db will be empty during some request from my client because it will be "refreshing/rebulding". And that cant happen because I have to return something
So I cam to and idea to
Lock the certain db tables so that the client will have to wait for the "refreshing the db"
or
CQRS https://martinfowler.com/bliki/CQRS.html
Do you have any suggestions how to solve the problem?
It sounds like you're using a relational database, so I'll try to outline a solution using database terms. The idea, however, is more general than that. In general, it's similar to Blue-Green deployment.
Have two data tables (or two databases, for that matter); one is active, and one is inactive.
When the software starts the update process, it can wipe the inactive table and write new data into it. During this process, the system keeps serving data from the active table.
Once the data update is entirely done, the system can begin to serve data from the previously inactive table. In other words, the inactive table becomes the active table, and vice versa.

What will happen when inserting a row during a long running query

I am writing some data loading code that pulls data from a large, slow table in an oracle database. I have read-only access to the data, and do not have the ability to change indexes or affect the speed of the query in any way.
My select statement takes 5 minutes to execute and returns around 300,000 rows. The system is inserting large batches of new records constantly, and I need to make sure I get every last one, so I need to save a timestamp for the last time I downloaded the data.
My question is: If my select statement is running for 5 minutes, and new rows get inserted while the select is running, will I receive the new rows or not in the query result?
My gut tells me that the answer is 'no', especially since a large portion of those 5 minutes is just the time spent on the data transfer from the database to the local environment, but I can't find any direct documentation on the scenario.
"If my select statement is running for 5 minutes, and new rows get inserted while the select is running, will I receive the new rows or not in the query result?"
No. Oracle enforces strict isolation levels and does not permit dirty reads.
The default isolation level is Read Committed. This means the result set you get after five minutes will be identical to the one you would have got if Oracle could have delivered you all the records in 0.0000001 seconds. Anything committed after you query started running will not be included in the results. That includes updates to the records as well as inserts.
Oracle does this by tracking changes to the table in the UNDO tablespace. Provided it can restrict the original image from that data your query will run to completion; if for any reason the undo information is overwritten your query will fail with the dreaded ORA-1555: Snapshot too old. That's right: Oracle would rather hurl an exception than provide us with an inconsistent result set.
Note that this consistency applies at the statement level. If we run the same query twice within the one transaction we may see two different results sets. If that is a problem (I think not in your case) we need to switch from Read Committed to Serialized isolation.
The Concepts Manual covers Concurrency and Consistency in great depth. Find out more.
So to answer your question, take the timestamp from the time you start the select. Specifically, take the max(created_ts) from the table before you kick off the query. This should protect you from the gap Alex mentions (if records are not committed the moment they are inserted there is the potential to lose records if you base the select on comparing with the system timestamp). Although doing this means you're issuing two queries in the same transaction which means you do need Serialized isolation after all!

Query returning single record taking too much time in a EJB-Hibernate Application along with Oracle DB

I am working with a EJB(3.0)-Hibernate(3) project along with Oracle 11g DB.
First of all due to the security reason I am unable to share my code, I am really sorry for that.
Issue is :
In my Application from different locations, DB has been called for retrieving, persisting, merging records which deals with a number of tables in DB.
But, for a particular retrieve query(select query which is fetching only a single record by putting a primary key data in where clause) from my Application, it is taking too much time(almost 4 minutes) for getting the response from DB(response is proper with a single record).
I can track the time by debugging from calling point to DB inside Application and the retrieving response from DB to my Application.
So, I want to know why for a single record fetching, it is taking so much time where for other queries it's fetching within seconds or micro-seconds.
And also want to know how to track the time-stamp of [query request from Application just hitting the Database after connecting DB through Hibernate Layer] and also what is going on inside the DB for this flow.
Please give me some advice or suggestions from your entire work experience if you facing such kind of issue and also help me how to track the whole flow
Application <-> Hibernate Layer <-> Database
Thanks in advance!!!

Parameterized trigger - concurrency concerns

My question is quite similar to this one but I need more guidance. I also read the Oracle context doc.
The current (test) trigger is :
CREATE OR REPLACE TRIGGER CHASSIS_DT_EVNT_AIUR_TRG_OLD AFTER DELETE OR INSERT OR UPDATE
OF ETA
ON CHASSITRANSPORTS
REFERENCING NEW AS New OLD AS Old
FOR EACH ROW
DECLARE
BEGIN
INSERT INTO TS_CHASSIS_DATE_EVENTS (CHASSISNUMBER,DATETYPE,TRANSPORTLEGSORTORDER,OLDDATE,CREATEDBY,CREATEDDATE,UPDATEDBY,UPDATEDDATE) VALUES (:old.chassino,'ETA',:old.sortorder,:old.eta,'xyz',sysdate,'xyz',sysdate);
EXCEPTION
WHEN OTHERS THEN
NULL;
END TS_CHASSIS_DT_EVNT_AIUR_TRG;
Now the 'CREATEDBY', 'UPDATEDBY' will be the web application users who have logged in and made the changes which caused the trigger execution, hence, these values need to be passed from the application.
The web application :
Is deployed in Websphere Application Server where the datasources are configured
As expected, is using db connection pooling
My question is which approach mentioned in the thread and the doc. should I take to avoid the 'concurrency' issues i.e the updates by the app. users in multiple sessions at the application level as well the db level should not interfere with each other.
I don't think any one of the approaches in that link would apply to you, primarily due to multi-user environment and connection pooling.
Connection pooling by nature allows different connections to share the same session. Setting a context (either sys_context or any other application context) is valid throughout the lifetime of the session. So two different connections can overwrite each other's values and read other's values. (concurrency issues)
I'd actually argue against doing an insert like this inside a trigger at all. It seems to me the insert you are doing is to write to a log table all updates that happened on the main table. If that is the case, why not insert to the log table at the time of making any updates to this table.
So the procedure that does UPDATE CHASSITRANSPORTS ... would also have another INSERT statement inside it that writes to the other table. If there is no procedure and it is a direct update statement from the application, then write a procedure for this.
You could say that there are multiple places where the same update happens and I'll suggest that in that scenario create an API for the base table CHASSITRANSPORTS that handles updates and so behind a black box also writes to the log table. Any place where you need to update that table column you'd use that API.
(I'm ignoring the fact that you are suppressing all errors in the trigger with WHEN OTHERS THEN NULL with the hope that this is probably just a small example)

Referencing object's identity before submitting changes in LINQ

is there a way of knowing ID of identity column of record inserted via InsertOnSubmit beforehand, e.g. before calling datasource's SubmitChanges?
Imagine I'm populating some kind of hierarchy in the database, but I wouldn't want to submit changes on each recursive call of each child node (e.g. if I had Directories table and Files table and am recreating my filesystem structure in the database).
I'd like to do it that way, so I create a Directory object, set its name and attributes,
then InsertOnSubmit it into DataContext.Directories collection, then reference Directory.ID in its child Files. Currently I need to call InsertOnSubmit to insert the 'directory' into the database and the database mapping fills its ID column. But this creates a lot of transactions and accesses to database and I imagine that if I did this inserting in a batch, the performance would be better.
What I'd like to do is to somehow use Directory.ID before commiting changes, create all my File and Directory objects in advance and then do a big submit that puts all stuff into database. I'm also open to solving this problem via a stored procedure, I assume the performance would be even better if all operations would be done directly in the database.
One way to get around this is to not use an identity column. Instead build an IdService that you can use in the code to get a new Id each time a Directory object is created.
You can implement the IdService by having a table that stores the last id used. When the service starts up have it grab that number. The service can then increment away while Directory objects are created and then update the table with the new last id used at the end of the run.
Alternatively, and a bit safer, when the service starts up have it grab the last id used and then update the last id used in the table by adding 1000 (for example). Then let it increment away. If it uses 1000 ids then have it grab the next 1000 and update the last id used table. Worst case is you waste some ids, but if you use a bigint you aren't ever going to care.
Since the Directory id is now controlled in code you can use it with child objects like Files prior to writing to the database.
Simply putting a lock around id acquisition makes this safe to use across multiple threads. I've been using this in a situation like yours. We're generating a ton of objects in memory across multiple threads and saving them in batches.
This blog post will give you a good start on saving batches in Linq to SQL.
Not sure off the top if there is a way to run a straight SQL query in LINQ, but this query will return the current identity value of the specified table.
USE [database];
GO
DBCC CHECKIDENT ("schema.table", NORESEED);
GO

Resources