We have a daily batch job executing a oracle-plsql function. Actually the quartz scheduler invokes a java program which makes a call to the oracle-plsql function. This oracle plsql function deletes data (which is more than 6 months) from 4 tables and then commits the transaction.
This batch job was running successfully in the test environment but started failing when new data was dumped to the tables which happened 2 weeks ago (The code is supposed to go into production this week). Earlier the number of rows in each table was not more than 0.1 million. But now it is 1 million in 3 tables and 2.4 million in the other table.
After running for 3 hours, we are getting a error in java (written in the log file) "...Connection reset; nested exception is java.sql.SQLException: Io exception: Connection reset....". When the row-counts on the tables were checked, it was clear that no record was deleted from any of the tables.
Is it possible in oracle database, for the plsql procedure/function to be automatically terminated/killed when the connection is timed out and the invoking session is no longer active?
Thanks in advance,
Pradeep.
The PL/SQL won't terminate because it is inactive, since by definition it isn't - it is still doing something. It won't be generating any network traffic back to your client though.
It appears something at the network level is causing the connection to be terminated. This could be a listener timeout, a firewall timeout, or something else. If it's consistently after three hours then it will almost certainly be a timeout configured somewhere rather than a network glitch, which would be more random (and possibly recoverable).
When the network connection is interrupted, Oracle will notice at some point and terminate the session. That will cause the PL/SQL call to be terminated, and that will cause any work it has done to be rolled back, which may take a while.
3 hours seems a long time for your deletes though even for a few million records. Perhaps you're deleting inefficiently, with individual inserts within your procedure. Which doesn't really help you of course. It might be worth pointing out that your production environment might not have whatever setting is killing your connection, or might have a shorter timeout, so even reducing the runtime might not make it bullet-proof in live. You probably need to find the source of the timeout and check the equivalent in the live env. to try to pre-empt similar problems there.
Related
I am running into ORA-01555: snapshot too old errors with Oracle 9i but am not running any updates with this application at all.
The error occurs after the application has been connected for some hours without any queries, then every query (which would otherwise be subsecond queries) comes back with a ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6$" too small.
Could this be cause of transaction isolation set to TRANSACTION_SERIALIZABLE? Or some other bug in the JDBC code? This could be caused by a bug in the jdbc-go driver but everything I've read about this bug has led me to believe scenarios where no DML statements are made this would not occur.
Read below a very good insight on this error by Tom Kyte. The problem in your case may come from what is called 'delayed block cleanout'. This is a case where selects creates redo. However, the root cause is almost sure improper size of rollback segments(but Tom adds as correlated causes: too frequently commits, a too big read after many updates, etc).
snaphot too old error (Tom Kyte)
When you run a query on an Oracle database the result will be what Oracle calls a "Read consistent snapshot".
What it means is that all the data items in the result will be represented with the value as of the the time the query was started.
To achieve this the DBMS looks into the rollback segments to get the original value of items which have been updated since the start of the query.
The DBMS uses the rollback segment in a circular way and will eventually wrap around - overwriting the old data.
If your query needs data that is no longer available in the rollback segment you will get "snapshot too old".
This can happen if your query is running for a long time on data being concurrently updated.
You can prevent it by either extending your rollback segments or avoid running the query concurrently with heavy updaters.
I also believe newer versions of Oracle provides better dynamic management of rollback segments than what is the case for Oracle 9i.
I have a PL/pgSQL function which takes data from a staging table to our target table. The process executes every night. Sometimes due to server restart or some maintenance issues we get the process executed manually.
The problem I am facing: whenever we start the process manually after 7 AM, it takes almost 2 hours to complete (read from staging table and insert into the target table). But whenever it executes as per schedule, i.e., before 7 AM, it takes 22-25 minutes on average.
What could be the issue? If required, I can share my function snippet here.
The typical reason would be general concurrent activity in the database, which competes for the same resources as your function and may cause lock contention. Check your DB log for activities starting around 7 a.m.
The Postgres Wiki on lock monitoring
A function always runs as a single transaction. Locks are acquired along the way and only released at the end of a transaction. This makes long running functions particularly vulnerable to lock contention.
You may be able to optimize general performance as well as behavior towards concurrent transactions to make it run faster. Or more radically: if at all possible, split your big function in separate parts, which you call in separate transactions.
PostgreSQL obtain and release LOCK inside stored function
How to split huge updates:
How do I do large non-blocking updates in PostgreSQL?
There are additional things to consider when packing multiple big operations into a single function:
Execute multiple functions together without losing performance
This may have been asked numerous times but none of them helped me so far.
Here's some history:
QueryTimeOut: 120 secs
Database:DB2
App Server: JBoss
Framework: Struts 2
I've one query which fetches around a million records. Yes, we need to fetch it all at once for caching purpose, sadly can't change the design.
Now, we've 2 servers Primary and DR. In DR server, the query is getting executed within 30 secs, so no timeout issue there. But in Primary serverit is getting time out due to some unknown reason. Sometimes it is getting timed out in rs.next() and sometime in pstmt.executeQuery().
All DB indexes, connection pool etc are in place. The explain plan shows, there are no full table scan as well.
My Analysis:
Since query is not the issue here, there might be issue in Network delay?
How can I get the root cause behind this timeout. How can I make sure there are no connection leakage? (Since all connection are closed properly).
Any way to recover from the timeout and again execute the query with increased timeout value for e.g: pstmt.setQueryTimeOut(600)? <- Note that this has no effect whatsoever. Don't know why..!
Appreciate any inputs.
Thank You!
We have automated the process of capturing baseline metrics of various queries as part of Oracle tuning project. The automation is carried out by a QTP script which executes the procedure, which in turn runs the query for specified number of times with different input parameters. Once the execution of stored procedure is complete, it opens OEM and saves the reports by searching the particular SQL ID.
We are facing an issue while running stored procedures which in turn has queries taking lot of time to execute. In such cases, the QTP executes the stored procedure for some duration of time and after that it appears to have been stopped. When I check OEM, after a certain amount of time, QTP terminates the execution of stored procedures and the session seems to have been timed out.
Since QTP uses ADO, do I need to set “CommandTimeout” property of the connection to some large value in case of executing stored procedures which take lot of time? Doesn’t the QTP throw any error in case of any such time out issue? In our case the QTP status was still being displayed as “Running”, even when nothing was happening in the backend.
we use getMetaData() on every cursor returned from the oracle stored procedure call.
With ojdbc5 we dont have spike in number of metadata sql's executed and average time. But with ojdbc6 we see spike in number of metadata sql's executed and increase in avg sql execution time.
Does anyone know or aware of this issue with ojdbc6.. wish they had made it open source?
did anyone atleast try decompiling the ojdbc6 jar anytime?
the problem is with the way SimpleJdbcCall of Spring works, it gets the metadata of the procedure and its arguments for everycall. even though they should not cache it, there should have been a setting which enables and disables the caching of the metadata while using SimpleJdbcCall.
When using the SimpleJdbcCall... beware of the metadata contention that happens.. if your app has too many pl/sql procedure invocations then the oracle can get latch contention and hence the overall app will slow down as this causes a bottleneck... servers even crash rendering the app non-responsive. add a small cache by diving into spring code and make a flag to enable/disable... tadaanngggg.. it works awesomely faster than ever.