We have a Oracle 19C database (19.0.0.0.ru-2021-04.rur-2021-04.r1) on AWS RDS which is hosted on an 4 CPU 32 GB RAM instance. The size of the database is not big (35 GB) and the PGA Aggregate Limit is 8GB & Target is 4GB. Whenever the scheduled internal Oracle Auto Optimizer Stats Collection Job (ORA$AT_OS_OPT_SY_nnn) runs then it consumes substantially high PGA memory (approx 7GB) and sometimes this makes database unstable and AWS loses communication with the RDS instance so it restarts the database.
We thought this may be linked to existing Oracle bug 30846782 (19C+: Fast/Excessive PGA growth when using DBMS_STATS.GATHER_TABLE_STATS) but Oracle & AWS had fixed it in the current 19C version we are using. There are no application level operations that consume this much PGA and the database restart have always happened when the Auto Optimizer Stats Collection Job was running. There are couple of more databases, which are on same version, where same pattern was observed and the database was restarted by AWS. We have disabled the job now on those databases to avoid further occurrence of this issue however we want to run this job as disabling it may cause old stats being available in the database.
Any pointers on how to tackle this issue?
I found the same issue in my AWS RDS Oracle 18c and 19c instances, even though I am not in the same patch level as you.
In my case, I applied this workaround and it worked.
SQL> alter system set "_fix_control"='20424684:OFF' scope=both;
However, before applying this change, I strongly suggest that you test it on your non production environments, and if you can, try to consult with Oracle Support. Dealing with hidden parameters might lead to unexpected side effects, so apply it at your own risk.
Instead of completely abandoning automatic statistics gathering, try find any specific objects that are causing the problem. If only a small number of tables are responsible for a large amount of statistics gathering, you can manually analyze those tables or change their preferences.
First, use the below SQL to see which objects are causing the most statistics gathering. According to the test case in bug 30846782, the problem seems to be only related to the number of times DBMS_STATS is called.
select *
from dba_optstat_operations
order by start_time desc;
In addition, you may be able to find specific SQL statements or sessions that generate a lot of PGA memory with the below query. (However, if the database restarts, it's possible that AWR won't save the recorded values.)
select username, event, sql_id, pga_allocated/1024/1024/1024 pga_allocated_gb, gv$active_session_history.*
from gv$active_session_history
join dba_users on gv$active_session_history.user_id = dba_users.user_id
where pga_allocated/1024/1024/1024 >= 1
order by sample_time desc;
If the problem is only related to a small number of tables with a large number of partitions, you can manually gather the stats on just that table in a separate session. Once the stats are gathered, the table won't be analyzed again until about 10% of the data is changed.
begin
dbms_stats.gather_table_stats(user, 'PGA_STATS_TEST');
end;
/
It's not uncommon for a database to spend a long time gathering statistics, but it is uncommon for a database to constantly analyze thousands of objects. Running into this bug implies there is something unusual about your database - are you constantly dropping and creating objects, or do you have a large number of objects that have 10% of their data modified every day? You may need to add a manual gather step to a few of your processes.
Turning off the automatic statistics job entirely will eventually cause many performance problems. Even if you can't add manual gathering steps, you may still want to keep the job enabled. For example, if tables are being analyzed too frequently, you may want to increase the table preference for the "STALE_PERCENT" threshold from 10% to 20%:
begin
dbms_stats.set_table_prefs
(
ownname => user,
tabname => 'PGA_STATS_TEST',
pname => 'STALE_PERCENT',
pvalue => '20'
);
end;
/
My Oracle 11.2.0.3 FULL DATABASE Datapump Export is very slow, when i ask V$SESSION_LONGOPS
SELECT USERNAME,OPNAME,TARGET_DESC,SOFAR,TOTALWORK,MESSAGE,SYSDATE,ROUND(100*SOFAR/TOTALWORK,2)||'%' COMPLETED FROM V$SESSION_LONGOPS
where SOFAR/TOTALWORK!=1
it show me 2 records, in opname one containing the SYS_EXPORT_FULL_XX, and another "Rowid Range Scan" and the message for the last one is
Rowid Range Scan : MY_SCHEMA.BIG_TABLE: 28118329 out of 30250532 Blocks done and it takes hours and hours.
I.E : MY_SCHEMA.BIG_TABLE is a 220 GB table size having 2 CLOB colunn.
If you have CLOBs in the table it will take a long time to export because that wont parallelize. Exactly what phase are you stuck in? Could you paste the last lines from the log file or get a status from data pump?
There are some best practices that you could try out:
SecureFile LOBs can be faster than BasicFile LOBs. That is yet another reason for going to SecureFile LOBs.
You could try to increase the STREAMS_POOL_SIZE to 256 MB (at least) although I think that is not the reason.
Use PARALLEL option and set it to 2 x CPU cores. Never export statistics - it is better to either export using DBMS_STATS or regather at target database.
Regards,
Daniel
Well for 11g and 12cR1 the Streams AQ Enqueue is a common culprit for this as well. If you ALTER SYSTEM SET EVENTS 'IMMEDIATE TRACE NAME MMAN_CREATE_DEF_REQUEST LEVEL 6' this will help if the issue is the very common Streams AQ Enqueue.
I have jobs in Oracle that can run for hours doing a lot of calculations involving (but not limited to) XmlTransform. I have noticed that the PGA memory is increasing (and the performance degrading) gradually until at some point the job fails with an out of memory (PGA) message. We have applied some fixes, but they don't seem to solve the issue.
Stopping the jobs and restarting them, solves my issue, the performance is good again and the memory is low...
All the code is written in PL/SQL and SQL.
Question:
As I want to solve this as soon as possible, I was wondering how I can workaround this type of issue in Oracle.
My main thinking goes to somehow:
restarting the job after some time (possibly the most simple solution) using Advanced Queuing
restart the current session?
executing some code syncronously in another session, maybe another job.
Oracle 12.1.0.2
EDIT: As asked here's sample code with XMLTransform:
function i_Convert_Xml_To_Clob (p_Zoek_Result_Type_Id in Zoek_Result_Type.Zoek_Result_Type_Id%type,
p_Xml in xmltype,
p_Xml_Transformation in xmltype) return clob is
mResult clob;
begin
if p_Xml_Transformation is not null then
select Xmltransform (p_Xml, p_Xml_Transformation).getclobval()
into mResult
from Dual;
elsif p_Xml is not null then
mResult := p_Xml.getclobval();
else
mResult := null;
end if;
return mResult;
end i_Convert_Xml_To_Clob;
Can you or a DBA monitor temp lob usage from another session using V$TEMPORARY_LOBS. If the number of lobs is increasing then the session is not freeing them correctly and this will lead increasing PGA usage (Note this is not a leak).
The most common scenario is when processing a statement that returns one or more temporary lobs, for instance XMLTRANSMFORM().getClobVal().
It is not uncommon for (Java ?) developers to forget that a TEMP LOB is a SESSION level object and the resources assoicated with it will not be freed as a result of the client handle or reference going out of scope. Eg If you get a TEMP lob into a JAVA Clob object you can not rely on garbage collection to clean up the LOB. You must explicitly free the lob before overwriting it with the next lob or the LOB resources will be held by the server until the session ends.
Since we don't have sample code we cannot definitely state this is what is happening in your case.
I am working on an application that have lot of cursor and many of them are just defined in a package header and not used in package body, so does this unused cursor creates an overhead?
A declared cursor but unused cursor will create no overhead but an open but unused cursor might, a bit.
An open cursor is stored in the private SQL area of the PGA. This "holds information about a parsed SQL statement and other session-specific information for processing.". You can find the amount of PGA you have by querying V$PGASTAT.
It's not 100% clear from Oracle's Memory Architecture documentation whether opened but unused cursors store anything in the PGA. The section on the persistent area of the private SQL area of the PGA would insinuate that this is only created if you're binding any variables to your cursor; but, as the state of the cursor must be stored in order for the DB to know that it's open I'm assuming that some memory is used.
If a single open cursor is negatively impacting your performance I'd be horrified. This would be an indication that you've massively underestimated the size of the PGA and SGA (execution plans are stored here) that you need.
However, this strategy can backfire massively as the number of open cursors is limited by the open_cursors parameter, which you can find in V$PARAMETER. This is an absolute upper limit on the number of open cursors you can have open. If you hit this limit you'll get ORA-01000.
This means that you should not open cursors that you're not going to use.
However it's also worth noting this particular Ask Tom question/answer, though it's from 2004.
3- If the open_cursors are increased then what will be the performance impact on the db > server and the memory usage?
...
Followup April 1, 2004 - 10am UTC:
...
...3) if you are not hitting ora-1000, it will change nothing (since you are not using the cursors you currently have)
I need some pointers on how to diagnose and fix this problem. I don't know if this is a simple server setup problem or an application design problem (or both).
Once or twice every few months this Oracle XE database reports ORA-4031 errors. It doesn't point to any particular part of the sga consistently. A recent example is:
ORA-04031: unable to allocate 8208 bytes of shared memory ("large pool","unknown object","sort subheap","sort key")
When this error comes up, if the user keeps refreshing, clicking on different links, they'll generally get more of these kinds of errors at different times, then soon they'll get "404 not found" page errors.
Restarting the database usually resolves the problem for a while, then a month or so later it comes up again, but rarely at the same location in the program (i.e. it doesn't seem linked to any particular portion of code) (the above example error was raised from an Apex page which was sorting 5000+ rows from a table).
I've tried increasing sga_max_size from 140M to 256M and hope this will help things. Of course, I won't know if this has helped since I had to restart the database to change the setting :)
I'm running Oracle XE 10.2.0.1.0 on a Oracle Enterprise Linux 5 box with 512MB of RAM. The server only runs the database, Oracle Apex (v3.1.2) and Apache web server. I installed it with pretty much all default parameters and it's been running quite well for a year or so. Most issues I've been able to resolve myself by tuning the application code; it's not intensively used and isn't a business critical system.
These are some current settings I think may be relevant:
pga_aggregate_target 41,943,040
sga_max_size 268,435,456
sga_target 146,800,640
shared_pool_reserved_size 5,452,595
shared_pool_size 104,857,600
If it's any help here's the current SGA sizes:
Total System Global Area 268435456 bytes
Fixed Size 1258392 bytes
Variable Size 251661416 bytes
Database Buffers 12582912 bytes
Redo Buffers 2932736 bytes
Even though you are using ASMM, you can set a minimum size for the large pool (MMAN will not shrink it below that value).
You can also try pinning some objects and increasing SGA_TARGET.
Don't forget about fragmentation.
If you have a lot of traffic, your pools can be fragmented and even if you have several MB free, there could be no block larger than 4KB.
Check size of largest free block with a query like:
select
'0 (<140)' BUCKET, KSMCHCLS, KSMCHIDX,
10*trunc(KSMCHSIZ/10) "From",
count(*) "Count" ,
max(KSMCHSIZ) "Biggest",
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ<140
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 10*trunc(KSMCHSIZ/10)
UNION ALL
select
'1 (140-267)' BUCKET,
KSMCHCLS,
KSMCHIDX,
20*trunc(KSMCHSIZ/20) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ between 140 and 267
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 20*trunc(KSMCHSIZ/20)
UNION ALL
select
'2 (268-523)' BUCKET,
KSMCHCLS,
KSMCHIDX,
50*trunc(KSMCHSIZ/50) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ between 268 and 523
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 50*trunc(KSMCHSIZ/50)
UNION ALL
select
'3-5 (524-4107)' BUCKET,
KSMCHCLS,
KSMCHIDX,
500*trunc(KSMCHSIZ/500) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ between 524 and 4107
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 500*trunc(KSMCHSIZ/500)
UNION ALL
select
'6+ (4108+)' BUCKET,
KSMCHCLS,
KSMCHIDX,
1000*trunc(KSMCHSIZ/1000) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ >= 4108
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 1000*trunc(KSMCHSIZ/1000);
Code from
All of the current answers are addressing the symptom (shared memory pool exhaustion), and not the problem, which is likely not using bind variables in your sql \ JDBC queries, even when it does not seem necessary to do so. Passing queries without bind variables causes Oracle to "hard parse" the query each time, determining its plan of execution, etc.
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::p11_question_id:528893984337
Some snippets from the above link:
"Java supports bind variables, your developers must start using prepared statements and bind inputs into it. If you want your system to ultimately scale beyond say about 3 or 4 users -- you will do this right now (fix the code). It is not something to think about, it is something you MUST do. A side effect of this - your shared pool problems will pretty much disappear. That is the root cause. "
"The way the Oracle
shared pool (a very important shared memory data structure)
operates is predicated on developers using bind variables."
" Bind variables are SO MASSIVELY important -- I cannot in any way shape or form OVERSTATE their importance. "
The following are not needed as they they not fix the error:
ps -ef|grep oracle
Find the smon and kill the pid for it
SQL> startup mount
SQL> create pfile from spfile;
Restarting the database will flush your pool and that solves a effect not the problem.
Fixate your large_pool so it can not go lower then a certain point or add memory and set a higher max memory.
This is Oracle bug, memory leak in shared_pool, most likely db managing lots of partitions.
Solution: In my opinion patch not exists, check with oracle support. You can try with subpools or en(de)able AMM ...
Error
ORA-04031: unable to allocate 4064 bytes of shared memory ("shared pool","select increment$,minvalue,m...","sga heap(3,0)","kglsim heap")
Solution: by nepasoft nepal
1.-
ps -ef|grep oracle
2.- Find the smon and kill the pid for it
3.-
SQL> startup mount
ORACLE instance started.
Total System Global Area 4831838208 bytes
Fixed Size 2027320 bytes
Variable Size 4764729544 bytes
Database Buffers 50331648 bytes
Redo Buffers 14749696 bytes
Database mounted.
4.-
SQL> alter system set shared_pool_size=100M scope=spfile;
System altered.
5.-
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
6.-
SQL> startup
ORACLE instance started.
Total System Global Area 4831838208 bytes
Fixed Size 2027320 bytes
Variable Size 4764729544 bytes
Database Buffers 50331648 bytes
Redo Buffers 14749696 bytes
Database mounted.
Database opened.
7.-
SQL> create pfile from spfile;
File created.
SOLVED