How to find the cause of sudden Undo TableSpace increase? - oracle

I use Oracle R12.1.3 on Oracle 10g DB 10.2.0.4 version hosted in IBM AIX 5.3 OS.
Recently I found the data occupied in Undo TableSpace was suddenly increased.
Please let me know what can be the cause of this sudden increase.
If there is a query to find this cause please let me know.

The below query will give the queries which were executed in last hour.
More rows_processed means more undo for DMLs.
select rows_processed, sql_id, sql_text from v$sql where last_active_time > sysdate-1/24 order by rows_processed desc;

Related

Auto Optimizer Stats Collection Job Causing Oracle RDS Database to Restart

We have a Oracle 19C database (19.0.0.0.ru-2021-04.rur-2021-04.r1) on AWS RDS which is hosted on an 4 CPU 32 GB RAM instance. The size of the database is not big (35 GB) and the PGA Aggregate Limit is 8GB & Target is 4GB. Whenever the scheduled internal Oracle Auto Optimizer Stats Collection Job (ORA$AT_OS_OPT_SY_nnn) runs then it consumes substantially high PGA memory (approx 7GB) and sometimes this makes database unstable and AWS loses communication with the RDS instance so it restarts the database.
We thought this may be linked to existing Oracle bug 30846782 (19C+: Fast/Excessive PGA growth when using DBMS_STATS.GATHER_TABLE_STATS) but Oracle & AWS had fixed it in the current 19C version we are using. There are no application level operations that consume this much PGA and the database restart have always happened when the Auto Optimizer Stats Collection Job was running. There are couple of more databases, which are on same version, where same pattern was observed and the database was restarted by AWS. We have disabled the job now on those databases to avoid further occurrence of this issue however we want to run this job as disabling it may cause old stats being available in the database.
Any pointers on how to tackle this issue?
I found the same issue in my AWS RDS Oracle 18c and 19c instances, even though I am not in the same patch level as you.
In my case, I applied this workaround and it worked.
SQL> alter system set "_fix_control"='20424684:OFF' scope=both;
However, before applying this change, I strongly suggest that you test it on your non production environments, and if you can, try to consult with Oracle Support. Dealing with hidden parameters might lead to unexpected side effects, so apply it at your own risk.
Instead of completely abandoning automatic statistics gathering, try find any specific objects that are causing the problem. If only a small number of tables are responsible for a large amount of statistics gathering, you can manually analyze those tables or change their preferences.
First, use the below SQL to see which objects are causing the most statistics gathering. According to the test case in bug 30846782, the problem seems to be only related to the number of times DBMS_STATS is called.
select *
from dba_optstat_operations
order by start_time desc;
In addition, you may be able to find specific SQL statements or sessions that generate a lot of PGA memory with the below query. (However, if the database restarts, it's possible that AWR won't save the recorded values.)
select username, event, sql_id, pga_allocated/1024/1024/1024 pga_allocated_gb, gv$active_session_history.*
from gv$active_session_history
join dba_users on gv$active_session_history.user_id = dba_users.user_id
where pga_allocated/1024/1024/1024 >= 1
order by sample_time desc;
If the problem is only related to a small number of tables with a large number of partitions, you can manually gather the stats on just that table in a separate session. Once the stats are gathered, the table won't be analyzed again until about 10% of the data is changed.
begin
dbms_stats.gather_table_stats(user, 'PGA_STATS_TEST');
end;
/
It's not uncommon for a database to spend a long time gathering statistics, but it is uncommon for a database to constantly analyze thousands of objects. Running into this bug implies there is something unusual about your database - are you constantly dropping and creating objects, or do you have a large number of objects that have 10% of their data modified every day? You may need to add a manual gather step to a few of your processes.
Turning off the automatic statistics job entirely will eventually cause many performance problems. Even if you can't add manual gathering steps, you may still want to keep the job enabled. For example, if tables are being analyzed too frequently, you may want to increase the table preference for the "STALE_PERCENT" threshold from 10% to 20%:
begin
dbms_stats.set_table_prefs
(
ownname => user,
tabname => 'PGA_STATS_TEST',
pname => 'STALE_PERCENT',
pvalue => '20'
);
end;
/

Unable to extend temp segment by 16 in tablespace PSTEMP

The query that gives me this error was running for 6 months now and it was working fine. Today for some reason gave me this error:
Error in running query because of SQL Error, Code=1652, Message=ORA-01652: unable to extend temp segment by 16 in tablespace PSTEMP (50,380).
I don't want to extend "PSTEMP" file. The query shouldn't be the problem as I mentioned it worked fine until now.
I don't know if that will help but the query has prompt value and if I enter a wrong value it works fine but when I enter the value from last week I know it should return 16 rows but instead I get the above error.
You can check your temp space with
SELECT * FROM dba_temp_free_space;
but it might not necessarily be temp despite the error message.
Check your tablespace free space with:
select a.tablespace_name,sum(a.tots/1048576) Tot_Size,
sum(a.sumb/1048576) Tot_Free,
round(sum(a.sumb)*100/sum(a.tots),2) Pct_Free,
sum(a.largest/1024) Max_Free,sum(a.chunks) Chunks_Free
from
(
select tablespace_name,0 tots,sum(bytes) sumb,
max(bytes) largest,count(*) chunks
from dba_free_space a
group by tablespace_name
union
select tablespace_name,sum(bytes) tots,0,0,0 from
dba_data_files
group by tablespace_name) a
group by a.tablespace_name
order by pct_free;
Most likely, your SQL became too heavy as the underlying data grew. You can try optimizing the SQL or if that's not an option, ask the DBAs to increase the undo tablespace (PSTEMP).

CONNECT BY NOCYCLE PRIOR 10G Optimiser Mode

question for today; if the RBO is enabled in 10.2.0.3 and one attempts to use a hierarchical approach; CONNECT BY PRIOR for example, does the optimiser get switched to CBO for execution? I have a large RBO 10GR2 (Don't ask!!), I know the stats are out of date and the query runs like a dog using CONNECT BY.
In v$sqlarea the OPTIMIZER_MODE is RULE. I know using LEFT OUTERS will force RULE to COST.
Any thoughts?
When my Memory is correct, you should be able to force the RBO with:
/*+ RULE */
as optimzier hint.
I managed to figure out that it was not the CONNECT BY forcing the CBO, there was a RANK() over Partition in the SELECT clause causing it!

How do I check index building status on Oracle 11?

I made terrible mistake in SQL index creation:
create index IDX_DATA_TABLE_CUSECO on DATA_TABLE (CUSTOMER_ID, SESSION_ID, CONTACT_ID)
tablespace IDX_TABLESPACE LOCAL ;
As You can see I missed keyword "ONLINE" to create index without blocking PRODUCTION table with high usage with 600m+ records. Corrected SQL is:
create index IDX_DATA_TABLE_CUSECO on DATA_TABLE (CUSTOMER_ID, SESSION_ID, CONTACT_ID)
tablespace IDX_TABLESPACE LOCAL ONLINE;
I was done it under PL/SQL Developer. When I was trying to stop it program stop responding and crashed.
Production system not working for 9 hours now and my boss wanna explode. :D
Is there any chance to see how many seconds/minutes/hours Oracle 11g left to process this index creation ? Or maybe is there any chance to see does Oracle still working on this request? (PL/SQL Developer crashed).
For haters:
I know I should do this like mentioned here: (source)
CREATE INDEX cust_idx on customer(id) UNUSABLE LOCAL;
ALTER INDEX cust_idx REBUILD parallel 6 NOLOGGING ONLINE;
You should be able to view the progress of the operation in V$SESSION_LONGOPS
SELECT sid,
serial#,
target,
target_desc,
sofar,
totalwork,
start_time,
time_remaining,
elapsed_seconds
FROM v$session_longops
WHERE time_remaining > 0
Of course, in a production system, I probably would have killed the session hours ago rather than letting the DDL operation continue to prevent users from accessing the application.

Securing Oracle distributed transactions against network failures

I am synchronizing a table in a local database with data from a table on a database on the opposite side of the earth using distributed transactions.
The networks are connected through vpn over the internet.
Most of the time it works fine, but when the connection is disrupted during an active transaction, a lock is preventing the job from running again.
I cannot kill the locking session. Trying to do so just returns "ORA-00031: Session marked for kill" and it is not actually killed before i cycle the local database.
The sync job is basically
CURSOR TRANS_CURSOR IS
SELECT COL_A, COL_B, COL_C
FROM REMOTE_MASTERTABLE#MY_LINK
WHERE UPDATED IS NULL;
BEGIN
FOR TRANS IN TRANS_CURSOR LOOP
INSERT INTO LOCAL_MASTERTABLE
(COL_A, COL_B, COL_C)
VALUES
(TRANS.COL_A, TRANS.COL_B, TRANS.COL_C);
INSERT INTO LOCAL_DETAILSTABLE (COL_A, COL_D, COL_E)
SELECT COL_A, COL_D, COL_E
FROM REMOTE_DETAILSTABLE#MY_LINK
WHERE COL_A = TRANS.COL_A;
UPDATE REMOTE_MASTERTABLE#MY_LINK SET UPDATED = 1 WHERE COL_A = TRANS.COL_A;
END LOOP;
END;
Any ideas to make this sync operation more tolerant to network dropouts would be greatly appreciated.
I use Oracle Standard Edition One, so no Enterprise features are available.
TIA
Søren
First off, do you really need to roll your own replication solution? Oracle provides technologies like Streams that are designed to allow you to replicate data changes from one system to another reliably without depending on the database link being always available. That also minimizes the amount of code you have to write and the amount of maintenance you have to perform.
Assuming that your application does need to be configured this way, Oracle will have to use the two-phase commit protocol to ensure that the distributed transaction happens atomically. It sounds like transactions are being left in an in-doubt state. You should be able to see information about in-doubt transactions in the DBA_2PC_PENDING view. You should then be able to manually handle that in-doubt transaction.
You may want to use bulk processing instead of looping. Bulk DML can often give huge performance gains, and if there is a large amount of network lag then the difference may be dramatic if Oracle is retrieving one row at a time. Decreasing the run time won't fix the error, but it should help avoid it. (Although Oracle may already be doing this optimization behind the scenes.)
EDIT
Bulk processing might help, but the best solution would probably be to use only SQL statements. I did some testing and the below version ran about 20 times faster than the original. (Although it's difficult to know how closely my sample data and self-referencing database link model your real data.)
BEGIN
INSERT INTO LOCAL_MASTERTABLE
(COL_A, COL_B, COL_C)
SELECT COL_A, COL_B, COL_C
FROM REMOTE_MASTERTABLE#MY_LINK
WHERE UPDATED IS NULL;
INSERT INTO LOCAL_DETAILSTABLE (COL_A, COL_D, COL_E)
SELECT REMOTE_DETAILSTABLE.COL_A, REMOTE_DETAILSTABLE.COL_D, REMOTE_DETAILSTABLE.COL_E
FROM REMOTE_DETAILSTABLE#MY_LINK
INNER JOIN (SELECT COL_A FROM REMOTE_MASTERTABLE#MY_LINK WHERE UPDATED IS NULL) TRANS
ON REMOTE_DETAILSTABLE.COL_A = TRANS.COL_A;
UPDATE REMOTE_MASTERTABLE#MY_LINK SET UPDATED = 1 WHERE UPDATED IS NULL;
END;
/

Resources