I recently moved a set of my Oracle SQL queries to a package with table functions where I am returning data pipelined.
However, I started observing something unusual. V$SQL stats like buffer_gets, fetches, cpu_time, execution_time etc, have started showing a cumulative number, and they keep getting higher with every execution of the queries.
Is this usual behavior?
That's expected behavior.
There's a lots of shared resources involved. If you have an application with a dozen database connections in a pool, all doing the same sort of work, then you can have SQLs that have multiple users fetching from them at the same time (eg Connection 1 getting the Invoices for ABC while Connection 2 is getting invoices for XYZ). V$SQL (or GV$SQL in a RAC environment) will show the cumulative totals for most details (and things like USERS_EXECUTING will be current values across all sessions).
The only time they'll get reset is when the SQL ages out of the shared area, typically when it hasn't being used for some time and the resources are needed for another SQL, or rare events like a database shutdown.
Related
We have a Oracle 19C database (19.0.0.0.ru-2021-04.rur-2021-04.r1) on AWS RDS which is hosted on an 4 CPU 32 GB RAM instance. The size of the database is not big (35 GB) and the PGA Aggregate Limit is 8GB & Target is 4GB. Whenever the scheduled internal Oracle Auto Optimizer Stats Collection Job (ORA$AT_OS_OPT_SY_nnn) runs then it consumes substantially high PGA memory (approx 7GB) and sometimes this makes database unstable and AWS loses communication with the RDS instance so it restarts the database.
We thought this may be linked to existing Oracle bug 30846782 (19C+: Fast/Excessive PGA growth when using DBMS_STATS.GATHER_TABLE_STATS) but Oracle & AWS had fixed it in the current 19C version we are using. There are no application level operations that consume this much PGA and the database restart have always happened when the Auto Optimizer Stats Collection Job was running. There are couple of more databases, which are on same version, where same pattern was observed and the database was restarted by AWS. We have disabled the job now on those databases to avoid further occurrence of this issue however we want to run this job as disabling it may cause old stats being available in the database.
Any pointers on how to tackle this issue?
I found the same issue in my AWS RDS Oracle 18c and 19c instances, even though I am not in the same patch level as you.
In my case, I applied this workaround and it worked.
SQL> alter system set "_fix_control"='20424684:OFF' scope=both;
However, before applying this change, I strongly suggest that you test it on your non production environments, and if you can, try to consult with Oracle Support. Dealing with hidden parameters might lead to unexpected side effects, so apply it at your own risk.
Instead of completely abandoning automatic statistics gathering, try find any specific objects that are causing the problem. If only a small number of tables are responsible for a large amount of statistics gathering, you can manually analyze those tables or change their preferences.
First, use the below SQL to see which objects are causing the most statistics gathering. According to the test case in bug 30846782, the problem seems to be only related to the number of times DBMS_STATS is called.
select *
from dba_optstat_operations
order by start_time desc;
In addition, you may be able to find specific SQL statements or sessions that generate a lot of PGA memory with the below query. (However, if the database restarts, it's possible that AWR won't save the recorded values.)
select username, event, sql_id, pga_allocated/1024/1024/1024 pga_allocated_gb, gv$active_session_history.*
from gv$active_session_history
join dba_users on gv$active_session_history.user_id = dba_users.user_id
where pga_allocated/1024/1024/1024 >= 1
order by sample_time desc;
If the problem is only related to a small number of tables with a large number of partitions, you can manually gather the stats on just that table in a separate session. Once the stats are gathered, the table won't be analyzed again until about 10% of the data is changed.
begin
dbms_stats.gather_table_stats(user, 'PGA_STATS_TEST');
end;
/
It's not uncommon for a database to spend a long time gathering statistics, but it is uncommon for a database to constantly analyze thousands of objects. Running into this bug implies there is something unusual about your database - are you constantly dropping and creating objects, or do you have a large number of objects that have 10% of their data modified every day? You may need to add a manual gather step to a few of your processes.
Turning off the automatic statistics job entirely will eventually cause many performance problems. Even if you can't add manual gathering steps, you may still want to keep the job enabled. For example, if tables are being analyzed too frequently, you may want to increase the table preference for the "STALE_PERCENT" threshold from 10% to 20%:
begin
dbms_stats.set_table_prefs
(
ownname => user,
tabname => 'PGA_STATS_TEST',
pname => 'STALE_PERCENT',
pvalue => '20'
);
end;
/
Is optimizer_use_sql_plan_baselines and resource_manager_cpu_allocation oracle system parameter have impact on sql query performance.
We have two envt suppose A and B. On A Envt query is running fine but in Envt. B its tacking time. I have compared system parameter and found difference in values in optimizer_use_sql_plan_baselines and resource_manager_cpu_allocation .
SQL plan baselines and the resource manager certainly could have a huge impact on performance, and you should use the below two queries or confirm or deny that those parameters are related to your problem.
GV$SQL stores which SQL plan baseline is associated with each SQL statement. Compare the SQL_PLAN_BASELINE column in the below query, and if they are equal then your problem is not related to baselines:
select sql_plan_baseline, round(elapsed_time/1000000) elapsed_seconds, gv$sql.*
from gv$sql
order by elapsed_time desc;
The Active Session History (ASH) views can tell you if the resource manager is an issue. If your queries are being throttled then you will see an event
named "resmgr:cpu quantum" in the below query. (But pay attention to the counts - don't troubleshoot a wait event if it only happens a small number of times.)
select nvl(event, 'CPU') event, count(*)
from gv$active_session_history
group by event
order by count(*) desc;
Resource manager can have other potentially negative affects. If you're in a data warehouse, and using parallel queries, it's possible that resource manager has downgraded the queries on one system. If you're using parallel queries, try comparing the SQL monitoring reports from both systems:
select dbms_sqltune.report_sql_monitor(sql_id => '&YOUR_SQL_ID') from dual;
However, I have a feeling that you're using the wrong approach for your problem. There are generally two approaches to Oracle database performance - database tuning and query tuning. If you're only interested in a single query, then you should probably focus on things like the execution plan and the wait events for the operations of that specific query.
I’m a longtime MSSQL developer who finds himself back in PL/SQL for the first time since Oracle 7. I’m looking for some tuning advice re a large export stored procedure, which is sporadically and not very reproducably running slow at certain points. This happens around some static working tables which it truncates, fills and uses as part of the export. The code in outline typically looks like this:
create or replace Procedure BigMultiPurposeExport as (
-- about 2000 lines of other code
INSERT WORK_TABLE_5 SELECT WHATEVER1 FROM WHEREVER1;
INSERT WORK_TABLE_5 SELECT WHATEVER2 FROM WHEREVER2;
INSERT WORK_TABLE_5 SELECT WHATEVER3 FROM WHEREVER3;
INSERT WORK_TABLE_5 SELECT WHATEVER4 FROM WHEREVER4;
-- WORK_TABLE_5 now has 0 to ~500k rows whose content can vary drastically from run to run
-- e.g. one hourly run exports 3 whale sightings, next exports all tourist visits to Kenya this decade
-- about 1000 lines of other code
INSERT OUTPUT_TABLE_3
SELECT THIS, THAT, THE_OTHER
FROM BUSINESS_TABLE_1 BT1
INNER JOIN BUSINESS_TABLE_2 ON etc -- typical join on indexed columns
INNER JOIN BUSINESS_TABLE_3 ON etc -- typical join on indexed columns
INNER JOIN BUSINESS_TABLE_4 ON etc -- typical join on indexed columns
LEFT OUTER JOIN WORK_TABLE_1 ON etc -- typical join on indexed columns
LEFT OUTER JOIN WORK_TABLE_2 ON etc -- typical join on indexed columns
LEFT OUTER JOIN WORK_TABLE_3 ON etc -- typical join on indexed columns
LEFT OUTER JOIN WORK_TABLE_4 ON etc -- typical join on indexed columns
LEFT OUTER JOIN WORK_TABLE_5 WT5 ON BT1.ID = WT5.BT1_ID AND WT5.RECORD_TYPE = 21
-- join above is now supported by indexes on BUSINESS_TABLE_1 (ID) and WORK_TABLE_5 (BT1_ID, RECORD_TYPE), originally wasn't
LEFT OUTER JOIN WORK_TABLE_6 ON etc -- typical join on indexed columns
LEFT OUTER JOIN WORK_TABLE_7 ON etc -- typical join on indexed columns
-- about 4000 lines of other code
)
That final insert into OUTPUT_TABLE_3 usually runs in under 10 seconds, but once in a while on certain customer servers it times out at our default 99 minutes. Then we have them take the tiemout off and run it on Friday night, and it finishes but takes 16 hours.
I narrowed the problem down to the join to WORK_TABLE_5, which had no index support, and put an index on the join terms. The next run took 4 seconds. But success has been intermittent, the customer occasionally gets some slow runs when they drastically change their export selection (i.e. drastically change the data in WORK_TABLE_5). And if we update statistics and rebuild indexes after a timed out export, it runs fine at the next attempt.
So, I am wondering about how best to handle truncating/filling static work tables with static indexes, statistics updated overnight, and a stored procedure compiled when the statistics are nothing like runtime.
I have a few general questions about things I'd like to understand better:
Is the nature of the data in the work table going to substantially effect the query plan? Does Oracle form its query plan when you compile the stored procedure? Could we get a highly inappropriate query plan if we compile the stored procedure with the table empty then use a table with 500k rows at runtime?
I expect that if this were an ad-hoc script then updating statistics on the problem table just before selecting from it would eliminate the sporadic slowdowns. But what if I were to update statistics inside the stored procedure, which is compiled with different statistics from runtime?
Anything else you'd like to add...
Thanks for any advice. I hope my MSSQL preconceptions haven't made me too far off base.
This is happening in Oracle 11g, but the code is deployed to assorted customers using Oracle 10 through 12 and I'd like to cater to all of those if possible.
-- Joel
Huge differences in table or index sizes can most definitely cause performance problems. The solution is to add statistics gathering to the procedure instead of relying on the default statistics jobs.
If you've been away from Oracle since version 7, the most important new feature is the Cost Based Optimizer. Oracle now builds query execution plans based on the optimizer statistics of tables, indexes, columns, expressions, system statistics, outlines, directives, dynamic sampling, etc. If you're a full time Oracle developer you should probably spend a day reading about optimizer statistics. Start with Managing Optimizer Statistics and DBMS_STATS in the official documentation.
Eventually the stored procedure should look like this:
--1: Insert into working tables.
insert into work_table...
--2: Gather statistics on working tables.
dbms_stats.gather_table_stats('SCHEMA_NAME', 'WORK_TABLE', ...);
--3: Use working tables.
insert into other_table select * from work_table...
There are so many statistics features it's hard to know exactly what parameters to use in that second step above. Here are some guesses about some features you might find useful:
DEGREE - One reason people avoid gathering statistics inside a process is the time is takes. You can significantly improve the run time by setting the degree. Although this also uses significantly more resources.
NO_INVALIDATE - It can be tricky to know when exactly are the statistics "set" for a query. Gathering statistics usually quickly invalidates execution plans that were based on old statistics. But not always. If you want to be 100% sure that the next query is using the latest statistics you want to set NO_INVALIDATE=>FALSE.
ESTIMATE_PERCENT In 11g and above you definitely want to use the default, which uses a faster algorithm. In 10g and below you may need to set the value to something low to make the gathering fast enough.
Although Oracle 10g and above comes with default statistics gathering jobs you cannot rely on them for a few reasons:
They are scheduled and may not run at the right time. If a process significantly changes the data then new stats are needed right away, not at 10 PM. If there are a lot of tables that need to be analyzed the job may not get to them all in one day.
Many DBAs disable the jobs. This is ridiculous and almost always a mistake. But you'll find many DBAs that disabled the job because they think they can do it better. Instead of working with the auto tasks, and settings preferences, many DBAs like to throw the whole thing out and replace it with a custom procedure that rots over time.
What am I missing here? I am trying to test identifying long running queries.
I have a test table with about 400 million rows called mytest.
I ran select * from mytest in sqlplus
In another window, I ran the script below to see my long running query
select s.username, s.sid, s.serial#, s.schemaname,
s.program, s.osuser, s.status, s.last_call_et
from v$session s
where last_call_et >= 1 – this is just for testing
My long running query does not show up in the result from the query above. If I change the criteria to be >=0, then I see my query showing the status as INACTIVE and last_call_et of 0 despite the fact that the query is still running. What can I do to see my long running queries like the select * from... above so that I can kill it?
Thanks
First, you need to understand what a query like select * from mytest is really doing under the covers because that's generally not going to be a long-running query. Oracle doesn't ever need to materialize that result set and isn't going to read all the data as the result of a single call. Instead, what goes on is a series of calls each of which cause Oracle to do a little bit of work. The conversation goes something like this.
Client: Hey Oracle, run the query for me: select * from mytest
Oracle: Sure thing (last_call_et resets to 0 to reflect that a new call started). I've generated a query plan and opened a cursor,
here's a handle (note that no work has been done yet to actually
execute the query)
Client: Cool, thanks. Using this cursor handle,
fetch me the next 50 rows (the fetch size is a client-side setting)
Oracle: Will do (last_call_et resets to 0 to reflect that a new call started). I started full scanning the table, read a couple of
blocks, and got 50 rows. Here you go.
Client: OK, I've processed
those. Using this cursor handle, fetch the next 50 rows
Repeat until all the data is fetched
At no point in this process is Oracle ever really being asked to do more than read a handful of blocks to get the 50 rows (or whatever the fetch size the client is requesting). At any point, the client could simply not request the next batch of data so Oracle doesn't need to do anything long-running. Oracle doesn't track the application think time between requests for more data-- it has no idea whether the client is a GUI that is in a tight loop fetching data or whether it is displaying a result to a human and waiting for a human to hit the "next" button. The vast majority of the time, the session is going to be INACTIVE because it's mostly waiting for the client to request the next batch of data (which it generally won't do until it had formatted the last batch of data for display and done the work to display it).
When most people talk about a long-running query, they're talking about a query that Oracle is actively processing for a relatively long time with no waits on a client to fetch the data.
You can use the below script to find long running query:
select * from
(
select
opname,
start_time,
target,
sofar,
totalwork,
units,
elapsed_seconds,
message
from
v$session_longops
order by start_time desc
)
where rownum <=1;
is there a way to know that of the say 100 DDL and DML statement which i am executing through jdbc , is something stuck.
i need to find out for progress bar that in db sql statements are executing and are not hung, so that can inform to user.
Is there a way to find this out.
There are any number of Oracle data dictionary views that you might use given the relatively vague requirements (what, exactly, is your definition of "hung", for example).
I'd probably start, though, with v$session assuming you can identify the sid and serial# of the particular session that you want to monitor (which may be aided by looking at the various columns in v$session). STATUS will tell you whether the session is actively executing a SQL statement at that particular instant. The sql_id will (generally) let you join to v$sqlarea or other views that tell you what statement is currently executing. The event column will tell you what the session is waiting on (i.e. reading from disk, waiting on CPU, waiting to acquire a lock, etc.).
The sql_id from v$session will also let you join to v$sqlstats which periodically updates with things like the amount of logical I/O a particular SQL statement has generated which would let you see that the currently active statement is doing something (whether that something is useful or whether it will terminate in our lifetime would be much more difficult).
Depending on what the code is doing, there may be one or more rows in v$session_longops that you can use to track the progress of longer-running operations-- using this effectively, though, will require that the third-party code issues the sort of long-running SQL operations that Oracle can monitor automatically (i.e. table scans of tables that have a reasonable amount of data) or that the code is instrumented to use v$session_longops to track its own progress.
Depending on what version of Oracle you're using, you might also be able to use the v$sql_monitor view to monitor SQL in real time.