when I try to use Oracle Statistics my cost grows? - oracle

I have a big query on for tables and I want to optimize it.
The weird part is that when I get the execution plan without statistics it says something like 1.2M. however if I get statistics for one of the tables involved in the query, my cost lowers to 4k. But if I ask for statistics in the other tables the cost grows to 50k, so I am not sure what's happening.
Can anyone explain a reason why giving more statistics actually increases query cost?

The Cost Based Optimiser uses as much information as you can give it in order to calculate the cost of a plan. If you update (i.e. change) the statistics it uses, then obviously that will change the calculated cost of the plan.
It's not actually the gathering of stats that causes the cost to grow - it's how those stats have changed (whether up or down) that causes the calculated cost to change.
In the absence of statistics, Oracle may use heuristics, guesswork or a quick sample of the data (depending on the settings in your instance).
Generally, the better (more accurate or representative) the statistics, the more accurate the cost calculation.

The cost based optimizer has it's challenges. There are rounding errors that can have quite an impact on decisions that it makes. This is one of the reasons that SQL Plan Stability, introduced in 11g is so nice. Forget about 10g, if you can, or prepare for long debugging sessions.
At the first use, a plan is generated based on the current statistics and executed. If SQL is repeated, the SQL and the plan are stored in a baseline. In the maintenance window, the most expensive plans are re evaluated and in many cases, a better plan can be provided. This is possible because at runtime, the optimizer is limited in the time it is given to search for a plan. In the maintenance window, a lot more time can be spent to find the best plan.
In 11g the peeking is also fixed and a single SQL can now have multiple plans, based on the values of the bind variables.
The query cost is based on many factors, where IO is a very important factor.
How are your tables filled and where are the high water marks located? A table that is filled and emptied constantly can have it's high watermark far away....
There are lots of bugs in the optimizer, lots of options, controlled by hidden parameters. You could try to use them to tweek the behaviour. Upgrading to 11g might be a lot smarter as it solves lots of performance problems for many applications.

Related

Random spikes in usage (CockroachCloud Serverless)

I recently set up a free CockroachDB Serverless cluster on CockroachCloud. It's been really great so far, but sometimes there are random spikes in Request Units even though the amount of SQL statements doesn't increase at all. Here's a screenshot of the two graphs in the cluster management page, it illustrates pretty well what I mean. I would really appreciate some help on how I could eliminate these spikes because CockroachCloud has some limits on free usage. That being said, I'm still fairly new to CockroachDB, so I might be missing something obvious.
You are likely performing enough mutations on your data to trigger automatic statistics collection as a background process. By default when 20% or more rows are modified in a table, CockroachDB will trigger a statistics refresh. The statistics are used by the optimizer to create more efficient query plans.
Your SQL Statements graph indicates that almost all your operations are inserts. That many inserts is almost certainly triggering stats collection. While you can turn off stats collection, the optimizer will then be using stale data to calculate query plans, potentially causing performance problems.
The occasional spikes in your Request Unit graph are above the 100 RUs per second baseline, but the rest of the time you are well below 100 RUs per second. That means you are accumulating RUs most of the time, and that (plus the initial 10 million RU allocation) should cover the bursts.
I added a FAQ entry to the Serverless docs covering this.

Disecting a Function for speed

I wanted to know if there was a way to measure the performance of a function. In parts.
I know that you are able to measure the total time it takes to complete the function but is there a way to measure the individual queires within a function?
Just wanted to know because I can not find the bottleneck for my function's performance.
Most of the time when you see a major difference between the estimated and the actual execution plans, it is because your statistics have not been (ever) updated. The SQL Server therefore has no idea which tables have little data, which ones are huge, and so on, and is more likely to generate bogus plans (both estimated and actual), or to miscalculate estimated plan costs. The actual plan is based on real, accurate costs of the plan, but when the plan is very far from an optimal one, this accuracy is of very little value for determining bottlenecks.
To correct this, issue the UPDATE STATISTICS statement or execute the sp_updatestats procedure.
Seeing 100% actuals for your function might well be an effect of empty or almost empty database, regardless of whether you have uptodate statistics or not.
When optimizing for performance, make sure that your database is populated quasi-realistically with lots of data (put twice as much records to each table than what you expect for production; but do maintain the expected rough proportions). There is not much point in looking for a performance bottleneck using an empty or an entirely, disproportionately overblown database; query plans will be different and even if the plan will happen to be the same one, the bottleneck may be elsewhere than in production.

How do I correctly performance test SELECT queries with Oracle?

I would like to test two queries to find out their performance as apposed to just looking at the execution plan. I have seen Tom Kyte do this all the time on his website as a way to gather evidence on his theories.
I believe there are many pitfalls in performance testing, for example, when i run a query in SQL developer for the first time, that query might return some fair number. Running that exact same query again, returns instantaneously. There must be some sort of caching on the server or client going on and I understand this is important - however I am only interested in non cached performance.
What are the guidelines to performance test? AND how do I write a performance test which repeats the query? Do i just write an anonymous block & loop? How do i get timing information, averages, medians, std deviations?
Oracle (and other databases) cache queries, which is where you see the behavior you describe. A "hard" parse means there's no query plan for the query, which leaves Oracle to figure out the query plan based on indexes and statistics. A "soft" parse is what happens when you run the identical query afterwards, and receive an instantaneous result, because the query plan exists & Oracle re-uses it. See the Ask Tom question about it for more details.
Be aware of the EXPLAIN output:
With the cost-based optimizer, execution plans can and do change as the underlying costs change. EXPLAIN PLAN output shows how Oracle runs the SQL statement when the statement was explained. This can differ from the plan during actual execution for a SQL statement, because of differences in the execution environment and explain plan environment.
Focusing on the non-cached performance gives a worst-case scenario, but given that caching will occur - non-cached benchmarks aren't realistic in everyday use.
To build off OMG Ponies answer, tuning based on timing is something that's possible, but not realistic. You'd have to start either with a fully-cached buffer cache in every case, or a fully-empty buffer cache, and neither of those is going to be representative of reality - especially if there's no competing load.
When I'm tuning, it's generally against a live system with activity, and I focus on tuning logical I/Os, either through using the extended SQL trace (dbms_monitor.session_trace_enable / dbms_monitor.session_trace_disable) and the tkprof utility, or using SQL*Plus and set autotrace traceonly - which does all the work of the query, but throws the output away, because I'm usually not interested in watching a jillion rows scroll by.
The exact mechanism usually involves bound SQL, using something like the following:
variable :my_bind1 number;
variable :my_bind2 varchar2(30);
begin
:my_bind1 := 42;
:my_bind2 := 'some meaningful string';
end;
/
set timing on;
set autotrace traceonly;
[godawful query with binds]
set autotrace off;
Within the results, I'm looking for the plan I'd expect, a comparative value for sorts - assuming any exist - and most importantly, the number of consistent I/Os. That's how many blocks Oracle had to read in consistent mode to satisfy the query. I can't find the original source of the quote, but I think it's Cary Milsap of Method R.
"Tune your logical I/Os, and your physical I/Os will follow."
In performance tuning, if the only piece of data you look at is wall-clock time, you will only be getting a small part of the whole picture. You need to at least look at the execution plan, as well as IO stats, in order to work out how best to tune the query.
Also, you need to eliminate other causes of performance issues - e.g. if there is a general performance issue across many queries, it might not be the fault of just one of them - it might be an architecture problem, or significant concurrent activity on the database, or even an underlying hardware issue.
I've had similar issues to what you describe before; e.g. a certain type of query which should be very fast was taking 30 seconds to run on the first time, then would settle down to a second or two. As soon as I looked at the execution plan, however, it was obvious that it was using a full table scan, because it couldn't use the unique index that had been created. The first time the query ran, most of the data was loaded into the cache (in fact, there were two levels of cache involved - the database buffer cache, as well as a storage-level cache over the disks) so subsequent full table scans were extremely fast.
What is correctly ?
Since 11g there are a few extra complications to take into account. The optimizer pre peeking has become a lot smarter and sql plan stability has a BIG influence. These two features make the database auto tuning but can also have unexpected effects during performance tests, for example because not all variations of the plans are known and accepted at the beginning of the tests.
This might be the cause that a second test run, the day after the first run, suddenly runs much quicker, without any apparent changes.
Since 11g performance testing is less important, compared to writing logically correct code. For example a Cartesian product and filtering out one distinct value van be functional correct but is in most of the cases wrong code because it fetches more data than logically needed.
If the queries fetches the data that is really needed and is in the correct control structure, have the database processes tune the code during the maintenance windows. In many cases the differences between the test environment and production are such that a comparison can not be safely made.
Don't get me wrong, testing is important but mostly for the logic compared to performance testing before 11g, there are extra steps to be taken.
For nice reading see Oracle® Database 2 Day + Performance Tuning Guide 11g Release 2 (11.2)

Oracle: Difference in execution plans between databases

I am comparing queries my development and production database.
They are both Oracle 9i, but almost every single query has a completely different execution plan depending on the database.
All tables/indexes are the same, but the dev database has about 1/10th the rows for each table.
On production, the query execution plan it picks for most queries is different from development, and the cost is somtimes 1000x higher. Queries on production also seem to be not using the correct indexes for queries in some cases (full table access).
I have ran dbms_utility.analyze schema on both databases recently as well in the hopes the CBO would figure something out.
Is there some other underlying oracle configuration that could be causing this?
I am a developer mostly so this kind of DBA analysis is fairly confusing at first..
1) The first thing I would check is if the database parameters are equivalent across Prod and Dev. If one of the parameters that affects the decisions of the Cost Based Optimizer is different then all bets are off. You can see the parameter in v$parameter view;
2) Having up to date object statistics is great but keep in mind the large difference you pointed out - Dev has 10% of the rows of Prod. This rowcount is factored into how the CBO decides the best way to execute a query. Given the large difference in row counts I would not expect plans to be the same.
Depending on the circumstance the optimizer may choose to Full Table Scan a table with 20,000 rows (Dev)where it may decide an index is lower cost on the table that has 200,000 rows (Prod). (Numbers just for demonstration, the CBO uses costing algorighms for determining what to FTS and what to Index scan, not absolute values).
3) System statistics also factor into the explain plans. This is a set of statistics that represent CPU and disk i/o characteristics. If your hardware on both systems is different then I would expect your System Statistics to be different and this can affect the plans. Some good discussion from Jonathan Lewis here
You can view system stats via the sys.aux_stats$ view.
Now I'm not sure why different plans are a bad thing for you... if stats are up to date and parameters set correctly you should be getting decent performance from either system no matter what the difference in size...
but it is possible to export statistics from your Prod system and load them into your Dev system. This make your Prod statistics available to your Dev database.
Check the Oracle documentation for the DBMS_STATS package, specifically the EXPORT_SCHEMA_STATS, EXPORT_SYSTEM_STATS, IMPORT_SCHEMA_STATS, IMPORT_SYSTEM_STATS procedures. Keep in mind you may need to disable the 10pm nightly statistics jobs on 10g/11g... or you can investigate Locking statistics after import so they are not updated by nightly jobs.

Is there a major performance gain by using stored procedures?

Is it better to use a stored procedure or doing it the old way with a connection string and all that good stuff? Our system has been running slow lately and our manager wants us to try to see if we can speed things up a little and we were thinking about changing some of the old database calls over to stored procedures. Is it worth it?
The first thing to do is check the database has all the necessary indexes set up. Analyse where your code is slow, and examine the relevant SQL statements and indexes relating to them. See if you can rewrite the SQL statement to be more efficient. Check that you aren't recompiling an SQL (prepared) statement for every iteration in a loop instead of outside it once.
Moving an SQL statement into a stored procedure isn't going to help if it is grossly inefficient in implementation. However the database will know how to best optimise the SQL and it won't need to do it repeatedly. It can also make the client side code cleaner by turning a complex SQL statement into a simple procedure call.
I would take a quick look at Stored Procedures are EVIL.
So long as your calls are consistent the database will store the execution plan (MS SQL anyway). The strongest remaining reason for using stored procedures are for easy and sure security management.
If I were you I'd first be looking for adding indices where required. Also run a profiling tool to examine what is taking long and if that sql needs to changed, e.g. adding more Where clauses or restricting result set.
You should consider caching where you can.
Stored procedures will not make things faster.
However, rearranging your logic will have a huge impact. The tidy, focused transactions that you design when thinking of stored procedures are hugely beneficial.
Also, stored procedures tend to use bind variables, where other programming languages sometimes rely on building SQL statements on-the-fly. A small, fixed set of SQL statements and bind variables is fast. Dynamic SQL statements are slow.
An application which is "running slow lately" does not need coding changes.
Measure. Measure. Measure. "slow" doesn't mean much when it comes to performance tuning. What is slow? Which exact transaction is slow? Which table is slow? Focus.
Control all change. All. What changed? OS patch? RDBMS change? Application change? Something changed to slow things down.
Check for constraints in scale. Is a table slowing down because 80% of the data is history that you use for reporting once a year?
Stored procedures are never the solution to performance problems until you can absolutely point to a specific block of code which is provably faster as a stored procedure.
stored procedures can be really help if they avoid sending huge amounts of data and/or avoid doing roundtrips to the server,so they can be valuable if your application has one of these problems.
After you finish your research you will realize there are two extreme views at opposite side of the spectrum. Historically the Java community has been against store procs due to the availability of frameworks such as hibernate, conversely the .NET community has used more stored procs and this legacy goes as far as the vb5/6 days. Put all this information in context and stay away from the extreme opinions on either side of the coin.
Speed should not be the primary factor to decide against or in favor of stored procs. You can achieve sp performace using inline SQL with hibernate and other frameworks. Consider maintenance and which other programs such as reports, scripts could use the same stored procs used by your application. If your scenario requires multiple consumers for the same SQL code, stored procedures are a good candidate, maintenance will be easier. If this is not the case, and you decide to use inline sql, consider externalizing it in config files to facilitate maintenance.
At the end of the day, what counts is what will make your particular scenario a success for your stakeholders.
If your server is getting noticeably slower in your busy season it may be because of saturation rather than anything inefficent in the database. Basic queuing theory tells us that a server gets hyperbolically slower as it approaches saturation.
The basic relationship is 1/(1-X) where X is the proportion of load. This describes the average queue length or time to wait before being served. Therefore a server that is getting saturated will slow down very rapidly when the load spikes.
A server that is 25% loaded will have an average service time of 1.333K for some constant K (loosely, K is the time for the machine to perform one transaction). A server that is 50% loaded will have an average service time of 2K and a server that is 90% loaded will have an average service time of 10K. Given that the slowdowns are hyperbolic in nature, it often doesn't take a large change in overall load to produce a significant degradation in response time.
Obviously this is somewhat simplistic as the server will be processing multiple requests concurrently (there are more elaborate queuing models for this situation), but the broad principle still applies.
So, if your server is experiencing transient loads that are saturating it, you will experience patches of noticeable slow-down. Note that these slow-downs need only be in one bottlenecked area of the system to slow the whole process down. If you are only experiencing this now in a busy season there is a possibility that your server has simply hit a constraint on a resource, rather than being particularly slow or inefficient.
Note that this possibility is not antithetical to the possibility of inefficiencies in the code. You may find that the way to ease the bottleneck is to tune some of your queries.
In order to tell if the system is bottlenecked, start gathering profiling information. If you can find resources with a large number of waits, this should give you a good starting point.
The final possibility is that you need to upgrade your server. If there are no major inefficiencies in the code (this might well be the case if profiling doesn't indicate any disproportionately large bottlenecks) you may simply need bigger hardware. I have no idea what your volumes are, but don't discount the possibility that you may have outgrown your server.
Yes, stored procs is a step forward towards acheiving good performance. The main reason is that stored procedures can be pre-compiled and their execution plan cached.
You however need to first analyse where your performance bottlenecks are really - so that you approach this exercise in a structured way.
As it has been suggested in one of the responses, try analyse using a profiler tool where the problem is - e.g do you need to create indexes...
Cheers
Like all of the above posts suggest, you first want to clean up your SQL statements, have appropriate indexes. caching can be tricky, I cant comment unless I have more detail on what you are trying to accomplish.
But one thing about sprocs, make sure you dont let it generate dynamic SQL statements
because for one, it will be pointless and it can be subjected to SQL Injection attacks...this has happened in one of the projects I looked into.
I would recommend sprocs for updates mainly, and then select statements.
good luck :)
You can never say in advance. You must do it and measure the difference because in 9 out of 10 cases, the bottleneck is not where you think.
If you use a stored procedure, you don't have to transmit the data. DBs are usually slow at executing [EDIT]complex[/EDIT] stored procedures [EDIT]with loops, higher math, etc[/EDIT]. So it really depends on how much work you would need to do, how slow your network is, how fast the DB executes this particular code, etc.

Resources