Setting workarea_size_policy to manual vs automatic - oracle

I am working on a data warehouse system which was upgraded about a year ago to Oracle 10g (now 10.2.0.5).
The database is set up with workarea_size_policy=auto and pga_aggregate_target=1G. Most of the ETL process is written in PL/SQL and this code generally sets workarea_size_policy=manual and sets the SORT_AREA_SIZE and HASH_AREA_SIZE for particular sessions when building specific parts of the warehouse.
The values chosen for the SORT_AREA_SIZE and HASH_AREA_SIZE are different for different parts of the build. These sizes are probably based on the expected amount of data that will be processed in each area.
The problem I am having is that this code is starting to cause a number of ORA-600 errors to occur. It is making me wonder if we should even be overriding the automatic settings at all.
The code that sets the manual settings was written many years ago by a developer who is no longer here. It was probably originally written for Oracle 8 with an amendment for Oracle 9 to set the workarea_size_policy to manual. No one really knows how the values used for HASH_AREA_SIZE and SORT_AREA_SIZE were found. They could be completely inappropriate for all I know.
After that long preamble, I've got a few questions.
How do I know when (if ever) I should be overriding the manual settings with workarea_size_policy=manual?
How do I find appropriate values for HASH_AREA_SIZE, SORT_AREA_SIZE, etc?
How do I benchmark that particular settings are actually providing any sort of benefit?
I'm aware that this is a pretty broad question but help would be appreciated.

I suggest you comment out the manual settings and do a test run only with automatic (dynamic) settings, like PGA_AGGREGATE_TARGET.
Management of Sort and Hash memory areas has improved a lot since Oracle 8!
It's hard to predetermine the memory requirements of your procedures, so the best is to test them with representative volumes of data and see how it goes.
You can then create an AWR report covering the timeframe of the execution of the procedures. There's a section in the report named PGA Memory Advisory. That will tell you if you need more memory assigned to PGA_AGGREGATE_TARGET, based on your current data volumes.
See sample here:
In this case you can clearly see that there's no need to go over the current 103 MB assigned, and you could actually stay at 52 MB without impacting the application.
Depending on the volumes we're talking about, if you can't assign more memory, some Sort or Hash operations might spill to a TEMPORARY tablespace, so make sure you have a properly sized one and possibly spread across as many disks / volumes as possible (see SAME configuration, also here).

Related

How to test tuned oracle sql and how to clear clear system/hardware buffer?

I want to know the way to test right sqls before tuned and after tuned.
but once I executed the original sql, I got results too fast for tuned sql.
I found below...
How to clear all cached items in Oracle
I did flush data buffer cache and shared pool but it still didn't work.
I guess this answer from that question is related to what I want to know more:
Keep in mind that the operating system and hardware also do caching which can skew your results.
Oracle's version is 11g and Server is HP-UX 11.31.
If the server was Linux, I could've tried clearing buffer using '/proc/sys/vm/drop_caches'.(I'm not sure it would works)
I'm searching quite long time for this problem. Is there anyone has this kind of problem?
thanks
If your query is such that the results are being cached in the file system, which your description would suggest, then the query is probably not a "heavy-hitter" overall. But if you were testing in isolation, with not much activity on the database, when the SQL is run in a production environment performance could suffer.
There are several things you can do to determine which version of two queries is better. In fact, entire books have been written on just this topic. But to summarize:
Before you begin, ensure statistics on the tables and indexes are up to date.
See how often the SQL will be executed in the grand scheme of things. If it runs once or twice a day, and takes 2 seconds to run, don't bother trying to tune.
Do a explain plan on both and look at the estimated costs and number of steps.
Turn on tracing for both optimizer steps and execution statistics, and compare.

Evision Argos/Oracle performance

I have a report in Evisions Argos that runs in 1-2 minutes on one server and for almost an hour on another. It is the exact same datablock with the exact same code, just running against a different database.
I won't display the script here, I just want to know what kinds of things our Argos Admin should check for (I don't have access to the server, I'm just a DataBlock Designer).
Oracle performance tuning is a vast topic. There are people out there who make (very good) livings out of tuning other people's queries. So you really aren't going to get much joy here.
But the general advice is actually pretty obvious: if the same query performs differently in two different environments the cause must be due to some difference between them.
Volumes of data (number of rows)
Distribution of data (development or test data may have different characteristics from live data)
Data structures (indexes, materialized views)
Stale statistics
Hardware - amount of RAM, quality of IO, size and type of storage.
Resource contention (more users, different types of users e.g. long running reports)
There are all sorts of tools you can use to diagnose performance problems. These depend on which version of the database you have, which edition you've licensed and also whether you have any of the chargeable options.
But regardless, the first place to start is with the documentation. Find out more.

ROWDEPENDENCIES Overhead in Oracle

I'm experimenting with a change capture architecture for ETL processing that is based on ora_rowscn, and have rebuilt the source tables with ROWDEPENDENCIES to isolate the SCN's to only those rows modified (as opposed to block-level tagging). I'm aware of the extra 6 bytes/row of space overhead, but it is not obvious to me what other impact this would have.
My question: What would be the extra work the RDBMS engine would do with rowdependencies enabled for commits and rollbacks? For my source tables with 100 to 500 rows/block I realize I must be writing 100-500x the number of SCN's (for our typical commits), but are there other side effects I'm missing?
Oracle introduced ROWDEPENDENCIES as part of a set of changes to optimize replication. It seems unlikely that they would have gone ahead if it did impact on performance. Certainly I haven't read of anything.
The inestimable Tom Kyte discusses using ROWDEPENDENCIES in one of his books, without any warnings or caveats (beyond mentioning the six bytes). If there are other gotachas I'm sure he would have said so.

What makes Oracle more scalable?

Oracle seems to have a reputation for being more scalable than other RDBMSes. After working with it a bit, I can say that it's more complex than other RDBMSes, but I haven't really seen anything that makes it more scalable than other RDBMSes. But then again, I haven't really worked on it in a whole lot of depth.
What features does Oracle have that are more scalable?
Oracle's RAC architecture is what makes it scalable where it can load balance across nodes and parallel queries can be split up and pushed to other nodes for processing.
Some of the tricks like loading blocks from another node's buffer cache instead of going to disc make performance a lot more scalable.
Also, the maintainability of RAC with rolling upgrades help make the operation of a large system more sane.
There is also a different aspect of scalability - storage scalability. ASM makes increasing the storage capacity very straightforward. A well designed ASM based solution, should scale past the 100s of terabyte size without needing to do anything very special.
Whether these make Oracle more scalable than other RDBMSs, I don't know. But I think I would feel less happy about trying to scale up a non-Oracle database.
Cursor sharing is (or was) a big advantage over the competition.
Basically, the same query plan is used for matching queries. An application will have a standard set of queries it issue (eg get the orders for this customer id). The simple way is to treat every query individually, so if you see 'SELECT * FROM ORDERS WHERE CUSTOMER_ID = :b1', you look at whether table ORDERS has an index on CUSTOMER_ID etc. As a result, you can spend as much work looking up meta data to get a query plan as actually retrieving the data. With simple keyed lookups, a query plan is easy. Complex queries with multiple tables joined on skewed columns are harder.
Oracle has a cache of query plans, and older/less used plans are aged out as new ones are required.
If you don't cache query plans, there's a limit to how smart you can make your optimizer as the more smarts you code into it, the bigger impact you have on each query processed. Caching queries means you only incur that overhead the first time you see the query.
The 'downside' is that for cursor sharing to be effective you need to use bind variables. Some programmers don't realise that and write code that doesn't get shared and then complain that Oracle isn't as fast as mySQL.
Another advantage of Oracle is the UNDO log. As a change is done, the 'old version' of the data is written to an undo log. Other database keep old versions of the record in the same place as the record. This requires VACUUM style cleanup operations or you bump into space and organisation issues. This is most relevant in databases with high update or delete activity.
Also Oracle doesn't have a central lock registry. A lock bit is stored on each individual data record. SELECT doesn't take a lock. In databases where SELECT locks, you could have multiple users reading data and locking each other or preventing updates, introducing scalability limits. Other databases would lock a record when a SELECT was done to ensure that no-one else could change that data item (so it would be consistent if the same query or transaction looked at the table again). Oracle uses UNDO for its read consistency model (ie looking up the data as it appeared at a specific point in time).
Tom Kyte's "Expert Oracle Database Architecture" from Apress does a good job of describing Oracle's architecture, with some comparisons with other rDBMSs. Worth reading.

Is there a major performance gain by using stored procedures?

Is it better to use a stored procedure or doing it the old way with a connection string and all that good stuff? Our system has been running slow lately and our manager wants us to try to see if we can speed things up a little and we were thinking about changing some of the old database calls over to stored procedures. Is it worth it?
The first thing to do is check the database has all the necessary indexes set up. Analyse where your code is slow, and examine the relevant SQL statements and indexes relating to them. See if you can rewrite the SQL statement to be more efficient. Check that you aren't recompiling an SQL (prepared) statement for every iteration in a loop instead of outside it once.
Moving an SQL statement into a stored procedure isn't going to help if it is grossly inefficient in implementation. However the database will know how to best optimise the SQL and it won't need to do it repeatedly. It can also make the client side code cleaner by turning a complex SQL statement into a simple procedure call.
I would take a quick look at Stored Procedures are EVIL.
So long as your calls are consistent the database will store the execution plan (MS SQL anyway). The strongest remaining reason for using stored procedures are for easy and sure security management.
If I were you I'd first be looking for adding indices where required. Also run a profiling tool to examine what is taking long and if that sql needs to changed, e.g. adding more Where clauses or restricting result set.
You should consider caching where you can.
Stored procedures will not make things faster.
However, rearranging your logic will have a huge impact. The tidy, focused transactions that you design when thinking of stored procedures are hugely beneficial.
Also, stored procedures tend to use bind variables, where other programming languages sometimes rely on building SQL statements on-the-fly. A small, fixed set of SQL statements and bind variables is fast. Dynamic SQL statements are slow.
An application which is "running slow lately" does not need coding changes.
Measure. Measure. Measure. "slow" doesn't mean much when it comes to performance tuning. What is slow? Which exact transaction is slow? Which table is slow? Focus.
Control all change. All. What changed? OS patch? RDBMS change? Application change? Something changed to slow things down.
Check for constraints in scale. Is a table slowing down because 80% of the data is history that you use for reporting once a year?
Stored procedures are never the solution to performance problems until you can absolutely point to a specific block of code which is provably faster as a stored procedure.
stored procedures can be really help if they avoid sending huge amounts of data and/or avoid doing roundtrips to the server,so they can be valuable if your application has one of these problems.
After you finish your research you will realize there are two extreme views at opposite side of the spectrum. Historically the Java community has been against store procs due to the availability of frameworks such as hibernate, conversely the .NET community has used more stored procs and this legacy goes as far as the vb5/6 days. Put all this information in context and stay away from the extreme opinions on either side of the coin.
Speed should not be the primary factor to decide against or in favor of stored procs. You can achieve sp performace using inline SQL with hibernate and other frameworks. Consider maintenance and which other programs such as reports, scripts could use the same stored procs used by your application. If your scenario requires multiple consumers for the same SQL code, stored procedures are a good candidate, maintenance will be easier. If this is not the case, and you decide to use inline sql, consider externalizing it in config files to facilitate maintenance.
At the end of the day, what counts is what will make your particular scenario a success for your stakeholders.
If your server is getting noticeably slower in your busy season it may be because of saturation rather than anything inefficent in the database. Basic queuing theory tells us that a server gets hyperbolically slower as it approaches saturation.
The basic relationship is 1/(1-X) where X is the proportion of load. This describes the average queue length or time to wait before being served. Therefore a server that is getting saturated will slow down very rapidly when the load spikes.
A server that is 25% loaded will have an average service time of 1.333K for some constant K (loosely, K is the time for the machine to perform one transaction). A server that is 50% loaded will have an average service time of 2K and a server that is 90% loaded will have an average service time of 10K. Given that the slowdowns are hyperbolic in nature, it often doesn't take a large change in overall load to produce a significant degradation in response time.
Obviously this is somewhat simplistic as the server will be processing multiple requests concurrently (there are more elaborate queuing models for this situation), but the broad principle still applies.
So, if your server is experiencing transient loads that are saturating it, you will experience patches of noticeable slow-down. Note that these slow-downs need only be in one bottlenecked area of the system to slow the whole process down. If you are only experiencing this now in a busy season there is a possibility that your server has simply hit a constraint on a resource, rather than being particularly slow or inefficient.
Note that this possibility is not antithetical to the possibility of inefficiencies in the code. You may find that the way to ease the bottleneck is to tune some of your queries.
In order to tell if the system is bottlenecked, start gathering profiling information. If you can find resources with a large number of waits, this should give you a good starting point.
The final possibility is that you need to upgrade your server. If there are no major inefficiencies in the code (this might well be the case if profiling doesn't indicate any disproportionately large bottlenecks) you may simply need bigger hardware. I have no idea what your volumes are, but don't discount the possibility that you may have outgrown your server.
Yes, stored procs is a step forward towards acheiving good performance. The main reason is that stored procedures can be pre-compiled and their execution plan cached.
You however need to first analyse where your performance bottlenecks are really - so that you approach this exercise in a structured way.
As it has been suggested in one of the responses, try analyse using a profiler tool where the problem is - e.g do you need to create indexes...
Cheers
Like all of the above posts suggest, you first want to clean up your SQL statements, have appropriate indexes. caching can be tricky, I cant comment unless I have more detail on what you are trying to accomplish.
But one thing about sprocs, make sure you dont let it generate dynamic SQL statements
because for one, it will be pointless and it can be subjected to SQL Injection attacks...this has happened in one of the projects I looked into.
I would recommend sprocs for updates mainly, and then select statements.
good luck :)
You can never say in advance. You must do it and measure the difference because in 9 out of 10 cases, the bottleneck is not where you think.
If you use a stored procedure, you don't have to transmit the data. DBs are usually slow at executing [EDIT]complex[/EDIT] stored procedures [EDIT]with loops, higher math, etc[/EDIT]. So it really depends on how much work you would need to do, how slow your network is, how fast the DB executes this particular code, etc.

Resources