How to start on query tuning? - oracle

looking for advice on how to start with Oracle tuning, I've never done it before.
I'd like to know how to do performance tuning of queries. We build applications for clients, so I don't have production database here. I basically know which of our queries are important, and want to make sure they will run fast.
In MSSQL it is pretty straight-forward, when you enable execution plan, it shows additional indexes which will help the query. Is there anything similar to that in Oracle? Or perhaps another approach?

Performance tuning is a big subject. There are two basic types of approaches that people take: science or guesswork.
A. Guesswork (otherwise known as the shotgun approach)
Look at the query
Guess which columns need indexes that don't have them already *
Create the indexes and re-run the query
If it made no difference, go back to step 1 and repeat.
B. Science
Find out how fast the query "should" be.
Determine how fast the query actually is (under "realistic" conditions as far as possible).
If it's already fast enough, stop here. Otherwise:
Determine where most of the time is going - is it spinning or waiting?
If it's waiting, find out what it's waiting for, and eliminate that if possible.
If it's spinning, find out what it's doing that it doesn't need to do, and eliminate that if possible.
Go back to step 2 and repeat.
(* if there are no more indexes to create, look for something else to change - anything. e.g. change a few session or system settings, rebuild a table, etc.)
(*** in case it's not obvious, I'm not advocating approach A)

First thing I would suggest is if you are not going to do tuning on prod databases you should import the table statics from prod db to your Dev one. This will make your Dev system behave as it has similar data as prod. If you google for importing stats you will find loads of articles. The next stage would to look at queries where u can add indexes generally where you see full table scan is a good place to look.
Do you use Bind variables?
Oracle does a lot of caching data so see where you can use bind variables this will do filtering in the private space of server process and save the db from doing physical reads and hence data is returned quickly.
Hope this helps.

Related

DB candidate as CouchDB/Schema replacement

The idea is to redesign data structure and/or change DB.
I just started to review this project and plan to start optimization from this one.
Currently i have CouchDb with about 80GB of document data, around 30M records.
From that subset for the most of documents properties like id, group_id, location, type can be considered as generic, but unfortunately for now such are even stored with different property naming around the set. Also a lot of deeply nested can be found.
Structure isn't hardly defined, that's why NoSQL db was selected way before some picture was seen.
Data is calculated and populated in DB in a separate Job on powerful cluster. This isn't done too often. From that perspective i can conclude that general write/update performance isn't very important. Also size decrease would be great, but isn't most important. There are only like 1-10 active customers at a time.
Actually read performance with various filtering/grouping etc is most important.
But no heavy summary calculations should be done, this one is already done while population.
This one is a data analytical tool for displaying compare and other reports to quality engineers and data analyst, so they can browse the results, group them or filter from the Web UI.
Now such tasks like searching a subset of document properties for a text isn't possible due to performance.
For sure i've done some initial investigations(like http://www.datastax.com/wp-content/themes/datastax-2014-08/files/NoSQL_Benchmarks_EndPoint.pdf) and it looks Cassandra seems to be good choice among NoSql.
Also it's quite interesting trying to port this data into the new PostgreSQl.
Any ideas would be highly appreciated :-)
Hello please check the following articles:
http://www.enterprisedb.com/nosql-for-enterprise
For me, PostgreSQL json(and jsonb!) capabilities allow to start schema-less, have transactions, indexes, grouping, aggregate functions with very good performance, just from the start. And when ready(and if needed), you can go for the schema, with internal data migration.
Also check:
https://www.compose.io/articles/is-postgresql-your-next-json-database/
Good luck

Microsoft Access equivalent of explain in MySQL

I'm working on a very large query, in a inherited application. This is a large insert-query, that takes 4 tables with well over a million records. I know, I would also rather have this in SQL-server, but there is no infrastructure at this customer to do this :-)
This query has worked for over a year. However, the source-tables keep on growing, and last week it threw the dreaded 'out of system resources'-error. Bummer...!
I think it is possible to optimize this query. Working in MySQL, I would use the explain-command, to see where optimalisation might occur. Is there a equivalent of this in Access? I cannot seem to find it....
kind regards,
Paul
Probably Jet ShowPlan is closest to what you want. You will have to set a registry key. Then query plan information gets dumped to a text file named SHOWPLAN.OUT. You can read about the details in this article on TechRepublic: Use Microsoft Jet's ShowPlan to write more efficient queries
Also try the Performance Analyzer wizard. You can ask it to examine your query alone, or also ask it to examine table or other queries used by that query.
If you haven't compacted the database recently, see whether that improves performance. Compacting also updates index statistics which allows the engine to make better decisions for the query plan.

Doesn't Read-Only make a difference for SQL Server?

I’ve been tasked with optimizing a rather nasty stored procedure in a legacy system. It’s a database dedicated to search, and a new copy is being generate every day, with a lot of complex joins being de-normalized. No writes are being performed, only SELECTs, so I figured some easy improvements could be made by making the whole database read-only and changing the recovery model to “Simple”.
Much to my surprise, this didn’t help – at all! The stored procedure still takes the same amount of time of complete. If fact, I’m so surprised that I figured I did it wrong!
My questions:
Do I need to do anything other than setting “Database read-only” to “true”?
Am I wrong to expect significant performance improvement by making the database read-only?
Same for the recovery model: Shouldn’t “Simple” have some noticeable impact?
Are there other similar database-wide configurations that can improve performance in this scenario?
The stored procedure is huge, with temporary tables, 40+ tables joined in 20+ queries. But I’d like to optimize the database itself before I edit this proc.
Since no writes are performed by your SP, there is no reason to expect noticable performance improvement from changing recovery model and read-write mode.
As others mentioned, you should look into the query plan and optimize your queries.
Another hint: indexes in the database might get fragmented while the database is filled up. Since the data is not going to be modified any more, it might help to rebuild all the indexes with fillfactor 100 - this might help to get rid of fragmentation and to compact data.
Call this for each table in the database: ALTER INDEX ALL ON table_name REBUILD WITH (FILLFACTOR = 100).
Generally, I won't expect much of performance improvement from this, but it depends on the particular database.
Speaking of query optimization, there are very useful features in SQL Server 2005 and later: Execution Related and Index-Related Dynamic Management Views. In particular, sys.dm_exec_query_stats and missing indexes are of interest.
These give you almost the same information as Tuning Advisor, but using you real-life workload, so you don't need to simulate it and feed to the Advisor.
Have you tried using the Database Engine Tuning Advisor included in SQL Server? It will analyze your query and suggest new indexes that will improve the performance of the query. Some of them will be good, some will be bad (for example, I've seen it suggest adding every column in a table to an index, sometimes like 30 of them!), so I don't follow it blindly. Generally I'll add a few indexes and then retest, to find the suggestions that are the most important. I've used it to optimize many queries that I thought I had properly indexed, only to find I could get a lot more performance out of them.
I had a similar setup, large stored procedures with lots of large temp tables.
Our problem was that the joins with and between the temp tables was very slow.
I recommend that you look at your execution plan and try to add relevant indexes to the temp tables too if you have not already.

Is there a way to fix Oracle query in shared pool

I have a report engine, performing PreparedStatements on Oracle 11, that is a highly prioritized task.
What I see is that first query invocation usually performs much much longer than the same query afterwards (query has different parameters and return different data).
I suppose this is due to hard parsing done by Oracle, on first query invocation.
I wonder, is there a way of hinting to Oracle, that this query is highly prioritized query which would be performed often, and which performance is critical, so it should remain in shared pool, no matter what?
I know that I can fix execution plan in Oracle 11, but I don't want to fix it, I want Oracle still to be able to change it, as system changes, all I want is to exclude query hard parsing.
Perhaps you should change your "I suppose..." into a "I tested and have determined..." :)
The query performance may be affected by more than just parsing; when it executes it has to fetch blocks from disk into the buffer cache - subsequent executions quite possibly are taking advantage of the blocks being found in memory and so are faster.
EDIT: to answer your immediate question - a workaround may be to have a job run periodically that parses the query but doesn't execute it. You might even be able to use this to determine whether parsing or fetching is the locus of the problem.
You can try pinning to shared pool using dbms_shared_pool.keep
But I would first make sure that you have an aging out problem first
Anton,
if your query is using bind variables it will be re-used. The cursor will be cached and as long as it is re-used, it will remain in the cursor cache. Make sure that it uses bind variables. This increases re-usability and scalability.
If you don't trust the rdbms you can pin it using dbms_shared_pool.keep.
See http://psoug.org/reference/dbms_shared_pool.html
You need to find your cursor in order to do so.
Normally there is an other problem that should be fixed.
Ronald.
http://ronr.blogspot.com

Best strategy for retrieving large dynamically-specified tables on an ASP.NET page

Looking for a bit of advice on how to optimise one of our projects. We have a ASP.NET/C# system that retrieves data from a SQL2008 data and presents it on a DevExpress ASPxGridView. The data that's retrieved can come from one of a number of databases - all of which are slightly different and are being added and removed regularly. The user is presented with a list of live "companies", and the data is retrieved from the corresponding database.
At the moment, data is being retrieved using a standard SqlDataSource and a dynamically-created SQL SELECT statement. There are a few JOINs in the statement, as well as optional WHERE constraints, again dynamically-created depending on the database and the user's permission level.
All of this works great (honest!), apart from performance. When it comes to some databases, there are several hundreds of thousands of rows, and retrieving and paging through the data is quite slow (the databases are already properly indexed). I've therefore been looking at ways of speeding the system up, and it seems to boil down to two choices: XPO or LINQ.
LINQ seems to be the popular choice, but I'm not sure how easy it will be to implement with a system that is so dynamic in nature - would I need to create "definitions" for each database that LINQ could access? I'm also a bit unsure about creating the LINQ queries dynamically too, although looking at a few examples that part at least seems doable.
XPO, on the other hand, seems to allow me to create a XPO Data Source on the fly. However, I can't find too much information on how to JOIN to other tables.
Can anyone offer any advice on which method - if any - is the best to try and retro-fit into this project? Or is the dynamic SQL model currently used fundamentally different from LINQ and XPO and best left alone?
Before you go and change the whole way that your app talks to the database, have you had a look at the following:
Run your code through a performance profiler (such as Redgate's performance profiler), the results are often surprising.
If you are constructing the SQL string on the fly, are you using .Net best practices such as String.Concat("str1", "str2") instead of "str1" + "str2". Remember, multiple small gains add up to big gains.
Have you thought about having a summary table or database that is periodically updated (say every 15 mins, you might need to run a service to update this data automatically.) so that you are only hitting one database. New connections to databases are quiet expensive.
Have you looked at the query plans for the SQL that you are running. Today, I moved a dynamically created SQL string to a sproc (only 1 param changed) and shaved 5-10 seconds off the running time (it was being called 100-10000 times depending on some conditions).
Just a warning if you do use LINQ. I have seen some developers who have decided to use LINQ write more inefficient code because they did not know what they are doing (pulling 36,000 records when they needed to check for 1 for example). This things are very easily overlooked.
Just something to get you started on and hopefully there is something there that you haven't thought of.
Cheers,
Stu
As far as I understand you are talking about so called server mode when all data manipulations are done on the DB server instead of them to the web server and processing them there. In this mode grid works very fast with data sources that can contain hundreds thousands records. If you want to use this mode, you should either create the corresponding LINQ classes or XPO classes. If you decide to use LINQ based server mode, the LINQServerModeDataSource provides the Selecting event which can be used to set a custom IQueryable and KeyExpression. I would suggest that you use LINQ in your application. I hope, this information will be helpful to you.
I guess there are two points where performance might be tweaked in this case. I'll assume that you're accessing the database directly rather than through some kind of secondary layer.
First, you don't say how you're displaying the data itself. If you're loading thousands of records into a grid, that will take time no matter how fast everything else is. Obviously the trick here is to show a subset of the data and allow the user to page, etc. If you're not doing this then that might be a good place to start.
Second, you say that the tables are properly indexed. If this is the case, and assuming that you're not loading 1,000 records into the page at once and retreiving only subsets at a time, then you should be OK.
But, if you're only doing an ExecuteQuery() against an SQL connection to get a dataset back I don't see how Linq or anything else will help you. I'd say that the problem is obviously on the DB side.
So to solve the problem with the database you need to profile the different SELECT statements you're running against it, examine the query plan and identify the places where things are slowing down. You might want to start by using the SQL Server Profiler, but if you have a good DBA, sometimes just looking at the query plan (which you can get from Management Studio) is usually enough.

Resources