I'm trying to make sure I have a good understanding regarding the relationship between CURSOR_SHARING, bind variables, bind variable peeking and histograms as most sources cover these topics is different sections.
Ok so here's what I've gathered so far, feel free too correct me if I got anything wrong:
CURSOR_SHARING
1. = EXACT (default)
1.1. if SQL statement uses literals: the optimizer will generate a new execution plan for every combination of literals - optimizer will not replace literals with binds. A new parent cursor is generated for every literal combination.
1.2. if SQL statement uses bind variables: first time the statment is run, the optimizer will peek at the value of the bind variables and use those specific values to generate an execution plan - all future statements with those bind variables will use that same plan (even if the plan is suboptimal for other values of the bind variable).
2. = FORCE
2.1. optimizer will replace all literals with binds - and will basically use the same algorithm as scenario 1.2
3. = SIMILAR
3.1. no histogram: optimizer replaces all literals with binds -> same final effect as with 1.2 and 2.1
3.2. with histogram: optmizer replaces all literals with binds, but peeks at the bind variable EVERY time the statement is run (as opposed to just on the first run through) to see if there is a more optimal execution plan for that specific value of the bind variable (based on histogram statistics). Therefore, a new child cursor is effectively created for every distinct value of the bind variable that the optimizer encounters.
Questions:
From my understanding, isn't using CUSOR_SHARING = EXACT + writing SQL statments with bind-variables (1.2) lead to the exact same outcome as setting CURSOR_SHARING = FORCE (2.1)? In both cases, the optimizer will only peek at the bind variable on the first run to generate the execution plan and then reuse that plan no matter what the values of the bind variables on subsequent runs? If so, then why do most sources recommend using bind variables? this seems like it could have a significant impact on performance.
Is the histogram used in the initial bind variable peek for 1.2 and 2.1? As in, the first time that SQL statment is run and the optimizer peeks at the bind variable, does it use the histogram (if there is one) to determine if full-table scan or index scan is used? "Oracle Database 11g, Performance Tuning Recipes" seems to indicate that histograms are relevant only when CURSOR_SHARING = SIMILAR but some other sources indicate that the histogram is used in all the other CURSOR_SHARING settings as well.
In case 1.1, would the optimizer make use the histogram to determine the best execution plan? Basically I just want to know when the histogram is used. Is it only when CURSOR_SHARING = SIMILAR or for other CURSOR_SHARING settings are well?
Adpative Cursor Sharing - this feature will only take place if there are bind variables (either from user query or system-generated (by literal replacements)). Therefore it only takes place in 1.2, 2.1, 3.1 and 3.2? but since SIMILAR has been deprecated, does this mean that ACS only occurs in 1.2 and 2.1?
Hopefully, I'm not too far off base right now but if I made any mistakes please do correct me
Thanks!
Edited by: BYS2 on Dec 20, 2011 12:11 PM
The difference between (a) using FORCE and (b) using EXACT and coding bind variable yourself, is that in the latter case you control when bind variables are used. So if you can see that in a particular case bind variables are hurting performance, or aren't necessary, you can change that query. With FORCE, you're stuck. The reason that bind variables are recommended for OLTP type queries is that parsing is a highly serialized process that can become a big bottleneck. In OLTP systems you tend to see lots of queries that should always be using the same execution plan run with different values, so re-parsing them all the time is a waste. Any good source will also recommend that you consider when to not use bind variables -- for instance, if you only have a few possible values that can appear at a particular position in a query, and one or more of those values might benefit from a different execution plan, it may be better overall to use literals since you can parse each variant once and then reuse the cached plan.
(Another benefit to using bind variables is that it leaves you less open to SQL injection.)
2 & 3. Histograms are used in general when creating execution plans for queries, and in more ways than are obvious. Yes, in the case of a standard bind variable peek with EXACT setting, the histogram is (or at least, may be) used by the optimizer in determining the execution plan. This can be a good thing or a bad thing, depending on the skew and what particular value you have for the bind. I think the point your source is making about histograms and the SIMILAR setting is that in that case, the presence of the histogram is one of the triggers that will cause a new execution plan to be created.
(I would highly recommend Jonathan Lewis's "Cost-Based Oracle Fundamentals" for all the information you could want about when and how histograms are used.)
4.. I believe that Adaptive Cursor Sharing is essentially an enhanced version of the logic that was previously implemented for CURSOR_SHARING=SIMILAR. The optimizer will consider creating new plans based on bind variable peeking, in all circumstances. SIMILAR appears to still exist as an option. This post may provide some further helpful info.
Related
I have an Oracle bind query that is extremely slow (about 2 minutes) when it executes in my C# program but runs very quickly in SQL Developer. It has two parameters that hit the tables index:
select t.Field1, t.Field2
from theTable t
where t.key1=:key1
and t.key2=:key2
Also, if I remove the bind variables and create dynamic sql, it runs just like it does in SQL Developer.
Any suggestion?
BTW, I'm using ODP.
If you are replacing the bind variables with static varibles in sql developer, then you're not really running the same test. Make sure you use the bind varibles, and if it's also slow you're just getting bit by a bad cached execution plan. Updating the stats on that table should resolve it.
However if you are actually using bind variables in sql developers then keep reading. The TLDR version is that parameters that ODP.net run under sometimes cause a slightly more pessimistic approach. Start with updating the stats, but have your dba capture the execution plan under both scenarios and compare to confirm.
I'm reposting my answer from here: https://stackoverflow.com/a/14712992/852208
I considered flagging yours as a duplicate but your title is a little more concise since it identifies the query does run fast in sql developer. I'll welcome advice on handling in another manner.
Adding the following to your config will send odp.net tracing info to a log file:
This will probably only be helpful if you can find a large gap in time. Chances are rows are actually coming in, just at a slower pace.
Try adding "enlist=false" to your connection string. I don't consider this a solution since it effecitively disables distributed transactions but it should help you isolate the issue. You can get a little bit more information from an oracle forumns post:
From an ODP perspective, all we can really point out is that the
behavior occurs when OCI_ATR_EXTERNAL_NAME and OCI_ATR_INTERNAL_NAME
are set on the underlying OCI connection (which is what happens when
distrib tx support is enabled).
I'd guess what you're not seeing is that the execution plan is actually different (meaning the actual performance hit is actually occuring on the server) between the odp.net call and the sql developer call. Have your dba trace the connection and obtain execution plans from both the odp.net call and the call straight from SQL Developer (or with the enlist=false parameter).
If you confirm different execution plans or if you want to take a preemptive shot in the dark, update the statistics on the related tables. In my case this corrected the issue, indicating that execution plan generation doesn't really follow different rules for the different types of connections but that the cost analysis is just slighly more pesimistic when a distributed transaction might be involved. Query hints to force an execution plan are also an option but only as a last resort.
Finally, it could be a network issue. If your odp.net install is using a fresh oracle home (which I would expect unless you did some post-install configuring) then the tnsnames.ora could be different. Host names in tnsnams might not be fully qualified, creating more delays resolving the server. I'd only expect the first attempt (and not subsequent attempts) to be slow in this case so I don't think it's the issue but I thought it should be mentioned.
Are the parameters bound to the correct data type in C#? Are the columns key1 and key2 numbers, but the parameters :key1 and :key2 are strings? If so, the query may return the correct results but will require implicit conversion. That implicit conversion is like using a function to_char(key1), which prevents an index from being used.
Please also check what is the number of rows returned by the query. If the number is big then possibly C# is fetching all rows and the other tool first pocket only. Fetching all rows may require many more disk reads in that case, which is slower. To check this try to run in SQL Developer:
SELECT COUNT(*) FROM (
select t.Field1, t.Field2
from theTable t
where t.key1=:key1
and t.key2=:key2
)
The above query should fetch the maximum number of database blocks.
Nice tool in such cases is tkprof utility which shows SQL execution plan which may be different in cases above (however it should not be).
It is also possible that you have accidentally connected to different databases. In such cases it is nice to compare results of queries.
Since you are raising "Bind is slow" I assume you have checked the SQL without binds and it was fast. In 99% using binds makes things better. Please check if query with constants will run fast. If yes than problem may be implicit conversion of key1 or key2 column (ex. t.key1 is a number and :key1 is a string).
I'm in the need to rewrite some query made by OBIEE
My need is to change all the literal value in bind variable but I didn't find how to use bind variables
Someone can help me?
Thanks
First, a statement:
If you are utilizing ADF (Oracle Application Development Framework) with OBIEE, there is a setting in the ADF layer which allows us to specify how the View Criteria and the WHERE clauses used in the queries fired on the VOs should be handled.
By default the setting useBindVarsForViewCriteriaLiterals is set to "False" in adf-config.xml. If this setting is False, then ADF will generate SQLs with literals for the view criteria and this can cause contention in the shared pool area of the database.
If we change the setting in adf-config.xml to "True" ADF generates SQL with bind variables for all view criterias.
However this setting should not be changed for BI (OBIEE), as BI does not support Bind variables in the queries. If we see any queries (or its related logs) which are using Bind variables in OBIEE queries/reports, then its possible that it is due to the above setting.
For BI, this setting should be left to the default value, i.e. "False"
Quick answer: It would seem that OBIEE is not capable of using bind variables in place of literals.
Now, the reasoning behind the restriction on bind variables:
In a data warehouses, instead of running say 1,000 statements
per second, they do something like take an average of 100 seconds to run a single query.
In these systems, the queries are few but big (they ask large questions). Here, the
overhead of the parse time is a tiny fraction of the overall execution time. Even if you
have thousands of users, they are not waiting behind each other to parse queries, but
rather are waiting for the queries to finish getting the answer.
In these systems, using bind variables may be counterproductive. Here, the runtimes for
the queries are lengthy-in seconds, minutes, hours, or more. The goal is to get the best
query optimization plan possible to reduce the runtime, not to execute as many of OLTP,
one-tenth-second queries as possible. Since the optimizer's goal is different, the rules
change.
Sometimes using a bind variable forces the optimizer to come up with the best generic plan, which actually might be the worst plan for the specific query. In a system where the queries take considerable time to execute, bind variables remove information the optimizer could have used to come up with a superior plan. In fact, some data warehouse-specific features are defeated by using bind variables. For example, Oracle supports a star transformation feature for data warehouses that can greatly reduce the time a query takes. However, one restriction that precludes star transformation is having queries that contain bind variables.
I have a vey huge query. It is rather large, so i will not post it here(it has 6 levels of nested queries with ordering and grouping). Query has 2 parameters that are passed to it via PreparedStatement.setString(index, value). When I execute my query through SQL Developer(replacing query parameters to actual values before it by hand) the query runs about 10 seconds and return approximately 15000 rows. But when I try to run it through java program using PreparedStament with varibales it fails with ORA-01652(unable to extend temp segment). I have tried to use simple Statement from java program - it works fine. Also when I use preparedStatement without variables(don't use setString(), but specify parameters by hand) it works fine too.
So, I suspect that problem is in PreparedStatemnt parameters.
How does the mechanism of that parameters work? Why simple statement works fine but prepared one fails?
You're probably running into issues with bind variable peeking.
For the same query, the best plan can be significantly different depending on the actual bind variables. In 10g, Oracle builds the execution plan based on the first set of bind variables used. 11g mostly fixed this problem with adaptive cursor sharing, a feature that creates multiple plans for different bind variables.
Here are some ideas for solving this problem:
Use literals This isn't always as bad as people assume. If the good version of your query runs in 10 seconds, the overhead of hard-parsing the query will be negligible. But you may need to be careful to avoid SQL injection.
Force a hard-parse There are a few ways to force Oracle to hard-parse every query. One method is to call DBMS_STATS with NO_INVALIDATE=>FALSE on one of the tables in the query.
Disable bind-variable peeking / hints You can do this by removing the relevant histograms, or using one of the parameters in the link provided by OldProgrammer. This will stabilize your plan, but will not necessarily pick the correct plan. You may also need to use hints to pick the right plan. But then you may not have the right plan for every combination of inputs.
Upgrade to 11g This may not be an option, but this issue is another good reason to start planning an upgrade.
I hope this was not asked here before (I did search around here, and did google for an answer, but could not find an answer)
The problem is: I'm using MS Access 2010 to select records from a linked table (There are millions of records in the table). If I specify criteria (e.g. Date) directly (for example date=#1/1/2013#), the query returns in an instant. If i use parameters (add a parameter of type date/time and provide value of 1/1/2013 when prompted (or date in some different format), or reference a control in a form), the query takes minutes to load.
Please let me know if You have any ideas on what could be causing this. I do feel bad about asking such a question and possibly wasting someones time...
Here's a potential answer, I didn't know this myself and did a little digging.
If performance is important, it may be necessary to prefer dynamic SQL even for where parameter queries are suitable due to how queries are optimized. Generally, Access creates a plan for a new query upon saving. When a query contains a parameter, then Access cannot know what value the parameter may contain and has to make a "good guess". Depending on which actual values are later supplied, it may be okay or poor, resulting in sub-optimal performance. In contrast, dynamic SQL sidesteps this because the "parameters" are hard-coded into the temporary string and thus a new plan is compiled with that value, guaranteeing optimal execution plan. Since compiling a new plan at runtime is very fast, it can be the case that dynamic SQL will outperform parameter queries.
Source: http://www.utteraccess.com/wiki/index.php/Parameter_Query#Performance
Also, if I had to guess, in your parameter query, Access is requesting the ENTIRE table from Oracle and then filtering down with your where clause, but when the WHERE clause is specified, it actually just loads those records and possibly makes use of indexes.
As far as a solution, I would build your query string in VBA then execute it. It opens you up to injection, but you can handle that. So:
Instead of using a saved parameter query object in Access, try to do something like this.
dim qr as string
qr = "SELECT * FROM myTable WHERE myDate = #" & me.dateControl & "#;"
'CurrentDb.execute qr, dbFailOnError
Docmd.RunSQL qr
Or, as you replied, currentdb.openrecordset(qr)
This would force the engine to make an execution plan at runtime rather than having a saved potentially suboptimal plan. Let me know if this works out for you, I'd be interested to see.
Of course the above reference about using parameters with Access (JET/ACE) ONLY applies to access back ends, not ODBC ones like SQL server or oracle. Since you pointed out that your using Oracle here then creating a view or using a pass-though query would and should resolve this performance issue. However one does NOT want to use Access/JET paramters with data coming from some server based system - you best just send the server SQL strings, but much better would be to use a pass-though query. If the result set requires editing, then PT query are only readonly, and you have to create a view and link to that view.
I am writing a stored procedure to perform a dynamic search that spans 10+ database tables. With millions of records in each table and a dynamic set of search parameters*, I am having some trouble optimizing the procedure.
Is there a "best practice" for building these kinds of queries? E.g. Use strings to build a dynamic query, use a huge list of IF THEN .. ELSE statements, etc? Can anyone provide a simple example or point me to some literature that will help? Here's some psuedocode for the stored procedure I am developing, which accepts a collection of parameters and a ref cursor.
v_query = "SELECT .....";
v_name = ... -- retrieve "name" parameter from collection
if v_name is not null then
v_query := v_query || ' AND table.Name = ' || v_name;
end if;
open search_cursor for v_query;
...
*By "dynamic set of search parameters," I mean that I pass in a collection of parameters. I figured this would be easier than making the caller pass in 20 parameters if they only want to search on one.
There are problems with using the static query approach; also be very careful about using the CURSOR_SHARING=FORCE option - it can really raise hell with your system if you haven't done a coverage test to ensure that all your other queries will work the way you want.
Problems with static queries:
The (x is null or x = col) predicates tend to kill any chance of using indexes. Since the query plan is computed at the time query is parsed the first time, the indexes you use will be based on the values for the first run of the query; later runs, which may not constrain on the same columns, will still use the same indexes.
Having one static statement with substitution variables will prevent the optimizer from making an intelligent choice about which index to use based on the data distribution. In a dynamic query (or in the first run of a query with bind variables), Oracle will see how selective your constraint is; a highly selective constraint will become a prime candidate for index use. For example, if your table had a row for every person in the U.S., STATE='Alaska' will be much more likely to use the index on STATE than STATE='California'.
Of course, in both these cases, if the dynamic columns in your WHERE clause are not indexed anyway, it doesn't matter, although I'd be surprised if that were the case in a database the size you're talking about.
Also, consider the real cost of all that hard parsing. Yes, hard parses serialize system resources, which makes them expensive, but only in the context of high volume queries. By their nature, ad-hoc queries do not get run very often. The cost you pay for all the hard parses you incur in an entire day will likely be hundreds of times less than the cost of a single query that uses the wrong indexes.
In the past, I've implemented these systems pretty much like you've done here - a base query portion, then iterating over a constraint list and adding WHERE clause predicates. I don't think it's hard for someone to maintain or understand, especially if you're talking about constraints that don't involve adding a lot of subqueries or extra tables to the FROM clause.
One thing to consider: If this system is primarily an offline one (in other words, not constantly being updated or inserted into - populated by periodic loads of bulk data), you may want to look into using BITMAP indexes. Bitmap indexes differ from regular b-tree indexes in that multiple indexes on a single table can be used simultaneously, and bitmap indexes are much, much smaller on disk than b-trees. They work very well for applications like this - where you will have a variety of constraints that can't be defined at design time. You will only want to put bitmap indexes on columns that have relatively few distinct values - say, one value constitutes no less than 1/1000 of the table - so don't use bitmaps on unique columns.
However, the downside is that bitmap indexes will noticeably degrade the performance of inserts and updates. The best practice for bitmaps is to use them in data warehouse applications, and they are dropped prior to loads and recreated afterwards.
Except in very particular cases, I don't think it is advisable (or even possible) to try to generate an optimized query. My advice is not to use dynamic SQL if you can : hard to read, hard to debug, hard to optimize, hard to maintain.
First, write a generic query that will work with any parameter sent to your procedure. According to your example, that would give something like :
SELECT * FROM table WHERE ((v_name IS NULL) OR (table.Name=v_name));
As you see, you could easily add other parameters to this query without using dynamic SQL. This query is much easier to read and debug. Ask your DBA for optimization tips.
Then, if you have a particular set of parameters that you know are often passed together, you could write a particular query for this set that you could specifically optimize. Pseudocode :
IF particular_set
THEN
/* Specific query */
ELSE
/* Generic query */
END IF;
The difficult part is to try not to have too many specific queries here, or you could fall into a maintenance hell.
We've had a similar requirement for one of our clients. They have half a dozen tables with millions of rows, and they wanted adhoc search capability on most of the columns.
The solution was a separate package for each table, which would take the search criteria and construct the SQL to run the search. We took advantage of the old system that was being replaced, to discover what the most common types of searches the users were doing, and made sure that those searches ran the best, by tuning the queries that were being generated (supported by the strategic use of indexes). Because each package was only responsible for queries against one table, it could have specific code designed to work with that table (including the odd hint, in a few rare cases).
One question/problem that you'll need to address is, do you hard-code the criteria (e.g. WHERE SURNAME='SMITH') or use bind variables? Using bind variables reduces hard parsing, which reduces load on the database server; however it can be impractical to use bind variables when the SQL is dynamically generated. The way we ended up going was to set CURSOR_SHARING=FORCE (which has its own disadvantages) which was a reasonable compromise in our case.
Read http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:6711305251199