How slow are cursors really and what would be better alternatives? - performance

I have been reading that cursors are pretty slow and one should unless out of options avoid them. I am trying to optimize my stored procedures and one of them uses a cursor. It frequently is being called by my application and with lot of users(20000) and rows to update. I was thinking maybe I should use something else as an alternative.
All I am trying to do or want is to get a list of records and then operate on depending on each row value. So for e.g we have say -
Employee - Id,Name,BenefitId,StartDate,EndDate
So based on benefitId I need to do different calculation using dates between StartDate and EndDate and update employee details. I am just making this contrived example to give a idea on my situation.
What are your thoughts on it ? Are there better alternatives for cursors like say using temp tables or user defined functions? When should you really opt for them or should we never be using cursors ? Thanks everyone for their help.

I once changed a stored procedure from cursors to set based logic. Running time went from 8 hours to 22 seconds. That's the kind of difference we're talking about.
Instead of taking different action a record at a time, use several passes on the data. Update and set field1=A where field2 is X, then update and set field1= B where field2 is Y, etc.

I've changed out cursors and moved from over 24 hours of processing time to less than a minute.
TO help you see how to fix your proc with set-based logic, read this:
http://wiki.lessthandot.com/index.php/Cursors_and_How_to_Avoid_Them

A cursor does row-by-row processing, or "Row By Agonizing Row" if your name is Jeff Moden.
This is just one example of how to do set-based SQL programming as opposed to RBAR, but it depends ultimately on what your cursor is doing.
Also, have a look at this on StackOverflow:
RBAR vs. Set based programming for SQL

First off, it sounds like you are trying to mix some business logic in your stored procs. That's generally something you want to avoid. A better solution would be to have a middle tier layer which encapsulates that business logic. That way your data layer remains purely data.
To answer your original question, it really depends on what you are using the cursors for. In some cases you can use a table variable or a temp table. You have to remember to free up temp tables though so I would suggest using table variables whenever possible. Sometimes, though, there is just no way around using cursors. Maybe the original DBA's didn't normalize enough (or normalized too much) and you are forced to use a cursor to traverse through multiple tables without any foreign key relationships.

Related

ORACLE - Which is better to generate a large resultset of records, View, SP, or Function

I recently working with Oracle database to generate some reports. What I need is to get result sets of specific records (only SELECT statement), sometimes are large records, to be used for generating the report in excel file.
At first, the reports are queried in Views but some of them are slow (have some complex subqueries). I was asked to increase the performance and also fixed some field mapping. I also want to tidy things up, because when I query against View, I must specifically call the right column name. I want to separate the data works into database, and the web app just for passing parameters and call the right result set.
I'm new to Oracle, so which is better to do this kind of task? Using SP or Function? or in what condition that maybe View is better?
Makes no difference whether you compile your SQL in a view, SP or function. It is the SQL itself that matters.
As long as you are able to meet your requirements with the views they should be a good option. If you intend to break-up your queries into multiple ones for achieving better performance then you should go for stored procedures. If you decide to go for stored procedure then it would be advisable to create a package and bundle all the stored procedures together in the package. If your problem is performance then there may not be a silver bullet solution for the same. You will have to work on your queries and design for the same.
If the problem is performance due to complex SELECT query (queries), you can consider tuning the queries. Often you will find queries written 15-20 years ago, which do not use functionality and techniques that were introduced by Oracle in more recent versions (even if the organization spent the big bucks to buy the more recent versions - making it into a waste of money). Honestly, that may be too much of a task for you if you are new at Oracle; also, some slow queries may have been written by people just like you, many years ago - before they had a chance to learn a lot about Oracle and have experience with it.
Another thing, if the reports don't need to use the absolute current state of the underlying tables (for example, if "what was in the tables at the end of the business day yesterday" is acceptable), you can create a materialized view. It will not work any faster than a regular view, but it can run overnight (say), or every six hours, or whatever - so that the further reporting processing from there will not have to wait for the queries to complete. This is one of the main uses of materialized views.
Good luck!

Can variables be declared and used in Epicor 10 BAQs

Is there a way to set and use variables in an Epicor Business Activity Query?
I am hoping there is something similar to the SQL functionality:
DECLARE #MyCounter int;
That I can then have access to in multiple calculated fields.
I'm currently using sub-queries to try and achieve my goal which is sort of working, but its not ideal, and its not at all elegant.
You cat set parameters which apply to the entire query, but other that than everything you can do in BAQ designer it pretty much a row level operation. Subqueries do get around this somewhat and are probably the right way to go in truth. You can even create Table Functions in SQL and join those in if your want, which is pretty sweet.
It appears that you are trying to use an iterator for something, and this gets starts driving you towards Cursorville in the county of SQL, which the BAQ designer, for good reason, is really not equipped to handle. I might be off on this, so if you post a more specific example I can try to help you out.

Oracle PL/SQL: choosing the update/merge column dynamically

I have a table with data relating to several moments in time that I have to keep updated. To save space and time, however, each row in my table refers to a given day and hourly and quarter-hourly data for that day are scattered throughout the several columns in that same row. When updating the data for a particular moment in time I, therefore, must choose the column that has to be be updated through some programming logic in my PL/SQL procedures and functions.
Is there a way to dynamically choose the column or columns involved in an update/merge operation without having to assemble the query string anew every time? Performance is a concern and the throughput must be high, so I can't do anything that would perform poorly.
Edit: I am aware of normalization issues. However I still would like to know a good way for choosing the columns to be updated/merged dynamically and programatically.
The only way to dynamically choose what column or columns to use for a DML statement is to use dynamic SQL. And the only way to use dynamic SQL is to generate a SQL statement that can then be prepared and executed. Of course, you can assemble the string in a more or less efficient manner, you can potentially parse the statement once and execute it multiple times, etc. in order to minimize the expense of using dynamic SQL. But using dynamic SQL that performs close to what you'd get with static SQL requires quite a bit more work.
I'd echo Ben's point-- it doesn't appear that you are saving time by structuring your table this way. You'll likely get much better performance by normalizing the table properly. I'm not sure what space you believe you are saving but I would tend to doubt that denormalizing your table structure is going to save you much if anything in terms of space.
One way to do what is required is to create a package with all possible updates (which aren't that many, as I'll only update one field at a given time) and then choosing which query to use depending on my internal logic. This would, however, lead to a big if/else or switch/case-like statement. Is there a way to achieve similar results with better performance?

Should I store reference data in my application memory, or in the database?

I am faced with the choice where to store some reference data (essentially drop down values) for my application. This data will not change (or if it does, I am fine with needing to restart the application), and will be frequently accessed as part of an AJAX autocomplete widget (so there may be several queries against this data by one user filling out one field).
Suppose each record looks something like this:
category
effective_date
expiration_date
field_A
field_B
field_C
field_D
The autocomplete query will need to check the input string against 4 fields in each record and discrete parameters against the category and effective/expiration dates, so if this were a SQL query, it would have a where clause that looks something like:
... WHERE category = ?
AND effective_date < ?
AND expiration_date > ?
AND (colA LIKE ? OR colB LIKE ? OR colC LIKE ?)
I feel like this might be a rather inefficient query, but I suppose I don't know enough about how databases optimize their indexes, etc. I do know that a lot of really smart people work really hard to make database engines really fast at this exact type of thing.
The alternative I see is to store it in my application memory. I could have a list of these records for each category, and then iterate over each record in the category to see if the filter criteria is met. This is definitely O(n), since I need to examine every record in the category.
Has anyone faced a similar choice? Do you have any insight to offer?
EDIT: Thanks for the insight, folks. Sending the entire data set down to the client is not really an option, since the data set is so large (several MB).
Definitely cache it in memory if it's not changing during the lifetime of the application. You're right, you don't want to be going back to the database for each call, because it's completely unnecessary.
There's can be debate about exactly how much to cache on the server (I tend to cache as little as possible until I really need to), but for information that will not change and will be accessed repeatedly, you should almost always cache that in the Application object.
Given the number of directions you're coming at this data (filtering on 6 or more columns), I'm not sure how much more you'll be able to optimize the information in memory. The first thing I would try is to store it in a list in the Application object, and query it using LINQ-to-objects. Or, if there is one field that is used significantly more than the others, or try using a Dictionary instead of a list. If the performance continues to be a problem, try using storing it in a DataSet and setting indexes on it (but of course you loose some code-simplicity and maintainability this way).
I do not think there is a one size fits all answer to your question. Depending on the data size and usage patterns the answer will vary. More than that the answer may change over time.
This is why in my development I built some intermediate layer which allows me to change how the caching is done by changing configuration (with no code changes). Every while we analyze various stats (cache hit ratio, etc.) and decide if we want to change cache behavior.
BTW there is also a third layer - you can push your static data to the browser and cache it there too
Can you just hard-wire it into the program (as long as you stick to DRY)? Changing it only requires a rebuild.

How can I optimize a dynamic search query in Oracle

I am writing a stored procedure to perform a dynamic search that spans 10+ database tables. With millions of records in each table and a dynamic set of search parameters*, I am having some trouble optimizing the procedure.
Is there a "best practice" for building these kinds of queries? E.g. Use strings to build a dynamic query, use a huge list of IF THEN .. ELSE statements, etc? Can anyone provide a simple example or point me to some literature that will help? Here's some psuedocode for the stored procedure I am developing, which accepts a collection of parameters and a ref cursor.
v_query = "SELECT .....";
v_name = ... -- retrieve "name" parameter from collection
if v_name is not null then
v_query := v_query || ' AND table.Name = ' || v_name;
end if;
open search_cursor for v_query;
...
*By "dynamic set of search parameters," I mean that I pass in a collection of parameters. I figured this would be easier than making the caller pass in 20 parameters if they only want to search on one.
There are problems with using the static query approach; also be very careful about using the CURSOR_SHARING=FORCE option - it can really raise hell with your system if you haven't done a coverage test to ensure that all your other queries will work the way you want.
Problems with static queries:
The (x is null or x = col) predicates tend to kill any chance of using indexes. Since the query plan is computed at the time query is parsed the first time, the indexes you use will be based on the values for the first run of the query; later runs, which may not constrain on the same columns, will still use the same indexes.
Having one static statement with substitution variables will prevent the optimizer from making an intelligent choice about which index to use based on the data distribution. In a dynamic query (or in the first run of a query with bind variables), Oracle will see how selective your constraint is; a highly selective constraint will become a prime candidate for index use. For example, if your table had a row for every person in the U.S., STATE='Alaska' will be much more likely to use the index on STATE than STATE='California'.
Of course, in both these cases, if the dynamic columns in your WHERE clause are not indexed anyway, it doesn't matter, although I'd be surprised if that were the case in a database the size you're talking about.
Also, consider the real cost of all that hard parsing. Yes, hard parses serialize system resources, which makes them expensive, but only in the context of high volume queries. By their nature, ad-hoc queries do not get run very often. The cost you pay for all the hard parses you incur in an entire day will likely be hundreds of times less than the cost of a single query that uses the wrong indexes.
In the past, I've implemented these systems pretty much like you've done here - a base query portion, then iterating over a constraint list and adding WHERE clause predicates. I don't think it's hard for someone to maintain or understand, especially if you're talking about constraints that don't involve adding a lot of subqueries or extra tables to the FROM clause.
One thing to consider: If this system is primarily an offline one (in other words, not constantly being updated or inserted into - populated by periodic loads of bulk data), you may want to look into using BITMAP indexes. Bitmap indexes differ from regular b-tree indexes in that multiple indexes on a single table can be used simultaneously, and bitmap indexes are much, much smaller on disk than b-trees. They work very well for applications like this - where you will have a variety of constraints that can't be defined at design time. You will only want to put bitmap indexes on columns that have relatively few distinct values - say, one value constitutes no less than 1/1000 of the table - so don't use bitmaps on unique columns.
However, the downside is that bitmap indexes will noticeably degrade the performance of inserts and updates. The best practice for bitmaps is to use them in data warehouse applications, and they are dropped prior to loads and recreated afterwards.
Except in very particular cases, I don't think it is advisable (or even possible) to try to generate an optimized query. My advice is not to use dynamic SQL if you can : hard to read, hard to debug, hard to optimize, hard to maintain.
First, write a generic query that will work with any parameter sent to your procedure. According to your example, that would give something like :
SELECT * FROM table WHERE ((v_name IS NULL) OR (table.Name=v_name));
As you see, you could easily add other parameters to this query without using dynamic SQL. This query is much easier to read and debug. Ask your DBA for optimization tips.
Then, if you have a particular set of parameters that you know are often passed together, you could write a particular query for this set that you could specifically optimize. Pseudocode :
IF particular_set
THEN
/* Specific query */
ELSE
/* Generic query */
END IF;
The difficult part is to try not to have too many specific queries here, or you could fall into a maintenance hell.
We've had a similar requirement for one of our clients. They have half a dozen tables with millions of rows, and they wanted adhoc search capability on most of the columns.
The solution was a separate package for each table, which would take the search criteria and construct the SQL to run the search. We took advantage of the old system that was being replaced, to discover what the most common types of searches the users were doing, and made sure that those searches ran the best, by tuning the queries that were being generated (supported by the strategic use of indexes). Because each package was only responsible for queries against one table, it could have specific code designed to work with that table (including the odd hint, in a few rare cases).
One question/problem that you'll need to address is, do you hard-code the criteria (e.g. WHERE SURNAME='SMITH') or use bind variables? Using bind variables reduces hard parsing, which reduces load on the database server; however it can be impractical to use bind variables when the SQL is dynamically generated. The way we ended up going was to set CURSOR_SHARING=FORCE (which has its own disadvantages) which was a reasonable compromise in our case.
Read http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:6711305251199

Resources