Why "recursive sql" is used in "Dictionary-Managed Tablespaces" in oracle? - oracle

Oracle documents points out that locally managed tablespace is better than dictionary-managed tablespace in several aspects. one is that recursive sql is used when database allocate free blocks in dictionary-managed tablespace.
Table fet$ has columns (TS#, FILE#, BLOCK#, LENGTH)
Could anyone explain why recursive sql is used for allocation with fet$?

You seem to be interpreting 'recursive' in the normal programming sense; but it can have slightly different meanings:
drawing upon itself, referring back.
The recursive nature of stories which borrow from each other
(mathematics, not comparable) of an expression, each term of which is determined by applying a formula to preceding terms
(computing, not comparable) of a program or function that calls itself
...
If you interpret it as a recursive function (meaning 3) then it doesn't quite make sense; fet$ isn't updated repeatedly and an SQL statement doesn't re-execute itself. Here 'recursive' is used more generally (meaning 1, sort of), in the sense that the SQL you run generates another layer of SQL 'under the hood'. Not the same SQL or the same function called by itself, but 'SQL drawing upon SQL', or 'SQL referring back to SQL', if you like.
The concepts guide - which is where I think you got your question from - says:
Avoids using the data dictionary to manage extents
Recursive operations can occur in dictionary-managed tablespaces if
consuming or releasing space in an extent results in another operation
that consumes or releases space in a data dictionary table or undo
segment.
With a table in a dictionary managed tablespace (DMT), when you insert data Oracle has to run SQL statements against the dictionary tables to identify and allocate blocks. You don't normally notice that, but you can see it in trace files and other performance views. SQL statements will be run against fet$ etc. to manage the space.
The 'recursive' part is that one SQL statement has to execute another (different) SQL statement; and that may in turn have to execute yet another (different again) SQL statement.
With a locally managed tablespace (LMT), block information is held in a bitmap within the tablespace itself. There is no dependence on the dictionary (for this, anyway). That extra layer of SQL is not needed, which saves time - both from the dictionary query itself and from potential concurrency delays, as multiple queries (across the database, for all tablespaces) access the dictionary at the same time. Managing that local block is much simpler and faster.
The concepts guide also says:
Note: Oracle strongly recommends the use of locally managed tablespaces with Automatic Segment Space Management.
As David says, there's not really any benefit to ever using a dictionary managed tablespaces any more, and unless you've inherited an old database that still uses them - in which case migrating to LMT should be considered - or are just learning for the sake of it, you can pretty much forget about them; anything new should be using LMT really, and references to DMTs are hopefully only of historic significance.
I wanted to demonstrate the difference by running a trace on the same insert statement against an LMT and a DMT, and showing the extra SQL statements from the trace file in the DMT version; but I can't find a DMT on any database I have access too, going back to 9i, which kind of backs up David's point I suppose. Instead I'll point you to yet more documentation:
Sometimes, to execute a SQL statement issued by a user, Oracle
Database must issue additional statements. Such statements are called
recursive calls or recursive SQL statements. For example, if you
insert a row into a table that does not have enough space to hold that
row, then Oracle Database makes recursive calls to allocate the space
dynamically. Recursive calls are also generated when data dictionary
information is not available in the data dictionary cache and must be
retrieved from disk.
You can use the tracing tools described in that document to compare for yourself, if you have access to a DMT; or you can search for examples.
You can see recursive SQL referred to elsewhere, usually in errors; the error isn't directly in the SQL your are executing, but in extra SQL Oracle issues internally in order to fulfil your request. LMTs just remove one instance where they used to be necessary, and in the process can remove a significant bottleneck.

Related

How can I utilize Oracle bind variables with Delphi's SimpleDataSet?

I have an Oracle 9 database from which my Delphi 2006 application reads data into a TSimpleDataSet using a SQL statement like this one (in reality it is more complex, of course):
select * from myschema.mytable where ID in (1, 2, 4)
My applications starts up and executes this query quite often during the course of the day, each time with different values in the in clause.
My DBAs have notified me that this is creating execessive load on the database server, as the query is re-parsed on every run. They suggested to use bind variables instead of building the SQL statement on the client.
I am familiar with using parameterized queries in Delphi, but from the article linked to above I get the feeling that is not exactly what bind variables are. Also, I would need theses prepared statements to work across different runs of the application.
Is there a way to prepare a statement containing an in clause once in the database and then have it executed with different parameters passed in from a TSimpleDataSet so it won't need to be reparsed every time my application is run?
My answer is not directly related to Delphi, but this problem in general. Your problem is that of the variable-sized in-list. Tom Kyte of Oracle has some recommendations which you can use. Essentially, you are creating too many unique queries, causing the database to do a bunch of hard-parsing. This will spike the CPU consumption (and DBA blood pressures) unnecessarily.
By making your query static, it can get by with a soft-parse or perhaps no parse at all! The DB can then cache the execution plan, the DBAs can deal with a more "stable" SQL, and overall performance should be improved.

Using WITH(NOLOCK) to increase performance

I have seen developers using WITH(nolock) in the query, is there any disadvantage of it?
Also, what is the default mode of execution of query? My database do not have any index.
Is there any other way to increase database select statement performance?
The common misconception with nolock is that that it places no locks on the database whilst executing. Technically it does issues a schema-stability (sch-s) lock, so the 'no' part of the lock relates to the data side of the query.
Most of the time that I see this, it is a premature optimization by a developer because they have heard it makes the query faster.
Unless you have instrumented proof and validity in accepting a dirty read (and potentially reading the same row twice) then it should not be used - it definately should not be the default approach to queries, but an exception to the rule when it can be shown that it is required.
There are numerous articles on this on the net. The main risk is that with NOLOCK you can read uncomitted data from the table (dirty reads). See, for example, http://msdn.microsoft.com/en-us/library/aa259216(v=sql.80).aspx or http://www.techrepublic.com/article/using-nolock-and-readpast-table-hints-in-sql-server/6185492
NOLOCK can be highly useful when you are reading old data from a frequently used table. Consider the following example,
You have a stored procedure to access data of inactive projects. You
don't want this stored procedure to lock the frequently used Projects
table while reading old data.
NOLOCK is also useful when dirty reads are not a problem and data is not frequently modified such as in the following cases,
Reading list of countries, currencies, etc... from a database to show
in the form. Here the data remains unchanged and a dirty read will
not cause a big problem as it will occur very rarely.
However starting with SQL server 2005 the benefits of NOLOCK is very little due to row versioning.

Oracle xmltype extract function never deallocate/reclaim memory until session down

I'm using Oracle 9.2x to do some xmltype data manipulation.
The table as simple as tabls(xml sys.xmltype), with like 10000 rows stored. Now I use a cursor to loop every row, then doing like
table.xml.extract('//.../text()','...').getStringVal();
I notice the oracle instance and the uga/pga keep allocating memory per execution of xmltype.extract() function, until running out of the machine's available memory, even when the dbms_session.free_unused_user_memory() is executed per call of extract().
If the session is closed, then the memory used by the Oracle instance returns right away to as before the execution.
I'm wondering, how to free/deallocate those memory allocated by extract function in same session?
Thanks.
--
John
PL/SQL variables and instantiated objects are some in session memory, which is why your programming is hitting the PGA rather than the SGA. Without knowing some context it is difficult for us to give you some specific advice. The general advice would be to consider how you could reduce the footprint of the variables in your PL/SQL.
For instance, you could include the extract() in the SQL statement rather than doing it in PL/SQL; retrieving just the data you want is always an efficient thing to do. Another possibility would be to use BULK COLLECT with the LIMIT clause to reduce the amount of data you're handling at any one point. A third approach might be to do away with the PL/SQL altogether and just use pure SQL. Pure SQL is way more efficient than switching between SQL and PL/SQL, because sets are better than RBAR. But like I said, because you haven't told us more about what you're trying to achieve we cannot tell whether your CURSOR LOOP is appropriate.

ABAP select performance hints?

Are there general ABAP-specific tips related to performance of big SELECT queries?
In particular, is it possible to close once and for all the question of FOR ALL ENTRIES IN vs JOIN?
A few (more or less) ABAP-specific hints:
Avoid SELECT * where it's not needed, try to select only the fields that are required. Reason: Every value might be mapped several times during the process (DB Disk --> DB Memory --> Network --> DB Driver --> ABAP internal). It's easy to save the CPU cycles if you don't need the fields anyway. Be very careful if you SELECT * a table that contains BLOB fields like STRING, this can totally kill your DB performance because the blob contents are usually stored on different pages.
Don't SELECT ... ENDSELECT for small to medium result sets, use SELECT ... INTO TABLE instead.
Reason: SELECT ... INTO TABLE performs a single fetch and doesn't keep the cursor open while SELECT ... ENDSELECT will typically fetch a single row for every loop iteration.
This was a kind of urban myth - there is no performance degradation for using SELECT as a loop statement. However, this will keep an open cursor during the loop which can lead to unwanted (but not strictly performance-related) effects.
For large result sets, use a cursor and an internal table.
Reason: Same as above, and you'll avoid eating up too much heap space.
Don't ORDER BY, use SORT instead.
Reason: Better scalability of the application server.
Be careful with nested SELECT statements.
While they can be very handy for small 'inner result sets', they are a huge performance hog if the nested query returns a large result set.
Measure, Measure, Measure
Never assume anything if you're worried about performance. Create a representative set of test data and run tests for different implementations. Learn how to use ST05 and SAT.
There won't be a way to close your second question "once and for all". First of all, FOR ALL ENTRIES IN 'joins' a database table and an internal (memory) table while JOIN only operates on database tables. Since the database knows nothing about the internal ABAP memory, the FOR ALL ENTRIES IN statement will be transformed to a set of WHERE statements - just try and use the ST05 to trace this. Second, you can't add values from the second table when using FOR ALL ENTRIES IN. Third, be aware that FOR ALL ENTRIES IN always implies DISTINCT. There are a few other pitfalls - be sure to consult the on-line ABAP reference, they are all listed there.
If the number of records in the second table is small, both statements should be more or less equal in performance - the database optimizer should just preselect all values from the second table and use a smart joining algorithm to filter through the first table. My recommendation: Use whatever feels good, don't try to tweak your code to illegibility.
If the number of records in the second table exceeds a certain value, Bad Things [TM] happen with FOR ALL ENTRIES IN - the contents of the table are split into multiple sets, then the query is transformed (see above) and re-run for each set.
Another note: The "Avoid SELECT *" statement is true in general, but I can tell you where it is false.
When you are going to take most of the fields anyway, and where you have several queries (in the same program, or different programs that are likely to be run around the same time) which take most of the fields, especially if they are different fields that are missing.
This is because the App Server Data buffers are based on the select query signature. If you make sure to use the same query, then you can ensure that the buffer can be used instead of hitting the database again. In this case, SELECT * is better than selecting 90% of the fields, because you make it much more likely that the buffer will be used.
Also note that as of the last version I tested, the ABAP DB layer wasn't smart enough to recognize SELECT A, B as being the same as SELECT B, A, which means you should always put the fields you take in the same order (preferable the table order) in order to make sure again that the data buffer on the application is being well used.
I usually follow the rules stated in this pdf from SAP: "Efficient Database Programming with ABAP"
It shows a lot of tips in optimizing queries.
This question will never be completely answered.
ABAP statement for accessing database is interpreted several times by different components of whole system (SAP and DB). Behavior of each component depends from component itself, its version and settings. Main part of interpretation is done in DB adapter on SAP side.
The only viable approach for reaching maximum performance is measurement on particular system (SAP version and DB vendor and version).
There are also quite extensive hints and tips in transaction SE30. It even allows you (depending on authorisations) to write code snippets of your own & measure it.
Unfortunately we can't close the "for all entries" vs join debate as it is very dependent on how your landscape is set up, wich database server you are using, the efficiency of your table indexes etc.
The simplistic answer is let the DB server do as much as possible. For the "for all entries" vs join question this means join. Except every experienced ABAP programmer knows that it's never that simple. You have to try different scenarios and measure like vwegert said. Also remember to measure in your live system as well, as sometimes the hardware configuration or dataset is significantly different to have entirely different results in your live system than test.
I usually follow the following conventions:
Never do a select *, Select only the required fields.
Never use 'into corresponding table of' instead create local structures which has all the required fields.
In the where clause, try to use as many primary keys as possible.
If select is made to fetch a single record and all primary keys are included in where clause use Select single, or else use SELECT UP TO TO 1 ROWS, ENDSELECT.
Try to use Join statements to connect tables instead of using FOR ALL ENTRIES.
If for all entries cannot be avoided ensure that the internal table is not empty and a delete the duplicate entries to increase performance.
Two more points in addition to the other answers:
usually you use JOIN for two or more tables in the database and you use FOR ALL ENTRIES IN to join database tables with a table you have in memory. If you can, JOIN.
usually the IN operator is more convinient than FOR ALL ENTRIES IN. But the kernel translates IN into a long select statement. The length of such a statement is limited and you get a dump when it gets too long. In this case you are forced to use FOR ALL ENTRIES IN despite the performance implications.
With in-memory database technologies, it's best if you can finish all data and calculations on the database side with JOINs and database aggregation functions like SUM.
But if you can't, at least try to avoid accessing database in LOOPs. Also avoid reading the database without using indexes, of course.

Slow Performance on Sql Express after inserting big chunks of data

We have noticed that our queries are running slower on databases that had big chunks of data added (bulk insert) when compared with databases that had the data added on record per record basis, but with similar amounts of data.
We use Sql 2005 Express and we tried reindexing all indexes without any better results.
Do you know of some kind of structural problem on the database that can be caused by inserting data in big chunks instead of one by one?
Thanks
One tip I've seen is to turn off Auto-create stats and Auto-update stats before doing the bulk insert:
ALTER DATABASE databasename SET AUTO_CREATE_STATISTICS OFF WITH NO_WAIT
ALTER DATABASE databasename SET AUTO_UPDATE_STATISTICS OFF WITH NO_WAIT
Afterwards, manually creating statistics by one of 2 methods:
--generate statistics quickly using a sample of data from the table
exec sp_createstats
or
--generate statistics using a full scan of the table
exec sp_createstats #fullscan = 'fullscan'
You should probably also turn Auto-create and Auto-update stats back on when you're done.
Another option is to check and defrag the indexes after a bulk insert. Check out Pinal Dave's blog post.
Probably SQL Server allocated new disk space in many small chunks. When doing big transactions, it's better to pre-allocate much space in both the data and log files.
That's an interesting question.
I would have guessed that Express and non-Express have the same storage layout, so when you're Googling for other people with similar problems, don't restrict yourself to Googling for problems in the Express version. On the other hand though, bulk insert is a common-place operation and performance is important, so I wouldn't consider it likely that this is a previously-undetected bug.
One obvious question: which is the clustered index? Is the clustered index also the primary key? Is the primary key unassigned when you insert, and therefore initialized by the database? If so then maybe there's a difference (between the two insert methods) in the pattern or sequence of successive values assigned by the database, which affects the way in which the data is clustered, which then affects performance.
Something else: as well as indexes, people say that SQL uses statistics (which it created as a result of runing previous queries) to optimize its execution plan. I don't know any details of that, but as well as "reindexing all indexes", check the execution plans of your queries in the two test cases to ensure that the plans are identical (and/or check the associated statistics).

Resources