it is said that we should always truncate a large table before dropping, it improves performance. Is it true?
IMO in general if you simply want to drop a table then DROP is appropriate. It will release space the same way as TRUNCATE would and it will have the advantage of being atomic (no query will have the opportunity to see the table "empty").
From 10g+, a dropped table won't be deleted immediately however: if there is sufficient space it will be put in the recycle bin. If you truncate a table first, no data will remain in the recycle bin. This may be why you have been told to truncate first (?).
In any case, if you want to bypass the recycle bin you could issue DROP TABLE your_table PURGE and this statement will be atomic.
It entirely depends if you want to be able to roll back if something goes wrong.
Deletion of data records the deletion against the transaction logs of the database until you commit the change.
Truncation removes all the data from the table without recording those logs, so there can be a significant performance improvement in doing this. Just be sure you know what you are doing, as there's no way back.
It may be a good idea in order to reset the high water mark.
Related
Did you know how exactly query for past data works?
The version of oracle is 10G
With this query I can recover some data, but sometimes this query
select *
from table as of timestamp systimestamp - 1
retrieve an error (too old snapshot).
Is possible to augment time for this work and retrieve data about 24 hour? Thanks!
The key issue here is the sizing of the undo segments, and the undo retention and guarantee.
The long and short of it is that you need your undo tablespace sized to hold all of the changes that can be made withing the maximum period that you want to flashback over, and you'd want to set the undo retention parameter to that value. If it is really critical to your application that the undo is preserved then set the undo guarantee on the undo tablespace.
Useful docs: http://docs.oracle.com/cd/B12037_01/server.101/b10739/undo.htm#i1008577
Be aware that performance of flashback is rather poor for bulk data, as the required undo blocks need to be found in the tablespace. 11g has better options for high performance flashback.
What the error means is that the rollback segment became invalidated because,
usually, the query took too long. There are other causes. Like rollback segment sizing.
How many rows are in the table? - you can get an idea from this
select num_rows
from all_tables
where table_name='MYTABLE_NAME_GOES_HERE';
If there are LOTS of rows, you may need to look at adding some kind index to support your query. Because a full table scan takes too long. If not then it is a DBA issue. Maybe adding an index is a DBA issue in your shop as well.
If this worked well a few days ago, and started happening lately, you probably just passed the threshold for the rollback.
I have a trigger that checks another couple of tables before allowing a row to be inserted. However between the time I check the other tables and insert the row the other tables may get updated.
How do I ensure the tables I'm checking remain in a consistent state until after the new row is inserted? I was thinking of taking locks out but everything I've read boils down to if you are not leaving locking to Oracle you're almost certainly doing it wrong.
Oracle is already doing this for you, when you perform a select it will look at all tables as of the time the transaction started ( the time of the first DML ). This wont stop the data from being changed under you though, your transaction just wont see it being changed. If you want to stop that data from being changed then you can use "SELECT FOR UPDATE" as Justin Cave suggests.
I would seriously question what you are doing though, triggers, except in the most trivial cases, almost always lead to unexpected side effects.
The SQL command TRUNCATE in Oracle is faster than than DELETE FROM table; in that the TRUNATE comand first drops the specified table in it's entirely and then creates a new table with same structure (clarification may require in case I may be wrong). Since TRUNCATE is a part of DDL it implicitly issues COMMIT before being executed and after the completion of execution. If such is a case then, the table that is dropped by the TRUNCATE command is lost permanently with it's entire structure in the data dictionary. In such a scenario, how is the TRUNCATE command able to drop first the table and recreate the same with the same structure?
(Note that I work for Sybase in SQL Anywhere engineering and my answer comes from my knowledge of how truncate is implemented there, but I imagine it's similar in Oracle as well.)
I don't believe the table is actually dropped and re-created; the contents are simply thrown away. This is much faster than delete from <table> because no triggers need to be executed, and rather than deleting a row at a time (both from the table and the indexes), the server can simply throw away all pages that contain rows for that table and any indexes.
I thought a truncate (amoungst other things) simply reset the High Water Mark.
see: http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/statements_10007.htm#SQLRF01707
however in
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2816964500346433991
It is clear that the data segment changes after a truncate.
I have tried to find examples but they are all simple with a single where clause. Here is the situation. I have a bunch of legacy data transferred from another database. I also have the "good" tables in that same database. I need to transfer (data-conversion) data from the legacy tables to thew tables. Because this is a different set of tables the data-conversion requires complex joins to put the old data into the new tables correctly.
So, old tables old data.
New tables must have the old data but it requires lots of joins to get that old data into the new tables correctly.
Can I use direct path with lots of joins like this? INSERT SELECT (lots of joins)
Does direct path apply to tables that are already on the same database (transfer between tables)? Is it only for loading tables from say a text file?
Thank you.
The query in your SELECT can be as complex as you'd like with a direct-path insert. The direct-path refers only to the destination table. It has nothing to do with the way that data is read or processed.
If you're doing a direct-path insert, you're asking Oracle to insert the new data above the high water mark of the table so you bypass the normal code that reuses space in existing blocks for new rows to be inserted. It also has to block other inserts since you can't have the high water mark of the table change during a direct-path insert. This probably isn't a big deal if you've got a downtime window in which to do the load but it would be quite problematic if you wanted the existing tables to be available for other applications during the load.
No, on the contrary, it means you need to do a backup after a NOLOGGING load, not that you can't backup the database.
Allow me to elaborate a bit. Normally, when you do DML in Oracle, the before images of the changes you are are making get logged in UNDO, and all the changes (including the UNDO changes) are first written to REDO. This is how Oracle manages transactions, instance recovery, and database recovery. If a transaction is aborted or rolled back, Oracle uses the information in UNDO to undo the changes your transaction made. If the instance crashes, then on instance restart, Oracle will use the information in REDO and UNDO to recover up to the last committed transaction. First, Oracle will read the REDO and roll forward, then, use UNDO to roll back all the transactions that were not committed at the time of the crash. In this way, Oracle is able to recover up to the last committed transaction.
Now, when you specify an APPEND hint on an insert statement, Oracle will execute the INSERT with direct load. This means that data is loaded into brand new, never before used blocks, from above the highwater mark. Because the blocks being loaded are brand new, there is no "before image", so, Oracle can avoid writing UNDO, which improves performance. If the database is in NOARCHIVELOG mode, then Oracle will also not write REDO. On a database in ARCHIVELOG mode, Oracle will still write REDO, unless, before you do the insert /*+ append */, you set the table to NOLOGGING, (i.e. alter table tab_name nologging;). In that case, REDO logging is disabled for the table. However, this is where you could run into backup/recovery implications. If you do a NOLOGGING direct load, and then you suffer a media failure, and the datafile containing the segment with the nologging operation is restored from a backup taken before the nologging load, then the redo log will not contain the changes required to recover that segment. So, what happens? Well, when you do a NOLOGGING load, Oracle writes extent invaldation records to the redo log, instead of the actual changes. Then, if you use that redo in recovery, those data blocks will be marked logically corrupt. Any subsequent queries against that segment will get an ORA-26040 error.
So, how to avoid this? Well, you should always take a backup imediately following any NOLOGGING direct load. If you restore/recover from a backup taken after the nologging load, there is no problem, because the data will be in the datablocks in the file that was restored.
Hope that's clear,
-Mark
Yes, there should not be any arbitrary limits on query complexity.
If you do
insert /*+ APPEND */ into target_table select .... from source1, source2..., sourceN where
It should work fine. Consider though, that the performance of the load will be limited by the performance of that query, so, be sure it's well-tuned, if you're expecting good performance.
Finally, consider whether setting NOLOGGING on the target table would improve performance significantly. But, also consider the backup recovery implications, if you decide to implement NOLOGGING.
Hope that helps,
-Mark
We have a mature Oracle database application (in production for over 10 years), and during that time, we have been using scripts of our own devising to remove old data that is no longer needed. They work by issuing delete statements against the appropriate tables, in a loop with frequent commits, in order to avoid overloading the system with i/o or using too much undo space.
They work fine, for the most part. They run daily, and it takes about an hour to remove the oldest days worth of data from the system. The main concerns I have are the effects on tables and indexes that all this deleting may have, and the fact that even though they don't overly load the system, deleting one day's worth of data in that short time does have the effect of blowing out the instances buffer cache, resulting in subsequent queries running slightly slower for the next few hours as the cache is gradually restored.
For years we've been considering better methods. In the past, I had heard that people used partitioned tables to manage old data reaping - one month per partition, for example, and dropping the oldest partition on a monthly basis. The main drawback to this approach is that our reaping rules go beyond "remove month X". Users are allowed to specify how long data must stay in the system, based on key values (e.g., in an invoice table, account foo can be removed after 3 months, but account bar may need to remain for 2 years).
There is also the issue of referential integrity; Oracle documentation talks about using partitions for purging data mostly in the context of data warehouses, where tables tend to be hypercubes. Ours is closer to the OLTP end of things, and it is common for data in month X to have relationships to data in month Y. Creating the right partitioning keys for these tables would be ticklish at best.
As for the cache blowouts, I have read a bit about setting up dedicated buffer caches, but it seems like it's more on a per-table basis, as opposed to a per-user or per-transaction basis. To preserve the cache, I'd really like the reaping job to only keep one transaction's worth of data in the cache at any time, since there is no need to keep the data around once deleted.
Are we stuck using deletes for the foreseeable future, or are there other, more clever ways to deal with reaping?
For the most part I think that you're stuck doing deletes.
Your comments on the difficulty of using partitions in your case probably do prevent them being used effectively (different delete dates being used depending on the type of record) but it it possible that you could create a "delete date" column on the records that you could partition on? It would have the disadvantage of making updates quite expensive as a change in the delete date might cause row migration, so your update would really be implemented as a delete and insert.
It could be that even then you cannot use DDL partition operations to remove old data because of the referential integrity issues, but partitioning still might serve the purpose of physically clustering the rows to be deleted so that fewer blocks need to be modified in order to delete them, mitigating the impact on the buffer cache.
Delete's aren't that bad, provided that you rebuild your indexes. Oracle will recover the pages that no longer contain data.
However, as-of 8i (and quite probably still), it would not properly recover index pages that no longer contained valid references. Worse, since the index leaves were chained, you could get into a situation where it would start walking the leaf nodes to find a row. This would cause a rather significant drop in performance: queries that would normally take seconds could take minutes. The drop was also very sudden: one day it would be fine, the next day it wouldn't.
I discovered this behavior (there was an Oracle bug for it, so other people have too) with an application that used increasing keys and regularly deleted data. Our solution was to invert portions of the key, but that's not going to help you with dates.
What if you temporarily deactivate indexes, perform the deletes and then rebuild them? Would it improve the performance of your deletes? Of course, in this case you have to make sure the scripts are correct and ensure proper delete order and referential integrity.
We have the same problem, using the same strategy.
If the situation becomes really bad (very fragmented allocation of indexes, tables, ...), we try to apply space reclamation actions.
Tables have to allow row movement (like for the flashback):
alter table TTT enable row movement;
alter table TTT shrink space;
and then rebuild all indexes.
I don't know how you are with maintenance windows, if the application has to be usable all the time, it is harder, if not, you can do some "repacking" when it is off-line. "alter table TTT move tablespace SSSS" does a lot of work cleaning up the mess as the table is rewritten. You can also specify new storage parameters such as extent management, sizes, ... take a look in the docs.
I use a script like this to create a script for the whole database:
SET SQLPROMPT "-- "
SET ECHO OFF
SET NEWPAGE 0
SET SPACE 0
SET PAGESIZE 0
SET FEEDBACK OFF
SET HEADING OFF
SET TRIMSPOOL ON
SET TERMOUT OFF
SET VERIFY OFF
SET TAB OFF
spool doit.sql
select 'prompt Enabling row movement in '||table_name||'...'||CHR (10)||'alter table '||table_name||' enable row movement;' from user_tables where table_name not like '%$%' and table_name not like '%QTAB' and table_name not like 'SYS_%';
select 'prompt Setting initial ext for '||table_name||'...'||CHR (10)||'alter table '||table_name||' move storage (initial 1m);' from user_tables where table_name not like '%$%' and table_name not like '%QTAB' and table_name not like 'SYS_%';
select 'prompt Shrinking space for '||table_name||'...'||CHR (10)||'alter table '||table_name||' shrink space;' from user_tables where table_name not like '%$%' and table_name not like '%QTAB' and table_name not like 'SYS_%';
select 'prompt Rebuilding index '||index_name||'...'||CHR (10)||'alter index '||index_name||' rebuild;' from user_indexes where status = 'UNUSABLE';
spool off
prompt now check and then run #doit.sql
exit