I am trying to create partitions on pre-existing tables in my application. While doing same, I am facing space issues in the schema's default tablespaces.
Could someone please explain how the DBMS_REDEFINITION package works in terms of space utilization. Do it requires extra space to perform the task. If yes , why? And is it temporary ? do it release the space after completing the redefinition
Immediate Comments on this would be highly appreciated.
When you use DBMS_REDEFINITION package to redefine a table you need twice the space which is used by your table. Actually DBMS_REDEFINITION copies all data from old table to new table. You have to drop the old table manually after successfull redefition.
Related
I have created some tables in Greenplum, performing insert update and delete operation. Regularly I am also performing vacuum operation. I Found bloat in it. Found solution to remove bloat https://discuss.pivotal.io/hc/en-us/articles/206578327-What-are-the-different-option-to-remove-bloat-from-a-table
However, if I truncate the table and reinsert the data, it removes bloat. Is it good practice to truncate the data from the table?
If you are performing UPDATE and DELETE statements on a heap table (default storage) and running VACUUM regularly, you will get some bloat by design. Heap storage, which is similar to the default PostgreSQL storage mechanism, provides read consistency using Multi-Version Concurrency Control (MVCC).
When you UPDATE or DELETE a record, the old value is still in the table and is able to be read by transactions that are still inflight and started before you issued the UPDATE or DELETE command. This provides the read consistency to the table.
When you execute a VACUUM statement, the database will mark the stale rows as available to be overwritten. It doesn't shrink the files. It just marks rows so they can be overwritten. The next time you execute an INSERT or UPDATE, the stale rows are now able to be used for the new data.
So if you UPDATE or DELETE 10% of a table between running VACUUM, you will probably have about 10% bloat.
Greenplum also has Append-Optimized (AO) storage which doesn't use MVCC and uses a visibility map instead. The files are bit smaller too so you should get better performance. The stale rows are hidden with the visibility map and VACUUM won't do anything until you hit the gp_appendonly_compaction_threshold percentage. The default is 10%. When you have 10% bloat in an AO table and execute VACUUM, the table will automatically get rebuilt for you.
Append-Optimized is called "appendonly" for backwards compatibility reasons but it does allow UPDATE and DELETE. Here is an example of an AO table:
CREATE TABLE sales
(txn_id int, qty int, date date)
WITH (appendonly=true)
DISTRIBUTED BY (txn_id);
Instead of truncate it is better to use drop the table, create the table and then insert the data.
When adding a column to a table that has a default value and a constraint of not null. Is it better to run as a single statement or to break it into steps while the database is under load.
ALTER TABLE user ADD country VARCHAR2(4) DEFAULT 'GB' NOT NULL
VERSUS
ALTER TABLE user ADD country VARCHAR2(2)
UPDATE user SET country = 'GB'
COMMIT
ALTER TABLE user MODIFY country DEFAULT 'GB' NOT NULL
Performance depends on the Oracle version you use. Locks are generated anyway.
If version <= Oracle 11.1 then #1 does the same as #2. It is slow anyway.
Beginning with Oracle 11.2, Oracle introduced a great optimization for the first statement (one command doing it all). You don't need to change the command - Oracle just behaves differently. It stores the default value only in data dictionary instead of updating each physical row.
But I also have to say, that I encountered some bugs in the past related to this feature (in Oracle 11.2.0.1)
failure of traditional import if export was done with direct=Y
merge statement can throw an ORA-600 [13013] (internal oracle error)
a performance problem in queries using such tables
I think this issues are fixed in current version 11.2.0.3, so I can recommend to use this feature.
Some time ago we have evaluated possible solutions of the same problem. On our project we had to remove all indexes on table, perform altering and restore indexes back.
If your system needs to be using the table then DBMS_Redefinition is really your only choice.
I would like to know if dropping Oracle index and recreating them will pose any data issues if assuming these are done during scheduled downtime.
Recently discovered that some indexes were parked on incorrect table space, would like to correct it by dropping the index and recreating it on the correct table space.
Please kindly advise.
I don't see a problem with that, but instead of drop/create you could also use the syntax below:
alter index <INDEX_NAME> rebuild tablespace <TABLESPACE_NAME>
To address what you asked in the comment below, the alter index rebuild should be faster. The reason for that is when you drop the index and create it again, index tree will be built from the table itself. But with alter index rebuild, Oracle reads the index itself, thus resulting in a smaller amount of I/O.
I have tried to find examples but they are all simple with a single where clause. Here is the situation. I have a bunch of legacy data transferred from another database. I also have the "good" tables in that same database. I need to transfer (data-conversion) data from the legacy tables to thew tables. Because this is a different set of tables the data-conversion requires complex joins to put the old data into the new tables correctly.
So, old tables old data.
New tables must have the old data but it requires lots of joins to get that old data into the new tables correctly.
Can I use direct path with lots of joins like this? INSERT SELECT (lots of joins)
Does direct path apply to tables that are already on the same database (transfer between tables)? Is it only for loading tables from say a text file?
Thank you.
The query in your SELECT can be as complex as you'd like with a direct-path insert. The direct-path refers only to the destination table. It has nothing to do with the way that data is read or processed.
If you're doing a direct-path insert, you're asking Oracle to insert the new data above the high water mark of the table so you bypass the normal code that reuses space in existing blocks for new rows to be inserted. It also has to block other inserts since you can't have the high water mark of the table change during a direct-path insert. This probably isn't a big deal if you've got a downtime window in which to do the load but it would be quite problematic if you wanted the existing tables to be available for other applications during the load.
No, on the contrary, it means you need to do a backup after a NOLOGGING load, not that you can't backup the database.
Allow me to elaborate a bit. Normally, when you do DML in Oracle, the before images of the changes you are are making get logged in UNDO, and all the changes (including the UNDO changes) are first written to REDO. This is how Oracle manages transactions, instance recovery, and database recovery. If a transaction is aborted or rolled back, Oracle uses the information in UNDO to undo the changes your transaction made. If the instance crashes, then on instance restart, Oracle will use the information in REDO and UNDO to recover up to the last committed transaction. First, Oracle will read the REDO and roll forward, then, use UNDO to roll back all the transactions that were not committed at the time of the crash. In this way, Oracle is able to recover up to the last committed transaction.
Now, when you specify an APPEND hint on an insert statement, Oracle will execute the INSERT with direct load. This means that data is loaded into brand new, never before used blocks, from above the highwater mark. Because the blocks being loaded are brand new, there is no "before image", so, Oracle can avoid writing UNDO, which improves performance. If the database is in NOARCHIVELOG mode, then Oracle will also not write REDO. On a database in ARCHIVELOG mode, Oracle will still write REDO, unless, before you do the insert /*+ append */, you set the table to NOLOGGING, (i.e. alter table tab_name nologging;). In that case, REDO logging is disabled for the table. However, this is where you could run into backup/recovery implications. If you do a NOLOGGING direct load, and then you suffer a media failure, and the datafile containing the segment with the nologging operation is restored from a backup taken before the nologging load, then the redo log will not contain the changes required to recover that segment. So, what happens? Well, when you do a NOLOGGING load, Oracle writes extent invaldation records to the redo log, instead of the actual changes. Then, if you use that redo in recovery, those data blocks will be marked logically corrupt. Any subsequent queries against that segment will get an ORA-26040 error.
So, how to avoid this? Well, you should always take a backup imediately following any NOLOGGING direct load. If you restore/recover from a backup taken after the nologging load, there is no problem, because the data will be in the datablocks in the file that was restored.
Hope that's clear,
-Mark
Yes, there should not be any arbitrary limits on query complexity.
If you do
insert /*+ APPEND */ into target_table select .... from source1, source2..., sourceN where
It should work fine. Consider though, that the performance of the load will be limited by the performance of that query, so, be sure it's well-tuned, if you're expecting good performance.
Finally, consider whether setting NOLOGGING on the target table would improve performance significantly. But, also consider the backup recovery implications, if you decide to implement NOLOGGING.
Hope that helps,
-Mark
I am writing a data conversion in PL/SQL that processes data and loads it into a table. According to the PL/SQL Profiler, one of the slowest parts of the conversion is the actual insert into the target table. The table has a single index.
To prepare the data for load, I populate a variable using the rowtype of the table, then insert it into the table like this:
insert into mytable values r_myRow;
It seems that I could gain performance by doing the following:
Turn logging off during the insert
Insert multiple records at once
Are these methods advisable? If so, what is the syntax?
It's much better to insert a few hundred rows at a time, using PL/SQL tables and FORALL to bind into insert statement. For details on this see here.
Also be careful with how you construct the PL/SQL tables. If at all possible, prefer to instead do all your transforms directly in SQL using "INSERT INTO t1 SELECT ..." as doing row-by-row operations in PL/SQL will still be slower than SQL.
In either case, you can also use direct-path inserts by using INSERT /*+APPEND*/, which basically bypasses the DB cache and directly allocates and writes new blocks to data files. This can also reduce the amount of logging, depending on how you use it. This also has some implications, so please read the fine manual first.
Finally, if you are truncating and rebuilding the table it may be worthwhile to first drop (or mark unusable) and later rebuild indexes.
Regular insert statements are the slowest way to get data in a table and not meant for bulk inserts. The following article references a lot of different techniques for improving performance: http://www.dba-oracle.com/oracle_tips_data_load.htm
Drop the index, then insert the rows, then re-create the index.
If dropping the index doesn't speed things up enough, you need the Oracle SQL*Loader:
http://www.oracle.com/technology/products/database/utilities/htdocs/sql_loader_overview.html
Suppose you have taken eid,ename,sal,job. So create a table first as:
SQL>create table tablename(eid number, ename varchar2(20),sal number,job char(10));
Now insert data:-
SQL>insert into tablename values(&eid,'&ename',&sal,'&job');
Check this link
http://www.dba-oracle.com/t_optimize_insert_sql_performance.htm
main points to consider for your
case is to use Append hint as this
will directly append into the table
instead of using freelist. If you can afford to turn off logging than use append with nologging hint to do it
Use a bulk insert instead instead of iterating in PL/SQL
Use sqlloaded to load the data directly into the table if you are getting data from a file feed
Here are my recommendations on fast insert.
Trigger - Disable any triggers associated with a table. Enable after Inserts are complete.
Index - Drop Index and re-create it after your Inserts are complete.
Stale stats - Re-analyze table and index stats.
Index de-fragmentation - Rebuild Index if needed
Use No Logging -Insert using INSERT APPEND (Oracle only). This approach is very risky approach, no redo logs are generated therefore you can’t do a rollback - make a backup of table before you start and don't try on live tables. Check if your db has similar option
Parallel Insert: Running parallel insert will get the job faster.
Use Bulk Insert
Constraints - Not much overhead during inserts but still a good idea to check, if it is still slow after even after step 1
You can learn more on http://www.dbarepublic.com/2014/04/slow-insert.html
Maybe one of your best option is to avoid Oracle as much as possible actually.
I've been baffled by this myself, but very often a Java process can outperform many of the Oracle's utilities which either use OCI (read: SQL Plus) or will take up so much of your time to get right (read: SQL*Loader).
This doesn't prevent you to use specific hints either (like /APPEND/).
I've been pleasantly surprised each time I've turned to that kind of solution.
Cheers,
Rollo