Delete on table in oracle standard edition(no partition) gets slow with time.
Important Info: I am working on oracle standard edition so partitioning option available.
Detail:
I have one table with no constraint on it (no PK or anyother key or trigger or index or anything).
More than a million record gets inserted in this table in every 15 min using sql loader.
we need to process this 15 min record in every 15 min and at end of process delete any record older than 30 minute so that at any point of time there is more than 30-40 minute of data in the table.
Problem:
As time passes due to so frequent insertion and deletion response from the table gets slow.
Data extraction and delete from table takes more time with every passing run.
After a while even a simple select query takes too long.
We cant truncate table as data loader runs continously and we may loose data if truncate and we dont have create table access to drop and create table.
we have to process data in every 15 minute and made it available to downstream for further processing. it just keep getting slow.
Kindly help me with the aforementioned situation.
Related
Purpose: Remove duplicate records from large table.
My Process:
Create table 2 with 9 fields. No indexes. Same data_types per field as Table 1.
insert 9 fields, all records into table 2 from existing table 1
Table 1 contains 71+ Million rows and 232 columns and many duplicate records.
No joins. No Where Clause.
Table 1 contains several indexes.
8 fields are required to get unique records.
I'm trying to set up a process to de-dup large tables, using dense_rank partitioning to identify the most recently entered duplicate. Thus, those 8 required fields from Table 1 plus the auto-increment from Table 1 are loaded into Table 2.
Version: 10.5.17 MariaDB
The next steps would be:
Create new Table 3 identical to table 1 but with no indexes.
Load all data from Table 1 into Table 3, joining Table 1 to Table 2 on the auto-increment fields, where table 2.Dense_Rank field value = 1. This inserts ~17 Million unique records
Drop any existing Foreign_Keys related to Table 1
Truncate Table 1
Insert all records from Table 3 into Table 1
Nullify columns in related tables where the foreign key values in Table 1 no longer exist
re-create Foreign Keys that had been dropped.
Creating a test instance of an existing system I accomplish everything I need to once - only the first time. But If I then drop table 2 before refreshing Table 1 as outlined immediately above, re-create and try to reload, workbench shows query running until 7200 second timeout.
While the insert into Table 2 query is running, opening a second instance of Workbench and selecting count of records in table 2 after 15 minutes gives me the 71+ Million records I'm looking for, but Workbench continues running until timeout.
The query shows up in Show Processlist for those 15 minutes, but disappears around the 15 minute mark - presumably once all records are loaded.
I have tried running with timeouts set to 0 as well as 86,400 seconds, indicating no read timeout and 24 hours timeout, respectively, but query still times out at 7200.0xx seconds, or 2 hours, every time.
The exact error message I get is: Error Code: 2013. Lost connection to MySQL server during query 7200.125 sec
I have tried running the insert statement with COMMIT and without.
This is being done in a Test Instance set up for this development where I am the only user, and only a single table is in use during the insert process.
Finding one idea on line I ran the following suggested query to identify locked tables but got the error message that the table does not exist:
SELECT TRX_ID, TRX_REQUESTED_LOCK_ID, TRX_MYSQL_THREAD_ID, TRX_QUERY
FROM INNODB_TRX
and, of course, with only a single table being called by a single user in the system nothing should be locked.
As noted above I can fulfill the entire process a single time but, when trying to run up to the point of stopping just before truncating Table 1 so I can start over, I am consistently unable to succeed. Table 2 never gets released after being loaded again.
The reason it is important for me test a second iteration is that once this process is successful it will be applied to several database instances that were not just set up for the purpose of testing this process, and if this only works on a newly created database instance that has had no other processing performed it may not be dependable.
I needed to truncate and reload a table.
I learned that truncate needs stats gathering on the table as its successor process so the database gets the actual statistics, otherwise previous stats are not cleared by the truncate statement.
After doing these two operations (truncate and stats gathering on the empty table), ran the insert... but don't see new statistics in all_tab_statistics table for my table. Sample_size is still 0.
Why is that? Shouldn't have Oracle done the automatic stats gathering after the insert?
Do I need to rerun the stats or is it just fine considering the performance around this table (please note it's going to truncate and reload each time)?
Consider the following approach. It has the advantage of the table always being present.
Create an empty new table like the old one.
Load the data into the new table. This is the slowest step.
Do whatever cleanup you might need, such as refreshing the statistics.
RENAME tables to swap the new table in place. This step is fast enough so you won't notice.
I know it's a long time since I posted my question above. But recently, we again faced the similar situation and this time below steps worked towards a much better performance on a table with 800 million rows.
Take a backup of the original table.
Truncate the original table.
Gather stats on the truncated table, so that statistics show 0 in the DB. Us CASCADE=>TRUE in the command to also include indexes in the process.
Drop the indexes on the truncated table and Insert the required data from the backup table.
Recreate the indexes and gather stats again (ofcourse, with CASCADE=>TRUE; however recreation of the indexes should ideally have calculated the appropriate stats).
Drop the backup table if not needed.
I have a pretty complex query where we make use of a temporary table (this is in Oracle running on AWS RDS service).
INSERT INTO TMPTABLE (inserts about 25.000 rows in no time)
SELECT FROM X JOIN TMPTABLE (joins with temp table also in no time)
DELETE FROM TMPTABLE (takes no time in a copy of the production database, up to 10 minutes in the production database)
If I change the delete to a truncate it is as fast as in development.
So this change I will of course deploy. But I would like to understand why this occurs. AWS team has been quite helpful but they are a bit biased on AWS and like to tell me that my 3000 USD a month database server is not fast enough (I don't think so). I am not that fluent in Oracle administration but I have understood that if the redo logs are constantly filled, this can cause issues. I have increased the size quite substantially, but then again, this doesn't really add up.
This is a fairly standard issue when deleting large amounts of data. The delete operation has to modify each and every row individually. Each row gets deleted, added to a transaction log, and is given an LSN.
truncate, on the other hand, skips all that and simply deallocates the data in the table.
You'll find this behavior is consistent across various RDMS solutions. Oracle, MSSQL, PostgreSQL, and MySQL will all have the same issue.
I suggest you use an Oracle Global Temporary table. They are fast, and don't need to be explicitly deleted after the session ends.
For example:
CREATE GLOBAL TEMPORARY TABLE TMP_T
(
ID NUMBER(32)
)
ON COMMIT DELETE ROWS;
See https://docs.oracle.com/cd/B28359_01/server.111/b28310/tables003.htm#ADMIN11633
I am using Oracle9i (9.2). I have a situation where I have to populate a table daily. Daily at mid night this table will be truncated and new data will be put in. The new data population takes about 10-20 mins. The issue is that this table can't be down(locked). While the new data is being inserted, the previous days data needs to be available for a select procedure.
Edit - I am looking into the transaction levels. I just need some expert opinion.
Is this possible in Oracle?
How about using two tables. Have a "current" table that has the previous days data. Then have a new table which you can load. Then when you are ready, you can "swap" the two tables, using a series of rename operations.
For Source: OLE DB Source - Sql Command
SELECT -- The destination table Id has IDENTITY(1,1) so I didn't take it here
[GsmUserId]
,[GsmOperatorId]
,[SenderHeader]
,[SenderNo]
,[SendDate]
,[ErrorCodeId]
,[OriginalMessageId]
,[OutgoingSmsId]
,24 AS [MigrateTypeId] --This is a static value
FROM [MyDb].[migrate].[MySource] WITH (NOLOCK)
To Destination: OLE DB Destination
Takes 5 or more minutes to insert 1.000.000 data. I even unchecked Check Constraints
Then, with the same SSIS configurations I wanted to test it with another table exactly the same as the Destination table. So, I re-create the destination table (with the same constrains except the inside data) and named as dbo.MyDestination.
But it takes about 30 seconds or less to complete the SAME data with the same amount of Data.
Why is it significantly faster with the test table and not the original table? Is it because the original table already has 107.000.000 data?
Check for indexes/triggers/constraints etc. on your destination table. These may slow things down considerably.
Check OLE DB connection manager's Packet Size, set it appropriately, you can follow this article to set it to right value.
If you are familiar with of SQL Server Profiler, then use it to get more insight especially what happens when you use re-created table to insert data against original table.