MySQL Workbench shows running query but query not in process list, eventually times out at 7200 seconds - insert

Purpose: Remove duplicate records from large table.
My Process:
Create table 2 with 9 fields. No indexes. Same data_types per field as Table 1.
insert 9 fields, all records into table 2 from existing table 1
Table 1 contains 71+ Million rows and 232 columns and many duplicate records.
No joins. No Where Clause.
Table 1 contains several indexes.
8 fields are required to get unique records.
I'm trying to set up a process to de-dup large tables, using dense_rank partitioning to identify the most recently entered duplicate. Thus, those 8 required fields from Table 1 plus the auto-increment from Table 1 are loaded into Table 2.
Version: 10.5.17 MariaDB
The next steps would be:
Create new Table 3 identical to table 1 but with no indexes.
Load all data from Table 1 into Table 3, joining Table 1 to Table 2 on the auto-increment fields, where table 2.Dense_Rank field value = 1. This inserts ~17 Million unique records
Drop any existing Foreign_Keys related to Table 1
Truncate Table 1
Insert all records from Table 3 into Table 1
Nullify columns in related tables where the foreign key values in Table 1 no longer exist
re-create Foreign Keys that had been dropped.
Creating a test instance of an existing system I accomplish everything I need to once - only the first time. But If I then drop table 2 before refreshing Table 1 as outlined immediately above, re-create and try to reload, workbench shows query running until 7200 second timeout.
While the insert into Table 2 query is running, opening a second instance of Workbench and selecting count of records in table 2 after 15 minutes gives me the 71+ Million records I'm looking for, but Workbench continues running until timeout.
The query shows up in Show Processlist for those 15 minutes, but disappears around the 15 minute mark - presumably once all records are loaded.
I have tried running with timeouts set to 0 as well as 86,400 seconds, indicating no read timeout and 24 hours timeout, respectively, but query still times out at 7200.0xx seconds, or 2 hours, every time.
The exact error message I get is: Error Code: 2013. Lost connection to MySQL server during query 7200.125 sec
I have tried running the insert statement with COMMIT and without.
This is being done in a Test Instance set up for this development where I am the only user, and only a single table is in use during the insert process.
Finding one idea on line I ran the following suggested query to identify locked tables but got the error message that the table does not exist:
SELECT TRX_ID, TRX_REQUESTED_LOCK_ID, TRX_MYSQL_THREAD_ID, TRX_QUERY
FROM INNODB_TRX
and, of course, with only a single table being called by a single user in the system nothing should be locked.
As noted above I can fulfill the entire process a single time but, when trying to run up to the point of stopping just before truncating Table 1 so I can start over, I am consistently unable to succeed. Table 2 never gets released after being loaded again.
The reason it is important for me test a second iteration is that once this process is successful it will be applied to several database instances that were not just set up for the purpose of testing this process, and if this only works on a newly created database instance that has had no other processing performed it may not be dependable.

Related

Oracle Materialized View has duplicate entries

I defined a matView for fast refresh by rowid and from time to time I got duplicated entries for the primary key. The master table definitely has no double entries. The view accesses a remote DB.
I cannot refresh via primary key, because I need to have an outer join in case the referenced id is null.
Most of the time it works fine, but every 1000 entries or so I get an entry twice.
When I update this duplicated record in the master, the refresh of the view "repairs" the record and I have a single record.
We have a RAC cluster with 2 instances.
create materialized view teststatussetat
refresh force on demand with rowid
as
select
ctssa.uuid id,
ctssa.uuid,
ctssa.rowid ctssa_rowid,
ctps.rowid ctps_rowid,
ssf.rowid ssf_rowid,
ctssa.coretestid coretest_uuid,
ctssa.lastupdate,
ctssa.pstatussetat statussetat,
ctps.code status_code,
ssf.account statussetfrom
from
coreteststatussetat#coredb ctssa,
coretestprocessstatusrev#coredb ctps,
coreuser#coredb ssf
where
ssf.uuid(+) = ctssa.statussetfromid and
ctps.uuid = ctssa.statusid
;
The log files are created like this:
create materialized view log on coreteststatussetat with sequence, rowid, primary key including new values;
We have an Oracle Database 19c Enterprise Edition Release 19.0.0.0.0.
To watch what happens I created a job which checks every 5 seconds, if the view contains duplicates for one day. The job found thousands of duplicate entries during the day but most of them (but not all) vanished away, so were there only temporary. The job protololled the primary key and also the rowid, so I hoped that I can find some changing rowids. But all of the duplicated primary keys have distinct rowids.
The data is created via Hibernate. But this should not make a difference. Oracle should not create duplicate entries.

Delete from table is very slow in oracle standard edition

Delete on table in oracle standard edition(no partition) gets slow with time.
Important Info: I am working on oracle standard edition so partitioning option available.
Detail:
I have one table with no constraint on it (no PK or anyother key or trigger or index or anything).
More than a million record gets inserted in this table in every 15 min using sql loader.
we need to process this 15 min record in every 15 min and at end of process delete any record older than 30 minute so that at any point of time there is more than 30-40 minute of data in the table.
Problem:
As time passes due to so frequent insertion and deletion response from the table gets slow.
Data extraction and delete from table takes more time with every passing run.
After a while even a simple select query takes too long.
We cant truncate table as data loader runs continously and we may loose data if truncate and we dont have create table access to drop and create table.
we have to process data in every 15 minute and made it available to downstream for further processing. it just keep getting slow.
Kindly help me with the aforementioned situation.

Deleting very large table records where id not in another table

I have one table values that have 80 million records. Another table values_history that has 250 million records.
I want to filter the values_history table and want to keep the only data for which id is preset in values table.
delete from values_history where id not in (select id from values);
This query takes such a long time that I have to abort the process.
Please some idea to speed up the process.
Can I delete the records in bunch like 1000000 at a time?
I have extracted out the required record and inserted into temp table .This took 2 hrs after that i dropped the table then again inserted extracted data back to the main table whole process took 4 hrs around that is fine for me.I have dropped foreign key and all other constraint before that..

HSQL simple one column update runs forever

I have a database with about 125 000 rows, each row with primary key, couple of int columns and couple of varchars.
I've added an int column and I'm trying to populate it before adding not null constraint.
The db is persisted in script file. I've read somewhere that all the affected rows get loaded to memory before the actual update, which means there wont be a disk write for every row. The whole db is about 20MB which would mean loading it and doing the update should be reasonably fast, right?
So, no joins, no nested queries, basic update.
I've tried multiple db managers including the one bundled with hsql jar.
update tbl1 set col1 = 1
Query never finishes executing.
It is probably running out of memory.
The easier way to do this operation is to define the column with DEFAULT 1, which does not use much memory regardless of the size of the table. You can even add the not null constraint at the same time
ALTER TABLE T ADD COLUMN C INT DEFAULT 1 NOT NULL

ORACLE Table Loading Speed

This is a new issue that I haven't run into before.
I have a table that at one point contained over 100k records, it's an event log for a dev environment.
it took up to 10 seconds to load the table (simply clicking on it to view the data in the table).
I removed all but 30 rows and it still takes 7 seconds to load.
I'm using Toad, and it gives me a dialog box that says "Statement Processing..."
Any ideas?
The following are some select statements and how long they took
select * from log; 21 rows in 10 secs
select * from log where id = 120000; 1 row in 1 msec
select * from log where user = 35000; 9 rows in 7 sec
The id is the pk, there is no index on the user field.
I have a table view that contains all of the fields sitting ontop of this table as well and it runs just as slow.
If you issue a "select * from event_log_table", then you are scanning the entire table with a full table scan. It has to scan through all allocated segments to see if there are rows in there. If your table once contained over 100K rows, then it has allocated at least the amount of space to be able to hold those 100K+ rows. Please see: http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/schema.htm#sthref2100
Now if you delete rows, the space is still allocated to this table, and Oracle still has to scan all space. It works like a high water mark.
To reduce the high water mark, you can issue a TRUNCATE TABLE command, which resets the high water mark. But then you'll lose ALL rows.
And there is an option to shrink the space in the table. You can read about it and its preconditions here:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_3001.htm#sthref5117
Regards,
Rob.
I would better understand this if you started off with a 100M records table. But just in case, try running Oracle stats. If this doesn't help, drop and recreate indices on that table.

Resources