PROBLEM
Audit triggers are killing the performance of my bulk update query, by inserting an old row and a new row on every update.
In this trigger the insert of old rows, for some reason, takes much more time than inserting the new rows.
TABLE
The audit table has a cluster index, 3 non cluster indexes and it got somethig like 35 Million records.
Cluster Index
GROUPID , USERID, IDENDITYCOLUMN
FACT
Old rows can be inserted anywhere in the Clustered index
New rows will be inserted in the bottom of the Clustered index
INVESTIGATION
What I tested was that Cache can improve performance by a lot in this operation, But I didn't figured out what exactly needs to be cached.
ASSUMPTIONS
I'm Assuming that clustered index is the most relevant index in an operation of insert.
I'm Assuming that by doing a query to the right pages of the clusted index, the audit performance can be greatly improved
ACTION
I have created a query to load all the audit rows related to the rows I'm going to insert on the operational table.
By doing That, performance have improved by little bit more then 2 fold, but was not good enough.
WHY NOT GOOD ENOUGH?
I have made another test which had much greater performance, but I didn't figured out what exactly I needed to cache.
HOW WAS THAT TEST?
T01 - BACK AND FORTH TEST
I grabbed 1000 rows from operational table and updated them back and forth to see if I got a astonishing cache performance from audit table.
A) I have updated GROUPID of 1000 rows to value X (it took a while)
B) I have updated GROUPID of the same 1000 rows to value Y (it took a while)
C) I have updated GROUPID of the same 1000 rows to value X (astonishing cache performance)
D) I have updated GROUPID of the same 1000 rows to value Y (astonishing cache performance)
T02 - CHECK THE OBJECTS CACHED ON AUDIT TABLE, WHEN UPDATING OPERATIONAL TABLE
I then cleaned the audit cache,indexes cache and data cache, and performed T01 - A) again.
I turns out that Both Cluster index pages and Datapages where loaded, and in approximatly the same amount, and a residual amount of pages were loaded in other indexes.
T03 - CHECK OBJECTS CACHED ON AUDIT TABLE, WHEN RUNNING MY ARTIFICIAL CACHE LOAD QUERY
I then cleaned the audit cache and ran my query.
It have only loaded approximately half of the pages compared with test T02.
WHAT LOGIC DID I APPLIED TO MY ARTIFICIAL CACHE LOAD QUERY?
I assumed if I query all the rows in audit table where GROUPID AND USERID exists in 1000 rows to update (in operational table), I would load to cache all the clustered index and data pages needed to have great performance from audit trigger.
Cluster Index
GROUPID , USERID, IDENDITYCOLUMN
However this didn't turned out true, as I had half of the pages loaded compared with test T02.
QUESTION
What can I do to have the performance of test T01 - C) or D) in the first time?
I can only think in pre cache the data/index pages but I'm not able to find out what exactly is missing.
If you guys have other suggestions to improve audit table triggers it's also valuable.
DATABASE
SYBASE 15.7
NOTE: This is a table that belongs to a specific product solution, which means that I cannot alter it as I wish. I have some constraints.
Related
I use crontab to schedule a SQL that queries a big table every 2 hours.
select a,b,c,d,e,f,g,h,i,j,k,many_cols from big_table format Null
It takes anywhere from 5 minutes to 30 seconds at a time.
What I can see from the query_log is that when the SQL time is low, the MarkCacheHits value is high, when the time is high, the MarkCacheHits value is low, and the MarkCacheMiss value is high.
And I'm wondering how to make mark cache hit as many as possible? (This is probably not the only big table that needs to be warmed up)
Will mark cache be replaced by other queries and what is its limit?
Does the warm-up way of selecting specific columns really work for an aggregate query of those columns? For example, warm-up SQL is as above, and the aggregate query can be select a,sum(if(b,c,0)) from big_table group by a
My clickhouse server has been hanging occasionally recently, and I can't see any errors or exceptions at the corresponding time from the log. Could this be related to my regular warm-up query of the big table?
In reality you placing data into Linux disk cache.
Will mark cache be replaced by other queries and what is its limit?
yes, will be replaced, 5GB <mark_cache_size>5368709120</mark_cache_size>
Does the warm-up way of selecting specific columns really work for an aggregate query of those columns?
Yes because you put files into Linux cache.
Could this be related to my regular warm-up query of the big table?
No.
In contrast with the BigQuery documentation, we see that it DOES cache the results when selecting data from a streaming, data partitioned table (Standard SQL).
Example:
When we perform a deterministic date scan on the streaming, data partitioned table using:
where (_PARTITIONTIME > '2017-11-12' or _PARTITIONTIME is null)
...BigQuery caches the data for 5 to 20 minutes if we fire the same exact query within that time frame.
While in my interpretation of the documentation it states that it SHOULD NOT cache the data:
'When any of the tables referenced by the query have recently received streaming inserts (a streaming buffer is attached to the table) even if no new rows have arrived'
Important notes:
Our test query queries heartbeat events that really arrive at us continuously
We actually want this caching behavior, because we do not always need to have data to be actual to the last second. We just want to know if we really can depend on this behavior.
Our Questions:
What is going on here / Why does the BQ caching happen at all?
The time this data stays in the BQ cache is 'random' (between 5-20 minutes). What does this mean?
Thanks for clarifying the question. I think it's an overlook that we didn't disabled caching for partitioned tables with streaming data. It should as otherwise the query might return outdated results.
We invalidate the cache when the table is changed. Streaming into the table will cause the table to be changed. I guess that's why the cache is invalidated between 5 to 20 minutes.
For existing table i have added the index to check the performance. Table has 1.5 million records. The existing cost is "58645". Once created the index the cost is reduced to "365". So that often time I have made the index as "unusable". Then I alter and rebuild the index to check. For yesterday known the index is being used by explain plan in oracle. But today when I unusable the index and rebuild, in explain plan the index scan was not working. But performance remains fast than older. I have dropped and created again. But still the issue is remaining. Fetching is fast. But the explain plan showing that the index is not being used and the cost is showing "58645". Am stuck with this.
Many times when you create the new index or rebuild it from scratch it doesn't show up in explain plan and sometime is not used for a while as well. To correct the explain plan the stats should be gathered on index.
EXEC DBMS_STATS.GATHER_INDEX_STATS should be used or use DBMS_STATS.GATHER_TABLE_STATS with cascade option.
Blocks of data are cached in the BUFFER_POOL, which will affect your results such that:
Run Query;
Change Index;
Run Query; - buffered data from 1 will skew the preformance
Flush buffer pool
Run Query - now you get a truer measure of how "fast" the query is.
Did you flush the buffer?
ALTER SYSTEM FLUSH BUFFER_POOL;
The application that I am working on currently has an archive logic where all the records older than 6 months will be moved to history tables in the same schema, but on a different table space. This is achieved using a stored procedure which is being executed daily.
For ex. TABLE_A (live, latest 6 months) ==> TABLE_A_H (archive, older than 6 months, up to 8 years).
So far no issues. Now the business has come up with a new requirement where the archived data should also be available for selects & updates. The updates can happen even for an year old data.
selects could be direct like,
select * from TABLE_A where id = 'something'
Or it could be open-ended query like,
select * from TABLE_A where created_date < 'XYZ'
Updates are usually for specific records.
These queries are exposed as REST services to the clients. There are possibilities of junk/null values (no way the application can sanitize the input).
The current snapshot of the DB is
PARENT_TABLE (10M records, 10-15K for each record)
CHILD_TABLE_ONE (28M records, less than 1K for each record)
CHILD_TABLE_TWO (25M records, less than 1K for each record)
CHILD_TABLE_THREE (46M records, less than 1K for each record)
CHILD_TABLE_FOUR (57M records, less than 1K for each record)
Memory is not a constraint - I can procure additional 2 TB of space if needed.
The problem is how do I keep the response time lower when it accesses the archive tables?.
What are all the aspects that I should consider when building a solution?
Solution1: For direct select/update, check if the records are available in live tables. If present, perform the operation on the live tables. If not, perform the operation on the archive tables.
For open ended queries, use UNION ???
Solution2: Use month-wise partitions and keep all 8 years of data in single set of tables?. Does oracle handles 150+ Millions of records in single table for select/update efficiently?
Solution3: Use NoSQL like Couchbase?. Not a feasible solution at the moment because of the infra/cost involved.
Solution4: ???
Tech Stack: Oracle 11G, J2EE Application using Spring/Hibernate (Java 1.6) hosted on JBoss.
Your response will be very much appreciated.
If I were you, I'd go with Solution 2, and ensure that you have the relevant indexes available for the types of queries you expect to be run.
Partitioning by month means that you can take advantage of partition pruning, assuming that the queries involve the column that's being partitioned on.
It also means that your existing code does not need to be amended in order to select or update the archived data.
You'll have to set up housekeeping to add new partitions though - unless you go for interval partitioning, but that has its own set of gotchas.
I have a table which has around 180 million records and 40 indexes. A nightly program, loads data into this table but due to certain business conditions we can only delete and load data into this table. The nightly program will bring new records or updates to existing records in the table from the source system.We have limited window i.e about 6 hours to complete the extract from the source system, perform business transformations and finally load the data into this target table and be ready for users to consume the data in the morning. The issue which we are facing is that the delete from this table takes a lot of time mainly due to the 40 indexes on the table(an average of 70000 deletes per hour). I did some digging on the internet and see the below options
a) Drop or disable indexes before delete and then rebuild indexes: The program which loads data into the target table after delete and loading the data needs to perform quite a few updates for which the indexes are critical. And to rebuild 1 index it takes almost 1.5 hours due to the enormous amount of data in the table. So this approach is not feasible due to the time it takes to rebuild indexes and due to the limited time we have to get the data ready for the users
b) Use bulk delete: Currently the program deletes based on rowid and deletes records one by one as below
DELETE
FROM <table>
WHERE rowid = g_wpk_tab(ln_i);
g_wpk_tab is the collection which holds rowids to be deleted which is read by looping via FOR ALL and I do an intermediate commit every 50000 row deletes.
Tom of AskTom says in this discussion over here says that the bulk delete and row by row delete will take almost the same amount of time
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5033906925164
So this wont be a feasible option as well
c)Regular Delete: Tom of AskTom suggests to use the regular delete and even that takes a long time probably due to the number of indexes on this table
d)CTAS: This approach is out of question because the program needs to recreate the table , create the 40 indexes and then proceed with the updates and I mentioned above an index will take atleast 1.5 hrs to create
If you could provide me any other suggestions I would really appreciate it.
UPDATE: As of now we have decided to go with the approach suggested by https://stackoverflow.com/users/409172/jonearles to archive instead of delete. Approach is to add a flag to the table to mark the records to be deleted as DELETE and then have a post delete program run during the day to delete off the records. This will ensure that the data is available for users at the right time. Since users consume via OBIEE we are planning to set content level filter on the table to not look at the archival column so that users needn't know about what to select and what to ignore.
Parallel DML alter session enable parallel dml;, delete /*+ parallel */ ...;, commit;. Sometimes it's that easy.
Parallel DDL alter index your_index rebuild nologging compress parallel;. NOLOGGING to reduce the amount of redo generated during the index rebuild. COMPRESS can significantly reduce the size of a non-unique index, which significantly reduces the rebuild time. PARALLEL can also make a huge difference in rebuild time if you have more than one CPU or more than one disk. If you're not already using these options, I wouldn't be surprised if using all of them together improves index rebuilds by an order of magnitude. And then 1.5 * 40 / 10 = 6 hours.
Re-evaluate your indexes Do you really need 40 indexes? It's entirely possible, but many indexes are only created because "indexes are magic". Make sure there's a legitimate reason behind each index. This can be very difficult to do, very few people document the reason for an index. Before you ask around, you may want to gather some information. Turn on index monitoring to see which indexes are really being used. And even if the index is used, see how it is used, perhaps through v$sql_plan. It's possible that an index is used for a specific statement but another index would have worked just as well.
Archive instead of delete Instead of deleting, just set a flag to mark a row as archived, invalid, deleted, etc. This will avoid the immediate overhead of index maintenance. Ignore the rows temporarily and let some other job delete them later. The large downside to this is that it affects any query on the table.
Upgrading is probably out of the question, but 12c has an interesting new feature called in-database archiving. It's a more transparent way of accomplishing the same thing.