Are there any benefits from partitioning an Oracle on the same hdd? - oracle

I'm looking for the solution to improve insert time for concurrent inserts. Will I get any benefits from Oracle partitioning not providing dedicated hardware for every partition?

What is the bottleneck in your current insert process? I'm guessing from the "high concurrency" in your question that you're talking about an OLTP app where there are a large number of single-row inserts rather than a small number of many-row inserts that would be common in a data warehouse.
In an OLTP scenario, it is relatively unlikely that partitioning will decrease the time required to do a single-row insert. Assuming that you've already eliminated the obvious time wasters like triggers on the table, most of the insert overhead is likely to be index maintenance with a bit of I/O for the writes to the redo logs. Partitioning likely wouldn't reduce any of these because in an OLTP environment you generally can't load into a staging table and do a partition exchange which would reduce the index maintenance costs.

Well, like everything else, it depends.
Partitioning can reduce contention and eliminate hot blocks. For example, imagine if you will, a transaction system. If you partitioned by hash across some surrogate customer ID value, each index would be significantly smaller, and potentially less subject to contention and index root splits.
Another solution if you have concurrency problems is the use of reverse-key indexes against "one-legged" indexes - where an indexed sequence-populated column forces continue block-splits. However, using reverse-key indexes prevents range scans from using the index, so beware.
It really depends on what Oracle wait events are part of your critical transaction path. What you're waiting on will generally dictate what solution is appropriate.
So it could help. It could also make the situation worse. Without more information about what's adding wait time - if anything - the internet can't help solve the problem.

Related

What is the actual use of partitions in clickhouse?

It says partitions make it easier to drop or move data so that there is hit only on limited data. In various blogs it is suggested to use month as a partitioning key (toYYYYMM(date)). In many places it is also suggested to not have more than a couple of partitions. I am using clickhouse as a database to store time series data which do not undergo frequent deletions. What would be the advisable partitioning key for timeseries data of high volume? Does there have to be one if I do not want to perform deletes frequently?
In production I noticed that startup was very slow and I was suspecting that having too many partitions is the culprit. So I decided to test it out by inserting time-series data fresh into a table (which created >2300 partitions for ~20Bil rows) by selecting data from another table (so that it doesn't have an opportunity to optimize the table). Immediately I dropped the original table and tried a restart. It finished fast in about 10s. This is in complete opposite to what I observed in production with 800GB+ of data (with many databases and tables as opposed to my test node which had only one table).
Edit: As it was pointed out, I mixed up parts and partitions. Regarding startup time of clickhouse being affected, I'd better post another question.
This is a pretty common question, and for disclosure, I work at ClickHouse.
Partitions are particularly useful when you have timeseries data, as you noted. When determining the number of partitions, we often recommend a few guidelines:
The use of partitioning should be determined by a couple of questions as to why you're using them:
are you generally going to query only a single partition? For example, if your queries are often for results within a one day or one month period, it could make sense to partition at that period duration
are you wanting to "tier" or set a TTL on your data such that once the partition reaches an age of X (e.g., 91 days old, 7 months old), you want to do something special with it? (e.g., TTL to lower cost tier storage, backup and delete from ClickHouse, etc.)
We often recommend to keep the number of partitions less than around 100. Up to 1000 partitions can work, but it is suboptimal and will have some performance impact at the filesystem and index/memory sizes, which can impact startup time insert/query time
Given these guidelines, hoping that helps with your question. It is probably most common to partition at the day or month, but since ClickHouse can manage large tables quite easily, might want to move towards fewer partitions if possible - partitioning by month probably most common.
I didn't fully understand your test results so please feel free to expand. 2300 partitions sounds like too many but might work, just with some performance implications. Reducing your number of partitions (and therefore increasing the partition size) seems like a good recommendation.

Will Shrinking and lowering the high water mark cause issues in OLTP systems

newb here, We have an old Oracle 10g instance that they have to keep alive until it is replaced. The nightly jobs have been very slow causing some issues. Every other Week there is a large process that does large amounts of DML (deletes, inserts, updates). Some of these tables have 2+ million rows. I noticed that some of the tables the HWM is higher than expected and in Toad I ran a database advisor check that recommended shrinking some tables, but I am concerned that the tables may need the space for DML operations or will shrinking them make the process faster or slower?
We cannot add cpu due to licensing costs
If you are accessing the tables with full scans and have a lot of empty space below the HWM, then yes, definitely reorg those (alter table move). There is no downside, only benefit. But if your slow jobs are using indexes, then the benefit will be minimal.
Don't assume that your slow jobs are due to space fragmentation. Use ASH (v$active_session_history) and SQL monitor (v$sql_plan_monitor) data or a graphical tool that utilizes this data to explore exactly what your queries are doing. Understand how to read execution plans and determine whether the correct plan is being used for your data. Tuning is unfortunately not a simple thing that can be addressed with a question on this forum.
In general, shrinking tables or rebuilding indexes should speed up reads of the table, or anything that does full table scans. It should not affect other DML operations.
When selecting or searching data, all of the empty blocks in the table and any indexes used by the query must still be read, so rebuilding them to reduce empty space and lower the high water mark will generally improve performance. This is especially true in indexes, where space lost to deleted rows is not recovered for reuse.

Partition index to reduce buffer busy waits?

From time to time our Oracle response times decrease significally for a minute or two, without having extra load.
we were able to identify an insert statement, which produces a lot of buffer busy waits.
From the ADDM report, we got the following hint:
Consider partitioning the INDEX "IDX1" with object
ID 4711 in a manner that will evenly distribute concurrent DML across
multiple partitions.
To be honest: I am not sure what that means. I don't know what a partitioned index is. I only can Image that it means to create a Partition with a local index.
Can you help me out here?
There is a very high frequency of reading and writing to that table. no updates or deletes are used.
Thanks,
E.
I am not sure what that means.
Oracle is telling you that there is a lot of concurrent ("at the same time") activity on a very small part of your index. This happens a lot.
Consider an index column TAB1_PK on table TAB1 whose values are inserted from a sequence TAB1_S. Suppose you have 5 database sessions all inserting into TAB1 at the same time.
Because TAB1_PK is indexed, and because the sequence is generating values in numeric order, what happens is that all those sessions have to read and update the same blocks of the index at the same time.
This can cause a lot of contention -- way more than you would expect, due to the way indexes work with multi-version read consistency. I mean, in some rare situations (depending on how the transaction logic is written and the number of concurrent sessions), it can really be crippling.
The (really) old way to avoid this problem was to use a reverse key index. That way, the sequential column values did not all go to the same index blocks.
However, that is a two-edged sword. On the one hand, you get less contention because you're inserting all over the index (good). On the other hand, your rows are going all over the index, meaning you cannot cache them all. You've just turned a big logical I/O problem into a physical I/O problem!
Nowadays, we have a better solution -- a GLOBAL HASH PARTITION on the index.
With a GHP, you can specify the number of hash buckets and use that to trade-off between how much contention you need to handle vs how compact you want the index updates (for better buffer caching). The more index hash partitions you use, the better your concurrency but the worse your index block buffer caching will be.
I find a number (of global hash partitions) around 16 is pretty good.

How many table partitions is too many in Postgres?

I'm partitioning a very large table that contains temporal data, and considering to what granularity I should make the partitions. The Postgres partition documentation claims that "large numbers of partitions are likely to increase query planning time considerably" and recommends that partitioning be used with "up to perhaps a hundred" partitions.
Assuming my table holds ten years of data, if I partitioned by week I would end up with over 500 partitions. Before I rule this out, I'd like to better understand what impact partition quantity has on query planning time. Has anyone benchmarked this, or does anyone have an understanding of how this works internally?
The query planner has to do a linear search of the constraint information for every partition of tables used in the query, to figure out which are actually involved--the ones that can have rows needed for the data requested. The number of query plans the planner considers grows exponentially as you join more tables. So the exact spot where that linear search adds up to enough time to be troubling really depends on query complexity. The more joins, the worse you will get hit by this. The "up to a hundred" figure came from noting that query planning time was adding up to a non-trivial amount of time even on simpler queries around that point. On web applications in particular, where latency of response time is important, that's a problem; thus the warning.
Can you support 500? Sure. But you are going to be searching every one of 500 check constraints for every query plan involving that table considered by the optimizer. If query planning time isn't a concern for you, then maybe you don't care. But most sites end up disliking the proportion of time spent on query planning with that many partitions, which is one reason why monthly partitioning is the standard for most data sets. You can easily store 10 years of data, partitioned monthly, before you start crossing over into where planning overhead starts to be noticeable.
"large numbers of partitions are likely to increase query planning time considerably" and recommends that partitioning be used with "up to perhaps a hundred" partitions.
Because every extra partition will usually be tied to check constraints, and this will lead the planner to wonder which of the partitions need to be queried against. In a best case scenario, the planner identifies that you're only hitting a single partition and gets rid of the append step altogether.
In terms of rows, and as DNS and Seth have pointed out, your milage will vary with the hardware. Generally speaking, though, there's no significant difference between querying a 1M row table and a 10M row table -- especially if your hard drives allow for fast random access and if it's clustered (see the cluster statement) using the index that you're most frequently hitting.
Each Table Partition takes up an inode on the file system. "Very large" is a relative term that depends on the performance characteristics of your file system of choice. If you want explicit performance benchmarks, you could probably look at various performance benchmarks of mails systems from your OS and FS of choice. Generally speaking, I wouldn't worry about it until you get in to the tens of thousands to hundreds of thousands of table spaces (using dirhash on FreeBSD's UFS2 would be win). Also note that this same limitation applies to DATABASES, TABLES or any other filesystem backed database object in PostgreSQL.
If you don't want to trust the PostgreSQL developers who wrote the code, then I recommend that you simply try it yourself and run a few example queries with explain analyze and time them using different partition schemes. Your specific hardware and software configuration is likely to dominate any answer in any case.
I'm assuming that the row optimization cache which the query optimizer uses to determine what joins and restrictions to use is stored with each partition, so it probably needs to load and read parts of each partition to plan the query.

Oracle performance question

I'm wondering if you have a table that contains 24 million record, how does that impact performance (does each insert/update/delete) take significantly longer to go through?
This is our Audit table, so when we make change changes in other tables we log then on to the Audit tale, does it also take significantly longer to carry out these update as well ?
The right answer is "it depends", of course...
But as far as I get, your concern is in how Audit table affects performance of queries (on other tables) when Audit table grows.
Probably you only insert into your Audit table. Insert time doesn't depend on amount of data already in table. So, no matter how big Audit table is, it should affect performance equally (given that database design isn't incredibly bad).
Of course, select or delete on Audit table itseft can take longer when the table grows.
If I read your question as "does a large Oracle table take longer for IUD operations", generally speaking the answer is no. I think the most impact on the insert/update/delete operations will be felt from the indexes present on this table (more indexes = slower performance for these operations).
However, if your auditing logic needs to look up existing rows in the audit table for procedural logic in some manner that doesn't use primary or unique keys, then there will be a performance impact with a large table.
There are many factors that come into play in regards to how fast an insert/update/delete occurs. For example, how many indexes are on the table? If a table has many indexes and you insert/update the table, it can cause the operation to take longer. How is the data stored in the physical structures of the database (i.e. the tablespaces if you're using Oracle, for example)? Are your indexes and data on separate disks, which can help speed up I/O?
Obviously, if you are writing out audit records then it can affect performance. But in a well-tuned database, it shouldn't be slowing it down enough to where you notice.
The approach I use for audit tables is to use triggers on the main tables and these triggers write out the audit records. But from a performance standpoint, it really depends on a lot of factors as to how fast the updates to your main tables will run.
I would recommend looking at the explain plan output for one of your slow updates if you are using Oracle (other DBs usually have such tools as well, google can help here). You can then see what plan the optimizer is generating and diagnose where the problems could be. You could potentially get a DBA to assist you as well to help figure out what's causing the slowness.
I'd suspect performance will be more related to contention than table size. Generally inserts happen at the 'end' of the table. Sessions inserting into that table have to take a latch on the block while they are writing records to it. During that time other sessions may have to wait for that block (which is where you may see busy buffer wait events).
But really you need to look at the SQLs in question and see what they are waiting on, and whether they are contributing significantly to an unacceptable level of performance. Then determine a course of action based on the specific wait events that are causing the problem.
Check out anything from Cary Milsap on performance tuning
The impact of table size is different for INSERT,DELETE AND UPDATE operation
Insert statement is not impacted much by table size as when we Insert data into table it will add to the next data block available.If there are Indexes on that particular table then Oracle has to search for particular data block before inserting data in that block ,which require search operation that need time.
Delete and Update statements are impacted by Table size as more the data more time is require to search for the particular row to Delete and Update operation

Resources