I'm using Oracle 11g.
A lot of times I need to check in
stats$
tables and snapshots tables if an index has used or no.
My question is - are there any disadvantages or problems to use
MONITORING USAGE
for all of my indexes? Just put them all inside.
Maybe build a procedure that perform it on every new index?
Thanks a lot.
Index monitoring has serious issues pre-12.2:
1) A boolean flag generally isn't enough information to make a reasonable determination on whether to keep an index or not. Was that index used 1 time because a developer forced an index using a hint or was the index called hundreds of times?
2) The check happens at parse phase, not execute phase. (See the previous point why this is an issue).
3) There is a performance impact. While that performance impact is small, especially if you are only turning it on for a single index, that impact will be more significant if you try turning it on for all indexes, system wide.
Index Monitoring is designed to be turned on for a single index (or small group of indexes), wait some reasonable time, then turn it off and check the stats. It isn't meant to just be on all the time.
In 12.2, Index Monitoring was completely overhauled so that it is on by default for all indexes (I'm pretty sure you can't even turn it off). Oracle has largely solved most of the issues with index monitoring in previous Oracle versions: the performance impact is insignificant, more meaningful stats (an actual count of number of times used), and the stats are updated on execute phase, not parse phase.
Tim Hall has a good write up on Index Monitoring here. Connor McDonald has an excellent YouTube video on why Index Monitoring had issues pre-12.2, and what Oracle has done to address these issues here (start watching from 19:15 - 27:05).
Related
We are planning to use a context index for full text search in Oracle 12c standard edition.
The data on which search will run is a JSON containing one Channel post and its replies from a 3rd party tool that is loaded into our database.(basically, all the chats and replies(including other attributes like timestamp/user etc) are stored in this table).
We are expecting about 50k rows of data per year and a daily of 100-150 DMLs per day. Our index is "SYNC ON COMMIT" currently,so what are the recommendations for optimizing the Oracle Text index?
First, let me preface my response with a disclaimer: I am exploring using Oracle Text as part of a POC currently, and my knowledge is somewhat limited as we're still in the research phase. Additionally, our datasets are in the 10s of millions with 100k DML operations daily.
From what I've read, Oracle Docs suggest scheduling both a FULL and REBUILD optimization for indexes which incur DML, so I currently have implemented the following in our dev environment.
execute ctx_ddl.optimize_index('channel_json_ctx_idx', 'FULL'); --run daily
execute ctx_ddl.optimize_index('channel_json_ctx_idx', 'REBUILD'); --run weekly
I cannot imagine with the dataset you've identified that your index will really become that fragmented and cause performance issues. You could probably get away with less frequent optimizations than what I've mentioned.
You could even forego scheduling the optimization and benchmark your performance. If you see it start to degrade, then note the timespan and perhaps count of DML operations for reference. Then run a 'FULL'. Then test performance. If performance improves, create a schedule. If performance does not improve, then run 'REBUILD'. Test performance. Assuming performance improves then you could schedule the 'REBUILD'for that time range and consider adding a more frequent 'FULL'.
Given 4-5 nodes having many IMaps with lots of data in it, some of the predicate queries started to become significantly slow. One of the solutions for solving this performance issue (as I think) could be adding indexes. However, this data is part of a sensible system which is currently being used in production.
Before adding indexes, I was wondering what would be the consequences of doing it on huge IMaps? (would it lock the entire map ?; would it bring down the entire system?; etc.) Hazelcast documentation includes information about how to do it, but doesn't give any other explanation.
If you want to add the index in runtime this is what will happen:
the AddIndexOperation will be executed on every partition
during the execution of the AddIndexOperation the partition will be blocked until all partition data are iterated and added to the index.
Queries won't be blocked in this timeframe - but get/put operations will.
I would recommend doing it in the "maintenance window" where you have the smallest load.
lots of data is relative - just execute a test in your dev environment having exactly the same amount of data to see how long it will take to add an index in your environment.
I've been doing some reading on gathering table and index statistics on Oracle databases but it's left me ... confused.
For the sake of argument, let's assume Oracle 11gR2 as the RDBMS. Regarding gathering table and index statistics, when should it be done, which is the preferred way of doing it, and does Oracle really automatically gather the necessary statistics for us?
Regarding the first point: when should it be done. I've read that, as a rule of thumb, gathering table and index statistics should be done after around 10% of the table's records have been modified (inserted, updated, etc) since the last time the table was analyzed.
Regarding the second point: which is the preferred way of doing it. If we want to calculate both table and index statistics, does executing DBMS_STATS.GATHER_TABLE_STATS with default options, assuming the table is not partitioned, suffice?
Regarding the third point:does Oracle really gather the necessary statistics automatically for us. If this is the case, should i not worry abouth gathering table statistics (see points 1 and 2)?
Thanks in advance.
EDIT: Following the comment by ammoQ, i realized that the question is not clear in what the use case really is, here. My question is about tables that aren' "manipulated" via a user's actions, i.e manually, rather via procedures typically ran by database jobs. Take my example, for instance. My ETL process loads several tables on a daily basis and it does so in approximately 1 hour. Of that 1 hour, about half is spent analyzing the tables themselves. Thus, the tables area analyzed daily, following insertions or updates. This seems overkill, hence the question.
In general, you need to have statistics that are representative (not necessarily accurate) and that give you the right execution plan. By default, Oracle will run a statistics collection job, during the nightly batch window. That may be fine for some applications, but if you have a data warehouse, which presumably includes a regular data load process, then managing the stats should be part of that process. Note that I have said "managing" and not "collecting" statistics. That's just my way of saying that there are other options for statistics in addition to just gathering statistics, although gathering statistics would be where I would start.
There are also things that can be done to optimize statistics gathering, incremental statistics for example. The other thing that is very important is is to use the AUTO Sample size when gathering stats. Do not specify a percentage, not even 100%. The reason is that auto sample size enables a number of internal optimizations and capabilities that are disabled if you do not use AUTO sample size.
So, taking your specific points
10% staleness is pretty random, and is just a number used by the auto stats.
dbms_stats.gather_table_stats() with default values is the preferred method. One parameter that I may change would be the DEGREE, to enable stats gathering in parallel
In 12c, basic stats are gathered on load into an empty table (or empty partition). Stats are built on indexes when indexes are created. So to reiterate what I said above, stats gathering should be part of your ELT process.
I hope that makes sense and helps.
Oracle states that 'Invisible indexes are especially useful for testing the removal of an index before dropping it or using indexes temporarily without affecting the overall application.'
I don't understand why visibility is 'especially' useful for this, wouldn't making an index unusable be especially useful since DML operations are not maintained therefor it resembles dropping an index more so than making it simply invisible. I've never actually worked with this, I'm guessing that making an index invisible/visible is easier than making it usable/unusable because you have to rebuild an index somehow when you make it usable?
It is referring to the impact on your queries via statistics and the optimizer.
Many Oracle databases have complex schemas, user bases, as well as really large tables and indexes. Some even have very controlled schema statistics. Dropping a big index can be an expensive step (time-wise).
The statistics gatherer collects statistics to populate the data dictionary, which is used for the optimizer. Index stats are one of the key inputs to the Cost Based Optimizer. "Faking" the drop will cause the optimizer to act as if the index is gone, and you can then see the impact on the query plans. If you find that the drop wasn't such a good idea, you can immediately revert it. On the other hand, some indexes take hours to build, so you can see how it is valuable to be able to test it out first.
I have an app using an Oracle 11g database. I have a fairly large table (~50k rows) which I query thus:
SELECT omg, ponies FROM table WHERE x = 4
Field x was not indexed, I discovered. This query happens a lot, but the thing is that the performance wasn't too bad. Adding an index on x did make the queries approximately twice as fast, which is far less than I expected. On, say, MySQL, it would've made the query ten times faster, at the very least. (Edit: I did test this on MySQL, and there saw a huge difference.)
I'm suspecting Oracle adds some kind of automatic index when it detects that I query a non-indexed field often. Am I correct? I can find nothing even implying this in the docs.
As has already been indicated, Oracle11g does NOT dynamically build indexes based on prior experience. It is certainly possible and indeed happens often that adding an index under the right conditions will produce the order of magnitude improvement you note.
But as has also already been noted, 50K (seemingly short?) rows is nothing to Oracle. The Oracle database in fact has a great deal of intelligence that allows it to scan data without indexes most efficiently. Every new release of the Oracle RDBMS gets better at moving large amounts of data. I would suggest to you that the reason Oracle was so close to its "best" timing even without the index as compared to MySQL is that Oracle is just a more intelligent database under the covers.
However, the Oracle RDBMS does have many features that touch upon the subject area you have opened. For example:
10g introduced a feature called AUTOMATIC SQL TUNING which is exposed via a gui known as the SQL TUNING ADVISOR. This feature is intended to analyze queries on its own, in depth and includes the ability to do WHAT-IF analysis of alternative query plans. This includes simulation of indexes which do not actually exist. However, this would not explain any performance differences you have seen because the feature needs to be turned on and it does not actually build any indexes, it only makes recommendations for the DBA to make indexes, among other things.
11g includes AUTOMATIC STATISTICS GATHERING which when enabled will automatically collect statistics on database objects as it deems necessary based on activity on those objects.
Thus the Oracle RDBMS is doing what you have suggested, dynamically altering its environment on its own based on its experience with your workload over time in order to improve performance. Creating indexes on the fly is just not one of the things is does yet. As an aside, this has been hinted to by Oracle in private sevearl times so I figure it is in the works for some future release.
Does Oracle 11g automatically index fields frequently used for full table scans?
No.
In regards the MySQL issue, what storage engine you use can make a difference.
"MyISAM relies on the operating system for caching reads and writes to the data rows while InnoDB does this within the engine itself"
Oracle will cache the table/data rows, so it won't need to hit the disk. depending on the OS and hardware, there's a chance that MySQL MyISAM had to physically read the data off the disk each time.
~50K rows, depending greatly on how big each row is, could conceivably be stored in under 1000 blocks, which could be quickly read into the buffer cache by a full table scan (FTS) in under 50 multi-block reads.
Adding appropriate index(es) will allow queries on the table to scale smoothly as the data volume and/or access frequency goes up.
"Adding an index on x did make the
queries approximately twice as fast,
which is far less than I expected. On,
say, MySQL, it would've made the query
ten times faster, at the very least."
How many distinct values of X are there? Are they clustered in one part of the table or spread evenly throughout it?
Indexes are not some voodoo device: they must obey the laws of physics.
edit
"Duplicates could appear, but as it
is, there are none."
If that column has neither a unique constraint nor a unique index the optimizer will choose an execution path on the basis that there could be duplicate values in that column. This is the value of declaring the data model as accuratley as possible: the provision of metadata to the optimizer. Keeping the statistics up to date is also very useful in this regard.
You should have a look at the estimated execution plan for your query, before and after the index has been created. (Also, make sure that the statistics are up-to-date on your table.) That will tell you what exactly is happening and why performance is what it is.
50k rows is not that big of a table, so I wouldn't be surprised if the performance was decent even without the index. Thus adding the index to equation can't really bring much improvement to query execution speed.