I would like to know if dropping Oracle index and recreating them will pose any data issues if assuming these are done during scheduled downtime.
Recently discovered that some indexes were parked on incorrect table space, would like to correct it by dropping the index and recreating it on the correct table space.
Please kindly advise.
I don't see a problem with that, but instead of drop/create you could also use the syntax below:
alter index <INDEX_NAME> rebuild tablespace <TABLESPACE_NAME>
To address what you asked in the comment below, the alter index rebuild should be faster. The reason for that is when you drop the index and create it again, index tree will be built from the table itself. But with alter index rebuild, Oracle reads the index itself, thus resulting in a smaller amount of I/O.
Related
If we specify ONLINE in the CREATE INDEX statement, the table isn't locked during creation of the index. Without ONLINE keyword it isn't possible to perform DML operations on the table. But is the SELECT statement possible on the table meanwhile? After reading the description of CREATE INDEX statement it still isn't clear to me.
I ask about this, because I wonder if it is similar to PostgreSQL or SQL Server:
In PostgreSQL writes on the table are not possible, but one can still read the table - see the CREATE INDEX doc > CONCURRENTLY parameter.
In SQL Server writes on the table are not possible, and additionally if we create a clustered index reads are also not possible - see the CREATE INDEX doc > ONLINE parameter.
Creating an index does NOT block other users from reading the table. In general, almost no Oracle DDL commands will prevent users from reading tables.
There are some DDL statements that can cause problems for readers. For example, if you TRUNCATE a table, other users who are in the middle of reading that table may get the error ORA-08103: Object No Longer Exists. But that's a very destructive change that we would expect to cause problems. I recently found a specific type of foreign key constraint that blocked reading the table, but that was likely a rare bug. I've caused a lot of production problems while adding objects, but so far I've never seen adding an index prevent users from reading the table.
I'm using the Data Skipping Indexes feature in clickhouse and i got confused about its usage. If i add a data skip index when i create the table like this:
CREATE TABLE MyTable
(
...
INDEX index_time TimeStamp TYPE minmax GRANULARITY 1
)
ENGINE =MergeTree()
...
When i query with TimeStamp filter condition the 'index_time' works. But if i didn't add index when creating table, alternatively, i added the index with Manipulations With Data Skipping Indicesfeature like this:
ALTER TABLEE MyTable ADD INDEX index_time TimeStamp TYPE minmax GRANULARITY 1
Then the index 'index_time' doesn't work.
My database is running on production so i can't recreate the table. I have to use the second way. Can anyone explain why it does not work or i used the feature in a wrong way?
The reason your queries don't use the index after an ALTER TABLE ADD INDEX is because the index does not exist yet. (!)
Any new data will be properly indexed, which is why your index works when you put it in CREATE TABLE. ClickHouse builds the index as you load data. If you created the table, ran ALTER TABLE ADD INDEX, and loaded data you would see the same behavior.
When the data already exist, things are different. ALTER TABLE updates the metadata for the table, but at this point all your data have been written to parts in the table. ClickHouse does not rewrite parts automatically to implement new indexes. However, you should be able to force rewriting to include the index by running:
OPTIMIZE TABLE MyTable FINAL
See the Github issue https://github.com/yandex/ClickHouse/issues/6561 referenced by Ruijang for more information.
It's quite right that
OPTIMIZE TABLE my_table_name FINAL;
does recreate the indexes set to the table. But there's some scenarios in a columnar DB where you want to avoid rewriting EVERYTHING. If you just add a single index to an already existing table with lots of data when you just rebuilt the new index which includes two steps:
Step 1 - Define the index
Creating the INDEX itself just defines what the index should do, which reflects in Clickhouse as metadata that's added to the table. Thus there is no index build up really, thus nothing will be faster. It's also a lightweight operation as it won't change data or build up any structures beside the table metadata.
It's important to understand any new incoming data will be indexed on insert, but any pre existing data is not included!
ALTER TABLE my_table_name ADD INDEX my_index(my_expression) TYPE minmax GRANULARITY 1
Note Clickhouse can index expressions, so it could simply be the column name as in the question or a more complex expression (e.g. my_index(price * sold_items * revshare)). The index will work on that expression only of course.
Step 2 - Build up (materialize) the index
After creation of the metadata the index for existing data need to be build up. This action is called materialize and needs to be explicitly triggered. Good thing is you can do this individually for any index that was added or changed. This is a heavy operation as it'll trigger work on the database.
ALTER TABLE my_table_name MATERIALIZE INDEX my_index;
Also have a look at the Clickhouse docs for Manipulating Data Skipping Indices
I am using basic compression in Oracle to archive seemingly unused tables as a first step to dropping them. I used these commands:
alter table table1 compress basic;
alter table table1 move;
The move invalidates the index. Does the invalid index still take up space? The index no longer shows up in a query of the USER_SEGMENTS table.
This would be useful to know whether I need to drop or rebuild and compress the index to save even more space.
ALTER TABLE ... MOVE marks the indexes built on that table UNUSABLE.
In earlier versions, UNUSABLE indexes still had their segments and the allocated space was not freed up. Starting from version 11.2, UNUSABLE index segments are automatically removed.
A non-existent segment consumes less space than a compressed index segment :)
If you want to use those indexes again, you have to rebuild them. Otherwise, just drop them.
We are running Oracle 11g and have some partitioned tables. I am trying to write an automated process to script out the indexes on these tables. (Basically when we do bulk loads, we want to drop all the indexes beforehand and recreate them afterward.)
The problem I have is knowing how to script out the partitioned indexes. Some are created with "LOCAL STORE IN (tablespacename)" and others just with "LOCAL" (which stores index extents in the same partition as the data). In either case, dba_indexes.tablespace_name is null, and I have having a heck of a time scripting out the two different cases correctly.
I know I can simply re-run the original DDL to recreate the indexes, but multiple parts of the organization can make changes, and there would be less risk if the loader tool could be self-contained and simply rebuild whatever was there to begin with.
I can query dba_ind_subpartitions, and if the tablespace_name values for every subpartition all match, then I can assume/infer that I should STORE IN that tablespace name. But, if the table is in a small single-partition state (e.g. newly created or just after archival), then the ones created with just LOCAL also match this test, so this is also not a perfect way of telling them apart.
I can compare the names of the index subpartition tablespaces to the data table partition tablespaces, and if they match, then I can assume/infer that those should be created with just LOCAL. But, that drags a bunch of extra tables into my query and makes it really hard to read, so I am worried about maintainability going forward. Plus, it just seems like a kludge.
It seems like there should be someplace in Oracle's data dictionaries where it is simply keeping track of this, and where I can just directly look it up instead of having to do a bunch of math and rely on assumptions. But, I have done a good deal of digging and haven't yet found it. So, any help would be much appreciated.
Although an insert alone is faster without the presence of indexes, have you benchmarked a load into tables with indexes enabled and established that it is slower than disabling (more robust than dropping!) and rebuilding them?
When you direct path insert into a table with indexes, Oracle optimises the index maintenance process by creating temporary segments to hold just the data required for the index builds. This generally allows the index maintenance to scan much smaller segments than otherwise required -- the temp segments plus the existing indexes.
Well, as jonearles describes, the dbms_metadata package is the way to generate DDL for existing objects.
But, it seems to me, this is more work than is required for what you're trying to achieve. If this is all for loading data, I recommend you simply alter the indexes to be unusable, set 'skip_unusable_indexes=true', do the data load, and the rebuild the indexes.
This should achieve what you want, without having to drop and re-create the indexes.
DBMS_METADATA.GET_DDL is easier than querying the data dictionary:
--Sample table and index.
create table test1(a number);
create index test1_idx on test1(a);
--Store the DDL, drop the index, then re-create it.
declare
ddl_before clob;
begin
ddl_before := dbms_metadata.get_ddl('INDEX', 'TEST1_IDX');
execute immediate 'drop index test1_idx';
--Do some processing here.
execute immediate ddl_before;
end;
/
We use mini profiler in two ways:
On developer machines with the pop-up
In our staging/prod environments with SqlServerStorage storing to MS SQL
After a few weeks we find that writing to the profiling DB takes a long time (seconds), and is causing real issues on the site. Truncating all profiler tables resolves the issue.
Looking through the SqlServerStorage code, it appears the inserts also do a check to make sure a row with that id doesnt already exist. Is this to ensure DB agnostic code? This seems it would introduce a massive penalty as the number of rows increases.
How would I go about removing the performance penalty from the performance profiler? Is anyone else experiencing this slow down? Or is it something we are doing wrong?
Cheers for any help or advice.
Hmm, it looks like I made a huge mistake in how that MiniProfilers table was created when I forgot about primary key being clustered by default... and the clustered index is a GUID column, a very big no-no.
Because data is physically stored on disk in the same order as the clustered index (indeed, one could say the table is the clustered index), SQL Server has to keep every newly inserted row in that physical order. This becomes a nightmare to keep sorted when we're using essentially a random number.
The fix is to add an auto-increasing int and switch the primary key to that, just like all the other tables (why I overlooked this, I don't remember... we don't use this storage provider here on Stack Overflow or this issue would have been found long ago).
I'll update the table creation scripts and provide you with something to migrate your current table in a bit.
Edit
After looking at this again, the main MiniProfilers table could just be a heap, meaning no clustered index. All access to the rows is by that guid ID column, so no physical ordering would help.
If you don't want to recreate your MiniProfiler sql tables, you can use this script to make the primary key nonclustered:
-- first remove the clustered index from the primary key
declare #clusteredIndex varchar(50);
select #clusteredIndex = name
from sys.indexes
where type_desc = 'CLUSTERED'
and object_name(object_id) = 'MiniProfilers';
exec ('alter table MiniProfilers drop constraint ' + #clusteredIndex);
-- and then make it non-clustered
alter table MiniProfilers add constraint
PK_MiniProfilers primary key nonclustered (Id);
Another Edit
Alrighty, I've updated the creation scripts and added indexes for most querying - see the code here in GitHub.
I would highly recommended dropping all your existing tables and rerunning the updated script.