I have table partitioned on a column(rcrd_expry_ts) of date type. We are updating this rcrd_expry_ts weekly by another job. We noticed the update query is taking quite longer time (1 to 1.5 min) even for few rows and I think longer time is taken for actually moving data internally to different partitioned. There can be a million of rows eligible to update rcrd_expry_ts by our weekly job.
CREATE TABLE tbl_parent
(
"parentId" NUMBER NOT NULL ENABLE,
"RCRD_DLT_TSTP" timestamp default timestamp '9999-01-01 00:00:00' NOT NULL
)
PARTITION BY RANGE ("RCRD_DLT_TSTP") INTERVAL (NUMTOYMINTERVAL('1','MONTH')) (PARTITION "P1" VALUES LESS THAN (TO_DATE('2010-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')));
CREATE TABLE tbl_child
(
"foreign_id" NUMBER NOT NULL ENABLE,
"id" NUMBER NOT NULL ENABLE,
constraint fk_id foreign key("foreign_id") references
tbl_parent("parentId")
)partition by reference (fk_id);
I am updating RCRD_DLT_TSTP in parent table from some another job (using simple update query) but I noticed that it took around 1 to 1.5 min to execute, probably due to creating partition and move data into corresponding partition. Is there any better way to achieve this in Oracle
The table has a referenced partitioned child. So any rows moving partition in the parent will have to be cascaded to the child table too.
This means you could be moving substantially more rows that the "few rows" that change in the parent.
It's also worth checking if the update can identify the rows it needs to change faster too.
You can do this by getting the plan for the update statement like this:
update /*+ gather_plan_statistics */ <your update statement>;
select *
from table(dbms_xplan.display_cursor( format => 'ALLSTATS LAST' ));
This will give you the plan for the update with its run time stats. This will help in identifying if there are any indexes you can create to improve performance.
Is there any better way to achieve this in Oracle
This is a question that needs to be answered in the larger context. You may well be able to make this process faster by unpartitioning the table and using indexes to identify the rows to change.
But this affects all the other statements that access this table. To what extent do they benefit from partitioning? If the answer is substantially, is it worth making this process faster at the expense of these others? What trade-offs are you willing to make here?
Related
I have table TRD_TEST2 in which I have created below partition for performance improvement as the table has 80 million records:
PARTITION BY RANGE(valid_to)
INTERVAL(NUMTOYMINTERVAL(1, 'MONTH'))
( PARTITION p0 VALUES LESS THAN (TO_DATE('01-01-3999', 'DD-MM-YYYY')))
ENABLE ROW MOVEMENT;
For history purposes, we have created technical key columns valid_from and valid_to in TRD_TEST2 table. Below is the Merge query which we are running daily:
MERGE INTO TRD_TEST2 e
USING TRD_TEST2_SRC h
ON (e.ld= h.ld)
WHEN MATCHED THEN
UPDATE SET e.valid_to = sysdate,
IS_CURRENT = 0
where e.valid_to = TO_DATE('01.01.3999', 'DD.MM.YYYY');
I would like to know does it make sense to enable ENABLE ROW MOVEMENT in this case as we are updating valid_to column using the above Merge query daily and what is the impact?
I'm not sure that the question makes sense.
If you are partitioning on valid_to and updating valid_to then you must enable row movement. If you don't enable row movement, as soon as you try to update a row in a way that would force it into a different partition, you'll get an error.
Choosing a partition key that is expected to change is generally frowned upon. Moving a row effectively turns an update into a delete and insert which will be significantly more expensive (how much will depend on how much of the cost is redo generation and how much redo is generated by the two operations). As was suggested in your earlier question, partitioning on valid_from would seem to make more sense in general since it is a static, NOT NULL column. And that saves you from having to assign fake valid_to values to current rows. If your only concern is the performance of this merge statement rather than all the other queries against this table, though, partitioning on valid_to might make sense for you.
We have a table with partitions. It also has an overflow partition (max partition) which sorts of acts as a catch-all for records which do not match the partition criteria. The idea was to create the partitions ahead of time so the records never end up in the max_partition. However for one table, this was missed out, so all the records ended up in that single partition.
Now most of these records are not used anymore so they can be deleted. However our approach is to drop the partitions when its too old. This cannot be done in this case. Is there an easy way to handle the purge?
Maybe its an idea to create the partitions now and move the records to them and then drop the partition now, but however it seems like its going to be very poor in performance. The other option was to create a temp table where a subset of records are moved and deleted from there, but again moving the records individually seems time consuming. This table has around 5 million records.
Which would be the best way forward, performance wise. We could manage a little downtime but not much.
We use Oracle 11g.
The table creation script looks something like this:
CREATE TABLE "TRANSACTIONS"
("year" number(4,0) NOT NULL ENABLE)
PARTITION BY RANGE ("year")
(PARTITION "P_OLD" VALUES LESS THAN (2010),
PARTITION "P_2011" VALUES LESS THAN (2011),
...
PARTITION "P_MAX" VALUES LESS THAN (MAXVALUE));
There is no need to drop the partition, you can purge it.
alter table TRANSACTIONS TRUNCATE PARTITION P_MAX UPDATE INDEXES;
or if you prefer, you can also delete the rows:
delete from TRANSACTIONS PARTITION (P_MAX);
You may use INTERVAL partition to make it simpler (actually I don't understand your question):
CREATE TABLE TRANSACTIONS (
...
TRANSACTION_DATE TIMESTAMP(0) NOT NULL
)
PARTITION BY RANGE (TRANSACTION_DATE) INTERVAL (INTERVAL '12' MONTH)
(PARTITION P_OLD VALUES LESS THAN (TIMESTAMP '2000-01-01 00:00:00' ) )
ENABLE ROW MOVEMENT;
I'm working with an application that has a large amount of outdated data clogging up a table in my databank. Ideally, I'd want to delete all entries in the table whose reference date is too old:
delete outdatedTable where referenceDate < :deletionCutoffDate
If this statement were to be run, it would take ages to complete, so I'd rather break it up into chunks with the following:
delete outdatedTable where referenceData < :deletionCutoffDate and rownum <= 10000
In testing, this works suprisingly slowly. The following query, however, runs dramatically faster:
delete outdatedTable where rownum <= 10000
I've been reading through multiple blogs and similar questions on StackOverflow, but I haven't yet found a straightforward description of how/whether using rownum affects the Oracle optimizer when there are other Where clauses in the query. In my case, it seems to me as if Oracle checks
referenceData < :deletionCutoffDate
on every single row, executes a massive Select on all matching rows, and only then filters out the top 10000 rows to return. Is this in fact the case? If so, is there any clever way to make Oracle stop checking the Where clause as soon as it's found enough matching rows?
How about a different approach without so much DML on the table. As a permanent solution for future you could go for table partitioning.
Create a new table with required partition(s).
Move ONLY the required rows from your existing table to the new partitioned table.
Once the new table is populated, add the required constraints and indexes.
Drop the old table.
In future, you would just need to DROP the old partitions.
CTAS(create table as select) is another way, however, if you want to have a new table with partition, you would have to go for exchange partition concept.
First of all, you should read about SQL statement's execution plan and learn how to explain in. It will help you to find answers on such questions.
Generally, one single delete is more effective than several chunked. It's main disadvantage is extremal using of undo tablespace.
If you wish to delete most rows of table, much faster way usially a trick:
create table new_table as select * from old_table where date >= :date_limit;
drop table old_table;
rename table new_table to old_table;
... recreate indexes and other stuff ...
If you wish to do it more than once, partitioning is a much better way. If table partitioned by date, you can select actual date quickly and you can drop partion with outdated data in milliseconds.
At last, paritioning if a way to dismiss 'deleting outdated records' at all. Sometimes we need old data, and it's sad if we delete it by own hands. With paritioning you can archive outdated partitions outside of the database, but connects them when you need to access old data.
This is an old request, but I'd like to show another approach (also using partitions).
Depending on what you consider old, you could create corresponding partitions (optimally exactly two; one current, one old; but you could just as well make more), e.g.:
PARTITION BY LIST ( mod(referenceDate,2) )
(
PARTITION year_odd VALUES (1),
PARTITION year_even VALUES (0)
);
This could as well be months (Jan, Feb, ... Dec), decades (XX0X, XX1X, ... XX9X), half years (first_half, second_half), etc. Anything circular.
Then whenever you want to get rid of old data, truncate:
ALTER TABLE mytable TRUNCATE PARTITION year_even;
delete from your_table
where PK not in
(select PK from your_table where rounum<=...) -- these records you want to leave
I have a table that I expect to get 7 million records a month on a pretty wide table. A small portion of these records are expected to be flagged as "problem" records.
What is the best way to implement the table to locate these records in an efficient way?
I'm new to Oracle, but is a materialized view an valid option? Are there such things in Oracle such as indexed views or is this potentially really the same thing?
Most of the reporting is by month, so partitioning by month seems like an option, but a "problem" record may be lingering for several months theorectically. Otherwise, the reporting shuold be mostly for the current month. Would you expect that querying across all month partitions to locate any problem record would cause significant performance issues compared to usinga single table?
Your general thoughts of where to start would be appreciated. I realize I need to read up and I'll do that but I wanted to get the community thought first to make sure I read the right stuff.
One more thought: The primary key is a GUID varchar2(36). In order of magnitude, how much of a performance hit would you expect this to be relative to using a NUMBER data type PK? This worries me but it is out of my control.
It depends what you mean by "flagged", but it sounds to me like you would benefit from a simple index, function based index, or an indexed virtual column.
In all cases you should be careful to ensure that all the index columns are NULL for rows that do not need to be flagged. This way your index will contain only the rows that are flagged (Oracle does not - by default - index rows in B-Tree indexes where all index column values are NULL).
Your primary key being a VARCHAR2 GUID should make no difference, at least with regards to the specific flagging of rows in this question, indexes will point to rows via Oracle internal ROWIDs.
Indexes support partitioning, so if your data is already partitioned, your index could be set to match.
Simple column index method
If you can dictate how the flagging works, or the column already exists, then I would simply add an index to it like so:
CREATE INDEX my_table_problems_idx ON my_table (problem_flag)
/
Function-based index method
If the data model is fixed / there is no flag column, then you can create a function-based index assuming that you have all the information you need in the target table. For example:
CREATE INDEX my_table_problems_fnidx ON my_table (
CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END
)
/
Now if you use the same logic in your SELECT statement, you should find that it uses the index to efficiently match rows.
SELECT *
FROM my_table
WHERE CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END IS NOT NULL
/
This is a bit clunky though, and it requires you to use the same logic in queries as the index definition. Not great. You could use a view to mask this, but you're still duplicating logic in at least two places.
Indexed virtual column
In my opinion, this is the best way to do it if you are computing the value dynamically (available from 11g onwards):
ALTER TABLE my_table
ADD virtual_problem_flag VARCHAR2(1) AS (
CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END
)
/
CREATE INDEX my_table_problems_idx ON my_table (virtual_problem_flag)
/
Now you can just query the virtual column as if it were a real column, i.e.
SELECT *
FROM my_table
WHERE virtual_problem_flag = 'Y'
/
This will use the index and puts the function-based logic into a single place.
Create a new table with just the pks of the problem rows.
I have a log table with a lot of information.
I would like to partition it into two: first part is the logs from the past month, since they are commonly viewed. Second part is the logs from the rest of the year (Compressed).
My problem is that all the examples of partitions where "up until 1/1/2013", "more recent than 1/1/2013" - That is with fixed dates...
What I am looking for/expecting is a way to define a partition on the last month, so that when the day changes, the logs from 30 days ago, are "automatically" transferred to the compressed partition.
I guess I can create another table which is completley compressed and move info using JOBS, but I was hoping for a built-in solution.
Thank you.
I think you want interval partitions based on a date. This will automatically generate the partitions for you. For example, monthly partitions would be:
create table test_data (
created_date DATE default sysdate not null,
store_id NUMBER,
inventory_id NUMBER,
qty_sold NUMBER
)
PARTITION BY RANGE (created_date)
INTERVAL(NUMTOYMINTERVAL(1, 'MONTH'))
(
PARTITION part_01 values LESS THAN (TO_DATE('20130101','YYYYMMDD'))
)
As data is inserted, Oracle will put into the proper partition or create one if needed. The partition names will be a bit cryptic (SYS_xxxx), but you can use the "partition for" clause to grab only the month you want. For example:
select * from test_data partition for (to_date('20130101', 'YYYYMMDD'))
It is not possible to automatically transfer data to a compressed partition. You can, however, schedule a simple job to compress last month's partition at the beginning of every month with this statement:
ALTER TABLE some_table
MOVE PARTITION FOR (add_months(trunc(SYSDATE), -1)
COMPRESS;
If you wanted to stay with only two partitions: current month and archive for all past transactions you could also merge partitions with ALTER TABLE MERGE PARTITIONS, but as far as I'm concerned it would rebuild the whole archive partition, so I would discourage doing so and stay with storing each month in its separate partition.