Adding a sequence to a large Oracle table - oracle

I have queries that take an existing large table and build tables off of them for reporting. The problem is that the source tables are 60-80MM+ records and it takes a long time to recreate. I'd like to be able to identify which records are new so I can build just add the new records to the reporting tables.
To me, the best way to identify this is to have an identity column. Is there any significant cost to creating this and adding it to the table?
Separately, is it possible to create a materialized view that takes data from one of these tables but add a sequence as part of the materialized view? That is, something like
create materialized view some_materialized_view as
select somesequence.nextval, source_table.*
from source_table?

You can add a sequence based column to your table, but as Gary suggests I wouldn't do that.
The task you are about to solve is so common that other solutions have been already implemented.
The first built-in option that comes to mind is the system change number SCN, a kind of Oracle internal clock. By default, tables are set up to record the SCN of the whole (usually 8K) block, containing usually many rows, but you can set a table to keep a record of the SCN that changed every row. Then you can track the columns that are new or change and have not been copied to your reporting tables.
CREATE TABLE t (c1 NUMBER) ROWDEPENDENCIES;
INSERT INTO t VALUES (1);
COMMIT;
SELECT c1, ora_rowscn FROM t;
Secondly, I would think of adding a date column. With 60-80 mio rows I wouldn't do this with ALTER TABLE xxx ADD (d DATE DEFAULT SYSDATE), but with rename, create as select, drop:
CREATE TABLE t AS SELECT * FROM all_objects;
RENAME t TO told;
CREATE TABLE t AS SELECT sysdate AS d, told.* FROM told;
ALTER TABLE t MODIFY d DATE DEFAULT SYSDATE;
DROP TABLE told;
Thirdly, I would read up on materialized views. I never had the chance to use this a work, but in theory, you should be able to set up a materialized view log on your 80 m table that records changes and updates dependent materialized views.
And forthly, I'd look into partitioning your large table on the (newly introduced) date column, so that identifying the new rows will become faster. That sadly depends on your version and Oracle license, though.

Related

Move Range Interval partition data from one table to history table in other database

We have a primary table that is Range partitioned by date with a 1-month interval. It's also a list sub-partitioned with 4 distinct values. So essentially it is one month partition having 4 sub-partitions.
Database: Oracle 19c
I need advice on how to effectively move the partition/sub-partition data from active schema to historical schema in another database.
Also, there are about 30 tables that are referenced partitioned on the primary table for which the data needs to be moved as well. Overall I'm looking to move about 2500 subpartitions
I'm not sure if an exchange partition would be the right approach in this scenario?
TIA
You could use exchange to get the data rapidly out of your active table, but you would still then to send that table over the wire to the remote history database to load it in.
In which case, using "exchange" probably is just adding more steps to the process for little gain. (There are still potential uses here depending on how you want to handle indexing etc).
But simplest is perhaps just transferring the data over, assuming a common structure between the two tables, ie
insert /*+ APPEND */ into history_table#remote_db
select * from active_table partition ( myparname )
I can't remember if partition naming syntax is supported over a db link, but if not, then the appropriate date predicates will do the same trick, and then just follow up with:
alter table active_table truncate partition myparname;

Bulk update in Oracle12c

I have a situation like to update a column(all rows) in a table having 150 million records.
Creation of duplicate table with updates and dropping of previous table is the best way but there is no available disk space to hold the duplicate table.
So how to perform the update in less time? Partitions are there on the table.
I am using oracle 12c
The cleanest approach is NOT updating the table, but creating a new table with the new column of updated rows. For instance, let's say I needed to update a column called old_value with the max of some value, instead of updating the old_table one does:
create new_table as select foo, bar, max(old_value) from old_table;
drop table old_table;
rename new_table as old_table.
If you need even more speed, you can do this creation using a parallel query with nologging thereby generating very little redo and no undo logs. More details can be ascertained here: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:6407993912330

Oracle Create Table as Select * from Another_Table same table space

I didn't design the DB so don't judge me on this.
I have a log table that is receiving A LOT of entries. I only need to keep a day or so on this this log table. My initial thought was:
In a single transaction:
1. rename the log table
2. create the original log table from the renamed log table
3. commit the trx and life goes on
The second time this happens I drop the renamed table and do it all over again. This will run as an Oracle job once a day.
The original question:
Would anyone know if I specify a table space name in table #1 like so:
create table "my_user"."first_table" (pkid number, full_name varchar2(50)) nologging tablespace "my_custom_tablespace";
Then I do something like:
create table second_table as select * from first_table where 1=2 -- because I only want the structure
Will my second_table be in the same table_space?
Thanks in advance for your help.
If you are on Enterprise Edition with partitioning, then a simpler solution is to go with an interval partitioned table, with one partition per day. Then truncate the partitions when you don't need them.
If not, then go with two tables, a synonym to point to the 'current' one that is being inserted into, and a view that selects from a union of the two tables. The nightly job would truncate the 'old' table and switch the synonym to make it the 'new' one.

Optimizing a delete... where query with rownum

I'm working with an application that has a large amount of outdated data clogging up a table in my databank. Ideally, I'd want to delete all entries in the table whose reference date is too old:
delete outdatedTable where referenceDate < :deletionCutoffDate
If this statement were to be run, it would take ages to complete, so I'd rather break it up into chunks with the following:
delete outdatedTable where referenceData < :deletionCutoffDate and rownum <= 10000
In testing, this works suprisingly slowly. The following query, however, runs dramatically faster:
delete outdatedTable where rownum <= 10000
I've been reading through multiple blogs and similar questions on StackOverflow, but I haven't yet found a straightforward description of how/whether using rownum affects the Oracle optimizer when there are other Where clauses in the query. In my case, it seems to me as if Oracle checks
referenceData < :deletionCutoffDate
on every single row, executes a massive Select on all matching rows, and only then filters out the top 10000 rows to return. Is this in fact the case? If so, is there any clever way to make Oracle stop checking the Where clause as soon as it's found enough matching rows?
How about a different approach without so much DML on the table. As a permanent solution for future you could go for table partitioning.
Create a new table with required partition(s).
Move ONLY the required rows from your existing table to the new partitioned table.
Once the new table is populated, add the required constraints and indexes.
Drop the old table.
In future, you would just need to DROP the old partitions.
CTAS(create table as select) is another way, however, if you want to have a new table with partition, you would have to go for exchange partition concept.
First of all, you should read about SQL statement's execution plan and learn how to explain in. It will help you to find answers on such questions.
Generally, one single delete is more effective than several chunked. It's main disadvantage is extremal using of undo tablespace.
If you wish to delete most rows of table, much faster way usially a trick:
create table new_table as select * from old_table where date >= :date_limit;
drop table old_table;
rename table new_table to old_table;
... recreate indexes and other stuff ...
If you wish to do it more than once, partitioning is a much better way. If table partitioned by date, you can select actual date quickly and you can drop partion with outdated data in milliseconds.
At last, paritioning if a way to dismiss 'deleting outdated records' at all. Sometimes we need old data, and it's sad if we delete it by own hands. With paritioning you can archive outdated partitions outside of the database, but connects them when you need to access old data.
This is an old request, but I'd like to show another approach (also using partitions).
Depending on what you consider old, you could create corresponding partitions (optimally exactly two; one current, one old; but you could just as well make more), e.g.:
PARTITION BY LIST ( mod(referenceDate,2) )
(
PARTITION year_odd VALUES (1),
PARTITION year_even VALUES (0)
);
This could as well be months (Jan, Feb, ... Dec), decades (XX0X, XX1X, ... XX9X), half years (first_half, second_half), etc. Anything circular.
Then whenever you want to get rid of old data, truncate:
ALTER TABLE mytable TRUNCATE PARTITION year_even;
delete from your_table
where PK not in
(select PK from your_table where rounum<=...) -- these records you want to leave

Is there a way to query the changes made by a materialized view fast refresh in Oracle?

Say that you have two Oracle databases, DB_A and DB_B. There is a table named TAB1 in DB_A with a materialized view log, and a materialized view named SNAP_TAB1 in DB_B created with
CREATE SNAPSHOT SNAP_TAB1
REFRESH FAST
AS SELECT * FROM TAB1#DB_A;
Is there a way to query in DB_B the changes made to SNAP_TAB1 after each call to fast-refresh the materialized view ?
DBMS_SNAPSHOT.REFRESH( 'SNAP_TAB1', 'F' );
In DB_A, prior to the refresh, you can query the materialized view log table, MLOG$_TAB1, to see which rows have been changed in TAB1. I'm looking for a way to query in DB_B, after each refresh, which rows have been refreshed in SNAP_TAB1.
Thanks!
I think the lines below work with prebuilt table:
You can add a column in the table SNAP_TAB1.
For inserts You can put it on default sysdate => for every insert you'll have the timestamp of the insert.
For updates you can use a trigger. Because the column is not involved in the Materialized View, updating the column with the trigger won't be a problem.
Probaly better, with the trigger you can use an unique id to store in that column, incremented before every new refresh.(Obtaining the unique id may have different aproaches.)
Obviously, you can't track deletes with this idea.

Resources