Transaction Performance Impact of Materialized View Logs - oracle

I have been researching using materialized views for data aggregation and reporting purposes for a company that is largely centered around transactions (using an Oracle db). The current reporting system is dependent upon a series of views that obscure a lot of the complex data logic of the application. These views place a heavy burden on the system when they are called.
We are interested in using the "fast refresh" for incremental updates to perform some of the complex query logic prior to use in reporting; however, there is a concern within the organization that the materialized view logs (which are required for this fast refresh) will have an impact on our current transaction performance in the database. This performance is very essential to our organization therefore there is a great fear of any change.
Here is an example of the type of materialized view log we would need to implement:
create materialized view log on transaction
with rowid, sequence(transaction_id,account_id,order_id,currency_id,price,transaction_date,payment_processor_id)
including new values;
We would not be using the "on commit" clause for updates but rather the "on demand" clause in creation of the view, as we understand this would have a performance impact.
Will implementing this type of logging affect database transaction performance? I imagine that it must slightly affect performance as there is an additional write procedure (to the log) that is wrapped in the commit, but I cannot find any reference to this in the Oracle documentation. Any literature or advice on this subject would be greatly appreciated.
Thanks for your help!

Yes, there will be an impact. The materialized log needs to be maintained synchronously so the transactions will need to insert a new row into the materialized view log for every row that is modified in the base table. How great an impact depends heavily on the system. If your system is I/O bound and you've optimized it so that physically writing the changes to the base table is a significant fraction of the wait time, the impact will be much greater than if your system is CPU bound and most of your wait time is spent reading data or performing computations.
If you are really concerned about the performance of the OLTP system, it would make sense to offload reporting to a different database on a different server. You can replicate the data to the reporting server using Streams (or GoldenGate if you can afford the additional licensing) which will have less of an impact on the source than materialized views because the redo information can be read asynchronously (and can be read on the reporting server rather than putting that workload on the production server). You could then define materialized views on the reporting server where they won't have any impact on the OLTP server. Or you could create a logical standby database as your reporting server and create the materialized views there. Either way, moving the reporting workload off the production server and reading the redo data asynchronously will protect the performance of the production server.

Related

Commits in the absence of locks in CockroachDB

I'm trying to understand how ACID in CockroachDB works without locks, from an application programmer's point of view. Would like to use it for an accounting / ERP application.
When two users update the same database field (e.g. a general ledger account total field) at the same time what does CockroachDB do? Assuming each is updating many other non-overlapping fields at the same time as part of the respective transactions.
Will the aborted application's commit process be informed about this immediately at the time of the commit?
Do we need to take care of additional possibilities than, for example, in ACID/locking PostgreSQL when we write the database access code in our application?
Or is writing code for accessing CockroachDB for all practical purposes the same as for accessing a standard RDBMS with respect to commits and in general.
Of course, ignoring performance issues / joins, etc.
I'm trying to understand how ACID in CockroachDB works without locks, from an application programmer's point of view. Would like to use it for an accounting / ERP application.
CockroachDB does have locks, but uses different terminology. Some of the existing documentation that talks about optimistic concurrency control is currently being updated.
When two users update the same database field (e.g. a general ledger account total field) at the same time what does CockroachDB do? Assuming each is updating many other non-overlapping fields at the same time as part of the respective transactions.
One of the transactions will block waiting for the other to commit. If a deadlock between the transactions is detected, one of the two transactions involved in the deadlock will be aborted.
Will the aborted application's commit process be informed about this immediately at the time of the commit?
Yes.
Do we need to take care of additional possibilities than, for example, in ACID/locking PostgreSQL when we write the database access code in our application?
Or is writing code for accessing CockroachDB for all practical purposes the same as for accessing a standard RDBMS with respect to commits and in general.
At a high-level there is nothing additional for you to do. CockroachDB defaults to serializable isolation which can result in more transaction restarts that weaker isolation levels, but comes with the advantage that the application programmer doesn't have to worry about anomalies.

Data Replication in Oracle

I have a table on master site which only allow insert operation. I want to replicate the rows which are inserted recently so, need something which can track the last record replicated into the local site and perform the replication afterwards.
I have tried Oracle materialized view but still confuse that whether I use the Fast Refresh or Complete Refresh. I need all the newly inserted rows replicated in one transaction.
Are there any better approach to do that? Any help would be highly appreciated.
Thanks.
A fast refresh would copy incremental changes over the network but requires that a materialized view log be created on the master site on the source table. That adds some overhead to the inserts happening on the master table but would generally make the refresh more efficient.
A complete refresh would copy every row over the network every time the materialized view is refreshed. That is likely to be less efficient from a refresh perspective but there will be no overhead to inserts on the source table and the master site does not need to create a materialized view log.
Oracle provides a host of data replication technologies-- materialized views are the oldest and probably the least efficient but are relatively trivial to set up. Streams is a newer technology that has much lower overhead but is quite a bit more complex to set up. Golden Gate is the preferred replication technology today but that has extra licensing costs.

Optimizing Materialized View

Does anyone have the best way to optimize an materilaized view drawing from a View in a database on a monthly basis. I have used the "standard" but are there any other bells and whistles that could provide quick and efficent Views of refreshing data and reducing query time?
Thanks in advance.
MATERIALIZED VIEW Table_X
REFRESH
FAST
START WITH SYSDATE
NEXT DATE '2016-01-01' + 31
WITH PRIMARY KEY
As <Query>
Refresh of a materialized view, whether fast or complete, is just as amenable to performance tuning as any other operation, and generally by just about the same methods.
A refresh is just an encapsulation of various queries against the base tables, materialized view logs, the materialized view, and system tables, and all you need is insight into the complete process. It's important to realise that everything is just SQL, and that means you can add indexes, modify memomry allocations, use partitioning, and just about every other procedure
The best mechanisms for getting insight are Oracle own tools, such as AWR or event tracing. I've used both, but the latter is very insightful and will give you precise information on where the refresh time is being spent. When you see the SQL itself by using event tracing, you can probably work out where any missing indexes etc are. Look out for the potential to index on Sys_Op_Map_Nonnull(column_name).
So, having said that the techniques are all pretty standard, here are some links with info too long/specific too include here.
https://oraclesponge.wordpress.com/2006/04/12/a-quick-materialized-view-performance-note/
http://oraclesponge.blogspot.co.uk/2005/09/optimizing-materialized-views-part-i.html
http://oraclesponge.blogspot.co.uk/2005/09/optimizing-materialized-views-part-ii.html
https://oraclesponge.wordpress.com/2005/11/23/optimizing-materialized-views-part-iii-manual-refresh-mechanisms/
https://oraclesponge.wordpress.com/2005/12/08/optimizing-materialized-views-part-iv-introduction-to-holap-cubes/
http://oraclesponge.blogspot.co.uk/2005/12/optimizing-materialized-views-part-v.html

materialized view over multiple databases

Set-up :
There is one TRANSPORT database and 4 PRODUNIT databases. All these 5 DBs are on different machines and are Oracle databases.
Requirement :
A 'UNIFIED view' is required in the TRANSPORT db which will retrieve data from a table that is present in all the 4 PRODUNIT databases. So when there is a query on the TRANSPORT database(with where clause), the data may be present in any one of the 4 PRODUNIT databases
The query will be kind of 'real time' i.e it requires that as soon as the data is inserted/updated in the table of any of the 4 PRODUNIT databases, it is IMMEDIATELY available in the TRANSPORT db
I searched on the net and ended up with the materialized view. I have the below concerns before I proceed :
Will the 'fast refresh on commit' ensure requirement 2 ?
The table in the individual PRODUNIT databases will experience frequent DML. I'm suspecting a performance impact on the TRANSPORT db - am I correct ? If yes, how shall I proceed ?
I'm rather wondering if there is an approach better than the materialized view !
A materialized view that refreshes on commit cannot refer to a remote object so it doesn't do you a lot of good. If you could do a refresh on commit, you could maintain the data in the transport database synchronously. But you can't.
I would seriously question the wisdom of wanting to do synchronous replication in this case. If you could, then the local databases would become unusable if the transport database was down or the network connection was unavailable. You'd incur the cost of a two-phase commit on every transaction. And it would be very easy for one of the produnit databases to block transactions happening on the other databases.
In virtually every instance I've ever come across, you'd be better served with asynchronous replication that keeps the transport database synchronized to within, say, a few seconds of the produnit database. You probably want to look into GoldenGate or Streams for asynchronous replication with relatively short delays.
whether or not you require a MV would depend on the performance between your DBs and the volume of data concerned.
I would start with a normal view, using DB links to select the data from the databases, but would need to test this to see that the performance is like.
Given requirement 2, a refresh on commit would probably be the best approach if performance on a normal view was poor.

oracle user_constraints, user_tables etc views for production

Is it ok to use that views in production? I mean if queries to dictionary is intended to be frequently called or it is designed just for very rare usage with tools like sql navigator, sql developer etc.
It depends on your definition of "frequently", the size of those objects in your database, and why you need to query them.
In general, it's fine to query data dictionary tables on a regular basis in production-- tons of database monitoring tools, for example, will regularly query a bunch of data dictionary tables to gather performance data. At the same time, though, you can easily configure most of these tools to put a tremendous load on your database by gathering too much data too frequently so your performance monitoring tool becomes the source of performance problems. Normally, you can just dial back the amount of data getting captured and the frequency at which it is captured to get 99% of the monitoring benefit without creating a bunch of issues.
I'm not sure why any tool would frequently need to query user_tables-- since tables aren't getting created or destroyed at runtime in a proper system, there aren't too many reasons why you'd really need to query that particular view all that frequently.

Resources