I have a very sensitive application using Oracle 12c, where I need to send a notification to other system whenever there is insert/update on a table. Ways I know to achieve is by
polling the table in regular intervals.
putting a trigger on the table for insert/update. In both the cases I am worried about the additional load on the database.
Replicate the data to another database with GoldenGate and continuously poll from it, so that I don't have to worry about the overhead.
not sure about materialized view.. can it be lightweight if refreshed every 1-2 seconds ?
Is there an programmatic alternative anyone can suggest(lightweight).
Related
Lets star with background. I have an api endpoint that I have to query every 15 minutes and that returns complex data. Unfortunately this endpoint does not provide information of what exactly changed. So it requires me to compare the data that I have in db and compare everything and than execute update, add or delete. This is pretty boring...
I came to and idea that I can simply remove all data from certain tables and build everything from scratch... But it I have to also return this cached data to my clients. So there might be a situation that the db will be empty during some request from my client because it will be "refreshing/rebulding". And that cant happen because I have to return something
So I cam to and idea to
Lock the certain db tables so that the client will have to wait for the "refreshing the db"
or
CQRS https://martinfowler.com/bliki/CQRS.html
Do you have any suggestions how to solve the problem?
It sounds like you're using a relational database, so I'll try to outline a solution using database terms. The idea, however, is more general than that. In general, it's similar to Blue-Green deployment.
Have two data tables (or two databases, for that matter); one is active, and one is inactive.
When the software starts the update process, it can wipe the inactive table and write new data into it. During this process, the system keeps serving data from the active table.
Once the data update is entirely done, the system can begin to serve data from the previously inactive table. In other words, the inactive table becomes the active table, and vice versa.
Set-up :
There is one TRANSPORT database and 4 PRODUNIT databases. All these 5 DBs are on different machines and are Oracle databases.
Requirement :
A 'UNIFIED view' is required in the TRANSPORT db which will retrieve data from a table that is present in all the 4 PRODUNIT databases. So when there is a query on the TRANSPORT database(with where clause), the data may be present in any one of the 4 PRODUNIT databases
The query will be kind of 'real time' i.e it requires that as soon as the data is inserted/updated in the table of any of the 4 PRODUNIT databases, it is IMMEDIATELY available in the TRANSPORT db
I searched on the net and ended up with the materialized view. I have the below concerns before I proceed :
Will the 'fast refresh on commit' ensure requirement 2 ?
The table in the individual PRODUNIT databases will experience frequent DML. I'm suspecting a performance impact on the TRANSPORT db - am I correct ? If yes, how shall I proceed ?
I'm rather wondering if there is an approach better than the materialized view !
A materialized view that refreshes on commit cannot refer to a remote object so it doesn't do you a lot of good. If you could do a refresh on commit, you could maintain the data in the transport database synchronously. But you can't.
I would seriously question the wisdom of wanting to do synchronous replication in this case. If you could, then the local databases would become unusable if the transport database was down or the network connection was unavailable. You'd incur the cost of a two-phase commit on every transaction. And it would be very easy for one of the produnit databases to block transactions happening on the other databases.
In virtually every instance I've ever come across, you'd be better served with asynchronous replication that keeps the transport database synchronized to within, say, a few seconds of the produnit database. You probably want to look into GoldenGate or Streams for asynchronous replication with relatively short delays.
whether or not you require a MV would depend on the performance between your DBs and the volume of data concerned.
I would start with a normal view, using DB links to select the data from the databases, but would need to test this to see that the performance is like.
Given requirement 2, a refresh on commit would probably be the best approach if performance on a normal view was poor.
I am confused about Oracle Advanced Queueing. It looks like it is a way to asynchronously send database notification to application layer.
But looking in some details, there is queue to be setup, alongside a table. and there is explicit calls to publish messages that will afterward be pushed to the application layer.
Does this work automatically with table rows modification ?
I want, if a particular table changes (no matter who/how changed), to receive a notification about it in form of a binary object that represents the row changed.
(Note: I know about Oracle Query change notification, CQN, but I am not satisfied with its performance, my goal is then to see if Oracle Advanced Queue can offer similar goal with better speed).
Thanks in advance.
I have a table of non trivial size on a DB2 database that is updated X times a day per user input in another application. This table is also read by my web-app to display some info to another set of users. I have a large number of users on my web app and they need to do lots of fuzzy string lookups with data that is up-to-the-minute accurate. So, I need a server side cache to do my fuzzy logic on and to keep the DB from getting hammered.
So, what's the best option? I would hate to pull the entire table every minute when the data changes so rarely. I could setup a trigger to update a timestamp of a smaller table and poll that to see if I need refresh my cache, but that seems hacky to.
Ideally I would like to have DB2 tell my web-app when something changes, or at least provide a very lightweight mechanism to detect data level changes.
I think if your web application is running in WebSphere, setting up MQ would be a pretty good solution.
You could write triggers that use the MQ Series routines to add things to a queue, and your web app could subscribe to the queue and listen for updates.
If your web app is not in WebSphere then you could still look at this option but it might be more difficult.
A simple solution could be to have a timestamp (somewhere) for the latest change on to table.
The timestamp could be located in a small table/view that is updated by either the application that updates the big table or by an update-trigger on the big table.
The update-triggers only task would be to update the "help"-timestamp with currenttimestamp.
Then the webapp only checks this timestamp.
If the timestamp is newer then what the webapp has then the data is reread from the big table.
A "low-tech"-solution thats fairly non intrusive to the exsisting system.
Hope this solution fits your setup.
Regards
Sigersted
Having the database push a message to your webapp is certainly doable via a variety of mechanisms (like mqseries, etc). Similar and easier is to write a java stored procedure that gets kicked off by the trigger and hands the data to your cache-maintenance interface. But both of these solutions involve a lot of versioning dependencies, etc that could be a real PITA.
Another option might be to reconsider the entire approach. Is it possible that instead of maintaining a cache on your app's side you could perform your text searching on the original table?
But my suggestion is to do as you (and the other poster) mention - and just update a timestamp in a single-row table purposed to do this, then have your web-app poll that table. Similarly you could just push the changed rows to this small table - and have your cache-maintenance program pull from this table. Either of these is very simple to implement - and should be very reliable.
Oracle has two seemingly competing technologies. CDC and DCN.
What are the strengths of each?
When would you use one and not the other?
In general, you would use DCN to notify a client application that the client application needs to clear/ update the application's cache. You would use CDC for ETL processing.
DCN would generally be preferable when you have an OLTP application that needs to be notified immediately about data changes in the database. Since the goal here is to minimize the number of network round-trips and the number of database hits, you'd generally want the application to use DCN for queries which either are mostly static. If a large fraction of the query is changing regularly, you may be better off just refreshing the application's cache on a set frequency rather than running queries constantly to get the changed data (DCN does not contain the changed data, just the ROWID of the row(s) that changed). If the application goes down, I believe DCN allows changes to be lost.
CDC would generally be preferable when you have a DSS application that needs to periodically pull over all the data that changed in a number of tables. CDC can guarantee that the subscriber has received every change to the underlying table(s) which can be important if you are trying to replicate changes to a different database . CDC allows the subscriber to pull the changes at its convenience rather than trying to notify the subscriber that there are changes, so you'd definitely want CDC if you wanted the subscriber to process new data every hour or every day rather than in near real time. (note: DCN also has a guaranteed delivery mode, see comments below. --Mark Harrison)
CDC seems to be much more complex to set up than DCN.
I mean to setup DCN I wrap a select in a start and end DCN block and then write a procedure to be called with a collect of changes. That's it.
CDC requires publishers and subscribers and anyways, seems like more work.