Oracle's database change notification feature sends rowids (physical row addresses) on row inserts, updates and deletes. As indicated in the oracle's documentation this feature can be used by the application to built a middle tier cache. But this seems to contradict when we have a detailed look on how row ids work.
ROWID's (physical row addresses) can change when various database operations are performed as indicated by this stackoverflow thread. In addition to this, as tom mentions in this thread clustered tables can have same rowids.
Based on the above research, it doesn't seem to be safe to use the rowid sent during the database change notification as the key in the application cache right? This also raises a question on - Should database change notification feature be used to built an application server cache? or is a recommendation made to restart all the application server clusters (to reload/refresh the cache) when the tables of the cached objects undergo any operations which result in rowid's to change? Would that be a good assumption to be made for production environments?
It seems to me to none of operations that can potentially change the ROWID is an operation that would be carried out in a productive environment while the application is running. Furthermore, I've seen a lot of productive software that uses the ROWID accross transaction (usually just for a few seconds or minutes). That software would probably fail before your cache if the ROWID changed. So creating a database cache based on change notification seems reasonable to me. Just provide a small disclaimer regarding the ROWID.
The only somewhat problematic operation is an update causing a movement to another partition. But that's something that rarely happens because it defeats the purpose of the partitioning, at least if it occurred regularly. The designer of a particular database schema will be able to tell you whether such an operation can occur and is relevant for caching. If none of the tables has ENABLE ROW MOVEMENT set, you don't even need to ask the designer.
As to duplicate ROWIDs: ROWIDs aren't unique globally, they are unique within a table. And you are given both the ROWID and the table name in the change notification. So the tuple of ROWID and table name is a perfect unique key for building a reliable cache.
Related
It's kinda real-world problem and I believe the solution exists but couldn't find one.
So We, have a Database called Transactions that contains tables such as Positions, Securities, Bogies, Accounts, Commodities and so on being updated continuously every second whenever a new transaction happens. For the time being, We have replicated master database Transaction to a new database with name TRN on which we do all the querying and updating stuff.
We want a sort of monitoring system ( like htop process viewer in Linux) for Database that dynamically lists updated rows in tables of the database at any time.
TL;DR Is there any way to get a continuous updating list of rows in any table in the database?
Currently we are working on Sybase & Oracle DBMS on Linux (Ubuntu) platform but we would like to receive generic answers that concern most of the platform as well as DBMS's(including MySQL) and any tools, utilities or scripts that can do so that It can help us in future to easily migrate to other platforms and or DBMS as well.
To list updated rows, you conceptually need either of the two things:
The updating statement's effect on the table.
A previous version of the table to compare with.
How you get them and in what form is completely up to you.
The 1st option allows you to list updates with statement granularity while the 2nd is more suitable for time-based granularity.
Some options from the top of my head:
Write to a temporary table
Add a field with transaction id/timestamp
Make clones of the table regularly
AFAICS, Oracle doesn't have built-in facilities to get the affected rows, only their count.
Not a lot of details in the question so not sure how much of this will be of use ...
'Sybase' is mentioned but nothing is said about which Sybase RDBMS product (ASE? SQLAnywhere? IQ? Advantage?)
by 'replicated master database transaction' I'm assuming this means the primary database is being replicated (as opposed to the database called 'master' in a Sybase ASE instance)
no mention is made of what products/tools are being used to 'replicate' the transactions to the 'new database' named 'TRN'
So, assuming part of your environment includes Sybase(SAP) ASE ...
MDA tables can be used to capture counters of DML operations (eg, insert/update/delete) over a given time period
MDA tables can capture some SQL text, though the volume/quality could be in doubt if a) MDA is not configured properly and/or b) the DML operations are wrapped up in prepared statements, stored procs and triggers
auditing could be enabled to capture some commands but again, volume/quality could be in doubt based on how the DML commands are executed
also keep in mind that there's a performance hit for using MDA tables and/or auditing, with the level of performance degradation based on individual config settings and the volume of DML activity
Assuming you're using the Sybase(SAP) Replication Server product, those replicated transactions sent through repserver likely have all the info you need to know which tables/rows are being affected; so you have a couple options:
route a copy of the transactions to another database where you can capture the transactions in whatever format you need [you'll need to design the database and/or any customized repserver function strings]
consider using the Sybase(SAP) Real Time Data Streaming product (yeah, additional li$ence is required) which is specifically designed for scenarios like yours, ie, pull transactions off the repserver queues and format for use in downstream systems (eg, tibco/mqs, custom apps)
I'm not aware of any 'generic' products that work, out of the box, as per your (limited) requirements. You're likely looking at some different solutions and/or customized code to cover your particular situation.
I've seen a few (literally, only a few) links and nothing in the documentation that talks about clustering with Firebird, that it can be done.
Then, I shot for the moon on this question CLUSTER command for Firebird?, but answerer told me that Firebird doesn't even have clustered indexes at all, so now I'm really confused.
Does Firebird physically order data at all? If so, can it be ordered by any key, not just primary, and can the clustering/defragging be turned on and off so that it only does it during downtime?
If not, isn't this a hit to performance since it will take the disk longer to put together disparate rows that naturally should be right next to each other?
(DB noob)
MVCC
I found out that Firebird is based upon MVCC, so old data actually isn't overwritten until a "sweep". I like that a lot!
Again, I can't find much, but it seems like a real shame that data wouldn't be defragged according to a key.
This says that database pages are defragmented but provides no further explanation.
Firebird does not cluster records. It was designed to avoid the problems that require clustering and the fragmentation problems that come with clustered indexes. Indexes and data are stored separately, on different types of pages. Each data page contains data from only one table. Records are stored in the order they were inserted, give or take concurrent inserts, which generally go on separate pages. When old records are removed, new records will be stored in their place, so new records sometimes appear on the same page as older ones.
Many tables use an artificial primary key, generally ascending, which might be a database generated sequence or a timestamp. That practice causes records to be stored in key order, but that order is by no means guaranteed. Nor is it very interesting. When the primary key is artificial, most queries that return groups of related records are done on secondary indexes. That's a performance hit for records that are clustered because look-ups on secondary indexes require traversing two indexes because the secondary index provides only the key to the primary index, which must be traversed to find the data.
On the larger issue of defragmentation and space usage, Firebird tracks the free space on pages so new records will be inserted on pages that have had records removed. If a page becomes completely empty, it will be reallocated. This space management is done as the database runs. As you know, Firebird uses Multi-Version Concurrency Control, so when a record is updated or deleted, Firebird creates a new record version, but keeps the old version around. When all transactions that were running before the change was committed have ended, the old record version no longer serves any purposes, and Firebird will remove it. In many applications, old versions are removed in the normal course of running the database. When a transaction touches a record with old versions, Firebird checks the state of the old versions and removes them if no running transaction can read them. There is a function called "Sweep" that systematically removes unneeded old record versions. Sweep can run concurrently with other database activity, though it's better to schedule it when the database load is low. So no, it's not true that nothing is removed until you run a sweep.
Best regards,
Ann Harrison
who's worked with Firebird and it's predecessors for an embarassingly long time
BTW - as the first person to answer mentioned, Firebird does leave space on pages so that the old version of a record stays on the same page as the newer version. It's not a fixed percentage of the space, but 16 bytes per record stored on the page, so pages of tables with very short records have more free space and tables that have long records have less.
On restore, database pages are created ~70% full (as I recall, unless you specify gbak's -use_all_space switch) and the restore is done one table at a time, writing pages to the end of the database file as needed. You can imagine a scenario where pages could be condensed down to much less. Hence bringing the data together and "defragging" it.
As far as controlling the physical grouping on disk or doing an online defrag -- in Firebird there is none. Remember that just because you need to access a page does not mean your disk does a read -- file system and database cache can avoid it!
as the title i am talking about, what's the best way to track data changes in oracle? i just want to know which row being updated/deleted/inserted?
at first i think about the trigger, but i need to write more triggers on each table and then record down the rowid which effected into my change table, it's not good, then i search in Google, learn new concepts about materialized view log and change data capture,
materialized view log is good for me that i can compare it to original table then i can get the different records, even the different of the fields, i think the way is the same with i create/copy new table from original (but i don't know what's different?);
change data capture component is complicate for me :), so i don't want to waste my time to research it.
anybody has the experience the best way to track data changes in oracle?
You'll want to have a look at the AUDIT statement. It gathers all auditing records in the SYS.AUD$ table.
Example:
AUDIT insert, update, delete ON t BY ACCESS
Regards,
Rob.
You might want to take a look at Golden Gate. This makes capturing changes a snap, at a price but with good performance and quick setup.
If performance is no issue, triggers and audit could be a valid solution.
If performance is an issue and Golden Gate is considered too expensive, you could also use Logminer or Change Data Capture. Given this choice, my preference would go for CDC.
As you see, there are quite a few options, near realtime and offline.
Coding a solution by hand also has a price, Golden Gate is worth investigating.
Oracle does this for you via redo logs, it depends on what you're trying to do with this info. I'm assuming your need is replication (track changes on source instance and propagate to 1 or more target instances).
If thats the case, you may consider Oracle streams (other options such as Advanced Replication, but you'll need to consider your needs):
From Oracle:
When you use Streams, replication of a
DML or DDL change typically includes
three steps:
A capture process or an application
creates one or more logical change
records (LCRs) and enqueues them into
a queue. An LCR is a message with a
specific format that describes a
database change. A capture process
reformats changes captured from the
redo log into LCRs, and applications
can construct LCRs. If the change was
a data manipulation language (DML)
operation, then each LCR encapsulates
a row change resulting from the DML
operation to a shared table at the
source database. If the change was a
data definition language (DDL)
operation, then an LCR encapsulates
the DDL change that was made to a
shared database object at a source
database.
A propagation propagates the staged
LCR to another queue, which usually
resides in a database that is separate
from the database where the LCR was
captured. An LCR can be propagated to
a number of queues before it arrives
at a destination database.
At a destination database, an apply
process consumes the change by
applying the LCR to the shared
database object. An apply process can
dequeue the LCR and apply it directly,
or an apply process can dequeue the
LCR and send it to an apply handler.
In a Streams replication environment,
an apply handler performs customized
processing of the LCR and then applies
the LCR to the shared database object.
We are using oracle CQN for change notifications for specific queries.
This is working fine for all the inserts and updates. The problem is delete, On delete the notification is sent with ROWID amongst other details. We cannot use the ROWID to lookup the row any more because it has been deleted.
Is there a way to get more data in a CQN notification regarding the deleted row ?
I'm afraid not.
My understanding is that this service is tailored to allow servers or clients to implement caches. In which case the cached table or view is supposed to be loaded in memory including the rowid, upon a notification, the cache manager having subscribed to the CQN service is supposed to invalidate the rows affected by the rowid list (or fetch it again in advanced).
Real life example. This can be useful for real time databases like Intelligent Network (i.e. to manage Prepaid Su$bscribers on a telecom network) in which callers need to be put through asap. The machine in charge of authorizing the calls (the SCP, there are several of them on the whole territory) is usually an in-memory database and the real persistent db lives on another node (the SDP at a central datacenter). The SDP with its on-disk db receives life-cycle events and balance refils events and notifies the subscribing SCPs.
You might have a different usage model.
I had this problem too, instead of deleting a row, I used a column "Active", instead of deleting a row I changed "Active" from "YES" to "NO".
When you create a Microsoft Access 2003 link to an Oracle table using Oracle's ODBC driver, you are sometimes asked to state which columns are the primary key(s).
I would like to know how to change that initial assignment, or even how to get Access/ODBC to forget the assignment. In my limited testing I wonder if the assignment isn't cached by the ODBC driver itself.
The columns I initial chose are not correct.
Update: I never did get a full answer on this one, deleting the links then restoring them didn't work. I think it's an obscure bug. I've moved on and haven't had to worry about this oddity since.
You must delete the link to the table and create a new one. When a table is linked all the connection info about the table's path, structure (including primary key), permissions, passwords and statistics are stored in the Access db. If any of those items change in the linked table, refreshing links won't automatically update it on the Access side because Access continues to use the previously stored info. You must delete or drop the linked table and recreate the link, storing the current connection information.
Don't know for sure if this next bit also applies to odbc linked tables, but I suspect it does. For Jet tables, it's a good idea to periodically delete all links and recreate them to improve query performance, because if a linked table's statistics are made on a table with few records, once that table is filled with many more records, new statistics will tell Jet's optimizer whether using indexes or a full table scan would be the better course of action when running a query.
It is not possible to delete the link and then relink?