ATTACH - Is there a price to pay? - performance

When two databases are attached is there a hit in performance compared to having a separate connection to each? Also, if I was writing data to one of the attached databases would both databases be locked or just the one being written to?
The reason I ask is it just seems simpler to me to have one connection that I ATTACH / DETACH each database to / from as it becomes needed / redundant rather than opening and closing connections to each of them all the time. My app doesn't have any threads.

Transaction are atomic over all attached databases; this requires creating a separate master journal in addition to all the normal rollback journals of the actual databases.
When having attached databases, table names (and PRAGMA statements) might require that the database name is added.
For these reason, ATTACH is usually used only when you actually need to access multiple databases in the same query.

Related

Dynamically List contents of a table in database that continously updates

It's kinda real-world problem and I believe the solution exists but couldn't find one.
So We, have a Database called Transactions that contains tables such as Positions, Securities, Bogies, Accounts, Commodities and so on being updated continuously every second whenever a new transaction happens. For the time being, We have replicated master database Transaction to a new database with name TRN on which we do all the querying and updating stuff.
We want a sort of monitoring system ( like htop process viewer in Linux) for Database that dynamically lists updated rows in tables of the database at any time.
TL;DR Is there any way to get a continuous updating list of rows in any table in the database?
Currently we are working on Sybase & Oracle DBMS on Linux (Ubuntu) platform but we would like to receive generic answers that concern most of the platform as well as DBMS's(including MySQL) and any tools, utilities or scripts that can do so that It can help us in future to easily migrate to other platforms and or DBMS as well.
To list updated rows, you conceptually need either of the two things:
The updating statement's effect on the table.
A previous version of the table to compare with.
How you get them and in what form is completely up to you.
The 1st option allows you to list updates with statement granularity while the 2nd is more suitable for time-based granularity.
Some options from the top of my head:
Write to a temporary table
Add a field with transaction id/timestamp
Make clones of the table regularly
AFAICS, Oracle doesn't have built-in facilities to get the affected rows, only their count.
Not a lot of details in the question so not sure how much of this will be of use ...
'Sybase' is mentioned but nothing is said about which Sybase RDBMS product (ASE? SQLAnywhere? IQ? Advantage?)
by 'replicated master database transaction' I'm assuming this means the primary database is being replicated (as opposed to the database called 'master' in a Sybase ASE instance)
no mention is made of what products/tools are being used to 'replicate' the transactions to the 'new database' named 'TRN'
So, assuming part of your environment includes Sybase(SAP) ASE ...
MDA tables can be used to capture counters of DML operations (eg, insert/update/delete) over a given time period
MDA tables can capture some SQL text, though the volume/quality could be in doubt if a) MDA is not configured properly and/or b) the DML operations are wrapped up in prepared statements, stored procs and triggers
auditing could be enabled to capture some commands but again, volume/quality could be in doubt based on how the DML commands are executed
also keep in mind that there's a performance hit for using MDA tables and/or auditing, with the level of performance degradation based on individual config settings and the volume of DML activity
Assuming you're using the Sybase(SAP) Replication Server product, those replicated transactions sent through repserver likely have all the info you need to know which tables/rows are being affected; so you have a couple options:
route a copy of the transactions to another database where you can capture the transactions in whatever format you need [you'll need to design the database and/or any customized repserver function strings]
consider using the Sybase(SAP) Real Time Data Streaming product (yeah, additional li$ence is required) which is specifically designed for scenarios like yours, ie, pull transactions off the repserver queues and format for use in downstream systems (eg, tibco/mqs, custom apps)
I'm not aware of any 'generic' products that work, out of the box, as per your (limited) requirements. You're likely looking at some different solutions and/or customized code to cover your particular situation.

Commits in the absence of locks in CockroachDB

I'm trying to understand how ACID in CockroachDB works without locks, from an application programmer's point of view. Would like to use it for an accounting / ERP application.
When two users update the same database field (e.g. a general ledger account total field) at the same time what does CockroachDB do? Assuming each is updating many other non-overlapping fields at the same time as part of the respective transactions.
Will the aborted application's commit process be informed about this immediately at the time of the commit?
Do we need to take care of additional possibilities than, for example, in ACID/locking PostgreSQL when we write the database access code in our application?
Or is writing code for accessing CockroachDB for all practical purposes the same as for accessing a standard RDBMS with respect to commits and in general.
Of course, ignoring performance issues / joins, etc.
I'm trying to understand how ACID in CockroachDB works without locks, from an application programmer's point of view. Would like to use it for an accounting / ERP application.
CockroachDB does have locks, but uses different terminology. Some of the existing documentation that talks about optimistic concurrency control is currently being updated.
When two users update the same database field (e.g. a general ledger account total field) at the same time what does CockroachDB do? Assuming each is updating many other non-overlapping fields at the same time as part of the respective transactions.
One of the transactions will block waiting for the other to commit. If a deadlock between the transactions is detected, one of the two transactions involved in the deadlock will be aborted.
Will the aborted application's commit process be informed about this immediately at the time of the commit?
Yes.
Do we need to take care of additional possibilities than, for example, in ACID/locking PostgreSQL when we write the database access code in our application?
Or is writing code for accessing CockroachDB for all practical purposes the same as for accessing a standard RDBMS with respect to commits and in general.
At a high-level there is nothing additional for you to do. CockroachDB defaults to serializable isolation which can result in more transaction restarts that weaker isolation levels, but comes with the advantage that the application programmer doesn't have to worry about anomalies.

materialized view over multiple databases

Set-up :
There is one TRANSPORT database and 4 PRODUNIT databases. All these 5 DBs are on different machines and are Oracle databases.
Requirement :
A 'UNIFIED view' is required in the TRANSPORT db which will retrieve data from a table that is present in all the 4 PRODUNIT databases. So when there is a query on the TRANSPORT database(with where clause), the data may be present in any one of the 4 PRODUNIT databases
The query will be kind of 'real time' i.e it requires that as soon as the data is inserted/updated in the table of any of the 4 PRODUNIT databases, it is IMMEDIATELY available in the TRANSPORT db
I searched on the net and ended up with the materialized view. I have the below concerns before I proceed :
Will the 'fast refresh on commit' ensure requirement 2 ?
The table in the individual PRODUNIT databases will experience frequent DML. I'm suspecting a performance impact on the TRANSPORT db - am I correct ? If yes, how shall I proceed ?
I'm rather wondering if there is an approach better than the materialized view !
A materialized view that refreshes on commit cannot refer to a remote object so it doesn't do you a lot of good. If you could do a refresh on commit, you could maintain the data in the transport database synchronously. But you can't.
I would seriously question the wisdom of wanting to do synchronous replication in this case. If you could, then the local databases would become unusable if the transport database was down or the network connection was unavailable. You'd incur the cost of a two-phase commit on every transaction. And it would be very easy for one of the produnit databases to block transactions happening on the other databases.
In virtually every instance I've ever come across, you'd be better served with asynchronous replication that keeps the transport database synchronized to within, say, a few seconds of the produnit database. You probably want to look into GoldenGate or Streams for asynchronous replication with relatively short delays.
whether or not you require a MV would depend on the performance between your DBs and the volume of data concerned.
I would start with a normal view, using DB links to select the data from the databases, but would need to test this to see that the performance is like.
Given requirement 2, a refresh on commit would probably be the best approach if performance on a normal view was poor.

Oracle database as a single synchronization point for two separate web applications

I am considering using an Oracle database to synchronize concurrent operations from two or more web applications on separate servers. The database is the single infrastructure element in common for those applications.
There is a good chance that two or more applications will attempt to perform the same operation at the exact same moment (cron invoked). I want to use the database to let one application decide that it will be the one which will do the work, and that the others will not do it at all.
The general idea is to perform a somehow-atomic and visible to all connections select/insert with node's ID. Only node which has the same id as the first inserted node ID returned by select would be do the work.
It was suggested to me that a merge statement can be of use here. However, after doing some research, I found a discussion which states that the merge statement is not designed to be called
Another option is to lock a table. By definition, only one node will be able to lock the server and do the insert, then select. After the lock is removed, other instances will see the inserted value and will not perform work.
What other solutions would you consider? I frown on workarounds with random delays, or even using oracle exceptions to notify a node that it should not do the work. I'd prefer a clean solution.
I ended up going with SELECT FOR UPDATE. It works as intended. It is important to remember to commit the transaction as soon as the needed update is made, so that other nodes don't hang waiting for the value.

One Instance Versus Multiple Instances In Oracle

What are the advantages and disadvantages of having a single instance compared to multiple instances when multiple databases are intended to be created?
You may want to browse the Oracle concept guide, especially if you're more familiar with other DBMS.
A database is a set of files, located on disk, that store data.
These files can exist independently of
a database instance.
An instance is a set of memory structures that manage database files.
The instance consists of a shared
memory area, called the system global
area (SGA), and a set of background
processes. An instance can exist
independently of database files.
A single instance (set of processes) can mount at most one database (set of files). If you need to access multiple databases, you will need multiple instances. More on the difference between instances and databases on askTom.
Ideally, you only want one instance per server (the server may be a logical server -- i.e a virtual server). This will allow Oracle to know exactly what is going on. This implies one database per server.
If your databases are really independent, going with multiple instances/databases would make sense, since you have greater control over DB version, administration, etc.
If however your databases are not really independent (you frequently share data across them, you need some common data accessible to all of them), it may be more efficient (and simpler) to go with a single consolidated database. Each original database would have its own set of schemas. In this case cross-schema referential integrity would be easy, you wouldn't need to duplicate the data that needs to be shared.

Resources