Trying to understand if there is any such concept like this in Oracle Database.
Let's say I have two Databases, Database_A & Database_B
Database_A has schema_A, is there a way I can attach this schema to Database_B?
What I mean by this is if there is a job populating a TABLE_A in schema_A, I can see that read-only view in Database_B. We are trying to split a big Oracle database into two smaller databases and have a vast PL/SQL code, and trying to minimize the refactoring here.
Sharding might be what you're looking for. The schemas and tables will still logically exist on all databases, but you can arrange the data to be physically stored in specific databases. There might be a way to setup shardspaces, tablespaces, and user default tablespaces in a way where each schema's data is automatically stored in a specific database.
But I haven't actually used sharding. From what I've read, it seems to be designed for massive distributed OLTP systems, and it is likely complicated to administer. I'd guess this feature isn't worth the hassle unless you have petabytes of data.
It's kinda real-world problem and I believe the solution exists but couldn't find one.
So We, have a Database called Transactions that contains tables such as Positions, Securities, Bogies, Accounts, Commodities and so on being updated continuously every second whenever a new transaction happens. For the time being, We have replicated master database Transaction to a new database with name TRN on which we do all the querying and updating stuff.
We want a sort of monitoring system ( like htop process viewer in Linux) for Database that dynamically lists updated rows in tables of the database at any time.
TL;DR Is there any way to get a continuous updating list of rows in any table in the database?
Currently we are working on Sybase & Oracle DBMS on Linux (Ubuntu) platform but we would like to receive generic answers that concern most of the platform as well as DBMS's(including MySQL) and any tools, utilities or scripts that can do so that It can help us in future to easily migrate to other platforms and or DBMS as well.
To list updated rows, you conceptually need either of the two things:
The updating statement's effect on the table.
A previous version of the table to compare with.
How you get them and in what form is completely up to you.
The 1st option allows you to list updates with statement granularity while the 2nd is more suitable for time-based granularity.
Some options from the top of my head:
Write to a temporary table
Add a field with transaction id/timestamp
Make clones of the table regularly
AFAICS, Oracle doesn't have built-in facilities to get the affected rows, only their count.
Not a lot of details in the question so not sure how much of this will be of use ...
'Sybase' is mentioned but nothing is said about which Sybase RDBMS product (ASE? SQLAnywhere? IQ? Advantage?)
by 'replicated master database transaction' I'm assuming this means the primary database is being replicated (as opposed to the database called 'master' in a Sybase ASE instance)
no mention is made of what products/tools are being used to 'replicate' the transactions to the 'new database' named 'TRN'
So, assuming part of your environment includes Sybase(SAP) ASE ...
MDA tables can be used to capture counters of DML operations (eg, insert/update/delete) over a given time period
MDA tables can capture some SQL text, though the volume/quality could be in doubt if a) MDA is not configured properly and/or b) the DML operations are wrapped up in prepared statements, stored procs and triggers
auditing could be enabled to capture some commands but again, volume/quality could be in doubt based on how the DML commands are executed
also keep in mind that there's a performance hit for using MDA tables and/or auditing, with the level of performance degradation based on individual config settings and the volume of DML activity
Assuming you're using the Sybase(SAP) Replication Server product, those replicated transactions sent through repserver likely have all the info you need to know which tables/rows are being affected; so you have a couple options:
route a copy of the transactions to another database where you can capture the transactions in whatever format you need [you'll need to design the database and/or any customized repserver function strings]
consider using the Sybase(SAP) Real Time Data Streaming product (yeah, additional li$ence is required) which is specifically designed for scenarios like yours, ie, pull transactions off the repserver queues and format for use in downstream systems (eg, tibco/mqs, custom apps)
I'm not aware of any 'generic' products that work, out of the box, as per your (limited) requirements. You're likely looking at some different solutions and/or customized code to cover your particular situation.
When two databases are attached is there a hit in performance compared to having a separate connection to each? Also, if I was writing data to one of the attached databases would both databases be locked or just the one being written to?
The reason I ask is it just seems simpler to me to have one connection that I ATTACH / DETACH each database to / from as it becomes needed / redundant rather than opening and closing connections to each of them all the time. My app doesn't have any threads.
Transaction are atomic over all attached databases; this requires creating a separate master journal in addition to all the normal rollback journals of the actual databases.
When having attached databases, table names (and PRAGMA statements) might require that the database name is added.
For these reason, ATTACH is usually used only when you actually need to access multiple databases in the same query.
In an organization that has two applications each with its own Oracle database instance, what are the disadvantages of consolidating the two databases into one database with two schemas?
Backups and replicating the database are bigger and slower, probably. What else?
Some background:
The two databases are the "gold source" for their respective data. Each is critical to the operation of the organization and each is actually used by several appliations, tools, and reports (but each database is principally "owned" by one application). The need to join data across the databases, to relate entities in one to entities in the other, comes up frequently. For this reason there are DB links connecting the two and some cross-database materialized views to help with performance. There is an effort underway to reduce data duplication and these materialized views are under discussion. Some in the organization want to phase out DB links and materialized views and introduce more web services to make the data available across applications. My concern is that there are too many situations that require complex joins of data across the two databases so services that expose the data won't perform. Another approach for reducing DB links and materialized views is to consolidate the schemas into one database, but I want to make sure I'm not forgetting any critical disadvantages to that approach.
In a single consolidated database, you will lose some flexibility from a DBA point of view:
A database obviously can have only one version (10.2.0.5 for example), which means that upgrades and patches will affect all schemas -- this may be a bad thing in case of multiple vendor app requirement mismatch.
Similarly, some administrative tasks (restore database A to point in time t) may be more complicated with a single database.
Overall, you will have less administration tasks (a single backup, single patching...) but each task will be more critical since they will have a global effect.
On the development side, beware of namespace collisions: some features are global over a single database, for example:
directories,
public synonyms,
DB link
Schemas
This means that you will have some work to do if you want to consolidate two databases that have public synonyms with the same name that points to two different things.
Could have something to do with licence costs - scaling up vs. scaling out.
The biggest concern I would have is that all your code will need to be rewritten to account for the new database and schemas. Or at least looked at. This courl introduce new bugs. I don't know how Oracle handles refernces to different databases, so I'll use an example of what I mean using SQL Server syntax. If I was joining to two tables onthe same server in different databases my select would be something like this:
SELECT a.field1, b.field2
FROM database1.dbo.table1 a
JOIN database2.dbo.table2 b
ON a.myid = b.myFK
To go your your new consolidated idea, you would want to write:
SELECT a.field1, b.field2
FROM schema1.table1 a
JOIN schema2.table2 b
ON a.myid = b.myFK
You will need to be especially careful of any tables that have the same name in both databases now, this could cause some sneaky bugs.
Note these are not difficult changes but all SQL hitting your database would have to be examined to see if it will work or adjusted if not.
I'm not sure if just putting them in the same database would do it either. You might need to consolidate some tables to avoid the duplication across applications. (In this case add fields to reference the old id numbers for things people are used to looking up by id like person_id that may appear on old paperwork, so they can be researched) This is a fairly major rewrite with all the attendant possibilites to make things worse due to new bugs.
If you go down this path, I highly recommend that you read a book on refactoring datbases before you decide how to design.
its hard to tell by just the information provided, big in db world would be 100gb or more, so 2 dbs would be 200GB. if both db are not bigger than 100GB then size should not be a huge factor in the decision, replication and sync can be done on changes only and backups should not be a big difference (again this depends on specifics such as when backups are done or if downtime is possible or backups are done during non-peak times)
Other than that other factors are:
naming collisions in dbo's such as keys, foreign key names, table names, etc. some renaming of tables, store procedures names too.
Our application runs on the web, is mostly an inquiry tool, does some transactions. We host the Oracle database. The app has always had a different instance of Oracle for each customer. A customer is a company which pays us to provide our service to the company's employees, typically 10,000-25,000 employees per customer. We intend to have several hundred customers. We do a major release every few years, and migrating to that new release is challenging: we might have a team at the customer site for a couple weeks, explaining new functionality and setting up the driving data to suit that customer.
We're considering going multi-client, putting all our customers into a single shared Oracle 11g instance on a big honkin' Windows Server 2008 server -- in order to reduce costs. I'm wondering if that's advisable.
There are some advantages to having separate instances for each customer. Tell me if these are bogus, please. In my rough guess about decreasing importance:
Our customers MyCorp and YourCo can be migrated separately when breaking changes are made to the schema. (With multi-client, we'd be migrating 300+ customers overnight!?!)
MyCorp's data can be easily backed up and (!!!) restored, without affecting other customers.
MyCorp's data is securely separated from their competitor YourCo's data, without depending on developers to get the code right and/or DBAs getting the configuration right.
Multiple instances are lower risk, because a disaster with one customer (someone accidentally doubles everyone's salary and the error is discovered after pay day) doesn't affect other customers. A disaster that affected ALL our customers (whoops, new DBA, and suddenly every participant has the same SSN!?!) might put our company under.
Having one instance on one server presents a single point of failure, with our entire customer base out of business if a hurricane knocks the building over. Multiple instances on multiple servers permits geographic dispersion: no catastrophe will affect too large a proportion of our customers, and the unaffected servers in other regions can take on the load of the failed servers.
Performance is better because the database is smaller (10,000 vs 2,000,000 rows in ~50 tables).
If MyCorp's offices are (mostly) in just one region, then the MyCorp's instance can be geographically co-located there, so network lag doesn't hurt performance. We can provide better service to global clients, for the same reason.
In MyCorp wants to take their database in-house, then we can easily export their instance, to get MyCorp their data.
Load-balancing is easier because instances can be placed on different servers (this is with a web farm).
When a DEV or QA instance is needed, it's easier to clone the real instance and anonymize the data, because there's much less data.
Because they're small enough, developers can have their own instance running locally, so they can work on code while waiting at the airport and while in-flight, without fighting VPN hassles.
Q1: What are other advantages of separate instances?
We are contemplating changing the database schema and merging all of our customers into one Oracle instance, running on one hefty server.
Here are advantages of the multi-client instance approach, most important first (my WAG). Please snipe if these are bogus:
Less work for the DBAs, since they only need to maintain one instance instead of hundreds. Less DBA work translates to cheaper, our main motive for this change.
With just one instance, the DBAs can do a better job of optimizing performance. They'll have time to add appropriate indexes and review our SQL.
It will be easier for developers to debug & enhance the application, because there is only one schema and one app (there might be dozens of schema versions if there are hundreds of instances, with a different version of the app for each version of the schema). This reduces costs too. The alternative is having to start every debug session with (1) What version is this customer running and (2) Let's struggle to recreate the corresponding development environment, code and database. (We need a Virtual Machine that includes the code AND database instance for each patch and release!)
Licensing Oracle is cheaper because it's priced per server irrespective of heft (or something -- I don't know anything about the subject).
The database becomes a viable persistent store for web session data, because there is just one instance.
Some database operations are easier with one multi-client instance, like finding a participant when they're hazy about which customer they (or their spouse, maybe) works for: all the names are in one table. Reporting across customers is straightforward.
Q2: What are other advantages of having multiple clients in one instance?
Q3: Which approach do you think is better (why)? Instance per customer, or all customers in one instance?
I'm concerned that having one multi-client instance makes migration near-impossible, and that's a deal killer...
... unless there is a compromise solution like having two multi-client instances, the old and the new. In that case case, we would design cross-instance solutions for finding participants, reporting, etc. so customers could go from one multi-client instance to the next without anything breaking.
Unless you are using Oracle XE (the limited, free edition) having one database per server will get very expensive very quickly, even if you're buying single core, single CPU boxes. Having several databases per server is inefficient, because each database incurs an overhead of CPU and RAM usage. Tuning is more difficult, because contention is harder to diagnose.
So, as well as being easier to administer, a single big server ought to work out cheaper than lots of discrete little servers (no guarantees, no money back!). Make sure you buy the biggest, fastest chips you can and as much RAM as you have free slots. Those are things which give you better performance without affecting your licensing costs.
Consider the Partitioning option, if you can afford it. This will address your concerns regarding backup and recovery, because each partition can have its own tablespace. So (given partitioning by client_id) it becomes possible to backup or restore an individual client's data without affecting the other clients. We can even export and import individual partitions. I'm surprised by David's observation that Partition pruning didn't work with VPD. But I haven't tried this combo, so I'll take his word for it.
The one thing you might lose from consolidation is the ability to support different clients on different versions of your application. However, this is not necessarily a bad thing. As you observe, maintaining several hundred customers will be a lot easier if you forgo individualised versions of the application. If you do need to offer some bespoke features - even if you just want to beta test some functionality with an individual client - then have a look at Edition-Based Redefinition in 11gR2: it is a really nifty feature. Also it is available for all Oracle licenses, not just Enterprise.
When you say 'separate instances', are you talking about one instance with multiple schemas on it? Or do you really mean multiple instances running on a single machine? There is little reason to run multiple instances on a single machine, as opposed to running multiple schemas on a single instance - each schema would still have their own set of tables, indexes, etc.
Anyways, I don't have a full answer, but one thing to keep in mind is the licensing costs of Oracle, and how that can affect what the optimal solution is.
According to the Oracle store,
Oracle standard edition one is $5,800.00 / Processor (where on x86, a processor is a socket, and you can go to up 2 sockets)
Oracle standard edition is $17,500.00 / Processor (where on x86, a processor is a socket, and you can go to up 4 sockets)
Oracle enterprise edition is $47,500.00 / Processor (where on x86, a processor is 2 CORES - so you have to effectively double that price for quad core CPUs)
So if, for example, you need 8 quad core CPUs to handle 100 customers, licensing that on a single database is VASTLY more expensive than having 4 separate databases, each having 2 quad core CPUs, each running 25 customers.
8 quad core CPUs requires enterprise edition, and would have a list price of 16 x $47,500 = $760,000. 4 machines, each running standard edition one, and each with 2 quad-core CPUS, would have a list price of 8 x $5,800.00 = $46,400 - a factor of 16 difference. Now, keep in mind that no one pays list price for enterprise edition, but there is still a huge difference to consider.
If you don't have a huge need for database operations across clients, and you don't need enterprise edition features, and you need this level of CPU power (or expect to grow to need this level of CPU power), the licensing costs are going to be a huge downside of the one-instance approach.
It may be worth researching salesforce, and the buzz word you're looking for is "multi tenant architecture"
This makes a good read:
http://blog.dayspring-tech.com/2009/02/forcecom-multitenant-architecture-under-the-covers/
It's a good example because Salesforce do use an Oracle db under the covers.
Good question, glad to see you are considering all the alternatives. Lots of good points but I will stick to just addressing one.
I was the DBA for a hosted application and the developers decided to use Oracle Virtual Private Database feature for this.
The application was constructed with intention of customers sharing a pool of app servers for load balancing and a single database schema on the back end.
Before VPD we had a Java class that tacked "where customer_id=?" or "and customer_id=?" on every query right before it went to the database so the customer would only see their data. To implement this in VPD upon login ot the DB we would have the app set a variable in the app context that would be used by the VPD policies to allow the session to only see their records. So yeah, you have to code it up right and assign VPD policies to tables, and also trust that Oracle holds up their end of the bargain.
So was it good for us? In theory it was nice to offload the SQL predicate handling to something outside our application but in practice the advantages didn't outweight the disadvantages.
When we had dozens of clients in one database and when we upgraded they all had to get upgraded at the same time. We had lots of tug-of-wars with customers that didn't want to upgrade for whatever reason or wanted to do their own QA on the new versions.
We entertained the Old instance/New instance thing for upgrades but migrating data was risky and associated downtime did not make customers happy. We did roll our own procedure that would step through tables and export data... But certainly not as easy as a quickie Export or Data Pump job.
We also had issues with VPD predicate analysis when it came to Partitioning. As with alot of Oracle features they may work OK on their own but once you combine with other features things get unpredictable. For us partitions not related to the current customer_id weren't getting eliminated because the predicate analysis was coming too late in the processing of the SQL statement. We worked around it by changing from static to dynamic VPD policies but our time spent parsing shot up.
So after all that what is my take on it? I would have spent the time making sure our app made good use of bind variables and continued with the old mechanism that added customer_id to the SQL statement.
Oracle is made to handle that kind of load.
My Question - What do you do when you have thousand customers and say ten thousand?
Do you still keep separate instances/schema?
I doubt anyone will do that. I have worked earlier in a place where each client had separate database as well as a copy at a central place.
Change management becomes a headache, you'd have to maintain a very good information about which client/company is on which database revision, schema, app version and all those things. This'd become a software in itself.
I'd suggest to create software/design based around SaaS model, that'll allow you easy maintenance and same database/schema for all users.
For Reliability you can still use clustering - Oracle RAC.
I've had to consider the same decision a few times. In our case we use MySQL, so there is no cost associated with running all customers in a separate database.
The benefits to running all of our customers on a separate database have been great. We have a script that lets us move a customer's entire instance to any server to balance load. The script merely copies over the database, copies over any custom files, spins up the application, and sets up our routing system to send users to the new instance. The whole process takes just a few minutes.
Database changes can take a very long time on large mysql databases. Since all our clients have their own database we are able to keep all of our datasets small. Backups are also very fast.
Our development instances behave the same way, so this method allows us to run a variety of database schemas simultaneously as we develop and test new features. We often work with customers to have them try out a new feature before we deploy it to the rest of our instances. The one rule that we stick to (in order to avoid a few of the drawbacks you mention), is that all clients must be within one version of each other. Maintaining more than a couple versions across clients would have a huge overhead.
Facebook took the same approach when they started their company. Each school that they launched at had a separate database and they were able to set up new instances very quickly. The primary reason they finally consolidated their database was that they wanted to enable users to communicate between schools.
If not for potential cost issues I would definitely encourage you to stick with the separate database approach.