Pull from multiple databases in the same source qualifier? - informatica-powercenter

I'm wondering if it is possible to pull from more that one database in the same source qualifier. You can only specify a single database connection per Source Qualifier so I'm not sure if this is possible.

Ben,
If they are both from the same database vendor and db links are set up, you can use DBLINKS in the source qualifier.
select a.col1,
b.col2
from schema1.table1 a,
schema1.table2#db2 b
where a.col3 = b.col4;
But if they are heterogeneous databases, I think the best way to implement would be create two different source qualifiers (or different look ups based on your requirement and the number of columns) and use the parameter file / session to mention different connections.

Assuming that the account used to connect has equivalent rights in both databases it's DATABASE_NAME.TABLE_NAME
SELECT
a.id
,a.name
,a.company
,b.company_id
,b.company_name
,b.address
FROM
database1.users as a
JOIN
database2.companies as b ON a.company=b.company_id

I would implement this using database links, which allows interactions across two databases.
Although not a preferred solution for many reasons, this would help you achieve what you said, for your reasons.
However, from an ideal solution perspective, you should not be doing this in first place :)
If the data is flowing through from two different databases, get them from two different source qualifiers and then depending upon your need, go for a joiner or lookup (depends upon the functional requirements) etc..

If the login have at least read access and the schemas are at the same service is possible.
For a reason our DBA don't allow dblinks...
One reason to use informatica is that you can create a specific Source Qualifier (SQ) for different sources and then use a Join/Union trans... Believe me if you have issues in some of the data sources; fix it and troubleshooting would be easier.
Also imagine that you leave the company and other team takes that jobs; graphically and logically would be easier to maintain...

Related

Can full information about an Oracle schema data-model be selected atomically?

I'm instantiating a client-side representation of an Oracle Schema data-model in custom Table/Column/Constraint/Index data structures, in C/C++ using OCI. For this, I'm selecting from:
all_tables
all_tab_comments
all_col_comments
all_cons_columns
all_constraints
etc...
And then I'm using OCI to describe all tables, for precise information about column types. This is working, but our CI testing farm is often failing inside this schema data-model introspection code, because another test is running in parallel and creating/deleting tables in the middle of this serie of queries and describe calls I'm making.
My question is thus how can I introspect this schema atomically such that another session does not concurrently change that very schema I'm instropecting?
Would using a Read-only Serializable transaction around the selects and describes be enough? I.e. does MVCC apply to Oracle's data dictionaries? What would be the likelihood of SnapShot too Old errors on such system dictionaries?
If full atomicity is not possible, are there steps I could take to minimize the possibility of getting inconsistent / stale info?
I was thinking maybe left-joins to reduce the number of queries, and/or replacing the OCIDescribeAny() calls with other dictionary accesses joined to other tables, to get all table/column info in a single query each?
I'd appreciate some expert input on this concurrency issue. Thanks, --DD
a typical read-write conflict. from the top of my head i see 2 ways around it:
use dbms_lock package in both "introspection" and "another test".
rewrite your retrospection query so that it returns one big thing of what you need. there are multiple ways to do that:
use xmlagg and alike.
use listagg and get one big string or clob.
just use a bunch of unions to get one resultset, as it's guaranteed to be consistent.
hope that helps.

Oracle PL/SQL data access package code generator

as many of you whose job is PL/SQL development on Oracle might have experienced in their career, it is common to create packages in order to handle the data access layer on a specific table. I mean, given a table 'employee' is a wide common practice to create a package 'da_employee' ('da' stands for 'data access') that implements routines such as ins() in order to insert a row into employee, del() for deleting a row, upd() for updating, lock() for locking, ..., I could go on...
The content of the package might vary on the basis of the needs and the personal choices, but it is likley to state that once the structure of a data access package is designed for a table, the hundreds more table I plan to create in my schema might need a package based on the same design.
At this point I could state it is possible to auto-generate such a kind of package using the metadata stored in the DB and a tamplate of the package itself.
I guess I'm not the first who have come to this conclusion, so I'm wondering if there are such code generation solutions around, either commercial or free.
The CodeGen utility is no longer available at Toadworld. I am now looking into offering alternatives for TAPI (and more generally data access layer) generation on the PL/SQL Challenge site (plqlchallenge.com). Rick, I would be interested to talk to you about yours - feel free to contact me at steven#stevenfeuerstein.com.
Regarding the question of whether to use a TAPI or not: I believe it is most important to focus on the fundamental principles first and then seek out the optimal solution.
The key principle, for me, is to avoid repetition of SQL statements in my app, and consequently to make it easier to optimize, maintain and enhance those statements. For this reason, a data access layer is critical. Some of us build apps that perform DML on individual tables, and so we find TAPIs useful. Others do not and prefer "XAPIs" (transaction APIs).
These days, I write packages that contain parts of both - and generate as much of it as I can.
You could try
https://code.google.com/p/tapig/
I looked at it briefly.. but it had an issue becuase I have table names with _ in them

Relationship between two Datacontext

I have two database. MasterData and ProductData.
I store the Users and Employees in the MasterData and I store the Tasks in the ProdcutData.
A Task entity has a User property. It shows which user created this Task.
If I used just one Database and one DataContext I could define a one and more relationship between two entities. But I must use two Databases and datacontexts.
Are any solution that I define relationship between two entities that are in different databases, datacontexts?
thanks advance: l.
This is not a full blown answer, but it might get you to think of another solution.
Depending on the DBMS you are using, you might be able to create synonyms or updateable views (or something similar) from one database to the other. That you DataContext can contain the synonyms/views and the tables.
In sql-server:
http://msdn.microsoft.com/en-us/library/ms177544.aspx
Well, unless I missed something there is no way to join two entities from different contexts/databases regardless if its L2S or EF. Alternative is pooling all possibly relevant data from two contexts and doing in-memory linq to do relational operations, but that certainly poses performance problems of loading too much data.
Here's a "novel" idea, why not use DataSet? Different table adapters can use different connection strings. It is rather archaic next to L2S/EF but it will offer you most bells & whistles of relationships.
I do have one question, if you keep users and their tasks in separate dbs how do you handle ref integrity?
Synonym is a good solution, but the EF does not support it yet....
http://data.uservoice.com/forums/72025-ado-net-entity-framework-ef-feature-suggestions/suggestions/1052345-support-for-multiple-databases?ref=title
Thanks again!

Disadvantages of consolidating databases?

In an organization that has two applications each with its own Oracle database instance, what are the disadvantages of consolidating the two databases into one database with two schemas?
Backups and replicating the database are bigger and slower, probably. What else?
Some background:
The two databases are the "gold source" for their respective data. Each is critical to the operation of the organization and each is actually used by several appliations, tools, and reports (but each database is principally "owned" by one application). The need to join data across the databases, to relate entities in one to entities in the other, comes up frequently. For this reason there are DB links connecting the two and some cross-database materialized views to help with performance. There is an effort underway to reduce data duplication and these materialized views are under discussion. Some in the organization want to phase out DB links and materialized views and introduce more web services to make the data available across applications. My concern is that there are too many situations that require complex joins of data across the two databases so services that expose the data won't perform. Another approach for reducing DB links and materialized views is to consolidate the schemas into one database, but I want to make sure I'm not forgetting any critical disadvantages to that approach.
In a single consolidated database, you will lose some flexibility from a DBA point of view:
A database obviously can have only one version (10.2.0.5 for example), which means that upgrades and patches will affect all schemas -- this may be a bad thing in case of multiple vendor app requirement mismatch.
Similarly, some administrative tasks (restore database A to point in time t) may be more complicated with a single database.
Overall, you will have less administration tasks (a single backup, single patching...) but each task will be more critical since they will have a global effect.
On the development side, beware of namespace collisions: some features are global over a single database, for example:
directories,
public synonyms,
DB link
Schemas
This means that you will have some work to do if you want to consolidate two databases that have public synonyms with the same name that points to two different things.
Could have something to do with licence costs - scaling up vs. scaling out.
The biggest concern I would have is that all your code will need to be rewritten to account for the new database and schemas. Or at least looked at. This courl introduce new bugs. I don't know how Oracle handles refernces to different databases, so I'll use an example of what I mean using SQL Server syntax. If I was joining to two tables onthe same server in different databases my select would be something like this:
SELECT a.field1, b.field2
FROM database1.dbo.table1 a
JOIN database2.dbo.table2 b
ON a.myid = b.myFK
To go your your new consolidated idea, you would want to write:
SELECT a.field1, b.field2
FROM schema1.table1 a
JOIN schema2.table2 b
ON a.myid = b.myFK
You will need to be especially careful of any tables that have the same name in both databases now, this could cause some sneaky bugs.
Note these are not difficult changes but all SQL hitting your database would have to be examined to see if it will work or adjusted if not.
I'm not sure if just putting them in the same database would do it either. You might need to consolidate some tables to avoid the duplication across applications. (In this case add fields to reference the old id numbers for things people are used to looking up by id like person_id that may appear on old paperwork, so they can be researched) This is a fairly major rewrite with all the attendant possibilites to make things worse due to new bugs.
If you go down this path, I highly recommend that you read a book on refactoring datbases before you decide how to design.
its hard to tell by just the information provided, big in db world would be 100gb or more, so 2 dbs would be 200GB. if both db are not bigger than 100GB then size should not be a huge factor in the decision, replication and sync can be done on changes only and backups should not be a big difference (again this depends on specifics such as when backups are done or if downtime is possible or backups are done during non-peak times)
Other than that other factors are:
naming collisions in dbo's such as keys, foreign key names, table names, etc. some renaming of tables, store procedures names too.

Separating Demo data in Live system

If we put aside the rights and wrongs of putting demo data into a live system for a minute (that's a whole separate discussion!), we are being asked to store some demo data in our live system so that it can be credibly demonstrated without the appearance of smoke + mirrors (we want to use the same login page for example)
Since I'm sure this is a challenge many other people must have - I'd be interested to know what approaches have people have devised to separating this data so that it doesn't get in the way of day to day operations on their systems?
As I alluded to above, I'm aware that this probably isn't best practice. :-)
Can you instead, segregate the data into a new database, and just redirect your connection strings (they're not hard-coded, right? right?) to point to the demo database. This way, live data isn't tainted, and your code looks identical. We actually do a three tier-deployment system this way, where we do local development, deploy to QC environments that have snapshots of the live data every few months, and then deploy to live when testing is complete.
FWIW, we're looking at using Oracle's row level security / virtual private database feature to seperate the demo data from the rest.
I've often seen it on certain types of live systems.
For example, point of sale systems in a supermarket: cashiers are trained on the production point of sale terminals.
The key is to carefully identify the test or training data. I wouldn't say that there's any explicit best practice for how to model this in a database - it's going to be applicaiton specific.
You really have to carefully define the scope of what is covered by the test/training scenarios. For example, you don't want the training/test transactions to appear in production reports (but you may want to be able to create reports with this data for training/test purposes).
Completely disagree with Joe. Oracle has a tool to do this regardless of implementation. Before I read your answer I was going to say VPD... But that could have an impact on Production.
Remember Every table in a query changes from
SELECT * FROM tableA
to
SELECT * FROM (SELECT * FROM tableA WHERE Data_quality = 'PROD' <or however you do it>
Every table with a policy that is...
So assuming your test data has to span EVERY table, every table will have to have a policy and every table will be filtered before a SQL can begin working.
You can even hide that column from the users. You'll need to write the policy with some deftness if you do. You'll have to create that value based on how the data is inserted and expose the column to certain admin accounts for maintenance.

Resources