Test data management solution for oracle, db2 and sap - oracle

i have 3 different databases (oracle , sap,db2) and would like to implement data masking on oracle db , since the data is flowing to sap and db2 how can i solve this issue? data in oracle is compared with db2 and sap and say for example if i mask first name in oracle then the same will not be masked at sap and db2. so is there a way to unmask and send data to downstream systems ?

Generally the task can be solved by vendor's tools like IBM Optim Data Privacy. Such tools provide the capabilities for consistent masking, e.g. same input produce the same masked output, provided equivalent algorithms and parameters.
Probably by saying SAP you mean SAP HANA. This can be a bit tricky, due to missing SQL compatibility and lack of integration, but anyway this is doable too with the very same tools - just a bit more work to implement.

Related

Using Saiku-ui without saiku-server/mondrian?

Is it possible to use the saiku-ui component with a different jolap provider than mondrian, or with a different server backend than the saiku-server component?
I have been looking but I have not found an architecture description of how these pieces fit together and what interfaces they use to communicate. Can anyone point me towards an understanding of what the saiku-ui wants to speak with and what the saiku-server is providing?
The reason for my interest is that I have a set of data spread across hundreds of csv files that I would like to query with a pivot and charting tool. It looks like the standard way to use this with saiku would be to have an ETL process to load in to an RDBMS. However, this would not be a simple process because the files and content and the way the files relate to each other vary, so the ETL would have to do a lot of inspection of the data sources to figure it out.
Given this it seems to me that I would have three options in how to use saiku:
1) write a complex ETL to load in to a rdbms, and then use a standard jdbc driver to provide the data to modrian. A side function of the ETL would be to analyze the inputs and write the mondrian schema file describing the cubes.
2) write a jdbc driver to access the data natively. This driver would parse sql and provide access to the underlying tables. Essentially this would be a custom r/o dbms written on top of the csv files. The jdbc connection would be used by mondrian to access the data. A side function of this custom dbms would be to produce the mondrian schema file.
3) write a tool that provides a jolap interface to the native data (accepts discovery and mdx queries). This would bypass mondrian entirely and interface with the ui.
I may be a bit naive here but I consider each of the three options to be feasible. Option #1 is my least preferred because of the likelihood of the data in the rdbms becoming out of sync with the cvs files. Option #3 is most preferred because the data are simple, so not much aggregating required and I suspect that mdx will be easier to parse than sql.
So, if i could produce my own jolap data source would it be possible to hook the saiku-ui tools up to it? Where would I look to find out the interface configuration details?
Many years ago, #ronaldbouman created xmondrian - the set of tools with the olap server, and web ui tools for xmla browsing and visualusation. But that project was not updating, and has no source code.
I just updated olap server and libraries to the latest versions.
You may get it here and build:.
https://github.com/Muritiku/xmondrian-build.
You may use web package as the example. The mondrian server works with the saiku-ui.
IMHO,
I would not be as confident as your are, because it took Julian Hyde more than a decade to build Mondrian (MDX->SQL) and Calcite (SQL), fulfilling your last two proposals.
You might simply consider using Calcite, or even better Dremio. Dremio has a JDBC interface, and can query directories of CSV files in SQL. I tested Saiku over Dremio successfully (with a schema based on two separate RDBMS). Just be careful to setup tables' schema accordingly in the Mondrian v4 schema.
Best regards,
Fabrice Etanchaud
Dremio

Oracle -> Postgresql Log-Based replication

(I do not code on my own, to make things clear)
I am looking for a solution that would allow to replicate data between a, master, Oracle 11g DB and a new PostgreSQL DB. Those are 2 different applications but the need to exchange data in real-time. There are some trigger-based ways but there is quite a big concern that this can affect the master DB efficiency - which we can't do.
I have also come across some log-based solutions, like HVR, but the cost is way too high for 500MB of data to be replicated.
Maybe anyone of You had a similar issue and found a way to deal with it?
Any kind of tips and help will be really appreciated as I am quite short on time
Oracle Archive Logs have different format than Postgres Write Ahead Logs. Despite the general similarity in concept of Oracle Streams, SQL Log Shipping, Postgres Streaming Replication etc, transaction logs <> redo logs <> xlogs and you can't use one provider logs to roll on the other provider engine.
Moreover you can't roll logs over same DB provider different version because of difference in binary format.
Something alike logical replication you can get with Postgres Logical Decoding, Oracle GoldenGate, Heterogeneous Database Replication, AWS DMS. But none of above gives you "Log-Based replication" between different db vendors
You can use a product that specializes in change data capture based data integration. Striim, GoldenGate, Attunity allow you to do CDC from Oracle. Striim also allows you to do CDC from PostgreSQL and write to Oracle as well.
https://striim.com
https://attunity.com

Developer sandboxes for Oracle database

We are developing a large data migration from Oracle DB (12c) to another system with SSIS. The developers are using a production copy database but the problem is that, due to the complexity of the data transformation, we have to do things in stages by preprocessing data into intermediate helper tables which are then used further downstream. The problem is that all developers are using the same database and screw each other up by running things simultaneously. Does Oracle DB offer anything in terms of developer sandboxing? We could build a mechanism to handle this (e.g. have dev ID in the helper tables, then query views that map to the dev), but I'd much rather use built-in functionality. Could I use Oracle Multitenant for this?
We ended up producing a master subset database of select schemas/tables through some fairly elaborate PL/SQL, then made several copies of this master schema so each dev has his/her own sandbox (as Alex suggested). We could have used Oracle Data Masking and Subsetting but it's too expensive. Another option for creating the subset database wouldn have been to use Jailer. I should note that we didn't have a need to mask any sensitive data.
Note. I would think this a fairly common problem so if new tools and solutions arise, please post them here as answers.

OCDM combined with ODI

ODI = ELT tool
OCDM = Data warehouse.
Is my understanding of the above correct ? More information/explanation is welcome.
Now my question is :
Is it possible to load into OCDM's pre-existing tables via ODI, when the source of ODI are in flatfiles/XML format ? If possible, how ?
Any links related to above are also welcome.
It is indeed possible. OCDM is a solution using an Oracle 11g database to store the data, so ODI can definitely load it.
Actually OCDM comes out-of-the-box with adapters to load the data from NCC (Oracle Communications Network Charging and Control) and BRM (Oracle Communications Billing and Revenue Management), and these adapters are using ODI 11g – and optionally Golden Gate.
Each of these adapters is composed of some models and one ODI project holding interfaces and packages.
If you want to build you own integration process, it is just a standard loading from flat file to Oracle or XML to Oracle. Both of these are covered by the tutorials in the ODI 11g Series in the Oracle Learning Library : https://apexapps.oracle.com/pls/apex/f?p=44785:24:0::NO:24:P24_CONTENT_ID,P24_PREV_PAGE:5185,29
It's possible, using OCDM ad-ons to load the data (NCC, BRM)
NCC (Oracle Communications Network Charging and Control),
BRM (Oracle Communications Billing and Revenue Management),
and these adapters are using ODI 11g and ODI 12c and optionally Golden Gate also.
ODI is mostly used to load historical data. For more real-time data , they use Oracle Golden Gate. However, OGG is used to load the data to staging, for data sync from staging to presentation they still use ODI>
Yes, its possible. You need normal standard interface writing techniques to implement this.

Independence though ODBC drivers

Can an application developed with oracle queries in DB layer
Be run on an SQLServer Database with the help of an ODBC driver
Maybe, if you used only ANSI SQL statements. ODBC will happily send the text of the query to the query parser on the server and as long as the server can parse it, it will run.
If, however, you have used anything that's specific to Oracle (and that's a long, long list), then it won't work so well.
All that ODBC provides you is abstraction from the connection details -- the driver, the server name, the port numbers etc.
So, how do you get true independence? Generally, you'll use a query generation library like Hibernate which knows how to translate a query language of some kind (HQL) to the specifics for that particular database (PL/SQL or Transact/SQL).
Short answer: Not reliably.
Longer answer: Not through ODBC, but using a JDBC driver for Microsoft SQL Server then perhaps if the application was developed only with ANSI standard SQL. Usually, that is not the case and some PL/SQL code will have been used. If an equivalent piece of T-SQL can be written then it is possible to port the application. But, to your question, this is largely immaterial to the database connection mechanism.
Addendum: Object Relational Mapping tools usually use dialects to generate database independent queries. Other options include using configuration to select the correct queries at run-time (if you need to support both database types).

Resources