converting database formats (mysql, ms access, ms excel, IBM DB2) - portability

We need to a good tool that we can run from a script to
automatically convert a (mysql, ms access or DB2) database
to a (mysql, ms access, ms excel or DB2) database,
while preserving the data types as much as possible (text, int, decimal, time...).
Do you know of such a tool?
I am looking for a solution, such as IBM Data Movement Tool. this tool converts from any database to IBM DB2, automatically. it maps the data types, creates the tables, and import the data automatically. and you can run it from the command line. the only problem is that the target database can only be a db2 database. I am looking for the same type of tool, from any database to any database. http://www.ibm.com/developerworks/data/library/techarticle/dm-0906datamovement/

Best I can provide for you is SwissSQL products. I have used their trial stuff to help me convert schemas from MS SQL to Postgres, and they were quite useful.
In theory they can also convert full schemas, data, and even stored procedures. In practice you should read what their tools cover, because if you expect 100% conversion without any human intervention you will probably be disappointed.
Similar products may exist, (have you already googled for "convert DB2 to mySQL"?) but they will probably not cover exactly the mix you are interested in (for example, there may be tools specializing in converting from a range of DBs to mySQL, or from a range to MSAccess, but having a general purpose tool doing what you want for all the combination is probably too much to ask).
And I am pretty sure that in every case, you will have to verify the results and be ready to manually correct anything missing, truncated, or misinterpreted.

I don't think there's the single silver-bullet for this problem. The best thing to look for is specific tools to help you along. Some things can already be done without additional tools, e.g. exporting Excel to CSV and importing CSV with MySQL's tools. Likely you'll just need to find the best tool for each specific case (with perhaps a few combined) and selectively use each depending on your desired input/output formats.

Related

How can we do data analysis for DB replication project

We are facing one issue in our project i.e. Data verification issue.
The project is about Replication of data from Sybase to oracle DBs.
The table structures for Table A across Sybase, Oracle is same.
Same column and primary key combination across all the databases.
e.g. If Sybase has Table A with columns a, b and C
same table with same name and same columns will be available in different databses.
We are done with replication stuff part.But we faced some silent failure like data discrepancy just wondering if there will any tool already available for this.
Any information on his would be helpful. Thanks.
Sybase (now SAP) has a couple products that can be used for data comparisons and reconciliation:
rs_subcmp - an older, 32-bit tool that comes with the Sybase Replication Server product that can be used to compare data between
source and target; SQL reconciliation scripts can be generated from
the differences and then applied to the target to bring it in sync
with the source; if your tables are more than 1GB in size you can
still use rs_subcmp but you'll need to create multiple comparison
jobs (via where clauses) to work on different subsets of your tables
[I don't recall if rs_subcmp can be use for heterogeneous
replication setsup, eg, ASE-Oracle.]
Data Assurance (DA) - the newer, 64-bit product ... also from
Sybase ... which can also compare data and (re)sync the target(s)
from the source (either via SQL reconciliation scripts or directly);
DA is capable of handling comparisons between a handful of
different RDBMS products (eg, ASE-Oracle); I'm currently working on a
project where one of the requirements is to validate (and reconcile
where needed) 200+TB of data being migrated from Oracle to HANA and
I'm using DA for the validation/reconciliation portion of the project
As #TenG has hinted at with his answer, there's a good bit of effort involved to compare data and generate code to reconcile the differences. Rolling your own code is doable but will entail a lot of work. If you've got the money you'll likely find 3rd party tools can get most/all of the work done for you.
If you used a 3rd party product to replicate your data from Sybase to Oracle, you may want to see if the same vendor has a comparison/validation/reconciliation tool you could use.
I've worked on a few migration projects and a key part has always been data reconciliation.
I can only talk about the approaches we took, based on constraints around tools available and minimising downtime, and constraints of available space.
In all cases I took to writing scripts that worked on two levels - summary view and "deep dive". We couldn't find any tools readily available that did what we wanted in a timely enough manner. In fact even the migration tools we found had limitations (datapump, sqlloader, golden gate, etc) and hand coded scripts to handle the bits that we found to be lacking or too slow in the standard tools.
The summary view varied from project to project. It was part functional based (do the accounting figures for transactions match) for the users to verify, and part technical. For smaller tables we could just write simple reports and the diff was straight forward.
For larger tables we wrote technical reports that looked at bands of data (e.g group the PK into 1000s) collect all the column data and produce checksum, generating a report for each table like:
PK ID Range Start Checksum
----------------- -----------
100000 22773377829
200000 38938938282
.
.
Corresponding table pairs from each database were then were "diff"d against each other to highlight discrepancies. Any differences that were found could then be looked at in more detail.
The scripts were written in such a way to allow them to run in parallel looking at discrete bands. Te band ranges were tunable as well to get the best throughput. This obviously sped things up.
The scripts were shell scripts firing off sqlplus reports, and similar for the source database.
On one project there wasn't enough diskspace to do these reports, so I wrote a Java program that queried the two databases side by side, using block queues to fetch and compare rowsets. Being in memory meant this was super fast.
For the "deep dive" we looked at the details for key tables, or for tables that reports a checksum difference.
For the user reports, the users would specify what they wanted to see, and we wrote the reports accordingly.
On the last project, the only discrepancies found were caused by character set conversion issues (people names with accents weren't handled correctly).
On projects where the overall dataset was smaller we extracted the data to XML files and wrote a Java tool to processes pairs and report differences.
The SAP/Sybase rs_subcmp tool is pretty powerful and also pretty hard to use. For details see:
https://help.sap.com/viewer/075940003f1549159206fcc89d020515/16.0.3.3/en-US/feb58db1bd1c1014b134ef4efef25563.html?q=rs_subcmp
You have to pass it key field information, but once you do that, it can retry/restart the compare streams after transient differences. Pretty fancy.
rs_subcmp expects to work on Sybase data source. So to compare against Oracle, you'd probably have to setup one of those Sybase-to-Oracle gateway products ($$$$$).
Could you install the Oracle ODBC drivers and configure them to allow Sybase clients to access Oracle? I'm guessing not (but that's outside the range of my experience).
Note the "-h" option for rs_subcmp. The docs just say it runs a "fast comparison", but what it's actually doing is running queries using the hashbytes() function. Something like:
select keyfield1,keyfield2, hashbytes("Md5",datacol1,datacol2,datacol3)
from mytable
So this sort of query might be good for the "summary view" type comparison discussed above (if the Oracle STANDARD_HASH() function output matches up with the Sybase hashbytes() function (again, outside my experience))
Note, as of ASE 16, there was a bug with the hash() & hashbytes() functions running the Md5 hash option against large varbinary columns where they could use up all procedure cache, potentially crashing the server (CR 811073)

Logical grouping schemas in ORACLE?

We are planning a new system for a client in ORACLE 11g. I've been mostly in the Sql Server world for several years, and am not really current on the latest ORACLE updates.
One particular feature I'm wondering if ORACLE has added in by this point is some sort of logical "container" for database objects, akin to Sql Server's SCHEMA.
Trying to use ORACLE's schemas like Sql Server winds up being a disaster for code comparisons when trying to push from dev > test > live.
Packages are sort of similar, except that you can't put tables into a package (so they really only work for logical code grouping).
The only other option I am aware of is the archaic practice of having to prefix object names with a "schema" prefix, i.e. RPT_REPORTS, RPT_PARAMETERS, RPT_LOGS, RPT_USERS, RPT_RUN_REPORT(), with the prefix RPT_ denoting that these are all the objects dealing with our reporting engine say. Writing a system like this feels like we never left the 8.3 file-naming age.
Is there by this point in time any cleaner, more direct way of logically grouping related objects together in ORACLE?
Oracle's logical container for database objects IS the schema. I don't know how much "cleaner" and "more direct" you can get! You are going to have to do a paradigm shift here. Don't try to think in SQL Server terms, and force a solution that looks like SQL Server on Oracle. Get familiar with what Oracle does and approach your problems from that perspective. There should be no problem pushing from dev to test to production in Oracle if you know what you're doing.
It seems you have a bit of a chip on your shoulder about Oracle when you use terms like "archaic practice". I would suggest you make friends with Oracle's very rich and powerful feature set by doing some reading, since you're apparently already committed to Oracle for this project. In particular, pick up a copy of "Effective Oracle By Design" by Tom Kyte. Once you've read that, have a look at "Expert Oracle Database Architecture" by the same author for a more in-depth look at how Oracle works. You owe it to your customer to know how to use the tool you've been handed. Who knows? You might even start to like it. Think of it as another tool in your toolchest. You're not married to SQL Server and you're not being unfaithful by using Oracle ;-)
EDIT:
In response to questions by OP:
I'm not sure why that is a logistical problem. They can be thought of as separate databases, but physically they are not. And no, you do not need a separate data file for each schema. A single datafile is often used for all schemas.
If you want a "nice, self-contained database" ala SQL Server, just create one schema to store all your objects. End of problem. You can create other users/schemas, just don't give them the ability to create objects.
There are tools to compare objects and data, as in the PL/SQL Developer compare. Typically in Oracle you want to compare schemas, not entire databases. I'm not sure why it is you want to have multiple schemas each with their own objects anyway. What does is buy you to do that? Keep your objects (tables, triggers, code, views, etc.) in one schema.

Any good document-oriented DB for Windows desktop besides MongoDB, etc?

I've been searching for a document-oriented DB that for a Windows desktop program. MongoDB seems to be the best one so far, because it's smaller (11MB) and simpler when compared to CoachDB (which is another option but it seems to be more complex and the download size is almost 50MB), but unfortunately, on 32-bit Windows the database size limit in MongoDB is 2GB, and they don't intend to fix this limit anytime.
Do you have any recommendation? Requirements:
Open source;
schema-less, in BSON/JSON format;
Easy to deploy to a windows machine.
Many thanks!
I'm just curious.. Why would you need a non-relational database for a desktop application. I mean, these things are designed for high-availability clusters and a really large amount of data, both of which are irrelevant for desktop apps where you would usually have just one user at a time and not so large dataset.
What I would use if I were you is an embedded database like HSQLDB or SQLite.
Now, if you want make it schema-less for simplicity, well just create your tables only with columns id long and data varchar
And then serialize/deserialize your objects to and from JSON yourself when accessing the data.
You can see a really easy way to do the JSON stuff here:
JSON Serializer for arbitrary HashMaps in Voldemort
Note: The question on link above is Voldemort-specific, but the answer I received isn't and could be applied here as well (assuming you are using Java, if not there has to be an easy way to do so in your language, too).

Oracle versus DB2 on data Validation

Most forums cite minor differences in speed, backup, etc.
It's about time someone tell how the two differ when it comes to GUI data validation.
Do this 2 Database always depend on java(or other software), or do they have the ability to create a user interface the accepts only valid input. Things like: positive numbers only, age between 1 to 100 only, email must be correct. I would be scared if my software accepts 500 years old for age.
Both offer native development tools that are roughly comparable and able to what you ask about.
Both also offer the ability for all main languages to interact with the RDBMS and so therefore the ability to to do the type of thing you discuss is as diverse as the options of a range of languages including Java, .Net, Ruby, Python, C++, VB, etc, etc
However what they don't really offer is a simple Access type 'forms and tables' type RAD tool. In simple terms the increased flexibility and power of both Oracle and DB2 comes at the price of simplicity.
Neither database DEPENDS on Java for implementing field level constraints. Data constraints can be implemented directly at the database level, and it is good practice to do so.
But you also need field level validation - users do not want to get constraint violation errors on insert.
As for tools that generate GUI applications from the database itself - I don't see that as an Oracle vs DB2 database question - it's more Oracle Apex vs IBMs equivalent - but even within Oracle you've got Forms (deceased), JDeveloper, Apex.

load data into text file from oracle database views

I want to load data into text file that is generated after executing "views" in Oracle?How can I achieve this in oracle using UNIX.for example-
I want the same in Oracle on unix box.Please help me out as it alredy cosume lots of time.
your early response is highly appreciated!!
As Thomas asked, we need to know what you are doing with the "flat file". For example, if you're loading it into spreadsheet or doing some other processing that expects a defined format, then you need to use SQL*Plus and spool to a file. If you're looking to save a table (data + table definition) for moving it to another Oracle database then EXP/IMP is the tool to use.
We generally describe the data retrieval process as "selecting" from a table/view, not "executing" a table/view.
If you have access to directories on the database server, and authority to create "Directory" objects in Oracle, then you have lots of options.
For example, you can use the UTL_FILE package (part of the PL/SQL built-ins) to read or write files at the operating system level.
Or use the "external table" functionality to define objects that look like single tables to Oracle but are actually flat files at the OS level. Well documented in the Oracle docs.
Also, for one-time tasks, most of the tools for working SQL and PL/SQL provide facilities for moving data to and from the database. In the Windows environment, Toad's good at that. So is Oracle's free SQLDeveloper, which runs on many platforms. You wouldn't want to use those for a process that runs every day, but they're fine for single moves. I've generally found these easier to use than SQLPlus spooling, but that's a primitive version of the same functionality.
As stated by others, we need to know a bit more about what you're trying to do.

Resources