I have huge tables in oracle database, appx 1 crore+ (10 million+) rows and want to migrate / copy those table and data in sql server.
Currently I am using Import functionality of SQL Server for this process. But it takes a day for this process and takes too much time.
Is there any better way? Any good outcome or step (SSIS, Any other functional step) to follow for this process?
Related
Is there a way to 'defrag' a table in Azure Sql or to understand why a table scan is so slow? I'm using an Azure Sql Database to drive an SSAS Tabular Model. My source table has ~30M rows and is currently reading/processing only 1M rows/hour, with 15M rows it was processing 1M/minute. I'm using sp_whoisactive but for the query that's driving the Tabular Process I see IO blocking and no CPU/IO values. Other queries which don't need a table scan run ok.
Thanks for your help
I'm looking for a simple way to communicate between two databases, there currently exists a database link between both database.
I want to process a job on database 1 for a batch of records (batch code for each batch of records), once the process has finished on database 1 and all the batches of records have been processed. I want database 2 to see that database 1 has processed a number of batches (batch codes) either by querying a oracle table or an Oracle advanced queue which sits on either database 1 or database 2.
Database 2 will process the batches of records that are on database 1 through a database linked view using each batch code and update the status of that batch to complete.
I want to be able to update the Oracle Advanced Queue or database table of its batch no, progress status ('S' started, 'C' completed), status date
Table name.
batch_records
Table columns
Batch No,
Status,
status date
Questions:
Can this be done by a simple database table rather than a complex Oracle Advanced Queue?
Can a table be updated over a database link?
Are there any examples of this?
To answer your question first:
yes, I believe so
yes, it can. But, if there are many rows involved, it can be pretty slow
probably
Database link is the way to communicate between two databases. If those jobs run on the database 1 (DB1), I'd suggest you to keep it there - in the DB1. Doing stuff over a database link calls for problems of different kinds. Might be slow, you can't do everything over the database link (LOBs, for example). One option is to schedule a job (using DBMS_SCHEDULER or DBMS_JOB (which is quite OK for simple things)). Let the procedure maintain job status in some table (that would be a "simple table" from your 1st question) in DB1 which will be read by the DB2.
How? Do it directly, or create a materialized view which will be refreshed in a scheduled manner (e.g. every morning at 07:00) or on demand (not that good idea) or on commit (once the DB1 procedure does the job and commits changes, materialized view will be refreshed).
If there aren't that many rows involved, I'd probably read the DB1 status table directly, and think of other options later (if necessary).
Am new to hiberante JPA. I am working on oracle to postgres migration and we are not using aws dms service for data migration. We would like to move ahead with Java for copying tables which have more than 1 million records. I have problem for below scenario.
Table A - Oracle
Table B - PostGres
Am extracting records from Oracle using ScrollableResults. Once i have the data from Oracle, i need to loop up a value in postgres database for data from Oracle before performing insert into postgres database.
I thought first #ColumnTransformer will help but it is not helping as i dont know how to reference data from oracle on ColumnTransformer expression.
So finally went ahead with writing normal insert query with values and subquery for lookup. Also set hibernate.jdbc.batch_size to 100.
I executed the program in this way and it took 5 mins for 10k records which i feel is slow.
is there any other solution for this problem to improve the performance.
Thanks for all your help
I found the solution. I solved it by storing postgres lookup table in list object then performing search in lookup table list object before performing insert. Now the speed is good.
I have a pretty complex query where we make use of a temporary table (this is in Oracle running on AWS RDS service).
INSERT INTO TMPTABLE (inserts about 25.000 rows in no time)
SELECT FROM X JOIN TMPTABLE (joins with temp table also in no time)
DELETE FROM TMPTABLE (takes no time in a copy of the production database, up to 10 minutes in the production database)
If I change the delete to a truncate it is as fast as in development.
So this change I will of course deploy. But I would like to understand why this occurs. AWS team has been quite helpful but they are a bit biased on AWS and like to tell me that my 3000 USD a month database server is not fast enough (I don't think so). I am not that fluent in Oracle administration but I have understood that if the redo logs are constantly filled, this can cause issues. I have increased the size quite substantially, but then again, this doesn't really add up.
This is a fairly standard issue when deleting large amounts of data. The delete operation has to modify each and every row individually. Each row gets deleted, added to a transaction log, and is given an LSN.
truncate, on the other hand, skips all that and simply deallocates the data in the table.
You'll find this behavior is consistent across various RDMS solutions. Oracle, MSSQL, PostgreSQL, and MySQL will all have the same issue.
I suggest you use an Oracle Global Temporary table. They are fast, and don't need to be explicitly deleted after the session ends.
For example:
CREATE GLOBAL TEMPORARY TABLE TMP_T
(
ID NUMBER(32)
)
ON COMMIT DELETE ROWS;
See https://docs.oracle.com/cd/B28359_01/server.111/b28310/tables003.htm#ADMIN11633
Hey EXPERIENCED SSIS DEVELOPERS, I need your help.
High-Level Requirements
Query SQL Server table (on a different server than my SSIS server) resulting in about 200-300k records results set.
Use three output colums for each row to lookup date in Oracle database.
Insert or Update SQL Server table with results.
Use SSIS.
SQL Server 2008
Sounds easy, right?
Here is what I have done:
Created on Control Flow Execute SQL Task that gets a recordset from SQL Server. Very fast, easy query, like select field1, field2, field 3 from table where condition > 0. That's it. Takes less than a second.
Created a variable (evaluated as expression) for the Oracle query that uses the results set from the above in the WHERE clause.
Created a ForEachLoop Container that takes the results (from #1 above) for each row in the recordset and runs it through a Data Flow that uses the Oracle query (from #2 above) with Data access mode: SQL command from variable against an Oracle data source. Fast, simple query with only about 6 columns returned.
Data Conversion - obvious reasons - changing 3 columns from Oracle data types to SQL Server data types.
OLE DB Destination to insert to SQL Server using Fast Load to staging table.
It works perfectly! Hooray! Bad news - it is very, very slow. When I say slow, I mean it process 3000 records per hour. Holy moly - so freaking slow.
Question: am I missing a way to speed it up? It seems like the ForEachLoop Container is the bottleneck. Growl.
Important Points:
- I have NO write access in Oracle environment, so don't even suggest a potential solution that requires it. Not a possibility. At all.
Oracle sources do not allow for direct parameter definition. So no SELECT FIELD FROM TABLE WHERE ?. Don't suggest it - doesn't work.
Ideas
- Should I find a way to break down the results of the Execute SQL task and send them through several ForEachLoop Containers for faster processing?
Is there another design that is more appropriate?
Is there a script I can use that is faster?
Would it be faster to create a temporary table in memory and populate it - then use the results to bulk insert to SQL Server? Does this work when using an Oracle data source?
ANY OTHER IDEAS?