Postgresql 9.6 writing data from remote oracle db is slow - oracle

Im using oracle_fdw extension in my postgresql database. I'm trying to copy the data of many tables in the oracle database into my postgresql tables. I'm doing so by running insert into local_postgresql_temp select * from remote_oracle_table. The performance of this operation are very slow and I tried to check the reason for that and mybe choose a different alternative.
1)First method - Insert into local_postgresql_table select * from remote_oracle_table this generated total disk write of 7 M/s and actual disk write of 4 M/s(iotop). For 32G table it took me 2 hours and 30 minutes.
2)second method - copy (select * from oracle_remote_table) to /tmp/dump generates total disk write of 4 M/s and actuval disk write of 100 K/s. The copy utility suppose to be very fast but it seems very slow.
-When I run copy from the local dump, the reading is very fast 300 M/s.
-I created a 32G file on the oracle server and used scp to copy it and it took me a few minutes.
-The wals directory is located on a different file system.
The parameters I assigned :
min_parallel_relation_size = 200MB
max_parallel_workers_per_gather = 5
max_worker_processes = 8
effective_cache_size = 12GB
work_mem = 128MB
maintenance_work_mem = 4GB
shared_buffers = 2000MB
RAM : 16G
CPU CORES : 8
HOW can I increase the writes ? How can I get the data faster from the oracle database to my postgresql database?
I run perf on this entire process and the results :

I'd concentrate on the bits you said were fast.
-I created a 32G file on the oracle server and used scp to copy it and it took me a few minutes.
-When I run copy from the local dump, the reading is very fast 300 M/s.
I'd suggest you combine these two. Use a dump tool (or SQLPLUS) to export the data from Oracle into a file that Postgresql's COPY command can read. You could generate the binary file format directly but its a bit tricky, but generating a CSV separated version etc, shouldn't be too tricky. An example for that is at How do I spool to a CSV formatted file using SQLPLUS?

Related

Oracle Datapump Export is very slow

My Oracle 11.2.0.3 FULL DATABASE Datapump Export is very slow, when i ask V$SESSION_LONGOPS
SELECT USERNAME,OPNAME,TARGET_DESC,SOFAR,TOTALWORK,MESSAGE,SYSDATE,ROUND(100*SOFAR/TOTALWORK,2)||'%' COMPLETED FROM V$SESSION_LONGOPS
where SOFAR/TOTALWORK!=1
it show me 2 records, in opname one containing the SYS_EXPORT_FULL_XX, and another "Rowid Range Scan" and the message for the last one is
Rowid Range Scan : MY_SCHEMA.BIG_TABLE: 28118329 out of 30250532 Blocks done and it takes hours and hours.
I.E : MY_SCHEMA.BIG_TABLE is a 220 GB table size having 2 CLOB colunn.
If you have CLOBs in the table it will take a long time to export because that wont parallelize. Exactly what phase are you stuck in? Could you paste the last lines from the log file or get a status from data pump?
There are some best practices that you could try out:
SecureFile LOBs can be faster than BasicFile LOBs. That is yet another reason for going to SecureFile LOBs.
You could try to increase the STREAMS_POOL_SIZE to 256 MB (at least) although I think that is not the reason.
Use PARALLEL option and set it to 2 x CPU cores. Never export statistics - it is better to either export using DBMS_STATS or regather at target database.
Regards,
Daniel
Well for 11g and 12cR1 the Streams AQ Enqueue is a common culprit for this as well. If you ALTER SYSTEM SET EVENTS 'IMMEDIATE TRACE NAME MMAN_CREATE_DEF_REQUEST LEVEL 6' this will help if the issue is the very common Streams AQ Enqueue.

Slow cross-loading from oracle (oracle-fdw) into PostgreSQL

I created multiple posts in the forum about the performance problem that I have but now after i made some tests and gathered all the info that is needed I'm creating this post.
I have performance issues with two big tables. Those tables are located on an oracle remote database. I'm running the quert :
insert into local_postgresql_table select * from oracle_remote_table.
The first table has 45M records and its size is 23G. The import of the data from the oracle remote database is taking 1 hour and 38 minutes. After that I create 13 regular indexes on the table and it takes 10 minutes per table ->2 hours and 10 minutes in total.
The second table has 29M records and its size is 26G. The import of the data from the oracle remote database is taking 2 hours and 30 minutes. The creation of the indexes takes 1 hours and 30 minutes (some are indexes on one column and the creation takes 5 min and some are indexes on multiples column and it takes 11 min.
Those operation are very problematic for me and I'm searching for a solution to improve the performance. The parameters I assigned :
min_parallel_relation_size = 200MB
max_parallel_workers_per_gather = 5
max_worker_processes = 8
effective_cache_size = 2500MB
work_mem = 16MB
maintenance_work_mem = 1500MB
shared_buffers = 2000MB
RAM : 5G
CPU CORES : 8
-I tried running select count(*) from table in oracle and in postgresql the running time is almost equal.
-Before importing the data I drop the indexes and the constraints.
-I tried to copy a 23G file from the oracle server to the postgresql server and it took me 12 minutes.
Please advice how can I continue ? How can I improve something in this operation ?

H2 database Load csv data faster

I want to load about 2 million rows from CSV formatted file to database and run some SQL statement for analysis, and then remove the data. File size is 2GB in size. Data is web server log message.
Did some research and found H2 in-memory database seems to be faster, since its keep the data in memory. When I try to load the data got OutOfMemory error message because of 32 bit java. Planning to try with 64 bit java.
I am looking for all the optimization option to load the quickly and run the SQL.
test.sql
CREATE TABLE temptable (
f1 varchar(250) NOT NULL DEFAULT '',
f2 varchar(250) NOT NULL DEFAULT '',
f3 reponsetime NOT NULL DEFAULT ''
) as select * from CSVREAD('log.csv');
Running like this in 64 bit java:
java -Xms256m -Xmx4096m -cp h2*.jar org.h2.tools.RunScript -url 'jdbc:h2:mem:test;LOG=0;CACHE_SIZE=65536;LOCK_MODE=0;UNDO_LOG=0' -script test.sql
If any other database available to use in AIX please let me know.
thanks
If the CSV file is 2 GB, then it will need more than 4 GB of heap memory when using a pure in-memory database. The exact memory requirements depend a lot on how redundant the data is. If the same values appear over and over again, then the database will need less memory as common objects are re-used (no matter if it's a string, long, timestamp,...).
Please note the LOCK_MODE=0, UNDO_LOG=0, and LOG=0 are not needed when using create table as select. In addition, the CACHE_SIZE does not help when using the mem: prefix (but it helps for in-memory file systems).
I suggest to try using the in-memory file system first (memFS: instead of mem:), which is slightly slower than mem:, but needs less memory usually:
jdbc:h2:memFS:test;CACHE_SIZE=65536
If this is not enough, try the compressed in-memory mode (memLZF:), which is again slower but uses even less memory:
jdbc:h2:memLZF:test;CACHE_SIZE=65536
If this is still not enough, I suggest to try the regular persistent mode and see how fast this is:
jdbc:h2:~/data/test;CACHE_SIZE=65536

Oracle Data Pump Export (expdp) locks table (or something similar)

I must export data from a partitioned table with global index that must be online all the time, but I am having troubles in doing that.
For data export I am using Data Pump Export - expdp and I am exporting only one partition. The oldest one, not the active one.
My expdp command exports correct data and it looks like this:
expdp user/pass#SID DIRECTORY=EXP_DIR
DUMPFILE=part23.dmp TABLES=SCHEMA_NAME.TABLE_NAME:TABLE_PARTITION_23`
Application that uses database has a connection timeout of 10 seconds. This parameter can't be changed. If INSERT queries are not finished within 10 seconds, data is written to a backup file.
My problem is that, during the export process that lasts few minutes, some data ends up in the backup file, and not in the database. I want to know why, and avoid it.
Partitions are organized weekly, and I am keeping 4 partitions active (last 4 weeks). Every partition is up to 3 GB.
I am using Oracle 11.2
Are you licensed to use the AWR? If so, do you have an AWR report for the snapshot when the timeouts occurred?
Oracle readers don't block writers and there would be no reason for an export process to lock anything that would impact new inserts.
Is this a single INSERT operation that has a timeout of 10 seconds (i.e. you are inserting a large number of rows in a single INSERT statement)? Or is this a batch of individual inserts such that some of the inserts can succeed in the 10 second window and some can fail? You say that "some data ends up in the backup file" but I'm not sure which of these scenarios are more accurate.
During normal operations, how close are you to the 10 second time-out?
Is it possible that the system is I/O bound and that doing the export increases the load on the I/O system causing all operations to be slower? If you've got an I/O bottleneck and you add an export process that has to read a 3 GB partition and write that data to disk (presumably also on the database server), that could certainly cause a general slowdown. If you're reasonably close to the 10 second time-out already, that could certainly push you over the edge.

Resolving ORA-4031 "unable to allocate x bytes of shared memory"

I need some pointers on how to diagnose and fix this problem. I don't know if this is a simple server setup problem or an application design problem (or both).
Once or twice every few months this Oracle XE database reports ORA-4031 errors. It doesn't point to any particular part of the sga consistently. A recent example is:
ORA-04031: unable to allocate 8208 bytes of shared memory ("large pool","unknown object","sort subheap","sort key")
When this error comes up, if the user keeps refreshing, clicking on different links, they'll generally get more of these kinds of errors at different times, then soon they'll get "404 not found" page errors.
Restarting the database usually resolves the problem for a while, then a month or so later it comes up again, but rarely at the same location in the program (i.e. it doesn't seem linked to any particular portion of code) (the above example error was raised from an Apex page which was sorting 5000+ rows from a table).
I've tried increasing sga_max_size from 140M to 256M and hope this will help things. Of course, I won't know if this has helped since I had to restart the database to change the setting :)
I'm running Oracle XE 10.2.0.1.0 on a Oracle Enterprise Linux 5 box with 512MB of RAM. The server only runs the database, Oracle Apex (v3.1.2) and Apache web server. I installed it with pretty much all default parameters and it's been running quite well for a year or so. Most issues I've been able to resolve myself by tuning the application code; it's not intensively used and isn't a business critical system.
These are some current settings I think may be relevant:
pga_aggregate_target 41,943,040
sga_max_size 268,435,456
sga_target 146,800,640
shared_pool_reserved_size 5,452,595
shared_pool_size 104,857,600
If it's any help here's the current SGA sizes:
Total System Global Area 268435456 bytes
Fixed Size 1258392 bytes
Variable Size 251661416 bytes
Database Buffers 12582912 bytes
Redo Buffers 2932736 bytes
Even though you are using ASMM, you can set a minimum size for the large pool (MMAN will not shrink it below that value).
You can also try pinning some objects and increasing SGA_TARGET.
Don't forget about fragmentation.
If you have a lot of traffic, your pools can be fragmented and even if you have several MB free, there could be no block larger than 4KB.
Check size of largest free block with a query like:
select
'0 (<140)' BUCKET, KSMCHCLS, KSMCHIDX,
10*trunc(KSMCHSIZ/10) "From",
count(*) "Count" ,
max(KSMCHSIZ) "Biggest",
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ<140
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 10*trunc(KSMCHSIZ/10)
UNION ALL
select
'1 (140-267)' BUCKET,
KSMCHCLS,
KSMCHIDX,
20*trunc(KSMCHSIZ/20) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ between 140 and 267
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 20*trunc(KSMCHSIZ/20)
UNION ALL
select
'2 (268-523)' BUCKET,
KSMCHCLS,
KSMCHIDX,
50*trunc(KSMCHSIZ/50) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ between 268 and 523
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 50*trunc(KSMCHSIZ/50)
UNION ALL
select
'3-5 (524-4107)' BUCKET,
KSMCHCLS,
KSMCHIDX,
500*trunc(KSMCHSIZ/500) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ between 524 and 4107
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 500*trunc(KSMCHSIZ/500)
UNION ALL
select
'6+ (4108+)' BUCKET,
KSMCHCLS,
KSMCHIDX,
1000*trunc(KSMCHSIZ/1000) ,
count(*) ,
max(KSMCHSIZ) ,
trunc(avg(KSMCHSIZ)) "AvgSize",
trunc(sum(KSMCHSIZ)) "Total"
from
x$ksmsp
where
KSMCHSIZ >= 4108
and
KSMCHCLS='free'
group by
KSMCHCLS, KSMCHIDX, 1000*trunc(KSMCHSIZ/1000);
Code from
All of the current answers are addressing the symptom (shared memory pool exhaustion), and not the problem, which is likely not using bind variables in your sql \ JDBC queries, even when it does not seem necessary to do so. Passing queries without bind variables causes Oracle to "hard parse" the query each time, determining its plan of execution, etc.
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::p11_question_id:528893984337
Some snippets from the above link:
"Java supports bind variables, your developers must start using prepared statements and bind inputs into it. If you want your system to ultimately scale beyond say about 3 or 4 users -- you will do this right now (fix the code). It is not something to think about, it is something you MUST do. A side effect of this - your shared pool problems will pretty much disappear. That is the root cause. "
"The way the Oracle
shared pool (a very important shared memory data structure)
operates is predicated on developers using bind variables."
" Bind variables are SO MASSIVELY important -- I cannot in any way shape or form OVERSTATE their importance. "
The following are not needed as they they not fix the error:
ps -ef|grep oracle
Find the smon and kill the pid for it
SQL> startup mount
SQL> create pfile from spfile;
Restarting the database will flush your pool and that solves a effect not the problem.
Fixate your large_pool so it can not go lower then a certain point or add memory and set a higher max memory.
This is Oracle bug, memory leak in shared_pool, most likely db managing lots of partitions.
Solution: In my opinion patch not exists, check with oracle support. You can try with subpools or en(de)able AMM ...
Error
ORA-04031: unable to allocate 4064 bytes of shared memory ("shared pool","select increment$,minvalue,m...","sga heap(3,0)","kglsim heap")
Solution: by nepasoft nepal
1.-
ps -ef|grep oracle
2.- Find the smon and kill the pid for it
3.-
SQL> startup mount
ORACLE instance started.
Total System Global Area 4831838208 bytes
Fixed Size 2027320 bytes
Variable Size 4764729544 bytes
Database Buffers 50331648 bytes
Redo Buffers 14749696 bytes
Database mounted.
4.-
SQL> alter system set shared_pool_size=100M scope=spfile;
System altered.
5.-
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
6.-
SQL> startup
ORACLE instance started.
Total System Global Area 4831838208 bytes
Fixed Size 2027320 bytes
Variable Size 4764729544 bytes
Database Buffers 50331648 bytes
Redo Buffers 14749696 bytes
Database mounted.
Database opened.
7.-
SQL> create pfile from spfile;
File created.
SOLVED

Resources