Low throughput ONLY in one partition out of 64 partitions in informatica SQ - informatica-powercenter

We are fetching millions of records through a source DB using Informatica Powercenter 9.6.1 and we have partitioned the table and fetching the data using partitions.
However , out of 64 partitions only 1 partition has low throughput. Our CPU is 72 threads and nothing is running in parallel to the session.
We understand that 64 partitions is huge but our server is capable of handling it. But low throught put only in 1 random partition is something we dont understand.

Looks like your thread might getting killed in between due to some network issue.
So informatica lost connectivity.

Related

How I can increase the background merge threads in clickhouse

I'm using clickhouse to store stock prices on a win10 machine via docker/wsl2.
My computer has 128GB ram and 12 cores, so I set the wsl2 to use 64GB and 8 cores, which I can confirm the numbers in docker container.
However, after inserting half billion records into the clickhouse, I continously see there are ~16-20 background merge threads running by checking system.metrics. And it only take ~200% cpu and 1GB ram. I tried to delete some rows from one of the tables, and the mutation job never starts.
I've set the config.xml to be like
<merge_tree>
<parts_to_delay_insert>1000</parts_to_delay_insert>
<parts_to_throw_insert>1200</parts_to_throw_insert>
<max_part_loading_threads>20</max_part_loading_threads>
</merge_tree>
and users.xml to be like
<max_memory_usage>60000000000</max_memory_usage>
<background_pool_size>64</background_pool_size>
<background_move_pool_size>32</background_move_pool_size>
the number of partitions is set to be around 1500 for each table. and insertion happens every 100K records come.
Is there anything I did wrong? How can I set the clickhouse engine to use maximum resource given?
Thanks!

Why queries are getting progressively slower when using postgres with spring batch?

I'm running a job using Spring Batch 4.2.0 with postgres (11.2) as backend. It's all wrapped in a spring boot app. I've 5 steps and each runs using a simple partitioning strategy to divide data by id ranges and reads data into each partition (which are processed by separate threads). I've about 18M rows in the table, each step reads, changes few fields and writes back. Each step reads all 18M rows and writes back. The issue I'm facing is, the queries that run to pull data into each thread scans data by id range like,
select field_1, field_2, field_66 from table where id >= 1 and id < 10000.
In this case each thread processes 10_000 rows at a time. When there's no traffic the query takes less than a second to read all 10,000 rows. But when the job runs there's about 70 threads reading all that data in. It goes progressively slower to almost a minute and a half, any ideas where to start troubleshooting this?
I do see autovacuum running in the backgroun for almost the whole duration of job. It definitely has enough memory to hold all that data in memory (about 6GB max heap). Postgres has sufficient shared_buffers 2GB, max_wal_size 2GB but not sure if that in itself is sufficient. Another thing I see is loads of COMMIT queries hanging around when checking through pg_stat_activity. Usually as much as number of partitions. So, instead of 70 connections being used by 70 partitions there are 140 conections used up with 70 of them running COMMIT. As time progresses these COMMITs get progressively slower too.
You are probably hitting https://github.com/spring-projects/spring-batch/issues/3634.
This issue has been fixed and will be part of version 4.2.3 planned to be released this week.

Oracle Change Notification, performance overhead?

Is there any performance overhead using Oracle Change Notification on a database table having fairly large size and on which 5k-8k operations are performed daily ?
After running it continuously for two days I have found few 'java.lang.IndexOutOfBoundException'
8k operations per day is trivial. My 10 year old laptop can do that. I suspect your java error is not performance/overhead related.

PostgreSql dropping CONSTRAINT on large table extremely slow

We have a database table in PostgreSql 9.2 running on Windows with approximately 750 million rows in it. We dropped a FK constraint on a column, but the statement has now been running for 48 hours and is still not complete.
The server has 32GB RAM and 16 CPU's. We have unfortunately not increased the maintenance_work_mem before running the SQL statement. It is therefore set at 256MB. The resource on the machine are not even coming close to maximum. As a matter of fact: CPU usage is below 3%, we have 80% of RAM free and Disk I/O does not go above 5MB/s, even though the machine can easily exceed 100MB/s.
Why does it take this long to drop a FK CONSTRAINT?
Is there a way to increase the performance of this statement execution whilst it is running?
What is the most efficient way of adding a FK CONSTRAINT to a table of this size?

Hbase concurrency making it slow

I have 1 master server and 5 region server and each server has 200 GB disk space and 16 GB RAM on each. I created a table in HBase which has 10 million records. I am using hbase-0.96 version on hadoop 2.
Table Name - sh_self_profiles
column family - profile
In this table, we have 30 columns in each row.
When I get a single column value from HBase, it takes around 10 ms. My problem is when I hit 100 or more concurrent requests the time slowly accumulates and increases to more than 400 ms instead of completing in 10ms only. When 100 requests are hit linearly, each one takes 10 ms only.
One thing that you should check is how well distributed your table is.
You can do this by going to the HBase master web console http://:60010, you will be able to see how many regions you have for your table. If you have not done anything special on table creation you could easily have only one or two regions, which means that all the requests are being directed to a single region server.
If this is the case, you can recreate your table with pre-split regions (I would suggest a multiple of 5, such as 15 or 20), and make sure that the concurrent gets that you are doing are equally spread over the row-key space.
Also, pls check how much RAM you have allocated to the region server - you might need to increase it from the default. If you are not running anything else other than HBase Region Sever on those machines, you could probably increase to 8GB ram.
Other than that, you could also adjust the default for hbase.regionserver.handler.count.
I hope this helps.
Which client are you using? Are you using the standard Java client, the Thrift client, the HTTP REST client, or something else? If your use case is a high amount of random reads of single column values, I highly recommend you try asynchbase as it is much faster than the standard synchronous Java client.

Resources