I have a performance problem with my process. It's an asynchronous task launched in CMT bean (on jboss server).
1 iteration performs 1 update and 3 inserts to my db via Hibernate. The process is divided into new transactions every 100 iterations.
Flush is called on EntityManager after every update/insert.
While the starting performance of first batch is satisfying (around 5-8s) it slows down drastically with time. The 30th batch takes around 30s to finish and later grows up to over 2 minutes per batch.
I tried switching FlushModeType to COMMIT, manual clearing/closing entityManagers, clearing entityManagers cache, I looked for memory leaks and can't find the reason for this slow down.
I measured little bits of code execution time and every code involving database connection slows down with time. I understand that a transaction slows with more entities processed but why is new transaction also slower?
The latest process consists of 250 000 iterations (2500 transactions in 1 thread) and takes forever to end.
If needed I'll provide more information. Any help would be appreciated.
I tried simplifying this code just to do 1 hibernate insert and no other operations and it still slows with time. This is an abstract pseudo view of what's going on inside.
Bean1
#Asynchronous
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void mainTask(){
while(...){
subTask();
}
}
Bean2
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void subTask(){
100.times{
3*Insert
1*Update
}
}
I am sure that my suggestions might not be accurate one or you tried them already, but I'd like to give a try. As you mentioned that DB connection is bottleneck, I'll go after that.
After reading the question, I find that time taken by transaction is proportional to iteration number. So it looks that entities created in first iteration are being sent to hibernate in next iteration.
For example, in 4th iteration, entities created in 1st,2nd and 3rd iterations are also being sent for update or being sent to hibernate somehow.
That could be the reason for degradation of performance as iteration progresses. As number of records to be updated/inserted/selected increases with each iteration.
I can think of following possibilities on top of my head -
Hibernate session used in first iteration is being used till end. Due to this, entities created in first iteration are being updated in later iterations as well. I read that you tried closing entity managers etc. But still, please check the place where sesssion is ending. You can create a new session in each transaction or delete the entities created in session after each transaction.
The list which filters records for each iteration is sending already processed records. That means the list should send records from 300 to 399 in 3rd iteration, but records from 0 to 399 are sent to transaction.
If you're using HQL, try using named query. Last time when I used Hibernate (about 8 months back), I noticed that when HQL is used number of objects loaded by hibernate are much more than named query.
Hibernate provides a way to print actual SQL query/parameters sent to DB. You can check actual query sent to DB. Link for this -
How to print a query string with parameter values when using Hibernate
We faced similar problems on couple of occasions during executing of batch programs with JPA.
Only solution we could find is use jdbc api for batch programs which involves lot of processing.
Turns out it's a java problem, we tested our application on different configurations and it works fine when our jboss runs on java 1.7 (on contrary to 1.6). Every batch finishes in about 5 seconds. We now stand against a choice to upgrade our java to 1.7 or to dig deeper and find what's wrong with our setup on java 1.6.
Related
I am using spring boot application which run through bat file.This i am using for many background services which interact with database and create txt files etc. All methods are annotated with #Scheduled(fixedDelayString = "${fixedDelay}") here fixedDelay=2000.
As many #Scheduled annotation that many thread i have configured through application.properties file. There is one method which basically call three mssql database procedure and i have to wait for all three procedure response and then proceed further. For this i have used Executors.newFixedThreadPool(3) and wait for response by future.get() this is a long running process that might take 2 or 3 hr or even more. Now i am getting Outofmemory GC overhead limit exceeded.First i try to increase the heap size but still this issue comes. And one i see the heapdump, i found outofmemory comes in this Method execution.Is there some limitation that we can not call any sql statement that will take more then 3 hr or more through thread?
I took the heapdump but from that i am not able to get the root cause. Because i am just calling the procedure in this.Not doing any other operation.Please help in this.Heap dump image is attached
Thanks
I remarked that after inserting 400 000 rows into mysql using spring boot, the heap size and RAM consumption go up, but never drops down after.
If a similar request is done the second time, then the heap rises with additional 5-10%.
In what the problem could be in ?
Why the heap is not cleared after executing a save query ?
Is there a problem with garbage collector ? if yes, how is it possible to fix ?
P.S: I tried some solutions provided in this article, but they did not help
https://medium.com/quick-code/what-a-recurring-outofmemory-error-taught-me-11f2061063a1
Edited:
I tried to select 1.2 mil records in order to fulfill the heap.
And after 20-25 minutes passed, the heap started to decrease. Why is this happening ? Isn't garbage collector suppose to clear the heap faster ? In this way, if during these 25 minutes other requests are done, server will just crash.
Edited 2:
Seems that when trying the same on ec2, garbage collector does not work at all. It happened already 3 times that server just runs out of memory and crashes.
Anyone know the cause ?
I ran 1 #Async thread which had inside it another 2 #Async threads called from different beans. They finished execution and after that heap never got down.
I do not have any #Cacheable methods or similar stuffs. I am just getting some data from tables, process them and update it. Have no inputstreams etc.
My spring boot application is going to listen to 1 million records an hour from a kafka broker. The entire processing logic for each message takes 1-1.5 seconds including a database insert. Broker has 64 partitions, which is also the concurrency of my #KafkaListener.
My current code is only able to process 90 records in a minute in a lower environment where I am listening to around 50k records an hour. Below is the code and all other config parameters like max.poll.records etc are default values:
#KafkaListener(id="xyz-listener", concurrency="64", topics="my-topic")
public void listener(String record) {
// processing logic
}
I do get "it is likely that the consumer was kicked out of the group" 7-8 times an hour. I think both of these issues can be solved through isolating listener method and multithreading processing of each message but I am not sure how to do that.
There are a few points to consider here. First, 64 consumers seems a bit too much for a single application to handle consistently.
Considering each poll by default fetches 500 records per consumer at a time, your app might be getting overloaded and causing the consumers to get kicked out of the group if a single batch takes more than the 5 minutes default for max.poll.timeout.ms to be processed.
So first, I'd consider scaling the application horizontally so that each application handles a smaller amount of partitions / threads.
A second way to increase throughput would be using a batch listener, and handling processing and DB insertions in batches as you can see in this answer.
Using both, you should be processing a sensible amount of work in parallel per app, and should be able to achieve your desired throughput.
Of course, you should load test each approach with different figures to have proper metrics.
EDIT: Addressing your comment, if you want to achieve this throughput I wouldn't give up on batch processing just yet. If you do the DB operations row by row you'll need a lot more resources for the same performance.
If your rule engine doesn't do any I/O you can iterate each record from the batch through it without losing performance.
About data consistency, you can try some strategies. For example, you can have a lock to ensure that even through a rebalance only one instance will process a given batch of records at a given time - or perhaps there's a more idiomatic way of handling that in Kafka using the rebalance hooks.
With that in place, you can batch load all the information you need to filter out duplicated / outdated records when you receive the records, iterate each record through the rule engine in memory, and then batch persist all results, to then release the lock.
Of course, it's hard to come up with an ideal strategy without knowing more details about the process. The point is by doing that you should be able to handle around 10x more records within each instance, so I'd definitely give it a shot.
We're currently using H2 version 199 in embedded mode with default nio file protocol and MVStore storage system. The write_delay parameter is set to 60 seconds.
We run a batch insert/update/delete of about 30.000 statements within 2 seconds (in one transaction) followed by another batch of a couple of hundred statements only 30 seconds later (in a second transaction). The next attempt to open a db connection (only 2 minutes later) shows that the DB is corrupt:
File corrupted while reading record: null. Possible solution: use the recovery tool [90030-199]
Since the transactions occur within a minute, we wonder whether the write_delay of 60 seconds might be contributing to the issue.
Changing write_delay to 60s (from a default value of 0.5s) will definitely increase your risk of lost transactions, and I do not see a good reason for doing it. Should not cause a db corruption, though. More likely some thread interruptions do that, since you are running in the same JVM a web server and who knows what else. Using async file store might help in that area, and yes it is stable enough (how much worse it can go for your app, than a database corruption, anyway).
Does Spring-batch acquires connections from datasource for whole job running time?
In general I have long running steps in Spring-batch job. During execution Springs takes connection from datasource managed by C3P0 and when steps runs too long C3P0 collects this connections by unreturnedConnectionTimeout - which prevents Springs to finish its manipulations with DB.
In order to manage this I am considering to refactor long running tasklet steps to chunk oriented with hope that Spring acquire connection from datasource only for period not longer than step execution or chunk processing time. And if Springs acquires connection for whole job - this refactoring will not help.
Spring Batch obtains connections in a number of different points for a number of different reasons so with the information you've provided, I can't clearly answer the exact question you are having. That being said, if you have long running chunks or tasklets, each of those are executed within the scope of a single transaction and therefore need to be tied to a single connection.