What is the default connection pool size that Spring Boot HikariCP provides when the container loads?
Of course, I am using below properties to setup max CP size, but I was wondering what is the default CP size if we don't give any number in the application.properties file.
spring.datasource.hikari.minimumIdle=5
spring.datasource.hikari.maximumPoolSize=20
spring.datasource.hikari.idleTimeout=30000
spring.datasource.hikari.poolName=SpringBootJPAHikariCP
spring.datasource.hikari.maxLifetime=2000000
spring.datasource.hikari.connectionTimeout=30000
And if I give a max pool size in application.properties as 100 and I use only 20, will that affect my application performance?
maximumPoolSize
Default: 10
HicariCP Documentation contains default properties:
https://github.com/brettwooldridge/HikariCP
Read about Pool Size here:
Maximum Connection Pool Size
Concerning the maximum pool size , for example, PostgreSQL recommends the following formula:
pool_size = ((core_count * 2) + effective_spindle_count)
core_count is amount of CPU cores
effective_spindle_count is the amount of disks in a RAID
But according to those docs:
but we believe it will be largely applicable across databases.
That means this formula generally can be applicable to other databases.
Also, for example, about Oracle you can read this article and watch video
Related
With the below fluentbit configuration we are getting errors from opensearch under heavy load.
Http bulk requests to opensearch by fluentbit(respresenting 429 errors as spike)
Fluentbit config:
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
DB /var/log/flb_kube.db
Mem_Buf_Limit 400M
storage.type filesystem
Skip_Long_Lines On
Refresh_Interval 1
Rotate_Wait 600
[OUTPUT]
Name es
Match kube.*
Host ${ES_HOST}
Port ${PORT}
Buffer_Size False
AWS_Auth Off
AWS_Role_ARN ${ES_ARN}
AWS_External_ID ${ES_IAMROLE}
HTTP_User ${ES_USER}
HTTP_Passwd ${ES_PASSWD}
tls On
tls.verify Off
Trace_Output ${TRACE_OUTPUT}
Trace_Error On
Replace_Dots On
Index fluentbit
Type flb
AWS_Region ${AWS_REGION}
Logstash_Format On
Logstash_Prefix ${ES_LOGSTASHPREFIX}_app_log
Logstash_DateFormat %Y.%m.%d
Retry_Limit 10
storage.total_limit_size 1G
For resolving this we have upgraded our opensearch instance type from r5.xlarge.search(4 nodes) to r5.2xlarge.search(3 nodes) but that also didn't solve the issue.
We have also increased the ES index refresh_interval to 60s but that didn't help.
We read that output to ES from fluentbit can be controlled via buffering so we decreased Mem_Buf_Limit to 400M and it didn't help.
Can someone help if can try any other things or we are missing something.
The issue here is not that of fluentbit but is of opensearch/elasticsearch.
The HTTP 429 errors (es_request_rejected_exception) in ES occur when too many requests are sent to the cluster, than what the thread pool for it can handle. The thread pool in OpenSearch for different tasks are allocated differently with search operations getting a larger share. The option to manually modify thread pool allocation is not available for versions 5.1 and later.
You can try to resolve this by few ways.
1: Refresh rate (you already did that and it didn't help).
2: Change the indexing speed. Try to send logs with an interval greater than your current.
3: Upscale (you did and it didn't work either)
You can get an idea with the following formula for thread pools.
Number of thread pools allocated for writes = Number of Virtual CPUs (your case)
Number of thread pools allocated for search = ((3 * Number of virtual CPUs)/2) + 1
So, I am guessing your issue here is a big number of shards! You can either decrease the shards for each index or if you are having this issue only once in a while when there is extra load, you can change the replica count to 0 and when the period is finished, change it back to the original.
Check these two links to find out more about optimizing your ES domain.
indexing performance
Best practices
I have a large database (100M rows) indexed by SphinxSearch. Each search takes 0.1-0.5s. However, if I run 10 searches concurrently, they take 20s on average.
Is it the expected behaviour of SphinxSearch?
Should I adjust the config or move to another search engine for concurrency?
My config file is simple:
searchd
{
listen = 9312
listen = 9306:mysql41
pid_file = /var/searchd.pid
read_timeout = 30
log = /var/log/sphinxsearch/searchd.log
query_log = /var/log/sphinxsearch/query.log
}
Is it the expected behaviour of SphinxSearch?
It heavily depends on the number of CPUs. If you have more than 10 physical CPUs then latency degradation from 0.5 sec to 20 sec by increasing the concurrency from 1 to 10 is definitely not expected. In this case first of all make sure all your CPUs are busy under the concurrency load. If it's not - depending on your Sphinx version and multi-tasking mode let it run with more threads.
Should I adjust the config or move to another search engine for concurrency?
I recommend Manticore Search as:
it's open source - https://github.com/manticoresoftware/manticoresearch/
it's the only fork of Sphinx and if you are familiar with Sphinx in general it shouldn't be a problem to migrate
hundreds of bugs have been fixed
the multi-tasking mode is completely different (coroutines)
Background: We are working on evaluating hazelcast which can act as an alternative of Redis.
Setup :
3 members in a cluster under a single subnet (production boxes). Each member has ~1.4GB of data.
Near caching is off.
Each member has 1 backup.
Code deployed by preparing a spring boot jar and cache is implemented as embedded one.
VM config : 8C, 31GB RAM
code uses IMAP to retrieve and put the keys in the cache.
LoadTest : Attempted 18K/s rest API calls to read the data.
But hazelcast is showing avg get latency of around 3-4ms which I feel should be in microsecond as we have been already seeing that much of get command latency with redis setup.
CPU Load was ~95% during this test.
A member which gave this latency has heap usage of ~60% (committed: 7.85GB, used: 4.68GB). It is though with all the members in the cluster.
Need help to understand that is my configuration somewhere wrong, because of which I am NOT able to achieve get latency in microseconds?
Config for starting embedded cache:
config.addMapConfig(mapConfig());
NetworkConfig networkConfig = config.getNetworkConfig();
JoinConfig join = networkConfig.getJoin();
join.getMulticastConfig().setEnabled(false);
join.getTcpIpConfig().setEnabled(true).setMembers(
Arrays.asList(
"ip1:5701",
"ip2:5701",
"ip3:5701"
)
);
return config;```
I was facing this issue for my springboot application that connects to a DB and MQ, and uses Atomikos Transaction manager.
com.atomikos.jms.AtomikosJMSException|Connection pool exhausted - try increasing 'maxPoolSize' and/or 'borrowConnectionTimeout' on the AtomikosConnectionFactoryBean.
com.atomikos.datasource.pool.PoolExhaustedException: ConnectionPool: pool is empty - increase either maxPoolSize or borrowConnectionTimeout
at com.atomikos.datasource.pool.ConnectionPool.waitForAtLeastOneAvailableConnection(ConnectionPool.java:326)
at com.atomikos.datasource.pool.ConnectionPool.findOrWaitForAnAvailableConnection(ConnectionPool.java:144)
at com.atomikos.datasource.pool.ConnectionPool.borrowConnection(ConnectionPool.java:132)
at com.atomikos.datasource.pool.ConnectionPoolWithSynchronizedValidation.borrowConnection(ConnectionPoolWithSynchronizedValidation.java:23)
at com.atomikos.jms.AtomikosConnectionFactoryBean.createConnection(AtomikosConnectionFactoryBean.java:601)
at org.springframework.jms.support.JmsAccessor.createConnection(JmsAccessor.java:196)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.access$100(AbstractPollingMessageListenerContainer.java:77)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer$MessageListenerContainerResourceFactory.createConnection(AbstractPollingMessageListenerContainer.java:490)
at org.springframework.jms.connection.ConnectionFactoryUtils.doGetTransactionalSession(ConnectionFactoryUtils.java:325)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:281)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:245)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1189)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1179)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1076)
at java.lang.Thread.run(Thread.java:748)
I tried printing the maxPoolSize and found that it is 1. This page came across in between (https://www.atomikos.com/Documentation/ConfiguringJms) and I found the line where they increased the MaxPoolSize to 5. I just tried setting it to 2 and it worked.
AtomikosConnectionFactoryBean xaConnectionFactory = new AtomikosConnectionFactoryBean();
xaConnectionFactory.setXaConnectionFactory(ibmMQXAConnectionFactory);
xaConnectionFactory.setMaxPoolSize(2);
Can someone help me to understand what should be the ideal poolsize. what it is for etc?
In order to process messages Atomikos uses DB and JMS connections (in your case).
These connections are taken from the pools of available connections. To get the idea why connection pools are needed, please follow this link as a starting point - Connection_pool
To put it simple - in order to process one message at a time Atomikos needs one DB and one JMS connection/session. So if you plan to process 10 messages in parallel, each connection pool size must be at least 10 (10 for DB and 10 for JMS connection pools respectively).
I am using Sidekiq (on Heroku with Puma) to send emails asynchronously and would like to use Redis to keep counters and cache models.
RedisCloud's free plan includes 30 connections to Redis. It is not clear to me how to manage:
redis connections used by Sidekiq
redis connections used in models (caching and counters)
Sidekiq Client size is configured like this:
Sidekiq.configure_client do |config|
config.redis = {url: ENV["REDISCLOUD_URL"], size: 3}
end
If I understood this correctly, Puma forks multiple processes, 2 in my case, which will result in:
2 (Puma Workers) * 3 (size) * 1 (Web Dyno) = 6 connections to redis used to push jobs.
Sidekiq Server
With Sidekiq taking 2 connections (or 5 in version 4), setting a concurrency of 10 would default in a server size of 12 or 15.
If I wanted to use all the remaining available connections (30 - 6 = 24), I could set :
Sidekiq.configure_client do |config|
config.redis = { size: 19 }
end
Total redis connections would be 19 + 5 (Sidekiq 4) = 24, and use the default concurrency of 25 would be ok.
As Mike Perham stated generally the concurrency must not be more than (server pool size - 2) * 2.
Now, where it starts to get confusing for me is the use of Redis out of Sidekiq.
# initializers/redis.rb
$redis = Redis.new(:url => uri)
Whenever I use Redis in a model or controller I call like so:
$redis.hincrby("mycounter", "key", 1)
As I understand it, all the puma threads wait on each other on a single Redis connection when $redis.whateverFunction is called.
In this answer What is the best way to use Redis in a Multi-threaded Rails environment? (Puma / Sidekiq), the recommended approach is using the connection_pool gem, related to the Sidekiq Wiki https://github.com/mperham/sidekiq/wiki/Advanced-Options#connection-pooling
require 'connection_pool'
$redis = ConnectionPool.new(size: 10) { Redis.new }
If I understand it right, it that case $redis.whateverFunction would have its own connection pool of 10, and sidekiq its own connection workers pool which would now be set out a new total of 20 redis connections ( 30 (available total) - 10 (redis model connections ), and Sidekiq client and server size would need to be changed.
How do you determine the size of the connection pool (here 10) needed for model/controller redis connections? Since Redis is single-threaded, how does increasing the connection pool actually increases redis operations performance?
Any thoughts on this would be of great help.
Thx!
Redis is single-threaded, but written in pure C, uses an event loop inside and handles connections asynchronously, so connection count does not affect it by much provided the same number of requests. It is capable of handling requests faster than your application can generate them because of network delay, ruby being slower than compiled and optimized C, etc, so you do not need to worry about it being single-threaded.
Increasing number of connections is beneficial for concurrent requests from different threads because there's no need to wait for response to be delivered over network to unlock connection, plus ruby can do parallel IOs.
Also you can tell if pool is too small when connection checkout times become worse than you expect/tolerate and corresponding thread/worker is idling while waiting for it, so benchmark your code and have a good look on your actual usage and behavior patterns.
On the other side i'd advise against using all of the connection count limit, there're times when you might need these extra connections. For example:
for graceful/"zero downtime" dyno restarts ("preboot") you need twice the connections, since old processes are still running for some time
keep at least one free connection for emergency debug as you may want to be able to connect from console/directly and see what data is inside when some unexpected highload comes