How to configure Redis connections with Rails 4, Puma and Sidekiq? - ruby

I am using Sidekiq (on Heroku with Puma) to send emails asynchronously and would like to use Redis to keep counters and cache models.
RedisCloud's free plan includes 30 connections to Redis. It is not clear to me how to manage:
redis connections used by Sidekiq
redis connections used in models (caching and counters)
Sidekiq Client size is configured like this:
Sidekiq.configure_client do |config|
config.redis = {url: ENV["REDISCLOUD_URL"], size: 3}
end
If I understood this correctly, Puma forks multiple processes, 2 in my case, which will result in:
2 (Puma Workers) * 3 (size) * 1 (Web Dyno) = 6 connections to redis used to push jobs.
Sidekiq Server
With Sidekiq taking 2 connections (or 5 in version 4), setting a concurrency of 10 would default in a server size of 12 or 15.
If I wanted to use all the remaining available connections (30 - 6 = 24), I could set :
Sidekiq.configure_client do |config|
config.redis = { size: 19 }
end
Total redis connections would be 19 + 5 (Sidekiq 4) = 24, and use the default concurrency of 25 would be ok.
As Mike Perham stated generally the concurrency must not be more than (server pool size - 2) * 2.
Now, where it starts to get confusing for me is the use of Redis out of Sidekiq.
# initializers/redis.rb
$redis = Redis.new(:url => uri)
Whenever I use Redis in a model or controller I call like so:
$redis.hincrby("mycounter", "key", 1)
As I understand it, all the puma threads wait on each other on a single Redis connection when $redis.whateverFunction is called.
In this answer What is the best way to use Redis in a Multi-threaded Rails environment? (Puma / Sidekiq), the recommended approach is using the connection_pool gem, related to the Sidekiq Wiki https://github.com/mperham/sidekiq/wiki/Advanced-Options#connection-pooling
require 'connection_pool'
$redis = ConnectionPool.new(size: 10) { Redis.new }
If I understand it right, it that case $redis.whateverFunction would have its own connection pool of 10, and sidekiq its own connection workers pool which would now be set out a new total of 20 redis connections ( 30 (available total) - 10 (redis model connections ), and Sidekiq client and server size would need to be changed.
How do you determine the size of the connection pool (here 10) needed for model/controller redis connections? Since Redis is single-threaded, how does increasing the connection pool actually increases redis operations performance?
Any thoughts on this would be of great help.
Thx!

Redis is single-threaded, but written in pure C, uses an event loop inside and handles connections asynchronously, so connection count does not affect it by much provided the same number of requests. It is capable of handling requests faster than your application can generate them because of network delay, ruby being slower than compiled and optimized C, etc, so you do not need to worry about it being single-threaded.
Increasing number of connections is beneficial for concurrent requests from different threads because there's no need to wait for response to be delivered over network to unlock connection, plus ruby can do parallel IOs.
Also you can tell if pool is too small when connection checkout times become worse than you expect/tolerate and corresponding thread/worker is idling while waiting for it, so benchmark your code and have a good look on your actual usage and behavior patterns.
On the other side i'd advise against using all of the connection count limit, there're times when you might need these extra connections. For example:
for graceful/"zero downtime" dyno restarts ("preboot") you need twice the connections, since old processes are still running for some time
keep at least one free connection for emergency debug as you may want to be able to connect from console/directly and see what data is inside when some unexpected highload comes

Related

Cassandra long connection times (compared to Redis)

I'm surprised by Cassandra long connection times (compared to Redis) made from the python client (cassandra-driver) to a single-node Cassandra cluster running on the same host. How can they be improved?
More info
Even though Cassandra server runs on the same host and I connect directly (through k8s/OCP service port), the connection times out until I set a relatively long triple-digits connect_timeout thresholds (relative to our standards built on 3-year experiences with Redis in several clusters).
In case of Redis the connection takes just a fraction of a millisecond, and the total (connection plus read) timeout sufficient to work in these conditions has always been 10 ms, while for Cassandra it would have to be set orders of magnitude larger!
2022-09-18 11:08:16.899097 - connecting to Redis database...
Redis<ConnectionPool<Connection<host=<redacted>.svc.cluster.local,port=<redacted>,db=0>>>
2022-09-18 11:08:16.899423 - connected to Redis database in 0.32 ms
versus:
2022-09-18 12:29:13.229688 - connecting to Cassandra database...
<cassandra.cluster.Session object at 0x7fd0065e80d0>
2022-09-18 12:29:13.409084 - connected to Cassandra database in 190.856 ms
What's even more surprising, the read timeout for Cassandra can be much shorter than connection timeout: only 16 ms for SELECT queries (with cold start, i.e. first query for a given key) vs. 128+ ms for connections alone (every time, apparently there's no connections caching on the client side).
Almost reproducible example (just fill in your cluster and db user details):
from cassandra.cluster import Cluster, PlainTextAuthProvider
# connection timeout
conn_timeout_ms = 64 # timeout
# conn_timeout_ms = 128 # OK (best time)
cas_auth_provider = PlainTextAuthProvider(username=cas_user,
password=cas_pass)
cas_cluster = Cluster(contact_points=[cas_uri],
port=cas_port,
auth_provider=cas_auth_provider,
connect_timeout=conn_timeout_ms / 1000, # ms -> sec
)
cas_session = cas_cluster.connect()
I've also tried setting the low-level TCP_NODELAY sockets option using the sockopts arg exposed by Cluster (to disable Nagle's aggregating algo, and send the data as soon as it's available), but it did not help:
sockopts = [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)]

MongoDB-Java performance with rebuilt Sync driver vs Async

I have been testing MongoDB 2.6.7 for the last couple of months using YCSB 0.1.4. I have captured good data comparing SSD to HDD and am producing engineering reports.
After my testing was completed, I wanted to explore the allanbank async driver. When I got it up and running (I am not a developer, so it was a challenge for me), I first wanted to try the rebuilt sync driver. I found performance improvements of 30-100%, depending on the workload, and was very happy with it.
Next, I tried the async driver. I was not able to see much difference between it and my results with the native driver.
The command I'm running is:
./bin/ycsb run mongodb -s -P workloads/workloadb -p mongodb.url=mongodb://192.168.0.13:27017/ycsb -p mongodb.writeConcern=strict -threads 96
Over the course of my testing (mostly with the native driver), I have experimented with more and less threads than 96; turned on "noatime"; tried both xfs and ext4; disabled hyperthreading; disabled half my 12 cores; put the journal on a different drive; changed sync from 60 seconds to 1 second; and checked the network bandwidth between the client and server to ensure its not oversubscribed (10GbE).
Any feedback or suggestions welcome.
The Async move exceeded my expectations. My experience is with the Python Sync (pymongo) and Async driver (motor) and the Async driver achieved greater than 10x the throughput. further, motor is still using pymongo under the hoods but adds the async ability. that could easily be the case with your allanbank driver.
Often the dramatic changes come from threading policies and OS configurations.
Async needn't and shouldn't use any more threads than cores on the VM or machine. For example, if you're server code is spawning a new thread per incoming conn -- then all bets are off. start by looking at the way the driver is being utilized. A 4 core machine uses <= 4 incoming threads.
On the OS level, you may have to fine-tune parameters like net.core.somaxconn, net.core.netdev_max_backlog, sys.fs.file_max, /etc/security/limits.conf nofile and the best place to start is looking at nginx related performance guides including this one. nginx is the server that spearheaded or at least caught the attention of many linux sysadmin enthusiasts. Contrary to popular lore one should reduce your keepalive timeout opposed to lengthen it. The default keep-alive timeout is some absurd (4 hours) number of seconds. you might want to cut the cord in 1 minute. basically, think a short sweet relationship with your clients connections.
Bear in mind that Mongo is not Async so you can use a Mongo driver pool. nevertheless, don't let the driver get stalled on slow queries. cut it off in 5 to 10 seconds using the following equivalents in Java. I'm just cutting and pasting here with no recommendations.
# Specifies a time limit for a query operation. If the specified time is exceeded, the operation will be aborted and ExecutionTimeout is raised. If max_time_ms is None no limit is applied.
# Raises TypeError if max_time_ms is not an integer or None. Raises InvalidOperation if this Cursor has already been used.
CONN_MAX_TIME_MS = None
# socketTimeoutMS: (integer) How long (in milliseconds) a send or receive on a socket can take before timing out. Defaults to None (no timeout).
CLIENT_SOCKET_TIMEOUT_MS=None
# connectTimeoutMS: (integer) How long (in milliseconds) a connection can take to be opened before timing out. Defaults to 20000.
CLIENT_CONNECT_TIMEOUT_MS=20000
# waitQueueTimeoutMS: (integer) How long (in milliseconds) a thread will wait for a socket from the pool if the pool has no free sockets. Defaults to None (no timeout).
CLIENT_WAIT_QUEUE_TIMEOUT_MS=None
# waitQueueMultiple: (integer) Multiplied by max_pool_size to give the number of threads allowed to wait for a socket at one time. Defaults to None (no waiters).
CLIENT_WAIT_QUEUE_MULTIPLY=None
Hopefully you will have the same success. I was ready to bail on Python prior to async

Can you determine the number of workers running in your application

In order to properly scale our sidekiq workers to the size of our database pool, we came up with a little formula in our configuration
sidekiq.rb
Sidekiq.configure_server do |config|
config.options[:concurrency] = ((ENV['DB_POOL'] || 5).to_i - 1) / workers
end
def workers
... the number of workers configured for our project ...
(ENV['HEROKU_WORKERS'] || 1).to_i
end
We're setting HEROKU_WORKERS by hand, but it would be sweet if there was a way to interrogate the Heroku API from within the application.
Modulo all the things that can happen (workers going up or down, changing the number of workers, etc.), this seems to get us out of the initial problem; where our workers would consume all of the database pool connections, and then start crashing on startup.
The heroku-api gem should provide you this.
https://github.com/heroku/heroku.rb
You should find your API key here: https://dashboard.heroku.com/account
require 'heroku-api'
heroku = Heroku::API.new(api_key: API_KEY)
Total number of current processes:
heroku.get_ps('heroku-app-name').body.count
(You should be able to parse this to get total number of workers... or a count of a specific kind of worker, if you have different kinds defined in your Procfile/Heroku app)

What happens if I have a pool_size of 1 in mongoid2 and i'm running unicorn with 3 worker_processes?

I'm running into a connection timeout happening. In my scenario of pool_size 1, does it mean that the most connection that are in the pool are 1 (ie. does pool_size = max_pool_size)??
Also, what happens when I have 3 unicorn processes running? Are they all using that same connection and things are actually slower than expected?
I'm running into a connection timeout happening. In my scenario of pool_size 1, does it mean that the most connection that are in the pool are 1 (ie. does pool_size = max_pool_size)??
In Mongoid 2, the pool size is the maximum number of connections that will ever be open, and are likely open at all times.
Mongoid 3 does not use a connection pool (though it did before it switched to the Moped driver).
Also, what happens when I have 3 unicorn processes running? Are they all using that same connection and things are actually slower than expected?
If you’re using Mongoid 3 with Rails, Mongoid will automatically reconnect when Unicorn forks a worker. If you’re using Mongoid 2 or not using Rails, you should call Mongoid.default_session.disconnect (in Mongoid 3, not sure what exactly to call in 2.x) in Unicorn’s before_fork hook.

Django1.3 multiple gunicorn workers caching problems

i have weird caching problems with the 1.3 version of django. I probably have something configured wrong, but am not sure what.
A good example is django-avatar, which uses caching and many people use it. Even if I dont have a cache backend defined the avatar seems to be cached, which by itself would be ok, but it keeps switching back and forth between the last values cached. Example: I upload a new avatar, now on approximately 50% of the requests it will show me the new one, 50% the old one. If I delete the old one I still get it on the site 50% of the time. The only way to fix it is to disable the caching of the avatar by setting it to one second.
First I thought it was because i used django.core.cache.backends.locmem.LocMemCache, which I never used before, but it even happens when I dont configure a cache backend at all.
I found one similar bug:
Django caching bug .. even if caching is disabled
but my pages render just fine, its the templatetags (for now) that cause the problems in my setup.
I use django 1.3, postgres, nginx, gunicorn 0.12.0, greenlet==0.3.1, eventlet==0.9.16
I just did some more testing and realized that it only happens when I start gunicorn using the config file. If I start it with ./manage.py run_gunicorn everything is fine. Running "gunicorn_django -c deploy/gunicorn.conf.py" causes the problems.
The only explanation I can think of is that each worker gets his own cache (I wonder why, since I did not define a cache).
Update: running ./manage.py run_gunicorn -w 4 also causes the same problems. Therefore I am almost certain that the multiple workers are causing the problems and each worker caches the values seperately.
My configuration:
import os
import socket
import sys
PORT = 8000
PROC_NAME = 'myapp_gunicorn'
LOGFILE_NAME = 'gunicorn.log'
TIMEOUT = 3600
IP = '127.0.0.1'
DEPLOYMENT_ROOT = os.path.dirname(os.path.abspath(__file__))
SITE_ROOT = os.path.abspath(os.path.sep.join([DEPLOYMENT_ROOT, '..']))
CPU_CORES = os.sysconf("SC_NPROCESSORS_ONLN")
sys.path.insert(0, os.path.join(SITE_ROOT, "apps"))
bind = '%s:%s' % (IP, PORT)
logfile = os.path.sep.join([DEPLOYMENT_ROOT, 'logs', LOGFILE_NAME])
proc_name = PROC_NAME
timeout = TIMEOUT
worker_class = 'eventlet'
workers = 2 * CPU_CORES + 1
I also tried it without using 'eventlet', but got the same errors.
Thanks for any help.
It is most likely defaulting to the in-memory-cache, which means each worker has it's own version of the cache in it's own memory space. If you hit thread 1 you get a different cache then thread 3. Nginx is spreading the load between each thread most likely via a round robin distribution, so you are changing threads each hit. Which explains your wacky results.
When you do manage.py run_gunicorn it is most likely running single threaded, and thus only one cache, and that is why you don't see the same results.
Using memcached or something similar is the way to go.

Resources