heroku redis connection issues using sidekiq - heroku

I am using Sidekiq to process jobs. I am using Heroku basic plan which allows up to 40 connections. My understanding is that each thread can have up to 1 connection. Sidekiq has a default number of threads of 25. In my thinking I should never be getting more than 25 connections.
But I have been getting too many connections errors to redis. How would this be posible? Should I cut down the number of Sidekiq workers? Or is there something else I can do? I currently have my Procfile like this:
worker: bundle exec sidekiq
Would switching it to this fix it?
worker: bundle exec sidekiq -c 10
Is it possible Sidekiq is not closing connections properly? Also, when I get this "too many connections" error, it basically brings down the site - is there a way to let if fail gracefully which it seems like it should do.

The short answer was that heroku wasn't accurately showing the number of connections which was tripping up any debugging. And I was pretty amateurish in figuring it out since Redis has pretty much worked.
My timeout was set to zero (determined via >heroku redis:info) which basically meant that connections are held open indefinitely (the dashboard didn't show this accurately saying I was using like 10 when really having like 35).
Connecting to redis via heroku redis:cli and then running >CLIENT LIST showed the problem where there were many connections / clients probably in like TIME_WAIT state.
changing the timeout fixed this:
heroku redis:timeout --seconds 60
Honestly, calling it a connection pool is a big inaccurate.

Related

Redis intermittent "crash" on Laravel Horizon. Redis stops working every few weeks/months

I have an issue with Redis that effects running of Laravel Horizon Queue and I am unsure how to debug it at this stage, so am looking for some advice.
Issue
Approx. every 3 - 6 weeks my queues stop running. Every time this happens, the first set of exceptions I see are:
Redis Exception: socket error on read socket
Redis Exception: read error on connection to 127.0.0.1:6379
Both of these are caused by Horizon running the command:
artisan horizon:work redis
Theory
We push around 50k - 100k jobs through the queue each day and I am guessing that Redis is running out of resources over the 3-6 week period. Maybe general memory, maybe something else?
I am unsure if this is due to a leak wikthin my system or something else.
Current Fix
At the moment, I simply run the command redis-cli FLUSHALL to completely clear the database and we are back working again for another 3 - 6 weeks. This is obviously not a great fix!
Other Details
Currently Redis runs within the webserver (not a dedicated Redis server). I am open to changing that but it is not fixing the root cause of the issue.
Help!
At this stage, I am really unsure where to start in terms of debugging and identifing the issue. I feel that is probably a good first step!

How can I detect whether my code is running "inside" Sidekiq server or Puma?

I'm using Puma as a web server, and Sidekiq as my queue runner.
For multiple things (Database connections, Redis connections, other external services) I'm using the ConnectionPool gem to manage safe access to connections.
Now, depending on whether I'm running in the context of Sidekiq or of Puma, I need those pools to be different sizes (as large as the number of Sidekiq Threads or Puma threads respectively, and they are different)
What is the best way to know, in your initializers, how big to make your connection pools based on execution context?
Thanks!
You use Sidekiq.server? which returns nil when not running inside the Sidekiq process itself.
I don't know about your specific case (puma/sidekiq), but in general you can find this information in the $PROGRAM_NAME variable. Also similar are $0 and __FILE__.

Running Delayed_Job on Unicorn with 2 web dynos

I have 2 web dynos active on Heroku.
I'm running Unicorn and Cedar-14.
#unicorn.rb
worker_processes 3
timeout 30
#Procfile.rb
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
How can I run delayed_job using Unicorn processes? I want 2 Dynos to keep the server online but don't want to pay for an additional "worker" dyno to process some lengthy database actions.
I've seen examples for using resque but nothing concrete for Unicorn + DelayedJob.
I've been looking into the same thing lately and while I have yet to implement anything, the consensus is that the best way to accomplish this is to spin up work dynos when you need then and spin them down when you're finished.
There are a few gems that do this but from what I've seen, they all have drawbacks. I've also read that there are some services that will charge a small monthly fee to handle this which eliminated the issues the various gems had.

TCP Socket communication between processes on Heroku worker dyno

I'd like to know how to communicate between processes on a Heroku worker dyno.
We want a Resque worker to read off a queue and send the data to another process running on the same dyno. The "other process" is an off-the-shelf piece of software that usually uses TCP sockets (port xyz) to listen for commands. It is set up to run as a background process before the Resque worker starts.
However, when we try to connect locally to that TCP socket, we get nowhere.
Our Rake task for setting up the queue does this:
task "resque:setup" do
# First launch our listener process in the background
`./some_process_that_listens_on_port_12345 &`
# Now get our queue worker ready, set up Redis backing store
port = 12345
ENV['QUEUE'] = '*'
ENV['PORT'] = port.to_s
Resque.redis = ENV['REDISTOGO_URL']
# Start working from the queue
WorkerClass.enqueue
end
And that works -- our listener process runs, and Resque tries to process queued tasks. However, the Resque jobs fail because they can't connect to localhost:12345 (specifically, Errno::ECONNREFUSED).
Possibly, Heroku is blocking TCP socket communication on the same dyno. Is there a way around this?
I tried to take the "code" out of the situation and just executed on the command line (after the server process claims that it is properly bound to 12345):
nc localhost 12345 -w 1 </dev/null
But this does not connect either.
We are currently investigating changing the client/server code to use UNIXSocket on both sides as opposed to TCPSocket, but as it's an off-the-shelf piece of software, we'd rather avoid our own fork if possible.
Use message queue Heroku add-ons ...,
like IronMQ for exsample
Have you tried Fifo?
http://www.gnu.org/software/libc/manual/html_node/FIFO-Special-Files.html#FIFO-Special-Files
Reading your question, you've answered your own question, you cannot connect to localhost 12345.
This way of setting up your processes is a strange one as your running two processes within one Heroku dyno which removes a lot of the benefits of Heroku, i.e independant process scaling, isolation and clean depenedency declaration and isolation.
I would strongly recommend running this as two seperate processes that interact via a third party backing service.
Heroku only lets you listen in a given port ($PORT) per dyno, I think.
I see two solutions here:
Use Redis as a communication middleware, so the worker would write on Redis again and the listener process, instead of listening in a port would be querying redis for new jobs.
Get another heroku dyno (or better, a complete different application) and launch there the listening process (on $PORT) and communicate both applications
#makdad, is the "3rd party software" written in Ruby? If so, I would run it with a monkey patch which fakes out TCPSocket or whatever class it is using to access the TCP socket. Put the monkey patch in a file of its own, which will only be required by the Ruby process which is running the 3rd party software. The monkey patch could even read data directly from the queue, and make TCPSocket behave as if that data had been received.
Yes, it's not very elegant, and I'm sure there may be a better way to do it, but when are you trying to get a job done (not spend days doing research), sometimes you just have to bite the bullet and do something which is ugly, but works. Whatever solution you choose, make sure to document it for those who work on the project later.

Is it possible to list all database connections currently in the pool?

I'm getting ActiveRecord::ConnectionTimeoutError in a daemon that runs independently from the rails app. I'm using Passenger with Apache and MySQL as the database.
Passenger's default pool size is 6 (at least that's what the documentation tells me), so it shouldn't use more than 6 connections.
I've set ActiveRecord's pool size to 10, even though I thought that my daemon should only need one connection. My daemon is one process with multiple threads that calls ActiveRecord here and there to save stuff to the database that it shares with the rails app.
What I need to figure out is whether the threads simply can't share one connection or if they just keep asking for new connections without releasing their old connections. I know I could just increase the pool size and postpone the problem, but the daemon can have hundreds of threads and sooner or later the pool will run out of connections.
The first thing I would like to know is that Passenger is indeed just using 6 connections and that the problem lies with the daemon. How do I test that?
Second I would like to figure out if every thread need their own connection or if they just need to be told to reuse the connection they already have. If they do need their own connections, maybe they just need to be told to not hold on to them when they're not using them? The threads are after all sleeping most of the time.
You can get to the connection pools that ActiveRecord is using through ActiveRecord::Base.connection_handler.connection_pools it should be an array of connection pools. You probably will only have one in there and it has a connections method on it. To get an array of connections it knows about.
You can also do a ActiveRecord::Base.connection_handler.connection_pools.each(&:clear_stale_cached_connections!) and it will checkin any checked out connections which thread is no longer alive.
Don't know if that helps or confuses more
As of February 2019, clear_state_cached_connections has been deprecated and moved to reap
Commit
Previous accepted answer updated:
ActiveRecord::Base.connection_handler.connection_pools.each(&:reap)

Resources