Redis Pub/Sub causes web socket connection to hang - ruby

I'm building a web application that connects to a server via web sockets. The server component is a small Ruby application based on sinatra, redis, and faye-websocket. The server is running on Phusion Passenger. A separate updater daemon is constantly pulling updates from various sources and publishes them to redis (using the redis gem and Redis::publish).
In order to push the updates to the clients I tried the following in my Sinatra app:
get '/' do
if Faye::WebSocket.websocket?(request.env)
store = Redis.new
ws = Faye::WebSocket.new(request.env)
ws.on(:open) do |event|
store.incr('connection_count')
puts 'Client connected (connection count: %s)' % store.get('connection_count')
end
ws.on(:close) do |event|
store.decr('connection_count')
puts 'Client disconnected (connection count: %s)' % store.get('connection_count')
end
ws.rack_response
store.subscribe(:updates) do |on|
on.message do |ch, payload|
puts "Got update"
ws.send(payload) if payload
end
end
end
end
This works only partially. A client can connect successfully and also receives updates but the store.incr and store.decr calls don't work. Also, the connections don't seem to be closed correctly—when I fire up multiple clients, I noticed that the connections pile up and the Passenger server stops working eventually.
Log output:
devserver_1 | App 614 stdout: Got update
devserver_1 | App 614 stdout: Got update
devserver_1 | App 614 stdout: Got update
When I comment out the following block, keeping track of the connections suddenly works:
store.subscribe(:updates) do |on|
on.message do |ch, payload|
puts "Got update"
ws.send(payload) if payload
end
end
Log output:
devserver_1 | App 1028 stdout: Client connected (connection count: 1)
devserver_1 | App 1039 stdout: Client connected (connection count: 2)
devserver_1 | App 1039 stdout: Client disconnected (connection count: 1)
devserver_1 | App 1028 stdout: Client disconnected (connection count: 0)
So using Redis::subscribe seems to somehow interfere with the web socket connection.
How can I solve this?
Phusion Passenger version 4.0.58
ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-linux-gnu]
sinatra (1.4.6)
faye-websocket (0.9.2)

I think the problem here is that Faye uses EventMachine which means there's a reactor on your thread that is handling events, and calling your callbacks ws.on(:open) and ws.on(:close).
Now when you hit
store.subscribe(:updates) do |on|
on.message do |ch, payload|
puts "Got update"
ws.send(payload) if payload
end
end
This is a blocking operation - it entirely blocks the current thread. If your current thread is blocked, the reactor can't listen for events and then call your callbacks.
One solution to this is to run your store.subscribe on a different thread so it doesn't matter if it blocks that thread.
But I think a better solution is to use a non-blocking version of the Redis library:
From the documentation:
redis = EM::Hiredis.connect
pubsub = redis.pubsub
pubsub.subscribe(:updates).callback do
puts "Got update"
ws.send(payload) if payload
end
Both of these (Redis + Faye) should register with the EventMachine reactor loop, so that it dispatches events to both.

Related

What is the correct way to end a WEBrick server process after a timeout?

I'm trying to implement a small WEBrick server that ends itself when there are no requests after x number of seconds. However, I'm getting nowhere. My very first attempt at simply exiting after 2 seconds fails. Here's the simple code
that doesn't work.
server = WEBrick::HTTPServer.new(:Port => 8000)
WEBrick::Utils::TimeoutHandler.register(2, Timeout::Error)
server.start
I thought that would simply exit the process after 2 seconds. Here's what actually happens:
[2020-01-19 15:41:10] INFO WEBrick 1.4.2
[2020-01-19 15:41:10] INFO ruby 2.5.1 (2018-03-29) [x86_64-linux-gnu]
[2020-01-19 15:41:10] INFO WEBrick::HTTPServer#start: pid=16622 port=8000
[2020-01-19 15:41:12] ERROR Timeout::Error: execution timeout
/usr/lib/ruby/2.5.0/webrick/server.rb:170:in `select'
And then the process keeps running. I have to ctrl-c to end it.
What's the correct way to shut down a server and end the process after a timeout?

How to send Apple Push Notifications by using apnotic gem?

I am trying to send Apple's push notification by following the instruction of apnotic gem.
https://github.com/ostinelli/apnotic
require 'apnotic'
connection = Apnotic::Connection.development(
auth_method: :token,
cert_path: "AuthKey_foobar.p8",
key_id: "foobar",
team_id: "abcd1234"
)
# create a notification for a specific device token
token = 'abcdef123456..'
notification = Apnotic::Notification.new(token)
notification.topic = "bundle_id"
notification.alert = "Notification from Apnotic!"
# send notifications and get results
response = connection.push(notification)
puts response.status
puts response.body
connection.close
However, I got the response.
403
{"reason"=>"ExpiredProviderToken"}
I don't think the token is actually expired because I can send notifications by using the app and same IDs and tokens.
https://github.com/onmyway133/PushNotifications
The problem was my virtual machine's clock. After adjusting the clock(by installing ntp), it worked.
$ sudo apt install ntp
$ sudo service ntp status
● ntp.service - Network Time Service
Loaded: loaded (/lib/systemd/system/ntp.service; enabled; vendor preset: enab
Active: active (running) since Sat 2019-08-17 05:46:06 UTC; 7s ago
:
$ ruby push.rb
200
{":status"=>"200", "apns-id"=>"......"}

Unresponsive socket after x time (puma - ruby)

I'm experiencing an unresponsive socket in with my Puma setup after random time. Up to this point I don't have a clue what's causing the issue. I was hoping somebody over here can help we with some answers or point me in the right direction. I'm having the following setup:
I'm using the official docker ruby-2.2.3-slim image together with the latest puma release 2.15.3, I've also installed Nginx as a reverse proxy. But I'm already sure Nginx isn't the problem over here because and I've tried to verify if the socket was working using this script. And the socket wasn't working, I got a timeout over there as well so I could ignore Nginx.
This is a testing environment so the server isn't experiencing any extreme load, I've also check memory consumption it has still several GB's of free space so that couldn't be the issue either.
What triggered me to look at the puma socket was the error message I got in my Nginx error logging:
upstream timed out (110: Connection timed out) while reading response header from upstream
Also I couldn't find anything in the logs of puma indicating what is going wrong, over here are my puma setup:
threads 0, 16
app_dir = ENV.fetch('APP_HOME')
environment ENV['RAILS_ENV']
daemonize
bind "unix://#{app_dir}/sockets/puma.sock"
stdout_redirect "#{app_dir}/log/puma.stdout.log", "#{app_dir}/log/puma.stderr.log", true
pidfile "#{app_dir}/pids/puma.pid"
state_path "#{app_dir}/pids/puma.state"
activate_control_app
on_worker_boot do
require 'active_record'
ActiveRecord::Base.connection.disconnect! rescue ActiveRecord::ConnectionNotEstablished
ActiveRecord::Base.establish_connection(YAML.load_file("#{app_dir}/config/database.yml")[ENV['RAILS_ENV']])
end
And this it the output in my puma state file:
---
pid: 43
config: !ruby/object:Puma::Configuration
cli_options:
conf:
options:
:min_threads: 0
:max_threads: 16
:quiet: false
:debug: false
:binds:
- unix:///APP/sockets/puma.sock
:workers: 1
:daemon: true
:mode: :http
:before_fork: []
:worker_timeout: 60
:worker_boot_timeout: 60
:worker_shutdown_timeout: 30
:environment: staging
:redirect_stdout: "/APP/log/puma.stdout.log"
:redirect_stderr: "/APP/log/puma.stderr.log"
:redirect_append: true
:pidfile: "/APP/pids/puma.pid"
:state: "/APP/pids/puma.state"
:control_url: unix:///tmp/puma-status-1449260516541-37
:config_file: config/puma.rb
:control_url_temp: "/tmp/puma-status-1449260516541-37"
:control_auth_token: cda8879717be7a645ea323d931b88d4b
:tag: APP
The application itself is a Rails app on the latest version 4.2.5, it's deployed on GCE (Google Container Engine).
If somebody could give me some pointer's on how to debug this any further would be very much appreciated. Because now I don't see any output anywhere which could help me any further.
EDIT
I replaced the unix socket with tcp connection to Puma with the same result, still hangs after x time
I'd start with:
How many requests get processed successfully per instance of puma?
Make sure you log the beginning and end of each request with the thread id of the thread executing it, what do you see?
Not knowing more about your application, I'd say it's likely the threads get stuck doing some long/blocking calls without timeouts or spinning on some computation until the whole thread pool gets depleted.
We'll see.
I finally found out why my application was behaving the way it was.
After trying to use a tcp connection and switching to Unicorn I start looking into other possible sources.
That's when I thought maybe my connection to Google Cloud SQL could be the problem. Once I read the faq of Cloud SQL, they mentioned that you have to tweak you Compute instances to ensure they keep open your DB connection. So I performed the next steps they recommend and that solved the problem for me, I added them just in case:
# Display the current tcp_keepalive_time value.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
# Set tcp_keepalive_time to 60 seconds and make it permanent across reboots.
$ echo 'net.ipv4.tcp_keepalive_time = 60' | sudo tee -a /etc/sysctl.conf
# Apply the change.
$ sudo /sbin/sysctl --load=/etc/sysctl.conf
# Display the tcp_keepalive_time value to verify the change was applied.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time

suddenly PossibleAuthenticationFailureError in amqp

I'm using the ruby amqp gem. I ran a AMQP.start event loop, but 'suddenlyit raised aPossibleAuthenticationFailureError` during the loop.
AMQP.start(amqp_config) do |connection|
channel = AMQP::Channel.new connection
channel.on_error do |channel, channel_close|
puts "Oops... a channel-level exception: code = #{channel_close.reply_code}, message = #{channel_close.reply_text}"
end
my_worker = MyWorker.new
my_worker.start
end
[amqp] Detected TCP connection failure
/home/raincole/.rvm/gems/ruby-1.9.3-p125/gems/amq-client-0.9.3/lib/amq/client/async/adapters/event_machine.rb:164:in `block in initialize': AMQP broker closed TCP connection before authentication succeeded: this usually means authentication failure due to misconfiguration. Settings are {:host=>"localhost", :port=>5672, :user=>"guest", :pass=>"guest", :vhost=>"/", :timeout=>nil, :logging=>false, :ssl=>false, :broker=>nil, :frame_max=>131072} (AMQP::PossibleAuthenticationFailureError)
The weird part is, my worker have received some messages before I got PossibleAuthenticationFailureError. It seems like that the configuration should be correct(and I checked it over and over again).
Are there other potential reasons for PossibleAuthenticationFailureError?
I recommend a 4 step approach to investigating this issue:
a) Eliminate the obvious - Are your credentials correct and is the user account alive and well (default = 'guest')? Are you connecting to the appropriate vhost (default = '/')?
$ rabbitmqctl list_users
Listing users ...
guest [administrator]
...done.
$ rabbitmqctl list_user_permissions guest
Listing permissions for user "guest" ...
/ .* .* .*
<your_vhost> .* .* .*
...done.
b) What do the rabbitmq connection logs say?
On a Mac OS installation of rabbitmq (using brew), the logs can be found in /usr/local/var/log/rabbitmq, but your log location could be elsewhere depending on OS and installation preferences.
You may see the following lines in the rabbit#localhost.log file. Not a lot of help...and so proceed to step (c). Otherwise, investigate as per what you see in the log.
=INFO REPORT==== 15-Feb-2013::00:42:21 ===
accepting AMQP connection <0.691.0> (127.0.0.1:53108 -> 127.0.0.1:5672)
=WARNING REPORT==== 15-Feb-2013::00:42:21 ===
closing AMQP connection <0.691.0> (127.0.0.1:53108 -> 127.0.0.1:5672):
connection_closed_abruptly
c) Is rabbitmq's listener (Erlang client) alive. Default port = 5672. Simplest way to check is to send a garbage message to that port and look for an 'AMQP' response:
$ telnet localhost 5672
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
asdasd
AMQP
Connection closed by foreign host.
(d) Is the event loop reactor closing prematurely, before the AMQP.connect (or AMQP.start) actions have had a chance to complete authentication?
EM.run
connection = AMQP.connect(:host => 'localhost', :vhost => '/') do
# your code here
end
EM.stop
end
With all 'your code' sitting in a callback, the EM.stop runs instantaneously after the AMQP.connect instruction. This gives no time for the connection to be suitably established.
What worked for me here was to add a timer and handle disconnects gracefully.
EM.run
connection = AMQP.connect(:host => 'localhost', :vhost => '/')
# your code here
end
graceful_exit = Proc.new {
connection.close { EM.stop }
}
EM.add_timer(3, graceful_exit)
end
The reason I put the EM.stop block in a Proc is so that I can reuse it for other graceful exits (say, when trapping 'TERM' and 'INT' signals)
Hope this helps.

EventMachine UNIX socket connections: how do I use them with Thin running as a service?

How can I use EventMachine.connect_unix_domain while running Thin as a service (using the init script (excerpt) and configuration below). The code directly below is the problem (I get an eventmachine not initialized: evma_connect_to_unix_server error). The second code example works, but doesn't allow me to daemonize thin (I don't think). Does Thin not already have a running instance of EventMachine?
UPDATE: Oddly enough: stopping the server (with service thin stop), seems to get into the config.ru file and run the app (so it works, until the stop command times out and kills the process). What happens when thin stops that could be causing this behavior?
Problematic Code
class Server < Sinatra::Base
# Webserver code removed
end
module Handler
def receive_data data
$received_data_changed = 1
$received_data = data
end
end
$sock = EventMachine.connect_unix_domain("/tmp/mysock.sock", Handler)
Working Code
EventMachine.run do
class Server < Sinatra::Base
# Webserver code removed
end
module Handler
def receive_data data
$received_data_changed = 1
$received_data = data
end
end
$sock = EventMachine.connect_unix_domain("/tmp/mysock.sock", Handler)
Server.run!(:port => 4567)
end
Init Script excerpt
DAEMON=/usr/local/bin/thin
SCRIPT_NAME=/etc/init.d/thin
CONFIG_PATH=/etc/thin
# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0
case "$1" in
start)
$DAEMON start --all $CONFIG_PATH
;;
Thin Config
---
chdir: /var/www
environment: development
timeout: 30
log: log/thin.log
pid: tmp/pids/thin.pid
max_conns: 1024
max_persistent_conns: 512
require: []
wait: 30
servers: 1
socket: /tmp/thin.server.sock
daemonize: true
Thin is built on top of EventMachine. I think that you should use EventMachine for serving your app. Try to debug further way Thin won't daemonize. (What version are you using?). Also you can run Thin on another port such as 4000 and then pass that as the upstream server to your proxy-forwarding server, if that is what you want to achieve.
What I ended up doing was removing the EventMachine.run do ... end and simply enclosing the socket connection in an EM.next_tick{ $sock = EventMachine.connect_unix_domain("/tmp/mysock.sock", Handler) }.
Could swear I tried this once before... but it works now.
EDIT: Idea for next_tick came from here.

Resources