I'm experiencing an unresponsive socket in with my Puma setup after random time. Up to this point I don't have a clue what's causing the issue. I was hoping somebody over here can help we with some answers or point me in the right direction. I'm having the following setup:
I'm using the official docker ruby-2.2.3-slim image together with the latest puma release 2.15.3, I've also installed Nginx as a reverse proxy. But I'm already sure Nginx isn't the problem over here because and I've tried to verify if the socket was working using this script. And the socket wasn't working, I got a timeout over there as well so I could ignore Nginx.
This is a testing environment so the server isn't experiencing any extreme load, I've also check memory consumption it has still several GB's of free space so that couldn't be the issue either.
What triggered me to look at the puma socket was the error message I got in my Nginx error logging:
upstream timed out (110: Connection timed out) while reading response header from upstream
Also I couldn't find anything in the logs of puma indicating what is going wrong, over here are my puma setup:
threads 0, 16
app_dir = ENV.fetch('APP_HOME')
environment ENV['RAILS_ENV']
daemonize
bind "unix://#{app_dir}/sockets/puma.sock"
stdout_redirect "#{app_dir}/log/puma.stdout.log", "#{app_dir}/log/puma.stderr.log", true
pidfile "#{app_dir}/pids/puma.pid"
state_path "#{app_dir}/pids/puma.state"
activate_control_app
on_worker_boot do
require 'active_record'
ActiveRecord::Base.connection.disconnect! rescue ActiveRecord::ConnectionNotEstablished
ActiveRecord::Base.establish_connection(YAML.load_file("#{app_dir}/config/database.yml")[ENV['RAILS_ENV']])
end
And this it the output in my puma state file:
---
pid: 43
config: !ruby/object:Puma::Configuration
cli_options:
conf:
options:
:min_threads: 0
:max_threads: 16
:quiet: false
:debug: false
:binds:
- unix:///APP/sockets/puma.sock
:workers: 1
:daemon: true
:mode: :http
:before_fork: []
:worker_timeout: 60
:worker_boot_timeout: 60
:worker_shutdown_timeout: 30
:environment: staging
:redirect_stdout: "/APP/log/puma.stdout.log"
:redirect_stderr: "/APP/log/puma.stderr.log"
:redirect_append: true
:pidfile: "/APP/pids/puma.pid"
:state: "/APP/pids/puma.state"
:control_url: unix:///tmp/puma-status-1449260516541-37
:config_file: config/puma.rb
:control_url_temp: "/tmp/puma-status-1449260516541-37"
:control_auth_token: cda8879717be7a645ea323d931b88d4b
:tag: APP
The application itself is a Rails app on the latest version 4.2.5, it's deployed on GCE (Google Container Engine).
If somebody could give me some pointer's on how to debug this any further would be very much appreciated. Because now I don't see any output anywhere which could help me any further.
EDIT
I replaced the unix socket with tcp connection to Puma with the same result, still hangs after x time
I'd start with:
How many requests get processed successfully per instance of puma?
Make sure you log the beginning and end of each request with the thread id of the thread executing it, what do you see?
Not knowing more about your application, I'd say it's likely the threads get stuck doing some long/blocking calls without timeouts or spinning on some computation until the whole thread pool gets depleted.
We'll see.
I finally found out why my application was behaving the way it was.
After trying to use a tcp connection and switching to Unicorn I start looking into other possible sources.
That's when I thought maybe my connection to Google Cloud SQL could be the problem. Once I read the faq of Cloud SQL, they mentioned that you have to tweak you Compute instances to ensure they keep open your DB connection. So I performed the next steps they recommend and that solved the problem for me, I added them just in case:
# Display the current tcp_keepalive_time value.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
# Set tcp_keepalive_time to 60 seconds and make it permanent across reboots.
$ echo 'net.ipv4.tcp_keepalive_time = 60' | sudo tee -a /etc/sysctl.conf
# Apply the change.
$ sudo /sbin/sysctl --load=/etc/sysctl.conf
# Display the tcp_keepalive_time value to verify the change was applied.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
Related
The situation is, I wanna establish a QUIC connection based on quic-go from local to ECS server. The related tests using localhost are done both on local and remote device. That is:
#local: .$QUIC-GO-PATH/example/client/main -insecure -keylog ssl.log -qlog trial.log -v https://127.0.0.1:6121/demo/tile
#local: .$QUIC-GO-PATH/example/main -qlog -tcp -v
These tests are completed.
Now is the problem,when I start local-remote connection an error occurred:
#remote: .$QUIC-GO-PATH/example/main -qlog -tcp -v
#local: .$QUIC-GO-PATH/example/client/main -insecure -keylog ssl.log -qlog trial.log -v https://$REMOTE_IPADDR:6121/demo/tile
timeout: no recent network activity
When I go through a wireshark examination, it seems like the CRYPTO handshake never finishes:
Wireshark
Also client Qlog file atteched here:
Qlog file
Codes are all the same with https://github.com/lucas-clemente/quic-go
Help!
This problem has been solved.
Code $QUIC-GO-PATH/example/main.go has binded the port as a default onto 127.0.0.1:6121, which led to the problem that the server cannot get reached by client outside, just get this on server running:
-bind 0.0.0.0:6121
I was happily following this advice on how to run a pry debugger inside my Phoenix controller tests:
require IEx in the target file
add IEx.pry to the desired line
run the tests inside IEx: iex -S mix test --trace
But after a few seconds this error always appeared:
16:51:08.108 [error] Postgrex.Protocol (#PID<0.250.0>) disconnected:
** (DBConnection.ConnectionError) owner #PID<0.384.0> timed out because
it owned the connection for longer than 15000ms
As the message says, the database connection appears to time out at this point and any commands that invoke the database connection will error out with a DBConnection.OwnershipError. How do I tell my database connection not to time out so I can debug my tests in peace?
The Ecto.Adapters.SQL.Sandbox FAQ mentions this issue and explains that you can add the :ownership_timeout setting to your Repo config to specify how long db connections should stay open before timing out. I set mine to 10 minutes (test environment only) so I never have to think about it again:
# config.test.exs
config :rumbl, Rumbl.Repo,
# ...other settings...
ownership_timeout: 10 * 60 * 1000 # long timeout so pry sessions don't break
As expected, I can now fool around in pry for 10 minutes before seeing that error.
when i run from windows-7 with cygwin to connect CFEngine bersion 3.4.2
cf-agent -Bs 217.64.173.210
Challenge response from server 217.64.173.210/217.64.173.210 was incorrect!
I: Made in version 'not specified' of '/var/cfengine/inputs/update.cf' near line 47
!! Authentication dialogue with 217.64.173.210 failed
Challenge response from server 217.64.173.210/217.64.173.210 was incorrect!
I: Made in version 'not specified' of '/var/cfengine/inputs/update.cf' near line
and in /var/cfengine/inputs/update.cf on line 47 is
47 : perms => m("600"),
on cgwin in folder keys
/var/cfengine/ppkeys
localhost.pub
localhost.priv
root-MD5=b8825ba0a0e7017e34b15766d3b3ac58 (which is also at CFEngine Server Side shared ky)
on Cf-Engine Server Side
/var/cfengine/ppkeys/
localhost.priv
localhost.pub
root-MD5=b8825ba0a0e7017e34b15766d3b3ac58
With Regards
Sandeep
Did you also get the server to trust the client's key? like so:
cf-key -t root-MD5=b8825ba0a0e7017e34b15766d3b3ac58
(on the server)
Also, try restarting cf-serverd in verbose mode with the -v switch on the server, and watch what error messages you get on that end.
I want to use SSL with MongoDB. It's not enabled by default so one has to compile from source with the necessary options. I followed the official documentation and got the v2.6.4 binary built and running nicely on a freshly deployed server running Ubuntu 14.04. All good so far.
Next I set up mongod as described in the official docs. I did follow their example of using a self-certified key for testing purposes. And the relevant part of the config looks like:
...
net:
bindIp: 127.0.0.1
port: 27017
ssl:
mode: requireSSL
PEMKeyFile: /opt/mongo/security/mongodb.pem
...
If I then run the client and specify to use SSL I connect fine. ($ mongo --ssl). FWIW if I try without the --ssl argument then it doesn't connect.
Ok, time to link up via Ruby. I'm on the same server and I try the following ruby script:
require 'rubygems'
require 'mongo'
client = Mongo::MongoClient.new('localhost', 27017, {:ssl => true})
Nope. It's not having it:
/home/test/.rvm/gems/ruby-1.9.3-p547/gems/mongo-1.11.1/lib/mongo/mongo_client.rb:422:in `connect': Failed to connect to a master node at localhost:27017 (Mongo::ConnectionFailure)
from /home/test/.rvm/gems/ruby-1.9.3-p547/gems/mongo-1.11.1/lib/mongo/mongo_client.rb:661:in `setup'
from /home/test/.rvm/gems/ruby-1.9.3-p547/gems/mongo-1.11.1/lib/mongo/mongo_client.rb:177:in `initialize'
from test_mongo_ssl.rb:8:in `new'
from test_mongo_ssl.rb:8:in `<main>'
So best to make sure that there's nothing wrong with the default connection without SSL. I disable SSL on mongod and restart. Then try the ruby script again, this time without the ssl option:
...
client = Mongo::MongoClient.new('localhost', 27017)
And it's fine. Therefore I feel I've narrowed it down to the ruby driver & ssl, but beyond that there's little else to go on.
EDIT I tried their Python driver on the same server and used their example program:
from pymongo import MongoClient
c = MongoClient(host="localhost", port=27017, ssl=True)
And that did connect OK. So at least I can feel fairly confident that the mongod is configured properly and the issue lies somewhere within the Mongo Ruby driver. Quite possibly a bug in their current driver (v1.11.1).
UPDATE I've also had success connecting via ssl using the node.js driver:
var mongo = require('mongodb');
var database = new mongo.Db("my_database", new mongo.Server("127.0.0.1", 27017, {ssl:true} ), {w:0});
database.open(function(err, db) {
if(err) throw err;
db.authenticate('user', 'password', function(err, result) {
var collection = db.collection('foo');
collection.findOne(function(err, item) {
if(err) throw err;
console.log(item);
db.close();
});
});
});
There it seems to be increasingly likely that there's either a bug in the ruby driver, or the documentation is incomplete and not explaining accurately how to use SSL connections. Therefore I've opened a new issue on MongoDB's issue tracker to hopefully get to the bottom of this.
Rather embarrassingly, the solution to this issue was my /etc/hosts file had a typo for the localhost entry:
127.0.0.1 localhost.localdomain locahost
As you can see, it's missing the second letter L in "localhost". (I suspect it went missing during an accidental vim gesture.) Therefore to resolve I just had to reinstate the missing "l":
127.0.0.1 localhost.localdomain localhost
It's still a mystery as to why the Python sample worked correctly. And it's because of that I didn't twig earlier that it was a problem with the hosts file.
I downloaded gsoap 2.8 and went into the samples folder and ran a make. Everything seems to have built fine. I then navigated into the "ssl" folder and ran the sllserver in one xterm and ran sslclient in a second xterm window. (I am running RHEL 6) The server seems to run fine, it says "Bind successful: socket = 4". But when I run the client I receive the following message:
Error -1 fault: SOAP-ENV:Client [no subcode]
"End of file or no input: Operation interrupted or timed out (30 s receive delay) (30 s send delay)"
Detail: [no detail]
I have not modified any of the sample code, so it seems like it should just work. Can anyone please give me some advice as to what I should look at? I am trying to learn how to set up a soap server that uses ssl. (I have a gsoap server running already) I searched all day for an example on the web and as usual, there is not one.
Thank you so much for any help.
You could rebuild this example with compiler switch -DDEBUG to enable message logging (make 'sslclient_CFLAGS = -DWITH_OPENSSL -DWITH_GZIP -DDEBUG'). The TEST.log will tell what went wrong. I suspect it is a network issue with the server address/port that is set by default to "https://localhost:18081".
You could set the timeout parameter: soap.recv_timeout = 60 (for 60 seconds)