Message/logging from Thin - ruby

How can I stop Rack Thin from returning initial messages of the following type?
>> Thin web server (v1.3.1 codename Triple Espresso)
>> Maximum connections set to 1024
>> istening on 0.0.0.0:3000, CTRL+C to stop
I am using it like this:
Rack::Handler::Thin.run(Rack::Builder.new do
map("/resource/"){run(Rack::File.new("/"))}
map("/") do
run(->env{
h = Rack::Utils.parse_nested_query(env["QUERY_STRING"])
[200, {},[routine_to_generate_dynamic_content(h)]]
})
end
end, Port: 3000)

Looks like the initial messages are coming from Thin. As per their Github Issue #31, Disabling all logging, you can add Thin::Logging.silent = true before the rest of the code to silence the initial messages.
However, this will also silence all other messages from the Thin adapter. A glance at the source says these other messages will also be silenced:
Waiting for n connection(s) to finish, can take up to n sec, CTRL+C to stop now
Stopping ...
!! Ruby 1.8.5 is not secure please install cgi_multipart_eof_fix:    gem install cgi_multipart_eof_fix
Hope this helps!

Those messages does not come from rack, they come from thin: https://github.com/macournoyer/thin/blob/master/lib/thin/server.rb#L150 You can set the logging preferences according to this: https://github.com/macournoyer/thin/blob/master/lib/thin/logging.rb Thin::Logging.silent = true, but do you really want to silence all? Maybe direct it to a log file instead of stdout?

Related

Why would socket.write hang indefinitely?

What would make a write call to a TCPSocket hang indefinitely?
lotsOfBytes = # a really large number of bytes, like 1 or 2 MB of data
socket = TCPSocket.new # some config
socket.write(lotsOfBytes) # this line hangs
I am trying to debug an issue where a get_multi operation sent to memcached with a large number of keys hangs indefinitely, and it does so on a line that resembles that code snippet. I'm trying to better understand how the low-level sockets on which this library is built are expected to work.
What are the values of following attributes on your TCPSocket:
Keep-alive activated and what value is set?
Timeout set and what value is set?
If you will do a Wireshark dump, it's much better so see what happens before hanging connection.
tcpdump? are there any attempts to send anything?
netstat - for see output queue.
does it work on a small number of bytes in your environment?

How to receive many messages in a batch in the Azure Event Hub via AMQP 1.0

I set up an AMQP 1.0 link with just the path and a filter using the Apache Qpid Electron Go wrapper for Qpid Proton like this:
amqpConnection.Receiver(
// the path containing the consumer group
// and the partition Id
electron.Source("<EVENTHUB_PATH>"),
// the filter map contains some annotations filters
// for the Event Hub offset
electron.Filter(filterMap),
)
I followed this doc to setup the AMQP link options: https://godoc.org/qpid.apache.org/electron#LinkOption
However while running the Go app I've realised it was very slow in fetching messages, so I've added 2 more link options like this:
amqpConnection.Receiver(
electron.Source("<EVENTHUB_PATH>"),
electron.Capacity(100),
electron.Prefetch(true),
electron.Filter(filterMap),
)
but after adding the capacity and the prefetch link options I don't see any improvement in the performance.
I keep receiving approximately 10 messages every ~5 seconds from 4 parallel links (one link per partition).
I've tried to run the app with the environment variable PN_TRACE_RAW=true for the verbose output from Qpid Proton (cf. this: https://qpid.apache.org/releases/qpid-proton-0.18.0/proton/c/api/group__transport.html), but I am not sure
on what should I look for to troubleshoot this issue.
I don't think there is any issue with the Qpid settings, but anyway this is what I see on the terminal:
[0x9fd490]:0 -> #attach(18) [name="<MY_CUSTOM_NAME>",
handle=1, role=true, snd-settle-mode=0, rcv-settle-mode=0, source=#source(40) [address="<MY_CUSTOM_PATH>",
durable=0, expiry-policy=:"link-detach", timeout=0, dynamic=false, filter={:string=#:"apache.org:selector-filter:string"
"amqp.annotation.x-opt-offset > '<MY_CUSTOM_OFFSET>'"}], target=#target(41) [address="",
durable=0, expiry-policy=:"link-detach", timeout=0, dynamic=false], initial-delivery-count=0,
max-message-size=0]
[0x9fd490]:0 -> #flow(19) [next-incoming-id=1, incoming-window=2147483647, next-outgoing-id=1,
outgoing-window=0, handle=1, delivery-count=0, link-credit=100, drain=false]
I also tried to run the Go app in a Azure VM in the same Azure Location as the Event Hub, but no improvement in the performance.
How could I fetch many messages at the same time in the same "round trip"?
I need to process thousands of messages per seconds.
You are correct that you need a prefetch window but the electron client can do a LOT better than that.
I did a quick test with the electron examples from https://github.com/apache/qpid-proton/tree/master/examples/go/electron
I get 3000 msg/sec even without prefetch, and nearly 10000 msgs/sec with.
$ ./broker -qsize 100000 &
Listening on [::]:5672
$ ./send -count 10000 /x ; time ./receive -count 10000 /x
Received all 10000 acknowledgements
Listening on 1 connections
Received 10000 messages
real 0m2.612s
user 0m1.611s
sys 0m0.510s
$ ./send -count 10000 /x ; time ./receive -count 10000 -prefetch 1000 /x
Received all 10000 acknowledgements
Listening on 1 connections
Received 10000 messages
real 0m1.053s
user 0m1.272s
sys 0m0.277s
There is clearly something funny going on - I'd like to help you get to the bottom of it.
PN_TRACE_RAW is a bit too verbose to be helpful, try PN_TRACE_FRM=1 which will give you a more readable summary.
I'm happy to continue the conversation either here or on users#qpid.apache.org if it turns into more of a support case than a question/answer.

How to configure Redis connections with Rails 4, Puma and Sidekiq?

I am using Sidekiq (on Heroku with Puma) to send emails asynchronously and would like to use Redis to keep counters and cache models.
RedisCloud's free plan includes 30 connections to Redis. It is not clear to me how to manage:
redis connections used by Sidekiq
redis connections used in models (caching and counters)
Sidekiq Client size is configured like this:
Sidekiq.configure_client do |config|
config.redis = {url: ENV["REDISCLOUD_URL"], size: 3}
end
If I understood this correctly, Puma forks multiple processes, 2 in my case, which will result in:
2 (Puma Workers) * 3 (size) * 1 (Web Dyno) = 6 connections to redis used to push jobs.
Sidekiq Server
With Sidekiq taking 2 connections (or 5 in version 4), setting a concurrency of 10 would default in a server size of 12 or 15.
If I wanted to use all the remaining available connections (30 - 6 = 24), I could set :
Sidekiq.configure_client do |config|
config.redis = { size: 19 }
end
Total redis connections would be 19 + 5 (Sidekiq 4) = 24, and use the default concurrency of 25 would be ok.
As Mike Perham stated generally the concurrency must not be more than (server pool size - 2) * 2.
Now, where it starts to get confusing for me is the use of Redis out of Sidekiq.
# initializers/redis.rb
$redis = Redis.new(:url => uri)
Whenever I use Redis in a model or controller I call like so:
$redis.hincrby("mycounter", "key", 1)
As I understand it, all the puma threads wait on each other on a single Redis connection when $redis.whateverFunction is called.
In this answer What is the best way to use Redis in a Multi-threaded Rails environment? (Puma / Sidekiq), the recommended approach is using the connection_pool gem, related to the Sidekiq Wiki https://github.com/mperham/sidekiq/wiki/Advanced-Options#connection-pooling
require 'connection_pool'
$redis = ConnectionPool.new(size: 10) { Redis.new }
If I understand it right, it that case $redis.whateverFunction would have its own connection pool of 10, and sidekiq its own connection workers pool which would now be set out a new total of 20 redis connections ( 30 (available total) - 10 (redis model connections ), and Sidekiq client and server size would need to be changed.
How do you determine the size of the connection pool (here 10) needed for model/controller redis connections? Since Redis is single-threaded, how does increasing the connection pool actually increases redis operations performance?
Any thoughts on this would be of great help.
Thx!
Redis is single-threaded, but written in pure C, uses an event loop inside and handles connections asynchronously, so connection count does not affect it by much provided the same number of requests. It is capable of handling requests faster than your application can generate them because of network delay, ruby being slower than compiled and optimized C, etc, so you do not need to worry about it being single-threaded.
Increasing number of connections is beneficial for concurrent requests from different threads because there's no need to wait for response to be delivered over network to unlock connection, plus ruby can do parallel IOs.
Also you can tell if pool is too small when connection checkout times become worse than you expect/tolerate and corresponding thread/worker is idling while waiting for it, so benchmark your code and have a good look on your actual usage and behavior patterns.
On the other side i'd advise against using all of the connection count limit, there're times when you might need these extra connections. For example:
for graceful/"zero downtime" dyno restarts ("preboot") you need twice the connections, since old processes are still running for some time
keep at least one free connection for emergency debug as you may want to be able to connect from console/directly and see what data is inside when some unexpected highload comes

Ruby - Elastic Search & RabbitMQ - data import being lost, script crashing silently

Stackers
I have a lot of messages in a RabbitMQ queue (running on localhost in my dev environment). The payload of the messages is a JSON string that I want to load directly into Elastic Search (also running on localhost for now). I wrote a quick ruby script to pull the messages from the queue and load them into ES, which is as follows :
#! /usr/bin/ruby
require 'bunny'
require 'json'
require 'elasticsearch'
# Connect to RabbitMQ to collect data
mq_conn = Bunny.new
mq_conn.start
mq_ch = mq_conn.create_channel
mq_q = mq_ch.queue("test.data")
# Connect to ElasticSearch to post the data
es = Elasticsearch::Client.new log: true
# Main loop - collect the message and stuff it into the db.
mq_q.subscribe do |delivery_info, metadata, payload|
begin
es.index index: "indexname",
type: "relationship",
body: payload
rescue
puts "Received #{payload} - #{delivery_info} - #{metadata}"
puts "Exception raised"
exit
end
end
mq_conn.close
There are around 4,000,000 messages in the queue.
When I run the script, I see a bunch of messages, say 30, being loaded into Elastic Search just fine. However, I see around 500 messages leaving the queue.
root#beep:~# rabbitmqctl list_queues
Listing queues ...
test.data 4333080
...done.
root#beep:~# rabbitmqctl list_queues
Listing queues ...
test.data 4332580
...done.
The script then silently exits without telling me an exception. The begin/rescue block never triggers an exception so I don't know why the script is finishing early or losing so many messages. Any clues how I should debug this next.
A
I've added a simple, working example here:
https://github.com/elasticsearch/elasticsearch-ruby/blob/master/examples/rabbitmq/consumer-publisher.rb
It's hard to debug your example when you don't provide examples of the test data.
The Elasticsearch "river" feature is deprecated, and will be removed, eventually. You should definitely invest time into writing your own custom feeder, if RabbitMQ and Elasticsearch are a central part of your infrastructure.
Answering my own question, I then learned that this is a crazy and stupid way to load a message queue of index instructions into Elastic. I created a river and can drain instructions much faster than I could with a ropey script. ;-)

How do I absolutely ensure that a Phusion Passenger instance stays alive?

I'm having a problem where no matter what I try all Passenger instances are destroyed after an idle period (5 minutes, but sometimes longer). I've read the Passenger docs and related questions/answers on Stack Overflow.
My global config looks like this:
PassengerMaxPoolSize 6
PassengerMinInstances 1
PassengerPoolIdleTime 300
And my virtual config:
PassengerMinInstances 1
The above should ensure that at least one instance is kept alive after the idle timeout. I'd like to avoid setting PassengerPoolIdleTime to 0 as I'd like to clean up all but one idle instance.
I've also added the ruby binary to my CSF ignore list to prevent the long running process from being culled.
Is there somewhere else I should be looking?
Have you tried to set the PassengerMinInstances to anything other than 1 like 3 and see that work?
Ok, I found the answer for you on this link: http://groups.google.com/group/phusion-passenger/browse_thread/thread/7557f8ef0ff000df/62f5c42aa1fe5f7e . Look at the last comment by Phusion guy.
Is there a way to ensure that I always have 10 processes up and
running, and that each process only serves 500 requests before being
shut down?
"Not at this time. But the current behavior is such that the next time
it determines that more processes need to be spawned it will make sure
L at least PassengerMinInstances processes exist."
I have to say their documentation doesn't seem to match what the current behavior.
This seems to be quite a common problem for people running Apache on WHM/cPanel:
http://techiezdesk.wordpress.com/2011/01/08/apache-graceful-restart-requested-every-two-hours/
Enabling piped logging sorted the problem out for me.

Resources