Ruby TCPServer performance issue - ruby

I am encountering an interesting issue with Ruby TCPServer, where once a client connects, it continually uses more and more CPU processing power until it hits 100% and then the entire system starts to bog down and can't process incoming data.
The processing class that is having an issue is designed to be a TCP Client that receives data from an embedded system, processes it, then returns the processed data to be further used (either by other similar data processors, or output to a user).
In this particular case, there is an external piece of code that would like this processed data, but cannot access it from the main parent code (the thing that the original process class is returning it's data to). This external piece may or may not be connected at any point while it is running.
To solve this, I set up a Thread with a TCPServer, and the processing class continually adds to a queue, and the Thread pulls from the queue and sends it to the client.
It works great, except for the performance issues. I am curious if I have something funky going on in my code, or if it's just the nature of this methodology and it will never be performant enough to work.
Thanks in advance for any insight/suggestions with this problem!
Here is my code/setup, with some test helpers:
process_data.rb
require 'socket'
class ProcessData
def initialize
super
#queue = Queue.new
#client_active = false
Thread.new do
# Waiting for connection
#server = TCPServer.open('localhost', 5000)
loop do
Thread.start(#server.accept) do |client|
puts 'Client connected'
# Connection established
#client_active = true
begin
# Continually attempt to send data to client
loop do
unless #queue.empty?
# If data exists, send it to client
begin
until #queue.empty?
client.puts(#queue.pop)
end
rescue Errno::EPIPE => error
# Client disconnected
client.close
end
end
sleep(1)
end
rescue IOError => error
# Client disconnected
#client_active = false
end
end # Thread.start(#server.accept)
end # loop do
end # Thread.new do
end
def read(data)
# Data comes in from embedded system on this method
# Do some processing
processed_data = data.to_i + 5678
# Ready to send data to external client
if #client_active
#queue << processed_data
end
return processed_data
end
end
test_embedded_system.rb (source of the original data)
require 'socket'
#data = '1234'*100000 # Simulate lots of data coming ing
embedded_system = TCPServer.open('localhost', 5555)
client_connection = embedded_system.accept
loop do
client_connection.puts(#data)
sleep(0.1)
end
parent.rb (this is what will create/call the ProcessData class)
require_relative 'process_data'
processor = ProcessData.new
loop do
begin
s = TCPSocket.new('localhost', 5555)
while data = s.gets
processor.read(data)
end
rescue => e
sleep(1)
end
end
random_client.rb (wants data from ProcessData)
require 'socket'
loop do
begin
s = TCPSocket.new('localhost', 5000)
while processed_data = s.gets
puts processed_data
end
rescue => e
sleep(1)
end
end
To run the test in linux, open 3 terminal windows:
Window 1: ./test_embedded_system.rb
Window 2: ./parent.rb
\CPU usage is stable
Window 3: ./random_client.rb
\CPU usage continually grows

I ended up figuring out what the issue was, and unfortunately I lead folks astray with my example.
It turns out my example didn't quite have the issue I was having, and the main difference was the sleep(1) was not in my version of process_data.rb.
That sleep is actually incredibly important, because it is inside of a loop do, and without the sleep, the Thread won't yield the GVL, and will continually eat up CPU resources.
Essentially, it was unrelated to TCP stuff, and more related to Threads and loops.
If you stumble on this question later on, you can put a sleep(0) in your loops if you don't want it to wait, but you want it to yield the GVL.
Check out these answers as well for more info:
Ruby infinite loop causes 100% cpu load
sleep 0 has special meaning?

Related

Writing a simple circuit breaker with thread support

I'm looking to extend a the simple circuter breaker written ruby to work across multiple thread...
And thus far I manage to accomplish something like this ..
## following is a simple cicruit breaker implementation with thread support.
## https://github.com/soundcloud/simple_circuit_breaker/blob/master/lib/simple_circuit_breaker.rb
class CircuitBreaker
class Error < StandardError
end
def initialize(retry_timeout=10, threshold=30)
#mutex = Mutex.new
#retry_timeout = retry_timeout
#threshold = threshold
reset!
end
def handle
if tripped?
raise CircuitBreaker::Error.new('circuit opened')
else
execute
end
end
def execute
result = yield
reset!
result
rescue Exception => exception
fail!
raise exception
end
def tripped?
opened? && !timeout_exceeded?
end
def fail!
#mutex.synchronize do
#failures += 1
if #failures >= #threshold
#open_time = Time.now
#circuit = :opened
end
end
end
def opened?
#circuit == :opened
end
def timeout_exceeded?
#open_time + #retry_timeout < Time.now
end
def reset!
#mutex.synchronize do
#circuit = :closed
#failures = 0
end
end
end
http_circuit_breaker = CircuitBreaker.new
http_circuit_breaker.handle { make_http_request }
but I'm not sure about few things ...
The multithreaded code has always puzzled me hence I'm not the entirely confident about the approach to say that the stuff seems correct.
Read operation are not under mutex:
While (I think, I have ensured that no data race condition every happens between two threads) mutex are applied for the write operation but the read operation is mutex free. Now, since there can be a scenario where a thread 1 has a held mutex while changing the #circuit or #failure variable but the other thread read the stale value.
So, I'm not able to think thorough does by achieving a full consistency(while applying the read lock) is worth a trade-off over here. Where consistency might be 100 % but the execution code as turn a bit slower because of the excessive lock.
it's unclear what you are asking, so i guess your post will be closed.
nevertheless, i think that the only thread-safe-way to implement a circuit-breaker would be to have the mutex around all data operartions which would result in a sequential flow, so it's basically useless.
otherwise you will have race-conditions like
thread-a starts (server does not respond immediately due to network issues)
thread-b starts (10 seconds later)
thread-b finishes all good
thread-a aborts due to a timeout -> opens circuit with stale data
a version that is mentioned in martin fowlers blog is a circuit-breaker in combination with a thread-pool: https://martinfowler.com/bliki/CircuitBreaker.html

Handle exceptions in concurrent-ruby thread pool

How to handle exceptions in concurrent-ruby thread pools (http://ruby-concurrency.github.io/concurrent-ruby/file.thread_pools.html)?
Example:
pool = Concurrent::FixedThreadPool.new(5)
pool.post do
raise 'something goes wrong'
end
# how to rescue this exception here
Update:
Here is simplified version of my code:
def process
pool = Concurrent::FixedThreadPool.new(5)
products.each do |product|
new_product = generate_new_product
pool.post do
store_in_db(new_product) # here exception is raised, e.g. connection to db failed
end
end
pool.shutdown
pool.wait_for_terminaton
end
So what I want to achive, is to stop processing (break loop) in case of any exception.
This exception is also rescued at higher level of application and there are executed some cleaning jobs (like setting state of model to failure and sending some notifications).
The following answer is from jdantonio from here https://github.com/ruby-concurrency/concurrent-ruby/issues/616
"
Most applications should not use thread pools directly. Thread pools are a low-level abstraction meant for internal use. All of the high-level abstractions in this library (Promise, Actor, etc.) all post jobs to the global thread pool and all provide exception handling. Simply pick the abstraction that best fits your use case and use it.
If you feel the need to configure your own thread pool rather than use the global thread pool, you can still use the high-level abstractions. They all support an :executor option which allows you to inject your custom thread pool. You can then use the exception handling provided by the high-level abstraction.
If you absolutely insist on posting jobs directly to a thread pool rather than using our high-level abstractions (which I strongly discourage) then just create a job wrapper. You can find examples of job wrappers in all our high-level abstractions, Rails ActiveJob, Sucker Punch, and other libraries which use our thread pools."
So how about an implementation with Promises ?
http://ruby-concurrency.github.io/concurrent-ruby/Concurrent/Promise.html
In your case it would look something like this:
promises = []
products.each do |product|
new_product = generate_new_prodcut
promises << Concurrent::Promise.execute do
store_in_db(new_product)
end
end
# .value will wait for the Thread to finish.
# The ! means, that all exceptions will be propagated to the main thread
# .zip will make one Promise which contains all other promises.
Concurrent::Promise.zip(*promises).value!
There may be a better way, but this does work. You will want to change the error handling within wait_for_pool_to_finish.
def process
pool = Concurrent::FixedThreadPool.new(10)
errors = Concurrent::Array.new
10_000.times do
pool.post do
begin
# do the work
rescue StandardError => e
errors << e
end
end
end
wait_for_pool_to_finish(pool, errors)
end
private
def wait_for_pool_to_finish(pool, errors)
pool.shutdown
until pool.shutdown?
if errors.any?
pool.kill
fail errors.first
end
sleep 1
end
pool.wait_for_termination
end
I've created an issue #634. Concurrent thread pool can support abortable worker without any problems.
require "concurrent"
Concurrent::RubyThreadPoolExecutor.class_eval do
# Inspired by "ns_kill_execution".
def ns_abort_execution aborted_worker
#pool.each do |worker|
next if worker == aborted_worker
worker.kill
end
#pool = [aborted_worker]
#ready.clear
stopped_event.set
nil
end
def abort_worker worker
synchronize do
ns_abort_execution worker
end
nil
end
def join
shutdown
# We should wait for stopped event.
# We couldn't use timeout.
stopped_event.wait nil
#pool.each do |aborted_worker|
# Rubinius could receive an error from aborted thread's "join" only.
# MRI Ruby doesn't care about "join".
# It will receive error anyway.
# We can "raise" error in aborted thread and than "join" it from this thread.
# We can "join" aborted thread from this thread and than "raise" error in aborted thread.
# The order of "raise" and "join" is not important. We will receive target error anyway.
aborted_worker.join
end
#pool.clear
nil
end
class AbortableWorker < self.const_get :Worker
def initialize pool
super
#thread.abort_on_exception = true
end
def run_task pool, task, args
begin
task.call *args
rescue StandardError => error
pool.abort_worker self
raise error
end
pool.worker_task_completed
nil
end
def join
#thread.join
nil
end
end
self.send :remove_const, :Worker
self.const_set :Worker, AbortableWorker
end
class MyError < StandardError; end
pool = Concurrent::FixedThreadPool.new 5
begin
pool.post do
sleep 1
puts "we shouldn't receive this message"
end
pool.post do
puts "raising my error"
raise MyError
end
pool.join
rescue MyError => error
puts "received my error, trace: \n#{error.backtrace.join("\n")}"
end
sleep 2
Output:
raising my error
received my error, trace:
...
This patch works fine for any version of MRI Ruby and Rubinius. JRuby is not working and I don't care. Please patch JRuby executor if you want to support it. It should be easy.

Running a loop (such as one for a mock webserver) within a thread

I'm trying to run a mock webserver within a thread within a class. I've tried passing the class' #server property to the thread block but as soon as I try to do server.accept the thread stops. Is there some way to make this work? I want to basically be able to run a webserver off of this script while still taking user input via stdin.gets. Is this possible?
class Server
def initialize()
#server = TCPServer.new(8080)
end
def run()
#thread = Thread.new(#server) { |server|
while true
newsock = server.accept
puts "some stuff after accept!"
next if !newsock
# some other stuff
end
}
end
end
def processCommand()
# some user commands here
end
test = Server.new
while true do
processCommand(STDIN.gets)
end
In the above sample, the thread dies on server.accept
In the code you posted, you're not calling Server#run. That's probably just an oversight in making the post. Server.accept is supposed to block a thread, returning only when someone has connected.
Anyone who goes into writing an HTTP server with bright eyes soon learns that it's more fun to let someone else do that work. For quick and dirty HTTP servers, I've got good results enlisting the aid of WEBrick. It's a part of the Ruby library. Here's a WEBrick server that will serve up "Boo!" When you connect your browser to localhost:8080/:
#!/usr/bin/ruby1.8
require 'webrick'
class MiniServer
def initialize
Thread.new do
Thread::abort_on_exception = true
server = WEBrick::HTTPServer.new(:BindAddress=>'127.0.0.1',
:Port=>8080,
:Logger=>WEBrick::Log.new('/dev/stdout'))
server.mount('/', Servlet, self)
server.start
end
end
private
class Servlet < WEBrick::HTTPServlet::AbstractServlet
def initialize(webrick_server, mini_server)
end
def do_GET(req, resp)
resp.body = "<html><head></head><body>Boo!</body></html>"
end
alias :do_POST :do_GET
end
end
server = MiniServer.new
gets
I don't know ruby, but it looks like server.accept is blocking until you get a tcp connection... your thread will continue as soon as a connection is accepted.
You should start the server in your main thread and then spawn a new thread for each connection that you accept, that way your server will immediately go to accept another connection and your thread will service the one that was just accepted.

Script stops while waiting for user input from STDIN.gets

I'm trying to do something like this, where I have two loops going in seperate threads. The problem I am having is that in the main thread, when I use gets and the script is waiting for user input, the other thread is stopped to wait as well.
class Server
def initialize
#server = TCPServer.new(8080)
run
end
def run
#thread = Thread.new(#server) { |server|
while true
newsock = server.accept
puts "some stuff after accept!"
next if !newsock
# some other stuff
end
}
end
end
def processCommand
# some user commands here
end
test = Server.new
while true do
processCommand(STDIN.gets)
end
The above is just a sample of what I want to do.
Is there a way to make the main thread block while waiting for user input?
You might want to take a look at using the select method of the IO class. Take a look at
good select example for handling select with asynchronous input. Depending upon what version of ruby you're using you might have issues with STDIN though, I'm pretty sure it always triggers the select in 1.8.6.
I'm not sure if this is what you are looking for, but I was looking for something similar and this example does exactly what I wanted. The thread will continue processing until the user hits enter, and then the thread will be able to handle your user input as desired.
user_input = nil
t1 = Thread.new do
while !user_input
puts "Running"
end
puts "Stopping per user input: #{user_input}"
end
user_input = STDIN.gets
t1.join

Thread lockup in ruby with Soap4r

This is related to a question I asked here:
Thread Locking in Ruby (use of soap4r and QT)
However it is particular to one part of that question and is supported by a simpler example. The test code is:
require 'rubygems'
require 'thread'
require 'soap/rpc/standaloneserver'
class SOAPServer < SOAP::RPC::StandaloneServer
def initialize(* args)
super
# Exposed methods
add_method(self, 'test', 'x', 'y')
end
def test(x, y)
return x + y
end
end
myServer = SOAPServer.new('monitorservice', 'urn:ruby:MonitorService', 'localhost', 4004)
Thread.new do
puts 'Starting web services'
myServer.start
puts 'Ending web services'
end
sleep(4)
#Thread.new do
testnum = 0
while testnum < 4000 do
testnum += 1
puts myServer.test(0,testnum)
sleep(2)
end
#end
puts myServer.test(0,4001)
puts myServer.test(0,4002)
puts myServer.test(0,4003)
puts myServer.test(0,4004)
gets
When I run this with the thread commented out everything runs along fine. However, once the thread is put in the process hangs. I poked into Webrick and found that the stop occurs here (the puts are, of course, mine):
while #status == :Running
begin
puts "1.1"
if svrs = IO.select(#listeners, nil, nil, 2.0)
svrs[0].each{|svr|
puts "-+-"
#tokens.pop # blocks while no token is there.
if sock = accept_client(svr)
th = start_thread(sock, &block)
th[:WEBrickThread] = true
thgroup.add(th)
else
#tokens.push(nil)
end
}
end
puts ".+."
When run with the thread NOT commented out I get something like this:
Starting web services
1.1
.+.
1.1
4001
4002
4003
4004
1
.+.
1.1
If the problem is caused by the gets() call and the purpose of the gets() call in your code is to prevent the Ruby interpreter from exiting, you can replace it with Thread.join() calls for each thread that you create. Join() will block until that thread has finished executing and therefore it'll prevent the Ruby interpreter from exiting.
E.g.:
t1 = Thread.new do
puts 'Starting web services'
myServer.start
puts 'Ending web services'
end
t2 = ...
...
t1.join
t2.join
Alternatively, if you can join() only one of the threads if there is a single thread that controls the execution of the application, and the other threads will be killed on exit.
The trailing gets blocks Ruby's IO. I'm not sure why. If it is replaced with pretty much anything the program works. I used a sleeping loop:
loop do
sleep 1
end
ADDED:
I should note that I also get strange behavior with sleep based on the sleep increment. In the end I abandoned Ruby since the threading behavior was too wonky.

Resources