Ruby EventMachine PeriodicTimer chaos - ruby

When creating a PeriodicTimer in Ruby EventMachine, it's very inaccurate and seems to act randomly and incorrect.
I'm wondering if there is a way to fix that or if I maybe use it in a wrong way.
Below is an example.
I have a PeriodicTimer that every 10 seconds is pulling a global array (which contains mysql statements) and executes mysql commands.
When running it like that, the PeriodicTimer might end up doing this only every few minutes, or it will stop doing it completely after some time, and other fancy abnormalities.
Eventmachine::run {
EM::PeriodicTimer.new(10) {
mysql_queue = $mysql_queue.dup
$mysql_queue = []
mysql_queue.each do |command|
begin
Mysql.query(command)
rescue Exception => e
puts "Error occured.. #{e} #{command}"
end
end
}
# Here follow random tasks/loops that are adding mysql statements to $mysql_queue
100_000.times {
if ...
$mysql_queue << "INSERT INTO `...` (...) VALUES (...);"
end
}
...
# ...
}
In case you are wondering, I like to use such "Mysql Queues" because it prevents race conditions.
In practice, this is actually a small TCP/IP client which is handling multiple concurrent connections and executing mysql commands to log various actions.

What do you expect this code to do? Why do you think that timer will be ever called, if you don't pass control to EM loop?
Try splitting your 100_000 loop into smaller parts:
10_000.times do
EM.next_tick do
10.times do
if ...
$mysql_queue << "INSERT INTO `...` (...) VALUES (...);"
end
end
end
end
Using a global $ is a bad idea here, it's not quite clean if you're using threads that may access it or not.

Related

Can't run multithreading with Celluloid

This simple example I run on jruby, but it only one thread runs
require 'benchmark'
require 'celluloid/current'
TIMES = 10
def delay
sleep 1
# 40_000_000.times.each{|i| i*i}
end
p 'celluloid: true multithreading?'
class FileWorker
include Celluloid
def create_file(id)
delay
p "Done!"
File.open("out_#{id}.txt", 'w') {|f| f.write(Time.now) }
end
end
workers_pool = FileWorker.pool(size: 10)
TIMES.times do |i|
# workers_pool.async.create_file(i) # also not happens
future = Celluloid::Future.new { FileWorker.new.create_file(i) }
p future.value
end
All created files have interval 1 second.
Please help to turn Celluloid into multithreading mode, where all files are created simultaneously.
Thanks!
FIXED:
Indeed, array of "futures" helps!
futures = []
TIMES.times do |i|
futures << Celluloid::Future.new { FileWorker.new.create_file(i) }
end
futures.each {|f| p f.value }
Thanks jrochkind !
Ah, I think I see.
Inside your loop, you are waiting for each future to complete, at the end of the loop -- which means you are waiting for one future to complete, before creating the next one.
TIMES.times do |i|
# workers_pool.async.create_file(i) # also not happens
future = Celluloid::Future.new { FileWorker.new.create_file(i) }
p future.value
end
Try changing it to this:
futures = []
TIMES.times do |i|
futures << Celluloid::Future.new { FileWorker.new.create_file(i) }
end
futures.each {|f| p f.value }
In your version, consider the first iteration the loop -- you create a future, then call future.value which waits for the future to complete. The future.value statement won't return until the future completes, and the loop iteration won't finish and loop again to create another future until the statement returns. So you've effectively made it synchronous, by waiting on each future with value before creating the next.
Make sense?
Also, for short code blocks like this, it's way easier on potential SO answerers if you put the code directly in the question, properly indented to format as code, instead of linking out.
In general, if you are using a fairly widely used library like Celluloid, and finding it doesn't seem to do the main thing it's supposed to do -- the first guess should probably be a bug in your code, not that the library fundamentally doesn't work at all (someone else would have noticed before now!). A question title reflecting that, even just "Why doesn't my Celluloid code appear to work multi-threaded" might have gotten more favorable attention than a title suggesting Celluloid fundamentally does not work -- without any code in the question itself demonstrating!

How can I run synchronous long running operations in EventMachine without blocking the reactor?

I'd like to run a series of Procs, in a specified order (i.e., they can't run asynchronously). Some of them may take an arbitrarily long amount of time.
My code is running within a context of an EventMachine reactor.
Is there a known idiom for writing this kind of code without blocking the main reactor?
As #maniacalrobot said, using EM.defer/deferrable lets the procs be run without blocking the reactor.
But then you enter "callback hell" when you need to run several procs serially.
I know two solutions to make the code more readable: promises and fibers.
Promises gives you a nice API to compose asynchronous calls, there are a lot of good articles out there, including:
http://domenic.me/2012/10/14/youre-missing-the-point-of-promises/
http://www.html5rocks.com/en/tutorials/es6/promises/
Fibers are a more ruby specific tool which makes your code look synchronous while doing asynchronous things.
Here is an helper method to execute a proc asynchronously (deferred) but still block the calling code without blocking the main reactor (that's the magic of Fibers):
def deferring(action)
f = Fiber.current
safe_action = proc do
begin
res = action.call
[nil, res]
rescue => e
[e, nil]
end
end
EM::defer(safe_action, proc { |error, result| f.resume([error, result]) })
error, result = Fiber.yield
raise error if error
result
end
Example of usage:
action1_res = deferring(proc do
puts 'Async action 1'
42
end
begin
deferring(proc do
puts "Action1 answered #{action1_res}"
raise 'action2 failed'
end)
rescue => error
puts error
end
Any code that would normally block the main reactor loop should be run using EM#defer. EM#defer takes two blocks as arguments, the first block is run in a different thread and should not block the reactor. A second, optional block, can be passed, which will be called when the first has completed (it will also receive the result of the first block).
Further reading https://github.com/eventmachine/eventmachine/wiki/EM::Deferrable-and-EM.defer
An example of chaining 2 long running operations would look like this:
logic_block = Proc.new { long_running_operation }
callback = Proc.new { |long_running_operation_result| EM.defer next_long_running_operation }
EM.defer logic_block, callback
Beware, The second (callback) block is run on the reactor loop, so if you're plan on chaining multiple blocks of long running code together, you'll need to call EM.defer again inside callbacks.

ruby rescue block -- respond with more than just one command

I'm running a script with an API that often times out. I'm using begin/rescue blocks to get it to redo when this happens, but want to log what is happening to the command line before I run the redo command.
begin
#...api query...
rescue ErrorClass
puts("retrying #{id}") && redo
end
Unfortunately the above script doesn't work. Only the first command is run.
I would like to force the rescue block to run multiple lines of code like so:
begin
# api query
rescue ErrorClass do ###or:# rescue ErrorClass do |e|
puts "retrying #{id}"
redo
end
but those don't work either.
I've had luck creating a separate method to run like so:
def example
id = 34314
begin
5/0
rescue ZeroDivisionError
eval(handle_zerodiv_error(id))
end
end
def handle_zerodiv_error(id)
puts "retrying #{id}"
"redo"
end
...that actually works. But it requires too many lines of code in my opinion and it uses eval which is not kosher by any means according to my mentor(s).
You are unnecessarily complicating things by using && or do. The && version does not work because puts returns nil, so by shortcut evaluation of &&, the part to follow is not evaluated. If you use || or ; instead, then it will work:
begin
...
rescue ErrorClass
puts("retrying #{id}") || redo
end
begin
...
rescue ErrorClass
puts("retrying #{id}"); redo
end
but even this is not necessary. You somehow seem to believe that you need a block within rescue to write multiple lines, but that does not make sense because you are not using a block with single line. There is no Ruby construction that requires a block only when you have multiple lines. So, just put them in multiple lines:
begin
...
rescue ErrorClass
puts("retrying #{id}")
redo
end
There is a retry built in. This example is from "The Ruby Programming Language" pg 162.
require "open-uri"
tries = 0
begin
tries +=1
open("http://www.example.com/"){|f| puts f.readlines}
rescue OpenURI::HTTPError => e
puts e.message
if (tries < 4)
sleep (2**tries) # wait for 2, 4 or 8 seconds
retry # and try again
end
end

ruby "on error resume next" function

Is there a way of doing the old "on error resume next" routine in ruby?
I've got array of value filled in dynamically from elsewhere (read from MQTT topics to be precise) then I want to do a bunch of numeric calculations on them and publish the results. The values SHOULD be numeric but are possibly missing or non-numeric.
At the moment my code looks something like
values=[]
//values get loaded here
begin
Publish('topic1',value[0]*10+value[1])
rescue TypeError,NoMethodError,ZeroDivisionError
end
begin
Publish('topic2',value[3]/value[4])
rescue TypeError,NoMethodError,ZeroDivisionError
end
//etc etc
If the calculation fails for any reason the program should just skip that step and go on.
It works but surely theres a better way than all those identical begin..rescue blocks? Ruby is about "DRY" after all..
Is there a way of re-writing the above so that a single begin..rescue construct is used while still allowing all calculations to be attempted?
UPDATED
How safe to do something like
def safe_Publish(topic,value)
return if value.nil?
Publish(topic,value)
end
and call with
safe_Publish('topic2',(value[3]/value[4] rescue nil))
The main problem is that the above catches ALL exceptions not just the ones I'm expecting which makes me a little nervous.
The on error resume next coding style is really dangerous - as it makes finding new bugs you accidentally introduce to your program very hard to find. Instead, I would just write a different version of publish that doesn't throw those exceptions:
def try_publish(topic_name)
begin
Publish('topic1',yield)
rescue TypeError,NoMethodError,ZeroDivisionError
# are you sure you don't want to do anything here? Even logging the errors
# somewhere could be useful.
end
end
You can then call this with:
try_publish('topic1') { value[0]*10+value[1] }
If TypeError,NoMethodError or ZeroDivisionError are thrown by the expression, they will be caught and ignored.
Now your original method won't require any rescues.
If you really wanted an on error resume next, you could possibly do it by monkey patching the raise method in Kernel, but that would be a horrible idea.
If you think a bit more carefully about what you are doing, and why you want on error resume next, I think you will see that you don't really need to suppress all exceptions. As the other posters pointed out, that would make it hard to find and fix bugs.
Your problem is that you have a bunch of numbers scraped from the Internet, and want to run some calculations on them, but some may be invalid or missing. For invalid/missing numbers, you want to skip over any calculations which would use those numbers.
A few possible solutions:
Pre-filter your data and remove anything which is not a valid number.
Put each calculation you want to do into a method of its own. Put a rescue Exception on the method definition.
Define "safe" wrappers for the numeric classes which don't raise exceptions on divide by zero, etc. Use these wrappers for your calculations.
The "wrappers" might look something like this (don't expect complete, tested code; this is just to give you the idea):
# This is not designed for "mixed" arithmetic between SafeNumerics and ordinary Numerics,
# but if you want to do mixed arithmetic, that can also be achieved
# more checks will be needed, and it will also need a "coerce" method
class SafeNumeric
attr_reader :__numeric__
def initialize(numeric)
#__numeric__ = numeric.is_a?(String) ? numeric.to_f : numeric
end
def zero?
#__numeric__.zero?
end
def /(other)
if other.zero? || #__numeric__.nil? || other.__numeric__.nil?
SafeNumeric.new(nil) # could use a constant for this to reduce allocations
else
SafeNumeric.new(#__numeric__ / other.__numeric__)
end
end
def to_s; #__numeric__.to_s; end
def inspect; #__numeric__.inspect; end
# methods are also needed for +, -, *
end
Then use it like:
numbers = scraped_from_net.map { |n| SafeNumeric.new(n) }
# now you can do arithmetic on "numbers" at will
This shows how to wrap a bunch of quick operations into a loop with each one being protected by a begin/rescue:
values = [1,2,3,0,4]
ops = [ ->{values[0]/values[1]}, ->{values[2]/values[3]} ]
ops.each do |op|
begin
puts "answer is #{op.call}"
rescue ZeroDivisionError
puts "cannot divide by zero"
end
end
I prefer the safe_publish method, however, as you can unit test that and it encapsulates the logic of making safe calls and handling errors in a single place:
def safe_publish(topic, &block)
begin
value = block.call
publish(topic, value)
rescue
# handle the error
end
end
and then you can call this with code like:
safe_publish 'topic0' do
value[0]*10+value[1]
end

Ruby's speed of threads

I have the following code to thread-safe write into a file:
threads = []
##lock_flag = 0
##write_flag = 0
def add_to_file
old_i = 0
File.open( "numbers.txt", "r" ) { |f| old_i = f.read.to_i }
File.open( "numbers.txt", "w+") { |f| f.write(old_i+1) }
#puts old_i
end
File.open( "numbers.txt", "w") { |f| f.write(0) } unless File.exist? ("numbers.txt")
2000.times do
threads << Thread.new {
done_flag = 0
while done_flag == 0 do
print "." #### THIS LINE
if ##lock_flag == 0
##lock_flag = 1
if ##write_flag == 0
##write_flag = 1
add_to_file
##write_flag = 0
done_flag = 1
end
##lock_flag = 0
end
end
}
end
threads.each {|t| t.join}
If I run this code it take about 1.5 sec to write all 2000 numbers into the file. So, all is good.
But if I remove the line print "." marked with "THIS LINE" is takes ages! This code needs about 12sec for only 20 threads to complete.
Now my question: why does the print speed up that code so much?
I'm not sure how you can call that thread safe at all when it's simply not. You can't use a simple variable to ensure safety because of race conditions. What happens between testing that a flag is zero and setting it to one? You simply don't know. Anything can and will eventually happen in that very brief interval if you're unlucky enough.
What might be happening is the print statement causes the thread to stall long enough that your broken locking mechanism ends up working. When testing that example using Ruby 1.9.2 it doesn't even finish, printing dots seemingly forever.
You might want to try re-writing it using Mutex:
write_mutex = Mutex.new
read_mutex = Mutex.new
2000.times do
threads << Thread.new {
done_flag = false
while (!done_flag) do
print "." #### THIS LINE
write_mutex.synchronize do
read_mutex.synchronize do
add_to_file
done_flag = true
end
end
end
}
end
This is the proper Ruby way to do thread synchronization. A Mutex will not yield the lock until it is sure you have exclusive control over it. There's also the try_lock method that will try to grab it and will fail if it is already taken.
Threads can be a real nuisance to get right, so be very careful when using them.
First off, there are gems that can make this sort of thing easier. threach and jruby_threach ("threaded each") are ones that I wrote, and while I'm deeply unhappy with the implementation and will get around to making them cleaner at some point, they work fine when you have safe code.
(1..100).threach(2) {|i| do_something_with(i)} # run method in two threads
or
File.open('myfile.txt', 'r').threach(3, :each_line) {|line| process_line(line)}
You should also look at peach and parallel for other examples of easily working in parallel with multiple threads.
Above and beyond the problems already pointed out -- that your loop isn't thread-safe -- none of it matters because the code you're calling (add_to_file) isn't thread-safe. You're opening and closing the same file willy-nilly across threads, and that's gonna give you problems. I can't seem to understand what you're trying to do, but you need to keep in mind that you have absolutely no idea the order in which things in different threads are going to run.

Resources