If I fire of 1 or 1000 of these in a controller action:
Thread.new {
# do some stuff
}
Will they indeed run asynchronously with the http request?
If an exception is raised, where does it trickle up to?
Anything else I should know about?
Threads and exceptions are not necessarily friends since exceptions can't bust out of the current thread and alert the parent thread. You also need to turn on thread exception notification or you'll never hear about them at all:
Thread.new do
Thread.abort_on_exception = true
end
You'll also need to call Thread.join on each new thread or the main one will hurry along without them.
This way your code will at least halt on an exception instead of simply terminating the thread that generated one and continuing on as if nothing had happened.
Make sure that the things you're calling inside your thread are thread safe or you may get unexpected results.
Related
I've been using resque on Heroku, which will from time to time interrupt your jobs with a SIGTERM.
Thus far I've handled this with a simple:
def process(options)
do_the_job
rescue Resque::TermException
self.defer options
end
We've started using resque-status so that we can keep track of jobs, but the method above obviously breaks that as the job will show completed when actually it's been deferred to another job.
My current thinking is that instead of deferring the current job in resque, there needs to be another job that re-queues jobs that have failed due to SIGTERM.
The trick comes in that some jobs are more complicated:
def process(options)
do_part1 unless options['part1_finished']
options['part1_finished']
do_part2
rescue Resque::TermException
self.defer options
end
Simply removing the rescue and simply retrying those jobs would cause an exception when do_part1 gets repeated.
Looking more deeply into how resque-status works, a possible work around is to go straight to resque for the re-queue using the same parameters that resque-status would use.
def process
do_part1 unless options['part1_finished']
options['part1_finished']
do_part2
rescue Resque::TermException
Resque.enqueue self.class, uuid, options
raise DeferredToNewJob
end
Of course, this is undocumented so may be incompatible with future releases of resque-status.
There is a draw back: between that job failing and the new job picking it up, the status of the first job will be reported by resque-status.
This is why I re-raise a new exception - otherwise the job status will show completed until the new worker picks up the old job, which may confuse processes that are watching and waiting for the job to finish.
By raising a new exception DeferredToNewJob, the job status will temporarily show failure, which is easier to work around at the front end, and the specific exception can be automatically cleared from the resque failure queue.
UPDATE
resque-status provides support for on_failure handler. If a method with this name is defined as an instance method on the class, we can make this even simpler
Here's my on_failure
def on_failure(e)
if e.is_a? DeferredToNewJob
tick('Waiting for new job')
else
raise e
end
end
With this in place the job spends basically no time in the failed state for processes watching it's status.
In addition, if resque-status finds this handler, then it won't raise the exception up to resque, so it won't get added to the failed queue.
Basically, all of my logic is in a bunch of event handlers that are fired by threads. After I establish the event handlers in the main thread:
puts 'Now connecting...'
socket = SocketIO::Client::Simple.connect 'http://localhost:3000'
socket.on :connect do
puts 'Connected'
end
I don't really have anything else to do in the main thread... but when I exit it, the whole process exits! I guess I could just do a while 1 {sleep 3} or something but that seems like a hack.
From what I can tell, daemon threads also don't work on Windows, so what am I supposed to do here?
If you're creating threads then it's your obligation to wait for them to finish before terminating. Normally this is done with join on the thread or threads in question.
Do you have a way of getting the thread out of that SocketIO instance? If so, join it.
This is more of an opinion oriented question. When handling exceptions in nested codes such as:
Assuming you have a class that initialize another class to run a job. The job returns a value, which is then processed by the class which initially called it.
Where would you put the exception and error logging? Would you define it on the initialization of the job class in the calling class, which will handle then exception in the job execution or on both levels ?
if the job handles exceptions then you don't need to wrap the call to the job in a try catch.
but the class that initializes and runs the job could throw exceptions, so you should handle exceptions at that level as well.
here is an example:
def some_job
begin
# a bunch of logic
rescue
# handle exception
# log it
end
end
it wouldn't make sense then to do this:
def some_manager
begin
some_job
rescue
# log
end
end
but something like this makes more sense:
def some_manager
begin
# a bunch of logic
some_job
# some more logic
rescue
# handle exception
# log
end
end
and of course you would want to catch specific exceptions.
Probably the best answer, in general, for handling Exceptions in Ruby is reading Exceptional Ruby. It may change your perspective on error handling.
Having said that, your specific case. When I hear "job" in hear "background process", so I'll base my answer on that.
Your job will want to report status while it's doing it's thing. This could be states like "in queue", "running", "finished", but it also could be more informative (user facing) information: "processing first 100 out of 1000 records".
So, if an error happens in your background process, my suggestion is two-fold:
Make sure you catch exceptions before you exit the job. Your background job processor might not like a random exception coming from your code. I, personally, like the idea of catching the exception and saving it to the database, for easy retrieval later. Then again, depending on your background job processor, maybe it handles error reporting for you. (I think reque does, for example).
On the front end, use AJAX (or something) to occasionally check in to how the job is doing. Say every 10 seconds or something. In additional to getting the status of the job, also make sure you return this additional information to the user (if appropriate).
I write a simple bot using "SimpleMUCClient". But got error: app.rb:73:in stop': deadlock detected (fatal)
from app.rb:73:in'. How to fix it?
Most likely the code you're running is executed in another thread. That particular thread is then joined (meaning Ruby waits for it to finish upon exiting the script) using Thread.join(). Calling Thread.stop() while also calling .join() is most likely the cause of the deadlock. Having said that you should following the guides of StackOverflow regarding how to ask questions properly, since you haven't done so I've down voted your question.
Joining a thread while still calling Thread.stop can be done as following:
th = Thread.new do
Thread.stop
end
if th.status === 'sleep'
th.run
else
th.join
end
It's not the cleanest way but it works. Also, if you want to actually terminate a thread you'll have to call Thread.exit instead.
I have a library method that occasionally hangs on a network connection, and there's no timeout mechanism.
What's the easiest way to add my own? Basically, I'm trying to keep my code from getting indefinitely stuck.
timeout.rb has some problems where basically it doesn't always work quite right, and I wouldn't recommend using it. Check System Timer or Terminator instead
The System Timer page in particular describes why timeout.rb can fail, complete with pretty pictures and everything. Bottom line is:
For timeout.rb to work, a freshly created “homicidal” Ruby thread has to be scheduled by the Ruby interpreter.
M.R.I. 1.8, the interpreter used by most Ruby applications in production, implements Ruby threads as green threads.
It is a well-known limitations of the green threads (running on top of a single native thread) that when a green thread performs a blocking system call to the underlying operating systems, none of green threads in the virtual machine will run until the system call returns.
Answered my own question:
http://www.ruby-doc.org/stdlib/libdoc/timeout/rdoc/index.html
require 'timeout'
status = Timeout::timeout(5) {
# Something that should be interrupted if it takes too much time...
}
To prevent an ugly error on timeout I suggest enclosing it and using a rescue like this:
begin
status = Timeout::timeout(5) do
#Some stuff that should be interrupted if it takes too long
end
rescue Timeout::Error
puts "Error message: It took too long!\n"
end