I am using Ruby 1.9.2.
I have a thread running which makes periodic calls to a database. The calls can be quite long, and sometimes (for various reasons) the DB connection disappears. If it does disappear, the thread just silently hangs there forever.
So, I want to wrap it all in a timeout to handle this. The problem is, on the second time through when a timeout should be called (always second), it still simply hangs. The timeout never takes effect. I know this problem existed in 1.8, but I was lead to believe timeout.rb worked in 1.9.
t = Thread.new do
while true do
sleep SLEEPTIME
begin
Timeout::timeout(TIMEOUTTIME) do
puts "About to do DB stuff, it will hang here on the second timeout"
db.do_db_stuff()
process_db_stuff()
end
rescue Timeout::Error
puts "Timed out"
#handle stuff here
end
end
end
Any idea why this is happening and what I can do about it?
One possibility is that your thread does not hang, it actually dies. Here's what you should do to figure out what's going on. Add this before you create your worker thread:
Thread.abort_on_exception = true
When an exception is raised inside your thread that is never caught, your whole process is terminated, and you can see which exception was raised. Otherwise (and this is the default), your thread is killed.
If this turns out not to be the problem, read on...
Ruby's implementation of timeouts is pretty naive. It sets up a separate thread that sleeps for n seconds, then blindly raises a Timeout exception inside the original thread.
Now, the original code might actually be in the middle of a rescue or ensure block. Raising an exception in such a block will silently abort any kind of cleanup code. This might leave the code that times out in an improper state.
It's quite difficult to tell if this is your problem exactly, but seeing how database handlers might do a fair bit of locking and exception handling, it might be very likely. Here's an article that explains the issue in more depth.
Is there any way you can use your database library's built-in timeout handling? It might be implemented on a lower level, not using Ruby's timeout implementation.
A simple alternative is to schedule the database calls in a separate process. You can fork the main process each time you do the heavy database-lifting. Or you could set up a simple cronjob to execute a script that executes it. This will be slightly more difficult if you need to communicate with your main thread. Please leave some more details if you want any advice on which option might suit your needs.
Based on your comments, the thread is dying. This might be a fault in libraries or application code that you may or may not be able to fix. If you wish to trap any arbitrary error that is generated by the database handling code and subsequently retry, you can try something like the following:
t = Thread.new do
loop do
sleep INTERVAL
begin
# Execute database queries and process data
rescue StandardError
# Log error or recover from error situation before retrying
end
end
end
You can also use the retry keyword in the rescue block to retry immediately, but you probably should keep a counter to make sure you're not accidentally retrying indefinitely when an unrecoverable error keeps occurring.
Related
Can anyone explain what would happen in the code below if the delayed_job that's scheduled in two weeks fails?
My understand is that this entire transaction would sit in memory until the transaction runs successfully or exhausts the permitted number of attempts (i.e. transaction doesn't merely guarantee that the job itself is created). Am I correct? If anyone could also elaborate on implications of this structure (i.e. memory leakage, race conditions, performance, etc.) and potential improvements that would be greatly appreciated!
...
def process
old_user.transaction(requires_new: true) do
begin
update_user_attributes
TransferUserDataJob.new(old_user, new_user).delay(run_at: 14.days.from_now, queue: 'transfer_user_data_queue').perform
raise ActiveRecord::Rollback if user.status.nil?
rescue Exception => e
raise ActiveRecord::Rollback
end
end
...
No, the transaction will NOT wait two weeks. This is precisely the reason why background jobs exist: do the expensive/heavy stuff later, so that frontend can respond as quickly as possible. If your user transfer process needs to happen in the same transaction, either move everything to the worker, or do everything on the spot, without delaying.
That is, transaction doesn't merely guarantee that the job itself is created?
This is exactly what happens in your code.
That code isn't doing what you expect it to do.
This part:
TransferUserDataJob.new(old_user, new_user).delay(run_at: 14.days.from_now, queue: 'transfer_user_data_queue').perform
Will schedule that job to performed by a separate process, typically on a separate server. So, it won't run in the context of your transaction.
Instead, you need to go into your TransferUserDataJob class and place that transaction inside the perform method.
If I have an app that is creating threads which do their work and then exit, and one or more threads get themselves into a deadlock (possibly through no fault of my own!), is there a way of programmatically forcing one of the threads to advance past the WaitForSingleObject it might be stuck at, and thus resolving the deadlock?
I don't necessarily want to terminate the thread, I just want to have it move on (and thus allow the threads to exit "gracefully".
(yes, I know this sounds like a duplicate of my earlier question Delphi 2006 - What's the best way to gracefully kill a thread and still have the OnTerminate handler fire?, but the situation is slightly different - what I'm asking here is whether it is possible to make a WaitForSingleObject (Handle, INFINTE) behave like a WaitForSingleObject (Handle, ItCantPossiblyBeWorkingProperlyAfterThisLong)).
Please be gentle with me.
* MORE INFO *
The problem is not necessarily in code I have the source to. The actual situation is a serial COM port library (AsyncFree) that is thread based. When the port is USB-based, the library seems to have a deadlock between two of the threads it creates on closing the port. I've already discussed this at length in this forum. I did recode one of the WaitForSingleObject calls to not be infinite, and that cured that deadlock, but then another one appeared later in the thread shutdown sequence, this time in the Delphi TThread.Destroy routine.
So my rationale for this is simple: when my threads deadlock, I fix the code if I can. If I can't, or one appears that I don't know about, I just want the thread to finish. I doesn't have to be pretty. I can't afford to have my app choke.
You can make a handle used in WaitForSingleObject invalid by closing it (from some other thread). In this case WaitForSingleObject should return WAIT_FAILED and your thread will be 'moved on'
If you don't use INFINITE but just set a given timeout time, you can check if the call returned because the time out time expired or because the handle you were waiting for got into the signalled state. Then your code can decide what to do next. Enter another waiting cycle, or simply exit anyway maybe showing somewhere 'hey, I was waiting but it was too long and I terminated anyway).
Another options is to use WaitForMultipleObjects and use something alike an event to have the wait terminate if needed. The advantage it doesn't need the timeout to expire.
Of course one the thread is awaken it must be able to handle the "exceptional" condition of continuing even if the "main" handle it was waiting for didn't return in time.
When I spawn a thread with Thread.new{} it looks like any exception that happens in that thread never sees the light of day, and the app just quietly ignores it
Normally, threads are isolated from each other, so exception in one won't terminate the whole application.
But, although I never used them, Thread class has several abort_on_exception methods, even with some examples. They should do what you want.
http://corelib.rubyonrails.org/classes/Thread.html
Adding to Nikita's answer, you can also trigger the exception by calling thread.join on the thread you've generated.
If you run the program with the debug flag on (ruby -d), then you'll also abort when an unhandled exception is raised in a thread.
after posting a question related to nginx, I'm a bit further with my investigations: The problem is, that the merb framework timeouts after about 30 seconds. If i tell the underlying nginx-server not to timeout, merb does, and I can't find a way to tell it not to; I need to do requests that take up to some minutes.
Any hints? Thanks a lot.
-- UPDATE --
Seems that mongrel behind merb is causing the error. Is there any way to change the mongrel-timeout running with merb?
Perhaps a different approach would yield better results - rather than workaround the timeouts, how about maximizing throughput by deferring the execution of the task?
Some approaches for long-running tasks are to either use run_later or exec a separate worker process to complete the task ...
def run_in_background(r)
Thread.new do
response = IO.popen(r) do |f|
f.read
end
end
end
In both cases you should return 202 (Accepted) as the status code and a URL where the calling application can get status updates.
I use this approach to handle requests which cause background batch processes to execute. Each writes it's start-time, progress and completion-time to a database (you could easily use a file). When the URL is invoked, I fetch the progress from the database and provide that back to the calling process.
I'm writing a job-scheduling app in Ruby for my work (primarily to move files using various protocol at a given frequency)
My main loop looks like this :
while true do
# some code to launch the proper job
sleep CONFIG["interval"]
end
It's working like a charm, but I'm not really sure if it is safe enough as the application might run on a server with cpu-intensive software running.
Is there another way to do the same thing, or is sleep() safe enough in my case ?
Any time I feel the need to block, I use an event loop; usually libev. Here is a Ruby binding:
http://rev.rubyforge.org/rdoc/
Basically, sleep is perfectly fine if you want your process to go to sleep without having anything else going on in the background. If you ever want to do other things, though, like sleep and also wait for TCP connections or a filehandle to become readable, then you're going to have to use an event loop. So, why not just use one at the beginning?
The flow of your application will be:
main {
Timer->new( after => 0, every => 60 seconds, run => { <do your work> } )
loop();
}
When you want to do other stuff, you just create the watcher, and it happens for you. (The jobs that you are running can also create watchers.)
Using sleep is likely OK for quick and dirty things. But for things that need a bit more robustness or reliability I suggest that sleep is evil :) The problem with sleeping is that the thread is (I'm assuming Windows here...) is truly asleep - the scheduler will not run the thread until some time after sleep interval has passed.
During this time, the thread will not wake up for anything. This means it cannot be canceled, or wake up to process some kind of event. Of course, the process can be killed, but that doesn't give the sleeping thread an opportunity to wake up and clean anything up.
I'm not familiar with Ruby, but I assume it has some kind of facility for waiting on multiple things. If you can, I suggest that instead of using sleep, you waint on two things\
A timer that wakes the thread periodically to do its work.
An event that is set when he process needs to cancel or quite (trapping control-C for example).
It would be even better if there is some kind of event that can be used to signal the need to do work. This would avoid polling on a timer. This generally leads to lower resource utilization and a more responsive system.
If you don't need an exact interval, then it makes sense to me. If you need to be awoken at regular times without drift, you probably want to use some kind of external timer. But when you're asleep, you're not using CPU resources. It's the task switch that's expensive.
While sleep(timeout) is perfectly appropriate for some designs, there's one important caveat to bear in mind.
Ruby installs signal handlers with SA_RESTART (see here), meaning that your sleep (or equivalent select(nil, nil, nil, timeout)) cannot easily be interrupted. Your signal handler will fire, but the program will go right back to sleep. This may be inconvenient if you wished to react timely to, say, a SIGTERM.
Consider that ...
#! /usr/bin/ruby
Signal.trap("USR1") { puts "Hey, wake up!" }
Process.fork() { sleep 2 and Process.kill("USR1", Process.ppid) }
sleep 30
puts "Zzz. I enjoyed my nap."
... will take about 30 seconds to execute, rather than 2.
As a workaround, you might instead throw an exception in your signal handler, which would interrupt the sleep (or anything else!) above. You might also switch to a select-based loop and use a variant of the self-pipe trick to wake up "early" upon receipt of a signal. As others have pointed out, fully-featured event libraries are available, too.
It wont use CPU while it is sleeping but if you are sleeping for a long time I would be more concerned of the running ruby interpreter holding up memory while it wasn't doing anything. This is not that big of a deal tho.