Implications of Wrapping Delayed Job in Transaction for Rails App - ruby

Can anyone explain what would happen in the code below if the delayed_job that's scheduled in two weeks fails?
My understand is that this entire transaction would sit in memory until the transaction runs successfully or exhausts the permitted number of attempts (i.e. transaction doesn't merely guarantee that the job itself is created). Am I correct? If anyone could also elaborate on implications of this structure (i.e. memory leakage, race conditions, performance, etc.) and potential improvements that would be greatly appreciated!
...
def process
old_user.transaction(requires_new: true) do
begin
update_user_attributes
TransferUserDataJob.new(old_user, new_user).delay(run_at: 14.days.from_now, queue: 'transfer_user_data_queue').perform
raise ActiveRecord::Rollback if user.status.nil?
rescue Exception => e
raise ActiveRecord::Rollback
end
end
...

No, the transaction will NOT wait two weeks. This is precisely the reason why background jobs exist: do the expensive/heavy stuff later, so that frontend can respond as quickly as possible. If your user transfer process needs to happen in the same transaction, either move everything to the worker, or do everything on the spot, without delaying.
That is, transaction doesn't merely guarantee that the job itself is created?
This is exactly what happens in your code.

That code isn't doing what you expect it to do.
This part:
TransferUserDataJob.new(old_user, new_user).delay(run_at: 14.days.from_now, queue: 'transfer_user_data_queue').perform
Will schedule that job to performed by a separate process, typically on a separate server. So, it won't run in the context of your transaction.
Instead, you need to go into your TransferUserDataJob class and place that transaction inside the perform method.

Related

Invoke Mono.block() through "nioEventloopGroup-*" threads would end up leading all the threads hang

The project I am working for is using Spring WebFlux. I came across a very odd issue.
The detail is that some of pieces of code are purely wrote in Reactor style (couples of Flux/Mono pipelines), however, in a inner publishers, I have to call a method where there is "Mono.block()" inside.
The weird thing I aware is that the whole service would become totally stuck, and when I captured a thread dump, I saw all those "nioEventLoopGroup-*" threads were hung.
A fun fact is that if I leverage a "simple" thread (new Thread(..)) to call the method (there is .block inside), everything works fine.
So my question is that, are those "nioEventLoopGroup-*" threads not allowed to call any blocking code.
Sorry for asking a dumb question, but it's blocking issue for now, so I am looking forward your insight.
Reactor, by default, uses a fixed size thread pool. When you use block(), the actual work needs to be done in some thread or another, which depends on the nature of the subscription and the Mono/Flux. Most likely a set of new tasks will be scheduled on the same scheduler, but block() will suspend its thread, waiting for those tasks to complete, so there is one fewer thread for those other tasks to be scheduled on. Evidently you have enough of these calls to exhauast the entire thread pool. All your block() calls are waiting for other tasks to complete, but there are no threads available for them to be run on.
There's no reason to call block() inside a mapping in a reactive stream. There are always other ways of achieving the same goal without blocking - flatMap(), zip() etc etc.

Why does resque use child processes for processing each job in a queue?

We have been using Resque in most of our projects, and we have been happy with it.
In a recent project, we were having a situation, where we are making a connection to a live streaming API from the twitter. Since, we have to maintain the connection, we were dumping each line from the streaming API to a resque queue, lest the connection is not lost. And we were, processing the queue afterwards.
We had a situation where the insertion rate into the queue was of the order 30-40/second and the rate at which the queue is popped was only 3-5/second. And because of this, the queue was always increasing. When we checked for reasons for this, we found that resque had a parent process, and for each job of the queue, it forks a child process, and the child process will be processing the job. Our rails environment was quite heavy and the child process forking was taking time.
So, we implemented another rake task of this sort, for the time being:
rake :process_queue => :environment do
while true
begin
interaction = Resque.pop("process_twitter_resque")
if interaction
ProcessTwitterResque.perform(interaction)
end
rescue => e
puts e.message
puts e.backtrace.join("\n")
end
end
end
and started the task like this:
nohup bundle exec rake process_queue --trace >> log/workers/process_queue/worker.log 2>&1 &
This does not handle failed jobs and all.
But, my question is why does Resque implement a child forked process to process the jobs from the queue. The jobs definitly does not need to be processed paralelly (since it is a queue and we expect it to process one after the other, sequentially and I beleive Resque also fork only 1 child process at a time).
I am sure Resque has done it with some purpose in mind. What is the exact purpose behind this parent/child process architecture?
The Ruby process that sits and listens for jobs in Redis is not the process that ultimately runs the job code written in the perform method. It is the “master” process, and its only responsibility is to listen for jobs. When it receives a job, it forks yet another process to run the code. This other “child” process is managed entirely by its master. The user is not responsible for starting or interacting with it using rake tasks. When the child process finishes running the job code, it exits and returns control to its master. The master now continues listening to Redis for its next job.
The advantage of this master-child process organization – and the advantage of Resque processes over threads – is the isolation of job code. Resque assumes that your code is flawed, and that it contains memory leaks or other errors that will cause abnormal behavior. Any memory claimed by the child process will be released when it exits. This eliminates the possibility of unmanaged memory growth over time. It also provides the master process with the ability to recover from any error in the child, no matter how severe. For example, if the child process needs to be terminated using kill -9, it will not affect the master’s ability to continue processing jobs from the Redis queue.
In earlier versions of Ruby, Resque’s main criticism was its potential to consume a lot of memory. Creating new processes means creating a separate memory space for each one. Some of this overhead was mitigated with the release of Ruby 2.0 thanks to copy-on-write. However, Resque will always require more memory than a solution that uses threads because the master process is not forked. It’s created manually using a rake task, and therefore must load whatever it needs into memory from the start. Of course, manually managing each worker process in a production application with a potentially large number of jobs quickly becomes untenable. Thankfully, we have pool managers for that.
Resque uses #fork for 2 reasons (among others): ability to prevent zombie workers (just kill them) and ability to use multiple cores (since it's another process).
Maybe this will help you with your fast-executing jobs: http://thewebfellas.com/blog/2012/12/28/resque-worker-performance

Ruby on Rails, Resque

I have a resque job class that is responsible for producing a report on user activity. The class queries the database and then performs numerous calculations/data parsing to send out an email to certain people. My question is, should resque jobs like this, that have numerous method (200 lines or so of code), be filled with all class methods and respond to the single ResqueClass.perform method? Or, should I be instantiating a new instance of this resque class to represent the single report that is being produced? If both methods properly calculate the data and email it, is there a convention or best practice on how it should be handled for background jobs?
Thank You
Both strategies are valid. I generally approach this from the perspective of concurrency. While your job is running, the resque worker servicing your job is busy, so if you have N workers and N of these jobs running, you're going to have to wait until one is done before anything else in the queue gets processed.
Maybe that's ok - if you just have one report at a time then you in effect will dedicate one worker to running the report, your others can do other things. But if you have a pile of these and it takes a while, you might impact other jobs in your queue.
The downside is that if your report dies, you may need logic to pick up where you left off. If you instantiate the report once per user, you'd simply need to retry the failed jobs - no "where was I" logic is required.

Using ruby timeout in a thread making a database call

I am using Ruby 1.9.2.
I have a thread running which makes periodic calls to a database. The calls can be quite long, and sometimes (for various reasons) the DB connection disappears. If it does disappear, the thread just silently hangs there forever.
So, I want to wrap it all in a timeout to handle this. The problem is, on the second time through when a timeout should be called (always second), it still simply hangs. The timeout never takes effect. I know this problem existed in 1.8, but I was lead to believe timeout.rb worked in 1.9.
t = Thread.new do
while true do
sleep SLEEPTIME
begin
Timeout::timeout(TIMEOUTTIME) do
puts "About to do DB stuff, it will hang here on the second timeout"
db.do_db_stuff()
process_db_stuff()
end
rescue Timeout::Error
puts "Timed out"
#handle stuff here
end
end
end
Any idea why this is happening and what I can do about it?
One possibility is that your thread does not hang, it actually dies. Here's what you should do to figure out what's going on. Add this before you create your worker thread:
Thread.abort_on_exception = true
When an exception is raised inside your thread that is never caught, your whole process is terminated, and you can see which exception was raised. Otherwise (and this is the default), your thread is killed.
If this turns out not to be the problem, read on...
Ruby's implementation of timeouts is pretty naive. It sets up a separate thread that sleeps for n seconds, then blindly raises a Timeout exception inside the original thread.
Now, the original code might actually be in the middle of a rescue or ensure block. Raising an exception in such a block will silently abort any kind of cleanup code. This might leave the code that times out in an improper state.
It's quite difficult to tell if this is your problem exactly, but seeing how database handlers might do a fair bit of locking and exception handling, it might be very likely. Here's an article that explains the issue in more depth.
Is there any way you can use your database library's built-in timeout handling? It might be implemented on a lower level, not using Ruby's timeout implementation.
A simple alternative is to schedule the database calls in a separate process. You can fork the main process each time you do the heavy database-lifting. Or you could set up a simple cronjob to execute a script that executes it. This will be slightly more difficult if you need to communicate with your main thread. Please leave some more details if you want any advice on which option might suit your needs.
Based on your comments, the thread is dying. This might be a fault in libraries or application code that you may or may not be able to fix. If you wish to trap any arbitrary error that is generated by the database handling code and subsequently retry, you can try something like the following:
t = Thread.new do
loop do
sleep INTERVAL
begin
# Execute database queries and process data
rescue StandardError
# Log error or recover from error situation before retrying
end
end
end
You can also use the retry keyword in the rescue block to retry immediately, but you probably should keep a counter to make sure you're not accidentally retrying indefinitely when an unrecoverable error keeps occurring.

Clarification on Threads and Run Loops In Cocoa

I'm trying to learn about threading and I'm thoroughly confused. I'm sure all the answers are there in the apple docs but I just found it really hard to breakdown and digest. Maybe somebody could clear a thing or 2 up for me.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop or is it somehow a new thread even though the method says "mainThread"? If the purpose of threads is to relieve processing on the main thread how does this help?
2) RunLoops
Is it true that if I want to create a completely seperate thread I use
"detachNewThreadSelector"? Does calling start on this initiate a default run loop for the thread that has been created? If so where do run loops come into it?
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
Run Loops
You can think of a Run Loop to be an event processing for-loop associated to a thread. This is provided by the system for every thread, but it's only run automatically for the main thread.
Note that running run loops and executing a thread are two distinct concepts. You can execute a thread without running a run loop, when you're just performing long calculations and you don't have to respond to various events.
If you want to respond to various events from a secondary thread, you retrieve the run loop associated to the thread by
[NSRunLoop currentRunLoop]
and run it. The events run loops can handle is called input sources. You can add input sources to a run-loop.
PerformSelector
performSelectorOnMainThread: adds the target and the selector to a special input source called performSelector input source. The run loop of the main thread dequeues that input source and handles the method call one by one, as part of its event processing loop.
NSOperation/NSOperationQueue
I think of NSOperation as a way to explicitly declare various tasks inside an app which takes some time but can be run mostly independently. It's easier to use than to detach the new thread yourself and maintain various things yourself, too. The main NSOperationQueue automatically maintains a set of background threads which it reuses, and run NSOperations in parallel.
So yes, if you just need to queue up operations in the main thread, you can do away with NSOperationQueue and just use performSelectorOnMainThread:, but that's not the main point of NSOperation.
GCD
GCD is a new infrastructure introduced in Snow Leopard. NSOperationQueue is now implemented on top of it.
It works at the level of functions / blocks. Feeding blocks to dispatch_async is extremely handy, but for a larger chunk of operations I prefer to use NSOperation, especially when that chunk is used from various places in an app.
Summary
You need to read Official Apple Doc! There are many informative blog posts on this point, too.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop …
You're asking about implementation details. Don't worry about how it works.
What it does is perform that selector on the main thread.
… or is it somehow a new thread even though the method says "mainThread"?
No.
If the purpose of threads is to relieve processing on the main thread how does this help?
It helps you when you need to do something on the main thread. A common example is updating your UI, which you should always do on the main thread.
There are other methods for doing things on new secondary threads, although NSOperationQueue and GCD are generally easier ways to do it.
2) RunLoops
Is it true that if I want to create a completely seperate thread I use "detachNewThreadSelector"?
That has nothing to do with run loops.
Yes, that is one way to start a new thread.
Does calling start on this initiate a default run loop for the thread that has been created?
No.
I don't know what you're “calling start on” here, anyway. detachNewThreadSelector: doesn't return anything, and it starts the thread immediately. I think you mixed this up with NSOperations (which you also don't start yourself—that's the queue's job).
If so where do run loops come into it?
Run loops just exist, one per thread. On the implementation side, they're probably lazily created upon demand.
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
These two things are unrelated.
performSelectorOnMainThread: does exactly that: Performs the selector on the main thread.
NSOperations run on secondary threads, one per operation.
An operation queue determines the order in which the operations (and their threads) are started.
Threads themselves are not queued (except maybe by the scheduler, but that's part of the kernel, not your application). The operations are queued, and they are started in that order. Once started, their threads run in parallel.
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
GCD is more or less the same set of concepts as operation queues. You won't understand one as long as you don't understand the other.
So what are all these things good for?
Run loops
Within a thread, a way to schedule things to happen. Some may be scheduled at a specific date (timers), others simply “whenever you get around to it” (sources). Most of these are zero-cost when idle, only consuming any CPU time when the thing happens (timer fires or source is signaled), which makes run loops a very efficient way to have several things going on at once without any threads.
You generally don't handle a run loop yourself when you create a scheduled timer; the timer adds itself to the run loop for you.
Threads
Threads enable multiple things to happen at the exact same time on different processors. Thing 1 can happen on thread A (on processor 1) while thing 2 happens on thread B (on processor 0).
This can be a problem. Multithreaded programming is a dance, and when two threads try to step in the same place, pain ensues. This is called contention, and most discussion of threaded programming is on the topic of how to avoid it.
NSOperationQueue and GCD
You have a thing you need done. That's an operation. You can't have it done on the main thread, or you'd simply send a message like normal; you need to run it in the background, on a secondary thread.
To achieve this, express it as either an NSOperation object (you create a subclass of NSOperation and instantiate it) or a block (or both), then add it to either an NSOperationQueue (NSOperations, including NSBlockOperation) or a dispatch queue (bare block).
GCD can be used to make things happen on the main thread, as well; you can create serial queues and add blocks to them. A serial queue, as its name suggests, will run exactly one block at a time, rather than running a bunch of them in parallel.
So what should I do?
I would not recommend creating threads directly. Use NSOperationQueue or GCD instead; they force you into better thinking habits that will reduce the risk of your threaded code inducing headaches.
For things that run periodically, not fitting into the “thing I need done” model of NSOperations and GCD blocks, consider just using the run loop on the main thread. Chances are, you don't need to put it on a thread after all. A rendering loop in a 3D game, for example, can be a simple timer.

Resources