Ensure orphaned processes are killed when the parent process dies - ruby

In Ruby, how do I ensure that child processes spawned from my program don't keep running when my main process exits or is killed?
Initially I thought I could just use at_exit in the main process, but that won't work if my main process gets kill -9ed or calls Kernel.exec. I need a solution that is (basically) foolproof, and cross-platform.

If you have to handle kill -9 termination for your parent app, then you have only a couple of choices that I can see:
Create a work queue manager and spawn/kill child processes from work queue manager. If you can't guarantee that the work queue manager won't also be killed without warning, then option 2 is your only choice I think, since the only thing you know for sure is that the child processes are still running.
http://www.celeryproject.org/
http://aws.amazon.com/elasticbeanstalk/
More aggressive approach - basically spawn off whole OS instances but they'll definitely get killed off within your parameters for operation
Have the child processes check a "heartbeat" from the parent process through RPC or monitoring parent PID in memory or watching a date/time on keep-alive file in /tmp to make sure it's current.
If the child processes fail to see the parent processes doing it's job of either responding to RPC messages, staying in memory itself, or keeping a file date/time current the child processes must kill themselves.

Related

Is it possible to make a console wait on another child process?

Usually when a program is run from the Windows console, the console will wait for the process to exit and then print the prompt and wait for user input. However, if the process starts a child process, the console will still only wait for the first process to exit. It will not wait for the child as well.
Is there a way for the program to get the console to wait on another child process instead of (or as well as) the current process.
I would assume it's impossible because presumably the console is waiting on the process' handle and there's no way to replace that handle. However, I'm struggling to find any confirmation of this.
Is there a way for the program to get the console to wait on another child process instead of (or as well as) the current process.
No. As you noted, as soon as the 1st process the console creates has exited, the console stops waiting. It has no concept of any child processes being created by that 1st process.
So, what you can do instead is either:
simply have the 1st process wait for any child process it creates before then exiting itself.
if that is not an option, then create a separate helper process that creates a Job Object and then starts the main process and assigns it to that job. Any child processes it creates will automatically be put into the same job as well 1. The helper process can then wait for all processes in the job to exit before then exiting itself. Then, you can have the console run and wait on the helper process rather than the main process.
1: by default - a process spawner can choose to break out a new child process from the current job, if the job is setup to allow that.

Killing PPID can kill all child process association with it at same time?

I have tried to kill PPID process which terminate process (also kills child pid's) immediately sends signal back in seconds to one of my console, but child process are taking time to respond back termination response. Any one has any idea why it is happening..?
Whenever the parent process gets killed, the child processes become ORPHAN processes so the INIT process becomes the parent of the ORPHAN processes. As INIT process is created in such a way that whenever any process gets killed all of it's children are taken care by the INIT process until the processes finish.
It looks like the parent process did not catch any signals, while the child processes did.
Alternatively, the child processes had resources open and are attempting a graceful exit, making sure those resources are properly taken care of.
In this case you may need to rewrite the parent process to catch the signal, forward it to its children, and then wait() for them to finish, and exit.

Why does resque use child processes for processing each job in a queue?

We have been using Resque in most of our projects, and we have been happy with it.
In a recent project, we were having a situation, where we are making a connection to a live streaming API from the twitter. Since, we have to maintain the connection, we were dumping each line from the streaming API to a resque queue, lest the connection is not lost. And we were, processing the queue afterwards.
We had a situation where the insertion rate into the queue was of the order 30-40/second and the rate at which the queue is popped was only 3-5/second. And because of this, the queue was always increasing. When we checked for reasons for this, we found that resque had a parent process, and for each job of the queue, it forks a child process, and the child process will be processing the job. Our rails environment was quite heavy and the child process forking was taking time.
So, we implemented another rake task of this sort, for the time being:
rake :process_queue => :environment do
while true
begin
interaction = Resque.pop("process_twitter_resque")
if interaction
ProcessTwitterResque.perform(interaction)
end
rescue => e
puts e.message
puts e.backtrace.join("\n")
end
end
end
and started the task like this:
nohup bundle exec rake process_queue --trace >> log/workers/process_queue/worker.log 2>&1 &
This does not handle failed jobs and all.
But, my question is why does Resque implement a child forked process to process the jobs from the queue. The jobs definitly does not need to be processed paralelly (since it is a queue and we expect it to process one after the other, sequentially and I beleive Resque also fork only 1 child process at a time).
I am sure Resque has done it with some purpose in mind. What is the exact purpose behind this parent/child process architecture?
The Ruby process that sits and listens for jobs in Redis is not the process that ultimately runs the job code written in the perform method. It is the “master” process, and its only responsibility is to listen for jobs. When it receives a job, it forks yet another process to run the code. This other “child” process is managed entirely by its master. The user is not responsible for starting or interacting with it using rake tasks. When the child process finishes running the job code, it exits and returns control to its master. The master now continues listening to Redis for its next job.
The advantage of this master-child process organization – and the advantage of Resque processes over threads – is the isolation of job code. Resque assumes that your code is flawed, and that it contains memory leaks or other errors that will cause abnormal behavior. Any memory claimed by the child process will be released when it exits. This eliminates the possibility of unmanaged memory growth over time. It also provides the master process with the ability to recover from any error in the child, no matter how severe. For example, if the child process needs to be terminated using kill -9, it will not affect the master’s ability to continue processing jobs from the Redis queue.
In earlier versions of Ruby, Resque’s main criticism was its potential to consume a lot of memory. Creating new processes means creating a separate memory space for each one. Some of this overhead was mitigated with the release of Ruby 2.0 thanks to copy-on-write. However, Resque will always require more memory than a solution that uses threads because the master process is not forked. It’s created manually using a rake task, and therefore must load whatever it needs into memory from the start. Of course, manually managing each worker process in a production application with a potentially large number of jobs quickly becomes untenable. Thankfully, we have pool managers for that.
Resque uses #fork for 2 reasons (among others): ability to prevent zombie workers (just kill them) and ability to use multiple cores (since it's another process).
Maybe this will help you with your fast-executing jobs: http://thewebfellas.com/blog/2012/12/28/resque-worker-performance

Resque: Does dequeueing kill the process?

I'm implementing resque on this project where I need the feature of killing whatever gets enqueued to resque. So, I've seen that there is a dequeuing method, which will remove the jobs from the queue. But, if this job has already been started, and is currently running, does dequeuing kill the process?
Also important: If a job gets dequeued, do I get a handle where I can do something, or is an exception thrown?
As far I know it don't kill the process its just remove the job from the queue if it exist check here
But if you want to achieve killing a job perhaps then you need to use various signal that resque provide
Here a list of them
Resque workers respond to a few different signals:
QUIT - Wait for child to finish processing then exit
TERM / INT - Immediately kill child then exit
USR1 - Immediately kill child but don't exit
USR2 - Don't start to process any new jobs
CONT - Start to process new jobs again after a USR2
In your case if would be USR1
Hope this help
The answer to this issue was actually using one of the many extensions for the resque gem, called resque-status. This handles worker instances, assignes a unique id to each of them (which I can use to identify them, feature I needed the most) and provides me with a kill method to be called on a job, which will guarantee that the job will process the kill signal the next time I call a certain method of their API (not exactly a kill and assign exception, but it's better than nothing).

IContextMenu::InvokeCommand, break away from job?

I have a child process in a job that has JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE specified.
When I invoke IContextMenu::InvokeCommand, though, any processes that are started are automatically killed when my child process exits, because they are automatically included in a job.
How can I prevent this from happening?
The solution I've found is to specify
JOB_OBJECT_LIMIT_BREAKAWAY_OK | JOB_OBJECT_LIMIT_SILENT_BREAKAWAY_OK
for the child process, to allow its children to automatically break away from the job.

Resources