Does Ruby Spawn or Fork share memory with parent process after completion? - ruby

I have an app that processes images and creates PDFs in the background using Delayed Job. The images are being processed using Process.spawn (through the subexec gem) which runs a GraphicsMagick process. We are then creating a PDF file using the Prawn gem, which includes these images and text components. I don't believe that the Prawn gem using Fork or Spawn. We are using Ruby 1.9.3. What happens is that our Delayed Job processes balloons from ~120MB to over 800MB of memory after a few PDF files are made.
I know that the spawned GraphicsMagick processes share memory with the parent process, but is that memory given back to the system after the child processes are finished? If I created the PDF files in a forked process, would the memory used in creating the PDF file be returned to the system after that forked process was completed?

I went ahead and forked the PDF creation tasks and tested it out. The forked processes do indeed release the memory back to the system after they exit, and our Delayed Job processes no longer balloon out of control. I have attached the code in case anyone is facing a similar problem. Interestingly enough, running Ruby 1.9.3 + Rails 3.2.x, we don't need to reconnect to the database in either the forked processes or the parent process. There was much debate about this in other Stackoverflow questions.
def run_in_fork
read, write = IO.pipe
pid = fork do
error = nil
read.close
begin
yield
rescue => e
error = e
end
Marshal.dump(error, write)
exit!(0) # skips exit handlers.
end
write.close
result = read.read
Process.wait(pid)
raise "Child process failed" if result.empty?
if exception = Marshal.load(result)
raise exception
end
return true
end
def my_method
object = MyObject.find.first
run_in_fork do
object.long_running_process
end
object.reload
end

Related

How can a process die in a way that Process.wait wouldn't notice?

I have this ruby script to manage que processes. que doesn't support multi-proccess, see discussion here):
#!/usr/bin/env ruby
cluster_size = 2
puts "starting Que cluster with #{cluster_size} workers"; STDOUT.flush
%w[INT TERM].each do |signal|
trap(signal) do
#pids.each{|pid| Process.kill(signal, pid) }
end
end
#pids = []
cluster_size.to_i.times do |n|
puts "Starting Que daemon #{n}"; STDOUT.flush
#pids << Process.spawn("que --worker-count $MAX_THREADS")
end
Process.waitall
puts "Que cluster has shut down"; STDOUT.flush
The script has been working well for a couple months. The other day I found things in a state where the script was running, but both child processes were dead.
I experimented with trying to replicate this. I killed the children with various signals, had them raise exceptions. In all cases, the script knew the process died and itself died.
How could the child process have died without the parent script knowing?
How could the child process have died without the parent script
knowing?
My guess is that the child process turned into a zombie and missed by Process.waitall. Did you check if the child processes are zombies when it happens?
The zombie:
If you have zombie processes it means those zombies have not been waited for by their parent (check the PPID with ps -l). In the end you have three choices: Fix the parent process (make it wait); kill the parent; or get over it.
Could you check your list of signals and trap it?
You can list all Signal(s) available (below is on windows):
Signal.list
=> {"EXIT"=>0, "INT"=>2, "ILL"=>4, "ABRT"=>22, "FPE"=>8, "KILL"=>9, "SEGV"=>11, "TERM"=>15}
Could you try to trap it via e.g. INT (note: you can have one trap per Signal) (
Signal.trap('SEGV') { throw :sigsegv }
catch :sigsegv
start_what_you_need
end
puts 'OMG! Got a SEGV!'
Since your question is a general one, it is hard to give you a specific answer.
Zombies are not the only possible cause for this problem -- stopped children may not be reported for a variety of reasons.
The existence of a zombie typically means that the parent has not properly waited on them. The posted code looks OK, though, so unless there's a framework bug lurking somewhere I'd want to look beyond the zombie apocalypse to explain this problem.
In contrast to zombies, which can't be fully reaped because they have no accessible parent, frozen processes have an intact parent but have stopped responding for some reason (waiting for an external process or I/O operation, memory problems, long or infinite looping, slow database operations, etc.).
On some platforms, Ruby can add a flag requesting return of stopped children that haven't been reported, using the following syntax:
waitpid(pid, Process::WUNTRACED)
AFAIK waitall doesn't have a version that accepts flags, so you'd have to aggregate this yourself, or use pid = -1 to wait for any child process (the default if you omit pid) or pid = 0 to wait for any child with the same process groupID as the calling process.
See documentation here.

RSpec testing of a multiprocess library

I'm trying to test a gem I'm creating with RSpec. The gem's purpose is to create queues (using 'bunny'). It will serve to communicate between processes on several servers.
But I cannot find documentation on how to safely create processes inside RSpec running environment without spawning several testing processes (all displaying example failures and successes).
Here is what I wanted the tests to do :
Spawn children processes, waiting on the queue
Push messages from the main RSpec process
Consumes the queue on the children processes
Wait for children to stop and get the number of messages received from each child.
For now I implemented a simple case where child is consuming only one message and then stops.
Here is my code currently :
module Queues
# Basic CR accepting only jobs of type cmd_line
class CR
attr_reader :nb_jobs
def initialize
# opening communication pipes
#rout, #wout = IO.pipe
#nb_jobs = nil # not yet available.
end
def main
#todo = JobPipe.instance
job = #todo.pop do |j|
# accept only CMD_LINE type of jobs.
j.type == Messages::Job::CMD_LINE
end
# run command
%x{#{job.cmd}}
#wout.puts "1" # saying that we did one job
end
def run
#pid = Process.fork
if #pid.nil? then
# we are in the child
self.main
#rout.close
#wout.close
exit
end
end
def wait
#nb_jobs = #rout.gets(nil).to_i
Process.wait(#pid)
#rout.close
#wout.close
#nb_jobs
end
end
#job = Messages::Job.new({:type => Messages::Job::CMD_LINE, :cmd => "sleep 1" })
RSpec.describe JobPipe do
context "one Orchestrator and one CR" do
before(:each) do
indalo_queue_pre_configure
end
it "can send a job with Orchestrator and be received by CR" do
cr = CR.new
cr.run # execute the C.R. process
todo = JobPipe.instance
todo.push(#job)
nb_jobs = cr.wait
expect(nb_jobs).to eql(1)
end
end
context "one Orchestrator and severals CR" do
it 'can send one job per CR and get all back' do
crs = Array.new(rand(2..10)) { CR.new }
crs.each do |cr|
cr.run
end
todo = JobPipe.instance
crs.each do |_|
todo.push(#job)
end
nb_jobs = 0
crs.each do |cr|
nb_jobs += cr.wait
end
expect(nb_jobs).to eql(crs.length)
end
end
end
end
Edit: The question is (sorry not putting it right away, this was a mistake):
Is there a way to use correctly RSpec on a multi-process environment ?
I'm not looking for a code review, just wanted to display a clear example of what I wanted to do. Here I used fork, but this duplicate all the process (including RSpec part) and got numerous RSpec outputs which is not what we would expect in a test suite.
I would expect that only the main program states the RSpec outputs/stats and the subprocesses just interact with it.
The only way I see to do that correctly is not fork, but call subprocesses through an other mean. Maybe I answer alone to this question...
But not knowing well RSpec, I was wondering if someone knew how to do it within RSpec without writing external code. It seems to me that having separate codes linked to a single test example is not a good idea.
What I found about multi-process testing is this plugin to RSpec. The only thing is I don't know about the mock concept, but maybe I have to learn about it...
Ok, I found an answer which is to use the &block argument of the Process.fork method. In this case, you don't really duplicate all the process, but just execute the block of code in an other process and then return 0 (like said in the Ruby doc).
This prevent the children to get all the RSpec environment and displaying plenty of times the states of your tests.
PS : Be careful not to forget to redirect STDOUT/STDERR of child process if you don't want them to pollute the STDOUT/STDERR of the test.
PS2: don't forget to close #wout on the parent side if you call #rout.gets(nil) in it, because having it opened on the parent prevent EOF from happening (a bug in the code I presented) even if you close it in the child.
PS3: Use two pipes instead of one to prevent child/parent to talk and listen in the same. Childhood error but I did it again.
PS4: Use exit statement (at the end of the &block) to prevent zombie state of the child and usure parent not waiting too long that the rest of the child process dies.
Sorry for that long post, but it's good it stays also for me ^^

Background thread in Rails can't see instance variables

I need to gather up some data from a rails application, aggregate it, and send it off to a remote server periodically. I instantiate my aggregation class in a global variable (I know, I know) in application.rb.
Inside my aggregation class, I fire up a thread that sleeps for 10 seconds, then looks at the queue, processes the data, and sends it. The queue is a hash stored in an instance variable of the class.
From the rails controller, I call a method in the aggregator class to queue the data in the hash. Of course this is on a different thread than the background task that reads the queue. The problem is that the background task never sees any data in the hash. In my log, I print out the object_id of the hash both when I write to it (from the controllers thread), and when I read from it (from the background thread). The hash#object_id matches from both threads, but the background thread never sees the data.
Whats killing me is that this works fine outside of rails. I've set up tests with many threads that really pound on it, and it works fine (there is some thread protection that I am not showing for clarity). Anyone know how the object_id's can match, but the contents are not consistent?
class Aggregator
def initialize
#q = {}
#timer = nil
end
def start
#timer = Thread.new do
loop do
sleep(10)
flush_q
end
end
end
def flush_q
logger.debug "flush: q.object_id = #{#q.object_id}" # matches what I get below
logger.debug "flush: q.length = #{#q.length}" # always zero!
#q.each_pair do |k,v|
# pack it up and send it
end
#q.clear
end
def add(item)
logger.debug "add: q.object_id = #{#q.object_id}" # matches what I get above
#q[item.name] ||= item
logger.debug "add: q.length = #{#q.length}" # increases with each add
# not actually that simple, but not relevant
end
end
I'm going to go out on a limb and assume that your code is deployed using a forking app server (eg unicorn or passenger).
This means that your app is loaded once and then new instances are forked from that master instances. Forking is cheap so this means that new instances of the app can be started up/shutdown really quickly.
I believe that your aggregator instance is getting created/started in this master process. When this forks the process's entire memory space is copied (so there an instance of aggregator in the new process, with the same object id and so on).
However when forking only the current thread is copied , so the aggregator flushing is only happening in the master process, but all the appending is happening in the child processes. You could confirm this by adding Proccess.pid to what you log - you should see that your logging is coming from 2 different process.
One way of fixing this would be to start/restart your thread after the child process has forked. How you do this depends on how the app is being served. With unicorn you can do this in your unicorn config via the after_fork method. With passenger you do
PhusionPassenger.on_event(:starting_worker_process) do |forked|
if forked
...
end
end

How to use ruby fibers to avoid blocking IO

I need to upload a bunch of files in a directory to S3. Since more than 90% of the time required to upload is spent waiting for the http request to finish, I want to execute several of them at once somehow.
Can Fibers help me with this at all? They are described as a way to solve this sort of problem, but I can't think of any way I can do any work while an http call blocks.
Any way I can solve this problem without threads?
I'm not up on fibers in 1.9, but regular Threads from 1.8.6 can solve this problem. Try using a Queue http://ruby-doc.org/stdlib/libdoc/thread/rdoc/classes/Queue.html
Looking at the example in the documentation, your consumer is the part that does the upload. It 'consumes' a URL and a file, and uploads the data. The producer is the part of your program that keeps working and finds new files to upload.
If you want to upload multiple files at once, simply launch a new Thread for each file:
t = Thread.new do
upload_file(param1, param2)
end
#all_threads << t
Then, later on in your 'producer' code (which, remember, doesn't have to be in its own Thread, it could be the main program):
#all_threads.each do |t|
t.join if t.alive?
end
The Queue can either be a #member_variable or a $global.
To answer your actual questions:
Can Fibers help me with this at all?
No they can't. Jörg W Mittag explains why best.
No, you cannot do concurrency with Fibers. Fibers simply aren't a concurrency construct, they are a control-flow construct, like Exceptions. That's the whole point of Fibers: they never run in parallel, they are cooperative and they are deterministic. Fibers are coroutines. (In fact, I never understood why they aren't simply called Coroutines.)
The only concurrency construct in Ruby is Thread.
When he says that the only concurrency contruct in Ruby is Thread, remember that there are many different implimentations of Ruby and that they vary in their threading implementations. Jörg once again provides a great answer to these differences; and correctly concludes that only something like JRuby (that uses JVM threads mapped to native threads) or forking your process is how you can achieve true parallelism.
Any way I can solve this problem without threads?
Other than forking your process, I would also suggest that you look at EventMachine and something like em-http-request. It's an event driven, non-blocking, reactor pattern based HTTP client that is asynchronous and does not incur the overhead of threads.
Aaron Patterson (#tenderlove) uses an example almost exactly like yours to describe exactly why you can and should use threads to achieve concurrency in your situation.
Most I/O libraries are now smart enough to release the GVL (Global VM Lock, or most people know it as the GIL or Global Interpreter Lock) when doing IO. There is a simple function call in C to do this. You don't need to worry about the C code, but for you this means that most IO libraries worth their salt are going to release the GVL and allow other threads to execute while the thread that is doing the IO waits for the data to return.
If what I just said was confusing, you don't need to worry about it too much. The main thing that you need to know is that if you are using a decent library to do your HTTP requests (or any other I/O operation for that matter... database, interprocess communication, whatever), the Ruby interpreter (MRI) is smart enough to be able to release the lock on the interpreter and allow other threads to execute while one thread awaits IO to return. If the next thread has its own IO to grab, the Ruby interpreter will do the same thing (assuming that the IO library is built to utilize this feature of Ruby, which I believe most are these days).
So, to sum up what I am saying, use threads! You should see the performance benefit. If not, check to see whether your http library is using the rb_thread_blocking_region() function in C and, if not, find out why not. Maybe there is a good reason, maybe you need to consider using a better library.
The link to the Aaron Patterson video is here: http://www.youtube.com/watch?v=kufXhNkm5WU
It is worth a watch, even if just for the laughs, as Aaron Patterson is one of the funniest people on the internet.
You could use separate processes for this instead of threads:
#!/usr/bin/env ruby
$stderr.sync = true
# Number of children to use for uploading
MAX_CHILDREN = 5
# Hash of PIDs for children that are working along with which file
# they're working on.
#child_pids = {}
# Keep track of uploads that failed
#failed_files = []
# Get the list of files to upload as arguments to the program
#files = ARGV
### Wait for a child to finish, adding the file to the list of those
### that failed if the child indicates there was a problem.
def wait_for_child
$stderr.puts " waiting for a child to finish..."
pid, status = Process.waitpid2( 0 )
file = #child_pids.delete( pid )
#failed_files << file unless status.success?
end
### Here's where you'd put the particulars of what gets uploaded and
### how. I'm just sleeping for the file size in bytes * milliseconds
### to simulate the upload, then returning either +true+ or +false+
### based on a random factor.
def upload( file )
bytes = File.size( file )
sleep( bytes * 0.00001 )
return rand( 100 ) > 5
end
### Start a child uploading the specified +file+.
def start_child( file )
if pid = Process.fork
$stderr.puts "%s: uploaded started by child %d" % [ file, pid ]
#child_pids[ pid ] = file
else
if upload( file )
$stderr.puts "%s: done." % [ file ]
exit 0 # success
else
$stderr.puts "%s: failed." % [ file ]
exit 255
end
end
end
until #files.empty?
# If there are already the maximum number of children running, wait
# for one to finish
wait_for_child() if #child_pids.length >= MAX_CHILDREN
# Start a new child working on the next file
start_child( #files.shift )
end
# Now we're just waiting on the final few uploads to finish
wait_for_child() until #child_pids.empty?
if #failed_files.empty?
exit 0
else
$stderr.puts "Some files failed to upload:",
#failed_files.collect {|file| " #{file}" }
exit 255
end

In ruby, how do I attempt a block of code but move on after n seconds?

I have a library method that occasionally hangs on a network connection, and there's no timeout mechanism.
What's the easiest way to add my own? Basically, I'm trying to keep my code from getting indefinitely stuck.
timeout.rb has some problems where basically it doesn't always work quite right, and I wouldn't recommend using it. Check System Timer or Terminator instead
The System Timer page in particular describes why timeout.rb can fail, complete with pretty pictures and everything. Bottom line is:
For timeout.rb to work, a freshly created “homicidal” Ruby thread has to be scheduled by the Ruby interpreter.
M.R.I. 1.8, the interpreter used by most Ruby applications in production, implements Ruby threads as green threads.
It is a well-known limitations of the green threads (running on top of a single native thread) that when a green thread performs a blocking system call to the underlying operating systems, none of green threads in the virtual machine will run until the system call returns.
Answered my own question:
http://www.ruby-doc.org/stdlib/libdoc/timeout/rdoc/index.html
require 'timeout'
status = Timeout::timeout(5) {
# Something that should be interrupted if it takes too much time...
}
To prevent an ugly error on timeout I suggest enclosing it and using a rescue like this:
begin
status = Timeout::timeout(5) do
#Some stuff that should be interrupted if it takes too long
end
rescue Timeout::Error
puts "Error message: It took too long!\n"
end

Resources