when sleep used on a spawned process it kills the process - ruby

I am primarily a java guy tinkering with Process.spawn and have hit a bit of a wall. Basically I tossed together a few rspec classes that do puts to tinker with it. This worked pretty well. Then I tossed a sleep in and the whole process just dies when a sleep is reached. I am not sure why, I thought that spawning gets around the shared resource issues of threads. Below is what I was using to make the processes
test_files = Dir["../spec/*.rb"]
test_files.each do |file|
pid = Process.spawn("rspec #{file}")
p pid
end
Process.wait
There is very little going on with the tests in the spec, below is a sample one.
it 'should another thing' do
sleep(0.3)
p "another1"
end
Could someone be so kind as to explain where I am going wrong?

It worked for me, I'm assuming you think it's not working because you get something like this:
script.rb:6:in `wait': No child processes (Errno::ECHILD)
from script.rb:6:in `<main>'
And I'm assuming you get that because you have this script in maybe a lib directory or something, but you're running it from the root of your app, so when it does Dir["../spec/*.rb"] you're expecting it to go up to the root of your app, and then down into the spec directory, but it is based on your pwd, so it goes to the parent directory of the app and looks for a spec directory, which doesn't exist.
Some things I'd advocate in this scenario, check that it finds the files before trying to spawn them. Use absolute filepaths, unless you intentionally want it to be based on where you run the program from. If this is all true, try writing it like this:
spec_glob = File.expand_path('../spec/*.rb', __dir__)
test_files = Dir[spec_glob]
p test_files
test_files.each do |file|
pid = Process.spawn("rspec #{file}")
p pid
end
Process.wait

Related

Ruby spawn process, capturing STDOUT/STDERR, while behaving as if it were spawned regularly

What I'm trying to achieve:
From a Ruby process, spawning a subprocess
The subprocess should print as normal back to the terminal. By "normal", I mean the process shouldn't miss out color output, or ignore user input (STDIN).
For that subprocess, capturing STDOUT/STDERR (jointly) e.g. into a String variable that can be accessed after the subprocess is dead. Escape characters and all.
Capturing STDOUT/STDERR is possible by passing a different IO pipe, however the subprocess can then detect that it's not in a tty. For example git log will not print characters that influence text color, nor use it's pager.
Using a pty to launch the process essentially "tricks" the subprocess into thinking it's being launched by a user. As far as I can tell, this is exactly what I want, and the result of this essentially ticks all the boxes.
My general tests to test if a solution fits my needs is:
Does it run ls -al normally?
Does it run vim normally?
Does it run irb normally?
The following Ruby code is able to check all the above:
to_execute = "vim"
output = ""
require 'pty'
require 'io/console'
master, slave = PTY.open
slave.raw!
pid = ::Process.spawn(to_execute, :in => STDIN, [:out, :err] => slave)
slave.close
master.winsize = $stdout.winsize
Signal.trap(:WINCH) { master.winsize = $stdout.winsize }
Signal.trap(:SIGINT) { ::Process.kill("INT", pid) }
master.each_char do |char|
STDOUT.print char
output.concat(char)
end
::Process.wait(pid)
master.close
This works for the most part but it turns out it's not perfect. For some reason, certain applications seem to fail to switch into a raw state. Even though vim works perfectly fine, it turned out neovim did not. At first I thought it was a bug in neovim but I have since been able to reproduce the problem using the Termion crate for the Rust language.
By setting to raw manually (IO.console.raw!) before executing, applications like neovim behave as expected, but then applications like irb do not.
Oddly spawning another pty in Python, within this pty, allows the application to work as expected (using python -c 'import pty; pty.spawn("/usr/local/bin/nvim")'). This obviously isn't a real solution, but interesting nonetheless.
For my actual question I guess I'm looking towards any help to resolving the weird raw issue or, say if I've completely misunderstood tty/pty, any different direction to where/how I should look at the problem.
[edited: see the bottom for the amended update]
Figured it out :)
To really understand the problem I read up a lot on how a PTY works. I don't think I really understood it properly until I drew it out. Basically PTY could be used for a Terminal emulator, and that was the simplest way to think of the data flow for it:
keyboard -> OS -> terminal -> master pty -> termios -> slave pty -> shell
|
v
monitor <- OS <- terminal <- master pty <- termios
(note: this might not be 100% correct, I'm definitely no expert on the subject, just posting it incase it helps anybody else understand it)
So the important bit in the diagram that I hadn't really realised was that when you type, the only reason you see your input on screen is because it's passed back (left-wards) to the master.
So first thing's first - this ruby script should first set the tty to raw (IO.console.raw!), it can restore it after execution is finished (IO.console.cooked!). This'll make sure the keyboard inputs aren't printed by this parent Ruby script.
Second thing is the slave itself should not be raw, so the slave.raw! call is removed. To explain this, I originally added this because it removes extra return carriages from the output: running echo hello results in "hello\r\n". What I missed was that this return carriage is a key instruction to the terminal emulator (whoops).
Third thing, the process should only be talking to the slave. Passing STDIN felt convenient, but it upsets the flow shown in the diagram.
This brings up a new problem on how to pass user input through, so I tried this. So we basically pass STDIN to the master:
input_thread = Thread.new do
STDIN.each_char do |char|
master.putc(char) rescue nil
end
end
that kind of worked, but it has its own issues in terms of some interactive processes weren't receiving a key some of the time. Time will tell, but using IO.copy_stream instead appears to solve that issue (and reads much nicer of course).
input_thread = Thread.new { IO.copy_stream(STDIN, master) }
update 21st Aug:
So the above example mostly worked, but for some reason keys like CTRL+c still wouldn't behave correctly. I even looked up other people's approach to see what I could be doing wrong, and effectively it seemed the same approach - as IO.copy_stream(STDIN, master) was successfully sending 3 to the master. None of the following seemed to help at all:
master.putc 3
master.putc "\x03"
master.putc "\003"
Before I went and delved into trying to achieve this in a lower level language I tried out 1 more thing - the block syntax. Apparently the block syntax magically fixes this problem.
To prevent this answer getting a bit too verbose, the following appears to work:
require 'pty'
require 'io/console'
def run
output = ""
IO.console.raw!
input_thread = nil
PTY.spawn('bash') do |read, write, pid|
Signal.trap(:WINCH) { write.winsize = STDOUT.winsize }
input_thread = Thread.new { IO.copy_stream(STDIN, write) }
read.each_char do |char|
STDOUT.print char
output.concat(char)
end
Process.wait(pid)
end
input_thread.kill if input_thread
IO.console.cooked!
end
Bundler.send(:with_env, Bundler.clean_env) do
run
end

How to stop a process from within the tests, when testing a never-ending process?

I am developing a long-running program in Ruby. I am writing some integration tests for this. These tests need to kill or stop the program after starting it; otherwise the tests hang.
For example, with a file bin/runner
#!/usr/bin/env ruby
while true do
puts "Hello World"
sleep 10
end
The (integration) test would be:
class RunReflectorTest < TestCase
test "it prints a welcome message over and over" do
out, err = capture_subprocess_io do
system "bin/runner"
end
assert_empty err
assert_includes out, "Hello World"
end
end
Only, obviously, this will not work; the test starts and never stops, because the system call never ends.
How should I tackle this? Is the problem in system itself, and would Kernel#spawn provide a solution? If so, how? Somehow the following keeps the out empty:
class RunReflectorTest < TestCase
test "it prints a welcome message over and over" do
out, err = capture_subprocess_io do
pid = spawn "bin/runner"
sleep 2
Process.kill pid
end
assert_empty err
assert_includes out, "Hello World"
end
end
. This direction also seems like it will cause a lot of timing-issues (and slow tests). Ideally, a reader would follow the stream of STDOUT and let the test pass as soon as the string is encountered and then immediately kill the subprocess. I cannot find how to do this with Process.
Test Behavior, Not Language Features
First, what you're doing is a TDD anti-pattern. Tests should focus on behaviors of methods or objects, not on language features like loops. If you must test a loop, construct a test that checks for a useful behavior like "entering an invalid response results in a re-prompt." There's almost no utility in checking that a loop loops forever.
However, you might decide to test a long-running process by checking to see:
If it's still running after t time.
If it's performed at least i iterations.
If a loop exits properly given certain input or upon reaching a boundary condition.
Use Timeouts or Signals to End Testing
Second, if you decide to do it anyway, you can just escape the block with Timeout::timeout. For example:
require 'timeout'
# Terminates block
Timeout::timeout(3) { `sleep 300` }
This is quick and easy. However, note that using timeout doesn't actually signal the process. If you run this a few times, you'll notice that sleep is still running multiple times as a system process.
It's better is to signal the process when you want to exit with Process::kill, ensuring that you clean up after yourself. For example:
pid = spawn 'sleep 300'
Process::kill 'TERM', pid
sleep 3
Process::wait pid
Aside from resource issues, this is a better approach when you're spawning something stateful and don't want to pollute the independence of your tests. You should almost always kill long-running (or infinite) processes in your test teardown whenever you can.
Ideally, a reader would follow the stream of STDOUT and let the test pass as soon as the string is encountered and then immediately kill the subprocess. I cannot find how to do this with Process.
You can redirect stdout of spawned process to any file descriptor by specifying out option
pid = spawn(command, :out=>"/dev/null") # write mode
Documentation
Example of redirection
With the answer from CodeGnome on how to use Timeout::timeout and the answer from andyconhin on how to redirect Process::spawn IO, I came up with two Minitest helpers that can be used as follows:
it "runs a deamon" do
wait_for(timeout: 2) do
wait_for_spawned_io(regexp: /Hello World/, command: ["bin/runner"])
end
end
The helpers are:
def wait_for(timeout: 1, &block)
Timeout::timeout(timeout) do
yield block
end
rescue Timeout::Error
flunk "Test did not pass within #{timeout} seconds"
end
def wait_for_spawned_io(regexp: //, command: [])
buffer = ""
begin
read_pipe, write_pipe = IO.pipe
pid = Process.spawn(command.shelljoin, out: write_pipe, err: write_pipe)
loop do
buffer << read_pipe.readpartial(1000)
break if regexp =~ buffer
end
ensure
read_pipe.close
write_pipe.close
Process.kill("INT", pid)
end
buffer
end
These can be used in a test which allows me to start a subprocess, capture the STDOUT and as soon as it matches the test Regular Expression, it passes, else it will wait 'till timeout and flunk (fail the test).
The loop will capture output and pass the test once it sees matching output. It uses a IO.pipe because that is most transparant for subprocesses (and their children) to write to.
I doubt this will work on Windows. And it needs some cleaning up of the wait_for_spawned_io which is doing slightly too much IMO. Antoher problem is that the Process.kill('INT') might not reach the children which are orphaned but still running after this test has ran. I need to find a way to ensure the entire subtree of processes is killed.

Running resque without Rakefile

I have built my own job server, which is essentially a private gem, built as a wrapper around resque.
(I am not running this in a Rails environment)
Everywhere I look, it seems like the documented/recommended way to start the workers, is with something like this:
$ QUEUE=* rake resque:work
Which means that it must be executed in a folder where the Rakefile exists.
I am looking for a way to start it without a Rakefile.
What I have learned so far:
I have looked through the issues, maybe someone asked a similar question.
I have looked through the wiki, and specifically the FAQ.
I know I can probably create my own "bin" to run it without rake, by analyzing the tasks file.
I saw that resque installs a resque binary, but it only seems to provide limited functionality, like removing and listing a worker, but not starting.
My current workaround is that my gem's binary is doing chdir to the gem's folder before running (and this folder has a Rakefile), like the code below.
def start_worker
ENV['QUEUE'] = '*'
Dir.chdir gemdir do
exec "rake resque:work"
end
end
def gemdir
File.expand_path "../../", __dir__
end
Appreciate any nudge in the right direction.
The current solution I have worked up for this:
def start_worker
interval = 5
queue = '*'
ENV['QUEUE'] = queue
worker = Resque::Worker.new
Resque.logger = Logger.new STDOUT
Resque.logger.level = Logger::INFO
## this is not yet implemented in 1.26.0, keeping here as a reminder
# worker.prepare
worker.log "Starting worker"
worker.work interval
end
Which is an adaptation of the code from the rake task
For reference, I also opened a github issue, in the off chance that someone else also needs such functionality.
I having created a script to create demon worker processes using following worker starting API.
def start_worker(id)
ENV['QUEUE'] = #queues || "*"
ENV['PIDFILE'] = pid_file(id)
ENV['JOBS_PER_FORK'] = #jobs_per_fork || "1000"
ENV['BACKGROUND'] = 'true'
ENV['TERM_CHILD'] = 'true'
#debug ? ENV['VVERBOSE'] = 'true' : ENV['VERBOSE'] = 'true'
begin
worker = Resque::Worker.new
rescue Resque::NoQueueError
Resque.logger.error "No queue is set for worker_id = #{id}"
end
worker.prepare
worker.log "Starting worker #{self}"
worker.work(5) # interval, will block
end

How can I determine which examples RSpec will run

I want to execute some code before an arbitrary RSpec test is run, but only in cases where the example groups to be tested are either in a specific directory or carry a specific tag.
For example, if I have the following groups:
## spec/file_one.rb
describe "Spec One - A group which needs the external app running", :external => true do
describe "Spec Two - A group which does not need the external app running" do
## spec/file_two.rb
describe "Spec Three - A group which does NOT need the external app running" do
## spec/always_run/file_three.rb
describe "Spec Four - A group which does need the external app running"
Then I want the code to be executed only when a test run contains Spec One or Spec Four.
This is relatively easy to do when I can rely on the filename, but harder when relying on the tag. How can I check what files examples will be run and then check their tags?
I'd just have a support setup like this:
PID_FILE = File.join(Rails.root, "tmp", "pids", "external.pid")
def read_pid
return nil unless File.exists? PID_FILE
File.open(PID_FILE).read.strip
end
def write_pid(pid)
File.open(PID_FILE, "w") {|f| f.print pid }
end
def external_running?
# Some test to see if the external app is running here
begin
!!Process.getpgid(read_pid)
rescue
false
end
end
def start_external
unless external_running?
write_pid spawn("./run_server")
# Maybe some wait loop here for the external service to boot up
end
end
def stop_external
Process.kill read_pid if external_running?
end
RSpec.configure do |c|
before(:each) do |example|
start_external if example.metadata[:external]
end
after(:suite) do
stop_external
end
end
Each test tagged with :external would attempt to start the external process if it's not already started. Thus, the first time you run a test that needs it, the process would be booted. If no tests with the tag are run, the process is never booted. The suite then cleans up after itself by terminating the process as a part of the shutdown process.
This way, you don't have to pre-process the test list, your tests aren't interdependent, and your external app is automatically cleaned up after. If the external app is running before the test suite gets a chance to invoke it, it will read the pid file and use the existing instance.
Rather than relying on metadata[:external] you could parse the full name of the example and determine if it needs the external app for a more "magical" setup, but that's kind of smelly to me; example descriptions are for humans, not for the spec suite to parse.

How do I ensure only one instance of a Ruby script is running at a time?

I have a process that runs on cron every five minutes. Usually, it takes only a few seconds to run, but sometimes it takes several minutes. I want to ensure that only one version of this is running at a time.
I tried an obvious way...
File.open("/tmp/indexer_lock.tmp",'w') do |f|
exit unless f.flock(File::LOCK_EX)
end
...but it's not testing to see if it can get the lock, it's blocking until the lock is released.
Any idea what I'm missing? I'd rather not hack something using ps, but that's an alternative.
I know this is old, but for anyone interested, there's a non-blocking constant that you can pass to flock so that it returns instead of blocking.
File.new("/tmp/foo.lock").flock( File::LOCK_NB | File::LOCK_EX )
Update for slhck
flock will return true if this process received the lock, false otherwise. So to ensure just one process is running at a time you just want to try to get the lock, and exit if you weren't able to. It's as simple as putting an exit unless in front of the line of code I have above:
exit unless File.new("/tmp/foo.lock").flock( File::LOCK_NB | File::LOCK_EX )
Depending on your needs, this should work just fine and doesn't require creating another file anywhere.
exit unless DATA.flock(File::LOCK_NB | File::LOCK_EX)
# your script here
__END__
DO NOT REMOVE: required for the DATA object above.
Although this isn't directly answering your question, if I were you I'd probably write a daemon script (you could use http://daemons.rubyforge.org/)
You could have your indexer (assuming its indexer.rb) be run through a wrapper script named script/index for example:
require 'rubygems'
require 'daemons'
Daemons.run('indexer.rb')
And your indexer can do almost the same thing, except you specify a sleep interval
loop do
# code executing your indexing
sleep INDEXING_INTERVAL
end
This is how job processors in tandem with a queue server usually function.
You could create and delete a temporary file and check for existence of this file.
Please check the answer to this question :
one instance shell script
There's a lockfile gem for exactly this situation. I've used it before and it's dead simple.
If your using cron it might be easier to do something like this in the shell script that cron calls:
#!/usr/local/bin/bash
#
if ps -C $PROGRAM_NAME &> /dev/null ; then
: #Program is already running.. appropriate action can be performed here (kill it?)
else
#Program is not running.. launch it.
$PROGRAM_NAME
fi
Here's a one-liner that should work at the top of any Ruby script:
exit unless File.new(__FILE__)).tap {|f| f.autoclose = false}.flock(File::LOCK_NB | File::LOCK_EX)
There are two issues with the original code.
First, the reason it's blocking is that the call to #flock is missing File::LOCK_NB:
Don't block when locking. May be combined
with other lock options using logical or.
Second, if a File object is closed (whether at the end of an #open block as in the code above, via explicit #close, or implicitly auto-closed when the File is garbage-collected), the underlying file descriptor is closed and the lock is released. To prevent this you can set #autoclose =false.
Ok, working off notes from #shodanex's pointer, here's what I have. I rubied it up a little bit (though I don't know of a touch analogue in Ruby).
tmp_file = File.expand_path(File.dirname(__FILE__)) + "/indexer.lock"
if File.exists?(tmp_file)
puts "quitting"
exit
else
`touch #{tmp_file}`
end
.. do stuff ..
File.delete(tmp_file)
Can you not add File::LOCK_NB to your lock, to make it non-blocking (i.e. it fails if it can't get the lock)
That would work in C, Perl etc.
At a higher level, you might find the lock_method gem useful:
def the_method_my_cron_job_calls
# something really expensive
end
lock_method :the_method_my_cron_job_calls
It uses lockfiles stored on the local filesystem (what was being discussed above) by default, but you can also configure remote lock storage:
LockMethod.config.storage = Redis.new([...]) # a remote RedisToGo instance, perhaps?
Also...
def the_method_my_cron_job_calls
# something really expensive
end
lock_method :the_method_my_cron_job_calls, (60*60) # automatically expire lock after an hour

Resources