while trying to run a ruby program and piping the output to another program like this:
ruby hello.rb | whoami
The command whoami is executed first as expected, but after that, hello.rb crashes with:
Traceback (most recent call last):
2: from p.rb:2:in `<main>'
1: from p.rb:2:in `print'
p.rb:2:in `write': Broken pipe # io_write - <STDOUT> (Errno::EPIPE)
This happens only when STDOUT.sync is set to true
STDOUT.sync = true
STDOUT.print "Hello!"
[and a similar error is raised with STDOUT.flush after STDOUT.puts when piped to another program]
What is the reason behind this crash?
Introduction
Firstly, an explanation can be found here.
Anyways, here's my thought...
When a pipe is used like this:
a | b
Both a and b are executed concurrently. b waits for standard input from a.
Speaking of Errno::EPIPE, The Linux man page of write says:
EPIPE fd is connected to a pipe or socket whose reading end is
closed. When this happens the writing process will also
receive a SIGPIPE signal. (Thus, the write return value is
seen only if the program catches, blocks or ignores this
signal.)
Talking about the problem in the question:
When the program whoami is run, it exits and no longer accepts standard inputs that ruby program hello.rb is sending - resulting in a broken pipe.
Here I wrote 2 ruby programs, named p.rb and q.rb to test that:
p.rb
#!/usr/bin/env ruby
print ?* * 100_000
q.rb
#!/usr/bin/ruby
exit! 0
Running:
bash[~] $ ruby p.rb | ruby q.rb
Traceback (most recent call last):
2: from p.rb:2:in `<main>'
1: from p.rb:2:in `print'
p.rb:2:in `write': Broken pipe # io_write - <STDOUT> (Errno::EPIPE)
Let's change the code of q.rb a bit, so that it accepts inputs:
#!/usr/bin/ruby -w
STDIN.gets
Running:
bash[~] $ ruby p.rb | ruby q.rb
Right, it displays nothing actually. The reason is that q.rb now waits for standard inputs. Apparently, the waiting is what matters the most here. Now, p.rb will not crash with even with STDOUT.sync or STDOUT.flush when piped to this q.rb.
Another Example:
p.rb
STDOUT.sync = true
loop until print("\e[2K<<<#{Time.now.strftime('%H:%M:%S:%2N')}>>>\r")
[warning: the loop without sleep may bring up your CPU usage]
q.rb
sleep 3
Running:
bash[~] $ time ruby p.rb | q.rb
Traceback (most recent call last):
2: from p.rb:2:in `<main>'
1: from p.rb:2:in `print'
p.rb:2:in `write': Broken pipe # io_write - <STDOUT> (Errno::EPIPE)
real 0m3.186s
user 0m0.282s
sys 0m0.083s
You see the program crashed after 3 seconds. It will crash after 5.1 seconds if q.rb had sleep 5. Similarly sleep 0 in q.rb will crash p.rb after 0.1 seconds. I guess the additional 0.1 seconds depends on the system because my system takes 0.1 seconds to load the ruby interpreter.
I wrote p.cr and q.cr Crystal programs to test. Crystal is compiled and doesn't take the long 0.1 seconds to load up.
The Crystal Programs:
p.cr
STDOUT.sync = true
loop do print("\e[2KHi!\r") end rescue exit
q.cr
sleep 3
I compiled them, and ran:
bash[~] $ time ./p | ./q
real 0m3.013s
user 0m0.007s
sys 0m0.019s
The binary ./p, in very close to 3 seconds, handles Unhandled exception: Error writing file: Broken pipe (Errno) and exits. Again, 0.01 seconds may be taken by the two crystal programs to execute and maybe the Kernel also takes a bit time to run the processes.
Also note that STDERR#print, STDERR#puts, STDERR#putc, STDERR#printf, STDERR#write, STDERR#syswrite doesn't raise Errno::EPIPE even if the output is in sync.
Conclusion
Pipe is arcane. Setting STDOUT#sync to true or using STDOUT#flush flushes all buffered data to the underlying operating system.
When running hello.rb | whoami, without sync, I can write 8191 bytes of data, and the program hello.rb doesn't crash. But with sync, writing 1 byte via pipe will crash hello.rb.
So when hello.rb synchronizes standard outputs with the piped program whoami, and whoami doesn't wait for hello.rb; hello.rb raises Errno::EPIPE because the pipe between these two programs is broken (correct me if I am lost here).
Related
I am trying to monitor the progress of copying a raspberry-pi OS image to a microSD card. This is similar to Kill a process called using open3 in ruby, except I'm not killing the process, I'm sending it a command for it to issue a progress message.
rpath = device_path.gsub(/disk/, "rdisk")
puts "\n\nCopying image to #{rpath}"
if false
stdout_err, status = Open3.capture2e( "sudo", "dd", "bs=1m", "if=#{source_path}", "of=#{rpath}" )
puts stdout_err
else
cmd = "sudo dd bs=1m if=#{source_path} of=#{rpath}"
Open3.popen2e(cmd) do |stdin, stdout_err, wait_thr|
Thread.new do
stdout_err.each {|l| puts l}
end
Thread.new do
while true
sleep 5
if true
Process.kill("INFO", wait_thr.pid) #Tried INFO, SIGINFO, USR1, SIGUSR1
# all give: `kill': Operation not permitted (Errno::EPERM)
else
stdin.puts 20.chr #Should send ^T -- has no effect, nothing to terminal during flash
end
end
end
wait_thr.value
end
The first section (after 'if false') flashes the image using Open3.capture2e. This works, but of course issues no progress information.
The section after the 'else' flashes the image using Open3.popen2e. It also attempts to display progress by either issuing 'Process.kill("INFO", wait_thr.pid)', or by sending ^T (20.chr) to the stdin stream every 5 seconds.
The Process.kill line generates an "Operation not permitted" error. The stdin.puts line has no effect at all.
One other thing... While the popen2e process is flashing, hitting ctrl-T on the keyboard DOES generate a progress response. I just can't get it to do it programmatically.
Any help is appreciated!
Newer versions of dd have an optional progress bar, as seen here. Even so I think you'll want to rethink how you execute that shell command so that it thinks it's attached to a terminal. Easiest thing to do is fork/exec, like:
cmd = "sudo dd bs=1m if=#{source_path} of=#{rpath} status=progress"
fork do
exec(cmd) # this replaces the forked process with the cmd, giving it direct access to your terminal
end
Process.wait() # waits for the child process to exit
If that's not an option you may want to look into other ways of getting unbuffered output, including just writing a bash script instead of a ruby one.
I'm trying to run a ruby script with nohup:
nohup ruby script.rb &
This takes hours to run, but logs its progress via puts.
Usually, I can look at nohup.out to view the recent output of anything I run with nohup. However, my ruby script doesn't seem to output anything until it finishes, or is killed.
What am I doing wrong?
I'm not familiar with running commands through nohup, but approaching this from a "I'm outputting content to a file and it's only being written after the script exits" type of problem, those are caused by the output being buffered.
So it's very likely that being run through nohup (and thus redirecting the puts output to nohup.out) you lost synchronization. You might need to flush occasionally or enable sync. Since puts is "equivalent to $stdout.puts":
$stdout.flush # run this, occasionally
# or just
$stdout.sync = true
I am getting unexpected behaviour using popen3, which I want to use to run a command like tool ala cmd < file1 > file2. The below example hangs, so that stdout done is never reached. Using other tools than cat may cause hanging, so that stdin done is never reached. I suspect, I am suffering from buffering, but how do I fix this?
#!/usr/bin/env ruby
require 'open3'
Open3.popen3("cat") do |stdin, stdout, stderr, wait_thr|
stdin.puts "foobar"
puts "stdin done"
stdout.each_line { |line| puts line }
puts "stdout done"
puts wait_thr.value
end
puts "all done"
stdout.each_line is waiting for further output from cat because cat's output stream is still open. It's still open because cat is still waiting for input from the user because its input stream hasn't been closed yet (you'll notice that when you open cat in a terminal and type in foobar, it will still be running and waiting for input until you press ^d to close the stream).
So to fix this, simply call stdin.close before you print the output.
Your code is hanging, because stdin is still open!
You need to close it with IO#close or with IO#close_write if you use popen3.
If you use popen then you need to use IO#close_write because it only uses one file descriptor.
#!/usr/bin/env ruby
require 'open3'
Open3.popen3("cat") do |stdin, stdout, stderr, wait_thr|
stdin.puts "foobar"
stdin.close # close stdin like this! or with stdin.close_write
stdout.each_line { |line| puts line }
puts wait_thr.value
end
See also:
Ruby 1.8.7 IO#close_write
Ruby 1.9.2 IO#close_write
Ruby 2.3.1 IO#close_write
The answers by Tilo and by sepp2k are right: If you close stdin, your simple test will end. Problem solved.
Though in your comment to the answer of sepp2k, you indicate that you still experience hangs.
Well, there are some traps that you might have overlooked.
Stuck on full buffer for stderr
If you call a program that prints more to stderr than the buffer of an anonymous pipe can hold (64KiB for current Linuxes), the program gets suspended. A suspended program neither exits nor closes stdout. Consequently, reading from its stdout will hang. So if you want to do it right, you have to use threads or IO.select, non-blocking, unbuffered reads in order to read from both stdout and stderr in parallel or by turns without getting stuck.
Stuck on full buffer for stdin
If you try to feed more (much more) than "foobar" to your program (cat), the buffer of the anonymous pipe for stdout will get full. The OS will suspend cat. If you write even more to stdin, the buffer of the anonymous pipe for stdin will get full. Then your call to stdin.write will get stuck. This means: You need to write to stdin, read from stdout and read from stderr in parallel or by turns.
Conclusion
Read a good book (Richards Stevens, "UNIX Network Programming: Interprocess communications") and use good library functions. IPC (interprocess communications) is just too complicated and prone to indeterministic run-time behavior. It is for too much hassle to try to get it right by try-and-error.
Use Open3.capture3.
I have Ruby programA that calls Ruby programB with:
system("ruby programB.rb <parameters>")
Under certain conditions, I want programB to terminate its operation (and the associated subshell) but allow programA to continue on to the next set of parameters.
However, exit() and abort() kill both the subshell and the parent, and I am unable to get Process.kill("SIGTERM",0) to work in programB (unfortunately, this is on Windows). I'm running ruby 1.9.2.
How can I terminate programB without also killing programA?
If the regular system call isn't cutting it, the usual way is to do something like this:
pid = fork do
exec("ruby programB.rb ...")
end
kill("SIGTERM", pid)
The fork operation gives you a process identifier you can kill. system will block until the child process returns, so any call to kill in the parent process will affect only the parent process.
Unfortunately there's no fork in Windows, but there are alternatives that achieve the same thing.
exit() and abort() don't kill the parent, at least not on Mac OS, and Linux in my experience.
Try saving this as abort.rb:
puts RUBY_VERSION
puts `date`
puts 'aborting'
abort
and this as exit.rb:
puts RUBY_VERSION
puts `date`
puts 'exiting'
exit
Then save this as test.rb in the same directory and run it:
puts `ruby exit.rb`
puts `ruby abort.rb`
On my system I see:
1.9.3
Fri Dec 21 22:17:12 MST 2012
exiting
1.9.3
Fri Dec 21 22:17:12 MST 2012
aborting
They do exit the currently running script in the sub-shell, which then exits because it's not a log-in shell, and could set a return status which is important to the calling program, but I have yet to see them kill the parent.
If you need to capture STDERR, using backticks or %x won't work. I'd recommend using Open3.capture3 for simplicity if you need to know what status code was returned, or whether STDERR returned anything.
The only thing that works reliably for me is this:
kill -INT $$
It reliably kills the script and only the script, even if it was source'd from the command line. Note that I'm running GNU bash, version 4.4.12(1)-release (x86_64-apple-darwin15.6.0); I can't recall if this works on bash 3.x
I am getting unexpected behaviour using popen3, which I want to use to run a command like tool ala cmd < file1 > file2. The below example hangs, so that stdout done is never reached. Using other tools than cat may cause hanging, so that stdin done is never reached. I suspect, I am suffering from buffering, but how do I fix this?
#!/usr/bin/env ruby
require 'open3'
Open3.popen3("cat") do |stdin, stdout, stderr, wait_thr|
stdin.puts "foobar"
puts "stdin done"
stdout.each_line { |line| puts line }
puts "stdout done"
puts wait_thr.value
end
puts "all done"
stdout.each_line is waiting for further output from cat because cat's output stream is still open. It's still open because cat is still waiting for input from the user because its input stream hasn't been closed yet (you'll notice that when you open cat in a terminal and type in foobar, it will still be running and waiting for input until you press ^d to close the stream).
So to fix this, simply call stdin.close before you print the output.
Your code is hanging, because stdin is still open!
You need to close it with IO#close or with IO#close_write if you use popen3.
If you use popen then you need to use IO#close_write because it only uses one file descriptor.
#!/usr/bin/env ruby
require 'open3'
Open3.popen3("cat") do |stdin, stdout, stderr, wait_thr|
stdin.puts "foobar"
stdin.close # close stdin like this! or with stdin.close_write
stdout.each_line { |line| puts line }
puts wait_thr.value
end
See also:
Ruby 1.8.7 IO#close_write
Ruby 1.9.2 IO#close_write
Ruby 2.3.1 IO#close_write
The answers by Tilo and by sepp2k are right: If you close stdin, your simple test will end. Problem solved.
Though in your comment to the answer of sepp2k, you indicate that you still experience hangs.
Well, there are some traps that you might have overlooked.
Stuck on full buffer for stderr
If you call a program that prints more to stderr than the buffer of an anonymous pipe can hold (64KiB for current Linuxes), the program gets suspended. A suspended program neither exits nor closes stdout. Consequently, reading from its stdout will hang. So if you want to do it right, you have to use threads or IO.select, non-blocking, unbuffered reads in order to read from both stdout and stderr in parallel or by turns without getting stuck.
Stuck on full buffer for stdin
If you try to feed more (much more) than "foobar" to your program (cat), the buffer of the anonymous pipe for stdout will get full. The OS will suspend cat. If you write even more to stdin, the buffer of the anonymous pipe for stdin will get full. Then your call to stdin.write will get stuck. This means: You need to write to stdin, read from stdout and read from stderr in parallel or by turns.
Conclusion
Read a good book (Richards Stevens, "UNIX Network Programming: Interprocess communications") and use good library functions. IPC (interprocess communications) is just too complicated and prone to indeterministic run-time behavior. It is for too much hassle to try to get it right by try-and-error.
Use Open3.capture3.