Why does io.Pipe() continue to block even when EOF is reached? - go

While playing with subprocesses and reading stdout through pipes I noticed interesting behaviour.
If I use an io.Pipe() to read the stdout of a subprocess created through os/exec, reading from that pipe hangs forever even when EOF is reached (the process is finished):
cmd := exec.Command("/bin/echo", "Hello, world!")
r, w := io.Pipe()
cmd.Stdout = w
cmd.Start()
io.Copy(os.Stdout, r) // Prints "Hello, World!" but never returns
However, if I use the built-in method StdoutPipe() it works:
cmd := exec.Command("/bin/echo", "Hello, world!")
p := cmd.StdoutPipe()
cmd.Start()
io.Copy(os.Stdout, p) // Prints "Hello, World!" and returns
Digging into the source code of /usr/lib/go/src/os/exec/exec.go, I can see that the StdoutPipe() method actually uses os.Pipe(), not io.Pipe():
pr, pw, err := os.Pipe()
cmd.Stdout = pw
cmd.closeAfterStart = append(c.closeAfterStart, pw)
cmd.closeAfterWait = append(c.closeAfterWait, pr)
return pr, nil
This gives me two clues:
File descriptors are being closed at certain points. Critically, the "write" end of the pipe is being closed after process start.
Instead of io.Pipe() as I used above, os.Pipe() (a lower level call that roughly maps to pipe(2) in POSIX) is used.
However I am still unable to understand why my original example behaves the way it does after taking into account this newfound knowledge.
If I try to close the write end of an io.Pipe() (instead of an os.Pipe()) then it appears to break it completely and nothing gets read (as if I'm reading from a closed pipe even though I thought I passed it to the subprocess):
cmd := exec.Command("/bin/echo", "Hello, world!")
r, w := io.Pipe()
cmd.Stdout = w
cmd.Start()
w.Close()
io.Copy(os.Stdout, r) // Prints nothing, no read buffer available
Okay, so I guess an io.Pipe() is quite different than an os.Pipe(), and probably doesn't behave like Unix pipes where one close() doesn't close it for everybody.
Just so you don't think I'm asking for a quick fix, I already know I can achieve my expected behaviour by using this code:
cmd := exec.Command("/bin/echo", "Hello, world!")
r, w, _ := os.Pipe() // using os.Pipe() instead of io.Pipe()
cmd.Stdout = w
cmd.Start()
w.Close()
io.Copy(os.Stdout, r) // Prints "Hello, World!" and returns on EOF. Works. :-)
What I'm asking for is why does io.Pipe() seem to ignore an EOF from the writer, leaving the reader blocking forever? A valid answer could be that io.Pipe() is the wrong tool for the job because $REASONS but I can't figure out what those $REASONS are because according to the documentation what I'm trying to do seems perfectly reasonable.
Here is a complete example to illustrate what I'm talking about:
package main
import (
"fmt"
"os"
"os/exec"
"io"
)
func main() {
cmd := exec.Command("/bin/echo", "Hello, world!")
r, w := io.Pipe()
cmd.Stdout = w
cmd.Start()
io.Copy(os.Stdout, r) // Blocks here even though EOF is reached
fmt.Println("Finished io.Copy()")
cmd.Wait()
}

"why does io.Pipe() seem to ignore an EOF from the writer, leaving the reader blocking forever?" Because there is no such thing as "EOF from the writer". All an EOF is (in unix) is an indication to the reader that no processes hold the write side of the pipe open. When a process attempts to read from a pipe which has no writers, the read system call returns a value that is conveniently named EOF. Since your parent still has one copy of the write side of the pipe open, read blocks. Stop thinking of EOF as a thing. It is merely an abstraction, and the writer never "sends" it.

You can use a goroutine:
package main
import (
"os"
"os/exec"
"io"
)
func main() {
r, w := io.Pipe()
c := exec.Command("go", "version")
c.Stdout = w
c.Start()
go func() {
io.Copy(os.Stdout, r)
r.Close()
}()
c.Wait()
}

Related

Capture stdout from exec.Command line by line and also pipe to os.Stdout

Can anyone help ?
I have an application I am running via exec.CommandContext (so I can cancel it via ctx). it would normally not stop unless it errors out.
I currently have it relaying its output to os.stdOut which is working great. But I also want to get each line via a channel - the idea behind this is I will look for a regular expression on the line and if its true then I will set an internal state of "ERROR" for example.
Although I can't get it to work, I tried NewSscanner. Here is my code.
As I say, it does output to os.StdOut which is great but I would like to receive each line as it happens in my channel I setup.
Any ideas ?
Thanks in advance.
func (d *Daemon) Start() {
ctx, cancel := context.WithCancel(context.Background())
d.cancel = cancel
go func() {
args := "-x f -a 1"
cmd := exec.CommandContext(ctx, "mydaemon", strings.Split(args, " ")...)
var stdoutBuf, stderrBuf bytes.Buffer
cmd.Stdout = io.MultiWriter(os.Stdout, &stdoutBuf)
cmd.Stderr = io.MultiWriter(os.Stderr, &stderrBuf)
lines := make(chan string)
go func() {
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
fmt.Println("I am reading a line!")
lines <- scanner.Text()
}
}()
err := cmd.Start()
if err != nil {
log.Fatal(err)
}
select {
case outputx := <-lines:
// I will do somethign with this!
fmt.Println("Hello!!", outputx)
case <-ctx.Done():
log.Println("I am done!, probably cancelled!")
}
}()
}
Also tried using this
go func() {
scanner := bufio.NewScanner(&stdoutBuf)
for scanner.Scan() {
fmt.Println("I am reading a line!")
lines <- scanner.Text()
}
}()
Even with that, the "I am reading a line" never gets out, I also debugged it and it neve enters the "for scanner.."
Also tried scanning on &stderrBuf, same, nothing enters.
cmd.Start() does not wait for the command to finish. Also, cmd.Wait() needs to be called to be informed about the end of the process.
reader, writer := io.Pipe()
cmdCtx, cmdDone := context.WithCancel(context.Background())
scannerStopped := make(chan struct{})
go func() {
defer close(scannerStopped)
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}()
cmd := exec.Command("ls")
cmd.Stdout = writer
_ = cmd.Start()
go func() {
_ = cmd.Wait()
cmdDone()
writer.Close()
}()
<-cmdCtx.Done()
<-scannerStopped
scannerStopped is added to demonstrate that the scanner goroutine stops now.
reader, writer := io.Pipe()
scannerStopped := make(chan struct{})
go func() {
defer close(scannerStopped)
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}()
cmd := exec.Command("ls")
cmd.Stdout = writer
_ = cmd.Run()
go func() {
_ = cmd.Wait()
writer.Close()
}()
<-scannerStopped
And handle the lines as it helps.
Note: wrote this in a bit of hurry. Let me know if anything is unclear or not correct.
For a correct program using concurrency and goroutines, we should try to show there are no data races, the program can't deadlock, and goroutines don't leak. Let's try to achieve this.
Full code
Playground: https://play.golang.org/p/Xv1hJXYQoZq. I recommend copying and running locally, because the playground doesn't stream output afaik and it has timeouts.
Note that I've changed the test command to % find /usr/local, a typically long-running command (>3 seconds) with plenty of output lines, since it is better suited for the scenarios we should test.
Walkthrough
Let's look at the Daemon.Start method. At the start, it is mostly the same. Most noticeably, though, the new code doesn't have a goroutine around a large part of the method. Even without this, the Daemon.Start method remains non-blocking and will return "immediately".
The first noteworthy fix is these updated lines.
outR, outW := io.Pipe()
cmd.Stdout = io.MultiWriter(outW, os.Stdout)
Instead of constructing a bytes.Buffer variable, we call io.Pipe. If we didn't make this change and stuck with a bytes.Buffer, then scanner.Scan() will return false as soon as there is no more data to read. This can happen if the command writes to stdout only occasionally (even a millisecond apart, for this matter). After scanner.Scan() returns false, the goroutine exits and we miss processing future output.
Instead, by using the read end of io.Pipe, scanner.Scan() will wait for input from the pipe's read end until the pipe's write end is closed.
This fixes the race issue between the scanner and the command output.
Next, we construct two closely-related goroutines: the first to consume from <-lines, and the second to produce into lines<-.
go func() {
for line := range lines {
fmt.Println("output line from channel:", line)
...
}
}()
go func() {
defer close(lines)
scanner := bufio.NewScanner(outR)
for scanner.Scan() {
lines <- scanner.Text()
}
...
}()
The consumer goroutine will exit when the lines channel is closed, as the closing of the channel would naturally cause the range loop to terminate; the producer goroutine closes lines upon exit.
The producer goroutine will exit when scanner.Scan() returns false, which happens when the write end of the io.Pipe is closed. This closing happens in upcoming code.
Note from the two paragraphs above that the two goroutines are guaranteed to exit (i.e. will not leak).
Next, we start the command. Standard stuff, it's a non-blocking call, and it returns immediately.
// Start the command.
if err := cmd.Start(); err != nil {
log.Fatal(err)
}
Moving on to the final piece of code in Daemon.Start. This goroutine waits for the command to exit via cmd.Wait(). Handling this is important because the command may for reasons other than Context cancellation.
Particularly, we want to close the write end of the io.Pipe (which, in turn, closes the output lines producer goroutine as mentioned earlier).
go func() {
err := cmd.Wait()
fmt.Println("command exited; error is:", err)
outW.Close()
...
}()
As a side note, by waiting on cmd.Wait(), we don't have to separately wait on ctx.Done(). Waiting on cmd.Wait() handles both exits caused by natural reasons (command successfully finished, command ran into internal error etc.) and exits caused by Context-cancelation.
This goroutine, too, is guaranteed to exit. It will exit when cmd.Wait() returns. This can happen either because the command exited normally with success; exited with failure due to a command error; or exited with failure due to Context cancelation.
That's it! We should have no data races, no deadlocks, and no leaked goroutines.
The lines elided ("...") in the snippets above are geared towards the Done(), CmdErr(), and Cancel() methods of the Daemon type. These methods are fairly well-documented in the code, so these elided lines are hopefully self-explanatory.
Besides that, look for the TODO comments for error handling you may want to do based on your needs!
Test it!
Use this driver program to test the code.
func main() {
var d Daemon
d.Start()
// Enable this code to test Context cancellation:
// time.AfterFunc(100*time.Millisecond, d.Cancel)
<-d.Done()
fmt.Println("d.CmdErr():", d.CmdErr())
}
You have to scan stdoutBuf instead of os.Stdin:
scanner := bufio.NewScanner(&stdoutBuf)
The command is terminated when the context canceled. If it's OK to read all output from the command until the command is terminated, then use this code:
func (d *Daemon) Start() {
ctx, cancel := context.WithCancel(context.Background())
d.cancel = cancel
args := "-x f -a 1"
cmd := exec.CommandContext(ctx, "mydaemon", strings.Split(args, " ")...)
stdout, err := cmd.StdoutPipe()
if err != nil {
log.Fatal(err)
}
err = cmd.Start()
if err != nil {
log.Fatal(err)
}
go func() {
defer cmd.Wait()
scanner := bufio.NewScanner(stdout)
for scanner.Scan() {
s := scanner.Text()
fmt.Println(s) // echo to stdout
// Do something with s
}
}()
}
The command is terminated when the context is canceled.
Read on stdout returns io.EOF when the command is terminated. The goroutine breaks out of the scan loop when stdout returns an error.

manipulating/reading before passing to command

How to read/manipulate the input from a connection that is passed to a command stdin?
For example, given the following code.
c, _ := net.Dial("tcp", somehost)
cmd := exec.Command("/bin/sh")
cmd.Stdin, cmd.Stdout, cmd.Stderr = c, c, c
cmd.Run()
How would it be possible to reverse the string from the connection before it is passed to the cmd.Stdin or how could I parse the string and not pass it on to cmd.Stdin?
Ive considered reading from the connection with bufio and then passing it to Command second argument, the params, but I was hoping for a better solution that does not require me to handle all the different cases for args input in a command, but instead just passing it on to Stdin after analysing the input
Ok since you mentioned in comments that "my real issue is how to intercept the input from the connection, parse it and parse it to the Stdin of the command. Seems when I do cmd.Run() I block and hence cant really continously parse"
Here is how I will do it:
import (
"io"
"os"
"os/exec"
)
func main() {
//All errors are not checked
cmd := exec.Command("/bin/sh")
cmdStdin, _ := cmd.StdinPipe()
go func() {
defer cmdStdin.Close()
//here you will need to loop on reading the connection,
//for simplicity lets assume you do that & receive data
//let says you got ls from connection
cmdStdin.Write([]byte("ls\n"))
}()
cmdStdout, _ := cmd.StdoutPipe()
go io.Copy(os.Stdout, cmdStdout)
cmd.Run()
}

Golang os/exec flushing stdin without closing it

I would like to manage a process in Go with the package os/exec. I would like to start it and be able to read the output and write several times to the input.
The process I launch in the code below, menu.py, is just a python script that does an echo of what it has in input.
func ReadOutput(rc io.ReadCloser) (string, error) {
x, err := ioutil.ReadAll(rc)
s := string(x)
return s, err
}
func main() {
cmd := exec.Command("python", "menu.py")
stdout, err := cmd.StdoutPipe()
Check(err)
stdin, err := cmd.StdinPipe()
Check(err)
err = cmd.Start()
Check(err)
go func() {
defer stdin.Close() // If I don't close the stdin pipe, the python code will never take what I write in it
io.WriteString(stdin, "blub")
}()
s, err := ReadOutput(stdout)
if err != nil {
Log("Process is finished ..")
}
Log(s)
// STDIN IS CLOSED, I CAN'T RETRY !
}
And the simple code of menu.py :
while 1 == 1:
name = raw_input("")
print "Hello, %s. \n" % name
The Go code works, but if I don't close the stdin pipe after I write in it, the python code never take what is in it. It is okay if I want to send only one thing in the input on time, but what is I want to send something again few seconds later? Pipe is closed! How should I do? The question could be "How do I flush a pipe from WriteCloser interface?" I suppose
I think the primary problem here is that the python process doesn't work the way you might expect. Here's a bash script echo.sh that does the same thing:
#!/bin/bash
while read INPUT
do echo "Hello, $INPUT."
done
Calling this script from a modified version of your code doesn't have the same issue with needing to close stdin:
func ReadOutput(output chan string, rc io.ReadCloser) {
r := bufio.NewReader(rc)
for {
x, _ := r.ReadString('\n')
output <- string(x)
}
}
func main() {
cmd := exec.Command("bash", "echo.sh")
stdout, err := cmd.StdoutPipe()
Check(err)
stdin, err := cmd.StdinPipe()
Check(err)
err = cmd.Start()
Check(err)
go func() {
io.WriteString(stdin, "blab\n")
io.WriteString(stdin, "blob\n")
io.WriteString(stdin, "booo\n")
}()
output := make(chan string)
defer close(output)
go ReadOutput(output, stdout)
for o := range output {
Log(o)
}
}
The Go code needed a few minor changes - ReadOutput method needed to be modified in order to not block - ioutil.ReadAll would have waited for an EOF before returning.
Digging a little deeper, it looks like the real problem is the behaviour of raw_input - it doesn't flush stdout as expected. You can pass the -u flag to python to get the desired behaviour:
cmd := exec.Command("python", "-u", "menu.py")
or update your python code to use sys.stdin.readline() instead of raw_input() (see this related bug report: https://bugs.python.org/issue526382).
Even though there is some problem with your python script. The main problem is the golang pipe. A trick to solve this problem is use two pipes as following:
// parentprocess.go
package main
import (
"bufio"
"log"
"io"
"os/exec"
)
func request(r *bufio.Reader, w io.Writer, str string) string {
w.Write([]byte(str))
w.Write([]byte("\n"))
str, err := r.ReadString('\n')
if err != nil {
panic(err)
}
return str[:len(str)-1]
}
func main() {
cmd := exec.Command("bash", "menu.sh")
inr, inw := io.Pipe()
outr, outw := io.Pipe()
cmd.Stdin = inr
cmd.Stdout = outw
if err := cmd.Start(); err != nil {
panic(err)
}
go cmd.Wait()
reader := bufio.NewReader(outr)
log.Printf(request(reader, inw, "Tom"))
log.Printf(request(reader, inw, "Rose"))
}
The subprocess code is the same logic as your python code as following:
#!/usr/bin/env bash
# menu.sh
while true; do
read -r name
echo "Hello, $name."
done
If you want to use your python code you should do some changes:
while 1 == 1:
try:
name = raw_input("")
print "Hello, %s. \n" % name
sys.stdout.flush() # there's a stdout buffer
except:
pass # make sure this process won't die when come across 'EOF'
// StdinPipe returns a pipe that will be connected to the command's
// standard input when the command starts.
// The pipe will be closed automatically after Wait sees the command exit.
// A caller need only call Close to force the pipe to close sooner.
// For example, if the command being run will not exit until standard input`enter code here`
// is closed, the caller must close the pipe.
func (c *Cmd) StdinPipe() (io.WriteCloser, error) {}

Transparent (filter-like) gzip/gunzip in Go

I'm trying, just for fun, to connect a gzip Writer directly to a gzip Reader, so I could write to the Writer and read from the Reader on the fly. I expected to read exactly what I wrote. I'm using gzip but I'd like to use this method also with crypto/aes, I suppose it should work very similar and it could be used with other reader/writers like jpeg, png...
This is my best option, that is not working, but I hope you can see what I mean: http://play.golang.org/p/7qdUi9wwG7
package main
import (
"bytes"
"compress/gzip"
"fmt"
)
func main() {
s := []byte("Hello world!")
fmt.Printf("%s\n", s)
var b bytes.Buffer
gz := gzip.NewWriter(&b)
ungz, err := gzip.NewReader(&b)
fmt.Println("err: ", err)
gz.Write(s)
gz.Flush()
uncomp := make([]byte, 100)
n, err2 := ungz.Read(uncomp)
fmt.Println("err2: ", err2)
fmt.Println("n: ", n)
uncomp = uncomp[:n]
fmt.Printf("%s\n", uncomp)
}
It seems that gzip.NewReader(&b) is trying to read immediately and a EOF is returned.
You'll need to do two things to make it work
Use an io.Pipe to connect the reader and writer together - you can't read and write from the same buffer
Run the reading and writing in seperate goroutines. Because the first thing that gzip does is attempt to read the header you'll get a deadlock unless you have another go routine attemting to write it.
Here is what that looks like
Playground
func main() {
s := []byte("Hello world!")
fmt.Printf("%s\n", s)
in, out := io.Pipe()
gz := gzip.NewWriter(out)
go func() {
ungz, err := gzip.NewReader(in)
fmt.Println("err: ", err)
uncomp := make([]byte, 100)
n, err2 := ungz.Read(uncomp)
fmt.Println("err2: ", err2)
fmt.Println("n: ", n)
uncomp = uncomp[:n]
fmt.Printf("%s\n", uncomp)
}()
gz.Write(s)
gz.Flush()
}
Use a pipe. For example,
Package io
func Pipe
func Pipe() (*PipeReader, *PipeWriter)
Pipe creates a synchronous in-memory pipe. It can be used to connect
code expecting an io.Reader with code expecting an io.Writer. Reads on
one end are matched with writes on the other, copying data directly
between the two; there is no internal buffering. It is safe to call
Read and Write in parallel with each other or with Close. Close will
complete once pending I/O is done. Parallel calls to Read, and
parallel calls to Write, are also safe: the individual calls will be
gated sequentially.

Redirect stdout pipe of child process in Go

I'm writing a program in Go that executes a server like program (also Go). Now I want to have the stdout of the child program in my terminal window where I started the parent program. One way to do this is with the cmd.Output() function, but this prints the stdout only after the process has exited. (That's a problem because this server-like program runs for a long time and I want to read the log output)
The variable out is of type io.ReadCloser and I don't know what I should do with it to achieve my task, and I can't find anything helpful on the web on this topic.
func main() {
cmd := exec.Command("/path/to/my/child/program")
out, err := cmd.StdoutPipe()
if err != nil {
fmt.Println(err)
}
err = cmd.Start()
if err != nil {
fmt.Println(err)
}
//fmt.Println(out)
cmd.Wait()
}
Explanation to the code: uncomment the Println function to get the code to compile, I know that Println(out io.ReadCloser) is not a meaningful function.
(it produces the output &{3 |0 <nil> 0} ) These two lines are just required to get the code to compile.
Now I want to have the stdout of the child program in my terminal
window where I started the parent program.
No need to mess with pipes or goroutines, this one is easy.
func main() {
// Replace `ls` (and its arguments) with something more interesting
cmd := exec.Command("ls", "-l")
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.Run()
}
I believe that if you import io and os and replace this:
//fmt.Println(out)
with this:
go io.Copy(os.Stdout, out)
(see documentation for io.Copy and for os.Stdout), it will do what you want. (Disclaimer: not tested.)
By the way, you'll probably want to capture standard-error as well, by using the same approach as for standard-output, but with cmd.StderrPipe and os.Stderr.
For those who don't need this in a loop, but would like the command output to echo into the terminal without having cmd.Wait() blocking other statements:
package main
import (
"fmt"
"io"
"log"
"os"
"os/exec"
)
func checkError(err error) {
if err != nil {
log.Fatalf("Error: %s", err)
}
}
func main() {
// Replace `ls` (and its arguments) with something more interesting
cmd := exec.Command("ls", "-l")
// Create stdout, stderr streams of type io.Reader
stdout, err := cmd.StdoutPipe()
checkError(err)
stderr, err := cmd.StderrPipe()
checkError(err)
// Start command
err = cmd.Start()
checkError(err)
// Don't let main() exit before our command has finished running
defer cmd.Wait() // Doesn't block
// Non-blockingly echo command output to terminal
go io.Copy(os.Stdout, stdout)
go io.Copy(os.Stderr, stderr)
// I love Go's trivial concurrency :-D
fmt.Printf("Do other stuff here! No need to wait.\n\n")
}

Resources