I am streaming command output to a client with this code. The command is built with context cancellation. The client sends a "cancel" request to the server which notifies the client's cancelCh which triggers cancel().
The issue I'm having is when the command is cancelled, the rest of the command output streams to the client as if the command was not cancelled. After the command is completed, exit status 1 is received; which shows that the command was indeed cancelled.
If I move the done channel to block after cmd.Wait() instead of before, I get the behavior I expect. The client immediately gets exit status 1 and no more data is sent. But that seems to cause a data race issue: https://github.com/golang/go/issues/19685. That issue is old but I think it's relevant.
What is the proper way to stream output to the client in real-time while also immediately exiting via context cancellation?
go func() {
defer func() {
cancel()
}()
<-client.cancelCh
}()
output := make(chan []byte)
go execute(cmd, output)
for data := range output {
fmt.Fprintf(w, "data: %s\n\n", data)
flusher.Flush()
}
func execute(cmd *exec.Cmd, output chan []byte) {
defer close(output)
cmdReader, err := cmd.StdoutPipe()
if err != nil {
output <- []byte(fmt.Sprintf("Error getting stdout pipe: %v", err))
return
}
cmd.Stderr = cmd.Stdout
scanner := bufio.NewScanner(cmdReader)
done := make(chan struct{})
go func() {
for scanner.Scan() {
output <- scanner.Bytes()
}
done <- struct{}{}
}()
err = cmd.Start()
if err != nil {
output <- []byte(fmt.Sprintf("Error executing: %v", err))
return
}
<-done
err = cmd.Wait()
if err != nil {
output <- []byte(err.Error())
}
//<-done
}
I run client and socket server written in Go (1.12) on macOS localhost.
Server sets SetKeepAlive and SetKeepAlivePeriod on net.TCPConn.
Client sends a packet and then closes connection (FIN) or client abruptly terminated.
Tcpdump shows that even after client closes the connection, server keeps sending keep-alive probes.
Shouldn't it detect that peer is "dead" and close the connection?
The question is generic, feel free to clarify if I'm missing some basics.
package main
import (
"flag"
"fmt"
"net"
"os"
"time"
)
func main() {
var client bool
flag.BoolVar(&client, "client", false, "")
flag.Parse()
if client {
fmt.Println("Client mode")
conn, err := net.Dial("tcp", "127.0.0.1:12345")
checkErr("Dial", err)
written, err := conn.Write([]byte("howdy"))
checkErr("Write", err)
fmt.Printf("Written: %v\n", written)
fmt.Println("Holding conn")
time.Sleep(60 * time.Second)
err = conn.Close()
checkErr("Close", err)
fmt.Println("Closed conn")
return
}
fmt.Println("Server mode")
l, err := net.Listen("tcp", "127.0.0.1:12345")
checkErr("listen", err)
defer l.Close()
for {
c, err := l.Accept()
checkErr("accept", err)
defer c.Close()
tcpConn := c.(*net.TCPConn)
err = tcpConn.SetKeepAlive(true)
checkErr("SetKeepAlive", err)
err = tcpConn.SetKeepAlivePeriod(5 * time.Second)
checkErr("SetKeepAlivePeriod", err)
b := make([]byte, 1024)
n, err := c.Read(b)
checkErr("read", err)
fmt.Printf("Received: %v\n", string(b[:n]))
}
}
func checkErr(location string, err error) {
if err != nil {
fmt.Printf("%v: %v\n", location, err)
os.Exit(-1)
}
}
The response to that question:
Sending keepalives is only necessary when you need the connection opened but idle. In that cases there is a risk that the connection is broken, so keep alive will try to detect broken connections.
If you had close the connection at server side with a proper con.Close() the keep alive would not be triggered (you did defer it to the end of the main function).
If you test your server code, it will start sending the keep alive after the timeout you set.
You notice that only after all keep alive proves (default 9 from kernel) and the time between the proves (8x), you get an io.EOF error on the server side Read (yes, the server stop sending)!
Currently the GO implementation is the same at Linux and OSX and it set both TCP_KEEPINTVL and TCP_KEEPIDLE to the value you pass to the setKeepAlivePeriod function, so, the behavior will depend of the kernel version.
func setKeepAlivePeriod(fd *netFD, d time.Duration) error {
// The kernel expects seconds so round to next highest second.
d += (time.Second - time.Nanosecond)
secs := int(d.Seconds())
if err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, secs); err != nil {
return wrapSyscallError("setsockopt", err)
}
err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs)
runtime.KeepAlive(fd)
return wrapSyscallError("setsockopt", err)
}
There is a request opened since 2014 to provide a way to set keepalive time and interval separately.
Some references:
rfc1122
net: enable TCP keepalive on new connections from net.Dial
net: enable TCP keepalives by default
TCP keep-alive to determine if client disconnected in netty
Using TCP keepalive with Go
In Go, I'm trying to:
start a subprocess
read from stdout and stderr separately
implement an overall timeout
After much googling, we've come up with some code that seems to do the job, most of the time. But there seems to be a race condition whereby some output is not read.
The problem seems to only occur on Linux, not Windows.
Following the simplest possible solution found with google, we tried creating a context with a timeout:
context.WithTimeout(context.Background(), 10*time.Second)
While this worked most of the time, we were able to find cases where it would just hang forever. There was some aspect of the child process that caused this to deadlock. (Something to do with grandchildren that were not sufficiently dissasociated from the child process, and thus caused the child to never completely exit.)
Also, it seemed that in some cases the error that is returned when the timeout occurrs would indicate a timeout, but would only be delivered after the process had actually exited (thus making the whole concept of the timeout useless).
func GetOutputsWithTimeout(command string, args []string, timeout int) (io.ReadCloser, io.ReadCloser, int, error) {
start := time.Now()
procLogger.Tracef("Initializing %s %+v", command, args)
cmd := exec.Command(command, args...)
// get pipes to standard output/error
stdout, err := cmd.StdoutPipe()
if err != nil {
return emptyReader(), emptyReader(), -1, fmt.Errorf("cmd.StdoutPipe() error: %+v", err.Error())
}
stderr, err := cmd.StderrPipe()
if err != nil {
return emptyReader(), emptyReader(), -1, fmt.Errorf("cmd.StderrPipe() error: %+v", err.Error())
}
// setup buffers to capture standard output and standard error
var buf bytes.Buffer
var ebuf bytes.Buffer
// create a channel to capture any errors from wait
done := make(chan error)
// create a semaphore to indicate when both pipes are closed
var wg sync.WaitGroup
wg.Add(2)
go func() {
if _, err := buf.ReadFrom(stdout); err != nil {
procLogger.Debugf("%s: Error Slurping stdout: %+v", command, err)
}
wg.Done()
}()
go func() {
if _, err := ebuf.ReadFrom(stderr); err != nil {
procLogger.Debugf("%s: Error Slurping stderr: %+v", command, err)
}
wg.Done()
}()
// start process
procLogger.Debugf("Starting %s", command)
if err := cmd.Start(); err != nil {
procLogger.Errorf("%s: failed to start: %+v", command, err)
return emptyReader(), emptyReader(), -1, fmt.Errorf("cmd.Start() error: %+v", err.Error())
}
go func() {
procLogger.Debugf("Waiting for %s (%d) to finish", command, cmd.Process.Pid)
err := cmd.Wait() // this can be 'forced' by the killing of the process
procLogger.Tracef("%s finished: errStatus=%+v", command, err) // err could be nil here
//notify select of completion, and the status
done <- err
}()
// Wait for timeout or completion.
select {
// Timed out
case <-time.After(time.Duration(timeout) * time.Second):
elapsed := time.Since(start)
procLogger.Errorf("%s: timeout after %.1f\n", command, elapsed.Seconds())
if err := TerminateTree(cmd); err != nil {
return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), -1,
fmt.Errorf("failed to kill %s, pid=%d: %+v",
command, cmd.Process.Pid, err)
}
wg.Wait() // this *should* take care of waiting for stdout and stderr to be collected after we killed the process
return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), -1,
fmt.Errorf("%s: timeout %d s reached, pid=%d process killed",
command, timeout, cmd.Process.Pid)
//Exited normally or with a non-zero exit code
case err := <-done:
wg.Wait() // this *should* take care of waiting for stdout and stderr to be collected after the process terminated naturally.
elapsed := time.Since(start)
procLogger.Tracef("%s: Done after %.1f\n", command, elapsed.Seconds())
rc := -1
// Note that we have to use go1.10 compatible mechanism.
if err != nil {
procLogger.Tracef("%s exited with error: %+v", command, err)
exitErr, ok := err.(*exec.ExitError)
if ok {
ws := exitErr.Sys().(syscall.WaitStatus)
rc = ws.ExitStatus()
}
procLogger.Debugf("%s exited with status %d", command, rc)
return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), rc,
fmt.Errorf("%s: process done with error: %+v",
command, err)
} else {
ws := cmd.ProcessState.Sys().(syscall.WaitStatus)
rc = ws.ExitStatus()
}
procLogger.Debugf("%s exited with status %d", command, rc)
return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), rc, nil
}
//NOTREACHED: should not reach this line!
}
Calling GetOutputsWithTimeout("uname",[]string{"-mpi"},10) will return the expected single line of output most of the time. But sometimes it will return no output, as if the goroutine that reads stdout didn't start soon enough to "catch" all the output (or exited early?) The "most of the time" strongly suggests a race condition.
We will also sometimes see errors from the goroutines about "file already closed" (this seems to happen with the timeout condition, but will happen at other "normal" times as well).
I would have thought that starting the goroutines before the cmd.Start() would have ensured that no output would be missed, and that using the WaitGroup would guarantee they would both complete before reading the buffers.
So how are we missing output? Is there still a race condition between the two "reader" goroutines and the cmd.Start()? Should we ensure those two are running using yet another WaitGroup?
Or is there a problem with the implementation of ReadFrom()?
Note that we are currently using go1.10 due to backward-compatibility problems with older OSs but the same effect occurs with go1.12.4.
Or are we overthinking this, and a simple implementation with context.WithTimeout() would do the job?
But sometimes it will return no output, as if the goroutine that reads stdout didn't start soon enough to "catch" all the output
This is impossible, because a pipe can't "lose" data. If the process is writing to stdout and the Go program isn't reading yet, the process will block.
The simplest way to approach the problem is:
Launch goroutines to collect stdout, stderr
Launch a timer that kills the process
Start the process
Wait for it to finish (or be killed by the timer) with .Wait()
If timer is fired, return timeout error
Handle wait error
func GetOutputsWithTimeout(command string, args []string, timeout int) ([]byte, []byte, int, error) {
cmd := exec.Command(command, args...)
// get pipes to standard output/error
stdout, err := cmd.StdoutPipe()
if err != nil {
return nil, nil, -1, fmt.Errorf("cmd.StdoutPipe() error: %+v", err.Error())
}
stderr, err := cmd.StderrPipe()
if err != nil {
return nil, nil, -1, fmt.Errorf("cmd.StderrPipe() error: %+v", err.Error())
}
// setup buffers to capture standard output and standard error
var stdoutBuf, stderrBuf []byte
// create 3 goroutines: stdout, stderr, timer.
// Use a waitgroup to wait.
var wg sync.WaitGroup
wg.Add(2)
go func() {
var err error
if stdoutBuf, err = ioutil.ReadAll(stdout); err != nil {
log.Printf("%s: Error Slurping stdout: %+v", command, err)
}
wg.Done()
}()
go func() {
var err error
if stderrBuf, err = ioutil.ReadAll(stderr); err != nil {
log.Printf("%s: Error Slurping stderr: %+v", command, err)
}
wg.Done()
}()
t := time.AfterFunc(time.Duration(timeout)*time.Second, func() {
cmd.Process.Kill()
})
// start process
if err := cmd.Start(); err != nil {
t.Stop()
return nil, nil, -1, fmt.Errorf("cmd.Start() error: %+v", err.Error())
}
err = cmd.Wait()
timedOut := !t.Stop()
wg.Wait()
// check if the timer timed out.
if timedOut {
return stdoutBuf, stderrBuf, -1,
fmt.Errorf("%s: timeout %d s reached, pid=%d process killed",
command, timeout, cmd.Process.Pid)
}
if err != nil {
rc := -1
if exitErr, ok := err.(*exec.ExitError); ok {
rc = exitErr.Sys().(syscall.WaitStatus).ExitStatus()
}
return stdoutBuf, stderrBuf, rc,
fmt.Errorf("%s: process done with error: %+v",
command, err)
}
// cmd.Wait docs say that if err == nil, exit code is 0
return stdoutBuf, stderrBuf, 0, nil
}
I'm running a bash command to start up a server in the background : "./starServer &" However, my server takes a few seconds to start up. I'm wondering what I can do to continuously check the port that it's running on to ensure it's up before I actually move on and do other things. I couldn't find anything in the golang api that helped with this. Any help is appreciated!
c := exec.Command("/bin/sh", "-c", command)
err := c.Start()
if err != nil {
log.Fatalf("error: %v", err)
}
l, err1 := net.Listen("tcp", ":" + port)
You could connect to the port using net.DialTimeout or net.Dial, and if successful, immediately close it. You can do this in a loop until successful.
for {
conn, err := net.DialTimeout("tcp", net.JoinHostPort("", port), timeout)
if conn != nil {
conn.Close()
break
}
}
A simple tiny library (I wrote) for a similar purpose might also be of interest: portping.
I'm using Go on an OSX machine and trying to make a program to open an external application and then after few seconds, close it - the application, not exit the Go script.
I'm using the library available on https://github.com/skratchdot/open-golang to start the app and it works fine. I also already have the timeout running. But the problem comes when I have to close the application.
Would someone give a hint of how I would be able to exit the app?
Thanks in advance.
It looks like that library is hiding details that you'd use to close the program, specifically the process ID (PID).
If you launch instead with the os/exec package or get a handle on that PID then you can use the Process object to kill or send signals to the app to try and close it gracefully.
https://golang.org/pkg/os/#Process
Thank you guys for the help. I would able to do what I was trying with the following code.
cmd := exec.Command(path string)
err := cmd.Start()
if err != nil {
log.Printf("Command finished with error: %v", err)
}
done := make(chan error, 1)
go func() {
done <- cmd.Wait()
}()
select {
case <-time.After(30 * time.Second): // Kills the process after 30 seconds
if err := cmd.Process.Kill(); err != nil {
log.Fatal("failed to kill: ", err)
}
<-done // allow goroutine to exit
log.Println("process killed")
indexInit()
case err := <-done:
if err!=nil{
log.Printf("process done with error = %v", err)
}
}
if err != nil {
log.Fatal(err)
}
log.Printf("Waiting for command to finish...")
//timer() // The time goes by...
err = cmd.Wait()
}
I placed that right after start the app with the os/exec package as #JimB recommended.