How to understand if exec.cmd was canceled - go

I'm trying to return specific error when the command was canceled by context.
After investigating ProcessState understood that if got -1 in exitCode the process got terminate signal
https://golang.org/pkg/os/#ProcessState.ExitCode
but maybe we have more elegant way?
Maybe I can put this error from cancel function?
Maybe it isn't good enough exitCode for understanding if the command was canceled?
var (
CmdParamsErr = errors.New("failed to get params for execution command")
ExecutionCanceled = errors.New("command canceled")
)
func execute(m My) error {
filePath, args, err := cmdParams(m)
err = nil
if err != nil {
log.Infof("cmdParams: err: %v\n, m: %v\n", err, m)
return CmdParamsErr
}
var out bytes.Buffer
var errStd bytes.Buffer
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, filePath, args...)
cmd.Stdout = &out
cmd.Stderr = &errStd
err = cmd.Run()
if err != nil {
if cmd.ProcessState.ExitCode() == -1 {
log.Warnf("execution was canceled by signal, err: %v\n", err)
err = ExecutionCanceled
return err
} else {
log.Errorf("run failed, err: %v, filePath: %v, args: %v\n", err, filePath, args)
return err
}
}
return err
}

exec.ExitError doesn't provide any reason for the exit code (there is no relevant struct field nor an Unwrap method), so you have to check the context directly:
if ctx.Err() != nil {
log.Println("canceled")
}
Note that this is a slight race because the context may be canceled just after the command failed for a different reason, but there is nothing you can do about that.

There is no straightforward or elegant way to figure out if a process was killed because a context was canceled. The closest you can come is this:
func run() error {
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "bash", "-c", "exit 1")
// Start() returns an error if the process can't be started. It will return
// ctx.Err() if the context is expired before starting the process.
if err := cmd.Start(); err != nil {
return err
}
if err := cmd.Wait(); err != nil {
if e, ok := err.(*exec.ExitError); ok {
// If the process exited by itself, just return the error to the
// caller.
if e.Exited() {
return e
}
// We know now that the process could be started, but didn't exit
// by itself. Something must have killed it. If the context is done,
// we can *assume* that it has been killed by the exec.Command.
// Let's return ctx.Err() so our user knows that this *might* be
// the case.
select {
case <-ctx.Done():
return ctx.Err()
default:
return e
}
}
return err
}
return nil
}
The problem here is that there might be a race condition, so returning ctx.Err() might be misleading. For example, imagine the following scenario:
The process starts.
The process is killed by an external actor.
The context is canceled.
You check the context.
At this point, the function above would return ctx.Err(), but this might be misleading because the reason why the process was killed is not because the context was canceled. If you decide to use a code similar to the function above, keep in mind this approximation.

Related

Why is this code getting `Error during running of the command error="exec: not started"`?

Here's my code (writeFromProcessToFileWithMax is an internal function, and is working properly):
// Go routines for non-blocking reading of stdout and stderr and writing to files
g := new(errgroup.Group)
// Read stdout in goroutine.
g.Go(func() error {
err = writeFromProcessToFileWithMax(stdoutScanner, stdoutFileWriter, maxStdoutFileLengthInGB)
if err != nil {
log.Error().Err(err).Msgf("Error writing to stdout file: %s", stdoutFilename)
return err
}
return nil
})
// Read stderr in goroutine.
g.Go(func() error {
err = writeFromProcessToFileWithMax(stderrScanner, stderrFileWriter, maxStderrFileLengthInGB)
if err != nil {
log.Error().Err(err).Msgf("Error writing to stderr file: %s", stderrFilename)
return err
}
return nil
})
// Wait the command in a goroutine.
g.Go(func() error {
return cmd.Wait()
})
// Starting the command
if err = cmd.Start(); err != nil {
log.Error().Err(err).Msg("Error starting command")
return err
}
// Waiting until errorGroups groups are done
if err = g.Wait(); err != nil {
log.Error().Err(err).Msg("Error during running of the command")
}
When I run it, I get the following Error = Error during running of the command error="exec: not started". But everything works properly.
Will the come back to bite me or should I suppress?
You're waiting for cmd before you start it. cmd.Wait() would be invoked before cmd.Start() most of the time in your old code. (There is no guarantee when exactly exactly things in two different goroutines happen with respect to each other, unless you explicitly use synchronisation points)
Swap the order of cmd.Start() and the cmd.Wait() inside the goroutine:
// Starting the command
if err = cmd.Start(); err != nil {
log.Error().Err(err).Msg("Error starting command")
return err
}
// Wait the command in a goroutine.
g.Go(func() error {
return cmd.Wait()
})
When you start the goroutine that waits after starting the command, you're guaranteed to perform cmd.Start() and cmd.Wait() in the correct order.
As to why it seemed to work: g.Wait() "blocks until all function calls from the Go method have returned, then returns the first non-nil error (if any) from them."
So all the go routines complete, including the ones that copy the output, and then you see the error from the one that did the cmd.Wait().

Calling an executable with a timeout

I am trying to use the context package to run a binary with a 10 second timeout, as such:
func RunQL(file db.File, flag string) string {
// 10-second timeout for the binary to run
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "qltool", "run", "-f", file.Path, "--rootfs", file.Root, "--args", flag)
out, err := cmd.Output()
// check to see if our timeout was executed
if ctx.Err() == context.DeadlineExceeded {
return ""
}
// no timeout (either completed successfully or errored)
if err != nil {
return ""
}
return string(out)
}
But for some reason, it still hangs if the process lasts longer than 10 seconds. Not sure what would be causing this, I also noticed that the documentation for the CommandContext() function appears to be wrong/misleading? It shows the following code:
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()
if err := exec.CommandContext(ctx, "sleep", "5").Run(); err != nil {
// This will fail after 100 milliseconds. The 5 second sleep
// will be interrupted.
}
}
But CommandContext() returns type *Cmd not error .

script command execution hung forever in go program

func Run() error {
log.Info("In Run Command")
cmd := exec.Command("bash", "/opt/AlterKafkaTopic.sh")
stdout, err := cmd.StdoutPipe()
if err != nil {
return err
}
if err = cmd.Start(); err != nil {
return err
}
f, err := os.Create(filepath.Join("/opt/log/", "execution.log"))
if err != nil {
return err
}
if _, err := io.Copy(f, stdout); err != nil {
return err
}
if err := cmd.Wait(); err != nil {
return err
}
return f.Close()
}
I am trying to execute a bash script from go code. The script changes some kafka topic properties. But the execution get hung io.Copy(f, stdout) and does not continue after it.
This program is running on RHEL7.2 server.
Could someone suggest where I am going wrong
From the docs:
Wait will close the pipe after seeing the command exit.
In other words, io.Copy exits when Wait() is called, but Wait is never called because it's blocked by Copy. Either run Copy in a goroutine, or simply assign f to cmd.Stdout:
f, err := os.Create(filepath.Join("/opt/log/", "execution.log"))
// TODO: Handle error
defer f.Close()
cmd := exec.Command("bash", "/opt/AlterKafkaTopic.sh")
cmd.Stdout = f
err = cmd.Run()

Is there a good way to cancel a blocking read?

I've got a command I have to run via the OS, let's call it 'login', that is interactive and therefore requires me to read from the stdin and pass it to the command's stdin pipe in order for it to work correctly. The only problem is the goroutine blocks on a read from stdin and I haven't been able to find a way to cancel a Reader in Go in order to get it to not hang on the blocking call. For example, from the perspective of the user, after the command looks as if it completed, you still have the opportunity to write to stdin once more (then the goroutine will move past the blocking read and exit)
Ideally I would like to avoid having to parse output from the command's StdoutPipe as that makes my code frail and prone to error if the strings of the login command were to change.
loginCmd := exec.Command("login")
stdin , err := loginCmd.StdinPipe()
if err != nil {
return err
}
out, err := loginCmd.StdoutPipe()
if err != nil {
return err
}
if err := loginCmd.Start(); err != nil {
return err
}
ctx, cancel := context.WithCancel(context.TODO())
var done sync.WaitGroup
done.Add(1)
ready := make(chan bool, 1)
defer cancel()
go func(ctx context.Context) {
reader := bufio.NewReader(os.Stdin)
for {
select {
case <- ctx.Done():
done.Done()
return
default:
//blocks on this line, if a close can unblock the read, then it should exit normally via the ctx.Done() case
line, err :=reader.ReadString('\n')
if err != nil {
fmt.Println("Error: ", err.Error())
}
stdin.Write([]byte(line))
}
}
}(ctx)
var bytesRead = 4096
output := make([]byte, bytesRead)
reader := bufio.NewReader(out)
for err == nil {
bytesRead, err = reader.Read(output)
if err != nil && err != io.EOF {
return err
}
fmt.Printf("%s", output[:bytesRead])
}
if err := loginCmd.Wait(); err != nil {
return err
}
cancel()
done.Wait()

Libcontainer - Running multiple processes in the container

I am trying to implement something to the effect of docker run and docker exec using libcontainer.
I have been able to create a container and run a process inside it with the following code:
func Run(id string, s *specs.LinuxSpec, f *Factory) (int, error) {
...
container, err := f.CreateContainer(id, config)
if err != nil {
return -1, err
}
process := newProcess(s.Process)
tty, err := newTty(s.Process.Terminal, process, rootuid)
defer tty.Close()
if err != nil {
return -1, err
}
defer func() {
if derr := Destroy(container); derr != nil {
err = derr
}
}()
handler := NewSignalHandler(tty)
defer handler.Close()
if err := container.Start(process); err != nil {
return -1, err
}
return handler.forward(process)
}
This works great (I believe), but the problem comes when I have to run another process(es) inside the same container. For example, a container is already running (the main process is running in foreground mode): How can I achieve what Docker allows you to do with docker exec?
I have the following code:
func Exec(container libcontainer.Container, process *libcontainer.Process, onData func(data []byte), onErr func(err error)) (int, error) {
reader, writer := io.Pipe()
process.Stdin = os.Stdin
rootuid, err := container.Config().HostUID()
if err != nil {
return -1, err
}
tty, err := newTty(true, process, rootuid)
defer tty.Close()
if err != nil {
return -1, err
}
handler := NewSignalHandler(tty)
defer handler.Close()
// Redirect process output
process.Stdout = writer
process.Stderr = writer
// Todo: Fix this, it waits for the main process to exit before it starts
if err := container.Start(process); err != nil {
return -1, err
}
go func(reader io.Reader) {
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
onData(scanner.Bytes())
}
if err := scanner.Err(); err != nil {
onErr(err)
}
}(reader)
return handler.forward(process)
}
This also works, but the problem is: It waits for the main process to exit before it runs. Sometimes it runs, but my memory goes to 100% after calling that function 5 - 7 times running a simple whoami command.
I'm pretty sure I am doing something(s) wrong, I just don't know what. Or is my understanding of containers failing me?
I used this project as a reference:
https://github.com/opencontainers/runc
It's probably better to use docker as reference for your case, because it uses same libcontainer.Container objects for starting and exec new process in container. You can find code interacting with libcontainer here:
https://github.com/docker/docker/tree/master/daemon/execdriver/native
Also it's better to post whole code, so people could try and debug it to help you.
EDIT:
Here is example code for running multiple containers: https://gist.github.com/anonymous/407eb530c0cb6c87ec9f
I runned it like
go run procs.go path-to-busybox
You can see with ps that there are indeed multiple processes in container.
Feel free to ask if you have any questions.

Resources