I need to call lots of short-lived (and occasionally some long-lived) external processes in rapid succession and process both stdout and stderr in realtime. I've found numerous solutions for this using StdoutPipe and StderrPipe with a bufio.Scanner for each, packaged into goroutines. This works most of the time, but it swallows the external command's output occasionally, and I can't figure out why.
Here's a minimal example displaying that behaviour on MacOS X (Mojave) and on Linux:
package main
import (
"bufio"
"log"
"os/exec"
"sync"
)
func main() {
for i := 0; i < 50000; i++ {
log.Println("Loop")
var wg sync.WaitGroup
cmd := exec.Command("echo", "1")
stdout, err := cmd.StdoutPipe()
if err != nil {
panic(err)
}
cmd.Start()
stdoutScanner := bufio.NewScanner(stdout)
stdoutScanner.Split(bufio.ScanLines)
wg.Add(1)
go func() {
for stdoutScanner.Scan() {
line := stdoutScanner.Text()
log.Printf("[stdout] %s\n", line)
}
wg.Done()
}()
cmd.Wait()
wg.Wait()
}
}
I've left out the stderr handling for this. When running this, I get only about 49,900 [stdout] 1 lines (the actual number varies with each run), though there should be 50,000. I'm seeing 50,000 loop lines, so it doesn't seem to die prematurely. This smells like a race condition somewhere, but I can't figure out where.
It works just fine if I don't put the scanning loop in a goroutine, but then I lose the ability to simultaneously read stderr, which I need.
I've tried running this with -race, Go reports no data races.
I'm out of ideas, what am I getting wrong?
You're not checking for errors in several places.
In some, this is not actually causing problems, but it's still a good idea to check:
cmd.Start()
may return an error, in which case the command was never run. (This is not the actual problem.)
When stdoutScanner.Scan() returns false, stdoutScanner.Err() may show an error. If you start checking this, you'll find some errors:
2020/02/19 15:38:17 [stdout err] read |0: file already closed
This isn't the actual problem, but—aha—this matches the symptoms you see: not all of the output got seen. Now, why would reading stdout claim that the file is closed? Well, where did stdout come from? It's from here:
stdout, err := cmd.StdoutPipe()
Take a look at the source code for this function, which ends with these lines:
c.closeAfterStart = append(c.closeAfterStart, pw)
c.closeAfterWait = append(c.closeAfterWait, pr)
return pr, nil
(and pr is the pipe-read return value). Hmm: what could closeAfterWait mean?
Now, here are your last two lines in your loop:
cmd.Wait()
wg.Wait()
That is, first we wait for cmd to finish. (When cmd finishes, what gets closed?) Then we wait for the goroutine that's reading cmd's stdout to finish. (Hm, what could still be reading from the pr pipe?)
The fix is now obvious: swap the wg.Wait(), which waits for the consumer of the stdout pipe to finish reading it, with the cmd.Wait(), which waits for echo ... to exit and then closes the read end of the pipe. If you close while the readers are still reading, they may never read what you expected.
Related
I'm writing a Jupyter kernel for Go, and before executing the Go code I create a side named pipe (syscall.Mkfifo) as a mechanism to allow one to publish html, images, etc.
My kernel creates the fifo, and then opens it for reading in a new goroutine and polls for input accordingly. Now, opening a fifo for reading is a blocking operation (it waits until someone opens on the other side).
But some programs are not interested in using this mechanism and will never open the other side of the fifo. When this happens my goroutine leaks, and it forever waits on the opening fifo -- even after I remove it.
The code looks more or less (less error handling) like this:
...
syscall.Mkfifo(pipePath, 0600)
go func() {
pipeReader, _ := os.Open(pipePath)
go poll(pipeReader) // Reads and process each entry until pipeReader is closed.
<-doneChan
pipeReader.Close()
}
go func() {
<-doneChan
os.Remove(pipePath)
}
Where doneChan is closed when the program I start finishes executing.
And the issue is that os.Open(pipePath) never returns if the other end is never opened,
even though the os.Remove(pipePath) is properly executed.
Is there a way to forcefully interrupt os.Open(pipePath), or another way to achieving the same thing ?
Thanks in advance!!
Ugh, it took just a bit more coffee and thinking.
I forgot that I can open the other end of the pipe myself, if I know the program I executed didn't open it. And this unblocks the open for reading.
The revised code, in case anyone bumps into this:
pipeOpenedForReading := false
var muFifo sync.Mutex{}
syscall.Mkfifo(pipePath, 0600)
go func() {
pipeReader, _ := os.Open(pipePath)
muFifo.Lock()
pipeOpenedForReading = true
muFifo.Unlock()
go poll(pipeReader) // Reads and process each entry until pipeReader is closed.
<-doneChan
pipeReader.Close()
}
go func() {
<-doneChan
muFifo.Lock()
if !pipeOpenedForReading {
// Open for writing, unblocking the os.Open for reading.
f, err = os.OpenFile(pipePath, O_WRONLY, 0600)
if err == nil {
close(f)
}
}
muFifo.Unlock
os.Remove(pipePath)
}
You can open FIFO in non-blocking way, according to documentation. So something like this should work:
go func() {
pipeReader, _ := os.OpenFile(pipePath, syscall.O_RDONLY|syscall.O_NONBLOCK, 0)
go poll(pipeReader) // Reads and process each entry until pipeReader is closed.
<-doneChan
pipeReader.Close()
}
I have this snippet of code which concurrently runs a function using an input and output channel and associated WaitGroups, but I was clued in to the fact that I've done some things wrong. Here's the code:
func main() {
concurrency := 50
var tasksWG sync.WaitGroup
tasks := make(chan string)
output := make(chan string)
for i := 0; i < concurrency; i++ {
tasksWG.Add(1)
// evidentally because I'm processing tasks in a groutine then I'm not blocking and I end up closing the tasks channel almost immediately and stopping tasks from executing
go func() {
for t := range tasks {
output <- process(t)
continue
}
tasksWG.Done()
}()
}
var outputWG sync.WaitGroup
outputWG.Add(1)
go func() {
for o := range output {
fmt.Println(o)
}
outputWG.Done()
}()
go func() {
// because of what was mentioned in the previous comment, the tasks wait group finishes almost immediately which then closes the output channel almost immediately as well which ends ranging over output early
tasksWG.Wait()
close(output)
}()
f, err := os.Open(os.Args[1])
if err != nil {
log.Panic(err)
}
s := bufio.NewScanner(f)
for s.Scan() {
tasks <- s.Text()
}
close(tasks)
// and finally the output wait group finishes almost immediately as well because tasks gets closed right away due to my improper use of goroutines
outputWG.Wait()
}
func process(t string) string {
time.Sleep(3 * time.Second)
return t
}
I've indicated in the comments where I've implementing things wrong. Now these comments make sense to me. The funny thing is that this code does indeed seem to run asynchronously and dramatically speeds up execution. I want to understand what I've done wrong but it's hard to wrap my head around it when the code seems to execute in an asynchronous way. I'd love to understand this better.
Your main goroutine is doing a couple of things sequentially and others concurrently, so I think your order of execution is off
f, err := os.Open(os.Args[1])
if err != nil {
log.Panic(err)
}
s := bufio.NewScanner(f)
for s.Scan() {
tasks <- s.Text()
}
Shouldn't you move this up top? So then you have values sent to tasks
THEN have your loop which ranges over tasks 50 times in the concurrency named for loop (you want to have something in tasks before calling code that ranges over it)
go func() {
// because of what was mentioned in the previous comment, the tasks wait group finishes almost immediately which then closes the output channel almost immediately as well which ends ranging over output early
tasksWG.Wait()
close(output)
}()
The logic here is confusing me, you're spawning a goroutine to wait on the waitgroup, so here the wait is nonblocking on the main goroutine - is that what you want to do? It won't wait for tasksWG to be decremented to zero inside main, it'll do that inside the goroutine that you've created. I don't believe you want to do that?
It might be easier to debug if you could give more details on the expected output?
I'm writing a program which opens a named pipe for reading, and then processes any lines written to this pipe:
err = syscall.Mkfifo("/tmp/myfifo", 0666)
if err != nil {
panic(err)
}
pipe, err := os.OpenFile("/tmp/myfifo", os.O_RDONLY, os.ModeNamedPipe)
if err != nil {
panic(err)
}
reader := bufio.NewReader(pipe)
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
line := scanner.Text()
process(line)
}
This works fine as long as the writing process does not restart or for other reasons send an EOF. When this happens, the loop terminates (as expected from the specifications of Scanner).
However, I want to keep the pipe open to accept further writes. I could just reinitialize the scanner of course, but I believe this would create a race condition where the scanner might not be ready while a new process has begun writing to the pipe.
Are there any other options? Do I need to work directly with the File type instead?
From the bufio GoDoc:
Scan ... returns false when the scan stops, either by reaching the end of the input or an error.
So you could possibly leave the file open and read until EOF, then trigger scanner.Scan() again when the file has changed or at a regular interval (i.e. make a goroutine), and make sure the pipe variable doesn't go out of scope so you can reference it again.
If I understand your concern about a race condition correctly, this wouldn't be an issue (unless write and read operations must be synchronized) but when the scanner is re-initialized it will end up back at the beginning of the file.
For some reason, once I started adding strings through a channel in my goroutine, the code stalls when I run it. I thought that it was a scope/closure issue so I moved all code directly into the function to no avail. I have looked through Golang's documentation and all examples look similar to mine so I am kind of clueless as to what is going wrong.
func getPage(url string, c chan<- string, swg sizedwaitgroup.SizedWaitGroup) {
defer swg.Done()
doc, err := goquery.NewDocument(url)
if err != nil{
fmt.Println(err)
}
nodes := doc.Find(".v-card .info")
for i := range nodes.Nodes {
el := nodes.Eq(i)
var name string
if el.Find("h3.n span").Size() != 0{
name = el.Find("h3.n span").Text()
}else if el.Find("h3.n").Size() != 0{
name = el.Find("h3.n").Text()
}
address := el.Find(".adr").Text()
phoneNumber := el.Find(".phone.primary").Text()
website, _ := el.Find(".track-visit-website").Attr("href")
//c <- map[string] string{"name":name,"address":address,"Phone Number": phoneNumber,"website": website,};
c <- fmt.Sprint("%s%s%s%s",name,address,phoneNumber,website)
fmt.Println([]string{name,address,phoneNumber,website,})
}
}
func getNumPages(url string) int{
doc, err := goquery.NewDocument(url)
if err != nil{
fmt.Println(err);
}
pagination := strings.Split(doc.Find(".pagination p").Contents().Eq(1).Text()," ")
numItems, _ := strconv.Atoi(pagination[len(pagination)-1])
return int(math.Ceil(float64(numItems)/30))
}
func main() {
arrChan := make(chan string)
swg := sizedwaitgroup.New(8)
zips := []string{"78705","78710","78715"}
for _, item := range zips{
swg.Add()
go getPage(fmt.Sprintf(base_url,item,1),arrChan,swg)
}
swg.Wait()
}
Edit:
so I fixed it by passing sizedwaitgroup as a reference but when I remove the buffer it doesn't work does that mean that I need to know how many elements will be sent to the channel in advance?
Issue
Building off of Colin Stewart's answer, from the code you have posted, as far as I can tell, your issue is actually with reading your arrChan. You write into it, but there's no place where you read from it in your code.
From the documentation :
If the channel is unbuffered, the sender blocks until the receiver has received the value. If the channel has a buffer, the sender blocks only until the value
has been copied to the buffer; if the buffer is full, this means
waiting until some receiver has retrieved a value.
By making the channel buffered, what's happening is your code is no longer blocking on the channel write operations, the line that looks like:
c <- fmt.Sprint("%s%s%s%s",name,address,phoneNumber,website)
My guess is that if you're still hanging at when the channel has a size of 5000, it's because you have more than 5000 values returned across all of your loops over node.Nodes. Once your buffered channel is full, the operations block until the channel has space, just like if you were writing to an unbuffered channel.
Fix
Here's a minimal example showing you how you would fix something like this (basically just add a reader)
package main
import "sync"
func getPage(item string, c chan<- string) {
c <- item
}
func readChannel(c <-chan string) {
for {
<-c
}
}
func main() {
arrChan := make(chan string)
wg := sync.WaitGroup{}
zips := []string{"78705", "78710", "78715"}
for _, item := range zips {
wg.Add(1)
go func() {
defer wg.Done()
getPage(item, arrChan)
}()
}
go readChannel(arrChan) // comment this out and you'll deadlock
wg.Wait()
}
Your channel has no buffer, so writes will block until the value can be read, and at least in the code you have posted, there are no readers.
You don't need to know size to make it work. But you might in order to exit cleanly. Which can be a bit tricky to observe at time because your program will exit once your main function exits and all goroutines still running are killed immediately finished or not.
As a warm up example, change readChannel in photoionized's response to this:
func readChannel(c <-chan string) {
for {
url := <-c
fmt.Println (url)
}
}
It only adds printing to the original code. But now you'll see better what is actually happening. Notice how it usually only prints two strings when code actually writes 3. This is because code exits once all writing goroutines finish, but reading goroutine is aborted mid way as result. You can "fix" it by removing "go" before readChannel (which would be same as reading the channel in main function). And then you'll see 3 strings printed, but program crashes with a dead lock as readChannel is still reading from the channel, but nobody writes into it anymore. You can fix that too by reading exactly 3 strings in readChannel(), but that requires knowing how many strings you expect to receive.
Here is my minimal working example (I'll use it to illustrate the rest):
package main
import (
"fmt"
"sync"
)
func getPage(url string, c chan<- string, wg *sync.WaitGroup) {
defer wg.Done()
c <- fmt.Sprintf("Got page for %s\n",url)
}
func readChannel(c chan string, wg *sync.WaitGroup) {
defer wg.Done()
var url string
ok := true
for ok {
url, ok = <- c
if ok {
fmt.Printf("Received: %s\n", url)
} else {
fmt.Println("Exiting readChannel")
}
}
}
func main() {
arrChan := make(chan string)
var swg sync.WaitGroup
base_url := "http://test/%s/%d"
zips := []string{"78705","78710","78715"}
for _, item := range zips{
swg.Add(1)
go getPage(fmt.Sprintf(base_url,item,1),arrChan,&swg)
}
var wg2 sync.WaitGroup
wg2.Add(1)
go readChannel(arrChan, &wg2)
swg.Wait()
// All written, signal end to readChannel by closing the channel
close(arrChan)
wg2.Wait()
}
Here I close the channel to signal to readChannel that there is nothing left to read, so it can exit cleanly at proper time. But sometimes you might want instead to tell readChannel to read exactly 3 strings and finish. Or may be you would want to start one reader for each writer and each reader will read exactly one string... Well, there are many ways to skin a cat and choice is all yours.
Note, if you remove wg2.Wait() line your code becomes equivalent to photoionized's response and will only print two strings whilst writing 3. This is because code exits once all writers finish (ensured by swg.Wait()), but it does not wait for readChannel to finish.
If you remove close(arrChan) line instead, your code will crash with a deadlock after printing 3 lines as code waits for readChannel to finish, but readChannel waits to read from a channel which nobody is writing to anymore.
If you just remove "go" before the readChannel call, it becomes equivalent of reading from channel inside main function. It will again crash with a dead lock after printing 3 strings because readChannel is still reading when all writers have already finished (and readChannel has already read all they written). A tricky point here is that swg.Wait() line will never be reached by this code as readChannel never exits.
If you move readChannel call after the swg.Wait() then code will crash before even printing a single string. But this is a different dead lock. This time code reaches swg.Wait() and stops there waiting for writers. First writer succeeds, but channel is not buffered, so next writer blocks until someone reads from the channel the data already written. Trouble is - nobody reads from the channel yet as readChannel has not been called yet. So, it stalls and crashes with a dead lock. This particular issue can be "fixed", but making channel buffered as in make(chan string, 3) as that will allow writers to keep writing even though nobody is reading from that channel yet. And sometimes this is what you want. But here again you have to know the maximum of messages to ever be in the channel buffer. And most of the time it's only deferring a problem - just add one more writer and you are where you started - code stalls and crashes as channel buffer is full and that one extra writer is waiting for someone to read from the buffer.
Well, this should covers all bases. So, check your code and see which case is yours.
I'm writing a service that has to stream output of a executed command both to parent and to log. When there is a long process, the problem is that cmd.StdoutPipe gives me a final (string) result.
Is it possible to give partial output of what is going on, like in shell
func main() {
cmd := exec.Command("sh", "-c", "some long runnig task")
stdout, _ := cmd.StdoutPipe()
cmd.Start()
scanner := bufio.NewScanner(stdout)
for scanner.Scan() {
m := scanner.Text()
fmt.Println(m)
log.Printf(m)
}
cmd.Wait()
}
P.S. Just to output would be:
cmd.Stdout = os.Stdout
But in my case it is not enough.
The code you posted works (with a reasonable command executed).
Here is a simple "some long running task" written in Go for you to call and test your code:
func main() {
fmt.Println("Child started.")
time.Sleep(time.Second*2)
fmt.Println("Tick...")
time.Sleep(time.Second*2)
fmt.Println("Child ended.")
}
Compile it and call it as your command. You will see the different lines appear immediately as written by the child process, "streamed".
Reasons why it may not work for you
The Scanner returned by bufio.NewScanner() reads whole lines and only returns something if a newline character is encountered (as defined by the bufio.ScanLines() function).
If the command you execute doesn't print newline characters, its output won't be returned immediately (only when newline character is printed, internal buffer is filled or the process ends).
Possible workarounds
If you have no guarantee that the child process prints newline characters but you still want to stream the output, you can't read whole lines. One solution is to read by words, or even read by characters (runes). You can achieve this by setting a different split function using the Scanner.Split() method:
scanner := bufio.NewScanner(stdout)
scanner.Split(bufio.ScanRunes)
The bufio.ScanRunes function reads the input by runes so Scanner.Scan() will return whenever a new rune is available.
Or reading manually without a Scanner (in this example byte-by-byte):
oneByte := make([]byte, 1)
for {
_, err := stdout.Read(oneByte)
if err != nil {
break
}
fmt.Printf("%c", oneByte[0])
}
Note that the above code would read runes that multiple bytes in UTF-8 encoding incorrectly. To read multi UTF-8-byte runes, we need a bigger buffer:
oneRune := make([]byte, utf8.UTFMax)
for {
count, err := stdout.Read(oneRune)
if err != nil {
break
}
fmt.Printf("%s", oneRune[:count])
}
Things to keep in mind
Processes have default buffers for standard output and for standard error (usually the size of a few KB). If a process writes to the standard output or standard error, it goes into the respective buffer. If this buffer gets full, further writes will block (in the child process). If you don't read the standard output and standard error of a child process, your child process may hang if the buffer is full.
So it is recommended to always read both the standard output and error of a child process. Even if you know that the command don't normally write to its standard error, if some error occurs, it will probably start dumping error messages to its standard error.
Edit: As Dave C mentions by default the standard output and error streams of the child process are discarded and will not cause a block / hang if not read. But still, by not reading the error stream you might miss a thing or two from the process.
I found good examples how to implement progress output in this article by Krzysztof Kowalczyk