How to set timeout in *os.File/io.Read in golang - go

I know there is a function called SetReadDeadline that can set a timeout in socket(conn.net) reading, while io.Read not. There is a way that starts another routine as a timer to solve this problem, but it brings another problem that the reader routine(io.Read) still block:
func (self *TimeoutReader) Read(buf []byte) (n int, err error) {
ch := make(chan bool)
n = 0
err = nil
go func() { // this goroutime still exist even when timeout
n, err = self.reader.Read(buf)
ch <- true
}()
select {
case <-ch:
return
case <-time.After(self.timeout):
return 0, errors.New("Timeout")
}
return
}
This question is similar in this post, but the answer is unclear.
Do you guys have any good idea to solve this problem?

Instead of setting a timeout directly on the read, you can close the os.File after a timeout. As written in https://golang.org/pkg/os/#File.Close
Close closes the File, rendering it unusable for I/O. On files that support SetDeadline, any pending I/O operations will be canceled and return immediately with an error.
This should cause your read to fail immediately.

Your mistake here is something different:
When you read from the reader you just read one time and that is wrong:
go func() {
n, err = self.reader.Read(buf) // this Read needs to be in a loop
ch <- true
}()
Here is a simple example (https://play.golang.org/p/2AnhrbrhLrv)
buf := bytes.NewBufferString("0123456789")
r := make([]byte, 3)
n, err := buf.Read(r)
fmt.Println(string(r), n, err)
// Output: 012 3 <nil>
The size of the given slice is used when using the io.Reader. If you would log the n variable in your code you would see, that not the whole file is read. The select statement outside of your goroutine is at the wrong place.
go func() {
a := make([]byte, 1024)
for {
select {
case <-quit:
result <- []byte{}
return
default:
_, err = self.reader.Read(buf)
if err == io.EOF {
result <- a
return
}
}
}
}()
But there is something more! You want to implement the io.Reader interface. After the Read() method is called until the file ends you should not start a goroutine in here, because you just read chunks of the file.
Also the timeout inside the Read() method doesn't help, because that timeout works for each call and not for the whole file.

In addition to #apxp's point about looping over Read, you could use a buffer size of 1 byte so that you never block as long is there is data to read.
When interacting with external resources anything can happen. It is possible for any given io.Reader implementation to simply block forever. Here, I'll write one for you...
type BlockingReader struct{}
func (BlockingReader) Read(b []byte) (int, error) {
<-make(chan struct{})
return 0, nil
}
Remember anyone can implement an interface, so you can't make any assumptions that it will behave like *os.File or any other standard library io.Reader. In addition to asinine coding like mine above, an io.Reader could legitimately connect to a resources that can block forever.
You cannot kill gorountines, so if an io.Reader truly blocks forever the blocked goroutine will continue to consume resources until your application terminates. However, this shouldn't be a problem, a blocked goroutine does not consume much in the way of resources, and should be fine as long as you don't blindly retry blocked Reads by spawning more gorountines.

Related

how to set timeout on bufio.ReadBytes()

I am writing a code, which reads data from network with timeouts. Initially I wrote this (much simplified skipped error checks too)
func read_msg(conn net.Conn, ch chan []byte) {
b := bufio.NewReader(conn)
msg,_ := b.ReadBytes(byte('\n'))
ch <- msg
close(ch)
}
func main() {
ln,_ := net.Listen("tcp", ":12345")
conn,_ := ln.Accept()
for {
ch := make(chan []byte)
go read_msg(conn,ch)
select {
case msg := <-ch:
fmt.Println("message received: ", msg)
case <-time.After(time.Duration(1)*time.Second):
fmt.Println("timeout")
}
}
}
When the timeout happens, the goroutine stays active, waiting on the bufio.ReadBytes(). Is there anyway to set a timeout on the bufio itself?
No, bufio does not have this feature (you can read the docs here).
The more appropriate solution is to use the SetReadDeadline method of the net.Conn value.
// A deadline is an absolute time after which I/O operations
// fail instead of blocking. The deadline applies to all future
// and pending I/O, not just the immediately following call to
// Read or Write.
...
// SetReadDeadline sets the deadline for future Read calls
// and any currently-blocked Read call.
// A zero value for t means Read will not time out.
For example:
conn.SetReadDeadline(time.Now().Add(time.Second))

Data race issues while reading data from file and sending it simultaneously

I am trying to read data from a file and send it immediately the chunk read without waiting the other goroutine to complete file read. I have two function
func ReadFile(stream chan []byte, stop chan bool) {
file.Lock()
defer file.Unlock()
dir, _ := os.Getwd()
file, _ := os.Open(dir + "/someFile.json")
chunk := make([]byte, 512*4)
for {
bytesRead, err := file.Read(chunk)
if err == io.EOF {
break
}
if err != nil {
panic(err)
break
}
stream <- chunk[:bytesRead]
}
stop <- true
}
First one is responsible for reading the file and sending data chunks to the "stream" channel received from the other function
func SteamFile() {
stream := make(chan []byte)
stop := make(chan bool)
go ReadFile(stream, stop)
L:
for {
select {
case data := <-stream:
//want to send data chunk by socket here
fmt.Println(data)
case <-stop:
break L
}
}
}
Second function creates 2 channel, sends them to the first function and starts to listen to the channels.
The problem is that sometimes data := <-stream is missing. I do not receive for example the first part of the file but receive all the others. When I run the program with -race flag it tells that there is a data race and two goroutines write to and read from the channel simultaneously. To tell the truth I thought that's the normal way to use channels but now I see that it brings to even more issues.
Can someone please help me to solve this issue?
When I run the program with -race flag it tells that there is a data race and two goroutines write to and read from the channel simultaneously. To tell the truth I thought that's the normal way do use channels.
It is, and you are almost certainly misunderstanding the output of the race detector.
You're sharing the slice in ReadFile (and therefore it's underlying array), so the data is overwritten in ReadFile while it's being read in SteamFile [sic]. While this shouldn't trigger the race detector it is definitely a bug. Create a new slice for each Read call:
func ReadFile(stream chan []byte, stop chan bool) {
file.Lock()
defer file.Unlock()
dir, _ := os.Getwd()
file, _ := os.Open(dir + "/someFile.json")
- chunk := make([]byte, 512*4)
for {
+ chunk := make([]byte, 512*4)
bytesRead, err := file.Read(chunk)

Synchronizing two goroutines with a size 1 channel

In an effort to learn golang, I was looking through the go source for reverseproxy:
https://golang.org/src/net/http/httputil/reverseproxy.go
I found this block of code (truncated):
...
errc := make(chan error, 1)
spc := switchProtocolCopier{user: conn, backend: backConn}
go spc.copyToBackend(errc)
go spc.copyFromBackend(errc)
<-errc
return
}
// switchProtocolCopier exists so goroutines proxying data back and
// forth have nice names in stacks.
type switchProtocolCopier struct {
user, backend io.ReadWriter
}
func (c switchProtocolCopier) copyFromBackend(errc chan<- error) {
_, err := io.Copy(c.user, c.backend)
errc <- err
}
func (c switchProtocolCopier) copyToBackend(errc chan<- error) {
_, err := io.Copy(c.backend, c.user)
errc <- err
}
The portion that caught my attention was the creation of the errc buffered channel. I thought (probably naively) that we would use an unbuffered channel and the later receive from errc would need to run twice, like this:
<-errc
<-errc
As written, I understand that reading from the channel will ensure at least one of the copy methods has run. I also understand that the first send to the channel will not block, while the second will block only if the first one has not yet been received.
What I don't understand, is why it is written like this. Is it to ensure that only one of the methods completes? If that is the case, couldn't they technically both run?
Thanks!
The channel of size one helps realize a binary semaphore.
Since at most one value is consumed from the channel (on line 549), changing the size of the channel to be greater than one will not affect the currently exhibited behavior, which is wait until at least one of the two go routines complete executing the Copy operation.

How can I stop a goroutine that is reading from UDP?

I have a go program that uses a goroutine to read UDP packets.
I wanted to use a select clause and a "stopping" channel to close the goroutine to shut down as soon as it is not needed anymore.
Here is a simple code example for the goroutine:
func Run(c chan string, q chan bool, conn *net.UDPConn) {
defer close(c)
buf := make([]byte, 1024)
for {
select {
case <- q:
return
default:
n, _, err := conn.ReadFromUDP(buf)
c <- string(buf[0:n])
fmt.Println("Received ", string(buf[0:n]))
if err != nil {
fmt.Println("Error: ", err)
}
}
}
}
The connection is created as:
conn, err := net.ListenUDP("udp",addr.Addr)
And the goroutine is supposed to terminate using:
close(q)
After closing the "stopping" channel ("q") the goroutine does not immediately stop. I need to send one more string via the UDP connection. When doing so the goroutine stops.
I simply do not understand this behaviour and I would be grateful if somebody could enlighten me.
Thank you in advance!
Your program is likely stopped at this line when you close the channel:
n, _, err := conn.ReadFromUDP(buf)
Because execution is blocked at a ReadFrom method, the select statement is not being evaluated, therefore the close on channel q is not immediately detected. When you do another send on the UDP connection, ReadFrom unblocks and (once that loop iteration finishes) control moves to the select statement: at that point the close on q is detected.
You can close the connection to unblock ReadFrom, as was suggested in a comment. See the PacketConn documentation in the net package, especially "Any blocked ReadFrom or WriteTo operations will be unblocked and return errors":
// Close closes the connection.
// Any blocked ReadFrom or WriteTo operations will be unblocked and return errors.
Close() error
Depending on your needs a timeout might be an option as well, see again PacketConn documentation in the net package:
// ReadFrom reads a packet from the connection,
// copying the payload into b. It returns the number of
// bytes copied into b and the return address that
// was on the packet.
// ReadFrom can be made to time out and return
// an Error with Timeout() == true after a fixed time limit;
// see SetDeadline and SetReadDeadline.
ReadFrom(b []byte) (n int, addr Addr, err error)

Why is my code causing a stall or race condition?

For some reason, once I started adding strings through a channel in my goroutine, the code stalls when I run it. I thought that it was a scope/closure issue so I moved all code directly into the function to no avail. I have looked through Golang's documentation and all examples look similar to mine so I am kind of clueless as to what is going wrong.
func getPage(url string, c chan<- string, swg sizedwaitgroup.SizedWaitGroup) {
defer swg.Done()
doc, err := goquery.NewDocument(url)
if err != nil{
fmt.Println(err)
}
nodes := doc.Find(".v-card .info")
for i := range nodes.Nodes {
el := nodes.Eq(i)
var name string
if el.Find("h3.n span").Size() != 0{
name = el.Find("h3.n span").Text()
}else if el.Find("h3.n").Size() != 0{
name = el.Find("h3.n").Text()
}
address := el.Find(".adr").Text()
phoneNumber := el.Find(".phone.primary").Text()
website, _ := el.Find(".track-visit-website").Attr("href")
//c <- map[string] string{"name":name,"address":address,"Phone Number": phoneNumber,"website": website,};
c <- fmt.Sprint("%s%s%s%s",name,address,phoneNumber,website)
fmt.Println([]string{name,address,phoneNumber,website,})
}
}
func getNumPages(url string) int{
doc, err := goquery.NewDocument(url)
if err != nil{
fmt.Println(err);
}
pagination := strings.Split(doc.Find(".pagination p").Contents().Eq(1).Text()," ")
numItems, _ := strconv.Atoi(pagination[len(pagination)-1])
return int(math.Ceil(float64(numItems)/30))
}
func main() {
arrChan := make(chan string)
swg := sizedwaitgroup.New(8)
zips := []string{"78705","78710","78715"}
for _, item := range zips{
swg.Add()
go getPage(fmt.Sprintf(base_url,item,1),arrChan,swg)
}
swg.Wait()
}
Edit:
so I fixed it by passing sizedwaitgroup as a reference but when I remove the buffer it doesn't work does that mean that I need to know how many elements will be sent to the channel in advance?
Issue
Building off of Colin Stewart's answer, from the code you have posted, as far as I can tell, your issue is actually with reading your arrChan. You write into it, but there's no place where you read from it in your code.
From the documentation :
If the channel is unbuffered, the sender blocks until the receiver has received the value. If the channel has a buffer, the sender blocks only until the value
has been copied to the buffer; if the buffer is full, this means
waiting until some receiver has retrieved a value.
By making the channel buffered, what's happening is your code is no longer blocking on the channel write operations, the line that looks like:
c <- fmt.Sprint("%s%s%s%s",name,address,phoneNumber,website)
My guess is that if you're still hanging at when the channel has a size of 5000, it's because you have more than 5000 values returned across all of your loops over node.Nodes. Once your buffered channel is full, the operations block until the channel has space, just like if you were writing to an unbuffered channel.
Fix
Here's a minimal example showing you how you would fix something like this (basically just add a reader)
package main
import "sync"
func getPage(item string, c chan<- string) {
c <- item
}
func readChannel(c <-chan string) {
for {
<-c
}
}
func main() {
arrChan := make(chan string)
wg := sync.WaitGroup{}
zips := []string{"78705", "78710", "78715"}
for _, item := range zips {
wg.Add(1)
go func() {
defer wg.Done()
getPage(item, arrChan)
}()
}
go readChannel(arrChan) // comment this out and you'll deadlock
wg.Wait()
}
Your channel has no buffer, so writes will block until the value can be read, and at least in the code you have posted, there are no readers.
You don't need to know size to make it work. But you might in order to exit cleanly. Which can be a bit tricky to observe at time because your program will exit once your main function exits and all goroutines still running are killed immediately finished or not.
As a warm up example, change readChannel in photoionized's response to this:
func readChannel(c <-chan string) {
for {
url := <-c
fmt.Println (url)
}
}
It only adds printing to the original code. But now you'll see better what is actually happening. Notice how it usually only prints two strings when code actually writes 3. This is because code exits once all writing goroutines finish, but reading goroutine is aborted mid way as result. You can "fix" it by removing "go" before readChannel (which would be same as reading the channel in main function). And then you'll see 3 strings printed, but program crashes with a dead lock as readChannel is still reading from the channel, but nobody writes into it anymore. You can fix that too by reading exactly 3 strings in readChannel(), but that requires knowing how many strings you expect to receive.
Here is my minimal working example (I'll use it to illustrate the rest):
package main
import (
"fmt"
"sync"
)
func getPage(url string, c chan<- string, wg *sync.WaitGroup) {
defer wg.Done()
c <- fmt.Sprintf("Got page for %s\n",url)
}
func readChannel(c chan string, wg *sync.WaitGroup) {
defer wg.Done()
var url string
ok := true
for ok {
url, ok = <- c
if ok {
fmt.Printf("Received: %s\n", url)
} else {
fmt.Println("Exiting readChannel")
}
}
}
func main() {
arrChan := make(chan string)
var swg sync.WaitGroup
base_url := "http://test/%s/%d"
zips := []string{"78705","78710","78715"}
for _, item := range zips{
swg.Add(1)
go getPage(fmt.Sprintf(base_url,item,1),arrChan,&swg)
}
var wg2 sync.WaitGroup
wg2.Add(1)
go readChannel(arrChan, &wg2)
swg.Wait()
// All written, signal end to readChannel by closing the channel
close(arrChan)
wg2.Wait()
}
Here I close the channel to signal to readChannel that there is nothing left to read, so it can exit cleanly at proper time. But sometimes you might want instead to tell readChannel to read exactly 3 strings and finish. Or may be you would want to start one reader for each writer and each reader will read exactly one string... Well, there are many ways to skin a cat and choice is all yours.
Note, if you remove wg2.Wait() line your code becomes equivalent to photoionized's response and will only print two strings whilst writing 3. This is because code exits once all writers finish (ensured by swg.Wait()), but it does not wait for readChannel to finish.
If you remove close(arrChan) line instead, your code will crash with a deadlock after printing 3 lines as code waits for readChannel to finish, but readChannel waits to read from a channel which nobody is writing to anymore.
If you just remove "go" before the readChannel call, it becomes equivalent of reading from channel inside main function. It will again crash with a dead lock after printing 3 strings because readChannel is still reading when all writers have already finished (and readChannel has already read all they written). A tricky point here is that swg.Wait() line will never be reached by this code as readChannel never exits.
If you move readChannel call after the swg.Wait() then code will crash before even printing a single string. But this is a different dead lock. This time code reaches swg.Wait() and stops there waiting for writers. First writer succeeds, but channel is not buffered, so next writer blocks until someone reads from the channel the data already written. Trouble is - nobody reads from the channel yet as readChannel has not been called yet. So, it stalls and crashes with a dead lock. This particular issue can be "fixed", but making channel buffered as in make(chan string, 3) as that will allow writers to keep writing even though nobody is reading from that channel yet. And sometimes this is what you want. But here again you have to know the maximum of messages to ever be in the channel buffer. And most of the time it's only deferring a problem - just add one more writer and you are where you started - code stalls and crashes as channel buffer is full and that one extra writer is waiting for someone to read from the buffer.
Well, this should covers all bases. So, check your code and see which case is yours.

Resources