How to check a net.Conn is closed?

How to check a net.Conn is closed? - go

Or how to check it is available for Read or Write in loop? If the conn is closed or not available, we should stop the loop.
For example:
package main
import "net"
func main() {
conn, err := net.Dial("tcp", "127.0.0.1:1111")
defer conn.Close()
for {
buf := make([]byte, 1, 1)
n, err := conn.Read(buf)
if err != nil {
// currently we can only stop the loop
// when occur any errors
log.Fatal(err)
}
}
}

You can get a number of errors, depending on how the connection was closed. The only error that you can count on receiving from a Read is an io.EOF. io.EOF is the value use to indicate that a connection was closed normally.
Other errors can be checked against the net.Error interface for its Timeout and Temporary methods. These are usually of the type net.OpError. Any non-temporary error returned from a Write is fatal, as it indicates the write couldn't succeed, but note that due to the underlying network API, writes returning no error still aren't guaranteed to have succeeded.
In general you can just follow the io.Reader api.
When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call. An instance of this general case is that a Reader returning a non-zero number of bytes at the end of the input stream may return either err == EOF or err == nil. The next Read should return 0, EOF.
If there was data read, you handle that first. After you handle the data, you can break from the loop on any error. If it was io.EOF, the connection is closed normally, and any other errors you can handle as you see fit.

If you are using Go 1.16 or newer and only working with the standard libraries (not some arbitrary networking stack), you can use a function like this to check the closed errors and handle them differently than other errors:
func isNetConnClosedErr(err error) bool {
switch {
case
errors.Is(err, net.ErrClosed),
errors.Is(err, io.EOF),
errors.Is(err, syscall.EPIPE):
return true
default:
return false
}
}
Note that there is an os.ErrClosed error (alias for fs.ErrClosed) that you could add if you were dealing with files, but you don't need it when only using the net package. While your code only shows a client, there is probably a server side doing listener.Accept() that gets closed by a different go process and you don't want to log nasty errors when that happens, so you want to have net.ErrClosed in the list above. As for syscall.EPIPE, this comes in handy when writing to a remote end that already closed the connection. Remember that with networking connections, you may not get the EPIPE on the 1st write, as the OS may need to send some data to discover that the remote end was closed.

Related

what errors does net.Conn.Write return

According to the go documentation, the net.Conn.Write() function will return an error, if it could not send all bytes in the given slice.
How do I know which type of error is returned in this case? Should I just check if there is an error, and if n > 0 and n < len(p) ? Or is it enough to just check if n < len(p) alone? Are there unrecoverable errors for which n < len(p) ?
Say I want to send X gigabytes of data for example. What would a "correct" Write() loop look like?
If the output buffers are full, will Write() simply just block until it has sent everything from p? making a check for n < len(p) superfluous?

Well, net.Conn is an interface. This means that it is completely up to the implementation to determine which errors to send back. A TCP connection and UNIX socket connection can have very different reasons why a write can't be fully completed.
The net.Conn.Write signature is exactly the same as the io.Writer signature. Which means that every implementation of net.Conn also implements io.Writer. So you can use any existing method like io.Copy or io.CopyN to write data to a connection.

How do I know which type of error is returned in this case? Should I just check if there is an error, and if n > 0 and n < len(p) ?
Use n < len(p) to detect the case where the write stopped early. The error is not nil in this case.
Are there unrecoverable errors for which n < len(p) ?
Yes. A network connection can fail after some data is written.
There are also recoverable errors. If write fails because the write deadline is exceeded, a later write with a new deadline can succeed.
Say I want to send X gigabytes of data for example. What would a "correct" Write() loop look like?
If you are asking how to write a []byte to a connection, then a loop is not needed in most cases. The Write call blocks until the data is written or an error occurs. Use this code:
_, err := c.Write(p)
If you are asking how to copy an io.Reader to a network connection, the use _, err := io.Copy(conn, r).
Write is different from Read. Read can return before filling the buffer.
If the output buffers are full, will Write() simply just block until it has sent everything from p?
Write blocks until all data is sent or the write fails with an error (deadline exceeded, network failure, network connection closed in other goroutine, ...).

The net.Conn.Write() is to implement the io.Writer interface, which has the following contract regarding errors:
Write must return a non-nil error if it returns n < len(p).
There is no single correct write loop. For certain cases, it might be important to know how much data was written. However, for network connections, based on this contract, the following should work:
var err error
for data, done := getNextSegment(); !done&&err==nil; data, done = getNextSegment() {
_,err=conn.Write(data)
}
To keep the total number of bytes written:
var err error
written:=0
for data, done := getNextSegment(); !done&&err==nil; data, done = getNextSegment() {
n,err=conn.Write(data)
written+=n
}

So, I went digging into the Go source code itself. I tracked the Write() call to a file named src/internal/poll/fd_unix.go.
// Write implements io.Writer.
func (fd *FD) Write(p []byte) (int, error) {
if err := fd.writeLock(); err != nil {
return 0, err
}
defer fd.writeUnlock()
if err := fd.pd.prepareWrite(fd.isFile); err != nil {
return 0, err
}
var nn int
for {
max := len(p)
if fd.IsStream && max-nn > maxRW {
max = nn + maxRW
}
n, err := ignoringEINTRIO(syscall.Write, fd.Sysfd, p[nn:max])
if n > 0 {
nn += n
}
if nn == len(p) {
return nn, err
}
if err == syscall.EAGAIN && fd.pd.pollable() {
if err = fd.pd.waitWrite(fd.isFile); err == nil {
continue
}
}
if err != nil {
return nn, err
}
if n == 0 {
return nn, io.ErrUnexpectedEOF
}
}
}
This seems to handle the retransmits already. So, Write() does actually guarantee that either everything is sent, or a fatal error occurs, which is unrecoverable.
It seems to me, that there is no need at all, to care about the value of n, other than for logging purposes. If an error ever occurs, it is severe enough that there is no reason to try and retransmit the remaining len(p)-n bytes.

Go GRPC client disconnect terminates Go server

Bit of a newb to both Go and GRPC, so bear with me.
Using go version go1.14.4 windows/amd64, proto3, and latest grpc (1.31 i think). I'm trying to set up a bidi streaming connection that will likely be open for longer periods of time. Everything works locally, except if I terminate the client (or one of them) it kills the server as well with the following error:
Unable to trade data rpc error: code = Canceled desc = context canceled
This error comes out of this code server side
func (s *exchangeserver) Trade(stream proto.ExchageService_TradeServer) error {
endchan := make(chan int)
defer close(endchan)
go func() {
for {
req, err := stream.Recv()
if err == io.EOF {
break
}
if err != nil {
log.Fatal("Unable to trade data ", err)
break
}
fmt.Println("Got ", req.GetNumber())
}
endchan <- 1
}()
go func() {
for {
resp := &proto.WordResponse{Word: "Hello again "}
err := stream.Send(resp)
if err != nil {
log.Fatal("Unable to send from server ", err)
break
}
time.Sleep(time.Duration(500 * time.Millisecond))
}
endchan <- 1
}()
<-endchan
return nil
}
And the Trade() RPC is so simple it isn't worth posting the .proto.
The error is clearly coming out of the Recv() call, but that call blocks until it sees a message, like the client disconnect, at which point I would expect it to kill the stream, not the whole process. I've tried adding a service handler with HandleConn(context, stats.ConnStats) and it does catch the disconnect before the server dies, but I can't do anything with it. I've even tried creating a global channel that the serve handler pushes a value into when HandleRPC(context, stats.RPCStats) is called and only allowing Recv() to be called when there's a value in the channel, but that can't be right, that's like blocking a blocking function for safety and it didn't work anyway.
This has to be one of those real stupid mistakes that beginner's make. Of what use would GPRC be if it couldn't handle a client disconnect without dying? Yet I have read probably a trillion (ish) posts from every corner of the internet and noone else is having this issue. On the contrary, the more popular version of this question is "My client stream stays open after disconnect". I'd expect that issue. Not this one.

Im not 100% sure how this is supposed to behave but I note that you are starting separate receive and send goroutines up at the same time. This might be valid but is not the typical approach. Instead you would usually receive what you want to process and then start a nested loop to handle the reply .
See an example of typical bidirectional streaming implementation from here: https://grpc.io/docs/languages/go/basics/
func (s *routeGuideServer) RouteChat(stream pb.RouteGuide_RouteChatServer) error {
for {
in, err := stream.Recv()
if err == io.EOF {
return nil
}
if err != nil {
return err
}
key := serialize(in.Location)
... // look for notes to be sent to client
for _, note := range s.routeNotes[key] {
if err := stream.Send(note); err != nil {
return err
}
}
}
}
sending and receiving at the same time might be valid for your use case but if that is what you are trying to do then I believe your handling of the channels is incorrect. Either way, please read on to understand the issue as it is a common one in go.
You have a single channel which only blocks until it receives a single message, once it unblocks the function ends and the channel is closed (by defer).
You are trying to send to this channel from both your send and receive
loop.
When the last one to finish tries to send to the channel it will have been closed (by the first to finish) and the server will panic. Annoyingly, you wont actually see any sign of this as the server will exit before the goroutine can dump its panic (no clues - probably why you landed here)
see an example of the issue here (grpc code stripped out):
https://play.golang.org/p/GjfgDDAWNYr
Note: comment out the last pause in the main func to stop showing the panic reliably (as in your case)
So one simple fix would probably be to simply create two separate channels (one for send, one for receive) and block on both - this however would leave the send loop open necessarily if you don't get a chance to respond so probably better to structure like the example above unless you have good reason to pursue something different.
Another possibility is some sort server/request context mix up but I'm pretty sure the above will fix - drop an update with your server setup code if your still having issues after the above changes

How to place http.Serve in its own goroutine if it blocks?

http.Serve either returns an error as soon as it is called or blocks if successfully executing.
How can I make it so that if it blocks it does so in its own goroutine? I currently have the following code:
func serveOrErr(l net.Listener, handler http.Handler) error {
starting := make(chan struct{})
serveErr := make(chan error)
go func() {
starting <- struct{}{}
if err := http.Serve(l, handler); err != nil {
serveErr <- err
}
}()
<-starting
select {
case err := <-serveErr:
return err
default:
return nil
}
}
This seemed like a good start and works on my test machine but I believe that there are no guarantees that serveErr <- err would be called before case err := <-serveErr therefore leading to inconsistent results due to a data race if http.Serve were to produce an error.

http.Serve either returns an error as soon as it is called or blocks if successfully executing
This assumption is not correct. And I believe it rarely occurs. http.Serve calls net.Listener.Accept in the loop – an error can occur any time (socket closed, too many open file descriptors etc.). It's http.ListenAndServe, usually being used for running http servers, which often fails early while binding listening socket (no permissions, address already in use).
In my opinion what you're trying to do is wrong, unless really your net.Listener.Accept is failing on the first call for some reason. Is it? If you want to be 100% sure your server is working, you could try to connect to it (and maybe actually transmit something), but once you successfully bound the socket I don't see it really necessary.

You could use a timeout on your select statement, e.g.
timeout := time.After(5 * time.Millisecond) // TODO: ajust the value
select {
case err := <-serveErr:
return err
case _ := <- timeout:
return nil
}
This way your select will block until serveErr has a value or the specified timeout has elapsed. Note that the execution of your function will therefore block the calling goroutine for up to the duration of the specified timeout.
Rob Pike's excellent talk on go concurrency patterns might be helpful.

Aerospike randomly returning nil errors when using Query() with Go client

I'm experiencing some strange behavior. I'm trying to set up a small webapp that fetches some data using Aerospike 3.5 Community running on an Ubuntu 12.04 server. I'm using the default aerospike.conf file (using the 'test' namespace) and am following the example of how to query here.
When I attempt to query some records with a filter, the Errors channel randomly is returning a nil error. (This example points to my dev database instance).
To replicate, compile and run the following multiple times, you'll see either data returned or a panic:
package main
import (
"fmt"
"github.com/aerospike/aerospike-client-go"
)
func main() {
c, err := aerospike.NewClient("52.7.157.46", 3000)
if err != nil {
panic(err)
}
recs := liststuff(c)
fmt.Printf("got results: %v", recs)
}
func liststuff(client *aerospike.Client) []*aerospike.Record {
// fetch some records with a filter
stm := aerospike.NewStatement("test", "products")
stm.Addfilter(aerospike.NewEqualFilter("visible", 1))
fmt.Println("querying...")
recordset, err := client.Query(nil, stm)
if err != nil {
panic(err)
}
// collect results into a slice
recs := []*aerospike.Record{}
L:
for {
select {
case rec, chanOpen := <-recordset.Records:
if !chanOpen {
break L
}
fmt.Println("found record %v", rec)
recs = append(recs, rec)
case err := <-recordset.Errors:
if err != nil {
panic(err)
} else {
panic(fmt.Errorf("error nil when it should exist"))
}
return nil
}
}
return recs
}

Just to post an update, both Errors and Records channels are closed automatically when the record stream is over from the server-side, hence the nil value from the Errors channel.
So this wasn't an error after all. We've updated the thread in our Aerospike user forum post accordingly.

I'm not familiar with the aerospike package but running your example code show that it always panics no matter if it returns data or not.
That means that the Errors channel always sends either an error or nil. If that's the expected behavior you'd have to handle it accordingly and only panic when the error is not nil.
Sending nil on the channel still means that a value is being sent on the channel and it will trigger the select statement. Thus the panic on a nil error.
The randomness you see, i.e. sometimes data is returned and sometimes it isn't is due to the nature of the select statement. If both data and a nil error are being sent at the same time both cases are true and select will pseudo randomly select one of the two.
If one or more of the communications can proceed, a single one that
can proceed is chosen via a uniform pseudo-random selection.
Otherwise, if there is a default case, that case is chosen. If there
is no default case, the "select" statement blocks until at least one
of the communications can proceed.
If it selects the data channel first it will print the data and then on the next iteration select the error channel and panic. If it picks the error channel first it panics and the data never prints.

Turns out it's a legit bug and should be fixed soon: https://discuss.aerospike.com/t/aerospike-randomly-returning-nil-errors-when-using-query-with-go-client/1346

Golang: TCP client/server data delimiter

Not sure how to formulate the question and if it really relates only to go language, but what i am trying to do is to have a tcp server and client that will exchange data in between, basically the client will stream big amounts of data into smaller chunks to the server, the server will wait to read every chunk of data and then reply with a status code which will be read by the client and based on that it will do other work.
I use the function below as a test to read the data from client and server (please note, i am aware that is not perfect, but it's just testing) :
func createBufferFromConn(conn net.Conn) *bytes.Buffer {
buffer := &bytes.Buffer{}
doBreak := false
for {
incoming := make([]byte, BUFFER_SIZE)
conn.SetReadDeadline(time.Now().Add(time.Second * 2))
bytesRead, err := conn.Read(incoming)
conn.SetReadDeadline(time.Time{})
if err != nil {
if err == io.EOF {
fmt.Println(err)
} else if neterr, ok := err.(net.Error); ok && neterr.Timeout() {
fmt.Println(err)
}
doBreak = true
}
if doBreak == false && bytesRead == 0 {
continue
}
if bytesRead > 0 {
buffer.Write(incoming[:bytesRead])
if bytes.HasSuffix(buffer.Bytes(), []byte("|")) {
bb := bytes.Trim(buffer.Bytes(), "|")
buffer.Reset()
buffer.Write(bb)
doBreak = true
}
}
if doBreak {
break
}
}
return buffer
}
Now in my case if i connect via telnet(the go code also includes a client() to connect to the server()) and i type something like test 12345| fair enough everything works just fine and the buffer contains all the bytes written from telnet(except the pipe which is removed by the Trim() call).
If i remove the if bytes.HasSuffix(buffer.Bytes(), []byte("|")) { block from the code then i will get a timeout after 2 seconds, again, as expected because no data is received in that amount of time and the server closes the connection, and if i don't set a read deadline from the connection, it will wait forever to read data and will never know when to stop.
I guess my question is, if i send multiple chunks of data, do i have to specify a delimiter of my own so that i know when to stop reading from the connection and avoid waiting forever or waiting for the server to timeout the connection ?

I guess my question is, if i send multiple chunks of data, do i have
to specify a delimiter of my own so that i know when to stop reading
from the connection and avoid waiting forever or waiting for the
server to timeout the connection
Yes. TCP is a stream protocol, and there's no way to determine where messages within the protocol start and stop without framing them in some way.
A more common framing method used is to send a size prefix, so that the receiver knows how much to read without having to buffer the results and scan for a delimiter. This can be as simple as message_length:data.... (see also netstring, and type-length-value encoding).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to check a net.Conn is closed? - go

Related

what errors does net.Conn.Write return

Go GRPC client disconnect terminates Go server

How to place http.Serve in its own goroutine if it blocks?

Aerospike randomly returning nil errors when using Query() with Go client

Golang: TCP client/server data delimiter

Categories

Resources