Program fails with: write tcp broken pipe error - go

I've got a go app that downloads files using github.com/jlaffaye/ftp which is a library that implements File Transfer Protocol (FTP)
Defined a struct with an FTP connection
type Client struct {
conn *ftp.ServerConn
sync.Mutex
}
I have a global client
var cli *Client = nil
Which I then initialize and assign to the global variable to be used later
conn, err := ftp.Dial(cfg.Host+cfg.Port, ftp.DialWithContext(ctx))
if err != nil {
return fmt.Errorf("ftp dial error: %v", err)
}
err = conn.Login(cfg.Username, pwd)
if err != nil {
return fmt.Errorf("ftp login error: %v", err)
}
c := &Client{}
c.Lock()
c.conn = conn
cli = c
c.Unlock()
I've got a http handler that takes the conn field from the Client struct and calls
fileNames, err := conn.NameList(".")
if err != nil {
return err
}
Most of the time, the application fails on the call to conn.NameList(".") with error
write tcp xxx.xxx.xx.x:48572-\u003exx.xxx.xxx.xxx:21: write: broken pipe
And sometimes
write tcp xxx.xxx.xx.xx:63037-\u003exx.xxx.xxx.xxx:21: wsasend: An established connection was aborted by the software in your host machine
I don't close the connection prematurely.
Does anyone have an idea on why this is happening? Or maybe you could recommend a better library that uses FTP?
As per the question if multiple services are running at the same time, and if it's conccurently safe.
When a process is running, another can't run until setIsRunning(false) is called
func (s *Service) setIsRunning(b bool) error {
mu := &sync.Mutex{}
mu.Lock()
defer mu.Unlock()
if b && s.isRunning {
return errors.New("already running")
}
s.isRunning = b
return nil
}
An attempt to call that handler while its running will yield
"error":"already running"
Also I use mutex locks while reading from the global Client like such
cli.Lock()
conn = cli.conn
cli.Unlock()

switching to a non-global client resolves the issue

Related

automatic gRPC unix reconnect after EOF

I have an application (let's call it client) connecting to another process (let's call it server) on the same machine via gRPC. The communication goes over unix socket.
If server is restarted, my client gets an EOF and does not re-establish the connection, although I expected the clientConn to handle the reconnection automatically.
Why isn't the dialer taking care of the reconnection?
I expect it to do so with the backoff params I passed.
Below some pseudo-MWE.
Run establish the initial connection, then spawns goroutineOne
goroutineOne waits for the connection to be ready and delegates the send to fooUpdater
fooUpdater streams the data, or returns in case of errors
for waitUntilReady I used the pseudo-code referenced by this answer to get a new stream.
func main() {
go func() {
if err := Run(ctx); err != nil {
log.Errorf("connection error: %v", err)
}
ctxCancel()
}()
// some wait logic
}
func Run(ctx context.Context) {
backoffConfig := backoff.Config{
BaseDelay: time.Duration(1 * time.Second),
Multiplier: backoff.DefaultConfig.Multiplier,
Jitter: backoff.DefaultConfig.Jitter,
MaxDelay: time.Duration(120 * time.Second),
}
myConn, err := grpc.DialContext(ctx,
"/var/run/foo.bar",
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithConnectParams(grpc.ConnectParams{Backoff: backoffConfig, MinConnectTimeout: time.Duration(1 * time.Second)}),
grpc.WithContextDialer(func(ctx context.Context, addr string) (net.Conn, error) {
d := net.Dialer{}
c, err := d.DialContext(ctx, "unix", addr)
if err != nil {
return nil, fmt.Errorf("connection to unix://%s failed: %w", addr, err)
}
return c, nil
}),
)
if err != nil {
return fmt.Errorf("could not establish socket for foo: %w", err)
}
defer myConn.Close()
return goroutineOne()
}
func goroutineOne() {
reconnect := make(chan struct{})
for {
if ready := waitUntilReady(ctx, myConn, time.Duration(2*time.Minute)); !ready {
return fmt.Errorf("myConn: %w, timeout: %s", ErrWaitReadyTimeout, "2m")
}
go func() {
if err := fooUpdater(ctx, dataBuffer, myConn); err != nil {
log.Errorf("foo updater: %v", err)
}
reconnect <- struct{}{}
}()
select {
case <-ctx.Done():
return nil
case <-reconnect:
}
}
}
func fooUpdater(ctx context.Context, dataBuffer custom.CircularBuffer, myConn *grpc.ClientConn) error {
clientStream, err := myConn.Stream(ctx) // custom pb code, returns grpc.ClientConn.NewStream(...)
if err != nil {
return fmt.Errorf("could not obtain stream: %w", err)
}
for {
select {
case <-ctx.Done():
return nil
case data := <-dataBuffer:
if err := clientStream.Send(data); err != nil {
return fmt.Errorf("could not send data: %w", err)
}
}
}
}
func waitUntilReady(ctx context.Context, conn *grpc.ClientConn, maxTimeout time.Duration) bool {
ctx, cancel := context.WithTimeout(ctx, maxTimeout)
defer cancel()
currentState := conn.GetState()
timeoutValid := true
for currentState != connectivity.Ready && timeoutValid {
timeoutValid = conn.WaitForStateChange(ctx, currentState)
currentState = conn.GetState()
// debug print currentState -> prints IDLE
}
return currentState == connectivity.Ready
}
Debugging hints also welcome :)
Based on the provided code and information, there might be an issue with how ctx.Done is being utilized.
The ctx.Done() is being used in fooUpdater and goroutineOnefunctions. When connection breaks, I believe that the ctx.Done() gets called in both functions, with the following execution order:
Connection breaks, the ctx.Done case in the fooUpdater function gets called, exiting the function. The select statement in the goroutineOne function also executes the ctx.Done case, which exists the function, and the client doesn't reconnect.
Try debugging it to check if both select case blocks get executed, but I believe that is the issue here.
According to the GRPC documentation, the connection is re-established if there is a transient failure otherwise it fails immediately. You can try to verify that the failure is transient by printing the connectivity state.
You should print the error code also to understand Why RPC failed.
Maybe what you have tried is not considered a transient failure.
Also, according to the following entry retry logic does not work with streams: grpc-java: Proper handling of retry on client for service streaming call
Here are the links to the corresponding docs:
https://grpc.github.io/grpc/core/md_doc_connectivity-semantics-and-api.html
https://pkg.go.dev/google.golang.org/grpc#section-readme
Also, check the following entry:
Ways to wait if server is not available in gRPC from client side

using stomp function in golang

I am trying to establish a stomp connection using stomp.dial() function at the client side, but unable to connect to the server.
I am using go-stomp library. I tried different methods like using net.connect and then stomp.connect, but its resulting in the same error. the error goes like
read tcp: wsarecv: An existing connection was forcibly closed by the remote host
What is exactly happening here and how to resolve it
My code looks something like
ticker := time.NewTicker(time.Second * 5)
defer ticker.Stop()
for ; ; <-ticker.C {
st, err := stomp.Dial("tcp", conn.ConfigStr)
if err != nil {
log.Println("Stomp connect error", err.Error())
continue
}
log.Println("CONNECTED TO ", conn.ProviderName)
I was able to accomplish your request in this way.
First I run a STOMP server locally on port 61613. To launch it, I used this command:
docker run -it --rm -p 61613:61613 efrecon/stomp -verbose 5
Then, I used the package go-stomp together with the function Dial to connect to it:
package main
import "github.com/go-stomp/stomp/v3"
func main() {
conn, err := stomp.Dial("tcp", "localhost:61613")
if err != nil {
panic(err)
}
defer func() {
if err = conn.Disconnect(); err != nil {
panic(err)
}
}()
}
Lastly, I used the Disconnect method to close the connection.
Let me know if this works also for you.

How to make net.Dial in Go reconnect if connection is lost?

I have an app in Go that's connecting to XMPP host using tcp and then xml Decoder to talk XMPP. How can I make net.Dial reconnect if tcp connection is dropped?
I am getting the following error on my error channel when the connection is dropped:
write tcp client:port->xmpp_server:5222: write: broken pipe. However I'm not sure how to properly handle it in my Dial function to make it reconnect.
// package xmpp
// Conn represents a connection
type Conn struct {
incoming *xml.Decoder
outgoing net.Conn
errchan chan error
}
// SetErrorChannel sets the channel for handling errors
func (c *Conn) SetErrorChannel(channel chan error) {
c.errchan = channel
}
// Dial dials an xmpp host
func Dial(host string) (*Conn, error) {
c := new(Conn)
var err error
c.outgoing, err = net.Dial("tcp", host+":5222")
if err != nil {
log.Printf("Can't dial %s:5222: %s", host, err)
return c, err
}
// TCP Keep Alive
err = c.outgoing.(*net.TCPConn).SetKeepAlive(true)
if err != nil {
c.errchan <- err
}
err = c.outgoing.(*net.TCPConn).SetKeepAlivePeriod(30 * time.Second)
if err != nil {
c.errchan <- err
}
c.incoming = xml.NewDecoder(c.outgoing)
log.Printf("Connected to: %s", c.outgoing.RemoteAddr())
return c, nil
}
// In a separate package
func NewXMPPClient(config) (*Client, error) {
errchannel := make(chan error)
connection, err := xmpp.Dial(host)
if err != nil {
return nil, err
}
connection.SetErrorChannel(errchannel)
// Do XMPP auth, receive messages, etc...
Figured it out. I just started to close the current tcp connection on any error in my error channel and re-create both TCP and XMPP (auth+listen) connections.

Proper way to close a crypto ssh session freeing all resources in golang?

TL;DR - What is the proper way to close a golang.org/x/crypto/ssh session freeing all resources?
My investigation thus far:
The golang.org/x/crypto/ssh *Session has a Close() function which calls the *Channel Close() function which sends a message (I'm guessing to the remote server) to close, but I don't see anything about closing other resources like the pipe returned from the *Session StdoutPipe() function.
Looking at the *Session Wait() code, I see that the *Session stdinPipeWriter is closed but nothing about the stdoutPipe.
This package feels a lot like the os/exec package which guarantees that using the os/exec Wait() function will clean up all the resources. Doing some light digging there shows some similarities in the Wait() functions. Both use the following construct to report errors on io.Copy calls to their stdout, stderr, stdin readers/writers (well if I'm reading this correctly actually only one error) - crypto package shown:
var copyError error
for _ = range s.copyFuncs {
if err := <-s.errors; err != nil && copyError == nil {
copyError = err
}
}
But the os/exec Wait() also calls this close descriptor method
c.closeDescriptors(c.closeAfterWait)
which is just calling the close method on a slice of io.Closer interfaces:
func (c *Cmd) closeDescriptors(closers []io.Closer) {
for _, fd := range closers {
fd.Close()
}
}
when os/exec creates the pipe, it tracks what needs closing:
func (c *Cmd) StdoutPipe() (io.ReadCloser, error) {
if c.Stdout != nil {
return nil, errors.New("exec: Stdout already set")
}
if c.Process != nil {
return nil, errors.New("exec: StdoutPipe after process started")
}
pr, pw, err := os.Pipe()
if err != nil {
return nil, err
}
c.Stdout = pw
c.closeAfterStart = append(c.closeAfterStart, pw)
c.closeAfterWait = append(c.closeAfterWait, pr)
return pr, nil
}
During this I noticed that x/cyrpto/ssh *Session StdoutPipe() returns an io.Reader and ox/exec returns an io.ReadCloser. And x/crypto/ssh does not track what to close. I can't find a call to os.Pipe() in the library so maybe the implementation is different and I'm missing something and confused by the Pipe name.
A session is closed by calling Close(). There are no file descriptors involved, nor are there any calls to os.Pipe as the "pipe" returned from Session.StdOutPipe is only a pipe in concept and is of type ssh.Channel. Go channels don't need to be closed, because closing a channel is not a cleanup operation, rather it's simply a type of message sent to the channel. There is only ever one network connection involved in the ssh transport.
The only resource you need to close is the network connection; there are no other system resources to be freed. Calling Close() on the ssh.Client will call ssh.Conn.Close, and in turn close the net.Conn.
If you need the handle the network connection, you can always skip the ssh.Dial convenience function and Dial the network connection yourself:
c, err := net.DialTimeout(network, addr, timeout)
if err != nil {
return nil, err
}
conn, chans, reqs, err := ssh.NewClientConn(c, addr, config)
if err != nil {
return nil, err
}
// calling conn.Close will close the underlying net.Conn
client := ssh.NewClient(c, chans, reqs)

Whether to create connection every time when amqp.Dial is threadsafe or not in go lang

As it is mentioned in the RabbitMQ docs that tcp connections are expensive to make. So, for that concept of channel was introduced. Now i came across this example. In the main() it creates the connection everytime a message is publised.
conn, err := amqp.Dial("amqp://guest:guest#localhost:5672/").
Shouldn't it be declared globally once and there should be failover mechanism in case connection get closed like singleton object. If amqp.Dial is thread-safe, which i suppose it should be
Edited question :
I am handling the connection error in the following manner. In which i listen on a channel and create a new connection on error. But when i kill the existing connection and try to publish message. I get the following error.
error :
2016/03/30 19:20:08 Failed to open a channel: write tcp 172.16.5.48:51085->172.16.0.20:5672: use of closed network connection
exit status 1
7:25 PM
Code :
func main() {
Conn, err := amqp.Dial("amqp://guest:guest#172.16.0.20:5672/")
failOnError(err, "Failed to connect to RabbitMQ")
context := &appContext{queueName: "QUEUENAME",exchangeName: "ExchangeName",exchangeType: "direct",routingKey: "RoutingKey",conn: Conn}
c := make(chan *amqp.Error)
go func() {
error := <-c
if(error != nil){
Conn, err = amqp.Dial("amqp://guest:guest#172.16.0.20:5672/")
failOnError(err, "Failed to connect to RabbitMQ")
Conn.NotifyClose(c)
}
}()
Conn.NotifyClose(c)
r := web.New()
// We pass an instance to our context pointer, and our handler.
r.Get("/", appHandler{context, IndexHandler})
graceful.ListenAndServe(":8086", r)
}
Of course, you shouldn't create a connection for each request. Make it a global variable or better part of an application context which you initialize once at startup.
You can handle connection errors by registering a channel using Connection.NotifyClose:
func initialize() {
c := make(chan *amqp.Error)
go func() {
err := <-c
log.Println("reconnect: " + err.Error())
initialize()
}()
conn, err := amqp.Dial("amqp://guest:guest#localhost:5672/")
if err != nil {
panic("cannot connect")
}
conn.NotifyClose(c)
// create topology
}

Resources