How do I get last connection error when dialling gRPC server? - go

I am having the following code:
dialCtx, cancel := context.WithTimeout(ctx, 120*time.Second)
defer cancel()
conn, err := grpc.DialContext(dialCtx, address,
grpc.WithTransportCredentials(creds),
grpc.WithKeepaliveParams(keepAlive),
grpc.WithBlock(),
)
if err != nil {
return fmt.Errorf("failed to connect to server: %v", err)
}
I am trying to create a connection with gRPC server. One important thing is that I am using WithBlock() which blocks the dial until the connection is ready or the context timeouts. Okay, but when the context timeouts, I don't get what was the connection problem, aka last connection error. I get context deadline exceeded.
I tried following:
Using grpc.FailOnNonTempDialError(true) - error is returned when service is not available, but when TLS verification fails, re-connection continues.
Using grpc.WithContextDialer(...) - does not work for me, because sometimes the initial dialling is successful, but if server certificates validation fails, whole connection is closed.
How can I get that last connection error?

After some more research, I decided to update the grpc package version. I was using v1.27.0 and the latest is v1.35.0. Between these version, the problem was fixed and a new dial option introduced:
grpc.WithReturnConnectionError()
It's a way better now, but there is room for improvement. Currently, lastError and the context error are combined like that:
conn, err = nil, fmt.Errorf("%v: %v", ctx.Err(), err)
The problem is that the underlying error's type is lost, so the only way to make some action, based on the error, is string comparing(which is not reliable).
I hope that answer will be useful.

Related

Why is Go connecting to a database synchronously?

I'm coming from a Node background and trying to get into Go, by looking at code examples.
I do find it weird that code is mostly synchronous - even things like connecting and communicating with the database, e.g.
func main() {
// Create a new client and connect to the server
client, err := mongo.Connect(context.TODO(), options.Client().ApplyURI(uri))
if err != nil {
panic(err)
}
}
Doesn't this block the thread until DB sends back a response? If not, how is that possible?
Yeah there's this difference:
In Node everything is not blocking until you say it otherwise, await or callabck.
In Go everything is blocking until you say it otherwise, go.

How to know which connection closed in "use of closed network connection" error

I'm proxying TCP connections in Go using io.Copy
_, err := io.Copy(src, dst)
if err != nil {
log.Println(err)
}
and one connection closes therefore sending this error:
readfrom tcp 171.31.80.49:10000->88.39.116.204:56210: use of closed network connection
How do I know which network connection closed? i.e. 171.31.80.49:10000 or 88.39.116.204:56210.
A TCP connection is a pair of IP and port pairs. In your case, the connection is 171.31.80.49:10000->88.39.116.204:56210. It is the connection, and it is closed. There is no connection 171.31.80.49:10000 or 88.39.116.204:56210.
There are two connections in your example: src and dst (you misnamed them, by the way: https://pkg.go.dev/io#Copy). If your question is which connection is getting closed, then, according to the error message, it's dst (which is supposed to be named src).
Why? Because the message says: readfrom ..., the error occurs when the io.Copy is reading from the Reader, which in our case is dst.

Google pubsub golang subscriber stops receiving new published message(s) after being idle for a few hours

I created a TOPIC in google pubsub, and created a SUBSCRIPTION inside the TOPIC, with the following settings
then I wrote a puller in go, using its Receive to pull and acknowledge published messages
package main
import (
...
)
func main() {
ctx := context.Background()
client, err := pubsub.NewClient(ctx, config.C.Project)
if err != nil {
// do things with err
}
sub := client.Subscription(config.C.PubsubSubscription)
err := sub.Receive(ctx, func(ctx context.Context, msg *pubsub.Message) {
msg.Ack()
})
if err != context.Canceled {
logger.Error(fmt.Sprintf("Cancelled: %s", err.Error()))
}
if err != nil {
logger.Error(fmt.Sprintf("Error: %s", err.Error()))
}
}
Nothing fancy, its working well, but then after a while (~ after 3 hours idle), it stops receiving new published messages, no error(s), nothing. Am i missing something?
In general, there can be several reasons why a subscriber may stop receiving messages:
If a subscriber does not ack or nack messages, the flow control limits can be reached, meaning no more messages can be delivered. This does not seem to be the case in your particular instance given that you immediately ack messages.
If another subscriber starts up for the same subscription, it could be receiving the messages. In this scenario, one would expect the subscriber to receive a subset of the messages rather than no messages at all.
Publishers just stop publishing messages and therefore there are no messages to receive. If you restart the subscriber and it starts receiving messages again, this probably isn't the case. You can also verify that a backlog is being built up by looking at the Stackdriver metric for subscription/backlog_bytes.
If your problem does not fall into one of those categories, it would be best to reach out to Google Cloud support with your project name, topic name, and subscription name so that they can narrow down the issue to either your user code, the client library itself, or the service.
I was experiencing something similar and I was pretty sure there was not another subscriber pulling those messages.
Try this: go to the topic, create a new bogus subscription (name it whatever you want, because you'll just delete it later). Right after I did that both the fake subscription (which I was using the python sample code client to subscribe to) and the real one was receiving messages again. Strange solution, but maybe it kicked the topic awake again.
Hopefully someone from Google could give us some insight into what's happening here, but I'm definitely not paying them enough to get direct support.
Few changes will help you to investigate the issue better:
- Check error from Receive
- Use separate context for Receive
ctx := context.Background()
err := sub.Receive(ctx, func(ctx context.Context, msg *pubsub.Message) {
msg.Ack()
})
if err != nil {
log.Fatal(err)
}
Does your code work before? I have problems with PubSub since today. Methods like get_topic(), create_topic() in Python PubSub library stop working, but I don't have any problems with sending and pulling messages. Yesterday everything was working fine but today not...

get notified when http.Server starts listening

When I look at the net/http server interface, I don't see an obvious way to get notified and react when the http.Server comes up and starts listening:
ListenAndServe(":8080", nil)
The function doesn't return until the server actually shuts down. I also looked at the Server type, but there doesn't appear to be anything that lets me tap into that timing. Some function or a channel would have been great but I don't see any.
Is there any way that will let me detect that event, or am I left to just sleeping "enough" to fake it?
ListenAndServe is a helper function that opens a listening socket and then serves connections on that socket. Write the code directly in your application to signal when the socket is open:
l, err := net.Listen("tcp", ":8080")
if err != nil {
// handle error
}
// Signal that server is open for business.
if err := http.Serve(l, rootHandler); err != nil {
// handle error
}
If the signalling step does not block, then http.Serve will easily consume any backlog on the listening socket.
Related question: https://stackoverflow.com/a/32742904/5728991

GORM pq too many connections

I am using GORM with my project, everything is good until I got an error that said:
pq: sorry, too many clients already
I just use the default configuration. The error happened after I did a lot of test requests on my application.
And the error is gone after I restart my application. So, I am thinking that the GORM connection is not released after I'm done with the query. I don't check it very deep enough on GORM code, I just ask here maybe someone has already experience about it?
The error message you are getting is a PostgreSQL error and not GORM. It is caused as you are opening the database connection more than once.
db, err := gorm.Open("postgres", "user=gorm dbname=gorm")
Should be initiated once and referred to after that.
sync.Once.Do(func() {
instance, err := gorm.Open("postgres",
"root:password#"+
"tcp(localhost:3306)/rav"+
"?charset=utf8&parseTime=True")
if err != nil {
log.Println("Connection Failed to Open")
return
}
log.Println("Connection Established here")
instance.DB().SetMaxIdleConns(10)
instance.LogMode(true)
})
You can restrict the connection to singleton function so the connection happens once even though it gets called multiple times.

Resources