Constantly Reconnect to Cassandra - go

I have read online that the best practice when using Cassandra is to have 1 Cluster and 1 Session for the lifetime of your service.
My questions are:
In case our Cassandra server goes down, how do I make sure that my Cluster and/or Session will keep trying to reconnect until our Cassandra server gets back online?
Should only the Cluster attempt to reconnect, or only the Session, or both?
We are using Go and github.com/gocql/gocql for our service.
I have seen the following snippet in the gocql documentation, but it looks like it only has a limited number of retries:
cluster.ReconnectionPolicy = &gocql.ConstantReconnectionPolicy{MaxRetries: 10, Interval: 8 * time.Second}
I also found the below snippet online, but it doesn't look like it's designed to handle this scenario:
var cluster *gocql.ClusterConfig
var session *gocql.Session
func getCassandraSession() *gocql.Session {
if session == nil || session.Closed() {
if cluster == nil {
cluster = gocql.NewCluster("127.0.0.1:9042")
cluster.Keyspace = "demodb"
cluster.Consistency = gocql.One
cluster.ProtoVersion = 4
}
var err error
if session, err = cluster.CreateSession(); err != nil {
panic(err)
}
}
return session
}
Are any of the above methods sufficient to ensure that reconnections are attempted until our Cassandra server gets back online? And if not, what's the best practice for this scenario?

Special thanks to #Jim Wartnick for this. I just tried turning Cassandra off on my local machine and then turning it back on and gocql instantly reconnected without having to use any of the above snippets in my question.
As long as your Cluster and Session have connected to Cassandra at least once, even if Cassandra goes down, they will instantly reconnect to it as soon as Cassandra gets back online.
Many thanks to everyone who helped!

Related

Scheduler-worker cluster without port forwarding

Hello Satckoverflow!
TLDR I would like to recreate https://github.com/KorayGocmen/scheduler-worker-grpc without port forwarding on the worker.
I am trying to build a competitive programming judge server for evaluation of submissions as a project for my school where I teach programming to kids.
Because the evaluation is computationally heavy I would like to have multiple worker nodes.
The scheduler would receive submissions and hand them out to the worker nodes. For ease of worker deployment ( as it will be often changing ) I would like the worker to be able to subscribe to the scheduler and thus become a worker and receive jobs.
The workers may not be on the same network as the scheduler + the worker resides in a VM ( maybe later will be ported to docker but currently there are issues with it ).
The scheduler should be able to know resource usage of the worker, send different types of jobs to the worker and receive a stream of results.
I am currently thinking of using grpc to address my requirements of communication between workers and the scheduler.
I could create multiple scheduler service methods like:
register worker, receive a stream of jobs
stream job results, receive nothing
stream worker state periodically, receive nothing
However I would prefer the following but idk whether it is possible:
The scheduler GRPC api:
register a worker ( making the worker GRPC api available to the scheduler )
The worker GRPC api:
start a job ( returns stream of job status )
cancel a job ???
get resource usage
The worker should unregister automatically if the connection is lost.
So my question is... is it possible to create a grpc worker api that can be registered to the scheduler for later use if the worker is behind a NAT without port forwarding?
Additional possibly unnecessary information:
Making matters worse I have multiple radically different types of jobs ( streaming an interactive console, executing code against prepared testcases ). I may just create different workers for different jobs.
Sometimes the jobs involve having large files on the local filesystem ( up to 500 MB ) that are usually kept near the scheduler therefore I would like to send the job to a worker which already has the specific files downloaded from the scheduler. Otherwise download the large files on one of the workers. Having all files at the same time on the worker would take more than 20 GB therefore I would like to avoid it.
A worker can run multiple jobs ( up to 16 ) at the same time.
I am writing the system in go.
As long as only the workers initiate the connections you don't have to worry about NAT. gRPC supports streaming in either direction (or both). This means that all of your requirements can be implemented using just one server on the scheduler; there is no need for the scheduler to connect back to the workers.
Given your description your service could look something like this:
syntax = "proto3";
import "google/protobuf/empty.proto";
service Scheduler {
rpc GetJobs(GetJobsRequest) returns (stream GetJobsResponse) {}
rpc ReportWorkerStatus(stream ReportWorkerStatusRequest) returns (google.protobuf.Empty) {}
rpc ReportJobStatus(stream JobStatus) returns (stream JobAction) {}
}
enum JobType {
JOB_TYPE_UNSPECIFIED = 0;
JOB_TYPE_CONSOLE = 1;
JOB_TYPE_EXEC = 2;
}
message GetJobsRequest {
// List of job types this worker is willing to accept.
repeated JobType types = 1;
}
message GetJobsResponse {
string jobId = 0;
JobType type = 1;
string fileName = 2;
bytes fileContent = 3;
// etc.
}
message ReportWorkerStatusRequest {
float cpuLoad = 0;
uint64 availableDiskSpace = 1;
uint64 availableMemory = 2;
// etc.
// List of filenames or file hashes, or whatever else you need to precisely
// report the presence of files.
repeated string haveFiles = 2;
}
Much of this is a matter of preference (you can use oneof instead of enums, for instance), but hopefully it's clear that a single connection from client to server is sufficient for your requirements.
Maintaining the set of available workers is quite simple:
func (s *Server) GetJobs(req *pb.GetJobRequest, stream pb.Scheduler_GetJobsServer) error {
ctx := stream.Context()
s.scheduler.AddWorker(req)
defer s.scheduler.RemoveWorker(req)
for {
job, err := s.scheduler.GetJob(ctx, req)
switch {
case ctx.Err() != nil: // client disconnected
return nil
case err != nil:
return err
}
if err := stream.Send(job); err != nil {
return err
}
}
}
The Basics tutorial includes examples for all types of streaming, including server and client implementations in Go.
As for registration, that usually just means creating some sort of credential that a worker will use when communicating with the server. This might be a randomly generated token (which the server can use to load associated metadata), or a username/password combination, or a TLS client certificate, or similar. Details will depend on your infrastructure and desired workflow when setting up workers.

Why is Go connecting to a database synchronously?

I'm coming from a Node background and trying to get into Go, by looking at code examples.
I do find it weird that code is mostly synchronous - even things like connecting and communicating with the database, e.g.
func main() {
// Create a new client and connect to the server
client, err := mongo.Connect(context.TODO(), options.Client().ApplyURI(uri))
if err != nil {
panic(err)
}
}
Doesn't this block the thread until DB sends back a response? If not, how is that possible?
Yeah there's this difference:
In Node everything is not blocking until you say it otherwise, await or callabck.
In Go everything is blocking until you say it otherwise, go.

Snowflake Go Sessions Keep Terminating

I am using the gosnowflake 1.40 driver. I am seeing my sessions cycle after 2 queries as seen in the image below, less than 1 second apart.
Connection setup looks something like this:
dsn, err := sf.DSN(sfConfig)
if err != nil {
panic("cannot get snowflake session: " + err.Error())
}
DBSession, err = sql.Open("snowflake", dsn)
if err != nil {
panic("cannot get snowflake session: " + err.Error())
}
return DBSession, nil
I use the following query pattern inside a function:
result = dbSession.QueryRow(command)
This session cycling pattern is not ideal, as I'd like to be able to assume a role and run multiple commands. Can someone point me to what I can do to make the Snowflake sessions persist? I don't have this problem using the WebUI.
DB maintains a pool of connections. Each connection in the pool will have a unique session ID. From the documentation:
DB is a database handle representing a pool of zero or more underlying connections. It's safe for concurrent use by multiple goroutines.
The sql package creates and frees connections automatically; it also maintains a free pool of idle connections.
You have a couple options for bypassing the default behavior of cycling through the pool of connections:
Obtain a specific Conn instance
from the connection pool using
DB.Conn(). The documentation
specifically states:
Queries run on the same Conn will be run in the same database session.
Modify the connection pool parameters using
DB.SetMaxOpenConns().
I suspect that setting this to 1 will also obtain the desired behavior.
However, this introduces scalability/concurrency concerns that are
addressed by having a connection pool in the first place.
Note, I'm not familiar with the Snowflake driver in particular. There may be other options that the driver supports.

sql.DB on aws-lambda too many connection

As I understand in Golang: the DB handle is meant to be long-lived and shared between many goroutines.
But when I using Golang with AWS lambda, it's a very different story since lambdas are stopping the function when it's finished.
I am using:defer db.Close() in Lambda Invoke function but it isn't affect. On MySQL, it's still keep that connection as Sleep query. As a result, it causes too many connections on MySQL.
Currently, I have to set wait_timeout in MySQL to small number. But it's not the best solution, in my opinion.
Is there any way to close connection when using Go SQL driver with Lambda?
Thanks,
There are two problems that we need to address
Correctly managing state between lambda invocations
Configuring a connection pool
Correctly managing state
Let us understand a bit of how the container is managed by AWS. From the AWS docs:
After a Lambda function is executed, AWS Lambda maintains the
execution context for some time in anticipation of another Lambda
function invocation. In effect, the service freezes the execution
context after a Lambda function completes, and thaws the context for
reuse, if AWS Lambda chooses to reuse the context when the Lambda
function is invoked again.
This execution context reuse approach has the following implications:
Any declarations in your Lambda function code (outside the handler
code, see Programming Model) remains initialized, providing additional
optimization when the function is invoked again. For example, if your
Lambda function establishes a database connection, instead of
reestablishing the connection, the original connection is used in
subsequent invocations. We suggest adding logic in your code to check
if a connection exists before creating one.
Each execution context provides 500MB of additional disk space in the
/tmp directory. The directory content remains when the execution
context is frozen, providing transient cache that can be used for
multiple invocations. You can add extra code to check if the cache has
the data that you stored. For information on deployment limits, see
AWS Lambda Limits.
Background processes or callbacks initiated by your Lambda function
that did not complete when the function ended resume if AWS Lambda
chooses to reuse the execution context. You should make sure any
background processes or callbacks (in case of Node.js) in your code
are complete before the code exits.
This first bullet point says that state is maintained between executions. Let us see this in action:
let counter = 0
module.exports.handler = (event, context, callback) => {
counter++
callback(null, { count: counter })
}
If you deploy this and call multiple times consecutively you will see that the counter will be incremented between calls.
Now that you know that - you should not call defer db.Close(), instead you should be reusing the database instance. You can do that by simply making db a package level variable.
First, create a database package that will export an Open function:
package database
import (
"fmt"
"os"
_ "github.com/go-sql-driver/mysql"
"github.com/jinzhu/gorm"
)
var (
host = os.Getenv("DB_HOST")
port = os.Getenv("DB_PORT")
user = os.Getenv("DB_USER")
name = os.Getenv("DB_NAME")
pass = os.Getenv("DB_PASS")
)
func Open() (db *gorm.DB) {
args := fmt.Sprintf("%s:%s#tcp(%s:%s)/%s?parseTime=true", user, pass, host, port, name)
// Initialize a new db connection.
db, err := gorm.Open("mysql", args)
if err != nil {
panic(err)
}
return
}
Then use it on your handler.go file:
package main
import (
"context"
"github.com/aws/aws-lambda-go/events"
"github.com/aws/aws-lambda-go/lambda"
"github.com/jinzhu/gorm"
"github.com/<username>/<name-of-lib>/database"
)
var db *gorm.DB
func init() {
db = database.Open()
}
func Handler() (events.APIGatewayProxyResponse, error) {
// You can use db here.
return events.APIGatewayProxyResponse{
StatusCode: 201,
}, nil
}
func main() {
lambda.Start(Handler)
}
OBS: don't forget to replace github.com/<username>/<name-of-lib>/database with the right path.
Now, you might still see the too many connections error. If that happens you will need a connection pool.
Configuring a connection pool
From Wikipedia:
In software engineering, a connection pool is a cache of database
connections maintained so that the connections can be reused when
future requests to the database are required. Connection pools are
used to enhance the performance of executing commands on a database.
You will need a connection pool that the number of allowed connections must be equal to the number of parallel lambdas running, you have two choices:
MySQL Proxy
MySQL Proxy is a simple program that sits between your client and
MySQL server(s) and that can monitor, analyze or transform their
communication. Its flexibility allows for a wide variety of uses,
including load balancing, failover, query analysis, query filtering
and modification, and many more.
AWS Aurora:
Amazon Aurora Serverless is an on-demand, auto-scaling configuration
for Amazon Aurora (MySQL-compatible edition), where the database will
automatically start up, shut down, and scale capacity up or down based
on your application's needs. It enables you to run your database in
the cloud without managing any database instances. It's a simple,
cost-effective option for infrequent, intermittent, or unpredictable
workloads.
Regardless of your choice, there are plenty of tutorials on the internet on how to configure both.

GORM pq too many connections

I am using GORM with my project, everything is good until I got an error that said:
pq: sorry, too many clients already
I just use the default configuration. The error happened after I did a lot of test requests on my application.
And the error is gone after I restart my application. So, I am thinking that the GORM connection is not released after I'm done with the query. I don't check it very deep enough on GORM code, I just ask here maybe someone has already experience about it?
The error message you are getting is a PostgreSQL error and not GORM. It is caused as you are opening the database connection more than once.
db, err := gorm.Open("postgres", "user=gorm dbname=gorm")
Should be initiated once and referred to after that.
sync.Once.Do(func() {
instance, err := gorm.Open("postgres",
"root:password#"+
"tcp(localhost:3306)/rav"+
"?charset=utf8&parseTime=True")
if err != nil {
log.Println("Connection Failed to Open")
return
}
log.Println("Connection Established here")
instance.DB().SetMaxIdleConns(10)
instance.LogMode(true)
})
You can restrict the connection to singleton function so the connection happens once even though it gets called multiple times.

Resources