As the title says I don't know if having multiple sql.Open statements is a good or bad thing or what or if I should have a file with just an init that is something like:
var db *sql.DB
func init() {
var err error
db, err = sql.Open
}
just wondering what the best practice would be. Thanks!
You should at least check the error.
As mentioned in "Connecting to a database":
Note that Open does not directly open a database connection: this is deferred until a query is made. To verify that a connection can be made before making a query, use the Ping function:
if err := db.Ping(); err != nil {
log.Fatal(err)
}
After use, the database is closed using Close.
If possible, limit the number of opened connection to a database to a minimum.
See "Go/Golang sql.DB reuse in functions":
You shouldn't need to open database connections all over the place.
The database/sql package does connection pooling internally, opening and closing connections as needed, while providing the illusion of a single connection that can be used concurrently.
As elithrar points out in the comment, database.sql/#Open does mention:
The returned DB is safe for concurrent use by multiple goroutines and maintains its own pool of idle connections.
Thus, the Open function should be called just once.
It is rarely necessary to close a DB.
As mentioned here
Declaring *sql.DB globally also have some additional benefits such as SetMaxIdleConns (regulating connection pool size) or preparing SQL statements across your application.
You can use a function init, which will run even if you don't have a main():
var db *sql.DB
func init() {
db, err = sql.Open(DBparms....)
}
init() is always called, regardless if there's main or not, so if you import a package that has an init function, it will be executed.
You can have multiple init() functions per package, they will be executed in the order they show up in the code (after all variables are initialized of course).
Related
I'm coming from a Node background and trying to get into Go, by looking at code examples.
I do find it weird that code is mostly synchronous - even things like connecting and communicating with the database, e.g.
func main() {
// Create a new client and connect to the server
client, err := mongo.Connect(context.TODO(), options.Client().ApplyURI(uri))
if err != nil {
panic(err)
}
}
Doesn't this block the thread until DB sends back a response? If not, how is that possible?
Yeah there's this difference:
In Node everything is not blocking until you say it otherwise, await or callabck.
In Go everything is blocking until you say it otherwise, go.
I have been using GORM for my application based on AWS lambda. I used gorm.Open() for every Handler function,
db, err := gorm.Open(mysql.Open(dsn), &gorm.Config{
Logger: logger.Default.LogMode(logger.Info),
})
so can someone help me confirm that does gorm.Open(...) automatically close the connection or not? Or I must use generic database interface bellow?
// Get generic database object sql.DB to use its functions
sqlDB, err := db.DB()
// Ping
sqlDB.Ping()
// Close
sqlDB.Close()
// Returns database statistics
sqlDB.Stats()
A gorm.DB object is intended to be reused, like a sql.DB handle. You rarely have to explicitly close these objects. Just create it once and reuse it.
gorm.DB contains a sql.DB which uses a connection pool to manage the connections. If it is closed, it will stop accepting new queries, wait for running queries to finish and close all connections.
I am using the gosnowflake 1.40 driver. I am seeing my sessions cycle after 2 queries as seen in the image below, less than 1 second apart.
Connection setup looks something like this:
dsn, err := sf.DSN(sfConfig)
if err != nil {
panic("cannot get snowflake session: " + err.Error())
}
DBSession, err = sql.Open("snowflake", dsn)
if err != nil {
panic("cannot get snowflake session: " + err.Error())
}
return DBSession, nil
I use the following query pattern inside a function:
result = dbSession.QueryRow(command)
This session cycling pattern is not ideal, as I'd like to be able to assume a role and run multiple commands. Can someone point me to what I can do to make the Snowflake sessions persist? I don't have this problem using the WebUI.
DB maintains a pool of connections. Each connection in the pool will have a unique session ID. From the documentation:
DB is a database handle representing a pool of zero or more underlying connections. It's safe for concurrent use by multiple goroutines.
The sql package creates and frees connections automatically; it also maintains a free pool of idle connections.
You have a couple options for bypassing the default behavior of cycling through the pool of connections:
Obtain a specific Conn instance
from the connection pool using
DB.Conn(). The documentation
specifically states:
Queries run on the same Conn will be run in the same database session.
Modify the connection pool parameters using
DB.SetMaxOpenConns().
I suspect that setting this to 1 will also obtain the desired behavior.
However, this introduces scalability/concurrency concerns that are
addressed by having a connection pool in the first place.
Note, I'm not familiar with the Snowflake driver in particular. There may be other options that the driver supports.
As I understand in Golang: the DB handle is meant to be long-lived and shared between many goroutines.
But when I using Golang with AWS lambda, it's a very different story since lambdas are stopping the function when it's finished.
I am using:defer db.Close() in Lambda Invoke function but it isn't affect. On MySQL, it's still keep that connection as Sleep query. As a result, it causes too many connections on MySQL.
Currently, I have to set wait_timeout in MySQL to small number. But it's not the best solution, in my opinion.
Is there any way to close connection when using Go SQL driver with Lambda?
Thanks,
There are two problems that we need to address
Correctly managing state between lambda invocations
Configuring a connection pool
Correctly managing state
Let us understand a bit of how the container is managed by AWS. From the AWS docs:
After a Lambda function is executed, AWS Lambda maintains the
execution context for some time in anticipation of another Lambda
function invocation. In effect, the service freezes the execution
context after a Lambda function completes, and thaws the context for
reuse, if AWS Lambda chooses to reuse the context when the Lambda
function is invoked again.
This execution context reuse approach has the following implications:
Any declarations in your Lambda function code (outside the handler
code, see Programming Model) remains initialized, providing additional
optimization when the function is invoked again. For example, if your
Lambda function establishes a database connection, instead of
reestablishing the connection, the original connection is used in
subsequent invocations. We suggest adding logic in your code to check
if a connection exists before creating one.
Each execution context provides 500MB of additional disk space in the
/tmp directory. The directory content remains when the execution
context is frozen, providing transient cache that can be used for
multiple invocations. You can add extra code to check if the cache has
the data that you stored. For information on deployment limits, see
AWS Lambda Limits.
Background processes or callbacks initiated by your Lambda function
that did not complete when the function ended resume if AWS Lambda
chooses to reuse the execution context. You should make sure any
background processes or callbacks (in case of Node.js) in your code
are complete before the code exits.
This first bullet point says that state is maintained between executions. Let us see this in action:
let counter = 0
module.exports.handler = (event, context, callback) => {
counter++
callback(null, { count: counter })
}
If you deploy this and call multiple times consecutively you will see that the counter will be incremented between calls.
Now that you know that - you should not call defer db.Close(), instead you should be reusing the database instance. You can do that by simply making db a package level variable.
First, create a database package that will export an Open function:
package database
import (
"fmt"
"os"
_ "github.com/go-sql-driver/mysql"
"github.com/jinzhu/gorm"
)
var (
host = os.Getenv("DB_HOST")
port = os.Getenv("DB_PORT")
user = os.Getenv("DB_USER")
name = os.Getenv("DB_NAME")
pass = os.Getenv("DB_PASS")
)
func Open() (db *gorm.DB) {
args := fmt.Sprintf("%s:%s#tcp(%s:%s)/%s?parseTime=true", user, pass, host, port, name)
// Initialize a new db connection.
db, err := gorm.Open("mysql", args)
if err != nil {
panic(err)
}
return
}
Then use it on your handler.go file:
package main
import (
"context"
"github.com/aws/aws-lambda-go/events"
"github.com/aws/aws-lambda-go/lambda"
"github.com/jinzhu/gorm"
"github.com/<username>/<name-of-lib>/database"
)
var db *gorm.DB
func init() {
db = database.Open()
}
func Handler() (events.APIGatewayProxyResponse, error) {
// You can use db here.
return events.APIGatewayProxyResponse{
StatusCode: 201,
}, nil
}
func main() {
lambda.Start(Handler)
}
OBS: don't forget to replace github.com/<username>/<name-of-lib>/database with the right path.
Now, you might still see the too many connections error. If that happens you will need a connection pool.
Configuring a connection pool
From Wikipedia:
In software engineering, a connection pool is a cache of database
connections maintained so that the connections can be reused when
future requests to the database are required. Connection pools are
used to enhance the performance of executing commands on a database.
You will need a connection pool that the number of allowed connections must be equal to the number of parallel lambdas running, you have two choices:
MySQL Proxy
MySQL Proxy is a simple program that sits between your client and
MySQL server(s) and that can monitor, analyze or transform their
communication. Its flexibility allows for a wide variety of uses,
including load balancing, failover, query analysis, query filtering
and modification, and many more.
AWS Aurora:
Amazon Aurora Serverless is an on-demand, auto-scaling configuration
for Amazon Aurora (MySQL-compatible edition), where the database will
automatically start up, shut down, and scale capacity up or down based
on your application's needs. It enables you to run your database in
the cloud without managing any database instances. It's a simple,
cost-effective option for infrequent, intermittent, or unpredictable
workloads.
Regardless of your choice, there are plenty of tutorials on the internet on how to configure both.
Hypothetically speaking, is it good practice to connect to a database for each request and close in when the request has completed?
I'm using mongodb with mgo for the database.
In my project, I would like to connect to a certain database by getting the database name from the request header (of course, this is combined with an authentication mechanism, e.g. JWT in my app). The flow goes something like:
User authentication:
POST to http://api.app.com/authenticate
// which checks the user in a "global" database,
// authenticates them and returns a signed JWT token
// The token is stored in bolt.db for the authentication mechanism
Some RESTful operations
POST to http://api.app.com/v1/blog/posts
// JWT middleware for each request to /v1* is set up
// `Client-Domain` in header is set to a database's name, e.g 'app-com'
// so we open a connection to that database and close when
// request finishes
So my questions are:
Is this feasible? - I've read about connection pools and reusing them but I haven't read much about them yet
Is there a better way of achieving the desired functionality?
How do I ensure the session is only closed when the request has completed?
The reason why I need to do this is because we have multiple vendors that have the same database collections with different entries with restricted access to their own databases.
Update / Solution
I ended up using Go's built in Context by Copying a session and using it anywhere I need to do any CRUD ops
Something like:
func main() {
...
// Configure connection and set in global var
model.DBSession, err = mgo.DialWithInfo(mongoDBDialInfo)
defer model.DBSession.Close()
...
n := negroni.Classic()
n.Use(negroni.HandlerFunc(Middleware))
...
}
func Middleware(res http.ResponseWriter, req *http.Request, next http.HandlerFunc) {
...
db := NewDataStore(clientDomain)
// db.Close() is an alias for ds.session.Close(), code for this function is not included in this post
// Im still experimenting with this, I need to make sure the session is only closed after a request has completed, currently it does not always do so
defer db.Close()
ctx := req.Context()
ctx = context.WithValue(ctx, auth.DataStore, db)
req = req.WithContext(ctx)
...
}
func NewDataStore(db string) *DataStore {
store := &DataStore{
db: DBSession.Copy().DB(db),
session: DBSession.Copy(),
}
return store
}
And then use it in a HandlerFunc, example /v1/system/users:
func getUsers(res http.ResponseWriter, req *http.Request) {
db := req.Context().Value(auth.DataStore).(*model.DataStore)
users := make([]SystemUser{}, 0)
// db.C() is an alias for ds.db.C(), code for this function is not included in this post
db.C("system_users").Find(nil).All(&users)
}
40% response time decrease over the original method I experimented with.
Hypothetically speaking is not a good practice because:
The database logic is scattered among several packages.
It's difficult to test
You can't apply DI (mainly it will be hard to maintain the code)
Replying to your questions:
Yes is feasible BUT you will not use the connection pool inside them go package (take a look to the code here if you want know more about Connection Pool)
A better way is to create a global variable that contains the database connection and close when the application is going to stop (and not close the connection every request)
How do I ensure the session is only closed when the request has complete<- you should checkout the answer fro your db query and then close the connection (but I don't recommend to close the connection after a request because you'll need to open again for another request and close again etc...)