I am connecting my go scripts to redshift using go-lang postgres driver. When query takes 5+ minutes to complete, my program never gets its control back. After checking the query at redshift-server I do see that query completed in ~7 minutes.
Not sure why is this happening.
My code
func truncate_and_populate_set_1(db *sql.DB, parameter string){
insert_q := `...`
db := GetDB()
util.ExeQ(db, insert_q)
log.Println("Done adding records to table")
}
func GetDB() *sql.DB {
connection_string := "postgres://%s:%s#host"
db, err := sql.Open("postgres", connection_string)
if err != nil {
fmt.Println(err)
}
return db
}
func ExeQ(db *sql.DB, query string) {
_, err := db.Exec(query)
if err != nil {
log.Fatal(err)
}
}
You need to alter the keep alive behavior of the library that's managing the Redshift connection. Unfortuantely I can't advise you on how to do that in Go.
For a JDBC URL you could append the options:
jdbc:redshift://my-cluster … :5439/user?tcpKeepAlive=true&TCPKeepAliveMinutes=2
See the documentation here for more options: http://docs.aws.amazon.com/redshift/latest/mgmt/troubleshooting-connections.html
Related
I'm doing an online course on Golang. The following piece of code is presented in the course material as an example of misuse of sync.Once:
var (
once sync.Once
db *sql.DB
)
func DbOnce() (*sql.DB, error) {
var err error
once.Do(func() {
fmt.Println("Am called")
db, err = sql.Open("mysql", "root:test#tcp(127.0.0.1:3306)/test")
if err != nil {
return
}
err = db.Ping()
})
if err != nil {
return nil, err
}
return db, nil
}
Supposedly, the above is a faulty implementation of an SQL connection manager. We, the students, are to find the error ourselves, which I struggle with. The code runs fine even in parallel. This is how I used it:
func main() {
wg := sync.WaitGroup{}
wg.Add(10)
for i := 0; i < 10; i++ {
go (func() {
db, err := DbOnce()
if err != nil {
panic(err)
}
var v int
r := db.QueryRow("SELECT 1")
err = r.Scan(&v)
fmt.Println(v, err)
wg.Done()
})()
}
wg.Wait()
}
I understand that homework questions are discouraged here, so I'm not asking for a complete solution, just a hint would be fine. Is the error related to concurrency (i.e. I need to run it in a specific concurrent context)? Is it usage of sql.Open specifically?
Initialization of the db variable is OK. The problem is with the returned error.
If you call DbOnce() for the first time and opening a DB connection fails, that error will be returned properly. But what about subsequent calls? The db initialization code will not be run again, so nil db may be returned, and since the initialization code is not run, the default value of the err variable is returned, which will be nil. To sum it up, the initialization error is lost and will not be reported anymore.
One solution is to stop the app if connection fails (at the first call). Another option is to store the initialization error too in a package level variable along with db, and return that from DbOnce() (and not use a local variable for that). The former has the advantage that you don't have to handle errors returned from DbOnce(), as it doesn't even have to return an error (if there's an error, your app will terminate).
The latter could look like this:
var (
once sync.Once
db *sql.DB
dbErr error
)
func DbOnce() (*sql.DB, error) {
once.Do(func() {
fmt.Println("Am called")
db, dbErr = sql.Open("mysql", "root:test#tcp(127.0.0.1:3306)/test")
if dbErr != nil {
return
}
dbErr = db.Ping()
})
return db, dbErr
}
So i want to test out an API that interacts with Cassandra on my local machine. In my func TestMain(m *testing.M) function, i want to clear the tables before running the tests. The TestMain function looks like this...
func TestMain(m *testing.M) {
keyspace = "staging"
cassandra.SetKeyspace(keyspace)
client = http.DefaultClient
// Ensure all tables are empty before tests run
err := ClearAllTables()
if err != nil {
logrus.Errorf("Failed to clear all tables: %v.", err)
os.Exit(1)
}
// Run tests
statusCode := m.Run()
os.Exit(statusCode)
}
The ClearAllTables function looks like this...
func ClearAllTables() (err error) {
// Create a DB session
session := cassandra.CreateSession()
defer session.Close()
// Get names of all existing tables
tableNameList := []string{"cliques", "users"}
// Remove all rows from each table
var count int
for _, tableName := range tableNameList {
if err := session.Query(`TRUNCATE TABLE ` + tableName).Exec(); err != nil {
return err
}
}
return nil
}
For some reason when i try to TRUNCATE the table, Cassandra times out, and i get the error...
level=error msg="Failed to clear all tables: gocql: no response received from cassandra within timeout period."
This only seems to happen when i test the program. I wrote a snippet of code in the main function that works fine.
func main () {
session := cassandra.CreateSession()
defer session.Close()
if err := session.Query(`TRUNCATE TABLE cliques`).Exec(); err != nil {
return
}
fmt.Println("Table truncated") //works
}
I also wrote a snippet of code that returns some rows from the database, and that also works fine in the main function.
Here is how i create my cassandra session...
// CreateSession connect to a cassandra instance
func CreateSession() *gocql.Session {
// Connect to the cluster
cluster := gocql.NewCluster("127.0.0.1")
cluster.Keyspace = "staging"
cluster.ProtoVersion = 4
cluster.CQLVersion = "3.0.0"
cluster.Consistency = gocql.One
session, err := cluster.CreateSession()
if err != nil {
logrus.Errorf("Failed to connect to Cassandra: %v.", err)
}
return session
}
I'm i missing anything, Why would Cassandra work fine with go run main.go but doesn't work with go test?
I have a set of functions in my web API app. They perform some operations on the data in the Postgres database.
func CreateUser () {
db, err := sql.Open("postgres", "user=postgres password=password dbname=api_dev sslmode=disable")
// Do some db operations here
}
I suppose functions should work with db independently from each other, so now I have sql.Open(...) inside each function. I don't know if it's a correct way to manage db connection.
Should I open it somewhere once the app starts and pass db as an argument to the corresponding functions instead of opening the connection in every function?
Opening a db connection every time it's needed is a waste of resources and it's slow.
Instead, you should create an sql.DB once, when your application starts (or on first demand), and either pass it where it is needed (e.g. as a function parameter or via some context), or simply make it a global variable and so everyone can access it. It's safe to call from multiple goroutines.
Quoting from the doc of sql.Open():
The returned DB is safe for concurrent use by multiple goroutines and maintains its own pool of idle connections. Thus, the Open function should be called just once. It is rarely necessary to close a DB.
You may use a package init() function to initialize it:
var db *sql.DB
func init() {
var err error
db, err = sql.Open("yourdriver", "yourDs")
if err != nil {
log.Fatal("Invalid DB config:", err)
}
}
One thing to note here is that sql.Open() may not create an actual connection to your DB, it may just validate its arguments. To test if you can actually connect to the db, use DB.Ping(), e.g.:
func init() {
var err error
db, err = sql.Open("yourdriver", "yourDs")
if err != nil {
log.Fatal("Invalid DB config:", err)
}
if err = db.Ping(); err != nil {
log.Fatal("DB unreachable:", err)
}
}
I will use a postgres example
package main
import necessary packages and don't forget the postgres driver
import (
"database/sql"
_ "github.com/lib/pq" //postgres driver
)
initialize your connection in the package scope
var db *sql.DB
have an init function for your connection
func init() {
var err error
db, err = sql.open("postgres", "connectionString")
//connectioString example => 'postgres://username:password#localhost/dbName?sslmode=disable'
if err != nil {
panic(err)
}
err = db.Ping()
if err != nil {
panic(err)
}
// note, we haven't deffered db.Close() at the init function since the connection will close after init. you could close it at main or ommit it
}
main function
func main() {
defer db.Close() //optional
//run your db functions
}
checkout this example
https://play.golang.org/p/FAiGbqeJG0H
I`m new to the GO. I have a following legacy code.
var db *sql.DB
func init() {
go feedChan()
connString := os.Getenv("DB_CONN")
var err error
db, err = sql.Open("postgres", connString)
if err != nil {
log.Fatalf("Failed to connect to database at %q: %q\n", connString, err)
}
// confirm connection
if err = db.Ping(); err != nil {
log.Fatalf("Unable to ping database at %q: %q\n", connString, err)
}
}
func feedChan() {
selectQuery, err := db.Prepare(`
SELECT id, proxy
FROM proxy
WHERE fail_count < 2
ORDER BY date_added DESC, last_used ASC, fail_count ASC
LIMIT 5
`)
....
Following code works on linux. But it fails on windows with nil error on
selectQuery, err := db.Prepare(`
Which make sense for me, since db initialized after a launch of feedChan goroutine. What doesnt make sense for me is why it work on linux.
So the question is why this code work at linux without errors?
That's probably a race condition. Import "time", put this line after go feedChan(), and see if it still works on Linux:
time.Sleep(3 * time.Second)
In order to avoid this situation, you could either initialize db before you spawn the routine (which uses db) or use some sort of barrier:
func init() {
barrier := make(chan int)
go feedChan(barrier)
connString := os.Getenv("DB_CONN")
var err error
db, err = sql.Open("postgres", connString)
if err != nil {
log.Fatalf("Failed to connect to database at %q: %q\n", connString, err)
// Retry.
} else {
barrier <- 1 // Opens barrier.
}
// ...
}
func feedChan(barrier chan int) {
<-barrier // Blocks until db is ready.
selectQuery, err := db.Prepare(`
SELECT id, proxy
FROM proxy
WHERE fail_count < 2
ORDER BY date_added DESC, last_used ASC, fail_count ASC
LIMIT 5
`)
// ...
}
After reading the first lines of your functions I just can say you that your legacy code has a huge bug and it can be easily fixed just moving this line go feedChan() to the end of the init() function.
Also note the main reason is not a race condition, just a matter of wait for the correct initialization of db variable.
I've an app that uses net/http. I register some handlers with http that need to fetch some stuff from a database before we can proceed to writing the response and be done with the request.
My question is in about which the best pratice is to connect to this database. I want this to work at one request per minute or 10 request per second.
I could connect to database within each handler every time a request comes in. (This would spawn a connection to mysql for each request?)
package main
import (
"database/sql"
_ "github.com/go-sql-driver/mysql"
"net/http"
"fmt"
)
func main() {
http.HandleFunc("/",func(w http.ResponseWriter, r *http.Request) {
db, err := sql.Open("mysql","dsn....")
if err != nil {
panic(err)
}
defer db.Close()
row := db.QueryRow("select...")
// scan row
fmt.Fprintf(w,"text from database")
})
http.ListenAndServe(":8080",nil)
}
I could connect to database at app start. Whenever I need to use the database I Ping it and if it's closed I reconnect to it. If it's not closed I continue and use it.
package main
import (
"database/sql"
_ "github.com/go-sql-driver/mysql"
"net/http"
"fmt"
"sync"
)
var db *sql.DB
var mutex sync.RWMutex
func GetDb() *sql.DB {
mutex.Lock()
defer mutex.Unlock()
err := db.Ping()
if err != nil {
db, err = sql.Open("mysql","dsn...")
if err != nil {
panic(err)
}
}
return db
}
func main() {
var err error
db, err = sql.Open("mysql","dsn....")
if err != nil {
panic(err)
}
http.HandleFunc("/",func(w http.ResponseWriter, r *http.Request) {
row := GetDb().QueryRow("select...")
// scan row
fmt.Fprintf(w,"text from database")
})
http.ListenAndServe(":8080",nil)
}
Which of these ways are the best or is there another way which is better. Is it a bad idea to have multiple request use the same database connection?
It's unlikly I will create an app that runs into mysql connection limit, but I don't want to ignore the fact that there's a limit.
The best way is to create the database once at app start-up, and use this handle afterwards. Additionnaly, the sql.DB type is safe for concurrent use, so you don't even need mutexes to lock their use. And to finish, depending on your driver, the database handle will automatically reconnect, so you don't need to do that yourself.
var db *sql.DB
var Database *Database
func init(){
hostName := os.Getenv("DB_HOST")
port := os.Getenv("DB_PORT")
username := os.Getenv("DB_USER")
password := os.Getenv("DB_PASS")
database := os.Getenv("DB_NAME")
var err error
db, err = sql.Open("mysql", fmt.Sprintf("%s:%s#tcp(%s:%d)/%s", username, password, hostName, port, database))
defer db.Close()
if err != nil {
panic(err)
}
err = db.Ping()
if err != nil {
panic(err)
}
Database := &Database{conn: db}
}
type Database struct {
conn *sql.DB
}
func (d *Database) GetConn() *sql.DB {
return d.conn
}
func main() {
row := Database.GetConn().QueryRow("select * from")
}
I'd recommend make the connection to your database on init().
Why? cause init() is guaranteed to run before main() and you definitely want to make sure you have your db conf set up right before the real work begins.
var db *sql.DB
func GetDb() (*sql.DB, error) {
db, err = sql.Open("mysql","dsn...")
if err != nil {
return nil, err
}
return db, nil
}
func init() {
db, err := GetDb()
if err != nil {
panic(err)
}
err = db.Ping()
if err != nil {
panic(err)
}
}
I did not test the code above but it should technically look like this.