Multiple queries to Postgres within the same function - go

I'm new to Go, so sorry for the silly question in advance!
I'm using Gin framework and want to make multiple queries to the database within the same handler (database/sql + lib/pq)
userIds := []int{}
bookIds := []int{}
var id int
/* Handling first query here */
rows, err := pgClient.Query(getUserIdsQuery)
defer rows.Close()
if err != nil {
return
}
for rows.Next() {
err := rows.Scan(&id)
if err != nil {
return
}
userIds = append(userIds, id)
}
/* Handling second query here */
rows, err = pgClient.Query(getBookIdsQuery)
defer rows.Close()
if err != nil {
return
}
for rows.Next() {
err := rows.Scan(&id)
if err != nil {
return
}
bookIds = append(bookIds, id)
}
I have a couple of questions regarding this code (any improvements and best practices would be appreciated)
Does Go properly handle defer rows.Close() in such a case? I mean I have reassignment of rows variable later down the code, so will compiler track both and properly close at the end of a function?
Is it ok to reuse id shared var or should I redeclare it while iterating within rows.Next() loop?
What's the better approach of having even more queries within one handler? Should I have some kind of Writer that accepts query and slice and populate it with ids retrieved?
Thanks.

I've never worked with go-pg library, and my answer is mostly focused on the other stuff, which are generic, and are not specific to golang or go-pg.
Regardless of the fact that the rows here has the same reference while being shared between 2 queries (so one rows.Close() call would suffice, unless the library has some special implementation), defining two variables is cleaner, like userRows and bookRows.
Although I already said that I have not worked with go-pg, I believe that you wont need to iterate through rows and scan the id for all the rows manually, I believe that the lib has provided some API like this (based on the quick look on the documentations):
userIds := []int{}
err := pgClient.Query(&userIds, "select id from users where ...", args...)
Regarding your second question, it depends on what you mean by "ok". Since your doing some synchronous iteration, I don't think it would result into bugs, but when it comes to coding style, personally, I wouldn't do this.
I think that the best thing to do in your case is this:
// repo layer
func getUserIds(args whatever) ([]int, err) {...}
// these can be exposed, based on your packaging logic
func getBookIds(args whatever) ([]int, err) {...}
// service layer, or wherever you want to aggregate both queries
func getUserAndBookIds() ([]int, []int, err) {
userIds, err := getUserIds(...)
// potential error handling
bookIds, err := getBookIds(...)
// potential error handling
return userIds, bookIds, nil // you have done err handling earlier
}
I think this code is easier to read/maintain. You won't face the variable reassignment and other issues.
You can take a look at the go-pg documentations for more details on how to improve your query.

Related

How to avoid code-duplication with open/close database use-case (context management)?

Just getting started with Go and I'm wondering about the following situation:
I have a pretty simple codebase where I simply want to open/close a database connection and execute a simple query. I can do this as follows (just showing the important bits here):
import (
"database/sql"
_ "github.com/lib/pq"
)
func (db *Database) ExecQueryA() {
dbConn, err := sql.Open("postgres", db.psqlconn)
if err != nil {
panic(err)
}
defer dbConn.Close()
_, err = db.Exec(...
if err != nil {
panic(err)
}
}
The above idea works fine, but what if I want to write x more of these functions, I do not want to duplicate this part:
dbConn, err := sql.Open("postgres", db.psqlconn)
if err != nil {
panic(err)
}
defer dbConn.Close()
At the start of each function (i.e. I want to avoid code duplication). In python I would write a context manager for this, I.e. I would use a with .. statement which would open and close the database connection for me. When using Go, what is the best way to avoid code duplication in this use case?
As Brits points out in the comment to your question, the *sql.DB does not need to be open and closed every time you intend to use it. Instead a single shared instance of *sql.DB, Opened once at the launch of your app, is a common and recommended practice.
... the Open function should be called just once. It is rarely necessary
to close a DB.
Note that *sql.DB is not a connection, instead, it is a pool that manages multiple connections, opens as many as necessary (and possible), keeps idle ones around if necessary, frees them if unnecessary, etc. And most of all, it is safe for concurrent use.
DB is a database handle representing a pool of zero or more underlying
connections. It's safe for concurrent use by multiple goroutines.
To answer your actual question, one pattern to reduce the repetition of obtaining-and-releasing resources is to pass a function literal to a wrapper function:
func (db *Database) run(f func(c *sql.DB)) {
c, err := sql.Open("postgres", db.psqlconn)
if err != nil {
panic(err)
}
defer c.Close()
f(c)
}
func (db *Database) ExecQueryA() {
db.run(func(c *sql.DB) {
_, err := c.Exec(...
if err != nil {
panic(err)
}
})
}

Return row ID created by CockroachDB after a transaction

After executing a cockroachDB transaction which creates a row, I would like to return that id to my application.
I am currently defining a transaction using the database/sql golang package. I then use the crdb client to execute the transaction against CockroachDB.
I cannot return anything from the crdb transaction except for an error relating to the crdb transaction itself. How do people retrieving their most recently created rows if the interface does not provide a way to return a new row id?
I have some transaction that executes two inserts. I get the resulting ids of those inserts and return a response with both of those id values.
func myTransaction(ctx context.Context, tx *sql.Tx) (MyTxnResponse, error) {
var aResult string
if err := tx.QueryRowContext(ctx, `INSERT INTO foo (a,b,c) VALUES (1,2,3) RETURNING foo_id`).Scan(&aResult); err != nil { ... }
var bResult string
if err := tx.QueryRowContext(ctx, `INSERT INTO bar (d,e,f) VALUES (4,5,6) RETURNING bar_id`).Scan(&bResult); err != nil { ... }
return MyTxnResponse{A: aResult, B: bResult}
}
Now, I want to execute the transaction against CRDB
var responseStruct MyTxnResponse
err := crdb.ExecuteTx(ctx.Background(), db, nil, func(tx *sql.Tx) error {
res, err := myTransaction(ctx,tx)
// This is what I am currently doing... it seems bad
responseStruct.A = res.A
responseStruct.B = res.B
return err
}
... use responseStruct for something
I am able to get the result of the transaction by using this external var solution and just mutating it's state in the closure, but that seems like a really bad pattern. I'm sure I'm missing something that will blow up in my face if I keep this pattern.
Is there a standard way to return something from a transaction?
You've got the right idea: declare variables outside the transaction and assign them inside the transaction closure. This is the idiomatic way to "return" things in Go from a closure like this, since the only alternative is to have the function return an interface{} which would need a runtime cast to be useful.
There are a couple of shortcuts available here: since responseStruct and res are the same type, you don't need to list all the fields. You can simply say responseStruct = res.
You can even assign the result of myTransaction directly to responseStruct, avoiding the temporary res entirely. However, you have to be careful with your variable scopes here - you can't use := because that would create a new responseStruct inside the closure. Instead, you must declare all other variables (here, err) so that you can use the plain = assignment:
var responseStruct MyTxnResponse
err := crdb.ExecuteTx(context.Background(), db, nil, func(tx *sql.Tx) error {
var err error
responseStruct, err = myTransaction(ctx, tx)
return err
})
// use responseStruct (but only if err is nil)

Spawning go routines in a loop with closure

I have a list of strings which can contain number of elements ranging from 1 to 100,000. I want to verify each string and see if they are stored in a database, which requires call to network.
In order to maximize the efficiency, I want to spawn a go routine for each element.
Goal is to return false if one of the verifications inside the go routine function returns err, and return true if there is no err. So if we find at least one err we can stop since we already know that it is going to return false.
This is the basic idea, and the function below is the structure I've been thinking about using so far. I'd like to know if there is a better way (perhaps using channel?).
for _, id := range userIdList {
go func(id string){
user, err := verifyId(id)
if err != nil {
return err
}
// ...
// few more calls to other APIs for verifications
if err != nil {
return err
}
}(id)
}
I have wrote a small function that might be helpful for you.
Please take a look at limited parallel operations

Why is RethinkDB very slow?

I am getting started with RethinkDB, I have never used it before. I give it a try together with Gorethink following this tutorial.
To sum up this tutorial, there are two programs:
The first one updates entries infinitely.
for {
var scoreentry ScoreEntry
pl := rand.Intn(1000)
sc := rand.Intn(6) - 2
res, err := r.Table("scores").Get(strconv.Itoa(pl)).Run(session)
if err != nil {
log.Fatal(err)
}
err = res.One(&scoreentry)
scoreentry.Score = scoreentry.Score + sc
_, err = r.Table("scores").Update(scoreentry).RunWrite(session)
}
And the second one, receives this changes and logs them.
res, err := r.Table("scores").Changes().Run(session)
var value interface{}
if err != nil {
log.Fatalln(err)
}
for res.Next(&value) {
fmt.Println(value)
}
In the statistics that RethinkDB shows, I can see that there are 1.5K reads and writes per second. But in the console of the second program, I see 1 or 2 changes per second approximately.
Why does this occur? Am I missing something?
This code:
r.Table("scores").Update(scoreentry).RunWrite(session)
Probably doesn't do what you think it does. This attempts to update every document in the table by merging scoreentry into it. This is why the RethinkDB console is showing so many writes per second: every time you run that query it's resulting in thousands of writes.
Usually you want to update documents inside of ReQL, like so:
r.Table('scores').Get(strconv.Itoa(pl)).Update(func (row Term) interface{} {
return map[string]interface{}{"Score": row.GetField('Score').Add(sc)};
})
If you need to do the update in Go code, though, you can replace just that one document like so:
r.Table('scores').Get(strconv.Itoa(pl)).Replace(scoreentry)
Im not sure why it is quite that slow, it could be because by default each query blocks until the write has been completely flushed. I would first add some kind of instrumentation to see which operation is being so slow. There are also a couple of ways that you can improve the performance:
Set the Durability of the write using UpdateOpts
_, err = r.Table("scores").Update(scoreentry, r.UpdateOpts{
Durability: "soft",
}).RunWrite(session)
Execute each query in a goroutine to allow your code to execute multiple queries in parallel (you may need to use a pool of goroutines instead but this code is just a simplified example)
for {
go func() {
var scoreentry ScoreEntry
pl := rand.Intn(1000)
sc := rand.Intn(6) - 2
res, err := r.Table("scores").Get(strconv.Itoa(pl)).Run(session)
if err != nil {
log.Fatal(err)
}
err = res.One(&scoreentry)
scoreentry.Score = scoreentry.Score + sc
_, err = r.Table("scores").Update(scoreentry).RunWrite(session)
}()
}

golang couchbase gocb.RemoveOp - doesnt removes all

I think I did a silly mistake somewhere, but could not figure where for long time already :( The code is rough, I just testing things.
It deletes, but by some reasons not all documents, I have rewritten to delete it all one by one, and that went OK.
I use official package for Couchbase http://github.com/couchbase/gocb
Here is code:
var items []gocb.BulkOp
myQuery := gocb.NewN1qlQuery([Selecting ~ 283k documents from 1.5mln])
rows, err := myBucket.ExecuteN1qlQuery(myQuery, nil)
checkErr(err)
var idToDelete map[string]interface{}
for rows.Next(&idToDelete) {
items = append(items, &gocb.RemoveOp{Key: idToDelete["id"].(string)})
}
if err := rows.Close(); err != nil {
fmt.Println(err.Error())
}
if err := myBucket.Do(items);err != nil {
fmt.Println(err.Error())
}
This way it deleted ~70k documents, I run it again it got deleted 43k more..
Then I just let it delete one by one, and it worked fine:
//var items []gocb.BulkOp
myQuery := gocb.NewN1qlQuery([Selecting ~ 180k documents from ~1.3mln])
rows, err := myBucket.ExecuteN1qlQuery(myQuery, nil)
checkErr(err)
var idToDelete map[string]interface{}
for rows.Next(&idToDelete) {
//items = append(items, &gocb.RemoveOp{Key: idToDelete["id"].(string)})
_, err := myBucket.Remove(idToDelete["id"].(string), 0)
checkErr(err)
}
if err := rows.Close(); err != nil {
fmt.Println(err.Error())
}
//err = myBucket.Do(items)
By default, queries against N1QL use a consistency level called 'request plus'. Thus, your second time running the program to query will use whatever index update is valid at the time of the query, rather than considering all of your previous mutations by waiting until the index is up to date. You can read more about this in Couchbase's Developer Guide and it looks like the you'll want to add the RequestPlus parameter to your myquery through the consistency method on the query.
This kind of eventually consistent secondary indexing and the flexibility is pretty powerful because it gives you as a developer the ability to decide what level of consistency you want to pay for since index recalculations have a cost.

Resources