Use a single mutex across multiple goroutines - go

I'm trying to reduce the amount of http requests my discord bot is making.
It's reading from an API.
With the fetched data it updates an internal database and outputs the changes.
Thing is: that database is different for every server the bot is in, and that's where I'm using the go routines. But, some servers need to fetch the same data, here is where I want to reduce the http requests. Right now I'm making requests regardless if I've already fetched a character. I want to create some sort of data that could be shared between the go routines and before making a request search within this data.
I was advised to use mutex. I'm trying. Original question: Working with unbuffered channels in golang
I made a skeleton of the real code I've tried: https://play.golang.org/p/mt229ns1R8m
In this example master := make([][]map[string]interface{}, 0) is simulating the discord servers.
Chars and Chars2 would be the tracked chars for each individual server.
The char "Test" is mutual to both of them, so it should be fetched from the API only once.
It's outputing this:
[[map[Level:15 Name:Test] map[Level:150 Name:Test2]] [map[Level:1500 Name:Test3] map[Level:15 Name:Test]]]
------
A call would be made
A call would be made
A call would be made
A call would be made
Cache: [map[Level:150 Name:Test2] map[Level:15 Name:Test]]Cache: [map[Level:15 Name:Test] map[Level:1500 Name:Test3]]Done
I was expecting the output to be:
[[map[Level:15 Name:Test] map[Level:150 Name:Test2]] [map[Level:1500 Name:Test3] map[Level:15 Name:Test]]]
------
A call would be made
A call would be made
A call would be made
Cache: [map[Level:150 Name:Test2] map[Level:15 Name:Test] map[Level:1500 Name:Test3]]Done
But a new cache is being generated by every go routine. How can I fix this?
Thanks.

There are too many unknowns here for me to really write a proper design, but let's make a few notes:
Try not to use interface{} at all, if at all possible. In this case, it seems that it must be possible, though I'm not sure what the actual types will be.
Try to make your data as simple as possible, but no simpler. In this case, that probably means: have one data structure for "thing that talks to a Discord server" and a separate one for "thing that talks to the local database" (is this a caching database? if so, what are the criteria for invalidating a cache entry?). But if one "character" (whatever that is—apparently a string) can have different properties per Discord server, that means that your index into your local database is not just a character, but rather a pair of values: the string value itself plus a Discord-server-identifier.
This might give you a functional interface like this:
var cacheServer *CacheServer
func InitCacheServer() error {
cacheServer = ... // whatever it takes to initialize the cache server
}
(I've assumed lazy initialization of the cache server. If you can do up-front initialization, you can drop the next test below. Replace ValueType with the type of the result of a cached lookup of a name.)
func (DiscordServer ds) Get(name string) (ValueType, error) {
if cacheserver == nil {
if err := InitCacheServer(); err != nil {
return nil, err
}
}
// Do a cache lookup. Tell the cache server that if there
// is no entry, it should return a NoEntry error and we will
// fill the cache ourselves, so it should hold this slot as
// "will be filled, so wait for it".
slot, v, err := cacheServer.Lookup(name, ds.identity, CacheServer.IntentToFill)
if err == CacheServer.NoEntry {
// We have the slot held. Try to look up the right info
// directly in the Discord server, then cache it.
v, err = ds.UncachedGet(name)
// Tell cache server that this is the value, or that it should
// produce this error instead of NoCache.
cacheServer.FillSlot(slot, v, err)
}
}
You might only want to cache some error types, rather than all; that's another one of those design questions that needs an answer that I cannot provide here. There are other ways to do this that don't necessarily need a slot pointer return value, too; I've just chosen this one for this example.
Note that most of the "hard work" is now in the cache server, which definitely requires some fancy footwork. In particular you will want to lock the overall data structure for a little while, use that to find the correct slot, then hold the slot itself so that other users of the slot must wait, while releasing the overall lock so that other users of other entries need not wait. This introduces locking order constraints: be careful to avoid deadlock. One method that should work is:
type CacheServer struct {
lock sync.Mutex
data map[string]map[string]*Entry
// more fields
}
type Entry {
lock sync.Mutex
cachedValue ValueType
cachedError error
}
(You'll need some more types, like Intent—just two enumerated integers for now—below, and probably more fields in the above; this is just a skeleton.)
func (cs *CacheServer) Lookup(name, srv string, flags Intent) (*Entry, ValueType, error) {
cs.lock.Lock()
defer cs.lock.Unlock()
// first, look up the server - if it does not exist, create one
smap := cs.data[srv]
if smap == nil {
cs.data[server] = make(map[string]*Entry)
}
entry := smap[name]
if entry == nil {
// no cached entry - if this is a pure lookup, just error,
// but if not, make a locked entry
if flags == CacheServer.IntentToFill {
// make a new entry and return with it locked
entry = &Entry{}
smap[name] = entry
entry.lock.Lock() // and do not unlock
}
return entry, nil, NoEntry
}
entry.lock.Lock() // wait for someone to fill it, if needed
defer entry.lock.Unlock()
return nil, entry.cachedValue, entry.cachedError
}
You need a routine to fill and release the entry as well, but it's pretty simple. You could, if you choose, make this a method on the Entry type rather than on the CacheServer type, as at least in this particular prototype, there is no need to use the cache server data structures directly. If you start getting fancier with cache invalidation, though, it might be nice to have access to the CacheServer object.
Note: I've designed this so that you can do a cache lookup without an intent-to-fill, if that's useful. If not, there's no reason to bother with the Intent argument.

Related

How to create a Singleton Cache in Go

I am working in Golang, which I am new to, and I have come across two interesting articles:
https://hackernoon.com/in-memory-caching-in-golang
The one from hackernoon is really good and the first example (Simple Map) is precisely what I am for to create a cache as it gives an example for expiring values in a cache. Where I am struggling to understand, is that it does not say whether the implementation creates just one instantation of the cache and not multiple copies, which would conflict or you have one value in one copy and one in another, and the look ups won't work properly.
In another link https://thedevelopercafe.com/articles/singleton-in-golang-839d8610958b it talks about instantation of one cache.
So, my question in both they use sync and so can I ask someone who has experience in Golang to advise me whether the example from Hackernoon in the function called newlocalcache sets up a singleton and if not what do I need to do to add it?
the function called newlocalcache sets up a singleton
No, it constructs and returns a new local cache every time it's called.
if not what do I need to do to add it?
Call it just once. For example:
var localCacheSingleton *localCache
var newLocalCacheOnce sync.Once
func newLocalCache(cleanupInterval time.Duration) *localCache {
newLocalCacheOnce.Do(func() {
lc := &localCache{
users: make(map[int64]cachedUser),
stop: make(chan struct{}),
}
lc.wg.Add(1)
go func(cleanupInterval time.Duration) {
defer lc.wg.Done()
lc.cleanupLoop(cleanupInterval)
}(cleanupInterval)
localCacheSingleton = lc
})
return localCacheSingleton
}

Ensuring a value is retrieved only once

I'm developing a Go package to access a web service (via HTTP). Every time I retrieve a page of data from that service, I also get the total of pages available. The only way to get this total is by getting one of the pages (usually the first one). However, requests to this service take time and I need to do the following:
When the GetPage method is called on a Client and the page is retrieved for the first time, the retrieved total should be stored somewhere in that client. When the Total method is called and the total hasn't yet been retrieved, the first page should be fetched and the total returned. If the total was retrieved before, either by a call to GetPage or Total, it should be returned immediately, without any HTTP requests at all. This needs to be safe for use by multiple goroutines. My idea is something along the lines of sync.Once but with the function passed to Do returning a value, which is then cached and automatically returned whenever Do is called.
I remember seeing something like this before, but I can't find it now even though I tried. Searching for sync.Once with value and similar terms didn't yield any useful results. I know I could probably do that with a mutex and a lot of locking, but mutexes and a lot of locking don't seem to be the recommended way to do stuff in go.
General "init-once" solution
In the general / usual case, the easiest solution to only init once, only when it's actually needed is to use sync.Once and its Once.Do() method.
You don't actually need to return any value from the function passed to Once.Do(), because you can store values to e.g. global variables in that function.
See this simple example:
var (
total int
calcTotalOnce sync.Once
)
func GetTotal() int {
// Init / calc total once:
calcTotalOnce.Do(func() {
fmt.Println("Fetching total...")
// Do some heavy work, make HTTP calls, whatever you want:
total++ // This will set total to 1 (once and for all)
})
// Here you can safely use total:
return total
}
func main() {
fmt.Println(GetTotal())
fmt.Println(GetTotal())
}
Output of the above (try it on the Go Playground):
Fetching total...
1
1
Some notes:
You can achieve the same using a mutex or sync.Once, but the latter is actually faster than using a mutex.
If GetTotal() has been called before, subsequent calls to GetTotal() will not do anything but return the previously calculated value, this is what Once.Do() does / ensures. sync.Once "tracks" if its Do() method has been called before, and if so, the passed function value will not be called anymore.
sync.Once provides all the needs for this solution to be safe for concurrent use from multiple goroutines, given that you don't modify or access the total variable directly from anywhere else.
Solution to your "unusal" case
The general case assumes the total is only accessed via the GetTotal() function.
In your case this does not hold: you want to access it via the GetTotal() function and you want to set it after a GetPage() call (if it has not yet been set).
We may solve this with sync.Once too. We would need the above GetTotal() function; and when a GetPage() call is performed, it may use the same calcTotalOnce to attempt to set its value from the received page.
It could look something like this:
var (
total int
calcTotalOnce sync.Once
)
func GetTotal() int {
calcTotalOnce.Do(func() {
// total is not yet initialized: get page and store total number
page := getPageImpl()
total = page.Total
})
// Here you can safely use total:
return total
}
type Page struct {
Total int
}
func GetPage() *Page {
page := getPageImpl()
calcTotalOnce.Do(func() {
// total is not yet initialized, store the value we have:
total = page.Total
})
return page
}
func getPageImpl() *Page {
// Do HTTP call or whatever
page := &Page{}
// Set page.Total from the response body
return page
}
How does this work? We create and use a single sync.Once in the variable calcTotalOnce. This ensures that its Do() method may only call the function passed to it once, no matter where / how this Do() method is called.
If someone calls the GetTotal() function first, then the function literal inside it will run, which calls getPageImpl() to fetch the page and initialize the total variable from the Page.Total field.
If GetPage() function would be called first, that will also call calcTotalOnce.Do() which simply sets the Page.Total value to the total variable.
Whichever route is walked first, that will alter the internal state of calcTotalOnce, which will remember the total calculation has already been run, and further calls to calcTotalOnce.Do() will never call the function value passed to it.
Or just use "eager" initialization
Also note that if it is likely that this total number have to be fetched during the lifetime of your program, it might not worth the above complexity, as you may just as easily initialize the variable once, when it's created.
var Total = getPageImpl().Total
Or if the initialization is a little more complex (e.g. needs error handling), use a package init() function:
var Total int
func init() {
page := getPageImpl()
// Other logic, e.g. error handling
Total = page.Total
}

Golang garbage collector and maps

I'm processing some user session data inside of a goroutine and creating a map to keep track of user id -> session data inside of it. The goroutine loops through a slice and if a SessionEnd event is found, the map key is deleted inside the same iteration. This doesn't seem to always be the case, as I can still retrieve some of the data as well as the 'key exists' bool variable sometimes in the following iterations. It's as if some variables haven't yet been zeroed.
Each map has only one goroutine writing/reading from it. From my understanding there shouldn't be a race condition, but it definitely seems that there is with the map and delete().
The code works fine if the garbage collector is run on every iteration. Am I using a map for the wrong purpose?
Pseudocode (a function that is run inside a single goroutine, lines is passed as a variable):
active := make(ActiveSessions) // map[int]UserSession
for _, l := range lines { // lines is a slice of a parsed log
u = l.EventData.(parser.User)
s, exists = active[u.SessionID]
switch l.Event {
// Contains cases which can check if exists is true or false
// errors if contains an event that can't happen,
// for example UserDisconnect before UserConnect,
// or UserConnect while a session is already active
case "UserConnect":
if exists {
// error, can't occur
// The same session id can occur in the log after a prior session has completed,
// which is exactly when the problems occur
}
case "UserDisconnect":
sessionFinished = true
}
// ...
if sessionFinished {
// <add session to finished sessions>
delete(active, u.SessionID)
// Code works only if runtime.GC() is executed here, could just be a coincidence
}
}

How to FIFO Order a Map during Unmarshalling

I know from reading around that Maps are intentionally unordered in Go, but they offer a lot of benefits that I would like to use for this problem I'm working on. My question is how might I order a map FIFO style? Is it even worth trying to make this happen? Specifically I am looking to make it so that I can unmarshal into a set of structures hopefully off of an interface.
I have:
type Package struct {
Account string
Jobs []*Jobs
Libraries map[string]string
}
type Jobs struct {
// Name of the job
JobName string `mapstructure:"name" json:"name" yaml:"name" toml:"name"`
// Type of the job. should be one of the strings outlined in the job struct (below)
Job *Job `mapstructure:"job" json:"job" yaml:"job" toml:"job"`
// Not marshalled
JobResult string
// For multiple values
JobVars []*Variable
}
type Job struct {
// Sets/Resets the primary account to use
Account *Account `mapstructure:"account" json:"account" yaml:"account" toml:"account"`
// Set an arbitrary value
Set *Set `mapstructure:"set" json:"set" yaml:"set" toml:"set"`
// Contract compile and send to the chain functions
Deploy *Deploy `mapstructure:"deploy" json:"deploy" yaml:"deploy" toml:"deploy"`
// Send tokens from one account to another
Send *Send `mapstructure:"send" json:"send" yaml:"send" toml:"send"`
// Utilize eris:db's native name registry to register a name
RegisterName *RegisterName `mapstructure:"register" json:"register" yaml:"register" toml:"register"`
// Sends a transaction which will update the permissions of an account. Must be sent from an account which
// has root permissions on the blockchain (as set by either the genesis.json or in a subsequence transaction)
Permission *Permission `mapstructure:"permission" json:"permission" yaml:"permission" toml:"permission"`
// Sends a bond transaction
Bond *Bond `mapstructure:"bond" json:"bond" yaml:"bond" toml:"bond"`
// Sends an unbond transaction
Unbond *Unbond `mapstructure:"unbond" json:"unbond" yaml:"unbond" toml:"unbond"`
// Sends a rebond transaction
Rebond *Rebond `mapstructure:"rebond" json:"rebond" yaml:"rebond" toml:"rebond"`
// Sends a transaction to a contract. Will utilize eris-abi under the hood to perform all of the heavy lifting
Call *Call `mapstructure:"call" json:"call" yaml:"call" toml:"call"`
// Wrapper for mintdump dump. WIP
DumpState *DumpState `mapstructure:"dump-state" json:"dump-state" yaml:"dump-state" toml:"dump-state"`
// Wrapper for mintdum restore. WIP
RestoreState *RestoreState `mapstructure:"restore-state" json:"restore-state" yaml:"restore-state" toml:"restore-state"`
// Sends a "simulated call" to a contract. Predominantly used for accessor functions ("Getters" within contracts)
QueryContract *QueryContract `mapstructure:"query-contract" json:"query-contract" yaml:"query-contract" toml:"query-contract"`
// Queries information from an account.
QueryAccount *QueryAccount `mapstructure:"query-account" json:"query-account" yaml:"query-account" toml:"query-account"`
// Queries information about a name registered with eris:db's native name registry
QueryName *QueryName `mapstructure:"query-name" json:"query-name" yaml:"query-name" toml:"query-name"`
// Queries information about the validator set
QueryVals *QueryVals `mapstructure:"query-vals" json:"query-vals" yaml:"query-vals" toml:"query-vals"`
// Makes and assertion (useful for testing purposes)
Assert *Assert `mapstructure:"assert" json:"assert" yaml:"assert" toml:"assert"`
}
What I would like to do is to have jobs contain a map of string to Job and eliminate the job field, while maintaining order in which they were placed in from the config file. (Currently using viper). Any and all suggestions for how to achieve this are welcome.
You would need to hold the keys in a separate slice and work with that.
type fifoJob struct {
m map[string]*Job
order []string
result []string
// Not sure where JobVars will go.
}
func (str *fifoJob) Enqueue(key string, val *Job) {
str.m[key] = val
str.order = append(str.order, key)
}
func (str *fifoJob) Dequeue() {
if len(str.order) > 0 {
delete(str.m, str.order[0])
str.order = str.order[1:]
}
}
Anyways if you're using viper you can use something like the fifoJob struct defined above. Also note that I'm making a few assumptions here.
type Package struct {
Account string
Jobs *fifoJob
Libraries map[string]string
}
var config Package
config.Jobs = fifoJob{}
config.Jobs.m = map[string]*Job{}
// Your config file would need to store the order in an array.
// Would've been easy if viper had a getSlice method returning []interface{}
config.Jobs.order = viper.GetStringSlice("package.jobs.order")
for k,v := range viper.GetStringMap("package.jobs.jobmap") {
if job, ok := v.(Job); ok {
config.Jobs.m[k] = &job
}
}
for
PS: You're giving too many irrelevant details in your question. I was asking for a MCVE.
Maps are by nature unordered but you can fill up a slice instead with your keys. Then you can range over your slice and sort it however you like. You can pull out specific elements in your slice with [i].
Check out pages 170, 203, or 204 of some great examples of this:
Programming in Go

Bypass sql null value problems in Go

I want to use Go to make an API for an existing database that uses null values extensively. Go will not scan nulls to empty strings (or equivalent), and so I need to implement a workaround.
The workarounds I have discovered have left me unsatisfied. In fact I went looking for a dynamic language because of this problem, but Go has certain attractions and I would like to stick with it if possible. Here are the workarounds that did not satisfy:
Don't use nulls in the database. Unsuitable because the database is pre-existing and I do not have liberty to interfere with its structure. The database is more important than my app, not the other way around.
In sql queries, use COALESCE, ISNULL, etc to convert nulls to empty strings (or equiv) before the data gets to my app. Unsuitable because there are many fields and many tables. Apart from a couple of obvious ones (primary key, surname), I don't know for sure which fields can be relied upon not to give me a null value, so I would be defensively cluttering my sql queries everywhere.
Use sql.NullString, sql.NullInt64, sql.NullFloat64, etc to convert nulls to empty strings (or equiv) as an intermediate step before settling them into their destination type. This suffers from the same problem as above, only I am cluttering my Go code instead of my sql queries.
Use a combination of *pointers and []byte, to scan each item in to a memory location without committing it to a particular type (other than []byte), and then somehow work with the raw data. But to do something meaningful with the data you have to convert it to something more useful, and then you are back to sql.Nullstring or if x==nil{handle it}, and this again is happening on a case by case basis for any field that I need to work with. So, again, we are looking at cluttered, messy, error-prone code and I'm repeating myself all the time instead of being DRY in my coding.
Look to the Go ORM libraries for help. Well I did that, but to my surprise none of them tackle this issue.
Make my own helper package to convert all null strings to "", null ints to 0, null floats to 0.00, null bools to false, etc, and make it part of the process of scanning in from the sql driver, resulting in regular, normal strings, ints, floats and bools.
Unfortunately if 6 is the solution, I do not have the expertise. I suspect the solution would involve something like "if the intended type of the item to be scanned to is a string, make it an sql.NullString and extract an empty string from it. But if the item to be scanned to is an int, make it a NullInt64 and get a zero from that. But if ...(etc)"
Is there anything I have missed? Thank you.
The use of pointers for the sql-scanning destination variables enables the data to be scanned in, worked with (subject to checking if != nil) and marshalled to json, to be sent out from the API, without having to put hundreds of sql.Nullstring, sql.Nullfloat64 etc everywhere. Nulls are miraculously preserved and sent out through the marshalled json. (See Fathername at the bottom). At the other end, the client can work with the nulls in javascript which is better equipped to handle them.
func queryToJson(db *sql.DB) []byte {
rows, err := db.Query(
"select mothername, fathername, surname from fams" +
"where surname = ?", "Nullfather"
)
defer rows.Close()
type record struct {
Mname, Fname, Surname *string // the key: use pointers
}
records := []record{}
for rows.Next() {
var r record
err := rows.Scan(r.Mname, r.Fname, r.Surname) // no need for "&"
if err != nil {
log.Fatal(err)
}
fmt.Println(r)
records = append(records, r)
}
j, err := json.Marshal(records)
if err != nil {
log.Fatal(err)
}
return j
}
j := queryToJson(db)
fmt.Println(string(j)) // [{"Mothername":"Mary", "Fathername":null, "Surname":"Nullfather"}]

Resources