Ensuring a value is retrieved only once - go

I'm developing a Go package to access a web service (via HTTP). Every time I retrieve a page of data from that service, I also get the total of pages available. The only way to get this total is by getting one of the pages (usually the first one). However, requests to this service take time and I need to do the following:
When the GetPage method is called on a Client and the page is retrieved for the first time, the retrieved total should be stored somewhere in that client. When the Total method is called and the total hasn't yet been retrieved, the first page should be fetched and the total returned. If the total was retrieved before, either by a call to GetPage or Total, it should be returned immediately, without any HTTP requests at all. This needs to be safe for use by multiple goroutines. My idea is something along the lines of sync.Once but with the function passed to Do returning a value, which is then cached and automatically returned whenever Do is called.
I remember seeing something like this before, but I can't find it now even though I tried. Searching for sync.Once with value and similar terms didn't yield any useful results. I know I could probably do that with a mutex and a lot of locking, but mutexes and a lot of locking don't seem to be the recommended way to do stuff in go.

General "init-once" solution
In the general / usual case, the easiest solution to only init once, only when it's actually needed is to use sync.Once and its Once.Do() method.
You don't actually need to return any value from the function passed to Once.Do(), because you can store values to e.g. global variables in that function.
See this simple example:
var (
total int
calcTotalOnce sync.Once
)
func GetTotal() int {
// Init / calc total once:
calcTotalOnce.Do(func() {
fmt.Println("Fetching total...")
// Do some heavy work, make HTTP calls, whatever you want:
total++ // This will set total to 1 (once and for all)
})
// Here you can safely use total:
return total
}
func main() {
fmt.Println(GetTotal())
fmt.Println(GetTotal())
}
Output of the above (try it on the Go Playground):
Fetching total...
1
1
Some notes:
You can achieve the same using a mutex or sync.Once, but the latter is actually faster than using a mutex.
If GetTotal() has been called before, subsequent calls to GetTotal() will not do anything but return the previously calculated value, this is what Once.Do() does / ensures. sync.Once "tracks" if its Do() method has been called before, and if so, the passed function value will not be called anymore.
sync.Once provides all the needs for this solution to be safe for concurrent use from multiple goroutines, given that you don't modify or access the total variable directly from anywhere else.
Solution to your "unusal" case
The general case assumes the total is only accessed via the GetTotal() function.
In your case this does not hold: you want to access it via the GetTotal() function and you want to set it after a GetPage() call (if it has not yet been set).
We may solve this with sync.Once too. We would need the above GetTotal() function; and when a GetPage() call is performed, it may use the same calcTotalOnce to attempt to set its value from the received page.
It could look something like this:
var (
total int
calcTotalOnce sync.Once
)
func GetTotal() int {
calcTotalOnce.Do(func() {
// total is not yet initialized: get page and store total number
page := getPageImpl()
total = page.Total
})
// Here you can safely use total:
return total
}
type Page struct {
Total int
}
func GetPage() *Page {
page := getPageImpl()
calcTotalOnce.Do(func() {
// total is not yet initialized, store the value we have:
total = page.Total
})
return page
}
func getPageImpl() *Page {
// Do HTTP call or whatever
page := &Page{}
// Set page.Total from the response body
return page
}
How does this work? We create and use a single sync.Once in the variable calcTotalOnce. This ensures that its Do() method may only call the function passed to it once, no matter where / how this Do() method is called.
If someone calls the GetTotal() function first, then the function literal inside it will run, which calls getPageImpl() to fetch the page and initialize the total variable from the Page.Total field.
If GetPage() function would be called first, that will also call calcTotalOnce.Do() which simply sets the Page.Total value to the total variable.
Whichever route is walked first, that will alter the internal state of calcTotalOnce, which will remember the total calculation has already been run, and further calls to calcTotalOnce.Do() will never call the function value passed to it.
Or just use "eager" initialization
Also note that if it is likely that this total number have to be fetched during the lifetime of your program, it might not worth the above complexity, as you may just as easily initialize the variable once, when it's created.
var Total = getPageImpl().Total
Or if the initialization is a little more complex (e.g. needs error handling), use a package init() function:
var Total int
func init() {
page := getPageImpl()
// Other logic, e.g. error handling
Total = page.Total
}

Related

Cypress: global variables in callbacks

I am trying to set up a new Cypress framework and I hit a point where I need help.
The scenario I am trying to work out: a page that is calling the same endpoint after every change and I use an interceptor to wait for the call. As we all know you can’t use the same intercept name for the same request multiple times so I did a trick here. I used a dynamically named alias, which works.
var counters = {}
function registerIntercept(method, url, name) {
counters[name] = 1;
cy.intercept(method, url, (req) => {
var currentCounter = counters[name]++;
cy.wrap(counters).as(counters)
req.alias = name + (currentCounter);
})
}
function waitForCall(call) {
waitForCall_(call + (counters[call.substr(1)]));
// HERE counters[any_previously_added_key] is always 1, even though the counters entry in registerIntercept is bigger
// I suspect counters here is not using the same counters value as registerIntercept
}
function waitForCall_(call) {
cy.wait(call)
cy.wait(100)
}
It is supposed to be used by using waitForCall(“#callalias”) and it will be converted to #callalias1, callalias2 and so on.
The problem is that the global counters is working in registerIntercept but I can’t get its values from waitForCall. It will always retrieve value 1 for a specific key, while the counters key in registerIntercept is already at a bigger number.
This code works if you use it as waitForCall_(“#callalias4”) to wait for the 4th request for example. Each intercept will get an alias ending with an incremented number. But I want to not keep track of how many calls were made and let the code retrieve that from counters and build the wait.
Any idea why counters in waitForCall is not having the same values for its keys as it has in registerIntercept?

Use a single mutex across multiple goroutines

I'm trying to reduce the amount of http requests my discord bot is making.
It's reading from an API.
With the fetched data it updates an internal database and outputs the changes.
Thing is: that database is different for every server the bot is in, and that's where I'm using the go routines. But, some servers need to fetch the same data, here is where I want to reduce the http requests. Right now I'm making requests regardless if I've already fetched a character. I want to create some sort of data that could be shared between the go routines and before making a request search within this data.
I was advised to use mutex. I'm trying. Original question: Working with unbuffered channels in golang
I made a skeleton of the real code I've tried: https://play.golang.org/p/mt229ns1R8m
In this example master := make([][]map[string]interface{}, 0) is simulating the discord servers.
Chars and Chars2 would be the tracked chars for each individual server.
The char "Test" is mutual to both of them, so it should be fetched from the API only once.
It's outputing this:
[[map[Level:15 Name:Test] map[Level:150 Name:Test2]] [map[Level:1500 Name:Test3] map[Level:15 Name:Test]]]
------
A call would be made
A call would be made
A call would be made
A call would be made
Cache: [map[Level:150 Name:Test2] map[Level:15 Name:Test]]Cache: [map[Level:15 Name:Test] map[Level:1500 Name:Test3]]Done
I was expecting the output to be:
[[map[Level:15 Name:Test] map[Level:150 Name:Test2]] [map[Level:1500 Name:Test3] map[Level:15 Name:Test]]]
------
A call would be made
A call would be made
A call would be made
Cache: [map[Level:150 Name:Test2] map[Level:15 Name:Test] map[Level:1500 Name:Test3]]Done
But a new cache is being generated by every go routine. How can I fix this?
Thanks.
There are too many unknowns here for me to really write a proper design, but let's make a few notes:
Try not to use interface{} at all, if at all possible. In this case, it seems that it must be possible, though I'm not sure what the actual types will be.
Try to make your data as simple as possible, but no simpler. In this case, that probably means: have one data structure for "thing that talks to a Discord server" and a separate one for "thing that talks to the local database" (is this a caching database? if so, what are the criteria for invalidating a cache entry?). But if one "character" (whatever that is—apparently a string) can have different properties per Discord server, that means that your index into your local database is not just a character, but rather a pair of values: the string value itself plus a Discord-server-identifier.
This might give you a functional interface like this:
var cacheServer *CacheServer
func InitCacheServer() error {
cacheServer = ... // whatever it takes to initialize the cache server
}
(I've assumed lazy initialization of the cache server. If you can do up-front initialization, you can drop the next test below. Replace ValueType with the type of the result of a cached lookup of a name.)
func (DiscordServer ds) Get(name string) (ValueType, error) {
if cacheserver == nil {
if err := InitCacheServer(); err != nil {
return nil, err
}
}
// Do a cache lookup. Tell the cache server that if there
// is no entry, it should return a NoEntry error and we will
// fill the cache ourselves, so it should hold this slot as
// "will be filled, so wait for it".
slot, v, err := cacheServer.Lookup(name, ds.identity, CacheServer.IntentToFill)
if err == CacheServer.NoEntry {
// We have the slot held. Try to look up the right info
// directly in the Discord server, then cache it.
v, err = ds.UncachedGet(name)
// Tell cache server that this is the value, or that it should
// produce this error instead of NoCache.
cacheServer.FillSlot(slot, v, err)
}
}
You might only want to cache some error types, rather than all; that's another one of those design questions that needs an answer that I cannot provide here. There are other ways to do this that don't necessarily need a slot pointer return value, too; I've just chosen this one for this example.
Note that most of the "hard work" is now in the cache server, which definitely requires some fancy footwork. In particular you will want to lock the overall data structure for a little while, use that to find the correct slot, then hold the slot itself so that other users of the slot must wait, while releasing the overall lock so that other users of other entries need not wait. This introduces locking order constraints: be careful to avoid deadlock. One method that should work is:
type CacheServer struct {
lock sync.Mutex
data map[string]map[string]*Entry
// more fields
}
type Entry {
lock sync.Mutex
cachedValue ValueType
cachedError error
}
(You'll need some more types, like Intent—just two enumerated integers for now—below, and probably more fields in the above; this is just a skeleton.)
func (cs *CacheServer) Lookup(name, srv string, flags Intent) (*Entry, ValueType, error) {
cs.lock.Lock()
defer cs.lock.Unlock()
// first, look up the server - if it does not exist, create one
smap := cs.data[srv]
if smap == nil {
cs.data[server] = make(map[string]*Entry)
}
entry := smap[name]
if entry == nil {
// no cached entry - if this is a pure lookup, just error,
// but if not, make a locked entry
if flags == CacheServer.IntentToFill {
// make a new entry and return with it locked
entry = &Entry{}
smap[name] = entry
entry.lock.Lock() // and do not unlock
}
return entry, nil, NoEntry
}
entry.lock.Lock() // wait for someone to fill it, if needed
defer entry.lock.Unlock()
return nil, entry.cachedValue, entry.cachedError
}
You need a routine to fill and release the entry as well, but it's pretty simple. You could, if you choose, make this a method on the Entry type rather than on the CacheServer type, as at least in this particular prototype, there is no need to use the cache server data structures directly. If you start getting fancier with cache invalidation, though, it might be nice to have access to the CacheServer object.
Note: I've designed this so that you can do a cache lookup without an intent-to-fill, if that's useful. If not, there's no reason to bother with the Intent argument.

Golang garbage collector and maps

I'm processing some user session data inside of a goroutine and creating a map to keep track of user id -> session data inside of it. The goroutine loops through a slice and if a SessionEnd event is found, the map key is deleted inside the same iteration. This doesn't seem to always be the case, as I can still retrieve some of the data as well as the 'key exists' bool variable sometimes in the following iterations. It's as if some variables haven't yet been zeroed.
Each map has only one goroutine writing/reading from it. From my understanding there shouldn't be a race condition, but it definitely seems that there is with the map and delete().
The code works fine if the garbage collector is run on every iteration. Am I using a map for the wrong purpose?
Pseudocode (a function that is run inside a single goroutine, lines is passed as a variable):
active := make(ActiveSessions) // map[int]UserSession
for _, l := range lines { // lines is a slice of a parsed log
u = l.EventData.(parser.User)
s, exists = active[u.SessionID]
switch l.Event {
// Contains cases which can check if exists is true or false
// errors if contains an event that can't happen,
// for example UserDisconnect before UserConnect,
// or UserConnect while a session is already active
case "UserConnect":
if exists {
// error, can't occur
// The same session id can occur in the log after a prior session has completed,
// which is exactly when the problems occur
}
case "UserDisconnect":
sessionFinished = true
}
// ...
if sessionFinished {
// <add session to finished sessions>
delete(active, u.SessionID)
// Code works only if runtime.GC() is executed here, could just be a coincidence
}
}

How to call a function in golang based on attribute change?

Sample code : https://play.golang.org/p/vl3xwMWf5G
In the above code, I am making sure that, we don't unnecessarily call the LocateBasket function. It is called only once during the GetBasketCall. But if there is any attribute change (For eg. The quantity was changed to 30) then I wanted to make sure that when user calls GetBasket it internally calls LocateBasket function too.
In my example I had only one function, but if there is multiple attributes which needs to be re-calculated (based on attribute change) when their corresponding Getter Functions are getting called, what is the best approach to do the same.
Write a setter method, and either immediately LocateBasket or flag it for GetBasket to do:
func (ball *Ball) SetQuantity(n int){
ball.Quantity = n
// ball.LocateBasket(ball)
ball.dirty = true
}
func (ball *Ball) GetBasket() BasketType {
if ball.dirty{
ball.LocateBasket(ball)
ball.dirty = false
return ball.basket
}

calling HTTP requests in angularjs in batched form?

I have two for loops and an HTTP call inside them.
for(i=0;i<m;i++) {
for(j=0;j<n;j++) {
$http call that uses i and j as GET parameters
.success(//something)
.error(//something more)
}
}
The problem with this is it makes around 200-250 AJAX calls based on values of m and n. This is causing problem of browser crash when tried to access from mobile.
I would like to know if there is a way to call HTTP requests in batched form (n requests at a time) and once these calls are finished, move to next batch and so on.
You could always use a proper HTTP batch module like this angular-http-batcher - which will take all of the requests and turn them into a single HTTP POST request before sending it to the server. Therefore it reduces 250 calls into 1! The module is here https://github.com/jonsamwell/angular-http-batcher and a detailed explanation of it is here http://jonsamwell.com/batching-http-requests-in-angular/
Yes, use the async library found here: https://github.com/caolan/async
First, use the loop to create your tasks:
var tasks = []; //array to hold the tasks
for(i=0;i<m;i++) {
for(j=0;j<n;j++) {
//we add a function to the array of "tasks"
//Async will pass that function a standard callback(error, data)
tasks.push(function(cb){
//because of the way closures work, you may not be able to rely on i and j here
//if i/j don't work here, create another closure and store them as params
$http call that uses i and j as GET parameters
.success(function(data){cb(null, data);})
.error(function(err){cb(err);});
});
}
}
Now that you've got an array full of callback-ready functions that can be executed, you must use async to execute them, async has a great feature to "limit" the number of simultaneous requests and therefore "batch".
async.parallelLimit(tasks, 10, function(error, results){
//results is an array with each tasks results.
//Don't forget to use $scope.$apply or $timeout to trigger a digest
});
In the above example you will run 10 tasks at a time in parallel.
Async has a ton of other amazing options as well, you can run things in series, parlallel, map arrays, etc.It's worth noting that you might be able to achieve greater efficiency by using a single function and the "eachLimit" function of async.
The way I did it is as follows (this will help when one wants to call HTTP requests in a batch of n requests at a time )
call batchedHTTP(with i=0);
batchedHTTP = function() {
/* check for terminating condition (in this case, i=m) */
for(j=0;j<n;j++) {
var promise = $http call with i and j GET parameters
.success(// do something)
.error(// do something else)
promisesArray.push(promise);
}
$q.all(promisesArray).then(function() {
call batchedHTTP(with i=i+1)
});
}

Resources