Golang garbage collector and maps - go

I'm processing some user session data inside of a goroutine and creating a map to keep track of user id -> session data inside of it. The goroutine loops through a slice and if a SessionEnd event is found, the map key is deleted inside the same iteration. This doesn't seem to always be the case, as I can still retrieve some of the data as well as the 'key exists' bool variable sometimes in the following iterations. It's as if some variables haven't yet been zeroed.
Each map has only one goroutine writing/reading from it. From my understanding there shouldn't be a race condition, but it definitely seems that there is with the map and delete().
The code works fine if the garbage collector is run on every iteration. Am I using a map for the wrong purpose?
Pseudocode (a function that is run inside a single goroutine, lines is passed as a variable):
active := make(ActiveSessions) // map[int]UserSession
for _, l := range lines { // lines is a slice of a parsed log
u = l.EventData.(parser.User)
s, exists = active[u.SessionID]
switch l.Event {
// Contains cases which can check if exists is true or false
// errors if contains an event that can't happen,
// for example UserDisconnect before UserConnect,
// or UserConnect while a session is already active
case "UserConnect":
if exists {
// error, can't occur
// The same session id can occur in the log after a prior session has completed,
// which is exactly when the problems occur
}
case "UserDisconnect":
sessionFinished = true
}
// ...
if sessionFinished {
// <add session to finished sessions>
delete(active, u.SessionID)
// Code works only if runtime.GC() is executed here, could just be a coincidence
}
}

Related

Cypress: global variables in callbacks

I am trying to set up a new Cypress framework and I hit a point where I need help.
The scenario I am trying to work out: a page that is calling the same endpoint after every change and I use an interceptor to wait for the call. As we all know you can’t use the same intercept name for the same request multiple times so I did a trick here. I used a dynamically named alias, which works.
var counters = {}
function registerIntercept(method, url, name) {
counters[name] = 1;
cy.intercept(method, url, (req) => {
var currentCounter = counters[name]++;
cy.wrap(counters).as(counters)
req.alias = name + (currentCounter);
})
}
function waitForCall(call) {
waitForCall_(call + (counters[call.substr(1)]));
// HERE counters[any_previously_added_key] is always 1, even though the counters entry in registerIntercept is bigger
// I suspect counters here is not using the same counters value as registerIntercept
}
function waitForCall_(call) {
cy.wait(call)
cy.wait(100)
}
It is supposed to be used by using waitForCall(“#callalias”) and it will be converted to #callalias1, callalias2 and so on.
The problem is that the global counters is working in registerIntercept but I can’t get its values from waitForCall. It will always retrieve value 1 for a specific key, while the counters key in registerIntercept is already at a bigger number.
This code works if you use it as waitForCall_(“#callalias4”) to wait for the 4th request for example. Each intercept will get an alias ending with an incremented number. But I want to not keep track of how many calls were made and let the code retrieve that from counters and build the wait.
Any idea why counters in waitForCall is not having the same values for its keys as it has in registerIntercept?

Synchronized conditional statement in Go

I have a method that may be used in multiple goroutines and run concurrently.
Inside this method, I have a conditional statement. If the conditional statement is true, I want all other goroutines calling this method to wait for one and only one of the goroutines to execute this conditional statement before proceeding to the next section.
For example:
type SomeClass struct {
mu sync.Mutex
}
func (c *SomeClass) SomeFunc() {
//Do some calculation
if condition {
//This part should be executed by only one goroutine if the condition is true.
//All others must wait for this to finish
}
//Additional calculations
}
And I want to use it like this:
func main(){
//initilize
go someClass.SomeFunc()
//If the condition is true, the following will wait at the conditional statement until the first one finishes the code inside the conditional block
//Once it's done, they can run concurrently
go someClass.SomeFunc()
go someClass.SomeFunc()
}
Edit
This is perhaps not the right design for this so I'm looking for any suggestions on how to implement this.
Edit2:
Note that each routine will have its own condition. This value of condition is not shared between threads. However, the work inside the condition should run only once only if the condition in 2 or more routines happens to be true at the same time.
You'll want a mutex protecting the condition from concurrent read/writes, and then a method for resetting the condition when you wish to execute the synchronous code again.
type SomeClass struct {
conditionMu sync.Mutex
condition bool
}
func (c *SomeClass) SomeFunc() {
// Lock the mutex, so that concurrent calls to SomeFunc will wait here.
c.conditionMu.Lock()
if c.condition {
// Synchronous code goes here.
// Reset the condition to false so that any waiting goroutines won't run the code inside this block again.
c.condition = false
}
// Unlock the mutex, and any waiting goroutines.
c.conditionMu.Unlock()
}
// ResetCondition sets the stored condition to true in a thread-safe manner.
func (c *SomeClass) ResetCondition() {
c.conditionMu.Lock()
c.condition = true
c.conditionMu.Unlock()
}
The other answers to this question were incorrect because they do not satisfy the requirements of the question.
If the lock is added outside the conditional statement, then it will act as a barrier and will force all routines to synchronize at that spot. This is not the point of this question. Suppose resolving the condition value takes a long time, we do not want to check the value one routine at a time. We want to let every process check the condition at once so if the condition is false, we can move forward without stopping.
We want to ensure that the goroutines run in parallel if the condition is not true. Adding a lock inside the method and outside the conditional statement will not allow that to happen.
The following solutions are correct and passed all tests and performed well.
Solution 1:
Use 2 nested conditional statement such as this:
Note that in this case, if the condition is false, no lock will be called and no synchronization is needed. Everything can run in parallel.
type SomeClass struct {
conditionMu sync.Mutex
rwMu sync.RWMutex
additionalWorkRequired bool
}
func (c *SomeClass) SomeFunc() {
//Do some work ...
//Note: The condition is not shared, some routines can have false and some true at the same time, which is fine.
condition := true;
// All routines can check this condition and go inside the block if the condition is true
if condition {
c.rwMutex.Lock()
c.additionalWorkRequired = true
c.rwMutex.Unlock()
//Lock so other routines can wait here for the first one
c.conditionMu.Lock()
if c.additionalWorkRequired {
// Synchronous code goes here.
c.additionalWorkRequired = false
}
//Unlock so all other processors can move forward in parallel
c.conditionMu.unlock()
}
//Finish up the remaining work
}
Solution 2:
Use the do function from sync/singleflight which can handle this situation automatically.
From documentation:
Do executes and returns the results of the given function, making sure that only one execution is in-flight for a given key at a time. If a duplicate comes in, the duplicate caller waits for the original to complete and receives the same results. The return value shared indicates whether v was given to multiple callers.
Edit:
Since many seem to be confused by this question and answer, I'm adding a use case which might make things more clear:
1. Send a HTTP Request
2. If the server returns an error saying credentials are incorrect (This is condition):
2.1. Save current credentials in a local variable
2.2. Acquire the mutex lock
2.2.1. Compare the shared credentials with the ones in the local variable(This is the second condition)
If they are the same, then replace them with new ones
2.3. Unlock
2.4. Retry request

Use a single mutex across multiple goroutines

I'm trying to reduce the amount of http requests my discord bot is making.
It's reading from an API.
With the fetched data it updates an internal database and outputs the changes.
Thing is: that database is different for every server the bot is in, and that's where I'm using the go routines. But, some servers need to fetch the same data, here is where I want to reduce the http requests. Right now I'm making requests regardless if I've already fetched a character. I want to create some sort of data that could be shared between the go routines and before making a request search within this data.
I was advised to use mutex. I'm trying. Original question: Working with unbuffered channels in golang
I made a skeleton of the real code I've tried: https://play.golang.org/p/mt229ns1R8m
In this example master := make([][]map[string]interface{}, 0) is simulating the discord servers.
Chars and Chars2 would be the tracked chars for each individual server.
The char "Test" is mutual to both of them, so it should be fetched from the API only once.
It's outputing this:
[[map[Level:15 Name:Test] map[Level:150 Name:Test2]] [map[Level:1500 Name:Test3] map[Level:15 Name:Test]]]
------
A call would be made
A call would be made
A call would be made
A call would be made
Cache: [map[Level:150 Name:Test2] map[Level:15 Name:Test]]Cache: [map[Level:15 Name:Test] map[Level:1500 Name:Test3]]Done
I was expecting the output to be:
[[map[Level:15 Name:Test] map[Level:150 Name:Test2]] [map[Level:1500 Name:Test3] map[Level:15 Name:Test]]]
------
A call would be made
A call would be made
A call would be made
Cache: [map[Level:150 Name:Test2] map[Level:15 Name:Test] map[Level:1500 Name:Test3]]Done
But a new cache is being generated by every go routine. How can I fix this?
Thanks.
There are too many unknowns here for me to really write a proper design, but let's make a few notes:
Try not to use interface{} at all, if at all possible. In this case, it seems that it must be possible, though I'm not sure what the actual types will be.
Try to make your data as simple as possible, but no simpler. In this case, that probably means: have one data structure for "thing that talks to a Discord server" and a separate one for "thing that talks to the local database" (is this a caching database? if so, what are the criteria for invalidating a cache entry?). But if one "character" (whatever that is—apparently a string) can have different properties per Discord server, that means that your index into your local database is not just a character, but rather a pair of values: the string value itself plus a Discord-server-identifier.
This might give you a functional interface like this:
var cacheServer *CacheServer
func InitCacheServer() error {
cacheServer = ... // whatever it takes to initialize the cache server
}
(I've assumed lazy initialization of the cache server. If you can do up-front initialization, you can drop the next test below. Replace ValueType with the type of the result of a cached lookup of a name.)
func (DiscordServer ds) Get(name string) (ValueType, error) {
if cacheserver == nil {
if err := InitCacheServer(); err != nil {
return nil, err
}
}
// Do a cache lookup. Tell the cache server that if there
// is no entry, it should return a NoEntry error and we will
// fill the cache ourselves, so it should hold this slot as
// "will be filled, so wait for it".
slot, v, err := cacheServer.Lookup(name, ds.identity, CacheServer.IntentToFill)
if err == CacheServer.NoEntry {
// We have the slot held. Try to look up the right info
// directly in the Discord server, then cache it.
v, err = ds.UncachedGet(name)
// Tell cache server that this is the value, or that it should
// produce this error instead of NoCache.
cacheServer.FillSlot(slot, v, err)
}
}
You might only want to cache some error types, rather than all; that's another one of those design questions that needs an answer that I cannot provide here. There are other ways to do this that don't necessarily need a slot pointer return value, too; I've just chosen this one for this example.
Note that most of the "hard work" is now in the cache server, which definitely requires some fancy footwork. In particular you will want to lock the overall data structure for a little while, use that to find the correct slot, then hold the slot itself so that other users of the slot must wait, while releasing the overall lock so that other users of other entries need not wait. This introduces locking order constraints: be careful to avoid deadlock. One method that should work is:
type CacheServer struct {
lock sync.Mutex
data map[string]map[string]*Entry
// more fields
}
type Entry {
lock sync.Mutex
cachedValue ValueType
cachedError error
}
(You'll need some more types, like Intent—just two enumerated integers for now—below, and probably more fields in the above; this is just a skeleton.)
func (cs *CacheServer) Lookup(name, srv string, flags Intent) (*Entry, ValueType, error) {
cs.lock.Lock()
defer cs.lock.Unlock()
// first, look up the server - if it does not exist, create one
smap := cs.data[srv]
if smap == nil {
cs.data[server] = make(map[string]*Entry)
}
entry := smap[name]
if entry == nil {
// no cached entry - if this is a pure lookup, just error,
// but if not, make a locked entry
if flags == CacheServer.IntentToFill {
// make a new entry and return with it locked
entry = &Entry{}
smap[name] = entry
entry.lock.Lock() // and do not unlock
}
return entry, nil, NoEntry
}
entry.lock.Lock() // wait for someone to fill it, if needed
defer entry.lock.Unlock()
return nil, entry.cachedValue, entry.cachedError
}
You need a routine to fill and release the entry as well, but it's pretty simple. You could, if you choose, make this a method on the Entry type rather than on the CacheServer type, as at least in this particular prototype, there is no need to use the cache server data structures directly. If you start getting fancier with cache invalidation, though, it might be nice to have access to the CacheServer object.
Note: I've designed this so that you can do a cache lookup without an intent-to-fill, if that's useful. If not, there's no reason to bother with the Intent argument.

Is * operator of std::shared_ptr thread safe?

I have a std::shared_ptr which changes asynchronously from a callback.
In main thread, I want to read the "latest" value and do complex calculations on it, and I do not care if the pointer's value changes while those calculations are running.
For this, I am simply making a copy of the contained value on the main thread:
// async thread
void callback(P new_data) {
smart_pointer_ = new_data;
}
// main thread loop!
Value copy_of_pointer_value = *smart_pointer_; // smart_pointer_ could be changing in callback right now
// do calcs with copy_of_pointer_value
Is this safe or should I be explicitly making a copy of the smart pointer before trying to read its value, like this:
// main thread loop!
auto smart_copy = smart_pointer_;
// I know I could work with *smart_copy directly, but I need to copy anyway for other reasons
Value copy_of_pointer_value = *smart_copy;
// do calcs with copy_of_pointer_value

Ensuring a value is retrieved only once

I'm developing a Go package to access a web service (via HTTP). Every time I retrieve a page of data from that service, I also get the total of pages available. The only way to get this total is by getting one of the pages (usually the first one). However, requests to this service take time and I need to do the following:
When the GetPage method is called on a Client and the page is retrieved for the first time, the retrieved total should be stored somewhere in that client. When the Total method is called and the total hasn't yet been retrieved, the first page should be fetched and the total returned. If the total was retrieved before, either by a call to GetPage or Total, it should be returned immediately, without any HTTP requests at all. This needs to be safe for use by multiple goroutines. My idea is something along the lines of sync.Once but with the function passed to Do returning a value, which is then cached and automatically returned whenever Do is called.
I remember seeing something like this before, but I can't find it now even though I tried. Searching for sync.Once with value and similar terms didn't yield any useful results. I know I could probably do that with a mutex and a lot of locking, but mutexes and a lot of locking don't seem to be the recommended way to do stuff in go.
General "init-once" solution
In the general / usual case, the easiest solution to only init once, only when it's actually needed is to use sync.Once and its Once.Do() method.
You don't actually need to return any value from the function passed to Once.Do(), because you can store values to e.g. global variables in that function.
See this simple example:
var (
total int
calcTotalOnce sync.Once
)
func GetTotal() int {
// Init / calc total once:
calcTotalOnce.Do(func() {
fmt.Println("Fetching total...")
// Do some heavy work, make HTTP calls, whatever you want:
total++ // This will set total to 1 (once and for all)
})
// Here you can safely use total:
return total
}
func main() {
fmt.Println(GetTotal())
fmt.Println(GetTotal())
}
Output of the above (try it on the Go Playground):
Fetching total...
1
1
Some notes:
You can achieve the same using a mutex or sync.Once, but the latter is actually faster than using a mutex.
If GetTotal() has been called before, subsequent calls to GetTotal() will not do anything but return the previously calculated value, this is what Once.Do() does / ensures. sync.Once "tracks" if its Do() method has been called before, and if so, the passed function value will not be called anymore.
sync.Once provides all the needs for this solution to be safe for concurrent use from multiple goroutines, given that you don't modify or access the total variable directly from anywhere else.
Solution to your "unusal" case
The general case assumes the total is only accessed via the GetTotal() function.
In your case this does not hold: you want to access it via the GetTotal() function and you want to set it after a GetPage() call (if it has not yet been set).
We may solve this with sync.Once too. We would need the above GetTotal() function; and when a GetPage() call is performed, it may use the same calcTotalOnce to attempt to set its value from the received page.
It could look something like this:
var (
total int
calcTotalOnce sync.Once
)
func GetTotal() int {
calcTotalOnce.Do(func() {
// total is not yet initialized: get page and store total number
page := getPageImpl()
total = page.Total
})
// Here you can safely use total:
return total
}
type Page struct {
Total int
}
func GetPage() *Page {
page := getPageImpl()
calcTotalOnce.Do(func() {
// total is not yet initialized, store the value we have:
total = page.Total
})
return page
}
func getPageImpl() *Page {
// Do HTTP call or whatever
page := &Page{}
// Set page.Total from the response body
return page
}
How does this work? We create and use a single sync.Once in the variable calcTotalOnce. This ensures that its Do() method may only call the function passed to it once, no matter where / how this Do() method is called.
If someone calls the GetTotal() function first, then the function literal inside it will run, which calls getPageImpl() to fetch the page and initialize the total variable from the Page.Total field.
If GetPage() function would be called first, that will also call calcTotalOnce.Do() which simply sets the Page.Total value to the total variable.
Whichever route is walked first, that will alter the internal state of calcTotalOnce, which will remember the total calculation has already been run, and further calls to calcTotalOnce.Do() will never call the function value passed to it.
Or just use "eager" initialization
Also note that if it is likely that this total number have to be fetched during the lifetime of your program, it might not worth the above complexity, as you may just as easily initialize the variable once, when it's created.
var Total = getPageImpl().Total
Or if the initialization is a little more complex (e.g. needs error handling), use a package init() function:
var Total int
func init() {
page := getPageImpl()
// Other logic, e.g. error handling
Total = page.Total
}

Resources