How to keep track of the count of instances of a type? - go

In object oriented languages I use class variables to track how many instances are currently spawned by incrementing on construction and decrementing on destruction.
I try to implement similar behaviour in go:
package entity
type Entity struct {
Name string
}
func New(name string) Entity {
entity := Entity{name}
counter++
return entity
}
var counter int = 0
func (e *Entity) Count() int {
return counter
}
and that works half way as I can not decrement the counter via a destructor.
Can I somehow mimic object destruction?
How would I keep track of instance count correctly?

You can use runtime.SetFinalizer like this. See here for playground version.
package main
import (
"fmt"
"runtime"
)
type Entity struct {
Name string
}
var counter int = 0
func New(name string) Entity {
entity := Entity{name}
counter++
runtime.SetFinalizer(&entity, func(_ *Entity) {
counter--
})
return entity
}
func (e *Entity) Count() int {
return counter
}
func main() {
e := New("Sausage")
fmt.Println("Entities", counter, e)
e = New("Potato")
fmt.Println("Entities", counter, e)
runtime.GC()
fmt.Println("Entities", counter)
e = New("Leek")
fmt.Println("Entities", counter)
runtime.GC()
fmt.Println("Entities", counter)
}
This prints
Entities 1 {Sausage}
Entities 2 {Potato}
Entities 0
Entities 1
Entities 0
Note this from the docs for gotchas with Finalizers
The finalizer for x is scheduled to run at some arbitrary time after x
becomes unreachable. There is no guarantee that finalizers will run
before a program exits, so typically they are useful only for
releasing non-memory resources associated with an object during a
long-running program.

There was a discussion on golang-nuts about finalizers.
For now,
there is no finalizer function (edit : no reliable finalizer function, as was shown to me by Nick)
the GC doesn't use and doesn't maintain any reference count
So you have to manage your instance count yourself.
Usually, you don't have instances living on themselves, so for many practical uses (not including the profiling of a complex and hard to understand program), you can use defer to track the end of life of your variables. I won't pretend this really replaces finalizers but it's simple and often sufficient.

Related

Lock slice before reading and modifying it

My experience working with Go is recent and in reviewing some code, I have seen that while it is write-protected, there is a problem with reading the data. Not with the reading itself, but with possible modifications that can occur between the reading and the modification of the slice.
type ConcurrentSlice struct {
sync.RWMutex
items []Item
}
type Item struct {
Index int
Value Info
}
type Info struct {
Name string
Labels map[string]string
Failure bool
}
As mentioned, the writing is protected in this way:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
found := false
i := 0
for inList := range cs.Iter() {
if item.Name == inList.Value.Name{
cs.items[i] = item
found = true
}
i++
}
if !found {
cs.Lock()
defer cs.Unlock()
cs.items = append(cs.items, item)
}
}
func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
cs.Lock()
defer cs.Unlock()
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
But between collecting the content of the slice and modifying it, modifications can occur.It may be that another routine modifies the same slice and when it is time to assign a value, it no longer exists: slice[i] = item
What would be the right way to deal with this?
I have implemented this method:
func GetList() *ConcurrentSlice {
if list == nil {
denylist = NewConcurrentSlice()
return denylist
}
return denylist
}
And I use it like this:
concurrentSlice := GetList()
concurrentSlice.UpdateOrAppend(item)
But I understand that between the get and the modification, even if it is practically immediate, another routine may have modified the slice. What would be the correct way to perform the two operations atomically? That the slice I read is 100% the one I modify. Because if I try to assign an item to a index that no longer exists, it will break the execution.
Thank you in advance!
The way you are doing the blocking is incorrect, because it does not ensure that the items you return have not been removed. In case of an update, the array would still be at least the same length.
A simpler solution that works could be the following:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
found := false
i := 0
cs.Lock()
defer cs.Unlock()
for _, it := range cs.items {
if item.Name == it.Name{
cs.items[i] = it
found = true
}
i++
}
if !found {
cs.items = append(cs.items, item)
}
}
Use a sync.Map if the order of the values is not important.
type Items struct {
m sync.Map
}
func (items *Items) Update(item Info) {
items.m.Store(item.Name, item)
}
func (items *Items) Range(f func(Info) bool) {
items.m.Range(func(key, value any) bool {
return f(value.(Info))
})
}
Data structures 101: always pick the best data structure for your use case. If you’re going to be looking up objects by name, that’s EXACTLY what map is for. If you still need to maintain the order of the items, you use a treemap
Concurrency 101: like transactions, your mutex should be atomic, consistent, and isolated. You’re failing isolation here because the data structure read does not fall inside your mutex lock.
Your code should look something like this:
func {
mutex.lock
defer mutex.unlock
check map or treemap for name
if exists update
else add
}
After some tests, I can say that the situation you fear can indeed happen with sync.RWMutex. I think it could happen with sync.Mutex too, but I can't reproduce that. Maybe I'm missing some informations, or maybe the calls are in order because they all are blocked and the order they redeem the right to lock is ordered in some way.
One way to keep your two calls safe without other routines getting in 'conflict' would be to use an other mutex, for every task on that object. You would lock that mutex before your read and write, and release it when you're done. You would also have to use that mutex on any other call that write (or read) to that object. You can find an implementation of what I'm talking about here in the main.go file. In order to reproduce the issue with RWMutex, you can simply comment the startTask and the endTask calls and the issue is visible in the terminal output.
EDIT : my first answer was wrong as I misinterpreted a test result, and fell in the situation described by OP.
tl;dr;
If ConcurrentSlice is to be used from a single goroutine, the locks are unnecessary, because the way algorithm written there is not going to be any concurrent read/writes to slice elements, or the slice.
If ConcurrentSlice is to be used from multiple goroutines, existings locks are not sufficient. This is because UpdateOrAppend may modify slice elements concurrently.
A safe version woule need two versions of Iter:
This can be called by users of ConcurrentSlice, but it cannot be called from `UpdateOrAppend:
func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
cs.RLock()
defer cs.RUnlock()
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
and this is only to be called from UpdateOrAppend:
func (cs *ConcurrentSlice) internalIter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
// No locking
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
And UpdateOrAppend should be synchronized at the top level:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
cs.Lock()
defer cs.Unlock()
....
}
Here's the long version:
This is an interesting piece of code. Based on my understanding of the go memory model, the mutex lock in Iter() is only necessary if there is another goroutine working on this code, and even with that, there is a possible race in the code. However, UpdateOrAppend only modifies elements of the slice with lower indexes than what Iter is working on, so that race never manifests itself.
The race can happen as follows:
The for-loop in iter reads element 0 of the slice
The element is sent through the channel. Thus, the slice receive happens after the first step.
The receiving end potentially updates element 0 of the slice. There is no problem up to here.
Then the sending goroutine reads element 1 of the slice. This is when a race can happen. If step 3 updated index 1 of the slice, the read at step 4 is a race. That is: if step 3 reads the update done by step 4, it is a race. You can see this if you start with i:=1 in UpdateOrAppend, and running it with the -race flag.
But UpdateOrAppend always modifies slice elements that are already seen by Iter when i=0, so this code is safe, even without the lock.
If there will be other goroutines accessing and modifying the structure, you need the Mutex, but you need it to protect the complete UpdateOrAppend method, because only one goroutine should be allowed to run that. You need the mutex to protect the potential updates in the first for-loop, and that mutex has to also include the slice append case, because that may actually modify the slice of the underlying object.
If Iter is only called from UpdateOrAppend, then this single mutex should be sufficient. If however Iter can be called from multiple goroutines, then there is another race possibility. If one UpdateOrAppend is running concurrently with multiple Iter instances, then some of those Iter instances will read from the modified slice elements concurrently, causing a race. So, it should be such that multiple Iters can only run if there are no UpdateOrAppend calls. That is a RWMutex.
But Iter can be called from UpdateOrAppend with a lock, so it cannot really call RLock, otherwise it is a deadlock.
Thus, you need two versions of Iter: one that can be called outside UpdateOrAppend, and that issues RLock in the goroutine, and another that can only be called from UpdateOrAppend and does not call RLock.

Preventing structs from being collected while passing their pointers to C calls

I'm writing a program which needs to call a number of C functions. These functions require pointers to structs, which are allocated by me. I know I can't simply convert and pass uintptrs because they hold no semantic value, thus the GC may take my structs away.
So, in order to keep GC from collecting my structs, I came up with this:
type Person struct {
thing uint32
other uint32
}
// Keeps pointers from being collected.
// Note this is a global variable.
var people map[*Person]struct{}
// Creates one or more structs.
func createPeople() (p1, p2 *Person) {
p1 = &Person{name: "John"}
people[p1] = struct{}{} // keep
p2 = &Person{name: "Mary"}
people[p2] = struct{}{} // keep
return
}
// Cleanup.
func deletePeople() {
for k := range people {
delete(people, k)
}
}
func main() {
people = make(map[*Person]struct{})
p1, p2 := createPeople()
// Lots of calls to C functions passing people pointers.
rawP1 := uintptr(unsafe.Pointer(p1))
ffiCall(rawP1)
deletePeople() // mandatory?
}
Will the solution above work?
Is there any risk of the structs being moved, thus messing up the pointers?
What if I don't call deletePeople()?

Is it thread safe to create a new Mutex in Go?

I have a struct in Go which contains a mutex, and I want to ensure that that mutex is never nil. To that end, I have implemented a GetMutex() function, which checks if the mutex is nil, and if it is, then assigns it a value.
My question is: is the following code thread safe? If not, what would be an idiomatic way to ensure that mux is always initialized? The only thing I can think of is to have a global mutex in this package which is used within my GetMutex() function, but perhaps there is a different approach.
package main
import (
"sync"
)
type Counter struct {
mux *sync.Mutex
counter int
}
// Is this thread safe?
func (c *Counter) GetMux() *sync.Mutex {
if c.mux == nil {
c.mux = &sync.Mutex{}
}
return c.mux
}
func (c *Counter) Inc() {
c.GetMux().Lock()
c.counter++
c.GetMux().Unlock()
}
func main() {
c := &Counter{}
c.Inc()
}
No, it's not safe if Counter.GetMux() is called from multiple goroutines concurrently: GetMux() both reads and writes the Counter.mux field.
The general way is to use a "constructor" like function that takes care of the initialization, like this:
func NewCounter() *Counter {
return &Counter{
mux: &sync.Mutex{},
}
}
And of course always create counters with this NewCounter().
Another–limited–way would be to use a non-pointer mutex value:
type Counter struct {
mux sync.Mutex
counter int
}
So when you have a Counter struct value, it–by design–includes a mutex. But if you do this, then Counter should always be used as a pointer, and Counter struct values must not be copied (else the mutex field would also be copied, but as package doc of sync states: "Values containing the types defined in this package should not be copied.").
The obvious advantage of this is that the zero value of Counter is a valid and ready counter (something you should aim for with your custom types), and no constructor function is needed.

GoRoutines and passing struct to original context

I have a configuration that defines a number of instances (SomeConfigItems) which have a thing() created for each of them.
That thing is a struct returned by an included package, which contains, among other things, a Price (float64) and a nested struct. The nested struct maintains a map of trades.
The problem is that I am able to loop through the thing.Streams.Trades and see all trades happening in real time from my main()'s for{} loop. I am not able to see an updated thing.Price even though it is set in the Handler on occasion.
I am having a hard time understanding how the nested structs can contain data but not Price. I feel as though I am missing something with scoping, goroutines, or possibly pointers for instantiation of new objects.
Any help would be appreciated, I will continue reading in the meantime. I've reduced the code to what seems relevant.
main.go:
package main
import thing
var Things []thing.Handler
for _, name := range SomeConfigItems {
handler := thing.New(name)
Things = append(Things, handler)
}
for {
for _, t := range Things {
log.Info("Price: ", t.Price) // This is set to 0 every iteration, but I can actively data in thing.Streams.Trades
}
}
thing.go:
package thing
import streams
type Handler struct {
Name string
Price float64
Streams streams.Streams
}
func New(name string) (h Handler, err error) {
stream, err := streams.New(strings.ToLower(name))
h = Handler{
Name: name,
Price: "0.0"
Streams: stream,
}
go h.handler()
return h, err
}
func (bot *Handler) handler() {
var currentPrice float64
for {
currentPrice = external.GetPrice(bot.Name).Price // Validated that this returns a float64
bot.Price = currentPrice // Verified that this is updated immediately after in this context.
// Unable to see Price updated from outer context.
}
}
streams.go:
package streams
type Streams struct {
Trades
}
type State struct {
Price string `json:"p"`
Quantity string `json:"q"`
}
type Trades struct {
Trades map[float64]float64
TradeMutex sync.Mutex
Updates chan State
}
func New(name string) (s Streams, err error) {
p := newTradeStream(name)
s = Streams{
Trades: p,
}
return s, err
}
func newTradeStream(name string) (ts Trades) {
ts = Trades{}
ts.Trades = make(map[float64]float64, MaxDepth)
ts.Updates = make(chan State, 500)
// ... Other watchdog code
return ts
}
Note:
I am added some debug logging in multiple locations. From within the Bot Handler, the price was printed (successfully), then updated, and then printed (successfully) again -- Showing no gap in the setting of Price from within the handler() function.
When adding the same type of debugging to the main() for{} loop, I tried setting an incrementing counter and assigning the value of thing.Price -- Printing thing.Price on each loop results in 0, even if I set the price (and validate it gets set) in the same loop, it is back to 0 on the next iteration.
This behavior is why I think that I am missing something very fundamental.
In Go, arguments are passed to functions by value -- meaning what the function gets is a copy of the value, not a reference to the variable. The same is true of the function receiver, and also the return list.
It's not the most elegant description, but for the sake of explanation, let's call this the "function wall." If the value being passed one way or the other is a pointer, the function still gets a copy, but it's a copy of a memory address, and so the pointer can be used to change the value of the variable on the other side of the wall. If it is a reference type, which uses a pointer in the implementation of the type, then again a change to the thing being pointed to can cross that wall. But otherwise the change does not cross the wall, which is one reason so many Go functions are written to return values instead of just modifying values.
Here's a runnable example:
package main
import (
"fmt"
)
type Car struct {
Color string
}
func (c Car) Change() { // c was passed by value, it's a copy
c.Color = "Red"
}
func main() {
ride := Car{"Blue"}
ride.Change()
fmt.Println(ride.Color)
}
Prints "Blue"
But two small changes:
func (c *Car) Change() { // here
c.Color = "Red"
}
func main() {
ride := &Car{"Blue"} // and here
ride.Change()
fmt.Println(ride.Color)
}
And now it prints "Red". Struct is not a reference type. So if you want modifications to a struct to cross the wall without using the return list to do it, use a pointer. Of course this only applies to values being passed via argument, return list, or receiver; and not to variables that are in scope on both sides of the wall; or to modifying the underlying value behind a reference type.
See also "Pointers Versus Values" in Effective Go, and "Go Data Structures" by Russ Cox.

Finalizer statistics

Is there a way to obtain the total number of finalizers registered using runtime.SetFinalizer and which have not yet run?
We are considering adding a struct with a registered finalizer to some of our products to release memory allocated using malloc, and the object could potentially have a relatively high allocation rate. It would be nice if we could monitor the number of finalizers, to make sure that they do not pile up and trigger out-of-memory errors (like they tend to with other garbage collectors).
(I'm aware that explicit deallocation would avoid this problem, but we cannot change the existing code, which does not call a Close function or something like that.)
You can keep keep a count of these objects by incrementing and decrementing a unexported package variable when a new object is created and finalized, respectively.
For example:
package main
import (
"fmt"
"runtime"
"sync/atomic"
)
var totalObjects int32
func TotalObjects() int32 {
return atomic.LoadInt32(&totalObjects)
}
type Object struct {
p uintptr // C allocated pointer
}
func NewObject() *Object {
o := &Object{
}
// TODO: perform other initializations
atomic.AddInt32(&totalObjects, 1)
runtime.SetFinalizer(o, (*Object).finalizer)
return o
}
func (o *Object) finalizer() {
atomic.AddInt32(&totalObjects, -1)
// TODO: perform finalizations
}
func main() {
fmt.Println("Total objects:", TotalObjects())
for i := 0; i < 100; i++ {
_ = NewObject()
runtime.GC()
}
fmt.Println("Total objects:", TotalObjects())
}
https://play.golang.org/p/n35QABBIcj
It's possible to make a wrapper on runtime.SetFinalizer which does the counting for you. Of course, it's a question of using it everywhere where you use SetFinalizer.
In case this is problematic, you can also modify SetFinalizer source code directly, but that requires a modified Go compiler.
Atomic integers are used as SetFinalizer may be called on different threads, and otherwise a counter may not be accurate as without those a race condition could possibly occur. Golang guarantees that finalizers are called from a single goroutine, so it's not needed for inner function.
https://play.golang.org/p/KKCH2UwTFYw
package main
import (
"fmt"
"reflect"
"runtime"
"sync/atomic"
)
var finalizersCreated int64
var finalizersRan int64
func SetFinalizer(obj interface{}, finalizer interface{}) {
finType := reflect.TypeOf(finalizer)
funcType := reflect.FuncOf([]reflect.Type{finType.In(0)}, nil, false)
f := reflect.MakeFunc(funcType, func(args []reflect.Value) []reflect.Value {
finalizersRan++
return reflect.ValueOf(finalizer).Call([]reflect.Value{args[0]})
})
runtime.SetFinalizer(obj, f.Interface())
atomic.AddInt64(&finalizersCreated, 1)
}
func main() {
v := "a"
SetFinalizer(&v, func(a *string) {
fmt.Println("Finalizer ran")
})
fmt.Println(finalizersRan, finalizersCreated)
runtime.GC()
fmt.Println(finalizersRan, finalizersCreated)
}

Resources