I have a map where every value is a pointer to another struct that itself has a lock.
type StatMap map[string]*Stats
type Stats struct {
sync.RWMutex
someStats, someMoreStats float64
}
I have implemented a method where I pack the StatMap into another struct and have a Mutex lock for the entire map, but I am expecting to modify every entry in the map simoultaniously from hundreds of goroutines, so it would be more effective to lock every element in the map so that two or more goroutines can read and modify values for entries in parallell.
What I am wondering is how I can initialize a new entry in the map whenever there comes a new key? I cannot lock the entry if it isn't in the map already, and I cannot check if it is in the map (as far as I know) in case another goroutine is currently modifying that entry.
I do not know what keys will be in the map before runtime.
My current implementation (that causes data races):
initializeStatMap("key")
statMap["key"].Lock()
// . . .
func initializeStatMap(key string) {
if statMap[key] != nil {
return
}
statMap[key] = &Stats{someStats: 0, someMoreStats: 0}
}
The Go's map semantics are as follows:
A map stores values (not variables) and that's why these values are
not adressable, and that's why you can't do something like
type T struct {
X int
}
m := make(map[int]T)
m[0] = T{}
m[0].x = 42 // won't compile
This requirement mostly comes from from the fact a map, being
an intricate dynamic data structure, should allow its particular
implementations to physically move the values it contains
around in memory — when doing rebalancing etc.
That's why the only three operations a map supports is adding
(or replacing) of its elements, getting them back and deleting them.
A map is not safe for concurrent use, so in order to do any of those three
operations on the same map concurrently, you need to protect it in
one way or another.
Consequently, once you have read a value from a map, orchestrating
concurrent access to it is a completely another story,
and here we're facing another fact about the map's semantics:
since it keeps values and is free to copy them around in memory,
it's not allowed to keep in a map anything which you want to have
reference semantics. For instance, it would be incorrect to keep values
of your Stats type in the map directly—because they embed instances
of sync.Mutex, and copying of them prohibited after they are first used.
Here you're already doing the right thing by storing pointers to your
variables.
Now you can see that it's pretty OK to roll like this:
Access the map itself to get a value bound to a key in a concurrent-safe
way (say, by holding a lock).
Lock the mutex on that variable and operate on it. That does not
involve the map at all.
The only remaining possible problem is as follows.
Suppose you're protecting the access to your map with a lock.
So you grab the lock, obtain the value bound to a key, by copying
it to a variable, release the lock and work with the copy of the
value.
Now while you're working with the copy of that value another
goroutine is free to update the map by deleting the value or replacing it.
While in your case it's fine technically — because your map operates on
pointers to variables, and it's fine to copy pointers — this might be inappropriate from the standpoint of the semantics of your program,
and this is something you have to think through.
To make it more clear, once you've got a pointer to some instance of Stats
and locked it, a pointer to this instance can be removed from the map,
or the map entry which held it could be updated by another pointer —
pointing to another instance of Stats, so once you're done with the
instance, it might have become unreachable via the map.
Related
I am implementing a cache in Golang. Let's say the cache could be implemented as sync.Map with integer key and value as a struct:
type value struct {
fileName string
functionName string
}
Huge number of records have the same fileName and functionName. To save memory I want to use string pool. Go has immutable strings and my idea looks like:
var (
cache sync.Map
stringPool sync.Map
)
type value struct {
fileName string
functionName string
}
func addRecord(key int64, val value) {
fileName, _ := stringPool.LoadOrStore(val.fileName, val.fileName)
val.fileName = fileName.(string)
functionName, _ := stringPool.LoadOrStore(val.functionName, val.functionName)
val.functionName = functionName.(string)
cache.Store(key, val)
}
My idea is to keep every unique string (fileName and functionName) in memory once. Will it work?
Cache implementation must be concurrent safe. The number of records in the cache is about 10^8. The number of records in the string pool is about 10^6.
I have some logic that removes records from the cache. There is no problem with main cache size.
Could you please suggest how to manage string pool size?
I am thinking about storing reference count for every record in the string pool. It will require additional synchronizations or probably global locks to maintain it. I would like to implementation as simple as possible. You can see in my code snippet I don't use additional mutexes.
Or may be I need to follow completely different approach to minimize memory usage for my cache?
What you are trying to do with stringPool is commonly known as string interning. There are libraries like github.com/josharian/intern that provide "good enough" solutions to that kind of problem, and that do not require you to manually maintain the stringPool map. Note that no solution (including yours, assuming you eventually remove some elements from stringPool) can reliably deduplicate 100% of strings without incurring impractical levels of CPU overhead.
As a side note, it's worth pointing out that sync.Map is not really designed for update-heavy workloads. Depending on the keys used, you may actually experience significant contention when calling cache.Store. Furthermore, since sync.Map relies on interface{} for both keys and values, it normally incurs much more allocations that a plain map. Make sure to benchmark with realistic workloads to ensure that you picked the right approach.
I'm writing an app which should at some point get the value of a defglobal variable and change it. For this I do the following:
DATA_OBJECT cur_time_q;
if (!EnvGetDefglobalValue(CLIEnvironment, cur_timeq_kw, &cur_time_q)) return CUR_TIME_GLBVAR_MISSING;
uint64_t cur_time = t_left;
SetType(cur_time_q, INTEGER);
void* val = EnvAddLong(CLIEnvironment, cur_time);
SetValue(cur_time_q, val);
EnvSetDefglobalValue(CLIEnvironment, cur_timeq_kw, &cur_time_q);
I partly took this approach from "Advanced Programming Guide" and it works fine, but I have some questions:
Does EnvAddLong(...) add a value, which will retain in memory, until the environment is destroyed? May it consume memory and increase the execution time of other API-functions like EnvRun(...), if the function with this fragment of code is called for, say, several thousand iterations?
Isn't it overkill? Should I go for something like EnvEval("(bind ...)") instead?
There's information in the CLIPS Advanced Programming Guide on how CLIPS handles garbage collection. API calls like EnvAddLong which are used to create values to pass to other API functions don't trigger garbage collection. Generally, API calls which cause code to execute or deallocate data structures such as Run, Reset, Clear, and Eval, trigger garbage collection and will deallocate any transient data created by functions like EnvAddLong. So if your program design repeatedly assigns values to globals and then runs, any CLIPS data structures you allocate will eventually be freed once the data is confirmed to be garbage and is no longer referenced by any CLIPS data structures.
If you can easily construct a string to pass to the Eval function, it's often easier to do this rather than make multiple API calls to achieve the same result.
The API was overhauled in release 6.4, so many tasks such as assigning a value to a defglobal can be done with one step rather than several.
CLIPSValue rv;
Defglobal *global;
mainEnv = CreateEnvironment();
Build(mainEnv,"(defglobal ?*x* = 3.1)");
Eval(mainEnv,"?*x*",&rv);
printf("%lf\n",rv.floatValue->contents);
global = FindDefglobal(mainEnv,"x");
if (global != NULL)
{
DefglobalSetInteger(global,343433);
Eval(mainEnv,"(println ?*x*)",NULL);
DefglobalGetValue(global,&rv);
printf("%lf\n",rv.floatValue->contents);
}
I am trying to build a slice of pointers manually and with C.calloc() for allocating the array portion of the slice. I am able to do this successfully though when I try and add pointers that I allocate with make() some of the values (of what the pointers point to) get changed seemingly randomly. On the other hand if I C.calloc() space for the pointers I will be adding, the value are not changed. Or, if I allocate the slices with make() and the pointers I add are allocated with make() the values are not changed.
I do notice that the memory locations of the pointers when using C.calloc() vs make() are very different but I don't see why this should cause the memory to be changed randomly. I am new to Go so please forgive me if I am overlooking some very simple.
Here is the code I use for allocating my slices manually:
type caster struct {
ptr *byte;
len int64;
cap int64;
}
var temp caster;
temp.ptr=(*byte)(C.calloc(C.ulong(size),8));
temp.len=int64(size);
temp.cap=int64(size);
newTable.table=*(*[]*entry)(unsafe.Pointer(&temp));
This works if the entries I add are allocated as follows:
var temp caster;
var e []entry;
temp.ptr=(*byte)(C.calloc(C.ulong(ninserts),8));
temp.len=int64(ninserts);
temp.cap=int64(ninserts);
e=*(*[]entry)(unsafe.Pointer(&temp));
for i:=0;i<ninserts;i++ {
e[i].val=hint64(rand.Int63());
}
for i:=0;i<ninserts;i++ {
ht.insert(&e[i]);
}
though the memory of of the entries gets randomly changed if they are allocated as follows:
var e []entry = make([]entry, ninserts);
for i:=0;i<ninserts;i++ {
e[i].val=hint64(rand.Int63());
}
for i:=0;i<ninserts;i++ {
ht.insert(&e[i]);
}
Unless I build my slices normally as follows:
newTable.table = make([]*entry, size);
I am trying to build a slice of pointers manually and with C.calloc() for allocating the array portion of the slice.
This is explicitly forbidden.
To quote from the official cgo documentation:
Go is a garbage collected language, and the garbage collector needs to know the location of every pointer to Go memory. Because of this, there are restrictions on passing pointers between Go and C.
In this section the term Go pointer means a pointer to memory allocated by Go (such as by using the & operator or calling the predefined new function) and the term C pointer means a pointer to memory allocated by C (such as by a call to C.malloc). Whether a pointer is a Go pointer or a C pointer is a dynamic property determined by how the memory was allocated; it has nothing to do with the type of the pointer.
Note that values of some Go types, other than the type's zero value, always include Go pointers. This is true of string, slice, interface, channel, map, and function types. A pointer type may hold a Go pointer or a C pointer. Array and struct types may or may not include Go pointers, depending on the element types. All the discussion below about Go pointers applies not just to pointer types, but also to other types that include Go pointers.
The boldface above is mine. It means that you must not allocate any of these types via C's allocators.
It is possible to defeat this enforcement by using the unsafe package, and of course there is nothing stopping the C code from doing anything it likes. However, programs that break these rules are likely to fail in unexpected and unpredictable ways.
This bit of your own code:
newTable.table=*(*[]*entry)(unsafe.Pointer(&temp));
violates the rules, but defeats their enforcement. You have allocated C memory, and are now trying to use it as if it were Go memory, in the form of a slice.
I saw the issue on Github which says sync.Pool should be used only with pointer types, for example:
var TPool = sync.Pool{
New: func() interface{} {
return new(T)
},
}
Does it make sense? What about return T{} and which is the better choice, why?
The whole point of sync.Pool is to avoid (expensive) allocations. Large-ish buffers, etc. You allocate a few buffers and they stay in memory, available for reuse. Hence the use of pointers.
But here you'll be copying the values on every step, defeating the purpose. (Assuming your T is a "normal" struct and not something like SliceHeader)
It is not necessary. In most cases it should be a pointer as you want to share an object, not to make copies.
In some use cases this can be a non pointer type, like an id of some external resource. I can imagine a pool of paths (mounted disk drives) represented with strings where some large file operations are being conducted.
I wrote some code like this:
shared_ptr<int> r = make_shared<int>();
int *ar = r.get();
delete ar; // report double free or corruption
// still some code
When the code ran up to delete ar;, the program crashed, and reported "double free or corruption", I'm confused why double free? The "r" is still in the scope, and not popped-off from stack. Do the delete operator do something magic?? Does it know the raw pointer is handled by a smart pointer currently? and then counter in "r" be decremented to zero automatically?
I know the operations is not recommended, but I want to know why?
You are deleting a pointer that didn't come from new, so you have undefined behavior (anything can happen).
From cppreference on delete:
For the first (non-array) form, expression must be a pointer to an object type or a class type contextually implicitly convertible to such pointer, and its value must be either null or pointer to a non-array object created by a new-expression, or a pointer to a base subobject of a non-array object created by a new-expression. If expression is anything else, including if it is a pointer obtained by the array form of new-expression, the behavior is undefined.
If the allocation is done by new, we can be sure that the pointer we have is something we can use delete on. But in the case of shared_ptr.get(), we cannot be sure if we can use delete because it might not be the actual pointer returned by new.
shared_ptr<int> r = make_shared<int>();
There is no guarantee that this will call new int (which isn't strictly observable by the user anyway) or more generally new T (which is observable with a user defined, class specific operator new); in practice, it won't (there is no guarantee that it won't).
The discussion that follows isn't just about shared_ptr, but about "smart pointers" with ownership semantics. For any owning smart pointer smart_owning:
The primary motivation for make_owning instead of smart_owning<T>(new T) is to avoid having a memory allocation without owner at any time; that was essential in C++ when order of evaluation of expressions didn't provide the guarantee that evaluation of the sub-expressions in the argument list was immediately before call of that function; historically in C++:
f (smart_owning<T>(new T), smart_owning<U>(new U));
could be evaluated as:
T *temp1 = new T;
U *temp2 = new U;
auto &&temp3 = smart_owning<T>(temp1);
auto &&temp4 = smart_owning<U>(temp2);
This way temp1 and temp2 are not managed by any owning object for a non trivial time:
obviously new U can throw an exception
constructing an owning smart pointer usually requires the allocation of (small) ressources and can throw
So either temp1 or temp2 could be leaked (but not both) if an exception was thrown, which was the exact problem we were trying to avoid in the first place. This means composite expressions involving construction of owning smart pointers was a bad idea; this is fine:
auto &&temp_t = smart_owning<T>(new T);
auto &&temp_u = smart_owning<U>(new U);
f (temp_t, temp_u);
Usually expression involving as many sub-expression with function calls as f (smart_owning<T>(new T), smart_owning<U>(new U)) are considered reasonable (it's a pretty simple expression in term of number of sub-expressions). Disallowing such expressions is quite annoying and very difficult to justify.
[This is one reason, and in my opinion the most compelling reason, why the non determinism of the order of evaluation was removed by the C++ standardisation committee so that such code is not safe. (This was an issue not just for memory allocated, but for any managed allocation, like file descriptors, database handles...)]
Because code frequently needed to do things such as smart_owning<T>(allocate_T()) in sub-expressions, and because telling programmers to decompose moderately complex expressions involving allocation in many simple lines wasn't appealing (more lines of code doesn't mean easier to read), the library writers provided a simple fix: a function to do the creation of an object with dynamic lifetime and the creation of its owning object together. That solved the order of evaluation problem (but was complicated at first because it needed perfect forwarding of the arguments of the constructor).
Giving two tasks to a function (allocate an instance of T and a instance of smart_owning) gives the freedom to do an interesting optimization: you can avoid one dynamic allocation by putting both the managed object and its owner next to each others.
But once again, that was not the primary purpose of functions like make_shared.
Because exclusive ownership smart pointers by definition don't need to keep a reference count, and by definition don't need to share the data needed for the deleter either between instances, and so can keep that data in the "smart pointer"(*), no additional allocation is needed for the construction of unique_ptr; yet a make_unique function template was added, to avoid the dangling pointer issue, not to optimize a non-thing (an allocation that isn't done in the fist place).
(*) which BTW means unique owner "smart pointers" do not have pointer semantic, as pointer semantic implies that you can makes copies of the "pointer", and you can't have two copies of a unique owner pointing to the same instance; "smart pointers" were never pointers anyway, the term is misleading.
Summary:
make_shared<T> does an optional optimization where there is no separate dynamic memory allocation for T: there is no operator new(sizeof (T)). There is obviously still the creation of an instance with dynamic lifetime with another operator new: placement new.
If you replace the explicit memory deallocation with an explicit destruction and add a pause immediately after that point:
class C {
public:
~C();
};
shared_ptr<C> r = make_shared<C>();
C *ar = r.get();
ar->~C();
pause(); // stops the program forever
The program will probably run fine; it is still illogical, indefensible, incorrect to explicitly destroy an object managed by a smart pointer. It isn't "your" resource. If pause() could exit with an exception, the owning smart pointer would try to destroy the managed object which doesn't even exist anymore.
It of course depends on how library implements make_shared, however most probable implementation is that:
std::make_shared allocates one block for two things:
shared pointer control block
contained object
std::make_shared() will invoke memory allocator once and then it will call placement new twice to initialize (call constructors) of mentioned two things.
| block requested from allocator |
| shared_ptr control block | X object |
#1 #2 #3
That means that memory allocator has provided one big block, which address is #1.
Shared pointer then uses it for control block (#1) and actual contained object (#2).
When you invoke delete with actual object kept by shred_ptr ( .get() ) you call delete(#2).
Because #2 is not known by allocator you get an corruption error.
See here. I quot:
std::shared_ptr is a smart pointer that retains shared ownership of an object through a pointer. Several shared_ptr objects may own the same object. The object is destroyed and its memory deallocated when either of the following happens:
the last remaining shared_ptr owning the object is destroyed;
the last remaining shared_ptr owning the object is assigned another pointer via operator= or reset().
The object is destroyed using delete-expression or a custom deleter that is supplied to shared_ptr during construction.
So the pointer is deleted by shared_ptr. You're not suppose to delete the stored pointer yourself
UPDATE:
I didn't realize that there were more statements and the pointer was not out of scope, I'm sorry.
I was reading more and the standard doesn't say much about the behavior of get() but here is a note, I quote:
A shared_ptr may share ownership of an object while storing a pointer to another object. get() returns the stored pointer, not the managed pointer.
So it looks that it is allowed that the pointer returned by get() is not necessarily the same pointer allocated by the shared_ptr (presumably using new). So delete that pointer is undefined behavior. I will be looking a little more into the details.
UPDATE 2:
The standard says at § 20.7.2.2.6 (about make_shared):
6 Remarks: Implementations are encouraged, but not required, to perform no more than one memory allocation. [ Note: This provides efficiency equivalent to an intrusive smart pointer. — end note ]
7 [ Note: These functions will typically allocate more memory than sizeof(T) to allow for internal bookkeeping structures such as the reference counts. — end note ]
So an specific implementation of make_shared could allocate a single chunk of memory (or more) and use part of that memory to initialize the stored pointer (but maybe not all the memory allocated). get() must return a pointer to the stored object, but there is no requirement by the standard, as previously said, that the pointer returned by get() has to be the one allocated by new. So delete that pointer is undefined behavior, you got a signal raised but anything can happen.