Hash table entry linked list - Mutex locks for thread safe operation - winapi

Would a Win32 Mutex be the most efficient way to limit thread access to a linked list in a hash table? I didn't want to create a lot of handles, and the size of the hash table is variable. It could potentially be thousands. I didn't want to lock the whole list down when only one entry's list is being changed, so that would call for multiple Mutexes (one per each list), but I figured I could probably get away with pooling about 20 Mutex handles and reusing them since there shouldn't be that many threads accessing it simultaneously. Is there an alternative to Mutex locks for this case?

A lot here depends on the details of your hash table. My immediate reaction would be to avoid a mutex/critical section at all, at least if you can.
At least for adding an item to the linked list, it's pretty easy to avoid it by using an InterlockedExchangePointer instead. Presumably you have a struct something like:
struct LL_item {
LL_item *next;
std::string key;
whatever_type value;
To insert an item of this type into a linked list, you do something like:
LL_item *item = new LL_item;
// set key and value here
item->next = &item;
InterlockedExchangePointer(&item->next, &bucket->head);
Prior to the InterlockedExchangePointer, bucket->head contains the address of the first item currently in the list. We initialize our new item with its own address in its next pointer. We then (atomically) exchange the next pointer in our new item with the pointer to the pointer to the (previous) first node in the list. After the exchange, the new node's next pointer contains the address of the previously-first item in the list, and the pointer to the head of the list contains the address of our new node.
I believe you can (probably) normally use an exchange to remove an item from a list as well, but I'm not sure -- I haven't thought through that quite as thoroughly. Quite a few hash tables don't (even try to) support deletion anyway, so you may not care about that though.

I'd suggest a slim reader writer lock. Sure, it locks the entire data structure when you're doing updates, but typically you'll have a lot more reads than writes to the hash table. My experience with SRW locks is that it works quite well and performance is very good. You probably should give it a try. That'll get your program working. Then you can profile the code to determine if there are bottlenecks and if so where the bottlenecks are. It's quite possible that the SRW lock is plenty fast enough.


data structure for page-aligned entries

I need to store many page-aligned entries, each entry is page-sized; basically I need to collect/bind together memory pages. The only requirement that I need to be able to check whether entry already exists upon adding by matching the machine-word-sized key. It is not possible to override the entry; if the same key is used, the existing entry must be found.
The function to add/replace the entry receives some machine-word-sized key (32/64 bits), checks if there is a page-aligned entry which contains the same key. If there is no entry, it is created via mmap, and is added with the required key. The C declaration of the entry looks like this:
struct entry {
uintptr_t key; /* machine-word-sized key */
unsigned char meta[]; /* this space may be used for storing in data structure */
The caller receives the key and takes the decision whether to use the existing entry or to allocate a new one. That is, the key must be looked up, and pages must be just collected to allow removal in a loop.
All I need are adding the entry and removing all entries (no specific order imposed); since I use each entry as epoll.data.ptr, I don't need even fast lookup after adding the entry. Given that each entry has some space for meta-data, I'm OK to dedicate some of this space to payload required to store the entry in the data structure.
I thought about using a hash table. I have no math or crypto background, so generating a good hash is a problem. I tried looking at several well-known hashes, but it seems that they are quite generic, i.e. intended to work with any data. However, my case seems to be very specific: there is no way user would use the table directly and conditions (page-aligned and page-sized entries plus word-sized key) are unlikely to change.
The questions are:
Am I right that hash table is OK for this case? If yes, what kind of hash would you suggest? If the hash table is linked-list based, it'd better be intrusive (i.e. all required meta is better to be inside the entry, not outside, like Linux kernel's struct list_head).
I was also looking at page tables, like described at https://wiki.osdev.org/Paging. However, it concentrates mostly on how MMU does its job, and I'm not sure whether I can adopt it to purely software implementation and how can I apply these concepts. Since machine-word-sized key must be used for inserting the entry, the concepts from this link only show how to organize pages effectively for page-to-page mappings.
I currently need to care only of 4096-bytes pages, but generic case is better (i.e. some algorithm which operates on PAGE_SIZE, be it 4K, 8K or whatever). It would be also nice the data structure does not assume page-sized (though page-aligned is a strict requirement, since all memory is obtained via mmap).

Is there a linux header for hashtable with spinlock-protected buckets?

I write a code which rarely creates/removes objects (up to several thousands) but very frequently modifies them in soft IRQ context. These objects are also rarely read (and probably will also be rarely modified) from task context (via procfs: file per object). Currently my code contains global per-CPU data blocks, each one guarded by a spinlock. Such a block contains a fixed-sized hashtable for object storage.
Obviously the current design is not optimal, especially when having very high object update loads: reading objects from procfs will cause data losses in updating soft IRQs. I need to rewrite the synchronisation scheme to get rid of global locks. The most obvious choice - to have a spinlock for each hashtable bucket - it should scale well. The problem is that I'll probably need to use my own hashtable implementation or at least to reimplement several top-level macros (didn't find those in linux/hashtable.h for spinlock-protected buckets). Should I also look towards RCU-enabled hashtable (yet I have no solid understanding of this synchronisation approach)?
Buckets with lock protection are declared in the header linux/list_bl.h. They use lowest bit of the head pointer as a lock bit.
RCU-protected access to the bucket is defined with other hash table functions in the header linux/hashtable.h (they have _rcu suffix).
Choosing between locks and RCU is up to you. Note, that RCU itself cannot resolve modify-modify conflicts. And it helps mostly for frequently-read data, which seems is not your case.
As only one locking function - hlist_bl_lock - is declared for struct hlist_bl_head, and this function is unaware for irq's, additional actions should be performed when hash table can be used in irq or bottom halves:

how to use std::map in threaded application safely?

I have a threaded application, each thread will probably insert specific item into map, or erase its inserted item from map, but for any other threads, they will just use find or traverse the whole map via iterator.
Again, each thread will only insert or erase one its specific item on map.
In such case, should I add lock before insert or erase to avoid race? then how?
Without looking at code, I can only say that you use a ConcurrentHashMap for your needs. You may also want to read this : What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?

Thread-safe (Goroutine-safe) cache in Go

Question 1
I am building/searching for a RAM memory cache layer for my server. It is a simple LRU cache that needs to handle concurrent requests (both Gets an Sets).
I have found https://github.com/pmylund/go-cache claiming to be thread safe.
This is true as far as getting the stored interface. But if multiple goroutines requests the same data, they are all retrieving a pointer (stored in the interface) to the same block of memory. If any goroutine changes the data, this is no longer very safe.
Are there any cache-packages out there that tackles this problem?
Question 1.1
If the answer to Question 1 is No, then what would be the suggested solution?
I see two options:
Alternative 1
Solution: Storing the values in a wrapping struct with a sync.Mutex so that each goroutine needs to lock the data before reading/writing to it.
type cacheElement struct { value interface{}, lock sync.Mutex }
Drawbacks: The cache becomes unaware of changes made to data or might even have dropped it out of the cache. One goroutine might also lock others.
Alternative 2
Solution: Make a copy of the data (assuming the data in itself doesn't contain pointers)
Drawbacks: Memory allocation every time a cache Get is performed, more garbage collection.
Sorry for the multipart question. But you don't have to answer all of them. If you have a good answer to Question 1, that would be sufficient for me!
Alternative 2 sounds good to me, but please note that you do not have to copy the data for each cache.Get(). As long as your data can be considered immutable, you can access it with many multiple readers at once.
You only have to create a copy if you intend to modify it. This idiom is called COW (copy on write) and is quite common in concurrent software design. It's especially well suited for scenarios with a high read/write ratio (just like a cache).
So, whenever you want to modify a cached entry, you basically have to:
create a copy of the old cached data, if any.
modify the data (after this step, the data should be considered immutable and must not be changed anymore)
add / replace the existing element in the cache. You could either use the go-cache library you have pointed out earlier (which is based on locks) for that, or write your own lock-free library that simply swaps the pointers to the data element atomically.
At this point any goroutine that performs a cache.Get operation will get the new data. Existing goroutines however, might still be reading the old data. So, your program might operate on many different versions of the same data at once. But don't worry, as soon as all goroutines have finished accessing the old data, the GC will collect it automatically.
tux21b gave a good answer. I'll just point out that you don't have to return pointers to data. you can store non pointer values in your cache and go will pass by value which will be a copy. Then your Get and Set methods will be safe since nothing can actually modify the cache contents.

Is NSObject's retain method atomic?

Is NSObject's retain method atomic?
For example, when retaining the same object from two different threads, is it promised that the retain count has gone up twice, or is it possible for the retain count to be incremented just once?
NSObject as well as object allocation and retain count functions are thread-safe — see Appendix A: Thread Safety Summary in the Thread Programming Guide.
Edit: I’ve decided to take a look at the open source part of Core Foundation. In CFRuntime.c, __CFDoExternRefOperation() is the function responsible for updating the the retain counters. It tests whether the process has more than one thread and, if there’s more than one thread, it acquires a spin lock before updating the retain count, hence making this operation thread safe.
Interestingly enough, the retain count is not an attribute (or instance variable) of an object in the struct (class) sense. The runtime keeps a separate structure with retain counters. In fact, if I understand it correctly, this structure is an array of hash tables and there’s a spin lock for each hash table. This means that a lock refers to multiple objects that have been placed in the same hash table, i.e., the lock is neither global (for all instances) nor per instance.
