Thread-safe (Goroutine-safe) cache in Go - caching

Question 1
I am building/searching for a RAM memory cache layer for my server. It is a simple LRU cache that needs to handle concurrent requests (both Gets an Sets).
I have found https://github.com/pmylund/go-cache claiming to be thread safe.
This is true as far as getting the stored interface. But if multiple goroutines requests the same data, they are all retrieving a pointer (stored in the interface) to the same block of memory. If any goroutine changes the data, this is no longer very safe.
Are there any cache-packages out there that tackles this problem?
Question 1.1
If the answer to Question 1 is No, then what would be the suggested solution?
I see two options:
Alternative 1
Solution: Storing the values in a wrapping struct with a sync.Mutex so that each goroutine needs to lock the data before reading/writing to it.
type cacheElement struct { value interface{}, lock sync.Mutex }
Drawbacks: The cache becomes unaware of changes made to data or might even have dropped it out of the cache. One goroutine might also lock others.
Alternative 2
Solution: Make a copy of the data (assuming the data in itself doesn't contain pointers)
Drawbacks: Memory allocation every time a cache Get is performed, more garbage collection.
Sorry for the multipart question. But you don't have to answer all of them. If you have a good answer to Question 1, that would be sufficient for me!

Alternative 2 sounds good to me, but please note that you do not have to copy the data for each cache.Get(). As long as your data can be considered immutable, you can access it with many multiple readers at once.
You only have to create a copy if you intend to modify it. This idiom is called COW (copy on write) and is quite common in concurrent software design. It's especially well suited for scenarios with a high read/write ratio (just like a cache).
So, whenever you want to modify a cached entry, you basically have to:
create a copy of the old cached data, if any.
modify the data (after this step, the data should be considered immutable and must not be changed anymore)
add / replace the existing element in the cache. You could either use the go-cache library you have pointed out earlier (which is based on locks) for that, or write your own lock-free library that simply swaps the pointers to the data element atomically.
At this point any goroutine that performs a cache.Get operation will get the new data. Existing goroutines however, might still be reading the old data. So, your program might operate on many different versions of the same data at once. But don't worry, as soon as all goroutines have finished accessing the old data, the GC will collect it automatically.

tux21b gave a good answer. I'll just point out that you don't have to return pointers to data. you can store non pointer values in your cache and go will pass by value which will be a copy. Then your Get and Set methods will be safe since nothing can actually modify the cache contents.

Related

Is it safe to read concurrently from a pointer?

I'm working on an image uploader and want to concurrently resize the image to different sizes. Once I've read the file as a []byte I'm passing a reference of that buffer to my resize functions that are being run concurrently.
Is this safe? I'm thinking by passing a reference of a large file to be read by resize functions will save me memory, and the concurrency will save me time.
Thank you!
Read-only data is usually fine for concurrent access, but you have to be very careful when passing references (pointers, slices, maps and so on) around. Today maybe no one is modifying them while you're also reading, but tomorrow someone may be.
If this is a throwaway script, you'll be fine. But if it's part of a larger program, I'd recommend future-proofing your code by judiciously protecting concurrent access. In your case something like a reader-writer lock could be a good match - all the readers will be able to acquire the lock concurrently, so the performance impact is negligible. And then if you do decide in the future this data could be modified, you already have the proper groundwork laid down w.r.t. safety.
Don't forget to run your code with the race detector enabled.

Is there a linux header for hashtable with spinlock-protected buckets?

I write a code which rarely creates/removes objects (up to several thousands) but very frequently modifies them in soft IRQ context. These objects are also rarely read (and probably will also be rarely modified) from task context (via procfs: file per object). Currently my code contains global per-CPU data blocks, each one guarded by a spinlock. Such a block contains a fixed-sized hashtable for object storage.
Obviously the current design is not optimal, especially when having very high object update loads: reading objects from procfs will cause data losses in updating soft IRQs. I need to rewrite the synchronisation scheme to get rid of global locks. The most obvious choice - to have a spinlock for each hashtable bucket - it should scale well. The problem is that I'll probably need to use my own hashtable implementation or at least to reimplement several top-level macros (didn't find those in linux/hashtable.h for spinlock-protected buckets). Should I also look towards RCU-enabled hashtable (yet I have no solid understanding of this synchronisation approach)?
Buckets with lock protection are declared in the header linux/list_bl.h. They use lowest bit of the head pointer as a lock bit.
RCU-protected access to the bucket is defined with other hash table functions in the header linux/hashtable.h (they have _rcu suffix).
Choosing between locks and RCU is up to you. Note, that RCU itself cannot resolve modify-modify conflicts. And it helps mostly for frequently-read data, which seems is not your case.
As only one locking function - hlist_bl_lock - is declared for struct hlist_bl_head, and this function is unaware for irq's, additional actions should be performed when hash table can be used in irq or bottom halves:
spin_lock_irqsave:
local_irq_save(flags);
hlist_bl_lock(...);
spin_unlock_irqrestore:
hlist_bl_unlock(...);
local_irq_restore(flags);
spin_lock_bh:
local_bh_disable();
hlist_bl_lock(...);
spin_unlock_bh:
hlist_bl_unlock(...);
local_bh_enable();

Difference between Read-Copy-Update and Reader-Writer-Lock?

They look pretty much same to me from programming perspective. From what I read when updating the data, RCU needs to maintain an old copy until all readers are done, which creates large overhead.
Is that the only difference when it comes to implementation ?
Read-Copy-Update (RCU): is not same as reader-writer lock, here are some of the points I can think off:
Separates update and reclamation information, where both readers and writers could avoid locking altogether.
From implementation point of view, RCU is suitable for dynamically allocated data structures, such as linked lists, as the writer does not modify the data in-place, but instead allocates a new element which it initializes with the updated data. The old element is replaced with the new element using an atomic pointer and then new readers will see the newly updated data. Drawback is that old reader will still see the old copy of the data. The old copy must be tracked, and readers must notify the RCU infrastructure that the read is complete, so old data can be reclaimed.
Read-writer-lock: Here a writer prevents another reader or another writer from acquiring the lock, while it has aquired the lock already. Multiple readers can acquire a lock simultaneous, provided no writer has taken the lock.
hope this helps!
In simple terms,
RCU - Concurrency is allowed between a single writer and multiple readers
Reader-Writer locks - Concurrent reads are allowed however not the write operation

what does __rcu stands for in linux?

I am new to linux kernel. My question is about the task_struct.
I know that Each task_struct has a reference to its parent process via a pointer to the task_struct of the parent.
After looking at the sched.h in the task_struct definition I noticed the following :
struct task_struct __rcu *real_parent; /* real parent process */
I found that it is referenced to compiler.h. I guess that the "__rcu" stands for "read copy update"
Can someone clarify the syntax ?
Read-copy-update is an algorithm that enables concurrent access to readers of a data structure without having to lock the structure. It can be read about here.
If the kernel is built with the CONFIG_SPARSE_RCU_POINTER config option, __rcu is defined in include/linux/compiler.h as
# define __rcu __attribute__((noderef, address_space(4)))
This is an annotation for a the Sparse code analysis tool that can warn about certain things the programmer may have overlooked. How this is relevant to RCU is explained in Documentation/RCU/checklist.txt:
__rcu sparse checks: tag the pointer to the RCU-protected data
structure with __rcu, and sparse will warn you if you
access that pointer without the services of one of the
variants of rcu_dereference().
rcu_dereference() returns a pointer that can be safely dereferenced by the code and documents the programmer's intention to protect the pointer with the RCU mechanism, enabling tools like Sparse to check for programming errors and omissions.
RCU stands for "read, copy, update". It is an algorithm that allows multiple readers to access data which can be updated or even deleted at the same time by writers.
Under RCU, writers still have to ensure mutual exclusion with regard to one another, but readers do not acquire a lock. Care has to be taken that the shared data structure is updated in ways that do not violate read integrity. If something has to be removed or deleted, the unlinking of that item from the data structure can be done in parallel with the readers but the actual deletion of the memory has to wait until the last reader has finished.
Rather than making the readers acquire a lock, the whereabouts of the readers are inferred in other ways. Threads can announce their intent to browse the data structure by joining a "read side critical section" which is not really a lock but a kind of global phase.
For instance, suppose that some threads entered the RCU read side critical section in phase 0. An updater has performed a deletion and want to free a piece of memory. It has to simply wait for all threads in the system to vacate phase 0. In the meanwhile, other readers are looking at the data structure already, but when they declare their intent to RCU, they do so by entering the RCU read-side critical section under phase 1. Only the phase 0 threads can possibly still have a pointer to the object that was deleted, and so when the last thread leaves phase 0, the object can safely be deleted. Newly arriving threads in phase 1 do not see the object, because the object has been removed from the data structure, so they have no way to find it.
RCU takes advantage of the idea that we do not need lock objects that are "owned" in order to know information like "no thread can be accessing this object any more".

What does 'dirty-flag' / 'dirty-values' mean?

I see some variables named 'dirty' in some source code at work and some other code. What does it mean? What is a dirty flag?
Generally, dirty flags are used to indicate that some data has changed and needs to eventually be written to some external destination. It isn't written immediate because adjacent data may also get changed and writing bulk of data is generally more efficient than writing individual values.
There's a deeper issue here - rather than "What does 'dirty mean?" in the context of code, I think we should really be asking - is 'dirty' an appropriate term for what is generally intended.
'Dirty' is potentially confusing and misleading. It will suggest to many new programmers corrupt or erroneous form data. The work 'dirty' implies that something is wrong and that the data needs to be purged or removed. Something dirty is, after all undesirable, unclean and unpleasant.
If we mean 'the form has been touched' or 'the form has been amended but the changes haven't yet been written to the server', then why not 'touched' or 'writePending' rather than 'dirty'?
That I think, is a question the programming community needs to address.
Dirty could mean a number of things, you need to provide more context. But in a very general sense a "dirty flag" is used to indicate whether something has been touched / modified.
For instance, see usage of "dirty bit" in the context of memory management in the wiki for Page Table
"Dirty" is often used in the context of caching, from application-level caching to architectural caching.
In general, there're two kinds of caching mechanisms: (1) write through; and (2) write back. We use WT and WB for short.
WT means that the write is done synchronously both to the cache and to the backing store. (By saying the cache and the backing store, for example, they can stand for the main memory and the disk, respectively, in the context of databases).
In contrast, for WB, initially, writing is done only to the cache. The write to the backing store is postponed until the cache blocks containing the data are about to be modified/replaced by new content.
The data is the dirty values. When implementing a WB cache, you can set dirty bits to indicate whether a cache block contains dirty value or not.

Resources