What does 'dirty-flag' / 'dirty-values' mean? - caching

I see some variables named 'dirty' in some source code at work and some other code. What does it mean? What is a dirty flag?

Generally, dirty flags are used to indicate that some data has changed and needs to eventually be written to some external destination. It isn't written immediate because adjacent data may also get changed and writing bulk of data is generally more efficient than writing individual values.

There's a deeper issue here - rather than "What does 'dirty mean?" in the context of code, I think we should really be asking - is 'dirty' an appropriate term for what is generally intended.
'Dirty' is potentially confusing and misleading. It will suggest to many new programmers corrupt or erroneous form data. The work 'dirty' implies that something is wrong and that the data needs to be purged or removed. Something dirty is, after all undesirable, unclean and unpleasant.
If we mean 'the form has been touched' or 'the form has been amended but the changes haven't yet been written to the server', then why not 'touched' or 'writePending' rather than 'dirty'?
That I think, is a question the programming community needs to address.

Dirty could mean a number of things, you need to provide more context. But in a very general sense a "dirty flag" is used to indicate whether something has been touched / modified.
For instance, see usage of "dirty bit" in the context of memory management in the wiki for Page Table

"Dirty" is often used in the context of caching, from application-level caching to architectural caching.
In general, there're two kinds of caching mechanisms: (1) write through; and (2) write back. We use WT and WB for short.
WT means that the write is done synchronously both to the cache and to the backing store. (By saying the cache and the backing store, for example, they can stand for the main memory and the disk, respectively, in the context of databases).
In contrast, for WB, initially, writing is done only to the cache. The write to the backing store is postponed until the cache blocks containing the data are about to be modified/replaced by new content.
The data is the dirty values. When implementing a WB cache, you can set dirty bits to indicate whether a cache block contains dirty value or not.

Related

Are cache ways in gem5 explicit or are they implied/derived from the number of cache sets and cache size?

I am trying to implement a gem5 version of HybCache as described in HYBCACHE: Hybrid Side-Channel-Resilient Caches for Trusted Execution Environments (which can be found at https://www.usenix.org/system/files/sec20spring_dessouky_prepub.pdf).
A brief summary of HybCache is that a subset of all the cache is reserved for use by secure processes and are isolated. This is achieved by using a limited subset of cache ways when the process is in 'isolated' mode. Non-isolated processes uses the cache operations normally, having access to the entire cache and using the replacement policy and associativity given in the configuration. The isolated subset of cache ways uses random replacement policy and is fully associative. Here is a picture demonstrating the idea.
The ways 6 and 7 are grey and represent the isolated cache ways.
So, I need to manipulate the placement of data into these ways. My question is, since I have found no mention of cache ways in the gem5 code, does that mean that the cache ways only exist logically? That is, do I have to manually calculate the location of each cache way? If cache ways are used in gem5, then were are they used? What is the file name?
Any help would be greatly appreciated.
This answer is only valid for the Classic cache model (src/mem/cache/).
In gem5 the number of cache ways is determined automatically from the cache size and the associativity. Check the files in src/mem/cache/tags/indexing_policies/ for the relevant code (specifically, the constructor of base.cc).
There are two ways you could tackle this implementation:
1 - Create a new class that inherits from BaseTags (e.g., HybCacheTags). This class will contain the decision of whether it should work in secure mode or not, and how to do so (i.e., when to call which indexing and replacement policy). Depending on whatever else is proposed in the paper, you may also need to derive from Cache to create a HybCache.
The new tags need one indexing policy per operation mode. One is the conventional (SetAssociative), and the other is derived from SetAssociative, where the parameter assoc makes the numSets become 1 (to make it fully associative). The derived one will also have to override at least one function, getPossibleEntries(), to only allow selecting the ways that you want. You can check skewed_assoc.cc for an example of a more complex location selection.
The new tags need one replacement policy per operation mode. You will likely just use the ones in the replacement_policies folder.
2 - You could create a HybCache based on the Cache class that has two tags, one conventional (i.e., BaseSetAssoc), and the other based on the FALRU class (rewritten to work as a, e.g., FARandom).
I believe the first option is easier and less hardcoded. FALRU has not been split into an indexing policy and replacement policy, so if you need to change one of these, you will have to reimplement it.
While implementing you may encounter coherence faults. If it happens it is much likely a problem in the indexing logic, and I wouldn't look into trying to find issues in the coherence model.

Thread-safe (Goroutine-safe) cache in Go

Question 1
I am building/searching for a RAM memory cache layer for my server. It is a simple LRU cache that needs to handle concurrent requests (both Gets an Sets).
I have found https://github.com/pmylund/go-cache claiming to be thread safe.
This is true as far as getting the stored interface. But if multiple goroutines requests the same data, they are all retrieving a pointer (stored in the interface) to the same block of memory. If any goroutine changes the data, this is no longer very safe.
Are there any cache-packages out there that tackles this problem?
Question 1.1
If the answer to Question 1 is No, then what would be the suggested solution?
I see two options:
Alternative 1
Solution: Storing the values in a wrapping struct with a sync.Mutex so that each goroutine needs to lock the data before reading/writing to it.
type cacheElement struct { value interface{}, lock sync.Mutex }
Drawbacks: The cache becomes unaware of changes made to data or might even have dropped it out of the cache. One goroutine might also lock others.
Alternative 2
Solution: Make a copy of the data (assuming the data in itself doesn't contain pointers)
Drawbacks: Memory allocation every time a cache Get is performed, more garbage collection.
Sorry for the multipart question. But you don't have to answer all of them. If you have a good answer to Question 1, that would be sufficient for me!
Alternative 2 sounds good to me, but please note that you do not have to copy the data for each cache.Get(). As long as your data can be considered immutable, you can access it with many multiple readers at once.
You only have to create a copy if you intend to modify it. This idiom is called COW (copy on write) and is quite common in concurrent software design. It's especially well suited for scenarios with a high read/write ratio (just like a cache).
So, whenever you want to modify a cached entry, you basically have to:
create a copy of the old cached data, if any.
modify the data (after this step, the data should be considered immutable and must not be changed anymore)
add / replace the existing element in the cache. You could either use the go-cache library you have pointed out earlier (which is based on locks) for that, or write your own lock-free library that simply swaps the pointers to the data element atomically.
At this point any goroutine that performs a cache.Get operation will get the new data. Existing goroutines however, might still be reading the old data. So, your program might operate on many different versions of the same data at once. But don't worry, as soon as all goroutines have finished accessing the old data, the GC will collect it automatically.
tux21b gave a good answer. I'll just point out that you don't have to return pointers to data. you can store non pointer values in your cache and go will pass by value which will be a copy. Then your Get and Set methods will be safe since nothing can actually modify the cache contents.

Why does loading cached objects increase the memory consumption drastically when computing them will not?

Relevant background info
I've built a little software that can be customized via a config file. The config file is parsed and translated into a nested environment structure (e.g. .HIVE$db = an environment, .HIVE$db$user = "Horst", .HIVE$db$pw = "my password", .HIVE$regex$date = some regex for dates etc.)
I've built routines that can handle those nested environments (e.g. look up value "db/user" or "regex/date", change it etc.). The thing is that the initial parsing of the config files takes a long time and results in quite a big of an object (actually three to four, between 4 and 16 MB). So I thought "No problem, let's just cache them by saving the object(s) to .Rdata files". This works, but "loading" cached objects makes my Rterm process go through the roof with respect to RAM consumption (over 1 GB!!) and I still don't really understand why (this doesn't happen when I "compute" the object all anew, but that's exactly what I'm trying to avoid since it takes too long).
I already thought about maybe serializing it, but I haven't tested it as I would need to refactor my code a bit. Plus I'm not sure if it would affect the "loading back into R" part in just the same way as loading .Rdata files.
Question
Can anyone tell me why loading a previously computed object has such effects on memory consumption of my Rterm process (compared to computing it in every new process I start) and how best to avoid this?
If desired, I will also try to come up with an example, but it's a bit tricky to reproduce my exact scenario. Yet I'll try.
Its likely because the environments you are creating are carrying around their ancestors. If you don't need the ancestor information then set the parents of such environments to emptyenv() (or just don't use environments if you don't need them).
Also note that formulas (and, of course, functions) have environments so watch out for those too.
If it's not reproducible by others, it will be hard to answer. However, I do something quite similar to what you're doing, yet I use JSON files to store all of my values. Rather than parse the text, I use RJSONIO to convert everything to a list, and getting stuff from a list is very easy. (You could, if you want, convert to a hash, but it's nice to have layers of nested parameters.)
See this answer for an example of how I've done this kind of thing. If that works out for you, then you can forego the expensive translation step and the memory ballooning.
(Taking a stab at the original question...) I wonder if your issue is that you are using an environment rather than a list. Saving environments might be tricky in some contexts. Saving lists is no problem. Try using a list or try converting to/from an environment. You can use the as.list() and as.environment() functions for this.

Read a File From Cache, But Without Polluting the Cache (in Windows)

Windows has a FILE_FLAG_NO_BUFFERING flag that allows you to specify whether or not you want your I/O to be cached by the file system.
That's fine, but what if I want to use the cache if possible, but avoid modifying it?
In other words, how do you tell Windows the following?
Read this file from the cache if it's already cached, but my data doesn't exhibit locality, so do not put it into the cache!
The SCSI standard defines a Disable Page Out bit that does precisely this, so I'm wondering how (if at all) it is possible to use that feature from Windows (with cooperation of the file system cache too, of course)?
Edit: TL;DR:
What's the equivalent of FILE_FLAG_WRITE_THROUGH for reads?
About the closest Windows provides to what you're asking is FILE_FLAG_WRITE_THROUGH.
I see two flags that look suspiciously like what you are asking for:
FILE_FLAG_RANDOM_ACCESS
FILE_FLAG_SEQUENTIAL_SCAN
The later's doc clearly suggests that it won't retain pages in cache, though it will probably read-ahead sequentially. The former's doc is completely opaque, but would seem to imply what you want. If the pattern is quite random, hanging onto pages for later reuse would be a waste of memory.
Keep in mind that, for files, the Windows kernel always will use some pages of 'cache' to hold the I/O. It has nowhere else to put it. So it's not meaningful to say 'don't cache it,' as opposed to 'evict the old pages of this file before evicting some other pages.'

Executable sections marked as "execute" AND "read"?

I've noticed (on Win32 at least) that in executables, code sections (.text) have the "read" access bit set, as well as the "execute" access bit. Are there any bonafide legit reasons for code to be reading itself instead of executing itself? I thought this was what other sections were for (such as .rdata).
(Specifically, I'm talking about IMAGE_SCN_MEM_READ.)
IMAGE_SCN_MEM_EXECUTE |IMAGE_SCN_MEM_READ are mapped into memory as PAGE_EXECUTE_READ, which is equivalent to PAGE_EXECUTE_WRITECOPY. This is needed to enable copy-on-write access. Copy-on-write means that any attempts to modify the page results in a new, process-private copy of the page being created.
There are a few different reasons for needing write-copy:
Code that needs to be relocated by the loader must have this set so that the loader can do the fix-ups. This is very common.
Sections that have code and data in single section would need this as well, to enable modifying process globals. Code & data in a single section can save space, and possibly improve locality by having code and the globals the code uses being on the same page.
Code that attempts to modify itself. I believe this is fairly rare.
Compile-time constants, particularly for long long or double values, are often loaded with a mov register, address statement from the code segment.
The one example I can think of for a reason to read code is to allow for self modifying code. Code must necessarily be able to read itself in order to be self modifying.
Also consider the opposite side. What advantage is gained from disallowing code from reading itself? I struggled for a bit on this one but I can see no advantage gained from doing so.

Resources