Write-Through No-Write-Allocate Penalty Calculation - caching

I am considering a write through, no write allocate (write no allocate) cache. I understand these by the following definitions:
Write Through: information is written to both the block in the cache and to the block in the lower-level memory
no write allocate: on a Write miss, the block is modified in the main memory and not loaded into the cache.
tcache : the time it takes to access the first level of cache
tmem : the time it takes to access something in memory
We have the following scenarios:
read hit: value is found in cache, only tcache is required
read miss: value is not found in cache, ( tcache + tmem )
write hit: writes to both cache and main memory, ( tcache + tmem )
write miss: writes directly to main memory, ( tcache + tmem )
The wikipedia flow for write through/no write allocate shows that we always have to go through the cache first, even though we aren't populating the cache. Why, if we know a write will never populate the cache in this situation, can't we only spend tmem performing the operation, rather than ( tcache + tmem )? It seems like we are unnecessarily spending extra time checking something we know we will not update.
My only guess is Paul A. Clayton's comment on a previous question regarding this type of cache is the reason we still have to interact with the cache on a write. But even then, I don't see why the cache update and the memory update can't be done in parallel.

Related

Modifying the cache access delay in gem5 does not work

When testing the cache access latency on my gem5, the access latency of l1 is 100 cycles lower than that of l2. My modification is to modify the tag_latency, data_latency, and response_latency in the L2 class in gem5/configs/common/Caches.py. Their original value was 20. I changed them all to 5 or all to 0. Every time I recompile gem5, when I run it again, the time does not change. Why is that?
I am using classical cache
By the way, does the meaning of data_latency, tag_latency and response_latency mean data access delay, tag delay, and delay in responsing to CPU ?
gem5/build/X86/gem5.opt --debug-flags=O3CPUAll --debug-start=120000000000
--outdir=gem5/results/test/final gem5/configs/example/attack_code_config.py
--cmd=final
--benchmark_stdout=gem5/results/test/final/final.out
--benchmark_stderr=gem5/results/test/final/final.err
--mem-size=4GB --l1d_size=32kB --l1d_assoc=8 --l1i_size=32kB --l1i_assoc=8
--l2_size=256kB --l2_assoc=8 --l1d_replacement=LRU --l1i_replacement=LRU
--caches --cpu-type=DerivO3CPU
--cmd --l1d_replacement etc. are the options I added to the
option.

what happens to cache and DRAM when executing "a=5"?

If a process writes a immediate operand to an address
int a;
a = 5;
what happens to L1-Data cache and DRAM?
DRAM fills "5" first or L1-Data Cache fills "5" first?
The compiler assigns some memory address to variable a. In the second statement, when a = 5 is executed, if the system is a multi-processor system, a request will be sent downstream to invalidate all lines and give the processor executing the code this particular cache address in a unique cache coherency state. The value of 5 is then written to the L1 cache (assuming the compiler wants to keep the cacheline address in the cache and does not deem that this should be written back to memory/DRAM).

write a block in cache that it's dirty bit was set

In computer architecture , if processor want to read a block in cache which it's dirty bit was set , then the processor will re-write this block to the memory or just read the block without write allocate ?
For reads, the data is read from the cache, as that is the latest updated data. For writes to the same block, the new data (to the same address) is updated and the dirty bit is set again. Only when there's a conflict miss (due to two different addresses sharing the same cache block) would the data actually be pushed to the next level of memory hierarchy.

Write in invalid state of MESI protocol

How is the write operation for a memory location that's not in the cache handled in the MESI protocol? The state diagrams i have seen mark it as Write Miss but i can't follow what happens in reality.
I think this results in a load operation on the bus to ensure that the processor trying to do the write gets exclusive access to the location and then the block is modified. Is this how it's done in reality or is the handling of write in invalid state implementation defined?
If the policy is allocate on a write miss:
If the block was not present in any other caches but only main memory, the block is fetched into the cache first, marked as M (modified) state, and then the write proceeds.
If the block was present in some other caches, it's copy in the other caches is first invalidated, so that this cache gains the only copy of the block, and then the write proceeds.
If the policy is no allocate on write miss: all write misses go directly to main memory. A copy is not fetched into the cache. If the main memory does not have the only copy of the block (some other cache has a copy), then the other copies are first invalidated and the write takes place in main memory.

why virtual memory copy on write need to be backed by disk page

reading the copy on write about window's memory management, it is saying that system will find a free page in the RAM for the shared memory ( be backed immediately by disk page ).
why it is necessary to back the RAM page with disk page ? it is not swapped out, it is just created ?
I remember the RAM page only get swapped when there is not enough RAM page
The system needs the guarantee that when a write happens, space will be available. You can't fail an allocation now if the system will run out of diskspace later.
That doesn't mean the disk s written to; the page reservation is merely bookkeeping.

Resources