So on the website GeekForGeek I found this formula to find the find the Average memory access time for a multi-level cache:
Tavg = H1 * C1 + (1 – H1) * (H2 * C2 +(1 – H2) *M )
where
H1 is the Hit rate in the L1 caches.
H2 is the Hit rate in the L2 cache.
C1 is the Time to access information in the L1 caches.
C2 is the Miss penalty to transfer information from the L2 cache to an L1 cache.
M is the Miss penalty to transfer information from the main memory to the L2 cache.
But I wonder if you have an L3, will it be this?:
Tavg = H1 * C1 + (1 – H1) * (H2 * C2 +(1 – H2) *M ) * (H3 * C3+(1-H3)*M3)
where
H3 is the Hit rate in the L3 cache.
C3 is the Miss penalty to transfer information from the L3 cache to an L2 cache.
M3 is the Miss penalty to transfer information from the main memory to the L3 cache.
It doesn't seemed right, Especially since L3 is often a cache for the whole chip rather than just a L2 back up (from what I understand.)
On another site I found this here :
Tavg = H1*T1+(1-H1)H2(T1+T2)+(1-H1)(1-H2)(T1+T2+T3)
But that doesn't seemed right either, the only thing from L3 is T3. If it is right, nothing explain why, so I'm confused.
I'm trying to find the Tavg for when there is a L3 (With L2 and L1) and when it's only L2 and L1. Any advices is appreciated!
Thank you!
In my research there wasn't much about the access time of L3, Except the las link, but I can't find explanation on the time access with the three levels.
Related
I am attempting to find the average memory access time (AMAT) of a single level cache. In order to do so, miss penalty must be calculated since the AMAT formula requires it.
Doing this for a multilevel cache requires using the next level cache penalty. But for a single level, there is obviously no other cache level.
So how is this calculated?
formula:
AMAT = HIT-TIME + MISS-RATE * MISS-PENALTY
You have the correct formula to calculate the AMAT, however you may be misinterpreting the components of the formula. Let’s take a look at how to use this equation, first with a single-level cache and next with a multi-level cache.
Suppose you have a single-level cache. Hit time represents the amount of time required to search and retrieve data from the cache. Miss rate denotes the percentage of the data requested that does not reside in the cache i.e. the percentage of data you have to go to main memory to retrieve. Miss penalty is the amount of time required to retrieve the data once you miss in the cache. Because we are dealing with a single-level cache, the only other level in the memory hierarchy to consider is main memory for the miss penalty.
Here’s a good example for single-level cache:
L1 cache has an access time of 5ns and a miss rate of 50%
Main memory has an access time of 500ns
AMAT = 5ns + 0.5 * 500ns = 255ns
You always check the cache first so you always incur a 5 ns hit time overhead. Because our miss rate is 0.5, we find what we are looking for in the L1 cache half the time and must go to main memory the remaining half time. You can calculate the miss penalty in the following way using a weighted average:
(0.5 * 0ns) + (0.5 * 500ns) = (0.5 * 500ns) = 250ns.
Now, suppose you have a multi-level cache i.e. L1 and L2 cache. Hit time now represents the amount of time to retrieve data in the L1 cache. Miss rate is an indication of how often we miss in the L1 cache. Calculating the miss penalty in a multi-level cache is not as straightforward as before because we need to consider the time required to read data from the L2 cache as well as how often we miss in the L2 cache.
Here’s a good example:
L1 cache has an access time of 5 ns and miss rate of 50%
L2 cache has an access time of 50 ns and miss rate of 20%
Main memory has an access time of 500 ns
AMAT = 5ns + 0.5 * (50ns + 0.2 * 500ns) = 80 ns
Again, you always check the L1 cache first so you always incur a 5 ns hit time overhead. Because our miss rate is 0.5, we find what we are looking for in the L1 cache half the time and must down the memory hierarchy (L2 cache, main memory) the remaining half time. If we do not find the data in the L1 cache, we always look in the L2 cache next. We thus incur a 50 ns hit time overhead every time we miss in the L1 cache. In the case that the data is not in the L2 cache also (which is 20% of the time), we must go to main memory which has a memory access time of 500 ns.
Is it correct to say the following statements:
Statement 1: Global Miss Rate for an L2 cache is the same as the Local Miss Rate for an L2 cache. Since for a memory reference accessing L2, to miss L1 and L2 is equivalent to missing L2, because because it has already missed L1 by virtue of attempting to access L2 (For a system of 2 hierarchical caches - L1 and L2)
Statement 2: The Global Miss Rate for an L1 cache is the same as Local Miss Rate for L1 Cache (For a system of 2 hierarchical caches - L1 and L2)
Statement 3: The Global Miss Rate for an Ln cache is the same as the Local Miss Rate for Ln cache (For a system of 2 hierarchical caches - L1, L2, L3,..., Ln)
Let me answer this as clearly as possible.
Local Miss Rate = Number of Misses in this cache/Number of references to this cache
Global Miss Rate = No. of Misses in this cache/Total number of references made by the processor
Statement 1: False
Explanation:
No. of accesses to L2 = No. of misses in L1
Total number of references made by the processor = Number of accesses to L1
(Since all memory references made by the processor are first served (tried to be served) by the L1 cache.
Therefore for L2 cache,
Local MR = Number of misses in L2 / Number of misses in L1
Global MR = Number of misses in L2 / Number of accesses to L1
So, for L2, Local MR != Global MR
Statement 2: True
Explanation:
For L1 cache,
Local MR = Number of misses in L1 / Number of accesses to L1
Global MR = Number of misses in L1 / Total number of references made by the processor
Total number of references made by the processor = Number of accesses to L1 (Since all memory references made by the processor are first served (tried to be served) by the L1 cache.
So, for L1, Local MR = Global MR
Statement 3: False, Correction - For a system of n hierarchical caches L1, L2,...,Ln
Explanation:
This statement is analogous to Statement 1. It means that for L2 in 2-level system, L3 in 3-level system, L4 in 4-level system and so on...
We proved that it is false for L2 in 2-level system in Statement 1. Same explanation follows for the rest.
Thus, for Ln in n-level hierarchical system, Local MR != Global MR
Suppose there are 2 caches L1 and L2
L1
Hit rate of L1=0.8
Access time of l1=2ns
and transfer time b/w L1 and CPU is 10ns
L2
Hit rate of L2=0.9
Access time of L2 =5ns
and transfer time b/w L2 and L1 is 100ns
What will be the effective access time in case of Look through and Look aside policies.
Look through and Look aside is the read policy of cache architecture.
First , We will see difference between them
(1) - LOOK THROUGH Policy = If processor wants to search content , it will first look into cache , if cache hits -- get content , if cache miss (here it will search into L2 and then go to main memory) it will go to main memory , read block from main memory and copy block into cache for further access...
Here , To calculate Access time
h = hit rate
c = cache access time
m = main memory access time
Access time = h * c + (1 - h ) * ( c + m )
for L1 = 2 + 10 = 12 ns
for (through L1) L2 = L1 time + 5 + 100 = 117 ns
for (through L1 + L2 ) memory = L1 + L2 + Mem = Mem ns
Access time = (0.8 * 12 ) + (0.18 * 117) + (0.02 * Mem ).
(2) LOOK ASIDE policy = Processor simultaneously look for content in both cache as well as in main memory....
Look aside requires more signal operation for every access(cache and main memory) and when content found in cache , it require to send a cancel signal to main memory..which is biggest disadvantage of look aside policy..
Here , To calculate Access time
you have to consider all signaling time for all operation ....
Note - Most of cache uses look through cache , because now a days , cache hit ratio is more than 95% ..so most of time content is available in cache....
[For software/application cache]In both, look-aside and look-through caches, the data is looked up first in the cache. In the look-aside case, it is the responsibility of the application to maintain the consistency of the data in the cache and insert the data back into cache, whereas in the look-through case, the consistency is handled transparently by the cache, without the application being involved.
This means that for look-aside cache, the application sends the request to the main memory, while in look-through cache the request is forwarded from the cache itself.
See the slides 14 and 15 in this slide deck for the visual illustration: https://www.cs.princeton.edu/courses/archive/fall19/cos316/lectures/08-caching.pdf
There is this question regarding solving the AMAT(Average Memory Access Time) given these data:
Legends: Cache Level 1 = L1 Cache Level 2 = L2 Main Memory = M
L1, L2 and M's Hit Time are 1, 10 and 100 respectively whilst
L1 Miss Rate is 5%, L2 5% and M 50%.
Find the AMAT in clock cycles.
After attempting to solve this question, here is my solution:
AMAT's formula is = Hit Time X Hit Rate + Miss Penalty * Miss Rate
Miss Penalty = AMAT for the next cache(say for example, AMAT of L2)
So I manipulated the formula, resulting into something like this:
AMAT = Hit Time L1 X Hit Rate L1 + AMAT L2 * Miss Rate L1
AMAT L2 = Hit Time L2 X Hit Rate L2 + AMAT M * Miss Rate L2
AMAT M = Hit Time M X Hit Rate M + [???] * Miss Rate M
providing the numerical value for the said formula would look like this:
AMAT = 1 X .95 + AMAT L2 * .05
AMAT L2 = 10 X .95 + AMAT M * .05
AMAT M = 100 X .5 + [???] * .5
So my first question would be, is my formula correct?
Next, how to get M's Miss Penalty?
Your "cascading" deductions are correct. If you miss L1 then the data has to be fetched from L2, involving a penalty, and if you miss L2 the data has to be fetched from RAM involving a higher penalty. Based on this your method of computing AMAT is correct.
Now, let's look a bit at the computer architecture. After L1 you have L2, after L2 there can be an L3 or it can be a RAM. After RAM you have the persistent storage (Hard disk).
However, the access time towards the HDD is very difficult to compute. Majority of the profilers (e.g perf, oprofile) just give you miss rates for the cache and assume that the rest of reads are taken directly from the RAM. Usually this is the case in modern computers. Reads from hard drives are marked as I/O requests. Why is it hard to compute the HDD access time? Because your data block can be at any location on your HDD, depending fragmentation. In case of spinning disks the access time can be lower f the data is located near the magnetic reader. So usually HDD access time is hard to predict when computing AMAT.
This means that if the problem statement gives you only L1, L2, M hit time, you cannot compute HDD hit time. However you can measure/estimate it in real life applications, using profilers.
So in short:
Is your formula correct? Yes - From mathematical point of view. From practical point of view, it is hard to use and generally does not make sense if M hit rate is not 100%.
How to compute M miss penalty? If it is not given you cannot compute it. In practice, by analyzing profiler output you can determine it for a specific measurement (no guaranteed that it will be the same for another measurement, because of the reasons described above).
Cheers.
(a) You are given a memory system that has two levels of cache (L1 and L2). Following are the specifications:
Hit time of L1 cache: 2 clock cycles
Hit rate of L1 cache: 92%
Miss penalty to L2 cache (hit time of L2): 8 clock cycles
Hit rate of L2 cache: 86%
Miss penalty to main memory: 37 clock cycles
Assume for the moment that hit rate of main memory is 100%.
Given a 2000 instruction program with 37% data transfer instructions (loads/stores), calculate the CPI (Clock Cycles per Instruction) for this scenario.
For this part, I calculated it like this (am I doing this right?):
(m1: miss rate of L1, m2: miss rate of L2)
AMAT = HitTime_L1 + m1*(HitTime_L2 + m2*MissPenalty_L2)
CPI(actual) = CPI(ideal) + (AMAT - CPI(ideal))*AverageMemoryAccess
(b) Now lets add another level of cache, i.e., L3 cache between the L2 cache and the main memory. Consider the following:
Miss penalty to L3 cache (hit time of L3 cache): 13 clock cycles
Hit rate of L3 cache: 81%
Miss penalty to main memory: 37 clock cycles
Other specifications remain as part (a)
For the same 2000 instruction program (which has 37% data transfer instructions), calculate the CPI.
(m1: miss rate of L1, m2: miss rate of L2, m3: miss rate of L3)
AMAT = HitTime_L1
+ m1*(HitTime_L2 + m2*MissPenalty_L2)
+ m2*(HitTime_L3 + m3*MissPenalty_L3)
Is this formula correct and where do I add the miss penalty to main memory in this formula?
It should probably be added with the miss penalty of L3 but I am not sure.
(a) The AMAT calculation is correct if you notice that the MissPenalty_L2 parameter is what you called Miss penalty to main memory.
The CPI is a bit more difficult.
First of all, let's assume that the CPU is not pipelined (sequential processor).
There are 1.37 memory accesses per instruction (one access to fetch the instruction and 0.37 due to data transfer instructions). The ideal case is that all memory acceses hit in the L1 cache.
So, knowing that:
CPI(ideal) = CPI(computation) + CPI(mem) =
CPI(computation) + Memory_Accesses_per_Instruction*HitTime_L1 =
CPI(computation) + 1.37*HitTime_L1
With real memory, the average memory access time is AMAT, so:
CPI(actual) = CPI(computation) + Memory_Accesses_per_Instruction*AMAT =
CPI(ideal) + Memory_Accesses_per_Instruction*(AMAT - HitTime_L1) =
CPI(ideal) + 1.37*(AMAT - HitTime_L1)
(b) Your AMAT calculation is wrong. After a miss at L2, it follows a L3 access that can be a hit or a miss. Try to finish the exercise yourself.