I'm studying for a final and I have the following example problem with professor answer:
Question
An instruction cache (I-cache) has a storage capacity of 524288 bytes
and employs 128-byte cache lines. The memory with which the cache is
used is 209715200 bytes in size. How many different memory blocks
could potentially map to set 10 within the 4-way set associative
cache?
Answer
Any memory block whose block number has bits 16 through 7 equal to
0000001010 would map to set 10. Since the memory is 209715200 bytes in
size (=0xC800000), so only the low 28 bits of the address will ever be
none-zero. That is, at most 28 bits are needed to specify any address
within the memory. So the upper 28-10-7 = 11 bits would determine how
many different blocks map to an individual set. The value in the upper
11 bits range from 0 to 11000111111 = decimal1599. So 1600 different
blocks map to set 10 within the 4-way set associative cache.
Can anyone explain this answer? I am particularly confused about the following items. I thought the upper 11 bits were the tag. Why would this have to do with whether it maps to set 10 or not. I also don't understand where the upper bound of 11000111111 for the 11 bits came from?
You are correct. The tag is the upper 11 bits. But what the poorly worded explanation is getting at is that the tag corresponding to the physical memory will never be mapped to greater than 11000111111, because the physical memory with a size of 0xC800000 has an upper limit of that number. In other words, a tag greater than 11000111111 will map to a physical address of 0xC800001 or greater.
Essentially to find the number of blocks that a set can be mapped to, just count the number of tag possibilities that there can be given the size of the physical memory and the size of the set bits and offset bits.
Related
I'm sorry if I made an error in posting this. Please let me know if I need to change anything.
I've received my computer architecture homework back and I missed this question. My professor's explanation didn't make sense to me, and I disagree with what he told me, so I am here asking what you guys think.
Here is the question:
A computer uses 16-bit memory addresses. Main memory is 512KB, and the cache is 1KB with 32B per block. Given each of the following mapping functions, calculate the number of bits in each field of the memory address.
Here is how I worked through the direct mapping part of the problem:
Cache memory: 1KB (2^10), 16-bit memory addresses (1 word = 2B) -> 1024B/2B = 512 words, 16 words per block (32B) -> 512/16 = 32 cache memory blocks.
Main memory: 512 KB (2^19), 16-bit memory addresses (1 word = 2B) -> 524288B/2B = 256K words, 16 words per block (32B) -> 256K/16 = 16384 or 16K main memory blocks.
I understand the word tag as such: 32B per block allows for 16 16-bit memory addresses per block. This (I believe) supports that: 1 word = 16 bits = 2 B -> 32B/2B = 16 words in each block. This equates to 2^4 = 4 bits for determining which word in the block, leaving 12 bits for tag and block bits in the memory address.
Now, in order to map 16K main memory blocks directly into 32 cache memory blocks, there will have to be 512 main memory blocks mapped to each cache memory block. So 512/16K blocks per 1/32 blocks.
Here is where I am confused. Doesn't this require 9 tag bits, as 2^9 = 512 (main memory blocks possibly mapped into one cache memory block)?
For the block bits, which point to a particular block in the cache, this requires 5 bits. 2^5 = 32, blocks in cache memory.
This would require 18 bits in the memory address.
Here is my professor's answer for this question:
2^5 = 32 -> 5 Word bits
(1KB)/(32B) = 32 blocks -> 5 Block bits
16 – 5 – 5 = 6 Tag bits
I did not realize I could simply subtract the required block and word bits to get the tag bits. But it still doesn't make sense to me. 2^6 = 64 blocks per cache block. 64*32 gives 2048. I can't wrap my head around this. Can someone please help?
Okay, the terminology that i learnt is slightly different but the principal should be the same for this explanation.
So cache will have multiple sets (sort of like a cell). And each set will have 1 cache line (containing 1 block of data) or multiple cache lines (each contain 1 block of data) (direct mapping or n-associativity mapping).
In mapping the main memory blocks to the cache, the main memory address (16 bit) is divided into 3 fields: tag, index bits and offset bits. A memory cell is 1 byte and a block is made up of a few cells
Offset bits are used to access the individual bytes of a memory block. Think of it as the offset on top of the block base address to get the byte you want (i assume your memory should be byte-addressable rather than word-addressable as it doesn't make sense to access 2B word as this would be inflexible) And here your prof/textbook call it as word bit. Hence if a block has 32 Bytes, there would be log2(block size) = 5 bits needed to access the individuals cells in the mapped block.
Index bits (in direct mapped cache is called block bits too as the number of set is the same as the number of blocks in the cache) is used to identify which set/cache line/ cache block that the main memory block is mapped to the cache. There are 1KB/32B = 32 cache blocks in the cache. As direct mapping is used, each set contain only 1 cache block and therefore there will be 32 set in this cache. Thus to access the correct set in cache, 5 bits is needed and therefore index bits = 5 bits
Tag is a name to determine if the data block in cache is the correct one we are looking one from the main memory. As the address of main memory is 16 bit and we already know index and offset fields, it is easy to deduce that tag will need 16 - 5 - 5 6 bits. How we determine the tag is not really a concern as the block size and cache size (and hence no. of sets in cache is given here).
So Im having trouble understanding some parts to direct mapped caching. I have a byte addressed memory system that has 64KB memory with a 2KB direct-mapped cache. Cache blocks are 32 bytes.
From what I understand and please correct me if i'm wrong, I have 2048B/32B = 64 cache blocks. I need to figure out how many total bits are needed for each cache entry (tag, "dirty" bit, etc).
I believe i'll need 6 index bits (2^6 = 64 (# of blocks))
and 5 offset bits (2^5 = 32 (size of cache block))
Im just having trouble figuring out the rest that are needed.
The bits of a physical address can be split into 3 groups - the least significant group of bits that determine "offset of byte within cache block" and doesn't need to be stored in the tag, the middle group of bits that determine "index of cache block within the cache" and doesn't need to be stored in the tag, and the most significant group of bits that is used to check if the data in the cache is the data you want which must be stored in the tag.
With 64 KiB of physical address space a physical address would have 16 bits; and if your cache is 2048 bytes then (for "direct mapped") the least significant group of bits and the middle group of bits combined must add up to a total of 11 bits. That means the most significant group of bits (which must be stored in the tag) needs to be 5 bits (because 16 bits - 11 bits = 5 bits).
For other bits; you always need something to indicate if the entry is used or empty; if the cache is "write-back" you need a dirty bit but if the cache is "write-through" you don't; if there are multiple CPUs and cache coherency you need more bits for that (e.g. exclusive/shared); and if there's any kind of error detection or correction you need more bits for that (e.g. a "parity bit"). This means the total tag size is at least 6 bits (but may be more).
This question is mostly just to clarify my understanding.
Say I have a 32-bit Computer, with virtual memory space of 2^32 bytes.
Memory paging is used, each page is 2^8 bytes.
So the memory address sizes are 24 bits. Since (2^32/2^8 = 2^24 bytes).
And the offset would be 8 bits? This I do not quite understand. Since I know that the total address is 32, and 24 is already taken by the pages, so the remainder is the offset of 8.
Lastly for the page size. If each physical memory address is stored in 32 bits (4 bytes), the table size would be 2^26 (2^24 * 2^2). Is this correct?
Page Table size = number of entries*size of entry
In your case, each page is 2^8 bytes, that is - you need 8 bits offset. You got that one right.
This leaves us with 24 bits for Page. 2^24 different pages.
Size of page-table for process X is: 2^24*Entry-Size. which is not provided by you here.
Lets assume it needs 32 bits per entry. Then, 2^24*32 = 2^24*2^5 = 2^29 bits.
I'm trying to understand direct mapped cache, but it is a very complex concept. I have written what I think I understand so far, but I am unsure whether I am correct or not. Can somebody please verify if the explanation below is correct?
E.g, for a made up computer, just for the sake of this question, there 1024 memory locations (cells) in the RAM. This equals 2^10 so the address for each of these memory locations must be 10 bits long.
The CPU is asked to get data from the RAM memory address 1100100111. However the CPU doesn't access the data directly from this memory address in the RAM. The RAM stores this data to cache memory and then the CPU gets the data from the cache memory.
There are different ways of doing this, one being direct mapped cache. The cache memory and ram memory are divided up into blocks, where the number of cells in the blocks in each memory must be the same. The number of blocks in the RAM and cache must also be a power of 2.
In this example lets say there are 2^6 = 64 blocks in the RAM, so there are 1024/64 = 16 cells in each block. Lets say there are 2^2 = 4 blocks in the cache, so the cache has 64 cells. The "6" and "2" in the exponents of these numbers are important later on.
Because the The number of blocks in the RAM and cache is a power of 2, it makes the calculations easy. In our address 1100100111 the last 6 bits mark the offset 100111 (the 6 comes from the fact that 2^6 = 64), and the remaining 4 bits 1100 mark the RAM block number the data is stored in. Within this block number are two other important numbers. First the cache block number; this is the cache block that that RAM block would store to. This is the first 2 bits after the offset, so it will be 00 (The 2 comes from the fact that There are 2^2 = 4 blocks in the cache). The remaining 2 numbers in the address mark the tag. This will be 11.
So when the CPU is asked to get data from memory address 1100100111 it will look for this data in cache block number 00. It will compare the tag of the address 11 to the tag saved in the cache, which is a separate piece of memory used to store information about where from the RAM the data has come from. If the tags are the same this is a hit and this is the data the CPU is looking for. If the tag of the address and the tag in the memory are different, then this is a miss, and the data isn't stored in the cache.
If this is the case, the cache controller will get the data from block number 1100 in the RAM and store it in the cache block number 00, and update the tag in this block to 11. The CPU can now get the data in this block.
Is this all correct? I need to understand this before I can start to try and understand associative and set associative memory.
Thanks!
You have the right idea, but your numbers went wrong somewhere. In your example you have a direct-mapped cache of 4 blocks/lines of 16 bytes/cells each. The address 1100100111 will be divided up as follows. You use the least significant four bits 0111 as the offset because it refers to which cell of a particular block you want. I think you accidentally included the block number as part of the offset. Anyway, the next least significant two bits 10 will be the block number and the most significant four bits 1100 will be the tag.
Your understanding seems to be fine. One thing more that is necessary is a bit to indicate if the cache block is valid or not. Good luck with the associative stuff!
I have a very simple (n00b) question.
A 20-bit external address bus gave a 1 MB physical address space (2^20
= 1,048,576).(Wikipedia)
Why 1 MByte?
2^20 = 1,048,576 bit = 1Mbit = 128KByte not 1MB
I misunderstood something.
When you have 20 bits you can address up to 2^20. This is your range, not the number of bits.
I.e. if you have 8 bits your range is up to 255 (unsigned) not 2^8 bits.
So with 20 bits you can address up to 2^20 bytes i.e. 1MB
I.e. with 20 bits you can represent addresses from 0 up to 2^20 = 1,048,576. I.e. you can reference up to 1MB of memory.
1 << 20 addresses, that is 1,048,576 bytes addressable. Hence, 1 MB physical address space.
Because the smallest addressable unit of memory (in general - some architectures have small bit-addressable pieces of memory) is the byte, not the bit. That is, each address refers to a byte, rather than to a bit.
Why, you ask? Direct access to individual bits is almost never needed - and if you need it, you can still load the surrounding byte and get the bit with bit masks and shifts. Increasing the bits per address allows you to address more memory with the same address range.
Note that a byte doesn't have to be 8 bit, strictly speaking, though it's ubiquitous by now. But regardless of the byte size, you're grouping bits together to be able to handle larger quantities of them.