Armv8 address fields for cache? - caching

I am reading, ARM Cortex-A Series Programmer’s Guide for ARMv8-A.
In 11.1.2 Cache tags and Physical Addresses, There was an example for cache address fields.
Example:
Cache is 4-way 32KB
Cache line = 16-words (64 Byte)
And the address fields stated in the document:
Set(index) = 8 bits, Offset = 6 bits, Tag = 30 bits
From my understanding, 8 bits index will correspond to 256 cache lines in each way (which is illustrated correctly in the example). And offset is 6 bits (2^6 = 64) which is used to address bytes inside the line(64 bytes) correctly.
However the cache is 4 way which means that cache size is 4*256*64 = 64KB not 32KB.
Is my analysis correct or I am missing something ?

Someone asked the same question on arm community website: https://community.arm.com/developer/ip-products/processors/f/cortex-a-forum/8159/how-to-compute-a-cache-size
Here is the reply on his question:
" Got reply from ARM. It is a document error. It should be 2-way set-associative cache. 16KB * 2 = 32 KB "

Related

Cache Configuration for MIPS Processor

I am having trouble figuring out the tag, set, and byte offset fields for a 32-bit MIPS processor with the following cache configurations:
(a) A direct mapped cache with capacity 8192 bytes and block size 32 bytes.
(b) A 8-way set associate cache with capacity 2048 bytes and block size 16 bytes.
I would greatly appreciate any help or guidance on this matter. Thank you!
I attempted to use the formula for calculating the number of bits for each field in a cache memory address:
`Tag = 32 - (log2(cache capacity/block size) + log2(block size))
Set/Index = log2(cache capacity/block size)
Byte offset = log2(block size)`
I expected to be able to use this formula to find the number of bits for each field, but I am unsure if this is the correct formula for this specific MIPS processor.
As a result, I am seeking assistance to verify my understanding and solution.

Cache memory design - address decoding

I was given the following problem:
A CPU generates 32 bit addresses for a byte addressable memory. Design an 8 KB cache memory for this CPU (8 KB is the cache size only for the data; it does not include the tag). The block size is 32 bytes. Show the block diagram, and the address decoding for direct mapped cache memory.
I determined that:
8 bits are needed for indexing
5 bits are needed for block offset
19 bits for the tag.
Is my solution correct? How should I do the decoding?
The numbers seem correct, however it is always worth pointing out in your solution that you're taking cache associativity under account. Specifically, 32-8-5=19 is only valid when the cache is directly mapped.
The decoding part is nicely illustrated in your drawing – it's simply the act of taking 32 bits of the address as used by the CPU apart into the tag, index, and offset fields.

Mapping 512KB main memory into 1KB cache homework question

I'm sorry if I made an error in posting this. Please let me know if I need to change anything.
I've received my computer architecture homework back and I missed this question. My professor's explanation didn't make sense to me, and I disagree with what he told me, so I am here asking what you guys think.
Here is the question:
A computer uses 16-bit memory addresses. Main memory is 512KB, and the cache is 1KB with 32B per block. Given each of the following mapping functions, calculate the number of bits in each field of the memory address.
Here is how I worked through the direct mapping part of the problem:
Cache memory: 1KB (2^10), 16-bit memory addresses (1 word = 2B) -> 1024B/2B = 512 words, 16 words per block (32B) -> 512/16 = 32 cache memory blocks.
Main memory: 512 KB (2^19), 16-bit memory addresses (1 word = 2B) -> 524288B/2B = 256K words, 16 words per block (32B) -> 256K/16 = 16384 or 16K main memory blocks.
I understand the word tag as such: 32B per block allows for 16 16-bit memory addresses per block. This (I believe) supports that: 1 word = 16 bits = 2 B -> 32B/2B = 16 words in each block. This equates to 2^4 = 4 bits for determining which word in the block, leaving 12 bits for tag and block bits in the memory address.
Now, in order to map 16K main memory blocks directly into 32 cache memory blocks, there will have to be 512 main memory blocks mapped to each cache memory block. So 512/16K blocks per 1/32 blocks.
Here is where I am confused. Doesn't this require 9 tag bits, as 2^9 = 512 (main memory blocks possibly mapped into one cache memory block)?
For the block bits, which point to a particular block in the cache, this requires 5 bits. 2^5 = 32, blocks in cache memory.
This would require 18 bits in the memory address.
Here is my professor's answer for this question:
2^5 = 32 -> 5 Word bits
(1KB)/(32B) = 32 blocks -> 5 Block bits
16 – 5 – 5 = 6 Tag bits
I did not realize I could simply subtract the required block and word bits to get the tag bits. But it still doesn't make sense to me. 2^6 = 64 blocks per cache block. 64*32 gives 2048. I can't wrap my head around this. Can someone please help?
Okay, the terminology that i learnt is slightly different but the principal should be the same for this explanation.
So cache will have multiple sets (sort of like a cell). And each set will have 1 cache line (containing 1 block of data) or multiple cache lines (each contain 1 block of data) (direct mapping or n-associativity mapping).
In mapping the main memory blocks to the cache, the main memory address (16 bit) is divided into 3 fields: tag, index bits and offset bits. A memory cell is 1 byte and a block is made up of a few cells
Offset bits are used to access the individual bytes of a memory block. Think of it as the offset on top of the block base address to get the byte you want (i assume your memory should be byte-addressable rather than word-addressable as it doesn't make sense to access 2B word as this would be inflexible) And here your prof/textbook call it as word bit. Hence if a block has 32 Bytes, there would be log2(block size) = 5 bits needed to access the individuals cells in the mapped block.
Index bits (in direct mapped cache is called block bits too as the number of set is the same as the number of blocks in the cache) is used to identify which set/cache line/ cache block that the main memory block is mapped to the cache. There are 1KB/32B = 32 cache blocks in the cache. As direct mapping is used, each set contain only 1 cache block and therefore there will be 32 set in this cache. Thus to access the correct set in cache, 5 bits is needed and therefore index bits = 5 bits
Tag is a name to determine if the data block in cache is the correct one we are looking one from the main memory. As the address of main memory is 16 bit and we already know index and offset fields, it is easy to deduce that tag will need 16 - 5 - 5 6 bits. How we determine the tag is not really a concern as the block size and cache size (and hence no. of sets in cache is given here).

Direct Mapped Cache Bits

So Im having trouble understanding some parts to direct mapped caching. I have a byte addressed memory system that has 64KB memory with a 2KB direct-mapped cache. Cache blocks are 32 bytes.
From what I understand and please correct me if i'm wrong, I have 2048B/32B = 64 cache blocks. I need to figure out how many total bits are needed for each cache entry (tag, "dirty" bit, etc).
I believe i'll need 6 index bits (2^6 = 64 (# of blocks))
and 5 offset bits (2^5 = 32 (size of cache block))
Im just having trouble figuring out the rest that are needed.
The bits of a physical address can be split into 3 groups - the least significant group of bits that determine "offset of byte within cache block" and doesn't need to be stored in the tag, the middle group of bits that determine "index of cache block within the cache" and doesn't need to be stored in the tag, and the most significant group of bits that is used to check if the data in the cache is the data you want which must be stored in the tag.
With 64 KiB of physical address space a physical address would have 16 bits; and if your cache is 2048 bytes then (for "direct mapped") the least significant group of bits and the middle group of bits combined must add up to a total of 11 bits. That means the most significant group of bits (which must be stored in the tag) needs to be 5 bits (because 16 bits - 11 bits = 5 bits).
For other bits; you always need something to indicate if the entry is used or empty; if the cache is "write-back" you need a dirty bit but if the cache is "write-through" you don't; if there are multiple CPUs and cache coherency you need more bits for that (e.g. exclusive/shared); and if there's any kind of error detection or correction you need more bits for that (e.g. a "parity bit"). This means the total tag size is at least 6 bits (but may be more).

Why 16-bit address with 12-bit offset results in 4KB page size?

I'm reading the "Modern Operating System" book. And I'm confused about the "Page Size".
In the book, the author says,
The incoming 16-bit virtual address is
split into a 4-bit page number and
12-bit offset. With 4 bits for the
page number, we can have 16 pages, and
with 12 bits for the offset, we can
address all 4096 bytes within a
page.
Why 4096 bytes? With 12 bits, we can address 4096 entries within a page, correct. But, one entry is an address (in this case, address size = 16 bits). So I think we can address 4096(entry) * 16(bit) = 4096(entry) * 2(byte) = 8KB, but why the book says that we can address 4096 (bytes) ?
Thanks in advance! :)
This is assuming byte-addressed memory (which almost every machine made in the past 30 years uses), so each address refers to a byte, not an entry or address or any other larger value. To hold a 16-bit value, you'll need two consecutive addresses (two bytes).
More than 30 years ago, there used to be machines which were word addressed, which worked like you surmise. But such machines had a tough time dealing with byte-oriented data (such as ASCII characters), and so have fallen out of favor. Nowadays, things like byte addressability, 8-bit bytes and twos-complement integers are pretty much just assumed.
The 12 bits are an offset within a page. The offset is in bytes, not addresses. 2^12 is 4096.
Because with 12 bit, we can address 2^12=4096 slots. Each slot represents an address which size is 1 byte in byte-addressable memory. Hence the total size is 4096*1=4096 bytes = 4KB.
What you are calculating is the page size, i.e. the size of a page in the page table in the memory. As we use 12 bits for the offset, each frame in the physical memory is 2^12=4096K. However, each page in the page table occupies 2^12 entries x 2 bytes = 8K in the memory.
okay so you have 16 bit virtual address let see what does it mean .It means you have 2**16 =65536 bytes.
4 bit page number that means there are 16 pages as 2^4=16
Now You Name The Pages As page1,page2...page16.
Now We are left with 12bits let us see how many address can 12 bits represent 2**12=4096 bytes
65536 bytes could also be achieved by dividing it into 16 pages containing 4096 bytes each as 4096*16=65536

Resources