what is the difference between byte addressable and word addressable - caching

A processor has 64 bits address line, and has a 16-way set-associative cache. The memory is a word (i.e 2 bytes) addressable. The cache size is 2 MByte, and the line size is 64 bytes long.
a. Show the memory address structure.
What is the effect of the phrase (2 byte addressable) on solving the question?
The solution will vary between byte addressable and word addressable !

In ye olde days it was common for processors to access memory in units other bytes. It was extremely common for computers to use memory in units of 36 bits (DEC, Sperry). There were deskop computers that used 14-bits.
A word is then the smallest unit of memory that a specific computer addresses. When the word is 8-bits, it is called a byte.
In your example the word becomes 16-bits.
You question is totally screwed up. If you have a word addressable machine, then it does not have bytes.
They appear to be saying
The cache size is 1 Megaword, and the line size is 4 words long
I am continually amazed at how academics come up with ways to make the simple complex.
There are actually two meanings of the term "word" in common use. (1) is the addressable unit size. (2) Processors generally support multiple integer sizes (e.g., 8, 16, 32, 64, 128 bits). Is it common for processor documentation to call one of this sizes greater than 8 a word.
The processor might have a MOVE BYTE instruction that moves 8-bits and a MOVE WORD instruction that moves 16, 32 or 64 bytes; whatever the processor documentation want to call a word.

Related

What percentage of the bits used for data in a 32kB (32,768 byte) direct-mapped write-back cache with a 64 byte cache line?

Encountered this problem and the solution said
"32 bit address bits, 64 byte line means we have 6 bits for the word address in the line that aren't in the tag, 32,768 bytes in the cache at 64 byte lines is 512 total lines, which means we have 12 bits of address for the cache index, write back means we need a dirty bit, and we always need a valid bit. So each line has 64*8=512 data bits, 32-6- 12=14 tag bits, and 2 flag bits: data/total bits = 512/(512+14+2)=512/528."
When I tried to solve the problem I got 32kB/64byte=512 lines in total, i.e. 2^9=512. In addition, a 64 byte cache line size, 1 word=4 bytes, is 64/4=16 words per line i.e. 2^4.
To my understanding the total amount of bits in a cache is given by total amount of entries/lines in the caches*(tag address + data)-> 2^9*((32-9-4+2)+16*32). Thus, the amount of data bits per cache line is 512 (16 words *32 bits per word), and the tag is 32-9-4+2=21 (the 9 is the cache index for direct mapped cache, the 4 is to address each word and the 2 is the valid bit and dirty bit)
Effectively, the answer should be 512/533 and not 512/528.
Correct?
512 lines = 9 bits not 12 as they claim, so you are right on this point.
However, they are right that 64 byte lines gives 6 bits for the block offset — though it is a byte offset, not word as they say.
So, 32-6-9=17 tag bits, then plus the 2 for dirty & valid.
FYI, there's nothing in the above problem that indicates a conversion from bytes to words. While it is true that there will be 16 x 32-bit words per line (i.e. 64 bytes per line) it is irrelevant: we should presume that the 32-bit address is a byte address unless otherwise stated. (It would be unusual to state cache size in bytes for a word (not byte) addressable machine; it would also be unusual for a 32-bit machine to be word addressable — some teaching architectures like LC-3 are word addressable, however, they are 16-bits; other word addressable machines have odd sizes like 12 or 18 or 36 bit words — though those pre-date caches!)

Why do bytes exist? Why don't we just use bits?

A byte consists of 8 bits on most systems.
A byte typically represents the smallest data type a programmer may use. Depending on language, the data types might be called char or byte.
There are some types of data (booleans, small integers, etc) that could be stored in fewer bits than a byte. Yet using less than a byte is not supported by any programming language I know of (natively).
Why does this minimum of using 8 bits to store data exist? Why do we even need bytes? Why don't computers just use increments of bits (1 or more bits) rather than increments of bytes (multiples of 8 bits)?
Just in case anyone asks: I'm not worried about it. I do not have any specific needs. I'm just curious.
because at the hardware level memory is naturally organized into addressable chunks. Small chunks means that you can have fine grained things like 4 bit numbers; large chunks allow for more efficient operation (typically a CPU moves things around in 'chunks' or multiple thereof). IN particular larger addressable chunks make for bigger address spaces. If I have chunks that are 1 bit then an address range of 1 - 500 only covers 500 bits whereas 500 8 bit chunks cover 4000 bits.
Note - it was not always 8 bits. I worked on a machine that thought in 6 bits. (good old octal)
Paper tape (~1950's) was 5 or 6 holes (bits) wide, maybe other widths.
Punched cards (the newer kind) were 12 rows of 80 columns.
1960s:
B-5000 - 48-bit "words" with 6-bit characters
CDC-6600 -- 60-bit words with 6-bit characters
IBM 7090 -- 36-bit words with 6-bit characters
There were 12-bit machines; etc.
1970-1980s, "micros" enter the picture:
Intel 4004 - 4-bit chunks
8008, 8086, Z80, 6502, etc - 8 bit chunks
68000 - 16-bit words, but still 8-bit bytes
486 - 32-bit words, but still 8-bit bytes
today - 64-bit words, but still 8-bit bytes
future - 128, etc, but still 8-bit bytes
Get the picture? Americans figured that characters could be stored in only 6 bits.
Then we discovered that there was more in the world than just English.
So we floundered around with 7-bit ascii and 8-bit EBCDIC.
Eventually, we decided that 8 bits was good enough for all the characters we would ever need. ("We" were not Chinese.)
The IBM-360 came out as the dominant machine in the '60s-70's; it was based on an 8-bit byte. (It sort of had 32-bit words, but that became less important than the all-mighty byte.
It seemed such a waste to use 8 bits when all you really needed 7 bits to store all the characters you ever needed.
IBM, in the mid-20th century "owned" the computer market with 70% of the hardware and software sales. With the 360 being their main machine, 8-bit bytes was the thing for all the competitors to copy.
Eventually, we realized that other languages existed and came up with Unicode/utf8 and its variants. But that's another story.
Good way for me to write something late on night!
Your points are perfectly valid, however, history will always be that insane intruder how would have ruined your plans long before you were born.
For the purposes of explanation, let's imagine a ficticious machine with an architecture of the name of Bitel(TM) Inside or something of the like. The Bitel specifications mandate that the Central Processing Unit (CPU, i.e, microprocessor) shall access memory in one-bit units. Now, let's say a given instance of a Bitel-operated machine has a memory unit holding 32 billion bits (our ficticious equivalent of a 4GB RAM unit).
Now, let's see why Bitel, Inc. got into bankruptcy:
The binary code of any given program would be gigantic (the compiler would have to manipulate every single bit!)
32-bit addresses would be (even more) limited to hold just 512MB of memory. 64-bit systems would be safe (for now...)
Memory accesses would be literally a deadlock. When the CPU has got all of those 48 bits it needs to process a single ADD instruction, the floppy would have already spinned for too long, and you know what happens next...
Who the **** really needs to optimize a single bit? (See previous bankruptcy justification).
If you need to handle single bits, learn to use bitwise operators!
Programmers would go crazy as both coffee and RAM get too expensive. At the moment, this is a perfect synonym of apocalypse.
The C standard is holy and sacred, and it mandates that the minimum addressable unit (i.e, char) shall be at least 8 bits wide.
8 is a perfect power of 2. (1 is another one, but meh...)
In my opinion, it's an issue of addressing. To access individual bits of data, you would need eight times as many addresses (adding 3 bits to each address) compared to using accessing individual bytes. The byte is generally going to be the smallest practical unit to hold a number in a program (with only 256 possible values).
Some CPUs use words to address memory instead of bytes. That's their natural data type, so 16 or 32 bits. If Intel CPUs did that it would be 64 bits.
8 bit bytes are traditional because the first popular home computers used 8 bits. 256 values are enough to do a lot of useful things, while 16 (4 bits) are not quite enough.
And, once a thing goes on for long enough it becomes terribly hard to change. This is also why your hard drive or SSD likely still pretends to use 512 byte blocks. Even though the disk hardware does not use a 512 byte block and the OS doesn't either. (Advanced Format drives have a software switch to disable 512 byte emulation but generally only servers with RAID controllers turn it off.)
Also, Intel/AMD CPUs have so much extra silicon doing so much extra decoding work that the slight difference in 8 bit vs 64 bit addressing does not add any noticeable overhead. The CPU's memory controller is certainly not using 8 bits. It pulls data into cache in long streams and the minimum size is the cache line, often 64 bytes aka 512 bits. Often RAM hardware is slow to start but fast to stream so the CPU reads kilobytes into L3 cache, much like how hard drives read an entire track into their caches because the drive head is already there so why not?
First of all, C and C++ do have native support for bit-fields.
#include <iostream>
struct S {
// will usually occupy 2 bytes:
// 3 bits: value of b1
// 2 bits: unused
// 6 bits: value of b2
// 2 bits: value of b3
// 3 bits: unused
unsigned char b1 : 3, : 2, b2 : 6, b3 : 2;
};
int main()
{
std::cout << sizeof(S) << '\n'; // usually prints 2
}
Probably an answer lies in performance and memory alignment, and the fact that (I reckon partly because byte is called char in C) byte is the smallest part of machine word that can hold a 7-bit ASCII. Text operations are common, so special type for plain text have its gain for programming language.
Why bytes?
What is so special about 8 bits that it deserves its own name?
Computers do process all data as bits, but they prefer to process bits in byte-sized groupings. Or to put it another way: a byte is how much a computer likes to "bite" at once.
The byte is also the smallest addressable unit of memory in most modern computers. A computer with byte-addressable memory can not store an individual piece of data that is smaller than a byte.
What's in a byte?
A byte represents different types of information depending on the context. It might represent a number, a letter, or a program instruction. It might even represent part of an audio recording or a pixel in an image.
Source

How to determine the number of redundant address bits?

A memory module has a data bus that is 128 bits wide. If module holds 4GB (2^31 bytes), how many address bits are redundant?
I believe there is some sort of formula (if not a formula a logical procedure) which we can use to find out the address bus then from there we can find out the redundant number of address bits. I don't have a basic idea of how these things are related: address bus, bus width, data bus.etc.
Each line on one of those parallel buses represents one bit of information. This is why the number of lines that comprise the bus is also referred to its width in bits.
The address bus is used to send the address to be read from or written to to the memory module. Because each line can 'transport' only one bit multiple lines are used in parallel.
For example, to be able to address 256 different locations in the memory, (at least) 8 lines are required for the address bus (because 2^8 = 256). Those 4 billion memory locations will therefor need 32 bus lines on the address bus, the address bus is 32 bits wide.
Note that I used the word "memory location" above, because the address sent to the memory module may refer to bytes or to some other unit of storage, like "words" (2 bytes) or "double-words" (4 bytes) or something else.
How big the minumum unit of storage that can be directly addressed is depends on the memory module and its internal organisation.
A memory module with a bus width of 128 bits for the data bus can send or receive 128 bits = 16 bytes at the same time. For this kind of memory the smallest addressable unit may be 128 bits, so that it could be accessed only in blocks of 16 bytes, which are usually aligned on multiples of this block size.
In that case, the first block of 16 bytes would be addressed by address 0 and would occupy the first 16 bytes of memory. Then at address 1 would be the next block, starting at byte #16. Address 2 gives 16 bytes from byte #32, and so on.
So, if each address on the address bus is used to address 16 bytes at the same time, less addresses will be needed to access the whole memory as compared to byte-wise addressing.
To be able to address each byte of those 4GB individually, the address bus needs to be 32 bits wide (2^32 bytes= 4GB). If, however, only whole blocks of 16 bytes can be addresses individually, one will only need (2^32)/16 different addressed to address the whole memory. 16 = 2^4, so (2^32)/(2^4) = 2^28. -> 28 bits would be needed to address each whole block of 16 bytes (=128 bits) and the width of the address bus could potentially be reduced to 28 lines.

I don't understand something in memory addressing

I have a very simple (n00b) question.
A 20-bit external address bus gave a 1 MB physical address space (2^20
= 1,048,576).(Wikipedia)
Why 1 MByte?
2^20 = 1,048,576 bit = 1Mbit = 128KByte not 1MB
I misunderstood something.
When you have 20 bits you can address up to 2^20. This is your range, not the number of bits.
I.e. if you have 8 bits your range is up to 255 (unsigned) not 2^8 bits.
So with 20 bits you can address up to 2^20 bytes i.e. 1MB
I.e. with 20 bits you can represent addresses from 0 up to 2^20 = 1,048,576. I.e. you can reference up to 1MB of memory.
1 << 20 addresses, that is 1,048,576 bytes addressable. Hence, 1 MB physical address space.
Because the smallest addressable unit of memory (in general - some architectures have small bit-addressable pieces of memory) is the byte, not the bit. That is, each address refers to a byte, rather than to a bit.
Why, you ask? Direct access to individual bits is almost never needed - and if you need it, you can still load the surrounding byte and get the bit with bit masks and shifts. Increasing the bits per address allows you to address more memory with the same address range.
Note that a byte doesn't have to be 8 bit, strictly speaking, though it's ubiquitous by now. But regardless of the byte size, you're grouping bits together to be able to handle larger quantities of them.

Understanding word alignment

I understand what it means to access memory such that it is aligned but I don’t understand why this is necessary. For instance, why can I access a single byte from an address 0x…1 but I cannot access a half word (two bytes) from the same address.
Again, I understand that if you have an address A and an object of size s that the access is aligned if A mod s = 0. But I just don’t understand why this is important at the hardware level.
Hardware is complex; this is a simplified explanation.
A typical modern computer might have a 32-bit data bus. This means that any fetch that the CPU needs to do will fetch all 32 bits of a particular memory address. Since the data bus can't fetch anything smaller than 32 bits, the lowest two address bits aren't even used on the address bus, so it's as if RAM is organised into a sequence of 32-bit words instead of 8-bit bytes.
When the CPU does a fetch for a single byte, the read cycle on the bus will fetch 32 bits and then the CPU will discard 24 of those bits, loading the remaining 8 bits into whatever register. If the CPU wants to fetch a 32 bit value that is not aligned on a 32-bit boundary, it has several general choices:
execute two separate read cycles on the bus to load the appropriate parts of the data word and reassemble them
read the 32-bit word at the address determined by throwing away the low two bits of the address
read some unexpected combination of bytes assembled into a 32-bit word, probably not the one you wanted
throw an exception
Various CPUs I have worked with have taken all four of those paths. In general, for maximum compatibility it is safest to align all n-bit reads to an n-bit boundary. However, you can certainly take shortcuts if you are sure that your software will run on some particular CPU family with known unaligned read behaviour. And even if unaligned reads are possible (such as on x86 family CPUs), they will be slower.
The computer always reads in some fixed size chunks which are aligned.
So, if you don't align your data in memory, you will have to probably read more than once.
Example
word size is 8 bytes
your structure is also 8 bytes
if you align it, you'll have to read one chunk
if you don't align it, you'll have to read two chunks
So, it's basically to speed up.
The reason for all alignment rules are the various widths of the Cache Lines (Instruction-Cache do have 16 Byte lines for the Core2 Architecture, and the Data-Cache do have 64-Byte Lines for L1 and 128-Byte Lines for L2).
So if you want to store/load data that crosses a Cahce-Line Boundary you need to load and store both Cache-lines, which hits the performance.
So you just don't do it because of the performance hit, its that simple.
Try reading a serial port. The data is 8 bits wide.
Nice hardware designers ensure it lies on a least significant byte of the word.
If you have a C structure that has elements not word aligned ( from backwards compatibility or conservation of memory say )
then the address of any byte within the structure is not word aligned.

Resources