How many bits are needed to address this much memory? - byte

I'm taking a programming fundamentals course and currently I'm on the chapter where it talks about computer organization and operations on bits - how the CPU (ALU, CU, registers, etc.) works.
I have a fairly good understanding of the binary language. I understand sign/magnitude format/ 1's complement, 2's complement, etc.
In the book I've learned that a nibble = 4 bits, 8 bits = 1 byte next is a word - which is usually in groups: 8 bits, 16 bits, 32 bits or 64 bits (so on), and all this makes perfect sense to me. Here's my homework question which is kind of confusing to me:
"A computer has 64 MB of memory, Each word is 4 bytes. How many bits are needed to address each single word in memory?"
Well, I'm confused now. The book just told me that a word is typically in multiples of 8.
However I know that 1 byte = 8 bits, so since there are 4 bytes and 1 byte = 8 bytes, would it be correct to think that 4 bytes x 8 bits = 32 bits? Is this the answer?

A 1-bit address can address two words (0, 1).
A 2-bit address can address four words (00, 01, 10, 11).
A 3-bit address can address eight words (000, 001, 010, 011, 100, 101, 110, 111).
So first answer: How many words do you have? Then answer: How many bits does your address need in order to address them?

64MB = 67108864 Bytes/4 Bytes = 16777216 words in memory, and each single word can thus be addressed in 24 bits (first word has address 000000000000000000000000 and last has address 111111111111111111111111). Also 2 raised to 24 = 16777216, so 24 bits are needed to address each word in memory.
The requirement is to represent each memory word with an address, which is in bits, in such a way that each and every word can be represented.
For example, to represent 4 words, you need 4 addresses, 2 raised to 2 is 4, so you need two bits. 00 is the address of the first word, 01 is the address of the second word, 10 is the address of the third word, and 11 is the address of the 4th word.
For 8 words, you need 8 addresses, and 2 raised to 3 is 8, so 3 bits are needed. 000, 001, 010, 011, 100, 101, 110, 111 are the 8 addresses.

1 byte = 8 bits, so since there are 4 bytes and 1 byte = 8 bites Would it be correct to think 4bytes x 8 bites = 32 bits?? being the answer???
No, that's not the answer. If your computer has 64 MB of memory and each word is 4 bytes, how many words are there in your memory? How much bits would you need to address each word (bits needed to represent a number from 0 to number of words - 1).

The formula being:
log (Memory Size/Addressable Unit Size) / log 2
Example1:
How many address bits are required to address 16GBytes of memory, where each addressable unit is 1 byte wide?
Ans: log(16*1024*1024*1024/1)/log2 = 34 bits
Example2:
How many address bits are required to address 16GBytes of memory, where each addressable unit is 2 bytes wide?
Ans: log(16*1024*1024*1024/2)/log2 = 33 bits
Example3:
How many address bits are required to address 64MBytes of memory, where each addressable unit is 4 bytes wide?
Ans: log(64*1024*1024/4)/log2 = 24 bits
Example3:
How many address bits are required to address 16MBytes of memory, where each addressable unit is 1 byte wide?
Ans: log(16*1024*1024/1)/log2 = 24 bits

Related

What percentage of the bits used for data in a 32kB (32,768 byte) direct-mapped write-back cache with a 64 byte cache line?

Encountered this problem and the solution said
"32 bit address bits, 64 byte line means we have 6 bits for the word address in the line that aren't in the tag, 32,768 bytes in the cache at 64 byte lines is 512 total lines, which means we have 12 bits of address for the cache index, write back means we need a dirty bit, and we always need a valid bit. So each line has 64*8=512 data bits, 32-6- 12=14 tag bits, and 2 flag bits: data/total bits = 512/(512+14+2)=512/528."
When I tried to solve the problem I got 32kB/64byte=512 lines in total, i.e. 2^9=512. In addition, a 64 byte cache line size, 1 word=4 bytes, is 64/4=16 words per line i.e. 2^4.
To my understanding the total amount of bits in a cache is given by total amount of entries/lines in the caches*(tag address + data)-> 2^9*((32-9-4+2)+16*32). Thus, the amount of data bits per cache line is 512 (16 words *32 bits per word), and the tag is 32-9-4+2=21 (the 9 is the cache index for direct mapped cache, the 4 is to address each word and the 2 is the valid bit and dirty bit)
Effectively, the answer should be 512/533 and not 512/528.
Correct?
512 lines = 9 bits not 12 as they claim, so you are right on this point.
However, they are right that 64 byte lines gives 6 bits for the block offset — though it is a byte offset, not word as they say.
So, 32-6-9=17 tag bits, then plus the 2 for dirty & valid.
FYI, there's nothing in the above problem that indicates a conversion from bytes to words. While it is true that there will be 16 x 32-bit words per line (i.e. 64 bytes per line) it is irrelevant: we should presume that the 32-bit address is a byte address unless otherwise stated. (It would be unusual to state cache size in bytes for a word (not byte) addressable machine; it would also be unusual for a 32-bit machine to be word addressable — some teaching architectures like LC-3 are word addressable, however, they are 16-bits; other word addressable machines have odd sizes like 12 or 18 or 36 bit words — though those pre-date caches!)

Purpose to set to 0 least significant bits in MMIX assembly with memory operations?

In the documentation to MMIX machine mmix-doc page 3 paragraph 4:
We use the notation to stand for a number consisting of
consecutive bytes starting at location . (The notation
means that the least significant t bits of k are set to
0, and only the least 64 bits of the resulting address are retained.
...
The notation M2t[k] is just a formal symbolism to express an address divisible by 2t.
This is confirmed just after the definition
All accesses to 2t-byte quantities by MMIX are aligned, in the
sense that the first byte is a multiple of 2t.
Most architectures, specially RISC ones, require a memory access to be aligned, this means that the address must be a multiple of the size accessed.
So, for example, reading a 64 bits word (an octa in MMIX notation) from memory require the address to be divisible by 8 because MMIX memory is byte addressable(1) and there are 8 bytes in an octa.
If all the possible data sizes are power of two we see a pattern emerge:
Multiples of Multiples of Multiples of
2 4 8
0000 0000 0000
0010 0100 1000
0100 1000
0110 1100
1000
1010
1100
1110
Multiples of 2 = 21 have the least bit always set to zero(2), multiples of 4 = 22 have the the two least bits set to zero, multiples of 8 = 23 have the three least bits set to zero and so on.
In general multiples of 2t have the least t bits set to zero.
You can formally prove this by induction over t.
A way to align a 64 bit number (the size of the MMIX address space) is to clear its lower t bits, this can be done by performing an AND operation with a mask of the form
11111...1000...0
\ / \ /
64 - t t
Such mask can be expressed as 264 - 2t.
264 is a big number for an example, lets pretend the address space is only 25.
Lets say we have the address 17h or 10111b in binary and lets say we want to align it to octas.
Octas are 8 bytes, 23 so we need to clear the lower 3 bits and preserve the other 2 bits.
The mask to use is 11000b or 18h in hexadecimal. This number is 25-23 = 32 - 8 = 24 = 18h.
If we perform the boolean AND between 17h and 18h we get 10h which is the aligned address.
This explains the notation k ∧ (264 − 2t) used short after, the "wedge" symbol ∧ is a logic AND.
So this notation just "pictures" the steps necessary to align the address k.
Note that the notation k ∨ (2t − 1) is also introduced, this is the complementary, ∨ is the OR and the whole effect is to have the lower t bits set to 1.
This is the greatest address occupied by an aligned access of size 2t.
The notation itself is used to explain the endianess.
If you wonder why aligned access are important, it has to do with hardware implementation.
Long story short the CPU interface to the memory has a predefined size despite the memory being byte addressable, say 64 bits.
So the CPU access the memory in blocks of 64 bits each one starting at an address multiple of 64 bits (i.e. aligned on 8 bytes).
Accessing an unaligned location may require the CPU to perform two access:
CPU reading an octa at address 2, we need bytes at 2, 3, 4 and 5.
Address 0 1 2 3 4 5 6 7 8 9 A B ...
\ / \ /
A B
CPU read octa at 0 (access A) and octa at 4 (access B), then combines the two reads.
RISC machine tends to avoid this complexity and entirely forbid unaligned access.
(1) Quoting: "If k is any unsigned octabyte, M[k] is a 1-byte
quantity".
(2) 20 = 1 is the only odd power of two, so you can guess that by removing it we only get even numbers.

Addressing Size Regarding Bytes

Just to make sure, does every single address contain one byte? So say you had theoretical addresses FFF0 and FFFF: there are 16 values between these two addresses, which means between them they contain 16 bytes, or 8 x 16 bits? Every individual address is linked to a single byte?
Just to make sure, does every single address contain one byte?
...which means between them they contain 16 bytes, or 8 x 16 bits?
Every individual address is linked to a single byte?
Yes to all three questions.
Which is why the limitation with 32-bit addressing, you can only access 2^32 bytes == 4,294,967,296 bytes == 4 GiB. Each addressable memory location gives access to 1 byte.
If we could access 2 bytes with one address, then that limit would have been 8 GiB. And the architecture of modern chips and all software would have to be modified to determine whether they want both bytes or just the first or the second. So you'd need, say, 1 more bit to determine that. Guess what, if you had 33-bit machines, that's what we'd get...max address-able space of 8 GiB. Which is still effectively 1-byte-containing addresses. Workarounds do exist but that's not related to your questions.
* GiB = Binary GigaBytes.
Note that this is not related to "types" where a char is 1 byte and an int is 4 bytes. Programming languages compensate for that when trying to access the value of a stored variable/data stored at a location(s). And they are actually calculated as total bits rather than total bytes. So an int is considered as 32 bits rather than 4 bytes. When C fetches an int's value from memory, it will fetch all 4 bytes even though the address of the int refers to just one, the address of the first byte.
Yes. Addresses map to bytes 1 to 1, even if they expect you to work with a word size of two or four bytes at a time.

Word size in bits to bytes conversion confusion

I have a pretty elementary question which is somewhat confusing me. It will be great to get some refresher on this.
Every computer has a word size. The word size is the maximum size of the virtual address space. So if we have lets say a 32 bit word size, we have a virtual address space that ranges to a max of 2^32 values. In references it says 2^32 bytes? Why is the range in bytes.
Also, What I am failing to understand is how 2^32 possible values be a possible address range of 4GB? So, my confusion stems from the confusion of turning the 32 bit word size into 4 byte word size, and then how 4 bytes, multiplied 2^32 times result in 4GB.
One way I tried to rationalize it is as follows:
2^32 bits = 2^2(bytes) x 2^10(kilobytes) x 2^10(megabytes) x 2^10(gigabytes)
So successive division of 2^32 by 2^10 results in 2^2 GB or 4 GB.
Can somebody point out how the 32-bit word size go to a 4GB page range?
Thanks
The argument in my head goes like this: We have 32 bits available to us, each bit can be at most 1. So the largest number we can accommodate is when all 32 bits (the 0 bit to the 31 bit that is) are filled with 1s. So the trick is to find the largest number in decimal form, by converting from binary to decimal we get:
1111111111111111111111111111111 (binary) = 4294967295 (decimal)
But what is 4294967295? It's actually one less than 2^32. Now there's another important thing to keep in mind:
4GB = 4294967296 bytes
But why is it 1 greater than our result? Because our first byte is byte 0 while the last is byte 4294967295 for a total of 4294967296 bytes.
So now we're in a position where the smallest number that can exist in a 32-bit register is 0 and the largest number that can exist in a 32-bit register is 4294967295.
0 (binary) - 1111111111111111111111111111111 (binary)
0 (decimal) - 4294967295 (decimal)
0 (hex) - 0xFFFFFFFF (hex)
So there is 4GB of addressable space because anything above 4GB will have an address that is too big of a number to fit inside a 32-bit number and thus inside a 32-bit register.
I did all this stuff inside excel and seeing it helped me a lot.

I don't understand something in memory addressing

I have a very simple (n00b) question.
A 20-bit external address bus gave a 1 MB physical address space (2^20
= 1,048,576).(Wikipedia)
Why 1 MByte?
2^20 = 1,048,576 bit = 1Mbit = 128KByte not 1MB
I misunderstood something.
When you have 20 bits you can address up to 2^20. This is your range, not the number of bits.
I.e. if you have 8 bits your range is up to 255 (unsigned) not 2^8 bits.
So with 20 bits you can address up to 2^20 bytes i.e. 1MB
I.e. with 20 bits you can represent addresses from 0 up to 2^20 = 1,048,576. I.e. you can reference up to 1MB of memory.
1 << 20 addresses, that is 1,048,576 bytes addressable. Hence, 1 MB physical address space.
Because the smallest addressable unit of memory (in general - some architectures have small bit-addressable pieces of memory) is the byte, not the bit. That is, each address refers to a byte, rather than to a bit.
Why, you ask? Direct access to individual bits is almost never needed - and if you need it, you can still load the surrounding byte and get the bit with bit masks and shifts. Increasing the bits per address allows you to address more memory with the same address range.
Note that a byte doesn't have to be 8 bit, strictly speaking, though it's ubiquitous by now. But regardless of the byte size, you're grouping bits together to be able to handle larger quantities of them.

Resources