Address sizes in Intel i5 - memory-management

My cpuinfo file says that my processor has address sizes as 39 bits physical, and 48 bits virtual. This has got me into some confusion.
Mine is a 64 bit machine. From what I understand, this is the word size of my machine. That is, it will fetch data from memory in chunks of 8 bytes. Also, a 64 bit machine means that the CPU can address 2^64 byte addressable locations, which is a lot. So, manufacturers cut-down some of these lines.
Here are the questions:
If the CPU only generates logical addresses, then what is the need for having 39 bits physical address size.
When we say that the CPU can access 2^64 bytes, do we mean Physical memory or the virtual memory?
I read somewhere that a 64 bit machine has size of its registers as 64 bits, and a 32 bit machine has 32 bit registers. Is this the case?
I think I have confused myself terribly, and need some help. Any other information on this would be appreciated. Thanks!

I can see why people are puzzled considering the number of academic questions posed on this board that suggest there is some mathematical relationship among address sizes.
The processor word size, physical address size, logical address size, and bus size are all independent to some degrees.
If the CPU only generates logical addresses, then what is the need for having 39 bits physical address size.
The CPU translates logical addresses to physical addresses.
When we say that the CPU can access 2^64 bytes, do we mean Physical memory or the virtual memory?
I could be either.
I read somewhere that a 64 bit machine has size of its registers as 64 bits, and a 32 bit machine has 32 bit registers. Is this the case?
Generally this is true for general registers but special purpose registers may be a different size (e.g., floating point, control registers)

There have been many occasions when a processor does not use all available bits for the generation of addresses.
In ancient times, the old MC68000 had 32 bit registers but only a 24 bit address bus.
For the i5 consider that a 64 bit address would control a mind boggling memory space of 17,179,869,184 gigabytes. A stupendously huge number even compared to the storage at Google or the NSA or the planet Earth.
The i5 designers, trim this insane number down to a more manageable 512 gigabytes of physical address space and 262,144 gigabytes of virtual address space.

Related

Why 4-level paging can only cover 64 TiB of physical address

There are the words in linux/Documentation/x86/x86_64/5level-paging.rst
Original x86-64 was limited by 4-level paging to 256 TiB of virtual address space and 64 TiB of physical address space.
I know that the limit of virtual address is 256TB because 2^48 = 256TB. But I don't know why its limit of physical is only 64TB.
Suppose we set the size of each page to 4k. Thus a linear address has 12 bits of offset, 9 bits indicate the index in each four level, which means 512 entries per level. A linear address can cover 512^4 pages, 512^4 * 4k = 256TB of space.
This is my understanding of the calculation of space limit. I'm wondering what's wrong with it.
The x86-64 ISA's physical address space limit is unchanged by PML5, remaining at 52-bit. Real CPUs implement some narrower number of physical address bits, saving bits in cache tags and TLB entries, among other places.
The 64 TiB limit is not imposed by x86-64 itself, but by the way Linux requires more virtual address space than physical for its own convenience and efficiency. See x86_64/mm.txt for the actual layout of Linux's virtual address space on x86-64 with PML4 paging, and note the 64 TB "direct mapping of all physical memory (page_offset_base)"
x86-64 Linux doesn't do HIGHMEM / LOWMEM
Linux can't actually use more than phys mem = 1/4 virtual address space, without nasty HIGHMEM / LOWMEM stuff like in the bad old days of 32-bit kernels on machines with more than 1 GiB of RAM (vm/highmem.html). (With a 3:1 user:kernel split of address space, letting user-space have 3GiB, but with the kernel having to map pages in/out of its own space if not accessing them via the current process's user-space addresses.)
Linus's rant about 32-bit PAE expands on why it's nice for an OS to have enough virtual address space to keep everything mapped, with the usual assertion that people who don't agree with him are morons. :P I tend to agree with him on this, that there are obvious efficiency advantages and that PAE is a huge PITA for the kernel. Probably even moreso on an SMP system.
If anyone had proposed a patch to add highmem support for x86-64 to allow using more than 64 TiB of physical memory with the existing PML4 format, I'd expect Linus would tell them 1995 called and wants its bad idea back. He wouldn't consider merging such a patch unless much RAM became common for servers, but hardware vendors still hadn't provided an extension for wider virtual addresses.
Fortunately that didn't happen: probably no CPU has supported wider than 46-bit phys addrs without supporting PML5. Vendors know that supporting more RAM than mainline Linux can use wouldn't be a selling point. But as the doc said, commercial systems were getting up to a max capacity of 64 TiB.
x86-64's page-table format has room for 52-bit physical addresses
The x86-64 page-table format itself has always had that much room: Why in x86-64 the virtual address are 4 bits shorter than physical (48 bits vs. 52 long)? has diagrams from AMD's manuals. Of course early CPUs had narrower physical addresses so you couldn't for example have a PCIe device put its device memory way up high in physical address space.
Your calculation has nothing to do with physical address limits, which is set by the number of bits in each page-table entry that can be used for that.
In x86-64 (and PAE), the page table format reserves bits up to bit #51 for use as physical-address bits, so OSes must zero them for forward compatibility with future CPUs. The low 12 bits are used for other things, but the physical address is formed by zeroing out the bits other than the phys-address bits in the PTE, so those low 12 bits become the low zero bits in an aligned physical-page address.
x86 terminology note: logical addresses are seg:off, and segment_base + offset gives you a linear address. With paging enabled (as required in long mode), linear addresses are virtual, and are what's used as a search key for the page tables (effectively a radix tree cached by the TLB).
Your calculation is just correctly reiterating the 256 TiB size of virtual address space, based on 4-level page tables with 4k pages. That's how much memory can be simultaneously mapped with PML4.
A physical page has to be the same size as a virtual page, and in x86-64 yes that's 4 KiB. (Or 2M largepage or 1G hugepage).
Fun fact: the x86-64 page-table-entry format is the same as PAE, so modern CPUs can also access large amounts of memory 32-bit mode. But of course not map it all at once. It's probably not a coincidence that AMD chose to use an existing well-designed format when making AMD64, so their CPUs would only need two different modes for hardware page-table walker: legacy x86 with 4-byte PTEs (10 bits per level) and PAE/AMD64 with 8-byte PTEs (9 bits per level).

Relation between size of address bus and memory size; memory Segmentation in 8086

My question is related to memory segmentation in 8086. I learnt that,
8086 has a 20 bit address bus. And so it can address 2^20 different addresses. Which means it has an memory size of 2^20, i.e, 1MB.
I have a few doubts:
What I understand from the fact that 8086 has a 20 bit address bus is that it could have 2^20 different combinations of 0s and 1s, each of which represents one physical address. What I don't understand is that how does 2^20 different address locations mean 1 MB of addressable memory? How is total number of different addresses locations related to memory size (in Megabytes)?
Also, correct me if I'm wrong, the 16 bit segment registers in 8086 hold the starting address of the different segments in the memory (Code, Stack, Data, Extra).My question is, aren't the addresses in memory of 20 bits? Then how can the 16 bit register hold 20 bit addresses? If it contains the upper 16 bit of the 20 bit address, how does the processor make out to which exact address location it has to point?
P.S: I am a beginner is micro-processors and total reliant on self study, so kindly excuse if my questions seem a bit silly.
Thanks in advance.
For this question, its important to remember there is a different between the number of possible memory addresses and the amount of actual memory (RAM) installed in the system. For the 8086, memory addresses are 20-bits long as you note, so that means there are 2^20 possible memory addresses (which is exactly 1 MiB in size since 1 MiB is 1024 or 2^10 KiB and 1 KiB is 1024 or 2^10 Bytes). This does NOT mean the system has 1 MiB worth of RAM necessarily, it very likely has less but the most addresses the 8086 could possibly address is 1 MiB; so if nothing but RAM was in the address space, the most RAM it could possibly have is 1 MiB. Frequently, you might have gaps in the address space not filled with anything, some of the address space is used for ROM or other peripherals. So, that size of the address space is 1 MiB but that does not mean there is 1 MiB of RAM/memory in the system.
Correct, the segment registers are all 16-bits for the 8086. A memory address is created by combining the appropriate segment register with the argument (the argument being the result of whatever the addressing mode being used by the instruction) by adding the argument to the segment register's value shifted by 4 bits. So, if for example the ss is 0x1111, sp is at 0x2222 and you preform a push ax instruction, the 20-bit address to which the value is pushed is (ss << 4) + sp or 0x11110 + 0x02222 = 0x13332. More information can be found on Wikipedia under the Real Mode section: https://en.wikipedia.org/wiki/X86_memory_segmentation

A 32 bit register represents 4Gbyte or 4Gbit of memory?

It may be a very stupid question but everywhere I read, it says a 32 bit register can represent maximum of 4GByte of memory but should it represent 4Gbit of memory? As 2^2 . 2^30 gives 4G. For addition of byte there must be another factor of 2^3. Can anyone help me if I am missing something here?
It depends on what is your 32 bits register meaning. If this is byte address, it can handle 4G bytes address space. If it is index of 512 bytes sectors it can handle 2Tera bytes of storage space. And finaly if it is a bit index only 512M bytes (4G bits).
32-bit register can address 2^32 bytes. 2^32 in decimal is 4,294,967,296. Hence number of addressable BYTES is 4+ billion. 4 GB addressable space.
See also:
Math behind 4GB limit on 32 bit systems
how much memory can be accessed by a 32 bit machine?
Can a 32-bit program use more than 4GB of memory on a 64-bit OS?

8 and 16 bit architecture

I'm a bit confused about bit architectures. I just cant find a good article that answers my questions, so I figured I'd ask SO.
Question 1:
When speaking of a 16 bit architecture, does it mean each ram address is 16 bits long? So if I create an int (32 bit) in C++ the variable would take up 2 addresses?
Question 2:
in a 16 bit architecture there are only 2^16 (65536) amount of addresses inside the RAM. Why can't they add more? Is this because 16 bit can't represent a higher value and therefore can't reference to adresses above 65535?
When speaking of a 16 bit architecture, does it mean each ram address is 16 bits long? So if I create an int (32 bit) in C++ the variable would take up 2 addresses?
You'd have to ask whoever was speaking of a 16-bit architecture what they meant by it. They could mean addresses are 16-bits long. They could mean general-purpose CPU registers are 16-bits long. They could mean something else. But there's no way we could know what some hypothetical person might mean. There is no universal definition of what makes something a "16-bit architecture".
For example, the 8032 is an 8-bit architecture with 8-bit general purpose registers. But it has a 16-bit pointer register that can be used to address 65,536 bytes of storage.
Regardless of bitness, almost all systems use byte addresses. So a 32-bit variable will take up 4 addresses on a machine of any bitness.
in a 16 bit architecture there are only 2^16 (65536) amount of addresses inside the RAM. Why can't they add more? Is this because 16 bit can't represent a higher value and therefore can't reference to adresses above 65535?
With 16-bits, there are only 65,536 possible ways those bits can be set. So a 16-bit register has 65,536 possible values.
Yes. Note, though that int on 16-bit architectures is usually just 16 bits wide.
Also note that it doesn't make sense to say that a variable "takes up" two addresses. The correct thing to say is that a 32-bit variable is as wide as two pointers on a 16-bit platform.
It will still occupy four bytes of space, no matter what architecture.
Yes; that's exactly what 16-bit addresses mean.
Note that each of these addresses points to a single byte of memory.
Depends on your definitions of 8-bit and 16-bit architecture.
The 6502 was considered an 8-bit CPU, because it operated on 8-bit values (the register size), yet had 16-bit addresses.
The 68000 was considered a 16-bit CPU, yet had 32-bit registers and addresses.
With x86, it is generally the address size that defines the architecture.
Also, '64-bit' CPUs don't always have a full 64-bit external address bus. They might internally handle addresses of that size, so the virtual address space can be large, but it doesn't mean they can have that much external memory.
Example From Wikipedia - All internal registers, as well as internal and external data buses, were 16 bits wide, firmly establishing the "16-bit microprocessor" identity of the 8086. A 20-bit external address bus gave a 1 MB physical address space (2^20 = 1,048,576). This address space was addressed by means of internal 'segmentation'. The data bus was multiplexed with the address bus in order to fit a standard 40-pin dual in-line package. 16-bit I/O addresses meant 64 KB of separate I/O space (2^16 = 65,536). The maximum linear address space was limited to 64 KB, simply because internal registers were only 16 bits wide. Programming over 64 KB boundaries involved adjusting segment registers (see below) and remained so until the 80386 introduced wider (32 bits) registers (and more advanced memory management hardware).
So you can see that there are no fixed rules that a 16 bit architecture will have 16 address lines only. Don't mix up two things, though it's intuitive to believe so.

Why does the 8086 use an extra register to address 1MB of memory?

I heard that the 8086 has 16-bit registers which allow it to only address 64K of memory. Yet it is still able to address 1MB of memory which would require 20-bit registers. It does this by using another register to to hold another 16 bits, and then adds the value in the 16-bit registers to the value in this other register to be able to generate numbers which can address up to 1MB of memory. Is that right?
Why is it done this way? It seems that there are 32-bit registers, which is more than sufficient to address 1MB of memory.
Actually this has nothing to do with number of registers. Its the size of the register which matters. A 16 bit register can hold up to 2^16 values so it can address 64K bytes of memory.
To address 1M, you need 20 bits (2^20 = 1M), so you need to use another register for the the additional 4 bits.
The segment registers in an 8086 are also sixteen bits wide. However, the segment number is shifted left by four bits before being added to the base address. This gives you the 20 bits.
the 8088 (and by extension, 8086) is instruction compatible with its ancestor, the 8008, including the way it uses its registers and handles memory addressing. the 8008 was a purely 16 bit architecture, which really couldn't address more than 64K of ram. At the time the 8008 was created, that was adequate for most of its intended uses, but by the time the 8088 was being designed, it was clear that more was needed.
Instead of making a new way for addressing more ram, intel chose to keep the 8088 as similar as possible to the 8008, and that included using 16 bit addressing. To allow newer programs to take advantage of more ram, intel devised a scheme using some additional registers that were not present on the 8008 that would be combined with the normal registers. these "segment" registers would not affect programs that were targeted at the 8008; they just wouldn't use those extra registers, and would only 'see' 16 addres bits, the 64k of ram. Applications targeting the newer 8088 could instead 'see' 20 address bits, which gave them access to 1MB of ram
I heard that the 8086 has 16 registers which allow it to only address 64K of memory. Yet it is still able to address 1MB of memory which would require 20 registers.
You're misunderstanding the number of registers and the registers' width. 8086 has eight 16-bit "general purpose" registers (that can be used for addressing) along with four segment registers. 16-bit addressing means that it can only support 216 B = 64 KB of memory. By getting 4 more bits from the segment registers we'll have 20 bits that can be used to address a total of 24*64KB = 1MB of memory
Why is it done this way? It seems that there are 32 registers, which is more than sufficient to address 1MB of memory.
As said, the 8086 doesn't have 32 registers. Even x86-64 nowadays don't have 32 general purpose registers. And the number of registers isn't relevant to how much memory a machine can address. Only the address bus width determines the amount of addressable memory
At the time of 8086, memory is extremely expensive and 640 KB is an enormous amount that people didn't think that would be reached in the near future. Even with a lot of money one may not be able to get that large amount of RAM. So there's no need to use the full 32-bit address
Besides, it's not easy to produce a 32-bit CPU with the contemporary technology. Even 64-bit CPUs today aren't designed to use all 64-bit address lines
Why can't OS use entire 64-bits for addressing? Why only the 48-bits?
Why do x86-64 systems have only a 48 bit virtual address space?
It'll takes more wires, registers, silicons... and much more human effort to design, debug... a CPU with wider address space. With the limited transistor size of the technology in the 70s-80s that may not even come into reality.
8086 doesn't have any 32-bit integer registers; that came years later in 386 which had a much higher transistor budget.
8086's segmentation design made sense for a 16-bit-only CPU that wanted to be able to use 20-bit linear addresses.
Segment registers could have been only 8-bit or something with a larger shift, but apparently there are some advantages to fine-grained segmentation where a segment start address can be any 16-byte aligned linear address. (A linear address is computed from (seg << 4) + off.)

Resources