What is the need for Logical Memory? - memory-management

When I study OS,I find a concept Logical Memory.So Why there is a need for a Logical Memory?How does a CPU generate Logical Memory?The output of "&ptr" operator is Logical or physical Address?Is Logical Memory and Virtual Memory same?

If you're talking about C's and C++'s sizeof, it returns a size and never an address. And the CPU does not generate any memory.
On x86 CPUs there are several layers in address calculations and translations. x86 programs operate with logical addresses comprised of two items: a segment selector (this one isn't always specified explicitly in instructions and may come from the cs, ds, ss or es segment register) and an offset.
The segment selector is then translated into the segment base address (either directly (multiplied by 16 in the real address mode and in the virtual 8086 mode of the CPU) or by using a special segment descriptor table (global or local, GDT or LDT, in the protected mode of the CPU), the selector is used as an index into the descriptor table, from where the base address is pulled).
Then the sum segment base address + offset forms a linear address (AKA virtual address).
If the CPU is in the real address mode, that's the final, physical address.
If the CPU is in the protected mode (or virtual 8086), that linear/virtual address can be further translated into the physical address by means of page tables (if page translation is enabled, of course, otherwise, it's the final physical address as well).
Physical memory is your RAM or ROM (or flash). Virtual memory is physical memory extended by the space of disk storage (could be flash as well as we now have SSDs).
You really need to read up on this. You seem to have no idea.

Related

Does memory-mapped I/O work by using RAM addresses?

Imagine a processor capable of addressing an 8-bit range (I know this is ridiculously small in reality) with a 128 byte RAM. And there is some 8-bit device register mapped to address 100. In order to store a value to it, does the CPU need to store a value at address 100 or does it specifically need to store a value at address 100 within RAM? In pseudo-assembly:
STI 100, value
VS
STI RAM_start+100, value
Usually, the address of a device is specified relative to the start of the address space it lives in.
The datasheet has surely more context and will clarify if the address is relative to something else.
However, before using it you have to translate that address as the CPU would see it.
For example, if your 8-bit address range accessible with the sti instruction is split in half:
0-127 => RAM
128-255 => IO
Because the hardware is wired this way, then, as seen from the CPU, the IO address range starts at 128, so an IO address of x is accessible at 128 + x.
The CPU datasheet usually establishes the convention used to give the addresses of the devices and the memory map of the CPU.
Address spaces can be hierarchical (e.g. as in PCI) or windowed (e.g. like the legacy PCI config space on x86), can have aliases, they may require special instructions or overlaps (e.g. reads to ROM, writes to RAM).
Always refers to the CPU manual/datasheet to understand the CPU memory map and how its address range(s) is (are) routed.

Why 4-level paging can only cover 64 TiB of physical address

There are the words in linux/Documentation/x86/x86_64/5level-paging.rst
Original x86-64 was limited by 4-level paging to 256 TiB of virtual address space and 64 TiB of physical address space.
I know that the limit of virtual address is 256TB because 2^48 = 256TB. But I don't know why its limit of physical is only 64TB.
Suppose we set the size of each page to 4k. Thus a linear address has 12 bits of offset, 9 bits indicate the index in each four level, which means 512 entries per level. A linear address can cover 512^4 pages, 512^4 * 4k = 256TB of space.
This is my understanding of the calculation of space limit. I'm wondering what's wrong with it.
The x86-64 ISA's physical address space limit is unchanged by PML5, remaining at 52-bit. Real CPUs implement some narrower number of physical address bits, saving bits in cache tags and TLB entries, among other places.
The 64 TiB limit is not imposed by x86-64 itself, but by the way Linux requires more virtual address space than physical for its own convenience and efficiency. See x86_64/mm.txt for the actual layout of Linux's virtual address space on x86-64 with PML4 paging, and note the 64 TB "direct mapping of all physical memory (page_offset_base)"
x86-64 Linux doesn't do HIGHMEM / LOWMEM
Linux can't actually use more than phys mem = 1/4 virtual address space, without nasty HIGHMEM / LOWMEM stuff like in the bad old days of 32-bit kernels on machines with more than 1 GiB of RAM (vm/highmem.html). (With a 3:1 user:kernel split of address space, letting user-space have 3GiB, but with the kernel having to map pages in/out of its own space if not accessing them via the current process's user-space addresses.)
Linus's rant about 32-bit PAE expands on why it's nice for an OS to have enough virtual address space to keep everything mapped, with the usual assertion that people who don't agree with him are morons. :P I tend to agree with him on this, that there are obvious efficiency advantages and that PAE is a huge PITA for the kernel. Probably even moreso on an SMP system.
If anyone had proposed a patch to add highmem support for x86-64 to allow using more than 64 TiB of physical memory with the existing PML4 format, I'd expect Linus would tell them 1995 called and wants its bad idea back. He wouldn't consider merging such a patch unless much RAM became common for servers, but hardware vendors still hadn't provided an extension for wider virtual addresses.
Fortunately that didn't happen: probably no CPU has supported wider than 46-bit phys addrs without supporting PML5. Vendors know that supporting more RAM than mainline Linux can use wouldn't be a selling point. But as the doc said, commercial systems were getting up to a max capacity of 64 TiB.
x86-64's page-table format has room for 52-bit physical addresses
The x86-64 page-table format itself has always had that much room: Why in x86-64 the virtual address are 4 bits shorter than physical (48 bits vs. 52 long)? has diagrams from AMD's manuals. Of course early CPUs had narrower physical addresses so you couldn't for example have a PCIe device put its device memory way up high in physical address space.
Your calculation has nothing to do with physical address limits, which is set by the number of bits in each page-table entry that can be used for that.
In x86-64 (and PAE), the page table format reserves bits up to bit #51 for use as physical-address bits, so OSes must zero them for forward compatibility with future CPUs. The low 12 bits are used for other things, but the physical address is formed by zeroing out the bits other than the phys-address bits in the PTE, so those low 12 bits become the low zero bits in an aligned physical-page address.
x86 terminology note: logical addresses are seg:off, and segment_base + offset gives you a linear address. With paging enabled (as required in long mode), linear addresses are virtual, and are what's used as a search key for the page tables (effectively a radix tree cached by the TLB).
Your calculation is just correctly reiterating the 256 TiB size of virtual address space, based on 4-level page tables with 4k pages. That's how much memory can be simultaneously mapped with PML4.
A physical page has to be the same size as a virtual page, and in x86-64 yes that's 4 KiB. (Or 2M largepage or 1G hugepage).
Fun fact: the x86-64 page-table-entry format is the same as PAE, so modern CPUs can also access large amounts of memory 32-bit mode. But of course not map it all at once. It's probably not a coincidence that AMD chose to use an existing well-designed format when making AMD64, so their CPUs would only need two different modes for hardware page-table walker: legacy x86 with 4-byte PTEs (10 bits per level) and PAE/AMD64 with 8-byte PTEs (9 bits per level).

x86 address space calculation PAE to 36 bits

I'm having some hard time understanding PAE. I know it creates a 3rd level of indirection via the PDPT, so that the address translation goes from CR3 -> PDPT(4 entries) -> PD(512 entries) -> PT (512 entries) -> PAGE (4096). But the address is still 32 bits, how do you get 36 bit addresses from this scheme? I'd appreciate an example. How does adding another table "increases" the address space?
PAE changes nothing about 32-bit virtual addresses, only the size of physical address they're mapped to. (Which sucks a lot, nowhere near enough virtual address space to map all those physical pages at once. Linus Torvalds wrote a nice rant about PAE: https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/ originally posted on https://www.realworldtech.com/forum/?threadid=76912&curpostid=76973 / https://www.realworldtech.com/forum/?threadid=76912&curpostid=76980)
It also widens a PTE (Page Table Entry) from 4 bytes to 8 bytes, which means 2 levels aren't enough anymore; that's where the small extra level comes to translate the top 2 bits of virtual addresses via those 4 entries.
36-bit only happened to be the supported physical address size in the first generation of CPUs that implemented PAE, Pentium Pro There is no inherent 36-bit limit to PAE.
x86-64 adopted the PTE format, which has room for up to 52-bit physical addresses. Current x86-64 CPUs support the same physical address-size in legacy mode with PAE as they do in 64-bit mode. (As reported by CPUID). That limit is a design choice that saves bits in cache tags, TLB entries, store-buffer entries, etc. and in comparators involved with them. It's normally chosen to be more than the amount of RAM that a real system could actually use, given the commercially available DIMM sizes and number of memory controllers even in multi-socket systems, and still leave room for some I/O address space.
x86-64 came soon after PAE, or soon enough for desktop use to be relevant, so it's a common misconception that PAE is only 36 bits. (Because 64-bit mode is a vastly better way to address more memory, allowing a single process to use more than 2G or 3G depending on user/kernel split.)

What if accessing a non-existing physical address in X86 system?

I am working on a Linux kernel module, which maps a physical address range to a process virtual address space, by playing with process's page tables.
Then, I have a question in my head, what will happen if a PTE points to a non-existing physical address?
For example, my X86 laptop has 8GB DRAM, and if a PTE has the value of 0x8000000400001227, will the CPU generate some exception for this invalid address accessing?
I did a quick a test with that, but there is NOthing unusual happened, and I got confused totally.
Please help clarifying the reason behind, or let me know if I really need to read some X86 documents.
Typically a memory read to non-existent memory will return all FF's and a memory write will be discarded. (With some platforms and/or some address ranges, reads may return 0. It depends on how the address range is decoded by the chipset.)
Page table entry bits 51:M are reserved (where M is the physical address width supported by the processor), so if you map and try to access an address greater than the physical address width, you will get a page fault due to a reserved bit violation. I think M is typically 39 bits for clients; more for servers. You can find out the value for your system using CPUID with eax=80000008 and examining bits 7:0 of eax.

addresability vs address space vs address bus

How do you determine addressability based on address space? How do you determine the size of the address bus based on the addressability? Ex. The addressability of a machine is 32 bits, what is the size of the address bus?
The address bus connects the CPU with the main memory. So if the address bus has 32 bits, the max size of the main memory is 2^32 bytes, i.e. 4 GB.
The address bus transfers a physical address, and thus the physical address space in this example is 4 GB.
However the CPU generates virtual addresses, and the virtual addresses are the virtual address space. The virtual addresses have to be mapped to physical addresses by a memory management unit.
In principle, one can map a small virtual address space to a large physical one (as done earlier e.g. in the PDP11 computers), but nowadays mostly a larger virtual address space is mapped to a smaller physical one, e.g. from a 64-bit CPU with a 2^64 byte virtual address space to a physical memory with a 32-bit address bus, which is thus 4 GB large.
So if you have a primitive system without memory management, and you want that all addresses that the GPU can generate are existing main memory addresses, then you address bus must have the same number of bits as the CPU uses for addressing, e.g. 32 bits.
But in a real system the virtual CPU addresses are essentially independent from the physical memory addresses.

Resources