user kernel mode division for windows process - winapi

How is the process address space(4GB) allocated between usermode and kernel mode modules in windows
when i checked explorer.exe in process explorer the lower 2GB is occupied by user mode dlls
and upper 3-4GB address range of system process is loaded by drivers (*.sys files)
So my question is will all these 3-4GB address range of each process is shared or they get duplicated for each process?

The upper gigabyte is where the OS kernel lives, as well as all the drivers and additional modules, as well as I/O buffers and other kernel-only data memory. That are is shared by all processes, and indeed has to be for the kernel to work at all. The page tables live in a region called hyperspace which is at the 3 GB boundary and is the only section of memory above 2 GB that is not shared between processes. The 3rd gigabyte is used by the kernel by default, but if you build your programs to have 3GB of usermode memory, then this area will belong to the process.
That's all off the top of my head, so feel free to correct me.

To provide a short answer:
The (virtual) memory layout depends on your OS. Of course, there are differences between 32 bit and 64 bit versions of windows, but also between the different versions.
See here (MSDN) and here (MS blogger).
Hope this helps.

By default up to XP 2 GB were used by the kernel, and the other 2 GB were available for all programs. When starting XP with the /3GB command line witch, programs linked with /LARGEADDRESSAWARE flag can use up to 3 GB of virtual address space.
That means each application can mange up to 3 GB. 32 Bit windows can swap memory out to a pagefile and this may well become larger than 4 GB. Thereby its possible that 2 Applications together may allocate much more than 3GB in total.
I just tested this on a 4 GB XP 32 Bit machine. I started 3 applications each allocating 2 GB using VirtualAlloc and filling it using memset. The task manager shows that the total amount of virtual allocated memory is 7 GB. This of course is not very practical. If two of these applications try to use all their memory simultaneously, the machine will slow down up to the perception as a system hang

Related

Virtual memory in OS

I have been trying to understand the virtual address space concept used by the running programs. Let me work with an example of 32-bit application running on 32-bit Windows OS .
As far as I have understood each process considers(or "thinks") itself as the only application running on the system (is this correct?) and it has access to 4GB addresses out of which, in standard configuration, 2 GB is allocated to kernel and 2 to the user process. I have the following questions on this:
Why does a user process need to have kernel code loaded in its address space? Why can't the kernel have its own full 4 GB address space so that each process can enjoy 4GB space?
In 2GB+2GB configuration, is 2GB sufficient for Kernel to load all its code? Surely all the application code making up the kernel is(or can be) more than 2GB? Similarly, a user process which is allocated the 2GB address space surely needs more than 2 GB when you consider its own code as well as the other dependencies such as dlls?
Another question I have on this topic is about the various locations where a running process is present on the computer system -Say for example I have a program C:\Program Files\MyApp\app.exe. When I launch it, it's loaded into the process using virtual address space and uses paging (pagefile.sys) to use the limited RAM. My question is, once app.exe is launched, does it load into RAM+Pagefile in its entirety or it only loads a portion of the program from C:\Program Files\MyApp\myapp.exe and hence it keeps on referring to the exe location for more as and when needed?
Last question - On a 32-bit OS if i had more than 4 GB RAM, can the memory management use the RAM space in excess of 4 GB or it goes waste?
Thanks
Steve
Why does a user process need to have kernel code loaded in its address
space? Why can't the kernel have its own full 4 GB address space so
that each process can enjoy 4GB space?
A process can have (a tiny little bit less than) 4 GiB. The problem is that converting virtual addresses into physical addresses is expensive, so the CPU uses a "translation look-aside buffer" (TLB) to speed it up; and (at least on older CPUs) changing the virtual address space (e.g. because the kernel is in its own virtual address space) causes TLB entries to be discarded, which causes (virtual) memory accesses to become slow (because of "TLB misses"). Mapping the kernel into all virtual address spaces avoids/avoided this performance problem.
Note: For modern CPUs with the "PCID" feature the performance problem can be avoided by giving each virtual address space an ID; but most operating systems were designed before this feature existed, so (even with meltdown patches) they still use virtual address spaces in the same way.
In 2GB+2GB configuration, is 2GB sufficient for Kernel to load all its
code? Surely all the application code making up the kernel is more
than 2GB? Similarly, a user process which is allocated the 2GB address
space surely needs more than 2 GB when you consider its own code as
well as the other dependencies such as dlls?
Code is never the problem - its data. In general, most software either doesn't need 2 GiB of space or needs more than 4 GiB of space; and there's very little that needs 2 GiB but doesn't need more than 4 GiB. For things that need more than 4 GiB of space, everything shifted to 64 bit (typically with 131072 GiB or more of "user space") about 10 years ago, so...
My question is, once app.exe is launched, does it load into RAM+Pagefile in its entirety or it only loads a portion of the program from C:\Program Files\MyApp\myapp.exe and hence it keeps on referring to the exe location for more as and when needed?
Most modern operating systems use "memory mapped files". The idea is that the executable file isn't initially loaded into RAM at all, but if/when something within a page is actually accessed the first time it causes a "page fault" and the page fault handler fetches the page from disk. This tends to reduce RAM consumption (stuff that isn't accessed is never loaded from disk) and improve process start up times.
On a 32-bit OS if i had more than 4 GB RAM, can the memory management use the RAM space in excess of 4 GB or it goes waste?
There are multiple virtual address spaces where virtual addresses might be 32 bits wide, and a single physical address space where (depending on extensions that the CPU supports) physical addresses might be 36 bits wide (or even wider). This means that you could have a 32-bit OS running on a "32-bit only" CPU that can effectively use up to (e.g.) 64 GiB of RAM (if you can find a motherboard that actually supports it). In this case the CPU still converts virtual addresses into physical addresses, and processes needn't be aware of the physical address size; but a single process won't be able to use all of the RAM by itself (you'd need many processes to use all the RAM).
Why does a user process need to have kernel code loaded in its address space? Why can't the kernel have its own full 4 GB address space so that each process can enjoy 4GB space?
There normally are no kernel processes (except for the NULL process). Most CPU's process exceptions and interrupts in the the context of the currently running process. To support that, the kernel needs to be in the same location and have the same layout in all processes. Otherwise, an interrupt occurring during one process would be handled differently than one occurring while another process is running.
In 2GB+2GB configuration, is 2GB sufficient for Kernel to load all its code? Surely all the application code making up the kernel is(or can be) more than 2GB? Similarly, a user process which is allocated the 2GB address space surely needs more than 2 GB when you consider its own code as well as the other dependencies such as dlls?
You have misconception here. The there is no application code in the kernel space. The kernel space code only executes in response to an interrupt or exception.
2GB is more than sufficient for any kernel I have seen. In fact, some 32-bit systems (where the hardware permits it) make the kernel space less than 2GB and increase the size of the user space accordingly.
Another question I have on this topic is about the various locations where a running process is present on the computer system -Say for example I have a program C:\Program Files\MyApp\app.exe. When I launch it, it's loaded into the process using virtual address space and uses paging (pagefile.sys) to use the limited RAM. My question is, once app.exe is launched, does it load into RAM+Pagefile in its entirety or it only loads a portion of the program from C:\Program Files\MyApp\myapp.exe and hence it keeps on referring to the exe location for more as and when needed?
That depends upon the system. On any rationally designed system, secondary storage will be allocated to back every valid page in the process user address space. The "where" depends upon the system. For example, some systems use the executable as the page file for the code and static data. Only the writeable data will go to the page file. However, some primitive operating systems do not support paging directly to a file in that manner.
Last question - On a 32-bit OS if i had more than 4 GB RAM, can the memory management use the RAM space in excess of 4 GB or it goes waste?
That depends upon the system. It is possible for a 32-bit OS to use more than 4GB of RAM. Each process is limited go 4GB but the various process can use more than 4GB of physical memory.
Let's say that you have 4K pages. That 12-bits. In theory a 32-bit processor could have 64 bit page table entries. In that case the processor could easily access more than 4GB of physical memory.
The more common case is that a 32-bit processor has 32-bit page table entries. In theory a 32-bit page table with 4K pages could access 2 ^ (32 + 12) bytes of memory. In practice some of the 32 bits in the page table entry have to be used for system purposes. If there are fewer than 12 control bits, the processor can use more than 4GB of physical memory.

Windows 8: Unable to allocate 2GB with 3GB User Address Space

I'm trying to create a Windows 8, 32-bit program for testing. Testing includes a large allocation, and I'm having trouble. The OS was booted with /3GB, the machine has 8GB and a page file, and the program was linked with /LARGEADDRESSAWARE, so I should not be memory constrained. (Its important for me to use a 32-bit program for testing because of the way some types are defined - for example, a size_t).
The trouble is I'm not able to allocate 2GB (0x80000000) of memory from new or VirtualAlloc. new throws bad_alloc and VirtualAlloc returns NULL with ERROR_NOT_ENOUGH_MEMORY.
In previous versions of Windows, a 3GB Address Space meant the application was given 0x00000000 to 0xBFFFFFFF, and the OS used 0xC0000000 to 0xFFFFFFFF (see Richter's Programming Applications for Windows or Solomon and Russinovich's Windows Internals). In principal, I believe that means I have the theoretical space.
If I switch to x64, everything works as expected. I suspect I'm missing something very obvious, but I'm not sure what (like a shared memory region right in the middle of the address space).
Are there any ideas how I might be able to perform an allocation of 0x80000000 on a 32-bit machine?
In previous versions of Windows, a 3GB Address Space meant the application was given 0x00000000 to 0xBFFFFFFF, and the OS used 0xC0000000 to 0xFFFFFFFF (see Richter's Programming Applications for Windows or Solomon and Russinovich's Windows Internals). In principal, I believe that means I have the theoretical space.
Nothing has changed in Windows 8. What you stated is still true. In order, on a 32 bit system, to be able to reserve a 2GB block of memory you need at least the following to be true:
Your process is large address aware.
Your system is booted with the /3GB switch.
The virtual address space of your process has an unreserved range of addresses that is 2GB in size.
It's easy enough to arrange for the first two conditions to hold, but the third condition is harder to control. You should not assume that your process will be able to find a 2GB contiguous range of address space in a 32 bit process. That's an unrealistic expectation.
If your test system is a 64 bit system then you should consider testing on 32 bit system also. For example, on a 64 bit system there is no /3GB boot option and all large address aware 32 bit processes have a 4GB address space. Of course, you are still subject to item 3 on my list.
The /3GB option has no meaning on a 64-bit operating system and is no longer supported on Vista and up. The option is IncreaseUserVA on modern 32-bit versions of Windows that use BCDEdit, like Windows 8. So it is very unlikely that you actually got what you hoped for, in all likelihood you actually got a 2 GB address space. Which is the quickest explanation for why you can't allocate 2 GB.
A 32-bit process gets a 4 GB address space on a 64-bit operating system since none of the upper pages are needed by the operating system. You have to opt-in though by telling the operating system that you don't use unwise pointer shenanigans like relying on the upper bit of an address to be zero, the /LARGEADDRESSAWARE link.exe or editbin.exe option is required.
That still doesn't mean you get to allocate 4 GB, and the same problem you have now with the 2 GB address space you currently get. The address space is shared between code and data. It takes just one DLL with an awkward base address to cut the available space in two.

Windows memory management

I'm a bit confused about Windows memory management.
I've read somewhere that every process in Windows (32 bit) gets its own 4 GB of memory - thanks to swapping to disk. But 32bit Windows can use max 4 GB. So I thought that every process only "thinks" it has 4 GB but in real it has fewer.
Am I correct?
So how can I access data from one process to another? If 2 PEs are loaded at 0x400000, how do I do that? Could you give me an example in C or ASM?
Can somebody explain this to me further? Maybe point me to some good article.
Just a brief description is enough :).
Thanks.
Prcesses can address up to 4 GB of addresses, which are or are not backed by "real" memory. Windows OS, even 32-bit, can address more than 4 GB but might have reasons to limit this amount, or the limits are stipulated by hardware.
About Memory Management:
Each process on 32-bit Microsoft Windows has its own virtual address space that enables addressing up to 4 gigabytes of memory. [...]
Windows Internals Book - Chapter 9: Memory Management

Can I use the Windows boot.ini /3GB switch with less than 4 GB physical memory?

I ran into this issue while creating an application that needs to allocate large contiguous amounts of memory and must run on 32-bit Windows XP computers with 2 GB of physical memory.
Initially I ran into out-of-memory problems. Setting the /3GB switch in boot.ini and LARGE_ADDRESS_AWARE in the executable solved the problem on my computer with 4 GB of physical memory.
The question is: can I use the same strategy on a computer with less than 4 GB of physical memory, e.g. 2 GB? I.e., are these options all about the virtual address space, or do they have some relation to physical memory as well?
The /3GB switch does apply to virtual memory, so you can use it on a machine with less memory. It is discussed here. That's not to say that you will get great performance using that option in that situation. But if it is simply a matter of "making things work", then it may be a reasonable solution.

Why is Available Physical Memory (dwAvailPhys) > Available Virtual Memory (dwAvailVirtual) in call GlobalMemoryStatus on Windows Vista x64

I am playing with an MSDN sample to do memory stress testing (see: http://msdn.microsoft.com/en-us/magazine/cc163613.aspx) and an extension of that tool that specifically eats physical memory (see http://www.donationcoder.com/Forums/bb/index.php?topic=14895.0;prev_next=next). I am obviously confused though on the differences between Virtual and Physical Memory. I thought each process has 2 GB of virtual memory (although I also read 1.5 GB because of "overhead". My understanding was that some/all/none of this virtual memory could be physical memory, and the amount of physical memory used by a process could change over time (memory could be swapped out to disc, etc.)I further thought that, in general, when you allocate memory, the operating system could use physical memory or virtual memory. From this, I conclude that dwAvailVirtual should always be equal to or greater than dwAvailPhys in the call GlobalMemoryStatus. However, I often (always?) see the opposite. What am I missing.
I apologize in advance if my question is not well formed. I'm still trying to get my head around the whole memory management system in Windows. Tutorials/Explanations/Book recs are most welcome!
Andrew
That was only true in the olden days, back when RAM was expensive. The operating system maps pages of virtual memory to RAM as needed. If there isn't enough RAM to satisfy a program's request, it starts unmapping pages to make room. If such a page contains data instead of code, it gets written to the paging file. Whenever the program accesses that page again, it generates a paging fault, letting the operating system read the page back from disk.
If the machine has little RAM and lots of processes consuming virtual memory pages, that can cause a very unpleasant effect called "thrashing". The operating system is constantly accessing the disk and machine performance slows down to a crawl.
More RAM means less disk access. There's very little reason not to use 3 or 4 GB of RAM on a 32-bit operating system, it's cheap. Even if you do not get to use all 4 GB, not all of it will be addressable due hardware devices taking space on the address bus (video, mostly). But that won't change the size of the virtual memory accessible by user code, it is still 2 Gigabytes.
Windows Internals is a good book.
The amount of virtual memory is limited by size of the address space - which is 4GB per process on a 32-bit system. And you have to subtract from this the size of regions reserved for system use and the amount of VM used already by your process (including all the libraries mapped to its address space).
On the other hand, the total amount of physical memory may be higher than the amount of virtual memory space the system has left free for your process to use (and these days it often is).
This means that if you have more than ~2GB or RAM, you can't use all your physical memory in one process (since there's not enough virtual memory space to map it to), but it can be used by many processes. Note that this limitation is removed in a 64-bit system.
I don't know if this is your issue, but the MSDN page for the GlobalMemoryStatus function contains the following warning:
On computers with more than 4 GB of memory, the GlobalMemoryStatus function can return incorrect information, reporting a value of –1 to indicate an overflow. For this reason, applications should use the GlobalMemoryStatusEx function instead.
Additionally, that page says:
On Intel x86 computers with more than 2 GB and less than 4 GB of memory, the GlobalMemoryStatus function will always return 2 GB in the dwTotalPhys member of the MEMORYSTATUS structure. Similarly, if the total available memory is between 2 and 4 GB, the dwAvailPhys member of the MEMORYSTATUS structure will be rounded down to 2 GB. If the executable is linked using the /LARGEADDRESSAWARE linker option, then the GlobalMemoryStatus function will return the correct amount of physical memory in both members.
Since you're referring to members like dwAvailPhys instead of ullAvailPhys, it sounds like you're using a MEMORYSTATUS structure instead of a MEMORYSTATUSEX structure. I don't know the consequences of that on a 64-bit platform, but on a 32-bit platform that definitely could cause incorrect memory sizes to be reported.

Resources