Problems using GlobalMemoryStatusEX on Win32 - visual-studio-2010

I am facing the following problem: I need to find the available memory on my system. GlobalMemoryStatusEX works fine when built on x64. But gives wrong answer when built on Win32. I am using Intel Visual Fortran 2010 on Windows 7 64-bit.
Here is a sample of my code:
program test
use kernel32
use ifwinty
implicit none
type(t_memorystatusex) :: status
integer :: RetVal
status%dwLength = sizeof(status)
RetVal = GlobalMemoryStatusEX(status)
end program test
Thank you very much!

Your program doesn't ever display the values returned, and you don't say what is "wrong", so I don't know what problem you're referring to. Intel Visual Fortran supplies a MemoryStatus example program (in the Win32 sample collection) that uses GlobalMemoryStatusEx. When I run it on my 64-bit system in 32-bit mode it displays:
48% of memory is in use
15.99GB total physical memory
8.28GB available physical memory
31.98GB total pageable memory
24.42GB available pageable memory
2.00GB total virtual memory
1.98GB available virtual memory
and when run in 64-bit mode:
48% of memory is in use
15.99GB total physical memory
8.28GB available physical memory
31.98GB total pageable memory
24.42GB available pageable memory
8192.00GB total virtual memory
8191.99GB available virtual memory
Notice that the only difference is in virtual memory, correctly reflecting the difference between 32-bit and 64-bit mode.
In most cases, you don't need to find the amount of available memory - pretty much every indication you can get is not going to be helpful during run of a program. In particular, available virtual memory doesn't necessarily mean you can allocate a single chunk of data that size, as the memory pool may be somewhat fragmented.
So my answer is that your code, what you showed of it, is the correct way to find out what memory, physical and virtual, is available to Windows. You just need to know how to interpret it correctly.

Related

CreateFileMapping size limit in a 64-bit process

Is there a size limit to the file mapping object? The reason I'm asking is that there is a mentioning of 2GB limit somewhere in MSDN (lost the track..) and I also checked this sample, which also expects 2GB size limit:
https://cpp.hotexamples.com/examples/-/-/CreateFileMapping/cpp-createfilemapping-function-examples.html
But I tried on a 40GB file with no problems on newest Win 10, so I'm a bit worried if there wasn't some limitation on older Windows for example.
There is no 2GB limit for file mappings. You can boot 32-bit Windows with the 3GB option or when a 32-bit process running on a 64-bit system, you get the full 4GB if the correct PE flag is set. All these limits are theoretical and you will never reach them in practice.
How large of a view you can map depends on two things;
The contiguous range of free addresses in your processes address space.
Available kernel memory for keeping track of the memory pages.
The first one is the big limit on 32-bit systems since the address space of your process is shared with system libraries, 3rd-party libraries (anti-virus, injected "tweaking" tools etc.), the PEB and TEBs, the system region, thread stacks and memory reserved by hardware. This will often put you well below 2GB. Any design requiring more than 500MB should probably be changed to only map in specific smaller ranges as needed.
For a 64-bit process on 64-bit Windows, the virtual address space is the 128-terabyte range 0x000'00000000 through 0x7FFF'FFFFFFFF (KB889654 claims 8 TB but that only applies to < Windows 8.1). Any usable range is going to be smaller but you can assume a couple of terabyte at least. 40GB is no problem and not enough to run into problems with low system resources either.

Does mmap allocate all the memory at once?

I'm working on a JIT compiler which will generate machine code in memory. This JIT is targeted at 64-bit POSIX x86 systems primarily, and I'm concerned about jumps in the code always being encodeable as 32-bit relative offsets. What I'd like to do is to mmap a 2-4GB chunk of executable memory for machine code, and manage this memory area myself.
What I'm wondering about specifically is: is it safe for me to mmap 4GB of memory at once, on a 64-bit system, even if the system doesn't have 4GBs of memory? I'm assuming that most (or all) OSes won't really be allocating the pages I don't write to, and so if I always allocate in the lower addresses first, I'm going to be OK, so long as I don't actually use more memory than the system physically has.
I would also be curious to hear alternative suggestions as to how to manage machine code allocation so that the machine code always resides in the same 4GB space on a 64-bit machine.
Your mmap of 4GB may succeed in allocating the virtual memory, and physical pages will be allocated as they are "dirtied", or modified by your program. If you run out of physical memory, your process may be terminated. See, also, this question.

32-bit pointers with the x86-64 ISA: why not?

The x86-64 instruction set adds more registers and other improvements to help streamline executable code. However, in many applications the increased pointer size is a burden. The extra, unused bytes in every pointer clog up the cache and might even overflow RAM. GCC, for example, builds with the -m32 flag, and I assume this is the reason.
It's possible to load a 32-bit value and treat it as a pointer. This doesn't necessitate extra instructions, just load/compute the 32 bits and load from the resulting address. The trick won't be portable, though, as platforms have different memory maps. On Mac OS X, the entire low 4 GiB of address space is reserved. Still, for one program I wrote, hackishly adding 0x100000000L to 32-bit "addresses" before use improved performance greatly over true 64-bit addresses, or compiling with -m32.
Is there any fundamental impediment to having a 32-bit, x86-64 platform? I suppose that supporting such a chimera would add complexity to any operating system, and anyone wanting that last 20% should just Make it Work™, but it still seems that this would be the best fit for a variety of computationally intensive programs.
There is an ABI called "x32" for linux in development. It's a mix between x86_64 and ia32 similar to what you describe - 32 bit address space while using the full 64 bit register set. It needs a custom kernel, binutils and gcc.
Some SPEC runs indicate a performace improvement of about 30% in some benchmarks. See further information at https://sites.google.com/site/x32abi/
As Mysticial commented above, ICC has the -auto-ilp32 / /Qauto-ilp32 option to use 32-bit pointers in 64-bit mode:
Instructs the compiler to analyze the program to determine if there are 64-bit pointers that can be safely shrunk into 32-bit pointers and if there are 64-bit longs (on Linux* systems) that can be safely shrunk into 32-bit longs.
On Windows there's no x32abi like on Linux, but you can still use 32-bit pointers by disabling the /LARGEADDRESSAWARE flag which is enabled for 64-bit binaries by default
By default, 64-bit Microsoft Windows-based applications have a user-mode address space of several terabytes. For precise values, see Memory Limits for Windows and Windows Server Releases. However, applications can specify that the system should allocate all memory for the application below 2 gigabytes. This feature is beneficial for 64-bit applications if the following conditions are true:
A 2 GB address space is sufficient.
The code has many pointer truncation warnings.
Pointers and integers are freely mixed.
The code has polymorphism using 32-bit data types.
All pointers are still 64-bit pointers, but the system ensures that every memory allocation occurs below the 2 GB limit, so that if the application truncates a pointer, no significant data is lost. Pointers can be truncated to 32-bit values, then extended to 64-bit values by either sign extension or zero extension.
Virtual Address Space
Of course there's no direct compiler support like the -mx32 option in GCC, therefore you may need to deal with pointers manually every time you store a pointer to memory or dereference it. The simplest solution is to write a class wrapping a 32-bit pointer to handle that. Luckily MS also had experience on mixed 32 and 64-bit pointers in the same architecture so they have lots of supporting keywords/macros:
POINTER_32/__ptr32
POINTER_64/__ptr64
POINTER_SIGNED/__sptr
POINTER_UNSIGNED/__uptr
Google's V8 engine uses a different way by compressing pointers to 32 bits to save memory as well as improve performance. See the comparison in memory and performance improvement here
See also How does the compressed pointer implementation in V8 differ from JVM's compressed Oops?
Read more
How to use 32-bit pointers in 64-bit application?
Can a C compiler generate an executable 64-bits where pointers are 32-bits?
I do not expect it very hard to support such a model in the OS. About the only thing that needs to change for processes in this model is page management, pages must be allocated below the 4 GB point. The kernel too should allocate its buffers from the first 4 GBs of the virtual address space if it passes them to the application. The same applies to the loader that loads and starts applications. Other than that a 64-bit kernel should be able handle such apps w/o major modifications.
Compiler support shouldn't be a big issue either. It's mostly a matter of generating code that can use the extra CPU registers and their full 64 bits and adding proper REX prefixes whenever needed.
It's called "x86-32 emulation", or WOW64 on Windows (presumably something else on other OSes) and it's a hardware flag in the processor. No need for any user-mode tricks here.

memory allocation issues

i have following question
i have RAM 2.5 GB in my computer what i want is if it is possible that in case of allocate totally memory to some process or for example
char * buffer=malloc(2.4GB) , no more process ( google chrome, microsoft games in computer..etc) can run?
Probably not. First, your operating system will have protections ie, malloc eventually becomes a system call in your OS so it will fail instead of killing everything.
Second, because of virtual memory you can have more allocated memory than RAM so that even if your OS were to let you allocate 2.5 gigs it will still be able to function and run processes.
While it is OS and compiler dependent, on Visual C++ under 32 bits windows, you will typically be unable to malloc more than 512MB at a time. This controlled by the preprocessor constant _HEAP_MAXREQ. For details of the approach I used to work around this limitatation, see the following thread If you go to 64 bits, this also ceases to be an issue, although you might end up using much more virtual memory than you would expect.
In a OS like Windows where each process gets a 4GB (assuming 32 bit OS) virtual address space, it doesn't matter how much RAM you are having. In such a case malloc(2.4GB) will surely fail as the user address space is limited to 2GB only. Even allocating 2GB will most probably fail as the system has to allocate 2GB of continuos virtual address space for malloc. This much continous free memory is nearly impossible due to fragmentation.
Computer works with virtual memory, this has no relation to a real size of RAM.

Why is Available Physical Memory (dwAvailPhys) > Available Virtual Memory (dwAvailVirtual) in call GlobalMemoryStatus on Windows Vista x64

I am playing with an MSDN sample to do memory stress testing (see: http://msdn.microsoft.com/en-us/magazine/cc163613.aspx) and an extension of that tool that specifically eats physical memory (see http://www.donationcoder.com/Forums/bb/index.php?topic=14895.0;prev_next=next). I am obviously confused though on the differences between Virtual and Physical Memory. I thought each process has 2 GB of virtual memory (although I also read 1.5 GB because of "overhead". My understanding was that some/all/none of this virtual memory could be physical memory, and the amount of physical memory used by a process could change over time (memory could be swapped out to disc, etc.)I further thought that, in general, when you allocate memory, the operating system could use physical memory or virtual memory. From this, I conclude that dwAvailVirtual should always be equal to or greater than dwAvailPhys in the call GlobalMemoryStatus. However, I often (always?) see the opposite. What am I missing.
I apologize in advance if my question is not well formed. I'm still trying to get my head around the whole memory management system in Windows. Tutorials/Explanations/Book recs are most welcome!
Andrew
That was only true in the olden days, back when RAM was expensive. The operating system maps pages of virtual memory to RAM as needed. If there isn't enough RAM to satisfy a program's request, it starts unmapping pages to make room. If such a page contains data instead of code, it gets written to the paging file. Whenever the program accesses that page again, it generates a paging fault, letting the operating system read the page back from disk.
If the machine has little RAM and lots of processes consuming virtual memory pages, that can cause a very unpleasant effect called "thrashing". The operating system is constantly accessing the disk and machine performance slows down to a crawl.
More RAM means less disk access. There's very little reason not to use 3 or 4 GB of RAM on a 32-bit operating system, it's cheap. Even if you do not get to use all 4 GB, not all of it will be addressable due hardware devices taking space on the address bus (video, mostly). But that won't change the size of the virtual memory accessible by user code, it is still 2 Gigabytes.
Windows Internals is a good book.
The amount of virtual memory is limited by size of the address space - which is 4GB per process on a 32-bit system. And you have to subtract from this the size of regions reserved for system use and the amount of VM used already by your process (including all the libraries mapped to its address space).
On the other hand, the total amount of physical memory may be higher than the amount of virtual memory space the system has left free for your process to use (and these days it often is).
This means that if you have more than ~2GB or RAM, you can't use all your physical memory in one process (since there's not enough virtual memory space to map it to), but it can be used by many processes. Note that this limitation is removed in a 64-bit system.
I don't know if this is your issue, but the MSDN page for the GlobalMemoryStatus function contains the following warning:
On computers with more than 4 GB of memory, the GlobalMemoryStatus function can return incorrect information, reporting a value of –1 to indicate an overflow. For this reason, applications should use the GlobalMemoryStatusEx function instead.
Additionally, that page says:
On Intel x86 computers with more than 2 GB and less than 4 GB of memory, the GlobalMemoryStatus function will always return 2 GB in the dwTotalPhys member of the MEMORYSTATUS structure. Similarly, if the total available memory is between 2 and 4 GB, the dwAvailPhys member of the MEMORYSTATUS structure will be rounded down to 2 GB. If the executable is linked using the /LARGEADDRESSAWARE linker option, then the GlobalMemoryStatus function will return the correct amount of physical memory in both members.
Since you're referring to members like dwAvailPhys instead of ullAvailPhys, it sounds like you're using a MEMORYSTATUS structure instead of a MEMORYSTATUSEX structure. I don't know the consequences of that on a 64-bit platform, but on a 32-bit platform that definitely could cause incorrect memory sizes to be reported.

Resources