Why Windows Server 2008 R2 x64 can't allocate more than 1.2GB for 32bit process - windows

I have very strange behavior of the Windows OS regarding 32-bit processes regarding memory allocation. The problem is - it seems that 32-bit process can't allocate more than something around 1.2GB, the limit floats between 1.1GB and 1.3GB depending on conditions I have no idea about.
My environment is:
Windows Server 2008 R2
Total physical RAM - 16GB
RAM in use at the moment of experiment - 6GB (i.e. around 9+GB is free)
I have self-written c++ tool, which bounces memory allocation via malloc() in blocks of 32MB until it gets denial from OS (the upper limit right now is 1070MB, but as I mentioned already - limit floats, the max number I remember was around 1.3GB) and than releases all these blocks in reverse order. This is done in cycle.
So the questions I have are:
Why I can't get close to real limit of 2GB for 32bit process? 1.3GB is only 60% of theoretical limit. Inability to get at least 0.5 GB more seems to be very strange to me. I have 9GB of unused RAM.
What (at runtime) can influence on the top limit? It does change over time - I have no idea why. I'd like to control it somehow - probably there is some magic command to optimize OS address space :)
In reply to comments:
show the loop where you allocate the memory: here it is https://code.google.com/p/membounce/source/browse/
Note: after looking deeper into the code I've noticed the bug which incorrectly calculated the upper limit (when allocation page size is changed dynamically while tool is running :() - the actual limit I'm getting right now is around 1.6 GB - this is better than 1.2 but yet not close to 2GB.
Build to x64 instead: in that case I'm interested particularly in 32bit. The original cause of this topic is sometime I'm not able to run java (specifically Netbeans x32) with Xmx900M (says JVM creation failed), and sometime it DOES run with Xmx1200M (Xmx is JVM parameter of max RAM that can be used by JVM). The reason I want to use non-x64 netbeans is because java (at least Oracle's JVM) effectively consumes twice more memory when switching to x64, i.e I have to allocate 2GB of RAM to JVM to get same efficiency as with 1GB on 32bit JVM. I have number of java processes and want to fit more of them into my 16GB address space :) (this is some kind of offtopic background)

Related

CreateFileMapping size limit in a 64-bit process

Is there a size limit to the file mapping object? The reason I'm asking is that there is a mentioning of 2GB limit somewhere in MSDN (lost the track..) and I also checked this sample, which also expects 2GB size limit:
https://cpp.hotexamples.com/examples/-/-/CreateFileMapping/cpp-createfilemapping-function-examples.html
But I tried on a 40GB file with no problems on newest Win 10, so I'm a bit worried if there wasn't some limitation on older Windows for example.
There is no 2GB limit for file mappings. You can boot 32-bit Windows with the 3GB option or when a 32-bit process running on a 64-bit system, you get the full 4GB if the correct PE flag is set. All these limits are theoretical and you will never reach them in practice.
How large of a view you can map depends on two things;
The contiguous range of free addresses in your processes address space.
Available kernel memory for keeping track of the memory pages.
The first one is the big limit on 32-bit systems since the address space of your process is shared with system libraries, 3rd-party libraries (anti-virus, injected "tweaking" tools etc.), the PEB and TEBs, the system region, thread stacks and memory reserved by hardware. This will often put you well below 2GB. Any design requiring more than 500MB should probably be changed to only map in specific smaller ranges as needed.
For a 64-bit process on 64-bit Windows, the virtual address space is the 128-terabyte range 0x000'00000000 through 0x7FFF'FFFFFFFF (KB889654 claims 8 TB but that only applies to < Windows 8.1). Any usable range is going to be smaller but you can assume a couple of terabyte at least. 40GB is no problem and not enough to run into problems with low system resources either.

Why would the JVM suddenly not allocate up to the maximum heap setting, even when running low on memory and with plenty of free OS memory?

My JVM is set to have a maximum heap size of 2GB. It is currently running slowly due to being low on memory, but it will not allocate beyond 1841MB (even though it has done so before on this run). I have over 16GB memory free.
Why would this suddenly happen to a running JVM? Could it be because it is "fenced in" - it cannot get a larger continuous range of physical memory?
This is for java 1.8.0_73 (64bit) on Windows 10. But I have seen this now and then for other java versions and on Windows 7 and XP too.
32 bit JVMs usually struggle to use more than about 1800Mb. Exactly how much they can allocate depends on your Operating System and how it lays out the 32 bit address space (which can vary between runs).
Use a 64 bit JVM to get more.
Start JVM with
java -Xmx2048m -Xms2048m
This will preallocate 2GB at JVM startup (even if not needed).
You cannot make a program run faster by just increasing the heap memory. A program may be slow due to various reasons.
In your case, may be it's not because of the memory usage, as the increase in the heap memory does not cause the program to use that memory to the fullest, or run faster. The heap is used up if you create a lot of objects which are in use and not garbage collected.
The other reason for the slowness could be due to some parts of the program using up the processing power (poorly performing algorithms ?).
It could also be due to slow I/O operations (file reads/writes ?).
These are only a few reasons. We can determine the slowness by getting to know more about your program.
You could look for slow running parts of your code by going through its logs (if any) or by using various profiling tools like jconsole (shipped with jdk), VisualVM, etc.
You could also tune your JVM by passing various parameters to customize the Garbage collection, various parts of the heap, thread stack size, etc.

Relation between RAM size and Virtual memory with JVM heap size

for performance testing, i need 2 GB of heap memory,so i am setting the parameter in java setting via "-Xmx2048m" and also increasing the virtual memory...but while running the application, it is giving errors like "the java run time environment cannot be loaded" and "Several JVM running in the same process caused an error", (in fact, it is not giving same error for any value more than 1 GB).
so is it possible to set Heap memory to be 2 GB? or it can be maximum of 1 GB only? if yes, how to do it??
I'm using windows 7, 64 bit with RAM size of 8 GB..and using java 1.6
Since you are running a 32-bit JVM, there is a limit on how much memory the process can use. Due to how virtual memory is laid out, 32-bit processes can only access 2 GB of memory (or up to 3-4 GB with special settings). Since Java needs some memory for its own bookkeeping, which is not part of the heap available to your application, the actual usuable limit for -Xmx must be somewhere below 2 GB. According to this answer, the limit for 32-bit Java on Windows is -Xmx1500m (not sure if it has changed in newer releases, but due to the limitations outlined above, it must be below 2 GB so it's likely to have stayd at 1500 MB).

Why is there a huge performance difference between 32 and 64 bit JDK? [duplicate]

Recently I've been doing some benchmarking of the write performance of my company's database product, and I've found that simply switching to a 64bit JVM gives a consistent 20-30% performance increase.
I'm not allowed to go into much detail about our product, but basically it's a column-oriented DB, optimised for storing logs. The benchmark involves feeding it a few gigabytes of raw logs and timing how long it takes to analyse them and store them as structured data in the DB. The processing is very heavy on both CPU and I/O, although it's hard to say in what ratio.
A few notes about the setup:
Processor: Xeon E5640 2.66GHz (4 core) x 2
RAM: 24GB
Disk: 7200rpm, no RAID
OS: RHEL 6 64bit
Filesystem: Ext4
JVMs: 1.6.0_21 (32bit), 1.6.0_23 (64bit)
Max heap size (-Xmx): 512 MB (for both 32bit and 64bit JVMs)
Constants for both JVMs:
Same OS (64bit RHEL)
Same hardware (64bit CPU)
Max heap size fixed to 512 MB (so the speed increase is not due to the 64bit JVM using a larger heap)
For simplicity I've turned off all multithreading options in our product, so pretty much all processing is happening in a single-threaded manner. (When I turned on multi-threading, of course the system got faster, but the ratio between 32bit and 64bit performance stayed about the same.)
So, my question is... Why would I see a 20-30% speed improvement when using a 64bit JVM? Has anybody seen similar results before?
My intuition up until now has been as follows:
64bit pointers are bigger, so the L1 and L2 caches overflow more easily, so performance on the 64bit JVM is worse.
The JVM uses some fancy pointer compression tricks to alleviate the above problem as much as possible. Details on the Sun site here.
The JVM is allowed to use more registers when running in 64bit mode, which speeds things up slightly.
Given the above three points, I would expect 64bit performance to be slightly slower, or approximately equal to, the 32bit JVM.
Any ideas? Thanks in advance.
Edit: Clarified some points about the benchmark environment.
From: http://www.oracle.com/technetwork/java/hotspotfaq-138619.html#64bit_performance
"Generally, the benefits of being able to address larger amounts of memory come with a small performance loss in 64-bit VMs versus running the same application on a 32-bit VM. This is due to the fact that every native pointer in the system takes up 8 bytes instead of 4. The loading of this extra data has an impact on memory usage which translates to slightly slower execution depending on how many pointers get loaded during the execution of your Java program. The good news is that with AMD64 and EM64T platforms running in 64-bit mode, the Java VM gets some additional registers which it can use to generate more efficient native instruction sequences. These extra registers increase performance to the point where there is often no performance loss at all when comparing 32 to 64-bit execution speed.
The performance difference comparing an application running on a 64-bit platform versus a 32-bit platform on SPARC is on the order of 10-20% degradation when you move to a 64-bit VM. On AMD64 and EM64T platforms this difference ranges from 0-15% depending on the amount of pointer accessing your application performs."
Without knowing your hardware I'm just taking some wild stabs
Your specific CPU may be using microcode to 'emulate' some x86 instructions -- most notably the x87 ISA
x64 uses sse math instead of x87 math, I've noticed a %10-%20 speedup of some math-heavy C++ apps in this case. Math differences could be the real killer if you're using strictfp.
Memory. 64 bits gives you much more address space. Maybe the GC is a little less agressive on 64 bits mode because you have extra RAM.
Is your OS is in 64b mode and running a 32b jvm via some wrapper utility?
The 64-bit instruction set has 8 more registers, this should make the code faster overall.
But, since processsors nowaday mostly wait for memory or disk, i suppose that either the memory subsystem or the disk i/o might be more efficient in 64-bit mode.
My best guess, based on a quick google for 32- vs 64-bit performance charts,
is that 64 bit I/O is more efficient. I suppose you do a lot of I/O...
If memcpy is involved when moving the data, it's probably more efficient to copy longs than ints.
Realize that the 64-bit JVM is not magic pixie dust that makes Java apps
go faster. The 64-bit JVM allows heaps >> 4 GB and, as such, only makes sense
for applications which can take advantage of huge memory on systems which
have it.
Generally there is either a slight improvement (due to certain hardware
optimizations on certain platforms) or minor degradation (due to increased
pointer size). Generally speaking there will be a need for fewer GC's -- but
when they do occur they will likely be longer.
In memory databases or search engines that can use the increased memory
for caching objects and thus avoid IPC or disk accesses will see the biggest
application level improvements. In addition a 64-bit JVM will also
allow you to run many, many more threads than a 32-bit one, because
there's more address space for things like thread stacks, etc. The
maximum number of threads generally for a 32-bit JVM is ~1000but ~100000 threads with a 64-bit JVM.
Some drawbacks though:
Additional issues with the 64-bit JVM are that certain client
oriented features like Java Plug-in and Java Web Start
are not supported. Also any native code would also need
to be compatible (e.g. JNI for things like Type II JDBC drivers).
This is a bonus for pure-Java developers as pure apps should
just run out of the box.
More on this Thread at Java.net

How to check what is the maximum amount of memory you can use of the address space in one process

When no LARGEADDRESSAWARE switch is given in a 32bit executable, 2GB of memory (give or take) is available the the process to use. When the switch LARGEADDRESSAWARE is present in the PE flags of the executable this limit can be (correct me if I am wrong):
2GB if a 32 bit Windows was not started with the /3GB switch
3GB if a 32 bit Windows was started with the /3GB switch
almost up to 4GB if the process runs under a Windows 64 bit OS as a 32 bit process.
My question is: how can one determine this memory limit (with and/or without the LARGEADDRESSAWARE flag)? And as a sidenote: is the enumeration of possibilities above correct?
Note: I am not interested in the amount of memory the process is using, also not the limit due to external effects, just the maximum amount of memory I can allocate in the ideal case.
I think the best approach is to call GetSystemInfo and work out what you need from lpMinimumApplicationAddress and lpMaximumApplicationAddress. You can simply subtract the former from the latter to obtain the total available addressable memory space.
Your three bullet points of the various possibilities are correct.

Resources