Why is the JVM using more memory than I am allocating

I am starting the Java process with the following command:
java -Xmx32m -jar winstone-lite.jar --warfile=myWarFile.war
Instead of using the amount of memory I specified, it is still allocating 144m.
When I say allocate, I mean when I look at the "top" process I am seeing 144m as the amount of memory being used.
I am using http://www.oracle.com/technetwork/java/embedded/documentation/index.html current version.
I would figure that if my application required more memory than I am allocating the jvm would crash.

-Xmxjust tells the JVM how much memory it may use for its internal heap.
The JVM needs memory for other purposes (permanent generation, temporary space etc.), plus like every binary it needs space for its own binary code, plus any libraries/DLLs/.so it loads.
The 144 MiB you quote probably contains at least some of these other memory uses.
How did you measure the memory usage? On modern OS using virtual memory, measuring memory usage of a process is not quite trivial, and cannot be expressed as a single value.


Java buildpack memory calculation

Java buildpack memory calculator with Spring Boot application inside of Docker container with 1GB memory calculates memory as it says in documentation, it takes entire available memory and this are calculated JVM options:
Calculated JVM Memory Configuration: -XX:MaxDirectMemorySize=10M -Xmx747490K -XX:MaxMetaspaceSize=157725K -Xss1M (Total Memory: 1G, Thread Count: 50, Loaded Class Count: 25433, Headroom: 0%)
Question is why does it takes entire available memory and gives it to JVM? It should leave some memory for java process outside of JVM. This can lead to OOM because JVM thinks it has 1GB for itself (747490K for heap), and in reality it has less because some of it's memory is used by native memory, outside of JVM.
Should I not use this calculator and set JVM configuration by myself or I can reconfigure this somehow?
Question is why does it takes entire available memory and gives it to JVM?
The assumption is that the only thing running in your container is your Java application, thus it assigns all of the available memory to be used.
If you do things like shell out and run other processes or run other processes in the container, you need to tell memory calculator so it can take that into account.
This can lead to OOM because JVM thinks it has 1GB for itself (747490K for heap), and in reality it has less because some of it's memory is used by native memory, outside of JVM.
The memory calculator takes into consideration the major memory regions within a Java process. Not just heap. That said, it cannot 100% guarantee that you will never go over your memory limit. That's impossible with a Java app.
There are things you can do as an application developer, like create 10,000 threads or JNI, that cannot be restricted and could potentially consume a whole ton of memory. If you do that, your app will go over its container memory limit and crash.
The memory calculator attempts to give you a reasonable memory configuration for most common Java workloads. Running a web app, running a microservice, running some batch jobs, etc...
If you are doing something that doesn't fit within that pattern, then you can simply tell the memory calculator and it'll adjust things accordingly.
Should I not use this calculator and set JVM configuration by myself or I can reconfigure this somehow?
Even if you need to customize what the calculator is doing it can be helpful. It's additional toil to calculate these values manually, especially when it's so easy to change the memory limits. If your ops team increases the memory limit of the container, you want your application to automatically adjust to that configuration (as well as it can).
Beyond that, memory calculator is also good at detecting problems early. If you configure the JVM manually and you mess it up, let's say you over-allocate memory, the JVM won't necessarily care until it tries to get more memory and can't. At some point down the road, you're going to have a problem but it's not clear when (probably at 3am on a Sat, lol).
With memory calculator, it's doing the math when your container first starts to make sure that memory settings are sane. If there's something off with the configuration, it'll fail and let you know.
You can override a memory calculator-defined value by simply setting that JVM option in the JAVA_TOOL_OPTIONS env variable. For example, if I want to allow for more direct memory, I would set JAVA_TOOL_OPTIONS='-XX:MaxDirectMemorySize=50M'. Then when you restart the container, the memory calculator will shift memory around to accommodate that.
The one thing you don't want to set is -Xmx. The memory calculator should always set this because it will set it to whatever is left after other regions have been accounted for. You can think of it like HEAP = CONTAINER_MEMORY_LIMIT - (all static memory regions).
If you were to set -Xmx, you have to get it exactly right. If it's too low then you're wasting memory. If it's too high then you could exceed the container memory limit and get crashes.
In short, if you think you want to set -Xmx, you should either increase the container memory limit or decrease one of the static memory regions.
If you run other things in the container, you need to set the headroom. This is done with the BPL_JVM_HEAD_ROOM env variable. Give it a percent of the total container memory limit. Ex: BPL_JVM_HEAD_ROOM=20 would use 80% of the container's memory limit for Java and 20 for other stuff.
Setting some headroom can be useful in other cases as well, like if you're troubleshooting a container crash and you want a little extra room, or if you don't like operating at 100% the memory limit. You can leave 5 or 10% unused to match your comfort level.
If you have an application that uses a lot of threads, you'll need to adjust this as well. The default is 250 threads, which works well for many web/servlet-based applications (thread per request model). We do automatically lower to 50 threads if you're specifically using Spring Webflux which does not need so many threads.
For other cases, it's up to you to configure this. For example, if you have a batch application that only needs a thread pool of 10, then you could set this 40 or 50. 40-50 seems weird in this example, but the JVM creates a number of its own threads and you need to account for those in addition to application-specific threads when in doubt look at a thread dump.

Why would the JVM suddenly not allocate up to the maximum heap setting, even when running low on memory and with plenty of free OS memory?

My JVM is set to have a maximum heap size of 2GB. It is currently running slowly due to being low on memory, but it will not allocate beyond 1841MB (even though it has done so before on this run). I have over 16GB memory free.
Why would this suddenly happen to a running JVM? Could it be because it is "fenced in" - it cannot get a larger continuous range of physical memory?
This is for java 1.8.0_73 (64bit) on Windows 10. But I have seen this now and then for other java versions and on Windows 7 and XP too.
32 bit JVMs usually struggle to use more than about 1800Mb. Exactly how much they can allocate depends on your Operating System and how it lays out the 32 bit address space (which can vary between runs).
Use a 64 bit JVM to get more.
Start JVM with
java -Xmx2048m -Xms2048m
This will preallocate 2GB at JVM startup (even if not needed).
You cannot make a program run faster by just increasing the heap memory. A program may be slow due to various reasons.
In your case, may be it's not because of the memory usage, as the increase in the heap memory does not cause the program to use that memory to the fullest, or run faster. The heap is used up if you create a lot of objects which are in use and not garbage collected.
The other reason for the slowness could be due to some parts of the program using up the processing power (poorly performing algorithms ?).
It could also be due to slow I/O operations (file reads/writes ?).
These are only a few reasons. We can determine the slowness by getting to know more about your program.
You could look for slow running parts of your code by going through its logs (if any) or by using various profiling tools like jconsole (shipped with jdk), VisualVM, etc.
You could also tune your JVM by passing various parameters to customize the Garbage collection, various parts of the heap, thread stack size, etc.

Gradle heap error on uploadArchives

I am trying to upload an archive that's 600MB in size.
I get this error:
Execution failed for task ':uploadArchives'.
> Java heap space
Caused by: java.lang.OutOfMemoryError: Java heap space
I have tried to set GRADLE_OPTS, JVM_OPTS and MAVEN_OPTS variables, for setting the max. heap size, like for example:
export GRADLE_OPTS=-Xmx1024m
gradle uploadArchives
But I am still getting the same error.
What am I missing here?
Ultimately you always have a finite max of heap to use no matter what platform you are running on. In Windows 32 bit this is around 2gb (not specifically heap but total amount of memory per process). It just happens that Java happens to make the default smaller (presumably so that the programmer can't create programs that have runaway memory allocation without running into this problem and having to examine exactly what they are doing).
So this given there are several approaches you could take to either determine what amount of memory you need or to reduce the amount of memory you are using. One common mistake with garbage collected languages such as Java or C# is to keep around references to objects that you no longer are using, or allocating many objects when you could reuse them instead. As long as objects have a reference to them they will continue to use heap space as the garbage collector will not delete them.
In this case you can use a Java memory profiler to determine what methods in your program are allocating large number of objects and then determine if there is a way to make sure they are no longer referenced, or to not allocate them in the first place. One option which I have used in the past is "JMP" http://www.khelekore.org/jmp/.
If you determine that you are allocating these objects for a reason and you need to keep around references (depending on what you are doing this might be the case), you will just need to increase the max heap size when you start the program. However, once you do the memory profiling and understand how your objects are getting allocated you should have a better idea about how much memory you need.
In general if you can't guarantee that your program will run in some finite amount of memory (perhaps depending on input size) you will always run into this problem. Only after exhausting all of this will you need to look into caching objects out to disk etc. At this point you should have a very good reason to say "I need Xgb of memory" for something and you can't work around it by improving your algorithms or memory allocation patterns. Generally this will only usually be the case for algorithms operating on large datasets (like a database or some scientific analysis program) and then techniques like caching and memory mapped IO become useful.
Run Java with the command-line option -Xmx, which sets the maximum size of the heap.
The problem was because the actual size of package was a lot higher due to gradle not being able to handle symlinks.
When I manually handled the symlinks, the problem ended.
The JVM heap size can be set in the gradle.properties file, in the root directory of your gradle project. Like this:
org.gradle.jvmargs=-Xms256m -Xmx1024m

Benefits of reserving vs. committing+reserving memory using VirtualAlloc on large arrays

I am writing a C++ program that essentially works with very large arrays. On Windows, I am using VirtualAlloc to allocate memory to my arrays. Now I fully understand the difference between reserving and committing memory using VirutalAlloc; however, I am wondering whether there is any benefit in committing memory page-by-page to a reserved region. In particular, MSDN (http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx) contains the following explanation for the MEM_COMMIT option:
Actual physical pages are not allocated unless/until the virtual addresses are actually accessed.
My experiments confirm this: I can reserve and commit several GB of memory wihtout increasing memory usage of my process (as shown in Task Manager); actual memory gets allocated only when I actually access memory.
Now I saw quite a few examples arguing that one should reserve a large portion of the address space and then commit memory page-by-page (or in some larger blocks, depending on the app's logic). As explained above, however, memory does not seem to be committed before one accesses it; thus, I'm wondering whether there is any real benefit in committing memory page-by-page. In fact, committing memory page-by-page might actually slow my program down due to many system calls for actually comitting memory. If I commit the entire region at once, I pay for just one system call, but the kernel seems to be smart enough to actually allocate only memory that I actually use.
I would appreciate it if someone could explain to me which strategy is better.
The difference is that commit "backs" the memory against the page file. To give an example:
Given 2GB of physical ram and 2GB of swap (assume fixed-size swap for this purpose).
Reserve 6GB - OK.
Commit first 2GB - OK.
Commit remaining 4GB - fails.
Extend swap file to 8GB
Commit remaining 4GB - succeeds.
The reason for using MEM_COMMIT would primarily be for runtime error suppression (app stability). If you have a process that commits pages on-demand then there's always a chance that a commit along-the-way could fail if it exceeds amount of memory+swap available. When memory has been backed by the page file then you have a strong guarantee that the memory is available for use from now until the point that you release it.
There's a number of reasons to go one way or the other, and I don't think there's any perfect science to deciding which. MEM_RESERVE alone is only needed for very large sparse array scenarios, ex: multi-gigabyte array which has at most 25-33% utilization (a popular technique for accelerating hash tables, etc).
Almost everything else is gray area where you could probably go either way -- MEM_COMMIT up-front would make your own app a little more stable and essentially give it priority to physical ram over competing apps that might allocate on-demand. (if you grab the ram first then your app will be the last left standing when physical memory is exhausted) At the same time, if you're not actually using all that ram then you may end up limiting the multi-tasking potential of your client's machine or causing unnecessary wasted disk space via a growing page file.

memory allocation vs. swapping (under Windows)

sorry for my rather general question, but I could not find a definite answer to it:
Given that I have free swap memory left and I allocate memory in reasonable chunks (~1MB) -> can memory allocation still fail for any reason?
The smartass answer would be "yes, memory allocation can fail for any reason". That may not be what you are looking for.
Generally, whether your system has free memory left is not related to whether allocations succeed. Rather, the question is whether your process address space has free virtual address space.
The allocator (malloc, operator new, ...) first looks if there is free address space in the current process that is already mapped, that is, the kernel is aware that the addresses should be usable. If there is, that address space is reserved in the allocator and returned.
Otherwise, the kernel is asked to map new address space to the process. This may fail, but generally doesn't, as mapping does not imply using physical memory yet -- it is just a promise that, should someone try to access this address, the kernel will try to find physical memory and set up the MMU tables so the virtual->physical translation finds it.
When the system is out of memory, there is no physical memory left, the process is suspended and the kernel attempts to free physical memory by moving other processes' memory to disk. The application does not notice this, except that executing a single assembler instruction apparently took a long time.
Memory allocations in the process fail if there is no mapped free region large enough and the kernel refuses to establish a mapping. For example, not all virtual addresses are useable, as most operating systems map the kernel at some address (typically, 0x80000000, 0xc0000000, 0xe0000000 or something such on 32 bit architectures), so there is a per-process limit that may be lower than the system limit (for example, a 32 bit process on Windows can only allocate 2 GB, even if the system is 64 bit). File mappings (such as the program itself and DLLs) further reduce the available space.
A very general and theoretical answer would be no, it can not. One of the reasons it could possibly under very peculiar circumstances fail is that there would be some weird fragmentation of your available / allocatable memory. I wonder whether you're trying get (probably very minor) performance boost (skipping if pointer == NULL - kind of thing) or you're just wondering and want to discuss it, in which case you should probably use chat.
Yes, memory allocation often fails when you run out of memory space in a 32-bit application (can be 2, 3 or 4 GB depending on OS version and settings). This would be due to a memory leak. It can also fail if your OS runs out of space in your swap file.
