Geoserver memory issues - geoserver

There seems to be a few issues raised about geoserver's memory usage - I'm wondering if someone can speculate as to why my memory use is so high.
I'm currently running geoserver 2.11.1 under tomcat7 with a -Xms set to 8gb and -Xmx set to 12Gb. I have 45,000 layers containing small shapefiles (~3Mb each) across the same number of stores and around 60 workspaces.
Memory use, using the rest of the default values from tomcat and geoserver, immediately reaches 8Gb and I can't figure out why this would be the case.
Thoughts include:
1) Garbage collection is inefficient
2) There is a memory leak
3) There is not enough memory on the server
Hopefully someone can shine some light on why memory use is so high - any help is appreciated!

The Xms is the startup memory size, so it is expected that your application takes this amount of memory. Now let's not forget that it reserves this memory, it doesn't necessarily use it right away.

Related

Java buildpack memory calculation

Java buildpack memory calculator with Spring Boot application inside of Docker container with 1GB memory calculates memory as it says in documentation, it takes entire available memory and this are calculated JVM options:
Calculated JVM Memory Configuration: -XX:MaxDirectMemorySize=10M -Xmx747490K -XX:MaxMetaspaceSize=157725K -Xss1M (Total Memory: 1G, Thread Count: 50, Loaded Class Count: 25433, Headroom: 0%)
Question is why does it takes entire available memory and gives it to JVM? It should leave some memory for java process outside of JVM. This can lead to OOM because JVM thinks it has 1GB for itself (747490K for heap), and in reality it has less because some of it's memory is used by native memory, outside of JVM.
Should I not use this calculator and set JVM configuration by myself or I can reconfigure this somehow?
Question is why does it takes entire available memory and gives it to JVM?
The assumption is that the only thing running in your container is your Java application, thus it assigns all of the available memory to be used.
If you do things like shell out and run other processes or run other processes in the container, you need to tell memory calculator so it can take that into account.
This can lead to OOM because JVM thinks it has 1GB for itself (747490K for heap), and in reality it has less because some of it's memory is used by native memory, outside of JVM.
The memory calculator takes into consideration the major memory regions within a Java process. Not just heap. That said, it cannot 100% guarantee that you will never go over your memory limit. That's impossible with a Java app.
There are things you can do as an application developer, like create 10,000 threads or JNI, that cannot be restricted and could potentially consume a whole ton of memory. If you do that, your app will go over its container memory limit and crash.
The memory calculator attempts to give you a reasonable memory configuration for most common Java workloads. Running a web app, running a microservice, running some batch jobs, etc...
If you are doing something that doesn't fit within that pattern, then you can simply tell the memory calculator and it'll adjust things accordingly.
Should I not use this calculator and set JVM configuration by myself or I can reconfigure this somehow?
Even if you need to customize what the calculator is doing it can be helpful. It's additional toil to calculate these values manually, especially when it's so easy to change the memory limits. If your ops team increases the memory limit of the container, you want your application to automatically adjust to that configuration (as well as it can).
Beyond that, memory calculator is also good at detecting problems early. If you configure the JVM manually and you mess it up, let's say you over-allocate memory, the JVM won't necessarily care until it tries to get more memory and can't. At some point down the road, you're going to have a problem but it's not clear when (probably at 3am on a Sat, lol).
With memory calculator, it's doing the math when your container first starts to make sure that memory settings are sane. If there's something off with the configuration, it'll fail and let you know.
TIPS:
You can override a memory calculator-defined value by simply setting that JVM option in the JAVA_TOOL_OPTIONS env variable. For example, if I want to allow for more direct memory, I would set JAVA_TOOL_OPTIONS='-XX:MaxDirectMemorySize=50M'. Then when you restart the container, the memory calculator will shift memory around to accommodate that.
The one thing you don't want to set is -Xmx. The memory calculator should always set this because it will set it to whatever is left after other regions have been accounted for. You can think of it like HEAP = CONTAINER_MEMORY_LIMIT - (all static memory regions).
If you were to set -Xmx, you have to get it exactly right. If it's too low then you're wasting memory. If it's too high then you could exceed the container memory limit and get crashes.
In short, if you think you want to set -Xmx, you should either increase the container memory limit or decrease one of the static memory regions.
If you run other things in the container, you need to set the headroom. This is done with the BPL_JVM_HEAD_ROOM env variable. Give it a percent of the total container memory limit. Ex: BPL_JVM_HEAD_ROOM=20 would use 80% of the container's memory limit for Java and 20 for other stuff.
Setting some headroom can be useful in other cases as well, like if you're troubleshooting a container crash and you want a little extra room, or if you don't like operating at 100% the memory limit. You can leave 5 or 10% unused to match your comfort level.
If you have an application that uses a lot of threads, you'll need to adjust this as well. The default is 250 threads, which works well for many web/servlet-based applications (thread per request model). We do automatically lower to 50 threads if you're specifically using Spring Webflux which does not need so many threads.
For other cases, it's up to you to configure this. For example, if you have a batch application that only needs a thread pool of 10, then you could set this 40 or 50. 40-50 seems weird in this example, but the JVM creates a number of its own threads and you need to account for those in addition to application-specific threads when in doubt look at a thread dump.

Why would the JVM suddenly not allocate up to the maximum heap setting, even when running low on memory and with plenty of free OS memory?

My JVM is set to have a maximum heap size of 2GB. It is currently running slowly due to being low on memory, but it will not allocate beyond 1841MB (even though it has done so before on this run). I have over 16GB memory free.
Why would this suddenly happen to a running JVM? Could it be because it is "fenced in" - it cannot get a larger continuous range of physical memory?
This is for java 1.8.0_73 (64bit) on Windows 10. But I have seen this now and then for other java versions and on Windows 7 and XP too.
32 bit JVMs usually struggle to use more than about 1800Mb. Exactly how much they can allocate depends on your Operating System and how it lays out the 32 bit address space (which can vary between runs).
Use a 64 bit JVM to get more.
Start JVM with
java -Xmx2048m -Xms2048m
This will preallocate 2GB at JVM startup (even if not needed).
You cannot make a program run faster by just increasing the heap memory. A program may be slow due to various reasons.
In your case, may be it's not because of the memory usage, as the increase in the heap memory does not cause the program to use that memory to the fullest, or run faster. The heap is used up if you create a lot of objects which are in use and not garbage collected.
The other reason for the slowness could be due to some parts of the program using up the processing power (poorly performing algorithms ?).
It could also be due to slow I/O operations (file reads/writes ?).
These are only a few reasons. We can determine the slowness by getting to know more about your program.
You could look for slow running parts of your code by going through its logs (if any) or by using various profiling tools like jconsole (shipped with jdk), VisualVM, etc.
You could also tune your JVM by passing various parameters to customize the Garbage collection, various parts of the heap, thread stack size, etc.

Methods to decrease Memory consumption in ArangoDB

We currently have a ArangoDB cluster running on version 3.0.10 for a POC with about 81 GB stored on disk and Main memory consumption of about 98 GB distributed across 5 Primary DB servers. There are about 200 million vertices and 350 million Edges. There are 3 Edge collections and 3 document collections, most of the memory(80%) is consumed due to the presence of the edges
I'm exploring methods to decrease the main memory consumption. I'm wondering if there are any methods to compress/serialize the data so that less amount of main memory is utilized.
The reason for decreasing memory is to reduce infrastructure costs, I'm willing to trade-off on speed for my use case.
Please can you let me know, if there any Methods to reduce main memory consumption for ArangoDB
It took us a while to find out that our original recommendation to set vm.overcommit_memory to 2 this is not good in all situations.
It seems that there is an issue with the bundled jemalloc memory allocator in ArangoDB with some environments.
With an vm.overcommit_memory kernel settings value of 2, the allocator had a problem of splitting existing memory mappings, which made the number of memory mappings of an arangod process grow over time. This could have led to the kernel refusing to hand out more memory to the arangod process, even if physical memory was still available. The kernel will only grant up to vm.max_map_count memory mappings to each process, which defaults to 65530 on many Linux environments.
Another issue when running jemalloc with vm.overcommit_memory set to 2 is that for some workloads the amount of memory that the Linux kernel tracks as "committed memory" also grows over time and does not decrease. So eventually the ArangoDB daemon process (arangod) may not get any more memory simply because it reaches the configured overcommit limit (physical RAM * overcommit_ratio + swap space).
So the solution here is to modify the value of vm.overcommit_memory from 2 to either 1 or 0. This will fix both of these problems.
We are still observing ever-increasing virtual memory consumption when using jemalloc with any overcommit setting, but in practice this should not cause problems.
So when adjusting the value of vm.overcommit_memory from 2 to either 0 or 1 (0 is the Linux kernel default btw.) this should improve the situation.
Another way to address the problem, which however requires compilation of ArangoDB from source, is to compile a build without jemalloc (-DUSE_JEMALLOC=Off when cmaking). I am just listing this as an alternative here for completeness. With the system's libc allocator you should see quite stable memory usage. We also tried another allocator, precisely the one from libmusl, and this also shows quite stable memory usage over time. The main problem here which makes exchanging the allocator a non-trivial issue is that jemalloc has very nice performance characteristics otherwise.
(quoting Jan Steemann as can be found on github)
Several new additions to the rocksdb storage engine were made meanwhile. We demonstrate how memory management works in rocksdb.
Many Options of the rocksdb storage engine are exposed to the outside via options.
Discussions and a research of a user led to two more options to be exposed for configuration with ArangoDB 3.7:
--rocksdb.cache-index-and-filter-blocks-with-high-priority
--rocksdb.pin-l0-filter-and-index-blocks-in-cache
--rocksdb.pin-top-level-index-and-filter

neo4j getting slow & stuck on amazon ec2

I have a neo4j running on an ec2 instance (large, ubuntu) and I'm running some scripts on it that do lots of writings.
I noticed that after a while that those scripts run (after they wrote couple of thousand nodes) the server starting to run very slow, sometimes to the point it get absolutely stuck. another weird part - resetting the instance from this situation usually ends up in the server taking much longer than usual to init.
first I suspected that neo4j uses up all the RAM and this is a paging problem, but I've read that neo4j calculates dynamically the heap size and stack size limits. also I checked memory usage with top and it looked like most of the RAM was unused, except for Java process occasionally popping up, taking few GBs and then disappear quickly, which I assumed was neo4j.
anyway here's my question(s): do I need to config neo4j server and/or wrapper, or should I let neo4j calculate it dynamically on its own? and did someone encountered something like I described and have any idea what could cause it?
thanks!
It's been my experience that you definitely need to tweak the memory settings to your needs. The neo4j manual has a whole section on it:
http://neo4j.com/docs/stable/configuration.html
I've not really heard of neo4j automatically adjusting to your server's memory capabilities, though just last night I did run across what seemed like a new configuration variable in conf/neo4j.properties:
# The amount of memory to use for mapping the store files, either in bytes or
# as a percentage of available memory. This will be clipped at the amount of
# free memory observed when the database starts, and automatically be rounded
# down to the nearest whole page. For example, if "500MB" is configured, but
# only 450MB of memory is free when the database starts, then the database will
# map at most 450MB. If "50%" is configured, and the system has a capacity of
# 4GB, then at most 2GB of memory will be mapped, unless the database observes
# that less than 2GB of memory is free when it starts.
#mapped_memory_total_size=50%

JVM memory tuning for eXist

Suppose you had a server with 24G RAM at your disposal, how much memory would you allocate to (Tomcat to run) eXist?
I'm setting up our new webserver, with an Intel Xeon E5649 (2.53GHz) processor, running Ubuntu 12.04 64-bit. eXist is running as a webapp inside Tomcat, and the db is only used for querying 'stable' collections --that is, no updates are being executed to the resources inside eXist.
I've been experimenting with different heap sizes (via -Xms and -Xmx settings when starting the Tomcat process), and so far haven't noticed much difference in response time for queries against eXist. In other words, it doesn't seem to matter much whether the JVM is allocated 4G or 16G. I have also upped the #cachesize and #collectionCache in eXist's WEB-INF/conf.xml file to e.g. 8192M, but this doesn't seem to have much effect. I suppose these settings /do/ have an influence when eXist is running inside Tomcat?
I know each situation is different (and I know there's a Tomcat server involved), but are there some rules of thumb for eXist performance w.r.t. the memory it is allocated? I'd like to get at a sensible memory configuration for a setup with a larger amount of RAM available.
This question was asked and answered on the exist-open mailing list. The answer from wolfgang#exist-db.org was:
Giving more memory to eXist will not necessarily improve response times. "Bad"
queries may consume lots of RAM, but the better your queries are optimized, the
less RAM they need: most of the heavy processing will be done using index
lookups and the optimizer will try to reduce the size of the node sets to be
passed around. Caching memory thus has to be large enough to hold the most
relevant index pages. If this is already the case, increasing the caching space
will not improve performance anymore. On the other hand, a too small cacheSize
of collectionCache will result in a recognizable bottleneck. For example, a
batch upload of resources or creating a backup can take several hours (instead
of e.g. minutes) if #collectionCache is too small.
If most of your queries are optimized to use indexes, 8gb RAM for eXist does
usually give you enough room to handle the occasional high load. Ideally you
could run some load tests to see what the maximum memory use actually is. For
#cacheSize, I rarely have to go beyond 512m. The setting for #collectionCache
depends on the number of collections and documents in the database. If you have
tens or hundreds of thousands of collections, you may have to increase it up to
768m or more. As I said above, you will recognize a sudden breakdown in
performance during uploads or backups if the collectionCache becomes too small.
So to summarize, a reasonable setting for me would be: -Xmx8192m,
#cacheSize="512m", #collectionCache="768m". If you can afford giving 16G main
memory it certainly won’t hurt. Also, if you are using the lucene index or the
new range index, you should consider increasing the #buffer setting in the
corresponding index module configurations in conf.xml as well:
<module id="lucene-index" buffer="256" class="org.exist.indexing.lucene.LuceneIndex" />
<module id="range-index" buffer="256" class="org.exist.indexing.range.RangeIndex"/>

Resources