Greenplum gp_vmem_protect_limit configuration - greenplum

We are doing a PoC by installing Greenplum on AWS environment. We have setup each of our segment servers as d2.8xlarge instance types which has 240 GB of RAM with no SWAP.
I am now trying to setup the gp_vmem_protect_limit using the formula mentioned in gpdb documents and the value is coming to 25600MB.
But in one of the Zendesk Notes it says that gp_vmem_protect_limit will be breached when "sessions executing on this segment are attempting together to use more than configured limit. " Does the segment in this text mean Segment Host or number of primary segments?
Also, with the Eager Free option being set I see that the memory utilization is very poor when running the TPC-DS benchmark with 5 concurrent users. I would like to improve the memory utilization of the environment and below are the other memory configurations
gpconfig -c gp_vmem_protect_limit -v 25600MB
gpconfig -c max_statement_mem -v 16384MB
gpconfig -c statement_mem -v 2400MB
Any suggestions?
Thanks,
Jayadeep

There is a calculator for it!
http://greenplum.org/calc/
You should also add a swap file or disk. It is pretty easy to do in Amazon too. I would add at least a 4GB swap file to each host when you have 240GB of RAM.

Related

memory usage grows until VM crashes while running Wildfly 9 with Java 8

We are having an issue with virtual servers (VMs) running out of native memory. These VMs are running:
Linux 7.2(Maipo)
Wildfly 9.0.1
Java 1.8.0._151 running with (different JVMs have different heap sizes. They range from 0.5G to 2G)
The JVM args are:
-XX:+UseG1GC
-XX:SurvivorRatio=1
-XX:NewRatio=2
-XX:MaxTenuringThreshold=15
-XX:-UseAdaptiveSizePolicy
-XX:G1HeapRegionSize=16m
-XX:MaxMetaspaceSize=256m
-XX:CompressedClassSpaceSize=64m
-javaagent:/<path to new relic.jar>
After about a month, sometimes longer, the VMs start to use all of their swap space and then eventually the OOM-Killer notices that java is using too much memory and kills one of our JVMs.
The amount of memory being used by the java process is larger than heap + metaSpace + compressed as revealed by using -XX:NativeMemoryTracking=detail
Are there tools that could tell me what is in this native memory(like a heap dump but not for the heap)?
Are there any tools that can map java heap usage to native memory usage (outside the heap) that are not jemalloc? I have used jemalloc to try to achieve this but the graph that is being drawn contains only hex values and not human readable class names so I cant really get anything out of it. Maybe I'm doing something wrong or perhaps I need another tool.
Any suggestions would be greatly appreciated.
You can use jcmd.
Start application with -XX:NativeMemoryTracking=summary or -
XX:NativeMemoryTracking=detail
Use jcmd to monitor the NMT (native memory tracker)
jcmd "pid" VM.native_memory baseline //take the baseline
jcmd "pid" VM.native_memory detail.diff // use based on your need to analyze more on change in native memory from its baseline

Ruby OOM in container

Recently we've encountered a problem with Ruby inside a Docker container. Despite quite low load, application tends to consume huge amounts of memory and after some time under mentioned load it OOMs.
After some investigation we narrowed down problem to the one-liner
docker run -ti -m 209715200 ruby:2.1 ruby -e 'while true do array = []; 3000000.times do array << "hey" end; puts array.length; end;'
On some machines it OOMed (was killed by oom-killer because of exceeding the limit) soon after the start, but on some it worked, though slowly, without OOMs. It seems like (only seems, maybe it's not the case) in some configurations ruby is able to deduce cgroup's limits and adjust it's GC.
Configurations tested:
CentOS 7, Docker 1.9 — OOM
CentOS 7, Docker 1.12 — OOM
Ubuntu 14.10, Docker 1.9 — OOM
Ubuntu 14.10, Docker 1.12 — OOM
MacOS X Docker 1.12 — No OOM
Fedora 23 Docker 1.12 — No OOM
If you look at the memory consumption of ruby process, in all cases it behaved similar to this picture, staying on the same level slightly below the limit, or crashing into the limit and being killed.
Memory consumption plot
We want to avoid OOMs at all cost, because it reduces resiliency and poses a risk of loosing data. Memory really needed for the application is way below the limit.
Do you have any suggestions as of what to do with ruby to avoid OOMing, possibly by loosing in performance?
We can't figure out what're the significant differences between tested installations.
Edit: Changing the code or increasing memory limit are not available. First one because we run fluentd with community plugins which we have no control of, second one because it won't guarantee that we won't face this issue again in the future.
You can try to tweak rubies garbage collection via environment variables (depending on your ruby version):
RUBY_GC_MALLOC_LIMIT=4000100
RUBY_GC_MALLOC_LIMIT_MAX=16000100
RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR=1.1
Or call garbage collection manualy via GC.start
For your example, try
docker run -ti -m 209715200 ruby:2.1 ruby -e 'while true do array = []; 3000000.times do array << "hey" end; puts array.length; array = nil; end;'
to help the garbage collector.
Edit:
I don't have a comparable environment to yours. On my machine (14.04.5 LTS, docker 1.12.3, RAM 4GB, Intel(R) Core(TM) i5-3337U CPU # 1.80GHz) the following looks quite promising.
docker run -ti -m 500MB -e "RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR=1" \
-e "RUBY_GC_MALLOC_LIMIT=5242880" \
-e "RUBY_GC_MALLOC_LIMIT_MAX=16000100" \
-e "RUBY_GC_HEAP_INIT_SLOTS=500000" \
ruby:2.1 ruby -e 'while true do array = []; 3000000.times do array << "hey" end; puts array.length; puts `ps -o rss -p #{Process::pid}`.chomp.split("\n").last.strip.to_i / 1024.0 / 1024 ; puts GC.stat; end;'
But every ruby app needs a different setup for fine tuning and if you experience memory leaks, your lost.
I don't think this is a docker issue. You're overusing the resources of the container and Ruby tends to not behave well once you hit memory thresholds. It can GC, but if another process tries to take some memory or Ruby attempts to allocate again while you are maxed out then the kernel will (usually) kill the process with the most memory. If you're worried about memory usage on a server, add some threshold alerts at 80% RAM and allocate the appropriately sized resources for the job. When you start hitting thresholds, allocate more RAM or look at the particular job parameters/allocations to see if it needs to be redesigned to have a lower footprint.
Another potential option if you really want to have a nice fixed memory band to GC against is to use JRuby and set the JVM max memory to leave a little wiggle room on the container memory. The JVM will manage OOM within its own context better as it isn't sharing those resources with other processes nor letting the kernel think the server is dying.
I had a similar issue with a few java based Docker containers that were running on a single Docker host. The problem was each container saw the total available memory of the host machine and assumed it could use all of that memory for itself. It didn't run GC very often and I ended up getting out of memory exceptions. I ended up manually limiting the amount of memory each container could use and I no longer got OOMs. Within the contianer I also limited the memory of the JVM.
Not sure if this is the same issue you're seeing but it could be related.
https://docs.docker.com/engine/reference/run/#/runtime-constraints-on-resources

Cannot increase work_mem above 1GB using PostgreSQL 9.3 on Windows Server

I would like to tweak the postgres config for use on a Windows server. Here is my current posgresql.conf file: http://pastebin.com/KpSi2zSd
I would like to increase work_mem and maintenance_work_mem, but if I raise the values above 1GB I get this error when starting the service:
Nothing is added to the log files (at least not in data\pg_log). How can I figure out what is causing the issue (increase logging)? Could the have anything to do with issues management between windows and postgres?
Here are my server specs:
Windows Server 2012 R2 Datacenter (64 bit)
Intel CPU E5-2670 v2 # 2.50 GHz
512 GB RAM
PostgreSQL 9.3
Under Windows the value for work_mem is limited to 2GB (even on a 64bit system) - there is no workaround as far as I know.
I don't know why you couldn't set it to 1GB though. Maybe the sum of work_mem and maintenance_work_mem has another limit I am not aware of.
Setting work_mem that high by default is usually not a good idea. With 512GB RAM and just 10 users this might work, but keep in mind that the amount of work_mem is requested by a statement for every sort, group or hash operation in a single query. So you could have a statement requesting this amount of memory 15 or 20 times.
You don't need to change this in postgresql.conf - this can be changed dynamically if you know that the following query will benefit from a large work_mem, by running:
set session work_mem='2097151';
If you use a higher number, you'll get an error message telling you the limit:
ERROR: 2097152 is outside the valid range for parameter "work_mem" (64 .. 2097151)
Even if Postgres isn't using all the memory, it still benefits from it. Postgres (unlike e.g. Oracle) relies heavily on the filesystem cache rather than doing all the caching itself. Values for shared_buffers beyond roughly 8GB rarely show any benefit.
What you do need to tell Postgres is how much memory the operating system usually uses for caching, by setting effective_cache_size to the appropriate value. Postgres doesn't use that for caching, but it influences the planner's choice to e.g. prefer an index scan over a seq scan if the index is likely to be in the file system cache.
You can see the current size of the file system cache in the Windows task manager (or e.g. ProcessExplorer)
as described above, in windows it is more beneficial to rely on the OS cache.
If you use RAMMAP from sysinternals (Microsoft) you can see exactly what is being used by postgres in the OS cache, and hence how much is actually cached to it.

JMeter issues when running large number of threads

I'm testing using Apache's Jmeter, I'm simply accessing one page of my companies website and turning up the number of users until it reaches a threshold, the problem is that when I get to around 3000 threads JMeter doesn't run all of them. Looking at the Aggregate Graph
it only runs about 2,536 (this number varies but is always around here) of them.
The partial run comes with the following exception in the logs:
01:16 ERROR - jmeter.JMeter: Uncaught exception:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at org.apache.jmeter.threads.ThreadGroup.start(ThreadGroup.java:293)
at org.apache.jmeter.engine.StandardJMeterEngine.startThreadGroup(StandardJMeterEngine.java:476)
at org.apache.jmeter.engine.StandardJMeterEngine.run(StandardJMeterEngine.java:395)
at java.lang.Thread.run(Unknown Source)
This behavior is consistent. In addition one of the times JMeter crashed in the middle outputting a file that said:
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 32756 bytes for ChunkPool::allocate
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (allocation.cpp:211), pid=10748, tid=11652
#
# JRE version: 6.0_31-b05
# Java VM: Java HotSpot(TM) Client VM (20.6-b01 mixed mode, sharing windows-x86 )
Any ideas?
I tried changing the heap size in jmeter.bat, but that didn't seem to help at all.
JVM is simply not capable of running so many threads. And even if it is, JMeter will consume a lot of CPU resources to purely switch contexts. In other words, above some point you are not benchmarking your web application but the client computer, hosting JMeter.
You have few choices:
experiment with JVM options, e.g. decrease default -Xss512K to something smaller
run JMeter in a cluster
use tools taking radically different approach like Gatling
I had a similar issue and increased the heap size in jmeter.bat to 1024M and that fixed the issue.
set HEAP=-Xms1024m -Xmx1024m
For the JVM, if you read hprof it gives you some solutions among which are:
switch to a 64 bits jvm ( > 6_u25)
with this you will be able to allocate more Heap (-Xmx) , ensure you have this RAM
reduce Xss with:
-Xss256k
Then for JMeter, follow best-practices:
http://jmeter.apache.org/usermanual/best-practices.html
http://www.ubik-ingenierie.com/blog/jmeter_performance_tuning_tips/
Finally ensure you use last JMeter version.
Use linux OS preferably
Tune the TCP stack, limits
Success will depend on your machine power (cpu and memory) and your test plan.
If this is not enough (for 3000 threads it should be OK), you may need to use distributed testing
Increasing the heap size in jmeter.bat works fine
set HEAP=-Xms1024m -Xmx1024m
OR
you can do something like below if you are using jmeter.sh:
JVM_ARGS="-Xms512m -Xmx1024m" jmeter.sh etc.
I ran into this same problem and the only solution that helped me is: https://stackoverflow.com/a/26190804/5796780
proper 100k threads on linux:
ulimit -s 256
ulimit -i 120000
echo 120000 > /proc/sys/kernel/threads-max
echo 600000 > /proc/sys/vm/max_map_count
echo 200000 > /proc/sys/kernel/pid_max
If you don't have root access:
echo 200000 | sudo dd of=/proc/sys/kernel/pid_max
After increasing Xms et Xmx heap size, I had to make my Java run in 64 bits mode. In jmeter.bat :
set JM_LAUNCH=java.exe -d64
Obviously, you need to run a 64 bits OS and have installed Java 64 bits (see https://www.java.com/en/download/manual.jsp)

redis bgsave failed because fork Cannot allocate memory

all:
here is my server memory info with 'free -m'
total used free shared buffers cached
Mem: 64433 49259 15174 0 3 31
-/+ buffers/cache: 49224 15209
Swap: 8197 184 8012
my redis-server has used 46G memory, there is almost 15G memory left free
As my knowledge,fork is copy on write, it should not failed when there has 15G free memory,which is enough to malloc necessary kernel structures .
besides, when redis-server used 42G memory, bgsave is ok and fork is ok too.
Is there any vm parameter I can tune to make fork return success ?
More specifically, from the Redis FAQ
Redis background saving schema relies on the copy-on-write semantic of fork in modern operating systems: Redis forks (creates a child process) that is an exact copy of the parent. The child process dumps the DB on disk and finally exits. In theory the child should use as much memory as the parent being a copy, but actually thanks to the copy-on-write semantic implemented by most modern operating systems the parent and child process will share the common memory pages. A page will be duplicated only when it changes in the child or in the parent. Since in theory all the pages may change while the child process is saving, Linux can't tell in advance how much memory the child will take, so if the overcommit_memory setting is set to zero fork will fail unless there is as much free RAM as required to really duplicate all the parent memory pages, with the result that if you have a Redis dataset of 3 GB and just 2 GB of free memory it will fail.
Setting overcommit_memory to 1 says Linux to relax and perform the fork in a more optimistic allocation fashion, and this is indeed what you want for Redis.
Redis doesn't need as much memory as the OS thinks it does to write to disk, so may pre-emptively fail the fork.
Modify /etc/sysctl.conf and add:
vm.overcommit_memory=1
Then restart sysctl with:
On FreeBSD:
sudo /etc/rc.d/sysctl reload
On Linux:
sudo sysctl -p /etc/sysctl.conf
From proc(5) man pages:
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
In mode 0, calls of mmap(2) with MAP_NORESERVE set are not checked, and the default check is very weak, leading to the risk of getting a process "OOM-killed". Under Linux 2.4
any non-zero value implies mode 1. In mode 2 (available since Linux 2.6), the total virtual address space on the system is limited to (SS + RAM*(r/100)), where SS is the size
of the swap space, and RAM is the size of the physical memory, and r is the contents of the file /proc/sys/vm/overcommit_ratio.
Redis's fork-based snapshotting method can effectively double physical memory usage and easily OOM in cases like yours. Reliance on linux virtual memory for doing snapshotting is problematic, because Linux has no visibility into Redis data structures.
Recently a new redis-compatible project Dragonfly has been released. Among other things, it solves the OOM problem entirely. (disclosure - I am the author of this project).

Resources