Green Plum cannot using all memory of server - greenplum

I am new to Green Plum. I have a single server installed GreenPlum(1 master instance, 6 segment instances), and we have huge data imported(about 10TB). as we all run it for about 1 month, the memory utilization is low(15GB of 128GB), but the cpu is almost 100% when we run some calculation on it.
It will report the OOM issue of segment some time.
OS version: CentOS 7.2, Server Type: VM
Here are the os settings:
kernel.shmmax = 107374182400
kernel.shmall = 26214400
kernel.shmmin = 4096
for GP setting:
gp_vmem_protect_limit=11900
Any help is appreciated

shmall should be <50% of RAM
you have one single VM (128GB) with gpdb master process and 6 primary segment processes. Am I right? Do you have mirror segment processes? How many CPU cores does your VM have?
gp_vmem_protect_limit =12GB. This means you have 12GB x 7 (1master, 6primary segments) = 84GB.
1 single node VM to handle 10TB data? Your cpu is probably waiting for IO all the time. This is not right.

Related

How to get better performace in ProxmoxVE + CEPH cluster

We have been running ProxmoxVE since 5.0 (now in 6.4-15) and we noticed a decay in performance whenever there is some heavy reading/writing.
We have 9 nodes, 7 with CEPH and 56 OSDs (8 on each node). OSDs are hard drives (HDD) WD Gold or better (4~12 Tb). Nodes with 64/128 Gbytes RAM, dual Xeon CPU mainboards (various models).
We already tried simple tests like "ceph tell osd.* bench" getting stable 110 Mb/sec data transfer to each of them with +- 10 Mb/sec spread during normal operations. Apply/Commit Latency is normally below 55 ms with a couple of OSDs reaching 100 ms and one-third below 20 ms.
The front network and back network are both 1 Gbps (separated in VLANs), we are trying to move to 10 Gbps but we found some trouble we are still trying to figure out how to solve (unstable OSDs disconnections).
The Pool is defined as "replicated" with 3 copies (2 needed to keep running). Now the total amount of disk space is 305 Tb (72% used), reweight is in use as some OSDs were getting much more data than others.
Virtual machines run on the same 9 nodes, most are not CPU intensive:
Avg. VM CPU Usage < 6%
Avg. Node CPU Usage < 4.5%
Peak VM CPU Usage 40%
Peak Node CPU Usage 30%
But I/O Wait is a different story:
Avg. Node IO Delay 11
Max. Node IO delay 38
Disk writing load is around 4 Mbytes/sec average, with peaks up to 20 Mbytes/sec.
Anyone with experience in getting better Proxmox+CEPH performance?
Thank you all in advance for taking the time to read,
Ruben.
Got some Ceph pointers that you could follow...
get some good NVMEs (one or two per server but if you have 8HDDs per server 1 should be enough) and put those as DB/WALL (make sure they have power protection)
the ceph tell osd.* bench is not that relevant for real world, I suggest to try some FIO tests see here
set OSD osd_memory_target to at 8G or RAM minimum.
in order to save some write on your HDD (data is not replicated X times) create your RBD pool as EC (erasure coded pool) but please do some research on that because there are some tradeoffs. Recovery takes some extra CPU calculations
All and all, hype-converged clusters are good for training, small projects and medium projects with not such a big workload on them... Keep in mind that planning is gold
Just my 2 cents,
B.

How is cpu config for haproxy handled within docker?

I'm wondering about haproxy performance from within a container. To make things simple if I have a vm running haproxy with this cpu config I know what to expect:
nbproc 1
nbthread 8
cpu-map auto:1/1-8 0-7
If I want to port the (whole) config to docker for testing purposes without any fancy swarm magic or setup just docker so that I can understand how things map, I'd imagine that the cpu config gets simpler and that the haproxy instance is meant to scale. I guess I have two questions:
Would you even bother configuring cpu from within an haproxy docker container or would you scale the container from behind a service? Maybe you need both.
Can a single container utilise the above config as though it were running on the system as a daemon? Would docker / containerd even care about this config?
I know having 4 containers each with their own config with the cpu evenly mapped like so wouldn't scale or make any sense:
nbproc 1
nbthread 2
cpu-map auto:1/1-2 0-1
nbproc 1
nbthread 2
cpu-map auto:1/3-4 2-3
nbproc 1
nbthread 2
cpu-map auto:1/5-6 4-5
nbproc 1
nbthread 2
cpu-map auto:1/7-8 6-7
But it's this sort of saturation that I'm wondering about. Just how does haproxy / docker handle this sort of cpu nuance?
I've confirmed that there's little to no perceivable impact to service when running haproxy under containerd vs running under systemd using the image provided by haproxy. Running a single container -d with --network host and no limits on cpu or memory at worst I've seen a 2-3% impact on web external latency with live traffic peaked at about 50-60MB/sec, which itself is dependent on throughput and type of requests. On an 8 core vm with 4GB mem (host cpu is xeon 6130 Gold) and a gig interface the memory utilisation is almost identical. cpu performance also remains stable with potential 3-5% increase in utilisation. These tests are private and unpublished.
As far as cpu configuration goes
nbproc 1
nbthread 8
cpu-map auto:1/1-8 0-7
master-worker
This config maps 1:1 between containerd and systemd and yeilds the results already mentioned. The proc and threads will start up under containerd and function as you expect. This takes up about 80-90% of the total cpu (800%) which represents less than 1 fully loaded core at peak. So this container could be scaled with this configuration a further 8 times in theory, 5 or 6 times to leave some headroom.
Also note that any fluctuations in these performance data are likely due to my environment. These tests were taken from a real environment acorss multiple sites not a test bed where I controlled every aspect. Also note depending on your host cpu and load your results will vary wildly.

How to accelerate specific process on Ubuntu

I'm executing pleskbackup on a Ubuntu 18.04 LTS‬ server to create a full backup.
This task has already been running for over half a day now. While the server isn't nearly working to capacity.
CPU: 1.5% (of 800%, 8 cores)
Memory: 3.9% (626 MB of 15.6 GB)
Is there any way to give this specific task more resources for speed it up?
I've set the priority already to the highest via htop.
– Thanks in advance.

Need help understanding my ElasticSearch Cluster Health

When querying my cluster, I noticed these stats for one of my nodes in the cluster. Am new to Elastic and would like the community's health in understanding the meaning of these and if I need to take any corrective measures?
Does the Heap used look on the higher side and if yes, how would I rectify it? Also any comments on the System Memory Used would be helpful - it feels like its on the really high side as well.
These are the JVM level stats
JVM
Version OpenJDK 64-Bit Server VM (1.8.0_171)
Process ID 13735
Heap Used % 64%
Heap Used/Max 22 GB / 34.2 GB
GC Collections (Old/Young) 1 / 46,372
Threads (Peak/Max) 163 / 147
This is the OS Level stats
Operating System
System Memory Used % 90%
System Memory Used 59.4 GB / 65.8 GB
Allocated Processors 16
Available Processors 16
OS Name Linux
OS Architecture amd64
As You state that you are new to Elasticsearch I must say you go through cluster as well as cat API you can find documentation at clusert API and cat API
This will help you understand more in depth.

Memory management by OS

I am trying to understand memory management by the OS .
What I understand till now is that in a 32 bit system ,each process is allocated a space of 4gb [2gb user + 2gb kernel] ,in the virtual address space.
What confuses me is that is this 4gb space unique for every process . if I have say 3 processes p1 ,p2 ,p3 running would I need 12 gb of space on the hard disk ?
Also if say I have 2gb ram on a 32 bit system ,how will it manage to handle a process which needs 4gb ?[through the paging file ] ?
[2gb user + 2gb kernel]
That is a constraint by the OS. On an x86 32-bit system without PAE enabled, the virtual address space is 4 GiB (note that GB usually denotes 1000 MB while GiB stands for 1024 MiB).
What confuses me is that is this 4gb space unique for every process .
Yes, every process has its own 4 GiB virtual address space.
if I have say 3 processes p1 ,p2 ,p3 running would I need 12 gb of
space on the hard disk ?
No. With three processes, they can occupy a maximum of 12 GiB of storage. Whether that's primary or secondary storage is left to the kernel (primary is preferred, of course). So, you'd need your primary memory size + some secondary storage space to be at least 12 GiB to contain all three processes if all those processes really occupied the full range of 4 GiB, which is pretty unlikely to happen.
Also if say I have 2gb ram on a 32 bit system ,how will it manage to
handle a process which needs 4gb ?[through the paging file ] ?
Yes, in a way. You mean the right thing, but the "paging file" is just an implementation detail. It is used by Windows, but Linux, for example, uses a seperate swap partition instead. So, to be technically correct, "secondary storage (a HDD, for example) is needed to store the remaining 2 GiB of the process" would be right.

Resources