Difference between memory_get_peak_usage and actual php process' memory usage

Difference between memory_get_peak_usage and actual php process' memory usage - memory-management

Why result of php memory_get_peak_usage differs so much from memory size that is shown as allocated to process when using 'top' or 'ps' commands in Linux?
I've set 2 Mb of memory_limit in php.ini
My single-string php-script with
echo memory_get_peak_usage(true);
says that it is using 786432 bytes (768 Kb)
If I try to ask system about current php process
echo shell_exec('ps -p '.getmypid().' -Fl');
it gives me
F S UID PID PPID C PRI NI ADDR SZ WCHAN RSS PSR STIME TTY TIME CMD
5 S www-data 14599 14593 0 80 0 - 51322 pipe_w 6976 2 18:53 ? 00:00:00 php-fpm: pool www
RSS param is 6976, so memory usage is 6976 * 4096 = 28573696 = ~28 Mb
Where that 28 Mb come from and is there any way to decrease memory size that is being used by php-fpm process?

The memory size is mostly used by the PHP process itself. memory_get_peak_usage() returns the memory used by your specific script. Ways to reduce the memory overhead is to remove the number of extensions, statically compile PHP, etc.. But don't forget that php-fpm (should) fork and that a lot of the memory usage that's not different between PHP process is in fact shared (until it changes).

PHP itself may only be set to a 2meg limit, but it's running WITHIN a Apache child process, and that process will have a much higher memory footprint.
If you were running the script from the command line, you'd get memory usage of PHP by itself, as it's not wrapped within Apache and is running on its own.

The peak memory usage is for the current script only.

Related

High RSS and OOM kill despite low value in runtime.MemStats.Sys

I have a process which slowly consumes more RAM until it eventually hits its cgroup limit and is OOM killed, and I'm trying to figure out why.
Oddly, go's runtime seems to think not much RAM is used, whereas the OS seems to think a lot is used.
Specifically, looking at runtime.MemStats (via the extvar package) I see:
"Alloc":51491072,
"TotalAlloc":143474637424,
"Sys":438053112,
"Lookups":0,
"Mallocs":10230571,
"Frees":10195515,
"HeapAlloc":51491072,
"HeapSys":388464640,
"HeapIdle":333824000,
"HeapInuse":54640640,
"HeapReleased":0,
"HeapObjects":35056,
"StackInuse":14188544,
"StackSys":14188544,
"MSpanInuse":223056,
"MSpanSys":376832,
"MCacheInuse":166656,
"MCacheSys":180224,
"BuckHashSys":2111104,
"GCSys":13234176,
"OtherSys":19497592
But from the OS perspective:
$ ps auxwf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 178 0.0 0.0 3996 3372 pts/0 Ss 17:33 0:00 bash
root 246 0.0 0.0 7636 2828 pts/0 R+ 17:59 0:00 \_ ps auxwf
root 1 166 2.8 11636248 5509288 ? Ssl 17:24 57:15 app server -api-public
So, the OSS reports an RSS of 5380 MiB, but the Sys field in MemStats shows only 417 MiB. My understanding is these fields should be approximately the same.
GC is running, as confirmed by setting GODEBUG=gctrace=1,madvdontneed=1. For example, I see output like:
gc 6882 #2271.137s 0%: 0.037+2.2+0.087 ms clock, 3.5+0.78/37/26+8.4 ms cpu, 71->72->63 MB, 78 MB goal, 96 P
The numbers vary a bit depending on the process, but they are all <100 MB, whereas the OS is reporting >1GB (and growing, until eventual OOM).
madvdontneed=1 was a shot in the dark but seems to make no difference. I wouldn't think the madvise parameters would be relevant, since it doesn't seem there's any need to return memory to the kernel, as the Go runtime doesn't think it's using much memory anyway.
What could explain this discrepancy? Am I not correctly understanding the semantics of these fields? Are there mechanisms that would result in the growth of RSS (and an eventual OOM kill) but not increase MemStats.Sys?

Diagnosing high CPU usage on Docker for Mac

How do I diagnose the cause of Docker on MacOS, specifically com.docker.hyperkit using 100% of CPU?
Docker stats
Docker stats shows all the running containers have low CPU, memory, net IO and block IO.
iosnoop
iosnoop shows that com.docker.hyperkit performs about 50 writes per second totaling 500KB per second to the file Docker.qcow2. According to What is Docker.qcow2?, Docker.qcow2 is a sparse file that's the persistent storage for all Docker containers.
In my case the file isn't that sparse. The physical size matches the logical size.
dtrace (dtruss)
dtruss sudo dtruss -p $DOCKER_PID shows a large number of psynch_cvsignal and psynch_cvwait calls.
psynch_cvsignal(0x7F9946002408, 0x4EA701004EA70200, 0x4EA70100) = 257 0
psynch_mutexdrop(0x7F9946002318, 0x5554700, 0x5554700) = 0 0
psynch_mutexwait(0x7F9946002318, 0x5554702, 0x5554600) = 89474819 0
psynch_cvsignal(0x10BF7B470, 0x4C8095004C809600, 0x4C809300) = 257 0
psynch_cvwait(0x10BF7B470, 0x4C8095014C809600, 0x4C809300) = 0 0
psynch_cvwait(0x10BF7B470, 0x4C8096014C809700, 0x4C809600) = -1 Err#316
psynch_cvsignal(0x7F9946002408, 0x4EA702004EA70300, 0x4EA70200) = 257 0
psynch_cvwait(0x7F9946002408, 0x4EA702014EA70300, 0x4EA70200) = 0 0
psynch_cvsignal(0x10BF7B470, 0x4C8097004C809800, 0x4C809600) = 257 0
psynch_cvwait(0x10BF7B470, 0x4C8097014C809800, 0x4C809600) = 0 0
psynch_cvwait(0x10BF7B470, 0x4C8098014C809900, 0x4C809800) = -1 Err#316
Update: top on Docker host
From https://stackoverflow.com/a/58293240/30900:
docker run -it --rm --pid host busybox top
The CPU usage on docker embedded host is ~3%. CPU usage on my MacBook was ~100%. So, the docker embedded host isn't causing the CPU usage spike.
Update: running dtrace scripts of most common stack traces
Stack traces from the dtrace scripts in the answer below: https://stackoverflow.com/a/58293035/30900.
These kernel stack traces look innocuous.
AppleIntelLpssGspi`AppleIntelLpssGspi::regRead(unsigned int)+0x1f
AppleIntelLpssGspi`AppleIntelLpssGspi::transferMmioDuplexMulti(void*, void*, unsigned long long, unsigned int)+0x91
AppleIntelLpssSpiController`AppleIntelLpssSpiController::transferDataMmioDuplexMulti(void*, void*, unsigned int, unsigned int)+0xb2
AppleIntelLpssSpiController`AppleIntelLpssSpiController::_transferDataSubr(AppleInfoLpssSpiControllerTransferDataRequest*)+0x5bc
AppleIntelLpssSpiController`AppleIntelLpssSpiController::_transferData(AppleInfoLpssSpiControllerTransferDataRequest*)+0x24f
kernel`IOCommandGate::runAction(int (*)(OSObject*, void*, void*, void*, void*), void*, void*, void*, void*)+0x138
AppleIntelLpssSpiController`AppleIntelLpssSpiDevice::transferData(IOMemoryDescriptor*, void*, unsigned long long, unsigned long long, IOMemoryDescriptor*, void*, unsigned long long, unsigned long long, unsigned int, AppleIntelSPICompletion*)+0x151
AppleHSSPISupport`AppleHSSPIController::transferData(IOMemoryDescriptor*, void*, unsigned long long, unsigned long long, IOMemoryDescriptor*, void*, unsigned long long, unsigned long long, unsigned int, AppleIntelSPICompletion*)+0xcc
AppleHSSPISupport`AppleHSSPIController::doSPITransfer(bool, AppleHSSPITransferRetryReason*)+0x97
AppleHSSPISupport`AppleHSSPIController::InterruptOccurred(IOInterruptEventSource*, int)+0xf8
kernel`IOInterruptEventSource::checkForWork()+0x13c
kernel`IOWorkLoop::runEventSources()+0x1e2
kernel`IOWorkLoop::threadMain()+0x2c
kernel`call_continuation+0x2e
53
kernel`waitq_wakeup64_thread+0xa7
pthread`__psynch_cvsignal+0x495
pthread`_psynch_cvsignal+0x28
kernel`psynch_cvsignal+0x38
kernel`unix_syscall64+0x27d
kernel`hndl_unix_scall64+0x16
60
kernel`hndl_mdep_scall64+0x4
113
kernel`ml_set_interrupts_enabled+0x19
524
kernel`ml_set_interrupts_enabled+0x19
kernel`hndl_mdep_scall64+0x10
5890
kernel`machine_idle+0x2f8
kernel`call_continuation+0x2e
43395
The most common stack traces in user space over 17 seconds clearly implicate com.docker.hyperkit. There 1365 stack traces in 17 seconds in which com.docker.hyperkit created threads which averages to 80 threads per second.
com.docker.hyperkit`0x000000010cbd20db+0x19f9
com.docker.hyperkit`0x000000010cbdb98c+0x157
com.docker.hyperkit`0x000000010cbf6c2d+0x4bd
libsystem_pthread.dylib`_pthread_body+0x7e
libsystem_pthread.dylib`_pthread_start+0x42
libsystem_pthread.dylib`thread_start+0xd
19
Hypervisor`hv_vmx_vcpu_read_vmcs+0x1
com.docker.hyperkit`0x000000010cbd4c4f+0x2a
com.docker.hyperkit`0x000000010cbd20db+0x174a
com.docker.hyperkit`0x000000010cbdb98c+0x157
com.docker.hyperkit`0x000000010cbf6c2d+0x4bd
libsystem_pthread.dylib`_pthread_body+0x7e
libsystem_pthread.dylib`_pthread_start+0x42
libsystem_pthread.dylib`thread_start+0xd
22
Hypervisor`hv_vmx_vcpu_read_vmcs
com.docker.hyperkit`0x000000010cbdb98c+0x157
com.docker.hyperkit`0x000000010cbf6c2d+0x4bd
libsystem_pthread.dylib`_pthread_body+0x7e
libsystem_pthread.dylib`_pthread_start+0x42
libsystem_pthread.dylib`thread_start+0xd
34
com.docker.hyperkit`0x000000010cbd878d+0x36
com.docker.hyperkit`0x000000010cbd20db+0x42f
com.docker.hyperkit`0x000000010cbdb98c+0x157
com.docker.hyperkit`0x000000010cbf6c2d+0x4bd
libsystem_pthread.dylib`_pthread_body+0x7e
libsystem_pthread.dylib`_pthread_start+0x42
libsystem_pthread.dylib`thread_start+0xd
47
Hypervisor`hv_vcpu_run+0xd
com.docker.hyperkit`0x000000010cbd20db+0x6b6
com.docker.hyperkit`0x000000010cbdb98c+0x157
com.docker.hyperkit`0x000000010cbf6c2d+0x4bd
libsystem_pthread.dylib`_pthread_body+0x7e
libsystem_pthread.dylib`_pthread_start+0x42
libsystem_pthread.dylib`thread_start+0xd
135
Related issues
Github - docker/for-mac: com.docker.hyperkit 100% cpu usage is back again #3499
. One comment suggests adding volume caching described here: https://www.docker.com/blog/user-guided-caching-in-docker-for-mac/. I tried this and got a small ~10% reduction in CPU usage.

I have the same problem. My CPU % went back down to normal after I removed all my volumes.
docker system prune --volumes
I also manually removed some named volumes:
docker volume rm NameOfVolumeHere
That doesn't solve the overall issue of not being able to use volumes with Docker for mac. Right now I'm just being careful about the amount of volumes I use and closing Docker desktop when not in use.

My suspicion is that the issue is IO related. With MacOS volumes, this involves osxfs where there is some performance tuning you can perform. Mainly, if you can accept fewer consistency checks, you can set the volume mode to delegated for faster performance. See the docs for more details: https://docs.docker.com/docker-for-mac/osxfs-caching/. However, if your image contains a large number of small files, performance will suffer, especially if you also have lots of image layers.
You can also try the following command to debug any process issues within the embedded VM that docker uses:
docker run -it --rm --pid host busybox top
(To exit, use <ctrl>-c)
To track down if it's IO, you can also try the following:
$ docker run -it --rm --pid host alpine /bin/sh
$ apk add sysstat
$ pidstat -d 5 12
That will run inside the alpine container running in the VM pid namespace, showing any IO happening from any process, whether or not that process is inside of a container. The stats are every 5 seconds for one minute (12 times) and then it will give you an average table per process. You can then <ctrl>-d to destroy the alpine container.
From the comments and edits, these stats may check out. A 4 core MBP has 8 threads, so full CPU utilization should be 800% if MacOS is reporting the same as other Unix based systems. Inside the VM there's over 100% load shown in the top command for the average in the past minute (though less from the 5 and 15 averages) which is roughly what you see for the hyperkit process on the host. The instantaneous usage is over 12% from top, not 3%, since you need to add the system and user percentages. And the IO numbers shown in pidstat align roughly with what you see written to the qcow2 image.
If the docker engine itself is thrashing (e.g. restarting containers, or running lots of healthchecks), then you can debug that by watching the output of:
docker events

EDIT: after a few weeks, my cpu issues have come back - so the below solutions probably aren't worth it
My CPU was always running crazy high, and it wasn't I/O, as determined using docker stats
I did a bunch of stuff, but had it suddenly decrease to reasonable levels and stay that way for over a week now, after doing the following:
Ensure you have the right # of CPU's set - not what you have, but HALF that amount. Mine was more than half, and I feel this was the real problem, in Preferences | Resources
decrease # of file shares if possible - Preferences | Resources, /private, /tmp/, /var/folders
disable use gRPC FUSE for file sharing - Preferences | Resources

Changing the volumes to use a delegated configuration worked for me and resulted in a drastic drop in CPU usage.
see the document: https://docs.docker.com/docker-for-mac/osxfs-caching/#delegated
how set in my docker-compose.yml:
version: "3"
services:
my_service:
image: python3.6
ports:
- "80:10000"
volumes:
- ./code:/www/code:cached
For me this worked, macOS 10.15.5, Docker Desktop 2.3.0

This is a small dTrace script I use to find where the kernel is spending its time (it's from Solaris, and dates back to the early days of Solaris 10):
#!/usr/sbin/dtrace -s
profile:::profile-1001hz
/arg0/
{
#[ stack() ] = count();
}
It simply samples kernel stack traces and counts each one it encounters in the # aggregation.
Run it as root:
... # ./kernelhotspots.d > /tmp/kernel_hot_spots.txt
Let it run for a decent amount of time while you're having CPU issues, then hit CTRL-C to break the script. It will emit all the kernel stack traces it encountered, the most common last. If you need more (or less) stack frames from the default with
#[ stack( 15 ) ] = count();
That will show a stack frame 15 calls deep.
The last few stack traces will be where your kernel is spending most of its time. That may or may not be informative.
This script will do the same for user-space stack traces:
#!/usr/sbin/dtrace -s
profile:::profile-1001hz
/arg1/
{
#[ ustack() ] = count();
}
Run it similarly:
... # ./userspacehotspots.d > /tmp/userspace_hot_spots.txt
ustack() is a bit slower - to emit the actual function names, dTrace has to do a lot more work to get them from the address spaces of the appropriate processes.
Disabling System Integrity Protection might help you get better stack traces.
See DTrace Action Basics for some more details.

Had same issue with docker today in Big Sur (tried pruning images, changing to apple virtualization, nothing helped). However, disabling the docker desktop to startup in preferences and never opening the desktop gui seems to fix it for me. Docker now runs with only 10%cpu usage even after starting a few containers. However, once I open the desktop gui it slowly rises again to +90% cpu and keeps on hogging the cpu even after closing the DockerDesktop process. Docker version 20.10.13, build a224086.

The solution I found was to increase the resources given to Docker. I increased the Memory from 2GB to 8GB, the Swap from 1GB to 2GB, and the disk image size to 160GB. Completely solved the problem for me, and it's an easy one for readers to try.

to disable use gRPC FUSE for file sharing might not good idea. I found the feedback from another issue made by docker community. see bellow:
So we'll look into that. However,
osxfs will not be supported long term.
We can't maintain two solutions.
hier to docker issue thread

There is an open issue here https://github.com/docker/for-mac/issues/6166
It seems there are a few bugs going on
For some people (me including) unchecking the "Open Docker Dashboard at startup" and manually restarting docker do the job.
For other people increasing resources like CPU and Memory works

Oracle Database CPU Usage on AIX

I want to find the CPU process usage for all Oracle processes on an AIX box.
On Solaris I can do the following:
prstat -n 400 -c -s cpu -p 9013 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
9013 oracle 3463M 2928M sleep 53 0 0:00:35 0.9% oracle/2
Total: 1 processes, 2 lwps, load averages: 2.25, 2.32, 2.40
This basically reports the CPU usage for a given process ID (in this case 9013). Given a list of all Oracle PID’s I can use this command to get the CPU usage for each one, sum them up and hey presto I have my Oracle database CPU usage.
How can I get the same with AIX?
Thanks

You can try nmon or topas, which will show the current %CPU. You might also want to look into using WLM to create a class for all the Oracle processes, then use wlmstat to see the CPU usage for that class. That would save you the trouble of adding them up manually.

How to get CPU utilisation, RAM utilisation in MAC from commandline

I know that using Activity Monitor we can see CPU utilisation.But i want to get this information of my remote system through script. top command is not helpful for me. Please comment me any other way to get it.

What is the objection to top in logging mode?
top -l 1 | grep -E "^CPU|^Phys"
CPU usage: 3.27% user, 14.75% sys, 81.96% idle
PhysMem: 5807M used (1458M wired), 10G unused.
Or use sysctl
sysctl vm.loadavg
vm.loadavg: { 1.31 1.85 2.00 }
Or use vm_stat
vm_stat
Mach Virtual Memory Statistics: (page size of 4096 bytes)
Pages free: 3569.
Pages active: 832177.
Pages inactive: 283212.
Pages speculative: 2699727.
Pages throttled: 0.
Pages wired down: 372883.

Errno::ENOMEM: Cannot allocate memory - cat

I have a job running on production which process xml files.
xml files counts around 4k and of size 8 to 9 GB all together.
After processing we get CSV files as output. I've a cat command which will merge all CSV files to a single file I'm getting:
Errno::ENOMEM: Cannot allocate memory
on cat (Backtick) command.
Below are few details:
System Memory - 4 GB
Swap - 2 GB
Ruby : 1.9.3p286
Files are processed using nokogiri and saxbuilder-0.0.8.
Here, there is a block of code which will process 4,000 XML files and output is saved in CSV (1 per xml) (sorry, I'm not suppose to share it b'coz of company policy).
Below is the code which will merge the output files to a single file
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each {|file|
`cat #{file} >> #{final_output_file}`
}
I've taken memory consumption snapshots during processing.It consumes almost all part of the memory, but, it won't fail.
It always fails on cat command.
I guess, on backtick it tries to fork a new process which doesn't get enough memory so it fails.
Please let me know your opinion and alternative to this.

So it seems that your system is running pretty low on memory and spawning a shell + calling cat is too much for the few memory left.
If you don't mind loosing some speed, you can merge the files in ruby, with small buffers.
This avoids spawning a shell, and you can control the buffer size.
This is untested but you get the idea :
buffer_size = 4096
output_file = File.open(final_output_file, 'w')
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each do |file|
f = File.open(file)
while buffer = f.read(buffer_size)
output_file.write(buffer)
end
f.close
end

You are probably out of physical memory, so double check that and verify your swap (free -m). In case you don't have a swap space, create one.
Otherwise if your memory is fine, the error is most likely caused by shell resource limits. You may check them by ulimit -a.
They can be changed by ulimit which can modify shell resource limits (see: help ulimit), e.g.
ulimit -Sn unlimited && ulimit -Sl unlimited
To make these limit persistent, you can configure it by creating the ulimit setting file by the following shell command:
cat | sudo tee /etc/security/limits.d/01-${USER}.conf <<EOF
${USER} soft core unlimited
${USER} soft fsize unlimited
${USER} soft nofile 4096
${USER} soft nproc 30654
EOF
Or use /etc/sysctl.conf to change the limit globally (man sysctl.conf), e.g.
kern.maxprocperuid=1000
kern.maxproc=2000
kern.maxfilesperproc=20000
kern.maxfiles=50000

I have the same problem, but instead of cat it was sendmail (gem mail).
I found problem & solution here by installing posix-spawn gem, e.g.
gem install posix-spawn
and here is the example:
a = (1..500_000_000).to_a
require 'posix/spawn'
POSIX::Spawn::spawn('ls')
This time creating child process should succeed.
See also: Minimizing Memory Usage for Creating Application Subprocesses at Oracle.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio