Why are OSX core files so large? - macos

I have a simple program:
#include <iostream>
int main() {
std::cout << "Starting" << std::endl;
while (1) {
}
return 0;
}
I compile and run it:
clang -o test test.cpp
bash$ ./test
Starting
In another terminal, I examine it and kill it:
bash$ top
Processes: 284 total, 3 running, 9 stuck, 272 sleeping, 1505 threads 14:45:49
Load Avg: 1.86, 1.81, 2.06 CPU usage: 13.57% user, 0.96% sys, 85.45% idle
SharedLibs: 21M resident, 12M data, 0B linkedit.
MemRegions: 87372 total, 4974M resident, 86M private, 1203M shared.
PhysMem: 15G used (1887M wired), 1450M unused.
VM: 729G vsize, 1064M framework vsize, 3299196(0) swapins, 3711511(0) swapouts.
Networks: packets: 2686338/648M in, 2068186/355M out.
Disks: 1141818/44G read, 1926370/100G written.
PID COMMAND %CPU TIME #TH #WQ #PORT MEM PURG CMPRS PGRP PPID STATE
30161 test 100.3 02:12.71 1/1 0 9 312K 0B 0B 29470 28181 running
66064 Terminal 2.7 01:29.69 9 3 257 53M 0B 4580K 66064 1 sleeping
bash$ ls -al /cores/
total 917024
drwxrwxr-t# 3 root admin 102 May 1 14:54 .
drwxr-xr-x 33 root wheel 1190 Apr 14 09:13 ..
bash$ killall -SIGSEGV test
bash$ ls -al /cores/
total 1818048
drwxrwxr-t# 4 root admin 136 May 1 14:55 .
drwxr-xr-x 33 root wheel 1190 Apr 14 09:13 ..
-r-------- 1 stebro admin 469516288 May 1 14:54 core.30161
vmmap says:
==== Summary for process 50745
ReadOnly portion of Libraries: Total=316K resident=280K(89%) swapped_out_or_unallocated=36K(11%)
Writable regions: Total=40.3M written=16K(0%) resident=12.1M(30%) swapped_out=14.4M(36%) unallocated=28.3M(70%)
REGION TYPE VIRTUAL
=========== =======
Kernel Alloc Once 4K
MALLOC 9388K see MALLOC ZONE table below
MALLOC (admin) 24K
STACK GUARD 56.0M
Stack 8192K
VM_ALLOCATE 4K
__DATA 680K
__LINKEDIT 70.9M
__TEXT 5960K
shared memory 4K
=========== =======
TOTAL 150.6M
VIRTUAL ALLOCATION BYTES
MALLOC ZONE SIZE COUNT ALLOCATED % FULL
=========== ======= ========= ========= ======
DefaultMallocZone_0x104a0e000 9216K 357 27K 0%
Why is my core file 469 MB?

Your core dump includes the complete state of memory as well as symbol information for the frameworks and libraries that your application was running with at the time. That's a lot of data. For more information you might check out the core dumps section of technical note 2124

Related

How much RAM is actually available for applications in Linux?

I’m working on embedded Linux targets (32-bit ARM) and need to determine how much RAM is available for applications once the kernel and core software are launched. Available memory reported by free and /proc/meminfo don’t seem to align with what testing shows is actually usable by applications. Is there a way to correctly calculate how much RAM is truly available without running e.g., stress on each system?
The target system used in my tests below has 256 MB of RAM and does not use swap (CONFIG_SWAP is not set). I’m used the 3.14.79-rt85 kernel in the tests below but have also tried 4.9.39 and see similar results. During boot, the following is reported:
Memory: 183172K/262144K available (5901K kernel code, 377K rwdata, 1876K rodata, 909K init, 453K bss, 78972K reserved)
Once system initialization is complete and the base software is running (e.g., dhcp client, ssh server, etc.), I get the following reported values:
[root#host ~]# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 210016 320 7880 0 0 0 0 186 568 0 2 97 0 0
[root#host ~]# free -k
total used free shared buff/cache available
Mem: 249616 31484 209828 68 8304 172996
Swap: 0 0 0
[root#host ~]# cat /proc/meminfo
MemTotal: 249616 kB
MemFree: 209020 kB
MemAvailable: 172568 kB
Buffers: 712 kB
Cached: 4112 kB
SwapCached: 0 kB
Active: 4684 kB
Inactive: 2252 kB
Active(anon): 2120 kB
Inactive(anon): 68 kB
Active(file): 2564 kB
Inactive(file): 2184 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 2120 kB
Mapped: 3256 kB
Shmem: 68 kB
Slab: 13236 kB
SReclaimable: 4260 kB
SUnreclaim: 8976 kB
KernelStack: 864 kB
PageTables: 296 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 124808 kB
Committed_AS: 47944 kB
VmallocTotal: 1810432 kB
VmallocUsed: 3668 kB
VmallocChunk: 1803712 kB
[root#host ~]# sysctl -a | grep '^vm'
vm.admin_reserve_kbytes = 7119
vm.block_dump = 0
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.drop_caches = 3
vm.extfrag_threshold = 500
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 32
vm.max_map_count = 65530
vm.min_free_kbytes = 32768
vm.mmap_min_addr = 4096
vm.nr_pdflush_threads = 0
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.scan_unevictable_pages = 0
vm.stat_interval = 1
vm.swappiness = 60
vm.user_reserve_kbytes = 7119
vm.vfs_cache_pressure = 100
Based on the numbers above, I expected to have ~160 MiB available for future applications. By tweaking sysctl vm.min_free_kbytes I can boost this to nearly 200 MiB since /proc/meminfo appears to take this reserve into account, but for testing I left it set as it is above.
To test how much RAM was actually available, i used the stress tool as follows:
stress --vm 11 --vm-bytes 10M --vm-keep --timeout 5s
At 110 MiB, the system remains responsive and both free and vmstat reflect the increased RAM usage. The lowest reported free/available values are below:
[root#host ~]# free -k
total used free shared buff/cache available
Mem: 249616 146580 93196 68 9840 57124
Swap: 0 0 0
[root#host ~]# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
11 0 0 93204 1792 8048 0 0 0 0 240 679 50 0 50 0 0
Here is where things start to break down. After increasing stress’ memory usage to 120 MiB - still well shy of the 168 MiB reported as available - the system freezes for the 5 seconds while stress is running. Continuously running vmstat during the test (or as continuously as possible due to the freeze) shows:
[root#host ~]# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 209664 724 6336 0 0 0 0 237 666 0 1 99 0 0
3 0 0 121916 1024 6724 0 0 289 0 1088 22437 0 45 54 0 0
1 0 0 208120 1328 7128 0 0 1652 0 4431 43519 28 22 50 0 0
Due to the significant increase in interrupts and IO, I’m guessing the kernel is evicting pages containing executable code and then promptly needing to read them back in from flash. My questions are a) is this a correct assessment? and b) why would the kernel be doing this with RAM still available?
Note that if try to use a single worker with stress and claim 160 MiB of memory, the OOM gets activated and kills the test. OOM does not trigger in the scenarios described above.

rsync: failed to set times on "/cygdrive/e/.": Invalid argument (22)

I get the below error message when I try to rsync from a local hard disk to a USB disk mounted at E: on Windows 10.
rsync: failed to set times on "/cygdrive/e/.": Invalid argument (22)
My rsync command is as below (path shortened for brevity):
rsync -rtv --delete --progress --modify-window=5 /cygdrive/d/path/to/folder/ /cygdrive/e/
I actually need to set modification times (on directories as well) and rsync actually sets modification times perfectly. It only fails to set times on root of the USB disk.
I experienced exactly the same problem.
I created a dir containing one text file and when trying to rsync it to an removable (USB) drive, I got the error. However, the file was copied to the destination. The problem is not reproducible if the destination is a folder (other than root) on the removable drive
I then repeated the process using a fixed drive as destination, and the problem was not reproducible
The 1st difference that popped up between the 2 drives, was the file system (for more details, check [MS.Docs]: File Systems Technologies):
FAT32 - on the removable drive
NTFS - on the fixed one
So this was the cause of my failure. Formatting the USB drive as NTFS fixed the problem:
The USB drive formatted as FAT32 (default):
cfati#cfati-e5550-0 /cygdrive/e/Work/Dev/StackOverflow/q045006385
$ ll /cygdrive/
total 20
dr-xr-xr-x 1 cfati None 0 Jul 14 17:58 .
drwxrwx---+ 1 cfati None 0 Jun 9 15:04 ..
d---r-x---+ 1 NT SERVICE+TrustedInstaller NT SERVICE+TrustedInstaller 0 Jul 13 22:21 c
drwxrwx---+ 1 SYSTEM SYSTEM 0 Jul 14 13:19 e
drwxr-xr-x 1 cfati None 0 Dec 31 1979 n
drwxr-xr-x 1 cfati None 0 Dec 31 1979 w
cfati#cfati-e5550-0 /cygdrive/e/Work/Dev/StackOverflow/q045006385
$ rsync -rtv --progress --modify-window=5 ./dir/ /cygdrive/w
sending incremental file list
rsync: failed to set times on "/cygdrive/w/.": Invalid argument (22)
./
a.txt
3 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/2)
sent 111 bytes received 111 bytes 444.00 bytes/sec
total size is 3 speedup is 0.01
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1196) [sender=3.1.2]
cfati#cfati-e5550-0 /cygdrive/e/Work/Dev/StackOverflow/q045006385
$ ll /cygdrive/
total 20
dr-xr-xr-x 1 cfati None 0 Jul 14 17:58 .
drwxrwx---+ 1 cfati None 0 Jun 9 15:04 ..
d---r-x---+ 1 NT SERVICE+TrustedInstaller NT SERVICE+TrustedInstaller 0 Jul 13 22:21 c
drwxrwx---+ 1 SYSTEM SYSTEM 0 Jul 14 13:19 e
drwxr-xr-x 1 cfati None 0 Dec 31 1979 n
drwxr-xr-x 1 cfati None 0 Dec 31 1979 w
After formatting the USB drive as NTFS:
cfati#cfati-e5550-0 /cygdrive/e/Work/Dev/StackOverflow/q045006385
$ ll /cygdrive/
total 24
dr-xr-xr-x 1 cfati None 0 Jul 14 17:59 .
drwxrwx---+ 1 cfati None 0 Jun 9 15:04 ..
d---r-x---+ 1 NT SERVICE+TrustedInstaller NT SERVICE+TrustedInstaller 0 Jul 13 22:21 c
drwxrwx---+ 1 SYSTEM SYSTEM 0 Jul 14 13:19 e
drwxr-xr-x 1 cfati None 0 Dec 31 1979 n
drwxrwxrwx+ 1 Administrators Administrators 0 Jul 14 17:59 w
cfati#cfati-e5550-0 /cygdrive/e/Work/Dev/StackOverflow/q045006385
$ rsync -rtv --progress --modify-window=5 ./dir/ /cygdrive/w
sending incremental file list
./
a.txt
3 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/2)
sent 111 bytes received 38 bytes 298.00 bytes/sec
total size is 3 speedup is 0.02
cfati#cfati-e5550-0 /cygdrive/e/Work/Dev/StackOverflow/q045006385
$ ll /cygdrive/
total 24
dr-xr-xr-x 1 cfati None 0 Jul 14 17:59 .
drwxrwx---+ 1 cfati None 0 Jun 9 15:04 ..
d---r-x---+ 1 NT SERVICE+TrustedInstaller NT SERVICE+TrustedInstaller 0 Jul 13 22:21 c
drwxrwx---+ 1 SYSTEM SYSTEM 0 Jul 14 13:19 e
drwxr-xr-x 1 cfati None 0 Dec 31 1979 n
drwxrwxrwx+ 1 Administrators Administrators 0 Jul 14 13:19 w
As a side note, when I was at step #2., I was an idiot and kept the --delete arg, so til I hit Ctrl + C, it deleted some data. Luckily, it didn't get to delete crucial files / folders.

Bad health due to com.cloudera.cmon.agent.DnsTest timeout

Problems:
More and more data nodes become bad health in Cloudera Manager.
Clue1:
no any task or job, just an idle data node here,
top
-bash-4.1$ top
top - 18:27:22 up 4:59, 3 users, load average: 4.55, 3.52, 3.18
Tasks: 139 total, 1 running, 137 sleeping, 1 stopped, 0 zombie
Cpu(s): 14.8%us, 85.2%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 7932720k total, 1243372k used, 6689348k free, 52244k buffers
Swap: 6160376k total, 0k used, 6160376k free, 267228k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13766 root 20 0 2664m 21m 7048 S 85.4 0.3 190:34.75 java
17688 root 20 0 2664m 19m 7048 S 75.5 0.3 1:05.97 java
12765 root 20 0 2859m 21m 7140 S 36.9 0.3 133:25.46 java
2909 mapred 20 0 1894m 113m 14m S 1.0 1.5 2:55.26 java
1850 root 20 0 1469m 62m 4436 S 0.7 0.8 2:54.53 python
1332 root 20 0 50000 3000 2424 S 0.3 0.0 0:12.04 vmtoolsd
2683 hbase 20 0 1927m 152m 18m S 0.3 2.0 0:36.64 java
Clue2:
-bash-4.1$ ps -ef|grep 13766
root 13766 1850 99 16:01 ? 03:12:54 java -classpath /usr/share/cmf/lib/agent-4.6.3.jar com.cloudera.cmon.agent.DnsTest
Clue3:
in cloudera-scm-agent.log,
[30/Aug/2013 16:01:58 +0000] 1850 Monitor-HostMonitor throttling_logger ERROR Timeout with args ['java', '-classpath', '/usr/share/cmf/lib/agent-4.6.3.jar', 'com.cloudera.cmon.agent.DnsTest']
None
[30/Aug/2013 16:01:58 +0000] 1850 Monitor-HostMonitor throttling_logger ERROR Failed to collect java-based DNS names
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/host/dns_names.py", line 53, in collect
result, stdout, stderr = self._subprocess_with_timeout(args, self._poll_timeout)
File "/usr/lib64/cmf/agent/src/cmf/monitor/host/dns_names.py", line 42, in _subprocess_with_timeout
return SubprocessTimeout().subprocess_with_timeout(args, timeout)
File "/usr/lib64/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 70, in subprocess_with_timeout
raise Exception("timeout with args %s" % args)
Exception: timeout with args ['java', '-classpath', '/usr/share/cmf/lib/agent-4.6.3.jar', 'com.cloudera.cmon.agent.DnsTest']
"cloudera-scm-agent.log" line 30357 of 30357 --100%-- col 1
Backgrouds:
if I restart all nodes, then everythings are OK, but after half and hour or more, bad health is coming one by one.
Version: Cloudera Standard 4.6.3 (#192 built by jenkins on 20130812-1221 git: fa61cf8559fbefeb5af7f223fd02164d1a0adfdb)
I added all nodes in /etc/hosts
the installed CDH is 4.3.1.
in fact, these nodes are VMs with fixed IP address.
Any suggestions?
BTW, where can I download source code of com.cloudera.cmon.agent.DnsTest?

Oracle 11gr2 failed check of kernel parameters on hp-ux

I'm installing oracle 11gR2 on 64 bit itanium HP-UX (v 11.31) system ( for HP Operation Manager 9 ).
According with the installation requiremens, I've changed the kernel parameters but when I start the installation process it don't recognize them.
Below the parameters that I've set :
Parameter ( Manual) (on server)
-------------------------------------------------------------
fs_async 0 0
ksi_alloc_max (nproc*8) 10240*8 = 81920
executable_stack 0 0
max_thread_proc 1024 3003
maxdsiz 0x40000000 (1073741824) 2063835136
maxdsiz_64bit 0x80000000 (2147483648) 2147483648
maxfiles 256 (a) 4096
maxssiz 0x8000000 (134217728) 134217728
maxssiz_64bit 0x40000000 (1073741824) 1073741824
maxuprc ((nproc*9)/10) 9216
msgmni (nproc) 10240
msgtql (nproc) 32000
ncsize 35840 95120
nflocks (nproc) 10240
ninode (8*nproc+2048) 83968
nkthread (((nproc*7)/4)+16) 17936
nproc 4096 10240
semmni (nproc) 10240
semmns (semmni*2) 20480
semmnu (nproc-4) 10236
semvmx 32767 65535
shmmax size of memory or 0x40000000 (higher one) 1073741824
shmmni 4096 4096
shmseg 512 1024
vps_ceiling 64 64
if this can help:
[root#HUG30055 /] # swapinfo
Kb Kb Kb PCT START/ Kb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 4194304 0 4194304 0% 0 - 1 /dev/vg00/lvol2
dev 8388608 0 8388608 0% 0 - 1 /dev/vg00/lvol10
reserve - 742156 -742156
memory 7972944 3011808 4961136 38%
[root#HUG30055 /] # bdf /tmp
Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol6 4194304 1773864 2401576 42% /tmp

percentage of memory used used by a process

percentage of memory used used by a process.
normally prstat -J will give the memory of process image and RSS(resident set size) etc.
how do i knowlist of processes with percentage of memory is used by a each process.
i am working on solaris unix.
addintionally ,what are the regular commands that you use for monitoring processes,performences of processes that might be very useful to all!
The top command will give you several memory-consumption numbers. htop is much nicer, and will give you percentages, but it isn't installed by default on most systems.
run
top and then Shift+O this will bring you to the options, press n (this maybe different on your machine) for memory and then hit enter
Example of memory sort.
top - 08:17:29 up 3 days, 8:54, 6 users, load average: 13.98, 14.01, 11.60
Tasks: 654 total, 2 running, 652 sleeping, 0 stopped, 0 zombie
Cpu(s): 14.7%us, 1.5%sy, 0.0%ni, 59.5%id, 23.5%wa, 0.1%hi, 0.8%si, 0.0%st
Mem: 65851896k total, 49049196k used, 16802700k free, 1074664k buffers
Swap: 50331640k total, 0k used, 50331640k free, 32776940k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21635 oracle 15 0 6750m 636m 51m S 1.6 1.0 62:34.53 oracle
21623 oracle 15 0 6686m 572m 53m S 1.1 0.9 61:16.95 oracle
21633 oracle 16 0 6566m 445m 235m S 3.7 0.7 30:22.60 oracle
21615 oracle 16 0 6550m 428m 220m S 3.7 0.7 29:36.74 oracle
16349 oracle RT 0 431m 284m 41m S 0.5 0.4 2:41.08 ocssd.bin
17891 root RT 0 139m 118m 40m S 0.5 0.2 41:08.19 osysmond
18154 root RT 0 182m 98m 43m S 0.0 0.2 10:02.40 ologgerd
12211 root 15 0 1432m 84m 14m S 0.0 0.1 17:57.80 java
Another method on Solaris is to do the following
prstat -s size 1 1
Example prstat output
www004:/# prstat -s size 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
420 nobody 139M 60M sleep 29 10 1:46:56 0.1% webservd/76
603 nobody 135M 59M sleep 29 10 5:33:18 0.1% webservd/96
339 root 134M 70M sleep 59 0 0:35:38 0.0% java/24
435 iplanet 132M 55M sleep 29 10 1:10:39 0.1% webservd/76
573 nobody 131M 53M sleep 29 10 0:24:32 0.0% webservd/76
588 nobody 130M 53M sleep 29 10 2:40:55 0.1% webservd/86
454 nobody 128M 51M sleep 29 10 0:09:01 0.0% webservd/76
489 iplanet 126M 49M sleep 29 10 0:00:13 0.0% webservd/74
405 root 119M 45M sleep 29 10 0:00:13 0.0% webservd/31
717 root 54M 46M sleep 59 0 2:31:27 0.2% agent/7
Keep in mind this is sorted by Size not RSS, if you need it by RSS use the rss key
www004:/# prstat -s rss 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
339 root 134M 70M sleep 59 0 0:35:39 0.1% java/24
420 nobody 139M 60M sleep 29 10 1:46:57 0.4% webservd/76
603 nobody 135M 59M sleep 29 10 5:33:19 0.5% webservd/96
435 iplanet 132M 55M sleep 29 10 1:10:39 0.0% webservd/76
573 nobody 131M 53M sleep 29 10 0:24:32 0.0% webservd/76
588 nobody 130M 53M sleep 29 10 2:40:55 0.0% webservd/86
454 nobody 128M 51M sleep 29 10 0:09:01 0.0% webservd/76
489 iplanet 126M 49M sleep 29 10 0:00:13 0.0% webservd/74
I'm not sure if ps is standardized but at least on linux, ps -o %mem gives the percentage of memory used (you would obviously want to add some other columns as well)

Resources