I'm suffering performance issues with my application, and one of my suspects is excessive IO.
iostat shows rate of 10K blocks written per second. How can I tell if this is a lot or not? How can I know the limit of the specific machine and disk?
Edit:Following Elliot's request:
iostat output:
avg-cpu: %user %nice %system %iowait %steal %idle
16.39 0.00 0.52 11.43 0.00 71.66
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
cciss/c0d0 315.20 0.00 10341.80 0 51709
uptime:
2:08am up 17 days 17:26, 5 users, load average: 9.13, 9.32, 8.73
top:
top - 02:10:02 up 17 days, 17:27, 5 users, load average: 8.89, 9.18, 8.72
Tasks: 202 total, 2 running, 200 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.9%us, 0.7%sy, 0.0%ni, 90.5%id, 2.9%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 96556M total, 15930M used, 80626M free, 221M buffers
Swap: 196615M total, 93M used, 196522M free, 2061M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20027 root 16 0 10.5g 9.8g 12m S 74 10.4 2407:55 /usr/intel/pkgs/java/1.6.0.31-64/jre//bin/java -server -X
Thanks
I can tell you from experience that's a very high block write rate for most systems. However, your system could be perfectly capable of handling that--depends on what kind of hardware you have. What's important is your server load figure and the iowait percentage. If your server load is high (i.e., higher than the number of cores on your system) and your load largely consists of iowait, then you have a problem.
Could you share with us the full output of iostat, uptime, and a snapshot of top -c output while your application is running?
Perspective:
If it's spinning disk, that's a high value.
If it's an SSD or a SAN with a write cache, that's reasonable.
Use iostat -x for wide and extended metrics:
[xxx#xxxxxxxxxx]$ iostat -x
Linux 2.6.32-358.6.2.el6.x86_64 (boxname.goes.here) 12/12/2013 _x86_64_ (24 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.57 0.00 0.21 0.00 0.00 99.21
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.06 28.38 0.04 3.99 0.99 259.15 64.58 0.01 2.11 0.55 0.22
The %util is your friend. If you look at iostat.c (see it at: http://code.google.com/p/tester-higkoo/source/browse/trunk/Tools/iostat/iostat.c) you can see it calculates this percentage by looking at the amount of time (in processor ticks) spent doing IO versus the total number of ticks that have passed. In other words, the PERCENTAGE-UTIL is the percent of time the IO was in a busy state.
Related
I have a process which slowly consumes more RAM until it eventually hits its cgroup limit and is OOM killed, and I'm trying to figure out why.
Oddly, go's runtime seems to think not much RAM is used, whereas the OS seems to think a lot is used.
Specifically, looking at runtime.MemStats (via the extvar package) I see:
"Alloc":51491072,
"TotalAlloc":143474637424,
"Sys":438053112,
"Lookups":0,
"Mallocs":10230571,
"Frees":10195515,
"HeapAlloc":51491072,
"HeapSys":388464640,
"HeapIdle":333824000,
"HeapInuse":54640640,
"HeapReleased":0,
"HeapObjects":35056,
"StackInuse":14188544,
"StackSys":14188544,
"MSpanInuse":223056,
"MSpanSys":376832,
"MCacheInuse":166656,
"MCacheSys":180224,
"BuckHashSys":2111104,
"GCSys":13234176,
"OtherSys":19497592
But from the OS perspective:
$ ps auxwf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 178 0.0 0.0 3996 3372 pts/0 Ss 17:33 0:00 bash
root 246 0.0 0.0 7636 2828 pts/0 R+ 17:59 0:00 \_ ps auxwf
root 1 166 2.8 11636248 5509288 ? Ssl 17:24 57:15 app server -api-public
So, the OSS reports an RSS of 5380 MiB, but the Sys field in MemStats shows only 417 MiB. My understanding is these fields should be approximately the same.
GC is running, as confirmed by setting GODEBUG=gctrace=1,madvdontneed=1. For example, I see output like:
gc 6882 #2271.137s 0%: 0.037+2.2+0.087 ms clock, 3.5+0.78/37/26+8.4 ms cpu, 71->72->63 MB, 78 MB goal, 96 P
The numbers vary a bit depending on the process, but they are all <100 MB, whereas the OS is reporting >1GB (and growing, until eventual OOM).
madvdontneed=1 was a shot in the dark but seems to make no difference. I wouldn't think the madvise parameters would be relevant, since it doesn't seem there's any need to return memory to the kernel, as the Go runtime doesn't think it's using much memory anyway.
What could explain this discrepancy? Am I not correctly understanding the semantics of these fields? Are there mechanisms that would result in the growth of RSS (and an eventual OOM kill) but not increase MemStats.Sys?
I'm using puppet as a provisioner for Vagrant, and am coming across an issue where Puppet will hang for an extremely long time when I do a "vagrant provision". Building the box from scratch using "vagrant up" doesn't seem to be a problem, only subsequent provisions.
If I turn puppet debug on and watch where it hangs, it seems to stop at various, seemingly arbitrary, points the first of which is:
Info: Applying configuration version '1401868442'
Debug: Prefetching yum resources for package
Debug: Executing '/bin/rpm --version'
Debug: Executing '/bin/rpm -qa --nosignature --nodigest --qf '%{NAME} %|EPOCH?{% {EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n''
Executing this command on the server myself returns immediately.
Eventually, it gets past this and continues. Using the summary option, I get the following, after waiting for a very long time for it to complete:
Debug: Finishing transaction 70191217833880
Debug: Storing state
Debug: Stored state in 9.39 seconds
Notice: Finished catalog run in 1493.99 seconds
Changes:
Total: 2
Events:
Failure: 2
Success: 2
Total: 4
Resources:
Total: 18375
Changed: 2
Failed: 2
Skipped: 35
Out of sync: 4
Time:
User: 0.00
Anchor: 0.01
Schedule: 0.01
Yumrepo: 0.07
Augeas: 0.12
Package: 0.18
Exec: 0.96
Service: 1.07
Total: 108.93
Last run: 1401869964
Config retrieval: 16.49
Mongodb database: 3.99
File: 76.60
Mongodb user: 9.43
Version:
Config: 1401868442
Puppet: 3.4.3
This doesn't seem very helpful to me, as the amount of time total's 108 seconds, so where have the other 1385 seconds gone?
Throughout, Puppet seems to be hammering the box, using up a lot of CPU, but still doesn't seem to advance. The memory it uses seems to continually increase. When I kick off the command, top looks like this:
Cpu(s): 10.2%us, 2.2%sy, 0.0%ni, 85.5%id, 2.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4956928k total, 2849296k used, 2107632k free, 63464k buffers
Swap: 950264k total, 26688k used, 923576k free, 445692k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28486 root 20 0 439m 334m 3808 R 97.5 6.9 2:02.92 puppet
22 root 20 0 0 0 0 S 1.3 0.0 0:07.55 kblockd/0
18276 mongod 20 0 788m 31m 3040 S 1.3 0.6 2:31.82 mongod
20756 jboss-as 20 0 3081m 1.5g 21m S 1.3 31.4 7:13.15 java
20930 elastics 20 0 2340m 236m 6580 S 1.0 4.9 1:44.80 java
266 root 20 0 0 0 0 S 0.3 0.0 0:03.85 jbd2/dm-0-8
22717 vagrant 20 0 98.0m 2252 1276 S 0.3 0.0 0:01.81 sshd
28762 vagrant 20 0 15036 1228 932 R 0.3 0.0 0:00.10 top
1 root 20 0 19364 1180 964 S 0.0 0.0 0:00.86 init
To me, this seems fine, there's over 2GB of available memory and plenty of available swap. I have a max open files limit of 1024.
About 10-15 minutes later, still no advance in the console output, but top looks like this:
Cpu(s): 11.2%us, 1.6%sy, 0.0%ni, 86.9%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%s
Mem: 4956928k total, 3834376k used, 1122552k free, 64248k buffers
Swap: 950264k total, 24408k used, 925856k free, 445728k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28486 root 20 0 1397m 1.3g 3808 R 99.6 26.7 15:16.19 puppet
18276 mongod 20 0 788m 31m 3040 R 1.7 0.6 2:45.03 mongod
20756 jboss-as 20 0 3081m 1.5g 21m S 1.3 31.4 7:25.93 java
20930 elastics 20 0 2340m 238m 6580 S 0.7 4.9 1:52.03 java
8486 root 20 0 308m 952 764 S 0.3 0.0 0:06.03 VBoxService
As you can see, puppet is now using a lot more of the memory, and it seems to continue in this fashion. The box it's building has 5GB of RAM, so I wouldn't have expected it to have memory issues. However, further down the line, after a long wait, I do get "Cannot allocate memory - fork(2)"
Running unlimit -a, I get:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 38566
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Which, again looks fine to me...
To be honest, I'm completely at a loss as to how to go about solving this, or what is causing it.
Any help or insight would be greatly appreciated!
EDIT:
So I managed to fix this eventually... It came down to using recurse with a file directive for a large directory. The target directory in question contained around 2GB worth of files, and puppet took a huge amount of time loading this into memory and doing it's hashes and comparisons. The first time I stood the server up, the directory was relatively empty so the check was quick, but then other resources were placed in it that increased its size massively, meaning subsequent runs took much longer.
The memory error that eventually was thrown was because, I can only assume, Puppet was loading the whole thing into memory in order to do its stuff...
I found a way around using the recurse function, and am now trying to avoid it like the plague...
Yeah, the problem with the recurse parameter on the file type is that it checks every single file's checksum, which on a massive directory adds up real quick.
As Felix suggests, using checksum => none is one way to fix it, another is to accomplish the task you're trying to do (say chmod or chown a whole directory) with an exec performing the native task, with an unless to check if it's already been done.
Something like:
define check_mode($mode) {
exec { "/bin/chmod $mode $name":
unless => "/bin/sh -c '[ $(/usr/bin/stat -c %a $name) == $mode ]'",
}
}
Taken from http://projects.puppetlabs.com/projects/1/wiki/File_Permission_Check_Patterns
Most of the user space processes are ending up in D-state after the unit runs for around 3-4 days, the unit is running on ARM processor. From the top o/p we can see that processes that are in D-state are waiting on system calls "page_fault" and "squashfs_readpage". Utimately this leads to a watchdog reset. The processes that go into D-sate would take unusually long time to recover.
Following is the top o/p when the system ends up in trouble:
top - 12:00:11 up 3 days, 2:40, 3 users, load average: 2.77, 1.90, 1.72
Tasks: 250 total, 3 running, 238 sleeping, 0 stopped, 9 zombie
Cpu(s): 10.0% us, 75.5% sy, 0.0% ni, 0.0% id, 10.3% wa, 0.0% hi, 4.2% si
Mem: 191324k total, 188896k used, 2428k free, 2548k buffers
Swap: 0k total, 0k used, 0k free, 87920k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1003 root 20 0 225m 31m 4044 S 15.2 16.7 0:21.91 user_process_1
3745 root 20 0 80776 9476 3196 **D** 9.0 5.0 1:31.79 user_process_2
129 root 15 -5 0 0 0 S 7.4 0.0 0:27.65 **mtdblockd**
4624 root 20 0 3640 256 160 **D** 6.5 0.1 0:00.20 GetCounters_cus
3 root 15 -5 0 0 0 S 3.2 0.0 43:38.73 ksoftirqd/0
31363 root 20 0 2356 1176 792 R 2.6 0.6 40:09.58 top
347 root 30 10 0 0 0 S 1.9 0.0 28:56.04 **jffs2_gcd_mtd3**
1169 root 20 0 225m 31m 4044 S 1.9 16.7 39:31.36 user_process_1
604 root 20 0 0 0 0 S 1.6 0.0 27:22.76 user_process_3
1069 root -23 0 225m 31m 4044 S 1.3 16.7 20:45.39 user_process_1
4545 root 20 0 3640 564 468 S 1.0 0.3 0:00.08 GetCounters_cus
64 root 15 -5 0 0 0 **D** 0.3 0.0 0:00.83 **kswapd0**
969 root 20 0 20780 1856 1376 S 0.3 1.0 14:18.89 user_process_4
973 root 20 0 225m 31m 4044 S 0.3 16.7 3:35.74 user_process_1
1070 root -23 0 225m 31m 4044 S 0.3 16.7 16:41.04 user_process_1
1151 root -81 0 225m 31m 4044 S 0.3 16.7 23:13.05 user_process_1
1152 root -99 0 225m 31m 4044 S 0.3 16.7 8:48.47 user_process_1
One more interesting observation is that when the system lands up in this problem, we can consistently see "mtdblockd" process running in the top o/p. We have swap disabled on this unit. there is no apparent memory leak in the unit.
Any idea what could be the possible reasons, the processes are stuck in D-sates?
D-state means the processes are stuck in the kernel in a TASK_UNINTERRUPTIBLE sleep, this is unlikely to be bugs in the Squashfs error handling code because if a process exited Squashfs holding a mutex, the system would quickly grind to a halt as other processes entered Squashfs and slept forever waiting for the mutex. You would also see a low load average/system time as most processes would be sleeping. Furthermore there is no evidence Squashfs has hit any I/O errors.
Load average (2.77) and system time (75.5%) is extremely high, coupled with the fact a lot of processes are in Squashfs_readpage (which is completing but slow), indicates the system is thrashing. There is too little memory and the system is spending all it's time constantly (re-)demand paging pages from disk. This will account for the fact a lot of processes are in Squashfs_readpage, system time is extremely high because the system is spending most of its time in Squashfs in the CPU intensive task of decompression. The other processes are stuck in Squashfs waiting on the decompressor mutex (only one process can be decompressing at a time because the decompressor state is shared).
I just started learning scala. Is there any way to find CPU time and real time and the cores used by the program when using actor model??
Thanks in advance.
You may use a profiler such as VisualVM or more adhoc and pleasant solution: Typesafe console.
how about using the unix system time function ?
time scalac HelloWorld.scala
If you are running in Linux (specifically here I'm describing Ubuntu) mpstat is a useful program.
You can install it using the following command:
sudo apt-get install sysstat
Once installed you can run a bash script to record CPU info. In this case I'm logging 10 times a second. This will block until you press Control-C to kill the logging. (Also, consider removing the sleep 0.1; for better data.)
while x=0; do mpstat -P ALL >> cpu_log.txt; sleep 0.1; done
In another terminal you can fire off your program (or ANY program) and see track the performance data. The output looks like this:
03:21:08 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:21:08 PM all 0.37 0.00 0.33 0.50 0.00 0.02 0.00 0.00 98.78
03:21:08 PM 0 0.51 0.00 0.45 0.57 0.00 0.03 0.00 0.00 98.43
03:21:08 PM 1 0.29 0.00 0.26 0.45 0.00 0.01 0.00 0.00 99.00
What is the throughput of Google File System?
http://labs.google.com/papers/gfs-sosp2003.pdf
Page 11, under 6.1.2 Writes has some figures of throughput. Not sure if that's what you're looking for.
Here is an article about it
FTA:
Cluster A B
Read Rate - last minute 583 MB/s 380 MB/s
Read Rate - last hour 562 MB/s 384 MB/s
Read Rate - since restart 589 MB/s 49 MB/s
Write Rate - last minute 1 MB/s 101 MB/s
Write Rate - last hour 2 MB/s 117 MB/s
Write Rate - since restart 25 MB/s 13 MB/s
Master Ops - last minute 325 Op/s 533 Op/s
Master Ops - last hour 381 Op/s 518 Op/s
Master Ops - since restart 202 Op/s 347 Op/s