Core usage in scala actor model - performance

I just started learning scala. Is there any way to find CPU time and real time and the cores used by the program when using actor model??
Thanks in advance.

You may use a profiler such as VisualVM or more adhoc and pleasant solution: Typesafe console.

how about using the unix system time function ?
time scalac HelloWorld.scala

If you are running in Linux (specifically here I'm describing Ubuntu) mpstat is a useful program.
You can install it using the following command:
sudo apt-get install sysstat
Once installed you can run a bash script to record CPU info. In this case I'm logging 10 times a second. This will block until you press Control-C to kill the logging. (Also, consider removing the sleep 0.1; for better data.)
while x=0; do mpstat -P ALL >> cpu_log.txt; sleep 0.1; done
In another terminal you can fire off your program (or ANY program) and see track the performance data. The output looks like this:
03:21:08 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:21:08 PM all 0.37 0.00 0.33 0.50 0.00 0.02 0.00 0.00 98.78
03:21:08 PM 0 0.51 0.00 0.45 0.57 0.00 0.03 0.00 0.00 98.43
03:21:08 PM 1 0.29 0.00 0.26 0.45 0.00 0.01 0.00 0.00 99.00

Related

jupyter notebook %%time doesn't measure cpu time of %%sh commands?

When I run python code in a jupyter-lab (v3.4.3) ipython notebook (v8.4.0) and use the %%time cell magic, both cpu time and wall time are reported.
%%time
for i in range(10000000):
a = i*i
CPU times: user 758 ms, sys: 121 µs, total: 758 ms
Wall time: 757 ms
But when the same computation is performed using the %%sh magic to run a shell script, the cpu time results are nonsense.
%%time
%%sh
python -c "for i in range(10000000): a = i*i"
CPU times: user 6.14 ms, sys: 12.5 ms, total: 18.6 ms
Wall time: 920 ms
The docs for %time do say "Time execution of a Python statement or expression.", but this still surprised me because I had assumed that the shell script will run in a python subprocess and thus can also be measured. So, what's going on here? Is this a bug, or just a known caveat of using %%sh?
I know I can use the shell builtin time or /usr/bin/time to get similar output, but this is a bit cumbersome for multiple lines of shell---is there a better workaround?

strace'ing/profiling a bash script

I'm currently trying to benchmark a bash script in 4 different versions. Each one does a giant rsync job and it usually takes a very long time to finish. There are many steps in the bash script which involves setting up and tearing down the environment to rsync to.
However, when I ran strace on the bash scripts, I get surprisingly short results, which leads me to believe that strace is not actually tracing the time waiting for a command like rsync(which might be spawned in a subshell and is completely not recorded by rsync), or, it's waking up intermittently and sleep for another amount of time of which strace is not counting. Here's a snippet:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.98 12.972555 120116 108 52 wait4
0.01 0.000751 13 56 clone
0.00 0.000380 1 553 rt_sigprocmask
0.00 0.000303 2 197 85 stat
0.00 0.000274 2 134 read
0.00 0.000223 19 12 open
0.00 0.000190 48 4 getdents
0.00 0.000110 1 82 8 close
0.00 0.000110 1 153 rt_sigaction
0.00 0.000084 1 61 getegid
0.00 0.000074 4 19 write
So what tools can I use that are similar to strace, OR, maybe I'm missing some type of recursive flag in strace to find out correctly where my bash script is waiting on?
I would like something along the lines of:
% time command
------ --------
... rsync
... ls
Any suggestions would be appreciated. Thank you!

User processes in D-state leads to a watchdog reset using Linux 2.6.24 and arm processor

Most of the user space processes are ending up in D-state after the unit runs for around 3-4 days, the unit is running on ARM processor. From the top o/p we can see that processes that are in D-state are waiting on system calls "page_fault" and "squashfs_readpage". Utimately this leads to a watchdog reset. The processes that go into D-sate would take unusually long time to recover.
Following is the top o/p when the system ends up in trouble:
top - 12:00:11 up 3 days, 2:40, 3 users, load average: 2.77, 1.90, 1.72
Tasks: 250 total, 3 running, 238 sleeping, 0 stopped, 9 zombie
Cpu(s): 10.0% us, 75.5% sy, 0.0% ni, 0.0% id, 10.3% wa, 0.0% hi, 4.2% si
Mem: 191324k total, 188896k used, 2428k free, 2548k buffers
Swap: 0k total, 0k used, 0k free, 87920k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1003 root 20 0 225m 31m 4044 S 15.2 16.7 0:21.91 user_process_1
3745 root 20 0 80776 9476 3196 **D** 9.0 5.0 1:31.79 user_process_2
129 root 15 -5 0 0 0 S 7.4 0.0 0:27.65 **mtdblockd**
4624 root 20 0 3640 256 160 **D** 6.5 0.1 0:00.20 GetCounters_cus
3 root 15 -5 0 0 0 S 3.2 0.0 43:38.73 ksoftirqd/0
31363 root 20 0 2356 1176 792 R 2.6 0.6 40:09.58 top
347 root 30 10 0 0 0 S 1.9 0.0 28:56.04 **jffs2_gcd_mtd3**
1169 root 20 0 225m 31m 4044 S 1.9 16.7 39:31.36 user_process_1
604 root 20 0 0 0 0 S 1.6 0.0 27:22.76 user_process_3
1069 root -23 0 225m 31m 4044 S 1.3 16.7 20:45.39 user_process_1
4545 root 20 0 3640 564 468 S 1.0 0.3 0:00.08 GetCounters_cus
64 root 15 -5 0 0 0 **D** 0.3 0.0 0:00.83 **kswapd0**
969 root 20 0 20780 1856 1376 S 0.3 1.0 14:18.89 user_process_4
973 root 20 0 225m 31m 4044 S 0.3 16.7 3:35.74 user_process_1
1070 root -23 0 225m 31m 4044 S 0.3 16.7 16:41.04 user_process_1
1151 root -81 0 225m 31m 4044 S 0.3 16.7 23:13.05 user_process_1
1152 root -99 0 225m 31m 4044 S 0.3 16.7 8:48.47 user_process_1
One more interesting observation is that when the system lands up in this problem, we can consistently see "mtdblockd" process running in the top o/p. We have swap disabled on this unit. there is no apparent memory leak in the unit.
Any idea what could be the possible reasons, the processes are stuck in D-sates?
D-state means the processes are stuck in the kernel in a TASK_UNINTERRUPTIBLE sleep, this is unlikely to be bugs in the Squashfs error handling code because if a process exited Squashfs holding a mutex, the system would quickly grind to a halt as other processes entered Squashfs and slept forever waiting for the mutex. You would also see a low load average/system time as most processes would be sleeping. Furthermore there is no evidence Squashfs has hit any I/O errors.
Load average (2.77) and system time (75.5%) is extremely high, coupled with the fact a lot of processes are in Squashfs_readpage (which is completing but slow), indicates the system is thrashing. There is too little memory and the system is spending all it's time constantly (re-)demand paging pages from disk. This will account for the fact a lot of processes are in Squashfs_readpage, system time is extremely high because the system is spending most of its time in Squashfs in the CPU intensive task of decompression. The other processes are stuck in Squashfs waiting on the decompressor mutex (only one process can be decompressing at a time because the decompressor state is shared).

Analyzing iostat output

I'm suffering performance issues with my application, and one of my suspects is excessive IO.
iostat shows rate of 10K blocks written per second. How can I tell if this is a lot or not? How can I know the limit of the specific machine and disk?
Edit:Following Elliot's request:
iostat output:
avg-cpu: %user %nice %system %iowait %steal %idle
16.39 0.00 0.52 11.43 0.00 71.66
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
cciss/c0d0 315.20 0.00 10341.80 0 51709
uptime:
2:08am up 17 days 17:26, 5 users, load average: 9.13, 9.32, 8.73
top:
top - 02:10:02 up 17 days, 17:27, 5 users, load average: 8.89, 9.18, 8.72
Tasks: 202 total, 2 running, 200 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.9%us, 0.7%sy, 0.0%ni, 90.5%id, 2.9%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 96556M total, 15930M used, 80626M free, 221M buffers
Swap: 196615M total, 93M used, 196522M free, 2061M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20027 root 16 0 10.5g 9.8g 12m S 74 10.4 2407:55 /usr/intel/pkgs/java/1.6.0.31-64/jre//bin/java -server -X
Thanks
I can tell you from experience that's a very high block write rate for most systems. However, your system could be perfectly capable of handling that--depends on what kind of hardware you have. What's important is your server load figure and the iowait percentage. If your server load is high (i.e., higher than the number of cores on your system) and your load largely consists of iowait, then you have a problem.
Could you share with us the full output of iostat, uptime, and a snapshot of top -c output while your application is running?
Perspective:
If it's spinning disk, that's a high value.
If it's an SSD or a SAN with a write cache, that's reasonable.
Use iostat -x for wide and extended metrics:
[xxx#xxxxxxxxxx]$ iostat -x
Linux 2.6.32-358.6.2.el6.x86_64 (boxname.goes.here) 12/12/2013 _x86_64_ (24 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.57 0.00 0.21 0.00 0.00 99.21
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.06 28.38 0.04 3.99 0.99 259.15 64.58 0.01 2.11 0.55 0.22
The %util is your friend. If you look at iostat.c (see it at: http://code.google.com/p/tester-higkoo/source/browse/trunk/Tools/iostat/iostat.c) you can see it calculates this percentage by looking at the amount of time (in processor ticks) spent doing IO versus the total number of ticks that have passed. In other words, the PERCENTAGE-UTIL is the percent of time the IO was in a busy state.

Is it possible to profile for file system interaction in Ruby?

Is it possible to profile a Ruby application to see how much it interacts with the file system?
Background: In the past I've written code that reads files within a loop when it only needs to do so once. I'd like to make sure that I eliminate all such code.
There are already perfectly capable programs for this purpose out there, that you don't need to duplicate. I don't think you should complicate your program with special logic checking for a relatively obscure programming error (at least, I've never accidentally committed the error you describe). In such cases, the solution is to check a performance characteristic from outside the program. Assuming you are on Linux, I would turn to a test/spec that exercises your code and watches iostat (or similar) in a seperate thread.
Sure, you can simply require 'profile at the top of a script:
# myscript.rb
require 'profile'
Dir.entries(".").each { |e| puts e}
$ ruby myscript.rb
(list of filenames...)
% cumulative self self total
time seconds seconds calls ms/call ms/call name
0.00 0.00 0.00 1 0.00 0.00 Dir#open
0.00 0.00 0.00 1 0.00 0.00 Dir#each
0.00 0.00 0.00 1 0.00 0.00 Enumerable.to_a
0.00 0.00 0.00 1 0.00 0.00 Dir#entries
0.00 0.00 0.00 56 0.00 0.00 IO#write
0.00 0.00 0.00 28 0.00 0.00 IO#puts
0.00 0.00 0.00 28 0.00 0.00 Kernel.puts
0.00 0.00 0.00 1 0.00 0.00 Array#each
0.00 0.01 0.00 1 0.00 10.00 #toplevel
Or you can just pass in an option on the command line:
$ ruby -r profile myscript.rb
If you want finer control over what to profile, take a look at the ruby-prof library.
I'm a sucker for brute force.
File.open(path, 'r') do |file|
...
end
File.mv(path, path + '.hidden') # Temporary
If the code tries to open the file twice, it won't find it the second time. After the test, you can reverse the rename with a shell one-liner. In Bash (and probably in other *nix shells as well):
for i in `ls *.hidden` ; do mv $i ${i%.hidden} ; done

Resources