Mac OS X Shell Script Measure Time Elapsed - macos

How can I measure the time elapsed in milliseconds in a shell script in Mac OS X?

Use the time command (manpage). This will be much cheaper than invoking ruby just to tell you elapsed time:
$ time a_command
To "extract" the real time from the command do (untested):
real_time=$(time a_command | grep ^real | awk 'print $2')
(where a_command can be a shell function if necessary)
This will return the value in minutes and seconds, so if you want the result in milliseconds then use python (or your favourite scripting language) to run the process with timing functions around the outside of the sub-process invocation and you will not incur the cost invoking the scripting language just to get the current time. See this answer and this answer for details.

You may use:
start_ms=$(ruby -e 'puts (Time.now.to_f * 1000).to_i')
# do some work
end_ms=$(ruby -e 'puts (Time.now.to_f * 1000).to_i')
elapsed_ms=$((end_ms - start_ms))
echo "$elapsed_ms ms passed"
OR only shell builtins (works in bash and zsh):
start_ns=$(date +%s%N)
# do some work
end_ns=$(date +%s%N)
elapsed_ms=$(((end_ns - start_ns) / 1000000))

Related

How to count number of forked (sub-?)processes

Somebody else has written (TM) some bash script that forks very many sub-processes. It needs optimization. But I'm looking for a way to measure "how bad" the problem is.
Can I / How would I get a count that says how many sub-processes were forked by this script all-in-all / recursively?
This is a simplified version of what the existing, forking code looks like - a poor man's grep:
#!/bin/bash
file=/tmp/1000lines.txt
match=$1
let cnt=0
while read line
do
cnt=`expr $cnt + 1`
lineArray[$cnt]="${line}"
done < $file
totalLines=$cnt
cnt=0
while [ $cnt -lt $totalLines ]
do
cnt=`expr $cnt + 1`
matches=`echo ${lineArray[$cnt]}|grep $match`
if [ "$matches" ] ; then
echo ${lineArray[$cnt]}
fi
done
It takes the script 20 seconds to look for $1 in 1000 lines of input. This code forks way too many sub-processes. In the real code, there are longer pipes (e.g. progA | progB | progC) operating on each line using grep, cut, awk, sed and so on.
This is a busy system with lots of other stuff going on, so a count of how many processes were forked on the entire system during the run-time of the script would be of some use to me, but I'd prefer a count of processes started by this script and descendants. And I guess I could analyze the script and count it myself, but the script is long and rather complicated, so I'd just like to instrument it with this counter for debugging, if possible.
To clarify:
I'm not looking for the number of processes under $$ at any given time (e.g. via ps), but the number of processes run during the entire life of the script.
I'm also not looking for a faster version of this particular example script (I can do that). I'm looking for a way to determine which of the 30+ scripts to optimize first to use bash built-ins.
You can count the forked processes simply trapping the SIGCHLD signal. If You can edit the script file then You can do this:
set -o monitor # or set -m
trap "((++fork))" CHLD
So fork variable will contain the number of forks. At the end You can print this value:
echo $fork FORKS
For a 1000 lines input file it will print:
3000 FORKS
This code forks for two reasons. One for each expr ... and one for `echo ...|grep...`. So in the reading while-loop it forks every time when a line is read; in the processing while-loop it forks 2 times (one because of expr ... and one for `echo ...|grep ...`). So for a 1000 lines file it forks 3000 times.
But this is not exact! It is just the forks done by the calling shell. There are more forks, because `echo ...|grep...` forks to start a bash to run this code. But after it is also forks twice: one for echo and one for grep. So actually it is 3 forks, not one. So it is rather 5000 FORKS, not 3000.
If You need to count the forks of the forks (of the forks...) as well (or You cannot modify the bash script or You want it to do from an other script), a more exact solution can be to used
strace -fo s.log ./x.sh
It will print lines like this:
30934 execve("./x.sh", ["./x.sh"], [/* 61 vars */]) = 0
Then You need to count the unique PIDs using something like this (first number is the PID):
awk '{n[$1]}END{print length(n)}' s.log
In case of this script I got 5001 (the +1 is the PID of the original bash script).
COMMENTS
Actually in this case all forks can be avoided:
Instead of
cnt=`expr $cnt + 1`
Use
((++cnt))
Instead of
matches=`echo ${lineArray[$cnt]}|grep $match`
if [ "$matches" ] ; then
echo ${lineArray[$cnt]}
fi
You can use bash's internal pattern matching:
[[ ${lineArray[cnt]} =~ $match ]] && echo ${lineArray[cnt]}
Mind that bash =~ uses ERE not RE (like grep). So it will behave like egrep (or grep -E), not grep.
I assume that the defined lineArray is not pointless (otherwise in the reading loop the matching could be tested and the lineArray is not needed) and it is used for other purpose as well. In that case I may suggest a little bit shorter version:
readarray -t lineArray <infile
for line in "${lineArray[#]}";{ [[ $line} =~ $match ]] && echo $line; }
First line reads the complete infile to lineArray without any loop. The second line is process the array element-by-element.
MEASURES
Original script for 1000 lines (on cygwin):
$ time ./test.sh
3000 FORKS
real 0m48.725s
user 0m14.107s
sys 0m30.659s
Modified version
FORKS
real 0m0.075s
user 0m0.031s
sys 0m0.031s
Same on linux:
3000 FORKS
real 0m4.745s
user 0m1.015s
sys 0m4.396s
and
FORKS
real 0m0.028s
user 0m0.022s
sys 0m0.005s
So this version uses no fork (or clone) at all. I may suggest to use this version only for small (<100 KiB) files. In other cases grap, egrep, awk over performs the pure bash solution. But this should be checked by a performance test.
For a thousand lines on linux I got the following:
$ time grep Solaris infile # Solaris is not in the infile
real 0m0.001s
user 0m0.000s
sys 0m0.001s

/usr/bin/time --format output elapsed time in milliseconds

I use the /usr/bin/time program to measure the time for a command.
with the --format parameter i can format the output.
e.g.
/usr/bin/time -f "%e" ls
is there a way to output a bigger accuracy of the elapsed seconds? or just output milliseconds, not seconds?
In the manual of /usr/bin/time it only says something about seconds, but maybe there is a way and someone can help me...
thanks!
EDIT:
I know about the bash command "time" which uses the format of the environment variable "TIMEFORMAT". sorry, but i don't wanna change that env-var... seems to risky to me, solution should be something that doesn't change the running system at all :)
One possibility is to use the date command:
ts=$(date +%s%N) ; my_command ; tt=$((($(date +%s%N) - $ts)/1000000)) ; echo "Time taken: $tt milliseconds"
%N should return nanoseconds, and 1 millisecond is 1000000 nanosecond, hence by division would return the time taken to execute my_command in milliseconds.
NOTE that the %N is not supported on all systems, but most of them.
For convenience I made devnull's answer into a script (I named it millisecond-time).
#!/bin/bash
ts=$(date +%s%N) ; $# ; tt=$((($(date +%s%N) - $ts)/1000000)) ; echo "Time taken: $tt milliseconds"
I put the script in /usr/local/bin.
Gave it execute rights chmod +x /usr/local/bin/millisecond-time.
Now I can use it like this: millisecond-time my_command
P.s. This would be a comment if I had the rep'.
There are a couple of things getting confused in this thread.
Bash has a built-in time command which supports a TIMEFORMAT environment variable that will let you format the output. For details on this run man bash and search for TIMEFORMAT.
There is also a standard /usr/bin/time command-line utility which supports a TIME environment variable that will let you format the output (or you can use -f or --format on the command line). For details on this run man time and search for TIME.
If you want the number of seconds the command took to run you can either use the built-in bash command (which supports a maximum precision of three decimal places):
bash# export TIMEFORMAT="%3lR"
bash# time find /etc > /dev/null
0m0.015s
Or you can use the command-line utility (which supports a maximum precision of two decimal places):
shell# export TIME="%E"
shell# /usr/bin/time find /opt/ > /dev/null
0:00.72
As mentioned above neither of these variables are used by anything else and are safe to change.

Performance profiling tools for shell scripts

I'm attempting to speed up a collection of scripts that invoke subshells and do all sorts of things. I was wonder if there are any tools available to time the execution of a shell script and its nested shells and report on which parts of the script are the most expensive.
For example, if I had a script like the following.
#!/bin/bash
echo "hello"
echo $(date)
echo "goodbye"
I would like to know how long each of the three lines took. time will only only give me total time for the script. bash -x is interesting but does not include timestamps or other timing information.
You can set PS4 to show the time and line number. Doing this doesn't require installing any utilities and works without redirecting stderr to stdout.
For this script:
#!/bin/bash -x
# Note the -x flag above, it is required for this to work
PS4='+ $(date "+%s.%N ($LINENO) ")'
for i in {0..2}
do
echo $i
done
sleep 1
echo done
The output looks like:
+ PS4='+ $(date "+%s.%N ($LINENO) ")'
+ 1291311776.108610290 (3) for i in '{0..2}'
+ 1291311776.120680354 (5) echo 0
0
+ 1291311776.133917546 (3) for i in '{0..2}'
+ 1291311776.146386339 (5) echo 1
1
+ 1291311776.158646585 (3) for i in '{0..2}'
+ 1291311776.171003138 (5) echo 2
2
+ 1291311776.183450114 (7) sleep 1
+ 1291311777.203053652 (8) echo done
done
This assumes GNU date, but you can change the output specification to anything you like or whatever matches the version of date that you use.
Note: If you have an existing script that you want to do this with without modifying it, you can do this:
PS4='+ $(date "+%s.%N ($LINENO) ")' bash -x scriptname
In the upcoming Bash 5, you will be able to save forking date (but you get microseconds instead of nanoseconds):
PS4='+ $EPOCHREALTIME ($LINENO) '
You could pipe the output of running under -x through to something that timestamps each line when it is received. For example, tai64n from djb's daemontools.
At a basic example,
sh -x slow.sh 2>&1 | tai64n | tai64nlocal
This conflates stdout and stderr but it does give everything a timestamp.
You'd have to then analyze the output to find expensive lines and correlate that back to your source.
You might also conceivably find using strace helpful. For example,
strace -f -ttt -T -o /tmp/analysis.txt slow.sh
This will produce a very detailed report, with lots of timing information in /tmp/analysis.txt, but at a per-system call level, which might be too detailed.
Sounds like you want to time each echo. If echo is all that you're doing this is easy
alias echo='time echo'
If you're running other command this obviously won't be sufficient.
Updated
See enable_profiler/disable_profiler in
https://github.com/vlovich/bashrc-wrangler/blob/master/bash.d/000-setup
which is what I use now. I haven't tested on all version of BASH & specifically but if you have the ts utility installed it works very well with low overhead.
Old
My preferred approach is below. Reason is that it supports OSX as well (which doesn't have high precision date) & runs even if you don't have bc installed.
#!/bin/bash
_profiler_check_precision() {
if [ -z "$PROFILE_HIGH_PRECISION" ]; then
#debug "Precision of timer is unknown"
if which bc > /dev/null 2>&1 && date '+%s.%N' | grep -vq '\.N$'; then
PROFILE_HIGH_PRECISION=y
else
PROFILE_HIGH_PRECISION=n
fi
fi
}
_profiler_ts() {
_profiler_check_precision
if [ "y" = "$PROFILE_HIGH_PRECISION" ]; then
date '+%s.%N'
else
date '+%s'
fi
}
profile_mark() {
_PROF_START="$(_profiler_ts)"
}
profile_elapsed() {
_profiler_check_precision
local NOW="$(_profiler_ts)"
local ELAPSED=
if [ "y" = "$PROFILE_HIGH_PRECISION" ]; then
ELAPSED="$(echo "scale=10; $NOW - $_PROF_START" | bc | sed 's/\(\.[0-9]\{0,3\}\)[0-9]*$/\1/')"
else
ELAPSED=$((NOW - _PROF_START))
fi
echo "$ELAPSED"
}
do_something() {
local _PROF_START
profile_mark
sleep 10
echo "Took $(profile_elapsed) seconds"
}
Here's a simple method that works on almost every Unix and needs no special software:
enable shell tracing, e.g. with set -x
pipe the output of the script through logger:
sh -x ./slow_script 2>&1 | logger
This will writes the output to syslog, which automatically adds a time stamp to every message. If you use Linux with journald, you can get high-precision time stamps using
journalctl -o short-monotonic _COMM=logger
Many traditional syslog daemons also offer high precision time stamps (milliseconds should be sufficient for shell scripts).
Here's an example from a script that I was just profiling in this manner:
[1940949.100362] bremer root[16404]: + zcat /boot/symvers-5.3.18-57-default.gz
[1940949.111138] bremer root[16404]: + '[' -e /var/tmp/weak-modules2.OmYvUn/symvers-5.3.18-57-default ']'
[1940949.111315] bremer root[16404]: + args=(-E $tmpdir/symvers-$krel)
[1940949.111484] bremer root[16404]: ++ /usr/sbin/depmod -b / -ae -E /var/tmp/weak-modules2.OmYvUn/symvers-5.3.18-57-default 5.3.18-57>
[1940952.455272] bremer root[16404]: + output=
[1940952.455738] bremer root[16404]: + status=0
where you can see that the "depmod" command is taking a lot of time.
Copied from here:
Since I've ended up here at least twice now, I implemented a solution:
https://github.com/walles/shellprof
It runs your script, transparently clocks all lines printed, and at the end prints a top 10 list of the lines that were on screen the longest:
~/s/shellprof (master|✔) $ ./shellprof ./testcase.sh
quick
slow
quick
Timings for printed lines:
1.01s: slow
0.00s: <<<PROGRAM START>>>
0.00s: quick
0.00s: quick
~/s/shellprof (master|✔) $
I'm not aware of any shell profiling tools.
Historically one just rewrites too-slow shell scripts in Perl, Python, Ruby, or even C.
A less drastic idea would be to use a faster shell than bash. Dash and ash are available for all Unix-style systems and are typically quite a bit smaller and faster.

Custom format for time command

I'd like to use the time command in a bash script to calculate the elapsed time of the script and write that to a log file. I only need the real time, not the user and sys. Also need it in a decent format. e.g 00:00:00:00 (not like the standard output). I appreciate any advice.
The expected format supposed to be 00:00:00.0000 (milliseconds) [hours]:[minutes]:[seconds].[milliseconds]
I've already 3 scripts. I saw an example like this:
{ time { # section code goes here } } 2> timing.log
But I only need the real time, not the user and sys. Also need it in a decent format. e.g 00:00:00:00 (not like the standard output).
In other words, I'd like to know how to turn the time output into something easier to process.
You could use the date command to get the current time before and after performing the work to be timed and calculate the difference like this:
#!/bin/bash
# Get time as a UNIX timestamp (seconds elapsed since Jan 1, 1970 0:00 UTC)
T="$(date +%s)"
# Do some work here
sleep 2
T="$(($(date +%s)-T))"
echo "Time in seconds: ${T}"
printf "Pretty format: %02d:%02d:%02d:%02d\n" "$((T/86400))" "$((T/3600%24))" "$((T/60%60))" "$((T%60))""
Notes:
$((...)) can be used for basic arithmetic in bash – caution: do not put spaces before a minus - as this might be interpreted as a command-line option.
See also: http://tldp.org/LDP/abs/html/arithexp.html
EDIT:
Additionally, you may want to take a look at sed to search and extract substrings from the output generated by time.
EDIT:
Example for timing with milliseconds (actually nanoseconds but truncated to milliseconds here). Your version of date has to support the %N format and bash should support large numbers.
# UNIX timestamp concatenated with nanoseconds
T="$(date +%s%N)"
# Do some work here
sleep 2
# Time interval in nanoseconds
T="$(($(date +%s%N)-T))"
# Seconds
S="$((T/1000000000))"
# Milliseconds
M="$((T/1000000))"
echo "Time in nanoseconds: ${T}"
printf "Pretty format: %02d:%02d:%02d:%02d.%03d\n" "$((S/86400))" "$((S/3600%24))" "$((S/60%60))" "$((S%60))" "${M}"
DISCLAIMER:
My original version said
M="$((T%1000000000/1000000))"
but this was edited out because it apparently did not work for some people whereas the new version reportedly did. I did not approve of this because I think that you have to use the remainder only but was outvoted.
Choose whatever fits you.
To use the Bash builtin time rather than /bin/time you can set this variable:
TIMEFORMAT='%3R'
which will output the real time that looks like this:
5.009
or
65.233
The number specifies the precision and can range from 0 to 3 (the default).
You can use:
TIMEFORMAT='%3lR'
to get output that looks like:
3m10.022s
The l (ell) gives a long format.
From the man page for time:
There may be a shell built-in called time, avoid this by specifying /usr/bin/time
You can provide a format string and one of the format options is elapsed time - e.g. %E
/usr/bin/time -f'%E' $CMD
Example:
$ /usr/bin/time -f'%E' ls /tmp/mako/
res.py res.pyc
0:00.01
Use the bash built-in variable SECONDS. Each time you reference the variable it will return the elapsed time since the script invocation.
Example:
echo "Start $SECONDS"
sleep 10
echo "Middle $SECONDS"
sleep 10
echo "End $SECONDS"
Output:
Start 0
Middle 10
End 20
Not quite sure what you are asking, have you tried:
time yourscript | tail -n1 >log
Edit: ok, so you know how to get the times out and you just want to change the format. It would help if you described what format you want, but here are some things to try:
time -p script
This changes the output to one time per line in seconds with decimals. You only want the real time, not the other two so to get the number of seconds use:
time -p script | tail -n 3 | head -n 1
The accepted answer gives me this output
# bash date.sh
Time in seconds: 51
date.sh: line 12: unexpected EOF while looking for matching `"'
date.sh: line 21: syntax error: unexpected end of file
This is how I solved the issue
#!/bin/bash
date1=$(date --date 'now' +%s) #date since epoch in seconds at the start of script
somecommand
date2=$(date --date 'now' +%s) #date since epoch in seconds at the end of script
difference=$(echo "$((date2-$date1))") # difference between two values
date3=$(echo "scale=2 ; $difference/3600" | bc) # difference/3600 = seconds in hours
echo SCRIPT TOOK $date3 HRS TO COMPLETE # 3rd variable for a pretty output.

Print execution time of a shell command

Is is possible to print the execution time of a shell command with following combination?
root#hostname:~# "command to execute" && echo "execution time"
time is a built-in command in most shells that writes execution time information to the tty.
You could also try something like
start_time=`date +%s`
<command-to-execute>
end_time=`date +%s`
echo execution time was `expr $end_time - $start_time` s.
Or in bash:
start_time=`date +%s`
<command-to-execute> && echo run time is $(expr `date +%s` - $start_time) s
Don't forget that there is a difference between bash's builtin time (which should be called by default when you do time command) and /usr/bin/time (which should require you to call it by its full path).
The builtin time always prints to stderr, but /usr/bin/time will allow you to send time's output to a specific file, so you do not interfere with the executed command's stderr stream. Also, /usr/bin/time's format is configurable on the command line or by the environment variable TIME, whereas bash's builtin time format is only configured by the TIMEFORMAT environment variable.
$ time factor 1234567889234567891 # builtin
1234567889234567891: 142662263 8653780357
real 0m3.194s
user 0m1.596s
sys 0m0.004s
$ /usr/bin/time factor 1234567889234567891
1234567889234567891: 142662263 8653780357
1.54user 0.00system 0:02.69elapsed 57%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+215minor)pagefaults 0swaps
$ /usr/bin/time -o timed factor 1234567889234567891 # log to file `timed`
1234567889234567891: 142662263 8653780357
$ cat timed
1.56user 0.02system 0:02.49elapsed 63%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+217minor)pagefaults 0swaps
root#hostname:~# time [command]
It also distinguishes between real time used and system time used.
For a line-by-line delta measurement, try gnonom.
It is a command line utility, a bit like moreutils's ts, to prepend timestamp information to the standard output of another command. Useful for long-running processes where you'd like a historical record of what's taking so long.
Piping anything to gnomon will prepend a timestamp to each line, indicating how long that line was the last line in the buffer--that is, how long it took the next line to appear. By default, gnomon will display the seconds elapsed between each line, but that is configurable.
Adding to #mob's answer:
Appending %N to date +%s gives us nanosecond accuracy:
start=`date +%s%N`;<command>;end=`date +%s%N`;echo `expr $end - $start`
In zsh you can use
=time ...
In bash or zsh you can use
command time ...
These (by different mechanisms) force an external command to be used.
If I'm starting a long-running process like a copy or hash and I want to know later how long it took, I just do this:
$ date; sha1sum reallybigfile.txt; date
Which will result in the following output:
Tue Jun 2 21:16:03 PDT 2015
5089a8e475cc41b2672982f690e5221469390bc0 reallybigfile.txt
Tue Jun 2 21:33:54 PDT 2015
Granted, as implemented here it isn't very precise and doesn't calculate the elapsed time. But it's dirt simple and sometimes all you need.
If you are using zshell, you can have zshell print the time # the start and end of execution. You can accomplish this by adding the following in your ~/.zshrc:
# print time before & after every command
preexec() { eval THEDATE="`date +"[%D_%H:%M:%S] "`"; echo "<CMD STARTED> $THEDATE" }
precmd() { eval THEDATE="`date +"[%D_%H:%M:%S] "`"; echo "<CMD FINISHD> $THEDATE" }
and open a new terminal window to have the changes take effect in all future terminal sessions.
Just ps -o etime= -p "<your_process_pid>"

Resources