Parsing the output of Bash's time builtin - bash

I'm running a C program from a Bash script, and running it through a command called time, which outputs some time statistics for the running of the algorithm.
If I were to perform the command
time $ALGORITHM $VALUE $FILENAME
It produces the output:
real 0m0.435s
user 0m0.430s
sys 0m0.003s
The values depending on the running of the algorithm
However, what I would like to be able to do is to take the 0.435 and assign it to a variable.
I've read into awk a bit, enough to know that if I pipe the above command into awk, I should be able to grab the 0.435 and place it in a variable. But how do I do that?
Many thanks

You must be careful: there's the Bash builtin time and there's the external command time, usually located in /usr/bin/time (type type -a time to have all the available times on your system).
If your shell is Bash, when you issue
time stuff
you're calling the builtin time. You can't directly catch the output of time without some minor trickery. This is because time doesn't want to interfere with possible redirections or pipes you'll perform, and that's a good thing.
To get time output on standard out, you need:
{ time stuff; } 2>&1
(grouping and redirection).
Now, about parsing the output: parsing the output of a command is usually a bad idea, especially when it's possible to do without. Fortunately, Bash's time command accepts a format string. From the manual:
TIMEFORMAT
The value of this parameter is used as a format string specifying how the timing information for pipelines prefixed with the time reserved word should be displayed. The % character introduces an escape sequence that is expanded to a time value or other information. The escape sequences and their meanings are as follows; the braces denote optional portions.
%%
A literal `%`.
%[p][l]R
The elapsed time in seconds.
%[p][l]U
The number of CPU seconds spent in user mode.
%[p][l]S
The number of CPU seconds spent in system mode.
%P
The CPU percentage, computed as (%U + %S) / %R.
The optional p is a digit specifying the precision, the number of fractional digits after a decimal point. A value of 0 causes no decimal point or fraction to be output. At most three places after the decimal point may be specified; values of p greater than 3 are changed to 3. If p is not specified, the value 3 is used.
The optional l specifies a longer format, including minutes, of the form MMmSS.FFs. The value of p determines whether or not the fraction is included.
If this variable is not set, Bash acts as if it had the value
$'\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS'
If the value is null, no timing information is displayed. A trailing newline is added when the format string is displayed.
So, to fully achieve what you want:
var=$(TIMEFORMAT='%R'; { time $ALGORITHM $VALUE $FILENAME; } 2>&1)
As #glennjackman points out, if your command sends any messages to standard output and standard error, you must take care of that too. For that, some extra plumbing is necessary:
exec 3>&1 4>&2
var=$(TIMEFORMAT='%R'; { time $ALGORITHM $VALUE $FILENAME 1>&3 2>&4; } 2>&1)
exec 3>&- 4>&-
Source: BashFAQ032 on the wonderful Greg's wiki.

You could try the below awk command which uses split function to split the input based on digit m or last s.
$ foo=$(awk '/^real/{split($2,a,"[0-9]m|s$"); print a[2]}' file)
$ echo "$foo"
0.435

You can use this awk:
var=$(awk '$1=="real"{gsub(/^[0-9]+[hms]|[hms]$/, "", $2); print $2}' file)
echo "$var"
0.435

Related

Idiomatic use of process substitution

I have learned Bash process substitution from Bash's man page. Unfortunately, my unskilled usage of the feature is ugly.
DEV=<(some commands that produce lines of data) && {
while read -u ${DEV##*/} FIELD1 FIELD2 FIELD3; do
some commands that consume the fields of a single line of data
done
}
Do skilled programmers have other ways to do this?
If an executable sample is desired, try this:
DEV=<(echo -ne "Cincinnati Hamilton Ohio\nAtlanta Fulton Georgia\n") && {
while read -u ${DEV##*/} FIELD1 FIELD2 FIELD3; do
echo "$FIELD1 lies in $FIELD2 County, $FIELD3."
done
}
Sample output:
Cincinnati lies in Hamilton County, Ohio.
Atlanta lies in Fulton County, Georgia.
In my actual application, the "some commands" are more complicated, but the above sample captures the essence of the question.
Process substitution <() is required. Alternatives to process substitution would not help.
Redirect into the loop's stdin with the operator <.
while read city county state; do
echo "$city lies in $county County, $state."
done < <(echo -ne "Cincinnati Hamilton Ohio\nAtlanta Fulton Georgia\n")
Output:
Cincinnati lies in Hamilton County, Ohio.
Atlanta lies in Fulton County, Georgia.
Note that in this example, a pipe works just as well.
echo -ne "Cincinnati Hamilton Ohio\nAtlanta Fulton Georgia\n" |
while read city county state
do
echo "$city lies in $county County, $state."
done
Also, uppercase variable names should be reserved for environment variables (like PATH) and other special variables (like RANDOM). And descriptive variable names are always good.
There are few alternative that will be portable. The 'right' choice depends on the specific case. In particular, it depends on the time to produce the input data, and the size of the input. In particular:
If it takes lot of time to process the data, you want to get parallel processing between the data generation, and the 'while' loop. This will result in incremental processing, and not having to wait for all the input data processing, before starting output data processing.
If the input is very large (and does not fit into a shell variable), you might not have a choice but to force an actual pipe. This is also true when the data is binary, Unicode, or similar - where bash variable will not work.
Mapping to the original question - PRODUCE = echo Cincinnati ..., and CONSUME - echo "$city ..."
For the trivial case (small input, fast produce/consume), the following will work. Bash will run them SEQUNIALLY: PRODUCE then CONSUME.
while read ... ; do
CONSUME
done <<< "$(PRODUCE)"
For the complex case (large input, or slow produce & consume), the following can be use to request PARALLEL execution
while read ... ; do
CONSUME
done < <(PRODUCE)
For the PRODUCE code is complex (loops, conditional, etc), or long (multiple lines), consider moving it into a function, instead of in-lining them into the loop command.
function produce {
PRODUCE
}
while read ... ; do
CONSUME
done < <(produce)

Sum time output from processes (bash)

I've made an script which measure the time of some processes. This is the file that I get:
real 0m6.768s
real 0m5.719s
real 0m5.173s
real 0m4.245s
real 0m5.257s
real 0m5.479s
real 0m6.446s
real 0m5.418s
real 0m5.654s
The command I use to get the time is this one:
{ time my-command } |& grep real >> times.txt
What I need is to sum all this times and get as a result how many (hours if applies) minutes and seconds using a bash script.
From man bash, then if PAGER is less / time.
If the time reserved word precedes a pipeline, the elapsed as well as user and system time consumed by its exe-
cution are reported when the pipeline terminates. The -p option changes the output format to that specified by
POSIX. The TIMEFORMAT variable may be set to a format string that specifies how the timing information should
be displayed; see the description of TIMEFORMAT under Shell Variables below.
then /TIMEFORMAT
The optional l specifies a longer format, including minutes, of the form MMmSS.FFs. The value of p
determines whether or not the fraction is included.
If this variable is not set, bash acts as if it had the value $'\nreal\t%3lR\nuser\t%3lU\nsys%3lS'. If
the value is null, no timing information is displayed. A trailing newline is added when the format
string is displayed.
If it can be changed to something like
TIMEFORMAT=$'\nreal\t%3R'
without the l, it may be easier to sum.
Note also format may depend on locale LANG:
compare
(LANG=fr_FR.UTF-8; time sleep 1)
and
(LANG=C; time sleep 1)
In that case the sum can be done with an external tool like awk
awk '/^real/ {sum+=$2} END{print sum} ' times.txt
or perl
perl -aln -e '$sum+=$F[1] if /^real/; END{print $sum}' times.txt
Pipe the output to this command
grep real | awk '{ gsub("m","*60+",$2); gsub("s","+",$2); printf("%s",$2); } END { printf("0\n"); }' | bc
This should work if you have generated the output using built-in time command. The output is in seconds.

conditional statements, arithmetic operation and output redirection in Makefiles

I am trying to have
I have two registers reg_a and reg_b, each are 32 bit. reg_a is used to store the epoch time (unix time), so it can go upto a maximum of 2^32 -1. If an overflow occurs, the overflow should be stored in reg_b. I also want to write them in a file rom.txt . I am trying to do this in a Makefile. This is how far I got (It is more of a pseudocode, there are syntax errors). Would be happy to know if there is a better way to do this.
# should be 2^32-1, but lets consider the below for example
EPOCH_MAX = 1500000000
identifier:
# get the epoch value
epoch=$(shell date +%s)
# initialize the reg_a, reg_b, assuming that overflow has not occurred
reg_a=$(epoch)
reg_b=0
# if overflow occurs
if [ ${epoch} -gt $(EPOCH_MAX)] ; then \
reg_a=$(EPOCH_MAX) ;\
reg_b=$(shell $(epoch)\-$(EPOCH_MAX) | bc) ;\
fi ;\
# here I want to print the values in a text file
echo $$(reg_a) > rom.txt
echo $$(reg_b) >> rom.txt
I am novice to Makefiles. The above is just a sort of pseudocode which tells what I want to do (Mostly through reading some webpages). I will be happy if someone can help me with the above. Thanks.
you've been asking a lot of questions about make. I think you might benefit from spending some time reading the GNU make manual
Pertinent to your question, each logical line in a recipe is run in a separate shell. So, you cannot set a shell variable in one logical line, then use the results in another one.
A "logical line" is all the physical lines where the previous one ends in backslash/newline.
So:
identifier:
# get the epoch value
epoch=$(shell date +%s)
# initialize the reg_a, reg_b, assuming that overflow has not occurred
reg_a=$(epoch)
reg_b=0
Will run 5 separate shells, one for each line (including the comments! Every line indented with a TAB character is considered a recipe line, even ones that begin with comments).
On the other hand, this:
if [ ${epoch} -gt $(EPOCH_MAX)] ; then \
reg_a=$(EPOCH_MAX) ;\
reg_b=$(shell $(epoch)\-$(EPOCH_MAX) | bc) ;\
fi ;\
Runs the entire if-statement in a single shell, because the backslash/newline pairs create a single logical line.
Second, you have to keep very clear in your mind the difference between make variables and shell variables. In the above the line epoch=$(shell date +%s) is setting the shell variable epoch (which value is immediately lost again when the shell exits).
The line reg_a=$(epoch) is referencing the make variable epoch, which is not set and so is empty.

Readable output for tracking runtime

I want to have a proper output style using /usr/bin/time and when I try something like
/usr/bin/time -f'time=%E' ls > /dev/null
the output is
time=0:00.05
where the 5 says 5 centiseconds.
If my command/script runs a longer time, the output will be e.g.:
time=1:30:05
where the 5 says 5 seconds.
I wanted to have the output written in man time:
The format string
The format is interpreted in the usual printf-like way. Ordinary characters are directly copied, tab, newline and backslash are escaped using \t, \n and \\, a
percent sign is represented by %%, and otherwise % indicates a conversion. The program time will always add a trailing newline itself. The conversions follow.
All of those used by tcsh(1) are supported.
Time
%E Elapsed real time (in [hours:]minutes:seconds).
So I don't want to have those confusing centiseconds. The format should be logical and easy readable without using additional scripts like sed. When I have a log for several commands, the output should be something like:
time=0:00:01
time=3:30:12
time=0:10:01

Evaluating a mathematical expression stored as a string, into a single number (bash)

I am working on Mac OSX and using bash as my shell. I currently have a string which I wish want evaluated as a number. When I echo the string I get 1.e8*1.07**100. Is there any way to pass this string on to be evaluated as a number?
The background as to why it is a string to start with is because the expression was built step by step. First 1.e8*1.07**%%d is within the code, then the user inputs an integer to be taken as what 1.07 will be raised to the power of. So in the example above, the user would have input 100, and thus the script is stuck with 1.e8*1.07**100, which is the correct expression I was hoping for, but I would have liked it to be evaluated when I echo the variable where it is store.
Actual important bits of code:
BASE=$(printf '1.e8*1.07**%%d')
#Get user input assigned to pow
NUM=$(printf ${BASE} ${pow})
echo $NUM #1.e8*1.07**100
Thanks for any help you can offer.
[Edit: I would also like to not just echo the answer, but store it as a variable.]
How about:
python -c "print $NUM"
By the way, you could just write
BASE="1.e8*1.07**%d"
(In fact, you don't even need the quotes.)
In most unix* systems you'll find a tool called bc that can perform calculations. You'll might need to rewrite your input though, I thinks it accepts ^ instead of **, and I'm not sure about the 1.e8 notation.
It happens that perl can evaluate that exact expression
$ x="1.e8*1.07**100"
$ y=$(perl -E "say $x")
$ echo $y
86771632556.6417

Resources