Script for finding average runtime of a program - shell

I found partial solutions on several sites, so I pulled several parts together, but I still couldn't figure it out.
Here is what I am doing:
I am running a simple java program from Terminal, and need to find the average runtime for the program.
What I am doing is running the command several times, finding the total time, and then dividing that total time by the number of times I ran the program.
I would also like to acquire the output of the program rather than displaying it on standard output.
Here is my current code and the output.
Shell Script:
startTime=$(date +%s%N)
for ((i = 0; i < $runTimes; i++))
do
java Program test.txt > /dev/null
done
endTime=$(date +%s%N)
timeDiff=$(( $endTime - $startTime ))
timeAvg=$(( $timeDiff / $numTimes ))
echo "Avg Time Taken: "
echo $timeAvg
Output:
./run: line 12: 1305249784N: value too great for base (error token is "1305249784N")
The line number 12 is off because this code is part of a larger file.
The line number 12 is the line with timeDiff being evaluated.
I appreciate any help, and apologize if this question is redundant or off-topic.

On my machine, I don't see what the %N format for date is getting you, as the value seems to be 7 zeros, BUT it is making a much bigger number to evaluate in the math, i.e. 1305250833570000000. Do you really need nano-second precision? I'll bet if you go with just %s it will be fine.
Otherwise you look to be on the right track.
P.S.
Oh yeah, minor point,
echo "Avg Time Taken: $timeAvg"
Is a a simpler way to achieve your required output ;-)
Option 2. You could take out the date calculations all together, and turn your loop into a small script. Then you can use a built-in feature of the shell
time myJavaTest.sh
Will give you details like
real 0m0.049s
user 0m0.016s
sys 0m0.015s
I hope this helps.

Related

Calculating days left to a specific date

I'm having trouble figuring out a way to calculate amount of days left to a specific date (passed as an argument). I tried this, however it doesn't even work correctly with any date of 2021.
days=$[$(date +%j -d $1)-$(date +%j $now)];
if (( $days < 0 ))
then
echo "error";
exit 1;
fi
echo "Theres" $days "left to this date.";
Does anyone have an idea on how I could fix it?
There are a few problems I see.
I don't recognize the $[...] syntax and can't find a shell that does (see the comments, below, where user #KamilCuk explains this further).
The if syntax is wrong, possibly swapped with the previous line.
Because of the parentheses, what's read as a less-than sign (<) is going to try to redirect input to a program.
The answer will always print out.
As you point out, there's no chance of a Julian day working with different years.
Try something like this, instead.
#!/bin/sh
Split up the dates for easier debugging, first, and also use `+%s to get "UNIX time" seconds since the start of 1970.
now=$(date +%s $now)
target=$(date +%s -d $1)
days=$(($target - $now));
Fix the conditional syntax.
if [ $days -lt 0 ]
then
echo error
exit 1
Put the output into an else clause.
else
Since we have the answer in seconds, divide by the number of seconds in a typical day.
days=$(($days / 86400))
echo "There are $days days left to this date."
fi
I also cleaned up the echo syntax for clarity.
Note that this still isn't perfect. Depending on your definition of "one day," there are going to be cases where the answer from this script differs from what you want; in that case, you'll need to adjust $target to match a particular time of day. In addition, not every day is 86400 seconds long, because of daylight savings and leap seconds. But with those caveats, it should work well enough and adding those to a script sounds like more work than "how many days?" should warrant.
If you want to see the steps it takes for debugging, run it with sh -x date.sh '2022-12-31' (with your script's name and date), since the -x argument tells the shell to give you a "trace" of intermediate steps.

Unique Linux filename, sortable by time

Previously I was using uuidgen to create unique filenames that I then need to iterate over by date/time via a bash script. I've since found that simply looping over said files via 'ls -l' will not suffice because evidently I can only trust the OS to keep timestamp resolution in seconds (nonoseconds is all zero when viewing files via stat on this particular filesystem and kernel)
So I then though maybe I could just use something like date +%s%N for my filename. This will print the seconds since 1970 followed by the current nanoseconds.
I'm possibly over-engineering this at this point, but these are files generated on high-usage enterprise systems so I don't really want to simply trust the nanosecond timestamp on the (admittedly very small) chance two files are generated in the same nanosecond and we get a collision.
I believe the uuidgen script has logic baked in to handle this occurrence so it's still guaranteed to be unique in that case (correct me if I'm wrong there... I read that someplace I think but the googles are failing me right now).
So... I'm considering something like
FILENAME=`date +%s`-`uuidgen -t`
echo $FILENAME
to ensure I create a unique filename that can then be iterated over with a simple 'ls' and who's name can be trusted to both be unique and sequential by time.
Any better ideas or flaws with this direction?
If you order your date format by year, month (zero padded), day (zero padded), hour (zero padded), minute (zero padded), then you can sort by time easily:
FILENAME=`date '+%Y-%m-%d-%H-%M'`-`uuidgen -t`
echo $FILENAME
or
FILENAME=`date '+%Y-%m-%d-%H-%M'`-`uuidgen -t | head -c 5`
echo $FILENAME
Which would give you:
2015-02-23-08-37-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
or
2015-02-23-08-37-xxxxx
# the same as above, but shorter unique string
You can choose other delimiters for the date/time besides - as you wish, as long as they're within the valid characters for Linux file name.
You will need %N for precision (nanoseconds):
filename=$(date +%s.%N)_$(uuidgen -t); echo $filename
1424699882.086602550_fb575f02-bb63-11e4-ac75-8ca982a9f0aa
BTW if you use %N and you're not using multiple threads, it should be unique enough.
You could take what TIAGO said about %N precision, and combine it with taskset
You can find some info here: http://manpages.ubuntu.com/manpages/hardy/man1/taskset.1.html
and then run your script
taskset --cpu-list 1 my_script
Never tested this, but, it should run your script only on the first core of your CPU. I'm thinking that if your script runs on your first CPU core, combined with date %N (nanoseconds) + uuidgen there's no way you can get duplicate filenames.

Parsing the output of Bash's time builtin

I'm running a C program from a Bash script, and running it through a command called time, which outputs some time statistics for the running of the algorithm.
If I were to perform the command
time $ALGORITHM $VALUE $FILENAME
It produces the output:
real 0m0.435s
user 0m0.430s
sys 0m0.003s
The values depending on the running of the algorithm
However, what I would like to be able to do is to take the 0.435 and assign it to a variable.
I've read into awk a bit, enough to know that if I pipe the above command into awk, I should be able to grab the 0.435 and place it in a variable. But how do I do that?
Many thanks
You must be careful: there's the Bash builtin time and there's the external command time, usually located in /usr/bin/time (type type -a time to have all the available times on your system).
If your shell is Bash, when you issue
time stuff
you're calling the builtin time. You can't directly catch the output of time without some minor trickery. This is because time doesn't want to interfere with possible redirections or pipes you'll perform, and that's a good thing.
To get time output on standard out, you need:
{ time stuff; } 2>&1
(grouping and redirection).
Now, about parsing the output: parsing the output of a command is usually a bad idea, especially when it's possible to do without. Fortunately, Bash's time command accepts a format string. From the manual:
TIMEFORMAT
The value of this parameter is used as a format string specifying how the timing information for pipelines prefixed with the time reserved word should be displayed. The % character introduces an escape sequence that is expanded to a time value or other information. The escape sequences and their meanings are as follows; the braces denote optional portions.
%%
A literal `%`.
%[p][l]R
The elapsed time in seconds.
%[p][l]U
The number of CPU seconds spent in user mode.
%[p][l]S
The number of CPU seconds spent in system mode.
%P
The CPU percentage, computed as (%U + %S) / %R.
The optional p is a digit specifying the precision, the number of fractional digits after a decimal point. A value of 0 causes no decimal point or fraction to be output. At most three places after the decimal point may be specified; values of p greater than 3 are changed to 3. If p is not specified, the value 3 is used.
The optional l specifies a longer format, including minutes, of the form MMmSS.FFs. The value of p determines whether or not the fraction is included.
If this variable is not set, Bash acts as if it had the value
$'\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS'
If the value is null, no timing information is displayed. A trailing newline is added when the format string is displayed.
So, to fully achieve what you want:
var=$(TIMEFORMAT='%R'; { time $ALGORITHM $VALUE $FILENAME; } 2>&1)
As #glennjackman points out, if your command sends any messages to standard output and standard error, you must take care of that too. For that, some extra plumbing is necessary:
exec 3>&1 4>&2
var=$(TIMEFORMAT='%R'; { time $ALGORITHM $VALUE $FILENAME 1>&3 2>&4; } 2>&1)
exec 3>&- 4>&-
Source: BashFAQ032 on the wonderful Greg's wiki.
You could try the below awk command which uses split function to split the input based on digit m or last s.
$ foo=$(awk '/^real/{split($2,a,"[0-9]m|s$"); print a[2]}' file)
$ echo "$foo"
0.435
You can use this awk:
var=$(awk '$1=="real"{gsub(/^[0-9]+[hms]|[hms]$/, "", $2); print $2}' file)
echo "$var"
0.435

conditional statements, arithmetic operation and output redirection in Makefiles

I am trying to have
I have two registers reg_a and reg_b, each are 32 bit. reg_a is used to store the epoch time (unix time), so it can go upto a maximum of 2^32 -1. If an overflow occurs, the overflow should be stored in reg_b. I also want to write them in a file rom.txt . I am trying to do this in a Makefile. This is how far I got (It is more of a pseudocode, there are syntax errors). Would be happy to know if there is a better way to do this.
# should be 2^32-1, but lets consider the below for example
EPOCH_MAX = 1500000000
identifier:
# get the epoch value
epoch=$(shell date +%s)
# initialize the reg_a, reg_b, assuming that overflow has not occurred
reg_a=$(epoch)
reg_b=0
# if overflow occurs
if [ ${epoch} -gt $(EPOCH_MAX)] ; then \
reg_a=$(EPOCH_MAX) ;\
reg_b=$(shell $(epoch)\-$(EPOCH_MAX) | bc) ;\
fi ;\
# here I want to print the values in a text file
echo $$(reg_a) > rom.txt
echo $$(reg_b) >> rom.txt
I am novice to Makefiles. The above is just a sort of pseudocode which tells what I want to do (Mostly through reading some webpages). I will be happy if someone can help me with the above. Thanks.
you've been asking a lot of questions about make. I think you might benefit from spending some time reading the GNU make manual
Pertinent to your question, each logical line in a recipe is run in a separate shell. So, you cannot set a shell variable in one logical line, then use the results in another one.
A "logical line" is all the physical lines where the previous one ends in backslash/newline.
So:
identifier:
# get the epoch value
epoch=$(shell date +%s)
# initialize the reg_a, reg_b, assuming that overflow has not occurred
reg_a=$(epoch)
reg_b=0
Will run 5 separate shells, one for each line (including the comments! Every line indented with a TAB character is considered a recipe line, even ones that begin with comments).
On the other hand, this:
if [ ${epoch} -gt $(EPOCH_MAX)] ; then \
reg_a=$(EPOCH_MAX) ;\
reg_b=$(shell $(epoch)\-$(EPOCH_MAX) | bc) ;\
fi ;\
Runs the entire if-statement in a single shell, because the backslash/newline pairs create a single logical line.
Second, you have to keep very clear in your mind the difference between make variables and shell variables. In the above the line epoch=$(shell date +%s) is setting the shell variable epoch (which value is immediately lost again when the shell exits).
The line reg_a=$(epoch) is referencing the make variable epoch, which is not set and so is empty.

What performance increases can we expect as the Perl 6 implementations mature?

Each time I have downloaded a new copy of Rakudo Perl 6, I have run the following expression just to get an idea of its current performance:
say [+] 1 .. 100000;
And the speeds have been increasing, but each time, there is a noticeable delay (several seconds) for the calculation. As a comparison, something like this in Perl 5 (or other interpreted languages) returns almost instantly:
use List::Util 'sum';
print sum(1 .. 100000), "\n";
or in Ruby (also nearly instant):
(1 .. 100000).inject(0) {|sum,x| sum+x}
Rewriting the expression as a Perl6 loop ends up being about twice as fast as reducing the range, but it is still a very noticeable delay (more than a second) for the simple calculation:
my $sum;
loop (my $x = 1; $x <= 100000; $x++) {$sum += $x}
So my question is, what aspects of the Perl6 implementation are causing these performance issues? And should this improve with time, or is this overhead an unfortunate side effect of the "everything is an object" model that Perl6 is using?
And lastly, what about the loop construct is faster than the [+] reduction operator? I would think that the loop would result in more total ops than the reduction.
EDIT:
I'd accept both mortiz's and hobbs's answers if I could. That everything is a being handled as a method call more directly answers why [+] is being slow, so that one gets it.
Another thing you have to understand about the lack of optimization is that it's compounded. A large portion of Rakudo is written in Perl 6. So for example the [+] operator is implemented by the method Any.reduce (called with $expression set to &infix:<+>), which has as its inner loop
for #.list {
#args.push($_);
if (#args == $arity) {
my $res = $expression.(#args[0], #args[1]);
#args = ($res);
}
}
in other words, a pure-perl implementation of reduce, which itself is being run by Rakudo. So not only is the code you can see not getting optimized, the code that you don't see that's making your code run is also not getting
optimized. Even instances of the + operator are actually method calls, since although the + operator on Num is implemented by Parrot, there's nothing yet in Rakudo to recognize that you've got two Nums and optimize away the method call, so there's a full dynamic dispatch before Rakudo finds multi sub infix:<+>(Num $a, Num $b) and realizes that all it's really doing is an 'add' opcode. It's a reasonable excuse for being 100-1000x slower than Perl 5 :)
Update 8/23/2010
More information from Jonathan Worthington on the kinds of changes that need to happen with the Perl 6 object model (or at least Rakudo's conception of it) to make things fast while retaining Perl 6's "everything is method calls" nature.
Update 1/10/2019
Since I can see that this is still getting attention... over the years, Rakudo/MoarVM have gotten JIT, inlining, dynamic specialization, and tons of work by many people optimizing every part of the system. The result is that most of those method calls can be "compiled out" and have nearly zero runtime cost. Perl 6 scores hundreds or thousands of times faster on many benchmarks than it did in 2010, and in some cases it's faster than Perl 5.
In the case of the sum-to-100,000 problem that the question started with, Rakudo 2018.06 is still a bit slower than perl 5.26.2:
$ time perl -e 'use List::Util 'sum'; print sum(1 .. 100000), "\n";' >/dev/null
real 0m0.023s
user 0m0.015s
sys 0m0.008s
$ time perl6 -e 'say [+] 1 .. 100000;' >/dev/null
real 0m0.089s
user 0m0.107s
sys 0m0.022s
But if we amortize out startup cost by running the code 10,000 times, we see a different story:
$ time perl -e 'use List::Util 'sum'; for (1 .. 10000) { print sum(1 .. 100000), "\n"; }' > /dev/null
real 0m16.320s
user 0m16.317s
sys 0m0.004s
$ time perl6 -e 'for 1 .. 10000 { say [+] 1 .. 100000; }' >/dev/null
real 0m0.214s
user 0m0.245s
sys 0m0.021s
perl6 uses a few hundred more milliseconds than perl5 on startup and compilation, but then it figures out how to do the actual summation around 70 times faster.
There are really various reasons why Rakudo is so slow.
The first and maybe most important reason is that Rakudo doesn't do any optimizations yet. The current goals are more explore new features, and to become more robust. You know, they say "first make it run, then make it right, then make it fast".
The second reason is that parrot doesn't offer any JIT compilation yet, and the garbage collector isn't the fastest. There are plans for a JIT compiler, and people are working on it (the previous one was ripped out because it was i386 only and a maintenance nightmare). There are also thoughts of porting Rakudo to other VMs, but that'll surely wait till after end of July.
In the end, nobody can really tell how fast a complete, well-optimized Perl 6 implementation will be until we have one, but I do expect it to be much better than now.
BTW the case you cited [+] 1..$big_number could be made to run in O(1), because 1..$big_number returns a Range, which is introspectable. So you can use a sum formula for the [+] Range case. Again it's something that could be done, but that hasn't been done yet.
It certainly isn't because everything is an object, because that's true in a number of other languages too (like Ruby). There's no reason why Perl 6 would have to be magnitudes slower than other languages like Perl 5 or Ruby, but the fact is that Rakudo is not as mature as perl or CRuby. There hasn't been much speed optimization yet.
Considering that now your test case is optimized to an O(1) algorithm that returns nearly instantly, and that it seems almost like there are several optimizations a week;
I expect quite an performance improvement all around.
$ perl6 -e 'say [+] 1..10**1000; say now - INIT now'
5000000000000000000000000000000000000000000000 ...
0.007447
Even if that wasn't special-cased for ranges it is still quite a bit faster than it was.
It now does your test calculation in less than a fifth of a second.
$ perl6 -e 'say [+] (1..100000).list; say now - INIT now'
5000050000
0.13052975
I submitted these to Fefe's language competition in December 2008. wp.pugs.pl is a literal translation of the Perl 5 example, wp.rakudo.pl is far more sixier. I have two programs because the two implement a different subset of the spec. Build information is outdated meanwhile. The sources:
#!/usr/bin/env pugs
# Pugs: <http://pugs.blogs.com/> <http://pugscode.org/>
# prerequisite: ghc-6.8.x, not 6.10.x
# svn co http://svn.pugscode.org/pugs/
# perl Makefile.PL
# make
# if build stops because of haskeline, do:
# $HOME/.cabal/bin/cabal update ; $HOME/.cabal/bin/cabal install haskeline
# learn more: <http://jnthn.net/papers/2008-tcpw-perl64danoob-slides.pdf>
my %words;
for =<> {
for .split {
%words{$_}++
}
}
for (sort { %words{$^b} <=> %words{$^a} }, %words.keys) {
say "$_ %words{$_}"
}
#!/usr/bin/env perl6
# Rakudo: <http://rakudo.org/> <http://www.parrot.org/download>
# svn co http://svn.perl.org/parrot/trunk parrot
# perl Configure.pl
# make perl6
# Solution contributed by Frank W. & Moritz Lenz
# <http://use.perl.org/~fw/journal/38055>
# learn more: <http://jnthn.net/papers/2008-tcpw-perl64danoob-slides.pdf>
my %words;
$*IN.lines.split(/\s+/).map: { %words{$_}++ };
for %words.pairs.sort: { $^b.value <=> $^a.value } -> $pair {
say $pair
}
These were the results in 2008:
$ time ./wp.pugs.pl < /usr/src/linux/COPYING > foo
real 0m2.529s
user 0m2.464s
sys 0m0.064s
$ time ./wp.rakudo.pl < /usr/src/linux/COPYING > foo
real 0m32.544s
user 0m1.920s
sys 0m0.248s
Today:
$ time ./wp.pugs.pl < /usr/src/linux/COPYING > foo
real 0m5.105s
user 0m4.898s
sys 0m0.096s
$ time ./wp.rakudo.pl < /usr/src/linux/COPYING > foo
Divide by zero
current instr.: '' pc -1 ((unknown file):-1)
Segmentation fault
real 0m3.236s
user 0m0.447s
sys 0m0.080s
Late additions: The crash has been dealt with at Why do I get 'divide by zero` errors when I try to run my script with Rakudo?. The Rakudo program is inefficient, see comments below and http://justrakudoit.wordpress.com/2010/06/30/rakudo-and-speed/.

Resources