Time multiple commands without a subshell?

Time multiple commands without a subshell? - time

Is there a way to use the time reserved word in zsh to time multiple commands, without starting a subshell?
I know that this works:
{ time (
sleep 5
sleep 3
PROMPT='foobar> '
) }
However the parentheses mean that a subshell is created, and variables initialized don't get exported.
I know I can capture the variables before and after, like
start=$(time)
# do something
end=$(time)
echo start - end | bc
Though for ad hoc timing this is a little cumbersome.

No, time can only work on a different process. So, it won't work with { ... } or with a builtin, like:
time { ls }
time echo
Note that your method capturing the time output won't work if there are already children (as their times when running the commands will also be taken into account). Ditto if you have traps and corresponding signals occur.

Related

perl: long delays during repeated system calls

not sure if this is a perl problem, or a cywin problem, or a Windows problem:
I'm running perl inside cygwin under Windows8. We have a comprensive number of small scripts for individual tasks, and I've recently written a top level script, which repeatedly calls several of these scripts via 'system' calls. All scripts for themselves run flawlessly, however execution is only happening in chunks, i.e. the top level script starts to operate, and after about 10 seconds it stops and the computer is idle for another 10-15 seconds, then starts again for 10 seconds, and so on. Apart from this script the PC is only running the usual Windows background processes, i.e. the top level script is the only process causing significant CPU load.
The script is too long to show here, but essentially consists of stuctures where a few variables are defined in loops and then combined via sprintf strings to call the scripts , just like the following snippet:
(...)
foreach $period (#periods)
{
foreach $wt (#wtlist)
{
foreach $type ('WT', 'Ref')
{
$out=1;
$dir1=0*$sectorwidth;
$dir2=1*$sectorwidth;
$addfile0 = sprintf("%s/files/monthly_recal/%s%s_from%s.%s_%03d.da1", $workingdir_correl, $nameRoot, $wt, $type, $period, $dir1 ) ;
$addfile1 = sprintf("%s/files/monthly_recal/%s%s_from%s.%s_%03d.da1", $workingdir_correl, $nameRoot, $wt, $type, $period, $dir2 ) ;
if (-e $addfile0 && -e $addfile1)
{
$cmd = sprintf ( "perl ../00Bin/add_sort_ts.pl $addfile0 $addfile1 %s/files/monthly_recal/tmp/%s%s_from%s.%s.out%02d 0 $timeStep\n", $workingdir_correl, $nameRoot, $wt, $type, $period, $out ) ;
print ($cmd);
system ( $cmd ) ;
}
}
}
}
(...)
All variables are defined (simple strings or integers) and the individual calls are all working.
When this top level script is running, it's running several loop iterations after another, so I don't think it's a matter of startup delays of the called scripts. To me it looks more as if Windows denies too many system calls in a row. I have other perl scripts withut 'system' calls, which run for 10 minutes without showing this intermittent behaviour.
I have no real clue where to look for, so any suggestion would be appreciated. The whole execution time of the top level script can take several hours, therefore any improvement here would greatly improve efficieny!
--UPDATE: From the discussion to Hakon's answer below it turned out that the problem lies in the shell that is used to run the perl scripts - intermittent operation appears when the code is run from Windows cmd or a non-login shell, but not when run explicitly from a login shell (e.g. when using bash --login or staring mintty -). I will open another thread soon to clarify why this happens... Thanks to all contributors here!

Can you try to simplify you problem a little bit? It will help to locate the problem more easily. For example, the following code running on Windows 10, Strawberry Perl 5.30 (running from CMD not Cygwin) shows no problems:
use strict;
use warnings;
for my $i (1..10) {
system 'cmd.exe /c worker.bat';
}
with a simple worker.bat like:
#echo off
echo %time%
timeout 5 > NUL
echo %time%
The output from running the Perl script is:
10:35:00.71
10:35:05.18
10:35:05.22
10:35:10.17
10:35:10.22
10:35:15.15
10:35:15.19
10:35:20.19
10:35:20.23
10:35:25.15
10:35:25.20
10:35:30.16
10:35:30.21
10:35:35.14
10:35:35.18
10:35:40.16
10:35:40.20
10:35:45.16
10:35:45.21
10:35:50.15
Showing no delays between the system calls.

whether a shell script can be executed if another instance of the same script is already running

I have a shell script which usually runs nearly 10 mins for a single run,but i need to know if another request for running the script comes while a instance of the script is running already, whether new request need to wait for existing instance to compplete or a new instance will be started.
I need a new instance must be started whenever a request is available for the same script.
How to do it...
The shell script is a polling script which looks for a file in a directory and execute the file.The execution of the file takes nearly 10 min or more.But during execution if a new file arrives, it also has to be executed simultaneously.
the shell script is below, and how to modify it to execute multiple requests..
#!/bin/bash
while [ 1 ]; do
newfiles=`find /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -newer /afs/rch/usr$
touch /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/.my_marker
if [ -n "$newfiles" ]; then
echo "found files $newfiles"
name2=`ls /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -Art |tail -n 2 |head $
echo " $name2 "
mkdir -p -m 0755 /afs/rch/usr8/fsptools/WWW/dumpspace/$name2
name1="/afs/rch/usr8/fsptools/WWW/dumpspace/fipsdumputils/fipsdumputil -e -$
$name1
touch /afs/rch/usr8/fsptools/WWW/dumpspace/tempfiles/$name2
fi
sleep 5
done

When writing scripts like the one you describe, I take one of two approaches.
First, you can use a pid file to indicate that a second copy should not run. For example:
#!/bin/sh
pidfile=/var/run/$(0##*/).pid
# remove pid if we exit normally or are terminated
trap "rm -f $pidfile" 0 1 3 15
# Write the pid as a symlink
if ! ln -s "pid=$$" "$pidfile"; then
echo "Already running. Exiting." >&2
exit 0
fi
# Do your stuff
I like using symlinks to store pid because writing a symlink is an atomic operation; two processes can't conflict with each other. You don't even need to check for the existence of the pid symlink, because a failure of ln clearly indicates that a pid cannot be set. That's either a permission or path problem, or it's due to the symlink already being there.
Second option is to make it possible .. nay, preferable .. not to block additional instances, and instead configure whatever it is that this script does to permit multiple servers to run at the same time on different queue entries. "Single-queue-single-server" is never as good as "single-queue-multi-server". Since you haven't included code in your question, I have no way to know whether this approach would be useful for you, but here's some explanatory meta bash:
#!/usr/bin/env bash
workdir=/var/tmp # Set a better $workdir than this.
a=( $(get_list_of_queue_ids) ) # A command? A function? Up to you.
for qid in "${a[#]}"; do
# Set a "lock" for this item .. or don't, and move on.
if ! ln -s "pid=$$" $workdir/$qid.working; then
continue
fi
# Do your stuff with just this $qid.
...
# And finally, clean up after ourselves
remove_qid_from_queue $qid
rm $workdir/$qid.working
done
The effect of this is to transfer the idea of "one at a time" from the handler to the data. If you have a multi-CPU system, you probably have enough capacity to handle multiple queue entries at the same time.

ghoti's answer shows some helpful techniques, if modifying the script is an option.
Generally speaking, for an existing script:
Unless you know with certainty that:
the script has no side effects other than to output to the terminal or to write to files with shell-instance specific names (such as incorporating $$, the current shell's PID, into filenames) or some other instance-specific location,
OR that the script was explicitly designed for parallel execution,
I would assume that you cannot safely run multiple copies of the script simultaneously.
It is not reasonable to expect the average shell script to be designed for concurrent use.

From the viewpoint of the operating system, several processes may of course execute the same program in parallel. No need to worry about this.
However, it is conceivable, that a (careless) programmer wrote the program in such a way that it produces incorrect results, when two copies are executed in parallel.

Storing execution time of a command in a variable

I am trying to write a task-runner for command line. No rationale. Just wanted to do it. Basically it just runs a command, stores the output in a file (instead of stdout) and meanwhile prints a progress indicator of sorts on stdout and when its all done, prints Completed ($TIME_HERE).
Here's the code:
#!/bin/bash
task() {
TIMEFORMAT="%E"
COMMAND=$1
printf "\033[0;33m${2:-$COMMAND}\033[0m\n"
while true
do
for i in 1 2 3 4 5
do
printf '.'
sleep 0.5
done
printf "\b\b\b\b\b \b\b\b\b\b"
sleep 0.5
done &
WHILE=$!
EXECTIME=$({ TIMEFORMAT='%E';time $COMMAND >log; } 2>&1)
kill -9 $WHILE
echo $EXECTIME
#printf "\rCompleted (${EXECTIME}s)\n"
}
There are some unnecessarily fancy bits in there I admit. But I went through tons of StackOverflow questions to do different kinds of fancy stuff just to try it out. If it were to be applied anywhere, a lot of fat could be cut off. But it's not.
It is to be called like:
task "ping google.com -c 4" "Pinging google.com 4 times"
What it'll do is print Pinging google.com 4 times in yellow color, then on the next line, print a period. Then print another period every .5 seconds. After five periods, start from the beginning of the same line and repeat this until the command is complete. Then it's supposed to print Complete ($TIME_HERE) with (obviously) the time it took to execute the command in place of $TIME_HERE. (I've commented that part out, the current version would just print the time).
The Issue
The issue is that that instead of the execution time, something very weird gets printed. It's probably something stupid I'm doing. But I don't know where that problem originates from. Here's the output.
$ sh taskrunner.sh
Pinging google.com 4 times
..0.00user 0.00system 0:03.51elapsed 0%CPU (0avgtext+0avgdata 996maxresident)k 0inputs+16outputs (0major+338minor)pagefaults 0swaps
Running COMMAND='ping google.com -c 4';EXECTIME=$({ TIMEFORMAT='%E';time $COMMAND >log; } 2>&1);echo $EXECTIME in a terminal works as expected, i.e. prints out the time (3.559s in my case.)
I have checked and /bin/sh is a symlink to dash. (However that shouldn't be a problem because my script runs in /bin/bash as per the shebang on the top.)
I'm looking to learn while solving this issue so a solution with explanation will be cool. T. Hanks. :)

When you invoke a script with:
sh scriptname
the script is passed to sh (dash in your case), which will ignore the shebang line. (In a shell script, a shebang is a comment, since it starts with a #. That's not a coincidence.)
Shebang lines are only interpreted for commands started as commands, since they are interpreted by the system's command launcher, not by the shell.
By the way, your invocation of time does not correctly separate the output of the time builtin from any output the timed command might sent to stderr. I think you'd be better with:
EXECTIME=$({ TIMEFORMAT=%E; time $COMMAND >log.out 2>log.err; } 2>&1)
but that isn't sufficient. You will continue to run into the standard problems with trying to put commands into string variables, which is that it only works with very simple commands. See the Bash FAQ. Or look at some of these answers:
How to escape a variable in bash when passing to command line argument
bash quotes in variable treated different when expanded to command
Preserve argument splitting when storing command with whitespaces in variable
find command fusses on -exec arg
Using an environment variable to pass arguments to a command
(Or probably hundreds of other similar answers.)

Getting a Process ID & Assigning to a Variable in a Bash Script

I have an interesting situation. All code here is a functional pseudo-code example of the exact issue I am facing so no jokes about assign the output of date. I actually want to capture the output of a slower more resource dependent function, but date works well to show the functional obstacle I have run into.
I am writing bash script where I want to assign the output of a process to a variable like so:
RESPONSE=$(nice -n 19 date);
Now that gives me the RESPONSE in a nice variable, right? Okay, what if I want to get the process ID of the function called within $()? How would I do that? I assumed this would work:
RESPONSE=$(nice -n 19 date & PID=(`jobs -l | awk '{print $2}'`));
Which does give me the process ID in the variable PID, but then I no longer get the output sent to RESPONSE.
The code I am using as a functional example is this. This example works, but no PID; yes I am not assigning a PID but again this is an example:
RESPONSE=$(nice -n 19 date);
wait ${PID};
echo "${RESPONSE}";
echo "${PID}";
This example gives me a PID but no RESPONSE:
RESPONSE=$(nice -n 19 date & PID=(`jobs -l | awk '{print $2}'`));
wait ${PID};
echo "${RESPONSE}";
echo "${PID}";
Anyone know how I can get the value of RESPONSE with the PID as well?

Depending on exactly how you set it up, using RESPONSE=$(backgroundcommand) will either wait for the command to complete (in which case it's too late to get its PID), or won't wait for the command to complete (in which case its output won't exist yet, and this can't be assigned to RESPONSE). You're going to have to store the command's output someplace else (like a temporary file), and then collect it when the process finishes:
responsefile=$(mktemp -t response)
nice -n 19 date >$responsefile &
pid=$!
wait $pid
response=$(<$responsefile)
rm $responsefile
(Note that the $(<$responsefile) construct is only available in bash, not in plain posix shells. If you don't start the script with #!/bin/bash, use $(cat $responsefile) instead.)
This still may not do quite what you're looking for, because (at least as far as I can see) what you're looking for doesn't really make sense. The command's PID and output never exist at the same time, so getting both isn't really meaningful. The script I gave above technically gives you both, but the pid is irrelevant (the process has existed) by the time response gets assigned, so while you have the pid at the end, it's meaningless.

ZSH: How to time a block of code?

In bash I am able to write a script that contains something like this:
{ time {
#series of commands
echo "something"
echo "another command"
echo "blah blah blah"
} } 2> $LOGFILE
In ZSH the equivalent code does not work and I can not figure out how to make it work for me. This code works but I don't exactly know how to get it to wrap multiple commands.
{ time echo "something" } 2>&1
I know I can create a new script and put the commands in there then time the execution properly, but is there a way to do it either using functions or a similar method to the bash above?

Try the following instead:
{ time ( echo hello ; sleep 10s; echo hola ; ) } 2>&1

If you want to profile your code you have a few alternatives:
Time subshell execution like:
time ( commands ... )
Use REPORTTIME to check for slow commands:
export REPORTTIME=3 # display commands with execution time >= 3 seconds
setop xtrace as explained here
The zprof module

Try replace { with ( ?
I think this should help

You can also use the times POSIX shell builtin in conjunction with functions.
It will report the user and system time used by the shell and its children. See
http://pubs.opengroup.org/onlinepubs/009695399/utilities/times.html
Example:
somefunc() {
code you want to time here
times
}
The reason for using a shell function is that it creates a new shell context, at the start of which times is all zeros (try it). Otherwise the result contains the contribution of the current shell as well. If that is what you want, forget about the function and put times last in your script.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio