Use PS0 and PS1 to display execution time of each bash command - bash

It seems that by executing code in PS0 and PS1 variables (which are eval'ed before and after a prompt command is run, as I understand) it should be possible to record time of each running command and display it in the prompt. Something like that:
user#machine ~/tmp
$ sleep 1
user#machine ~/tmp 1.01s
$
However, I quickly got stuck with recording time in PS0, since something like this doesn't work:
PS0='$(START=$(date +%s.%N))'
As I understand, START assignment happens in a sub-shell, so it is not visible in the outer shell. How would you approach this?

I was looking for a solution to a different problem and came upon this question, and decided that sounds like a cool feature to have. Using #Scheff's excellent answer as a base in addition to the solutions I developed for my other problem, I came up with a more elegant and full featured solution.
First, I created a few functions that read/write the time to/from memory. Writing to the shared memory folder prevents disk access and does not persist on reboot if the files are not cleaned for some reason
function roundseconds (){
# rounds a number to 3 decimal places
echo m=$1";h=0.5;scale=4;t=1000;if(m<0) h=-0.5;a=m*t+h;scale=3;a/t;" | bc
}
function bash_getstarttime (){
# places the epoch time in ns into shared memory
date +%s.%N >"/dev/shm/${USER}.bashtime.${1}"
}
function bash_getstoptime (){
# reads stored epoch time and subtracts from current
local endtime=$(date +%s.%N)
local starttime=$(cat /dev/shm/${USER}.bashtime.${1})
roundseconds $(echo $(eval echo "$endtime - $starttime") | bc)
}
The input to the bash_ functions is the bash PID
Those functions and the following are added to the ~/.bashrc file
ROOTPID=$BASHPID
bash_getstarttime $ROOTPID
These create the initial time value and store the bash PID as a different variable that can be passed to a function. Then you add the functions to PS0 and PS1
PS0='$(bash_getstarttime $ROOTPID) etc..'
PS1='\[\033[36m\] Execution time $(bash_getstoptime $ROOTPID)s\n'
PS1="$PS1"'and your normal PS1 here'
Now it will generate the time in PS0 prior to processing terminal input, and generate the time again in PS1 after processing terminal input, then calculate the difference and add to PS1. And finally, this code cleans up the stored time when the terminal exits:
function runonexit (){
rm /dev/shm/${USER}.bashtime.${ROOTPID}
}
trap runonexit EXIT
Putting it all together, plus some additional code being tested, and it looks like this:
The important parts are the execution time in ms, and the user.bashtime files for all active terminal PIDs stored in shared memory. The PID is also shown right after the terminal input, as I added display of it to PS0, and you can see the bashtime files added and removed.
PS0='$(bash_getstarttime $ROOTPID) $ROOTPID experiments \[\033[00m\]\n'

As #tc said, using arithmetic expansion allows you to assign variables during the expansion of PS0 and PS1. Newer bash versions also allow PS* style expansion so you don't even need a subshell to get the current time. With bash 4.4:
# PS0 extracts a substring of length 0 from PS1; as a side-effect it causes
# the current time as epoch seconds to PS0time (no visible output in this case)
PS0='\[${PS1:$((PS0time=\D{%s}, PS1calc=1, 0)):0}\]'
# PS1 uses the same trick to calculate the time elapsed since PS0 was output.
# It also expands the previous command's exit status ($?), the current time
# and directory ($PWD rather than \w, which shortens your home directory path
# prefix to "~") on the next line, and finally the actual prompt: 'user#host> '
PS1='\nSeconds: $((PS1calc ? \D{%s}-$PS0time : 0)) Status: $?\n\D{%T} ${PWD:PS1calc=0}\n\u#\h> '
(The %N date directive does not seem to be implemented as part of \D{...} expansion with bash 4.4. This is a pity since we only have a resolution in single second units.)
Since PS0 is only evaluated and printed if there is a command to execute, the PS1calc flag is set to 1 to do the time difference (following the command) in PS1 expansion or not (PS1calc being 0 means PS0 was not previously expanded and so didn't re-evaluate PS1time). PS1 then resets PS1calc to 0. In this way an empty line (just hitting return) doesn't accumulate seconds between return key presses.
One nice thing about this method is that there is no output when you have set -x active. No subshells or temporary files in sight: everything is done within the bash process itself.

I took this as puzzle and want to show the result of my puzzling:
First I fiddled with time measurement. The date +%s.%N (which I didn't realize before) was where I started from. Unfortunately, it seems that bashs arithmetic evaluation seems not to support floating points. Thus, I chosed something else:
$ START=$(date +%s.%N)
$ awk 'BEGIN { printf("%fs", '$(date +%s.%N)' - '$START') }' /dev/null
8.059526s
$
This is sufficient to compute the time difference.
Next, I confirmed what you already described: sub-shell invocation prevents usage of shell variables. Thus, I thought about where else I could store the start time which is global for sub-shells but local enough to be used in multiple interactive shells concurrently. My solution are temp. files (in /tmp). To provide a unique name I came up with this pattern: /tmp/$USER.START.$BASHPID.
$ date +%s.%N >/tmp/$USER.START.$BASHPID ; \
> awk 'BEGIN { printf("%fs", '$(date +%s.%N)' - '$(cat /tmp/$USER.START.$BASHPID)') }' /dev/null
cat: /tmp/ds32737.START.11756: No such file or directory
awk: cmd. line:1: BEGIN { printf("%fs", 1491297723.111219300 - ) }
awk: cmd. line:1: ^ syntax error
$
Damn! Again I'm trapped in the sub-shell issue. To come around this, I defined another variable:
$ INTERACTIVE_BASHPID=$BASHPID
$ date +%s.%N >/tmp/$USER.START.$INTERACTIVE_BASHPID ; \
> awk 'BEGIN { printf("%fs", '$(date +%s.%N)' - '$(cat /tmp/$USER.START.$INTERACTIVE_BASHPID)') }' /dev/null
0.075319s
$
Next step: fiddle this together with PS0 and PS1. In a similar puzzle (SO: How to change bash prompt color based on exit code of last command?), I already mastered the "quoting hell". Thus, I should be able to do it again:
$ PS0='$(date +%s.%N >"/tmp/${USER}.START.${INTERACTIVE_BASHPID}")'
$ PS1='$(awk "BEGIN { printf(\"%fs\", "$(date +%s.%N)" - "$(cat /tmp/$USER.START.$INTERACTIVE_BASHPID)") }" /dev/null)'"$PS1"
0.118550s
$
Ahh. It starts to work. Thus, there is only one issue - to find the right start-up script for the initialization of INTERACTIVE_BASHPID. I found ~/.bashrc which seems to be the right one for this, and which I already used in the past for some other personal customizations.
So, putting it all together - these are the lines I added to my ~/.bashrc:
# command duration puzzle
INTERACTIVE_BASHPID=$BASHPID
date +%s.%N >"/tmp/${USER}.START.${INTERACTIVE_BASHPID}"
PS0='$(date +%s.%N >"/tmp/${USER}.START.${INTERACTIVE_BASHPID}")'
PS1='$(awk "BEGIN { printf(\"%fs\", "$(date +%s.%N)" - "$(cat /tmp/$USER.START.$INTERACTIVE_BASHPID)") }" /dev/null)'"$PS1"
The 3rd line (the date command) has been added to solve another issue. Comment it out and start a new interactive bash to find out why.
A snapshot of my cygwin xterm with bash where I added the above lines to ./~bashrc:
Notes:
I consider this rather as solution to a puzzle than a "serious productive" solution. I'm sure that this kind of time measurement consumes itself a lot of time. The time command might provide a better solution: SE: How to get execution time of a script effectively?. However, this was a nice lecture for practicing the bash...
Don't forget that this code pollutes your /tmp directory with a growing number of small files. Either clean-up the /tmp from time to time or add the appropriate commands for clean-up (e.g. to ~/.bash_logout).

Arithmetic expansion runs in the current process and can assign to variables. It also produces output, which you can consume with something like \e[$((...,0))m (to output \e[0m) or ${t:0:$((...,0))} (to output nothing, which is presumably better). 64-bit integer support in Bash supports will count POSIX nanoseconds until the year 2262.
$ PS0='${t:0:$((t=$(date +%s%N),0))}'
$ PS1='$((( t )) && printf %d.%09ds $((t=$(date +%s%N)-t,t/1000000000)) $((t%1000000000)))${t:0:$((t=0))}\n$ '
0.053282161s
$ sleep 1
1.064178281s
$
$
PS0 is not evaluated for empty commands, which leaves a blank line (I'm not sure if you can conditionally print the \n without breaking things). You can work around that by switching to PROMPT_COMMAND instead (which also saves a fork):
$ PS0='${t:0:$((t=$(date +%s%N),0))}'
$ PROMPT_COMMAND='(( t )) && printf %d.%09ds\\n $((t=$(date +%s%N)-t,t/1000000000)) $((t%1000000000)); t=0'
0.041584565s
$ sleep 1
1.077152833s
$
$
That said, if you do not require sub-second precision, I would suggest using $SECONDS instead (which is also more likely to return a sensible answer if something sets the time).

As correctly stated in the question, PS0 runs inside a sub-shell which makes it unusable for this purpose of setting the start time.
Instead, one can use the history command with epoch seconds %s and the built-in variable $EPOCHSECONDS to calculate when the command finished by leveraging only $PROMPT_COMMAND.
# Save start time before executing command (does not work due to PS0 sub-shell)
# preexec() {
# STARTTIME=$EPOCHSECONDS
# }
# PS0=preexec
# Save end time, without duplicating commands when pressing Enter on an empty line
precmd() {
local st=$(HISTTIMEFORMAT='%s ' history 1 | awk '{print $2}');
if [[ -z "$STARTTIME" || (-n "$STARTTIME" && "$STARTTIME" -ne "$st") ]]; then
ENDTIME=$EPOCHSECONDS
STARTTIME=$st
else
ENDTIME=0
fi
}
__timeit() {
precmd;
if ((ENDTIME - STARTTIME >= 0)); then
printf 'Command took %d seconds.\n' "$((ENDTIME - STARTTIME))";
fi
# Do not forget your:
# - OSC 0 (set title)
# - OSC 777 (notification in gnome-terminal, urxvt; note, this one has preexec and precmd as OSC 777 features)
# - OSC 99 (notification in kitty)
# - OSC 7 (set url) - out of scope for this question
}
export PROMPT_COMMAND=__timeit
Note: If you have ignoredups in your $HISTCONTROL, then this will not report back for a command that is re-run.

Following #SherylHohman use of variables in PS0 I've come with this complete script. I've seen you don't need a PS0Time flag as PS0Calc doesn't exists on empty prompts so _elapsed funct just exit.
#!/bin/bash
# string preceding ms, use color code or ascii
_ELAPTXT=$'\E[1;33m \uf135 '
# extract time
_printtime () {
local _var=${EPOCHREALTIME/,/};
echo ${_var%???}
}
# get diff time, print it and end color codings if any
_elapsed () {
[[ -v "${1}" ]] || ( local _VAR=$(_printtime);
local _ELAPSED=$(( ${_VAR} - ${1} ));
echo "${_ELAPTXT}$(_formatms ${_ELAPSED})"$'\n\e[0m' )
}
# format _elapsed with simple string substitution
_formatms () {
local _n=$((${1})) && case ${_n} in
? | ?? | ???)
echo $_n"ms"
;;
????)
echo ${_n:0:1}${_n:0,-3}"ms"
;;
?????)
echo ${_n:0:2}","${_n:0,-3}"s"
;;
??????)
printf $((${_n:0:3}/60))m+$((${_n:0:3}%60)),${_n:0,-3}"s"
;;
???????)
printf $((${_n:0:4}/60))m$((${_n:0:4}%60))s${_n:0,-3}"ms"
;;
*)
printf "too much!"
;;
esac
}
# prompts
PS0='${PS1:(PS0time=$(_printtime)):0}'
PS1='$(_elapsed $PS0time)${PS0:(PS0time=0):0}\u#\h:\w\$ '
img of result
Save it as _prompt and source it to try:
source _prompt
Change text, ascii codes and colors in _ELAPTXT
_ELAPTXT='\e[33m Elapsed time: '

Related

Bash command completion with full path expansion injected into history for vim

i've spent a solid week searching online and trying many different ways to solve a tricky problem. basically i would like to use vim to edit custom commands / scripts that are in my $PATH without having to actually cd to their given directories first or manually type their full paths on the command line.
in essence, i'd love to be able to combine stock bash command completion (compgen -c) with simultaneous path expansion when specifying scripts in my $PATH as vim FILE ARGUMENTS. btw i'm using the caps to make clear what can be a tricky subject and not shouting.
it's probably easier to show you what i'm trying to do then explain it. lets say i have scripts in directories that are on my $PATH
~/bin/x/y/cmd1.sh
~/bin/a/b/cmd2.sh
/ppp/n/m/cmd3.sh
sometimes these scripts provide functionality on files that exist in other directories so i'd like to be able to edit them easily from anywhere in the file system. sometimes i just want to be able to edit those scripts from other directories because it's more convenient. lets say i'm currently in the following directory.
/completely/different/dir
but now i need to vim edit
~/bin/a/b/cmd2.sh
my options to achieve this solely with default bash functionality is to do one of the following which takes a long time
cd ~/bin/a/b/; vim cmd.sh
vim ~/<tab-complete-my-way-to-file>
open a new terminal window plus some combination of the above
since i know the names of my custom scripts it would be soooo much easier to just do the following which requires no tab completion of the full path to the file or directory as well as no cd'ing to a different directory to change my context!!!
vim cmd2.sh
but this won't work by default b/c vim needs the full path to the script
my first thought was to write a vim wrapper function which basically uses which to do the $PATH expansion for me and then tie bash command completion to my vc function like this:
vc () { vim $(which "$#"); }
complete -c vc
i can run the following in the shell to complete partial script names that start with "c" from the choices of cmd1.sh, cmd2.sh, cmd3.sh
vc c<tab>
until i get what i want here which is great
vc cmd2.sh
when i hit enter and execute the command it all works fine BUT it doesn't inject the expanded path into the READLINE command line and thus the FULL EXAPANDED PATH of 'cmd2.sh' never winds up in my command history! my history will show this
vc cmd2.sh
instead of
vc ~/bin/a/b/cmd2.sh
or
vim ~/bin/a/b/cmd2.sh
i want that expanded path in my command history because it makes future operations on that script file super easy when reusing command history. ie i can ls, file, diff, mv, cp that expanded path much easier reusing history than writing more wrapper scripts for ls, file, diff, mv, cp etc.. like i had to do with vc above.
QUESTIONS :
OPTION 1
is there a way to reinject the full expanded path provided by which in my vc function directly back into the original vc READLINE or just inject the entire "vim " command that actually gets executed in vc as a replacement for the original vc command into READLINE? any method that allows me to get the expanded vim command into the history even if it is in addition to the original vc command is ok by me.
basically how do you access and edit the current READLINE programmatically in bash?
OPTION 2
note i can also do something like this DIRECTLY on the command line in real-time
vim $(which cmd2.sh) C-x-e
which gives me what i want (it expands the path which will then put it into history) but i have to always type the extra subshell and which text as well as the C-x-e (to expand the line) on every iteration of the command while losing the command completion functionality which basically makes this useless. put another way, is there anyway to automate the above using a bind key so that
vc cmd2.sh
is automatcially transformed first into
vim $(which cmd2.sh)
and then automatically follows up with C-x-e so that it gets expanded to
vim ~/bin/a/b/cmd2.sh
but have all the editing movement, text insertion and final command line expansion happen all in the same bindkey macro? this might be the best solution of all.
OPTION 3
alternatively, since bash command completion automatically winds up in the READLINE and thus the history, a custom completion function would solve my problem. is there a way to make vc use a completion function that would BOTH complete commands in $PATH when used as vim arguments as described above AND ALSO SIMULTANEOUSLY EXPAND THEM TO THEIR FULL PATHS?
i know how to write a basic completion function. countless hours of attempts (which i am choosing not to put here to keep confusion / post length down) are failing for the simple reason that i'm not sure command completion is compatible with simultaneous full path expansion b/c it breaks traditional completion.
with a custom completion function, here's what happens when i try to find one of my scripts "cmd2.sh" living in "vim ~/bin/a/b/cmd2.sh" but start with a "c" and hit "".
vim c<tab>
instead of getting me these completions to choose from
cmd1.sh
cmd2.sh
cmd3.sh
it completes the first one it finds in the $PATH and inserts it into the READLINE which might be
/ppp/n/m/cmd3.sh
when i really want
~/bin/a/b/cmd2.sh
this effectively kills the completion lookup because the word before my cursor in the READLINE now starts with /ppp/n/m/cmd3.sh and there's no way of getting back to cmd2.sh
i hope that's clear.
thanks
This requires some boilerplate in your .bashrc file, but might work for you. It makes use of the directory stack (some might say it abuses the directory stack, but if you aren't using it for anything else, it might be OK).
In your .bashrc, add each directory of interest to your directory stack. End the list with your home directory, as pushd also changes your current working directory.
pushd ~/bin/x/y/cmd1.sh
pushd ~/bin/a/b/cmd2.sh
pushd /ppp/n/m/cmd3.sh
pushd ~
Yes, it duplicates your PATH entry a bit, but I contend you don't really need access to every directory in your PATH, just the ones where you have files you intend to edit. (Are you really going to try to edit anything in /bin or /usr/bin?)
Now, in your interactive shell, you can run dirs -v to see, along with its index, the directories in your stack:
$ dirs -v
0 ~
1 /ppp/n/m
2 ~/bin/a/b
3 ~/bin/x/y
4 ~
Now, no matter where you are, if you want to edit ~/bin/x/y/cmd1.sh, you can use
$ vi ~3/cmd3.sh
As long as you don't use popd or pushd elsewhere to modify the stack, the indices will stay the same. (Using pushd will add a new directory to the top of the stack, increasing each index; popd will decrease each index after it removes the top directory.)
A much simpler process would be to simply define some variables whose values are the desired directories:
binab=~/bin/a/b
binxy=~/bin/x/y
ppp=/ppp/n/m
and simply expand them
$ vi $ppp/cmd3.sh
The shell performs parameter name completion, so the variable names don't have to be particularly short, but the dirstack approach guarantees you only need 2 or 3 characters. (Also, it doesn't pollute the global namespace with additional varibles.)
Interestingly, I've found myself wanting to do something similar a while back. I hacked together the following bash script. It's pretty self-explanatory. If I want to edit one of my scripts (this one, for example is ~/bin/vm), I just run vm vm. I can open several files in my path, either in buffers, or vertical/horizontal splits etc...
Do with it what you like, pasting it here because it's all ready to use:
#!/usr/bin/env bash
Usage() {
cat <<-__EOF_
${0##*/} Opens scripts in PATH from any location (vim -O)
Example: ${0##*/} ${0##*/}
opens this script in vim editor
-o: Change default behaviour (vim -O) to -o
-b: Change default behaviour to open in buffers (vim file1 file2)
-h: Display this message
__EOF_
}
flag="O"
vimopen() {
local wrapped
local located
local found
found=false
[ $# -lt 1 ] && echo "No script given" && return
wrapped=""
for arg in "$#"; do
if located=$(which "${arg}" 2> /dev/null); then
found=true
wrapped="${wrapped} ${located}"
else
echo "${arg} not found!"
fi
done
$found || return
# We WANT word splitting to occur here
# shellcheck disable=SC2086
case ${flag} in
O)
vim $wrapped -O
;;
o)
vim $wrapped -o
;;
*)
vim $wrapped
esac
}
while getopts :boh f; do
case $f in
h)
Usage
exit 0
;;
o)
flag="o"
shift
;;
b)
flag=""
shift
;;
*)
echo "Unknown option ${f}-${OPTARG}"
Usage
exit 1
;;
esac
done
vimopen "$#"
Let me share something that answers OPTION3 part of your answer:
Behavior of this solution
The solutions that I will show will offer up basenames of commands (i.e. what compgen -c ${cur} returns where cur is last word on the command line) until there is only one candidate in which case it will be replaced by the full path of the command.
$ vc c<TAB><TAB>
Display all 216 possibilities? (y or n)
$ vc cm<TAB>
cmake cmake-gui cmcprompt cmd1.sh cmd2.sh cmd3.sh cmp cmpdylib cmuwmtopbm
$ vc cmd<TAB>
cmd1.sh cmd2.sh cmd3.sh
$ vc cmd1<TAB>
$ vc /Users/pcarphin/vc/bin/cmd1.sh
which I think is what you want.
And for your vc function, you can still do
vc(){
vim "$(which "${1}")
}
since which /Users/pcarphin/vc/bin/cmd3.sh returns /Users/pcarphin/vc/bin/cmd3.sh and so it will work whether you do vc cmd3.sh<ENTER> or if you do vc cmd3.sh<TAB><ENTER>
Basic solution
So here it is, it's as simple as using compgen -c to get command basename candidates and checking if you only have a single candidate and if so, replacing it with the full path.
_vc(){
local cur prev words cword
_init_completion || return;
COMPREPLY=( $(compgen -c ${cur}) )
#
# If there is only one candidate for completion, replace it with the
# full path returned by which.
#
if ((${#COMPREPLY[#]} == 1)) ; then
COMPREPLY[0]=$(which ${COMPREPLY[0]})
fi
}
complete -F _vc vc
Solution that filters out shell functions
The compgen -c command will include the names of shell functions and if you want to leave those out (maybe because your vc function would fail which would be inelegant for an argument supplied by a completion function), here is what you can do:
_vc(){
local cur prev words cword
_init_completion || return;
local candidates=($(compgen -c ${cur}))
#
# Put in COMPREPLY only the command names that are files in PATH
# and leave out shell functions
#
local i=0
for cmd in "${candidates[#]}" ; do
if which $cmd 2>/dev/null ; then
COMPREPLY[i++]=${cmd}
fi
done
#
# If there is only one candidate for completion, replace it with the
# full path returned by which.
#
if ((${#COMPREPLY[#]} == 1)) ; then
COMPREPLY[0]=$(which ${COMPREPLY[0]})
fi
}
Solution that handles shell functions
If we want to handle shell functions, then we can get rid of the part that filters them out and enhance the part that replaces the command name by a full path when COMPREPLY contains only one candidate. This is based on turning on extdebug which causes declare -F shell_function to output the file where shell_function was defined:
cmd_location(){
local location
if location=$(which "${1}" 2>/dev/null) ; then
echo "${location}"
else
# If extdebug is off, remember that and turn it on
local user_has_extdebug
if ! shopt extdebug ; then
user_has_extdebug=no
shopt -s extdebug
fi
info=$(declare -F ${COMPREPLY[0]})
if [[ -n "${info}" ]] ; then
echo ${info} | cut -d ' ' -f 3
fi
# Turn extdebug back off if it was off before
if [[ "${user_has_extdebug}" == no ]] ; then
shopt -u extdebug
fi
fi
}
_vc(){
local cur prev words cword
_init_completion || return;
COMPREPLY=( $(compgen -c ${cur}) )
if ((${#COMPREPLY[#]} == 1)) ; then
COMPREPLY[0]=$(cmd_location ${COMPREPLY[0]})
fi
}
And in this case, your vc function would need the same kind of logic or you could just remember to always use the shell completion to end up calling it with a full path.
That's why I factored out the cmd_location function
vc(){
if [[ "${1}" == /* ]] ; then
vim "${1}"
else
vim $(cmd_location "${1}")
fi
}
I was looking for something else but I found this question which inspired me to do this for myself so thank you, now I'll have a neat vc function with a cool completion function. Personally, I'm going to use the last version which handles shell functions.
The declare -F command with extdebug prints out the function name, the line number, and the file, so I'll see if I can adapt the solution so that in the case of shell functions, it opens the file at the location.
For that, I'd have to get rid of the part that puts a full path on the command line. So what I'm going to do for myself won't be an answer to your question. Note the use of parentheses for open_shell_function which makes it run in a subshell so I don't have to do the whole thing with user_has_extdebug.
open_shell_function()(
# Use subshell so as not to turn on extdebug in the user's shell
# and avoid doing this remembering stuff
shopt -s extdebug
local info=$(declare -F ${1})
if [[ -z "${info}" ]] ; then
echo "No info from 'declare -F' for '${1}'"
return 1
fi
local lineno
if ! lineno=$(echo ${info} | cut -d ' ' -f 2) ; then
echo "Error getting line number from info '${info}' on '${1}'"
return 1
fi
local file
if ! file=$(echo ${info} | cut -d ' ' -f 3) ; then
echo "Error getting filename from info '${info}' on '${1}'"
return 1
fi
vim ${file} +${lineno}
)
vc(){
local file
if file=$(which ${1} 2>/dev/null) ; then
vim ${file}
else
echo "no '${1}' found in path, looking for shell function"
open_shell_function "${1}"
fi
}
complete -c vc

Bash: Extract user path (/home/userID) from read line containing full path and replace with "~"

I'm constructing a bash script file a bit at a time. I'm learning as I
go. But I can't find anything online to help me at this point: I need to
extract a substring from a large string, and the two methods I found using ${} (curly brackets) just won't work.
The first, ${x#y}, doesn't do what it should.
The second, ${x:p} or ${x:p:n}, keeps reporting bad substitution.
It only seems to work with constants.
The ${#x} returns a string length as text, not as a number, meaning it does not work with either ${x:p} or ${x:p:n}.
Fact is, it's seems really hard to get bash to do much math at all. Except for the for statements. But that is just counting. And this isn't a task for a for loop.
I've consolidated my script file here as a means of helping you all understand what it is that I am doing. It's for working with PureBasic source files, but you only have to change the grep's "--include=" argument, and it can search other types of text files instead.
#!/bin/bash
home=$(echo ~) # Copy the user's path to a variable named home
len=${#home} # Showing how to find the length. Problem is, this is treated
# as a string, not a number. Can't find a way to make over into
# into a number.
echo $home "has length of" $len "characters."
read -p "Find what: " what # Intended to search PureBasic (*.pb?) source files for text matches
grep -rHn $what $home --include="*.pb*" --exclude-dir=".cache" --exclude-dir=".gvfs" > 1.tmp
while read line # this checks for and reads the next line
do # the closing 'done' has the file to be read appended with "<"
a0=$line # this is each line as read
a1=$(echo "$a0" | awk -F: '{print $1}') # this gets the full path before the first ':'
echo $a0 # Shows full line
echo $a1 # Shows just full path
q1=${line#a1}
echo $q1 # FAILED! No reported problem, but failed to extract $a1 from $line.
q1=${a0#a1}
echo $q1 # FAILED! No reported problem, but failed to extract $a1 from $a0.
break # Can't do a 'read -n 1', as it just reads 1 char from the next line.
# Can't do a pause, because it doesn't exist. So just run from the
# terminal so that after break we can see what's on the screen .
len=${#a1} # Can get the length of $a1, but only as a string
# q1=${line:len} # Right command, wrong variable
# q1=${line:$len} # Right command, right variable, but wrong variable type
# q1=${line:14} # Constants work, but all $home's aren't 14 characters long
done < 1.tmp
The following works:
x="/home/user/rest/of/path"
y="~${x#/home/user}"
echo $y
Will output
~/rest/of/path
If you want to use "/home/user" inside a variable, say prefix, you need to use $ after the #, i.e., ${x#$prefix}, which I think is your issue.
The hejp I got was most appreciated. I got it done, and here it is:
#!/bin/bash
len=${#HOME} # Showing how to find the length. Problem is, this is treated
# as a string, not a number. Can't find a way to make over into
# into a number.
echo $HOME "has length of" $len "characters."
while :
do
echo
read -p "Find what: " what # Intended to search PureBasic (*.pb?) source files for text matches
a0=""; > 0.tmp; > 1.tmp
grep -rHn $what $home --include="*.pb*" --exclude-dir=".cache" --exclude-dir=".gvfs" >> 0.tmp
while read line # this checks for and reads the next line
do # the closing 'done' has the file to be read appended with "<"
a1=$(echo $line | awk -F: '{print $1}') # this gets the full path before the first ':'
a2=${line#$a1":"} # renove path and first colon from rest of line
if [[ $a0 != $a1 ]]
then
echo >> 1.tmp
echo $a1":" >> 1.tmp
a0=$a1
fi
echo " "$a2 >> 1.tmp
done < 0.tmp
cat 1.tmp | less
done
What I don't have yet is an answer as to whether variables can be used in place of constants in the dollar-sign, curly brackets where you use colons to mark that you want a substring of that string returned, if it requires constants, then the only choice might be to generate a child scriot using the variables, which would appear to be constants in the child, execute there, then return the results in an environmental variable or temporary file. I did stuff like that with MSDOS a lot. Limitation here is that you have to then make the produced file executable as well using "chmod +x filename". Or call it using "/bin/bash filename".
Another bash limitation found it that you cannot use "sudo" in the script without discontinuing execution of the present script. I guess a way around that is use sudo to call /bin/bash to call a child script that you produced. I assume then that if the child completes, you return to the parent script where you stopped at. Unless you did "sudo -i", "sudo -su", or some other variation where you become super user. Then you likely need to do an "exit" to drop the super user overlay.
If you exit the child script still as super user, would typing "exit" but you back to completing the parent script? I suspect so, which makes for some interesting senarios.
Another question: If doing a "while read line", what can you do in bash to check for a keyboard key press? The "read" option is already taken while in this loop.

How to count number of forked (sub-?)processes

Somebody else has written (TM) some bash script that forks very many sub-processes. It needs optimization. But I'm looking for a way to measure "how bad" the problem is.
Can I / How would I get a count that says how many sub-processes were forked by this script all-in-all / recursively?
This is a simplified version of what the existing, forking code looks like - a poor man's grep:
#!/bin/bash
file=/tmp/1000lines.txt
match=$1
let cnt=0
while read line
do
cnt=`expr $cnt + 1`
lineArray[$cnt]="${line}"
done < $file
totalLines=$cnt
cnt=0
while [ $cnt -lt $totalLines ]
do
cnt=`expr $cnt + 1`
matches=`echo ${lineArray[$cnt]}|grep $match`
if [ "$matches" ] ; then
echo ${lineArray[$cnt]}
fi
done
It takes the script 20 seconds to look for $1 in 1000 lines of input. This code forks way too many sub-processes. In the real code, there are longer pipes (e.g. progA | progB | progC) operating on each line using grep, cut, awk, sed and so on.
This is a busy system with lots of other stuff going on, so a count of how many processes were forked on the entire system during the run-time of the script would be of some use to me, but I'd prefer a count of processes started by this script and descendants. And I guess I could analyze the script and count it myself, but the script is long and rather complicated, so I'd just like to instrument it with this counter for debugging, if possible.
To clarify:
I'm not looking for the number of processes under $$ at any given time (e.g. via ps), but the number of processes run during the entire life of the script.
I'm also not looking for a faster version of this particular example script (I can do that). I'm looking for a way to determine which of the 30+ scripts to optimize first to use bash built-ins.
You can count the forked processes simply trapping the SIGCHLD signal. If You can edit the script file then You can do this:
set -o monitor # or set -m
trap "((++fork))" CHLD
So fork variable will contain the number of forks. At the end You can print this value:
echo $fork FORKS
For a 1000 lines input file it will print:
3000 FORKS
This code forks for two reasons. One for each expr ... and one for `echo ...|grep...`. So in the reading while-loop it forks every time when a line is read; in the processing while-loop it forks 2 times (one because of expr ... and one for `echo ...|grep ...`). So for a 1000 lines file it forks 3000 times.
But this is not exact! It is just the forks done by the calling shell. There are more forks, because `echo ...|grep...` forks to start a bash to run this code. But after it is also forks twice: one for echo and one for grep. So actually it is 3 forks, not one. So it is rather 5000 FORKS, not 3000.
If You need to count the forks of the forks (of the forks...) as well (or You cannot modify the bash script or You want it to do from an other script), a more exact solution can be to used
strace -fo s.log ./x.sh
It will print lines like this:
30934 execve("./x.sh", ["./x.sh"], [/* 61 vars */]) = 0
Then You need to count the unique PIDs using something like this (first number is the PID):
awk '{n[$1]}END{print length(n)}' s.log
In case of this script I got 5001 (the +1 is the PID of the original bash script).
COMMENTS
Actually in this case all forks can be avoided:
Instead of
cnt=`expr $cnt + 1`
Use
((++cnt))
Instead of
matches=`echo ${lineArray[$cnt]}|grep $match`
if [ "$matches" ] ; then
echo ${lineArray[$cnt]}
fi
You can use bash's internal pattern matching:
[[ ${lineArray[cnt]} =~ $match ]] && echo ${lineArray[cnt]}
Mind that bash =~ uses ERE not RE (like grep). So it will behave like egrep (or grep -E), not grep.
I assume that the defined lineArray is not pointless (otherwise in the reading loop the matching could be tested and the lineArray is not needed) and it is used for other purpose as well. In that case I may suggest a little bit shorter version:
readarray -t lineArray <infile
for line in "${lineArray[#]}";{ [[ $line} =~ $match ]] && echo $line; }
First line reads the complete infile to lineArray without any loop. The second line is process the array element-by-element.
MEASURES
Original script for 1000 lines (on cygwin):
$ time ./test.sh
3000 FORKS
real 0m48.725s
user 0m14.107s
sys 0m30.659s
Modified version
FORKS
real 0m0.075s
user 0m0.031s
sys 0m0.031s
Same on linux:
3000 FORKS
real 0m4.745s
user 0m1.015s
sys 0m4.396s
and
FORKS
real 0m0.028s
user 0m0.022s
sys 0m0.005s
So this version uses no fork (or clone) at all. I may suggest to use this version only for small (<100 KiB) files. In other cases grap, egrep, awk over performs the pure bash solution. But this should be checked by a performance test.
For a thousand lines on linux I got the following:
$ time grep Solaris infile # Solaris is not in the infile
real 0m0.001s
user 0m0.000s
sys 0m0.001s

customize linux shell prompt

In my .bashrc, I got this:
PS1="[\w $]"
And every time when I cd to a dir with a deep level, the shell prompt almost takes up the whole line, (terminal size: 80*24), like:
[/level_a_dir/level_b_dir/level_c_dir/level_d_dir/level_e_dir $]
Question
I want to cut the prompt short if the pwd is longer than 20 chars, just keep the last dir, like:
[.../level_e_dir $]
#[/level_a_dir/level_b_dir/level_c_dir/level_d_dir] is replaced with ...
How to do it?
I have done it in the following way.
First you have to create a shell script, truncate.sh:
#!/bin/bash
MAXLEN=20
REPLACEMENT="..."
# replace /home/user by ~
TPWD=$(echo ${PWD} | sed 's#'${HOME}'#~#;')
# truncate
if [ ${#TPWD} -gt ${MAXLEN} ] ; then
PWDOFFSET=$(( ${#TPWD} - ${MAXLEN} ))
TPWD="${REPLACEMENT}${TPWD:${PWDOFFSET}:${MAXLEN}}"
fi
echo ${TPWD}
Next you have to replace your PS1:
export PS1="[\$(truncate.sh) ] "
If you really want just 20 characters whetever that might be (or less), then the simplest I can think of is:
export PS1='[${PWD:$((${#PWD}-20))} $]'
I would drop the brackets if you don't have much space or think about having a two line prompt (which I personally hate :-)

Performance profiling tools for shell scripts

I'm attempting to speed up a collection of scripts that invoke subshells and do all sorts of things. I was wonder if there are any tools available to time the execution of a shell script and its nested shells and report on which parts of the script are the most expensive.
For example, if I had a script like the following.
#!/bin/bash
echo "hello"
echo $(date)
echo "goodbye"
I would like to know how long each of the three lines took. time will only only give me total time for the script. bash -x is interesting but does not include timestamps or other timing information.
You can set PS4 to show the time and line number. Doing this doesn't require installing any utilities and works without redirecting stderr to stdout.
For this script:
#!/bin/bash -x
# Note the -x flag above, it is required for this to work
PS4='+ $(date "+%s.%N ($LINENO) ")'
for i in {0..2}
do
echo $i
done
sleep 1
echo done
The output looks like:
+ PS4='+ $(date "+%s.%N ($LINENO) ")'
+ 1291311776.108610290 (3) for i in '{0..2}'
+ 1291311776.120680354 (5) echo 0
0
+ 1291311776.133917546 (3) for i in '{0..2}'
+ 1291311776.146386339 (5) echo 1
1
+ 1291311776.158646585 (3) for i in '{0..2}'
+ 1291311776.171003138 (5) echo 2
2
+ 1291311776.183450114 (7) sleep 1
+ 1291311777.203053652 (8) echo done
done
This assumes GNU date, but you can change the output specification to anything you like or whatever matches the version of date that you use.
Note: If you have an existing script that you want to do this with without modifying it, you can do this:
PS4='+ $(date "+%s.%N ($LINENO) ")' bash -x scriptname
In the upcoming Bash 5, you will be able to save forking date (but you get microseconds instead of nanoseconds):
PS4='+ $EPOCHREALTIME ($LINENO) '
You could pipe the output of running under -x through to something that timestamps each line when it is received. For example, tai64n from djb's daemontools.
At a basic example,
sh -x slow.sh 2>&1 | tai64n | tai64nlocal
This conflates stdout and stderr but it does give everything a timestamp.
You'd have to then analyze the output to find expensive lines and correlate that back to your source.
You might also conceivably find using strace helpful. For example,
strace -f -ttt -T -o /tmp/analysis.txt slow.sh
This will produce a very detailed report, with lots of timing information in /tmp/analysis.txt, but at a per-system call level, which might be too detailed.
Sounds like you want to time each echo. If echo is all that you're doing this is easy
alias echo='time echo'
If you're running other command this obviously won't be sufficient.
Updated
See enable_profiler/disable_profiler in
https://github.com/vlovich/bashrc-wrangler/blob/master/bash.d/000-setup
which is what I use now. I haven't tested on all version of BASH & specifically but if you have the ts utility installed it works very well with low overhead.
Old
My preferred approach is below. Reason is that it supports OSX as well (which doesn't have high precision date) & runs even if you don't have bc installed.
#!/bin/bash
_profiler_check_precision() {
if [ -z "$PROFILE_HIGH_PRECISION" ]; then
#debug "Precision of timer is unknown"
if which bc > /dev/null 2>&1 && date '+%s.%N' | grep -vq '\.N$'; then
PROFILE_HIGH_PRECISION=y
else
PROFILE_HIGH_PRECISION=n
fi
fi
}
_profiler_ts() {
_profiler_check_precision
if [ "y" = "$PROFILE_HIGH_PRECISION" ]; then
date '+%s.%N'
else
date '+%s'
fi
}
profile_mark() {
_PROF_START="$(_profiler_ts)"
}
profile_elapsed() {
_profiler_check_precision
local NOW="$(_profiler_ts)"
local ELAPSED=
if [ "y" = "$PROFILE_HIGH_PRECISION" ]; then
ELAPSED="$(echo "scale=10; $NOW - $_PROF_START" | bc | sed 's/\(\.[0-9]\{0,3\}\)[0-9]*$/\1/')"
else
ELAPSED=$((NOW - _PROF_START))
fi
echo "$ELAPSED"
}
do_something() {
local _PROF_START
profile_mark
sleep 10
echo "Took $(profile_elapsed) seconds"
}
Here's a simple method that works on almost every Unix and needs no special software:
enable shell tracing, e.g. with set -x
pipe the output of the script through logger:
sh -x ./slow_script 2>&1 | logger
This will writes the output to syslog, which automatically adds a time stamp to every message. If you use Linux with journald, you can get high-precision time stamps using
journalctl -o short-monotonic _COMM=logger
Many traditional syslog daemons also offer high precision time stamps (milliseconds should be sufficient for shell scripts).
Here's an example from a script that I was just profiling in this manner:
[1940949.100362] bremer root[16404]: + zcat /boot/symvers-5.3.18-57-default.gz
[1940949.111138] bremer root[16404]: + '[' -e /var/tmp/weak-modules2.OmYvUn/symvers-5.3.18-57-default ']'
[1940949.111315] bremer root[16404]: + args=(-E $tmpdir/symvers-$krel)
[1940949.111484] bremer root[16404]: ++ /usr/sbin/depmod -b / -ae -E /var/tmp/weak-modules2.OmYvUn/symvers-5.3.18-57-default 5.3.18-57>
[1940952.455272] bremer root[16404]: + output=
[1940952.455738] bremer root[16404]: + status=0
where you can see that the "depmod" command is taking a lot of time.
Copied from here:
Since I've ended up here at least twice now, I implemented a solution:
https://github.com/walles/shellprof
It runs your script, transparently clocks all lines printed, and at the end prints a top 10 list of the lines that were on screen the longest:
~/s/shellprof (master|✔) $ ./shellprof ./testcase.sh
quick
slow
quick
Timings for printed lines:
1.01s: slow
0.00s: <<<PROGRAM START>>>
0.00s: quick
0.00s: quick
~/s/shellprof (master|✔) $
I'm not aware of any shell profiling tools.
Historically one just rewrites too-slow shell scripts in Perl, Python, Ruby, or even C.
A less drastic idea would be to use a faster shell than bash. Dash and ash are available for all Unix-style systems and are typically quite a bit smaller and faster.

Resources