Explain some tips of bash - bash

I get a piece of code for PID file control.
The style of programmers, I don't understand..
I don't know -->
Use of && on
[[ $mypid -ne $procpid ]] **&&**
And relaunch ourselves properly (does not work on MacosX)
$0 $# &
Code complete...
function createpidfile() {
mypid=$1
pidfile=$2
#Close stderr, don't overwrite existing file, shove my pid in the lock file.
$(exec 2>&-; set -o noclobber; echo "$mypid" > "$pidfile")
[[ ! -f "$pidfile" ]] && exit #Lock file creation failed
procpid=$(<"$pidfile")
[[ $mypid -ne $procpid ]] && {
#I'm not the pid in the lock file
# Is the process pid in the lockfile still running?
isrunning "$pidfile" || {
# No. Kill the pidfile and relaunch ourselves properly.
rm "$pidfile"
$0 $# &
}
exit
}
}
I'm lost

[[ ! -f "$pidfile" ]] && exit means "if there is no file called $pidfile then exit" (using the short-circuit evaluation) - exit will not be evaluated if the file exists.
$0 $# &:
$0 - the first argument in the command line (meaning the executable itself);
$# - all the remaining arguments passed onto the command line;
& - send the process to background after the launch.

command1 && command2
command2 is executed if, and only if, command1 returns an exit status of zero.
$0 is the name of the actual binary.
$# are all parameters.
and the closing & sends the process to the background.
Everything is documented in the bash manual See e.g. section 3.4.2 Special Parameters

&& is a logical AND.
If the condition [[ $mypid -ne $procpid ]] is true, the code in the block {...} gets executed.
$0 $# & restarts the script in the background (with the same arguments).
$0 is the command that invoked the script
$# is the list of all arguments passed to the script
& indicates the previous command should be executed in the background

It's boolean short-circuiting - if the bit before the && (and) operator evaluates to be false then there's no need to execute the second part (the block between { and }. The same trick is used with the || operator, which will only execute the second block if the first block returned false.

Related

Calling exit doesn't work as expected

I need to run several child processes in background and pipe data between them. When the script exits, I want to kill any remaining of them, so I added
trap cleanup EXIT
cleanup()
{
echo "Cleaning up!"
pkill -TERM -P $$
}
Since I need to react if one of the processes reports an error, I created wrapper functions. Anything that ends with fd is a previously opened file descriptor, connected to a FIFO pipe.
run_gui()
{
"$GAME_BIN" $args <&$gui_infd >&$gui_outfd # redirecting IO to some file descriptors
if [[ $? == 0 ]]; then
echo exiting ok
exit $OK_EXITCODE
else
exit $ERROR_EXITCODE
fi
}
The functions run_ai1(), run_ai2 are analogous.
run_ai1()
{
"$ai1" <&$ai1_infd >&$ai1_outfd
if [[ $? == 0 || $? == 1 || $? == 2 ]]; then
exit $OK_EXITCODE
else
exit $ERROR_EXITCODE
fi
}
run_ai2()
{
"$ai2" <&$ai2_infd >&$ai2_outfd
if [[ $? == 0 || $? == 1 || $? == 2 ]]; then
exit $OK_EXITCODE
else
exit $ERROR_EXITCODE
fi
}
Then I run the functions and do the needed piping
printinit 1 >&$ai1_infd
printinit 2 >&$ai2_infd
run_gui &
run_ai1 &
run_ai2 &
while true; do
echo "Started the loop"
while true; do
read -u $ai1_outfd line || echo "Nothing read"
echo $line
if [[ $line ]]; then
echo "$line" >&$gui_infd
echo "$line" >&$ai2_infd
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
while true; do
read -u $ai2_outfd line || echo "nothing read"
echo $line
if [[ $line ]]; then
echo "$line" >&$gui_infd
echo "$line" >&$ai1_infd
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
done
When $GAME_BIN exits, i.e. the GUI is closed by the close button, I can see the exiting ok message on the stdout, but the cleanup function is not called at all. When I add a manual call to cleanup before calling exit $OK_EXITCODE, although the processes are killed:
./game.sh: line 309: 9193 Terminated run_gui
./game.sh: line 309: 9194 Terminated run_ai1
./game.sh: line 309: 9195 Terminated run_ai2
./game.sh: line 309: 9203 Terminated sleep $turndelay
the loop runs anyway and the script doesn't exit, as it should (exit $OK_EXITCODE). The AI scripts are simple:
#!/bin/sh
while true; do
echo END_TURN
done
There is no wait call anywhere in my script. What am I doing wrong?
What's interesting: when I call jobs -p right after run_ai2 &, then I get 3 pids listed. On the other hand, when I invoke this command from the cleanup function - the output is empty.
Besides, why is the sleep $turndelay process terminated? It's not a child invoked process.
An EXIT trap fires when the trapping script exits. Your toplevel script isn't exiting here.
The trap isn't inherited by the sub-shell that your run_* functions are running under (from being run in the background) so it never triggers when the sub-shell's exit.
What you want is most likely what you did manually (though slightly incorrectly it sounded like).
You want the cleanup function called from run_gui when $GAME has exited. Something like this.
run_gui() {
"$GAME_BIN" $args <&$gui_infd >&$gui_outfd # redirecting IO to some file descriptors
ret=$?
cleanup
exit $ret
}
Then you'll just need to make sure that cleanup gets the right value of $$ (Which in bash it will, for your usage, even in a sub-shell since $$ in a sub-shell is the parent process ID but you might want to make that more explicit by setting up a handler in your main script for a signal and signalling the main script when run_gui terminates instead.)
I'd guess you are getting some child processes kicked off by a child process. Do this: in another window do a ps -ft pts/1 or whatever your tty is. Verify.
Also change the pkill to a kill $(jobs -p) and see if that works.

Run a shell script with While condition in an infinite loop based on conditions

I need to create a shell script to place some indicator/flag files in a directory say /dir1/dir2/flag_file_directory based on the request flags received from a shell script in a directory /dir1/dir2/req_flag_file_directory and the source files present in a directory say dir1/dir2/source_file_directory. For this I need to run a script using a while condition in an infinite loop as I do not know when the source files will be made available.
So, my implementation plan is somewhat like this - Lets say JOB1 which is scheduled to run at some time in the morning will first place(touch) a request flag (eg. touch /dir1/dir2/req_flag_file_directory/req1.req), saying that this job is running, so look for the Source files of pattern file_pattern_YYYYMMDD.CSV (the file patterns are different for different jobs) present in the source file directory, if they are present, then count the number. If the count of the files is correct, then first delete the request flag for this job and then touch a indicator/flag file in the /dir1/dir2/flag_file_directory. This indicator/flag file will then be used as an indicator that the source files are all present and the job can be continued to load these files into our system.
I will have all the details related to the jobs and their flag files in a file whose structure is as shown below. Based on the request flag, the script should know what other criterias it should look for before placing the indicator file:
request_flags|source_name|job_name|file_pattern|file_count|indicator_flag_file
req1.req|Sourcename1|jobname1|file_pattern_1|3|ind1.ind
req2.req|Sourcename2|jobname2|file_pattern_2|6|ind2.ind
req3.req|Sourcename3|jobname3|file_pattern_3|1|ind3.ind
req**n**.req|Sourcename**n**|jobname**n**|file_pattern_**n**|2|ind**n**.ind
Please let me know how this can be achieved and also if you have other suggestions or solutions too
Rather have the service daemon script polling in an infinite loop (i.e. waking up periodically to check if it needs to do work), you could use file locking and a named pipe to create an event queue.
Outline of the service daemon, daemon.sh. This script will loop infinitely, blocking by reading from the named pipe at read line until a message arrives (i.e., some other process writes to $RequestPipe).
#!/bin/bash
# daemon.sh
LockDir="/dir1/dir2/req_flag_file_directory"
LockFile="${LockDir}/.MultipleWriterLock"
RequestPipe="${LockDir}/.RequestQueue"
while true ; do
if read line < "$RequestPipe" ; then
# ... commands to be executed after message received ...
echo "$line" # for example
fi
done
An outline of requestor.sh, the script that wakes up the service daemon when everything is ready. This script does all the preparation necessary, e.g. creating files in req_flag_file_directory and source_file_directory, then wakes the service daemon script by writing to the named pipe. It could even send a message that that contains more information for the service daemon, say "Job 1 ready".
#!/bin/bash
# requestor.sh
LockDir="/dir1/dir2/req_flag_file_directory"
LockFile="${LockDir}/.MultipleWriterLock"
RequestPipe="${LockDir}/.RequestQueue"
# ... create all the necessary files ...
(
flock --exclusive 200
# Unblock the service daemon/listener by sending a line of text.
echo Wake up sleepyhead. > "$RequestPipe"
) 200>"$LockFile" # subshell exit releases lock automatically
daemon.sh fleshed out with some error handling:
#!/bin/bash
# daemon.sh
LockDir="/dir1/dir2/req_flag_file_directory"
LockFile="${LockDir}/.MultipleWriterLock"
RequestPipe="${LockDir}/.RequestQueue"
SharedGroup=$(echo need to put a group here 1>&2; exit 1)
#
if [[ ! -w "$RequestPipe" ]] ; then
# Handle 1st time. Or fix a problem.
mkfifo --mode=775 "$RequestPipe"
chgrp "$SharedGroup" "$RequestPipe"
if [[ ! -w "$RequestPipe" ]] ; then
echo "ERROR: request queue, can't write to $RequestPipe" 1>&2
exit 1
fi
fi
while true ; do
if read line < "$RequestPipe" ; then
# ... commands to be executed after message received ...
echo "$line" # for example
fi
done
requestor.sh fleshed out with some error handling:
#!/bin/bash
# requestor.sh
LockDir="/dir1/dir2/req_flag_file_directory"
LockFile="${LockDir}/.MultipleWriterLock"
RequestPipe="${LockDir}/.RequestQueue"
SharedGroup=$(echo need to put a group here 1>&2; exit 1)
# ... create all the necessary files ...
#
if [[ ! -w "$LockFile" ]] ; then
# Handle 1st time. Or fix a problem.
touch "$LockFile"
chgrp "$SharedGroup" "$LockFile"
chmod 775 "$LockFile"
if [[ ! -w "$LockFile" ]] ; then
echo "ERROR: write lock, can't write to $LockFile" 1>&2
exit 1
fi
fi
if [[ ! -w "$RequestPipe" ]] ; then
# Handle 1st time. Or fix a problem.
mkfifo --mode=775 "$RequestPipe"
chgrp "$SharedGroup" "$RequestPipe"
if [[ ! -w "$RequestPipe" ]] ; then
echo "ERROR: request queue, can't write to $RequestPipe" 1>&2
exit 1
fi
fi
(
flock --exclusive 200 || {
echo "ERROR: write lock, $LockFile flock failed." 1>&2
exit 1
}
# Unblock the service daemon/listener by sending a line of text.
echo Wake up sleepyhead. > "$RequestPipe"
) 200> "$LockFile" # subshell exit releases lock automatically
Still having some doubts about the contents of requests file, but I think I've come up with a rather simple solution:
#!/bin/bash
DETAILS_FILE="details.txt"
DETAILS_LINES=$((`wc -l $DETAILS_FILE|awk '{print $1}'`-1)) # to remove banner line (first line)
DETAILS=`tail -$DETAILS_LINES $DETAILS_FILE|tr '\n\r' ' '`
PIDS=()
IFS=' '
waitall () { # PIDS...
## Wait for children to exit and indicate whether all exited with 0 status.
local errors=0
while :; do
debug "Processes remaining: $*"
for pid in $#; do
echo "PID: $pid"
shift
if kill -0 "$pid" 2>/dev/null; then
debug "$pid is still alive."
set -- "$#" "$pid"
elif wait "$pid"; then
debug "$pid exited with zero exit status."
else
debug "$pid exited with non-zero exit status."
((++errors))
fi
done
(("$#" > 0)) || break
# TODO: how to interrupt this sleep when a child terminates?
sleep ${WAITALL_DELAY:-1}
done
((errors == 0))
}
debug () { echo "DEBUG: $*" >&2; }
#function to check for # of sourcefiles matching pattern in dir
#params: req3.req Sourcename3 jobname3 file_pattern_3 1 ind3.ind
check () {
NOFILES=`find $2 -type f | egrep -c $4`
if [ $NOFILES -eq "$5" ];then
echo "Touching file $6. done."
touch $6
else
echo "$NOFILES matching $4 pattern. exiting"
fi
}
echo "parsing $DETAILS_FILE file..."
read -a lines <<< "$DETAILS"
for line in "${lines[#]}"
do
IFS='|'
read -a ARRAY <<< "$line"
echo "Line processed. Dispatching job ${ARRAY[2]}..."
check ${ARRAY[#]} &
IFS=' '
PIDS="$PIDS $!"
#echo $PIDS
done
waitall ${PIDS}
wait
Although not exactly in a infinite loop. This script is intended to run in a crontab.
First it reads details.txt file, as per your example.
After parsing all details, this script dispatches the check function, with sole purpose of counting the number of files matching file_pattern of each source_name folder, and if the number of files is equal to file_count, then touches the indicator_flag_file.
Hope that helps!

why does pgrep fail in this process monitor?

I have a monitor shell script that does effectively monitor and keep a process running. But it often fails in the sense that it starts a second, third or more instance of the process. I have also seen the pgrep command (pgrep -n -f wx_nanoserver) return the wrong pid at the command line...
Here's my script:
#!/bin/bash
check_process() {
# echo "$ts: checking $1"
[ "$1" = "" ] && return 0
[ `pgrep -n -f $1` ] && return 1 || return 0
}
while [ 1 ]; do
# timestamp
ts=`date +%T`
NOW=`date +"%Y%m%d-%H%M%S"`
# echo "$ts: begin checking..."
check_process "wx_nanoserver"
[ $? -eq 0 ] && echo "$ts: not running, restarting..." && `php /var/www/wx_nanoserver.php > /var/www/logs/wx_output_$NOW.log 2> /var/www/logs/wx_error_$NOW.log &`
sleep 5
done
try:
pgrep -n -f "$1" && return 1 || return 0
if you use [ ], you will try to check pgrep stdout data, and your script did not compare it with empty space or sth, without [ ], will using pgrep exit code.
Two weird things about your script:
[ `pgrep -n -f $1` ] && return 1 || return 0
works through side effects. The ``` part evaluates to either the pid of the process if found, or nothing if no process is found. The single[notation is a synonym for thetestbuiltin (or command on earlier systems) which happens to returntrueif its argument is a nonempty string andfalseif it is given no argument. So when a pid is found, the test becomes something like[ 1234 ]which evaluates to true and[ ]` otherwise, which evaluates to false. That is indeed what you want, but it would be cleaner to write:
pgrep -n -f "$1" &>/dev/null && return 1 || return 0
Another thing is
`php /var/www/wx_nanoserver.php > /var/www/logs/wx_output_$NOW.log 2> /var/www/logs/wx_error_$NOW.log &`
where you use command substitution for no apparent reason. You're asking bash to evaluate the output of your command rather than simply running it. As its output is redirected, it always evaluates to an empty string so it has no further effect. A side effect is that the command is run in a subshell, which is a good thing to deamonize it. Though, it would be cleaner to write:
( php /var/www/wx_nanoserver.php > /var/www/logs/wx_output_$NOW.log 2> /var/www/logs/wx_error_$NOW.log & )
Not sure though what the actual problem might be. Seems to be working that way anyhow.
Final note, the back tick `` notation has been deprecated in favour of the$()` notation.

bash script: how to save return value of first command in a pipeline?

Bash: I want to run a command and pipe the results through some filter, but if the command fails, I want to return the command's error value, not the boring return value of the filter:
E.g.:
if !(cool_command | output_filter); then handle_the_error; fi
Or:
set -e
cool_command | output_filter
In either case it's the return value of cool_command that I care about -- for the 'if' condition in the first case, or to exit the script in the second case.
Is there some clean idiom for doing this?
Use the PIPESTATUS builtin variable.
From man bash:
PIPESTATUS
An array variable (see Arrays
below) containing a list of exit
status values from the processes in
the most-recently-executed foreground
pipeline (which may contain only a
single command).
If you didn't need to display the error output of the command, you could do something like
if ! echo | mysql $dbcreds mysql; then
error "Could not connect to MySQL. Did you forget to add '--db-user=' or '--db-password='?"
die "Check your credentials or ensure server is running with /etc/init.d/mysqld status"
fi
In the example, error and die are defined functions. elsewhere in the script. $dbcreds is also defined, though this is built from command line options. If there is no error generated by the command, nothing is returned. If an error occurs, text will be returned by this particular command.
Correct me if I'm wrong, but I get the impression you're really looking to do something a little more convoluted than
[ `id -u` -eq '0' ] || die "Must be run as root!"
where you actually grab the user ID prior to the if statement, and then perform the test. Doing it this way, you could then display the result if you choose. This would be
UID=`id -u`
if [ $UID -eq '0' ]; then
echo "User is root"
else
echo "User is not root"
exit 1 ##set an exit code higher than 0 if you're exiting because of an error
fi
The following script uses a fifo to filter the output in a separate process. This has the following advantages over the other answers. First, it is not bash specific. In particular it does not rely on PIPESTATUS. Second, output is not stalled until the command has completed.
$ cat >test_filter.sh <<EOF
#!/bin/sh
cmd()
{
echo $1
echo $2 >&2
return $3
}
filter()
{
while read line
do
echo "... $line"
done
}
tmpdir=$(mktemp -d)
fifo="$tmpdir"/out
mkfifo "$fifo"
filter <"$fifo" &
pid=$!
cmd a b 10 >"$fifo" 2>&1
ret=$?
wait $pid
echo exit code: $ret
rm -f "$fifo"
rmdir "$tmpdir"
EOF
$ sh ./test_filter.sh
... a
... b
exit code: 10

How to terminate script's process tree in Cygwin bash from bash script

I have a Cygwin bash script that I need to watch and terminate under certain conditions - specifically, after a certain file has been created. I'm having difficulty figuring out how exactly to terminate the script with the same level of completeness that Ctrl+C does, however.
Here's a simple script (called test1) that does little more than wait around to be terminated.
#!/bin/bash
test -f kill_me && rm kill_me
touch kill_me
tail -f kill_me
If this script is run in the foreground, Ctrl+C will terminate both the tail and the script itself. If the script is run in the background, a kill %1 (assuming it is job 1) will also terminate both tail and the script.
However, when I try to do the same thing from a script, I'm finding that only the bash process running the script is terminated, while tail hangs around disconnected from its parent. Here's one way I tried (test2):
#!/bin/bash
test -f kill_me && rm kill_me
(
touch kill_me
tail -f kill_me
) &
while true; do
sleep 1
test -f kill_me && {
kill %1
exit
}
done
If this is run, the bash subshell running in the background is terminated OK, but tail still hangs around.
If I use an explicitly separate script, like this, it still doesn't work (test3):
#!/bin/bash
test -f kill_me && rm kill_me
# assuming test1 above is included in the same directory
./test1 &
while true; do
sleep 1
test -f kill_me && {
kill %1
exit
}
done
tail is still hanging around after this script is run.
In my actual case, the process creating files is not particularly instrumentable, so I can't get it to terminate of its own accord; by finding out when it has created a particular file, however, I can at that point know that it's OK to terminate it. Unfortunately, I can't use a simple killall or equivalent, as there may be multiple instances running, and I only want to kill the specific instance.
/bin/kill (the program, not the bash builtin) interprets a negative PID as “kill the process group” which will get all the children too.
Changing
kill %1
to
/bin/kill -- -$$
works for me.
Adam's link put me in a direction that will solve the problem, albeit not without some minor caveats.
The script doesn't work unmodified under Cygwin, so I rewrote it, and with a couple more options. Here's my version:
#!/bin/bash
function usage
{
echo "usage: $(basename $0) [-c] [-<sigspec>] <pid>..."
echo "Recursively kill the process tree(s) rooted by <pid>."
echo "Options:"
echo " -c Only kill children; don't kill root"
echo " <sigspec> Arbitrary argument to pass to kill, expected to be signal specification"
exit 1
}
kill_parent=1
sig_spec=-9
function do_kill # <pid>...
{
kill "$sig_spec" "$#"
}
function kill_children # pid
{
local target=$1
local pid=
local ppid=
local i
# Returns alternating ids: first is pid, second is parent
for i in $(ps -f | tail +2 | cut -b 10-24); do
if [ ! -n "$pid" ]; then
# first in pair
pid=$i
else
# second in pair
ppid=$i
(( ppid == target && pid != $$ )) && {
kill_children $pid
do_kill $pid
}
# reset pid for next pair
pid=
fi
done
}
test -n "$1" || usage
while [ -n "$1" ]; do
case "$1" in
-c)
kill_parent=0
;;
-*)
sig_spec="$1"
;;
*)
kill_children $1
(( kill_parent )) && do_kill $1
;;
esac
shift
done
The only real downside is the somewhat ugly message that bash prints out when it receives a fatal signal, namely "Terminated", "Killed" or "Interrupted" (depending on what you send). However, I can live with that in batch scripts.
This script looks like it'll do the job:
#!/bin/bash
# Author: Sunil Alankar
##
# recursive kill. kills the process tree down from the specified pid
#
# foreach child of pid, recursive call dokill
dokill() {
local pid=$1
local itsparent=""
local aprocess=""
local x=""
# next line is a single line
for x in `/bin/ps -f | sed -e '/UID/d;s/[a-zA-Z0-9_-]\{1,\}
\{1,\}\([0-9]\{1,\}\) \{1,\}\([0-9]\{1,\}\) .*/\1 \2/g'`
do
if [ "$aprocess" = "" ]; then
aprocess=$x
itsparent=""
continue
else
itsparent=$x
if [ "$itsparent" = "$pid" ]; then
dokill $aprocess
fi
aprocess=""
fi
done
echo "killing $1"
kill -9 $1 > /dev/null 2>&1
}
case $# in
1) PID=$1
;;
*) echo "usage: rekill <top pid to kill>";
exit 1;
;;
esac
dokill $PID

Resources