I'm trying to capture logs for a set period, parse it and produce a report. Here's what I do
(tail -F log_file | grep --line-buffered some_text | awk '{process lines} END {produce report}') & pid=$! && disown
sleep 60
pkill -TERM -P $pid
kill -TERM $pid
Explanation:
tail a log file and pipe to grep, awk to process and produce report at END. Run these in a command group (within ())
Wait 60 seconds
kill the children of the process group (tail, grep, awk)
kill the command group
Now the problem is when awk is killed it won't write the report (complete the END part)! What am I doing wrong here? Can you suggest a workaround?
You already explained your problem. "When Awk is killed, it won't write the report."
The proper solution is to only kill tail and then wait for the rest of the pipeline to finish.
If your tail supports the --pid= argument, that's easy -- just start a sentinel sleep process in the background, run for as long as you need to, kill the sentinel, wait for tail to exit. Or use the sleep 60 you are already using; just start it before the tail pipeline instead.
sleep 60 &
zzz=$!
tail -F log_file --pid=$zzz | awk '/some_text/ {process lines} END {produce report}'
wait $zzz
(Note also the refactoring to lose the useless grep | awk.)
If it doesn't, things easily get a lot more involved. I can imagine that it might be possible by setting a clever trap in the subshell, but I would probably simply write a short Python or Perl wrapper with a signal handler to write the report when terminated.
Related
I am debugging an application, that has a long running time and produces only a logfile as reliably complete output, so usually I use tail -f to monitor the output. Since it also requires some specific setup of the environment variables, I have wrapped the whole invocation, including tail -f LOGFILE &, into a bash script.
However, this creates a tail process that won't be terminated automatically and will remain running. Cleanup with trap leads to complicated code, once there is more than a single cleanup task, and there is no obvious way to account for all ways the script may be terminated.
Using the timeout command, I could limit tail -f to terminate after a fixed total time, but that will break cases, where it is SUPPOSED to run longer.
So I was wondering, if there is a way to limit tail -f such that it terminates, if the followed file doesn't change for a specified amount of time.
Update: The subsequent script worked for me when executed on its own, but in some instances, the tail process would not terminate regardless. It isn't entirely clear, whether tail -f detects that process it is piping to has terminated.
Lacking a builtin solution, a stdout based timeout can be produced in bash, exploiting that tail will terminate, if it's stdout is closed.
# Usage: withTimeout TIMEOUT COMMAND [ARGS ...]
# Execute COMMAND. Terminate, if it hasn't produced new output in TIMEOUT seconds.
# Depending on the platform, TIMEOUT may be fractional. See `help read`.
withTimeout () {
local timeout="$1"; shift 1
"$#" | while IFS= read -r -t "${timeout}" line || return 0; do
printf "%s\n" "$line"
done
}
withTimeout 2 tail -f LOGFILE &
Note that tail may resort to polling the file once per second, if it cannot use inotify. If faster output is needed the -s option can be supplied.
All,
I am trying to run a bash script that kicks off several sub processes. The processes redirect to their own log files and I must kick them off in parallel. To do this i have written a check_procs procedure, that monitors for the number of processes using the same parent PID. Once the number reaches 1 again, the script should continue. However, it seems to just hang. I am not sure why, but the code is below:
check_procs() {
while true; do
mypid=$$
backup_procs=`ps -eo ppid | grep -w $mypid | wc -w`
until [ $backup_procs == 1 ]; do
echo $backup_procs
sleep 5
backup_procs=`ps -eo ppid | grep -w $mypid | wc -w`
done
done
}
This function is called after the processes are kicked off, and I can see it echoing out the number of processes, but then the echoing stops (suggesting the function has completed since the process count is now 1, but then nothing happens, and I can see the script is still in the process list of the server. I have to kill it off manually. The part where the function is called is below:
for ((i=1; i <= $threads; i++)); do
<Some trickery here to generate $cmdfile and $logfile>
nohup rman target / cmdfile=$cmdfile log=$logfile &
x=$(($x+1))
done
check_procs
$threads is a command line parameter passed to the script, and is a small number like 4 or 6. These are kicked off using nohup, as shown. When the IF in check_procs is satisfied, everything hangs instead of executing the remainder of the script. What's wrong with my function?
Maybe I'm mistaken, but it is not expected? Your outer loop runs forever, there is no exit point. Unless the process count increases again the outer loop runs infinitely (without any delay which is not recommended).
I used the following bash code:
for pid in `top -n 1 | awk '{if($8 == "R") print $1;}'`
do
kill $pid
done
It says:
./kill.sh: line 3: kill: 29162: arguments must be process or job IDs
./kill.sh: line 3: kill: 29165: arguments must be process or job IDs
./kill.sh: line 3: kill: 29166: arguments must be process or job IDs
./kill.sh: line 3: kill: 29169: arguments must be process or job IDs
What causes this error and how do I kill processes in Bash?
I usuallly use:
pkill <process name>
In your case:
pkill R
Note that this will kill all the running instances of R, which may or may not be what you want.
Probably this awk command is not returning any reliable data, in any case theres a much easier way:
kill `pidof R`
Or:
killall R
It seems there may be a (possibly aliased) script kill.sh in your current directory which is acting as an intermediary and calling the kill builtin. However, the script is passing the wrong arguments to the builtin. (I can't give details without seeing the script.)
Solution.
Your command will work fine using the kill builtin. The simplest solution is to ensure you use the bash builtin kill. Execute:
chmod a-x kill.sh
unalias kill
unset -f kill
This will prevent the script from running and remove any alias or function that may interfere with your use of the kill builtin.
Note.
You can also simplify your command:
kill `top -n 1 | awk '{if($8 == "R") print $1;}'`
Alternatively...
You can also use builtin to bypass any functions and aliases:
builtin kill `top -n 1 | awk '{if($8 == "R") print $1;}'`
Please try:
top -n 1 | awk '{print "(" $1 ") - (" $8 ")";}'
to understand how $1 and $8 are evaluated
Please also post the content of ./kill.sh and explain the purpose of this script
top issues a header you need to suppress or skip over.
Or maybe you can use ps since its output is highly configurable.
Also note that kill accepts multiple PIDs, so there's no reason to employ a loop.
kill $( ps h -o pid,state | awk '$2 == "R" { print $1 }' )
In addition to awk, sed can provide pid isolation in the list for kill:
kill $(ps h -o pid,state | grep R | sed -e 's/\s.*$//')
Basically, the opposite side of the same coin.
I used the following bash code:
for pid in `top -n 1 | awk '{if($8 == "R") print $1;}'`
do
kill $pid
done
It says:
./kill.sh: line 3: kill: 29162: arguments must be process or job IDs
./kill.sh: line 3: kill: 29165: arguments must be process or job IDs
./kill.sh: line 3: kill: 29166: arguments must be process or job IDs
./kill.sh: line 3: kill: 29169: arguments must be process or job IDs
What causes this error and how do I kill processes in Bash?
I usuallly use:
pkill <process name>
In your case:
pkill R
Note that this will kill all the running instances of R, which may or may not be what you want.
Probably this awk command is not returning any reliable data, in any case theres a much easier way:
kill `pidof R`
Or:
killall R
It seems there may be a (possibly aliased) script kill.sh in your current directory which is acting as an intermediary and calling the kill builtin. However, the script is passing the wrong arguments to the builtin. (I can't give details without seeing the script.)
Solution.
Your command will work fine using the kill builtin. The simplest solution is to ensure you use the bash builtin kill. Execute:
chmod a-x kill.sh
unalias kill
unset -f kill
This will prevent the script from running and remove any alias or function that may interfere with your use of the kill builtin.
Note.
You can also simplify your command:
kill `top -n 1 | awk '{if($8 == "R") print $1;}'`
Alternatively...
You can also use builtin to bypass any functions and aliases:
builtin kill `top -n 1 | awk '{if($8 == "R") print $1;}'`
Please try:
top -n 1 | awk '{print "(" $1 ") - (" $8 ")";}'
to understand how $1 and $8 are evaluated
Please also post the content of ./kill.sh and explain the purpose of this script
top issues a header you need to suppress or skip over.
Or maybe you can use ps since its output is highly configurable.
Also note that kill accepts multiple PIDs, so there's no reason to employ a loop.
kill $( ps h -o pid,state | awk '$2 == "R" { print $1 }' )
In addition to awk, sed can provide pid isolation in the list for kill:
kill $(ps h -o pid,state | grep R | sed -e 's/\s.*$//')
Basically, the opposite side of the same coin.
Where I work, I have seen the below snippet in shell scripts to check for the completion of background jobs:
until [[ `ps -ef | grep backgroundjob | grep -v grep | wc -l` -eq 0 ]];
do
sleep 30
done
Having read the man page of wait command, I know these 3 lines can be replaced by wait command in a more short and easily readable way. My questions are:
Are there any disadvantages or scenarios where wait command might
not work as well as the snippet above?
How is wait commandimplemented? It seems to return almost
immediately, so probably a
tight loop? If it is a tight loop, then probably the above snippet
which sleeps for 30 seconds would go easy on the CPU than the wait command?
wait only works for child processes of the current shell. This means that, if your process forked to background itself, the original child process will have exited, and the shell won't be able to wait on it.
Some shells' wait builtin will return immediately if there is no such child process; others, like bash, will warn you:
$ wait 1234
bash: wait: pid 1234 is not a child of this shell
wait doesn't impose any CPU load because it uses the waitpid(2) system call, which pauses the process until the nominated process has exited.