Do a tail -F until matching a pattern - shell

I want to do a tail -F on a file until matching a pattern. I found a way using awk, but IMHO my command is not really clean. The problem is that I need to do it in only one line, because of some limitations.
tail -n +0 -F /tmp/foo | \
awk -W interactive '{if ($1 == "EOF") exit; print} END {system("echo EOF >> /tmp/foo")}'
The tail will block until EOF appears in the file. It works pretty well. The END block is mandatory because awk's exit does not exit right away. It makes awk to eval the END block before quitting. The END block hangs on a read call (because of tail), so the last thing I need to do, is to write another line in the file to force tail to exit.
Does someone know a better way to do that?

Use tail's --pid option and tail will stop when the shell dies. No need to add extra to the tailed file.
sh -c 'tail -n +0 --pid=$$ -f /tmp/foo | { sed "/EOF/ q" && kill $$ ;}'

Try this:
sh -c 'tail -n +0 -f /tmp/foo | { sed "/EOF/ q" && kill $$ ;}'
The whole command-line will exit as soon as the "EOF" string is seen in /tmp/foo.
There is one side-effect: the tail process will be left running (in the background) until anything is written to /tmp/foo.

I've not results with the solution:
sh -c 'tail -n +0 -f /tmp/foo | { sed "/EOF/ q" && kill $$ ;}'
There is some issue related with the buffer because if there aren't more lines appended to the file, then sed will not read the input. So, with a little more research i came up with this:
sed '/EOF/q' <(tail -n 0 -f /tmp/foo)
The script is in https://gist.github.com/2377029

This is something Tcl is quite good at. If the following is "tail_until.tcl",
#!/usr/bin/env tclsh
proc main {filename pattern} {
set pipe [open "| tail -n +0 -F $filename"]
set pid [pid $pipe]
fileevent $pipe readable [list handler $pipe $pattern]
vwait ::until_found
catch {exec kill $pid}
}
proc handler {pipe pattern} {
if {[gets $pipe line] == -1} {
if {[eof $pipe]} {
set ::until_found 1
}
} else {
puts $line
if {[string first $pattern $line] != -1} {
set ::until_found 1
}
}
}
main {*}$argv
Then you'd do tail_until.tcl /tmp/foo EOF

Does this work for you?
tail -n +0 -F /tmp/foo | sed '/EOF/q'
I'm assuming that 'EOF' is the pattern you're looking for. The sed command quits when it finds it, which means that the tail should quit the next time it writes.
I suppose that there is an outside chance that tail would hang around if the pattern is found at about the end of the file, waiting for more output to appear in the file which will never appear. If that's really a concern, you could probably arrange to kill it - the pipeline as a whole will terminate when sed terminates (unless you're using a funny shell that decides that isn't the correct behaviour).
Grump about Bash
As feared, bash (on MacOS X, at least, but probably everywhere) is a shell that thinks it needs to hang around waiting for tail to finish even though sed quit. Sometimes - more often than I like - I prefer the behaviour of good old Bourne shell which wasn't so clever and therefore guessed wrong less often than Bash does. dribbler is a program which dribbles out messages one per second ('1: Hello' etc in the example), with the output going to standard output. In Bash, this command sequence hangs until I did 'echo pqr >>/tmp/foo' in a separate window.
date
{ timeout -t 2m dribbler -t -m Hello; echo EOF; } >/tmp/foo &
echo Hi
sleep 1 # Ensure /tmp/foo is created
tail -n +0 -F /tmp/foo | sed '/EOF/q'
date
Sadly, I don't immediately see an option to control this behaviour. I did find shopt lithist, but that's unrelated to this problem.
Hooray for Korn Shell
I note that when I run that script using Korn shell, it works as I'd expect - leaving a tail lurking around to be killed somehow. What works there is 'echo pqr >> /tmp/foo' after the second date command completes.

Here's an extended version of Jon's solution which uses sed instead of grep so that the output of tail goes to stdout:
sed -r '/EOF/q' <( exec tail -n +0 -f /tmp/foo ); kill $! 2> /dev/null
This works because sed gets created before tail so $! holds the PID of tail
The main advantage of this over the sh -c solutions is that killing a sh seems to print something to the output such as 'Terminated' which is unwelcome

sh -c 'tail -n +0 --pid=$$ -f /tmp/foo | { sed "/EOF/ q" && kill $$ ;}'
Here the main problem is with $$.
If you run command as is, $$ is set not to sh but to the PID of the current shell where command is run.
To make kill work you need to change kill $$ to kill \$$
After that you can safely get rid of --pid=$$ passed to tail command.
Summarising, following will work just fine:
/bin/sh -c 'tail -n 0 -f /tmp/foo | { sed "/EOF/ q" && kill \$$ ;}
Optionally you can pass -n to sed to keep it quiet :)

To kill the dangling tail process as well you may execute the tail command in a (Bash) process substituion context which can later be killed as if it had been a backgrounded process. (Code taken from How to read one line from 'tail -f' through a pipeline, and then terminate?).
: > /tmp/foo
grep -m 1 EOF <( exec tail -f /tmp/foo ); kill $! 2> /dev/null
echo EOF > /tmp/foo # terminal window 2
As an alternative you could use a named pipe.
(
: > /tmp/foo
rm -f pidfifo
mkfifo pidfifo
sh -c '(tail -n +0 -f /tmp/foo & echo $! > pidfifo) |
{ sed "/EOF/ q" && kill $(cat pidfifo) && kill $$ ;}'
)
echo EOF > /tmp/foo # terminal window 2

ready to use for tomcat =
sh -c 'tail -f --pid=$$ catalina.out | { grep -i -m 1 "Server startup in" && kill $$ ;}'
for above scenario :
sh -c 'tail -f --pid=$$ /tmp/foo | { grep -i -m 1 EOF && kill $$ ;}'

tail -f <filename> | grep -q "<pattern>"

Related

bash get command that was used before pipe symbol

For a half-finished script that already uses the output of a program I also need the name and the parameters of the program that was used to pipe to my script.
So I run it like this:
yay something | ./myscript
Now I need to store "yay something" into a variable.
There is a way to to get previous runned commands or the current one by using set -o history -o histexpand and echo !! or echo $0 but that doesn't include what I wrote right before the pipe.
Maybe you would suggest to pass the name of the program and it's parameter to my script as parameters and then run it there but I don't want this (pass a command as an argument to bash script).
UPDATED SOLUTION (old below):
#!/bin/bash -i
#get processes
processes=$(> >(ps -f))
echo beginning:
echo "$processes"
#filter bin/bash -i
pac=$(echo "$processes" | sed '1,/bin\/bash -i/!d')
pac=$(echo "$pac" | tail -2 | head -1)
#kill
delete=$(echo $pac | grep -oP "(?<=$USER\s)\w+")
pac=$(echo "$pac" | grep -o -P '(?<=00:00:00).*(?=)')
echo "$delete"
kill -9 "$delete"
#print
echo " "
echo end:
echo "${pac:1}"
Note: When you use echo, man or cat then $pac will be empty.
OLD Text:
Thanks to Charles for his enormous effort and his link that finally led me to processes=$(> >(ps -f)).
Here a working example. You can e.g. use it with vi test | ./testprocesses (or nano or package helpers like yay or trizen but it won't work with echo, man nor with cat):
#!/bin/bash -i
#get processes
processes=$(> >(ps -f))
echo beginning:
echo $processes
#filter
pac=$(echo $processes | grep -o -P '(?<=CM).*(?=testprocesses)' | grep -o -P '(?<=D).*(?=testprocesses)' | grep -o -P "(?<=00:00:00).*(?=$USER)")
#kill
delete=$(echo $pac | grep -oP "(?<=$USER\s)\w+")
pac=$(echo $pac | grep -o -P '(?<=00:00:00).*(?=)')
kill -9 $delete
#print
echo " "
echo end:
echo $pac
The kill part is necessary to kill the vi instance else it will still be running and eventually interfer with future executions of the script.

xargs output buffering -P parallel

I have a bash function that i call in parallel using xargs -P like so
echo ${list} | xargs -n 1 -P 24 -I# bash -l -c 'myAwesomeShellFunction #'
Everything works fine but output is messed up for obvious reasons (no buffering)
Trying to figure out a way to buffer output effectively. I was thinking I could use awk, but I'm not good enough to write such a script and I can't find anything worthwhile on google? Can someone help me write this "output buffer" in sed or awk? Nothing fancy, just accumulate output and spit it out after process terminates. I don't care the order that shell functions execute, just need their output buffered... Something like:
echo ${list} | xargs -n 1 -P 24 -I# bash -l -c 'myAwesomeShellFunction # | sed -u ""'
P.s. I tried to use stdbuf as per
https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe but did not work, i specified buffering on o and e but output still unbuffered:
echo ${list} | xargs -n 1 -P 24 -I# stdbuf -i0 -oL -eL bash -l -c 'myAwesomeShellFunction #'
Here's my first attempt, this only captures first line of output:
$ bash -c "echo stuff;sleep 3; echo more stuff" | awk '{while (( getline line) > 0 )print "got ",$line;}'
$ got stuff
This isn't quite atomic if your output is longer than a page (4kb typically), but for most cases it'll do:
xargs -P 24 bash -c 'for arg; do printf "%s\n" "$(myAwesomeShellFunction "$arg")"; done' _
The magic here is the command substitution: $(...) creates a subshell (a fork()ed-off copy of your shell), runs the code ... in it, and then reads that in to be substituted into the relevant position in the outer script.
Note that we don't need -n 1 (if you're dealing with a large number of arguments -- for a small number it may improve parallelization), since we're iterating over as many arguments as each of your 24 parallel bash instances is passed.
If you want to make it truly atomic, you can do that with a lockfile:
# generate a lockfile, arrange for it to be deleted when this shell exits
lockfile=$(mktemp -t lock.XXXXXX); export lockfile
trap 'rm -f "$lockfile"' 0
xargs -P 24 bash -c '
for arg; do
{
output=$(myAwesomeShellFunction "$arg")
flock -x 99
printf "%s\n" "$output"
} 99>"$lockfile"
done
' _

How to break a tail -f command in bash

How can I break a tail -f in bash? Since this question is related to this question
tail -f | awk and end tail once data is found
I tried the following:
#! /bin/bash
tvar="testing"
(set -o pipefail && tail -f <<< "$tvar" | awk '{print; exit} END{ exit 1}' )
But the script is still hanging on to tail -f
Well, the problem is not the tail -f but the awk which hangs. It is meant to terminate when EOF is found (with exit 1). But there is no EOF found; the tail -f does not terminate, so there comes no EOF.
Would the awk terminate, then this would also break the pipe and the tail would receive a SIGPIPE (which would terminate it).
You must find a different condition on which to terminate.
EDIT:
To achieve what you want you can start the tail -f in the background, remember its PID and kill it as soon as you do not need it anymore. Running in the background and using a pipe at the same time is tricky. The easiest way to do it would be to use a named pipe (FIFO):
mkfifo log.pipe
tail -f log > log.pipe & tail_pid=$!
awk ... < log.pipe
kill $tail_pid
rm log.pipe
It seems that switching from using <<< to echo "$tvar" | tail -f does what you want instead?
$> cat test.sh
#! /bin/bash
tvar="testing"
(set -o pipefail && echo "$tvar" | tail -f | awk '{print} END{ exit 1}' )
$> ./test.sh
testing
$>
Although the awk doesn't print anything out afterwards.

nice way to kill piped process?

I want to process each stdout-line for a shell, the moment it is created. I want to grab the output of test.sh (a long process). My current approach is this:
./test.sh >tmp.txt &
PID=$!
tail -f tmp.txt | while read line; do
echo $line
ps ${PID} > /dev/null
if [ $? -ne 0 ]; then
echo "exiting.."
fi
done;
But unfortunately, this will print "exiting" and then wait, as the tail -f is still running. I tried both break and exit
I run this on FreeBSD, so I cannot use the --pid= option of some linux tails.
I can use ps and grep to get the pid of the tail and kill it, but thats seems very ugly to me.
Any hints?
why do you need the tail process?
Could you instead do something along the lines of
./test.sh | while read line; do
# process $line
done
or, if you want to keep the output in tmp.txt :
./test.sh | tee tmp.txt | while read line; do
# process $line
done
If you still want to use an intermediate tail -f process, maybe you could use a named pipe (fifo) instead of a regular pipe, to allow detaching the tail process and getting its pid:
./test.sh >tmp.txt &
PID=$!
mkfifo tmp.fifo
tail -f tmp.txt >tmp.fifo &
PID_OF_TAIL=$!
while read line; do
# process $line
kill -0 ${PID} >/dev/null || kill ${PID_OF_TAIL}
done <tmp.fifo
rm tmp.fifo
I should however mention that such a solution presents several heavy problems of race conditions :
the PID of test.sh could be reused by another process;
if the test.sh process is still alive when you read the last line, you won't have any other occasion to detect its death afterwards and your loop will hang.

What is the equivalent to xargs -r under OsX

Are they any equivalent under OSX to the xargs -r under Linux ? I'm trying to find a way to interupt a pipe if there's no data.
For instance imagine you do the following:
touch test
cat test | xargs -r echo "content: "
That doesn't yield any result because xargs interrupts the pipe.
Is there either some hidden xargs option or something else to achieve the same result under OSX?
The POSIX standard for xargs mandates that the command be executed once, even if there are no arguments. This is a nuisance, which is why GNU xargs has the -r option. Unfortunately, neither BSD (MacOS X) nor the other mainstream Unix versions (AIX, HP-UX, Solaris) support it.
If it is crucial to you, obtain and install GNU xargs somewhere that your environment will find it, without affecting the system (so don't replace /usr/bin/xargs unless you're a braver man than I am — but /usr/local/bin/xargs might be OK, or $HOME/bin/xargs, or …).
You can use test or [:
if [ -s test ] ; then cat test | xargs echo content: ; fi
There is no standard way to determine if the xargs you are running is GNU or not. I set $gnuargs to either "true" or "false" and then have a function that replaces xargs and does the right thing.
On Linux, FreeBSD and MacOS this script works for me. The POSIX standard for xargs mandates that the command be executed once, even if there are no arguments. FreeBSD and MacOS X violate this rule, thus don't need "-r". GNU finds it annoying, and adds -r. This script does the right thing and can be enhanced if you find a version of Unix that does it some other way.
#!/bin/bash
gnuxargs=$(xargs --version 2>&1 |grep -s GNU >/dev/null && echo true || echo false)
function portable_xargs_r() {
if $gnuxargs ; then
cat - | xargs -r "$#"
else
cat - | xargs "$#"
fi
}
echo 'this' > foo
echo '=== Expect one line'
portable_xargs_r <foo echo "content: "
echo '=== DONE.'
cat </dev/null > foo
echo '=== Expect zero lines'
portable_xargs_r <foo echo "content: "
echo '=== DONE.'
Here's a quick and dirty xargs-r using a temporary file.
#!/bin/sh
t=$(mktemp -t xargsrXXXXXXXXX) || exit
trap 'rm -f $t' EXIT HUP INT TERM
cat >"$t"
test -s "$t" || exit
exec xargs "$#" <"$t"
with POSIX xargs¹, to avoid running the-command when the input is empty, you could use moreutils's ifne (for if not empty):
... | ifne xargs ... the-command ...
Or use a sh wrapper that checks the number of arguments:
... | xargs ... sh -c '[ "$#" -eq 0 ] || exec the-command ... "$#"' sh
¹ though one can hardly use xargs POSIXly as it doesn't support -0, has unspecified behaviour when the input is non-text (like for filenames which on most systems are not guaranteed to be text except in the POSIX locale), parses its input in a very arcane way and that is locale-dependant, and doesn't give any guarantee if any word is more than 255 bytes long!
You could make sure that the input always has at least one line. This may not always be possible, but you'd be surprised how many creative ways this can be done.
A typical use case looks like:
find . -print0 | xargs -r -0 grep PATTERN
Some versions of xargs do not have an -r flag. In that case, you can supply /dev/null as the first filename so that grep is never handed an empty list of filenames. Since the pattern will never be found in /dev/null, this won't affect the output:
find . -print0 | xargs -0 grep PATTERN /dev/null
You can test if the stream has any content:
cat test | { if IFS= read -r tmp; then { printf "%s\n" "$tmp"; cat; } | xargs echo "content: "; fi; }
# ^^^ - otherwise just do nothing
# ^^^^^^^^^^^^^^^^^^^^^^^ - to xargs
# ^^^ - and the rest of input
# ^^^^^^^^^^^^^^^^^^^^^^ - redirect first line
# ^^^^^^^^^^^^^^^^^^^ - try reading anything
# or with a function
# even TODO: add the check of `portable_xargs_r` in the other answer and call `xargs -r` when available.
xargs_r() {
if IFS= read -r tmp; then
{ printf "%s\n" "$tmp"; cat; } | xargs "$#"
fi
}
cat test | xargs_r echo "content: "
This method runs the check inside the pipe inside the subshell, so it effectively can be used in a complicated pipe setup.

Resources