tail -f | sed to file doesn't work [duplicate] - shell

This question already has answers here:
write to a file after piping output from tail -f through to grep
(4 answers)
Closed 5 years ago.
I am having an issue with filtering a log file that is being written and writing the output to another file (if possible using tee, so I can see it working as it goes).
I can get it to output on stdout, but not write to a file, either using tee or >>.
I can also get it to write to the file, but only if I drop the -f options from tail, which I need.
So, here is an overview of the commands:
tail -f without writing to file: tail -f test.log | sed 's/a/b/' works
tail writing to file: tail test.log | sed 's/a/b/' | tee -a a.txt works
tail -f writing to file: tail -f test.log | sed 's/a/b/' | tee -a a.txt doesn't output on stdout nor writes to file.
I would like 3. to work.

It's the sed buffering. Use sed -u. man sed:
-u, --unbuffered
load minimal amounts of data from the input files and flush the
output buffers more often
And here's a test for it (creates files fooand bar):
$ for i in {1..3} ; do echo a $i ; sleep 1; done >> foo &
[1] 12218
$ tail -f foo | sed -u 's/a/b/' | tee -a bar
b 1
b 2
b 3
Be quick or increase the {1..3} to suit your skillz.

Related

Using Bash Less and Grep together [duplicate]

Is that possible to use grep on a continuous stream?
What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.
I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.
Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)
tail -f file | grep --line-buffered my_pattern
It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).
I use the tail -f <file> | grep <pattern> all the time.
It will wait till grep flushes, not till it finishes (I'm using Ubuntu).
I think that your problem is that grep uses some output buffering. Try
tail -f file | stdbuf -o0 grep my_pattern
it will set output buffering mode of grep to unbuffered.
If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:
tail -c +0 -f <file> | grep --line-buffered <pattern>
The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.
In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.
If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:
tail -f /var/log/some.log | grep --line-buffered foo | grep bar
you may consider this answer as enhancement .. usually I am using
tail -F <fileName> | grep --line-buffered <pattern> -A 3 -B 5
-F is better in case of file rotate (-f will not work properly if file rotated)
-A and -B is useful to get lines just before and after the pattern occurrence .. these blocks will appeared between dashed line separators
But For me I prefer doing the following
tail -F <file> | less
this is very useful if you want to search inside streamed logs. I mean go back and forward and look deeply
Didn't see anyone offer my usual go-to for this:
less +F <file>
ctrl + c
/<search term>
<enter>
shift + f
I prefer this, because you can use ctrl + c to stop and navigate through the file whenever, and then just hit shift + f to return to the live, streaming search.
sed would be a better choice (stream editor)
tail -n0 -f <file> | sed -n '/search string/p'
and then if you wanted the tail command to exit once you found a particular string:
tail --pid=$(($BASHPID+1)) -n0 -f <file> | sed -n '/search string/{p; q}'
Obviously a bashism: $BASHPID will be the process id of the tail command. The sed command is next after tail in the pipe, so the sed process id will be $BASHPID+1.
Yes, this will actually work just fine. Grep and most Unix commands operate on streams one line at a time. Each line that comes out of tail will be analyzed and passed on if it matches.
This one command workes for me (Suse):
mail-srv:/var/log # tail -f /var/log/mail.info |grep --line-buffered LOGIN >> logins_to_mail
collecting logins to mail service
Coming some late on this question, considering this kind of work as an important part of monitoring job, here is my (not so short) answer...
Following logs using bash
1. Command tail
This command is a little more porewfull than read on already published answer
Difference between follow option tail -f and tail -F, from manpage:
-f, --follow[={name|descriptor}]
output appended data as the file grows;
...
-F same as --follow=name --retry
...
--retry
keep trying to open a file if it is inaccessible
This mean: by using -F instead of -f, tail will re-open file(s) when removed (on log rotation, for sample).
This is usefull for watching logfile over many days.
Ability of following more than one file simultaneously
I've already used:
tail -F /var/www/clients/client*/web*/log/{error,access}.log /var/log/{mail,auth}.log \
/var/log/apache2/{,ssl_,other_vhosts_}access.log \
/var/log/pure-ftpd/transfer.log
For following events through hundreds of files... (consider rest of this answer to understand how to make it readable... ;)
Using switches -n (Don't use -c for line buffering!).By default tail will show 10 last lines. This can be tunned:
tail -n 0 -F file
Will follow file, but only new lines will be printed
tail -n +0 -F file
Will print whole file before following his progression.
2. Buffer issues when piping:
If you plan to filter ouptuts, consider buffering! See -u option for sed, --line-buffered for grep, or stdbuf command:
tail -F /some/files | sed -une '/Regular Expression/p'
Is (a lot more efficient than using grep) a lot more reactive than if you does'nt use -u switch in sed command.
tail -F /some/files |
sed -une '/Regular Expression/p' |
stdbuf -i0 -o0 tee /some/resultfile
3. Recent journaling system
On recent system, instead of tail -f /var/log/syslog you have to run journalctl -xf, in near same way...
journalctl -axf | sed -une '/Regular Expression/p'
But read man page, this tool was built for log analyses!
4. Integrating this in a bash script
Colored output of two files (or more)
Here is a sample of script watching for many files, coloring ouptut differently for 1st file than others:
#!/bin/bash
tail -F "$#" |
sed -une "
/^==> /{h;};
//!{
G;
s/^\\(.*\\)\\n==>.*${1//\//\\\/}.*<==/\\o33[47m\\1\\o33[0m/;
s/^\\(.*\\)\\n==> .* <==/\\o33[47;31m\\1\\o33[0m/;
p;}"
They work fine on my host, running:
sudo ./myColoredTail /var/log/{kern.,sys}log
Interactive script
You may be watching logs for reacting on events?
Here is a little script playing some sound when some USB device appear or disappear, but same script could send mail, or any other interaction, like powering on coffe machine...
#!/bin/bash
exec {tailF}< <(tail -F /var/log/kern.log)
tailPid=$!
while :;do
read -rsn 1 -t .3 keyboard
[ "${keyboard,}" = "q" ] && break
if read -ru $tailF -t 0 _ ;then
read -ru $tailF line
case $line in
*New\ USB\ device\ found* ) play /some/sound.ogg ;;
*USB\ disconnect* ) play /some/othersound.ogg ;;
esac
printf "\r%s\e[K" "$line"
fi
done
echo
exec {tailF}<&-
kill $tailPid
You could quit by pressing Q key.
you certainly won't succeed with
tail -f /var/log/foo.log |grep --line-buffered string2search
when you use "colortail" as an alias for tail, eg. in bash
alias tail='colortail -n 30'
you can check by
type alias
if this outputs something like
tail isan alias of colortail -n 30.
then you have your culprit :)
Solution:
remove the alias with
unalias tail
ensure that you're using the 'real' tail binary by this command
type tail
which should output something like:
tail is /usr/bin/tail
and then you can run your command
tail -f foo.log |grep --line-buffered something
Good luck.
Use awk(another great bash utility) instead of grep where you dont have the line buffered option! It will continuously stream your data from tail.
this is how you use grep
tail -f <file> | grep pattern
This is how you would use awk
tail -f <file> | awk '/pattern/{print $0}'

Getting head to display all but the last line of a file: command substitution and standard I/O redirection

I have been trying to get the head utility to display all but the last line of standard input. The actual code that I needed is something along the lines of cat myfile.txt | head -n $(($(wc -l)-1)). But that didn't work. I'm doing this on Darwin/OS X which doesn't have the nice semantics of head -n -1 that would have gotten me similar output.
None of these variations work either.
cat myfile.txt | head -n $(wc -l | sed -E -e 's/\s//g')
echo "hello" | head -n $(wc -l | sed -E -e 's/\s//g')
I tested out more variations and in particular found this to work:
cat <<EOF | echo $(($(wc -l)-1))
>Hola
>Raul
>Como Esta
>Bueno?
>EOF
3
Here's something simpler that also works.
echo "hello world" | echo $(($(wc -w)+10))
This one understandably gives me an illegal line count error. But it at least tells me that the head program is not consuming the standard input before passing stuff on to the subshell/command substitution, a remote possibility, but one that I wanted to rule out anyway.
echo "hello" | head -n $(cat && echo 1)
What explains the behavior of head and wc and their interaction through subshells here? Thanks for your help.
head -n -1 will give you all except the last line of its input.
head is the wrong tool. If you want to see all but the last line, use:
sed \$d
The reason that
# Sample of incorrect code:
echo "hello" | head -n $(wc -l | sed -E -e 's/\s//g')
fails is that wc consumes all of the input and there is nothing left for head to see. wc inherits its stdin from the subshell in which it is running, which is reading from the output of the echo. Once it consumes the input, it returns and then head tries to read the data...but it is all gone. If you want to read the input twice, the data will have to be saved somewhere.
Using sed:
sed '$d' filename
will delete the last line of the file.
$ seq 1 10 | sed '$d'
1
2
3
4
5
6
7
8
9
For Mac OS X specifically, I found an answer from a comment to this Q&A.
Assuming you are using Homebrew, run brew install coreutils then use the ghead command:
cat myfile.txt | ghead -n -1
Or, equivalently:
ghead -n -1 myfile.txt
Lastly, see brew info coreutils if you'd like to use the commands without the g prefix (e.g., head instead of ghead).
cat myfile.txt | echo $(($(wc -l)-1))
This works. It's overly complicated: you could just write echo $(($(wc -l)-1)) <myfile.txt or echo $(($(wc -l <myfile.txt)-1)). The problem is the way you're using it.
cat myfile.txt | head -n $(wc -l | sed -E -e 's/\s//g')
wc consumes all the input as it's counting the lines. So there is no data left to read in the pipe by the time head is started.
If your input comes from a file, you can redirect both wc and head from that file.
head -n $(($(wc -l <myfile.txt) - 1)) <myfile.txt
If your data may come from a pipe, you need to duplicate it. The usual tool to duplicate a stream is tee, but that isn't enough here, because the two outputs from tee are produced at the same rate, whereas here wc needs to fully consume its output before head can start. So instead, you'll need to use a single tool that can detect the last line, which is a more efficient approach anyway.
Conveniently, sed offers a way of matching the last line. Either printing all lines but the last, or suppressing the last output line, will work:
sed -n '$! p'
sed '$ d'
Here is a one-liner that can get you the desired output, and it can be used more generally for getting all lines from a file except the last n lines.
grep -n "" myfile.txt \ # output the line number for each line
| sort -nr \ # reverse the file by using those line numbers
| sed '1,4d' \ # delete first 4 lines (last 4 of the original file)
| sort -n \ # reverse the reversed file (correct the line order)
| sed 's/^[0-9]*://' # remove the added line numbers
Here is the above command in an actual single line and runnable (can't execute the above due to the added comments):
grep -n "" myfile.txt | sort -nr | sed '1,4d' | sort -n | sed 's/^[0-9]*://'
It's a little cumbersome, and this problem can be solved with more comprehensive commands like ghead, but when you can't or don't want to download such tools, it's nice to be able to do this with the more basic options. I've been in situations where it's simply not an option to get better tools.
awk 'NR>1{print p}{p=$0}'
For this job, an awk one-liner is a bit longer than a sed one.

In bash, how to tail a fifo that is currently being written to, and then tail the fifo again, grepping for different text?

I have a process that is writing to standard out, and I want to be able to monitor the output by grepping for various strings while running tail -f. One way to do this is to write to a normal file and then tail the file grepping for one string, then tail it grepping for another. However, I don't want to write to a file because this will fill up the disk, and I don't want to deal with the hassle of rotating log files.
I thought I could achieve this using fifos. E.g.,
mkfifo myfifo
foo > myfifo &
tail -f myfifo | grep bar
tail -f myfifo | grep baz
Unfortunately, this doesn't seem to work. In fact, the only way I see any output when tailing is when I first execute tail -f myfifo and then foo > mfifo, but I don't want to restart foo (that's the whole point, otherwise I can just grep standard out directly and restart the process to grep for a different string). Does anyone know why this is happening or have a suggestion for how to achieve this?
This is happening because a fifo is a data stream. Once a piece of data is read from a FIFO, it's removed from the FIFO. In your case, the output of foo that's stored in myfifo is being read by the first tail -f that greps for "bar", leaving nothing for the second tail -f.
But you don't need to send the output of foo to a file at all (FIFO or otherwise). You can just send its output directly into tee and have it send that output to as many processes as you want. For example:
$ foo | tee >(grep -o bar) >(grep -o baz) >/dev/null
Or if you're just using grep, you can use -e as many times as you want on the output:
$ foo | grep -e bar -e baz
You can use a different grep syntax:
tail -f myfifo | grep -e 'pattern1' -e 'pattern2' -e 'pattern3'
I think you want both to happen at the same time on the same output stream, right?
Otherwise you could play with tee
tail -f myfifo | tee -a somefile | grep bar
grep foo somefile

Pipe output to two different commands [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
osx/linux: pipes into two processes?
Is there a way to pipe the output from one command into the input of two other commands, running them simultaneously?
Something like this:
$ echo 'test' |(cat) |(cat)
test
test
The reason I want to do this is that I have a program which receives an FM radio signal from a USB SDR device, and outputs the audio as raw PCM data (like a .wav file but with no header.) Since the signal is not music but POCSAG pager data, I need to pipe it to a decoder program to recover the pager text. However I also want to listen to the signal so I know whether any data is coming in or not. (Otherwise I can't tell if the decoder is broken or there's just no data being broadcast.) So as well as piping the data to the pager decoder, I also need to pipe the same data to the play command.
Currently I only know how to do one - either pipe it to the decoder and read the data in silence, or pipe it to play and hear it without seeing any decoded text.
How can I pipe the same data to both commands, so I can read the text and hear the audio?
I can't use tee as it only writes the duplicated data to a file, but I need to process the data in real-time.
It should be ok if you use both tee and mkfifo.
mkfifo pipe
cat pipe | (command 1) &
echo 'test' | tee pipe | (command 2)
Recent bash present >(command) syntax:
echo "Hello world." | tee >(sed 's/^/1st: /') >(sed 's/^/2nd cmd: /') >/dev/null
May return:
2nd cmd: Hello world.
1st: Hello world.
download somefile.ext, save them, compute md5sum and sha1sum:
wget -O - http://somewhere.someland/somepath/somefile.ext |
tee somefile.ext >(md5sum >somefile.md5) | sha1sum >somefile.sha1
or
wget -O - http://somewhere.someland/somepath/somefile.ext |
tee >(md5sum >somefile.md5) >(sha1sum >somefile.sha1) >somefile.ext
Old answer
There is a way to do that via unnamed pipe (tested under linux):
(( echo "hello" |
tee /dev/fd/5 |
sed 's/^/1st occure: /' >/dev/fd/4
) 5>&1 |
sed 's/^/2nd command: /'
) 4>&1
give:
2nd command: hello
1st occure: hello
This sample will let you download somefile.ext, save them, compute his md5sum and compute his sha1sum:
(( wget -O - http://somewhere.someland/somepath/somefile.ext |
tee /dev/fd/5 |
md5sum >/dev/fd/4
) 5>&1 |
tee somefile.ext |
sha1sum
) 4>&1
Maybe take a look at tee command. What it does is simply print its input to a file, but it also prints its input to the standard output. So something like:
echo "Hello" | tee try.txt | <some_command>
Will create a file with content "Hello" AND also let "Hello" (flow through the pipeline) end up as <some_command>'s STDIN.

Trouble with piping through sed

I am having trouble piping through sed. Once I have piped output to sed, I cannot pipe the output of sed elsewhere.
wget -r -nv http://127.0.0.1:3000/test.html
Outputs:
2010-03-12 04:41:48 URL:http://127.0.0.1:3000/test.html [99/99] -> "127.0.0.1:3000/test.html" [1]
2010-03-12 04:41:48 URL:http://127.0.0.1:3000/robots.txt [83/83] -> "127.0.0.1:3000/robots.txt" [1]
2010-03-12 04:41:48 URL:http://127.0.0.1:3000/shop [22818/22818] -> "127.0.0.1:3000/shop.29" [1]
I pipe the output through sed to get a clean list of URLs:
wget -r -nv http://127.0.0.1:3000/test.html 2>&1 | grep --line-buffered -v ERROR | sed 's/^.*URL:\([^ ]*\).*/\1/g'
Outputs:
http://127.0.0.1:3000/test.html
http://127.0.0.1:3000/robots.txt
http://127.0.0.1:3000/shop
I would like to then dump the output to file, so I do this:
wget -r -nv http://127.0.0.1:3000/test.html 2>&1 | grep --line-buffered -v ERROR | sed 's/^.*URL:\([^ ]*\).*/\1/g' > /tmp/DUMP_FILE
I interrupt the process after a few seconds and check the file, yet it is empty.
Interesting, the following yields no output (same as above, but piping sed output through cat):
wget -r -nv http://127.0.0.1:3000/test.html 2>&1 | grep --line-buffered -v ERROR | sed 's/^.*URL:\([^ ]*\).*/\1/g' | cat
Why can I not pipe the output of sed to another program like cat?
When sed is writing to another process or to a file, it will buffer data.
Try adding the --unbuffered options to sed.
you can also use awk. since your URL appears in field 3, you can use $3, and you can remove the grep as well.
awk '!/ERROR/{sub("URL:","",$3);print $3}' file

Resources