Debugger for unix pipe commands - shell

As I build *nix piped commands I find that I want to see the output of one stage to verify correctness before building the next stage but I don't want to re-run each stage. Does anyone know of a program that will help with that? It would keep the output of the last stage automatically to use for any new stages. I usually do this by sending the result of each command to a temporary file (i.e. tee or run each command one at a time) but it would be nice for a program to handle this.
I envision something like a tabbed interface where each tab is labeled with each pipe command and selecting a tab shows the output (at least a hundred lines) of applying that command to to the previous result.

Use 'tee' to copy the intermediate results out to some file as well as pass them on to the next stage of the pipe, like so:
cat /var/log/syslog | tee /tmp/syslog.out | grep something | tee /tmp/grep.out | sed 's/foo/bar/g' | tee /tmp/sed.out | cat >>/var/log/syslog.cleaned

You can also use pipes if you need bidirectional communication (i.e. with netcat):
mknod backpipe p
nc -l -p 80 0<backpipe | tee -a inflow | nc localhost 81 | tee -a outflow 1>backpipe
(via)

There's also the "pv" command - available in debian / ubuntu repostitories which shows you the throughput of your pipes.
An example from the man page :
Transferring a file from another process and passing the expected size to pv:
cat file | pv -s 12345 | nc -w 1 somewhere.com 3000

tee(1) is your friend. It sends its input to both the specified file and stdout.
Stick it between your pipes. For example:
ls | tee /tmp/out1 | sort | tee /tmp/out2 | sed 's/foo/bar/g'

Related

Using Bash Less and Grep together [duplicate]

Is that possible to use grep on a continuous stream?
What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.
I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.
Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)
tail -f file | grep --line-buffered my_pattern
It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).
I use the tail -f <file> | grep <pattern> all the time.
It will wait till grep flushes, not till it finishes (I'm using Ubuntu).
I think that your problem is that grep uses some output buffering. Try
tail -f file | stdbuf -o0 grep my_pattern
it will set output buffering mode of grep to unbuffered.
If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:
tail -c +0 -f <file> | grep --line-buffered <pattern>
The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.
In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.
If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:
tail -f /var/log/some.log | grep --line-buffered foo | grep bar
you may consider this answer as enhancement .. usually I am using
tail -F <fileName> | grep --line-buffered <pattern> -A 3 -B 5
-F is better in case of file rotate (-f will not work properly if file rotated)
-A and -B is useful to get lines just before and after the pattern occurrence .. these blocks will appeared between dashed line separators
But For me I prefer doing the following
tail -F <file> | less
this is very useful if you want to search inside streamed logs. I mean go back and forward and look deeply
Didn't see anyone offer my usual go-to for this:
less +F <file>
ctrl + c
/<search term>
<enter>
shift + f
I prefer this, because you can use ctrl + c to stop and navigate through the file whenever, and then just hit shift + f to return to the live, streaming search.
sed would be a better choice (stream editor)
tail -n0 -f <file> | sed -n '/search string/p'
and then if you wanted the tail command to exit once you found a particular string:
tail --pid=$(($BASHPID+1)) -n0 -f <file> | sed -n '/search string/{p; q}'
Obviously a bashism: $BASHPID will be the process id of the tail command. The sed command is next after tail in the pipe, so the sed process id will be $BASHPID+1.
Yes, this will actually work just fine. Grep and most Unix commands operate on streams one line at a time. Each line that comes out of tail will be analyzed and passed on if it matches.
This one command workes for me (Suse):
mail-srv:/var/log # tail -f /var/log/mail.info |grep --line-buffered LOGIN >> logins_to_mail
collecting logins to mail service
Coming some late on this question, considering this kind of work as an important part of monitoring job, here is my (not so short) answer...
Following logs using bash
1. Command tail
This command is a little more porewfull than read on already published answer
Difference between follow option tail -f and tail -F, from manpage:
-f, --follow[={name|descriptor}]
output appended data as the file grows;
...
-F same as --follow=name --retry
...
--retry
keep trying to open a file if it is inaccessible
This mean: by using -F instead of -f, tail will re-open file(s) when removed (on log rotation, for sample).
This is usefull for watching logfile over many days.
Ability of following more than one file simultaneously
I've already used:
tail -F /var/www/clients/client*/web*/log/{error,access}.log /var/log/{mail,auth}.log \
/var/log/apache2/{,ssl_,other_vhosts_}access.log \
/var/log/pure-ftpd/transfer.log
For following events through hundreds of files... (consider rest of this answer to understand how to make it readable... ;)
Using switches -n (Don't use -c for line buffering!).By default tail will show 10 last lines. This can be tunned:
tail -n 0 -F file
Will follow file, but only new lines will be printed
tail -n +0 -F file
Will print whole file before following his progression.
2. Buffer issues when piping:
If you plan to filter ouptuts, consider buffering! See -u option for sed, --line-buffered for grep, or stdbuf command:
tail -F /some/files | sed -une '/Regular Expression/p'
Is (a lot more efficient than using grep) a lot more reactive than if you does'nt use -u switch in sed command.
tail -F /some/files |
sed -une '/Regular Expression/p' |
stdbuf -i0 -o0 tee /some/resultfile
3. Recent journaling system
On recent system, instead of tail -f /var/log/syslog you have to run journalctl -xf, in near same way...
journalctl -axf | sed -une '/Regular Expression/p'
But read man page, this tool was built for log analyses!
4. Integrating this in a bash script
Colored output of two files (or more)
Here is a sample of script watching for many files, coloring ouptut differently for 1st file than others:
#!/bin/bash
tail -F "$#" |
sed -une "
/^==> /{h;};
//!{
G;
s/^\\(.*\\)\\n==>.*${1//\//\\\/}.*<==/\\o33[47m\\1\\o33[0m/;
s/^\\(.*\\)\\n==> .* <==/\\o33[47;31m\\1\\o33[0m/;
p;}"
They work fine on my host, running:
sudo ./myColoredTail /var/log/{kern.,sys}log
Interactive script
You may be watching logs for reacting on events?
Here is a little script playing some sound when some USB device appear or disappear, but same script could send mail, or any other interaction, like powering on coffe machine...
#!/bin/bash
exec {tailF}< <(tail -F /var/log/kern.log)
tailPid=$!
while :;do
read -rsn 1 -t .3 keyboard
[ "${keyboard,}" = "q" ] && break
if read -ru $tailF -t 0 _ ;then
read -ru $tailF line
case $line in
*New\ USB\ device\ found* ) play /some/sound.ogg ;;
*USB\ disconnect* ) play /some/othersound.ogg ;;
esac
printf "\r%s\e[K" "$line"
fi
done
echo
exec {tailF}<&-
kill $tailPid
You could quit by pressing Q key.
you certainly won't succeed with
tail -f /var/log/foo.log |grep --line-buffered string2search
when you use "colortail" as an alias for tail, eg. in bash
alias tail='colortail -n 30'
you can check by
type alias
if this outputs something like
tail isan alias of colortail -n 30.
then you have your culprit :)
Solution:
remove the alias with
unalias tail
ensure that you're using the 'real' tail binary by this command
type tail
which should output something like:
tail is /usr/bin/tail
and then you can run your command
tail -f foo.log |grep --line-buffered something
Good luck.
Use awk(another great bash utility) instead of grep where you dont have the line buffered option! It will continuously stream your data from tail.
this is how you use grep
tail -f <file> | grep pattern
This is how you would use awk
tail -f <file> | awk '/pattern/{print $0}'

Use output from previous commands without repeating them

If I enter a command that runs for a long time or produce a lot of output, I often want to process that output in some way, but I don't want to re-run the command. For example, I might run
$ command
$ command | grep foo
$ command | grep foo | sort | uniq
But if command takes a long time, this is tedious to re-run. Is there a way to have bash (or any other shell) save the output of the last command, similar to the Python REPL's _?. I am aware of tee, but I would rather have my shell do this automatically without having to use tee all the time.
I am also aware I could store the output of a command, but again, I would like my shell to do this automatically, so I don't have to think about storing the command and I can just use my shell normally, and process the previous output when I want to.
You can store the output into a variable:
output=$(command)
echo $output | grep foo
echo $output | grep foo | sort | uniq

Echo server built with Netcat and FIFO fails when used with rev

I'm playing around with Netcat, and I've successfully made an echo server like this:
mkfifo fifo
cat fifo | nc -l 3000 > fifo
Next, I'd like to apply some transformation to the data before it's echoed back:
cat fifo | nc -l 3000 | rev > fifo
# Or:
cat fifo | rev | nc -l 3000 > fifo
But neither of the above works. The same happens when I use any text-tranforming program, not just rev. But if I replace rev with cat, it works again:
cat fifo | nc -l 3000 | cat > fifo
This leads me to believe there's something special about how cat uses standard in and standard out. (As compared to rev, tr, and other similar text-transforming programs.)
What's going on here? Why does inserting rev into the pipeline break the echo server? Is cat indeed special, and if so, how?
This is due to buffering. glibc automatically buffers output when stdout is not a terminal.
efficiency.
You can see the same effect in your terminal where rev reverses each line as you type it, while rev | cat does not.
To fix it, you have to get your command to not buffer. GNU has a stdbuf tool for doing this for arbitrary commands:
cat fifo | nc -l 3000 | stdbuf -o 0 rev > fifo
The interactive command scripting tool expect also comes with an unbuffer command to do the same.
Buffering is only efficient when combining multiple small writes into a large one. Programs that just copy from one place to another (like cat and dd) don't benefit from buffering and therefore don't do it.

Pipe output to two different commands [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
osx/linux: pipes into two processes?
Is there a way to pipe the output from one command into the input of two other commands, running them simultaneously?
Something like this:
$ echo 'test' |(cat) |(cat)
test
test
The reason I want to do this is that I have a program which receives an FM radio signal from a USB SDR device, and outputs the audio as raw PCM data (like a .wav file but with no header.) Since the signal is not music but POCSAG pager data, I need to pipe it to a decoder program to recover the pager text. However I also want to listen to the signal so I know whether any data is coming in or not. (Otherwise I can't tell if the decoder is broken or there's just no data being broadcast.) So as well as piping the data to the pager decoder, I also need to pipe the same data to the play command.
Currently I only know how to do one - either pipe it to the decoder and read the data in silence, or pipe it to play and hear it without seeing any decoded text.
How can I pipe the same data to both commands, so I can read the text and hear the audio?
I can't use tee as it only writes the duplicated data to a file, but I need to process the data in real-time.
It should be ok if you use both tee and mkfifo.
mkfifo pipe
cat pipe | (command 1) &
echo 'test' | tee pipe | (command 2)
Recent bash present >(command) syntax:
echo "Hello world." | tee >(sed 's/^/1st: /') >(sed 's/^/2nd cmd: /') >/dev/null
May return:
2nd cmd: Hello world.
1st: Hello world.
download somefile.ext, save them, compute md5sum and sha1sum:
wget -O - http://somewhere.someland/somepath/somefile.ext |
tee somefile.ext >(md5sum >somefile.md5) | sha1sum >somefile.sha1
or
wget -O - http://somewhere.someland/somepath/somefile.ext |
tee >(md5sum >somefile.md5) >(sha1sum >somefile.sha1) >somefile.ext
Old answer
There is a way to do that via unnamed pipe (tested under linux):
(( echo "hello" |
tee /dev/fd/5 |
sed 's/^/1st occure: /' >/dev/fd/4
) 5>&1 |
sed 's/^/2nd command: /'
) 4>&1
give:
2nd command: hello
1st occure: hello
This sample will let you download somefile.ext, save them, compute his md5sum and compute his sha1sum:
(( wget -O - http://somewhere.someland/somepath/somefile.ext |
tee /dev/fd/5 |
md5sum >/dev/fd/4
) 5>&1 |
tee somefile.ext |
sha1sum
) 4>&1
Maybe take a look at tee command. What it does is simply print its input to a file, but it also prints its input to the standard output. So something like:
echo "Hello" | tee try.txt | <some_command>
Will create a file with content "Hello" AND also let "Hello" (flow through the pipeline) end up as <some_command>'s STDIN.

How to pipe stdout while keeping it on screen ? (and not to a output file)

I would like to pipe standard output of a program while keeping it on screen.
With a simple example (echo use here is just for illustration purpose) :
$ echo 'ee' | foo
ee <- the output I would like to see
I know tee could copy stdout to file but that's not what I want.
$ echo 'ee' | tee output.txt | foo
I tried
$ echo 'ee' | tee /dev/stdout | foo but it does not work since tee output to /dev/stdout is piped to foo
Here is a solution that works at on any Unix / Linux implementation, assuming it cares to follow the POSIX standard. It works on some non Unix environments like cygwin too.
echo 'ee' | tee /dev/tty | foo
Reference: The Open Group Base Specifications Issue 7
IEEE Std 1003.1, 2013 Edition, ยง10.1:
/dev/tty
Associated with the process group of that process, if any. It is
useful for programs or shell procedures that wish to be sure of
writing messages to or reading data from the terminal no matter how
output has been redirected. It can also be used for applications that
demand the name of a file for output, when typed output is desired and
it is tiresome to find out what terminal is currently in use. In each process, a synonym for the controlling terminal
Some environments like Google Colab have been reported not to implement /dev/tty while still having their tty command returning a usable device. Here is a workaround:
tty=$(tty)
echo 'ee' | tee $tty | foo
or with an ancient Bourne shell:
tty=`tty`
echo 'ee' | tee $tty | foo
Another thing to try is:
echo 'ee' | tee >(foo)
The >(foo) is a process substitution.
Edit:
To make a bit clearer, (.) here start a new child process to the current terminal, where the output is being redirected to.
echo ee | tee >(wc | grep 1)
# ^^^^^^^^^^^^^^ => child process
Except that any variable declarations/changes in child process do not reflect in the parent, there is very few of concern with regard to running commands in a child process.
Try:
$ echo 'ee' | tee /dev/stderr | foo
If using stderr is an option, of course.
Access to "/dev/stdout" is denied on some systems, but access to the user terminal is given by "/dev/tty".
Using "wc" for "foo", the above examples work OK (on linux, OSX, etc.) as:
% echo 'Hi' | tee /dev/tty | wc
Hi
1 1 3
To add a count at the bottom of a list of matching files, I use something like:
% ls [A-J]* | tee /dev/tty | wc -l
To avoid having to remember all this, I define aliases:
% alias t tee /dev/tty
% alias wcl wc -l
so that I can simply say:
% ls [A-J]* | t | wcl
POSTSCRIPT: For the younger set, who might titter at its pronunciation as "titty", I might add that "tty" was once the common
abbreviation for a "teletype" terminal, which used a roll of yellow
paper and had round keys that often stuck.
first you need to figure out the terminal associated with your screen (or whichever screen you want the output to display on):
tty
then you can tee the output to that terminal and pipe the other copy through your foo program:
echo ee | tee /dev/pty/2 | foo

Resources