How can I expand arguments to a bash function into a chain of piped commands? - bash

I often find myself doing something like this a lot:
something | grep cat | grep bat | grep rat
when all I recall is that those three words must have occurred somewhere, in some order, in the output of something...Now, i could do something like this:
something | grep '.*cat.*bat.*rat.*'
but that implies ordering (bat appears after cat). As such, I was thinking of adding a bash function to my environment called mgrep which would turn:
mgrep cat bat rat
into
grep cat | grep bat | grep rat
but I'm not quite sure how to do it (or whether there is an alternative?). One idea would be to for loop over the parameters like so:
while (($#)); do
grep $1 some_thing > some_thing
shift
done
cat some_thing
where some_thing is possibly some fifo like when one does >(cmd) in bash but I'm not sure. How would one proceed?

I believe you could generate a pipeline one command at a time, by redirecting stdin at each step. But it's much simpler and cleaner to generate your pipeline as a string and execute it with eval, like this:
CMD="grep '$1' " # consume the first argument
shift
for arg in "$#" # Add the rest in a pipeline
do
CMD="$CMD | grep '$arg'"
done
eval $CMD
This will generate a pipeline of greps that always reads from standard input, as in your model. Note that it protects spaces in quoted arguments, so that it works correctly if you write:
mgrep 'the cat' 'the bat' 'the rat'

Thanks to Alexis, this is what I did:
function mgrep() #grep multiple keywords
{
CMD=''
while (($#)); do
CMD="$CMD grep \"$1\" | "
shift
done
eval ${CMD%| }
}

You can write a recursive function; I'm not happy with the base case, but I can't think of a better one. It seems a waste to need to call cat just to pass standard input to standard output, and the while loop is a bit inelegant:
mgrep () {
local e=$1;
# shift && grep "$e" | mgrep "$#" || while read -r; do echo "$REPLY"; done
shift && grep "$e" | mgrep "$#" || cat
# Maybe?
# shift && grep "$e" | mgrep "$#" || echo "$(</dev/stdin)"
}

Related

Create a chained command line in bash function

I have a question: I would like to create a function that (dependantly from number of entered arguments) would create so called "cained" command line. The current code I wrote look as follow:
function ignore {
if [ -n "$#" ] && [ "$#" > 0 ]; then
count=$#
if [ ${count} -eq 1 ]; then
return "grep -iv $1"
else
for args in "$#" do
## Here should be code that would put (using pseudo code) as many "grep -iv $a | grep -iv $(a+1) | ... | grep -iv $(a+n)", where the part "$(a+..) represent the placeholder of next argument"
done
fi
fi
}
Any ideas? Thanks
Update
I would like to precise above. The above functions would become used as following:
some_bash_function | ignore
example:
apt-get cache search apache2 | ignore doc lib
Maybe this will help bit more
This seems horribly inefficient. A much better solution would look like grep -ive "${array[0]}" -e "${array[1]}" -e "${array[2]}" etc. Here's a simple way to build that.
# don't needlessly use Bash-only function declaration syntax
ignore () {
local args=()
local t
for t; do
args+=(-e "$t")
done
grep -iv "${args[#]}"
}
In the end, git status | ignore foo bar baz is not a lot simpler than git status | grep -ive foo -e bar -e baz so this function might not be worth these 116 bytes (spaces included). But hopefully at least this can work to demonstrate a way to build command lines programmatically. The use of arrays is important; there is no good way to preserve quoting of already quoted values if you smash everything into a single string.
A more sustainable solution still is to just combine everything into a single regex. You can do that with grep -iv 'foo\|bar\|baz' though personally, I would probably switch to the more expressive regex dialect of grep -E; grep -ivE 'foo|bar|baz'.
If you really wanted to build a structure of pipes, I guess a recursive function would work.
# FIXME: slow and ugly, prefer the one above
ignore_slowly () {
if [[ $# -eq 1 ]]; then
grep -iv "$1"
else
local t=$1
shift
grep -iv "$t" | ignore_slowly "$#"
fi
}
But generally, you want to minimize the number of processes you create.
Though inefficient, what you want can be done like this:
#!/bin/bash
ignore () {
printf -v pipe 'grep -iv %q | ' "$#"
pipe=${pipe%???} # remove trailing ' | '
bash -c "$pipe"
}
ignore 'regex1' 'regex2' … 'regexn' < file

How can I grep a list of names from case?

So as an example, I have a bunch of apps that are constantly writing to /var/log/app//nonsence.file there's nothing else those folders, just logs from this one set of apps. so I can easily do:
cat /var/log/app/*/nonsence.file
and I'll get a nice stream of the app logs.
Mixed into this stream are periodic references to people. I'd like to build a script to trigger when certain names appear in the stream.
I can do this easily enough:
cat /var/log/app/*/nonsence.file | grep 'greg|john|suzy|stacy'
and I can put THAT into a simple script thusly:
#!/bin/sh
NAME=`cat /var/log/app/*/nonsence.file | grep 'greg\|john\|suzy\|stacy'`
case "$NAME" in
"greg" ) echo "I found greg!" >> ~/names.meh ;;
"john" ) echo "I found john!" >> ~/names.meh ;;
"suzy" ) echo "I found suzy!" >> ~/names.meh ;;
"stacy" ) echo "I found stacy!" >> ~/names.meh ;;
* ) echo "forever alone..." >> ~/names.meh ;;
esac
easy peasy!
the trouble is, the list of names change from time to time and I would really like a neater list.
After some thinking I believe what I REALLY want to do is add each name into the case section only. so what do I need to do in the NAME variable section to tell the command to grep the name referenced in the case section?
cat file | grep is a useless use of cat. Just grep file.
Command in a pipe are by default block buffered.
The >> ~/names.meh is just repetition. Just specify it once for the whole block.
The backticks ` are discouraged. It's preferred to use $(..) instead.
Each time NAME=... is assigned the file is read, while you seem to want to want:
... I'd like to build a script to trigger when certain names appear in the stream.
which suggest you want to react when the name appears in the script, not after some time.
You may try:
patterns=(greg john suzy stacy)
printf "%s\n" /var/log/app/*/nonsence.file |
# tail each file at the same time by spawning for each a background process
xargs -P0 -n1 tail -F -n+1 |
# grep for the patterns
# pass the patterns from a file
# the <(...) is a process substitution, a bash extension
grep --line-buffered -f <(printf "%s\n" "${patterns[#]}") -o |
# for each grepped content execute different action
while IFS= read -r line; do
case "$line" in)
"greg") someaction; ;;
# etc
*) echo "Internal error - unhandled pattern"; ;;
esac
done >> ~/names.me
Because specyfing patterns twice is lame, you could do an associative function to map the patterns to function names, or just use unique function names and geenerate from them the pattern list:
pattern_greg() { echo "greg"; }
pattern_kamil() { echo "well, not greg"; }
patterns=($(declare -F | sed 's/declare -f //; /^pattern_/!d; s/pattern_//'))
... |
while IFS= read -r line; do
if declare -f pattern_"$line" >/dev/null 2>&1; then
pattern_"$line"
else
echo "Internal error occured"
fi
done
alternatively, but I like the functions better:
greg_function() { echo do something; }
kamil_callback() { echo do something else; }
declare -A patterns
patterns=([greg]=greg_function [kamil]=kamil_callback)
... | grep -f <(printf "%s\n" ${!patterns[#]}) ... |
while IFS= read -r line; do
# I think this is how to check if array element is set
if [[ -n "${patterns[$line]}" ]]; then
"${patterns[$line]}"
else
echo error
fi
done

Why does history require a numeric value for grep?

I am trying to make a custom function (hisgrep) to grep from history.
I had it working before, when the code was basically "history | grep $1", but I want the implementation to be able to grep multiple keywords. (e.g. "hisgrep docker client" would equal "history | grep docker | grep client").
My problem is that, when I try to do this I get this error: "-bash: history: |: numeric argument required."
I've tried changing how the command was called in the end from $cmd to just $cmd, but that did nothing.
Here's the code:
#!/bin/bash
function hisgrep() {
cmd='history'
for arg in "$#"; do
cmd="$cmd | grep $arg"
done
`$cmd`
}
Sadly, bash doesn't have something called "foldl" or similar function.
You can do it like this:
histgrep() {
local str;
# save the history into some list
# I already filter the first argument, so the initial list is shorter
str=$(history | grep -e "$1");
shift;
# for each argument
for i; do
# pass the string via grep
str=$(<<<"$str" grep "$i")
done
printf "%s\n" "$str"
}
Notes:
Doing cmd="$cmd | grep $arg" and then doing `$cmd` looks unsafe.
Remember to quote your variables.
Use https://www.shellcheck.net/ to check your scripts.
Backticks ` are deprecated. Use $() command substitution.
using both function and parenthesis function func() is not portable. Just do func().
As for the unsafe version, you need to pass it via eval (and eval is evil), which by smart using printf shortens to just:
histgrep() { eval "history $(printf "| grep -e '%s' " "$#")"; }
But I think we can do a lot safer by expanding the arguments after command substitution, inside the eval call:
histgrep() { eval "history $(printf '| grep -e "$%s" ' $(seq $#))"; }
The eval here will see history | grep -e "$1" | grep -e "$2" | ... which I think looks actually quite safe.
It does not work because | is interpreted as an argument to the history command.

How do I avoid the usage of the "for" loop in this bash function?

I am creating this function to make multiple grep's over every line of a file. I run it as following:
cat file.txt | agrep string1 string2 ... stringN
function agrep () {
for a in $#; do
cmd+=" | grep '$a'";
done ;
while read line ; do
eval "echo "\'"$line"\'" $cmd";
done;
}
The idea is to print every line that contains all the strings: string1, string2, ..., stringN. This already works but I want to avoid the usage of the for to construct the expression:
| grep string1 | grep string2 ... | stringN
And if it's possible, also the usage of eval. I tried to make some expansion as follows:
echo "| grep $"{1..3}
And I get:
| grep $1 | grep $2 | grep $3
This is almost what I want but the problem is that when I try:
echo "| grep $"{1..$#}
The expansion doesn't occur because bash cant expand {1..$#} due to the $#. It just works with numbers. I would like to construct some expansion that works in order to avoid the usage of the for in the agrep function.
agrep () {
if [ $# = 0 ]; then
cat
else
pattern="$1"
shift
grep -e "$pattern" | agrep "$#"
fi
}
Instead of running each multiple greps on each line, just get all the lines that match string1, then pipe that to grep for string2, etc. One way to do this is make agrep recursive.
agrep () {
if (( $# == 0 )); then
cat # With no arguments, just output everything
else
grep "$1" | agrep "${#:2}"
fi
}
It's not the most efficient solution, but it's simple.
(Be sure to note Rob Mayoff's answer, which is the POSIX-compliant version of this.)
awk to the rescue!
you can avoid multiple grep calls and constructing the command by switching to awk
awk -v pat='string1 string2 string3' 'BEGIN{n=split(pat,p)}
{for(i=1;i<=n;i++) if($0!~p[i]) next}1 ' file
enter your space delimited strings as in the example above.
Not building a string for the command is definitely better (see chepner's and Rob Mayoff's answers). However, just as an example, you can avoid the for by using printf:
agrep () {
cmd=$(printf ' | grep %q' "$#")
sh -c "cat $cmd"
}
Using printf also helps somewhat with special characters in the patterns. From help printf:
In addition to the standard format specifications described in printf(1),
printf interprets:
%b expand backslash escape sequences in the corresponding argument
%q quote the argument in a way that can be reused as shell input
%(fmt)T output the date-time string resulting from using FMT as a format
string for strftime(3)
Since the aim of %q is providing output suitable for shell input, this should be safe.
Also: You almost always want to use "$#" with the quotes, not just plain $#.

Speed up bash filter function to run commands consecutively instead of per line

I have written the following filter as a function in my ~/.bash_profile:
hilite() {
export REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g")
while read line
do
echo $line | egrep "$1" | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
done
exit 0
}
to find lines of anything piped into it matching a regular expression, and highlight matches using ANSI escape codes on a VT100-compatible terminal.
For example, the following finds and highlights the strings bin, U or 1 which are whole words in the last 10 lines of /etc/passwd:
tail /etc/passwd | hilite "\b(bin|[U1])\b"
However, the script runs very slowly as each line forks an echo, egrep and sed.
In this case, it would be more efficient to do egrep on the entire input, and then run sed on its output.
How can I modify my function to do this? I would prefer to not create any temporary files if possible.
P.S. Is there another way to find and highlight lines in a similar way?
sed can do a bit of grepping itself: if you give it the -n flag (or #n instruction in a script) it won't echo any output unless asked. So
while read line
do
echo $line | egrep "$1" | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
done
could be simplified to
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
EDIT:
Here's the whole function:
hilite() {
REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g");
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
}
That's all there is to it - no while loop, reading, grepping, etc.
If your egrep supports --color, just put this in .bash_profile:
hilite() { command egrep --color=auto "$#"; }
(Personally, I would name the function egrep; hence the usage of command).
I think you can replace the whole while loop with simply
sed -n "s/$REGEX_SED/\x1b[7m&\x1b[0m/gp"
because sed can read from stdin line-by-line so you don't need read
I'm not sure if running egrep and piping to sed is faster than using sed alone, but you can always compare using time.
Edit: added -n and p to sed to print only highlighted lines.
Well, you could simply do this:
egrep "$1" $line | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
But I'm not sure that it'll be that much faster ; )
Just for the record, this is a method using a temporary file:
hilite() {
export REGEX_SED=$(echo $1 | sed "s/[|()]/\\\&/g")
export FILE=$2
if [ -z "$FILE" ]
then
export FILE=~/tmp
echo -n > $FILE
while read line
do
echo $line >> $FILE
done
fi
egrep "$1" $FILE | sed "s/$REGEX_SED/\x1b[7m&\x1b[0m/g"
return $?
}
which also takes a file/pathname as the second argument, for case like
cat /etc/passwd | hilite "\b(bin|[U1])\b"

Resources