Execute command when grep finds a match - bash

I have a program that continuously runs and is piped into grep and every once in a while grep finds a match. I want my bash script to execute when grep finds the match.
Like this:
program | grep "pattern"
grep outputs -> script.sh executes

If I understand your question correct (the "every once in a while grep finds a match" part), your program produces the positive pattern several times during its longish runtime and you want script.sh run at those points in time .
When you pipe program to grep and watch on greps exit code with &&, then you execute script.sh only once and only at the end of programs runtime.
If my assumptions are correct (that you want script.sh run several times), then you need "if then logic" behind the | symbol. One way to do it is using awk instead of grep:
program | awk '/pattern/ {system("./script.sh")}'
/pattern/ {system("./script.sh")} in single quotes is the awk script
/pattern/ instructs awk to look for lines matching pattern and do the instructions inside the following curly braces
system("./script.sh") inside the curly braces behind the /.../ executes the script.sh in current directory via awks system command
The overall effect is: script.sh is executed as often as there are lines matching pattern in the output of program.
Another way to this is using sed:
program | sed -n '/pattern/ e ./script.sh'
-n prevents sed from printing every line in programs out put
/pattern/ is the filter criteria for lines in programs output: lines must match pattern
e ./script.sh is the action that sed does when a line matches the pattern: it executes your script.sh

Use -q grep parameter and && syntax, like this:
program | grep -q "pattern" && sh script.sh
where:
-q: Runs in quiet mode but exit immediately with zero status if any match is found
&&: is used to chain commands together, such that
the next command is run if and only if the preceding command exited
without errors (or, more accurately, exits with a return code of 0).
Explained here.
sh script.sh: Will run only if "pattern" is found

Equivalent to #masterguru's answer, but perhaps more understandable:
if program | grep -q "pattern"
then
# execute command here
fi
grep returns a zero exit status when it finds a match, so it can be used as a conditional in if.

Related

Bash replace lines in file that contain functions

I have a shell script that contains the following line
PROC_ID=$(cat myfile.sh | grep running)
which, after you echo out the value would be 1234 or something like that.
What I want to do is find and replace instances of this line with a literal value
I want to replace it with PROC_ID=1234 instead of having the function call.
I've tried doing this in another shell script using sed but I can't get it to work
STR_TO_USE="PROC_ID=${1}"
STR_TO_REP='PROC_ID=$(cat myfile.sh | grep running)'
sed -i "s/$STR_TO_REP/$STR_TO_USE/g" sample.sh
but it complains stating sed: 1: "sample.sh": unterminated substitute pattern
How can I achieve this?
EDIT:
sample.sh should contain beforehand
#!/bin/bash
....
PROC_ID=$(cat myfile.sh | grep running)
echo $PROC_ID
....
After, it should contain
#!/bin/bash
....
PROC_ID=1234
echo $PROC_ID
....
The script I'm using as described above will be taking the in an arg from the command line, hence STR_TO_USE="PROC_ID=${1}"
Simply:
sed /^PROC_ID=/s/=.*/=1234/
Translation:
At line begining by PROC_ID=
replace = to end of line by =1234.
or more accurate
sed '/^[ \o11]*PROC_ID=.*myfile.*running/s/=.*/=1234/'
could be enough
([ \o11]* mean some spaces and or tabs could even prepand)
Well, first, I want to point out something obvious. This: $(cat myfile.sh | grep running) will at the very least NOT only contain the string 1234 but will certainly also contain the string running. But since you aren't asking for help with that, I'll leave it alone.
All you need in your above sed, is first to backslash the $.
STR_TO_REP='PROC_ID=\$(cat myfile.sh | grep running)'
This allows the sed command to be terminated.

How to add string to cat result in bash?

File info has some certain info on a line starting with myline. Im trying to pass it to a script like this:
bash myscript `cat info | grep myline`
This works well. Script gets "myline" as first argument. But now i want to add a "w" at the end of that. I tried
bash myscript `cat info | grep myline`w
This is already problematic, the script gets "wyline" as first argument.
And now the next step is that i actually want to have an if statement whether i want to add w or not. Tried this:
bash myscript `cat info | grep myline``[ "condition" == "condition"] && echo "w"`
This works the same way. Script gets "wyline" as first argument.
So I have two questions:
1) How to fix the "wyline" result to get desired "mylinew"
2) Is there a better way to write this if statement after cat?
Do not use backticks `, use $(...) instead. bash hackers obsolete deprecated syntax
cat file | grep is a useless use of cat useless use of cat award. Just grep file.
Just quote the result and add w:
myscript "$(grep myline info)w"
You can add a trailing w to the last line of input with sed:
myscript "$(grep myline info | sed '$s/$/w/')"
I would advise to always quote your variable expansions.
Script gets "wyline" as first argument.
Your input file has dos line endings. Inspect output with cut -v or hexdump -C or xxd. Use dos2unix and remove carriage return characters.

unusual chaining of "grep" in the shell

I stumbled upon a shell instruction which looks odd:
ls -a | grep ".qmail-" | grep -v "mail" | grep ".mail" > t ; echo $?
I suspect that the returned value would represent an error. Could anyone confirm this or explain in which circumstances this instruction would be applied ?
The first grep only allows through lines that contain qmail (preceded by any character and followed by a dash, but that is largely immaterial). The second grep strips out lines that contain mail, which means every line passed by the first grep is deleted by the second. There's nothing left for the third one to process, so the file t will always be empty. The value for $? should be 1, failure, since the third grep failed to find any lines that matched its pattern (because it got no lines to process).
It is a mistake.
It is hard to know how to fix it without knowing what it is trying to do.
The bash shell (and most other shells) let users use the output of one command as the input of another. This is accomplished with the | operator which is called a pipe. So the output of of ls -a is fed to grep ".qmail-" and so on. The > operator sends the output of the command to a file, in this case t. So ls -a | grep ".qmail-" | grep -v "mail" | grep ".mail" > t lists the contents of a directory and passes the output through successive filters before finally saving the output to the file t.
The semicolon signals the end of a command and allows multiple bash commands to be entered on a single line.
echo $? prints out the return value of the last executed command, in this case, ls -a | grep ".qmail-" | grep -v "mail" | grep ".mail" > t. By convention, any value besides 0 indicates some sort of error occurred.The Linux Documentation Project gives a list of some common exit codes.

perform an operation for *each* item listed by grep

How can I perform an operation for each item listed by grep individually?
Background:
I use grep to list all files containing a certain pattern:
grep -l '<pattern>' directory/*.extension1
I want to delete all listed files but also all files having the same file name but a different extension: .extension2.
I tried using the pipe, but it seems to take the output of grep as a whole.
In find there is the -exec option, but grep has nothing like that.
If I understand your specification, you want:
grep --null -l '<pattern>' directory/*.extension1 | \
xargs -n 1 -0 -I{} bash -c 'rm "$1" "${1%.*}.extension2"' -- {}
This is essentially the same as what #triplee's comment describes, except that it's newline-safe.
What's going on here?
grep with --null will return output delimited with nulls instead of newline. Since file names can have newlines in them delimiting with newline makes it impossible to parse the output of grep safely, but null is not a valid character in a file name and thus makes a nice delimiter.
xargs will take a stream of newline-delimited items and execute a given command, passing as many of those items (one as each parameter) to a given command (or to echo if no command is given). Thus if you said:
printf 'one\ntwo three \nfour\n' | xargs echo
xargs would execute echo one 'two three' four. This is not safe for file names because, again, file names might contain embedded newlines.
The -0 switch to xargs changes it from looking for a newline delimiter to a null delimiter. This makes it match the output we got from grep --null and makes it safe for processing a list of file names.
Normally xargs simply appends the input to the end of a command. The -I switch to xargs changes this to substitution the specified replacement string with the input. To get the idea try this experiment:
printf 'one\ntwo three \nfour\n' | xargs -I{} echo foo {} bar
And note the difference from the earlier printf | xargs command.
In the case of my solution the command I execute is bash, to which I pass -c. The -c switch causes bash to execute the commands in the following argument (and then terminate) instead of starting an interactive shell. The next block 'rm "$1" "${1%.*}.extension2"' is the first argument to -c and is the script which will be executed by bash. Any arguments following the script argument to -c are assigned as the arguments to the script. This, if I were to say:
bash -c 'echo $0' "Hello, world"
Then Hello, world would be assigned to $0 (the first argument to the script) and inside the script I could echo it back.
Since $0 is normally reserved for the script name I pass a dummy value (in this case --) as the first argument and, then, in place of the second argument I write {}, which is the replacement string I specified for xargs. This will be replaced by xargs with each file name parsed from grep's output before bash is executed.
The mini shell script might look complicated but it's rather trivial. First, the entire script is single-quoted to prevent the calling shell from interpreting it. Inside the script I invoke rm and pass it two file names to remove: the $1 argument, which was the file name passed when the replacement string was substituted above, and ${1%.*}.extension2. This latter is a parameter substitution on the $1 variable. The important part is %.* which says
% "Match from the end of the variable and remove the shortest string matching the pattern.
.* The pattern is a single period followed by anything.
This effectively strips the extension, if any, from the file name. You can observe the effect yourself:
foo='my file.txt'
bar='this.is.a.file.txt'
baz='no extension'
printf '%s\n'"${foo%.*}" "${bar%.*}" "${baz%.*}"
Since the extension has been stripped I concatenate the desired alternate extension .extension2 to the stripped file name to obtain the alternate file name.
If this does what you want, pipe the output through /bin/sh.
grep -l 'RE' folder/*.ext1 | sed 's/\(.*\).ext1/rm "&" "\1.ext2"/'
Or if sed makes you itchy:
grep -l 'RE' folder/*.ext1 | while read file; do
echo rm "$file" "${file%.ext1}.ext2"
done
Remove echo if the output looks like the commands you want to run.
But you can do this with find as well:
find /path/to/start -name \*.ext1 -exec grep -q 'RE' {} \; -print | ...
where ... is either the sed script or the three lines from while to done.
The idea here is that find will ... well, "find" things based on the qualifiers you give it -- namely, that things match the file glob "*.ext", AND that the result of the "exec" is successful. The -q tells grep to look for RE in {} (the file supplied by find), and exit with a TRUE or FALSE without generating any of its own output.
The only real difference between doing this in find vs doing it with grep is that you get to use find's awesome collection of conditions to narrow down your search further if required. man find for details. By default, find will recurse into subdirectories.
You can pipe the list to xargs:
grep -l '<pattern>' directory/*.extension1 | xargs rm
As for the second set of files with a different extension, I'd do this (as usual use xargs echo rm when testing to make a dry run; I haven't tested it, it may not work correctly with filenames with spaces in them):
filelist=$(grep -l '<pattern>' directory/*.extension1)
echo $filelist | xargs rm
echo ${filelist//.extension1/.extension2} | xargs rm
Pipe the result to xargs, it will allow you to run a command for each match.

How to get the part of a file after the first line that matches a regular expression

I have a file with about 1000 lines. I want the part of my file after the line which matches my grep statement.
That is:
cat file | grep 'TERMINATE' # It is found on line 534
So, I want the file from line 535 to line 1000 for further processing.
How can I do that?
The following will print the line matching TERMINATE till the end of the file:
sed -n -e '/TERMINATE/,$p'
Explained: -n disables default behavior of sed of printing each line after executing its script on it, -e indicated a script to sed, /TERMINATE/,$ is an address (line) range selection meaning the first line matching the TERMINATE regular expression (like grep) to the end of the file ($), and p is the print command which prints the current line.
This will print from the line that follows the line matching TERMINATE till the end of the file:
(from AFTER the matching line to EOF, NOT including the matching line)
sed -e '1,/TERMINATE/d'
Explained: 1,/TERMINATE/ is an address (line) range selection meaning the first line for the input to the 1st line matching the TERMINATE regular expression, and d is the delete command which delete the current line and skip to the next line. As sed default behavior is to print the lines, it will print the lines after TERMINATE to the end of input.
If you want the lines before TERMINATE:
sed -e '/TERMINATE/,$d'
And if you want both lines before and after TERMINATE in two different files in a single pass:
sed -e '1,/TERMINATE/w before
/TERMINATE/,$w after' file
The before and after files will contain the line with terminate, so to process each you need to use:
head -n -1 before
tail -n +2 after
IF you do not want to hard code the filenames in the sed script, you can:
before=before.txt
after=after.txt
sed -e "1,/TERMINATE/w $before
/TERMINATE/,\$w $after" file
But then you have to escape the $ meaning the last line so the shell will not try to expand the $w variable (note that we now use double quotes around the script instead of single quotes).
I forgot to tell that the new line is important after the filenames in the script so that sed knows that the filenames end.
How would you replace the hardcoded TERMINATE by a variable?
You would make a variable for the matching text and then do it the same way as the previous example:
matchtext=TERMINATE
before=before.txt
after=after.txt
sed -e "1,/$matchtext/w $before
/$matchtext/,\$w $after" file
to use a variable for the matching text with the previous examples:
## Print the line containing the matching text, till the end of the file:
## (from the matching line to EOF, including the matching line)
matchtext=TERMINATE
sed -n -e "/$matchtext/,\$p"
## Print from the line that follows the line containing the
## matching text, till the end of the file:
## (from AFTER the matching line to EOF, NOT including the matching line)
matchtext=TERMINATE
sed -e "1,/$matchtext/d"
## Print all the lines before the line containing the matching text:
## (from line-1 to BEFORE the matching line, NOT including the matching line)
matchtext=TERMINATE
sed -e "/$matchtext/,\$d"
The important points about replacing text with variables in these cases are:
Variables ($variablename) enclosed in single quotes ['] won't "expand" but variables inside double quotes ["] will. So, you have to change all the single quotes to double quotes if they contain text you want to replace with a variable.
The sed ranges also contain a $ and are immediately followed by a letter like: $p, $d, $w. They will also look like variables to be expanded, so you have to escape those $ characters with a backslash [\] like: \$p, \$d, \$w.
As a simple approximation you could use
grep -A100000 TERMINATE file
which greps for TERMINATE and outputs up to 100,000 lines following that line.
From the man page:
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines.
Places a line containing a group separator (--) between
contiguous groups of matches. With the -o or --only-matching
option, this has no effect and a warning is given.
A tool to use here is AWK:
cat file | awk 'BEGIN{ found=0} /TERMINATE/{found=1} {if (found) print }'
How does this work:
We set the variable 'found' to zero, evaluating false
if a match for 'TERMINATE' is found with the regular expression, we set it to one.
If our 'found' variable evaluates to True, print :)
The other solutions might consume a lot of memory if you use them on very large files.
If I understand your question correctly you do want the lines after TERMINATE, not including the TERMINATE-line. AWK can do this in a simple way:
awk '{if(found) print} /TERMINATE/{found=1}' your_file
Explanation:
Although not best practice, you could rely on the fact that all variables defaults to 0 or the empty string if not defined. So the first expression (if(found) print) will not print anything to start off with.
After the printing is done, we check if this is the starter-line (that should not be included).
This will print all lines after the TERMINATE-line.
Generalization:
You have a file with start- and end-lines and you want the lines between those lines excluding the start- and end-lines.
start- and end-lines could be defined by a regular expression matching the line.
Example:
$ cat ex_file.txt
not this line
second line
START
A good line to include
And this line
Yep
END
Nope more
...
never ever
$ awk '/END/{found=0} {if(found) print} /START/{found=1}' ex_file.txt
A good line to include
And this line
Yep
$
Explanation:
If the end-line is found no printing should be done. Note that this check is done before the actual printing to exclude the end-line from the result.
Print the current line if found is set.
If the start-line is found then set found=1 so that the following lines are printed. Note that this check is done after the actual printing to exclude the start-line from the result.
Notes:
The code rely on the fact that all AWK variables defaults to 0 or the empty string if not defined. This is valid, but it may not be best practice so you could add a BEGIN{found=0} to the start of the AWK expression.
If multiple start-end-blocks are found, they are all printed.
grep -A 10000000 'TERMINATE' file
is much, much faster than sed, especially working on really a big file. It works up to 10M lines (or whatever you put in), so there isn't any harm in making this big enough to handle about anything you hit.
Use Bash parameter expansion like the following:
content=$(cat file)
echo "${content#*TERMINATE}"
There are many ways to do it with sed or awk:
sed -n '/TERMINATE/,$p' file
This looks for TERMINATE in your file and prints from that line up to the end of the file.
awk '/TERMINATE/,0' file
This is exactly the same behaviour as sed.
In case you know the number of the line from which you want to start printing, you can specify it together with NR (number of record, which eventually indicates the number of the line):
awk 'NR>=535' file
Example
$ seq 10 > a #generate a file with one number per line, from 1 to 10
$ sed -n '/7/,$p' a
7
8
9
10
$ awk '/7/,0' a
7
8
9
10
$ awk 'NR>=7' a
7
8
9
10
If for any reason, you want to avoid using sed, the following will print the line matching TERMINATE till the end of the file:
tail -n "+$(grep -n 'TERMINATE' file | head -n 1 | cut -d ":" -f 1)" file
And the following will print from the following line matching TERMINATE till the end of the file:
tail -n "+$(($(grep -n 'TERMINATE' file | head -n 1 | cut -d ":" -f 1)+1))" file
It takes two processes to do what sed can do in one process, and if the file changes between the execution of grep and tail, the result can be incoherent, so I recommend using sed. Moreover, if the file doesn’t not contain TERMINATE, the first command fails.
Alternatives to the excellent sed answer by jfg956, and which don't include the matching line:
awk '/TERMINATE/ {y=1;next} y' (Hai Vu's answer to 'grep +A': print everything after a match)
awk '/TERMINATE/ ? c++ : c' (Steven Penny's answer to 'grep +A': print everything after a match)
perl -ne 'print unless 1 .. /TERMINATE/' (tchrist's answer to 'grep +A': print everything after a match)
This could be one way of doing it. If you know in what line of the file you have your grep word and how many lines you have in your file:
grep -A466 'TERMINATE' file
sed is a much better tool for the job:
sed -n '/re/,$p' file
where re is a regular expression.
Another option is grep's --after-context flag. You need to pass in a number to end at, using wc on the file should give the right value to stop at. Combine this with -n and your match expression.
This will print all lines from the last found line "TERMINATE" till the end of the file:
LINE_NUMBER=`grep -o -n TERMINATE $OSCAM_LOG | tail -n 1 | sed "s/:/ \\'/g" | awk -F" " '{print $1}'`
tail -n +$LINE_NUMBER $YOUR_FILE_NAME

Resources