grep: compare string from file with another string - bash

I have a list of files paths that I need to compare with a string:
git_root_path=$(git rev-parse --show-toplevel)
list_of_files=.git/ForGeneratingSBConfigAlert.txt
cd $git_root_path
echo "These files needs new OSB config:"
while read -r line
do
modfied="$line"
echo "File for compare: $modfied"
if grep -qf $list_of_files" $modfied"; then
echo "Found: $modfied"
fi
done < <(git status -s | grep -v " M" | awk '{if ($1 == "M") print $2}')
$modified - is a string variable that stores path to file
Pattern file example:
SVCS/resources/
SVCS/bus/projects/busCallout/
SVCS/bus/projects/busconverter/
SVCS/bus/projects/Resources/ (ignore .jar)
SVCS/bus/projects/Teema/
SVCS/common/
SVCS/domain/
SVCS/techutil/src/
SVCS/tech/mds/src/java/fi/vr/h/service/tech/mds/exception/
SVCS/tech/mds/src/java/fi/vr/h/service/tech/mds/interfaces/
SVCS/app/cashmgmt/src/java/fi/vr/h/service/app/cashmgmt/exception/
SVCS/app/cashmgmt/src/java/fi/vr/h/service/app/cashmgmt/interfaces/
SVCS/app/customer/src/java/fi/vr/h/service/app/customer/exception/
SVCS/app/customer/src/java/fi/vr/h/service/app/customer/interfaces/
SVCS/app/etravel/src/java/fi/vr/h/service/app/etravel/exception/
SVCS/app/etravel/src/java/fi/vr/h/service/app/etravel/interfaces/
SVCS/app/hermes/src/java/fi/vr/h/service/app/hermes/exception/
SVCS/app/hermes/src/java/fi/vr/h/service/app/hermes/interfaces/
SVCS/app/journey/src/java/fi/vr/h/service/app/journey/exception/
SVCS/app/journey/src/java/fi/vr/h/service/app/journey/interfaces/
SVCS/app/offline/src/java/fi/vr/h/service/app/offline/exception/
SVCS/app/offline/src/java/fi/vr/h/service/app/offline/interfaces/
SVCS/app/order/src/java/fi/vr/h/service/app/order/exception/
SVCS/app/order/src/java/fi/vr/h/service/app/order/interfaces/
SVCS/app/payment/src/java/fi/vr/h/service/app/payment/exception/
SVCS/app/payment/src/java/fi/vr/h/service/app/payment/interfaces/
SVCS/app/price/src/java/fi/vr/h/service/app/price/exception/
SVCS/app/price/src/java/fi/vr/h/service/app/price/interfaces/
SVCS/app/product/src/java/fi/vr/h/service/app/product/exception/
SVCS/app/product/src/java/fi/vr/h/service/app/product/interfaces/
SVCS/app/railcar/src/java/fi/vr/h/service/app/railcar/exception/
SVCS/app/railcar/src/java/fi/vr/h/service/app/railcar/interfaces/
SVCS/app/reservation/src/java/fi/vr/h/service/app/reservation/exception/
SVCS/app/reservation/src/java/fi/vr/h/service/app/reservation/interfaces/
kraken_test.txt
namaker_test.txt
shmaker_test.txt
I need to compare file search pattern with a string, is it possible using grep?

I'm not sure I understand the overall logic, but a few immediate suggestions come to mind.
You can avoid grep | awk in the vast majority of cases.
A while loop with a grep on a line at a time inside the loop is an antipattern. You probably just want to run one grep on the whole input.
Your question would still benefit from an explanation of what you are actually trying to accomplish.
cd "$(git rev-parse --show-toplevel)"
git status -s | awk '!/ M/ && $1 == "M" { print $2 }' |
grep -Fxf .git/ForGeneratingSBConfigAlert.txt
I was trying to think of a way to add back your human-readable babble, but on second thought, this program is probably better without it.
The -x option to grep might be wrong, depending on what you are really hoping to accomplish.

This should work:
git status -s | grep -v " M" | awk '{if ($1 == "M") print $2}' | \
grep --file=.git/ForGeneratingSBConfigAlert.txt --fixed-strings --line-regexp
Piping the awk output directly to grep avoids the while loop entirely. In most cases you'll find you don't really need to print debug messages and the like in it.
--file takes a file with one pattern to match per line.
--fixed-strings avoids treating any characters in the patterns as special.
--line-regexp anchors the patterns so that they only match if a full line of input matches one of the patterns.
All that said, could you clarify what you are actually trying to accomplish?

Related

How to get the nth recent file in the nth last modified subdirectory using pipes

I'm doing an exercise for OS exam. It requires to get the 3rd recent file of the 2nd last modified sub-directory inside current directory. Then I have to print its lines in reverse order. I can not use tac command. The text suggest to use (other than awk and sed): head, tails, wc.
I've succeded getting filename of the requested file (but in a too complex way I think). Now I have to print it in reverse. I think I can use this awk solution https://stackoverflow.com/a/744093/11614625.
This is how I'm getting the filename:
ls -t | head | awk '{system("test -d \"" $0 "\" && echo \"" $0 "\"")}' | awk 'NR==2 {system("ls \"" $0 "\" | head")}' | awk 'NR==1'
How can I do better? And what if 3rd directory or 2nd file doesn't exists?
See https://mywiki.wooledge.org/ParsingLs and awk '{system("test -d \"" $0 "\" && echo \"" $0 "\"")}' is calling shell to call awk to call system to call shell to call test which is clearly a worse approach than just having shell call test in the first place if you were going to do that. Also, any solution that reads the whole file into memory (as any sed or a naive awk solution would) will fail for large files as they'll exceed available memory.
Unfortunately this is how to do what you want robustly:
dir="$(find . -mindepth 1 -maxdepth 1 -type d -printf '%T+\t%p\0' |
sort -rz |
awk -v RS='\0' 'NR==2{sub(/[^\t]+\t/,""); print; exit}')" &&
file="$(find "$dir" -mindepth 1 -maxdepth 1 -type f -printf '%T+\t%p\0' |
sort -z |
awk -v RS='\0' 'NR==3{sub(/[^\t]+\t/,""); print; exit}')" &&
cat -n "$file" | sort -rn | cut -f2-
If any of the commands in any of the pipes fail then the error message from the command that failed will be printed and then no other command will execute and the overall exit status will be the failure one from that failing command.
I used cat | sort | cut rather than awk or sed to print the file in reverse because awk (unless you write demand paging in it) or sed would have to read the whole file into memory at once and so would fail for very large files while sort is designed to handle large files by using paging with tmp files as necessary and only keeping parts of the file in memory at a time so it's limited only by how much free disk space you have on your device.
The above requires GNU tools to provide/handle NUL line-endings - if you don't have those then change \0 to \n in the find command, remove the z from sort options, and remove -v RS='\0' from the awk command and be aware that the result will only work if your directory or file names don't contain newlines.

Regex multiline output variable in if clause

Consider the following on a debian based system:
VAR=$(dpkg --get-selections | awk '{print $1}' | grep linux-image)
This will print a list of installed packages with the string "linux-image" in them on my system this output looks like:
linux-image-3.11.0-17-generic
linux-image-extra-3.11.0-17-generic
linux-image-generic
Now as we all know
echo $VAR
results in
linux-image-3.11.0-17-generic linux-image-extra-3.11.0-17-generic linux-image-generic
and
echo "$VAR"
results in
linux-image-3.11.0-17-generic
linux-image-extra-3.11.0-17-generic
linux-image-generic
I do not want to use external commands in a if clause, it seems rather dirty and not very elegant, so I wanted to use bash built in regex matching:
if [[ "$VAR" =~ ^linux-image-g ]]; then
echo "yes"
fi
however that does not work, since it does not seem to consider multiple lines here. How can I match beginnings of lines in a variable?
There's nothing wrong with using an external command as part of the if statement; I would skip the VAR variable altogether and use
if dpkg --get-selections | awk '{print $1}' | grep -q linux-image;
The -q option to grep suppresses its output, and the if statement uses the exit status of grep directly. You could also drop the grep and test $1 directly in the awk script:
if dpkg --get-selections | awk '$1 =~ "^linux-image" { exit 0; } END {exit 1}'; then
or you can skip awk, since there doesn't seem to be a real need to drop the other fields before calling grep:
if dpkg --get-selections | grep -q '^linux-image'; then

how to grep multiples variable in bash

I need to grep multiple strings, but i don't know the exact number of strings.
My code is :
s2=( $(echo $1 | awk -F"," '{ for (i=1; i<=NF ; i++) {print $i} }') )
for pattern in "${s2[#]}"; do
ssh -q host tail -f /some/path |
grep -w -i --line-buffered "$pattern" > some_file 2>/dev/null &
done
now, the code is not doing what it's supposed to do. For example if i run ./script s1,s2,s3,s4,.....
it prints all lines that contain s1,s2,s3....
The script is supposed to do something like grep "$s1" | grep "$s2" | grep "$s3" ....
grep doesn't have an option to match all of a set of patterns. So the best solution is to use another tool, such as awk (or your choice of scripting languages, but awk will work fine).
Note, however, that awk and grep have subtly different regular expression implementations. It's not clear from the question whether the target strings are literal strings or regular expression patterns, and if the latter, what the expectations are. However, since the argument comes delimited with commas, I'm assuming that the pieces are simple strings and should not be interpreted as patterns.
If you want the strings to be interpreted as patterns, you can change index to match in the following little program:
ssh -q host tail -f /some/path |
awk -v STRINGS="$1" -v IGNORECASE=1 \
'BEGIN{split(STRINGS,strings,/,/)}
{for(i in strings)if(!index($0,strings[i]))next}
{print;fflush()}'
Note:
IGNORECASE is only available in gnu awk; in (most) other implementations, it will do nothing. It seems that is what you want, based on the fact that you used -i in your grep invocation.
fflush() is also an extension, although it works with both gawk and mawk. In Posix awk, fflush requires an argument; if you were using Posix awk, you'd be better off printing to stderr.
You can use extended grep
egrep "$s1|$s2|$s3" fileName
If you don't know how many pattern you need to grep, but you have all of them in an array called s, you can use
egrep $(sed 's/ /|/g' <<< "${s[#]}") fileName
This creates a herestring with all elements of the array, sed replaces the field separator of bash (space) with | and if we feed that to egrep we grep all strings that are in the array s.
test.sh:
#!/bin/bash -x
a=" $#"
grep ${a// / -e } .bashrc
it works that way:
$ ./test.sh 1 2 3
+ a=' 1 2 3'
+ grep -e 1 -e 2 -e 3 .bashrc
(here is lots of text that fits all the arguments)

Bash grep variable from multiple variables on a single line

I am using GNU bash, version 4.2.20(1)-release (x86_64-pc-linux-gnu). I have a music file list I dumped into a variable: $pltemp.
Example:
/Music/New/2010s/2011;Ziggy Marley;Reggae In My Head
I wish to grep the 3rd field above, in the Master-Music-List.txt, then continue another grep for the 2nd field. If both matched, print else echo "Not Matched".
So the above will search for the Song Title (Reggae In My Head), then will make sure it has the artist "Shaggy" on the same line, for a success.
So far, success for a non-variable grep;
$ grep -i -w -E 'shaggy.*angel' Master-Music-MM-Playlist.m3u
$ if ! grep Shaggy Master-Music-MM-Playlist.m3u ; then echo "Not Found"; fi
$ grep -i -w Angel Master-Music-MM-Playlist.m3u | grep -i -w shaggy
I'm not sure how to best construct the 'entire' list to process.
I want to do this on a single line.
I used this to dump the list into the variable $pltemp...
Original: \Music\New\2010s\2011\Ziggy Marley - Reggae In My Head.mp3
$ pltemp="$(cat Reggae.m3u | sed -e 's/\(.*\)\\/\1;/' -e 's/\(.*\)\ -\ /\1;/' -e 's/\\/\//g' -e 's/\\/\//g' -e 's/.mp3//')"
If you realy want to "grep this, then grep that", you need something more complex than grep by itself. How about awk?
awk -F';' '$3~/title/ && $2~/artist/ {print;n=1;exit;} END {if(n=0)print "Not matched";}'
If you want to make this search accessible as a script, the same thing simply changes form. For example:
#!/bin/sh
awk -F';' -vartist="$1" -vtitle="$2" '$3~title && $2~artist {print;n=1;exit;} END {if(n=0)print "Not matched";}'
Write this to a file, make it executable, and pipe stuff to it, with the artist substring/regex you're looking for as the first command line option, and the title substring/regex as the second.
On the other hand, what you're looking for might just be a slightly more complex regular expression. Let's wrap it in bash for you:
if ! echo "$pltemp" | egrep '^[^;]+;[^;]*artist[^;]*;.*title'; then
echo "Not matched"
fi
You can compress this to a single line if you like. Or make it a stand-along shell script, or make it a function in your .bashrc file.
awk -F ';' -v title="$title" -v artist="$artist" '$3 ~ title && $2 ~ artist'
Well, none of the above worked, so I came up with this...
for i in *.m3u; do
cat "$i" | sed 's/.*\\//' | while read z; do
grep --color=never -i -w -m 1 "$z" Master-Music-Playlist.m3u \
| echo "#NotFound;"$z" "
done > "$i"-MM-Final.txt;
done
Each line is read (\Music\Lady Gaga - Paparazzi.mp3), the path is stripped, the song is searched in the Master Music List, if not found, it echos "Not Found", saved into a new playlist.
Works {Solved}
Thanks anyway.

String Manipulation in Bash

I am a newbie in Bash and I am doing some string manipulation.
I have the following file among other files in my directory:
jdk-6u20-solaris-i586.sh
I am doing the following to get jdk-6u20 in my script:
myvar=`ls -la | awk '{print $9}' | egrep "i586" | cut -c1-8`
echo $myvar
but now I want to convert jdk-6u20 to jdk1.6.0_20. I can't seem to figure out how to do it.
It must be as generic as possible. For example if I had jdk-6u25, I should be able to convert it at the same way to jdk1.6.0_25 so on and so forth
Any suggestions?
Depending on exactly how generic you want it, and how standard your inputs will be, you can probably use AWK to do everything. By using FS="regexp" to specify field separators, you can break down the original string by whatever tokens make the most sense, and put them back together in whatever order using printf.
For example, assuming both dashes and the letter 'u' are only used to separate fields:
myvar="jdk-6u20-solaris-i586.sh"
echo $myvar | awk 'BEGIN {FS="[-u]"}; {printf "%s1.%s.0_%s",$1,$2,$3}'
Flavour according to taste.
Using only Bash:
for file in jdk*i586*
do
file="${file%*-solaris*}"
file="${file/-/1.}"
file="${file/u/.0_}"
do_something_with "$file"
done
i think that sed is the command for you
You can try this snippet:
for fname in *; do
newname=`echo "$fname" | sed 's,^jdk-\([0-9]\)u\([0-9][0-9]*\)-.*$,jdk1.\1.0_\2,'`
if [ "$fname" != "$newname" ]; then
echo "old $fname, new $newname"
fi
done
awk 'if(match($9,"i586")){gsub("jdk-6u20","jdk1.6.0_20");print $9;}'
The if(match()) supersedes the egrep bit if you want to use it. You could use substr($9,1,8) instead of cut as well.
garph0 has a good idea with sed; you could do
myvar=`ls jdk*i586.sh | sed 's/jdk-\([0-9]\)u\([0-9]\+\).\+$/jdk1.\1.0_\2/'`
You're needing the awk in there is an artifact of the -l switch on ls. For pattern substitution on lines of text, sed is the long-time champion:
ls | sed -n '/^jdk/s/jdk-\([0-9][0-9]*\)u\([0-9][0-9]*\)$/jdk1.\1.0_\2/p'
This was written in "old-school" sed which should have greater portability across platforms. The expression says:
don't print lines unless they match -n
on lines beginning with 'jdk' do:
on a line that contains only "jdk-IntegerAuIntegerB"
change it to "jdk.1.IntegerA.0_IntegerB"
and print it
Your sample becomes even simpler as:
myvar=`echo *solaris-i586.sh | sed 's/-solaris-i586\.sh//'`

Resources