Using grep to get the line number of first occurrence of a string in a file - bash

I am using bash script for testing purpose.During my testing I have to find the line number of first occurrence of a string in a file. I have tried "awk" and "grep" both, but non of them return the value.
Awk example
#/!bin/bash
....
VAR=searchstring
...
cpLines=$(awk '/$VAR/{print NR}' $MYDIR/Configuration.xml
this does not expand $VAR. If I use the value of VAR it works, but I want to use VAR
Grep example
#/!bin/bash
...
VAR=searchstring
...
cpLines=grep -n -m 1 $VAR $MYDIR/Configuration.xml |cut -f1 -d:
this gives error line 20: -n: command not found

grep -n -m 1 SEARCH_TERM FILE_PATH |sed 's/\([0-9]*\).*/\1/'
grep switches
-n = include line number
-m 1 = match one
sed options (stream editor):
's/X/Y/' - replace X with Y
\([0-9]*\) - regular expression to match digits zero or multiple times occurred, escaped parentheses, the string matched with regex in parentheses will be the \1 argument in the Y (replacement string)
\([0-9]*\).* - .* will match any character occurring zero or multiple times.

You need $() for variable substitution in grep
cpLines=$(grep -n -m 1 $VAR $MYDIR/Configuration.xml |cut -f1 -d: )

Try something like:
awk -v search="$var" '$0~search{print NR; exit}' inputFile
In awk, / / will interpret awk variable literally. You need to use match (~) operator. What we are doing here is looking for the variable against your input line. If it matches, we print the line number stored in NR and exit.
-v allows you to create an awk variable (search) in above example. You then assign it your bash variable ($var).

grep -n -m 1 SEARCH_TERM FILE_PATH | grep -Po '^[0-9]+'
explanation:
-Po = -P -o
-P use perl regex
-o only print matched string (not the whole line)

Try pipping;
grep -P 'SEARCH TERM' fileName.txt | wc -l

Related

Replace one character by the other (and vice-versa) in shell

Say I have strings that look like this:
$ a='/o\\'
$ echo $a
/o\
$ b='\//\\\\/'
$ echo $b
\//\\/
I'd like a shell script (ideally a one-liner) to replace / occurrences by \ and vice-versa.
Suppose the command is called invert, it would yield (in a shell prompt):
$ invert $a
\o/
$ invert $b
/\\//\
For example using sed, it seems unavoidable to use a temporary character, which is not great, like so:
$ echo $a | sed 's#/#%#g' | sed 's#\\#/#g' | sed 's#%#\\#g'
\o/
$ echo $b | sed 's#/#%#g' | sed 's#\\#/#g' | sed 's#%#\\#g'
/\\//\
For some context, this is useful for proper printing of git log --graph --all | tac (I like to see newer commits at the bottom).
tr is your friend:
% echo 'abc' | tr ab ba
bac
% echo '/o\' | tr '\\/' '/\\'
\o/
(escaping the backslashes in the output might require a separate step)
I think this can be done with (g)awk:
$ echo a/\\b\\/c | gawk -F "/" 'BEGIN{ OFS="\\" } { for(i=1;i<=NF;i++) gsub(/\\/,"/",$i); print $0; }'
a\/b/\c
$ echo a\\/b/\\c | gawk -F "/" 'BEGIN{ OFS="\\" } { for(i=1;i<=NF;i++) gsub(/\\/,"/",$i); print $0; }'
a/\b\/c
$
-F "/" This defines the separator, The input will be split in "/", and should no longer contain a "/" character.
for(i=1;i<=NF;i++) gsub(/\\/,"/",$i);. This will replace, in all items in the input, the backslash (\) for a slash (/).
If you want to replace every instance of / with \, you can uses the y command of sed, which is quite similar to what tr does:
$ a='/o\'
$ echo "$a"
/o\
$ echo "$a" | sed 'y|/\\|\\/|'
\o/
$ b='\//\\/'
$ echo "$b"
\//\\/
$ echo "$b" | sed 'y|/\\|\\/|'
/\\//\
If you are strictly limited to GNU AWK you might get desired result following way, let file.txt content be
\//\\\\/
then
awk 'BEGIN{FPAT=".";OFS="";arr["/"]="\\";arr["\\"]="/"}{for(i=1;i<=NF;i+=1){if($i in arr){$i=arr[$i]}};print}' file.txt
gives output
/\\////\
Explanation: I inform GNU AWK that field is any single character using FPAT built-in variable and that output field separator (OFS) is empty string and create array where key-value pair represent charactertobereplace-replacement, \ needs to be escaped hence \\ denote literal \. Then for each line I iterate overall all fields using for loop and if given field hold character present in array arr keys I do exchange it for corresponding value, after loop I print line.
(tested in gawk 4.2.1)

Why does my bash script flag this awk substring command as a syntactic error when it works in the terminal?

I'm trying to extract a list of dates from a series of links using lynx's dump function and piping the output through grep and awk. This operation works successfully in the terminal and outputs dates accurately. However, when it is placed into a shell script, bash claims a syntax error:
Scripts/ETC/PreD.sh: line 18: syntax error near unexpected token `('
Scripts/ETC/PreD.sh: line 18: ` lynx --dump "$link" | grep -m 1 Date | awk '{print substr($0,10)}' >> dates.txt'
For context, this is part of a while-read loop in which $link is being read from a file. Operations undertaken inside this while-loop when the awk command is removed are all successful, as are similar while-loops that include other awk commands.
I know that either I'm misunderstanding how bash handles variable substitution, or how bash handles awk commands, or some combination of the two. Any help would be immensely appreciated.
EDIT: Shellcheck is divided on this, the website version finds no error, but my downloaded version provides error SC1083, which says:
This { is literal. Check expression (missing ;/\n?) or quote it.
A check on the Shellcheck GitHub page provides this:
This error is harmless when the curly brackets are supposed to be literal, in e.g. awk {'print $1'}.
However, it's cleaner and less error prone to simply include them inside the quotes: awk '{print $1}'.
Script follows:
#!/bin/bash
while read -u 4 link
do
IFS=/ read a b c d e <<< "$link"
echo "$e" >> 1.txt
lynx --dump "$link" | grep -A 1 -e With: | tr -d [:cntrl:][:digit:][] | sed 's/\With//g' | awk '{print substr($0,10)}' | sed 's/\(.*\),/\1'\ and'/' | tr -s ' ' >> 2.txt
lynx --dump "$link" | grep -m 1 Date | awk '{print substr($0,10)}' >> dates.txt
done 4< links.txt
In sed command you have unmatched ', due to unquoted '.
In awk script your have constant zero length variable.
From gawk manual:
substr(string, start [, length ])
Return a length-character-long substring of string, starting at character number start. The first character of a string is character
number one.48 For example, substr("washington", 5, 3) returns "ing".
If length is not present, substr() returns the whole suffix of string that begins at character number start. For example,
substr("washington", 5) returns "ington". The whole suffix is also
returned if length is greater than the number of characters remaining
in the string, counting from character start.
If start is less than one, substr() treats it as if it was one. (POSIX doesn’t specify what to do in this case: BWK awk acts this way,
and therefore gawk does too.) If start is greater than the number of
characters in the string, substr() returns the null string. Similarly,
if length is present but less than or equal to zero, the null string
is returned.
Also I suggest you combine grep|awk|sed|tr into single awk script. And debug the awk script with printouts.
From:
lynx --dump "$link" | grep -A 1 -e With: | tr -d [:cntrl:][:digit:][] | sed 's/\With//g' | awk '{print substr($0,10,length)}' | sed 's/\(.*\),/\1'\ and'/' | tr -s ' ' >> 2.txt
To:
lynx --dump "$link" | awk '/With/{found=1;next}found{found=0;print sub(/\(.*\),/,"& and",gsub(/ +/," ",substr($0,10)))}' >> 2.txt
From:
lynx --dump "$link" | grep -m 1 Date | awk '{print substr($0,10,length)}' >> dates.txt
To:
lynx --dump "$link" | awk '/Date/{print substr($0,10)}' >> dates.txt

Regular expression using sed in UNIX

I want to replace variable using sed .To replace i need to know what is present in a file ,So i want to extract that string using regular expression .
$ cat file1.txt
select * from ${database_name}.tab_name;
I want to take ${type_database_name_env} into a string and use sed replace command to replace that variable with actual name
sed -n 's/[${][a-z][_][a-z][_][a-z][_][a-z][}]/,/./p' file1.txt
I need output as
$ var1=`sed command` # I am looking for proper sed command
$ echo $var1
${database_name}
With grep, you may use
var1="$(grep -o '\${[^{}]*}' file1.txt | head -1)"
The | head -1 is used to exract the first match in case there are more.
See the online demo:
f='select * from ${database_name}.tab_name;'
var1="$(grep -o '\${[^{}]*}' <<< "$f" | head -1)"
echo "$var1"
With sed, you may use
var1="$(sed -En 's/.*(\$\{[^{}]*}).*/\1/p' file"
See the online demo:
f='select * from ${database_name}.tab_name;'
var1="$(sed -En 's/.*(\$\{[^{}]*}).*/\1/p' <<< $f)"
echo "$var1"
# => ${database_name}
Regex details
.* - matches 0+ chars
(\$\{[^{}]*}) - captures into Group 1 (\1) a $ char followed with {, 0+ chars other than { and } and then a }
.* - matches 0+ chars.
As the replacement is the reference to the Group 1 text, it is all there remains after sed does its job. Note the -E option: it enables the POSIX ERE syntax where (...) are used to specify a capturing group, not \(...\).
You could just use awk:
$ awk -F'[ .]+' '{print $4}' file
${database_name}

bash script command output execution doesn't assign full output when using backticks

I used many times [``] to capture output of command to a variable. but with following code i am not getting right output.
#!/bin/bash
export XLINE='($ZWP_SCRIP_NAME),$ZWP_LT_RSI_TRIGGER)R),$ZWP_RTIMER'
echo 'Original XLINE'
echo $XLINE
echo '------------------'
echo 'Extract all word with $ZWP'
#works fine
echo $XLINE | sed -e 's/\$/\n/g' | sed -e 's/.*\(ZWP[_A-Z]*\).*/\1/g' | grep ZWP
echo '------------------'
echo 'Assign all word with $ZWP to XVAR'
#XVAR doesn't get all the values
export XVAR=`echo $XLINE | sed -e 's/\$/\n/g' | sed -e 's/.*\(ZWP[_A-Z]*\).*/\1/g' | grep ZWP` #fails
echo "$XVAR"
and i get:
Original XLINE
($ZWP_SCRIP_NAME),$ZWP_LT_RSI_TRIGGER)R),$ZWP_RTIMER
------------------
Extract all word with $ZWP
ZWP_SCRIP_NAME
ZWP_LT_RSI_TRIGGER
ZWP_RTIMER
------------------
Assign all word with $ZWP to XVAR
ZWP_RTIMER
why XVAR doesn't get all the values?
however if i use $() to capture the out instead of ``, it works fine. but why `` is not working?
Having GNU grep you can use this command:
XVAR=$(grep -oP '\$\KZWP[A-Z_]+' <<< "$XLINE")
If you pass -P grep is using Perl compatible regular expressions. The key here is the \K escape sequence. Basically the regex matches $ZWP followed by one or more uppercase characters or underscores. The \K after the $ removes the $ itself from the match, while its presence is still required to match the whole pattern. Call it poor man's lookbehind if you want, I like it! :)
Btw, grep -o outputs every match on a single line instead of just printing the lines which match the pattern.
If you don't have GNU grep or you care about portability you can use awk, like this:
XVAR=$(awk -F'$' '{sub(/[^A-Z_].*/, "", $2); print $2}' RS=',' <<< "$XLINE")
First, the smallest change that makes your code "work":
echo "$XLINE" | tr '$' '\n' | sed -e 's/.*\(ZWP[_A-Z]*\).*/\1/g' | grep ZWP_
The use of tr replaces a sed expression that didn't actually do what you thought it did -- try looking at its output to see.
One sane alternative would be to rely on GNU grep's -o option. If you can't do that...
zwpvars=( ) # create a shell array
zwp_assignment_re='[$](ZWP_[[:alnum:]_]+)(.*)' # ...and a regex
content="$XLINE"
while [[ $content =~ $zwp_assignment_re ]]; do
zwpvars+=( "${BASH_REMATCH[1]}" ) # found a reference
content=${BASH_REMATCH[2]} # stuff the remaining content aside
done
printf 'Found variable: %s\n' "${zwpvars[#]}"

parse output in bash

My file looks like
//
[297]((((21:0.125204,20:0.125204):0.00994299,(28:0.0790047,(7:0.0146105,5:0.0146105):0.0643943):0.0561423):0.0578754,(((23:0.0386924,((((26:0.0160606,22:0.0160606):0.00378,(19:0.0160596,16:0.0160596):0.00378096):0.00242531,12:0.0222659):0.0146336,((29:0.0160393,(17:0.00712055,14:0.00712055):0.00891871):0.0195068,11:0.0355461):0.00135346):0.00179282):0.0468499,4:0.0855423):0.0451632,((25:0.059669,(30:0.0155625,13:0.0155625):0.0441064):0.0223692,(3:0.0288957,1:0.0288957):0.0531425):0.0486673):0.062317):0.60861,((((((62:0.00660739,58:0.00660739):0.011345,(70:0.00496959,54:0.00496959):0.0129828):0.0065665,((68:0.00291155,53:0.00291155):0.0178013,(66:0.0163583,((65:0.0045002,(69:0.00305355,59:0.00305355):0.00144664):0.000757378,(61:0.00311373,52:0.00311373):0.00214385):0.0111007):0.00435459):0.003806):0.123648,(76:0.0395418,(40:0.00641035,34:0.00641035):0.0331314):0.108625):0.0327298,((((46:0.00103749,42:0.00103749):0.0373456,(48:0.0259862,41:0.0259862):0.0123969):0.00173179,(47:0.0275497,39:0.0275497):0.0125652):0.106275,((((44:0.00708562,36:0.00708562):0.0773928,(37:0.025,27:0.025):0.0594785):0.00501024,18:0.0894887):0.0248315,(15:0.0649576,6:0.0649576):0.0493626):0.0320701):0.0345064):0.0680223,((((80:0.0173948,73:0.0173948):0.0162433,(67:0.0129751,((63:0.00435012,57:0.00435012):0.00727273,(60:0.00848091,(64:0.00386096,((56:0.00203231,55:0.00203231):0.00103,51:0.0030623):0.000798654):0.00461996):0.00314194):0.00135223):0.0206631):0.0296773,(33:0.0415374,((75:0.0372575,(45:0.0371022,38:0.0371022):0.000155282):0.0029007,((43:0.0101608,32:0.0101608):0.0242563,31:0.0344171):0.00574108):0.00137926):0.021778):0.147776,((((74:0.0336172,((79:0.0258073,(77:0.0203659,(78:0.00390563,72:0.00390563):0.0164602):0.00544144):0.00767555,49:0.0334829):0.000134364):0.0132633,(35:0.0137148,24:0.0137148):0.0331656):0.0721567,(10:0.0147938,8:0.0147938):0.104243):0.0343567,((71:0.0427659,50:0.0427659):0.0221428,(9:0.0467372,2:0.0467372):0.0181715):0.0884852):0.0576977):0.0378275):0.552713);
[2271]((((21:0.125204,20:0.125204):0.00994299,(28:0.0790047,(7:0.0146105,5:0.0146105):0.0643943):0.0561423):0.0578754,(((23:0.0386924,((((26:0.0160606,22:0.0160606):0.00378,(19:0.0160596,16:0.0160596):0.00378096):0.00242531,12:0.0222659):0.0146336,((29:0.0160393,(17:0.00712055,14:0.00712055):0.00891871):0.0195068,11:0.0355461):0.00135346):0.00179282):0.0468499,4:0.0855423):0.0451632,((25:0.059669,(30:0.0155625,13:0.0155625):0.0441064):0.0223692,(3:0.0288957,1:0.0288957):0.0531425):0.0486673):0.062317):0.60861,((((47:0.0363305,(((62:0.00660739,58:0.00660739):0.011345,(70:0.00496959,54:0.00496959):0.0129828):0.0065665,((68:0.00291155,53:0.00291155):0.0178013,(66:0.0163583,((65:0.0045002,(69:0.00305355,59:0.00305355):0.00144664):0.000757378,(61:0.00311373,52:0.00311373):0.00214385):0.0111007):0.00435459):0.003806):0.0118116):0.111837,(76:0.0395418,(40:0.00641035,34:0.00641035):0.0331314):0.108625):0.0327298,((((46:0.00103749,42:0.00103749):0.0373456,(48:0.0259862,41:0.0259862):0.0123969):0.00173179,39:0.0401149):0.106275,((((44:0.00708562,36:0.00708562):0.0773928,(37:0.025,27:0.025):0.0594785):0.00501024,18:0.0894887):0.0248315,(15:0.0649576,6:0.0649576):0.0493626):0.0320701):0.0345064):0.0680223,((((80:0.0173948,73:0.0173948):0.0162433,(67:0.0129751,((63:0.00435012,57:0.00435012):0.00727273,(60:0.00848091,(64:0.00386096,((56:0.00203231,55:0.00203231):0.00103,51:0.0030623):0.000798654):0.00461996):0.00314194):0.00135223):0.0206631):0.0296773,(33:0.0415374,((75:0.0372575,(45:0.0371022,38:0.0371022):0.000155282):0.0029007,((43:0.0101608,32:0.0101608):0.0242563,31:0.0344171):0.00574108):0.00137926):0.021778):0.147776,((((74:0.0336172,((79:0.0258073,(77:0.0203659,(78:0.00390563,72:0.00390563):0.0164602):0.00544144):0.00767555,49:0.0334829):0.000134364):0.0132633,(35:0.0137148,24:0.0137148):0.0331656):0.0721567,(10:0.0147938,8:0.0147938):0.104243):0.0343567,((71:0.0427659,50:0.0427659):0.0221428,(9:0.0467372,2:0.0467372):0.0181715):0.0884852):0.0576977):0.0378275):0.552713);
[687]((((21:0.125204,20:0.125204):0.00994299,(28:0.0790047,(7:0.0146105,5:0.0146105):0.0643943):0.0561423):0.0578754,((4:0.128716,(23:0.0386924,((((26:0.0160606,22:0.0160606):0.00378,(19:0.0160596,16:0.0160596):0.00378096):0.00242531,12:0.0222659):0.0146336,((29:0.0160393,(17:0.00712055,14:0.00712055):0.00891871):0.0195068,11:0.0355461):0.00135346):0.00179282):0.0900232):0.0019898,((25:0.059669,(30:0.0155625,13:0.0155625):0.0441064):0.0223692,(3:0.0288957,1:0.0288957):0.0531425):0.0486673):0.062317):0.60861,((((47:0.0363305,(((62:0.00660739,58:0.00660739):0.011345,(70:0.00496959,54:0.00496959):0.0129828):0.0065665,((68:0.00291155,53:0.00291155):0.0178013,(66:0.0163583,((65:0.0045002,(69:0.00305355,59:0.00305355):0.00144664):0.000757378,(61:0.00311373,52:0.00311373):0.00214385):0.0111007):0.00435459):0.003806):0.0118116):0.111837,(76:0.0395418,(40:0.00641035,34:0.00641035):0.0331314):0.108625):0.0327298,((((46:0.00103749,42:0.00103749):0.0373456,(48:0.0259862,41:0.0259862):0.0123969):0.00173179,39:0.0401149):0.106275,((((44:0.00708562,36:0.00708562):0.0773928,(37:0.025,27:0.025):0.0594785):0.00501024,18:0.0894887):0.0248315,(15:0.0649576,6:0.0649576):0.0493626):0.0320701):0.0345064):0.0680223,((((80:0.0173948,73:0.0173948):0.0162433,(67:0.0129751,((63:0.00435012,57:0.00435012):0.00727273,(60:0.00848091,(64:0.00386096,((56:0.00203231,55:0.00203231):0.00103,51:0.0030623):0.000798654):0.00461996):0.00314194):0.00135223):0.0206631):0.0296773,(33:0.0415374,((75:0.0372575,(45:0.0371022,38:0.0371022):0.000155282):0.0029007,((43:0.0101608,32:0.0101608):0.0242563,31:0.0344171):0.00574108):0.00137926):0.021778):0.147776,((((74:0.0336172,((79:0.0258073,(77:0.0203659,(78:0.00390563,72:0.00390563):0.0164602):0.00544144):0.00767555,49:0.0334829):0.000134364):0.0132633,(35:0.0137148,24:0.0137148):0.0331656):0.0721567,(10:0.0147938,8:0.0147938):0.104243):0.0343567,((71:0.0427659,50:0.0427659):0.0221428,(9:0.0467372,2:0.0467372):0.0181715):0.0884852):0.0576977):0.0378275):0.552713);
[186]((((21:0.125204,20:0.125204):0.00994299,(28:0.0790047,(7:0.0146105,5:0.0146105):0.0643943):0.0561423):0.0578754,((4:0.128716,(23:0.0386924,((((26:0.0160606,22:0.0160606):0.00378,(19:0.0160596,16:0.0160596):0.00378096):0.00242531,12:0.0222659):0.0146336,((29:0.0160393,(17:0.00712055,14:0.00712055):0.00891871):0.0195068,11:0.0355461):0.00135346):0.00179282):0.0900232):0.0019898,((25:0.059669,(30:0.0155625,13:0.0155625):0.0441064):0.0223692,(3:0.0288957,1:0.0288957):0.0531425):0.0486673):0.062317):0.60861,((((47:0.0363305,(((62:0.00660739,58:0.00660739):0.011345,(70:0.00496959,54:0.00496959):0.0129828):0.0065665,((68:0.00291155,53:0.00291155):0.0178013,(66:0.0163583,((65:0.0045002,(69:0.00305355,59:0.00305355):0.00144664):0.000757378,(61:0.00311373,52:0.00311373):0.00214385):0.0111007):0.00435459):0.003806):0.0118116):0.111837,(76:0.0395418,(40:0.00641035,34:0.00641035):0.0331314):0.108625):0.0327298,((((44:0.00708562,36:0.00708562):0.0773928,(37:0.025,27:0.025):0.0594785):0.00501024,18:0.0894887):0.0248315,(15:0.0649576,6:0.0649576):0.0493626):0.0665766):0.0680223,((((80:0.0173948,73:0.0173948):0.0162433,(67:0.0129751,((63:0.00435012,57:0.00435012):0.00727273,(60:0.00848091,(64:0.00386096,((56:0.00203231,55:0.00203231):0.00103,51:0.0030623):0.000798654):0.00461996):0.00314194):0.00135223):0.0206631):0.0296773,(33:0.0415374,((75:0.0372575,(45:0.0371022,38:0.0371022):0.000155282):0.0029007,((43:0.0101608,32:0.0101608):0.0242563,31:0.0344171):0.00574108):0.00137926):0.021778):0.147776,((((74:0.0336172,((79:0.0258073,(77:0.0203659,(78:0.00390563,72:0.00390563):0.0164602):0.00544144):0.00767555,49:0.0334829):0.000134364):0.0132633,(35:0.0137148,24:0.0137148):0.0331656):0.0721567,(10:0.0147938,8:0.0147938):0.104243):0.0343567,((((46:0.00103749,42:0.00103749):0.0373456,(48:0.0259862,41:0.0259862):0.0123969):0.00173179,39:0.0401149):0.0339623,((71:0.0427659,50:0.0427659):0.0221428,(9:0.0467372,2:0.0467372):0.0181715):0.00916857):0.0793167):0.0576977):0.0378275):0.552713);
So after the first line every line starts with a number in brackets. I would need to grep the number in brackets and output it into a new file (without [) ..how would that be done>
grep -Po '(?<=\[)\d+(?=\])' file > new_file
-P for Perl regexs so it is possible to use:
\d for a digit
positive lookbehind and positive lookahead ((?<=\[) and (?=\]))
-o for only matching
Another possibility if your grep doesn't support the -P option but awk is available could be this:
awk -F '[][]' '{ if ($2 != "") print $2 }' file > new_file
-F tells awk to accept both ] and [ as a field delimiter, $2 then contains the number you want and is printed.
In three steps using simple commands:
grep -v "//" inputfile | cut -d"[" -f2 | cut -d"]" -f1
In sed can you remove everything outside the []:
grep -v "//" inputfile | sed 's/.*\[\(.*\)].*/\1/'

Resources