How to get the line number of a string in another string in Shell - shell

Given
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
I'd like to get the line number of the first occurrence of $str in $sourceStr, which should be 3.
I don't know how to do it.
I have tried:
awk 'match($0, v) { print NR; exit }' v=$str <<<$sourceStr
grep -n $str <<< $sourceStr | grep -Eo '^[^:]+';
grep -n $str <<< $sourceStr | cut -f1 -d: | sort -ug
grep -n $str <<< $sourceStr | awk -F: '{ print $1 }' | sort -u
All output 1, not 3.
How can I get the line number of $str in $sourceStr?
Thanks!

You may use this awk + printf in bash:
awk -v s="$str" '$0 == s {print NR; exit}' <(printf "%b\n" "$sourceStr")
3
Or even this awk without any bash support:
awk -v s="$str" -v source="$sourceStr" 'BEGIN {
split(source, a); for (i=1; i in a; ++i) if (a[i] == s) {print i; exit}}'
3
You may use this sed as well:
sed -n "/^$str$/{=;q;}" <(printf "%b\n" "$sourceStr")
3
Or this grep + cut:
printf "%b\n" "$sourceStr" | grep -nxF -m 1 "$str" | cut -d: -f1
3

It's not clear if you've just made a cut-n-paste error, but your sourceStr is not a multiline string (as demonstrated below). Also, you really need to quote your herestring (also demonstrated below). Perhaps you just want:
$ sourceStr="abc\nefg\nhij\nlmn\nhij"
$ echo "$sourceStr"
abc\nefg\nhij\nlmn\nhij
$ sourceStr=$'abc\nefg\nhij\nlmn\nhij'
$ echo "$sourceStr"
abc
efg
hij
lmn
hij
$ cat <<< $sourceStr
abc efg hij lmn hij
$ cat <<< "$sourceStr"
abc
efg
hij
lmn
hij
$ str=hij
$ awk "/${str}/ {print NR; exit}" <<< "$sourceStr"
3

Just use sed!
printf 'abc\nefg\nhij\nlmn\nhij\n' \
| sed -n '/hij/ { =; q; }'
Explanation: if sed meets a line that contains "hij" (regex /hij/), it prints the line number (the = command) and exits (the q command). Else it doesn't print anything (the -n switch) and goes on with the next line.
[update] Hmmm, sorry, I just noticed your "All output 1, not 3".
The primary reason why your commands don't output 3 is that sourceStr="abc\nefg\nhij\nlmn\nhij" doesn't automagically change your \n into new lines, so it ends up being one single line and that's why your commands always display 1.
If you want a multiline string, here are two solutions with bash:
printf -v sourceStr "abc\nefg\nhij\nlmn\nhij"
sourceStr=$'abc\nefg\nhij\nlmn\nhij'
And now that your variable contains space characters (new lines), as stated by William Pursell, in order to preserve them, you must enclose your $sourceStr with double quotes:
grep -n "$str" <<< "$sourceStr" | ...

There's always a hard way to do it:
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | nl | grep $str | head -1 | gawk '{ print $1 }'
or, a bit more efficient:
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | gawk '/'$str/'{ print NR; exit }'

Related

How to grab fields in inverted commas

I have a text file which contains the following lines:
"user","password_last_changed","expires_in"
"jeffrey","2021-09-21 12:54:26","90 days"
"root","2021-09-21 11:06:57","0 days"
How can I grab two fields jeffrey and 90 days from inverted commas and save in a variable.
If awk is an option, you could save an array and then save the elements as individual variables.
$ IFS="\"" read -ra var <<< $(awk -F, '/jeffrey/{ print $1, $NF }' input_file)
$ $ var2="${var[3]}"
$ echo "$var2"
90 days
$ var1="${var[1]}"
$ echo "$var1"
jeffrey
while read -r line; do # read in line by line
name=$(echo $line | awk -F, ' { print $1} ' | sed 's/"//g') # grap first col and strip "
expire=$(echo $line | awk -F, ' { print $3} '| sed 's/"//g') # grap third col and strip "
echo "$name" "$expire" # do your business
done < yourfile.txt
IFS=","
arr=( $(cat txt | head -2 | tail -1 | cut -d, -f 1,3 | tr -d '"') )
echo "${arr[0]}"
echo "${arr[1]}"
The result is into an array, you can access to the elements by index.
May be this below method will help you using
sed and awk command
#!/bin/sh
username=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $1}')
echo "$username"
expires_in=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $3}')
echo "$expires_in"
Output :
jeffrey
90 days
Note :
This above method will work if their is only distinct username
As far i know username are not duplicate

Shell awk - Print a position from variable

Here is my string that needs to be parsed.
line='aaa vvv ccc'
I need to print the values one by one.
no_of_users=$(echo $line| wc -w)
If the no_of_users is greater than 1 then I need to print the values one by one.
aaa
vvv
ccc
I used this script.
if [ $no_of_users -gt 1 ]
then
for ((n=1;n<=$no_of_users;n++))
do
-- here is my issue ##echo 'user:'$n $line|awk -F ' ' -vno="${n}" 'BEGIN { print no }'
done
fi
In the { print no } I have to print the value in that position.
You may use this awk:
awk 'NF>1 {OFS="\n"; $1=$1} 1' <<< "$line"
aaa
vvv
ccc
What it does:
NF>1: If number of fields are greater than 1
OFS="\n": Set output field separator to \n
$1=$1: Force restructure of a record
1: Print a record
1st solution: Within single awk could you please try following. Where var is an awk variable which has shell variable line value in it.
awk -v var="$line" '
BEGIN{
num=split(var,arr," ")
if(num>1){
for(i=1;i<=num;i++){ print arr[i] }
}
}'
Explanation: Adding detailed explanation for above.
awk -v var="$line" ' ##Starting awk program and creating var variable which has line shell variable value in it.
BEGIN{ ##Starting BEGIN section of program from here.
num=split(var,arr," ") ##Splitting var into array arr here. Saving its total length into variable num to check it later.
if(num>1){ ##Checking condition if num is greater than 1 then do following.
for(i=1;i<=num;i++){ print arr[i] } ##Running for loop from i=1 to till value of num here and printing arr value with index i here.
}
}'
2nd solution: Adding one more solution tested and written in GNU awk.
echo "$line" | awk -v RS= -v OFS="\n" 'NF>1{$1=$1;print}'
Another option:
if [ $no_of_users -gt 1 ]
then
for ((n=1;n<=$no_of_users;n++))
do
echo 'user:'$n $(echo $line|awk -F ' ' -v x=$n '{printf $x }')
done
fi
You can use grep
echo $line | grep -o '[a-z][a-z]*'
Also with awk:
awk '{print $1, $2, $3}' OFS='\n' <<< "$line"
aaa
vvv
ccc
the key is setting OFS='\n'
Or a really toughie:
printf "%s\n" $line
(note: $line is unquoted)
printf will consume all words in line with word-splitting applied so each word is taken as a single input.
Example Use/Output
$ line='aaa vvv ccc'; printf "%s\n" $line
aaa
vvv
ccc
Using bash:
$ line='aaa vvv'ccc'
$ [[ $line =~ \ ]] && echo -e ${line// /\\n}
aaa
vvv
ccc
$ line=aaa
$ [[ $line =~ \ ]] && echo -e ${line// /\\n}
$
If you are on another shell:
$ line="foo bar baz" bash -c '[[ $line =~ \ ]] && echo -e ${line// /\\n}'
grep -Eq '[[:space:]]' <<< "$line" && xargs printf "%s\n" <<< $line
Do a silent grep for a space in the variable, if true, print with names on separate lines.
awk -v OFS='\n' 'NF>1{$1=$1; print}'
e.g.
$ line='aaa vvv ccc'
$ echo "$line" | awk -v OFS='\n' 'NF>1{$1=$1; print}'
aaa
vvv
ccc
$ line='aaa'
$ echo "$line" | awk -v OFS='\n' 'NF>1{$1=$1; print}'
$
another golfed awk variation
$ awk 'gsub(FS,RS)'
only print if there is a substitution.

Merging awk and cut into one command

My line is:
var1="_source.statistics.test1=AAAAA;;;_source.statistics.test2=BBBB;;;_source.statistics.test3=CCCCC"
awk -F ";;;" '{print $1}' <<<$var1 | cut -d= -f2
AAAAA
awk -F ";;;" '{print $2}' <<<$var1 | cut -d= -f2
BBBB
How can I get to the same result using only AWK?
Awk lets you split a field on another delimiter.
awk -F ";;;" '{split($1, a, /=/); print a[2] }'
However, perhaps a more fruitful approach would be to transform this horribly hostile input format to something a little bit more normal, and take it from there with standard tools.
sed 's/;;;/\
/g' inputfile | ...
Could you please try following, within single awk by making use of field separator -F setting it as either = or ; for each line passed to awk.
echo "$var1" | awk -F'=|;' '{print $2}'
AAAAA
echo "$var1" | awk -F'=|;' '{print $6}'
BBBB
OR
echo "$var1" | awk -F"=|;;;" '{print $2}'
AAAAA
echo "$var1" | awk -F"=|;;;" '{print $4}'
BBBB
Considering that you need these output for variables, if yes then you could use it by sed and placing its values in an array and later could make use of it. IMHO this is why arrays are built to save our time of creating N numbers of variables.
Creation of an array with sed:
array=( $(echo "$var1" | sed 's/\([^=]*\)=\([^;]*\)\([^=]*\)=\([^;]*\)\(.*\)/\2 \4/' ) )
Creating of an array with awk:
array=( $(echo "$var1" | awk -F"=|;;;" '{print $2,$4}') )
Above will create an array with values of AAAAA and BBBB now to fetch it you could use.
for i in {0..1}; do echo "$i : ${array[$i]}"; done
0 : AAAAA
1 : BBBB
I have used for loop for your understanding of it, one could use directly array[0] for AAAAA or array[1] for BBBB.
Whenever you have name/tag=val input data it's useful to create an array of tag-value pairs so you can just print or do whatever else you like with the data by it's tags, e.g.:
$ awk -F';;;|=' '{for (i=1; i<NF; i+=2) f[$i]=$(i+1); print f["_source.statistics.test1"]}' <<<"$var1"
AAAAA
$ awk -F';;;|=' '{for (i=1; i<NF; i+=2) f[$i]=$(i+1); print f["_source.statistics.test3"], f["_source.statistics.test2"]}' <<<"$var1"
CCCCC BBBB

Extract data between delimiters from a Shell Script variable

I have this shell script variable, var. It keeps 3 entries separated by new line. From this variable var, I want to extract 2, and 0.078688. Just these two numbers.
var="USER_ID=2
# 0.078688
Suhas"
These are the code I tried:
echo "$var" | grep -o -P '(?<=\=).*(?=\n)' # For extracting 2
echo "$var" | awk -v FS="(# |\n)" '{print $2}' # For extracting 0.078688
None of the above working. What is the problem here? How to fix this ?
Just use tr alone for retaining the numerical digits, the dot (.) and the white-space and remove everything else.
tr -cd '0-9. ' <<<"$var"
2 0.078688
From the man page, of tr for usage of -c, -d flags,
tr [OPTION]... SET1 [SET2]
-c, -C, --complement
use the complement of SET1
-d, --delete
delete characters in SET1, do not translate
To store it in variables,
IFS=' ' read -r var1 var2 < <(tr -cd '0-9. ' <<<"$var")
printf "%s\n" "$var1"
2
printf "%s\n" "$var2"
2
0.078688
Or in an array as
IFS=' ' read -ra numArray < <(tr -cd '0-9. ' <<<"$var")
printf "%s\n" "${numArray[#]}"
2
0.078688
Note:- The -cd flags in tr are POSIX compliant and will work on any systems that has tr installed.
echo "$var" |grep -oP 'USER_ID=\K.*'
2
echo "$var" |grep -oP '# \K.*'
0.078688
Your solution is near to perfect, you need to chance \n to $ which represent end of line.
echo "$var" |awk -F'# ' '/#/{print $2}'
0.078688
echo "$var" |awk -F'=' '/USER_ID/{print $2}'
2
You can do it with pure bash using a regex:
#!/bin/bash
var="USER_ID=2
# 0.078688
Suhas"
[[ ${var} =~ =([0-9]+).*#[[:space:]]([0-9\.]+) ]] && result1="${BASH_REMATCH[1]}" && result2="${BASH_REMATCH[2]}"
echo "${result1}"
echo "${result2}"
With awk:
First value:
echo "$var" | grep 'USER_ID' | awk -F "=" '{print $2}'
Second value:
echo "$var" | grep '#' | awk '{print $2}'
Assuming this is the format of data as your sample
# For extracting 2
echo "$var" | sed -e '/.*=/!d' -e 's///'
echo "$var" | awk -F '=' 'NR==1{ print $2}'
# For extracting 0.078688
echo "$var" | sed -e '/.*#[[:blank:]]*/!d' -e 's///'
echo "$var" | awk -F '#' 'NR==2{ print $2}'

Unix command to convert multiple line data in a single line along with delimiter

Here is the actual file data:
abc
def
ghi
jkl
mno
And the required output should be in this format:
'abc','def','ghi','jkl','mno'
The command what I used to do this gives output as:
abc,def,ghi,jkl,mno
The command is as follows:
sed -n 's/[0-3]//;s/ //;p' Split_22_05_2013 | \
awk -v ORS= '{print $0" ";if(NR%4==0){print "\n"}}'
In response to sudo_O's comment I add an awk less solution in pure bash. It does not exec any program at all. Of course instead of <<XXX ... XXX (here-is-the-document) stuff one could add <filename.
set c=""
while read w; do
echo -e "$c'$w'\c"
c=,
done<<XXX
abc
def
ghi
jkl
mno
XXX
Output:
'abc','def','ghi','jkl','mno'
An even shorter version:
printf -v out ",'%s'" $(<infile)
echo ${out:1}
Without the horrifying pipe snakes You can try something like this:
awk 'NR>1{printf ","}{printf "\x27%s\x27",$0}' <<XXX
abc
def
ghi
jkl
mno
XXX
Output:
'abc','def','ghi','jkl','mno'
Or an other version which reads the whole input as one line:
awk -vRS="" '{gsub("\n","\x27,\x27");print"\x27"$0"\x27"}'
Or a version which lets awk uses the internal variables more
awk -vRS="" -F"\n" -vOFS="','" -vORS="'" '{$1=$1;print ORS $0}'
The $1=$1; is needed to tell to awk to repack $0 using the new field and record separators (OFS, ORS).
$ cat test.txt
abc
def
ghi
jkl
mno
$ cat test.txt | tr '\n' ','
abc,def,ghi,jkl,mno,
$ cat test.txt | awk '{print "\x27" $1 "\x27"}' | tr '\n' ','
'abc','def','ghi','jkl','mno',
$ cat test.txt | awk '{print "\x27" $1 "\x27"}' | tr '\n' ',' | sed 's/,$//'
'abc','def','ghi','jkl','mno'
The last command can be shortened to avoid UUOC:
$ awk '{print "\x27" $1 "\x27"}' test.txt | tr '\n' ',' | sed 's/,$//'
'abc','def','ghi','jkl','mno'
Using sed alone:
sed -n "/./{s/^\|\$/'/g;H}; \${x;s/\n//;s/\n/,/gp};" test.txt
Edit: Fixed, it should also work with or without empty lines now.
$ cat file
abc
def
ghi
jkl
mno
$ cat file | tr '\n' ' ' | awk -v q="'" -v OFS="','" '$1=$1 { print q $0 q }'
'abc','def','ghi','jkl','mno'
Replace '\n' with ' ' -> (tr '\n\ ' ')
Replace each separator (' ' space) with (',' quote-comma-quote) ->
(-v OFS="','")
Add quotes to the begin and end of line -> (print q $0 q)
This can be done pretty briefly with sed and paste:
<infile sed "s/^\|\$/'/g" | paste -sd,
Or more portably (I think, cannot test right now):
sed "s/^\|\$/'/g" infile | paste -s -d , -
$ sed "s/[^ ][^ ]*/'&',/g" input.txt | tr -d '\n'
'abc','def','ghi','jkl','mno',
To clean the last ,, throw in a
| sed 's/,$//'
awk 'seen == 1 { printf("'"','"'%s", $1);} seen == 0 {seen = 1; printf("'"'"'%s", $1);} END { printf("'"'"'\n"); }'
In slightly more readable format (suitable for awk -f):
# Print quote-terminator, separator, quote-start, thing
seen == 1 { printf("','%s", $1); }
# Set the "print separator" flag, print quote-start thing
seen == 0 { seen = 1; printf("'%s", $1}; }
END { printf("'\n"); } # Print quote-end
perl -l54pe 's/.*/\x27$&\x27/' file

Resources