Given
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
I'd like to get the line number of the first occurrence of $str in $sourceStr, which should be 3.
I don't know how to do it.
I have tried:
awk 'match($0, v) { print NR; exit }' v=$str <<<$sourceStr
grep -n $str <<< $sourceStr | grep -Eo '^[^:]+';
grep -n $str <<< $sourceStr | cut -f1 -d: | sort -ug
grep -n $str <<< $sourceStr | awk -F: '{ print $1 }' | sort -u
All output 1, not 3.
How can I get the line number of $str in $sourceStr?
Thanks!
You may use this awk + printf in bash:
awk -v s="$str" '$0 == s {print NR; exit}' <(printf "%b\n" "$sourceStr")
3
Or even this awk without any bash support:
awk -v s="$str" -v source="$sourceStr" 'BEGIN {
split(source, a); for (i=1; i in a; ++i) if (a[i] == s) {print i; exit}}'
3
You may use this sed as well:
sed -n "/^$str$/{=;q;}" <(printf "%b\n" "$sourceStr")
3
Or this grep + cut:
printf "%b\n" "$sourceStr" | grep -nxF -m 1 "$str" | cut -d: -f1
3
It's not clear if you've just made a cut-n-paste error, but your sourceStr is not a multiline string (as demonstrated below). Also, you really need to quote your herestring (also demonstrated below). Perhaps you just want:
$ sourceStr="abc\nefg\nhij\nlmn\nhij"
$ echo "$sourceStr"
abc\nefg\nhij\nlmn\nhij
$ sourceStr=$'abc\nefg\nhij\nlmn\nhij'
$ echo "$sourceStr"
abc
efg
hij
lmn
hij
$ cat <<< $sourceStr
abc efg hij lmn hij
$ cat <<< "$sourceStr"
abc
efg
hij
lmn
hij
$ str=hij
$ awk "/${str}/ {print NR; exit}" <<< "$sourceStr"
3
Just use sed!
printf 'abc\nefg\nhij\nlmn\nhij\n' \
| sed -n '/hij/ { =; q; }'
Explanation: if sed meets a line that contains "hij" (regex /hij/), it prints the line number (the = command) and exits (the q command). Else it doesn't print anything (the -n switch) and goes on with the next line.
[update] Hmmm, sorry, I just noticed your "All output 1, not 3".
The primary reason why your commands don't output 3 is that sourceStr="abc\nefg\nhij\nlmn\nhij" doesn't automagically change your \n into new lines, so it ends up being one single line and that's why your commands always display 1.
If you want a multiline string, here are two solutions with bash:
printf -v sourceStr "abc\nefg\nhij\nlmn\nhij"
sourceStr=$'abc\nefg\nhij\nlmn\nhij'
And now that your variable contains space characters (new lines), as stated by William Pursell, in order to preserve them, you must enclose your $sourceStr with double quotes:
grep -n "$str" <<< "$sourceStr" | ...
There's always a hard way to do it:
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | nl | grep $str | head -1 | gawk '{ print $1 }'
or, a bit more efficient:
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | gawk '/'$str/'{ print NR; exit }'
I have a text file a.txt with following data
abc/def/ghi
jkl/mno/pqr/stu
I need to cut them so that I get first and last string with "/" as delimiter
Output expected is
abc ghi
jkl stu
cat a.txt |cut -d "/" -f1 #gives me first cell
cat a.txt |rev |cut -d "/" -f1 |rev #gives me last cell
I want both cells to be available in single command. Kindly help.
You could use awk for this,
$ awk -F/ '{print $1,$NF}' file
abc ghi
jkl stu
Through sed,
$ sed 's~^\([^/]*\).*\/\(.*\)$~\1 \2~g' file
abc ghi
jkl stu
Through perl,
$ perl -pe 's;^([^/]*).*\/(.*)$;\1 \2;g' file
abc ghi
jkl stu
Ugly hack through grep and paste,
$ grep -oP '^[^/]*|\w+(?=$)' file | paste -d' ' - -
abc ghi
jkl stu
Another sed ( without capture ),
sed 's#/.*/# #g' yourfile
Test:
$ sed 's#/.*/# #g' yourfile
abc ghi
jkl stu
Here is the actual file data:
abc
def
ghi
jkl
mno
And the required output should be in this format:
'abc','def','ghi','jkl','mno'
The command what I used to do this gives output as:
abc,def,ghi,jkl,mno
The command is as follows:
sed -n 's/[0-3]//;s/ //;p' Split_22_05_2013 | \
awk -v ORS= '{print $0" ";if(NR%4==0){print "\n"}}'
In response to sudo_O's comment I add an awk less solution in pure bash. It does not exec any program at all. Of course instead of <<XXX ... XXX (here-is-the-document) stuff one could add <filename.
set c=""
while read w; do
echo -e "$c'$w'\c"
c=,
done<<XXX
abc
def
ghi
jkl
mno
XXX
Output:
'abc','def','ghi','jkl','mno'
An even shorter version:
printf -v out ",'%s'" $(<infile)
echo ${out:1}
Without the horrifying pipe snakes You can try something like this:
awk 'NR>1{printf ","}{printf "\x27%s\x27",$0}' <<XXX
abc
def
ghi
jkl
mno
XXX
Output:
'abc','def','ghi','jkl','mno'
Or an other version which reads the whole input as one line:
awk -vRS="" '{gsub("\n","\x27,\x27");print"\x27"$0"\x27"}'
Or a version which lets awk uses the internal variables more
awk -vRS="" -F"\n" -vOFS="','" -vORS="'" '{$1=$1;print ORS $0}'
The $1=$1; is needed to tell to awk to repack $0 using the new field and record separators (OFS, ORS).
$ cat test.txt
abc
def
ghi
jkl
mno
$ cat test.txt | tr '\n' ','
abc,def,ghi,jkl,mno,
$ cat test.txt | awk '{print "\x27" $1 "\x27"}' | tr '\n' ','
'abc','def','ghi','jkl','mno',
$ cat test.txt | awk '{print "\x27" $1 "\x27"}' | tr '\n' ',' | sed 's/,$//'
'abc','def','ghi','jkl','mno'
The last command can be shortened to avoid UUOC:
$ awk '{print "\x27" $1 "\x27"}' test.txt | tr '\n' ',' | sed 's/,$//'
'abc','def','ghi','jkl','mno'
Using sed alone:
sed -n "/./{s/^\|\$/'/g;H}; \${x;s/\n//;s/\n/,/gp};" test.txt
Edit: Fixed, it should also work with or without empty lines now.
$ cat file
abc
def
ghi
jkl
mno
$ cat file | tr '\n' ' ' | awk -v q="'" -v OFS="','" '$1=$1 { print q $0 q }'
'abc','def','ghi','jkl','mno'
Replace '\n' with ' ' -> (tr '\n\ ' ')
Replace each separator (' ' space) with (',' quote-comma-quote) ->
(-v OFS="','")
Add quotes to the begin and end of line -> (print q $0 q)
This can be done pretty briefly with sed and paste:
<infile sed "s/^\|\$/'/g" | paste -sd,
Or more portably (I think, cannot test right now):
sed "s/^\|\$/'/g" infile | paste -s -d , -
$ sed "s/[^ ][^ ]*/'&',/g" input.txt | tr -d '\n'
'abc','def','ghi','jkl','mno',
To clean the last ,, throw in a
| sed 's/,$//'
awk 'seen == 1 { printf("'"','"'%s", $1);} seen == 0 {seen = 1; printf("'"'"'%s", $1);} END { printf("'"'"'\n"); }'
In slightly more readable format (suitable for awk -f):
# Print quote-terminator, separator, quote-start, thing
seen == 1 { printf("','%s", $1); }
# Set the "print separator" flag, print quote-start thing
seen == 0 { seen = 1; printf("'%s", $1}; }
END { printf("'\n"); } # Print quote-end
perl -l54pe 's/.*/\x27$&\x27/' file
Given this file
$ cat foo.txt
,,,,dog,,,,,111,,,,222,,,333,,,444,,,
,,,,cat,,,,,555,,,,666,,,777,,,888,,,
,,,,mouse,,,,,999,,,,122,,,133,,,144,,,
I can print the first field like so
$ awk -F, '{print $5}' foo.txt
dog
cat
mouse
However I would like to ignore those empty fields so that I can call like this
$ awk -F, '{print $1}' foo.txt
You can use like this:
$ awk -F',+' '{print $2}' file
dog
cat
mouse
Similarly, you can use $3, $4 and $5 and so on.. $1 cannot be used in this case because the records begins with delimiter.
$ awk '{print $1}' FPAT=[^,]+ foo.txt
dog
cat
mouse
You can delete multiple repetition of a field with tr -s 'field':
$ tr -s ',' < your_file
,dog,111,222,333,444,
,cat,555,666,777,888,
,mouse,999,122,133,144,
And then you can access to dog, etc with:
$ tr -s ',' < your_file | awk -F, '{print $2}'
dog
cat
mouse
perl -anF,+ -e 'print "$F[1]\n"' foo.txt
dog
cat
mouse
this is no awk but you will get to use 1 instead of 2.
awk -F, '{gsub(/^,*|,*$/,"");gsub(/,+/,",");print $1}' your_file
tested below:
> cat temp
,,,,dog,,,,,111,,,,222,,,333,,,444,,,
,,,,cat,,,,,555,,,,666,,,777,,,888,,,
,,,,mouse,,,,,999,,,,122,,,133,,,144,,,
execution:
> awk -F, '{gsub(/^,*|,*$/,"");gsub(/,+/,",");print $1}' temp
dog
cat
mouse