how to extract string appears after one particular string in Shell - bash

I am working on a script where I am grepping lines that contains -abc_1.
I need to extract string that appear just after this string as follow :
option : -abc_1 <some_path>
I have used following code :
grep "abc_1" | awk -F " " {print $4}
This code is failing if there are more spaces used between string , e.g :
option : -abc_1 <some_path>
It will be helpful if I can extract the path somehow without bothering of spaces.

This should do:
echo 'option : -abc_1 <some_path>' | awk '/abc_1/ {print $4}'
If you do not specify field separator, it uses one ore more blank as separator.
PS you do not need both grep and awk

With sed you can do the search and the filter in one step:
sed -n 's/^.*abc_1 *: *\([^ ]*\).*$/\1/p'
The -n option suppresses printing, but the p command at the end still prints if a successful substitution was made.

perl -lne ' print $1 if(/-abc_1 (.*)/)' your_file
Tested Here
Or if you want to use awk:
awk '{for(i=1;i<=NF;i++)if($i="-abc_1")print $(i+1)}' your_file

try this grep only way:
grep -Po '^option\s*:\s*-abc_1\s*\K.*' file
or if the white spaces were fixed:
grep -Po '^option : -abc_1 \K.*' file


Remove starting substring http from strings using AWK?

I'm wondering Is there a better and cleaner way to remove strings at beginning and last of each line in a file using AWK only?
Here's what I got so far
cat results.txt | awk '{gsub("https://", "") ;print}' | tr -d ":443"
File: results.txt
To get the result
With GNU awk.
Use / and : as field separators and print fourth column:
awk -F '[/:]' '{print $4}' results.txt
Or use https:// and : as field separators and print second column:
awk -F 'https://|:' '{print $2}' results.txt
If it's a list of URLs like that, you could take advantage of the fact that the field separator in awk can be a regular expression:
awk -F':(//)?' '{print $2}'
This says that your field seperator is ": optionally followed by //", which would split each line into:
[$1] http
[$3] 443
And then we print out only field $2.
cat results.txt | awk '{gsub("https://", "") ;print}' | tr -d ":443"
I think you are misunderstading what tr -d does, it is used to delete enumerated characters (not substring), it does seems to do what you want because your test input
do not contain : or 4 or 3 which should be kept, if you need test case which will shown malfunction try
Also code as above feature anti-pattern known as useless use of cat as GNU AWK can deal with file on its' own that is
cat results.txt | awk '{gsub("https://", "") ;print}'
can be written more succintly as
awk '{gsub("https://", "") ;print}' results.txt
I would rewrite whole your code (cat,awk,tr) to single awk as follows
awk '{gsub("^https://|:443$","");print}' results.txt
Explanation: replace https:// following start of line (^) or (|) :443 before end of line ($) using empty string (i.e. delete these parts) then print. Note that ^ and $ will prevent deleting https:// and :443 in middle of strings, though feel free to remove ^ and $ if you find these to be unlikely.

How can I prefix the output of each match in grep with some text?

I have a file with a list of phrases
I'm running cat file.txt | xargs -I% sh -c "grep -Eio '(an)' >> output.txt"
What I can't figure out, is that I want the output to contain the original line, for example:
How can I prefix the output of grep to also include the value being piped to it?
This should be a task for awk, could you please try following.
awk '/an/{print $0",an"}' Input_file
This will look for string an in all lines of Input_file and append an in them too.
Solution with sed:
sed '/an/s/$/,an/' intput_file
This finds lines that match the pattern /an/, and appends ,an to the end of the pattern space $.
Use awk instead of grep:
$ awk -v s="an" ' # search string
OFS="," # separating comma
match($0,s) { # when there is a match
print $0,substr($0,RSTART,RLENGTH) # output
}' file

I am trying to use awk to get the name of a file given the absolute path to the file.
For example, when given the input path /home/parent/child/filename I would like to get filename
I have tried:
awk -F "/" '{print $5}' input
which works perfectly.
However, I am hard coding $5 which would be incorrect if my input has the following structure:
So a generic solution requires always taking the last field (which will be the filename).
Is there a simple way to do this with the awk substr function?
Use the fact that awk splits the lines in fields based on a field separator, that you can define. Hence, defining the field separator to / you can say:
awk -F "/" '{print $NF}' input
as NF refers to the number of fields of the current record, printing $NF means printing the last one.
So given a file like this:
This would be the output:
$ awk -F"/" '{print $NF}' file
In this case it is better to use basename instead of awk:
$ basename /home/parent/child1/child2/filename
If you're open to a Perl solution, here one similar to fedorqui's awk solution:
perl -F/ -lane 'print $F[-1]' input
-F/ specifies / as the field separator
$F[-1] is the last element in the #F autosplit array
Another option is to use bash parameter substitution.
$ foo="/home/parent/child/filename"
$ echo ${foo##*/}
$ foo="/home/parent/child/child2/filename"
$ echo ${foo##*/}
Like 5 years late, I know, thanks for all the proposals, I used to do this the following way:
$ echo /home/parent/child1/child2/filename | rev | cut -d '/' -f1 | rev
Glad to notice there are better manners
It should be a comment to the basename answer but I haven't enough point.
If you do not use double quotes, basename will not work with path where there is space character:
$ basename /home/foo/bar foo/bar.png
ok with quotes " "
$ basename "/home/foo/bar foo/bar.png"
file example
$ cat a
/home/parent/child 1/child 2/child 3/filename1
/home/parent/child 1/child2/filename2
$ while read b ; do basename "$b" ; done < a
I know I'm like 3 years late on this but....
you should consider parameter expansion, it's built-in and faster.
if your input is in a var, let's say, $var1, just do ${var1##*/}. Look below
$ var1='/home/parent/child1/filename'
$ echo ${var1##*/}
$ var1='/home/parent/child1/child2/filename'
$ echo ${var1##*/}
$ var1='/home/parent/child1/child2/child3/filename'
$ echo ${var1##*/}
you can skip all of that complex regex :
echo '/home/parent/child1/child2/filename' |
mawk '$!_=$-_=$NF' FS='[/]'
2nd to last :
mawk '$!--NF=$NF' FS='/'
3rd last field :
echo '/home/parent/child1/child2/filename' |
mawk '$!--NF=$--NF' FS='[/]'
4th-last :
mawk '$!--NF=$(--NF-!-FS)' FS='/'
echo '/home/parent/child000/child00/child0/child1/child2/filename' |
echo '/home/parent/child1/child2/filename'
major caveat :
- `gawk/nawk` has a slight discrepancy with `mawk` regarding
- how it tracks multiple,
- and potentially conflicting, decrements to `NF`,
- so other than the 1st solution regarding last field,
- the rest for now, are only applicable to `mawk-1/2`
just realized it's much much cleaner this way in mawk/gawk/nawk :
echo '/home/parent/child1/child2/filename' | …
awk ++NF FS='.+/' OFS= # updated such that
# root "/" still gets printed
You can also use:
sed -n 's/.*\/\([^\/]\{1,\}\)$/\1/p'
sed -n 's/.*\/\([^\/]*\)$/\1/p'

How to retrieve digits including the separator "."

I am using grep to get a string like this: ANS_LENGTH=266.50 then I use sed to only get the digits: 266.50
This is my full command: grep --text 'ANS_LENGTH=' log.txt | sed -e 's/[^[[:digit:]]]*//g'
The result is : 26650
How can this line be changed so the result still shows the separator: 266.50
You don't need grep if you are going to use sed. Just use sed' // to match the lines you need to print.
sed -n '/ANS_LENGTH/s/[^=]*=\(.*\)/\1/p' log.txt
-n will suppress printing of lines that do not match /ANS_LENGTH/
Using captured group we print the value next to = sign.
p flag at the end allows to print the lines that matches our //.
If your grep happens to support -P option then you can do:
grep -oP '(?<=ANS_LENGTH=).*' log.txt
(?<=...) is a look-behind construct that allows us to match the lines you need. This requires the -P option
-o allows us to print only the value part.
You need to match a literal dot as well as the digits.
Try sed -e 's/[^[[:digit:]\.]]*//g'
The dot will match any single character. Escaping it with the backslash will match only a literal dot.
Here is some awk example:
cat file:
some data ANS_LENGTH=266.50 other=22
not mye data=43
gnu awk (due to RS)
awk '/ANS_LENGTH/ {f=NR} f&&NR-1==f' RS="[ =]" file
awk '/ANS_LENGTH/ {getline;print}' RS="[ =]" file
Plain awk
awk -F"[ =]" '{for(i=1;i<=NF;i++) if ($i=="ANS_LENGTH") print $(i+1)}' file
awk '{for(i=1;i<=NF;i++) if ($i~"ANS_LENGTH") {split($i,a,"=");print a[2]}}' file

bash scripting removing optional <Integer><colon> prefix

I have a list with all of the content is like:
I need to remove the N: but leave the rest of strings as is.
Have tried:
cat service-rpmu.list | sed -ne "s/#[#:]\+://p" > end.list
cat service-rpmu.list | egrep -o '#[#:]+' > end.list
both result in an empty end.list
//* the N:, just denotes an epoch version */
With sed:
sed 's/^[0-9]\+://' your.file
Btw, your list looks like the output of a grep command with the option -n. If this is true, then omit the -n option there. Also it is likely that your whole task can be done with a single sed command.
awk -F: '{ sub(/^.*:/,""); print}' sample
Here is another way with awk:
awk -F: '{print $NF}’ service-rpmu.list
