Replace line after match - bash

Given this file
$ cat foo.txt
AAA
111
BBB
222
CCC
333
I would like to replace the first line after BBB with 999. I came up with this command
awk '/BBB/ {f=1; print; next} f {$1=999; f=0} 1' foo.txt
but I am curious to any shorter commands with either awk or sed.

This might work for you (GNU sed)
sed '/BBB/!b;n;c999' file
If a line contains BBB, print that line and then change the following line to 999.
!b negates the previous address (regexp) and breaks out of any processing, ending the sed commands, n prints the current line and then reads the next into the pattern space, c changes the current line to the string following the command.

This is some shorter:
awk 'f{$0="999";f=0}/BBB/{f=1}1' file
f {$0="999";f=0} if f is true, set line to 999 and f to 0
/BBB/ {f=1} if pattern match set f to 1
1 print all lines, since 1 is always true.

can use sed also, it's shorter
sed '/BBB/{n;s/.*/999/}'

$ awk '{print (f?999:$0); f=0} /BBB/{f=1}' file
AAA
111
BBB
999
CCC
333

awk '/BBB/{print;getline;$0="999"}1' your_file

sed 's/\(BBB\)/\1\
999/'
works on mac

Related

Searching a string and replacing another string above the searched string

I have a file with the lines below
123
456
123
789
abc
efg
xyz
I need to search with abc and replace immediate above 123 with 111. This is the requirement, abc is only one occurrence in the file but 123 can be multiple occurrences and 123 can be at any position above abc.
Please help me.
I have tried with below sed command
sed -i.bak "/abc/!{x;1!p;d;};x;s/123/1111" filename
With the above command, it is only replacing 123, if 123 is just above abc, if 123 is 2 lines above abc then replace is failing.
There's more than on way to do it. Here's one:
sed -i.bak '1{h;d;};/123/{x;p;d;};/abc/{x;s/123/111/;p;d;};H;${x;p;};d' filename
ed comes in handy for complex editing of files in scripts:
ed -s file <<EOF
/^abc$/;?^123$?;.c
111
.
w
EOF
This: Sets the current line to the first one matching abc (/^abc$/;). Then changes the first line before that point that matches 123 to 111 (?XXX? searches backwards for a matching regular expression, and ?^123$?;. selects that single line for c to change) and finally saves the modified file.
This is a classic case where you keep track of your previous line and change stuff depeinding on conditions satisfying the current line. Genearlly, an awk program looks like this:
awk '(FNR==1){prev=$0; next}
(condition_on_$0) { action_on_prev }
{ print prev; prev = $0 }
END { print $0 }'
So in the case of the OP, this would read:
awk '(FNR==1){prev=$0; next}
$0 == "abc" { if (prev == "123") prev = "111" }
{ print prev; prev = $0 }
END { print $0 }'
This might work for you (GNU sed):
sed -Ez 's/(.*)(\n123.*\nabc)/\1\n111\2/' file
This slurps the file into memory and inserts 111 in front of the last occurrence of 123 before abc.
A less memory intensive solution:
sed -E '/^123$/{:a;N;/\n123$/{h;s///p;g;s/.*\n//;ba};/\nabc$/!ba;s/^/111\n/}' file
This gathers up lines following a line containing 123. If another line containing 123 is encountered it offloads all lines before it and begins gathering lines again. If it finds a line containing abc it inserts 111 at the front of the lines gathered so far.
Another alternative:
sed '/abc/{x;/./{s/^/111\n/p;z};x;b};/123/{x;/./p;x;h;$!d;b};x;/./{x;H;$!d};x' file
$ tac file | awk 'f && sub(/123/,"111"){f=0} /abc/{f=1} 1' | tac
123
456
111
789
abc
efg
xyz

How to get lines from the last match to the end of file?

Need to print lines after the last match to the end of file. The number of matches could be anything and not definite. I have some text as shown below.
MARKER
aaa
bbb
ccc
MARKER
ddd
eee
fff
MARKER
ggg
hhh
iii
MARKER
jjj
kkk
lll
Output desired is
jjj
kkk
lll
Do I use awk with RS and FS to get the desired output?
You can actually do it with awk (gawk) without using any pipe.
$ awk -v RS='(^|\n)MARKER\n' 'END{printf "%s", $0}' file
jjj
kkk
lll
Explanations:
You define your record separator as (^|\n)MARKER\n via RS='(^|\n)MARKER\n', by default it is the EOL char
'END{printf "%s", $0}' => at the end of the file, you print the whole line, as RS is set at (^|\n)MARKER\n, $0 will include all the lines until EOF.
Another option is to use grep (GNU):
$ grep -zoP '(?<=MARKER\n)(?:(?!MARKER)[^\0])+\Z' file
jjj
kkk
lll
Explanations:
-z to use the ASCII NUL character as delimiter
-o to print only the matching
-P to activate the perl mode
PCRE regex: (?<=MARKER\n)(?:(?!MARKER)[^\0])+\Z explained here https://regex101.com/r/RpQBUV/2/
Last but not least, the following sed approach can also been used:
sed -n '/^MARKER$/{n;h;b};H;${x;p}' file
jjj
kkk
lll
Explanations:
n jump to next line
h replace the hold space with the current line
H do the same but instead of replacing, append
${x;p} at the end of the file exchange (x) hold space and pattern space and print (p)
that can be turned into:
tac file | sed -n '/^MARKER$/q;p' | tac
if we use tac.
Could you please try following.
tac file | awk '/MARKER/{print val;exit} {val=(val?val ORS:"")$0}' | tac
Benefit of this approach will be awk will just read last block of the Input_file(which will be actually first block for awk after tac prints it reverse)and exit after that.
Explanation:
tac file | ##Printing Input_file in reverse order.
awk '
/MARKER/{ ##Searching for a string MARKER in a line of Input_file.
print val ##Printing variable val here. Because we need last occurrence of string MARKER,which has become first instance after reversing the Input_file.
exit ##Using exit to exit from awk program itself.
}
{
val=(val?val ORS:"")$0 ##Creating variable named val whose value will be keep appending to its own value with a new line to get values before string MARKER as per OP question.
}
' | ##Sending output of awk command to tac again to make it in its actual form, since tac prints it in reverse order.
tac ##Using tac to make it in correct order(lines were reversed because of previous tac).
You can try Perl as well
$ perl -0777 -ne ' /.*MARKER(.*)/s and print $1 ' input.txt
jjj
kkk
lll
$
This might work for you (GNU sed):
sed -nz 's/.*MARKER.//p' file
This uses greed to delete all lines upto and including the last occurrence of MARKER.
Simplest to remember:
tac fun.log | sed "/MARKER/Q" | tac
This awk solution would work with any version of awk on any OS:
awk '/^MARKER$/ {s=""; next} {s = s $0 RS} END {printf "%s", s}' file
jjj
kkk
lll

Print all lines between two patterns, exclusive, first instance only (in sed, AWK or Perl) [duplicate]

This question already has answers here:
How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?
(9 answers)
Closed 3 years ago.
Using sed, AWK (or Perl), how do you print all lines between (the first instance of) two patterns, exclusive of the patterns?1
That is, given as input:
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
Or possibly even:
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
fff
PATTERN1
ggg
hhh
iii
PATTERN2
jjj
I would expect, in both cases:
bbb
ccc
ddd
1 A number of users voted to close this question as a duplicate of this one. In the end, I provided a gist that proves they are different. The question is also superficially similar to a number of others, but there is no exact match, and none of them are of high quality, and, as I believe that this specific problem is the one most commonly faced, it deserves a clear formulation, and a set of correct, clear answers.
If you have GNU sed (tested using version 4.7 on Mac OS X), the simplest solution could be:
sed '0,/PATTERN1/d;/PATTERN2/Q'
Explanation:
The d command deletes from line 1 to the line matching /PATTERN1/ inclusive.
The Q command then exits without printing on the first line matching /PATTERN2/.
If the file has only once instance of the pattern, or if you don't mind extracting all of them, and you want a solution that doesn't depend on a GNU extension, this works:
sed -n '/PATTERN1/,/PATTERN2/{//!p}'
Explanation:
Note that the empty regular expression // repeats the last regular expression match.
With awk (assumes that PATTERN1 and PATTERN2 are always present in pairs and either of them do not occur inside a pair)
$ cat ip.txt
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
fff
PATTERN1
ggg
hhh
iii
PATTERN2
jjj
$ awk '/PATTERN2/{exit} f; /PATTERN1/{f=1}' ip.txt
bbb
ccc
ddd
/PATTERN1/{f=1} set flag if /PATTERN1/ is matched
/PATTERN2/{exit} exit if /PATTERN2/ is matched
f; print input line if flag is set
Generic solution, where the block required can be specified
$ awk -v b=1 '/PATTERN2/ && c==b{exit} c==b; /PATTERN1/{c++}' ip.txt
bbb
ccc
ddd
$ awk -v b=2 '/PATTERN2/ && c==b{exit} c==b; /PATTERN1/{c++}' ip.txt
2
46
This might work for you (GNU sed);
sed -n '/PATTERN1/{:a;n;/PATTERN2/q;p;$!ba}' file
This prints only the lines between the first set of delimiters, or if the second delimiter does not exist, to the end of the file.
I attempted twice to answer, but the questions switched hold/duplicate statuses..
Borrowing input from #Sundeep and adding the answer which I shared in the question comments.
Using awk
awk -v x=0 -v y=1 ' /PATTERN1/&&y { x=1;next } /PATTERN2/&&y { x=0;y=0; next } x ' file
with Perl
perl -0777 -ne ' while( /PATTERN1.*?\n(.+?)^[^\n]*?PATTERN2/msg ) { print $1 if $x++ <1 } '
Results:
$ cat ip.txt
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
PATTERN1
2
46
PATTERN2
xyz
$
$ awk -v x=0 -v y=1 ' /PATTERN1/&&y { x=1;next } /PATTERN2/&&y { x=0;y=0; next } x ' ip.txt
bbb
ccc
ddd
$ perl -0777 -ne ' while( /PATTERN1.*?\n(.+?)^[^\n]*?PATTERN2/msg ) { print $1 if $x++ <1 } ' ip.txt
bbb
ccc
ddd
$
To make it generic
awk here y is the input
awk -v x=0 -v y=2 ' /PATTERN1/ { x++;next } /PATTERN2/ { if(x==y) exit } x==y ' ip.txt
2
46
perl check ++$x against the occurence.. here it is 2
perl -0777 -ne ' while( /PATTERN1.*?\n(.+?)^[^\n]*?PATTERN2/msg ) { print $1 if ++$x==2 } ' ip.txt
2
46
Adding more solutions(possible ways here, for fun :) and not at all claiming that these are better than usual ones) All tested and written in GNU awk. Also tested with given examples only.
1st Solution:
awk -v RS="" -v FS="PATTERN2" -v ORS="" '$1 ~ /\nPATTERN1\n/{sub(/.*PATTERN1\n/,"",$1);print $1}' Input_file
2nd solution:
awk -v RS="" -v ORS="" 'match($0,/PATTERN1[^(PATTERN2)]*/){val=substr($0,RSTART,RLENGTH);gsub(/^PATTERN1\n|^$\n/,"",val);print val}' Input_file
3rd solution:
awk -v RS="" -v OFS="\n" -v ORS="" 'sub(/PATTERN2.*/,"") && sub(/.*PATTERN1/,"PATTERN1"){$1=$1;sub(/^PATTERN1\n/,"")} 1' Input_file
In all above codes output will be as follows.
bbb
ccc
ddd
Using GNU sed:
sed -nE '/PATTERN1/{:s n;/PATTERN2/q;p;bs}'
-n will prune all but lines between PATTERN1 and PATTERN2 including both, because there will be p printout command.
every sed range check if it's true will execute only one the next, so {} grouping is mandated..
Drop PATTERN1 by n command (means next), if reach the first PATTERN2 outrightly quit otherwise print the line then and continue the next line within that boundary.

gawk use to replace a line containing a pattern with multiple lines using variable

I am trying to replace a line containing the Pattern using gawk, with a set of lines. Let's say, file aa contains
aaaa
ccxyzcc
aaaa
ddxyzdd
I'm using gawk to replace all lines containing xyz with a set of lines 111\n222, my changed contents would contain:
aaaa
111
222
aaaa
111
222
But, if I use:
gawk -v nm2="111\n222" -v nm1="xyz" '{ if (/nm1/) print nm2;else print $0}' "aa"
The changed content shows:
aaaa
ccxyzcc
aaaa
ddxyzdd
I need the entire lines those contain xyz i.e. lines ccxyzcc and ddxyzdd having to be replaced with 111 followed by 222. Please help.
The problem with your code was that /nm1/ tries to match nm1 as pattern not the value in nm1 variable
$ gawk -v nm2="111\n222" -v nm1="xyz" '$0 ~ nm1{print nm2; next} 1' aa
aaaa
111
222
aaaa
111
222
Thanks #fedorqui for suggestion, next can be avoided by simply overwriting content of input line matching the pattern with required text
gawk -v nm2="111\n222" -v nm1="xyz" '$0 ~ nm1{$0=nm2} 1' aa
Solution with GNU sed
$ nm1='xyz'
$ nm2='111\n222'
$ sed "/$nm1/c $nm2" aa
aaaa
111
222
aaaa
111
222
The c command would delete the line matching pattern and add the text given
When using awk's ~ operator, and you don't need to provide a literal regex on the right-hand side.
Your command as-such with the correction of improper syntax would be something like,
gawk -v nm2="111\n222" -v nm1="xyz" '{ if ( $0 ~ nm1 ) print nm2;else print $0}' input-file
which produces the output.
aaaa
111
222
aaaa
111
222
This is how I'd do it:
$ cat aa
aaaa
ccxyzcc
aaaa
ddxyzdd
$ awk '{gsub(/.*xyz.*/, "111\n222")}1' aa
aaaa
111
222
aaaa
111
222
$
Passing variables as patterns to awk is always a bit tricky.
awk -v nm2='111\n222' '{if ($1 ~ /xyz/){ print nm2 } else {print}}'
will give you the output, but the 'xyz' pattern is now fixed.
Passing nm1 as shell variable will also work:
nm1=xyz
awk -v nm2='111\n222' '{if ($1 ~ /'$nm1'/){ print nm2 } else {print}}' aa

Transpose Columns in a single comma separated row conditionally

I have an input file that looks like this:
aaa 111
aaa 222
aaa 333
bbb 444
bbb 555
I want to create a transposed output file that looks like this:
aaa 111,222,333
bbb 444,555
How can I do this using awk, sed, etc?
One way using awk:
$ awk '{a[$1]=a[$1]?a[$1]","$2:$2}END{for(k in a)print k,a[k]}' file
aaa 111,222,333
bbb 444,555
And if your implementation of awk doesn't support the ternary operator then:
$ awk 'a[$1]{a[$1]=a[$1]","$2;next}{a[$1]=$2}END{for(k in a)print k,a[k]}' file
aaa 111,222,333
bbb 444,555
Your new file does not cause any problems for the script, what output are you getting? I suspect it's probably a line ending issue. Run dos2unix file to fix the line ending.
$ cat file
APM00065101435 189
APM00065101435 190
APM00065101435 191
APM00065101435 390
190104555 00C7
190104555 00D1
190104555 00E1
190104555 0454
190104555 0462
$ awk '{a[$1]=a[$1]?a[$1]","$2:$2}END{for(k in a)print k,a[k]}' file
APM00065101435 189,190,191,390
190104555 00C7,00D1,00E1,0454,0462
Code for GNU sed:
I made a question for this and got a very good & useful answer from potong:
sed -r ':a;$!N;s/^(([^ ]+ ).*)\n\2/\1,/;ta;P;D' file
sed -r ':a;$!N;s/^((\S+\s).*)\n\2/\1,/;ta;P;D' file

Resources