Insert Text 2 lines above Pattern - shell

I have some C code from a codegenerator and have to insert a line two lines before a specific mark in the code.
Example, generated Code: file.c
void foo () {
bar();
}
# <- I want a line to be added here!
void example () {
MARKERFUNCTION();
...
}
I do not know the exact position of MARKERFUNCTION() but it appears only once in the file and is the first line in example(). example() is not a fixed name.
A single sed command would be great, but a combination of grep and something else would do too.

$ sed '/()\s*{/{N; s/.*MARKERFUNCTION/== Text to add ==\n\0/}' file.c
void foo () {
bar();
}
== Text to add ==
void example () {
MARKERFUNCTION();
...
}
/()\s*{/ match () followed by any spaces and { - to detect line with function.. if there can be content inside (), using /)\s*{/ could work
{N; if above condition matches, get next line
s/.*MARKERFUNCTION/== Text to add ==\n\0/ perform substitution by checking if the pattern MARKERFUNCTION is there in the two lines we have. Then in replacement section, add the required text, a newline and the pattern matched which by default is in \0
awk solution, assuming there is a blank line prior to start of function having the required pattern
$ awk -v RS= -v ORS="\n\n" '/MARKERFUNCTION/{$0 = "--Text to add--\n" $0} 1' file.c
void foo () {
bar();
}
--Text to add--
void example () {
MARKERFUNCTION();
...
}
This reads the input file paragraph wise and if a paragraph contains the required pattern, add a line before printing

Using tac with awk:
tac file.c | awk '/MARKERFUNCTION/{p=NR} p && NR==p+2{print "new line added"} 1' | tac
void foo () {
bar();
}
# <- I want a line to be added here!
new line added
void example () {
MARKERFUNCTION();
...
}

Related

sed or awk add content to a body of a c function running in git bash v2.21.0

To turn this function in a c file (test.c)
void Fuction(uint8, var)
{
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
into:
void Fuction(uint8, var)
{
injected1();
injected2();
injected3();
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
By injecting this one (inject.c)
injected1();
injected2();
injected3();
I tried several approaches with sed and awk but actually i was not able to inject the code below the open curly braces the code was injected before the curly braces.
On a regex website I was able to select the pattern including the curly braces, but in my script it did not work. May be awk is more compatible, but I have no deeper experiance with awk may some one coeld help here?
With awk i had a additional problem to pass the pattern variable with an ^ancor
call in git bash should be like this:
./inject.sh "void Fuction(uint8, var)" test.c inject.c
(my actual inject.sh bash script)
PATTERN=$1
FILE=$2
INJECTFILE=$3
sed -i "/^$PATTERN/r $INJECTFILE" $FILE
#sed -i "/^$PATTERN\r\{/r $INJECTFILE" $FILE
I actually have no idear to catch also the \n and the { in the next line
My result is:
void Fuction(uint8, var)
injected1();
injected2();
injected3();
{
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
Expanding on OP's sed code:
sed "/^${PATTERN}/,/{/ {
/{/ r ${INJECTFILE}
}" $FILE
# or as a one-liner
sed -e "/^${PATTERN}/,/{/ {" -e "/{/ r ${INJECTFILE}" -e "}" $FILE
Where:
/^${PATTERN}/,/{/ finds range of rows starting with ^${PATTERN} and ending with a line that contains a {
{ ... } within that range ...
/{/ r ${INJECTFILE} - find the line containing a { and append the contents of ${INJECTFILE}
Results:
$ ./inject.sh "void Fuction(uint8, var)" test.c inject.c
void Fuction(uint8, var)
{
injected1();
injected2();
injected3();
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
Once OP verifies the output the -i flag can be added to force sed to overwrite the file.
NOTE: OP's expected output shows the injected lines with some additional leading white space; if the intention is to auto-indent the injected lines to match with the current lines ... I'd probably want to look at something like awk in order to provide the additional formatting.

awk substitution ascii table rules bash

I want to perform a hierarchical set of (non-recursive) substitutions in a text file.
I want to define the rules in an ascii file "table.txt" which contains lines of blank space tabulated pairs of strings:
aaa 3
aa 2
a 1
I have tried to solve it with an awk script "substitute.awk":
BEGIN { while (getline < file) { subs[$1]=$2; } }
{ line=$0; for(i in subs)
{ gsub(i,subs[i],line); }
print line;
}
When I call the script giving it the string "aaa":
echo aaa | awk -v file="table.txt" -f substitute.awk
I get
21
instead of the desired "3". Permuting the lines in "table.txt" doesn't help. Who can explain what the problem is here, and how to circumvent it? (This is a simplified version of my actual task. Where I have a large file containing ascii encoded phonetic symbols which I want to convert into Latex code. The ascii encoding of the symbols contains {$,&,-,%,[a-z],[0-9],...)).
Any comments and suggestions!
PS:
Of course in this application for a substitution table.txt:
aa ab
a 1
a original string: "aa" should be converted into "ab" and not "1b". That means a string which was yielded by applying a rule must be left untouched.
How to account for that?
The order of the loop for (i in subs) is undefined by default.
In newer versions of awk you can use PROCINFO["sorted_in"] to control the sort order. See section 12.2.1 Controlling Array Traversal and (the linked) section 8.1.6 Using Predefined Array Scanning Orders for details about that.
Alternatively, if you can't or don't want to do that you could store the replacements in numerically indexed entries in subs and walk the array in order manually.
To do that you will need to store both the pattern and the replacement in the value of the array and that will require some care to combine. You can consider using SUBSEP or any other character that cannot be in the pattern or replacement and then split the value to get the pattern and replacement in the loop.
Also note the caveats/etc×¥ with getline listed on http://awk.info/?tip/getline and consider not using that manually but instead using NR==1{...} and just listing table.txt as the first file argument to awk.
Edit: Actually, for the manual loop version you could also just keep two arrays one mapping input file line number to the patterns to match and another mapping patterns to replacements. Then looping over the line number array will get you the pattern and the pattern can be used in the second array to get the replacement (for gsub).
Instead of storing the replacements in an associative array, put them in two arrays indexed by integer (one array for the strings to replace, one for the replacements) and iterate over the arrays in order:
BEGIN {i=0; while (getline < file) { subs[i]=$1; repl[i++]=$2}
n = i}
{ for(i=0;i<n;i++) { gsub(subs[i],repl[i]); }
print tolower($0);
}
It seems like perl's zero-width word boundary is what you want. It's a pretty straightforward conversion from the awk:
#!/usr/bin/env perl
use strict;
use warnings;
my %subs;
BEGIN{
open my $f, '<', 'table.txt' or die "table.txt:$!";
while(<$f>) {
my ($k,$v) = split;
$subs{$k}=$v;
}
}
while(<>) {
while(my($k, $v) = each %subs) {
s/\b$k\b/$v/g;
}
print;
}
Here's an answer pulled from another StackExchange site, from a fairly similar question: Replace multiple strings in a single pass.
It's slightly different in that it does the replacements in inverse order by length of target string (i.e. longest target first), but that is the only sensible order for targets which are literal strings, as appears to be the case in this question as well.
If you have tcc installed, you can use the following shell function, which process the file of substitutions into a lex-generated scanner which it then compiles and runs using tcc's compile-and-run option.
# Call this as: substitute replacements.txt < text_to_be_substituted.txt
# Requires GNU sed because I was too lazy to write a BRE
substitute () {
tcc -run <(
{
printf %s\\n "%option 8bit noyywrap nounput" "%%"
sed -r 's/((\\\\)*)(\\?)$/\1\3\3/;
s/((\\\\)*)\\?"/\1\\"/g;
s/^((\\.|[^[:space:]])+)[[:space:]]*(.*)/"\1" {fputs("\3",yyout);}/' \
"$1"
printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"
} | lex -t)
}
With gcc or clang, you can use something similar to compile a substitution program from the replacement list, and then execute that program on the given text. Posix-standard c99 does not allow input from stdin, but gcc and clang are happy to do so provided you tell them explicitly that it is a C program (-x c). In order to avoid excess compilations, we use make (which needs to be gmake, Gnu make).
The following requires that the list of replacements be in a file with a .txt extension; the cached compiled executable will have the same name with a .exe extension. If the makefile were in the current directory with the name Makefile, you could invoke it as make repl (where repl is the name of the replacement file without a text extension), but since that's unlikely to be the case, we'll use a shell function to actually invoke make.
Note that in the following file, the whitespace at the beginning of each line starts with a tab character:
substitute.mak
.SECONDARY:
%: %.exe
#$(<D)/$(<F)
%.exe: %.txt
#{ printf %s\\n "%option 8bit noyywrap nounput" "%%"; \
sed -r \
's/((\\\\)*)(\\?)$$/\1\3\3/; #\
s/((\\\\)*)\\?"/\1\\"/g; #\
s/^((\\.|[^[:space:]])+)[[:space:]]*(.*)/"\1" {fputs("\3",yyout);}/' \
"$<"; \
printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"; \
} | lex -t | c99 -D_POSIX_C_SOURCE=200809L -O2 -x c -o "$#" -
Shell function to invoke the above:
substitute() {
gmake -f/path/to/substitute.mak "${1%.txt}"
}
You can invoke the above command with:
substitute file
where file is the name of the replacements file. (The filename must end with .txt but you don't have to type the file extension.)
The format of the input file is a series of lines consisting of a target string and a replacement string. The two strings are separated by whitespace. You can use any valid C escape sequence in the strings; you can also \-escape a space character to include it in the target. If you want to include a literal \, you'll need to double it.
If you don't want C escape sequences and would prefer to have backslashes not be metacharacters, you can replace the sed program with a much simpler one:
sed -r 's/([\\"])/\\\1/g' "$<"; \
(The ; \ is necessary because of the way make works.)
a) Don't use getline unless you have a very specific need and fully understand all the caveats, see http://awk.info/?tip/getline
b) Don't use regexps when you want strings (yes, this means you cannot use sed).
c) The while loop needs to constantly move beyond the part of the line you've already changed or you could end up in an infinite loop.
You need something like this:
$ cat substitute.awk
NR==FNR {
if (NF==2) {
strings[++numStrings] = $1
old2new[$1] = $2
}
next
}
{
for (stringNr=1; stringNr<=numStrings; stringNr++) {
old = strings[stringNr]
new = old2new[old]
slength = length(old)
tail = $0
$0 = ""
while ( sstart = index(tail,old) ) {
$0 = $0 substr(tail,1,sstart-1) new
tail = substr(tail,sstart+slength)
}
$0 = $0 tail
}
print
}
$ echo aaa | awk -f substitute.awk table.txt -
3
$ echo aaaa | awk -f substitute.awk table.txt -
31
and adding some RE metacharacters to table.txt to show they are treated just like every other character and showing how to run it when the target text is stored in a file instead of being piped:
$ cat table.txt
aaa 3
aa 2
a 1
. 7
\ 4
* 9
$ cat foo
a.a\aa*a
$ awk -f substitute.awk table.txt foo
1714291
Your new requirement requires a solution like this:
$ cat substitute.awk
NR==FNR {
if (NF==2) {
strings[++numStrings] = $1
old2new[$1] = $2
}
next
}
{
delete news
for (stringNr=1; stringNr<=numStrings; stringNr++) {
old = strings[stringNr]
new = old2new[old]
slength = length(old)
tail = $0
$0 = ""
charPos = 0
while ( sstart = index(tail,old) ) {
charPos += sstart
news[charPos] = new
$0 = $0 substr(tail,1,sstart-1) RS
tail = substr(tail,sstart+slength)
}
$0 = $0 tail
}
numChars = split($0, olds, "")
$0 = ""
for (charPos=1; charPos <= numChars; charPos++) {
$0 = $0 (charPos in news ? news[charPos] : olds[charPos])
}
print
}
.
$ cat table.txt
1 a
2 b
$ echo "121212" | awk -f substitute.awk table.txt -
ababab

Replace or append block of text in file with contest of another file

I have two files:
super.conf
someconfig=23;
second line;
#blockbegin
dynamicconfig=12
dynamicconfig2=1323
#blockend
otherconfig=12;
input.conf
newdynamicconfig=12;
anothernewline=1234;
I want to run a script and have input.conf replace the contents between the #blockbegin and #blockend lines.
I already have this:
sed -i -ne '/^#blockbegin/ {p; r input.conf' -e ':a; n; /#blockend/ {p; b}; ba}; p' super.conf
It works well but until I change or remove #blockend line in super.conf, then script replaces all lines after #blockbegin.
In addition, I want script to replace block or if block doesn't exists in super.conf append new block with content of input.conf to super.conf.
It can be accomplished by remove + append, but how to remove block using sed or other unix command?
Though I gotta question the utility of this scheme -- I tend to favor systems that complain loudly when expectations aren't met instead of being more loosey-goosey like this -- I believe the following script will do what you want.
Theory of operation: It reads in everything up-front, and then emits its output all in one fell swoop.
Assuming you name the file injector, call it like injector input.conf super.conf.
#!/usr/bin/env awk -f
#
# Expects to be called with two files. First is the content to inject,
# second is the file to inject into.
FNR == 1 {
# This switches from "read replacement content" to "read template"
# at the boundary between reading the first and second files. This
# will of course do something suprising if you pass more than two
# files.
readReplacement = !readReplacement;
}
# Read a line of replacement content.
readReplacement {
rCount++;
replacement[rCount] = $0;
next;
}
# Read a line of template content.
{
tCount++;
template[tCount] = $0;
}
# Note the beginning of the replacement area.
/^#blockbegin$/ {
beginAt = tCount;
}
# Note the end of the replacement area.
/^#blockend$/ {
endAt = tCount;
}
# Finished reading everything. Process it all.
END {
if (beginAt && endAt) {
# Both beginning and ending markers were found; replace what's
# in the middle of them.
emitTemplate(1, beginAt);
emitReplacement();
emitTemplate(endAt, tCount);
} else {
# Didn't find both markers; just append.
emitTemplate(1, tCount);
emitReplacement();
}
}
# Emit the indicated portion of the template to stdout.
function emitTemplate(from, to) {
for (i = from; i <= to; i++) {
print template[i];
}
}
# Emit the replacement text to stdout.
function emitReplacement() {
for (i = 1; i <= rCount; i++) {
print replacement[i];
}
}
I've written perl one-liner:
perl -0777lni -e 'BEGIN{open(F,pop(#ARGV))||die;$b="#blockbegin";$e="#blockend";local $/;$d=<F>;close(F);}s|\n$b(.*)$e\n||s;print;print "\n$b\n",$d,"\n$e\n" if eof;' edited.file input.file
Arguments:
edited.file - path to updated file
input.file - path to file with new content of block
Script first delete block (if find one matching) and next append new block with new content.
You mean say
sed '/^#blockbegin/,/#blockend/d' super.conf

Shell script to combine three files using AWK

I have three files G_P_map.txt, G_S_map.txt and S_P_map.txt. I have to combine these three files using awk. The example contents are the following -
(G_P_map.txt contains)
test21g|A-CZ|1mos
test21g|A-CZ|2mos
...
(G_S_map.txt contains)
nwtestn5|A-CZ
nwtestn6|A-CZ
...
(S_P_map.txt contains)
3mos|nwtestn5
4mos|nwtestn6
Expected Output :
1mos, 3mos
2mos, 4mos
Here is the code which I tried. I was able to combine the first two, but I couldn't do along with the third one.
awk -F"|" 'NR==FNR {file1[$1]=$1; next} {$2=file[$1]; print}' G_S_map.txt S_P_map.txt
Any ideas/help is much appreciated. Thanks in advance!
I would look at a combination of join and cut.
GNU AWK (gawk) 4 has BEGINFILE and ENDFILE which would be perfect for this. However, the gawk manual includes a function that will provide this functionality for most versions of AWK.
#!/usr/bin/awk
BEGIN {
FS = "|"
}
function beginfile(ignoreme) {
files++
}
function endfile(ignoreme) {
# endfile() would be defined here if we were using it
}
FILENAME != _oldfilename \
{
if (_oldfilename != "")
endfile(_oldfilename)
_oldfilename = FILENAME
beginfile(FILENAME)
}
END { endfile(FILENAME) }
files == 1 { # save all the key, value pairs from file 1
file1[$2] = $3
next
}
files == 2 { # save all the key, value pairs from file 2
file2[$1] = $2
next
}
files == 3 { # perform the lookup and output
print file1[file2[$2]], $1
}
# Place the regular END block here, if needed. It would be in addition to the one above (there can be more than one)
Call the script like this:
./scriptname G_P_map.txt G_S_map.txt S_P_map.txt

Joining lines that matches specific conditions in bash

I need a command that will join lines if:
-following line starts with more than 5 spaces
-length of the joined lines won't be greater than 79 characters
-those lines are not between lines with pattern1 and pattern2
-same as above but with another set of patterns, like pattern3 and pattern4
It will work on a file like this:
Long line that contains too much text for combining it with following one
That line cannot be attached to the previous becouse of the length
This one also
becouse it doesn't start with spaces
This one
could be
expanded
pattern1
here are lines
that shouldn't be
changed
pattern2
Another line
to grow
After running the command, output should be:
Long line that contains too much text for combining it with following one
That line cannot be attached to the previous becouse of the length
This one also
becouse that one doesn't start with spaces
This one could be expanded
pattern1
here are lines
that shouldn't be
changed
pattern2
Another line to grow
It can't move part of the line.
I'm using bash 2.05 sed 3.02 awk 3.1.1 and grep 2.5.1 and i don't know how to solve this problem :)
Here's a start for you:
#!/usr/bin/awk -f
BEGIN {
TRUE = printflag1 = printflag2 = 1
FALSE = 0
}
# using two different flags prevents premature enabling when blocks are
# nested or intermingled
/pattern1/ {
printflag1 = FALSE
}
/pattern2/ {
printflag1 = TRUE
}
/pattern3/ {
printflag2 = FALSE
}
/pattern4/ {
printflag2 = TRUE
}
{
line = $0
sub(/^ +/, " ", line)
sub(/ +$/, "", line)
}
/^ / &&
length(accum line) <= 79 &&
printflag1 &&
printflag2 {
accum = accum line
next
}
{
print accum
accum = line
}

Resources