sed or awk add content to a body of a c function running in git bash v2.21.0 - bash

To turn this function in a c file (test.c)
void Fuction(uint8, var)
{
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
into:
void Fuction(uint8, var)
{
injected1();
injected2();
injected3();
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
By injecting this one (inject.c)
injected1();
injected2();
injected3();
I tried several approaches with sed and awk but actually i was not able to inject the code below the open curly braces the code was injected before the curly braces.
On a regex website I was able to select the pattern including the curly braces, but in my script it did not work. May be awk is more compatible, but I have no deeper experiance with awk may some one coeld help here?
With awk i had a additional problem to pass the pattern variable with an ^ancor
call in git bash should be like this:
./inject.sh "void Fuction(uint8, var)" test.c inject.c
(my actual inject.sh bash script)
PATTERN=$1
FILE=$2
INJECTFILE=$3
sed -i "/^$PATTERN/r $INJECTFILE" $FILE
#sed -i "/^$PATTERN\r\{/r $INJECTFILE" $FILE
I actually have no idear to catch also the \n and the { in the next line
My result is:
void Fuction(uint8, var)
injected1();
injected2();
injected3();
{
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content

Expanding on OP's sed code:
sed "/^${PATTERN}/,/{/ {
/{/ r ${INJECTFILE}
}" $FILE
# or as a one-liner
sed -e "/^${PATTERN}/,/{/ {" -e "/{/ r ${INJECTFILE}" -e "}" $FILE
Where:
/^${PATTERN}/,/{/ finds range of rows starting with ^${PATTERN} and ending with a line that contains a {
{ ... } within that range ...
/{/ r ${INJECTFILE} - find the line containing a { and append the contents of ${INJECTFILE}
Results:
$ ./inject.sh "void Fuction(uint8, var)" test.c inject.c
void Fuction(uint8, var)
{
injected1();
injected2();
injected3();
dosomething();
}
// void Fuction(uint8, var)
// should not be injected below a comment with same pattern content
Once OP verifies the output the -i flag can be added to force sed to overwrite the file.
NOTE: OP's expected output shows the injected lines with some additional leading white space; if the intention is to auto-indent the injected lines to match with the current lines ... I'd probably want to look at something like awk in order to provide the additional formatting.

Related

Sed substitution - spaces by tabs

I'm trying to formatting a batch of .c files via the sed command in a shell script to align properly the functions name. I'm replacing int(space)function1() by int(3tab)function1()
int function1(int foo)
{
*my_function_code*
}
char function2(int foo)
{
*my_function_code*
}
int main(int foo)
{
*my_function_code*
}
I'm actually using the following loop to apply my substitution :
#align global scope
printf " Correct global scope alignement...\n"
for file in ${FILES[#]}; do
sed -i -e 's/^int */int /g' \
-i -e 's/^char */char /g' \
-i -e 's/^float */float /g' \
-i -e 's/^long int */long int /g' ${file}
done
The problem is, if I rerun the script, instead of doing nothing, it will add multiple tabs again. Giving me this :
int function1(int foo)
{
*my_function_code*
}
char function2(int foo)
{
*my_function_code*
}
int main(int foo)
{
*my_function_code*
}
The * isn't supposed to looking only for spaces and not tabulations or is it considered as all blanks characters ?
Could you please try following, written and tested with shown samples. Simply checking if line starts either from int or char(you could add float and long int too in condition) then substitute spaces in 3 tabs here.
sed -E '/^int|^char/s/ +/\t\t\t/' Input_file

Match a pattern within a block of text bounded by two different patterns

Is there a way to match a pattern within a block of text whose boundaries are two unique patterns on different lines? By that I mean:
/*some comment*/
void something_patternWord1()
{
some_code;
some_code;
some_code_patternWord3;
some_code;
}
/*some comment patternWord2
Desired Output:
/*some comment*/
void something_patternWord1()
{
some_code;
some_code;
new_line_of_some_code; //after performing sed command
some_code_patternWord3;
some_code;
}
/*some comment patternWord2
What I've tried:
sed '/patternWord1/,/patternWord2/{/patternWord3/i new_line_of_some_code;}' inputFile > outputFile
The above command doesn't seem to be working.
Ultimately, what I'm trying to achieve is using bash script, read in some stuff from .csv file, and if the word read in is someWord then search for a block of text bounded by patternWord1 and patternWord2, match patternWord3 within that block of text and insert some code above it.
I have many .cpp files with many different functions and therefore many combinations of different patternWord1 and patternWord2.
Is what I'm trying to do even possible with sed, and is sed the best tool for something like this?
Perhaps another approach: would it be possible for me to use if-then statements in bash while having the script read in a .cpp file, and then use the sed command to match for patternWord3 and insert a new line of code?
If you could help me out with sed that would be great, but also I'm open to other suggestions as well if it make more sense/it is more efficient. Thank you in advance.
The point here is to tell the insert command where to stop.
You may do it several ways, either placing the inserted string on a new line, or by closing the sed command with the ' where the inserted line should end.
You may use
sed -e '/patternWord1/,/patternWord2/{/patternWord3/i\
new_line_of_some_code;
}'
Or,
sed -e '/patternWord1/,/patternWord2/{/patternWord3/i \ \ \ new_line_of_some_code;' -e '}' inputFile > outputFile
See the online demo:
s='/*some comment*/
void something_patternWord1()
{
some_code;
some_code;
some_code_patternWord3;
some_code;
}
/*some comment patternWord2'
sed -e '/patternWord1/,/patternWord2/{/patternWord3/i \ \ \ new_line_of_some_code;' -e '}' <<< "$s"
Output:
/*some comment*/
void something_patternWord1()
{
some_code;
some_code;
new_line_of_some_code;
some_code_patternWord3;
some_code;
}
/*some comment patternWord2

Insert Text 2 lines above Pattern

I have some C code from a codegenerator and have to insert a line two lines before a specific mark in the code.
Example, generated Code: file.c
void foo () {
bar();
}
# <- I want a line to be added here!
void example () {
MARKERFUNCTION();
...
}
I do not know the exact position of MARKERFUNCTION() but it appears only once in the file and is the first line in example(). example() is not a fixed name.
A single sed command would be great, but a combination of grep and something else would do too.
$ sed '/()\s*{/{N; s/.*MARKERFUNCTION/== Text to add ==\n\0/}' file.c
void foo () {
bar();
}
== Text to add ==
void example () {
MARKERFUNCTION();
...
}
/()\s*{/ match () followed by any spaces and { - to detect line with function.. if there can be content inside (), using /)\s*{/ could work
{N; if above condition matches, get next line
s/.*MARKERFUNCTION/== Text to add ==\n\0/ perform substitution by checking if the pattern MARKERFUNCTION is there in the two lines we have. Then in replacement section, add the required text, a newline and the pattern matched which by default is in \0
awk solution, assuming there is a blank line prior to start of function having the required pattern
$ awk -v RS= -v ORS="\n\n" '/MARKERFUNCTION/{$0 = "--Text to add--\n" $0} 1' file.c
void foo () {
bar();
}
--Text to add--
void example () {
MARKERFUNCTION();
...
}
This reads the input file paragraph wise and if a paragraph contains the required pattern, add a line before printing
Using tac with awk:
tac file.c | awk '/MARKERFUNCTION/{p=NR} p && NR==p+2{print "new line added"} 1' | tac
void foo () {
bar();
}
# <- I want a line to be added here!
new line added
void example () {
MARKERFUNCTION();
...
}

Nested dollar signs inside quotes

Trying to write a bash script containing nested dollar variables and I can't get it to work :
#!/bin/bash
sed '4s/.*/$(grep "remote.*$1" /home/txtfile)/' /home/target
The error says :
sed / -e expression #1, char 30: unkown option to 's'
The problem seems to come from $1 which need to be replaced by the parameter passed from the bash call and then the whole $(...) needs to be replaced by the command call so we replace the target line 4 by the string output.
Variable expansion and Command substitution won't be done when put inside single quotes, use double quotes instead:
sed "4s/.*/$(grep "remote.*$1" /home/txtfile)/" /home/target
Your approach is wrong, the right way to do what you want is just one command, something like this (depending on your possible $1 values and input file contents which you haven't shown us):
awk -v tgt='remote.*$1' '
NR==FNR { if ($0 ~ tgt) str = str $0 ORS; next }
FNR==4 { printf "%s", str; next }
{ print }
' /home/txtfile /home/target

awk substitution ascii table rules bash

I want to perform a hierarchical set of (non-recursive) substitutions in a text file.
I want to define the rules in an ascii file "table.txt" which contains lines of blank space tabulated pairs of strings:
aaa 3
aa 2
a 1
I have tried to solve it with an awk script "substitute.awk":
BEGIN { while (getline < file) { subs[$1]=$2; } }
{ line=$0; for(i in subs)
{ gsub(i,subs[i],line); }
print line;
}
When I call the script giving it the string "aaa":
echo aaa | awk -v file="table.txt" -f substitute.awk
I get
21
instead of the desired "3". Permuting the lines in "table.txt" doesn't help. Who can explain what the problem is here, and how to circumvent it? (This is a simplified version of my actual task. Where I have a large file containing ascii encoded phonetic symbols which I want to convert into Latex code. The ascii encoding of the symbols contains {$,&,-,%,[a-z],[0-9],...)).
Any comments and suggestions!
PS:
Of course in this application for a substitution table.txt:
aa ab
a 1
a original string: "aa" should be converted into "ab" and not "1b". That means a string which was yielded by applying a rule must be left untouched.
How to account for that?
The order of the loop for (i in subs) is undefined by default.
In newer versions of awk you can use PROCINFO["sorted_in"] to control the sort order. See section 12.2.1 Controlling Array Traversal and (the linked) section 8.1.6 Using Predefined Array Scanning Orders for details about that.
Alternatively, if you can't or don't want to do that you could store the replacements in numerically indexed entries in subs and walk the array in order manually.
To do that you will need to store both the pattern and the replacement in the value of the array and that will require some care to combine. You can consider using SUBSEP or any other character that cannot be in the pattern or replacement and then split the value to get the pattern and replacement in the loop.
Also note the caveats/etc×¥ with getline listed on http://awk.info/?tip/getline and consider not using that manually but instead using NR==1{...} and just listing table.txt as the first file argument to awk.
Edit: Actually, for the manual loop version you could also just keep two arrays one mapping input file line number to the patterns to match and another mapping patterns to replacements. Then looping over the line number array will get you the pattern and the pattern can be used in the second array to get the replacement (for gsub).
Instead of storing the replacements in an associative array, put them in two arrays indexed by integer (one array for the strings to replace, one for the replacements) and iterate over the arrays in order:
BEGIN {i=0; while (getline < file) { subs[i]=$1; repl[i++]=$2}
n = i}
{ for(i=0;i<n;i++) { gsub(subs[i],repl[i]); }
print tolower($0);
}
It seems like perl's zero-width word boundary is what you want. It's a pretty straightforward conversion from the awk:
#!/usr/bin/env perl
use strict;
use warnings;
my %subs;
BEGIN{
open my $f, '<', 'table.txt' or die "table.txt:$!";
while(<$f>) {
my ($k,$v) = split;
$subs{$k}=$v;
}
}
while(<>) {
while(my($k, $v) = each %subs) {
s/\b$k\b/$v/g;
}
print;
}
Here's an answer pulled from another StackExchange site, from a fairly similar question: Replace multiple strings in a single pass.
It's slightly different in that it does the replacements in inverse order by length of target string (i.e. longest target first), but that is the only sensible order for targets which are literal strings, as appears to be the case in this question as well.
If you have tcc installed, you can use the following shell function, which process the file of substitutions into a lex-generated scanner which it then compiles and runs using tcc's compile-and-run option.
# Call this as: substitute replacements.txt < text_to_be_substituted.txt
# Requires GNU sed because I was too lazy to write a BRE
substitute () {
tcc -run <(
{
printf %s\\n "%option 8bit noyywrap nounput" "%%"
sed -r 's/((\\\\)*)(\\?)$/\1\3\3/;
s/((\\\\)*)\\?"/\1\\"/g;
s/^((\\.|[^[:space:]])+)[[:space:]]*(.*)/"\1" {fputs("\3",yyout);}/' \
"$1"
printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"
} | lex -t)
}
With gcc or clang, you can use something similar to compile a substitution program from the replacement list, and then execute that program on the given text. Posix-standard c99 does not allow input from stdin, but gcc and clang are happy to do so provided you tell them explicitly that it is a C program (-x c). In order to avoid excess compilations, we use make (which needs to be gmake, Gnu make).
The following requires that the list of replacements be in a file with a .txt extension; the cached compiled executable will have the same name with a .exe extension. If the makefile were in the current directory with the name Makefile, you could invoke it as make repl (where repl is the name of the replacement file without a text extension), but since that's unlikely to be the case, we'll use a shell function to actually invoke make.
Note that in the following file, the whitespace at the beginning of each line starts with a tab character:
substitute.mak
.SECONDARY:
%: %.exe
#$(<D)/$(<F)
%.exe: %.txt
#{ printf %s\\n "%option 8bit noyywrap nounput" "%%"; \
sed -r \
's/((\\\\)*)(\\?)$$/\1\3\3/; #\
s/((\\\\)*)\\?"/\1\\"/g; #\
s/^((\\.|[^[:space:]])+)[[:space:]]*(.*)/"\1" {fputs("\3",yyout);}/' \
"$<"; \
printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"; \
} | lex -t | c99 -D_POSIX_C_SOURCE=200809L -O2 -x c -o "$#" -
Shell function to invoke the above:
substitute() {
gmake -f/path/to/substitute.mak "${1%.txt}"
}
You can invoke the above command with:
substitute file
where file is the name of the replacements file. (The filename must end with .txt but you don't have to type the file extension.)
The format of the input file is a series of lines consisting of a target string and a replacement string. The two strings are separated by whitespace. You can use any valid C escape sequence in the strings; you can also \-escape a space character to include it in the target. If you want to include a literal \, you'll need to double it.
If you don't want C escape sequences and would prefer to have backslashes not be metacharacters, you can replace the sed program with a much simpler one:
sed -r 's/([\\"])/\\\1/g' "$<"; \
(The ; \ is necessary because of the way make works.)
a) Don't use getline unless you have a very specific need and fully understand all the caveats, see http://awk.info/?tip/getline
b) Don't use regexps when you want strings (yes, this means you cannot use sed).
c) The while loop needs to constantly move beyond the part of the line you've already changed or you could end up in an infinite loop.
You need something like this:
$ cat substitute.awk
NR==FNR {
if (NF==2) {
strings[++numStrings] = $1
old2new[$1] = $2
}
next
}
{
for (stringNr=1; stringNr<=numStrings; stringNr++) {
old = strings[stringNr]
new = old2new[old]
slength = length(old)
tail = $0
$0 = ""
while ( sstart = index(tail,old) ) {
$0 = $0 substr(tail,1,sstart-1) new
tail = substr(tail,sstart+slength)
}
$0 = $0 tail
}
print
}
$ echo aaa | awk -f substitute.awk table.txt -
3
$ echo aaaa | awk -f substitute.awk table.txt -
31
and adding some RE metacharacters to table.txt to show they are treated just like every other character and showing how to run it when the target text is stored in a file instead of being piped:
$ cat table.txt
aaa 3
aa 2
a 1
. 7
\ 4
* 9
$ cat foo
a.a\aa*a
$ awk -f substitute.awk table.txt foo
1714291
Your new requirement requires a solution like this:
$ cat substitute.awk
NR==FNR {
if (NF==2) {
strings[++numStrings] = $1
old2new[$1] = $2
}
next
}
{
delete news
for (stringNr=1; stringNr<=numStrings; stringNr++) {
old = strings[stringNr]
new = old2new[old]
slength = length(old)
tail = $0
$0 = ""
charPos = 0
while ( sstart = index(tail,old) ) {
charPos += sstart
news[charPos] = new
$0 = $0 substr(tail,1,sstart-1) RS
tail = substr(tail,sstart+slength)
}
$0 = $0 tail
}
numChars = split($0, olds, "")
$0 = ""
for (charPos=1; charPos <= numChars; charPos++) {
$0 = $0 (charPos in news ? news[charPos] : olds[charPos])
}
print
}
.
$ cat table.txt
1 a
2 b
$ echo "121212" | awk -f substitute.awk table.txt -
ababab

Resources