How to decrement (subtract) number in file with sed - bash

I've got some source code like the following where I call a function in C:
void myFunction (
&((int) table[1, 0]),
&((int) table[2, 0]),
&((int) table[3, 0])
);
...the only problem is that the function has >300 parameters (it's an auto-generated wrapper for initialising and calling a whole module; it was given to me and I cannot change it). And as you can see: I began accessing the array with a 1 instead of a 0... Great times, modifying all the 300 parameters, i.e. decrasing 300 x the x-coordinate of the array, by hand.
The solution I am looking for is how I could force sed to to do the work for me ;)
EDIT: Please note that the syntax above for accessing a two-dimensional array in C is wrong anyway! Of course it should be [1][0]... (so don't just copy-and-paste ;))

Basically, the command I came up with, was the following:
sed -r 's/(.*)(table\[)([0-9]+)(,)(.*)/echo "\1\2$((\3-1))\4\5"/ge' inputfile.c > outputfile.c
Well, this does not look very intuitive on the first sight - and I was missing good explanations for nearly every example I found.
So I will try to give a detailed explanation on this:
sed
--> basic command
-r
--> most examples you find are using -e; however, the -r parameter (only works with GNU sed) enables extended regular expressions and brings support for the + in a regex. It basically means "one or more matches".
's/input/output/ge'
--> this is the basic replacement syntax. It basically means "replace 'input' by 'output'". The /g is a "global" flag, i.e. sed will replace all occurences and not only the first one. You can add an additional e to execute the result in the bash. This is what we want to do here to handle the calculation.
(.*)
--> this matches "everthing" from the last match to the next match
(table\[)
--> the \ is to escape the bracket. This part of the expression will match Strings like table[
([0-9]+)
--> this one matches numbers with at least one digit, however, it can also match higher numbers with more than only one digit.
(,)
--> this simply matches the comma ,
(.*)
--> and again: the rest of the line
And now the interesting part:
echo "\1\2$((\3-1))\4\5"
the echo is a bash command
the \n (you can use every value from \1 up to \9) is some kind of "variable" for the inputs: \1 will contain the first match, \2 the seconds match, ... --> this helps you to preserve parts of the input string
the $((1+1)) is a simple bash syntax to calculate the value of the term inside the double brackets (in the complete sed command above, the \3 will of course be automatically replaced by the 3rd match, i.e. the 1st part inside the brackets to access the table's cells)
please note that we use quotation marks around the echo content to also be able to process lines with characters like & which would otherwise not work
The already mentioned e of \ge at the end will trigger the execution of the result in the bash. E.g. the first two lines of the example source code in the question would produce the following bash statements:
echo "void myFunction ("
echo " &((int) table[$((1-1)), 0]),"
which is being executed and results in the following output:
void myFunction (
&((int) table[0, 0]),
...which is exatcly what I wanted :)
BTW:
text > output.c
is simple bash syntax to output text (or in this case the sed-processed source code) to a file called output.c.
Good links about this topic are:
sed basics
regular expressions basics
Ahh and one more thing: You can also use sed in the git-Bash on Windows - if you are "forced" to use Windows at work like me ;)
PS: In the meantime I could have easily done this by hand but using sed was a lot more fun ;)

Here's another way you could do it, using Perl:
perl -pe 's/(table\[)(\d+)(,)/$1.($2-1).$3/e' file.c
This uses the e modifier to execute an expression in the replacement. The capture groups are concatenated together but the middle group has 1 subtracted from its value.
This will output to standard output so you can check that it does what you want. When you're happy, you can add the -i switch to overwrite the original file.

Related

Sed keep original indentation and camel-casing a variable

I have a simple sed script and I am replacing a bunch of lines in my application dynamically with a variable, the variable is a list of strings.My function works but does not keep the original indentation.the function deletes the line if it contains the certain string and replaces the line with a completely new line, I could not do a replace due to certain syntax restrictions.
How do I keep my original indentation when the line is replaced
Can I capitalize my variable and remove the underscore on the fly, i.e. the title is a capitalize and underscore removed version of the variableName, the list of items in the variable array is really long so I am trying to do this in one shot.
Ex: I want report_type -> Report Type done mid process
Is there a better way to solve this with sed? Thanks for any inputs much appreciated.
sed function is as follows
variableName=$1
sed -i "/name\=\"${variableName}\.name\" value\=model\.${variableName}\.name options\=\#lists\./c\\{\{\> \_dropdown title\=\"${variableName}\" required\=true name\=\"${variableName}\"\}\}" test
SAMPLE INPUT
{{> _select title="Report Type" required=true name="report_type.name" value=model.report_type.name options=#lists.report_type}}
SAMPLE EXPECTED OUPUT
{{> _dropdown title="Report Type" required=true name="report_type" value=model.report_type.name}}
sample input variable
report_type
Try this:
sed -E "s/^(\s+).*name\=\"(report_type)\.name\" value\=model\.report_type\.name options\=\#lists\..*$/\1\{\{\> \_dropdown title\=\"\2\" required\=true name\=\"\2\"\}\}/;T;s/\"(\w+)_(\w+)\"/\"\u\1 \u\2\"/g" input.txt > output.txt
I used "report_type" instead of ${variableName} for testing as an sed one-liner.
Please change back to ${variableName}.
Then go back to using -i (in addition to -E, which is for extended regex).
I am not sure whether I can do it without extended regex, let me know if that is necessary.
use s/// to replace fine tuned line
first capture group for the white space making the indentation
second capture group for the variable name
stop if that did not replace anything, T;
another s///
look for something consisting of only letters between "",
with a "_" between two parts,
seems safe enough because this step is only done on the already replaced line
replace by two parts, without "_"
\u for making camel case
Note:
Doing this on your sample input creates two very similar lines.
I assume that is intentional. Otherwise please provide desired output.
Using GNU sed version 4.2.1.
Interesting line of output:
{{> _dropdown title="Report Type" required=true name="Report Type"}}

Delete string from a file with beginning and ending tag like <ex></ex> in a shell script

I am writing a bash script to uncomment a comment with beginning of a tag
ex;
/*<O33>*/
// here my code
/*</O33>*/
Here's my script, and I successfully uncomment it.
sed -i "/^[ \t]*\/\*<O33>\*\//,/^[ \t]*\/\*<\/O33>\*\//s/\/\///g" $Path/DebugVersion.c
and this is the result:
/*<O33>*/
here my code
/*</O33>*/
Now I'm trying the reverse the process, to recomment the string between the begin and end tags, but I don't know how to do it.
The task of reinserting the comments is a little trickier because you do not want to insert the // markers on the start and end lines. I ended up with a file, script.sed, that contained:
\#/\*<O33>\*/#, \#/\*</O33>\*/# {
\#/\*</\{0,1\}O33>\*/# ! s%\([[:space:]]*\)%&//%
}
I then ran:
$ sed -f script.sed data
/*<O33>*/
//here my code
/*</O33>*/
$
Your pattern matching is complicated by the presence of / and * in the material to be matched. One way around this is to change sed's search character by using, in this case, \# to tell it that # marks the end of the pattern. (You can choose any other character that's convenient: = or % work well too.) The first line has the form patt1, patt2{ where the first pattern looks for /*<O33*/ (note that is a capital letter O, not a zero) and the second looks for /*</O33>*/. The { starts grouping the commands. It means that lines in the range from the first pattern to the second pattern will be subjected to the commands contained within the matching braces { … }.
The second line uses a slightly different variant on the pattern match to identify both begin and end lines (the /\{0,1\} is the classic sed way of finding 0 or 1 instances of /; if you use modern extended regexes, it is equivalent to /?). The ! operator inverts the sense of the match; only lines that do not match the end tags will be subjected to the s%%% substitution. This avoids inserting // in front of /*<O33>*/, for example.
The s%\([[:space:]]*\)%&//% operation looks for zero or more leading white space characters (blanks or tabs), and adds // after them. Again, this avoids using s/// because / is part of the replacement text.
This can all be squished into one line rather than needing a separate script file, but I like script files because I don't have to fight the shell. That's not a big problem here; there are no quotes — single, double or back — to confuse things. Use single quotes around the script unless you need to interpolate shell variables into the script. If you must use double quotes, you have to double up the backslashes, and it rapidly gets fiddly. Avoid double quotes around the script if you can.
$ sed '\#/\*<O33>\*/#, \#/\*</O33>\*/# { \#/\*</\{0,1\}O33>\*/# ! s%\([[:space:]]*\)%&//%; }' data
/*<O33>*/
//here my code
/*</O33>*/
$
The trailing semicolon is optional in GNU sed; it is necessary in BSD (macOS) and classic sed.
I note in passing that in the decommenting sed script, you have:
s/\/\///g
The g is probably not a good idea. If you have:
/*<O33>*/
// printf("%s\n", __func__); // Identify the function!
/*</O33*/
Then the uncommenting operation will leave:
/*<O33>*/
printf("%s\n", __func__); Identify the function!
/*</O33*/
removing the second comment marker too. The code won't compile. Drop the g; you only want to remove the first comment marker on the lines. (Clearly, if you only use // comments to disable blocks of code, this isn't critical — but it is safer in the long run.)
Also note that if you're working in C or C++ with the C preprocessor available, you'd do better to use:
#ifdef O33
printf("%s\n", __func__); // Identify the function!
#endif
as you don't have to edit the file to enable or disable the code; you can simply recompile with or without -DO33 or equivalent.
i learn from Jonathan's answer and SLePort's answer and i get it.
sed -i "/^[ \t]*\/\*<O33>\*\//,/^[ \t]*\/\*<\/O33>\*\//d" $Path/DebugVersion.c

Substitution of substring doesn't work in bash (tried sed, ${a/b/c/})

Before to write, of course I read many other similar cases. Example I used #!/bin/bash instead of #!/bin/sh
I have a very simple script that reads lines from a template file and wants to replace some keywords with real data. Example the string <NAME> will be replaced with a real name. In the example I want to replace it with the word Giuseppe. I tried 2 solutions but they don't work.
#!/bin/bash
#read the template and change variable information
while read LINE
do
sed 'LINE/<NAME>/Giuseppe' #error: sed: -e expression #1, char 2: extra characters after command
${LINE/<NAME>/Giuseppe} #error: WORD(*) command not found
done < template_mail.txt
(*) WORD is the first word found in the line
I am sorry if the question is too basic, but I cannot see the error and the error message is not helping.
EDIT1:
The input file should not be changed, i want to use it for every mail. Every time i read it, i will change with a different name according to the receiver.
EDIT2:
Thanks your answers i am closer to the solution. My example was a simplified case, but i want to change also other data. I want to do multiple substitutions to the same string, but BASH allows me only to make one substitution. In all programming languages i used, i was able to substitute from a string, but BASH makes this very difficult for me. The following lines don't work:
CUSTOM_MAIL=$(sed 's/<NAME>/Giuseppe/' template_mail.txt) # from file it's ok
CUSTOM_MAIL=$(sed 's/<VALUE>/30/' CUSTOM_MAIL) # from variable doesn't work
I want to modify CUSTOM_MAIL a few times in order to include a few real informations.
CUSTOM_MAIL=$(sed 's/<VALUE1>/value1/' template_mail.txt)
${CUSTOM_MAIL/'<VALUE2>'/'value2'}
${CUSTOM_MAIL/'<VALUE3>'/'value3'}
${CUSTOM_MAIL/'<VALUE4>'/'value4'}
What's the way?
No need to do the loop manually. sed command itself runs the expression on each line of provided file:
sed 's/<NAME>/Giuseppe/' template_mail.txt > output_file.txt
You might need g modifier if there are more appearances of the <NAME> string on one line: s/<NAME>/Giuseppe/g

Understanding 'sed' command

I am currently trying to install GCC-4.1.2 on my machine: Fedora 20.
In the instruction, the first three commands involve using 'sed' commands, for Makefile modification. However, I am having difficulty in using those commands properly for my case. The website link for GCC-4.1.2.
The commands are:
sed -i 's/install_to_$(INSTALL_DEST) //' libiberty/Makefile.in &&
sed -i 's#\./fixinc\.sh#-c true#' gcc/Makefile.in &&
sed -i 's/#have_mktemp_command#/yes/' gcc/gccbug.in &&
I am trying to understand them by reading the 'sed' man page, but it is not so easy to do so. Any help/tip would be appreciated!
First, the shell part: &&. That just chains the commands together, so each subsequent line will only be run if the prior one is run successfully.
sed -i means "run these commands inline on the file", that is, modify the file directly instead of printing the changed contents to STDOUT. Each sed command here (the string) is a substitute command, which we can tell because the command starts with s.
Substitute looks for a piece of text in the file, and then replaces it. So the order is always s/needle/replacement/. See how the first and last lines have those same forward-slashes? That's the traditional delimiter between the command (substitute), the needle to find in the haystack (install_to_$(INSTALL_DEST), and the text to replace it with ().
So, the first one looks for the string and deletes it (the empty replacement). The last one looks for #have_mktemp_command# and replaces it with yes.
The middle one is a bit weird. See how it starts with s# instead of s/? Well, sed will let you use any delimiter you like to separate the needle from the replacement. Since this needle had a / in it (\./fixinc\.sh), it made sense to use a different delimiter than /. It will replace the text ./fixinc.sh with -c true.
Last note: Why does the second needle have \. instead of .? Well, in a Regular Expression like the needle is (but not used in your example), some characters are magical and do magical fairy dust operations. One of those magic characters is .. To avoid the magic, we put a \ in front of it, escaping away from the magic. (The magic is "match any character", and we want a literal period. That's why.)

Why does sed not replace overlapping patterns

I have a database unload file with field separated with the <TAB> character. I am running this file through sed to replace any occurences of <TAB><TAB> with <TAB>\N<TAB>. This is so that when the file is loaded into MySQL the \N in interpreted as NULL.
The sed command 's/\t\t/\t\N\t/g;' almost works except that it only replaces the first instance e.g. "...<TAB><TAB><TAB>..." becomes "...<TAB>\N<TAB><TAB>...".
If I use 's/\t\t/\t\N\t/g;s/\t\t/\t\N\t/g;' it replaces more instances.
I have a notion that despite the /g modifier this is something to do with the end of one match being the start of another.
Could anyone explain what is happening and suggest a sed command that would work or do I need to loop.
I know I could probably switch to awk, perl, python but I want to know what is happening in sed.
Not dissimilar to the perl solution, this works for me using pure sed
With #Robin A. Meade improvement
sed ':repeat;
s|\t\t|\t\n\t|g;
t repeat'
Explanation
:repeat is a label, used for branch commands, similar to batch
s|\t\t|\t\n\t|g; - Standard replace 2 tabs with tab-newline-tab. I still use the global flag because if you have, say, 15 tabs, you will only need to loop twice, rather than 14 times.
t repeat means if the "s" command did any replaces, then goto the label repeat, else it goes onto the next line and starts over again.
So it goes like this. Keep repeating (goto repeat) as long as there is a match for the pattern of 2 tabs.
While the argument can be made that you could just do two identical global replaces and call it good, this same technique could work in more complicated scenarios.
As #thorn-blake points out, sed just doesn't support advanced features like lookahead, so you need to do a loop like this.
Original Answer
sed ':repeat;
/\t\t/{
s|\t\t|\t\n\t|g;
b repeat
}'
Explanation
:repeat is a label, used for branch commands, similar to batch
/\t\t/ means match the pattern 2 tabs. If the pattern it matched, the command following the second / is executed.
{} - In this case the command following the match command is a group. So all of the commands in the group are executed if the match pattern is met.
s|\t\t|\t\n\t|g; - Standard replace 2 tabs with tab-newline-tab. I still use the global because if you have say 15 tabs, you will only need to loop twice, rather than 14 times.
b repeat means always goto (branch) the label repeat
Short version
Which can be shortened to
sed ':r;s|\t\t|\t\n\t|g; t r'
# Original answer
# sed ':r;/\t\t/{s|\t\t|\t\n\t|g; b r}'
MacOS
And the Mac (yet still Linux/Windows compatible) version:
sed $':r\ns|\t\t|\t\\\n\t|g; t r'
# Original answer
# sed $':r\n/\t\t/{ s|\t\t|\t\\\n\t|g; b r\n}'
Tabs need to be literal in BSD sed
Newlines need to be both literal and escaped at the same time, hence the single slash (that's \ before it is processed by the $, making it a single literal slash ) plus the \n which becomes an actual newline
Both label names (:r) and branch commands (b r when not the end of the expression) must end in a newline. Special characters like semicolons and spaces are consumed by the label name/branch command in BSD, which makes it all very confusing.
I know you want sed, but sed doesn't like this at all, it seems that it specifically (see here) won't do what you want. However, perl will do it (AFAIK):
perl -pe 'while (s#\t\t#\t\n\t#) {}' <filename>
As a workaround, replace every tab with tab + \N; then remove all occurrences of \N which are not immediately followed by a tab.
sed -e 's/\t/\t\\N/g' -e 's/\\N\([^\t]\)/\1/g'
... provided your sed uses backslash before grouping parentheses (there are sed dialects which don't want the backslashes; try without them if this doesn't work for you.)
Right, even with /g, sed will not match the text it replaced again. Thus, it's read <TAB><TAB> and output <TAB>\N<TAB> and then reads the next thing in from the input stream. See http://www.grymoire.com/Unix/Sed.html#uh-7
In a regex language that supports lookaheads, you can get around this with a lookahead.
Well, sed simply works as designed. The input line is scanned once, not multiple times. Maybe it helps to look at the consequences if sed used rescanning the input line to deal with overlapping patterns by default: in this case even simple substitutions would work quite differently--some might say counter-intuitively--, e.g.
s/^/ / inserting a space at the beginning of a line would never terminate
s/$/foo/ appending foo to each line - likewise
s/[A-Z][A-Z]*/CENSORED/ replacing uppercase words with CENSORED - likewise
There are probably many other situations. Of course these could all be remedied with, say, a substitution modifier, but at the time sed was designed, the current behavior was chosen.

Resources