sed command for inserting text inside single quote - bash

Suppose there's a text file with the following line:
export MYSQL_ADMIN=''
I want to insert text inside that single quote using the sed command, so that it changes to something like this for example:
export MYSQL_ADMIN='abc1'
What is the appropriate sed command for that in Linux?
I tried
sed -i -e ''/MYSQL_ADMIN/s/''/'abc1'/g"
but it didn't work.

Something like sed -i "s;export MYSQL_ADMIN=.*;export MYSQL_ADMIN='abc1';" /path/to/file.ext
-i modify file in place
s means substitute,
First block is what you are matching as an regular expression - the .* matches everything to the end of the line, this ensures you don't keep any text on that line after the substitue - and second block is what you are replacing with that match.
Always check the file after each run of sed if there is no error and check what changed.
To get the single quotes to print you may have to do ""'"" like ""'""abc1""'""

It is important to understand that although
I want to insert text inside that single quote using the sed command
is a perfectly good characterization of the effect you want to achieve, it does not map directly onto operations from sed's repertoire. With sed, the appropriate tool for most line modifications is the s command, which substitutes specified text for one or more matches to a specified regular expression. That would be the most natural thing to use for your case.
Additionally, it is important with sed to understand how and when to bind commands to specific lines. If you don't do that for a given command then it is applied to all lines. Sometimes that's fine, but other times it will produce unwanted results.
I tried
sed -i -e ''/MYSQL_ADMIN/s/''/'abc1'/g"
but it didn't work.
The two leading single quotes in that sed expression match each other, leaving the trailing double quote unmatched. Also, you do not specify the name of the file to modify. This variation would at least be valid shell syntax, and it would have the desired effect on the specified line appearing in file my_script:
sed -i -e "/MYSQL_ADMIN/s/''/'abc1'/g" my_script
That might also make other, unwanted changes, however.
You need to make some assumptions about the content of the file in order to do such a thing at all. The above depends on the text MYSQL_ADMIN and '' to appear on the same line only in the line(s) you want to modify. That may turn out to hold, but it seems unnecessarily risky. An assumption more likely to hold in general would be that there will be only one assignment to variable MYSQL_ADMIN, or that it is acceptable to modify all such assignments that assign a single-quote-delimited empty value.
Going with the latter, one might end up with this:
sed -i -e "s/\<MYSQL_ADMIN=''\(\s\|$\)/MYSQL_ADMIN='abc1'\1/g" my_script
The pattern \<MYSQL_ADMIN=''\(\s\|$\) improves on your plain MYSQL_ADMIN in these significant ways:
the \< causes it to match only immediately after a word boundary -- start of line, whitesepace, or punctuation. This prevents substitutions for other variables whose names happen to end with MYSQL_ADMIN. If you prefer, it would be even stronger to instead anchor the match to the beginning of the line with ^.
including the ='' in the pattern distinguishes between MYSQL_ADMIN and variables whose names contain that as an initial substring. It also ensures that the '' that gets replaced, if any, goes with the variable and does not merely appear somewhere else on the line.
the \(\s\|$\) both matches and captures either a whitespace character or the empty string at the end of a line. This distinguishes between assignments of an empty value and assignments of values that are merely prefixed by '' (which is valid if the file is a shell script). Having included it in the match, the capture allows the matched text, if any, to be preserved in the output (via the \1 in the replacement).
Because that matches the whole assignment, a complete assignment must appear in the replacement, too. On the other hand, this means that (probably) you can apply the command to every line, as shown, with no particular loss of efficiency relative to the previous command.
Even that might produce changes you didn't want, however, such as in comment lines or quoted text.

Related

Sed keep original indentation and camel-casing a variable

I have a simple sed script and I am replacing a bunch of lines in my application dynamically with a variable, the variable is a list of strings.My function works but does not keep the original indentation.the function deletes the line if it contains the certain string and replaces the line with a completely new line, I could not do a replace due to certain syntax restrictions.
How do I keep my original indentation when the line is replaced
Can I capitalize my variable and remove the underscore on the fly, i.e. the title is a capitalize and underscore removed version of the variableName, the list of items in the variable array is really long so I am trying to do this in one shot.
Ex: I want report_type -> Report Type done mid process
Is there a better way to solve this with sed? Thanks for any inputs much appreciated.
sed function is as follows
variableName=$1
sed -i "/name\=\"${variableName}\.name\" value\=model\.${variableName}\.name options\=\#lists\./c\\{\{\> \_dropdown title\=\"${variableName}\" required\=true name\=\"${variableName}\"\}\}" test
SAMPLE INPUT
{{> _select title="Report Type" required=true name="report_type.name" value=model.report_type.name options=#lists.report_type}}
SAMPLE EXPECTED OUPUT
{{> _dropdown title="Report Type" required=true name="report_type" value=model.report_type.name}}
sample input variable
report_type
Try this:
sed -E "s/^(\s+).*name\=\"(report_type)\.name\" value\=model\.report_type\.name options\=\#lists\..*$/\1\{\{\> \_dropdown title\=\"\2\" required\=true name\=\"\2\"\}\}/;T;s/\"(\w+)_(\w+)\"/\"\u\1 \u\2\"/g" input.txt > output.txt
I used "report_type" instead of ${variableName} for testing as an sed one-liner.
Please change back to ${variableName}.
Then go back to using -i (in addition to -E, which is for extended regex).
I am not sure whether I can do it without extended regex, let me know if that is necessary.
use s/// to replace fine tuned line
first capture group for the white space making the indentation
second capture group for the variable name
stop if that did not replace anything, T;
another s///
look for something consisting of only letters between "",
with a "_" between two parts,
seems safe enough because this step is only done on the already replaced line
replace by two parts, without "_"
\u for making camel case
Note:
Doing this on your sample input creates two very similar lines.
I assume that is intentional. Otherwise please provide desired output.
Using GNU sed version 4.2.1.
Interesting line of output:
{{> _dropdown title="Report Type" required=true name="Report Type"}}

Bash script " & " symbol creating an issues

I have a tag
<string name="currencysym">$</string>
in string.xml And what to change the $ symbol dynamically. Use the below command but didn't work:
currencysym=₹
sed -i '' 's|<string name="currencysym">\(.*\)<\/string>|<string name="currencysym">'"<!\[CDATA\[${currencysym}\]\]>"'<\/string>|g'
Getting OUTPUT:
<string name="currencysym"><![CDATA[<string name="currencysym">$</string>#x20B9;]]></string>
" & " Has Removed...
But I need:
<string name="currencysym"><![CDATA[₹]]></string>
using xml-parser/tool to handle xml is first choice
instead of <..>\(.*\)<..> better use <..>\([^<]*\)<..> in case you have many tags in one line
& in replacement has special meaning, it indicates the whole match (\0) of the pattern. That's why you see <...>..</..> came to your output. If you want it to be literal, you should escape it -> \&
First problem is the line
currencysym=₹
This actually reads as "assign empty to currencysym and start no process in the background":
In bash you can set an environment variable (or variables) just or one run of a process by doing VAR=value command. This is how currencysym= is being interpreted.
The & symbol means start process in the background, except there is no command specified, so nothing happens.
Everything after # is interpreted as a comment, so #x20B9; is just whitespace from Bash's point of view.
Also, ; is a command separator, like &, which means "run in foreground". It is not used here because it is commented out by #.
You have to either escape &, # and ;, or just put your string into single quotes: currencysym=\&\#x20B9\; or currencysym='₹'.
Now on top of that, & has a special meaning in sed, so you will need to escape it before using it in the sed command. You can do this directly in the definition like currencysym=\\\&\#x20B9\; or currencysym='\₹', or you can do it in your call to sed using builtin bash functionality. Instead of accessing ${currencysym}, reference ${currencysym/&/\&}.
You should use double-quotes around variables in your sed command to ensure that your environment variables are expanded, but you should not double-quote exclamation marks without escaping them.
Finally, you do not need to capture the original currency symbol since you are going to replace it. You should make your pattern more specific though since the * quantifier is greedy and will go to the last closing tag on the line if there is more than one:
sed 's|<string name="currencysym">[^<]*</string>|<string name="currencysym"><![CDATA['"${currencysym/&/\&}"']]></string>|' test.xml
Yields
<string name="currencysym"><![CDATA[₹]]></string>
EDIT
As #fedorqui points out, you can use this example to show off correct use of capture groups. You could capture the parts that you want to repeat exactly (the tags), and place them back into the output as-is:
sed 's|\(<string name="currencysym">\)[^<]*\(</string>\)|\1<![CDATA['"${currencysym/&/\&}"']]>\2|' test.xml
sed -i '' 's|\(<string name="currencysym">\)[^<]*<|\1<![CDATA[\₹]]><|g' YourFile
the group you keep in buffer is the wrong one in your code, i keep first part not the &
grouping a .* is not the same as all untl first < you need. especially with the goption meaning several occurence could occur and in this case everithing between first string name and last is the middle part (your group).
carrefull with & alone (not escaped) that mean 'whole search pattern find' in replacement part

Unable to remove a value from a text file using -sed

I'm trying to remove an ID number from a text file using a series of commands (using terminal), but they don't seem to be working. I need to remove the number and the associated "ID" text
Text in File:
{"id":"098765432"}
Commands I've been using (but don't seem to be working):
sed -i.bak 's/"id":[0-9]\{1,\},//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{1,\}",//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{9,\}",//g' ./Filename.txt
sed -i.bak 's/"id":[0-9]\{9,\},//g' ./Filename.txt
sed -i.bak 's/"[0-9]\{1,\}",//g' ./Filename.txt
Thanks for the help :)
As #Wintermute already noted in the comment, the problem is in the comma before //. However, I am going to explain the whole line, just so the others may understand it completely, in case something is not clear to those who come across this question later.
So, the proper command that will satisfy your requirement is:
sed -i.bak 's/"id":"[0-9]\{1,\}"//g' ./Filename.txt
sed is the command that calls stream editor.
Flag -i is the flag used to represent editing files in place (it makes backup if extension is supplied). In this case, extension written is .bak and indeed the backup file (containing initial context of our file) is created with the original name + the extension provided.
Argument 's/"id":"[0-9]{1,}"//g' is the argument given to the sed command.
Since this argument (regular expression in it) was the cause of the problem, I am going to explain it in detail.
First part we should notice is that its structure is s/Regex/Replacement/g where
Regex = "id":"[0-9]{1,}"
Replacement = nothing (literally nothing, not even blank space)
So basically, as described by Bruce Barnett, s stands for substitution. Regex is the part we will replace with the Replacement. At the end, letter g means that we will change more than just one occurrence of this regex per line (without g, it would replace just the first occurrence in every line, no matter how many are there).
And at the end we have ./Filename.txt, which is the source file we are applying this command on (./ means that the file is in the same directory from where we are running this command).
About the regex used ("id":"[0-9]{1,}"):
It starts with the literals ("id":") and this part will match literally any part in the file which is exactly the same as this one. Next, we have ([0-9]{1,}), which means that we want to, in addition to the first part, look for the at least one occurrence of a number (but it can be more of them, as the matched example from the question shows).
Now you may understand why comma caused this problem. There is no comma in the original text in the file. Thus, none of the commands tried (since all of them contain comma) worked. Of course, some of them have even more reasons.
EDIT: As #ghoti pointed out, replacement is not a regex. It is the string we will put at the place(s) that are found by our regex expression. So in this case, our replacement is blank string (since we want to delete the specified part).

Error while executing sed command

I am trying to execute script with commands:
sed -i "USER/c\export USER=${signumid}" .bashrc
sed -i "DEVENVHOME=$/c\export DEVENVHOME=${DEVENVHOME:-/home/${signumid}/CPM_WORKAREA/devenv.x}" .bashrc
 
I want to replace the line with string "USER" in .bashrc with export USER=${signumid} where $signumid variable is being provided through Cygwin prompt. Similarly I want to replace line with string DEVENVHOME=$ with export DEVENVHOME=${DEVENVHOME:-/home/${signumid}/CPM_WORKAREA/devenv.x} in .bashrc file, where $signumid variable is provided through Cygwin prompt.
But I am getting following errors on Cygwin termminal.:
sed: -e expression #1, char 1: unknown command: `U'
sed: -e expression #1, char 3: extra characters after command
The general syntax of a sed script is a sequence of address command arguments statements (separated by newline or semicolon). The most common command is the s substitution command, with an empty address, so we can perhaps assume that that is what you want to use here. You seem to be attempting to interpolate a shell variable $signumid which adds a bit of a complication to this exposition.
If your strings were simply static text, it would make sense to use single quotes; then, the shell does not change the text within the quotes at all. The general syntax of the s command is s/regex/replacement/ where the slash as the argument separator is just a placeholder, as we shall soon see.
sed -i 's/.*USER.*/export USER=you/
s% DEVENVHOME=\$%export DEVENVHOME=${DEVENVHOME:-/home/you/CPM_WORKAREA/devenv.x}%' .bashrc
This will find any line with USER and substitute the entire line with export USER=you; then find any line which contains DEVENVHOME=$ (with a space before, and a literal dollar character) and replace the matched expression with the long string. Because the substitution string uses slashes internally, we use a different regex separator % -- alternatively, we could backslash-escape the slashes which are not separators, but as we shall see, that quickly becomes untenable when we add the following twist. Because the dollar sign has significance as the "end of line" metacharacter in regular expressions, we backslash-escape it.
I have ignored the c\ in your attempt on the assumption that it is simply a misunderstanding of sed syntax. If it is significant, what do you hope to accomplish with it? c\export is not a valid Bash command, so you probably mean something else, but I cannot guess what.
Now, to interpolate the value of the shell variable signumid into the replacement, we cannot use single quotes, because those inhibit interpolation. You have correctly attempted to use double quotes instead (in your edited question), but that means we have to make some additional changes. Inside double quotes, backslashes are processed by the shell, so we need to double all backslashes, or find alternative constructs. Fortunately for us, the only backslash is in \$ which can equivalently be expressed as [$], so let's switch to that notation instead. Also, where a literal dollar sign is wanted in the replacement string, we backslash-escape it in order to prevent the shell from processing it.
sed -i "s/.*USER.*/export USER=$signumid/
s% DEVENVHOME=[$]%export DEVENVHOME=\${DEVENVHOME:-/home/$signumid/CPM_WORKAREA/devenv.x}%" .bashrc
Equivalenty, you could use single quotes around the parts of the script which are meant to be untouched by the shell, and then put an adjacent double-quoted string around the parts which need interpolation, like
'un$touched*by$(the!shell)'"$signumid"'more$[complex]!stuff'
This final script still rests on a number of lucky or perhaps rather unlucky guesses about what you actually want. On the first line, I have changed just USER to a regular expression which matches the entire line -- maybe that's not what you want? On the other hand, the second line makes the opposite assumption, just so you can see the variations -- it only replaces the actual text we matched. Probably one or the other needs to be changed.
Finally, notice how the two separate sed commands have been conflated into a single script. Many newcomers do not realize that sed is a scripting language which accepts an arbitrary number of commands in a script, and simply treat it as a "replace" program with a funny syntax.
Another common source of confusion is the evaluation order. The shell processes the double-quoted string even before sed starts to execute, so if you have mistakes in the quoting, you can easily produce syntax errors in the sed script which lead to rather uninformative error messages (because what sed tells you in the error message is based on what the script looks like after the shell's substutions). For example, if signumid contains slashes, it will produce syntax errors, because sed will see those as terminating separators for the s/// command. An easy workaround is to switch to a separator which does not occur in the value of signumid.

Why does sed not replace overlapping patterns

I have a database unload file with field separated with the <TAB> character. I am running this file through sed to replace any occurences of <TAB><TAB> with <TAB>\N<TAB>. This is so that when the file is loaded into MySQL the \N in interpreted as NULL.
The sed command 's/\t\t/\t\N\t/g;' almost works except that it only replaces the first instance e.g. "...<TAB><TAB><TAB>..." becomes "...<TAB>\N<TAB><TAB>...".
If I use 's/\t\t/\t\N\t/g;s/\t\t/\t\N\t/g;' it replaces more instances.
I have a notion that despite the /g modifier this is something to do with the end of one match being the start of another.
Could anyone explain what is happening and suggest a sed command that would work or do I need to loop.
I know I could probably switch to awk, perl, python but I want to know what is happening in sed.
Not dissimilar to the perl solution, this works for me using pure sed
With #Robin A. Meade improvement
sed ':repeat;
s|\t\t|\t\n\t|g;
t repeat'
Explanation
:repeat is a label, used for branch commands, similar to batch
s|\t\t|\t\n\t|g; - Standard replace 2 tabs with tab-newline-tab. I still use the global flag because if you have, say, 15 tabs, you will only need to loop twice, rather than 14 times.
t repeat means if the "s" command did any replaces, then goto the label repeat, else it goes onto the next line and starts over again.
So it goes like this. Keep repeating (goto repeat) as long as there is a match for the pattern of 2 tabs.
While the argument can be made that you could just do two identical global replaces and call it good, this same technique could work in more complicated scenarios.
As #thorn-blake points out, sed just doesn't support advanced features like lookahead, so you need to do a loop like this.
Original Answer
sed ':repeat;
/\t\t/{
s|\t\t|\t\n\t|g;
b repeat
}'
Explanation
:repeat is a label, used for branch commands, similar to batch
/\t\t/ means match the pattern 2 tabs. If the pattern it matched, the command following the second / is executed.
{} - In this case the command following the match command is a group. So all of the commands in the group are executed if the match pattern is met.
s|\t\t|\t\n\t|g; - Standard replace 2 tabs with tab-newline-tab. I still use the global because if you have say 15 tabs, you will only need to loop twice, rather than 14 times.
b repeat means always goto (branch) the label repeat
Short version
Which can be shortened to
sed ':r;s|\t\t|\t\n\t|g; t r'
# Original answer
# sed ':r;/\t\t/{s|\t\t|\t\n\t|g; b r}'
MacOS
And the Mac (yet still Linux/Windows compatible) version:
sed $':r\ns|\t\t|\t\\\n\t|g; t r'
# Original answer
# sed $':r\n/\t\t/{ s|\t\t|\t\\\n\t|g; b r\n}'
Tabs need to be literal in BSD sed
Newlines need to be both literal and escaped at the same time, hence the single slash (that's \ before it is processed by the $, making it a single literal slash ) plus the \n which becomes an actual newline
Both label names (:r) and branch commands (b r when not the end of the expression) must end in a newline. Special characters like semicolons and spaces are consumed by the label name/branch command in BSD, which makes it all very confusing.
I know you want sed, but sed doesn't like this at all, it seems that it specifically (see here) won't do what you want. However, perl will do it (AFAIK):
perl -pe 'while (s#\t\t#\t\n\t#) {}' <filename>
As a workaround, replace every tab with tab + \N; then remove all occurrences of \N which are not immediately followed by a tab.
sed -e 's/\t/\t\\N/g' -e 's/\\N\([^\t]\)/\1/g'
... provided your sed uses backslash before grouping parentheses (there are sed dialects which don't want the backslashes; try without them if this doesn't work for you.)
Right, even with /g, sed will not match the text it replaced again. Thus, it's read <TAB><TAB> and output <TAB>\N<TAB> and then reads the next thing in from the input stream. See http://www.grymoire.com/Unix/Sed.html#uh-7
In a regex language that supports lookaheads, you can get around this with a lookahead.
Well, sed simply works as designed. The input line is scanned once, not multiple times. Maybe it helps to look at the consequences if sed used rescanning the input line to deal with overlapping patterns by default: in this case even simple substitutions would work quite differently--some might say counter-intuitively--, e.g.
s/^/ / inserting a space at the beginning of a line would never terminate
s/$/foo/ appending foo to each line - likewise
s/[A-Z][A-Z]*/CENSORED/ replacing uppercase words with CENSORED - likewise
There are probably many other situations. Of course these could all be remedied with, say, a substitution modifier, but at the time sed was designed, the current behavior was chosen.

Resources