Cannot understand a sed pattern - bash

My original issue was to be able to add a line at the end of a specific block in a configuration file.
############
# MY BLOCK #
############
VALUE1 = XXXXX
VALUE2 = YYYYY
MYNEWVALUE = XXXXX <<< I want to add this one
##############
# MY BLOCK 2 #
##############
To do this I used the following sed script and it work flawlessly (found it in another post) :
sed -i -e "/# MY BLOCK #/{:a;n;/^$/!ba;i\MYNEWVALUE = XXXXX" -e '}' myfile
This worked perfectly when executed inside a shell script but I can't manage to use it directly in an interactive shell (it gave me an error: "!ba event not found"). To solve this, I tried to add '\' before '!ba' but now it gave me another error which tells me that '\' is an unknown command.
Could anyone explain where my mistake is on the above issue and how this script works?
Here is my understanding:
-i : insert new line (i think the first one is useless, am i right?)
-e : execute this sed script (don't understand why there is a second one at the end to close the })
:a : begin a loop
n : read each line with the pattern ^$ (empty lines)
! : reverse the loop
ba : end of the loop
Thanks !

Use ' instead of " to avoid having bash try to do history substitution on the !
If XXXXX contains a shell parameter expansion or somesuch, you can do it like this:
sed -i -e"/# $BLOCK_NAME"'#/{:a;n;/^$/!ba;i\'"$NEW_VAR = $NEW_VALUE" -e"}" myfile
The second -e is required to effectively insert a newline to close off the i command. You could actually insert the newline directly, instead:
sed -i -e"/# $BLOCK_NAME "'#/{:a;n;/^$/!ba;i\'"$NEW_VAR = $NEW_VALUE"$'\n}' myfile

:a introduces a label, named a.
n writes current pattern space to output, and replace pattern space with next line of input.
/^$/! means to match lines that are NOT (!) blank lines in pattern space; the following ba is a "branch to label a" when that match (not blank line) occurs.
If the branch doesn't occur, the i insert then takes place.
Use single quotes (') instead of double quotes (") on command line to prevent shell from performing shell substitutions (including the "$" and "!" characters).

In interactive shells, ! is used for history substitution, so you need to escape it:
sed -i -e "/# MY BLOCK #/{:a;n;/^\$/\!ba;i\MYNEWVALUE = XXXXX" -e '}' myfile
You should also escape $, since it has special meaning inside doublequoted strings (although in this case it's OK, because it's followed by /, not a variable name).

Related

What is the function of this sed shell script?

I was reading this example about setting up a cluster with pgpool and Watchdog and decided to give it a try as an exercise.
I'm far from being a master of shell scripting, but I could follow the documentation and modify it according to the settings of my virtual machines. But I don't get what is the purpose of the following snippet:
if [ ${PGVERSION} -ge 12 ]; then
sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
-e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${DEST_NODE_PGDATA}/postgresql.conf
fi
In my case PGVERSION will be 12 (so the script will execute the code after the condition), RECOVERYCONF is /usr/local/pgsql/data/myrecovery.conf and DEST_NODE_PGDATA is /usr/local/pgsql/data.
I get (please excuse and correct me if I'm wrong) that -e indicates that a script comes next, the $(some commands) part evaluates the expression and returns the result, and that the sed regular expression indicates that the '/'s will be replaced by \/ (forward slash and slash). What is puzzling me are the "\\\$ainclude_if_exists =" and "/^include_if_exists" parts, I don't know what they mean or what are they intended for, nor how they interact. Also, the -e after the first sed regular expression is confusing me.
If you are interested in the context, those commands are near the end of the /var/lib/pgsql/11/data/recovery_1st_stage example script.
Thanks in advance for your time.
Here's a tiny representation of the same code:
sed -i -e '$amyvalue = foo' -e '/^myvalue = foo/d' myfile.txt
The first sed expression is:
$ # On the last line
a # Append the following text
myvalue = foo # (text to be appended)
The second is:
/ # On lines matching regex..
^myvalue = foo # (regex to match)
/ # (end of regex)
d # ..delete the line
So it deletes any myvalue = foo that may already exists, and then adds one such line at the end. The point is just to ensure that you have exactly one line, by A. adding the line if it's missing, B. not duplicate the line if it already exists.
The rest of the expression is merely complicated by the fact that this snippet uses variables and is embedded in a double quoted string that's being passed to a different host via ssh, and therefore requires some additional escaping both of the variables and of the quotes.

Replace Double quotes with space

this is perhaps one of the most discussed topics here. I tried almost all the commands and other tweaks found here, but something doesn't seems to be doing well.
i would want to replace all the double quotes in my file with whitespace/blank
I'm seeing the below error when i tried to execute this command.
sed "s/"/ \''/g' x_orbit.txt > new.tx
sed: -e expression #1, char 3: unterminated `s' command
You're close. Just use single quotes, so the shell doesn't try to expand the metacharacters in your sed command:
sed 's/"/ /g' x_orbit.txt > new.txt
You could try tr for example:
tr '"' ' ' < x_orbit.txt > new.txt
The script you provided:
sed "s/"/ \''/g' x_orbit.txt > new.tx
means:
sed # invoke sed to execute the following script:
" # enclose the script in double quotes rather than single so the shell can
# interpret it (e.g. to expand variables like $HOME) before sed gets to
# interpret the result of that expansion
s/ # replace what follows until the next /
" # exit the double quotes so the shell can now not only expand variables
# but can now do globbing and file name expansion on wildcards like foo*
/ # end the definition of the regexp you want to replace so it is null since
# after the shell expansion there was no text for sed to read between
# this / and the previous one (the 2 regexp delimiters)
\' # provide a blank then an escaped single quote for the shell to interpret for some reason
'/g' # enclose the /g in single quotes as all scripts should be quoted by default.
That is so far off the correct syntax it's kinda shocking which is why I dissected it above to try to help you understand what you wrote so you'll see why it doesn't work. Where did you get the idea to write it that way (or to put it another way - what did you think each character in that script meant? I'm asking as it indicates a fundamental misunderstanding of how quoting and escaping works in shell so it'd be good if we could help correct that misunderstanding rather than just correct that script.
When you use any script or string in shell, simply always enclose it in single quotes:
sed 'script' file
var='string'
unless you NEED to use double quotes to let a variable expand and then use double quotes unless you NEED to use no quotes to let globbing and file name expansion happen.
An awk version:
awk '{gsub(/"/," ")}1' file
gsub is used for the replace
1 is always true, so line is printed

Understanding sed command

Please excuse if the question is too naive. I am new to shell scripting and am not able to find any good resource to understand the specifics. I am trying to make sense of a legacy script. Please can someone tell me what the following command does:
sed "s#s3AtlasExtractName#$i#g" load_xyz.sql >> load_abc.sql;
This command will replace all occurrences of s3AtlasExtractName with whatever $i is.
s - Substitute
# - Delimiter
s3AtlasExtractName - Word that needs substituting
# - Delimiter
$i - i variable that will be used to replace s3AtlasExtractName
# - Delimiter
g - Global Replace all instance of s3AtlasExtractName in a single line and not just the first occurrence of it
So this will parse through load_xyz.sql and change all occurrences of s3AtlasExtractName to the value of $i and append the whole of the contents of load_xyz.sql to a file called load_abc.sql with the sed substitutions.
sed is a command line stream editor. You can find information about it here:
http://www.computerhope.com/unix/used.htm
An easy example is shown below where sed is used to replace the word "test" with the word "example" in myfile.txt but output is sent to newfile.txt
sed 's/test/example/g' myfile.txt > newfile.txt
It seems that your script is performing a similar function by replacing the content of the load_xyz.sql file and storing it in a new file load_abc.sql Without more code I am just guessing but it seems that the parameter $i could be used as counter to insert similar but new values into the load_abc.sql file.
In short, this reads load_xyz.sql and replaces every occurrence of "s3AtlasExtractName" by whatever has been stored in the shell variable "i".
The long version is that sed accepts many subcommands with different formattings. Any "simple" sed command will look like 'sed '. The first letter of the subcommand tells you which operation sed is going to do with your files.
The "s" operation stands for "substitution" and is the most commonly used. It is followed by a Perl-like regexp: separator, regexp to look for, separator, value to substitute, separator, PREG flags. In your case, the separator is '#' which is pretty unusual but not forbidden, so the command substitues '$i' to every instance of 's3AtlasExtractName'. The 'g' PREG flag tells sed to replace every occurrence of the pattern (the default is to only replace its first occurrence on every line in the input).
Finally, the use of "$i" inside a double-quote-delimited string tells the shell to actually expand the shell variable 'i' so you'll want to look for a shell statement setting that (possibly a 'for' statement).
Hope this helps.
edit: I focused on the 'sed' part and kinda missed the redirection part. The '>>' token tells the shell to take the output of the sed command (i.e. the contents of load_xyz.sql with all occurrences of s3AtlasExtractName replaced by the contents of $i) and append it to the file 'load_abc.sql'.

sed partial replace or variable

I'd like to use sed to do a replace, but not by searching for what to replace.
Allow me to explain. I have a variable set to a default value initially.
VARIABLE="DEFAULT"
I can do a sed to replace DEFAULT with what I want, but then I would have to put DEFAULT back when I was all done. This is becuase what gets stored to VARIABLE is unique to the user. I'd like to use sed to search for somthing else other than what to replace. For example, search for VARIABLE=" and " and replace whats between it. That way it just constantly updates and there is no need to reset VARIABLE.
This is how I do it currently:
I call the script and pass an argument
./script 123456789
Inside the script, this is what happens:
sed -i "s%DEFAULT%$1%" file_to_modify
This replaces
VARIABLE="DEFAULT"
with
VARIABLE="123456789"
It would be nice if I didn't have to search for "DEFAULT", because then I would not have to reset VARIABLE at end of script.
sed -r 's/VARIABLE="[^"]*"/VARIABLE="123456789"/' file_to_modify
Or, more generally:
sed -r 's/VARIABLE="[^"]*"/VARIABLE="'"$1"'"/' file_to_modify
Both of the above use a regular expression that looks for 'VARIABLE="anything-at-all"' and replaces it with, in the first example above 'VARIABLE="123456789"' or, in the second, 'VARIABLE="$1"' where "$1" is the first argument to your script. The key element is [^"]. It means any character other than double-quote. [^"]* means any number of characters other than double-quote. Thus, we replace whatever was in the double-quotes before, "[^"]*", with our new value "123456789" or, in the second case, "$1".
The second case is a bit tricky. We want to substitute $1 into the expression but the expression is itself in single quotes. Inside single-quotes, bash will not substitute for $1. So, the sed command is broken up into three parts:
# spaces added for exposition but don't try to use it this way
's/VARIABLE="[^"]*"/VARIABLE="' "$1" '"/'
The first part is in single quotes and bash passes it literally to sed. The second part is in double-quotes, so bash will subsitute in for the value of `$``. The third part is in single-quotes and gets passed to sed literally.
MORE: Here is a simple way to test this approach on the command line without depending on any files:
$ new=1234 ; echo 'VARIABLE="DEFAULT"' | sed -r 's/VARIABLE="[^"]*"/VARIABLE="'"$new"'"/'
VARIABLE="1234"
The first line above is the command run at the prompt ($). The second is the output from running the command..

How to make bash regex'es more readable?

Here is an example of nicely indented Python regex (taken from here):
charref = re.compile(r"""
&[#] # Start of a numeric entity reference
(
0[0-7]+ # Octal form
| [0-9]+ # Decimal form
| x[0-9a-fA-F]+ # Hexadecimal form
)
; # Trailing semicolon
""", re.VERBOSE)
Now, I would like to use the same technique for bash regexes (i.e. sed or grep), but can't find any reference to similar features so far. Is it even possible to indent (and comment) something like this?
echo "$MULTILINE | sed -re 's/(expr1|expr2)|(expr3|expr4)/expr5/g'
You can use bash's line continuation, possibly:
echo "start of a line \
continues the previous line \
yet another continuation
oops. this is a brand new line"
Note the backslashes at the end of the first two lines. they essentially 'escape' the newline/linebreak that would otherwise tell bash you're starting a new line, which also implicitly terminate the statement being defined.

Resources