sed -- replace variable in for do loop - bash

I want to insert a counter into a text file using sed. For example, the file has the following content:
please.add.number.00
Here is the script I'm using:
for i in $(seq 0 10)
do
sed -i 's/please.add.number.00/please.add.number.$i/' filename.txt
done
But the value ($i) in the file doesn't change. I want to substitute the value of $i in this line of filename.txt. I would appreciate any help to fix this issue!

As mentioned in comments, there is a difference between these two lines:
echo '$HOME'
echo "$HOME"
Single quotes will result in $HOME, double quotes will tell you your home directory.
Based on edits to your question, it looks like your actual problem is a case of misunderstanding how sed works. The substitute command takes two parameters: a search pattern and a replacement. If the search pattern (please.add.number.00) never changes, then it will only ever be matched the first time it is run.

Related

How do I use sed to store a variable?

I am trying to use sed to use as input for a variable. The user will choose from a list of files that have numbers before each to identify individual files. Then they choose a number corresponding to a name. I need to get the name of that file. My code is:
for entry in *; do
((i++))
echo "$i) $entry: "
done
echo What file # do you want to choose?:
read filenum
fileName=$(./myscript.sh | sed -n "${filenum}p")
echo $fileName ###this is to see if anything goes into fileName. nothing is ever output
echo What do you want to do with $fileName?
Ideally I would use () instead of the backtick but I can't seem to figure out how. I've looked at the links below, but can't get those ideas to work. I believe a problem may be that I am trying to include the filenum variable inside my sed.
https://www.linuxquestions.org/questions/linux-newbie-8/storing-output-of-sed-in-a-variable-in-shell-script-499997/
Store output of sed into a variable
Don't put backticks around $filenum. That will try to execute the contents of $filenum as a command. Put variables inside double quotes.
And if you do want to nest a backtick expression inside another set of backticks, you have to escape them. That's where $() becomes useful -- they nest without any hassle.
When you use sed -n, you need to use the p command to print the lines that you want to show in the output.
fileName=$(sed -n "${filenum}p" myscript.sh)
This will put the contents of line $filenum of myscript.sh in the variable.
If you actually wanted to execute myscript.sh and print the selected line of its output, you need to pipe to sed:
fileName=$(./myscript.sh | sed -n "${filenum}p")

What does this sed syntax mean? "s/MY_BASE_DIR=\(.*\)/MY_BASE_DIR=${MY_BASE_DIR-\1}/"

This is a simple question but i am unable to find it in tutorials. Could anybody please explain what this statement does when executed in a bash shell within a folder containing .sh scripts. I know -i does in place editing, i understand that it will run sed on all scripts within the current directory. And i know that it does some sort of substitution. But what does this \(.*\) mean?
sed -i 's/MY_BASE_DIR=\(.*\)/MY_BASE_DIR=${MY_BASE_DIR-\1}/' *.sh
Thanks in advance.
You have an expression like:
sed -i 's/XXX=\(YYY\)/XXX=ZZZ/' file
This looks for a string XXX= in a file and captures what goes after. Then, it replaces this captured content with ZZZ. Since there is a captured group, it is accessed with \1. Finally, using the -i flag in sed makes the edition to be in-place.
For the replacement, it uses the following syntax described in Shell parameter expansion:
${parameter:-word}
If parameter is unset or null, the expansion of word is substituted.
Otherwise, the value of parameter is substituted.
Example:
$ d=5
$ echo ${d-3}
5
$ echo ${a-3}
3
So with ${MY_BASE_DIR-SOMETHING-\1} you are saying: print $MY_BAS_DIR. And if this variable is unset or null, print what is stored in \1.
All together, this is resetting MY_BASE_DIR to the value in the variable $MY_BASE_DIR unless this is not set; in such case, the value remains the same.
Note though that the variable won't be expanded unless you use double quotes.
Test:
$ d=5
$ cat a
d=23
blabla
$ sed "s/d=\(.*\)/d=${d-\1}/" a # double quotes -> value is replaced
d=5
blabla
$ sed 's/d=\(.*\)/d=${d-\1}/' a # single quotes -> variable is not expanded
d=${d-23}
blabla
Andd see how the value remains the same if $d is not set:
$ unset d
$ sed "s/d=\(.*\)/d=${d-\1}/" a
d=23
The scripts contain lines like this:
MY_BASE_DIR=/usr/local
The sed expression changes them to:
MY_BASE_DIR=${MY_BASE_DIR-/usr/local}
The effect is that /usr/local is not used as a fixed value, but only as the default value. You can override it by setting the environment variable MY_BASE_DIR.
For future reference, I would take a look at the ExplainShell website:
http://explainshell
that will give you a breakdown of the command structure etc. In this instance, let step through the details...Let's start with a simple example, let's assume that we were going to make the simple change - commenting out all lines by adding a "#" before each line. We can do this for all *.sh files in a directory with the ".sh" extension in the current directory:
sed 's/^/\#/' *.sh
i.e. Substitute beginning of line ^, with a # ...
Caveat: You did not specify the OS you are using. You may get different results with different versions of sed and OS...
ok, now we can drill into the substitution in the script. An example is probably easier to explain:
File: t.sh
MY_BASE_DIR="/important data/data/bin"
the command 's/MY_BASE_DIR=\(.*\)/MY_BASE_DIR=${MY_BASE_DIR-\1}/' *.sh
will search for "MY_BASE_DIR" in each .sh file in the directory.
When it encounters the string "MY_BASE_DIR=.*", in the file, it expands it to be MY_BASE_DIR="/important data/data/bin", this is now replaced on the right side of the expression /MY_BASE_DIR=${MY_BASE_DIR-\1}/ which becomes
MY_BASE_DIR=${MY_BASE_DIR-"/important data/data/bin"}
essentially what happens is that the substitute operation takes
MY_BASE_DIR="/important data/data/bin"
and inserts
MY_BASE_DIR=${MY_BASE_DIR-"/important data/data/bin"}
now if we run the script with the variable MY_BASE_DIR set
export MY_BASE_DIR="/new/import/dir"
the scripts modified by the sed script referenced will now substitute /important data/data/bin with /new/import/dir...

Understanding sed command

Please excuse if the question is too naive. I am new to shell scripting and am not able to find any good resource to understand the specifics. I am trying to make sense of a legacy script. Please can someone tell me what the following command does:
sed "s#s3AtlasExtractName#$i#g" load_xyz.sql >> load_abc.sql;
This command will replace all occurrences of s3AtlasExtractName with whatever $i is.
s - Substitute
# - Delimiter
s3AtlasExtractName - Word that needs substituting
# - Delimiter
$i - i variable that will be used to replace s3AtlasExtractName
# - Delimiter
g - Global Replace all instance of s3AtlasExtractName in a single line and not just the first occurrence of it
So this will parse through load_xyz.sql and change all occurrences of s3AtlasExtractName to the value of $i and append the whole of the contents of load_xyz.sql to a file called load_abc.sql with the sed substitutions.
sed is a command line stream editor. You can find information about it here:
http://www.computerhope.com/unix/used.htm
An easy example is shown below where sed is used to replace the word "test" with the word "example" in myfile.txt but output is sent to newfile.txt
sed 's/test/example/g' myfile.txt > newfile.txt
It seems that your script is performing a similar function by replacing the content of the load_xyz.sql file and storing it in a new file load_abc.sql Without more code I am just guessing but it seems that the parameter $i could be used as counter to insert similar but new values into the load_abc.sql file.
In short, this reads load_xyz.sql and replaces every occurrence of "s3AtlasExtractName" by whatever has been stored in the shell variable "i".
The long version is that sed accepts many subcommands with different formattings. Any "simple" sed command will look like 'sed '. The first letter of the subcommand tells you which operation sed is going to do with your files.
The "s" operation stands for "substitution" and is the most commonly used. It is followed by a Perl-like regexp: separator, regexp to look for, separator, value to substitute, separator, PREG flags. In your case, the separator is '#' which is pretty unusual but not forbidden, so the command substitues '$i' to every instance of 's3AtlasExtractName'. The 'g' PREG flag tells sed to replace every occurrence of the pattern (the default is to only replace its first occurrence on every line in the input).
Finally, the use of "$i" inside a double-quote-delimited string tells the shell to actually expand the shell variable 'i' so you'll want to look for a shell statement setting that (possibly a 'for' statement).
Hope this helps.
edit: I focused on the 'sed' part and kinda missed the redirection part. The '>>' token tells the shell to take the output of the sed command (i.e. the contents of load_xyz.sql with all occurrences of s3AtlasExtractName replaced by the contents of $i) and append it to the file 'load_abc.sql'.

bash grep variable as pattern

I don't usually work in bash but grep could be a really fast solution in this case. I have read a lot of questions on grep and variable assignment in bash yet I do not see the error. I have tried several flavours of double quotes around $pattern, used `...`` or $(...) but nothing worked.
So here's what I try to do:
I have two files. The first contains several names. Each of them I want to use as a pattern for grep in order to search them in another file. Therefore I loop through the lines of the first file and assign the name to the variable pattern.
This step works as the variable is printed out properly.
But somehow grep does not recognize/interpret the variable. When I substitute "$pattern" with an actual name everything is fine as well. Therefore I don't think the variable assignment has a problem but the interpretation of "$pattern" as the string it should represent.
Any help is greatly appreciated!
#!/bin/bash
while IFS='' read -r line || [[ -n $line ]]; do
a=( $line )
pattern="${a[2]}"
echo "Text read from file: $pattern"
var=$(grep "$pattern" 9606.protein.aliases.v10.txt)
echo "Matched Line in Alias is: $var"
done < "$1"
> bash match_Uniprot_StringDB.sh ~/Chromatin_Computation/.../KDM.protein.tb
output:
Text read from file: "UBE2B"
Matched Line in Alias is:
Text read from file: "UTY"
Matched Line in Alias is:
EDIT
The solution drvtiny suggested works. It is necessary to get rid of the double quotes to match the string. Adding the following lines makes the script work.
pattern="${pattern#\"}"
pattern="${pattern%\"}"
Please, look at "-f FILE" option in man grep.
I advise that this option do exactly what you need without any bash loops or such other "hacks" :)
And yes, according to the output of your code, you read pattern including double quotes literally. In other words, you read from file ~/Chromatin_Computation/.../KDM.protein.tb this string:
"UBE2B"
But not
UBE2B
as you probably expect.
Maybe you need to remove double quotes on the boundaries of your $pattern?
Try to do this after reading pattern:
pattern=${pattern#\"}
pattern=${pattern%\"}

sed partial replace or variable

I'd like to use sed to do a replace, but not by searching for what to replace.
Allow me to explain. I have a variable set to a default value initially.
VARIABLE="DEFAULT"
I can do a sed to replace DEFAULT with what I want, but then I would have to put DEFAULT back when I was all done. This is becuase what gets stored to VARIABLE is unique to the user. I'd like to use sed to search for somthing else other than what to replace. For example, search for VARIABLE=" and " and replace whats between it. That way it just constantly updates and there is no need to reset VARIABLE.
This is how I do it currently:
I call the script and pass an argument
./script 123456789
Inside the script, this is what happens:
sed -i "s%DEFAULT%$1%" file_to_modify
This replaces
VARIABLE="DEFAULT"
with
VARIABLE="123456789"
It would be nice if I didn't have to search for "DEFAULT", because then I would not have to reset VARIABLE at end of script.
sed -r 's/VARIABLE="[^"]*"/VARIABLE="123456789"/' file_to_modify
Or, more generally:
sed -r 's/VARIABLE="[^"]*"/VARIABLE="'"$1"'"/' file_to_modify
Both of the above use a regular expression that looks for 'VARIABLE="anything-at-all"' and replaces it with, in the first example above 'VARIABLE="123456789"' or, in the second, 'VARIABLE="$1"' where "$1" is the first argument to your script. The key element is [^"]. It means any character other than double-quote. [^"]* means any number of characters other than double-quote. Thus, we replace whatever was in the double-quotes before, "[^"]*", with our new value "123456789" or, in the second case, "$1".
The second case is a bit tricky. We want to substitute $1 into the expression but the expression is itself in single quotes. Inside single-quotes, bash will not substitute for $1. So, the sed command is broken up into three parts:
# spaces added for exposition but don't try to use it this way
's/VARIABLE="[^"]*"/VARIABLE="' "$1" '"/'
The first part is in single quotes and bash passes it literally to sed. The second part is in double-quotes, so bash will subsitute in for the value of `$``. The third part is in single-quotes and gets passed to sed literally.
MORE: Here is a simple way to test this approach on the command line without depending on any files:
$ new=1234 ; echo 'VARIABLE="DEFAULT"' | sed -r 's/VARIABLE="[^"]*"/VARIABLE="'"$new"'"/'
VARIABLE="1234"
The first line above is the command run at the prompt ($). The second is the output from running the command..

Resources