Replacing special characters in a shell script using sed - shell

I am trying to write a shell script that will replace whatever characters/strings I choose using sed. My first attempt worked with the exception of special characters. I have been trying to use sed to fix the special characters so that they too will be searched for or replaced. I decided to simplify the script for testing purposed, and just deal with a single offending character. However, I am still having problems.
Edited Script
#! /bin/sh
oldString=$1
newString=$2
file=$3
oldStringFixed=$(echo "$oldString" | sed 's/\\/\\\\/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\[/\\\[/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\]/\\\]/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\^/\\\^/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\*/\\\*/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\+/\\\+/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\./\\\./g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\$/\\\$/g')
oldStringFixed=$(echo "$oldStringFixed" | sed 's/\-/\\\-/g')
sed -e "s/$oldStringFixed/$newString/g" "$file" > newfile.updated
mv newfile.updated "$file"#! /bin/sh
In case it is not clear, I am trying to search through oldString for the [ character, and replace it with an escaped version and assign the results to oldStringFixed (do I need the backticks for this?). The bottom two lines are slightly modified versions of my original script that I believe works correctly.
When I echo the fixed string, nothing is displayed, and sed outputs an error
sed: can't read [: No such file or directory
Can anyone explain what Is wrong with my first sed line?
EDIT:
Thanks to Jite, the script is working better. However, I am still having a problem with replacing single quoted characters with spaces, i.e. ' *'. The new version is above.

I suggest two improvements:
Do not stack calls to sed as you do, instead pack all of them in a single function, as escape_string below.
You can use a fancy delimiter for the sed substitute command to avoid issues linked to / being part of the strings involved.
With these changes, your script looks like:
#! /bin/sh
oldString="$1"
newString="$2"
file="$3"
escape_string()
{
printf '%s' "$1" | sed -e 's/[][\\^*+.$-]/\\\1/g'
}
fancyDelim=$(printf '\001')
oldStringFixed=$(escape_string "$oldString")
sed -e "s$fancyDelim$oldStringFixed$fancyDelim$newString${fancyDelim}g" "$file" \
> newfile.updated
mv newfile.updated "$file"

To replace values containing special characters try using sed with "|" instead of "/"
Eg: sed -i 's|'$original_value'|'$new_value'|g'
where original_value="comprising_special_char_/"
new_value="comprising_new_special_char:"

Change:
oldStringFixed= `sed 's/\[/\[/g' "$oldString"\`
to:
oldStringFixed=$(echo "$oldString" | sed 's/\[/\\\[/g')
Problem 1: Space after =, it's not allowed when assigning shell variables.
Problem 2: sed expects a file as input, not a string. You may pipe it as my solution does though.
Problem 3: You need to escape the backslash first \\, then you need to escape your char \[, totalling \\\[ :)
Side note: I changed `` to $() since the latter is the recommended praxis (due to nesting, another topic).

For me it was just a nightmare trying to get sed to do this for the general case. I gave up and wrote a short Python code to replace sed:
#!/usr/bin/python
# replace.py
import sys
# Replace string in a file (in place)
match=sys.argv[1]
replace=sys.argv[2]
filename=sys.argv[3]
print "Replacing strings in",filename
with open(filename,"r") as f:
data = f.read().replace(match,replace)
with open(filename,"w") as f:
f.write(data)
Which can then be used like:
#!/bin/bash
orig='<somethinghorrible>'
out='<replacement>'
python replace.py "$orig" "$out" myfile.txt

you can use this for replacing " with \" sed 's/\"/\\\"/g' filename

Related

Insert the contents of the variable in SED command [duplicate]

If I run these commands from a script:
#my.sh
PWD=bla
sed 's/xxx/'$PWD'/'
...
$ ./my.sh
xxx
bla
it is fine.
But, if I run:
#my.sh
sed 's/xxx/'$PWD'/'
...
$ ./my.sh
$ sed: -e expression #1, char 8: Unknown option to `s'
I read in tutorials that to substitute environment variables from shell you need to stop, and 'out quote' the $varname part so that it is not substituted directly, which is what I did, and which works only if the variable is defined immediately before.
How can I get sed to recognize a $var as an environment variable as it is defined in the shell?
Your two examples look identical, which makes problems hard to diagnose. Potential problems:
You may need double quotes, as in sed 's/xxx/'"$PWD"'/'
$PWD may contain a slash, in which case you need to find a character not contained in $PWD to use as a delimiter.
To nail both issues at once, perhaps
sed 's#xxx#'"$PWD"'#'
In addition to Norman Ramsey's answer, I'd like to add that you can double-quote the entire string (which may make the statement more readable and less error prone).
So if you want to search for 'foo' and replace it with the content of $BAR, you can enclose the sed command in double-quotes.
sed 's/foo/$BAR/g'
sed "s/foo/$BAR/g"
In the first, $BAR will not expand correctly while in the second $BAR will expand correctly.
Another easy alternative:
Since $PWD will usually contain a slash /, use | instead of / for the sed statement:
sed -e "s|xxx|$PWD|"
You can use other characters besides "/" in substitution:
sed "s#$1#$2#g" -i FILE
一. bad way: change delimiter
sed 's/xxx/'"$PWD"'/'
sed 's:xxx:'"$PWD"':'
sed 's#xxx#'"$PWD"'#'
maybe those not the final answer,
you can not known what character will occur in $PWD, / : OR #.
if delimiter char in $PWD, they will break the expression
the good way is replace(escape) the special character in $PWD.
二. good way: escape delimiter
for example:
try to replace URL as $url (has : / in content)
x.com:80/aa/bb/aa.js
in string $tmp
URL
A. use / as delimiter
escape / as \/ in var (before use in sed expression)
## step 1: try escape
echo ${url//\//\\/}
x.com:80\/aa\/bb\/aa.js #escape fine
echo ${url//\//\/}
x.com:80/aa/bb/aa.js #escape not success
echo "${url//\//\/}"
x.com:80\/aa\/bb\/aa.js #escape fine, notice `"`
## step 2: do sed
echo $tmp | sed "s/URL/${url//\//\\/}/"
URL
echo $tmp | sed "s/URL/${url//\//\/}/"
URL
OR
B. use : as delimiter (more readable than /)
escape : as \: in var (before use in sed expression)
## step 1: try escape
echo ${url//:/\:}
x.com:80/aa/bb/aa.js #escape not success
echo "${url//:/\:}"
x.com\:80/aa/bb/aa.js #escape fine, notice `"`
## step 2: do sed
echo $tmp | sed "s:URL:${url//:/\:}:g"
x.com:80/aa/bb/aa.js
With your question edit, I see your problem. Let's say the current directory is /home/yourname ... in this case, your command below:
sed 's/xxx/'$PWD'/'
will be expanded to
sed `s/xxx//home/yourname//
which is not valid. You need to put a \ character in front of each / in your $PWD if you want to do this.
Actually, the simplest thing (in GNU sed, at least) is to use a different separator for the sed substitution (s) command. So, instead of s/pattern/'$mypath'/ being expanded to s/pattern//my/path/, which will of course confuse the s command, use s!pattern!'$mypath'!, which will be expanded to s!pattern!/my/path!. I’ve used the bang (!) character (or use anything you like) which avoids the usual, but-by-no-means-your-only-choice forward slash as the separator.
Dealing with VARIABLES within sed
[root#gislab00207 ldom]# echo domainname: None > /tmp/1.txt
[root#gislab00207 ldom]# cat /tmp/1.txt
domainname: None
[root#gislab00207 ldom]# echo ${DOMAIN_NAME}
dcsw-79-98vm.us.oracle.com
[root#gislab00207 ldom]# cat /tmp/1.txt | sed -e 's/domainname: None/domainname: ${DOMAIN_NAME}/g'
--- Below is the result -- very funny.
domainname: ${DOMAIN_NAME}
--- You need to single quote your variable like this ...
[root#gislab00207 ldom]# cat /tmp/1.txt | sed -e 's/domainname: None/domainname: '${DOMAIN_NAME}'/g'
--- The right result is below
domainname: dcsw-79-98vm.us.oracle.com
VAR=8675309
echo "abcde:jhdfj$jhbsfiy/.hghi$jh:12345:dgve::" |\
sed 's/:[0-9]*:/:'$VAR':/1'
where VAR contains what you want to replace the field with
I had similar problem, I had a list and I have to build a SQL script based on template (that contained #INPUT# as element to replace):
for i in LIST
do
awk "sub(/\#INPUT\#/,\"${i}\");" template.sql >> output
done
If your replacement string may contain other sed control characters, then a two-step substitution (first escaping the replacement string) may be what you want:
PWD='/a\1&b$_' # these are problematic for sed
PWD_ESC=$(printf '%s\n' "$PWD" | sed -e 's/[\/&]/\\&/g')
echo 'xxx' | sed "s/xxx/$PWD_ESC/" # now this works as expected
for me to replace some text against the value of an environment variable in a file with sed works only with quota as the following:
sed -i 's/original_value/'"$MY_ENVIRNONMENT_VARIABLE"'/g' myfile.txt
BUT when the value of MY_ENVIRONMENT_VARIABLE contains a URL (ie https://andreas.gr) then the above was not working.
THEN use different delimiter:
sed -i "s|original_value|$MY_ENVIRNONMENT_VARIABLE|g" myfile.txt

Error on sed script - extra characters after command

I've been trying to create a sed script that reads a list of phone numbers and only prints ones that match the following schemes:
+1(212)xxx-xxxx
1(212)xxx-xxxx
I'm an absolute beginner, but I tried to write a sed script that would print this for me using the -n -r flags (the contents of which are as follows):
/\+1\(212\)[0-9]{3}-[0-9]{4}/p
/1\(212\)[0-9]{3}-[0-9]{4}/p
If I run this in sed directly, it works fine (i.e. sed -n -r '/\+1\(212\)[0-9]{3}-[0-9]{4}/p' sample.txt prints matching lines as expected. This does NOT work in the sed script I wrote, instead sed says:
sed: -e expression #1, char 2: extra characters after command
I could not find a good solution, this error seems to have so many causes and none of the answers I found apply easily here.
EDIT: I ran it with sed -n -r script.sed sample.txt
sed can not automatically determine whether you intended a parameter to be a script file or a script string.
To run a sed script from a file, you have to use -f:
$ echo 's/hello/goodbye/g' > demo.sed
$ echo "hello world" | sed -f demo.sed
goodbye world
If you neglect the -f, sed will try to run the filename as a command, and the delete command is not happy to have emo.sed after it:
$ echo "hello world" | sed demo.sed
sed: -e expression #1, char 2: extra characters after command
Of the various unix tools out there, two use BRE as their default regex dialect. Those two tools are sed and grep.
In most operating systems, you can use egrep or grep -E to tell that tool to use ERE as its dialect. A smaller (but still significant) number of sed implementations will accept a -E option to use ERE.
In BRE mode, however, you can still create atoms with brackets. And you do it by escaping parentheses. That's why your initial expression is failing -- the parentheses are NOT special by default in BRE, but you're MAKING THEM SPECIAL by preceding the characters with backslashes.
The other thing to keep in mind is that if you want sed to execute a script from a command line argument, you should use the -e option.
So:
$ cat ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
212-xxx-xxxx
$ grep '^+\{0,1\}1([0-9]\{3\})' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ egrep '^[+]?1\([0-9]{3}\)' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ sed -n -e '/^+\{0,1\}1([0-9]\{3\})/p' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ sed -E -n -e '/^[+]?1\([0-9]{3}\)/p' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
Depending on your OS, you may be able to get a full list of how this works from man re_format.

Replace all unquoted characters from a file bash

Using bash, how would one replace all unquoted characters from a file?
I have a system that I can't modify that spits out CSV files such as:
code;prop1;prop2;prop3;prop4;prop5;prop6
0,1000,89,"a1,a2,a3",33,,
1,,,"a55,a10",1,1 L,87
2,25,1001,a4,,"1,5 L",
I need this to become, for a new system being added
code;prop1;prop2;prop3;prop4;prop5;prop6
0;1000;89;a1,a2,a3;33;;
1;;;a55,a10;1;1 L;87
2;25;1001;a4;1,5 L;
If the quotes can be removed after this substitution happens in one command it would be nice :) But I prefer clarity to complicated one-liners for future maintenance.
Thank you
With sed:
sed -e 's/,/;/g' -e ':loop; s/\("\)\([^;]*\);\([^"]*"\)/\1\2,\3/; t loop'
Test:
$ sed -e 's/,/;/g' -e ':loop; s/\("\)\([^;]*\);\([^"]*"\)/\1\2,\3/; t loop' yourfile
code;prop1;prop2;prop3;prop4;prop5;prop6
0;1000;89;"a1,a2,a3";33;;
1;;;"a55,a10";1;1 L;87
2;25;1001;a4;;"1,5 L";
You want to use a csv parser. Parsing csv with shell tools is hard (you will encounter regular expressions soon, and they rarely get all cases).
There is one in almost every language. I recommend python.
You can also do this using excel/openoffice variants by opening the file and then saving with ; as the separator.
You can used sed:
echo '0,1000,89,"a1,a2,a3",33,,' | sed -e "s|\"||g"
This will replace " with the empty string (deletes it), and you can pipe another sed to replace the , with ;:
sed -e "s|,|;|g"
$ echo '0,1000,89,"a1,a2,a3",33,,' | sed -e "s|\"||g" | sed -e "s|,|;|g"
>> 0;1000;89;a1;a2;a3;33;;
Note that you can use any separator you want instead of | inside the sed command. For example, you can rewrite the first sed as:
sed -e "s-\"--g"

using sed to find and replace in bash for loop

I have a large number of words in a text file to replace.
This script is working up until the sed command where I get:
sed: 1: "*.js": invalid command code *
PS... Bash isn't one of my strong points - this doesn't need to be pretty or efficient
cd '/Users/xxxxxx/Sites/xxxxxx'
echo `pwd`;
for line in `cat myFile.txt`
do
export IFS=":"
i=0
list=()
for word in $line; do
list[$i]=$word
i=$[i+1]
done
echo ${list[0]}
echo ${list[1]}
sed -i "s/{$list[0]}/{$list[1]}/g" *.js
done
You're running BSD sed (under OS X), therefore the -i flag requires an argument specifying what you want the suffix to be.
Also, no files match the glob *.js.
This looks like a simple typo:
sed -i "s/{$list[0]}/{$list[1]}/g" *.js
Should be:
sed -i "s/${list[0]}/${list[1]}/g" *.js
(just like the echo lines above)
So myFile.txt contains a list of from:to substitutions, and you are looping over each of those. Why don't you create a sed script from this file instead?
cd '/Users/xxxxxx/Sites/xxxxxx'
sed -e 's/^/s:/' -e 's/$/:/' myFile.txt |
# Output from first sed script is a sed script!
# It contains substitutions like this:
# s:from:to:
# s:other:substitute:
sed -f - -i~ *.js
Your sed might not like the -f - which means sed should read its script from standard input. If that is the case, perhaps you can create a temporary script like this instead;
sed -e 's/^/s:/' -e 's/$/:/' myFile.txt >script.sed
sed -f script.sed -i~ *.js
Another approach, if you don't feel very confident with sed and think you are going to forget in a week what the meaning of that voodoo symbols is, could be using IFS in a more efficient way:
IFS=":"
cat myFile.txt | while read PATTERN REPLACEMENT # You feed the while loop with stdout lines and read fields separated by ":"
do
sed -i "s/${PATTERN}/${REPLACEMENT}/g"
done
The only pitfall I can see (it may be more) is that if whether PATTERN or REPLACEMENT contain a slash (/) they are going to destroy your sed expression.
You can change the sed separator with a non-printable character and you should be safe.
Anyway, if you know whats on your myFile.txt you can just use any.

SED bad substitution error

Here's my problem, I have written the following line of code to format properly a list of files found recursively in a directory.
find * | sed -e '/\(.*\..*\)/ !d' | sed -e "s/^.*/\${File} \${INST\_FILES} &/" | sed -e "s/\( \)\([a-zA-Z0-9]*\/\)/\/\2/" | sed -e "s/\(\/\)\([a-zA-Z0-9\_\-\(\)\{\}\$]*\.[a-zA-Z0-9]*\)/ \2/"
The second step is to write the output of this command in a script. While the code above has the expected behavior, the problem occurs when I try to store its output to a variable, I get a bad substitution error from the first sed command in the line.
#!/bin/bash
nsisscript=myscript.sh
FILES=*
for f in $(find $FILES); do
v=`echo $f | sed -e '/\(.*\..*\)/ !d' | sed -e "s/^.*/\${File} \${INST\_FILES} &/" | sed -e "s/\( \)\([a-zA-Z0-9]*\/\)/\/\2/" | sed -e "s/\(\/\)\([a-zA-Z0-9\_\-\(\)\{\}\$]*\.[a-zA-Z0-9]*\)/ \2/"`
sed -i.backup -e "s/\;Insert files here/$v\\n&/" $nsisscript
done
Could you please help me understand what the difference is between the two cases and why I get this error ?
Thanks in advance!
Well my guess was that your escaping of underscore in INST_FILES is strange as underscore is not a special character in shell nor in sed. The error disappear when you delete the '\' before '_'
my 2 cents
Parsing inside of backquote-style command substitution is a bit weird -- it requires an extra level of escaping (i.e. backslashes) to control when expansions take place. Ugly solution: add more backslashes. Better solution: use $() instead of backquotes -- it does the same thing, but without the weird parsing and escaping issues.
BTW, your script seems to have some other issues. First, I don't know about the sed on your system, but the versions I'm familiar with don't interpret \n in the substitution as a newline (which I presume you want), but as a literal n character. One solution is to include a literal newline in the substitution (preceded by a backslash).
Also, the loop executes for each found file, but for files that don't have a period in the name, the first sed command removes them, $v is empty, and you add a blank line to myscript.sh. You should either put the filtering sed call in the for statement, or add it as a filter to the find command.
#!/bin/bash
nsisscript=myscript.sh
nl=$'\n'
FILES=*
for f in $(find $FILES -name "*.*"); do
v=$(echo $f | sed -e "s/^.*/\${File} \${INST\_FILES} &/" | sed -e "s/\( \)\([a-zA-Z0-9]*\/\)/\/\2/" | sed -e "s/\(\/\)\([a-zA-Z0-9\_\-\(\)\{\}\$]*\.[a-zA-Z0-9]*\)/ \2/")
sed -i.backup -e "s/\;Insert files here/$v\\$nl&/" $nsisscript
done

Resources