Escaping in bash within perl - bash

I'm running something in perl and have the following command, which deletes consecutive duplicate lines (only keeping one of them)
system("sed -i '$!N; /^\(.*\)\n\1$/!p; d' *[13579].csv");
However, when I run this, I get the following error:
sed: -e expression #1, char 11: unterminated address regex
I have a feeling it has to do with my escaping, but I'm not too certain as I am rather inexperienced with perl and bash. I know the dollar signs should be escaped, but what about the backslashes? Does anyone have a good resource they can point me to to learn more about escaping bash within perl? Thanks!

When putting sed in Perl it can be fussy, there's a couple things you could do. The first would be to change the type of quotes you wrap around the command system is running and the sed pattern (flip outer to single, inner to double); the other option would be to escape the \ characters in sed.
system('sed -i "$!N;/^\(.*\)\n\1$/!p;d" *filename');
Note: since your filename uses a special characters there might be escaping needed for that to work with globbing (eg. *\\[13579].csv); escaping would be something like this:
system("sed -i '$!N;/^\\(.*\\)\\n\\1\$/!p;d' *\\[13579].csv");
If your file name happens to include spaces then those would need escaping as well.
system("sed -i '$!N;/^\\(.*\\)\\n\\1\$/!p;d' *\\[12345]\\ \\[13579].csv");
sed would then find any files matching *[12345] [13579].csv and in-place edit them.

Related

Changing a line of text with sed with special characters

The name in the title says it all. However, I'm absolutely the worst with the sed command. So I'm trying to edit the following file:
/var/www/html/phpMyAdmin/config.inc.php
I want to edit the line that says
$cfg['Servers'][$i]['AllowRoot'] = false;
into the following
$cfg['Servers'][$i]['AllowRoot'] = true;
It has so many special characters and whatnot and I have no prior knowledge of how sed works. So here's some commands I've tried to specifically edit that one line.
sed -i "/*.AllowRoot.*/\$cfg['Servers'][\$i]['AllowRoot'] = true;/" /var/www/html/phpMyAdmin/config.inc.php
sed -i "/*.AllowRoot.*/$cfg['Servers'][$i]['AllowRoot'] = true;/" /var/www/html/phpMyAdmin/config.inc.php
# this one finds the line successfully and prints it so I know it's got the right string:
sed -n '/AllowRoot/p' /var/www/html/phpMyAdmin/config.inc.php
sed -i "s/'AllowRoot|false'/'AllowRoot|true'/" /var/www/html/phpMyAdmin/config.inc.php
I have absolutely no idea what I'm doing and I'm not learning a whole lot besides the feeling that the last command splits up 'AllowRoot|false' makes sure that both must be present in the sentence to come back as a result. So to my logic, I thought changing the word false into true would make that happen, but nothing. The other commands return... bizarre results at best, one even emptying the file. Or that's one of the commands I had not written down here, I've lost track after 50 attempts. What is the solution here?
The [ and ] need to be escaped to match literal brackets, instead of inadvertently starting a bracket expression. This should work:
$ sed -i "/\$cfg\['Servers'\]\[\$i\]\['AllowRoot'\]/s/false/true/" /var/www/html/phpMyAdmin/config.inc.php
There is not many things to escape in sed. Main problem in your line is / which you have chosen as delimiter (most common, but not required). I suggest you use # and the following will work:
sed -i "s#$cfg['Servers'][$i]['AllowRoot'] = false;<br />#$cfg['Servers'][$i]['AllowRoot'] = true;<br />#g" input.txt
however you need to think about bash interpreter as well. $i and $cfg will be interpreted as variables. My suggestion is that when you want to match a string like this to put the sed expression in a text file like this:
cat allow_root_true.sed
s#['Servers'][]['AllowRoot'] = false;<br />#['Servers'][]['AllowRoot'] = true;<br />#g
and run the command using sed -f like this:
sed -i -f allow_root_true.sed input.txt
Warning -i will change the input file
sed can't do literal string matching which is why you need to escape so many characters (see Is it possible to escape regex metacharacters reliably with sed), but awk can:
$ awk -v str="\$cfg['Servers'][\$i]['AllowRoot']" 'index($0,str){sub(/false/,"true")} 1' file
//some text here
$cfg['Servers'][$i]['AllowRoot'] = true;<br />
//some more text here
Run code snippetHide resultsExpand snippet
In the above we only have to escape the $s to protect them from the shell since the string is enclosed in "s to allow it to include 's.

How to append the specific path in the given file list and update the filelist [duplicate]

I have a file r. I want to replace the words File and MINvac.pdb in it with nothing. The commands I used are
sed -i 's/File//g' /home/kanika/standard_minimizer_prosee/r
and
sed -i 's/MINvac.pdb//g' /home/kanika/standard_minimizer_prosee/r
I want to combine both sed commands into one, but I don't know the way. Can anyone help?
The file looks like this:
-6174.27 File10MINvac.pdb
-514.451 File11MINvac.pdb
4065.68 File12MINvac.pdb
-4708.64 File13MINvac.pdb
6674.54 File14MINvac.pdb
8563.58 File15MINvac.pdb
sed is a scripting language. You separate commands with semicolon or newline. Many sed dialects also allow you to pass each command as a separate -e option argument.
sed -i 's/File//g;s/MINvac\.pdb//g' /home/kanika/standard_minimizer_prosee/r
I also added a backslash to properly quote the literal dot before pdb, but in this limited context that is probably unimportant.
For completeness, here is the newline variant. Many newcomers are baffled that the shell allows literal newlines in quoted strings, but it can be convenient.
sed -i 's/File//g
s/MINvac\.pdb//g' /home/kanika/standard_minimizer_prosee/r
Of course, in this limited case, you could also combine everything into one regex:
sed -i 's/\(File\|MINvac\.pdb\)//g' /home/kanika/standard_minimizer_prosee/r
(Some sed dialects will want this without backslashes, and/or offer an option to use extended regular expressions, where they should be omitted. BSD sed, and thus also MacOS sed, demands a mandatory argument to sed -i which can however be empty, like sed -i ''.)
Use the -e flag:
sed -i -e 's/File//g' -e 's/MINvac.pdb//g' /home/kanika/standard_minimizer_prosee/r
Once you get more commands than are convenient to define with -es, it is better to store the commands in a separate file and include it with the -f flag.
In this case, you'd make a file containing:
s/File//g
s/MINvac.pdb//g
Let's call that file 'sedcommands'. You'd then use it with sed like this:
sed -i -f sedcommands /home/kanika/standard_minimizer_prosee/r
With only two commands, it's probably not worthwhile using a separate file of commands, but it is quite convenient if you have a lot of transformations to make.

How to use sed to remove ./ between two characters in Unix shell

I am trying to remove ./ between two characters using sed but not getting the desired output.
Sample:
e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt
I tried the below but it is not working as expected, even the . in the ".txt" is getting removed.
sed -i 's/[./,]//g'
Beware: don't even think of using the -i option until you know the code is working. You can screw things up big time!
Use:
sed -e 's%[.]/%%g'
You can choose the delimiter in a s/// command, and when the regular expressions involve /, it is sensible to choose something else — I often use % when it doesn't figure in the text. The -e is optional. Using [.] to detect an actual dot is one way; you can write \. if you prefer, but I'm allergic to avoidable backslashes (if you've never had to write 16 backslashes in a row to get troff to do what you want, you haven't suffered enough).
Be aware that the -i option behaves differently in GNU sed and BSD (macOS) sed. Using -i.bak works in both (for an arbitrary, non-empty string such as .bak). Otherwise, your code isn't portable (which may or may not matter to you now, but might well do later on).
You have:
sed -i 's/[./,]//g'
The trouble with this is that it looks for any of the characters ., / or , in isolation — so it removes the . in .txt as well as the . and / in ./. You need to look for consecutive characters — as in my suggested solution.
try this:
echo "e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt" | sed -e 's|\./||'
You need to use escape character \
's#\.\/##g'
:=>echo "e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt" | sed 's#\.\/##g'
e2b66a3d84ee448c33d7f2a2f7e51c58 2017_06_10_0400.txt
:=>

unterminated address regex while using sed

I am trying to use the sed command to find and print the number that appears between "\MP2=" and "\" in a portion of a line that appears like this in a large .log file
\MP2=-193.0977448\
I am using the command below and getting the following error:
sed "/\MP2=/,/\/p" input.log
sed: -e expression #1, char 12: unterminated address regex
Advice on how to alter this would be greatly appreciated!
Superficially, you just need to double up the backslashes (and it's generally best to use single quotes around the sed program):
sed '/\\MP2=/,/\\/p' input.log
Why? The double-backslash is necessary to tell sed to look for one backslash. The shell also interprets backslashes inside double quoted strings, which complicates things (you'd need to write 4 backslashes to ensure sed sees 2 and interprets it as 'look for 1 backslash') — using single quoted strings avoids that problem.
However, the /pat1/,/pat2/ notation refers to two separate lines. It looks like you really want:
sed -n '/\\MP2=.*\\/p' input.log
The -n suppresses the default printing (probably a good idea on the first alternative too), and the pattern looks for a single line containing \MP2= followed eventually by a backslash.
If you want to print just the number (as the question says), then you need to work a little harder. You need to match everything on the line, but capture just the 'number' and remove everything except the number before printing what's left (which is just the number):
sed -n '/.*\\MP2=\([^\]*\)\\.*/ s//\1/p' input.log
You don't need the double backslash in the [^\] (negated) character class, though it does no harm.
If the starting and ending pattern are on the same line, you need a substitution. The range expression /r1/,/r2/ is true from (an entire) line which matches r1, through to the next entire line which matches r2.
You want this instead;
sed -n 's/.*\\MP2=\([^\\]*\)\\.*/\1/p' file
This extracts just the match, by replacing the entire line with just the match (the escaped parentheses create a group which you can refer back to in the substitution; this is called a back reference. Some sed dialects don't want backslashes before the grouping parentheses.)
awk is a better tool for this:
awk -F= '$1=="MP2" {print $2}' RS='\' input.log
Set the record separator to \ and the field separator to '=', and it's pretty trivial.

Find and replace variables containing non-escaped characters with sed

I can use this, to find all instances of "fly" and replace it with "insect" in my file:
sed -i 's/fly/insect/g' ./animals.txt
How can I find a BASH variable and replace it with another BASH variable? E.g.:
name=$(echo "fly")
category=$(echo "insect")
sed -i 's/$name/$category/g' ./animals.txt
Update:
I am using GNU sed version 4.2.1. When I try the solutions below, it reports this error:
sed: -e expression #1, char 73: unknown option to `s'
Update:
I discovered why the error is coming up. $category frequently contains lots of symbols (e.g. "/", "$", "#", "!", brackets, etc.).
ame=$(echo "fly")
category=$(echo "in/sect")
sed -i "s/$name/$category/g" ./animals.txt
The above code will create the same error:
sed: -e expression #1, char 7: unknown option to `s'
Is there a way to let sed complete the replaces, even when it encounters these symbols?
Using double quotes
Use double quotes to make the shell expand variables while keeping whitespace:
sed -i "s/$name/$category/g" ./animals.txt
Note: if you need to put backslashes in your replacement (e.g. for back references), you need double slashes (\& contains the pattern match):
Using single quotes
If you've a lot shell meta-characters, you can do something like this:
sed -i 's/'$pattern'/'$category'/g' ./animals.txt
I discovered why the error is coming up. $category frequently contains lots of symbols (e.g. "/", "$", "#", "!", brackets, etc.).
If the substitution or replacement portion contains characters like / then we can use different sets of sed delimiters. You can use something like - # % , ; : | _ etc. any character that does not happen to occur in your substitution and replacement.
Use double-quotes so the shell will interpolate your variables:
sed -i "s/$name/$category/g" ./animals.txt
Use double quotes:
sed -i "s/$name/$category/g" ./animals.txt

Resources