Is there a way to do multiple "sed" at once in bash? [duplicate] - bash

This question already has answers here:
Combine multiple sed commands [duplicate]
(5 answers)
Closed 8 years ago.
So I am looking to edit a number of bits of a file prior to using it as an input file for model simulations. At the moment I am passing it back and forth between a couple of temporary files (it was a bit buggy when I tried to write to the same temporary file) before finally making a file I can use to run the model. Is there a way to get all this alterations made simultaneously? I reckon doing it the way I am now is probably quite inefficient. Example of code below:
sed -e "s/9000000.0/${naerval}/" MC_NAMELIST_Pin14_Run3.IN > /tmp/temp1.in
#sed is away to change a string in a text file
sed -e "s/8000000.0/${sig_aer}/" /tmp/temp1.in > /tmp/temp2.in
sed -e "s/7000000.0/${d_aer}/" /tmp/temp2.in > /tmp/temp1.in
sed -e "s/6000000.0/${t_twall}/" /tmp/temp1.in > /tmp/temp2.in
sed -e "s/5000000.0/${RH}/" /tmp/temp2.in > /tmp/temp1.in
sed -e "s/4000000.0/${Therm_Coeff}/" /tmp/temp1.in > /tmp/temp2.in
sed -e "s/3000000.0/${press_decay}/" /tmp/temp2.in > /tmp/temp1.in
sed -e "s/2000000.0/${kappa}/" /tmp/temp1.in > /tmp/NAMELIST.IN
./main.exe /tmp/NAMELIST.IN
I have additionally attempted replacing this code with:
sed -i.bak s~9000000.0~${naerval}~;s~8000000.0~${sig_aer}~;s~7000000.0~${d_aer}~;s~6000000.0~${t_twall}~;s~5000000.‌0~${RH}~;s~4000000.0~${Therm_Coeff}~;s~3000000.0~${press_decay}~;s~2000000.0~${ka‌​ppa}~;" MC_NAMELIST_Pin14_Run3.IN > /tmp/NAMELIST.IN
./main.exe /tmp/NAMELIST.IN
However, this causes an error in main.exe while the original code does not. I assume therefore that this code does not alter MC_NAMELIST_Pin14_Run3.IN in the expected way.

You can combine several sed commands like this:
sed -i.bak "s/9000000.0/${naerval}/; s/8000000.0/${sig_aer}/" /tmp/temp1.in
i.bak will enable inline editing and save original file with .bak extension
Keep in mind that your replacement strings cannot create a slash or new line.
You can use an alternate delimiter like this:
sed -i.bak "s~9000000.0~${naerval}~; s~8000000.0~${sig_aer}~" /tmp/temp1.in

At the moment I am passing it back and forth between a couple of temporary files
Writing and reading all those temp files is crazy, that's what pipes are for!
sed -e "s/9000000.0/${naerval}/" MC_NAMELIST_Pin14_Run3.IN | \
sed -e "s/8000000.0/${sig_aer}/" | \
sed -e "s/7000000.0/${d_aer}/" | etc.
But you can combine all the edits into one sed invocation with multiple scripts, preceding each one with -e:
sed -e "s/9000000.0/${naerval}/" -e "s/8000000.0/${sig_aer}/" -e "s/7000000.0/${d_aer}/" -e etc. etc. MC_NAMELIST_Pin14_Run3.IN > /tmp/NAMELIST.IN
Or as a single script with many commands, separated by semi-colons:
sed -e "s/9000000.0/${naerval}/;s/8000000.0/${sig_aer}/;s/7000000.0/${d_aer}/;..." MC_NAMELIST_Pin14_Run3.IN > /tmp/NAMELIST.IN

Try something like:
sed -i.bak -e 's/9000000.0/${naerval}/' -e 's/8000000.0/${sig_aer}/' MC_NAMELIST_Pin14_Run3.IN

Related

multiple sed with -e and escape characters

I'm trying to do multiple replacements in a gzipped file and have been having trouble.
zcat PteBra.fa.align.gz | sed -e 's#Simple_repeat/Satellite/Y-chromosome#Simple_repeat/Satellite#g' -e sed 's#Unknown/Unknown/Y-chromosome#Unknown/Unknown#g' -e sed 's#DNA/DNA/TcMar#DNA/TcMar#g' -e sed 's#DNA/DNA/Crypton#DNA/Crypton#g' -e sed 's#DNA/DNA/PIF-Harbinger#DNA/PIF-Harbinger#g' -e sed 's#DNA/DNA/CMC-Chapaev-3#DNA/CMC-Chapaev-3#g' -e sed 's#SINE/SINE/RTE#SINE/RTE#g' > PteBra.fa.align.corrected
Note that I'm using # instead of the standard / because of the presence of / in the text I want to replace. Each individual sed works with no problem but stringing them together yields this consistent error:
sed: -e expression #2, char 3: unterminated `s' command
I have looked all over for a solution but finally, to get the work done, just did all the sed's individually. It takes FOREVER, so I'd like to get this option working.
I've been at this for hours and would appreciate some help.
What am I doing wrong?
Thanks.
You don't have to write -e sed each time! -e will do.
zcat PteBra.fa.align.gz | sed -e 's#Simple_repeat/Satellite/Y-chromosome#Simple_repeat/Satellite#g' -e 's#Unknown/Unknown/Y-chromosome#Unknown/Unknown#g' -e 's#DNA/DNA/TcMar#DNA/TcMar#g' -e 's#DNA/DNA/Crypton#DNA/Crypton#g' -e 's#DNA/DNA/PIF-Harbinger#DNA/PIF-Harbinger#g' -e 's#DNA/DNA/CMC-Chapaev-3#DNA/CMC-Chapaev-3#g' -e 's#SINE/SINE/RTE#SINE/RTE#g' > PteBra.fa.align.corrected
or you can use semicolon inside sed string expression itself
zcat PteBra.fa.align.gz | sed -e '
s#Simple_repeat/Satellite/Y-chromosome#Simple_repeat/Satellite#g;
s#Unknown/Unknown/Y-chromosome#Unknown/Unknown#g;
s#DNA/DNA/TcMar#DNA/TcMar#g;
s#DNA/DNA/Crypton#DNA/Crypton#g;
s#DNA/DNA/PIF-Harbinger#DNA/PIF-Harbinger#g;
s#DNA/DNA/CMC-Chapaev-3#DNA/CMC-Chapaev-3#g;
s#SINE/SINE/RTE#SINE/RTE#g
' > PteBra.fa.align.corrected
As you already have a proper answer, this is not yet another answer
but a small suggestion for the actual operation.
I imagine writing the sed command in a line may be a messy job. How about
preparing a look-up table which describes a replacee and a replacer
in a line as a csv format like:
table.txt
Simple_repeat/Satellite/Y-chromosome,Simple_repeat/Satellite
Unknown/Unknown/Y-chromosome,Unknown/Unknown
DNA/DNA/TcMar,DNA/TcMar
DNA/DNA/Crypton,DNA/Crypton
DNA/DNA/PIF-Harbinger,DNA/PIF-Harbinger
DNA/DNA/CMC-Chapaev-3,DNA/CMC-Chapaev-3
SINE/SINE/RTE,SINE/RTE
Then you can execute the following awk script to replace the strings:
zcat PteBra.fa.align.gz | awk -F, '
NR==FNR {repl[$1] = $2; next}
{
for (r in repl) gsub(r, repl[r])
print
}
' table.txt - > PteBra.fa.align.corrected
Hope this helps.

Error on sed script - extra characters after command

I've been trying to create a sed script that reads a list of phone numbers and only prints ones that match the following schemes:
+1(212)xxx-xxxx
1(212)xxx-xxxx
I'm an absolute beginner, but I tried to write a sed script that would print this for me using the -n -r flags (the contents of which are as follows):
/\+1\(212\)[0-9]{3}-[0-9]{4}/p
/1\(212\)[0-9]{3}-[0-9]{4}/p
If I run this in sed directly, it works fine (i.e. sed -n -r '/\+1\(212\)[0-9]{3}-[0-9]{4}/p' sample.txt prints matching lines as expected. This does NOT work in the sed script I wrote, instead sed says:
sed: -e expression #1, char 2: extra characters after command
I could not find a good solution, this error seems to have so many causes and none of the answers I found apply easily here.
EDIT: I ran it with sed -n -r script.sed sample.txt
sed can not automatically determine whether you intended a parameter to be a script file or a script string.
To run a sed script from a file, you have to use -f:
$ echo 's/hello/goodbye/g' > demo.sed
$ echo "hello world" | sed -f demo.sed
goodbye world
If you neglect the -f, sed will try to run the filename as a command, and the delete command is not happy to have emo.sed after it:
$ echo "hello world" | sed demo.sed
sed: -e expression #1, char 2: extra characters after command
Of the various unix tools out there, two use BRE as their default regex dialect. Those two tools are sed and grep.
In most operating systems, you can use egrep or grep -E to tell that tool to use ERE as its dialect. A smaller (but still significant) number of sed implementations will accept a -E option to use ERE.
In BRE mode, however, you can still create atoms with brackets. And you do it by escaping parentheses. That's why your initial expression is failing -- the parentheses are NOT special by default in BRE, but you're MAKING THEM SPECIAL by preceding the characters with backslashes.
The other thing to keep in mind is that if you want sed to execute a script from a command line argument, you should use the -e option.
So:
$ cat ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
212-xxx-xxxx
$ grep '^+\{0,1\}1([0-9]\{3\})' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ egrep '^[+]?1\([0-9]{3}\)' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ sed -n -e '/^+\{0,1\}1([0-9]\{3\})/p' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ sed -E -n -e '/^[+]?1\([0-9]{3}\)/p' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
Depending on your OS, you may be able to get a full list of how this works from man re_format.

How to combine multiple sed commands into one [duplicate]

This question already has answers here:
Combining two sed commands
(2 answers)
Closed 1 year ago.
I have 4 different sed commands which I am running on a file. And in order to tune in the performance of these 4 commands, I want to combine them into one.
Each command is a complex command with -E switch. Searched many many forums but could not get my specific answer.
sed -i -E ':a; s/('"$search_str"'X*)[^X&]/\1X/; ta' "$newfile"
sed -i -E '/[<]ExtData[>?" "]/{:a; /Name=/{/Name="'"$nvp_list_ORed"'"/!b}; /Value=/bb; n; ba; :b; s/(Value="X*)[^X"]/\1X/; tb; }' "$newfile"
sed -i -E ':a; s/('"$search_str1"'X*)[^X\<]/\1X/; ta' "$newfile"
sed -i -E ':a; s/('"$search_str2"'X*)[^X\/]/\1X/; ta' "$newfile"
And i want to combine them say something like
sed -i -E 'command1' -e 'command2' -e 'command3' -e 'command4'
"$newfile"
But it is not working. Because may be -E and -e can't be combine.
Please let me know.
Thanks !! Puneet
-E means "extended regex" and is a standalone flag, -e means "expression" and must be followed by a sed expression.
You can combine them, but each of your sed expression must be preceded by a -e if you want multiple of them, which isn't the case of your first one.
sed -i -E -e 'command1' -e 'command2' -e 'command3' -e 'command4' "$newfile"
A second option is to write each command in the same expression :
sed -i -E 'command1;command2;command3;command4' "$newfile"
However, since you're using labels I wouldn't rely on this option ; some implementations may not support it as John1024 pointed out.
Lastly, as mentionned by Mad Physicist, you can write your sed expressions to a file which you'll reference through the -f option.
The file must contain a single sed expression by line (you can write multiline expressions by suffixing each line but the last by a \, thus escaping the line-feed).
Simply pipe them:
sed -E 'A' file | sed -E 'B' | ... >file.tmp && mv file.tmp file
As #Aaron observed, if you want to give multiple separate expressions to sed, you must designate them as -e options; they will be combined. You can also combine a bunch of expressions into one by separating the pieces with semicolons.
Your case is a bit special however: your particular expressions use labels and branch instructions, with one of the label names (a) repeated in each expression. In order to combine these, each label should be distinct, and each branch (either conditional and absolute) should specify the correct label. That would look something like this:
sed -i -E \
-e ':a1; s/('"$search_str"'X*)[^X&]/\1X/; ta1' \
-e '/[<]ExtData[>?" "]/ {:a2; /Name=/ {/Name="'"$nvp_list_ORed"'"/ !b}; /Value=/ bb2; n; ba2; :b2; s/(Value="X*)[^X"]/\1X/; tb2; }' \
-e ':a3; s/('"$search_str1"'X*)[^X\<]/\1X/; ta3' \
-e ':a4; s/('"$search_str2"'X*)[^X\/]/\1X/; ta4' \
"$newfile"
Do note that even with proper quoting from a shell perspsective, which you appear to have, your approach will not do what you expect if the value of any of the interpolated shell variables contains a regex metacharacter.
Warning: It is not always possible to combine multiple sed scripts into a single one without change. Sometimes you might have to do a redesign of your algorithm.
Sed makes has two concepts of memory. The pattern space and the hold space. Concatenation is only working if these two spaces are identical in both sed commands. Below you find an example where the pattern space changes:
$ echo aa | sed -e 's/./&\n/' | sed -e '1s/a/b/g'
b
a
$ echo aa | sed -e 's/./&\n/' -e '1s/a/b/g'
b
b
$ echo aa | gsed -e 's/./&\n/;1s/a/b/g'
b
b
In the original pipeline, the first sed command works on the pattern space aa, while the second script's pattern space is only a.

Using a bash variable to pass multiple -e clauses to sed [duplicate]

This question already has answers here:
Why does shell ignore quoting characters in arguments passed to it through variables? [duplicate]
(3 answers)
Closed 6 years ago.
I'm creating a variable from an array which build up multiple -e clauses for a sed command.
The resulting variable is something like:
sedArgs="-e 's/search1/replace1/g' -e 's/search2/replace2/g' -e 's/search3/replace3/g'"
But when I try to call sed with this as the argument I get the error sed: -e expression #1, char 1: unknown command: ''
I've tried to call sed the following ways:
cat $myFile | sed $sedArgs
cat $myFile | sed ${sedArgs}
cat $myFile | sed `echo $sedArgs`
cat $myFile | sed "$sedArgs"
cat $myFile | sed `echo "$sedArgs"`
and all give the same error.
UPDATE - Duplicate question
As has been identified, this is a 'quotes expansion' issue - I thought it was something sed specific, but the duplicate question that has been identified put me on the right track.
I managed to resolve the issue by creating the sedArgs string as:
sedArgs="-e s/search1/replace1/g -e s/search2/replace2/g -e s/search3/replace3/g"
and calling it with:
cat $myFile | sed $sedArgs
which works perfectly.
Then I took the advice of tripleee and kicked the useless cat out!
sed $sedArgs $myFile
also works perfectly.
Use BASH arrays instead of simple string:
# sed arguments in an array
sedArgs=(-e 's/search1/replace1/g' -e 's/search2/replace2/g' -e 's/search3/replace3/g')
# then use it as
sed "${sedArgs[#]}" file
Here is no sane way to do that, but you can pass the script as a single string.
sedArgs='s/search1/replace1/g
s/search2/replace2/g
s/search3/replace3/g'
: then
sed "$sedArgs" "$myFile"
The single-quoted string spans multiple lines; this is scary when you first see it, but perfectly normal shell script. Notice also how the cat is useless as ever, and how the file name needs to be quoted, too.

Replace all unquoted characters from a file bash

Using bash, how would one replace all unquoted characters from a file?
I have a system that I can't modify that spits out CSV files such as:
code;prop1;prop2;prop3;prop4;prop5;prop6
0,1000,89,"a1,a2,a3",33,,
1,,,"a55,a10",1,1 L,87
2,25,1001,a4,,"1,5 L",
I need this to become, for a new system being added
code;prop1;prop2;prop3;prop4;prop5;prop6
0;1000;89;a1,a2,a3;33;;
1;;;a55,a10;1;1 L;87
2;25;1001;a4;1,5 L;
If the quotes can be removed after this substitution happens in one command it would be nice :) But I prefer clarity to complicated one-liners for future maintenance.
Thank you
With sed:
sed -e 's/,/;/g' -e ':loop; s/\("\)\([^;]*\);\([^"]*"\)/\1\2,\3/; t loop'
Test:
$ sed -e 's/,/;/g' -e ':loop; s/\("\)\([^;]*\);\([^"]*"\)/\1\2,\3/; t loop' yourfile
code;prop1;prop2;prop3;prop4;prop5;prop6
0;1000;89;"a1,a2,a3";33;;
1;;;"a55,a10";1;1 L;87
2;25;1001;a4;;"1,5 L";
You want to use a csv parser. Parsing csv with shell tools is hard (you will encounter regular expressions soon, and they rarely get all cases).
There is one in almost every language. I recommend python.
You can also do this using excel/openoffice variants by opening the file and then saving with ; as the separator.
You can used sed:
echo '0,1000,89,"a1,a2,a3",33,,' | sed -e "s|\"||g"
This will replace " with the empty string (deletes it), and you can pipe another sed to replace the , with ;:
sed -e "s|,|;|g"
$ echo '0,1000,89,"a1,a2,a3",33,,' | sed -e "s|\"||g" | sed -e "s|,|;|g"
>> 0;1000;89;a1;a2;a3;33;;
Note that you can use any separator you want instead of | inside the sed command. For example, you can rewrite the first sed as:
sed -e "s-\"--g"

Resources