Linux Shell script Sed inserting - bash

I need to write a shell script to insert a parameter string after every big letter in a file.
$parameter="4"
Example input.txt
AppLe
House
Example output.txt
A4ppL4e
H4ouse
I've tried to use
sed '/[A-Z]/i\$1\'
Can anyone help me?
THX

With GNU/BSD/busybox sed which support the -i option:
param=4
sed -i'' -e 's/\([[:upper:]]\)/\1'"$param"'/g' input.txt
This replaces each uppercase letter inside the captured group \(...\) globally with the first captured group \1 and the value of variable param in-place.
With standard sed you need a temporary file or sponge from the moreutils package:
param=4
sed 's/\([[:upper:]]\)/\1'"$param"'/g' input.txt > temp && mv temp input.txt
param=4
sed 's/\([[:upper:]]\)/\1'"$param"'/g' input.txt | sponge input.txt

Use a file editor like ed to edit files:
printf "%s\n" 'g/[[:upper:]]/s/\([[:upper:]]\)/\1'"$param"'/g' w | ed -s input.txt
or if you like heredocs better
ed -s input.txt <<EOF
g/[[:upper:]]/s/\([[:upper:]]\)/\1${param}/g
w
EOF

Related

How to remove consecutive repeating characters from every line?

I have the below lines in a file
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;;;;
Acanthocephala;;;;;;;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Polymorphus;;
and I want to remove the repeating semi-colon characters from all lines to look like below (note- there are repeating semi-colons in the middle of some of the above lines too)
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;
Acanthocephala;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Polymorphus;
I would appreciate if someone could kindly share a bash one-liner to accomplish this.
You can use tr with "squeeze":
tr -s ';' < infile
perl -p -e 's/;+/;/g' myfile # writes output to stdout
or
perl -p -i -e 's/;+/;/g' myfile # does an in-place edit
If you want to edit the file itself:
printf "%s\n" 'g/;;/s/;\{2,\}/;/g' w | ed -s foo.txt
If you want to pipe a modified copy of the file to something else and leave the original unchanged:
sed 's/;\{2,\}/;/g' foo.txt | whatever
These replace runs of 2 or more semicolons with single ones.
could be solved easily by substitutions.
I add an awk solution by playing with the FS/OFS variable:
awk -F';+' -v OFS=';' '$1=$1' file
or
awk -F';+' -v OFS=';' '($1=$1)||1' file
Here's a sed version of alaniwi's answer:
sed 's/;\+/;/g' myfile # Write output to stdout
or
sed -i 's/;\+/;/g' myfile # Edit the file in-place

How to delete a line (matching a pattern) from a text file? [duplicate]

How would I use sed to delete all lines in a text file that contain a specific string?
To remove the line and print the output to standard out:
sed '/pattern to match/d' ./infile
To directly modify the file – does not work with BSD sed:
sed -i '/pattern to match/d' ./infile
Same, but for BSD sed (Mac OS X and FreeBSD) – does not work with GNU sed:
sed -i '' '/pattern to match/d' ./infile
To directly modify the file (and create a backup) – works with BSD and GNU sed:
sed -i.bak '/pattern to match/d' ./infile
There are many other ways to delete lines with specific string besides sed:
AWK
awk '!/pattern/' file > temp && mv temp file
Ruby (1.9+)
ruby -i.bak -ne 'print if not /test/' file
Perl
perl -ni.bak -e "print unless /pattern/" file
Shell (bash 3.2 and later)
while read -r line
do
[[ ! $line =~ pattern ]] && echo "$line"
done <file > o
mv o file
GNU grep
grep -v "pattern" file > temp && mv temp file
And of course sed (printing the inverse is faster than actual deletion):
sed -n '/pattern/!p' file
You can use sed to replace lines in place in a file. However, it seems to be much slower than using grep for the inverse into a second file and then moving the second file over the original.
e.g.
sed -i '/pattern/d' filename
or
grep -v "pattern" filename > filename2; mv filename2 filename
The first command takes 3 times longer on my machine anyway.
The easy way to do it, with GNU sed:
sed --in-place '/some string here/d' yourfile
You may consider using ex (which is a standard Unix command-based editor):
ex +g/match/d -cwq file
where:
+ executes given Ex command (man ex), same as -c which executes wq (write and quit)
g/match/d - Ex command to delete lines with given match, see: Power of g
The above example is a POSIX-compliant method for in-place editing a file as per this post at Unix.SE and POSIX specifications for ex.
The difference with sed is that:
sed is a Stream EDitor, not a file editor.BashFAQ
Unless you enjoy unportable code, I/O overhead and some other bad side effects. So basically some parameters (such as in-place/-i) are non-standard FreeBSD extensions and may not be available on other operating systems.
I was struggling with this on Mac. Plus, I needed to do it using variable replacement.
So I used:
sed -i '' "/$pattern/d" $file
where $file is the file where deletion is needed and $pattern is the pattern to be matched for deletion.
I picked the '' from this comment.
The thing to note here is use of double quotes in "/$pattern/d". Variable won't work when we use single quotes.
You can also use this:
grep -v 'pattern' filename
Here -v will print only other than your pattern (that means invert match).
To get a inplace like result with grep you can do this:
echo "$(grep -v "pattern" filename)" >filename
I have made a small benchmark with a file which contains approximately 345 000 lines. The way with grep seems to be around 15 times faster than the sed method in this case.
I have tried both with and without the setting LC_ALL=C, it does not seem change the timings significantly. The search string (CDGA_00004.pdbqt.gz.tar) is somewhere in the middle of the file.
Here are the commands and the timings:
time sed -i "/CDGA_00004.pdbqt.gz.tar/d" /tmp/input.txt
real 0m0.711s
user 0m0.179s
sys 0m0.530s
time perl -ni -e 'print unless /CDGA_00004.pdbqt.gz.tar/' /tmp/input.txt
real 0m0.105s
user 0m0.088s
sys 0m0.016s
time (grep -v CDGA_00004.pdbqt.gz.tar /tmp/input.txt > /tmp/input.tmp; mv /tmp/input.tmp /tmp/input.txt )
real 0m0.046s
user 0m0.014s
sys 0m0.019s
Delete lines from all files that match the match
grep -rl 'text_to_search' . | xargs sed -i '/text_to_search/d'
SED:
'/James\|John/d'
-n '/James\|John/!p'
AWK:
'!/James|John/'
/James|John/ {next;} {print}
GREP:
-v 'James\|John'
perl -i -nle'/regexp/||print' file1 file2 file3
perl -i.bk -nle'/regexp/||print' file1 file2 file3
The first command edits the file(s) inplace (-i).
The second command does the same thing but keeps a copy or backup of the original file(s) by adding .bk to the file names (.bk can be changed to anything).
You can also delete a range of lines in a file.
For example to delete stored procedures in a SQL file.
sed '/CREATE PROCEDURE.*/,/END ;/d' sqllines.sql
This will remove all lines between CREATE PROCEDURE and END ;.
I have cleaned up many sql files withe this sed command.
echo -e "/thing_to_delete\ndd\033:x\n" | vim file_to_edit.txt
Just in case someone wants to do it for exact matches of strings, you can use the -w flag in grep - w for whole. That is, for example if you want to delete the lines that have number 11, but keep the lines with number 111:
-bash-4.1$ head file
1
11
111
-bash-4.1$ grep -v "11" file
1
-bash-4.1$ grep -w -v "11" file
1
111
It also works with the -f flag if you want to exclude several exact patterns at once. If "blacklist" is a file with several patterns on each line that you want to delete from "file":
grep -w -v -f blacklist file
to show the treated text in console
cat filename | sed '/text to remove/d'
to save treated text into a file
cat filename | sed '/text to remove/d' > newfile
to append treated text info an existing file
cat filename | sed '/text to remove/d' >> newfile
to treat already treated text, in this case remove more lines of what has been removed
cat filename | sed '/text to remove/d' | sed '/remove this too/d' | more
the | more will show text in chunks of one page at a time.
Curiously enough, the accepted answer does not actually answer the question directly. The question asks about using sed to replace a string, but the answer seems to presuppose knowledge of how to convert an arbitrary string into a regex.
Many programming language libraries have a function to perform such a transformation, e.g.
python: re.escape(STRING)
ruby: Regexp.escape(STRING)
java: Pattern.quote(STRING)
But how to do it on the command line?
Since this is a sed-oriented question, one approach would be to use sed itself:
sed 's/\([\[/({.*+^$?]\)/\\\1/g'
So given an arbitrary string $STRING we could write something like:
re=$(sed 's/\([\[({.*+^$?]\)/\\\1/g' <<< "$STRING")
sed "/$re/d" FILE
or as a one-liner:
sed "/$(sed 's/\([\[/({.*+^$?]\)/\\\1/g' <<< "$STRING")/d"
with variations as described elsewhere on this page.
cat filename | grep -v "pattern" > filename.1
mv filename.1 filename
You can use good old ed to edit a file in a similar fashion to the answer that uses ex. The big difference in this case is that ed takes its commands via standard input, not as command line arguments like ex can. When using it in a script, the usual way to accomodate this is to use printf to pipe commands to it:
printf "%s\n" "g/pattern/d" w | ed -s filename
or with a heredoc:
ed -s filename <<EOF
g/pattern/d
w
EOF
This solution is for doing the same operation on multiple file.
for file in *.txt; do grep -v "Matching Text" $file > temp_file.txt; mv temp_file.txt $file; done
I found most of the answers not useful for me, If you use vim I found this very easy and straightforward:
:g/<pattern>/d
Source

Inserting a line in the beginning of a file using sed in HP-UX

I am trying to insert a line in the beginning of a file using sed.
I tried below commands :
sed -i '1s/^/LINE TO INSERT\n/' test.txt
sed: illegal option -- i --> Error thrown
sed '1i/^/LINE TO INSERT\n/' test.txt
sed: Function 1i/^/LINE TO INSERT\n/ cannot be parsed. --> Error thrown
Both the ways came out to be failed.
Any possible solution to it ? I am using ksh script on HP-UX.
Thanks.
How about good old ed?
printf '%s\n' 1i 'LINE TO INSERT' . w | ed -s file
printf is used to send each command to ed on a separate line.
Alternatively, if you're terrified of ed like me, you can just use a temporary file, as suggested in the comments:
echo 'LINE TO INSERT' > tmp && cat tmp test > new && mv new test && rm tmp
I think you have a typo: you're missing the closing apostrophe from your 1st command. Otherwise it's fine. I.e.:
You have this: sed -i '1s/^/... test.txt
But you need this: sed -i '1s/^/...' test.txt
Putting all together: sed -i '1s/^/LINE TO INSERT\n/' test.txt
Update: if -i is not supported, then you can use a temporary file:
sed '1s/^/LINE TO INSERT\n/' test.txt > /tmp/test.txt.tmp
mv /tmp/test.txt.tmp test.txt

search and replace multiple occurrences

So I have a file containing millions of lines.
and now within the file I have occurrences such as
=Continent
=Country
=State
=City
=Street
Now I have an excel file in which I have the text that should replace these occurrences - as an example :
=Continent should be replaced with =Asia
Similarly for other text
Now I was thinking of writing a java program to read my input file , read the mapping file and for each occurrence search and replace.
I am being lazy here - was wondering if I could do the same using editors like VIM ?
would that be possible ?
NOTE - I dont want to do a single text replace - I have multiple text that need to be found and replaced and I dont want to do the search and replace manually for each.
EDIT1:
Contents of my file that I want to replace: "1.txt"
continent=cont_text
country=country_text
The file that contains the values I want to replace with : "to_replace.txt"
=cont_text~Asia
=country_text~India
and finally using 'sed' here is my .sh file - but I am doing something wrong - it does not replace the contents of "1.txt"
while IFS="~" read foo bar;
do
echo $foo
echo $bar
for filename in 1.txt; do
sed -i.backup 's/$foo/$bar/g;' $filename
done
done < to_replace.txt
You can't put $foo and $bar in single quotes because the shell won't expand them. You don't need the for $filename in 1.txt loop because sed will loop through the lines of 1.txt. And you can't use -i.backup inside the loop because it will change the backup file each time and not preserve the original. So your script should be:
#!/bin/bash
cp 1.txt 1.txt.backup
while IFS="~" read foo bar;
do
echo $foo
echo $bar
sed -i "s/$foo/=$bar/g;" 1.txt
done < to_replace.txt
Output:
$ cat 1.txt
continent=Asia
country=India
sed is for simple substitutions on individual lines and shell is an environment from which to call tools not a tool to manipulate text so any time you write a shell loop to manipulate text you are doing it wrong.
Just use the tool that the same guys who invented sed and shell also invented to do general text processing jobs like this, awk:
$ awk -F'[=~]' -v OFS="=" 'NR==FNR{map[$2]=$3;next} {$2=map[$2]} 1' to_replace.txt 1.txt
continent=Asia
country=India
This sed command will do it without any loop:
sed -n 's#\(^=[^~]*\)~\(.*\)#s/\1/=\2/g#p' to_replace.txt |sed -i -f- 1.txt
Or sed with extended regex:
sed -nr 's#(^=[^~]*)~(.*)#s/\1/=\2/g#p' to_replace.txt | sed -i -f- 1.txt
Explanation:
The sed command:
sed -n 's#\(^=[^~]*\)~\(.*\)#s/\1/=\2/g#p' to_replace.txt
generates an output:
s/=cont_text/=Asia/g
s/=country_text/=India/g
which is then used as a sed script for the next sed after the pipe.
$ cat 1.txt
continent=Asia
country=India

sed in-place command not deleting from file in bash

I have a bash script which checks for a string pattern in file and delete entire line i same file but somehow its not deleting the line and no throwing any error .same command from command prompt deletes from file .
#array has patterns
for k in "${patternarr[#]}
do
sed -i '/$k/d' file.txt
done
sed version is >4
when this loop completes i want all lines matching string pattern in array to be deleted from file.txt
when i run sed -i '/pataern/d file.txt from command prompt then it works fine but not inside bash
Thanks in advance
Here:
sed -i '/$k/d' file.txt
The sed script is singly-quoted, which prevents shell variable expansion. It will (probably) work with
sed -i "/$k/d" file.txt
I say "probably" because what it will do depends on the contents of $k, which is just substituted into the sed code and interpreted as such. If $k contains slashes, it will break. If it comes from an untrustworthy source, you open yourself up to code injection (particularly with GNU sed, which can be made to execute shell commands).
Consider k=^/ s/^/rm -Rf \//e; #.
It is generally a bad idea to substitute shell variables into sed code (or any other code). A better way would be with GNU awk:
awk -i inplace -v pattern="$k" '!($0 ~ pattern)' file.txt
Or to just use grep -v and a temporary file.
first of all, you got an unclosed double quote around ${patternarr[#]} in your for statement.
Then your problem is that you use single quotes in the sed argument, making your shell not evaluate the $k within the quotes:
% declare -a patternarr=(foo bar fu foobar)
% for k in ${patternarr[#]}; do echo sed -i '/$k/d' file.txt; done
sed -i /$k/d file.txt
sed -i /$k/d file.txt
sed -i /$k/d file.txt
sed -i /$k/d file.txt
if you replace them with double quotes, here it goes:
% for k in ${patternarr[#]}; do echo sed -i "/$k/d" file.txt; done
sed -i /foo/d file.txt
sed -i /bar/d file.txt
sed -i /fu/d file.txt
sed -i /foobar/d file.txt
Any time you write a loop in shell just to manipulate text you have the wrong approach. This is probably closer to what you really should be doing (no surrounding loop required):
awk -v ks="${patternarr[#]}" 'BEGIN{gsub(/ /,")|(",ks); ks="("ks")} $0 !~ ks' file.txt
but there may be even better approaches still (e.g. only checking 1 field instead of the whole line, or using word boundaries, or string comparison or....) if you show us some sample input and expected output.
You need to use double quotes to interpolate shell variables inside the sed command, like:
for k in ${patternarr[#]}; do
sed -i "/$k/d" file.txt
done

Resources