Redirect output from sed 's/c/d/' myFile to myFile - bash

I am using sed in a script to do a replace and I want to have the replaced file overwrite the file. Normally I think that you would use this:
% sed -i 's/cat/dog/' manipulate
sed: illegal option -- i
However as you can see my sed does not have that command.
I tried this:
% sed 's/cat/dog/' manipulate > manipulate
But this just turns manipulate into an empty file (makes sense).
This works:
% sed 's/cat/dog/' manipulate > tmp; mv tmp manipulate
But I was wondering if there was a standard way to redirect output into the same file that input was taken from.

I commonly use the 3rd way, but with an important change:
$ sed 's/cat/dog/' manipulate > tmp && mv tmp manipulate
I.e. change ; to && so the move only happens if sed is successful; otherwise you'll lose your original file as soon as you make a typo in your sed syntax.
Note! For those reading the title and missing the OP's constraint "my sed doesn't support -i": For most people, sed will support -i, so the best way to do this is:
$ sed -i 's/cat/dog/' manipulate

Yes, -i is also supported in FreeBSD/MacOSX sed, but needs the empty string as an argument to edit a file in-place.
sed -i "" 's/old/new/g' file # FreeBSD sed

If you don't want to move copies around, you could use ed:
ed file.txt <<EOF
%s/cat/dog/
wq
EOF

Kernighan and Pike in The Art of Unix Programming discuss this issue. Their solution is to write a script called overwrite, which allows one to do such things.
The usage is: overwrite file cmd file.
# overwrite: copy standard input to output after EOF
opath=$PATH
PATH=/bin:/usr/bin
case $# in
0|1) echo 'Usage: overwrite file cmd [args]' 1>&2; exit 2
esac
file=$1; shift
new=/tmp/overwr1.$$; old=/tmp/overwr2.$$
trap 'rm -f $new $old; exit 1' 1 2 15 # clean up
if PATH=$opath "$#" >$new
then
cp $file $old # save original
trap '' 1 2 15 # wr are commmitted
cp $new $file
else
echo "overwrite: $1 failed, $file unchanged" 1>&2
exit 1
fi
rm -f $new $old
Once you have the above script in your $PATH, you can do:
overwrite manipulate sed 's/cat/dog/' manipulate
To make your life easier, you can use replace script from the same book:
# replace: replace str1 in files with str2 in place
PATH=/bin:/usr/bin
case $# in
0|2) echo 'Usage: replace str1 str2 files' 1>&2; exit 1
esac
left="$1"; right="$2"; shift; shift
for i
do
overwrite $i sed "s#$left#$right#g" $i
done
Having replace in your $PATH too will allow you to say:
replace cat dog manipulate

You can use sponge from the moreutils.
sed "s/cat/dog/" manipulate | sponge manipulate

Perhaps -i is gnu sed, or just an old version of sed, but anyways. You're on the right track. The first option is probably the most common one, the third option is if you want it to work everywhere (including solaris machines)... :) These are the 'standard' ways of doing it.

To change multiple files (and saving a backup of each as *.bak):
perl -p -i -e "s/oldtext/newtext/g" *
replaces any occurence of oldtext by newtext in all files in the current folder. However you will have to escape all perl special characters within oldtext and newtext using the backslash
This is called a “Perl pie” (mnemonic: easy as a pie)
The -i flag tells it do do in-place replacement, and it should be ok to use single (“'”) as well as double (“””) quotes.
If using ./* instead of just *, you should be able to do it in all sub-directories
See man perlrun for more details, including how to take a backup file of the original.
using sed:
sed -i 's/old/new/g' ./* (used in GNU)
sed -i '' 's/old/new/g' ./* (used in FreeBSD)

-i option is not available in standard sed.
Your alternatives are your third way or perl.

A lot of answers, but none of them is correct. Here is the correct and simplest one:
$ echo "111 222 333" > file.txt
$ sed -i -s s/222/444/ file.txt
$ cat file.txt
111 444 333
$

Workaround using open file handles:
exec 3<manipulate
Prevent open file from being truncated:
rm manipulate
sed 's/cat/dog/' <&3 > manipulate

Related

need to clean file via SED or GREP

I have these files
NotRequired.txt (having lines which need to be remove)
Need2CleanSED.txt (big file , need to clean)
Need2CleanGRP.txt (big file , need to clean)
content:
more NotRequired.txt
[abc-xyz_pqr-pe2_123]
[lon-abc-tkt_1202]
[wat-7600-1_414]
[indo-pak_isu-5_761]
I am reading above file and want to remove lines from Need2Clean???.txt, trying via SED and GREP but no success.
myFile="NotRequired.txt"
while IFS= read -r HKline
do
sed -i '/$HKline/d' Need2CleanSED.txt
done < "$myFile"
myFile="NotRequired.txt"
while IFS= read -r HKline
do
grep -vE \"$HKline\" Need2CleanGRP.txt > Need2CleanGRP.txt
done < "$myFile"
Looks as if the Variable and characters [] making some problem.
What you're doing is extremely inefficient and error prone. Just do this:
grep -vF -f NotRequired.txt Need2CleanGRP.txt > tmp &&
mv tmp Need2CleanGRP.txt
Thanks to grep -F the above treats each line of NotRequired.txt as a string rather than a regexp so you don't have to worry about escaping RE metachars like [ and you don't need to wrap it in a shell loop - that one command will remove all undesirable lines in one execution of grep.
Never do command file > file btw as the shell might decide to execute the > file first and so empty file before command gets a chance to read it! Always do command file > tmp && mv tmp file instead.
Your assumption is correct. The [...] construct looks for any characters in that set, so you have to preface ("escape") them with \. The easiest way is to do that in your original file:
sed -i -e 's:\[:\\[:' -e 's:\]:\\]:' "${myFile}"
If you don't like that, you can probably put the sed command in where you're directing the file in:
done < replace.txt|sed -e 's:\[:\\[:' -e 's:\]:\\]:'
Finally, you can use sed on each HKline variable:
HKline=$( echo $HKline | sed -e 's:\[:\\[:' -e 's:\]:\\]:' )
try gnu sed:
sed -Ez 's/\n/\|/g;s!\[!\\[!g;s!\]!\\]!g; s!(.*).!/\1/d!' NotRequired.txt| sed -Ef - Need2CleanSED.txt
Two sed process are chained into one by shell pipe
NotRequired.txt is 'slurped' by sed -z all at once and substituted its \n and [ meta-char with | and \[ respectively of which the 2nd process uses it as regex script for the input file, ie. Need2CleanSED.txt. 1st process output;
/\[abc-xyz_pqr-pe2_123\]|\[lon-abc-tkt_1202\]|\[wat-7600-1_414\]|\[indo-pak_isu-5_761\]/d
add -u ie. unbuffered, option to evade from batch process, sort of direct i/o

Remove middle of filenames

I have a list of filenames like this in bash
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
And I want them to look like this
UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz
I do not have the perl rename command and sed 's/_Other*160418./_/' *.gz
is not doing anything. I've tried other rename scripts on here but either nothing occurs or my shell starts printing huge amounts of code to the console and freezes.
This post (Removing Middle of Filename) is similar however the answers given do not explain what specific parts of the command are doing so I could not apply it to my problem.
Parameter expansions in bash can perform string substitutions based on glob-like patterns, which allows for a more efficient solution than calling an extra external utility such as sed in each loop iteration:
for f in *.gz; do echo mv "$f" "${f/_Other_*-TTAGGA_R_160418./_}"; done
Remove the echo before mv to perform actual renaming.
You can do something like this in the directory which contains the files to be renamed:
for file_name in *.gz
do
new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name");
mv "$file_name" "$new_file_name";
done
The pattern (_[^.]*\.) starts matching from the FIRST _ till the FIRST . (both inclusive). [^.]* means 0 or more non-dot (or non-period) characters.
Example:
AMD$ ls
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
AMD$ for file_name in *.gz
> do new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name")
> mv "$file_name" "$new_file_name"
> done
AMD$ ls
UTSHoS10_R1.fq.gz UTSHoS10_R2.fq.gz UTSHoS11_R2.fq.gz UTSHoS12_R1.fq.gz UTSHoS12_R2.fq.gz
Pure Bash, using substring operation and assuming that all file names have the same length:
for file in UTS*.gz; do
echo mv -i "$file" "${file:0:9}${file:38:8}"
done
Outputs:
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS10_R1.fq.gz
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS10_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz UTSHoS12_R1.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz UTSHoS12_R2.fq.gz
Once verified, remove echo from the line inside the loop and run again.
Going with your sed command, this can work as a bash one-liner:
for name in UTSH*fq.gz; do newname=$(echo $name | sed 's/_Other.*160418\./_/'); echo mv $name $newname; done
Notes:
I've adjusted your sed command: it had an * without a preceeding . (sed takes a regular expression, not a globbing pattern). Similarly, the dot needs escaping.
To see if it works, without actually renaming the files, I've left the echo command in. Easy to remove just that to make it functional.
It doesn't have to be a one-liner, obviously. But sometimes, that makes editing and browsing your command-line history easier.

Can envsubst not do in-place substitution?

I have a config file which contains some ENV_VARIABLE styled variables.
This is my file.
It might contain $EXAMPLES of text.
Now I want that variable replaced with a value which is saved in my actual environment variables. So I'm trying this:
export EXAMPLES=lots
envsubst < file.txt > file.txt
But it doesn't work when the input file and output file are identical. The result is an empty file of size 0.
There must be a good reason for this, some bash basics that I'm not aware of?
How do I achieve what I want to do, ideally without first outputting to a different file and then replacing the original file with it?
I know that I can do it easily enough with sed, but when I discovered the envsubst command I thought that it should be perfect for my use case, so I'd like to use that.
Here is the solution that I use:
originalfile="file.txt"
tmpfile=$(mktemp)
cp --attributes-only --preserve $originalfile $tmpfile
cat $originalfile | envsubst > $tmpfile && mv $tmpfile $originalfile
Be careful with other solutions that do not use a temporary file. Pipes are asynchronous, so the file will occasionally be read after it has already been truncated.
Redirects are handled by the shell, not the program being executed, and they are set up before the program is invoked.
The redirect >output.file has the effect of creating output.file if it doesn't exist and emptying it if it does. Either way, you end up with an empty file, and that is what the program's output is redirected to.
Programs like sed which are capable of "in-place" modification must take the filename as a command-line argument, not as a redirect.
In your case, I would suggest using a temporary file and then renaming it if all goes OK.
envsubst < file.txt | tee file.txt
I found another shortcut to put into temp file and then rename it to original file.
envsubst < in.txt > out.txt && mv out.txt in.txt
To avoid creating a temporary file, use sponge not tee:
envsubst < file.txt | sponge file.txt
From https://linux.die.net/man/1/sponge:
sponge reads standard input and writes it out to the specified file. Unlike a shell redirect, sponge soaks up all its input before opening the output file. This allows constricting pipelines that read from and write to the same file.
You can achieve in-place substitution by calling envsubst from gnu sed with the "e" command:
EXAMPLES=lots sed -i 's/.*/echo & | envsubst/e' file.txt
It's worth noting that the mv solution won't maintain file permissions. Using cp -pf would be preferable in the case that you're modifying an executable file.
tmpfile=$(mktemp)
cat file.txt | envsubst > "$tmpfile" && cp -pf "$tmpfile" file.txt
rm -f "$tmpfile"
This answer was framed from two other answers. I guess this is the best solution.
originalFile=file.txt
tmpfile=$(mktemp)
cat $originalFile | envsubst > "$tmpfile" && cp -pf "$tmpfile" $originalFile
rm -f "$tmpfile"
Updated 20221011 - Using 1 sed command
sed -i -r 's/["`]|\$\(/\\&/g; s/.*/echo "&"/ e' ./input.txt
Updated 20221007 - Using 2 sed commands
sed -i -r 's/["`]|\$\(/\\&/g' input.txt
sed -i -r 's/.*/echo "&"/ e' input.txt
Do it without envsubst
envsubst_file () {
local original_file=$1
local temp_file=$(mktemp)
trap "rm -f ${temp_file}" 0 2 3 15
cp -p ${original_file} ${temp_file}
cat ${original_file} | sed -r 's/["`]|\$\(/\\&/g' | sed -r 's/.*/echo "&"/g' | sh > ${temp_file}
mv ${temp_file} ${original_file}
}
envsubst_file 'input.txt'
First using sed to escapes double quotes("), backtick(`) and command $( by prefixing with backslash(\),then using sed again replace with
echo "&"
Finally executing the shell script and redirecting to ${temp_file}
If you use bash, check this:
a=`<file.txt` && envsubst <<<"$a" >file.txt
Tested on 500mb file, works as expected.
In the end I found that using envsubst was too dangerous after all. My files might contain dollar signs in places where I don't want any substitution to happen, and envsubst will just replace them with empty strings if no corresponding environment variable is defined. Not cool.

sed delete not working with cat variable

I have a file named test-domain, the contents of which contain the line 100.am.
When I do this, the line with 100.am is deleted from the test-domain file, as expected:
for x in $(echo 100.am); do sed -i "/$x/d" test-domain; done
However, if instead of echo 100.am, I read each line from a file named unwanted-lines, it does NOT work.
for x in $(cat unwanted-lines); do sed -i "/$x/d" test-domain; done
This is even if the only contents of unwanted-lines is one line, with the exact contents 100.am.
Does anyone know why sed delete line works if you use echo in your variable, but not if you use cat?
fgrep -v -f unwanted-lines test-domain > /tmp/Buffer
mv /tmp/Buffer test-domain
sed is not interesting in this case due to multiple call in shell (poor efficiency and lot of ressources used). The way to still use sed is to preload line to delete, and make a search base on this preloaded info but very heavy compare to fgrep in this case
Does anyone know why sed delete line works if you use echo in your
variable, but not if you use cat?
I believe that your file containing unwanted lines contains CR+LF line endings due to which it doesn't work when you use the file. You could strip the CR in your loop:
for x in $(cat unwanted-lines); do x="${x//$'\r'}"; sed -i "/$x/d" test-domain; done
One better strategy than yours would be to use a genuine editor, e.g., ed, as so:
ed -s test-domain < <(
shopt -s extglob
while IFS= read -r l; do
[[ $l = *([[:space:]]) ]] && continue
l=${l//./\\.}
echo "g/$l/d"
done < unwanted-lines
echo "wq"
)
Caveat. You must make sure that the file unwanted-lines doesn't contain any character that could clash with ed's regexps and commands. I have already included a match for a period (i.e., replace . with \.).
This method is quite efficient, as you're not forking so many times on sed, writing temp files, renaming them, etc.
Another possibility would be to use grep, but then you won't have the editing option ed offers.
Remark. ed is the standard editor.
why not just applying the sed command on your file?
sed -i '/.*100\.am/d' your_file

replace words using grep and sed in shell

I have some 150 files and in them I want to remove this following code:
<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript" SRC="/height.js"></SCRIPT>
What I'm doing is:
sed -e 's/<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript" SRC="/height.js"></SCRIPT>/ /' file_names
This doesn't seem to work.
I want to remove this script from all the files in one go. How can I do that?
You have to worry about the slashes in the text you are replacing.
Either: use '\/' for each slash,
Or: cheat and use '.' to match any character at the point where the slash should appear.
The alternative exploits the improbability of a file containing the HTML. Theoretically, if you don't like the second alternative, you should also use '\.' at each point where '.' appears in the string you're looking at.
sed -e 's/<SCRIPT LANGUAGE="JavaScript" TYPE="text.javascript" SRC=".height.js"><.SCRIPT>/ /' file_names
This is copied from your example and slashes are replaced by dots. However, supplying all the file names on the command line like that will simply write the output as the concatenation of all the edited files to standard output.
Classically, to edit files more or less in situ, you'd write:
tmp=${TMPDIR:-/tmp}/xxx.$$
trap 'rm -f $tmp; exit 1' 0 1 2 3 13 15
for file in ...list...
do
sed -e '...' $file > $tmp
mv $tmp $file
done
rm -f $tmp
trap 0
This includes reasonably bullet-proof clean-up of the temporary - it is not perfect. This variant backs up the original before replacing it with the edited version:
tmp=${TMPDIR:-/tmp}/xxx.$$
trap 'rm -f $tmp; exit 1' 0 1 2 3 13 15
for file in ...list...
do
sed -e '...' $file > $tmp
mv $file $file.bak
mv $tmp $file
done
rm -f $tmp
trap 0
With GNU sed, you can use the '-i' or '--in-place' option to overwrite the files; you can use '--in-place=.bak' to create backup copies of each file in file.bak.
You need to escape the special characters with an extra backslash.
Note that the output will also all go to the console. If you want 150 separate output files, you might want to look at the xargs command, something like:
ls -1 | xargs -t -i 'sed -e -i "replace comment" {}'
Be aware that the sed '-i' option will edit the files in place so get your replacement right first and back the files up!
You should pipe the output
e.g.
sed -e 's/<.*SRC="\/height.js".*>//g' < foo.html > foo1.html
You don't have to use the / with the s command:
sed 's|old|new|' files...
If you want to do inline-replace, then the first step is to back up all of your files then issue this command, substitute 'old' and 'new' with strings of your choice:
sed -i .bak 's|old|new|' files...
Because inline replacement is very dangerous, I can't emphasize enough that you must back up all of your files before running the command.
Did I mention that you should back up all of your files?

Resources