find specific text in a directory and delete the lines from the files - bash

I want to find specific text in a directory, and then delete the lines from the files that include the specific text.
Now I have two questions:
How can I achieve the task?
What is wrong with What I have tried? I have tried the methods below, but failed. the details are following:
grep -rnw "./" -e "webdesign"
This searches the current directory with pattern "webdesign", and I get the result:
.//pages/index.html:1:{% load webdesign %}
.//pages/pricing.html:1:{% load webdesign %}
.//prototypes.py:16: 'django.contrib.webdesign',
Then I use sed to remove the lines from those files, which doesn't work, only get blank file ( I mean it deletes all my file content):
sed -i "/webdesign/d" ./pages/index.html
or
sed "/webdesign/d" ./pages/index.html > ./pages/index.html
My software environment is: OS X Yosemite, Mac Terminal, Bash

A loop in bash will do the trick provided that there are no filenames with spaces (in which case other solutions are possible, but this is the simplest)
for i in `grep -lrnw "yourdirectory/" -e "webdesign"`
do
sed "/webdesign/d" $i > $i.tmp
# safety to avoid destroying the file if problem arises (disk full?)
if [ $? = 0 ] ; then
mv -f $i.tmp $i
fi
done
note that you should not locate this script in the current directory because it contains webdesign and it will be modified as well :)
Thanks to choroba, I know that -i option doesn't work like wished. But it has another meaning or it would be rejected by the opt parser. It has something to do with suffixes, well, it doesn't matter now, but it's difficult to see the problem at first.
Without -i you cannot work on a file in-place. And redirecting output to the input just destroys the input file (!). That's why your solution did not work.

You can install GNU sed that supports the -i option, then
sed -i '/webdesign/d' files
should work. Note that it's safer to use -i~ to create a backup.
You cannot write to the same file you're reading from, that's why
sed /webdesign/d file > file
doesn't work (it overwrites the file before you can read anything from it). Create a temporary file
sed /webdesign/d file > file.tmp
mv file.tmp file

Related

How can I redirect output of a `sed` and `tr` pipe and overwrite the input file? [duplicate]

I would like to run a find and replace on an HTML file through the command line.
My command looks something like this:
sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html > index.html
When I run this and look at the file afterward, it is empty. It deleted the contents of my file.
When I run this after restoring the file again:
sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html
The stdout is the contents of the file, and the find and replace has been executed.
Why is this happening?
When the shell sees > index.html in the command line it opens the file index.html for writing, wiping off all its previous contents.
To fix this you need to pass the -i option to sed to make the changes inline and create a backup of the original file before it does the changes in-place:
sed -i.bak s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html
Without the .bak the command will fail on some platforms, such as Mac OSX.
An alternative, useful, pattern is:
sed -e 'script script' index.html > index.html.tmp && mv index.html.tmp index.html
That has much the same effect, without using the -i option, and additionally means that, if the sed script fails for some reason, the input file isn't clobbered. Further, if the edit is successful, there's no backup file left lying around. This sort of idiom can be useful in Makefiles.
Quite a lot of seds have the -i option, but not all of them; the posix sed is one which doesn't. If you're aiming for portability, therefore, it's best avoided.
sed -i 's/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g' index.html
This does a global in-place substitution on the file index.html. Quoting the string prevents problems with whitespace in the query and replacement.
use sed's -i option, e.g.
sed -i bak -e s/STRING_TO_REPLACE/REPLACE_WITH/g index.html
To change multiple files (and saving a backup of each as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
this is called a “Perl pie” (easy as a pie)
You should try using the option -i for in-place editing.
Warning: this is a dangerous method! It abuses the i/o buffers in linux and with specific options of buffering it manages to work on small files. It is an interesting curiosity. But don't use it for a real situation!
Besides the -i option of sed
you can use the tee utility.
From man:
tee - read from standard input and write to standard output and files
So, the solution would be:
sed s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html | tee | tee index.html
-- here the tee is repeated to make sure that the pipeline is buffered. Then all commands in the pipeline are blocked until they get some input to work on. Each command in the pipeline starts when the upstream commands have written 1 buffer of bytes (the size is defined somewhere) to the input of the command. So the last command tee index.html, which opens the file for writing and therefore empties it, runs after the upstream pipeline has finished and the output is in the buffer within the pipeline.
Most likely the following won't work:
sed s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html | tee index.html
-- it will run both commands of the pipeline at the same time without any blocking. (Without blocking the pipeline should pass the bytes line by line instead of buffer by buffer. Same as when you run cat | sed s/bar/GGG/. Without blocking it's more interactive and usually pipelines of just 2 commands run without buffering and blocking. Longer pipelines are buffered.) The tee index.html will open the file for writing and it will be emptied. However, if you turn the buffering always on, the second version will work too.
sed -i.bak "s#https.*\.com#$pub_url#g" MyHTMLFile.html
If you have a link to be added, try this. Search for the URL as above (starting with https and ending with.com here) and replace it with a URL string. I have used a variable $pub_url here. s here means search and g means global replacement.
It works !
The problem with the command
sed 'code' file > file
is that file is truncated by the shell before sed actually gets to process it. As a result, you get an empty file.
The sed way to do this is to use -i to edit in place, as other answers suggested. However, this is not always what you want. -i will create a temporary file that will then be used to replace the original file. This is problematic if your original file was a link (the link will be replaced by a regular file). If you need to preserve links, you can use a temporary variable to store the output of sed before writing it back to the file, like this:
tmp=$(sed 'code' file); echo -n "$tmp" > file
Better yet, use printf instead of echo since echo is likely to process \\ as \ in some shells (e.g. dash):
tmp=$(sed 'code' file); printf "%s" "$tmp" > file
And the ed answer:
printf "%s\n" '1,$s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g' w q | ed index.html
To reiterate what codaddict answered, the shell handles the redirection first, wiping out the "input.html" file, and then the shell invokes the "sed" command passing it a now empty file.
I was searching for the option where I can define the line range and found the answer. For example I want to change host1 to host2 from line 36-57.
sed '36,57 s/host1/host2/g' myfile.txt > myfile1.txt
You can use gi option as well to ignore the character case.
sed '30,40 s/version/story/gi' myfile.txt > myfile1.txt
With all due respect to the above correct answers, it's always a good idea to "dry run" scripts like that, so that you don't corrupt your file and have to start again from scratch.
Just get your script to spill the output to the command line instead of writing it to the file, for example, like that:
sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html
OR
less index.html | sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g
This way you can see and check the output of the command without getting your file truncated.

Sed & Mac OS Terminal: How to remove parentheses content from the first line of every file?

I am on Mac Os 10.14.6 and have a directory that contains subdirectories that all contain text files. Altogether, there are many hundreds of text files.
I would like to go through the text files and check for any content in the first line that is in parentheses. If such content is found, then the parentheses (and content in the parentheses) should be removed.
Example:
Before removal:
The new world (82 edition)
After removal:
The new world
How would I do this?
Steps I have tried:
Google around, it seems SED would be best for this.
I have found this thread, which provides SED code for removing bracketed content.
sed -e 's/([^()]*)//g'
However, I am not sure how to adapt it to work on multiple files and also to limit it to the first line of those files. I found this thread which explains how to use SED on multiple files, but I am not sure how to adapt the example to work with parentheses content.
Please note: As long as the solution works on Mac OS terminal, then it does not need to use SED. However, from Googling, SED seems to be the most suited.
I managed to achieve what you're after simply by using a bash script and sed together, as so:
#!/bin/bash
for filename in $PWD/*.txt; do
sed -i '' '1 s/([^()]*)//g' $filename
done
The script simply iterates over all the .txt files in $PWD (the current working directory, so that you can add this script to your bin and run it anywhere), and then runs the command
sed -ie '1 s/([^()]*)//g' $filename
on the file. By starting the command with the number 1 we tell sed to only work on the first line of the file :)
Edit: Best Answer
The above works fine in a directory where all contained objects are files, and not including directories; in other words, the above does not perform recursive search through directories.
Therefore, after some research, this command should perform exactly what the question asks:
find . -name "*.txt" -exec sed -i '' '1 s/([^()]*)//g' {} \;
I must iterate, and reiterate, that you test this on a backup first to test it works. Otherwise, use the same command as above but change the '' in order to control the creation of backups. For example,
find . -name "*.txt" -exec sed -i '.bkp' '1 s/([^()]*)//g' {} \;
This command will perform the sed replace in the original file (keeping the filename) but will create a backup file for each with the appended .bkp, for example test1.txt becomes test1.txt.bkp. This a safer option, but choose what works best for you :)
Good try,
The command you where looking for single line:
sed -E '1s|\([^\)]+\)||'
The command to replace each input file first line:
sed -Ei '1s|\([^\)]+\)||' *.txt
example:
echo "The new world (82 edition)" |sed -E '1s|\([^\)]+\)||'
The new world
Explanation
sed -Ei E option: the extended RegExp syntax, i option: for in-place file replacement
sed -Ei '1s|match RegExp||' for first line only, replace first matched RegExp string with empty string
\([^\)]+\) RegExp matching: start with (, [^\)]any char not ), + - more than once, terminate with )
Try:
# create a temporary file
tmp=$(mktemp)
# for each something in _the current directory_
for i in *; do
# if it is not a file, don't parse it
if [ ! -f "$i" ]; then continue; fi
# remove parenthesis on first line, save the output in temporary file
sed '1s/([^)]*)//g' "$i" > "$tmp"
# move temporary file to the original file
mv "$tmp" "$i"
done
# remove temporary file
rm "$tmp"

Search text and append to each end of line of text file - OSX

I'm new to OSX command line tools.
I am trying to find a block of text in a file and append this text at the end of all lines in another text file. At run time I don't know what this text will be, I just know it will be located within "BEGINHMM" and "ENDHMM". Also, I don't know the makeup of the destination file, except for that it will not be an empty text file.
The command which finds the block of text of interest is:
sed -n '/<BEGINHMM>/,/<ENDHMM>/p' proto
where "proto" is a text file containing the text of interest.
I've been trying to pipe the output of the above command to another 'sed' command, in the following manner:
xargs -I '{}' sed -i .bak 's/$/{}/' monophones0.txt
but I am getting some bizarre results, I see the "{}" inserted in the text for example.
I've also tried piping to:
xargs -0 sed -i .bak 's/$/&/' monophones0.txt
but I just get the printout (similar to terminal echo) of the text I am trying to grab.
Ultimately I want to loop over several 'proto' files in multiple directories and copy the text between the "BEGINHMM", "ENDHMM" block in each directory, and append the selected text to that directory's monophones.txt lines.
I am running the commands in the terminal, bash, OSX 10.12.2
Any help would be appreciated.
(1) Your sed command is of the form sed -n '/A/,/B/p'; this will include the lines on which A and B occur, even if these strings do not appear at the beginning of the line. This form may have other surprises in store for you as well (what do expect will happen if B is missing or repeated?), but the remainder of this post assumes that's what you want.
(2) It's not clear how you intend to specify the "proto" files, but you do indicate they might be in several directories, so for the remainder of this post, I'll assume they are listed, one per line, in a file named proto.txt in each directory. This will ensure that you don't run into any limitations on command-line length, but the following can easily be modified if you don't want to create such a file.
(3) Here is a script which will use the sed command you've mentioned to copy segments from each of the "proto" files specified in a directory to monophones0.txt in the directory in which the script is executed.
#!/bin/bash
OUT=monophones0.txt
cat proto.txt | while read file
do
if [ -r "$file" ] ; then
sed -n '/<BEGINHMM>/,/<ENDHMM>/p' "$file" >> $OUT
elif [ -n "$file" ] ; then
echo "NOT FOUND: $file" >&2
fi
done
Just like what you did before. tmpfile=$(mktemp); sed -n '/<BEGINHMM>/,/<ENDHMM>/p' proto >$tmpfile; sed -i .bak "r $tmpfile" monophones0.txt; rm $tmpfile. This is the basic idea; there are other checks you need to perform to make this a robust script.
– 4ae1e1

How do I write a bash script to copy files into a new folder based on name?

I have a folder filled with ~300 files. They are named in this form username#mail.com.pdf. I need about 40 of them, and I have a list of usernames (saved in a file called names.txt). Each username is one line in the file. I need about 40 of these files, and would like to copy over the files I need into a new folder that has only the ones I need.
Where the file names.txt has as its first line the username only:
(eg, eternalmothra), the PDF file I want to copy over is named eternalmothra#mail.com.pdf.
while read p; do
ls | grep $p > file_names.txt
done <names.txt
This seems like it should read from the list, and for each line turns username into username#mail.com.pdf. Unfortunately, it seems like only the last one is saved to file_names.txt.
The second part of this is to copy all the files over:
while read p; do
mv $p foldername
done <file_names.txt
(I haven't tried that second part yet because the first part isn't working).
I'm doing all this with Cygwin, by the way.
1) What is wrong with the first script that it won't copy everything over?
2) If I get that to work, will the second script correctly copy them over? (Actually, I think it's preferable if they just get copied, not moved over).
Edit:
I would like to add that I figured out how to read lines from a txt file from here: Looping through content of a file in bash
Solution from comment: Your problem is just, that echo a > b is overwriting file, while echo a >> b is appending to file, so replace
ls | grep $p > file_names.txt
with
ls | grep $p >> file_names.txt
There might be more efficient solutions if the task runs everyday, but for a one-shot of 300 files your script is good.
Assuming you don't have file names with newlines in them (in which case your original approach would not have a chance of working anyway), try this.
printf '%s\n' * | grep -f names.txt | xargs cp -t foldername
The printf is necessary to work around the various issues with ls; passing the list of all the file names to grep in one go produces a list of all the matches, one per line; and passing that to xargs cp performs the copying. (To move instead of copy, use mv instead of cp, obviously; both support the -t option so as to make it convenient to run them under xargs.) The function of xargs is to convert standard input into arguments to the program you run as the argument to xargs.

How will i use sed shell script for replacing a pattern in a list of file

I have some 100 files which have my name in it, RAHUL. I want it replaced with another term RAHUL2. I have a file which contains the list of files and i want to fetch it in a sed to do the changes.
files :
C:/desktop/file1.txt
C:/desktop/rahul/file1.txt
C:/desktop/rahul/file3.txt
C:/desktop/rahul/file4.txt
C:/desktop/rahul/file6.txt
C:/desktop/rahul/file8.txt
C:/desktop/rahul/file9.txt
and in each file data, i want to replace all occurance of term RAHUL with RAHUL2
I assume you are using some Cygwin environment on Windows, since you have sed. Then you can use find to list all the files and execute sed on them:
find C:/desktop -type f -name 'file*.txt' -exec sed -i 's/RAHUL/RAHUL2/g' {} \;
Please make a backup of the original files if you are using sed -i but aren't sure if the command is working already. This because sed -i will overwrite the original file. You have been warned.
hek2mgl's answer is good if you want to search for all files matching file*.txt pattern on your C:/desktop directory.
Now as you already have a file containing the list of files to edit, here is another way to proceed:
while read FILE ; do sed -i 's/RAHUL/RAHUL2/g' $FILE ; done < files_to_edit.txt
The read command will read your input one line after another. Input is files_to_edit.txt file, as indicated by the < input redirection operator.
Remark about -i option remains valid: it will edit your files in place, so make backup, or at least run it first on a couple of files (potentially without -i option, just to check what output is).

Resources