Search text and append to each end of line of text file - OSX - macos

I'm new to OSX command line tools.
I am trying to find a block of text in a file and append this text at the end of all lines in another text file. At run time I don't know what this text will be, I just know it will be located within "BEGINHMM" and "ENDHMM". Also, I don't know the makeup of the destination file, except for that it will not be an empty text file.
The command which finds the block of text of interest is:
sed -n '/<BEGINHMM>/,/<ENDHMM>/p' proto
where "proto" is a text file containing the text of interest.
I've been trying to pipe the output of the above command to another 'sed' command, in the following manner:
xargs -I '{}' sed -i .bak 's/$/{}/' monophones0.txt
but I am getting some bizarre results, I see the "{}" inserted in the text for example.
I've also tried piping to:
xargs -0 sed -i .bak 's/$/&/' monophones0.txt
but I just get the printout (similar to terminal echo) of the text I am trying to grab.
Ultimately I want to loop over several 'proto' files in multiple directories and copy the text between the "BEGINHMM", "ENDHMM" block in each directory, and append the selected text to that directory's monophones.txt lines.
I am running the commands in the terminal, bash, OSX 10.12.2
Any help would be appreciated.

(1) Your sed command is of the form sed -n '/A/,/B/p'; this will include the lines on which A and B occur, even if these strings do not appear at the beginning of the line. This form may have other surprises in store for you as well (what do expect will happen if B is missing or repeated?), but the remainder of this post assumes that's what you want.
(2) It's not clear how you intend to specify the "proto" files, but you do indicate they might be in several directories, so for the remainder of this post, I'll assume they are listed, one per line, in a file named proto.txt in each directory. This will ensure that you don't run into any limitations on command-line length, but the following can easily be modified if you don't want to create such a file.
(3) Here is a script which will use the sed command you've mentioned to copy segments from each of the "proto" files specified in a directory to monophones0.txt in the directory in which the script is executed.
#!/bin/bash
OUT=monophones0.txt
cat proto.txt | while read file
do
if [ -r "$file" ] ; then
sed -n '/<BEGINHMM>/,/<ENDHMM>/p' "$file" >> $OUT
elif [ -n "$file" ] ; then
echo "NOT FOUND: $file" >&2
fi
done

Just like what you did before. tmpfile=$(mktemp); sed -n '/<BEGINHMM>/,/<ENDHMM>/p' proto >$tmpfile; sed -i .bak "r $tmpfile" monophones0.txt; rm $tmpfile. This is the basic idea; there are other checks you need to perform to make this a robust script.
– 4ae1e1

Related

How to use bash for loop with sed to search and replace the string in a file repeatedly?

I am not very familiar with bash programming. And I tried to write a bash script to reduce my work.
what I want is to open the in.lammps file in every folder in this directory and replace some string in the in.lammps file. but it shows such an error.
and I also did try:
#!/bin/bash
for file in temp-*;
do
cd $file;
sed -i -f mycommands "$file/in.lammps";
done
here is the content of the file 'mycommands':
s/0.080/0.039/g
s/2.45/2.93/g
s/2.625/3.382/g
s/Pt/AU/g
do you have any idea about it ? I have 20 folders in this directory and I need to replace some string in the file in.lammps.
enter image description here
enter image description here
enter image description here
Diving in directories on each iteration with cd would just add unneccessary complexity to your code. Simply write
for file in */in.lamps; do
sed -i -f mycommands "$file"
done
This will detect all in.lamps in sub-directories and apply sed to each.
Just so that you understand what's going on, you can put an echo "$file" before the sed line to output all the affected files.

Sed & Mac OS Terminal: How to remove parentheses content from the first line of every file?

I am on Mac Os 10.14.6 and have a directory that contains subdirectories that all contain text files. Altogether, there are many hundreds of text files.
I would like to go through the text files and check for any content in the first line that is in parentheses. If such content is found, then the parentheses (and content in the parentheses) should be removed.
Example:
Before removal:
The new world (82 edition)
After removal:
The new world
How would I do this?
Steps I have tried:
Google around, it seems SED would be best for this.
I have found this thread, which provides SED code for removing bracketed content.
sed -e 's/([^()]*)//g'
However, I am not sure how to adapt it to work on multiple files and also to limit it to the first line of those files. I found this thread which explains how to use SED on multiple files, but I am not sure how to adapt the example to work with parentheses content.
Please note: As long as the solution works on Mac OS terminal, then it does not need to use SED. However, from Googling, SED seems to be the most suited.
I managed to achieve what you're after simply by using a bash script and sed together, as so:
#!/bin/bash
for filename in $PWD/*.txt; do
sed -i '' '1 s/([^()]*)//g' $filename
done
The script simply iterates over all the .txt files in $PWD (the current working directory, so that you can add this script to your bin and run it anywhere), and then runs the command
sed -ie '1 s/([^()]*)//g' $filename
on the file. By starting the command with the number 1 we tell sed to only work on the first line of the file :)
Edit: Best Answer
The above works fine in a directory where all contained objects are files, and not including directories; in other words, the above does not perform recursive search through directories.
Therefore, after some research, this command should perform exactly what the question asks:
find . -name "*.txt" -exec sed -i '' '1 s/([^()]*)//g' {} \;
I must iterate, and reiterate, that you test this on a backup first to test it works. Otherwise, use the same command as above but change the '' in order to control the creation of backups. For example,
find . -name "*.txt" -exec sed -i '.bkp' '1 s/([^()]*)//g' {} \;
This command will perform the sed replace in the original file (keeping the filename) but will create a backup file for each with the appended .bkp, for example test1.txt becomes test1.txt.bkp. This a safer option, but choose what works best for you :)
Good try,
The command you where looking for single line:
sed -E '1s|\([^\)]+\)||'
The command to replace each input file first line:
sed -Ei '1s|\([^\)]+\)||' *.txt
example:
echo "The new world (82 edition)" |sed -E '1s|\([^\)]+\)||'
The new world
Explanation
sed -Ei E option: the extended RegExp syntax, i option: for in-place file replacement
sed -Ei '1s|match RegExp||' for first line only, replace first matched RegExp string with empty string
\([^\)]+\) RegExp matching: start with (, [^\)]any char not ), + - more than once, terminate with )
Try:
# create a temporary file
tmp=$(mktemp)
# for each something in _the current directory_
for i in *; do
# if it is not a file, don't parse it
if [ ! -f "$i" ]; then continue; fi
# remove parenthesis on first line, save the output in temporary file
sed '1s/([^)]*)//g' "$i" > "$tmp"
# move temporary file to the original file
mv "$tmp" "$i"
done
# remove temporary file
rm "$tmp"

find specific text in a directory and delete the lines from the files

I want to find specific text in a directory, and then delete the lines from the files that include the specific text.
Now I have two questions:
How can I achieve the task?
What is wrong with What I have tried? I have tried the methods below, but failed. the details are following:
grep -rnw "./" -e "webdesign"
This searches the current directory with pattern "webdesign", and I get the result:
.//pages/index.html:1:{% load webdesign %}
.//pages/pricing.html:1:{% load webdesign %}
.//prototypes.py:16: 'django.contrib.webdesign',
Then I use sed to remove the lines from those files, which doesn't work, only get blank file ( I mean it deletes all my file content):
sed -i "/webdesign/d" ./pages/index.html
or
sed "/webdesign/d" ./pages/index.html > ./pages/index.html
My software environment is: OS X Yosemite, Mac Terminal, Bash
A loop in bash will do the trick provided that there are no filenames with spaces (in which case other solutions are possible, but this is the simplest)
for i in `grep -lrnw "yourdirectory/" -e "webdesign"`
do
sed "/webdesign/d" $i > $i.tmp
# safety to avoid destroying the file if problem arises (disk full?)
if [ $? = 0 ] ; then
mv -f $i.tmp $i
fi
done
note that you should not locate this script in the current directory because it contains webdesign and it will be modified as well :)
Thanks to choroba, I know that -i option doesn't work like wished. But it has another meaning or it would be rejected by the opt parser. It has something to do with suffixes, well, it doesn't matter now, but it's difficult to see the problem at first.
Without -i you cannot work on a file in-place. And redirecting output to the input just destroys the input file (!). That's why your solution did not work.
You can install GNU sed that supports the -i option, then
sed -i '/webdesign/d' files
should work. Note that it's safer to use -i~ to create a backup.
You cannot write to the same file you're reading from, that's why
sed /webdesign/d file > file
doesn't work (it overwrites the file before you can read anything from it). Create a temporary file
sed /webdesign/d file > file.tmp
mv file.tmp file

How do I write a bash script to copy files into a new folder based on name?

I have a folder filled with ~300 files. They are named in this form username#mail.com.pdf. I need about 40 of them, and I have a list of usernames (saved in a file called names.txt). Each username is one line in the file. I need about 40 of these files, and would like to copy over the files I need into a new folder that has only the ones I need.
Where the file names.txt has as its first line the username only:
(eg, eternalmothra), the PDF file I want to copy over is named eternalmothra#mail.com.pdf.
while read p; do
ls | grep $p > file_names.txt
done <names.txt
This seems like it should read from the list, and for each line turns username into username#mail.com.pdf. Unfortunately, it seems like only the last one is saved to file_names.txt.
The second part of this is to copy all the files over:
while read p; do
mv $p foldername
done <file_names.txt
(I haven't tried that second part yet because the first part isn't working).
I'm doing all this with Cygwin, by the way.
1) What is wrong with the first script that it won't copy everything over?
2) If I get that to work, will the second script correctly copy them over? (Actually, I think it's preferable if they just get copied, not moved over).
Edit:
I would like to add that I figured out how to read lines from a txt file from here: Looping through content of a file in bash
Solution from comment: Your problem is just, that echo a > b is overwriting file, while echo a >> b is appending to file, so replace
ls | grep $p > file_names.txt
with
ls | grep $p >> file_names.txt
There might be more efficient solutions if the task runs everyday, but for a one-shot of 300 files your script is good.
Assuming you don't have file names with newlines in them (in which case your original approach would not have a chance of working anyway), try this.
printf '%s\n' * | grep -f names.txt | xargs cp -t foldername
The printf is necessary to work around the various issues with ls; passing the list of all the file names to grep in one go produces a list of all the matches, one per line; and passing that to xargs cp performs the copying. (To move instead of copy, use mv instead of cp, obviously; both support the -t option so as to make it convenient to run them under xargs.) The function of xargs is to convert standard input into arguments to the program you run as the argument to xargs.

How will i use sed shell script for replacing a pattern in a list of file

I have some 100 files which have my name in it, RAHUL. I want it replaced with another term RAHUL2. I have a file which contains the list of files and i want to fetch it in a sed to do the changes.
files :
C:/desktop/file1.txt
C:/desktop/rahul/file1.txt
C:/desktop/rahul/file3.txt
C:/desktop/rahul/file4.txt
C:/desktop/rahul/file6.txt
C:/desktop/rahul/file8.txt
C:/desktop/rahul/file9.txt
and in each file data, i want to replace all occurance of term RAHUL with RAHUL2
I assume you are using some Cygwin environment on Windows, since you have sed. Then you can use find to list all the files and execute sed on them:
find C:/desktop -type f -name 'file*.txt' -exec sed -i 's/RAHUL/RAHUL2/g' {} \;
Please make a backup of the original files if you are using sed -i but aren't sure if the command is working already. This because sed -i will overwrite the original file. You have been warned.
hek2mgl's answer is good if you want to search for all files matching file*.txt pattern on your C:/desktop directory.
Now as you already have a file containing the list of files to edit, here is another way to proceed:
while read FILE ; do sed -i 's/RAHUL/RAHUL2/g' $FILE ; done < files_to_edit.txt
The read command will read your input one line after another. Input is files_to_edit.txt file, as indicated by the < input redirection operator.
Remark about -i option remains valid: it will edit your files in place, so make backup, or at least run it first on a couple of files (potentially without -i option, just to check what output is).

Resources