Bash sed text replace with weird filenames - bash

I have a list of files in a directory where the files have spaces and ()
File1 (in parenthesis).txt
File 2 (in parenthesis).txt
File name 3.txt
And on one line in each text file is the name of the file between <short_description>
<short_description>File1 (in parenthesis)</short_description>
I need to modify it to look like this
<short_description>TEST-File1 (in parenthesis)</short_description>
But I can't seem to get it... I can print the filenames out BUT when I try and do the sed command to just replace the whole line with what I want...
for FILE in "$(find . -type f -iname '*.txt')"
do
sed -i "s/^<short_description> .*$/<short_description>TEST-$FILE<\/short_description>/" "$FILE"
done
... this one give me an error
"sed: -e expression #1, char 54: unknown option to `s''"
which I'm assuming means I haven't escaped something but honestly I have no idea what.
Can someone help?
Thank you!

If you say for FILE in "$(find . -type f -iname '*.txt')", the all filenames
fed by find are enclosed with double quotes and merged into a long single string which
contains whitespaces and newlines.
I can print the filenames out
Even if you try to debug with echo "$FILE", it may look as if the filenames
are properly processed. But it is not. You can see it with something like
echo "***${FILE}***".
Then would you please try:
for file in *.txt
do
sed -i "s#^\(<short_description>\)\(.*\)\(</short_description>\)#\1TEST-\2\3#" "$file"
done

Related

Sed & Mac OS Terminal: How to remove parentheses content from the first line of every file?

I am on Mac Os 10.14.6 and have a directory that contains subdirectories that all contain text files. Altogether, there are many hundreds of text files.
I would like to go through the text files and check for any content in the first line that is in parentheses. If such content is found, then the parentheses (and content in the parentheses) should be removed.
Example:
Before removal:
The new world (82 edition)
After removal:
The new world
How would I do this?
Steps I have tried:
Google around, it seems SED would be best for this.
I have found this thread, which provides SED code for removing bracketed content.
sed -e 's/([^()]*)//g'
However, I am not sure how to adapt it to work on multiple files and also to limit it to the first line of those files. I found this thread which explains how to use SED on multiple files, but I am not sure how to adapt the example to work with parentheses content.
Please note: As long as the solution works on Mac OS terminal, then it does not need to use SED. However, from Googling, SED seems to be the most suited.
I managed to achieve what you're after simply by using a bash script and sed together, as so:
#!/bin/bash
for filename in $PWD/*.txt; do
sed -i '' '1 s/([^()]*)//g' $filename
done
The script simply iterates over all the .txt files in $PWD (the current working directory, so that you can add this script to your bin and run it anywhere), and then runs the command
sed -ie '1 s/([^()]*)//g' $filename
on the file. By starting the command with the number 1 we tell sed to only work on the first line of the file :)
Edit: Best Answer
The above works fine in a directory where all contained objects are files, and not including directories; in other words, the above does not perform recursive search through directories.
Therefore, after some research, this command should perform exactly what the question asks:
find . -name "*.txt" -exec sed -i '' '1 s/([^()]*)//g' {} \;
I must iterate, and reiterate, that you test this on a backup first to test it works. Otherwise, use the same command as above but change the '' in order to control the creation of backups. For example,
find . -name "*.txt" -exec sed -i '.bkp' '1 s/([^()]*)//g' {} \;
This command will perform the sed replace in the original file (keeping the filename) but will create a backup file for each with the appended .bkp, for example test1.txt becomes test1.txt.bkp. This a safer option, but choose what works best for you :)
Good try,
The command you where looking for single line:
sed -E '1s|\([^\)]+\)||'
The command to replace each input file first line:
sed -Ei '1s|\([^\)]+\)||' *.txt
example:
echo "The new world (82 edition)" |sed -E '1s|\([^\)]+\)||'
The new world
Explanation
sed -Ei E option: the extended RegExp syntax, i option: for in-place file replacement
sed -Ei '1s|match RegExp||' for first line only, replace first matched RegExp string with empty string
\([^\)]+\) RegExp matching: start with (, [^\)]any char not ), + - more than once, terminate with )
Try:
# create a temporary file
tmp=$(mktemp)
# for each something in _the current directory_
for i in *; do
# if it is not a file, don't parse it
if [ ! -f "$i" ]; then continue; fi
# remove parenthesis on first line, save the output in temporary file
sed '1s/([^)]*)//g' "$i" > "$tmp"
# move temporary file to the original file
mv "$tmp" "$i"
done
# remove temporary file
rm "$tmp"

bash: trouble with find and sed in a directory

Pulling my hair out - somebody save me from an early Q-ball.
I have a folder with loads of powerpoint files and I want to change a substring in each title. All of them are of the form "lecture 2 2014.pptx" and I want to change "2014" to "2016".
Insider the directory I try commands like:
find . -name "*2014*" | xargs -0 sed -i 's/2014/2016/g'
to no avail. Any advice?
Edit my goal is to change the file name. "Lecture 2 2014.pptx" to "Lecture 2 2016.pptx"
rename s/2014/2016/ *2014.pptx
If your list is too long to expand by shell try:
find -name \*2014.pptx -exec rename s/2014/2016/ {} \;
rename was already mentioned. Be aware that there are two version floating around: one with the syntax
rename [options] expression replacement file...
and one with the syntax
rename s/old/new/ file...
As an alternative: a simple Bash loop with a regex extracting the "2014" from each file name, replacing it with "2016"
re='(.*)2014(.*)'
for fname in *2014*; do
[[ $fname =~ $re ]]
mv "$fname" "${BASH_REMATCH[1]}2016${BASH_REMATCH[2]}"
done

Transfer a path with space in bash

I'm trying to run a program on every file on a dir.
But there is spaces in the name of the file. For example, a file can be named «/my/good/path/MY - AWFUL, FILE.DOC»
And when I'm trying to send the path to my the other tool (a python script), I've got an error saying «MY» is not a existing file. :(
Here is my current bash code:
#!/usr/bin/bash
for file in $(find "/my/pash" -name "*.DOC")
do
newvar=`program "$file"`
done
So… where is my problem?
Thanks everyone :)
Some correct answers, but no explanations so far:
a for loop is intended to iterate over words not lines. The given (unquoted) string is subject to word splitting (which is what is troubling you) and filename expansion, and then you iterate over the resulting words. You could set IFS to contain only a newline. The safest way is to use find -print0 and xargs -0 as demonstrated by Vytenis's answer
find -name "*.DOC" -print0 | xargs -r -0 -n1 program
#!/usr/bin/bash
find "/my/pash" -name "*.DOC" | while read file; do
newvar="$(program "$file")"
done
Note that this only fixes the case where a space or tab is in the file name. If you have a newline in the file name, it gets a little more complicated.
That is because the for loop will take every word inside the result of the find as an element to iterate over. for will see it as:
for file in {/my/good/path/MY, -, AWFUL, FILE.DOC}
echo "$file"
done
And will print:
/my/good/path/MY
-
AWFUL,
FILE.DOC
One solution to this problem is to use the xargs program to pass the result of the find as your python program argument:
find "/my/pash" -name "*.DOC" -print0 | xargs -0 -i program "{}"
the loop treats blanks as delimiter, so try this one:
find "/my/pash" -name "*.DOC" | while read file; do
newvar=`program "$file"`
done

How to overwrite the contents in the sed, without having backup file

I have a command like this:
sed -i -e '/console.log/ s/^\/*/\/\//' *.js
which does comments out all console.log statements. But there are two things
It keeps the backup file like test.js-e , I doesn't want to do that.
Say I want to the same process recursive to the folder, how to do it?
You don't have to use -e option in this particular case as it is unnecessary. This will solve your 1st problem (as -e seems to be going as suffix for -i option).
For the 2nd part, u can try something like this:
for i in $(find . -type f -name "*.js"); do sed -i '/console.log/ s/^\/*/\/\//' $i; done;
Use find to recursively find all .js files and do the replacement.
When checking sed's help, -i takes a suffix and uses it as a backup,
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)
and the output backup seems to be samefile + -e which is the second argument you're sending, try removing the space and see if that would work
sed -ie '/console.log/ s/^\/*/\/\//' *.js
As for the recursion, you could use find with -exec or xargs, please modify the find command and test it before running exec
find -name 'console.log' -type f -exec sed -ie '/console.log/ s/^\/*/\/\//' *.js \;
From your original post I presume you just want to make a C-style comment leading like:
/*
to a double back-slash style like:
//
right?
Then you can do it with this command
find . -name "*.js" -type f -exec sed -i '/console.log/ s#^/\*#//#g' '{}' \;
To be awared that:
in sed the split character normally be / but if you found that annoying to Escape when your replacing or matching string contains a / . You can change the split character to # or | as you like, I found it very useful trick.
if you do want to do is what I presumed, be sure that you should Escape the character *, because a combination of regex /* just means to match a pattern that / occurs one time or many times or none at all, that will match everything, it's very dangerous!

Bash sed in loop

I'm trying to use the follow to substitute the tab with comma in several file:
#!/bin/sh
for i in *output_*.txt
do
sed 's/ /;/g' $i > $i
done
But it is not working because in the output file I still have the tab delimiter. It just work when I'm using it on a single file without the for loop.
Any help?
Thanks.
Several things are wrong. Unqouted variables and output redirection into same file. Loop is also not needed.
Try:
sed -i 's/ /;/g' *output_*.txt
The correct script you need is as follows:
find . -name '*output_*.txt' | while read FILENAME; do
(sed -e "s/\\t/;/g" <${FILENAME} >${FILENAME%.txt}.tmp) && (mv ${FILENAME%.txt}.tmp ${FILENAME});
done
This script has several important features:
It finds all files called *output_*.txt in the current directory and all subdirectories. If you do not want to recurse into subdirectories, then use:
find . -maxdepth 1 -name '*output_*.txt' | while read FILENAME; do
as the first line.
It does not overwrite your original input file if sed encounters an error. sed generates its output to a temporary file (<filename>.tmp) and it only replaces the original file if it is successful.
As pointed out by other posters, the tab character is represented by \t in sed scripts.
An example transformation performed by this script is as follows (the sequence <tab> represents a tab character):
Input:
<tab><tab><tab><tab><tab>line 1<tab><tab>
<tab><tab><tab>line 2<tab><tab>
<tab><tab>line 3<tab><tab>
<tab><tab><tab>line 4<tab><tab>
<tab><tab><tab><tab><tab>line<tab><tab> 5
Output:
;;;;line 1;;
;;;line 2;;
;;line 3;;
;;;line 4;;
;;;;;line;; 5

Resources