Bash sed in loop - bash

I'm trying to use the follow to substitute the tab with comma in several file:
#!/bin/sh
for i in *output_*.txt
do
sed 's/ /;/g' $i > $i
done
But it is not working because in the output file I still have the tab delimiter. It just work when I'm using it on a single file without the for loop.
Any help?
Thanks.

Several things are wrong. Unqouted variables and output redirection into same file. Loop is also not needed.
Try:
sed -i 's/ /;/g' *output_*.txt

The correct script you need is as follows:
find . -name '*output_*.txt' | while read FILENAME; do
(sed -e "s/\\t/;/g" <${FILENAME} >${FILENAME%.txt}.tmp) && (mv ${FILENAME%.txt}.tmp ${FILENAME});
done
This script has several important features:
It finds all files called *output_*.txt in the current directory and all subdirectories. If you do not want to recurse into subdirectories, then use:
find . -maxdepth 1 -name '*output_*.txt' | while read FILENAME; do
as the first line.
It does not overwrite your original input file if sed encounters an error. sed generates its output to a temporary file (<filename>.tmp) and it only replaces the original file if it is successful.
As pointed out by other posters, the tab character is represented by \t in sed scripts.
An example transformation performed by this script is as follows (the sequence <tab> represents a tab character):
Input:
<tab><tab><tab><tab><tab>line 1<tab><tab>
<tab><tab><tab>line 2<tab><tab>
<tab><tab>line 3<tab><tab>
<tab><tab><tab>line 4<tab><tab>
<tab><tab><tab><tab><tab>line<tab><tab> 5
Output:
;;;;line 1;;
;;;line 2;;
;;line 3;;
;;;line 4;;
;;;;;line;; 5

Related

Bash sed text replace with weird filenames

I have a list of files in a directory where the files have spaces and ()
File1 (in parenthesis).txt
File 2 (in parenthesis).txt
File name 3.txt
And on one line in each text file is the name of the file between <short_description>
<short_description>File1 (in parenthesis)</short_description>
I need to modify it to look like this
<short_description>TEST-File1 (in parenthesis)</short_description>
But I can't seem to get it... I can print the filenames out BUT when I try and do the sed command to just replace the whole line with what I want...
for FILE in "$(find . -type f -iname '*.txt')"
do
sed -i "s/^<short_description> .*$/<short_description>TEST-$FILE<\/short_description>/" "$FILE"
done
... this one give me an error
"sed: -e expression #1, char 54: unknown option to `s''"
which I'm assuming means I haven't escaped something but honestly I have no idea what.
Can someone help?
Thank you!
If you say for FILE in "$(find . -type f -iname '*.txt')", the all filenames
fed by find are enclosed with double quotes and merged into a long single string which
contains whitespaces and newlines.
I can print the filenames out
Even if you try to debug with echo "$FILE", it may look as if the filenames
are properly processed. But it is not. You can see it with something like
echo "***${FILE}***".
Then would you please try:
for file in *.txt
do
sed -i "s#^\(<short_description>\)\(.*\)\(</short_description>\)#\1TEST-\2\3#" "$file"
done

Use bash to replace substring in filename with a new substring based on pairwise (old,new) values in a .csv file

I have a directory containing hundreds of image files, each named differently, such as:
abdogf.png
abcatf.png
abhorsef.png
I have created a changenames.csv file containing two columns with oldstring in the first column and newstring in the second such as:
"dog","woof"
"cat","miaow"
"horse","neigh"
These strings are currently in quotation marks, as shown.
I would like to invoke a bash command or .sh script from the command line to replace each oldstring substring with each newstring substring in the directory's filenames (not within file contents), such that the directory newly contains files called:
abwooff.png
abmiaowf.png
abneighf.png
instead of the original files.
I have tried various solutions such as https://superuser.com/questions/508731/find-and-replace-string-in-filenames/508758#508758 and How to find and replace part of filenames from list without success.
For example, I have tried invoking the following within the directory with the files:
#!/bin/bash
inputfile=${1}
while read line
do
IFS=',' read -a names <<< "${line}"
for file in `find . -maxdepth 1 -type f -name "*${names[0]}*"`; do
rename "s/${names[0]}/${names[1]}/" *
done
done < ${inputfile}
using the command line command test.sh changenames.csv.
This produces no error but makes no changes to the filenames.
I have also tried this solution https://stackoverflow.com/a/55866613/10456769 which generated an error in which #echo was not a recognised command.
Thank you in advance for any help.
You need to strip the double quotes off at first. The code tries to find
files such as *"cat"* which do not exit.
Moreover you do not need to execute the find command. You are not
using the variable file at all.
Would you please try the following:
while IFS=',' read -r old new; do
old=${old//\"/} # remove leading and trailing double-quotes
new=${new//\"/} # same as above
rename "s/$old/$new/" *
done < "$1"
The IFS=',' read -a names <<< "${line}" does not remove " from the input. Your filenames do not have " in them, so you have to remove them too.
Backticks ` are discouraged. Don't use them. Use $(....) instead.
"for file in `" is as bad as for file in $(cat) - it's a common bash antipattern. Don't use it - you will have problems with elements with spaces or tabs. Use while IFS= read -r line to read something like by line.
There is a problem with rename, there are two common versions of rename - GNU rename and perl rename. Your script seems to aim the perl version - make sure it is the one installed.
Let rename do the rename - there is no need for for file in find here.
If you do while read line and then IFS=, read <<<"$line" is duplicating the work, just do while IFS=, read -a names; do from the beginning.
So you could do:
# split the input on ','
while IFS=',' read -r pre post; do
# remove quotes
pre=${pre//\"/}
post=${post//\"/}
# do the rename
rename "s/${pre}/${post}/" *
done < ${inputfile}
I think I would do the following script that uses sed:
# find all files in a directory
find . -maxdepth 1 -type f |
# convert filenames into pairs (filename,new_filename) separated by newline
sed "
# hold the line in hold space
h
# replace the characters as in the other file
$(
# generate from "abc","def" -> s"abc"def"g
sed 's#"\([^"]*\)","\([^"]*\)"#s"\1"\2"g#' changenames.csv
)
# switch pattern and hold space
x
# append the line
G
# remove the line if substitute is the same
/^\(.*\)\n\1$/d
" |
# outputs two lines per each filename:
# one line with old filename and one line with new filename
# so just pass that to mv
xargs -l2 echo mv -v
and a one liner:
find . -maxdepth 1 -type f | sed "h;$(sed 's#"\([^"]*\)","\([^"]*\)"#s"\1"\2"g#' changenames.csv);x;G; /^\(.*\)\n\1$/d" | xargs -l2 echo mv -v
With the following recreation of files structure:
touch abdogf.png abcatf.png abhorsef.png
cat <<EOF >changenames.csv
"dog","woof"
"cat","miaow"
"horse","neigh"
EOF
The script outputs on repl:
mv -v ./abdogf.png ./abwooff.png
mv -v ./abcatf.png ./abmiaowf.png
mv -v ./abhorsef.png ./abneighf.png

Sed & Mac OS Terminal: How to remove parentheses content from the first line of every file?

I am on Mac Os 10.14.6 and have a directory that contains subdirectories that all contain text files. Altogether, there are many hundreds of text files.
I would like to go through the text files and check for any content in the first line that is in parentheses. If such content is found, then the parentheses (and content in the parentheses) should be removed.
Example:
Before removal:
The new world (82 edition)
After removal:
The new world
How would I do this?
Steps I have tried:
Google around, it seems SED would be best for this.
I have found this thread, which provides SED code for removing bracketed content.
sed -e 's/([^()]*)//g'
However, I am not sure how to adapt it to work on multiple files and also to limit it to the first line of those files. I found this thread which explains how to use SED on multiple files, but I am not sure how to adapt the example to work with parentheses content.
Please note: As long as the solution works on Mac OS terminal, then it does not need to use SED. However, from Googling, SED seems to be the most suited.
I managed to achieve what you're after simply by using a bash script and sed together, as so:
#!/bin/bash
for filename in $PWD/*.txt; do
sed -i '' '1 s/([^()]*)//g' $filename
done
The script simply iterates over all the .txt files in $PWD (the current working directory, so that you can add this script to your bin and run it anywhere), and then runs the command
sed -ie '1 s/([^()]*)//g' $filename
on the file. By starting the command with the number 1 we tell sed to only work on the first line of the file :)
Edit: Best Answer
The above works fine in a directory where all contained objects are files, and not including directories; in other words, the above does not perform recursive search through directories.
Therefore, after some research, this command should perform exactly what the question asks:
find . -name "*.txt" -exec sed -i '' '1 s/([^()]*)//g' {} \;
I must iterate, and reiterate, that you test this on a backup first to test it works. Otherwise, use the same command as above but change the '' in order to control the creation of backups. For example,
find . -name "*.txt" -exec sed -i '.bkp' '1 s/([^()]*)//g' {} \;
This command will perform the sed replace in the original file (keeping the filename) but will create a backup file for each with the appended .bkp, for example test1.txt becomes test1.txt.bkp. This a safer option, but choose what works best for you :)
Good try,
The command you where looking for single line:
sed -E '1s|\([^\)]+\)||'
The command to replace each input file first line:
sed -Ei '1s|\([^\)]+\)||' *.txt
example:
echo "The new world (82 edition)" |sed -E '1s|\([^\)]+\)||'
The new world
Explanation
sed -Ei E option: the extended RegExp syntax, i option: for in-place file replacement
sed -Ei '1s|match RegExp||' for first line only, replace first matched RegExp string with empty string
\([^\)]+\) RegExp matching: start with (, [^\)]any char not ), + - more than once, terminate with )
Try:
# create a temporary file
tmp=$(mktemp)
# for each something in _the current directory_
for i in *; do
# if it is not a file, don't parse it
if [ ! -f "$i" ]; then continue; fi
# remove parenthesis on first line, save the output in temporary file
sed '1s/([^)]*)//g' "$i" > "$tmp"
# move temporary file to the original file
mv "$tmp" "$i"
done
# remove temporary file
rm "$tmp"

List file using ls with a condition and process/grep files that only whitespaces

I have a list of files in a folder which some of the files have spaces in the filename.
I need to replace the whitespace with _ but first, i need to list the file with condition ls *_[1-4]*[A-c]* . After filter the files, some of the files have whitespace with no fixed position(front, middle, end position).
How can i replace the whitespace after ls command?
You don't want to process the output from ls. Simply loop over the matching files.
for file in *_[1-4]*[A-c]*; do
# Skip files which do not contain any whitespace
case $file in *\ *) ;; *) continue;; esac
echo mv -n "$file" "${file// /_}"
done
The echo is there as a safeguard; take it out if the output looks correct.
The case and the substitution looks for a space (ASCII 32); if you also want to match tabs, form feeds, etc, adapt accordingly. bash allows for something like $[\t ] to match a tab or space, but this is not portable to other Bourne shell implementations
I would use find to list the files and pipe to the results to sed:
find -maxdepth 1 -type f -name '*_[1-4]*[A-c]*' | sed 's/ /_/g'

Removing last n characters from Unix Filename before the extension

I have a bunch of files in Unix Directory :
test_XXXXX.txt
best_YYY.txt
nest_ZZZZZZZZZ.txt
I need to rename these files as
test.txt
best.txt
nest.txt
I am using Ksh on AIX .Please let me know how i can accomplish the above using a Single command .
Thanks,
In this case, it seems you have an _ to start every section you want to remove. If that's the case, then this ought to work:
for f in *.txt
do
g="${f%%_*}.txt"
echo mv "${f}" "${g}"
done
Remove the echo if the output seems correct, or replace the last line with done | ksh.
If the files aren't all .txt files, this is a little more general:
for f in *
do
ext="${f##*.}"
g="${f%%_*}.${ext}"
echo mv "${f}" "${g}"
done
If this is a one time (or not very often) occasion, I would create a script with
$ ls > rename.sh
$ vi rename.sh
:%s/\(.*\)/mv \1 \1/
(edit manually to remove all the XXXXX from the second file names)
:x
$ source rename.sh
If this need occurs frequently, I would need more insight into what XXXXX, YYY, and ZZZZZZZZZZZ are.
Addendum
Modify this to your liking:
ls | sed "{s/\(.*\)\(............\)\.txt$/mv \1\2.txt \1.txt/}" | sh
It transforms filenames by omitting 12 characters before .txt and passing the resulting mv command to a shell.
Beware: If there are non-matching filenames, it executes the filename—and not a mv command. I omitted a way to select only matching filenames.

Resources