Recursive BASH renaming - bash

EDIT: Ok, I'm sorry, I should have specified that I was on Windows, and using win-bash, which is based on bash 1.14.2, along with the gnuwin32 tools. This means all of the solutions posted unfortunately didn't help out. It doesn't contain many of the advanced features. I have however figured it out finally. It's an ugly script, but it works.
#/bin/bash
function readdir
{
cd "$1"
for infile in *
do
if [ -d "$infile" ]; then
readdir "$infile"
else
renamer "$infile"
fi
done
cd ..
}
function renamer
{
#replace " - " with a single underscore.
NEWFILE1=`echo "$1" | sed 's/\s-\s/_/g'`
#replace spaces with underscores
NEWFILE2=`echo "$NEWFILE1" | sed 's/\s/_/g'`
#replace "-" dashes with underscores.
NEWFILE3=`echo "$NEWFILE2" | sed 's/-/_/g'`
#remove exclamation points
NEWFILE4=`echo "$NEWFILE3" | sed 's/!//g'`
#remove commas
NEWFILE5=`echo "$NEWFILE4" | sed 's/,//g'`
#remove single quotes
NEWFILE6=`echo "$NEWFILE5" | sed "s/'//g"`
#replace & with _and_
NEWFILE7=`echo "$NEWFILE6" | sed "s/&/_and_/g"`
#remove single quotes
NEWFILE8=`echo "$NEWFILE7" | sed "s/’//g"`
mv "$1" "$NEWFILE8"
}
for infile in *
do
if [ -d "$infile" ]; then
readdir "$infile"
else
renamer "$infile"
fi
done
ls
I'm trying to create a bash script to recurse through a directory and rename files, to remove spaces, dashes and other characters. I've gotten the script working fine for what I need, except for the recursive part of it. I'm still new to this, so it's not as efficient as it should be, but it works. Anyone know how to make this recursive?
#/bin/bash
for infile in *.*;
do
#replace " - " with a single underscore.
NEWFILE1=`echo $infile | sed 's/\s-\s/_/g'`;
#replace spaces with underscores
NEWFILE2=`echo $NEWFILE1 | sed 's/\s/_/g'`;
#replace "-" dashes with underscores.
NEWFILE3=`echo $NEWFILE2 | sed 's/-/_/g'`;
#remove exclamation points
NEWFILE4=`echo $NEWFILE3 | sed 's/!//g'`;
#remove commas
NEWFILE5=`echo $NEWFILE4 | sed 's/,//g'`;
mv "$infile" "$NEWFILE5";
done;

find is the command able to display all elements in a filesystem hierarchy. You can use it to execute a command on every found file or pipe the results to xargs which will handle the execution part.
Take care that for infile in *.* does not work on files containing whitespaces. Check the -print0 option of find, coupled to the -0 option of xargs.

All those semicolons are superfluous and there's no reason to use all those variables. If you want to put the sed commands on separate lines and intersperse detailed comments you can still do that.
#/bin/bash
find . | while read -r file
do
newfile=$(echo "$file" | sed '
#replace " - " with a single underscore.
s/\s-\s/_/g
#replace spaces with underscores
s/\s/_/g
#replace "-" dashes with underscores.
s/-/_/g
#remove exclamation points
s/!//g
#remove commas
s/,//g')
mv "$infile" "$newfile"
done
This is much shorter:
#/bin/bash
find . | while read -r file
do
# replace " - " or space or dash with underscores
# remove exclamation points and commas
newfile=$(echo "$file" | sed 's/\s-\s/_/g; s/\s/_/g; s/-/_/g; s/!//g; s/,//g')
mv "$infile" "$newfile"
done
Shorter still:
#/bin/bash
find . | while read -r file
do
# replace " - " or space or dash with underscores
# remove exclamation points and commas
newfile=$(echo "$file" | sed 's/\s-\s/_/g; s/[-\s]/_/g; s/[!,]//g')
mv "$infile" "$newfile"
done

In bash 4, setting the globstar option allows recursive globbing.
shopt -s globstar
for infile in **
...
Otherwise, use find.
while read infile
do
...
done < <(find ...)
or
find ... -exec ...

I've used 'find' in the past to locate files then had it execute another application.
See '-exec'

rename 's/pattern/replacement/' glob_pattern

Related

Remove part of name of multiple files on mac os

i have a directory full of .png files with a random caracters in the middle of the filenames like
T1_021_É}ÉcÉjÉV_solid box.png
T1_091_ÉRÉjÉtÉ#Å[_City.png
T1_086_ÉnÉiÉ~ÉYÉL_holiday.png
I expect this after removing
T1_021_solid box.png
T1_091_City.png
T1_086_holiday.png
Thank you
Using for to collect the file lists and bash parameter expansion with substring removal, you can do the following in the directory containing the files:
for i in T1_*; do
beg="${i%_*_*}" ## trim from back to 2nd '_'
end="${i##*_}" ## trim from from through last '_'
mv "$i" "${beg}_$end" ## mv file to new name.
done
(note: you don't have to use variables beg and end you can just combing both parameter expansions to form the new filenaame, e.g. mv "$i" "${i%_*_*}_${i##*_}", up to you, but beg and end make things a bit more readable.)
Result
New file names:
$ ls -al T1_*
T1_021_solid
T1_086_holiday.png
T1_091_City.png
Just another way to approach it from bash only.
Using cut
You can use cut to remove the 3rd field using '_' as the delimiter with :
for i in T1_*; do
mv "$i" $(cut -d'_' -f-2,4- <<< "$i")
done
(same output)
The only drawback is the use of cut in the command substitution would require an additional subshell be spawned each iteration.
If the set of random characters have _ before and after
find . -type f -iname "T1_0*" 2>/dev/null | while read file; do
mv "${file}" "$(echo ${file} | cut -d'_' -f1,2,4-)"
done
Explanation:
Find all files that start with T1_
Read the list line by line using the while loop
Use _ as delimiter and cut the 3rd column
Use mv to rename
Filenames after renaming:
T1_021_solid box.png
T1_086_holiday.png
T1_091_City.png

Substitute shortest match of pattern in filename

I have files with the following filename pattern:
C14_1_S1_R1_001_copy1.fastq.gz
That I would like to be renamed this way:
C14_1_S1_R1.fastq.gz
I have tested unsuccessfully the following pattern replacement strategy:
for f in *.fastq.gz; do echo mv "$f" "${f/_*./_}"; done
Any suggestion is welcome.
Your original filename has several underscore characters but you only want to remove from the second to last underscore. In that case, try:
mv "$f" "${f%_*_*}.fastq.gz"
Consider a directory with these files:
$ ls -1
C14_1_S1_R1_001_copy1.fastq.gz
C15_1_S1_R1_001_copy1.fastq.gz
If we run our loop and then run a new ls, we see the changed filenames:
$ for f in ./*.fastq.gz; do mv "$f" "${f%_*_*}.fastq.gz"; done
$ ls -1
C14_1_S1_R1.fastq.gz
C15_1_S1_R1.fastq.gz
The key here is that ${var%word} is suffix removal and it matches the shortest possible suffix that matches the glob word. Thus, ${f%_*_*} removes the second-to-last underscore character and everything after it. ${f%_*_*}.fastq.gz removes the second-to-last underscore character and everything after and then restores your desired suffix of .fastq.gz.
str="C14_1_S1_R1_001_copy1.fastq.gz"
front=$(echo "${str}" | cut -d'_' -f1-4)
back=$(echo "${str}" | cut --complement -d'.' -f1)
echo "${front}.${back}"
With regex using the =~ test operator and BASH_REMATCH
#!/usr/bin/env bash
for file in *.fastq.gz; do
if [[ $file =~ ^(.+)(_[[:digit:]]+_copy.*[^\.])(\.fastq\.gz)$ ]]; then
echo mv -v "$file" "${BASH_REMATCH[1]}${BASH_REMATCH[3]}"
fi
done
Basically it just split the C14_1_S1_R1_001_copy1.fastq.gz into three parts.
BASH_REMATCH[1] has C14_1_S1_R1
BASH_REMATCH[2] has _001_copy1
BASH_REMATCH[3] has .fastq.gz
Remove the echo if you're ok with the output so the files can be renamed.

Removing "!" , "[" and "]" from Filenames

I'm trying to rename over 1700 videos for a emulator I'm putting together,
Some of the files can look like the following examples:
romfilename1!!! (Japan) [SLUS-01005].mp4
romfilename2 (USA) [SLUS-28605] (Disc 1).mp4
romfilename3 (USA) [SLUS-28605] (Disc 2).mp4
I'm trying to achieve the following results:
romfilename1.mp4
romfilename2 (Disc 1).mp4
romfilename3 (Disc 2).mp4
So far I've been able to remove (USA) & (Japan) by using:
for i in *.mp4
do
mv "$i" "`echo $i | sed 's/ (USA)//'`"
done
So now I'm stuck on how I could go about removing the Exclamation Marks,
I've spent much time trying to search for an answer but havnt had much luck.
I am also stuck on how I got about removing these code thingys "[SLUS-28605]"
Mostly because of the brackets "[" and "]", the code inside is not important.
I've triend the following but the these particular characters mess things up.
for i in *.mp4
do
mv "$i" "`echo $i | sed 's/!!//'`"
done
and...
for i in *.mp4
do
mv "$i" "`echo $i | sed 's/[SLUS-28605]//'`"
done
and..
for i in *.mp4
do
mv "$i" "`echo $i | sed -i 's/[]"[]//g'
done
Thanks in advance for any assistance, Nem
You don't need sed for any of this.
shopt -s extglob
for i in *.mp4
do
# Remove all !; the ! doesn't need to be escaped if history
# expansion is disabled.
new_i=${i//\!}
# Remove the *first* parenthesized group (which contains the country)
new_i=${new_i/ (+([!)]))}
# Remove the bracketed group
new_i=${new_i// \[*]}
#mv "$i" "$new_i"
echo "mv \"$i\" \"$new_i\""
done
You can remove the echo once you verify that the mv commands are correct.
You can substitute multiple patterns in one line using sed and should escape special chars like spaces and square braces:
#!/bin/bash
for i in *.mp4
do
mv "$i" "$(echo $i | sed 's/!!!//; s/\ (USA)\ //; s/\ (Japan)\ //; s/\[SLUS-[^][]*\]//')"
done
You can use rename command for that. It supports regexes. So the command will looks like:
rename 's/[![]]//g' *
or
rename 's/[!]*\|\[[^]]*\]\| *(Japan) *\| *(USA) *//g' *
Though please double check man page of rename available in your system. E.g. deb-based and rpm-based distributives use different versions and regex will vary depending on your local rename version.
Regex should be adjusted to your complete requirement, as it is not really clear from the question.
It will also save from possible issues with special symbols in filename like \n and others.
Remove the ! :
for i in *.mp4
do
name=`echo $i | sed 's/!//g'`
mv "$i" "$name"
done
Eemove the [???] :
for i in *.mp4
do
name=`echo $i | sed 's/\[[^][]*\]//g'`
mv "$i" "$name"
done
Remove the (???) :
for i in *.mp4
do
name=`echo $i | sed 's/([^)(]*)//g'`
mv "$i" "$name"
done
If you want to remove all in once :
for i in *.mp4
do
name=`echo $i | sed 's/!//g' | sed 's/([^)(]*)//g' | sed 's/\[[^][]*\]//g' `
mv "$i" "$name"
done

How to split the contents of `$PATH` into distinct lines?

Suppose echo $PATH yields /first/dir:/second/dir:/third/dir.
Question: How does one echo the contents of $PATH one directory at a time as in:
$ newcommand $PATH
/first/dir
/second/dir
/third/dir
Preferably, I'm trying to figure out how to do this with a for loop that issues one instance of echo per instance of a directory in $PATH.
echo "$PATH" | tr ':' '\n'
Should do the trick. This will simply take the output of echo "$PATH" and replaces any colon with a newline delimiter.
Note that the quotation marks around $PATH prevents the collapsing of multiple successive spaces in the output of $PATH while still outputting the content of the variable.
As an additional option (and in case you need the entries in an array for some other purpose) you can do this with a custom IFS and read -a:
IFS=: read -r -a patharr <<<"$PATH"
printf %s\\n "${patharr[#]}"
Or since the question asks for a version with a for loop:
for dir in "${patharr[#]}"; do
echo "$dir"
done
How about this:
echo "$PATH" | sed -e 's/:/\n/g'
(See sed's s command; sed -e 'y/:/\n/' will also work, and is equivalent to the tr ":" "\n" from some other answers.)
It's preferable not to complicate things unless absolutely necessary: a for loop is not needed here. There are other ways to execute a command for each entry in the list, more in line with the Unix Philosophy:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
such as:
echo "$PATH" | sed -e 's/:/\n/g' | xargs -n 1 echo
This is functionally equivalent to a for-loop iterating over the PATH elements, executing that last echo command for each element. The -n 1 tells xargs to supply only 1 argument to it's command; without it we would get the same output as echo "$PATH" | sed -e 'y/:/ /'.
Since this uses xargs, which has built-in support to split the input, and echoes the input if no command is given, we can write that as:
echo -n "$PATH" | xargs -d ':' -n 1
The -d ':' tells xargs to use : to separate it's input rather than a newline, and the -n tells /bin/echo to not write a newline, otherwise we end up with a blank trailing line.
here is another shorter one:
echo -e ${PATH//:/\\n}
You can use tr (translate) to replace the colons (:) with newlines (\n), and then iterate over that in a for loop.
directories=$(echo $PATH | tr ":" "\n")
for directory in $directories
do
echo $directory
done
My idea is to use echo and awk.
echo $PATH | awk 'BEGIN {FS=":"} {for (i=0; i<=NF; i++) print $i}'
EDIT
This command is better than my former idea.
echo "$PATH" | awk 'BEGIN {FS=":"; OFS="\n"} {$1=$1; print $0}'
If you can guarantee that PATH does not contain embedded spaces, you can:
for dir in ${PATH//:/ }; do
echo $dir
done
If there are embedded spaces, this will fail badly.
# preserve the existing internal field separator
OLD_IFS=${IFS}
# define the internal field separator to be a colon
IFS=":"
# do what you need to do with $PATH
for DIRECTORY in ${PATH}
do
echo ${DIRECTORY}
done
# restore the original internal field separator
IFS=${OLD_IFS}

change lowercase file names to uppercase with awk ,sed or bash

I would like to change lowercase filenames to uppercase with awk/sed/bash
your help would be appreciated
aaaa.txt
vvjv.txt
acfg.txt
desired output
AAAA.txt
VVJV.txt
ACFG.txt
PREFACE:
If you don't care about the case of your extensions, simply use the 'tr' utility in a shell loop:
for i in *.txt; do mv "$i" "$(echo "$i" | tr '[a-z]' '[A-Z]')"; done
If you do care about the case of the extensions, then you should be aware that there is more than one way to do it (TIMTOWTDI). Personally, I believe the Perl solution, listed here, is probably the simplest and most flexible solution under Linux. If you have multiple file extensions, simply specify the number you wish to keep unchanged. The BASH4 solution is also a very good one, but you must be willing to write out the extension a few times, or alternatively, use another variable to store it. But if you need serious portability then I recommend the last solution in this answer which uses octals. Some flavours of Linux also ship with a tool called rename that may also be worth checking out. It's usage will vary from distro to distro, so type man rename for more info.
SOLUTIONS:
Using Perl:
# single extension
perl -e 's/\.[^\.]*$/rename $_, uc($`) . $&/e for #ARGV' *.txt
# multiple extensions
perl -e 's/(?:\.[^\.]*){2}$/rename $_, uc($`) . $&/e for #ARGV' *.tar.gz
Using BASH4:
# single extension
for i in *.txt; do j="${i%.txt}"; mv "$i" "${j^^}.txt"; done
# multiple extensions
for i in *.tar.gz; do j="${i%.tar.gz}"; mv "$i" "${j^^}.tar.gz"; done
# using a var to store the extension:
e='.tar.gz'; for i in *${e}; do j="${i%${e}}"; mv "$i" "${j^^}${e}"; done
Using GNU awk:
for i in *.txt; do
mv "$i" $(echo "$i" | awk '{ sub(/.txt$/,""); print toupper($0) ".txt" }');
done
Using GNU sed:
for i in *.txt; do
mv "$i" $(echo "$i" | sed -r -e 's/.*/\U&/' -e 's/\.TXT$/\u.txt/');
done
Using BASH3.2:
for i in *.txt; do
stem="${i%.txt}";
for ((j=0; j<"${#stem}"; j++)); do
chr="${stem:$j:1}"
if [[ "$chr" == [a-z] ]]; then
chr=$(printf "%o" "'$chr")
chr=$((chr - 40))
chr=$(printf '\'"$chr")
fi
out+="$chr"
done
mv "$i" "$out.txt"
out=
done
In general for lowercase/upper case modifications "tr" ( translate characters ) utility is often used, it's from the set of command line utilities used for character replacement.
dtpwmbp:~ pwadas$ echo "xxx" | tr '[a-z]' '[A-Z]'
XXX
dtpwmbp:~ pwadas$
Also, for renaming files there's "rename" utility, delivered with perl ( man rename ).
SYNOPSIS
rename [ -v ] [ -n ] [ -f ] perlexpr [ files ]
DESCRIPTION
"rename" renames the filenames supplied according to the rule specified as the first argument. The perlexpr argument is a Perl expression which is expected to modify the $_ string in
Perl for at least some of the filenames specified. If a given filename is not modified by the expression, it will not be renamed. If no filenames are given on the command line,
filenames will be read via standard input.
For example, to rename all files matching "*.bak" to strip the extension, you might say
rename 's/\.bak$//' *.bak
To translate uppercase names to lower, you'd use
rename 'y/A-Z/a-z/' *
I would suggest using rename, if you only want to uppercase the filename and not the extension, use something like this:
rename -n 's/^([^.]*)\.(.*)$/\U$1\E.$2/' *
\U uppercases everything until \E, see perlreref(1). Remove the -n when your happy with the output.
Bash 4 parameter expansion can perform case changes:
for i in *.txt; do
i="${i%.txt}"
mv "$i.txt" "${i^^?}.txt"
done
bash:
for f in *.txt; do
no_ext=${f%.txt}
mv "$f" "${no_ext^^}.txt"
done
for f in *.txt; do
mv "$f" "`tr [:lower:] [:upper:] <<< "${f%.*}"`.txt"
done
An easier, lightweight and portable approach would be:
for i in *.txt
do
fname=$(echo $i | cut -d"." -f1 | tr [a-z] [A-Z])
ext=$(echo $i | cut -d"." -f2)
mv $i $fname.$ext
done
This would work on almost every version of BASH since we are using most common external utilities (cut, tr) found on every Unix flavour.
Simply use (on terminal):
for i in *.txt; do mv $i `echo ${i%.*} | tr [:lower:] [:upper:]`.txt; done;
This might work for you (GNU sed):
printf "%s\n" *.txt | sed 'h;s/[^.]*/\U&/;H;g;s/\(.*\)\n/mv -v \1 /' | sh
or more simply:
printf "%s\n" *.txt | sed 'h;s/[^.]*/\U&/;H;g;s/\(.*\)\n/mv -v \1 /e'
for i in *.jar; do mv $i `echo ${i%} | tr [:upper:] [:lower:]`; done;
this works for me.

Resources