Removing Underscore of Files under a Directory - bash

I want to remove the underscores of the files that there are inside some Directories
I have this:
Divulgation
├── Biology
│   └── Dawkins, C. Richard
│   └── Books
│   ├── The_Blind_Watchmaker.pdf
│   ├── The_God_Delusion.pdf
│   └── The_Selfish_Gene_(3rd_Ed.).pdf
├── Chemistry
│   └── Gray, Theodore W
│   └── Books
│   ├── Molecules_The_Elements_and_the_Architecture_of Everything.epub
│   └── The_Elements,_A_Visual_Exploration_of_Every_Known_Atom in_the_Universe.pdf
└── Physics
├── Hawking, Stephen
│   └── Books
│   ├── A_Brief_History_of_Time_from_the_Big_Bang_to_Black Holes.djvu
│   ├── My_Brief_History.epub
│   └── The_Universe_in_a_Nutshell.pdf
└── Sagan, Carl E
└── Books
├── Billions_and_Billions_Thoughts_on_Life_and_Death_at_the Brink_of_the_Millennium.pdf
├── Cosmos.pdf
└── The_Dragons_of_Eden.epub
I want this:
My Attempt:
#!/bin/bash
shopt -s extglob # allow fancy #(...) construct to specify dirs
shopt -s globstar # add double-asterisk for flexible depth
for f in #(Divulgation)/**/Books/*.pdf # Search for Filed under Books Directory
do echo "mv "$f" `echo "$f" | sed 's/_/ /g'` " # Show the command first
mv "$f" `echo "$f" | sed 's/_/ /g'` # Rename File by removing Underscores using the s Command
done
The error:
mv Divulgation/Biology/Dawkins, C. Richard/Books/The_Blind_Watchmaker.pdf Divulgation/Biology/Dawkins, C. Richard/Books/The Blind Watchmaker.pdf
mv: target 'Watchmaker.pdf' is not a directory
mv Divulgation/Biology/Dawkins, C. Richard/Books/The God Delusion.pdf Divulgation/Biology/Dawkins, C. Richard/Books/The God Delusion.pdf
mv: target 'Delusion.pdf' is not a directory
mv Divulgation/Biology/Dawkins, C. Richard/Books/The Selfish Gene (3rd Ed.).pdf Divulgation/Biology/Dawkins, C. Richard/Books/The Selfish Gene (3rd Ed.).pdf
mv: target 'Ed.).pdf' is not a directory
mv Divulgation/Chemistry/Gray, Theodore W/Books/The Elements, A Visual Exploration of Every Known Atom in the Universe.pdf Divulgation/Chemistry/Gray, Theodore W/Books/The Elements, A Visual Exploration of Every Known Atom in the Universe.pdf
mv: target 'Universe.pdf' is not a directory
mv Divulgation/Physics/Hawking, Stephen/Books/The Universe in a Nutshell.pdf Divulgation/Physics/Hawking, Stephen/Books/The Universe in a Nutshell.pdf
mv: target 'Nutshell.pdf' is not a directory
mv Divulgation/Physics/Sagan, Carl E/Books/Billions and Billions Thoughts on Life and Death at the Brink of the Millennium.pdf Divulgation/Physics/Sagan, Carl E/Books/Billions and Billions Thoughts on Life and Death at the Brink of the Millennium.pdf
mv: target 'Millennium.pdf' is not a directory
mv Divulgation/Physics/Sagan, Carl E/Books/Cosmos.pdf Divulgation/Physics/Sagan, Carl E/Books/Cosmos.pdf
mv: target 'E/Books/Cosmos.pdf' is not a directory
Help ... Umm the PDFs are reviews of the Books, not the Books themselves ..

Great script, almost there.
Using backticks ` is discouraged. Don't use them. Use $(..) instead.
Always rememebr to quote expansions (and also backticks, if you use them).
Remember that echo "$f" may fail for strange filenames like -e. Use printf "%s\n" "$f" instead. But since it's bash, you can just <<<"$f" use a here string.
You could use set -x to see what is happening. I think simple mv -v might be more... simple.
You need to quote the backticks. If you don't quote them, shell will do word splitting - split the result on spaces into multiple arguments. So for example ls $(echo "filename with spaces.txt") will run ls with 3 arguments and ls will interpret is as 3 separate files - filename, with and spaces.txt.
And replace backticks with $(...):
for f in Divulgation/**/Books/*.pdf; do
mv -v "$f" "$(<<<"$f" sed 's/_/ /g')"
done
For substituting single characters, for me using tr '_' ' ' seems like more "suitable" command then sed.

This might work for you (find, rename and GNU parallel):
find Divulgation -path '*/Books/*.pdf' | rename -v 's/_/ /g'
If you want to parameterize the top directory, use:
parallel --dry find {} -path '*/Books/*.pdf' \| rename -v 's/_/ /g' ::: Divulgation SomeotherVulation
When satisfied the commands are correct, remove the --dryrun option

Related

How to rename files recursively with Bash

I am trying to make a bash script that should replace any occurrence of a given pattern with an other, given, expression in any path in a given directory. For instance, if I have the following tree structure:
.
|- file1
|- file-pattern-pattern.html
|- directory-pattern/
| |- another-pattern
| \- pattern.pattern
\- other-pattern/
\- a-file-pattern
it should end up looking like
.
|- file1
|- file-expression-expression.html
|- directory-expression/
| |- another-expression
| \- expression.expression
\- other-expression/
\- a-file-expression
The main issue I have is that most solution I have found make either usage of the ** glob pattern alongside with a shopt -s globstar nullglob or find to execute rename on all the files, but since I actually change the name of a directory during that operation, it breaks with messages like find: ./directory-pattern: No such file or directory or rename: ./directory-pattern/another-expression: not accessible: No such file or directory.
The second issue is that, according to rename'a man page, it "will rename the specified files by repalcing the first occurrence" of the pattern, not all occurrences, and I didn't find any option to overwrite this behavior. Of course, I don't want to "just run rename with -v until it doesn't spit anything anymore", which just sounds silly.
So the question is: how do I achieve that bulk-renaming in Bash?
Edit: leave only the 1-pass solution that apparently works as well as the 2-passes.
You'll probably have to explore the hierarchy depth first. Example with find and a bash exec script:
$ find . -depth -name '*pattern*' -exec bash -c \
'f=$(basename "$1"); d=$(dirname "$1"); \
mv "$1" "$d/${f//pattern/expression}"' _ {} \;
Demo:
$ tree .
.
├── file-pattern-pattern.html
├── file1
├── foo-pattern
│   └── directory-pattern
│   ├── another-pattern
│   └── pattern.pattern
└── other-pattern
└── a-file-pattern
$ find . -depth -name '*pattern*' -exec bash -c \
'f=$(basename "$1"); d=$(dirname "$1"); \
mv "$1" "$d/${f//pattern/expression}"' _ {} \;
$ tree .
.
├── file-expression-expression.html
├── file1
├── foo-expression
│   └── directory-expression
│   ├── another-expression
│   └── expression.expression
└── other-expression
└── a-file-expression
Explanation: -depth tells find to process each directory's contents before the directory itself. This avoids one of the issues you encountered when referring to a directory that was already renamed. The bash script uses simple pattern substitutions to replace all occurrences of string pattern by string expression.

how can I process a file that has different names depending on the folder (in other words get the name first)?

Let's suppose there is a folder with several subfolders. In each subfolder there is a file, that has a different name depending on the folder. For example
basefolder
|________f1_1_1: video_1_1_1.mp4
|________f1_2_1: video_1_2_1.mp4
|
|_ .....
I want to write a shell script that do some processing on these files
So I have
search_dir=/path/to/the/basefolder/
for entry in "$search_dir"*/
do
echo "$entry"
#ls "$entry" #<--------HERE
echo "========================"
done
As you can see I can list the subfolders.
I want to do something like
process video_1_1_1.mp4 video_1_1_1_out.mp4
but the file name varies.
Yes I see that I can perhaps use the entry variable to compose the name of the file, but what if the files don't follow this pattern and the only thing I know is that they start with "video"?
Is there a way to get the name of the file in the folder so as to use it later?
Consider this file tree:
$ tree /tmp/test
/tmp/test
├── one
│   ├── one-1.mp4
│   ├── one-2.mp4
│   ├── one-3.mp4
│   ├── video-1.mp4
│   └── video-2.mp4
└── two
├── two-1.mp4
├── two-2.mp4
├── two-3.mp4
├── video-1.mp4
└── video-2.mp4
2 directories, 10 files
You can use a recursive glob to find all the .mp4 files in that tree:
$ for fn in "/tmp/test/"**/*".mp4"; do echo "$fn"; done
/tmp/test/one/one-1.mp4
/tmp/test/one/one-2.mp4
/tmp/test/one/one-3.mp4
/tmp/test/one/video-1.mp4
/tmp/test/one/video-2.mp4
/tmp/test/two/two-1.mp4
/tmp/test/two/two-2.mp4
/tmp/test/two/two-3.mp4
/tmp/test/two/video-1.mp4
/tmp/test/two/video-2.mp4
Or just the ones starting with video:
$ for fn in "/tmp/test/"**/"video-"*".mp4"; do echo "$fn"; done
/tmp/test/one/video-1.mp4
/tmp/test/one/video-2.mp4
/tmp/test/two/video-1.mp4
/tmp/test/two/video-2.mp4
Instead of echo you can process...
If process involves more than one file, you can use xargs.
You can also use find:
$ find "/tmp/test/" -iname "video*.mp4" -type f
/tmp/test//one/video-1.mp4
/tmp/test//one/video-2.mp4
/tmp/test//two/video-1.mp4
/tmp/test//two/video-2.mp4
Then you would construct a pipe to xargs or use find -exec:
$ find [ what ] -print0 | xargs -0 process # xargs way
$ find [ what ] -exec process {} + # modern find

Script to replace/delete characters in direcory and file names (work in progress)

I am trying to delete a set of characters like single quote (') and spaces from file names and directories. Example, I have:
Directory I'm confused which contains file you're right
So far, I have been able to create a short script:
#!/bin/sh
for f in *; do mv "$f" `echo $f | tr ' ' '_'`; done
for f in *; do mv "$f" `echo $f | tr -d \'`; done
which renames the dir to Im_confused as intended. The file in the directory of course is not affected.
How can I replace and delete characters in subdirectories as well?
For example, for depth 2, the command is:
REP_CHARS=" →" # Characters to replace
DEL_CHARS="'," # Characters to delete
find . -maxdepth 2 | sort -r |
sed -n -e '/^\.\+$/!{p;s#.\+/#&\n#;p}' |
sed "n;n;s/[$DEL_CHARS]//g;s/[$REP_CHARS]/_/g" |
sed "n;N;s/\n//" |
xargs -L 2 -d '\n' mv 2>/dev/null
Use find with -maxdepth.
Use sort to order from the deepest.
Use sed to replace only the end part.
Use xargs to perform mv.
[Original]
├── I'm confused
│   ├── I'm confused
│   │   └── you're right
│   ├── comma, comma
│   └── you're right
└── talking heads-love → building on fire
└── talking heads-love → building on fire
[After]
├── Im_confused
│   ├── Im_confused
│   │   └── you're right
│   ├── comma_comma
│   └── youre_right
└── talking_heads-love___building_on_fire
└── talking_heads-love___building_on_fire
I would use this rename script:
#!/bin/sh
for f in *; do
g=$(printf '%s' "$f" | tr -s '[:space:]' _ | tr -d "'")
[ "$f" != "$g" ] && mv -v "$f" "$g"
done
and this find invocation
find . -depth -execdir /absolute/path/to/rename.sh '{}' +
-depth does a depth-first descent into the file hierarchy so the files are renamed before their parent directories
-execdir performs the command in the directory where the file is found, so the value of $f only contains the filename not its directory as well.
Demo
$ mkdir -p "a b/c d/e f"
$ touch a\ b/c\ d/e\ f/"I'm confused"
$ touch "a file with spaces"
$ tree
.
├── a\ b
│   └── c\ d
│   └── e\ f
│   └── I'm\ confused
├── a\ file\ with\ spaces
└── rename.sh
3 directories, 3 files
$ find . -depth -execdir /tmp/rename.sh '{}' +
renamed 'a b' -> 'a_b'
renamed 'a file with spaces' -> 'a_file_with_spaces'
renamed "I'm confused" -> 'Im_confused'
renamed 'e f' -> 'e_f'
renamed 'c d' -> 'c_d'
$ tree
.
├── a_b
│   └── c_d
│   └── e_f
│   └── Im_confused
├── a_file_with_spaces
└── rename.sh
3 directories, 3 files

deleting intermediary folders

Maybe one of you guys has something like this at hand already? I tried to use robocopy on windows but to no avail. I also tried to write a bash script in linux with find etc... but gave up on that one also ^^ Google search brought no solution also unfortunately. I need this for my private photo library.
Solution could be linux or windows based, both are fine. Any ideas?
I would like to get rid of hundreds of 'intermediary folders'.
I define an 'intermediary folder' as a folder that contains nothing else than exactly one sub-folder. Example
folder 1
file in folder 1
folder 2 <-- 'intermediary folder: contains exactly one sub-folder, nothing else'
folder 3
file in folder 3
What I would like to end up with is:
folder 1
file in folder 1
folder 3
file in folder 3
I do not need the script to be recursive (removing several layers of intermediary folders at once), I'll just run it several times.
Even cooler would be if the script could rename folder 3 in the above example to 'folder 2 - folder 3', but I can live without this feature I guess.
I guess one of you linux experts has a one liner handy for that? ^^
Thank you very much!
Take a look at this code:
#!/usr/bin/env bash
shopt -s nullglob
while IFS= read -rd '' dir; do
f=("$dir"/*)
if ((${#f[#]}==1)) && [[ -d $f ]]; then
mv -t "${dir%/*}" "$f" || continue
rm -r "$dir"
fi
done < <(find folder1 -depth -mindepth 1 -type d -print0)
Explanation:
shopt -s nullglob: allows filename patterns which match no files to expand to a null string
find ... -depth: makes find traverse the file system in a depth-first order
find ... -mindepth 1: processes all directories except the starting-point
find ... -type d: finds only directories
find ... -print0: prints the directories separated by a null character \0 (to correctly handle possible newlines in filenames)
while IFS= read ...: loops over all the directories (the output of find)
f=("$dir"/*): creates an array with all files in the currently processed directory
((${#f[#]}==1)) && [[ -d $f ]]: true if there is only one file and it is a directory
mv -t "${dir%/*}" "$f": moves the only subdirectory one directory above
mv ... || continue: mv can fail if the subdirectory already exists in the directory above. || continue ignores such subdirectory
rm -r "$dir": removes the processed directory
Test run:
$ tree folder1
folder1
├── file1
├── folder2
│   └── folder3
│   └── file3
├── folder4
│   ├── file4a
│   ├── file4b
│   └── file4c
└── folder5
└── folder6
├── file6
└── folder7
└── folder8
└── folder9
├── dir9
└── file9
$ ./script
$ tree folder1
folder1
├── file1
├── folder3
│   └── file3
├── folder4
│   ├── file4a
│   ├── file4b
│   └── file4c
└── folder6
├── file6
└── folder9
├── dir9
└── file9

Search and replace using prename or sed

I am trying to perform search and replace in file names as following:
$ tree
.
├── a_a
│   ├── a_b_c
│   ├── a b c.mkv
│   ├── b_b
│   └── b_c_d
├── b_a
│   ├── a_f_r
│   ├── c_d
│   ├── f_r_e
│   └── r r
├── c_r
│   ├── d_f
│   └── r_a_s.mkv
└── d.mkv
I want to replace the underscores in the file and folder names by spaces. And the way I want to do this is by replacing the underscores in the base names of the files and folder present in the inner directories first and then move up so that the path I am recursing still exists in the next iteration, since if I rename the upper layer directories, in the next iteration the path to access its inner directories and files will become invalid.
I know I can recurse over files using the find command. Now I want to use a tool to perform the replace operation starting with the files inside and then moving outwards. I don't have much experience in writing regex but I think we may be able to do this using grouping in regex, but I am not sure so plese help.
Till now I have been able to figure out that we can use regex groups to access some parts of the file name. To be more specific, we can get the base name of the folders and files using following regex:
rename -n 's!([^/]*\Z)!uc($1)!e' ./*
Using above regex in the rename command we can convert the base name group to uppercase and I want to know how can I replace the underscores in that group to spaces.
PS: Also I know some of you might say this is a duplicate question, but please read it again, I have researched a lot before asking the question and could not find this specific question anywhere.
#!/bin/bash
find -depth | while IFS= read -r fn; do
pnew=$(dirname "$fn")
fnew=$(basename "$fn")
if [[ "$fnew" =~ "_" ]]; then
new="$pnew/${fnew//_/ }"
echo "$fn -> $new"
mv "$fn" "$new"
fi
done
Remarks:
The -depth argument lets find traverse the directories depth-first.
The dirname/basename-split prevents directories from getting renamed along with their children. Only the leaf file/directory may be renamed at a time.
Everything is quoted where needed to allow for spaces in filenames (including incoming filenames).
#!/bin/bash
# man find, search for -type
#
# these are other types:
# b - block special, c - character special, d - directory, p - named pipe
# f - regular file, l - symbolic link, s - socket
# Move directories first, then everything else
for TYPE in d f; do
for NAME in $( find . -type $TYPE -print0 ); do
if [[ $NAME =~ [a-z] ]]; then
NEW_NAME=$NAME
NEW_NAME=${NEW_NAME//[\_]/-} # Change '-' to ' ' if you insist on spaces
echo "renaming '$NAME' to '$NEW_NAME'"
mv "$NAME" "$NEW_NAME"
fi
done
done

Resources