Search and replace using prename or sed - bash

I am trying to perform search and replace in file names as following:
$ tree
.
├── a_a
│   ├── a_b_c
│   ├── a b c.mkv
│   ├── b_b
│   └── b_c_d
├── b_a
│   ├── a_f_r
│   ├── c_d
│   ├── f_r_e
│   └── r r
├── c_r
│   ├── d_f
│   └── r_a_s.mkv
└── d.mkv
I want to replace the underscores in the file and folder names by spaces. And the way I want to do this is by replacing the underscores in the base names of the files and folder present in the inner directories first and then move up so that the path I am recursing still exists in the next iteration, since if I rename the upper layer directories, in the next iteration the path to access its inner directories and files will become invalid.
I know I can recurse over files using the find command. Now I want to use a tool to perform the replace operation starting with the files inside and then moving outwards. I don't have much experience in writing regex but I think we may be able to do this using grouping in regex, but I am not sure so plese help.
Till now I have been able to figure out that we can use regex groups to access some parts of the file name. To be more specific, we can get the base name of the folders and files using following regex:
rename -n 's!([^/]*\Z)!uc($1)!e' ./*
Using above regex in the rename command we can convert the base name group to uppercase and I want to know how can I replace the underscores in that group to spaces.
PS: Also I know some of you might say this is a duplicate question, but please read it again, I have researched a lot before asking the question and could not find this specific question anywhere.

#!/bin/bash
find -depth | while IFS= read -r fn; do
pnew=$(dirname "$fn")
fnew=$(basename "$fn")
if [[ "$fnew" =~ "_" ]]; then
new="$pnew/${fnew//_/ }"
echo "$fn -> $new"
mv "$fn" "$new"
fi
done
Remarks:
The -depth argument lets find traverse the directories depth-first.
The dirname/basename-split prevents directories from getting renamed along with their children. Only the leaf file/directory may be renamed at a time.
Everything is quoted where needed to allow for spaces in filenames (including incoming filenames).

#!/bin/bash
# man find, search for -type
#
# these are other types:
# b - block special, c - character special, d - directory, p - named pipe
# f - regular file, l - symbolic link, s - socket
# Move directories first, then everything else
for TYPE in d f; do
for NAME in $( find . -type $TYPE -print0 ); do
if [[ $NAME =~ [a-z] ]]; then
NEW_NAME=$NAME
NEW_NAME=${NEW_NAME//[\_]/-} # Change '-' to ' ' if you insist on spaces
echo "renaming '$NAME' to '$NEW_NAME'"
mv "$NAME" "$NEW_NAME"
fi
done
done

Related

how can I process a file that has different names depending on the folder (in other words get the name first)?

Let's suppose there is a folder with several subfolders. In each subfolder there is a file, that has a different name depending on the folder. For example
basefolder
|________f1_1_1: video_1_1_1.mp4
|________f1_2_1: video_1_2_1.mp4
|
|_ .....
I want to write a shell script that do some processing on these files
So I have
search_dir=/path/to/the/basefolder/
for entry in "$search_dir"*/
do
echo "$entry"
#ls "$entry" #<--------HERE
echo "========================"
done
As you can see I can list the subfolders.
I want to do something like
process video_1_1_1.mp4 video_1_1_1_out.mp4
but the file name varies.
Yes I see that I can perhaps use the entry variable to compose the name of the file, but what if the files don't follow this pattern and the only thing I know is that they start with "video"?
Is there a way to get the name of the file in the folder so as to use it later?
Consider this file tree:
$ tree /tmp/test
/tmp/test
├── one
│   ├── one-1.mp4
│   ├── one-2.mp4
│   ├── one-3.mp4
│   ├── video-1.mp4
│   └── video-2.mp4
└── two
├── two-1.mp4
├── two-2.mp4
├── two-3.mp4
├── video-1.mp4
└── video-2.mp4
2 directories, 10 files
You can use a recursive glob to find all the .mp4 files in that tree:
$ for fn in "/tmp/test/"**/*".mp4"; do echo "$fn"; done
/tmp/test/one/one-1.mp4
/tmp/test/one/one-2.mp4
/tmp/test/one/one-3.mp4
/tmp/test/one/video-1.mp4
/tmp/test/one/video-2.mp4
/tmp/test/two/two-1.mp4
/tmp/test/two/two-2.mp4
/tmp/test/two/two-3.mp4
/tmp/test/two/video-1.mp4
/tmp/test/two/video-2.mp4
Or just the ones starting with video:
$ for fn in "/tmp/test/"**/"video-"*".mp4"; do echo "$fn"; done
/tmp/test/one/video-1.mp4
/tmp/test/one/video-2.mp4
/tmp/test/two/video-1.mp4
/tmp/test/two/video-2.mp4
Instead of echo you can process...
If process involves more than one file, you can use xargs.
You can also use find:
$ find "/tmp/test/" -iname "video*.mp4" -type f
/tmp/test//one/video-1.mp4
/tmp/test//one/video-2.mp4
/tmp/test//two/video-1.mp4
/tmp/test//two/video-2.mp4
Then you would construct a pipe to xargs or use find -exec:
$ find [ what ] -print0 | xargs -0 process # xargs way
$ find [ what ] -exec process {} + # modern find

Removing Underscore of Files under a Directory

I want to remove the underscores of the files that there are inside some Directories
I have this:
Divulgation
├── Biology
│   └── Dawkins, C. Richard
│   └── Books
│   ├── The_Blind_Watchmaker.pdf
│   ├── The_God_Delusion.pdf
│   └── The_Selfish_Gene_(3rd_Ed.).pdf
├── Chemistry
│   └── Gray, Theodore W
│   └── Books
│   ├── Molecules_The_Elements_and_the_Architecture_of Everything.epub
│   └── The_Elements,_A_Visual_Exploration_of_Every_Known_Atom in_the_Universe.pdf
└── Physics
├── Hawking, Stephen
│   └── Books
│   ├── A_Brief_History_of_Time_from_the_Big_Bang_to_Black Holes.djvu
│   ├── My_Brief_History.epub
│   └── The_Universe_in_a_Nutshell.pdf
└── Sagan, Carl E
└── Books
├── Billions_and_Billions_Thoughts_on_Life_and_Death_at_the Brink_of_the_Millennium.pdf
├── Cosmos.pdf
└── The_Dragons_of_Eden.epub
I want this:
My Attempt:
#!/bin/bash
shopt -s extglob # allow fancy #(...) construct to specify dirs
shopt -s globstar # add double-asterisk for flexible depth
for f in #(Divulgation)/**/Books/*.pdf # Search for Filed under Books Directory
do echo "mv "$f" `echo "$f" | sed 's/_/ /g'` " # Show the command first
mv "$f" `echo "$f" | sed 's/_/ /g'` # Rename File by removing Underscores using the s Command
done
The error:
mv Divulgation/Biology/Dawkins, C. Richard/Books/The_Blind_Watchmaker.pdf Divulgation/Biology/Dawkins, C. Richard/Books/The Blind Watchmaker.pdf
mv: target 'Watchmaker.pdf' is not a directory
mv Divulgation/Biology/Dawkins, C. Richard/Books/The God Delusion.pdf Divulgation/Biology/Dawkins, C. Richard/Books/The God Delusion.pdf
mv: target 'Delusion.pdf' is not a directory
mv Divulgation/Biology/Dawkins, C. Richard/Books/The Selfish Gene (3rd Ed.).pdf Divulgation/Biology/Dawkins, C. Richard/Books/The Selfish Gene (3rd Ed.).pdf
mv: target 'Ed.).pdf' is not a directory
mv Divulgation/Chemistry/Gray, Theodore W/Books/The Elements, A Visual Exploration of Every Known Atom in the Universe.pdf Divulgation/Chemistry/Gray, Theodore W/Books/The Elements, A Visual Exploration of Every Known Atom in the Universe.pdf
mv: target 'Universe.pdf' is not a directory
mv Divulgation/Physics/Hawking, Stephen/Books/The Universe in a Nutshell.pdf Divulgation/Physics/Hawking, Stephen/Books/The Universe in a Nutshell.pdf
mv: target 'Nutshell.pdf' is not a directory
mv Divulgation/Physics/Sagan, Carl E/Books/Billions and Billions Thoughts on Life and Death at the Brink of the Millennium.pdf Divulgation/Physics/Sagan, Carl E/Books/Billions and Billions Thoughts on Life and Death at the Brink of the Millennium.pdf
mv: target 'Millennium.pdf' is not a directory
mv Divulgation/Physics/Sagan, Carl E/Books/Cosmos.pdf Divulgation/Physics/Sagan, Carl E/Books/Cosmos.pdf
mv: target 'E/Books/Cosmos.pdf' is not a directory
Help ... Umm the PDFs are reviews of the Books, not the Books themselves ..
Great script, almost there.
Using backticks ` is discouraged. Don't use them. Use $(..) instead.
Always rememebr to quote expansions (and also backticks, if you use them).
Remember that echo "$f" may fail for strange filenames like -e. Use printf "%s\n" "$f" instead. But since it's bash, you can just <<<"$f" use a here string.
You could use set -x to see what is happening. I think simple mv -v might be more... simple.
You need to quote the backticks. If you don't quote them, shell will do word splitting - split the result on spaces into multiple arguments. So for example ls $(echo "filename with spaces.txt") will run ls with 3 arguments and ls will interpret is as 3 separate files - filename, with and spaces.txt.
And replace backticks with $(...):
for f in Divulgation/**/Books/*.pdf; do
mv -v "$f" "$(<<<"$f" sed 's/_/ /g')"
done
For substituting single characters, for me using tr '_' ' ' seems like more "suitable" command then sed.
This might work for you (find, rename and GNU parallel):
find Divulgation -path '*/Books/*.pdf' | rename -v 's/_/ /g'
If you want to parameterize the top directory, use:
parallel --dry find {} -path '*/Books/*.pdf' \| rename -v 's/_/ /g' ::: Divulgation SomeotherVulation
When satisfied the commands are correct, remove the --dryrun option

deleting intermediary folders

Maybe one of you guys has something like this at hand already? I tried to use robocopy on windows but to no avail. I also tried to write a bash script in linux with find etc... but gave up on that one also ^^ Google search brought no solution also unfortunately. I need this for my private photo library.
Solution could be linux or windows based, both are fine. Any ideas?
I would like to get rid of hundreds of 'intermediary folders'.
I define an 'intermediary folder' as a folder that contains nothing else than exactly one sub-folder. Example
folder 1
file in folder 1
folder 2 <-- 'intermediary folder: contains exactly one sub-folder, nothing else'
folder 3
file in folder 3
What I would like to end up with is:
folder 1
file in folder 1
folder 3
file in folder 3
I do not need the script to be recursive (removing several layers of intermediary folders at once), I'll just run it several times.
Even cooler would be if the script could rename folder 3 in the above example to 'folder 2 - folder 3', but I can live without this feature I guess.
I guess one of you linux experts has a one liner handy for that? ^^
Thank you very much!
Take a look at this code:
#!/usr/bin/env bash
shopt -s nullglob
while IFS= read -rd '' dir; do
f=("$dir"/*)
if ((${#f[#]}==1)) && [[ -d $f ]]; then
mv -t "${dir%/*}" "$f" || continue
rm -r "$dir"
fi
done < <(find folder1 -depth -mindepth 1 -type d -print0)
Explanation:
shopt -s nullglob: allows filename patterns which match no files to expand to a null string
find ... -depth: makes find traverse the file system in a depth-first order
find ... -mindepth 1: processes all directories except the starting-point
find ... -type d: finds only directories
find ... -print0: prints the directories separated by a null character \0 (to correctly handle possible newlines in filenames)
while IFS= read ...: loops over all the directories (the output of find)
f=("$dir"/*): creates an array with all files in the currently processed directory
((${#f[#]}==1)) && [[ -d $f ]]: true if there is only one file and it is a directory
mv -t "${dir%/*}" "$f": moves the only subdirectory one directory above
mv ... || continue: mv can fail if the subdirectory already exists in the directory above. || continue ignores such subdirectory
rm -r "$dir": removes the processed directory
Test run:
$ tree folder1
folder1
├── file1
├── folder2
│   └── folder3
│   └── file3
├── folder4
│   ├── file4a
│   ├── file4b
│   └── file4c
└── folder5
└── folder6
├── file6
└── folder7
└── folder8
└── folder9
├── dir9
└── file9
$ ./script
$ tree folder1
folder1
├── file1
├── folder3
│   └── file3
├── folder4
│   ├── file4a
│   ├── file4b
│   └── file4c
└── folder6
├── file6
└── folder9
├── dir9
└── file9

Bash: How to replace a prefix of folders

I have a bunch of folders:
test_001
test_002
and I would like to replace the prefix test with ftp to get:
ftp_001
ftp_002
One problem: I have access on a Linux-Server with minimal installation. For example, rename is not installed and probably even sed is not installed. so, how can I replace the prefix using pure bash?
Since you have a minimal installation I have tried to make a command that does not require tr, sed or find.
INPUT:
$ tree .
.
├── a
├── b
├── c
├── test_001
└── test_002
2 directories, 3 files
CMD:
for d in */; do mv "${d:0:-1}" "ftp"${d:4:-1}; done
OUTPUT:
tree .
.
├── a
├── b
├── c
├── ftp_001
└── ftp_002
2 directories, 3 files
Explanations about substrings in bash : https://www.tldp.org/LDP/abs/html/string-manipulation.html
This little script may help:
for dir in */
do
mv "$dir" "${dir/test/ftp}"
done
execute it under the parent of your test_00x directory.
It could be written in a compact one-liner:
for dir in */; do mv "$dir" "${dir/test/ftp}"; done

Compress all files of certain file types in subfolders in one file per subfolder using shell script or AppleScript

I am looking for a way to archive all files of certain file types in one zip file per subfolder.
My folder structure is as follows:
/path/to
└── TopLevel
├── SubLevel1
│   ├── SubSubLevel1
│   ├── SubSubLevel2
│   └── SubSubLevel3
├── SubLevel2
│   ├── SubSubLevel1
│   ├── SubSubLevel2
│   └── SubSubLevel3
├── SubLevel3
│   ├── SubSubLevel1
│   └── SubSubLevel2
└── SubLevel4
In each folder or subfolder or sub-subfolder, there are files of the file type *.abc, *.xyz and also *.001 through *.999 and all these files I want to compress into one zip file per folder, i.e. all files of the specified types in folder "SubSubLevel1" of "SubLevel1" of "TopLevel" should be packaged into one file named "SubSubLevel1_data.zip" inside the "SubSubLevel1" folder. All other files in these folders, which do not match the search criteria as described above, should be kept unzipped in the same directory.
I have found some ideas here or here, but both approaches are based on a different way of archiving the files and I have so far not found a way to adopt them to my needs since I am not very experienced with shell scripting. I have also tried to get a solution with AppleScript, but there I face the problem how to get all files in the folder with the number as an extension (*.001 through *.999). With RegEx, I would do something like ".abc|.xyz.\d\d\d" which would cover my search for certain file types, but I am also not sure now how to implement the result of a grep in AppleScript.
I guess someone out there must have an idea how to address my archiving issue. Thanks in advance for your suggestions.
After some playing around I came up with the following solution:
#!/bin/bash
shopt -s nullglob
find -E "$PWD" -type d -maxdepth 1 -regex ".*201[0-5][0-1][0-9].*" -print0 | while IFS="" read -r -d "" thisFolder ; do
echo "The current folder is: $thisFolder"
to_archive=( "$thisFolder"/*.[Aa][Bb][Cc] "$thisFolder"/*.[Xx][Yy][Zz] "$thisFolder"/*.[0-9][0-9][0-9] )
if [ ${#to_archive[#]} != 0 ]
then
7z a -mx=9 -uz1 -x!.DS_Store "$thisFolder"/"${thisFolder##*/}"_data.7z "${to_archive[#]}" && rm "${to_archive[#]}"
fi
find "$thisFolder" -type d -mindepth 1 -maxdepth 1 -print0 | while IFS="" read -r -d "" thisSubFolder ; do
echo "The current subfolder is: $thisSubFolder"
to_archive=( "$thisSubFolder"/*.[Aa][Bb][Cc] "$thisSubFolder"/*.[Xx][Yy][Zz] "$thisSubFolder"/*.[0-9][0-9][0-9] )
if [ ${#to_archive[#]} != 0 ]
then
7z a -mx=9 -uz1 -x!.DS_Store "$thisSubFolder"/"${thisSubFolder##*/}"_data.7z "${to_archive[#]}" && rm "${to_archive[#]}"
fi
done
done
My script has two nested for loops to iterate through subfolders and sub-subfolders. With "find" I look for a regex pattern in order to only backup folders from 2010-2015 . All files matching the specified extensions inside the folders are compressed in one target archive per folder.

Resources