Renaming files but keeping them in their present subdirectory - bash

I have a script that renames html files based on information in their tags. This script goes through the current directory and all subdirectories and performs this renaming recursively. however, after renaming them it moves them into the current working directory I am executing my shell script from. How can I make sure the files remain in their subdirectories, and are not moved to the working directory?
Here is what I am working with:
#!/usr/bin/env bash
for f in `find . -type f | grep \.htm`
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' "$f" )
mv ./"$f" "${title//[ ]/-}".htm
done

Never use this construct:
for f in `find . -type f | grep \.htm`
as the loop fails for file names that contain space and the grep's unnecessary as find has a -name option for that. Use this instead:
find . -type f -name '*\.htm.*' -print |
while IFS= read -r f
This:
awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF}
can be reduced and clarified to:
awk 'BEGIN{IGNORECASE=1;FS="</?title>";RS=""}
Note that the use of EOF was misleading as EOF is just an undefined variable which therefore contains a null string (so your first record will go until the first blank line, not until the end of the file). You could have used RS=bubba and got the same effect but just setting RS to an empty string is clearer. Not saying it's what you SHOULD be doing, but it's a clearer implementation of what you ARE doing.
Finally putting it all back together something like this should work for you:
find . -type f -name '*\.htm.*' -print |
while IFS= read -r f
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="</?title>";RS=""} {print $2}' "$f"
mv -- "$f" $(dirname "$f")/"${title//[ ]/-}".htm
done

Try:
mv ./"$f" "$(dirname "$f")/${title//[ ]/-}".htm
Note that your for f in \find...' will fail on any file name with a space or CR in it. You can avoid that with a line like:
find . -type f -name '*.htm' -type f -exec myrename.sh {} \;
where the renaming code is in a script called myrename.sh.

shopt -s globstar nullglob
for f in **/*.htm
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' "$f" )
mv "$f" "$(dirname "$f")/${title//[ ]/-}".htm
done

Related

Issues renaming files using bash script with input from .txt file with find -exec rename command

Update 01/12/2022
With triplee's helpful suggestions, I resolved it to take both files & directories by adding a comma in between f and d, the final code now looks like this:
while read -r old new;
do echo "replacing ${old} by ${new}" >&2
find '/path/to/dir' -depth -type d,f -name "$old" -exec rename
"s/${old}/${new}/" {} ';'
done <input.txt
Thank you!
Original request:
I am trying to rename a list of files (from $old to $new), all present in $homedir or in subdirectories in $homedir.
In the command line this line works to rename files in the subfolders:
find ${homedir}/ -name ${old} -exec rename "s/${old}/${new}/" */${old} ';'
However, when I want to implement this line in a simple bash script getting the $old and $new filenames from input.txt, it doesn't work anymore...
input.txt looks like this:
name_old name_new
name_old2 name_new2
etc...
the script looks like this:
#!/bin/bash
homedir='/path/to/dir'
cat input.txt | while read old new;
do
echo 'replacing' ${old} 'by' ${new}
find ${homedir}/ -name ${old} -exec rename "s/${old}/${new}/" */${old} ';'
done
After running the script, the text line from echo with $old and $new filenames being replaced is printed for the entire loop, but no files are renamed. No error is printed either. What am I missing? Your help would be greatly appreaciated!
I checked whether the $old and $new variables were correctly passed to the find -exec rename command, but because they are printed by echo that doesn't seem to be the issue.
If you add an echo, like -exec echo rename ..., you'll see what actually gets executed. I'd say that both the path to $old is wrong (you're not using the result of find in the -exec clause), and */$old isn't quoted and might be expanded by the shell before find ever gets to see it.
You're also having most other expansions unquoted, which can lead to all sorts of trouble.
You could do it in pure Bash (drop echo when output looks good):
shopt -s globstar
for f in **/"$old"; do echo mv "$f" "${f/%*/$new}"; done
Or with rename directly, though this would run into trouble if too many files match (drop -n when output looks good):
rename -n "s/$old\$/$new/" **/"$old"
Or with GNU find, using -execdir to run in the same directory as the matching file (drop echo when output looks good):
find -type f -name "$old" -execdir echo mv "$old" "$new" \;
And finally, a version with find that spawns just a single subshell (drop echo when output looks right):
find -type f -name "$old" -exec bash -c '
new=$1
shift
for f; do
echo mv "$f" "${f/%*/$new}"
done
' bash "$new" {} +
The argument to rename should be the file itself, not */${old}. You also have a number of quoting errors, and a useless cat).
#!/bin/bash
while read -r old new;
do
echo "replacing ${old} by ${new}" >&2
find /path/to/dir -name "$old" -exec rename "s/${old}/${new}/" {} ';'
done <input.txt
Running find multiple times on the same directory is hugely inefficient, though. Probably a better solution is to find all files in one go, and abort if it's not one of the files on the list.
find /path/to/dir -type f -exec sh -c '
for f in "$#"; do
awk -v f="$f" "f==\$1 { print \"s/\" \$1 \"/\" \$2 \"/\" }" "$0" |
xargs -I _ -r rename _ "$f"
done' input.txt {} +
(Untested; probably try with echo before you run this live.)

Show directory path with only files present in them

This is the folder structure that I have.
Using the find command find . -type d in root folder gives me the following result
Result
./folder1
./folder1/folder2
./folder1/folder2/folder3
However, I want the result to be only ./folder1/folder2/folder3. i.e only print the result if there's a file of type .txt present inside.
Can someone help with this scenario? Hope it makes sense.
find . -type f -name '*.txt' |
sed 's=/[^/]*\.txt$==' |
sort -u
Find all .txt files, remove file names with sed to get the parent directories only, then sort -u to remove duplicates.
This won’t work on file names/paths that contain a new line.
You may use this find command that finds all the *.txt files and then it gets unique their parent directory names:
find . -type f -name '*.txt' -exec bash -c '
for f; do
f="${f#.}"
printf "%s\0" "$PWD${f%/*}"
done
' _ {} + | awk -v RS='\0' '!seen[$0]++'
We are using printf "%s\0" to address directory names with newlines, spaces and glob characters.
Using gnu-awk to get only unique directory names printed
Using Associative array and Process Substitution.
#!/usr/bin/env bash
declare -A uniq_path
while IFS= read -rd '' files; do
path_name=${files%/*}
if ((!uniq_path["$path_name"]++)); then
printf '%s\n' "$path_name"
fi
done < <(find . -type f -name '*.txt' -print0)
Check the value of uniq_path
declare -p uniq_path
Maybe this POSIX one?
find root -type f -name '*.txt' -exec dirname {} \; | awk '!seen[$0]++'
* adds a trailing \n after each directory path
* breaks when a directory in a path has a \n in its name
Or this BSD/GNU one?
find root -type f -name '*.txt' -exec dirname {} \; -exec printf '\0' \; | sort -z -u
* adds a trailing \n\0 after each directory path

In bash, how can I recursively rename each file to the name of its parent folder, retaining the original extension?

I have a large directory of folders, each of which has only one file:
directory/folder1/208hasdfasdf.jpg
directory/folder2/f230fsdf.gif
directory/folder3/23fsdbfasf.jpg
I'd like to rename this to:
directory2/folder1/folder1.jpg
directory2/folder2/folder2.gif
directory3/folder3/folder3.jpg
How can I do that?
For the path and filenames shown, you can use a loop and combination of find and sed to make the substitutions, e.g.
for f in $(find directory -type f -wholename "*folder*"); do
mv "$f" $(sed -E 's|^([^/]+)/([^/]+)/([^.]+)[.](.*)$|\1/\2/\2.\4|' <<< "$f")
done
Where sed -E 's|^([^/]+)/([^/]+)/([^.]+)[.](.*)$|\1/\2/\2.\4| uses the alternative delimiter '|' instead of '/' to ease dealing with pathnames, and then separates and captures the "directory" with ^([^/]+) and then the "folderX" with ([^/]+), followed by the filename without the extension ([^.]+) and lastly the extension (.*)$ making each component available through the numbered backreferences \1, \2, \3, and \4, respectively.
Then to form the new filename, you just duplicate the \2 foldername in place of the \3 filename, for a new filename of \1/\2/\2.\4
Example Use/Output
$ find tmp-david -type f -wholename "*folder*"
tmp-david/folder3/23fsdbfasf.jpg
tmp-david/folder2/f230fsdf.gif
tmp-david/folder1/208hasdfasdf.jpg
And the replacement of the filenames with
$ for f in $(find tmp-david -type f -wholename "*folder*"); do
> mv "$f" $(sed -E 's|^([^/]+)/([^/]+)/([^.]+)[.](.*)$|\1/\2/\2.\4|' <<< "$f")
> done
Resulting in:
$ find tmp-david -type f -wholename "*folder*"
tmp-david/folder3/folder3.jpg
tmp-david/folder2/folder2.gif
tmp-david/folder1/folder1.jpg
You could try something like this, assuming you're using bash:
find directory/ \( -name '*.gif' -o -name '*.jpg' \) -print |
while read old; do
parent=${old%/*}
base=${parent##*/}
ext=${old##*.}
mv $old $parent/$base.$ext
done
If you're dealing with filenames that contain whitespace you're going
to need to massage this a bit.
Before running this script:
$ find directory -type f -print
directory/folder2/f230fsdf.gif
directory/folder1/208hasdfasdf.jpg
directory/folder3/23fsdbfasf.jpg
After running this script:
$ find directory -type f -print
directory/folder2/folder2.gif
directory/folder1/folder1.jpg
directory/folder3/folder3.jpg

Renaming directories in bash using sed

i have several directories which i want to rename:
etc:
"duedate-year" directory to "duedate" (just removing -year)
"start-year" directory to "start"
This is what i've tried:
for CACHE in `find ${DESTINATION_REPO} -maxdepth 1 -type d -name "*year" ` ;
do
set UPDATE="awk -F"-year" '{print $1}' $CACHE" ;
mv $CACHE $UPDATE
done
However it doesn't succeed. Is there away to rename directory using "sed" command?
You're assigning the result of awk incorrectly. It should be inside backticks or $(...). And to process a variable, you need to pipe echo $CACHE to it, not use $CACHE as the filename argument (that will process the contents of the file). So that line should be:
And variables aren't assigned using set, you just write var=value.
So that line should be:
UPDATE=$(echo "$CACHE" | awk -F-year '{print $1}`)
But there's no need to use awk for this at all, you can use shell variable expansion operators:
UPDATE=${CACHE%%-year*}
%%year* means to remove the longest trailing part of the value that matches the wildcard -year*.
Many shell solutions will "work" for a given sample input set and then blow up disastrously later, usually due to unquoted variables, incorrect processing of blanks, etc. This should be safe unless your file name contains newlines (in which case see find -print0 and xargs -0):
find "$DESTINATION_REPO" -maxdepth 1 -type d -name "*-year" |
while IFS= read -r CACHE
do
mv -- "$CACHE" "${CACHE%-year}"
done
Or use the rename command
rename 's/-year//' *year
Yes you can use a pipe. I do this:
for DIR in $(find ${DESTINATION_REPO} -maxdepth 1 -type d -name "*year"); do
mv "${DIR}" $(echo "${DIR}" | sed -E 's/year//')
done
It should be noted that I am very much self taught and sometimes have bad habits...
After consulting gniourf_gniourf I am posting a more robust version (read more) which is a "code lift" of Ed Morton's answer below.
find ${DESTINATION_REPO} -maxdepth 1 -type d -name "*-year"|
while IFS= read -r DIR
do
mv "${DIR}" $(echo "${DIR}" | sed -E 's/year//')
done

How can I list all unique file names without their extensions in bash?

I have a task where I need to move a bunch of files from one directory to another. I need move all files with the same file name (i.e. blah.pdf, blah.txt, blah.html, etc...) at the same time, and I can move a set of these every four minutes. I had a short bash script to just move a single file at a time at these intervals, but the new name requirement is throwing me off.
My old script is:
find ./ -maxdepth 1 -type f | while read line; do mv "$line" ~/target_dir/; echo "$line"; sleep 240; done
For the new script, I basically just need to replace find ./ -maxdepth 1 -type f
with a list of unique file names without their extensions. I can then just replace do mv "$line" ~/target_dir/; with do mv "$line*" ~/target_dir/;.
So, with all of that said. What's a good way to get a unique list of files without their file names with bash script? I was thinking about using a regex to grab file names and then throwing them in a hash to get uniqueness, but I'm hoping there's an easier/better/quicker way. Ideas?
A weird-named files tolerant one-liner could be:
find . -maxdepth 1 -type f -and -iname 'blah*' -print0 | xargs -0 -I {} mv {} ~/target/dir
If the files can start with multiple prefixes, you can use logic operators in find. For example, to move blah.* and foo.*, use:
find . -maxdepth 1 -type f -and \( -iname 'blah.*' -or -iname 'foo.*' \) -print0 | xargs -0 -I {} mv {} ~/target/dir
EDIT
Updated after comment.
Here's how I'd do it:
find ./ -type f -printf '%f\n' | sed 's/\..*//' | sort | uniq | ( while read filename ; do find . -type f -iname "$filename"'*' -exec mv {} /dest/dir \; ; sleep 240; done )
Perhaps it needs some explaination:
find ./ -type f -printf '%f\n': find all files and print just their name, followed by a newline. If you don't want to look in subdirectories, this can be substituted by a simple ls;
sed 's/\..*//': strip the file extension by removing everything after the first dot. Both foo.tar ad foo.tar.gz are transformed into foo;
sort | unique: sort the filenames just found and remove duplicates;
(: open a subshell:
while read filename: read a line and put it into the $filename variable;
find . -type f -iname "$filename"'*' -exec mv {} /dest/dir \;: find in the current directory (find .) all the files (-type f) whose name starts with the value in filename (-iname "$filename"'*', this works also for files containing whitespaces in their name) and execute the mv command on each one (-exec mv {} /dest/dir \;)
sleep 240: sleep
): end of subshell.
Add -maxdepth 1 as argument to find as you see fit for your requirements.
Nevermind, I'm dumb. there's a uniq command. Duh. New working script is: find ./ -maxdepth 1 -type f | sed -e 's/.[a-zA-Z]*$//' | uniq | while read line; do mv "$line*" ~/target_dir/; echo "$line"; sleep 240; done
EDIT: Forgot close tag on code and a backslash.

Resources