In bash, how can I recursively rename each file to the name of its parent folder, retaining the original extension? - bash

I have a large directory of folders, each of which has only one file:
directory/folder1/208hasdfasdf.jpg
directory/folder2/f230fsdf.gif
directory/folder3/23fsdbfasf.jpg
I'd like to rename this to:
directory2/folder1/folder1.jpg
directory2/folder2/folder2.gif
directory3/folder3/folder3.jpg
How can I do that?

For the path and filenames shown, you can use a loop and combination of find and sed to make the substitutions, e.g.
for f in $(find directory -type f -wholename "*folder*"); do
mv "$f" $(sed -E 's|^([^/]+)/([^/]+)/([^.]+)[.](.*)$|\1/\2/\2.\4|' <<< "$f")
done
Where sed -E 's|^([^/]+)/([^/]+)/([^.]+)[.](.*)$|\1/\2/\2.\4| uses the alternative delimiter '|' instead of '/' to ease dealing with pathnames, and then separates and captures the "directory" with ^([^/]+) and then the "folderX" with ([^/]+), followed by the filename without the extension ([^.]+) and lastly the extension (.*)$ making each component available through the numbered backreferences \1, \2, \3, and \4, respectively.
Then to form the new filename, you just duplicate the \2 foldername in place of the \3 filename, for a new filename of \1/\2/\2.\4
Example Use/Output
$ find tmp-david -type f -wholename "*folder*"
tmp-david/folder3/23fsdbfasf.jpg
tmp-david/folder2/f230fsdf.gif
tmp-david/folder1/208hasdfasdf.jpg
And the replacement of the filenames with
$ for f in $(find tmp-david -type f -wholename "*folder*"); do
> mv "$f" $(sed -E 's|^([^/]+)/([^/]+)/([^.]+)[.](.*)$|\1/\2/\2.\4|' <<< "$f")
> done
Resulting in:
$ find tmp-david -type f -wholename "*folder*"
tmp-david/folder3/folder3.jpg
tmp-david/folder2/folder2.gif
tmp-david/folder1/folder1.jpg

You could try something like this, assuming you're using bash:
find directory/ \( -name '*.gif' -o -name '*.jpg' \) -print |
while read old; do
parent=${old%/*}
base=${parent##*/}
ext=${old##*.}
mv $old $parent/$base.$ext
done
If you're dealing with filenames that contain whitespace you're going
to need to massage this a bit.
Before running this script:
$ find directory -type f -print
directory/folder2/f230fsdf.gif
directory/folder1/208hasdfasdf.jpg
directory/folder3/23fsdbfasf.jpg
After running this script:
$ find directory -type f -print
directory/folder2/folder2.gif
directory/folder1/folder1.jpg
directory/folder3/folder3.jpg

Related

Show directory path with only files present in them

This is the folder structure that I have.
Using the find command find . -type d in root folder gives me the following result
Result
./folder1
./folder1/folder2
./folder1/folder2/folder3
However, I want the result to be only ./folder1/folder2/folder3. i.e only print the result if there's a file of type .txt present inside.
Can someone help with this scenario? Hope it makes sense.
find . -type f -name '*.txt' |
sed 's=/[^/]*\.txt$==' |
sort -u
Find all .txt files, remove file names with sed to get the parent directories only, then sort -u to remove duplicates.
This won’t work on file names/paths that contain a new line.
You may use this find command that finds all the *.txt files and then it gets unique their parent directory names:
find . -type f -name '*.txt' -exec bash -c '
for f; do
f="${f#.}"
printf "%s\0" "$PWD${f%/*}"
done
' _ {} + | awk -v RS='\0' '!seen[$0]++'
We are using printf "%s\0" to address directory names with newlines, spaces and glob characters.
Using gnu-awk to get only unique directory names printed
Using Associative array and Process Substitution.
#!/usr/bin/env bash
declare -A uniq_path
while IFS= read -rd '' files; do
path_name=${files%/*}
if ((!uniq_path["$path_name"]++)); then
printf '%s\n' "$path_name"
fi
done < <(find . -type f -name '*.txt' -print0)
Check the value of uniq_path
declare -p uniq_path
Maybe this POSIX one?
find root -type f -name '*.txt' -exec dirname {} \; | awk '!seen[$0]++'
* adds a trailing \n after each directory path
* breaks when a directory in a path has a \n in its name
Or this BSD/GNU one?
find root -type f -name '*.txt' -exec dirname {} \; -exec printf '\0' \; | sort -z -u
* adds a trailing \n\0 after each directory path

Rename all files in directory and subdirectory

How do I rename files in directory and subdirectory?
I found this program, but I need to go change files in subdirectory.
for file in *#me01
do
mv "$file" "${file/#me01/_me01}"
done
n#me01
to
n_me01
The following one-liner will likely work for you:
find . -type f -name '*#me01' -execdir rename '#me01' '_me01' {} \;
The following form is likely more correct as it will change only the last # to _ if there are multiple occurrences of #me01 in the file:
for f0 in $(find . -type f -name '*#me01')
do
f1=$(printf '%s' "$f0" | sed 's/#me01$/_me01/')
mv "$f0" "$f1"
done
This latter form is also more flexible and can be built upon more easily as the regex language in sed is much more powerful than rename expressions.
If rename of directories is also required the following can easily be added...
Either:
find . -type d -name '*#me01' -execdir rename '#me01' '_me01' {} \;
Or:
for d0 in $(find . -type d -name '*#me01')
do
d1=$(printf '%s' "$d0" | sed 's/#me01$/_me01/')
mv "$d0" "$d1"
done
Using bash:
shopt -s globstar
for name in **/*#me01; do
mv "$name" "${name%#me01}_me01"
done
This enables the globstar shell option in bash which makes ** match across path separators in pathnames.
It also uses a standard parameter substitution to delete the #me01 portion at the very end of the found pathname and replace it with _me01.

How to log variable in bash

for i in *.txt;
do
xxd -l 3 $i >> log
done
I also want to log file names $i for each result. E.g.:
file_name
result_of_command
You probably just need to use printf:
for f in *.txt; do
printf "%s: %s\n" "$f" "$(xxd -l 3 "$f")"
done >> log
I'm not totally clear what you are asking, but is this what you want?
for i in *.txt;
do
echo "$i" >> log
xxd -l 3 $i >> log
done
It's better to use find with the -exec option to run a command for every file matching certain criteria.
If you want all files in your current directory matching *.txt you can use find. You can use the -exec option to run a command for each file. {} replaces the name of the file and \; (an escaped ; terminates the command). You can use + instead to tell find to replace {} with multiple filenames.
find . -type f -name '*.txt' -maxdepth 1 -exec xxd -l 3 {} \; >> log
Note that the above example includes hidden files, you can exclude them using a regex.
find . -type f \( ! -regex '.*/\..*' \) -name '*.txt' -maxdepth 1 -exec xxd -l 3 {} \; >> log
Also, if you're going to be globbing files in the current directory and using them in commands, always use ./*. Paths beginning with - are likely to be interpreted by your command as options.

Renaming files but keeping them in their present subdirectory

I have a script that renames html files based on information in their tags. This script goes through the current directory and all subdirectories and performs this renaming recursively. however, after renaming them it moves them into the current working directory I am executing my shell script from. How can I make sure the files remain in their subdirectories, and are not moved to the working directory?
Here is what I am working with:
#!/usr/bin/env bash
for f in `find . -type f | grep \.htm`
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' "$f" )
mv ./"$f" "${title//[ ]/-}".htm
done
Never use this construct:
for f in `find . -type f | grep \.htm`
as the loop fails for file names that contain space and the grep's unnecessary as find has a -name option for that. Use this instead:
find . -type f -name '*\.htm.*' -print |
while IFS= read -r f
This:
awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF}
can be reduced and clarified to:
awk 'BEGIN{IGNORECASE=1;FS="</?title>";RS=""}
Note that the use of EOF was misleading as EOF is just an undefined variable which therefore contains a null string (so your first record will go until the first blank line, not until the end of the file). You could have used RS=bubba and got the same effect but just setting RS to an empty string is clearer. Not saying it's what you SHOULD be doing, but it's a clearer implementation of what you ARE doing.
Finally putting it all back together something like this should work for you:
find . -type f -name '*\.htm.*' -print |
while IFS= read -r f
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="</?title>";RS=""} {print $2}' "$f"
mv -- "$f" $(dirname "$f")/"${title//[ ]/-}".htm
done
Try:
mv ./"$f" "$(dirname "$f")/${title//[ ]/-}".htm
Note that your for f in \find...' will fail on any file name with a space or CR in it. You can avoid that with a line like:
find . -type f -name '*.htm' -type f -exec myrename.sh {} \;
where the renaming code is in a script called myrename.sh.
shopt -s globstar nullglob
for f in **/*.htm
do
title=$( awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' "$f" )
mv "$f" "$(dirname "$f")/${title//[ ]/-}".htm
done

How can I list all unique file names without their extensions in bash?

I have a task where I need to move a bunch of files from one directory to another. I need move all files with the same file name (i.e. blah.pdf, blah.txt, blah.html, etc...) at the same time, and I can move a set of these every four minutes. I had a short bash script to just move a single file at a time at these intervals, but the new name requirement is throwing me off.
My old script is:
find ./ -maxdepth 1 -type f | while read line; do mv "$line" ~/target_dir/; echo "$line"; sleep 240; done
For the new script, I basically just need to replace find ./ -maxdepth 1 -type f
with a list of unique file names without their extensions. I can then just replace do mv "$line" ~/target_dir/; with do mv "$line*" ~/target_dir/;.
So, with all of that said. What's a good way to get a unique list of files without their file names with bash script? I was thinking about using a regex to grab file names and then throwing them in a hash to get uniqueness, but I'm hoping there's an easier/better/quicker way. Ideas?
A weird-named files tolerant one-liner could be:
find . -maxdepth 1 -type f -and -iname 'blah*' -print0 | xargs -0 -I {} mv {} ~/target/dir
If the files can start with multiple prefixes, you can use logic operators in find. For example, to move blah.* and foo.*, use:
find . -maxdepth 1 -type f -and \( -iname 'blah.*' -or -iname 'foo.*' \) -print0 | xargs -0 -I {} mv {} ~/target/dir
EDIT
Updated after comment.
Here's how I'd do it:
find ./ -type f -printf '%f\n' | sed 's/\..*//' | sort | uniq | ( while read filename ; do find . -type f -iname "$filename"'*' -exec mv {} /dest/dir \; ; sleep 240; done )
Perhaps it needs some explaination:
find ./ -type f -printf '%f\n': find all files and print just their name, followed by a newline. If you don't want to look in subdirectories, this can be substituted by a simple ls;
sed 's/\..*//': strip the file extension by removing everything after the first dot. Both foo.tar ad foo.tar.gz are transformed into foo;
sort | unique: sort the filenames just found and remove duplicates;
(: open a subshell:
while read filename: read a line and put it into the $filename variable;
find . -type f -iname "$filename"'*' -exec mv {} /dest/dir \;: find in the current directory (find .) all the files (-type f) whose name starts with the value in filename (-iname "$filename"'*', this works also for files containing whitespaces in their name) and execute the mv command on each one (-exec mv {} /dest/dir \;)
sleep 240: sleep
): end of subshell.
Add -maxdepth 1 as argument to find as you see fit for your requirements.
Nevermind, I'm dumb. there's a uniq command. Duh. New working script is: find ./ -maxdepth 1 -type f | sed -e 's/.[a-zA-Z]*$//' | uniq | while read line; do mv "$line*" ~/target_dir/; echo "$line"; sleep 240; done
EDIT: Forgot close tag on code and a backslash.

Resources