Find and rename multiple files using a bash script in Linux - bash

As an example, in a directory /home/hel/files/ are thousends of files and hundreds of directories.
An application saves there its output files with special characters in the file names.
I want to replace these special characters with underscores in all file names. e.g. -:"<>#
I wrote a bash script which simply repeats a command to rename the files using Linux/Unix 'rename'.
Example: file name: rename.sh
#!/bin/bash
rename "s/\'/_/g" *
rename 's/[-:"<>#\,&\s\(\)\[\]?!–~%„“;│\´\’\+#]/_/g' *
rename 'y/A-Z/a-z/' *
rename 's/\.(?=[^.]*\.)/_/g' *
rename 's/[_]{2,}/_/g' *
I execute the following find command:
find /home/hel/files/ -maxdepth 1 -type f -execdir /home/hel/scripts/rename.sh {} \+
Now the issue:
This works fine, except the fact, that it renames subdirectories too, if they have the searched characters in their name.
The find command searches just for files and not for directories.
I tried some other find variations like:
find /home/hel/files/ -maxdepth 1 -type f -execdir sh /home/hel/scripts/rename.sh {} \+
find /home/hel/files/ -maxdepth 1 -type f -execdir sh /home/hel/scripts/rename.sh {} +
find /home/hel/files/ -maxdepth 1 -type f -execdir sh /home/hel/scripts/rename.sh {} \;
They are all working, but with the same result.
What is not working:
find /home/hel/files/ -maxdepth 1 -type f -exec sh /home/hel/scripts/rename.sh {} \+
This one is dangerous, because it renames the directories and files in the current directory, where you call the find command too.
Maybe one has an idea, why this happens or has a better solution.

The script rename.sh did not use its command line arguments at all, but instead searched files and directories (!) on its own using the glob *.
Change your script to the following.
#!/bin/bash
rename -d s/\''/_/g;
s/[-:"<>#\,&\s\(\)\[\]?!–~%„“;│\´\’\+#]/_/g;
y/A-Z/a-z/;
s/\.(?=[^.]*\.)/_/g;
s/[_]{2,}/_/g' "$#"
Then use find ... -maxdepth 1 -type f -exec sh .../rename.sh {} +.
Changes Made
Use "$#" instead of * to process the files given as arguments rather than everything in the current directory.
Execute rename only once as a 2nd rename wouldn't find the files specified with "$#" after they were renamed by the 1st rename.
Use the -d option such that only the basenames are modified. find always puts a path in front of the files, at the very least ./. Without this option rename would change ./filename to mangledPath/newFilename and therefore move the file to another directory.
Note that man rename is a bit misleading
--path, --fullpath
Rename full path: including any directory component. DEFAULT
-d, --filename, --nopath, --nofullpath
Do not rename directory: only rename filename component of path.
For a given path rename -d 's...' some/path/basename just processes the basename and ignores the leading components some/path/. If basename is a directory it will still be renamed despite the -d option.

Related

Recurse through subdirectories and rename all files of a given extension the same filename [duplicate]

This question already has answers here:
Find files recursively and rename based on their full path
(3 answers)
Closed 2 years ago.
This should be simple but I am getting stuck somewhere.
I want to recurse through a directory and rename all pdfs the same filename. The renamed files should remain in their current subdirectory.
The current PDF filenames are arbitrary.
Assume that I am running this script from the top directory. Inside this top directory are several subdirs, each with a PDF with an arbitrary filename.
This works to rename the files in place:
find . -iname "*.pdf" -exec rename 's/test.pdf/commonname.pdf/' '{}' \;
But since the current filenames are arbitrary, I need to swap out a regex for "any characters or digits" in place of test.pdf
My understanding is that the correct regex expression is .*
So I tried:
find . -iname "*.pdf" -exec rename 's/.*/commonname.pdf/' '{}' \;
When I run this, the first PDF gets renamed to commonpdf.pdf, but it is moved into the top directory. My use case requires that the PDFs get renamed in place.
I am missing something obvious here, clearly - can you spot my mistake?
The problem is that in s/.*/commonname.pdf/, .* matches the complete path, not just the filename. You could make sure that the regular expression applies to nothing but the filename by matching on non-slashes:
find . -iname '*.pdf' -exec rename 's|[^/]*$|commonname.pdf|' '{}' \;
or you could use GNU find's -execdir, which sets the working directory to the directory containing the matching file:
find . -iname '*.pdf' -execdir rename 's/.*/commonname.pdf/' '{}' \;
or not use rename at all:
find . -iname '*.pdf' -execdir mv {} commonname.pdf \;
or not use find, but a single invocation of rename:
rename 's|[^/]*$|commonname.pdf|' **/*.pdf
This requires the globstar shell option to enable the ** glob.
Use the -n option to rename for a dry run without actually changing filenames.

How to delete all files or Sub-folders (both) in a folder except 2 folders with shell script

I would like to know how to deleted all the contents of a folder (it contains other folders and some files) except for 2 folders and its contents
The below command keeps the folder conf and removes all the other folders
find . ! -name 'conf' -type d -exec rm -rf {} +
I have tried to pipe it like below
find . -maxdepth 1 -type d ! -name 'conf' |find . -maxdepth 1 -type d ! -name 'foldername2'
but didnt work.
is it possible to do with a single command
You haven't specified which shell you're using, but if you're using bash then extended globs can help:
printf '%s\n' !(#(conf|foldername2)/)
If you're happy with the list of files and directories produced by that, then pass the same glob to rm -rf:
rm -rf !(#(conf|foldername2)/)
Inside a script, you may need to enable extglob using shopt -s extglob. Later, you can change -s to -u to unset the option.
If you're using a different shell, then you can add some more options to your find command:
find -maxdepth 1 ! -name 'conf' -a ! -name 'foldername2' -exec rm -rf {} +
Try it without the -exec part first to print the matches rather than deleting everything.
It may my little program utility can help you. I hope so.
First of all you should find the path of your files .sh
then you should find the main folder that contains those files .sh
then remove anything except those folders
I wrote drr for such a purpose that it can do such a task so easy
drr, stands for: remove or rename files based on regular expression in D language. So you must compile it before using.
See the screenshot:
Please be careful since this is not an appropriate tool for beginner.

How to move files en-masse while skipping a few files and directories

I'm trying to write a shell script that moves all files except for the ones that end with .sh and .py. I also don't want to move directories.
This is what I've got so far:
cd FILES/user/folder
shopt -s extglob
mv !(*.sh|*.py) MoveFolder/ 2>/dev/null
shopt -u extglob
This moves all files except the ones that contain .sh or .py, but all directories are moved into MoveFolder as well.
I guess I could rename the folders, but other scripts already have those folders assigned for their work, so renaming might give me more trouble. I also could add the folder names but whenever someone else creates a folder, I would have to add its name to the script or it will be moved as well.
How can I improve this script to skip all folders?
Use find for this:
find -maxdepth 1 \! -type d \! -name "*.py" \! -name "*.sh" -exec mv -t MoveFolder {} +
What it does:
find: find things...
-maxdepth 1: that are in the current directory...
\! -type d: and that are not a directory...
\! -name "*.py: and whose name does not end with .py...
\! -name "*.sh: and whose name does not end with .sh...
-exec mv -t MoveFolder {} +: and move them to directory MoveFolder
The -exec flag is special: contrary to the the prior flags which were conditions, this one is an action. For each match, the + that ends the following command directs find to aggregate the file name at the end of the command, at the place marked with {}. When all the files are found, find executes the resulting command (i.e. mv -t MoveFolder file1 file2 ... fileN).
You'll have to check every element to see if it is a directory or not, as well as its extension:
for f in FILES/user/folder/*
do
extension="${f##*.}"
if [ ! -d "$f" ] && [[ ! "$extension" =~ ^(sh|py)$ ]]; then
mv "$f" MoveFolder
fi
done
Otherwise, you can also use find -type f and do some stuff with maxdepth and a regexp.
Regexp for the file name based on Check if a string matches a regex in Bash script, extension extracted through the solution to Extract filename and extension in Bash.

Bash Script for listing subdirectories and files in textfile

I need a Script that writes the directory and subdirectory in a text-file.
For example the script lies in /Mainfolder and in this folder are four other folders. Each contains several files.
Now I would like the script to write the path of each file in the textfile.
Subfolder1/File1.dat
Subfolder1/File2.dat
Subfolder2/File1.dat
Subfolder3/File1.dat
Subfolder4/File1.dat
Subfolder4/File2.dat
Important is that there is no slash in front of the listing.
Use the find command:
find Mainfolder > outputfile
and if you only want the files listed, do
find Mainfolder -type f > outputfile
You can also strip the leading ./ if you search the current directory, with the %P format option:
find . -type f -printf '%P\n' > outputfile
If your bash version is high enough, you can do it like that:
#!/bin/bash
shopt -s globstar
echo ** > yourtextfile
This solution assumes that the subdirectories contain only files -- they do not contain any directory in turn.
find . -type f -print | sed 's|^.*/S|S|'
I have created a single file in each of the four subdirectories. The original output is:
./Subfolder1/File1.dat
./Subfolder4/File4.dat
./Subfolder2/File2.dat
./Subfolder3/File3.dat
The filtered output is:
Subfolder1/File1.dat
Subfolder4/File4.dat
Subfolder2/File2.dat
Subfolder3/File3.dat
You can use this find with -exec:
find . -type f -exec bash -c 'f="{}"; echo "${f:2}"' \;
This will print all files starting from current paths by removing ./ from front.

dealing filenames with shell regex

138.096.000.015.00111-138.096.201.072.38717
138.096.000.015.01008-138.096.201.072.00790
138.096.201.072.00790-138.096.000.015.01008
138.096.201.072.33853-173.194.020.147.00080
138.096.201.072.34293-173.194.034.009.00080
138.096.201.072.38717-138.096.000.015.00111
138.096.201.072.41741-173.194.034.025.00080
138.096.201.072.50612-173.194.034.007.00080
173.194.020.147.00080-138.096.201.072.33853
173.194.034.007.00080-138.096.201.072.50612
173.194.034.009.00080-138.096.201.072.34293
173.194.034.025.00080-138.096.201.072.41741
I have many folders inside which there are many files, the file names are like the above
I want to remove those files with file names having substring "138.096.000"
and sometimes I want to get the list of files with filenames with substring "00080"
To delete files with name containing "138.096.000":
find /root/of/files -type f -name '*138.096.000*' -exec rm {} \;
To list files with names containing "00080":
find /root/of/files -type f -name '*00080*'
rm $(find . -name \*138.096.000\*)
This uses the find command to find the appropriate files. This is executed within a subshell, and the output (the list of files) is used by rm. Note the escaping of the * pattern, since the shell will try and expand * itself.
This assumes you don't have filenames with spaces etc. You may prefer to do something like:
for i in $(find . -name \*138.096.000\*); do
rm $i
done
in this scenario, or even
find . -name \*138.096.000\* | xargs rm
Note that in the loop above you'll execute rm for each file, and the xargs variant will execute rm multiple times (dependin gon the number of files you have - it may only execute once).
However, if you're using zsh then you can simply do:
rm **/*138.096.000*
(I'm assuming your directories aren't named like your files. Note the -f flag as used in Kamil's answer if this is the case)

Resources