Find and replace filenames recursivly ignoring folders - bash

I am trying to normalze filenames in a large group of files inside a folder and all sub folders. I have had limited sucess with this command
find ./ -type f -name "*.txt" -print0 | xargs -0 rename 's/bob.smith/bob smith /' {} \;
This command works as expected except in the case where "bob.smith" exists in the name of a folder where a file with 'bob.smith' in its filename exists. In that case I receive the following :
Can't rename ./today.bob.smith.ok/1.bob.smith.344.txt ./today.bob smith.ok/1.bob.smith.344.txt: No such file or directory

This should work:
find ./ -type f -name "*.txt" -print0 | xargs -0 rename 's/bob\.smith([^\/]+)$/bob smith $1/' {} \;
I modified the regex to identify those bob.smith who have not a / after them, and I used regex grouping ((regex) and $1) so I don't lose what is after bob.smith.
Note that since you wrote s/bob.smith/ and not s/bob.smith./ I assumed that you want the . after bob.smith. The result looks like this:
1.bob smith .344.txt
Note also the bob\.smith instead of bob.smith. In regex . means "any char".

Related

Removing white spaces from files but not from directories throws an error

I'm trying to recursively rename some files with parent folders that contain spaces, I've tried the following command in ubuntu terminal:
find . -type f -name '* *' -print0 | xargs -0 rename 's/ //'
It has given out the following error refering to the folder names:
Can't rename ./FOLDER WITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg
./FOLDERWITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg: No such file
or directory
If i'm not mistaken the fact that the folders have white spaces in them shouldn't affect the process since it uses the flag -f.
What is passed to xargs is the full path of the file, not just the file name. So your s/ // substitute command also removes spaces from the directory part. And as the new directories (without spaces) don't exist you get the error you see. The renaming, in your example, was:
./FOLDER WITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg ->
./FOLDERWITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg
And this is not possible if directories ./FOLDERWITH SPACES/FOLDER1.1/SUBFOLDER1.1 don't already exist.
Try with the -d option of rename:
find . -type f -name '* *' -print0 | xargs -0 rename -d 's/ //'
(the -d option only renames the filename component of the path.)
Note that you don't need xargs. You could use the -execdir action of find:
find . -type f -name '* *' -execdir rename 's/ //' {} +
And as the -execdir command is executed in the subdirectory containing the matched file, you don't need the -d option of rename any more. And the -print0 action of find is not needed neither.
Last note: if you want to replace all spaces in the file names, not just the first one, do not forget to add the g flag: rename 's/ //g'.
You're correct in that -type f -name '* *' only finds files with blanks in the name, but find prints the entire path including parent directories, so if you have
dir with blank/file with blank.txt
and you do rename 's/ //' on that string, you get
dirwith blank/file with blank.txt
because the first blank in the entire string was removed. And now the path has changed, invalidating previously found results.
You could
use a different incantation of rename to a) only apply to the part after the last / and b) replace multiple blanks:
find . -type f -name '* *' -print0 | xargs -0 rename -n 's| (?=[^/]*$)||g'
s/ (?=[^\/]*$)//g matches all blanks that are followed by characters other than / until the end of the string, where (?=...) is a look-ahead.1 You can use rename -n to dry-run until everything looks right.
(with GNU find) use -execdir to operate relative to the directory where the file is found, and also use Bash parameter expansion instead of rename:
find \
-type f \
-name '* *' \
-execdir bash -c 'for f; do mv "$f" "${f//[[:blank:]]}"; done' _ {} +
This collects as many matches as possible and then calls the Bash command with all the matches; for f iterates over all positional parameters (i.e., each file), and the mv command removes all blanks. _ is a stand-in for $0 within bash -c and doesn't really do anything.
${f//[[:blank:]]} is a parameter expansion that removes all instances of [[:blank:]] from the string $f.
You can use echo mv until everything looks right.
1 There's an easier method to achieve the same using rename -d, see Renaud's answer.

Script to find recursively the number of files with a certain extension

We have a highly nested directory structure, where we have a directory, let's call it 'my Dir', appearing many times in our hierarchy. I am interested in counting the number of "*.csv" files in all directories named 'my Dir' (yes, there is a whitespace in the name). How can I go about it?
I tried something like this, but it does not work:
find . -type d -name "my Dir" -exec ls "{}/*.csv" \; | wc -l
If you want to the number of files matching the pattern '*.csv' under "my Dir", then:
don't ask for -type d; ask for -type f
don't ask for -name "my Dir" if you really want -name '*.csv'
don't try to ls *.csv on each match, because if there's more N csv files in a directory, you would potentially count each one N times
also beware of embedding {} in -exec code!
For counting files from find, I like to use a trick I learned from Stéphane Chazelas on U&L; for example, from: Counting files in Linux:
find "my Dir" -type f -name '*.csv' -printf . | wc -c
This requires GNU find, as -printf is a GNU extension to the POSIX standard.
It works by looking within "my Dir" (from the current working directory) for files that match the pattern; for each matching file, it prints a single dot (period); that's all piped to wc who counts the number of characters (periods) that find produced -- the number of matching files.
You would exclude all pathcs that are not My Dir:
find . -type f -not '(' -not -path '*/my Dir/*' -prune ')' -name '*.csv'
Another solution is to use the -path predicate to select your files.
find . -path '*/my Dir/*.csv'
Counting the number of occurrences could be a simple matter of piping to wc -l, though this will obviously produce the wrong result if some of the files contain newlines in their names. (This is slightly pathological, but definitely something you want to cover in production code.) A common arrangement is to just print a newline for every found file, instead of its name.
find . -path '*/my Dir/*.csv' -printf '.\n' | wc -l
(The -printf predicate is not in POSIX but it's not hard to replace with an -exec or similar.)

recursively rename files and replacing characters

In directory ~/foo/ there are some file/directory names which I want to to replace recursively using a bash script.
I want to replace every occurrence of space to . in directory names. Also, every occurrence of space to - in file names.
I have search similar questions and all of that use the command find, but I was not able find a way to use it.
To replace in directories you could try this:
find ~/foo -type d -name "* *" -execdir perl-rename -v 's/ /./g' '{}' \+
And for files:
find ~/foo -type f -name "* *" -exec perl-rename -v 's/ /-/g' '{}' \;
Options used:
-type [fd] to specify file or directory.
-name to only match the directories/files with space
-exec\-execdir to execute a command, in this case, perl-rename
NOTE
Depending in your S.O. you could use just rename instead of perl-rename.

Rename all files in subfolders - replace string in filename

I want to rename all files in a folder and its subfolders.
I need to change the string HEX20 to the string HEX8.
Some filenames have other numbers, so I cannot simply change the 20 to an 8.
An example of the full path is:
\\FRDS01006\z188018\FEM\Linear\HEX20\3HEX20\3HEX20.bof
I would like to do the same replacement for the folder names.
How about this:
find . -name "*HEX20*" -exec rename HEX20 HEX8 '{}' +
This will search recursively through the current directory and any subdirectories to match HEX20. (The flag -type f is omitted because the asker wants to change the names of directories in addition to files.) It will then build a long rename command and ultimately call it. This type of construction may be simpler than building a series of commands with sed and then executing them one-by-one.
Try this:
find . -type f -name "*HEX20*" | sed 's/\(.*\)HEX20\(.*\)/mv \0 \1HEX8\2/' | sh
This way you find for regular files having HEX20 in their names:
find . -type f -name "*HEX20*"
then change the last occurrence of HEX20 whith HEX8 and compile the mv command:
find . -type f -name "*HEX20*" | sed 's/\(.*\)HEX20\(.*\)/mv \0 \1HEX8\2/'
finally you execute the compiled commands with sh:
find . -type f -name "*HEX20*" | sed 's/\(.*\)HEX20\(.*\)/mv \0 \1HEX8\2/' | sh

Delete all files but keep all directories in a bash script?

I'm trying to do something which is probably very simple, I have a directory structure such as:
dir/
subdir1/
subdir2/
file1
file2
subsubdir1/
file3
I would like to run a command in a bash script that will delete all files recursively from dir on down, but leave all directories. Ie:
dir/
subdir1/
subdir2/
subsubdir1
What would be a suitable command for this?
find dir -type f -print0 | xargs -0 rm
find lists all files that match certain expression in a given directory, recursively. -type f matches regular files. -print0 is for printing out names using \0 as delimiter (as any other character, including \n, might be in a path name). xargs is for gathering the file names from standard input and putting them as a parameters. -0 is to make sure xargs will understand the \0 delimiter.
xargs is wise enough to call rm multiple times if the parameter list would get too big. So it is much better than trying to call sth. like rm $((find ...). Also it much faster than calling rm for each file by itself, like find ... -exec rm \{\}.
With GNU's find you can use the -delete action:
find dir -type f -delete
With standard find you can use -exec rm:
find dir -type f -exec rm {} +
find dir -type f -exec rm '{}' +
find dir -type f -exec rm {} \;
where dir is the top level of where you want to delete files from
Note that this will only delete regular files, not symlinks, not devices, etc. If you want to delete everything except directories, use
find dir -not -type d -exec rm {} \;

Resources