Rename multiple filename in multiple folders - bash

I know you can do this to rename all filenames in a single folder with something like this:
for file in 1_*; do
mv "$file" "${file/1_/}"
done
However, is there a way to do this across multiple folders? For example, it will search through all the folders in the current directory and change them.
I have bash version 4.3

A robust solution, assuming you have GNU or BSD/OSX find:
find . -type f -name '1_*' -execdir sh -c 'echo mv -- "$1" "${1#1_}"' _ {} \;
Note:
- This will only echo the mv commands, to be safe; remove the echo to perform actual renaming.
- The OP's substitution, "${file/1_/}" was changed to the POSIX-compliant "${file#1_}", which is actually closer to the intent.
- If you truly need a substitution such as "${file/1_/}", which the sh on your system may or may not support, it is better to explicitly invoke a shell known to support it, such as bash.
- Symlinks are ignored (both files and directories); use find -L ... to include them (both as potential files to be renamed and to make find descend into symlinks to directories).
find . -type f -name '1_*' finds all files (-type f) with names matching 1_* (-name '1_*') in the current dir.'s (.) subtree.
-execdir executes the command passed to it in the subdirectory in which the file at hand is located.
sh -c 'echo mv -- "$1" "${1#1_}"' _ {} \; invokes the default shell (sh):
with a command string (passed to -c)
mv -- "$1" "${1#1_}" effectively removes prefix 1_ from the filename represented by the first positional parameter ($1).
and dummy parameter _ (which sh will assign to $0, which is not of interest here)
and the path of the file at hand, {}, which the shell will bind to $1;
\; simply terminates -execdir's argument.
Note that -- ensures that any filename that happens to start with - isn't mistaken for an option by mv (applies analogously below).
-execdir is not POSIX-compliant; if a POSIX-compliant variant is therefore more cumbersome:
find . -type f -name '1_*' -exec sh -c \
'cd -- "${1%/*}" && f=${1##*/} && echo mv -- "$f" "${f#1_}"' _ {} \;
cd -- "${1%/*}" changes to the directory in which the file at hand is located.
Note: cd -- "$(dirname -- "$1")" is generally more robust, but less efficient; since we know that $1 is always a path rather than a mere filename in this scenario, we can use the more efficient cd -- "${1%/*}".
f=${1##*/} extracts the mere filename from the file path at hand.
The remainder of the command line then works as above, analogously.
Performance note:
The above solutions are concise, but inefficient, because a shell instance has to be spawned for each matching file.
A potential speed-up is to use a variant of peak's approach, but only if you avoid calling external utilities in the while loop (except for mv):
find . -type f -name '1_*' | while IFS= read -r f; do
new_name=${f##*/} # extract filename
new_name=${new_name#1_} # perform substitution
d=${f%/*} # extract dir
echo mv -- "$f" "$d/$new_name" # call mv to rename
done
The above bears the hypothetical risk of breaking with filenames with embedded newlines (very rare); with GNU or BSD find, this problem could be solved.
With this approach, only a single shell instance is spawned, which processes all matching filenames in a loop - as long as the only external utility that is called in the loop is mv, this will generally be faster than the find-only solutions with -exec[dir].

If you don't want to depend on too many subtleties, consider this very pedestrian approach, which assumes a bash-like shell and that all the usual suspects (find, sed, ....) are directly available:
find . -type f -name "1_*" | while read -r file ; do
x=$(basename "$file")
y=$(sed 's/1_//' <<< "$x")
d=$(dirname "$file")
mv "$file" "$d/$y"
done
(You might want to try this using "mv -i" or "echo mv ....". You might also want to use find with the -follow option.)

Related

Recursively Rename Files and Directories with Bash on macOS

I'm writing a script that will perform some actions, and one of those actions is to find all occurrences of a string in both file names and directory names, and replace it with another string.
I have this so far
find . -name "*foo*" -type f -depth | while read file; do
newpath=${file//foo/bar}
mv "$file" "$newpath"
done
This works fine as long as the path to the file doesn't also contain foo, but that isn't guaranteed.
I feel like the way to approach this is to ONLY change the file names first, then go back through and change the directory names, but even then, if you have a structure that has more than one directory with foo in it, it will not work properly.
Is there a way to do this with built in macOS tools? (I say built-in, because this script is going to be distributed to some other folks in our organization and it can't rely on any packages to be installed).
Separating the path_name from the file_name, something like.
#!/usr/bin/env bash
while read -r file; do
path_name="${file%/*}"; printf 'Path is %s\n' "$path_name"
file_name="${file#"$path_name"}"; printf 'Filename is %s\n' "$file_name"
newpath="$path_name${file_name//foo/bar}"
echo mv -v "$file" "$newpath"
done < <(find . -name "*foo*" -type f)
Have a look at basename and dirname as well.
The printf's is just there to show which is the path and the filename.
The script just replace foo to bar from the file_name, It can be done with the path_name as well, just use the same syntax.
newpath="${path_name//bar/more}${file_name//foo/bar}"
So renaming both path_name and file_name.
Or renaming the path_name and then the file_name like your idea is an option also.
path_name="${file%/*}"
file_name="${file#"$path_name"}"
new_pathname="${path_name//bar/more}"
mv -v "$path_name" "$new_pathname"
new_filename="${file_name//foo/bar}"
mv -v "${new_pathname%/*}$file_name" "$new_pathname$new_filename"
There are no additional external tool/utility used, except from the ones being used by your script.
Remove the echo If you're satisfied with the result/output.
You can use -execdir to run a command on just the filename (basename) in the relevant directory:
find . -depth -name '*foo*' -execdir bash -c 'mv -- "${1}" "${1//foo/bar}"' _ {} \;

Using bash I need to perform a find of 0 byte files but report on their existence before deletion

The history of this problem is:
I have millions of files and directories on a NAS system. I found a count of 1,095,601 empty (0 byte) files. These files used to have data but were destroyed by a predecessor not using the correct toolsets to migrate data between an XSAN and this Isilon NAS.
The files were media production data, like fonts, pdfs and image files. They are no longer useful beyond the history of their existence. Before I proceed to delete them, the production user's need a record of which files used to exist, so when they browse a project folder, they can use the unaffected files but then refer to a text file in the same directory which records which files used to also be there and thus provide reason as to why certain reference files are broken.
So how do I find files across multiple directories and delete them but first output their filename to a text file which would be saved to each relevant path location?
I am thinking along the lines of:
for file in $(find . -type f -size 0); do
echo "$file" >> /PATH/TO/FOUND/FILE/PARENT/DIR/deletedFiles.txt -print0 |
xargs -0 rm ;
done
To delete each empty file while leaving behind a file called deletedFiles.txt which contains the names of the deleted files, try:
PATH=/bin:/usr/bin find . -empty -type f -execdir bash -c 'printf "%s\n" "$#" >>deletedFiles.txt' none {} + -delete
How it works
PATH=/bin:/usr/bin
This sets a temporary but secure path.
find .
This starts find looking in the current directory
-empty
This tells find to only look for empty files
-type f
This restricts find to looking for regular files.
-execdir bash -c 'printf "%s\n" "$#" >>deletedFiles.txt' none {} +
In each directory that contains an empty file, this adds the name of each empty file to the file deletedFiles.txt.
Notice the peculiar use of none in the command:
bash -c 'printf "%s\n" "$#" >>deletedFiles.txt' none {} +
When this command is run, bash will execute the string printf "%s\n" "$#" >>deletedFiles.txt and the arguments that follow that string are assigned to the positional parameters: $0, $1, $2, etc. When we use $#, it does not include $0. It, as is usual, expands to $1, $2, .... Thus, we add the placeholder none so that the placeholder is assigned is the $0, which we will ignore, and the complete list of file names are assigned to "$#".
-delete
This deletes each empty file.
Why not simply
find . -type f -size 0 -exec rm -v + |
sed -e 's%^removed .\./%%' -e 's/.$//' >deletedFiles.txt
If your find is too old to support -exec ... + you'll need to revert to -exec rm -v {} \; or refactor to
find . -type f -size 0 -print0 |
xargs -r -0 rm -v |
sed -e 's%^removed .\./%%' -e 's/.$//' >deletedFiles.txt
The brief sed script is to postprocess the output from rm -v which looks like
removed ‘./bar’
removed ‘./foo’
(with some funny quote characters around the file name) on my system. If you are fine with that output, of course, just omit the sed script from the pipeline.
If you know in advance which directories contain empty files, you can run the above snippet individually in those directories. Assuming you saved the snippet above as a script (with a proper shebang and execute permissions) named find-empty, you could simply use
for path in /path/to/first /path/to/second/directory /path/to/etc; do
cd "$path" && find-empty
done
This will only work if you have absolute paths (if not, you can run the body of the loop in a subshell by adding parentheses around it).
If you want to inspect all the directories in a tree, change the script to print to standard output instead (remove >deletedFiles.txt from the script) and try something like
find /path/to/tree -type d -exec sh -c '
t=$(mktemp -t find-emptyXXXXXXXX)
cd "$1" &&
find-empty | grep . >"$t" &&
mv "$t" deletedFiles.txt ||
rm "$t"' _ {} \;
This uses a temporary file so as to avoid updating the timestamp of directories which do not contain any empty files. The grep . is used purely for side effect; if any (non-empty) lines are printed, it will return success, whereas otherwise, it will report failure; this way, we know whether or not to move the temporary file to the target directory.
With prompting from #JonathanLeffler I have succeeded with the following:
#!/bin/bash
## call this script with: find . -type f -empty -exec handleEmpty.sh {} +
for file in "$#"
do
file2="$(basename "$file")"
echo "$file2" >> "$(dirname "$file")"/deletedFiles.txt
rm "$file"
done
This means I retain a trace of the removed files in a deletedFiles.txt flag file in each respective directory for the users to see when files are missing. That way, they can pursue going back to archive CD's to retrieve these deleted files, which are hopefully not 0 byte files.
Thanks to #John1024 for the suggestion of using the empty flag rather than size.

Move all files from subdirectory into a new directory without overwriting

I want to consolidate into 1 directory files that are in multiple subdirectories.
The following comes close except that the random string is added after the extension; I want it before the extension:
find . -type f -iname "[a-z,0-9]*" -exec bash -c 'mv -v "$0" "./$( mktemp "$( basename "$0" ).XXX" )"' '{}' \;
I've searched through dozens of other posts but nothing addressed the specifics of my situation:
I'm on OS X (so it's a BSD flavor of Bash; for ex. there's no -t option for mv)
Many of the files have identical names so I need to rewrite them during the mv (and I can't just use the -n option for mv because there too many files would thus not get moved)
The files are not all the same kind, so I need to use a find -type f
I want to exclude .DS_store files, so it seems like a good option is find -type f -iname "[a-z,0-9]*"
I want the rewritten files's names to be in the form of: oldname-random_string.xyz (but I'm also OK with having the files being renamed as a sequential list: 00001.xyz, 00002.xyz, etc.)
The files are buried 4 levels down from my master directory:
Master/Top dir
Dir 2
Dir 3
Dir 4
Dir 5
file
For the sake of simplicity I prefer a bash command to a .sh script (but I'm happy with either)
GNU Solution
This uses basically the same command that you were using but I supply a template to mktemp so that the XXX pattern appears just before the suffix. With GNU sed:
find . -type f -iname "[a-z,0-9]*" -exec bash -c 'mv -v "$1" "./$(mktemp -u "$(basename "$1" | sed -E -e '\''s/\.([^.]+)$/.XXX.\1/'\'' -e '\''/XXX/ !s/$/.XXX/'\'')" )"' _ '{}' \;
The key addition above is the use of sed to insert XXX before the suffix in the file name:
sed -E -e 's/\.([^.]+)$/.XXX.\1/' -e '/XXX/ !s/$/.XXX/'
This has two commands. The first puts .XXX before the extension. The second command is run only if the file name has no extension in which case it adds .XXX to the end of the file name.
In the first command, the source regex consists of two parts. The first is \. which matches a period. The second is ([^.]+)$ which captures the extension into group 1. The substitution replaces this with .XXX.\1 where \1 is sed notation for group 1 which, in our case, is the file's extension.
OSX Solution
Under OSX, mktemp is not useful because it only supports templates with the XXX part trailing. As a workaround, we can use a bash script that generates non-overlapping file names:
#!/bin/bash
find . -type f -iname "[a-z,0-9]*" -print0 |
while IFS= read -r -d '' fname
do
new=$(basename "$fname")
[ "$fname" = "./$new" ] && continue
[ "$new" = .DS_store ] && continue
name=${new%.*}
ext=${new#"$name"}
n=0
new=$(printf '%s.%03i%s' "$name" "$n" "$ext")
while [ -f "$new" ]
do
n=$(($n + 1))
new=$(printf '%s.%03i%s' "$name" "$n" "$ext")
done
mv -v "$fname" "$new"
done
The above uses the find command to get the file names. The option -print0 is used to assure that it works with difficult file names. The while loop reads these file names one by one, into the variable fname. fname includes the full path to the source file. The file name without the path is then stored in new. Then two checks are performed. If the source file is already in the current directory, the script continues on to the next loop. Similarly, if the file name id .DS_Store, it is also skipped. (The find command, as given, already skips these files. This line is there just for future flexibility.) Next, the file name is split into two parts: the name and ext, the extension. ext includes the leading period. Next, a loop checks for files of the form name.NNN.ext and stops at the first one that doesn't yet exist. The source file is moved to a file of that name.
Related Notes Regarding the GNU Solution and its Compatibility
Quoting in the above GNU command is complex. The argument to bash -c needs to be in single-quotes to prevent the calling bash from performing premature variable substitution. In addition, the sed commands need to be in single-quotes when executed by the bash subshell to prevent history expansion from interfering with the use of negation, !, within the sed command.
The OSX (BSD) sed does not support combining commands together with semicolons. Consequently, each command is supplied to sed via a separate -e option.
The OSX (BSD) sed seems to treat + differently from the GNU sed. This incompatibility seems to go away when using the -E (extended regex) option. (The corresponding GNU option is -r but, as an undocumented compatibility feature, GNU sed supports -E also.

Repeated input redirection to c++ executable in bash

I have written an executable in c++, which is designed to take input from a file, and output to stdout (which I would like to redirect to a single file). The issue is, I want to run this on all of the files in a folder, and the find command that I am using is not cooperating. The command that I am using is:
find -name files/* -exec ./stagger < {} \;
From looking at examples, it is my understanding that {} replaces the file name. However, I am getting the error:
-bash: {}: No such file or directory
I am assuming that once this is ironed out, in order to get all of the results into one file, I could simply use the pattern Command >> outputfile.txt.
Thank you for any help, and let me know if the question can be clarified.
The problem that you are having is that redirection is processed before the find command. You can work around this by spawning another bash process in the -exec call:
find files/* -exec bash -c '/path/to/stagger < "$1"' -- {} \;
The < operator is interpreted as a redirect by the shell prior to running the command. The shell tries redirecting input from a file named {} to find's stdin, and an error occurs if the file doesn't exist.
The argument to -name is unquoted and contains a glob character. The shell applies pathname expansion and gives nonsensical arguments to find.
Filenames can't contain slashes. The argument to -name can't work even if it were quoted. If GNU find is available, -path can be used to specify a glob pattern files/*, but this doesn't mean "files in directories named files", for that you need -regex. Portable solutions are harder.
You need to specify one or more paths for find to start from.
Assuming what you really wanted was to have a shell perform the redirect, Here's a way with GNU find.
find . -type f -regex '.*foo/[^/]*$' -exec sh -c 'for x; do ./stagger <"$x"; done' -- {} +
This is probably the best portable way using find (-depth and -prune won't work for this):
find . -type d -name files -exec sh -c 'for x; do for y in "$x"/*; do [ -f "$y" ] && ./stagger <"$y"; done; done' -- {} +
If you're using Bash, this problem is a very good candidate for just using a globstar pattern instead of find.
#!/usr/bin/env bash
shopt -s extglob globstar nullglob
for x in **/files/*; do
[[ -f "$x" ]] && ./stagger <"$x"
done
Simply escape the less-than symbol, so that redirection is carried out by the find command rather than the shell it is running in:
find files/* -exec ./stagger \< {} \;

How do I apply a shell command to many files in nested (and poorly escaped) subdirectories?

I'm trying to do something like the following:
for file in `find . *.foo`
do
somecommand $file
done
But the command isn't working because $file is very odd. Because my directory tree has crappy file names (including spaces), I need to escape the find command. But none of the obvious escapes seem to work:
-ls gives me the space-delimited filename fragments
-fprint doesn't do any better.
I also tried: for file in "find . *.foo -ls"; do echo $file; done
- but that gives all of the responses from find in one long line.
Any hints? I'm happy for any workaround, but am frustrated that I can't figure this out.
Thanks,
Alex
(Hi Matt!)
You have plenty of answers that explain well how to do it; but for the sake of completion I'll repeat and add to it:
xargs is only ever useful for interactive use (when you know all your filenames are plain - no spaces or quotes) or when used with the -0 option. Otherwise, it'll break everything.
find is a very useful tool; put using it to pipe filenames into xargs (even with -0) is rather convoluted as find can do it all itself with either -exec command {} \; or -exec command {} + depending on what you want:
find /path -name 'pattern' -exec somecommand {} \;
find /path -name 'pattern' -exec somecommand {} +
The former runs somecommand with one argument for each file recursively in /path that matches pattern.
The latter runs somecommand with as many arguments as fit on the command line at once for files recursively in /path that match pattern.
Which one to use depends on somecommand. If it can take multiple filename arguments (like rm, grep, etc.) then the latter option is faster (since you run somecommand far less often). If somecommand takes only one argument then you need the former solution. So look at somecommand's man page.
More on find: http://mywiki.wooledge.org/UsingFind
In bash, for is a statement that iterates over arguments. If you do something like this:
for foo in "$bar"
you're giving for one argument to iterate over (note the quotes!). If you do something like this:
for foo in $bar
you're asking bash to take the contents of bar and tear it apart wherever there are spaces, tabs or newlines (technically, whatever characters are in IFS) and use the pieces of that operation as arguments to for. That is NOT filenames. Assuming that the result of a tearing long string that contains filenames apart wherever there is whitespace yields in a pile of filenames is just wrong. As you have just noticed.
The answer is: Don't use for, it's obviously the wrong tool. The above find commands all assume that somecommand is an executable in PATH. If it's a bash statement, you'll need this construct instead (iterates over find's output, like you tried, but safely):
while read -r -d ''; do
somebashstatement "$REPLY"
done < <(find /path -name 'pattern' -print0)
This uses a while-read loop that reads parts of the string find outputs until it reaches a NULL byte (which is what -print0 uses to separate the filenames). Since NULL bytes can't be part of filenames (unlike spaces, tabs and newlines) this is a safe operation.
If you don't need somebashstatement to be part of your script (eg. it doesn't change the script environment by keeping a counter or setting a variable or some such) then you can still use find's -exec to run your bash statement:
find /path -name 'pattern' -exec bash -c 'somebashstatement "$1"' -- {} \;
find /path -name 'pattern' -exec bash -c 'for file; do somebashstatement "$file"; done' -- {} +
Here, the -exec executes a bash command with three or more arguments.
The bash statement to execute.
A --. bash will put this in $0, you can put anything you like here, really.
Your filename or filenames (depending on whether you used {} \; or {} + respectively). The filename(s) end(s) up in $1 (and $2, $3, ... if there's more than one, of course).
The bash statement in the first find command here runs somebashstatement with the filename as argument.
The bash statement in the second find command here runs a for(!) loop that iterates over each positional parameter (that's what the reduced for syntax - for foo; do - does) and runs a somebashstatement with the filename as argument. The difference here between the very first find statement I showed with -exec {} + is that we run only one bash process for lots of filenames but still one somebashstatement for each of those filenames.
All this is also well explained in the UsingFind page linked above.
Instead of relying on the shell to do that work, rely on find to do it:
find . -name "*.foo" -exec somecommand "{}" \;
Then the file name will be properly escaped, and never interpreted by the shell.
find . -name '*.foo' -print0 | xargs -0 -n 1 somecommand
It does get messy if you need to run a number of shell commands on each item, though.
xargs is your friend. You will also want to investigate the -0 (zero) option with it. find (with -print0) will help to produce the list. The Wikipedia page has some good examples.
Another useful reason to use xargs, is that if you have many files (dozens or more), xargs will split them up into individual calls to whatever xargs is then called upon to run (in the first wikipedia example, rm)
find . -name '*.foo' -print0 | xargs -0 sh -c 'for F in "${#}"; do ...; done' "${0}"
I had to do something similar some time ago, renaming files to allow them to live in Win32 environments:
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/[\\/:\*\?#"\|<>]/_/g'
if [ ${newf} != ${f} ]; then
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
fi
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
This is probably a little simplistic, doesn't avoid name collisions, and I'm sure it could be done better -- but this does remove the need to use basename on the find results (in my case) before performing my sed replacement.
I might ask, what are you doing to the found files, exactly?

Resources