How do I apply a shell command to many files in nested (and poorly escaped) subdirectories?

How do I apply a shell command to many files in nested (and poorly escaped) subdirectories? - bash

I'm trying to do something like the following:
for file in `find . *.foo`
do
somecommand $file
done
But the command isn't working because $file is very odd. Because my directory tree has crappy file names (including spaces), I need to escape the find command. But none of the obvious escapes seem to work:
-ls gives me the space-delimited filename fragments
-fprint doesn't do any better.
I also tried: for file in "find . *.foo -ls"; do echo $file; done
- but that gives all of the responses from find in one long line.
Any hints? I'm happy for any workaround, but am frustrated that I can't figure this out.
Thanks,
Alex
(Hi Matt!)

You have plenty of answers that explain well how to do it; but for the sake of completion I'll repeat and add to it:
xargs is only ever useful for interactive use (when you know all your filenames are plain - no spaces or quotes) or when used with the -0 option. Otherwise, it'll break everything.
find is a very useful tool; put using it to pipe filenames into xargs (even with -0) is rather convoluted as find can do it all itself with either -exec command {} \; or -exec command {} + depending on what you want:
find /path -name 'pattern' -exec somecommand {} \;
find /path -name 'pattern' -exec somecommand {} +
The former runs somecommand with one argument for each file recursively in /path that matches pattern.
The latter runs somecommand with as many arguments as fit on the command line at once for files recursively in /path that match pattern.
Which one to use depends on somecommand. If it can take multiple filename arguments (like rm, grep, etc.) then the latter option is faster (since you run somecommand far less often). If somecommand takes only one argument then you need the former solution. So look at somecommand's man page.
More on find: http://mywiki.wooledge.org/UsingFind
In bash, for is a statement that iterates over arguments. If you do something like this:
for foo in "$bar"
you're giving for one argument to iterate over (note the quotes!). If you do something like this:
for foo in $bar
you're asking bash to take the contents of bar and tear it apart wherever there are spaces, tabs or newlines (technically, whatever characters are in IFS) and use the pieces of that operation as arguments to for. That is NOT filenames. Assuming that the result of a tearing long string that contains filenames apart wherever there is whitespace yields in a pile of filenames is just wrong. As you have just noticed.
The answer is: Don't use for, it's obviously the wrong tool. The above find commands all assume that somecommand is an executable in PATH. If it's a bash statement, you'll need this construct instead (iterates over find's output, like you tried, but safely):
while read -r -d ''; do
somebashstatement "$REPLY"
done < <(find /path -name 'pattern' -print0)
This uses a while-read loop that reads parts of the string find outputs until it reaches a NULL byte (which is what -print0 uses to separate the filenames). Since NULL bytes can't be part of filenames (unlike spaces, tabs and newlines) this is a safe operation.
If you don't need somebashstatement to be part of your script (eg. it doesn't change the script environment by keeping a counter or setting a variable or some such) then you can still use find's -exec to run your bash statement:
find /path -name 'pattern' -exec bash -c 'somebashstatement "$1"' -- {} \;
find /path -name 'pattern' -exec bash -c 'for file; do somebashstatement "$file"; done' -- {} +
Here, the -exec executes a bash command with three or more arguments.
The bash statement to execute.
A --. bash will put this in $0, you can put anything you like here, really.
Your filename or filenames (depending on whether you used {} \; or {} + respectively). The filename(s) end(s) up in $1 (and $2, $3, ... if there's more than one, of course).
The bash statement in the first find command here runs somebashstatement with the filename as argument.
The bash statement in the second find command here runs a for(!) loop that iterates over each positional parameter (that's what the reduced for syntax - for foo; do - does) and runs a somebashstatement with the filename as argument. The difference here between the very first find statement I showed with -exec {} + is that we run only one bash process for lots of filenames but still one somebashstatement for each of those filenames.
All this is also well explained in the UsingFind page linked above.

Instead of relying on the shell to do that work, rely on find to do it:
find . -name "*.foo" -exec somecommand "{}" \;
Then the file name will be properly escaped, and never interpreted by the shell.

find . -name '*.foo' -print0 | xargs -0 -n 1 somecommand
It does get messy if you need to run a number of shell commands on each item, though.

xargs is your friend. You will also want to investigate the -0 (zero) option with it. find (with -print0) will help to produce the list. The Wikipedia page has some good examples.
Another useful reason to use xargs, is that if you have many files (dozens or more), xargs will split them up into individual calls to whatever xargs is then called upon to run (in the first wikipedia example, rm)

find . -name '*.foo' -print0 | xargs -0 sh -c 'for F in "${#}"; do ...; done' "${0}"

I had to do something similar some time ago, renaming files to allow them to live in Win32 environments:
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/[\\/:\*\?#"\|<>]/_/g'
if [ ${newf} != ${f} ]; then
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
fi
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
This is probably a little simplistic, doesn't avoid name collisions, and I'm sure it could be done better -- but this does remove the need to use basename on the find results (in my case) before performing my sed replacement.
I might ask, what are you doing to the found files, exactly?

Related

Renaming bunch of files with xargs

I've been trying to rename a bunch of files in a proper order using xargs but to no avail. While digging around on piles of similar question, I found answers with the use of sed alongside xargs. Novice me wants to avoid the use of sed. I presume there must be some easier way around.
To be more specific, I've got some files as follows:
Abc.jpg
Def.jpg
Ghi.jpg
Jkl.jpg
and I want these to be renamed in an ordered way, like:
Something1.jpg
Something2.jpg
Something3.jpg
Something4.jpg
Could xargs command along with seq achieve this? If so, how do I implement it?

I don't know why anyone would try to engage sed for this. Probably not xargs or seq, either. Here's a pure-Bash one-liner:
(x=1; for f in *.jpg; do mv "$f" "Something$((x++)).jpg"; done)
At its core, that's a for loop over the files you want to rename, performing a mv command on each one. The files to operate on are expressed via a single glob expression, but you could also name them individually, use multiple globs, or use one of a variety of other techniques. Variable x is used as a simple counter, initialized to 1 before entering the loop. $((x++)) expands to the current value of x, with the side effect of incrementing x by 1. The whole thing is wrapped in parentheses to run it in a subshell, so that nothing in it affects the host shell environment. (In this case, that means it does not create or modify any variable x in the invoking shell.)
If you were putting that in a script instead of typing it on the command line then it would be more readable to split it over several lines:
(
x=1
for f in *.jpg; do
mv "$f" "Something$((x++)).jpg"
done
)
You can type it that way, too, if you wish.

This is an example of how to find, number and rename jpgs.
Regardless of how you use the find (what options you need. recursive, mindepth, maxdepth, regex, ...).
You can add numbers to find ouput with nl and use number and file as 2 arguments for xargs $1, $2
$ find . -type f -name "*.jpg" |nl| xargs -n 2 bash -c 'echo mv "$2" Something"$1".jpg' argv0
the echo echo mv ... will show this
mv ./Jkl.jpg Something1.jpg
mv ./Abc.jpg Something2.jpg
mv ./Def.jpg Something3.jpg
Using sort and testing the number of arguments
$ find . -type f -name "*.jpg" |sort|nl| xargs -n 2 bash -c '[ "$#" -eq 2 ] && echo mv "$2" Something"$1".jpg' argv0
mv ./Abc.jpg Something1.jpg
mv ./Def.jpg Something2.jpg
mv ./Jkl.jpg Something3.jpg

Rename multiple filename in multiple folders

I know you can do this to rename all filenames in a single folder with something like this:
for file in 1_*; do
mv "$file" "${file/1_/}"
done
However, is there a way to do this across multiple folders? For example, it will search through all the folders in the current directory and change them.
I have bash version 4.3

A robust solution, assuming you have GNU or BSD/OSX find:
find . -type f -name '1_*' -execdir sh -c 'echo mv -- "$1" "${1#1_}"' _ {} \;
Note:
- This will only echo the mv commands, to be safe; remove the echo to perform actual renaming.
- The OP's substitution, "${file/1_/}" was changed to the POSIX-compliant "${file#1_}", which is actually closer to the intent.
- If you truly need a substitution such as "${file/1_/}", which the sh on your system may or may not support, it is better to explicitly invoke a shell known to support it, such as bash.
- Symlinks are ignored (both files and directories); use find -L ... to include them (both as potential files to be renamed and to make find descend into symlinks to directories).
find . -type f -name '1_*' finds all files (-type f) with names matching 1_* (-name '1_*') in the current dir.'s (.) subtree.
-execdir executes the command passed to it in the subdirectory in which the file at hand is located.
sh -c 'echo mv -- "$1" "${1#1_}"' _ {} \; invokes the default shell (sh):
with a command string (passed to -c)
mv -- "$1" "${1#1_}" effectively removes prefix 1_ from the filename represented by the first positional parameter ($1).
and dummy parameter _ (which sh will assign to $0, which is not of interest here)
and the path of the file at hand, {}, which the shell will bind to $1;
\; simply terminates -execdir's argument.
Note that -- ensures that any filename that happens to start with - isn't mistaken for an option by mv (applies analogously below).
-execdir is not POSIX-compliant; if a POSIX-compliant variant is therefore more cumbersome:
find . -type f -name '1_*' -exec sh -c \
'cd -- "${1%/*}" && f=${1##*/} && echo mv -- "$f" "${f#1_}"' _ {} \;
cd -- "${1%/*}" changes to the directory in which the file at hand is located.
Note: cd -- "$(dirname -- "$1")" is generally more robust, but less efficient; since we know that $1 is always a path rather than a mere filename in this scenario, we can use the more efficient cd -- "${1%/*}".
f=${1##*/} extracts the mere filename from the file path at hand.
The remainder of the command line then works as above, analogously.
Performance note:
The above solutions are concise, but inefficient, because a shell instance has to be spawned for each matching file.
A potential speed-up is to use a variant of peak's approach, but only if you avoid calling external utilities in the while loop (except for mv):
find . -type f -name '1_*' | while IFS= read -r f; do
new_name=${f##*/} # extract filename
new_name=${new_name#1_} # perform substitution
d=${f%/*} # extract dir
echo mv -- "$f" "$d/$new_name" # call mv to rename
done
The above bears the hypothetical risk of breaking with filenames with embedded newlines (very rare); with GNU or BSD find, this problem could be solved.
With this approach, only a single shell instance is spawned, which processes all matching filenames in a loop - as long as the only external utility that is called in the loop is mv, this will generally be faster than the find-only solutions with -exec[dir].

If you don't want to depend on too many subtleties, consider this very pedestrian approach, which assumes a bash-like shell and that all the usual suspects (find, sed, ....) are directly available:
find . -type f -name "1_*" | while read -r file ; do
x=$(basename "$file")
y=$(sed 's/1_//' <<< "$x")
d=$(dirname "$file")
mv "$file" "$d/$y"
done
(You might want to try this using "mv -i" or "echo mv ....". You might also want to use find with the -follow option.)

Repeated input redirection to c++ executable in bash

I have written an executable in c++, which is designed to take input from a file, and output to stdout (which I would like to redirect to a single file). The issue is, I want to run this on all of the files in a folder, and the find command that I am using is not cooperating. The command that I am using is:
find -name files/* -exec ./stagger < {} \;
From looking at examples, it is my understanding that {} replaces the file name. However, I am getting the error:
-bash: {}: No such file or directory
I am assuming that once this is ironed out, in order to get all of the results into one file, I could simply use the pattern Command >> outputfile.txt.
Thank you for any help, and let me know if the question can be clarified.

The problem that you are having is that redirection is processed before the find command. You can work around this by spawning another bash process in the -exec call:
find files/* -exec bash -c '/path/to/stagger < "$1"' -- {} \;

The < operator is interpreted as a redirect by the shell prior to running the command. The shell tries redirecting input from a file named {} to find's stdin, and an error occurs if the file doesn't exist.
The argument to -name is unquoted and contains a glob character. The shell applies pathname expansion and gives nonsensical arguments to find.
Filenames can't contain slashes. The argument to -name can't work even if it were quoted. If GNU find is available, -path can be used to specify a glob pattern files/*, but this doesn't mean "files in directories named files", for that you need -regex. Portable solutions are harder.
You need to specify one or more paths for find to start from.
Assuming what you really wanted was to have a shell perform the redirect, Here's a way with GNU find.
find . -type f -regex '.*foo/[^/]*$' -exec sh -c 'for x; do ./stagger <"$x"; done' -- {} +
This is probably the best portable way using find (-depth and -prune won't work for this):
find . -type d -name files -exec sh -c 'for x; do for y in "$x"/*; do [ -f "$y" ] && ./stagger <"$y"; done; done' -- {} +
If you're using Bash, this problem is a very good candidate for just using a globstar pattern instead of find.
#!/usr/bin/env bash
shopt -s extglob globstar nullglob
for x in **/files/*; do
[[ -f "$x" ]] && ./stagger <"$x"
done

Simply escape the less-than symbol, so that redirection is carried out by the find command rather than the shell it is running in:
find files/* -exec ./stagger \< {} \;

Is there such a thing as inline bash scripts?

I want to do something on the lines of:
find -name *.mk | xargs "for i in $# do mv i i.aside end"
I realize that there might be more than on error in this, but I'd like to specifically know about this sort of inline command definition that I can pass xargs to.

This particular command isn't a great example, but you can use an "inline shell script" by giving sh -c 'here is the script' as a command. And you can give it arguments which will be $# inside the script but there's a catch: the first argument after here is the script goes to $0 inside the script, so you have to put an extra word there or you'll lose the first argument.
find . -name '*.mk' -exec sh -c 'for i; do mv "$i" "$i.aside"; done' fnord '{}' +
Another fun feature I took advantage of there is the fact that for loops iterate over the command line arguments by default: for i; do ... is equivalent to for i in "$#"; do ...
I reiterate, the above command is convoluted and slow compared to the many other methods of doing the bulk mv. I'm posting it only to show some cool syntax.

There's no need for xargs here
find -name *.mk -exec mv {} {}.aside \;

I'm not sure what the semantics of your for loop should be, but blindly coding it would give something like this:
find -name *.mk | while read file
do
for i in $file; do mv $i $i.aside; done
done
If the body is used in multiple places, you can also use bash functions.

In some version of find an argument is needed : . for the current directory
Star * must be escaped
You can try with echo command to be sure what command will do
find . -name '*.mk' -print0 | xargs -0i sh -c "echo mv '{}' '{}.aside'"
man xargs
/-i
man sh
/-c

I'm certain you could do this in a nice manner, but since you requested xargs:
find -name "*.tk" | xargs -I% mv % %.aside
Looping over filenames makes no sense, since you can only rename one at a time. Using inline uglyness is not necessary, but I could not make it work with the pipe and either eval or bash -c.

Loop over directories with whitespace in Bash

In a bash script, I want to iterate over all the directories in the present working directory and do stuff to them. They may contain special symbols, especially whitespace. How can I do that? I have:
for dir in $( ls -l ./)
do
if [ -d ./"$dir" ]
but this skips my directories with whitespace in their name. Any help is appreciated.

Give this a try:
for dir in */

Take your pick of solutions:
http://www.cyberciti.biz/tips/handling-filenames-with-spaces-in-bash.html
The general idea is to change the default seperator (IFS).
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for f in *
do
echo "$f"
done
IFS=$SAVEIFS

There are multiple ways. Here is something that is very fast:
find /your/dir -type d -print0 | xargs -0 echo
This will scan /your/dir recursively for directories and will pass all paths to the command "echo" (exchange to your need). It may call echo multiple time, but it will try to pass as many directory names as the console allows at once. This is extremely fast because few processes need to be started. But it works only on programs that can take an arbitrary amount of values as options.
-print0 tells find to seperate file paths using a zero byte (and -0 tells xargs to read arguments seperated by zero byte)
If you don't have the later one, you can do this:
find /your/dir -type d -print0 | xargs -0 -n 1 echo
or
find /your/dir -type d -print0 --exec echo '{}' ';'
The option -n 1 will tell xargs not to pass more arguments than one at the same time to your program.
If you don't want find to scan recursively you can specify the depth option to disable recursion (don't know the syntax by heart though).
Though if that's usable in your particular script is another question ;-).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How do I apply a shell command to many files in nested (and poorly escaped) subdirectories? - bash

Instead of relying on the shell to do that work, rely on find to do it: find . -name "*.foo" -exec somecommand "{}" \; Then the file name will be properly escaped, and never interpreted by the shell.

find . -name '*.foo' -print0 | xargs -0 -n 1 somecommand It does get messy if you need to run a number of shell commands on each item, though.

find . -name '*.foo' -print0 | xargs -0 sh -c 'for F in "${#}"; do ...; done' "${0}"

Related

Renaming bunch of files with xargs

Rename multiple filename in multiple folders

Repeated input redirection to c++ executable in bash

Is there such a thing as inline bash scripts?

Loop over directories with whitespace in Bash

Categories

Resources