When using scp or rsync I often fail to deal with 'Argument list too long' error. When having to mv or rm, I have no problem to use find and xargs but I fail to understand how to use find and -exec despite all the SE posts on the subject. Consider the following issue...
I tried
$scp /Path/to/B* Me#137.92.4.152:/Path/to/
-bash: /usr/bin/scp: Argument list too long
So I tried
$find . -name "/Path/to/B*" -exec scp "{}" Me#137.92.4.152:/Path/to/ '\;'
find: -exec: no terminating ";" or "+"
so I tried
$find . -name "/Path/to/B*" -exec scp "{}" Me#137.92.4.152:/Path/to/ ';'
find: ./.gnupg: Permission denied
find: ./.subversion/auth: Permission denied
So I tried
$sudo find . -name "/Path/to/B*" -exec scp "{}" Me#137.92.4.152:/Path/to/ ';'
and nothing happen onces I enter my password
I am on Mac OSX version 10.11.3, Terminal version 2.6.1
R. Saban's helpful answer solves your primary problem:
-name only accepts a filename pattern, not a path pattern.
Alternatively, you could simply use the -path primary instead of the -name primary.
As for using as few invocations of scp as possible - each of which requires specifying a password by default:
As an alternative, consider bypassing the use of scp altogether, as suggested in Eric Renouf's helpful answer.
While find's -exec primary allows using terminator + in lieu of ; (which must be passed as ';' or \; to prevent the shell from interpreting ; as a command terminator) for passing as many filenames as will fit on a single command line (a built-in xargs, in a manner of speaking), this is NOT an option here, because use of + requires that placeholder {} come last on the command line, immediately before +.
However, since you're on macOS, you can use BSD xarg's nonstandard -J option for placing the placeholder anywhere on the command line, while still passing as many arguments as possible at once (using BSD find's nonstandard -print0 option in combination with xargs's nonstandard -0 option ensures that all filenames are passed as-is, even if they have embedded spaces, for instance):
find . -path "/Path/to/B*" -print0 | xargs -0 -J {} scp {} Me#137.92.4.152:/Path/to/
Now you will at most be prompted a few times: one prompt for every batch of arguments, as required to accommodate all arguments while observing the max. command-line length with the fewest number of calls possible.
EDIT after your update:
find "/Path/to" -maxdepth 1 -name "B*" -exec scp {} Me#137.92.4.152:/Path/to/ \;
A solution that wouldn't require multiple scp connections (and therefore password entries) would be to tar on one side and untar on the other like:
find /Path/to -maxdepth 1 -name 'B*' -print0 | tar -c --null -T - | ssh ME#137.92.4.152 tar -x -C /Path/to
assuming your version of find supports -print0 and the like. It works by printing out null terminated list of files from find and telling tar to read its list of files from stdin (-T -) treating the list as null terminated (--null) and create a new archive (-c). By default, tar will write to stdout.
So then we'll pipe that archive to an ssh command to the target host. That will read the output of the previous command on its stdin, so we'll use tar there to extract (-x) the archive into the given directory (-C /Path/to)
Related
I want to unzip multiple files,
Using this answer, I found the following command.
find -name '*.zip' -exec sh -c 'unzip -d "${1%.*}" "$1"' _ {} \;
How do I use GNU Parallel with the above command to unzip multiple files?
Edit 1:
As per questions by user Mark Setchell
Where are the files ?
All the zip files are generally in a single directory.
But, as per my assumption, the command finds all the files even if recursively/non-recursively according to the depth given in find command.
How are the files named?
abcd_sdfa_fasfasd_dasd14.zip
how do you normally unzip a single one?
unzip abcd_sdfa_fasfasd_dasd14.zip -d abcd_sdfa_fasfasd_dasd14
You can first use find with the -print0 option to NULL delimit files and then read back in GNU parallel with the NULL delimiter and apply the unzip
find . -type f -name '*.zip' -print0 | parallel -0 unzip -d {/.} {}
The part {/.} applies string substitution to get the basename of the file and removes the part preceding the . as seen from the GNU parallel documentation - See 7. Get basename, and remove last ({.}) or any ({:}) extension You can further set the number of parallel jobs that can be run with the -j flag. e.g. -j8, -j64
You could also using the + variant of -exec. It starts parallel after find has completed, but also allows for you to still use -print/-printf/-ls/etc. and possibly abort the find before executing the command:
find . -type f -name '*.zip' -ls -exec parallel unzip -d {.} ::: {} \+
Note that GNU Parallel also uses {} to specify the input arguments. In this case, however, we use {.} to strip the extension like shown in your example. You can override the GNU Parallel's replacement string {} with -I (for example, using -I## allows for you to use ## instead of {}).
I recommend using GNU Parallel's --dry-run flag or prepending unzip with an echo to test the command first and see what would be executed.
With GNU find, it is easy to pipe to xargs. A typical (useless) example:
find /var/log -name "*.log" | xargs dirname
This returns all the directory names containing some log file.
The same command with BSD find does not work, ending with:
usage: dirname path
That is xargs is unable to pass file list entries to dirname.
BSD find's manpage mentions the -exec and -execdir options, stating "This behaviour is similar to that of xargs(1)."
-exec utility [argument ...] {} +
Same as -exec, except that ``{}'' is replaced with as many pathnames as possible for
each invocation of utility. This behaviour is similar to that of xargs(1).
-execdir utility [argument ...] {} +
Same as -execdir, except that ``{}'' is replaced with as many pathnames as possible
for each invocation of utility. This behaviour is similar to that of xargs(1).
Each time I fall back on these two flags, I have to read the documentation again. I seem unable to remember their usage! Also, I am concerned with script portability across GNU/BSD systems, basically Linux, Open/FreeBSD, and MacOS.
Any way to pipe BSD find to xargs, or -exec is really the only option?
Both GNU and FreeBSD version of xargs support a way to pass the strings from stdin to the command as part of the -I flag. All you need to is
find /var/log -name "*.log" | xargs -I {} dirname -- "{}"
The GNU xargs page says about the flag as
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input.
This provides an alternate way than using -exec or -execdir. However, having said, that using -exec is not too complex for your case.
find /var/log -name "*.log" -type f -exec dirname "{}" \;
Say I want to edit every .html file in a directory one after the other using vim, I can do this with:
find . -name "*.html" -exec vim {} \;
But what if I only want to edit every html file containing a certain string one after the other? I use grep to find files containing those strings, but how can I pipe each one to vim similar to the find command. Perphaps I should use something other than grep, or somehow pipe the find command to grep and then exec vim. Does anyone know how to edit files containing a certain string one after the other, in the same fashion the find command example I give above would?
grep -l 'certain string' *.html | xargs vim
This assumes you don't have eccentric file names with spaces etc in them. If you have to deal with eccentric file names, check whether your grep has a -z option to terminate output lines with null bytes (and xargs has a -0 option to read such inputs), and if so, then:
grep -zl 'certain string' *.html | xargs -0 vim
If you need to search subdirectories, maybe your version of Bash has support for **:
grep -zl 'certain string' **/*.html | xargs -0 vim
Note: these commands run vim on batches of files. If you must run it once per file, then you need to use -n 1 as extra options to xargs before you mention vim. If you have GNU xargs, you can use -r to prevent it running vim when there are no file names in its input (none of the files scanned by grep contain the 'certain string').
The variations can be continued as you invent new ways to confuse things.
With find :
find . -type f -name '*.html' -exec bash -c 'grep -q "yourtext" "${1}" && vim "${1}"' _ {} \;
On each files, calls bash commands that grep the file with yourtext and open it with vim if text is matching.
Solution with a for cycle:
for i in $(find . -type f -name '*.html'); do vim $i; done
This should open all files in a separate vim session once you close the previous.
I'm trying to pipe some files from the find command to the interactive remove command, so that I can double check the files I'm removing, but I've run into some trouble.
find -name '#*#' -print0 | xargs -0 rm -i
I thought the above would work, but instead I just get a string of "rm: remove regular file ./some/path/#someFile.js#? rm: remove regular file ./another/path/#anotherFile#?..."
Can someone explain to me what's exactly is happening, and what I can do to get my desired results? Thanks.
You can do this by using exec option with find. Use the command
find . -name '#*#' -exec rm -i {} \;
xargs will not work (unless you use options such as -o or -p) because it uses stdin to build commands. Since stdin is already in use, you cannot input the response for query with rm.
Can someone explain to me what's exactly is happening,
As the man page for xargs says (under the -a option): "If you use this option, stdin remains unchanged when commands are run.
Otherwise, stdin is redirected from /dev/null."
Since you're not using the -a option, each rm -i command that xargs is running gets its stdin from /dev/null (i.e. no input is available). When rm asks whether to remove a particular file, the answer is effectively "no" because /dev/null gives no reply. rm receives an EOF on its input, so it does not remove that file, and goes on to the next file.
and what I can do to get my desired results?
Besides using find -exec as unxnut explained, another way to do it is to use the -o (or --open-tty) option with xargs:
find -name '#*#' -print0 | xargs -0 -o rm -i
That's probably the ideal way, because it allows rm -i to handle interactive confirmation itself, as designed.
Another way is to use the -p (or --interactive) option with xargs:
find -name '#*#' -print0 | xargs -0 -p rm
With this approach, xargs handles the interactive confirmation instead of having rm do it. You may also want to use -n 1, so that each prompt only asks about one file:
find -name '#*#' -print0 | xargs -0 -p -n 1 rm
The advantage of using xargs over find -exec is that you can use it with any command that generates the file path arguments, not just with find.
you can use this simple command to solve your problem.
find . -name '#*#' -delete
Can someone show me to use xargs properly? Or if not xargs, what unix command should I use?
I basically want to input more than (1) file name for input <localfile>, third input parameter.
For example:
1. use `find` to get list of files
2. use each filename as input to shell script
Usage of shell script:
test.sh <localdir> <localfile> <projectname>
My attempt, but not working:
find /share1/test -name '*.dat' | xargs ./test.sh /staging/data/project/ '{}' projectZ \;
Edit:
After some input from everybody and trying -exec, I am finding that my <localfile> filename input with find is also giving me the full path. /path/filename.dat instead of filename.dat. Is there a way to get the basename from find? I think this will have to be a separate question.
I'd just use find -exec here:
% find /share1/test -name '*.dat' -exec ./test.sh /staging/data/project/ {} projectZ \;
This will invoke ./test.sh with your three arguments once for each .dat file under /share1/test.
xargs would pack up all of these filenames and pass them into one invocation of ./test.sh, which doesn't look like your desired behaviour.
If you want to execute the shell script for each file (as opposed to execute in only once on the whole list of files), you may want to use find -exec:
find /share1/test -name '*.dat' -exec ./test.sh /staging/data/project/ '{}' projectZ \;
Remember:
find -exec is for when you want to run a command on one file, for each file.
xargs instead runs a command only once, using all the files as arguments.
xargs stuffs as many files as it can onto the end of the command line.
Do you want to execute the script on one file at a time or all files? For one at a time, use file's exec, which it looks like you're already using the syntax for, and which xargs doesn't use:
find /share1/test -name '*.dat' -exec ./test.sh /staging/data/project/ '{}' projectZ \;
xargs does not have to combine arguments, it's just the default behavior. this properly uses xargs, to execute the commands, as intended.
find /share1/test -name '*.dat' -print0 | xargs -0 -I'{}' ./test.sh /staging/data/project/ '{}' projectZ
When piping find to xargs, NULL termination is usually preferred, I recommend appending the -print0 option to find. After which you must add -0 to xargs, so it will expect NULL terminated arguments. This ensures proper handling of filenames. It's not POSIX proper, but considered well supported. You can always drop the NULL terminating options, if your commands lack support.
Remeber while find's purpose is finding files, xargs is much more generic. I often use xargs to process non-filename arguments.