Why can't I exclude a directory using find - bash

I attempting to run a command on all subdirectories in a directory using find and -exec, however on one of the directories, the user the script runs under does not have adequate permissions and I get an error (permission denied). I am attempting to ignore the directory using either ! -path or using -prune. Neither of these methods work. I have tried both of the commands down below.
I have tried every combination of subDirToExclude— with and without ./ at the beginning, with and without /* at the end. I've tried relative path, full path and every single combination of all of them that you can think of to try and match this path, but it simply does not work. The man page is unhelpful and no suggestions from any related questions on this forum produce any useful results. Why do none of the methods suggested in the man page work? How can this actually be done?
find /path/to/dir -maxdepth 1 -type d ! -path "subDirToExclude" -exec somecommand {} +
find /path/to/dir -maxdepth 1 -type d -path "subDirToExclude" -prune -o -exec somecommand {} +
find: ‘/path/to/dir/subDirToExclude’: Permission denied

The argument to the -path option should be a full pathname, not just the name of the directory. Use -name if you just want to match the name of the directory.
find /path/to/dir -maxdepth 1 -type d ! -name "subDirToExclude" -exec somecommand {} +
You could also do this without using find at all, since you're not recursing into subdirectories because of -maxdepth 1.
shopt -s extglob
somecommand /path/to/dir /path/to/dir/!(subDirToExclude)/
Putting / at the end of the filename makes the wildcard only match directories. Actually, this will also match symbolic links to directories; if that's a problem, you can't use this solution.

Related

Trying to find files containing an identifier, then move them to a new directory within terminal

I'm a beginner with this stuff and seem to be running into an issue.
Basically, I have many files with names containing a keyword (let's call it "Category1") within a directory. For example:
ABC-Category1-XYZ.txt
I'm trying to move them from a directory into another directory with the same name as the keyword.
I started with this:
find /path_A -name "*Category1*" -exec mv {} /path_A/Category1 \;
It spit out something like this:
mv: rename /path_A/Category1 to /path_A/Category1/Category1: Invalid
Argument
So I did some fiddling and hypothesized that the problem was caused by the command trying to move the directory Category1 into itself(maybe). I decided to exclude directories from the search so it would only attempt to move files. I came up with this:
find /path_A -name "*Category1*" \(! -type d \) -exec mv {} /path_A/Category1 \;
This did move the files from their original location to where I wanted them, but it still gave me something like:
mv: /path_A/Category1/ABC-Category1-XYZ.txt and
/path_A/Category1/ABC-Category1-XYZ.txt are identical
I'm no expert, so I could be wrong... but I believe the command is trying to find and move the files from their original directory, then find them again. The directory Category1 is a subdirectory of the starting point, /path_A, So i believe it is finding the files it just moved in the directory Category1 and attempting to move them again.
Can anyone help me fix this issue?
You are creating new files that find tries to process. Safest approach is to move them somewhere else not in the path_A you are searching with find.
Or you can use prune to ignore that directory if you don't have any other directory matching:
find /path_A -name '*Category1*' -prune -type f -exec mv {} /path_A/Category1/ \;
Although another post has been accepted, let me post a proper answer.
Would you please try:
find /path_A -name 'Category1' -prune -o -type f -name '*Category1*' -exec mv -- {} /path_A/Category1/ \;
The option -prune is rather a command than a condition. It tells find to
ignore the directory tree specified by the conditions before -prune.
In this case it excludes the directory Category1 from the search.
The following -o is logical OR and may be interpreted something like instead or else. The order of the options makes difference.
Please be noticed the 1st category1 is the directory name to exclude and the 2nd *Category1* is the filenames to find.
If you are not sure which files are the result of find, try to execute:
find /path_A -name 'Category1' -prune -o -type f -name '*Category1*' -print
then tweak the options to see the change of output.

Find error - unknown primary or operator

find /Volumes/COMMON-LIC-PHOTO/STAGING/Completed -type d -maxdepth 2 -iname -iregex '.*_OUTPUT' -exec rsync -rtWv --stats --progress {} /Volumes/COMMON-LIC-PHOTO/ASPERA/ASPERA_STAGING/ \;
The code above is designed to look inside the directory Complete for any sub-directories with the phrase "_OUTPUT" (ignoring case, hence -iname) at the end of the directory name and copy what it finds to a new location, Aspera_Staging. I'm running the code in a .sh triggered by the launchcd app Launch Control whenever a new directory is moved to Complete (which could be part of the issue because cron seems to be very picky).
It works about half the time, the other half it does nothing at all. An OUTPUT directory won't be copied. I can't find a pattern, it almost seems random. I've noticed in the debug log that it is giving me the following error:
find: .*_OUTPUT: unknown primary or operator
I've spent hours tinkering, trying to figure it out. I've followed a lot of suggestions found on here and other sites but so far nothing has worked. It obviously has something to do with it looking for the Output folders but I just can't get to the bottom of it.
As commenters have noticed, -iname requires a parameter, therefore the -iregex that follows is understood as that parameter and the parameter to -iregex is (mis)taken as an operator, hence your error message.
In your context, -iname and -iregex seem redundant, so your command should be either:
find /Volumes/COMMON-LIC-PHOTO/STAGING/Completed -type d -maxdepth 2 -iname '*_OUTPUT' -exec ... \;
or:
find /Volumes/COMMON-LIC-PHOTO/STAGING/Completed -type d -maxdepth 2 -iregex '.*_OUTPUT' -exec ... \;
(notice how the parameters to -iname and to -iregex slightly differ)

Bash script for removing specific file from certain subdirectories

On a unix server, I'm trying to figure out how to remove a file, say "example.xls", from any subdirectories that start with v0 ("v0*").
I have tried something like:
find . -name "v0*" -type d -exec find . -name "example.xls" -type f
-exec rm {} \;
But i get errors. I have a solution but it works too well, i.e. it will delete the file in any subdirectory, regardless of it's name:
find . -type f -name "example.xls" -exec rm -f {} \;
Any ideas?
You will probably have to do it in two steps -- i.e. first find the directories, and then the files -- you can use xargs to make it in a single line, like
find . -name "v0*" -type d | \
xargs -l -I[] \
find [] -name "example.xls" -type f -exec rm {} \;
what it does, is first generating a list of viable directory name, and let xargs call the second find with the names locating the file name within that directory
Try:
find -path '*/v0*/example.xls' -delete
This matches only files named example.xls which, somewhere in its path, has a parent directory name that starts with v0.
Note that since find offers -delete as an action, it is not necessary to invoke the external executable rm.
Example
Consider this directory structure:
$ find .
.
./a
./a/example.xls
./a/v0
./a/v0/b
./a/v0/b/example.xls
./a/v0/example.xls
We can identify files example.xls who have one of their parent directories named v0*:
$ find -path '*/v0*/example.xls'
./a/v0/b/example.xls
./a/v0/example.xls
To delete those files:
find -path '*/v0*/example.xls' -delete
Alternative: find only those files directly under directory v0*
find -regex '.*/v0[^/]*/example.xls'
Using the above directory structure, this approach returns one file:
$ find -regex '.*/v0[^/]*/example.xls'
./a/v0/example.xls
To delete such files:
find -regex '.*/v0[^/]*/example.xls' -delete
Compatibility
Although my tests were performed with GNU find, both -regex and -path are required by POSIX and also supported by OSX.

Exclude specified directory when using `find` command

I have a directory which contains a number of files (no subdirectories). I wish to find these files. The following gets me close:
$ find docs
docs
docs/bar.txt
docs/baz.txt
docs/foo.txt
I don't want the directory itself to be listed. I could do this instead:
$ find docs -type f
docs/bar.txt
docs/baz.txt
docs/foo.txt
Using a wildcard seems to do the trick as well:
$ find docs/*
docs/bar.txt
docs/baz.txt
docs/foo.txt
My understanding is that these work in different ways: with -type, we're providing a single path to find, whereas in the latter case we're using wildcard expansion to pass several paths to find. Is there a reason to favour one approach over the other?
You have a UNIX tag, and you example has a *. Some versions of find have a problem with that.
If the directory has no subdirectories.
FYI.
Generally the first parms to find has to be a directory or a list of directories
find /dir1 /dir2 -print
Find is recursive - so it will follow each directory down listing every thing, symlinks, directories, pipes, and regular files. This can be confusing. -type delimits your search
find /dir1 /dir2 -type f -print
You can also have find do extra output example: have it rm files older than 30 days for example:
find /dir1 /dir2 -type f -mtime +30 -exec rm {} \;
Or give complete infomation
find /dir1 /dir2 -type f -mtime +30 -exec ls -l {} \;
find /dir1 /dir2 -type f -mtime +30 -ls # works on some systems
To answer your question: because find can be dangerous ALWAYS fully specify each directory , file type ,etc., when you are using a nasty command like rm. You might have forgotten your favorite directory is also in there. Or the one used to generate your paycheck. Using a wildcard is ok for just looking around.
Using *
find /path/to/files -type f -name 'foo*'
-- tics or quotes around strings with a star in them in some UNIX systems.
find docs -type f
will get you a listing of every non-directory file of every subdirectory of docs
find docs/*
will get you a listing of every file AND every subdirectory of docs

What does this bash script means

I've found the following line of code in a script. Could someone explain me what does this following line of code means?
Basically, the purpose of this line is find a set of files to archive. Since I am not familiar with bash scripts, it is difficult for me to understand this line of code.
_filelist=`cd ${_path}; find . -type f -mtime ${ARCHIVE_DELAY} -name "${_filename}" -not -name "${_ignore_filename}" -not -name "${_ignore_filename2}"`
Let's break it down:
cd ${_path} : changes to the directory stored in the ${_path} variable
find is used to find files based on the following criteria:
. : look in the current directory and recurse through all
sub-directories
-type f: look for regular files only (not directories)
-mtime ${ARCHIVE_DELAY} : look for files last modified
${ARCHIVE_DELAY}*24 hours ago
-name "${_filename}": look for files which have name matching ${_filename}
-not -name "${_ignore_filename}" : do not find files which have
name matching ${_ignore_filename}
-not -name "${_ignore_filename2}" : do not find files which have
name matching ${_ignore_filename2}
All the files found are stored in a variable called _filelist.
The backtick (`) symbol assigns to the variable the output of the command.
Your script is assigning to $_filelist what you get by:
Changing directory to $_path
Finding in the current directory (.) files (-type f) where
Name is $_filename (a pattern, I suppose)
Name is not $_ignore_filename or $_ignore_filename2
I think you could as well change that to find ${_path} ... without the cd, but please try it out.
_filelist=`somecode`
makes the variable _filelist contain the output of the command somecode.
Somecode, in this case, is mostly a find command, which searches recursively for files.
find . -type f -mtime ${ARCHIVE_DELAY} -name "${_filename}" -not -name "${_ignore_filename}" -not -name "${_ignore_filename2}"
find .
searches the current dir, but this was just before changed to be _path.
-type f
only searches in ordinary files (not dirs, sockets, ...)
-mtime
specifies the modification time of that files, to be the same as ${ARCHIVE_DELAY}
-name explains
itself, has to be "${_filename}"
-not name
explains itself too, I guess.
So the whole part sets the variable filelist to files, found by some criterias: name, age, and type.

Resources