I'm trying to parse the android source directory and i need to extract all the directory names excluding certain patterns. If you notice below., for now i included only 1 directory to the exclude list, but i will be adding more.,
The find command doesn't exclude the directory with name 'docs'.
The commented out line works., but the other one doesn't. For easy debugging, i included the min and maxdepth which i would remove later.
Any comments or hints on why it doesn't work?
#! /bin/bash
ANDROID_PATH=$1
root=/
EXCLUDES=( doc )
cd ${root}
for dir in "${EXCLUDES[#]}"; do
exclude_name_cmd_string=${exclude_name_cmd_string}$(echo \
"-not -name \"${dir}*\" -prune")
done
echo -e ${exclude_name_cmd_string}
custom_find_cmd=$(find ${ANDROID_PATH} -mindepth 1 -maxdepth 1 \
${exclude_name_cmd_string} -type d)
#custom_find_cmd=$(find ${ANDROID_PATH} -mindepth 1 -maxdepth 1 \
# -not -name "doc*" -prune -type d)
echo ${custom_find_cmd}
Building up a command string with possibly-quoted arguments is a bad idea. You get into nested quoting levels and eval and a bunch of other dangerous/confusing syntactic stuff.
Use an array to build the find; you've already got the EXCLUDES in one.
Also, the repeated -not and -prune seems weird to me. I would write your command as something like this:
excludes=()
for dir in "${EXCLUDES[#]}"; do
excludes+=(-name "${dir}*" -prune -o)
done
find "${ANDROID_PATH}" -mindepth 1 -maxdepth 1 "${excludes[#]}" -type d -print
The upshot is, you want the argument to -name to be passed to find as a literal wildcard that find will expand, not a list of files returned by the shell's expansion, nor a string containing literal quotation marks. This is very hard to do if you try to build the command as a string, but trivial if you use an array.
Friends don't let friends build shell commands as strings.
When I run your script (named fin.sh) as:
bash -x fin.sh $HOME/tmp
one of the lines of trace output is:
find /Users/jleffler/tmp -mindepth 1 -maxdepth 1 -not -name '"doc*"' -prune -type d
Do you see the single quotes around the double quotes? That's bash trying to be helpful. I'm guessing that your "doesn't work" problem is that you still get directories under doc* included in the output; other than that, it seems to work for me.
How to fix that?
...it seems you've found a way to fix that...I'm not sure I'd trust it with a Bourne shell (but the Korn shell seems to agree with Bash), but it looks like it might work with Bash. I'm pretty sure this is something that changed during the last 30 years or so, but it is hard to prove that; getting hands on the old code is not easy.
I also wonder whether you need repeated -prune options if you have repeated excluded directories; I'm not sufficiently familiar with -prune to be sure.
Found the problem. Its with the escape sequence in the exclude_name_cmd_string.
Correct syntax should have been
exclude_name_cmd_string=${exclude_name_cmd_string}$(echo \
"-not -name ${dir}* -prune")
Related
I have the following Makefile which should find all .tex files starting with prefix "slides" and then compile all these latex files:
TSLIDES = $(shell find . -maxdepth 1 -iname 'slides*.tex' -printf '%f\n')
TPDFS = $(TSLIDES:%.tex=%.pdf)
all: $(TPDFS)
$(TPDFS): %.pdf: %.tex
latexmk -pdf $<
However, I keep getting the error messages (I am pretty sure it used to work and am very confused why I am getting this error now...)
/usr/bin/find: paths must precede expression: `slides01-intro.tex'
/usr/bin/find: possible unquoted pattern after predicate `-iname'?
In the manual, I found this
NON-BUGS
Operator precedence surprises
The command find . -name afile -o -name bfile -print will never print
afile because this is actually equivalent to find . -name afile -o \(
-name bfile -a -print \). Remember that the precedence of -a is
higher than that of -o and when there is no operator specified
between tests, -a is assumed.
“paths must precede expression” error message
$ find . -name *.c -print
find: paths must precede expression
Usage: find [-H] [-L] [-P] [-Olevel] [-D ... [path...] [expression]
This happens because *.c has been expanded by the shell resulting in
find actually receiving a command line like this:
find . -name frcode.c locate.c word_io.c -print
That command is of course not going to work. Instead of doing things
this way, you should enclose the pattern in quotes or escape the
wildcard:
$ find . -name '*.c' -print
$ find . -name \*.c -print
But this does not help in my case as I have used quotes to avoid shell expansion. Any idea how I can fix this (I have also tried TSLIDES = $(shell find . -maxdepth 1 -iname 'slides*.tex' in the first line of my Makefile but it exits with the same error?
EDIT: I am on windows and use the git bash (which is based on mingw-64).
You should always make very clear up-front in questions using Windows, that you're using Windows. Running POSIX-based tools like make on Windows always requires a bit of extra work. But I'm assuming based on the mingw-w64 label that you are, in fact, on Windows.
I tried your example on my GNU/Linux system and it worked perfectly. My suspicion is that your version of GNU make is invoking Windows cmd.exe instead of a POSIX shell like bash. In Windows cmd.exe, the single-quote character ' is not treated like a quote character.
Try replacing your single quotes with double-quotes " and see if it works:
TSLIDES = $(shell find . -maxdepth 1 -iname "slides*.tex" -printf "%f\n")
I'm also not sure if the \n will be handled properly. But you don't really need it, you can just use -print (or even, in GNU find, leave it out completely as it's the default action).
I'm not a Windows person so the above might not help but it's my best guess. If not please edit your question and provide more details about the environment you're using: where you got your version of make, where you're running it from, etc.
I have arraylist of files and I am trying to use rm with xargs to remove files like:
dups=["test.csv","man.csv","teams.csv"]
How can I pass the complete dups array to find and delete these files?
I want to make changes below to make it work
find ${dups[#]} -type f -print0 | xargs -0 rm
Your find command is wrong.
# XXX buggy: read below
find foo bar baz -type f -print0
means look in the paths foo, bar, and baz, and print any actual files within those. (If one of the paths is a directory, it will find all files within that directory. If one of the paths is a file in the current directory, it will certainly find it, but then what do you need find for?)
If these are files in the current directory, simply
rm -- "${dups[#]}"
(notice also how to properly quote the array expansion).
If you want to look in all subdirectories for files with these names, you will need something like
find . -type f \( -name "test.csv" -o -name "man.csv" -o -name "teams.csv" \) -delete
or perhaps
find . -type f -regextype egrep -regex '.*/(test\.csv|man\.csv|teams\.csv)' -delete
though the -regex features are somewhat platform-dependent (try find -E instead of find -regextype egrep on *BSD/MacOS to enable ERE regex support).
Notice also how find has a built-in predicate -delete so you don't need the external utility rm at all. (Though if you wanted to run a different utility, find -exec utility {} + is still more efficient than xargs. Some really old find implementations didn't have the + syntax for -exec but you seem to be on Linux where it is widely supported.)
Building this command line from an array is not entirely trivial; I have proposed a duplicate which has a solution to a similar problem. But of course, if you are building the command from Java, it should be easy to figure out how to do this on the Java side instead of passing in an array to Bash; and then, you don't need Bash at all (you can pass this to find directly, or at least use sh instead of bash because the command doesn't require any Bash features).
I'm not a Java person, but from Python this would look like
import subprocess
command = ["find", ".", "-type", "f"]
prefix = "("
for filename in dups:
command.extend([prefix, "-name", filename])
prefix = "-o"
command.extend([")", "-delete"])
subprocess.run(command, check=True, encoding="utf-8")
Notice how the backslashes and quotes are not necessary when there is no shell involved.
I am looking for a way to use the find command to tell if a folder has no files in it. I have tried using the -empty flag, but since I am on macOS the system files the OS places in the directory such as .DS_Store cause find to not consider the directory empty. I have tried telling find to ignore .DS_Store but it still considers the directory not empty because that file is present.
Is there a way to have find exclude certain files from what it considers -empty? Also is there a way to have find return a list of directories with no visible files?
The -empty predicate is rather simple, it's true for a directory if it has any entries other than . or ...
Kind of an ugly solution, but you can use -exec to run another find in each directory which will implement your criteria for deciding what directories you want to include.
Below:
the outer find will execute sh -c for each directory in /starting/point
sh will execute another find with different criteria.
the inner find will print the first match and then quit
read will consume the output (if any) of the inner find. read will have an exit status of 0 only if the inner find printed at least one line, non-zero otherwise
if there was no output from the inner find, the outer find's -exec predicate will evaluate to false
since -exec is followed by -o, the following -print action will be executed only for those directories which do not match the inner find's criteria
find /starting/point \
-type d \( \
-exec sh -c \
'find "$1" -mindepth 1 -maxdepth 1 ! -name ".*" -print -quit | read' \
sh {} \; \
-o -print \
\)
Also note that the 'find FOLDER -empty' is somewhat tricky. It will consider FOLDER empty even if it contains files, as long as these are empty.
Maybe not exactly what was asked, but I prefer the brute force approach if I want to avoid a no-match error on using FOLDER/*. In tcsh:
ls -d FOLDER/* >& /dev/null
if !($status) COMMANDS FOLDER/* ...
A variation of this might be usable here (like also using
ls -d FOLDER/.* | wc -l
and drawing the desired conclusions from the combined results).
I am trying to find all directories that start with a year in brackets, such as this:
[1990] Nature Documentary
and then rename them removing brackets and inserting a dash in between.
1990 - Nature Documentary
The find command below seems to find the results, however I could not prefix the pattern with ^ to mark start of directory name otherwise its not returning hits.
I am pretty sure I need to use -exec or -execdir, but I am not sure how to store the found pattern and manipulate it.
find . -type d -name '\[[[:digit:]][[:digit:]][[:digit:]][[:digit:]]] *'
With [p]rename:
-depth -exec prename -n 's/\[(\d{4})]([^\/]+)$/$1 -$2/' {} +
Drop -n if the output looks good.
Without it, you'd need a shell script with several hardly intelligible parameter expansions there:
-depth -exec sh -c '
for dp; do
yr=${dp##*/[} yr=${yr%%]*}
echo mv "$dp" "${dp%/*}/$yr -${dp##*/\[????]}"
done' sh {} +
Remove echo to apply changes.
You can use the rename command
find . -type d -name '\[[[:digit:]][[:digit:]][[:digit:]][[:digit:]]\] *'| rename -n 's/(\[\d{4}\]) ([\w,\s]+)+$/$1 - $2/'
Note: The effect will not take place until you delete the -n option.
find . -type d
can be used to find all directories below some start point. But it returns the current directory (.) too, which may be undesired. How can it be excluded?
Not only the recursion depth of find can be controlled by the -maxdepth parameter, the depth can also be limited from “top” using the corresponding -mindepth parameter. So what one actually needs is:
find . -mindepth 1 -type d
POSIX 7 solution:
find . ! -path . -type d
For this particular case (.), golfs better than the mindepth solution (24 vs 26 chars), although this is probably slightly harder to type because of the !.
To exclude other directories, this will golf less well and requires a variable for DRYness:
D="long_name"
find "$D" ! -path "$D" -type d
My decision tree between ! and -mindepth:
script? Use ! for portability.
interactive session on GNU?
exclude .? Throw a coin.
exclude long_name? Use -mindepth.
I use find ./* <...> when I don't mind ignoring first-level dotfiles (the * glob doesn't match these by default in bash - see the 'dotglob' option in the shopt builtin: https://www.gnu.org/software/bash/manual/html_node/The-Shopt-Builtin.html).
eclipse tmp # find .
.
./screen
./screen/.testfile2
./.X11-unix
./.ICE-unix
./tmux-0
./tmux-0/default
eclipse tmp # find ./*
./screen
./screen/.testfile2
./tmux-0
./tmux-0/default
Well, a simple workaround as well (the solution was not working for me on windows git bash)
find * -type d
It might not be very performant, but gets the job done, and it's what we need sometimes.
[Edit] : As #AlexanderMills commented it will not show up hidden directories in the root location (eg ./.hidden), but it will show hidden subdirectories (eg. ./folder/.hiddenSub). [Tested with git bash on windows]
Pipe it to sed. Don't forget the -r that extend regular expression.
find . -type d | sed -r '/^\.$/d'