I'm trying to understand what the following bash script snippet is doing.
The sequential bangs ('!') are the main thing tripping me up, and searching online doesn't seem to really yield anything useful.
for file in $(find $pwd/localroot -type f ! -path '*\.git*' ! -path '*README\.md' ! -path "*?scriptname"); do
It means "not". From the find(1) man page:
! expr
True if expr is false. This character will also usually need protection from interpretation by the shell.
There are implicit ands between each of the tests.
Find files: -type f
But not inside .git directories: ! -path '*\.git*'
And ignore README.md: ! -path '*README\.md'
And ignore ?scriptname: ! -path "*?scriptname", where ? is a single character.
Related
I would like a loop because I have a lot of exclusions. The issue is that I don't know how to use variables to do it.
Here is my code:
tabSearch=("20220514*" "20220508*" "20220515*")
find . \( ! -name "${tabSearch[0]}" -a ! -name "${tabSearch[1]}" \)
The idea is to use as -name $variable as a needed with a loop but I have a syntax issue. Can you help me please?
Best to create the condition string separately, then insert into the command.
ALso best to keep each condition separate. Find automatically considers such conditions as 'and'ed unless you specify "-o" .
#!/bin/sh
CONDS=""
for cond in '20220514*' '20220508*' '20220515*'
do
CONDS="${CONDS} \( ! -name '${cond}' \)"
done
eval find . ${CONDS} -print
I have a script that recursively searches all directories for specific files or specific file endings.
These certain files I want to save the path in a description file.
Looks for example like this:
./org/apache/commons/.../file1.pom
./org/apache/commons/.../file1.jar
./org/apache/commons/.../file1.zip
and so on.
In a blacklist , I describe which file endings I want to ignore.
! -path "./.cache/*" ! -path "./org/*" ! -name "*.sha1" ! -name"*.lastUpdated"
and so on.
Now i want to read this blacklist file while the search to ignore the described files:
find . -type f $(cat blacklist) > artifact.descriptor
Unfortunately, the blacklist will not be included while the search.
When:
echo "find . -type f $(cat blacklist) > artifact.descriptor"
Result is as expected:
find . -type f ! -path "./.cache/*" ! -path "./org/*" ! -name "*.sha1" ! -name"*.lastUpdated" > artifact.descriptor
But it does not work or exclude the described files.
I tried with following command and it works, but i want to know why not with with find alone.
find . -type f | grep -vf $blacklist > artifact.descriptor
Hopefully someone can explain it to me :)
Thanks a lot.
As tripleee suggests, it is generally considered bad practice to store a command in a variable because it does not catch all the cornercases.
However you can use eval as a workaround
/tmp/test$ ls
blacklist test.a test.b test.c
/tmp/test$ cat blacklist
-not -name *.c -not -name *.b
/tmp/test$ eval "find . -type f "`cat blacklist`
./test.a
./blacklist
In your case I think it fails because the quotes in your blacklist file are considered as a literal and not as enclosing the patterns and I think it works if you remove them, but still it's probably not safe for other reasons.
! -path ./.cache/* ! -path ./org/* ! -name *.sha1 ! -name *.lastUpdated
I'm trying to run find, and exclude several directories listed in an array. I'm finding some weird behavior when it's expanding, though, which is causing me issues:
~/tmp> skipDirs=( "./dirB" "./dirC" )
~/tmp> bars=$(find . -name "bar*" -not \( -path "${skipDirs[0]}/*" $(printf -- '-o -path "%s/\*" ' "${skipDirs[#]:1}") \) -prune); echo $bars
./dirC/bar.txt ./dirA/bar.txt
This did not skip dirC as I wold have expected. The problem is that the print expands the quotes around "./dirC".
~/tmp> set -x
+ set -x
~/tmp> bars=$(find . -name "bar*" -not \( -path "${skipDirs[0]}/*" $(printf -- '-o -path "%s/*" ' "${skipDirs[#]:1}") \) -prune); echo $bars
+++ printf -- '-o -path "%s/*" ' ./dirC
++ find . -name 'bar*' -not '(' -path './dirB/*' -o -path '"./dirC/*"' ')' -prune
+ bars='./dirC/bar.txt
./dirA/bar.txt'
+ echo ./dirC/bar.txt ./dirA/bar.txt
./dirC/bar.txt ./dirA/bar.txt
If I try to remove the quotes in the $(print..), then the * gets expanded immediately, which also gives the wrong results. Finally, if I remove the quotes and try to escape the *, then the \ escape character gets included as part of the filename in the find, and that does not work either. I'm wondering why the above does not work, and, what would work? I'm trying to avoid using eval if possible, but currently I'm not seeing a way around it.
Note: This is very similar to: Finding directories with find in bash using a exclude list, however, the posted solutions to that question seem to have the issues I listed above.
The safe approach is to build your array explicitly:
#!/bin/bash
skipdirs=( "./dirB" "./dirC" )
skipdirs_args=( -false )
for i in "${skipdirs[#]}"; do
args+=( -o -type d -path "$i" )
done
find . \! \( \( "${skipdirs_args[#]}" \) -prune \) -name 'bar*'
I slightly modify the logic in your find since you had a slight (logic) error in there: your command was:
find -name 'bar*' -not stuff_to_prune_the_dirs
How does find proceed? it will parse the files tree and when it finds a file (or directory) that matches bar* then it will apply the -not ... part. That's really not what you want! your -prune is never going to be applied!
Look at this instead:
find . \! \( -type d -path './dirA' -prune \)
Here find will completely prune the directory ./dirA and print everything else. Now it's among everything else that you want to apply the filter -name 'bar*'! the order is very important! there's a big difference between this:
find . -name 'bar*' \! \( -type d -path './dirA' -prune \)
and this:
find . \! \( -type d -path './dirA' -prune \) -name 'bar*'
The first one doesn't work as expected at all! The second one is fine.
Notes.
I'm using \! instead of -not as \! is POSIX, -not is an extension not specified by POSIX. You'll argue that -path is not POSIX either so it doesn't matter to use -not. That's a detail, use whatever you like.
You had to use some dirty trick to build your commands to skip your dir, as you had to consider the first term separately from the other. By initializing the array with -false, I don't have to consider any terms specially.
I'm specifying -type d so that I'm sure I'm pruning directories.
Since my pruning really applies to the directories, I don't have to include wildcards in my exclude terms. This is funny: your problem that seemingly is about wildcards that you can't handle disappears completely when you use find appropriately as explained above.
Of course, the method I gave really applies with wildcards too. For example, if you want to exclude/prune all subdirectories called baz inside subdirectories called foo, the skipdirs array given by
skipdirs=( "./*/foo/baz" "./*/foo/*/baz" )
will work fine!
The issue here is that the quotes you are using on "%s/*" aren't doing what you think they are.
That is to say, you think you need the quotes on "%s/*" to prevent the results from the printf from being globbed however that isn't what is happening. Try the same thing without the directory separator and with files that start and end with double quotes and you'll see what I mean.
$ ls
"dirCfoo"
$ skipDirs=( "dirB" "dirC" )
$ printf '%s\n' -- -path "${skipDirs[0]}*" $(printf -- '-o -path "%s*" ' "${skipDirs[#]:1}")
-path
dirB*
-o
-path
"dirCfoo"
$ rm '"dirCfoo"'
$ printf -- '%s\n' -path "${skipDirs[0]}*" $(printf -- '-o -path "%s*" ' "${skipDirs[#]:1}")
-path
dirB*
-o
-path
"dirC*"
See what I mean? The quotes aren't being handled specially by the shell. They just happen not to glob in your case.
This issue is part of why things like what is discussed at http://mywiki.wooledge.org/BashFAQ/050 don't work.
To do what you want here I believe you need to create the find arguments array manually.
sD=(-path /dev/null)
for dir in "${skipDirs}"; do
sD+=(-o -path "$dir")
done
and then expand "${sD[#]}" on the find command line (-not \( "${sD[#]}" \) or so).
And yes, I believe this makes the answer you linked to incorrect (though the other answer might work (for non-whitespace, etc. files) because of the array indirection that is going on.
I don't code in Bash daily. I'm trying to implement small functionality: user define an array of directories or files to omit in find command. Unfortunately I have a problem with expanding asterisk and other meta-characters by shell (* is expanded during concatenation). My code is:
excluded=( "subdirectory/a/*"
"subdirectory/b/*"
)
cnt=0
for i in "${excluded[#]}"; do
directories="$directories ! -path \"./${excluded[$cnt]}\""
cnt=$(($cnt+1))
done
echo "$directories"
for i in $(find . -type f -name "*.txt" $directories); do
new_path=$(echo "$i"|sed "s/\.\//..\/up\//g")
echo $new_path
done
Unfortunately, I still see excluded directories in results.
EDIT:
This is not duplicate of existing question. I don't ask you how to exclude directories in find. I have a problem with expanding meta-characters like "*" by passing variables to find command. E.g I have almost working solution below:
excluded=( "subdirectory/a/*"
"subdirectory/b/*"
)
cnt=0
for i in "${excluded[#]}"; do
directories="$directories ! -path ./${excluded[$cnt]}"
cnt=$(($cnt+1))
done
echo "$directories"
for i in $(find . -type f -name "*.txt" $directories); do
new_path=$(echo "$i"|sed "s/\.\//..\/up\//g")
echo $new_path
done
It works, but problem is when e.g directory c contains more than one file. In such case, asterisk sign is replaced by full file paths. Consequently I have an error:
find: paths must precede expression: ./subdirectory/c/fgo.txt
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]
Why? Because asterisk sgin has been expanded to full file name:
! -path ./subdirectory/a/aaa.txt ! -path ./subdirectory/b/dfdji.txt ! -path ./subdirectory/c/asd.txt ./subdirectory/c/fgo.txt
My question is: how to avoid such situation?
You'll want to use the -prune switch in find.
Here's an example (I found this on stackoverflow itself)
find . -type d \( -path dir1 -o -path dir2 -o -path dir3 \) -prune -o -print
This omits, dir1, dir2, dir3.
Source: https://stackoverflow.com/a/4210072/1220089
My initial thought is to use double quotes to prevent the expansion:
for i in $(find . -type f -name "*.txt" "$directories" | sed 's#\./#\.\./up/#g'); do
but this of course fails. You can accomplish the same effect with (untested):
pre='! -path'
excluded=( "$pre subdirectory/a/*"
"$pre subdirectory/b/*"
)
for i in $(find . -type f -name "*.txt" "${excluded[#]}" | sed ...); do
I need to find all of the TIFFs in a directory, recursively, but ignore some artifacts (basically all hidden files) that also happen to end with ".tif". This command:
find . -type f -name '*.tif' ! -name '.*'
works exactly how I want it on the command line, but inside a bash script it doesn't find anything. I've tried replacing ! with -and -not and--I think--just about every escaping permutation I can think of and/or recommended by the googleshpere, e.g. .\*, leaving out single quotes, etc. Obviously I'm missing something, any help is appreciated.
EDIT: here's the significant part of the script; the directory it's doing the find on is parameterized, but I've been debugging with it hard-coded; it makes no difference:
#!/bin/bash
RECURSIVE=1
DIR=$1
#get the absolute path to $DIR
DIR=$(cd $DIR; pwd)
FIND_CMD="find $DIR -type f -name '*.tif' ! -name '.*'"
if [ $RECURSIVE == 1 ]; then
FIND_CMD="$FIND_CMD -maxdepth 1"
fi
for in_img in $($FIND_CMD | sort); do
echo $in_img # for debugging
#stuff
done
It was related to having the expression stored in a variable. The solution was to use eval, which of course would be the right thing to do anyway. Everything is the same as above except at the start of the for loop:
for in_img in $(eval $FIND_CMD | sort); do
#stuff
done
To not find hidden files you can use the following:
find . \( ! -regex '.*/\..*' \) -type f -name "*.tif"
It checks the filename and doesn't show (! in the parenthesis) the files beginning with a dot, which are the hidden files.