find with nested command reading blacklist - shell

I have a script that recursively searches all directories for specific files or specific file endings.
These certain files I want to save the path in a description file.
Looks for example like this:
./org/apache/commons/.../file1.pom
./org/apache/commons/.../file1.jar
./org/apache/commons/.../file1.zip
and so on.
In a blacklist , I describe which file endings I want to ignore.
! -path "./.cache/*" ! -path "./org/*" ! -name "*.sha1" ! -name"*.lastUpdated"
and so on.
Now i want to read this blacklist file while the search to ignore the described files:
find . -type f $(cat blacklist) > artifact.descriptor
Unfortunately, the blacklist will not be included while the search.
When:
echo "find . -type f $(cat blacklist) > artifact.descriptor"
Result is as expected:
find . -type f ! -path "./.cache/*" ! -path "./org/*" ! -name "*.sha1" ! -name"*.lastUpdated" > artifact.descriptor
But it does not work or exclude the described files.
I tried with following command and it works, but i want to know why not with with find alone.
find . -type f | grep -vf $blacklist > artifact.descriptor
Hopefully someone can explain it to me :)
Thanks a lot.

As tripleee suggests, it is generally considered bad practice to store a command in a variable because it does not catch all the cornercases.
However you can use eval as a workaround
/tmp/test$ ls
blacklist test.a test.b test.c
/tmp/test$ cat blacklist
-not -name *.c -not -name *.b
/tmp/test$ eval "find . -type f "`cat blacklist`
./test.a
./blacklist
In your case I think it fails because the quotes in your blacklist file are considered as a literal and not as enclosing the patterns and I think it works if you remove them, but still it's probably not safe for other reasons.
! -path ./.cache/* ! -path ./org/* ! -name *.sha1 ! -name *.lastUpdated

Related

jenkins bash script - remove directory path from find command results

I need to search in my dist directory for minified .js and .css files within jenkins.
I have a bash script with a find cmd, like this:
for path in $(/usr/bin/find dist/renew-online -maxdepth 1 -name "*.js" -or -name "*.css" -type f); do
# URL of the JavaScript file on the web server
url=$linkTarget/$path
echo "url=$linkTarget/$path"
Where linkTarget is: http://uat.xxxx.com/renew-online.
I want to attach the minified files form dist/renew-online to the linkTarget,
for example:
http://uat.xxxx.com/renew-online/main-es2015.cf7da54187dc97781fff.js
BUT I keeping getting: http://uat.xxxx.com/renew-online/dist/renew-online/main-es2015.cf7da54187dc97781fff.js
I've tried with -maxdepth 0 also but can't get the correct url - newbie at scripts!
Hopefully one of you guys can help, thanks your time
This can be achieved by using 'find' command only:
/usr/bin/find dist/renew-online -maxdepth 1 \( -name "*.js" -o -name "*.css" \) -type f -printf "$linkTarget/%f\n"
It is also recommended to isolate 'or' statements inside round brackets.
This is more a bash question than a jenkins one and you have multiple ways to do it.
If all your files are in a single path, and actually you are forcing with the depth, you can use a cut
for path in $(/usr/bin/find dist/renew-online -maxdepth 1 -name "*.js" -or -name "*.css" -type f | cut -d'/' -f2); do
In the other hand the here https://serverfault.com/questions/354403/remove-path-from-find-command-output by the usage of -printf '%f\n'
Please note as well that the usage of find in a for loop is fragile and it is recommended to use a while https://github.com/koalaman/shellcheck/wiki/SC2044
EDIT
the field used in cut depends on the folders you have in your find syntax. The most accurate way is the one in the serverfault link above

how to use find result as part of regex in bash shell

I want to find out which directory doesn't have *.dsc file in it, so I had a try:
find . -type d -exec ls {}/*.dsc
the output is as below:
ls: connot access './abc/*.dsc': No such file or directory
I'm sure that there is a dsc file in abc/ directory.
Seems bash shell will treat "{}/*.dsc" as a string but not a regex, so I had another try:
find . -type d|xargs -I {} ls {}/*.dsc
but the result is the same.
How could I get the command work as I need?
Can you try this one out :
find . ! -iname "*.dsc" -type d
!: This is used to negate the match. It will list everything but files with .dsc extension.
-type d: This will fetch all the directories.
I wasn't able to use wildcards in find + ls, but the command below works.
find . -type d -not -path . | while read -r dir; do [ -f $dir/*\.dsc ] || echo $dir; done
It test separately whether a file *.dsc exists and echoes the directory otherwise.

Not expanding asterisk by shell - excluding paths from find

I don't code in Bash daily. I'm trying to implement small functionality: user define an array of directories or files to omit in find command. Unfortunately I have a problem with expanding asterisk and other meta-characters by shell (* is expanded during concatenation). My code is:
excluded=( "subdirectory/a/*"
"subdirectory/b/*"
)
cnt=0
for i in "${excluded[#]}"; do
directories="$directories ! -path \"./${excluded[$cnt]}\""
cnt=$(($cnt+1))
done
echo "$directories"
for i in $(find . -type f -name "*.txt" $directories); do
new_path=$(echo "$i"|sed "s/\.\//..\/up\//g")
echo $new_path
done
Unfortunately, I still see excluded directories in results.
EDIT:
This is not duplicate of existing question. I don't ask you how to exclude directories in find. I have a problem with expanding meta-characters like "*" by passing variables to find command. E.g I have almost working solution below:
excluded=( "subdirectory/a/*"
"subdirectory/b/*"
)
cnt=0
for i in "${excluded[#]}"; do
directories="$directories ! -path ./${excluded[$cnt]}"
cnt=$(($cnt+1))
done
echo "$directories"
for i in $(find . -type f -name "*.txt" $directories); do
new_path=$(echo "$i"|sed "s/\.\//..\/up\//g")
echo $new_path
done
It works, but problem is when e.g directory c contains more than one file. In such case, asterisk sign is replaced by full file paths. Consequently I have an error:
find: paths must precede expression: ./subdirectory/c/fgo.txt
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]
Why? Because asterisk sgin has been expanded to full file name:
! -path ./subdirectory/a/aaa.txt ! -path ./subdirectory/b/dfdji.txt ! -path ./subdirectory/c/asd.txt ./subdirectory/c/fgo.txt
My question is: how to avoid such situation?
You'll want to use the -prune switch in find.
Here's an example (I found this on stackoverflow itself)
find . -type d \( -path dir1 -o -path dir2 -o -path dir3 \) -prune -o -print
This omits, dir1, dir2, dir3.
Source: https://stackoverflow.com/a/4210072/1220089
My initial thought is to use double quotes to prevent the expansion:
for i in $(find . -type f -name "*.txt" "$directories" | sed 's#\./#\.\./up/#g'); do
but this of course fails. You can accomplish the same effect with (untested):
pre='! -path'
excluded=( "$pre subdirectory/a/*"
"$pre subdirectory/b/*"
)
for i in $(find . -type f -name "*.txt" "${excluded[#]}" | sed ...); do

Why is find finding .git directories?

I have the following find command and I'm surprised to see .git directories being found. Why?
$ find . ! -name '*git*' | grep git
./.git/hooks
./.git/hooks/commit-msg
./.git/hooks/applypatch-msg.sample
./.git/hooks/prepare-commit-msg.sample
./.git/hooks/pre-applypatch.sample
./.git/hooks/commit-msg.sample
./.git/hooks/post-update.sample
Because find searches for files and none of the found files have the search pattern in their name (see the man page). You need to remove the offending directory via the -prune switch:
find . -path ./.git -prune -o -not -name '*git*' -print |grep git
See Exclude directory from find . command
[edit] An alternative without -prune (and much more natural imho):
find . -not -path "*git*" -not -name '*git*' |grep git
You're just seeing expected behaviour of find. The -name test is only applied to the filename itself, not the whole path. If you want to search everything but the .git directory, you can use bash(1)'s extglob option:
$ shopt -s extglob
$ find !(.git)
It doesn't really find those git-files. Instead it finds files under ./.git/ that match the pattern ! -name '*git*' which includes all files that don't include git in their filename (not path name).
Finds -name is about the files, not the path.
Try -iwholename instead of -name:
find . ! -iwholename '*git*'
This is what I needed:
find . ! -path '*git*'

bash `find` escaping

I need to find all of the TIFFs in a directory, recursively, but ignore some artifacts (basically all hidden files) that also happen to end with ".tif". This command:
find . -type f -name '*.tif' ! -name '.*'
works exactly how I want it on the command line, but inside a bash script it doesn't find anything. I've tried replacing ! with -and -not and--I think--just about every escaping permutation I can think of and/or recommended by the googleshpere, e.g. .\*, leaving out single quotes, etc. Obviously I'm missing something, any help is appreciated.
EDIT: here's the significant part of the script; the directory it's doing the find on is parameterized, but I've been debugging with it hard-coded; it makes no difference:
#!/bin/bash
RECURSIVE=1
DIR=$1
#get the absolute path to $DIR
DIR=$(cd $DIR; pwd)
FIND_CMD="find $DIR -type f -name '*.tif' ! -name '.*'"
if [ $RECURSIVE == 1 ]; then
FIND_CMD="$FIND_CMD -maxdepth 1"
fi
for in_img in $($FIND_CMD | sort); do
echo $in_img # for debugging
#stuff
done
It was related to having the expression stored in a variable. The solution was to use eval, which of course would be the right thing to do anyway. Everything is the same as above except at the start of the for loop:
for in_img in $(eval $FIND_CMD | sort); do
#stuff
done
To not find hidden files you can use the following:
find . \( ! -regex '.*/\..*' \) -type f -name "*.tif"
It checks the filename and doesn't show (! in the parenthesis) the files beginning with a dot, which are the hidden files.

Resources