Fast recursive grepping of svn working copy [duplicate] - windows

This question already has answers here:
Exclude .svn directories from grep [duplicate]
(11 answers)
Closed 6 years ago.
I need to search all cpp/h files in svn working copy for "foo", excluding svn's special folders completely. What is the exact command for GNU grep?

I use ack for this purpose, it's like grep but automatically knows how to exclude source control directories (among other useful things).

grep -ir --exclude-dir=.svn foo *
In the working directory will do.
Omit the 'i' if you want the search to be case sensitive.
If you want to check only .cpp and .h files use
grep -ir --include={.cpp,.h} --exclude-dir=.svn foo *

Going a little off-topic:
If you have a working copy with a lot of untracked files (i.e. not version-controlled) and you only want to search source controlled files, you can do
svn ls -R | xargs -d '\n' grep <string-to-search-for>

This is a RTFM. I typed 'man grep' and '/exclude' and got:
--exclude=GLOB
Skip files whose base name matches GLOB (using wildcard
matching). A file-name glob can use *, ?, and [...] as
wildcards, and \ to quote a wildcard or backslash character
literally.
--exclude-from=FILE
Skip files whose base name matches any of the file-name globs
read from FILE (using wildcard matching as described under
--exclude).
--exclude-dir=DIR
Exclude directories matching the pattern DIR from recursive
searches.

I wrote this script which I've added to my .bashrc. It automatically excludes SVN directories from grep, find and locate.

I use these bash aliases for grepping for content and files in svn trees... I find it faster and more pleasant to search from the commandline (and use vim for coding) rather than a GUI-based IDE:
s () {
local PATTERN=$1
local COLOR=$2
shift; shift;
local MOREFLAGS=$*
if ! test -n "$COLOR" ; then
# is stdout connected to terminal?
if test -t 1; then
COLOR=always
else
COLOR=none
fi
fi
find -L . \
-not \( -name .svn -a -prune \) \
-not \( -name templates_c -a -prune \) \
-not \( -name log -a -prune \) \
-not \( -name logs -a -prune \) \
-type f \
-not -name \*.swp \
-not -name \*.swo \
-not -name \*.obj \
-not -name \*.map \
-not -name access.log \
-not -name \*.gif \
-not -name \*.jpg \
-not -name \*.png \
-not -name \*.sql \
-not -name \*.js \
-exec grep -iIHn -E --color=${COLOR} ${MOREFLAGS} -e "${PATTERN}" \{\} \;
}
# s foo | less
sl () {
local PATTERN=$*
s "$PATTERN" always | less
}
# like s but only lists the files that match
smatch () {
local PATTERN=$1
s $PATTERN always -l
}
# recursive search (filenames) - find file
f () {
find -L . -not \( -name .svn -a -prune \) \( -type f -or -type d \) -name "$1"
}

Related

Bash: Find command with multiple -name variable [duplicate]

I have a find command that finds files with name matching multiple patterns mentioned against the -name parameter
find -L . \( -name "SystemOut*.log" -o -name "*.out" -o -name "*.log" -o -name "javacore*.*" \)
This finds required files successfully at the command line. What I am looking for is to use this command in a shell script and join this with a tar command to create a tar of all log files. So, in a script I do the following:
LIST="-name \"SystemOut*.log\" -o -name \"*.out\" -o -name \"*.log\" -o -name \"javacore*.*\" "
find -L . \( ${LIST} \)
This does not print files that I am looking for.
First - why this script is not functioning like the command? Once it does, can I club it with cpio or similar to create a tar in one shot?
Looks like find fails to match * in patterns from unquoted variables. This syntax works for me (using bash arrays):
LIST=( -name \*.tar.gz )
find . "${LIST[#]}"
Your example would become the following:
LIST=( -name SystemOut\*.log -o -name \*.out -o -name \*.log -o -name javacore\*.\* )
find -L . \( "${LIST[#]}" \)
eval "find -L . \( ${LIST} \)"
You could use an eval and xargs,
eval "find -L . \( $LIST \) " | xargs tar cf 1.tar
When you have a long list of file names you use, you may want to try the following syntax instead:
# List of file patterns
Pat=( "SystemOut*.log"
"*.out"
"*.log"
"javacore*.*" )
# Loop through each file pattern and build a 'find' string
find $startdir \( -name $(printf -- $'\'%s\'' "${Pat[0]}") $(printf -- $'-o -name \'%s\' ' "${Pat[#]:1}") \)
That method constructs the argument sequentially using elements from a list, which tends to work better (at least in my recent experiences).
You can use find's -exec option to pass the results to an archiving program:
find -L . \( .. \) -exec tar -Af archive.tar {} \;
LIST="-name SystemOut*.log -o -name *.out -o -name *.log -o -name javacore*.*"
The wildcards are already quoted and you don't need to quote them again. Moreover, here
LIST="-name \"SystemOut*.log\""
the inner quotes are preserved and find will get them as a part of the argument.
Building -name list for find command
Here is a proper way to do this:
cmd=();for p in \*.{log,tmp,bak} .debug-\*;do [ "$cmd" ] && cmd+=(-o);cmd+=(-name "$p");done
Or
cmd=()
for p in \*.{log,tmp,bak,'Spaced FileName'} {.debug,.log}-\* ;do
[ "$cmd" ] && cmd+=(-o)
cmd+=(-name "$p")
done
You could dump you $cmd array:
declare -p cmd
declare -a cmd=([0]="-name" [1]="*.log" [2]="-o" [3]="-name" [4]="*.tmp" [5]="-o"
[6]="-name" [7]="*.bak" [8]="-o" [9]="-name" [10]="*.Spaced FileName"
[11]="-o" [12]="-name" [13]=".debug-*" [14]="-o" [15]="-name" [16]=".log-*")
Then now you could
find [-L] [/path] \( "${cmd[#]}" \)
As
find \( "${cmd[#]}" \)
(Nota: if no path is submited, current path . is default)
find /home/user/SomeDir \( "${cmd[#]}" \)
find -L /path -type f \( "${cmd[#]}" \)

find option available to omit leading './' in result

I think this is probably a pretty n00ber question but I just gotsta ask it.
When I run:
$ find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \)
and get:
./01.Adagio - Allegro Vivace.mp3
./03.Allegro Vivace.mp3
./02.Adagio.mp3
./04.Allegro Ma Non Troppo.mp3
why does find prepend a ./ to the file name? I am using this in a script:
fList=()
while read -r -d $'\0'; do
fList+=("$REPLY")
done < <(find . -type f \( -name "*.mp3" -o -name "*.ogg" \) -print0)
fConv "$fList" "$dBaseN"
and I have to use a bit of a hacky-sed-fix at the beginning of a for loop in function 'fConv', accessing the array elements, to remove the leading ./. Is there a find option that would simply omit the leading ./ in the first place?
The ./ at the beginning of the file is the path. The "." means current directory.
You can use "sed" to remove it.
find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \) | sed 's|./||'
I do not recommend doing this though, since find can search through multiple directories, how would you know if the file found is located in the current directory?
If you ask it to search under /tmp, the results will be on the form /tmp/file:
$ find /tmp
/tmp
/tmp/.X0-lock
/tmp/.com.google.Chrome.cUkZfY
If you ask it to search under . (like you do), the results will be on the form ./file:
$ find .
.
./Documents
./.xmodmap
If you ask it to search through foo.mp3 and bar.ogg, the result will be on the form foo.mp3 and bar.ogg:
$ find *.mp3 *.ogg
click.ogg
slide.ogg
splat.ogg
However, this is just the default. With GNU and other modern finds, you can modify how to print the result. To always print just the last element:
find /foo -printf '%f\0'
If the result is /foo/bar/baz.mp3, this will result in baz.mp3.
To print the path relative to the argument under which it's found, you can use:
find /foo -printf '%P\0'
For /foo/bar/baz.mp3, this will show bar/baz.mp3.
However, you shouldn't be using find at all. This is a job for plain globs, as suggested by R Sahu.
shopt -s nullglob
files=(*.mp3 *.ogg)
echo "Converting ${files[*]}:"
fConv "${files[#]}"
find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \) -exec basename "{}" \;
Having said that, I think you can use a simpler approach:
for file in *.mp3 *.ogg
do
if [[ -f $file ]]; then
# Use the file
fi
done
If your -maxdepth is 1, you can simply use ls:
$ ls *.mp3 *.ogg
Of course, that will pick up any directory with a *.mp3 or *.ogg suffix, but you probably don't have such a directory anyway.
Another is to munge your results:
$ find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \) | sed 's#^\./##'
This will remove all ./ prefixes, but not touch other file names. Note the ^ anchor in the substitution command.

Find and delete old files excluding some subdirectories

I have been searching for a while, but can't seem to get a succinct solution. I am trying to delete old files but excluding some subdirectories (passed via parm) and their child subdirecories.
The issue that I am having is that when the subdirectory_name is itself older than the informed duration (also passed via parm) the find command is including the subdirectory_name on the list of the find. In reality the remove won't be able to delete these subdirectories because the rm command default option is f.
Here is the find commmand generated by the script:
find /directory/ \( -type f -name '*' -o -type d \
-name subdirectory1 -prune -o -type d -name directory3 \
-prune -o -type d -name subdirectory2 -prune -o \
-type d -name subdirectory3 -prune \) -mtime +60 \
-exec rm {} \; -print
Here is the list of files (and subdirectories brought by the find command)
/directory/subdirectory1 ==> this is a subdreictory name and I'd like to not be included
/directory/subdirectory2 ==> this is a subdreictory name and I'd like to not be included
/directory/subdirectory3 ==> this is a subdreictory name and I'd like to not be included
/directory/subdirectory51/file51
/directory/file1 with spaces
Besides this -- the script works fine not bringing (excluding) the files under these 3 subdirectories:
subdirectory1, subdirectory2 and subdirectory3.
Thank you.
Following command will delete only files older than 1 day.
You can exclude the directories as shown in the example below, directories test1 & test2 will be excluded.
find /path/ -mtime +60 -type d \( -path ./test1 -o -path ./test2 \) -prune -o -type f -print0 | xargs -0 rm -f
Though it would be advisable to see what's going to be deleted using -print
find /path/ -mtime +60 -type d \( -path ./test1 -o -path ./test2 \) -prune -o -type f -print
find /directory/ -type d \(
-name subdirectory1 -o \
-name subdirectory2 -o \
-name subdirectory3 \) -prune -o \
-type f -mtime +60 -print -exec rm -f {} +
Note that the AND operator (-a, implicit between two predicates if not specified) has precedence over the OR one (-o). So the above is like:
find /directory/ \( -type d -a \(
-name subdirectory1 -o \
-name subdirectory2 -o \
-name subdirectory3 \) -a -prune \) -o \
\( -type f -a -mtime +60 -a -print -a -exec rm -f {} + \)
Note that every file name matches the * pattern, so -name '*' is like -true and is of no use.
Using + instead of ; runs fewer rm commands (as few as possible, and each is passed several files to remove).
Do not use that code above on directories writeable by others as it's vulnerable to attacks whereby the attacker can change a directory to a symlink to another one in between the time find traverses the directory and calls rm to have you delete any file on the filesystem. Can be alleviated by changing the -exec part with -delete or -execdir rm -f {} \; if your find supports them.
See also the -path predicate if you want to exclude a specific subdirectory1 instead of any directory whose name is subdirectory1.

Why does find . -not -name ".*" not exclude hidden files?

I want to ignore all hidden files, but especially .git and .svn ones when searching (and later replacing) files, not I have found that the most basic way to exclude such hidden files described in many online tutorials doesn't work here.
find . -not -name ".*"
will also print hidden files.
The script I'm trying to write is
replace() {
if [ -n "$3" ]; then expr="-name \"$3\""; fi
find . -type f \( $expr -not -name ".*" \) -exec echo sed -i \'s/$1/$2/g\' {} \;
unset expr
}
The thing is -not -name ".*" does match all files and directories that start with anything but "." - but it doesn't prune them from the search, so you'll get matches from inside hidden directories. To prune paths use -prune, i.e.:
find $PWD -name ".*" -prune -o -print
(I use $PWD because otherwise the start of the search "." would also be pruned and there would be no output)
correct version
replace() {
if [ -n "$3" ]; then expr=-name\ $3; fi
find $PWD -name '.*' -prune -o $expr -type f -exec sed -i s/$1/$2/g {} \;
unset expr
}

bash: Filtering out directories and extensions from find?

I'm trying to find files modified recently with this
find . -mtime 0
Which gives me
en/content/file.xml
es/file.php
en/file.php.swp
css/main.css
js/main.js
But I'd like to filter out the en and es directories but would like to grab anything else. In addition, I'd like to filter out .swp files from the results of those.
So I want to get back:
css/main.css
js/main.js
xml/foo.xml
In addition to every other file not within es/en and not ending in .swp
properly, just in find:
find -mtime 0 -not \( -name '*.swp' -o -path './es*' -o -path './en*' \)
The -prune command prevents find form descending down the directories you wish to avoid:
find . \( -name en -o -name es \) -prune , -mtime 0 ! -name "*.swp"
Try this:
find . -mtime 0 | grep -v '^en' | grep -v '^es'
Adding the cap character at the beginning of the pattern given to grep ensures that it is a must to find the pattern at the start of the line.
Update: Following Chen Levy's comment(s), use the following instead of the above
find . -mtime 0 | grep -v '^\./en' | grep -v '^\./es'
find is great but the implementation in various UNIX versions differs, so I prefer solutions that are easier to memorize and using commands with more standard options
find . -mtime 0 | grep -v '^en' | grep -v '^es' | grep -v .swp
The -v flag for grep makes it return all lines that don't match the pattern.
The -regex option of find(1) (which can be combined with the -E option to enable extended regular expressions) matches the whole file path as well.
find . -mtime 0 -not \( -name '*.swp' -o -regex '\./es.*' -o -regex '\./en.*' \)
find "$(pwd -P)" -mtime 0 -not \( -name '*.swp' -o -regex '.*/es.*' -o -regex '.*/en.*' \)

Resources