So far I've come up with this:
find . -name 'CVS' -type d -exec rm -rf {} \;
It's worked locally thus far, can anyone see any potential issues? I want this to basically recursively delete 'CVS' directories accidently uploaded on to a server.
Also, how can I make it a script in which I can specify a directory to clean up?
Well, the obvious caveat: It'll delete directories named CVS, regardless of if they're CVS directories or not.
You can turn it into a script fairly easily:
#!/bin/sh
if [ -z "$1" ]; then
echo "Usage: $0 path"
exit 1
fi
find "$1" -name 'CVS' -type d -print0 | xargs -0 rm -Rf
# or find … -exec like you have, if you can't use -print0/xargs -0
# print0/xargs will be slightly faster.
# or find … -exec rm -Rf '{}' + if you have reasonably modern find
edit
If you want to make it safer/more fool-proof, you could do something like this after the first if/fi block (there are several ways to write this):
⋮
case "$1" in
/srv/www* | /home)
true
;;
*)
echo "Sorry, can only clean from /srv/www and /home"
exit 1
;;
esac
⋮
You can make it as fancy as you want (for example, instead of aborting, it could prompt if you really meant to do that). Or you could make it resolve relative paths, so you wouldn't have to always specify a full path (but then again, maybe you want that, to be safer).
A simple way to do would be:
find . -iname CVS -type d | xargs rm
-rf
Related
I need to delete a file present in multiple directories if it is found else ignore. I tried the following snippet.
ls $dir/"$input.xml" 2> /dev/null
var = `echo$?`
if [[ $var == 0 ]]; then
echo -e "\n Deleting...\n"
rm $dir/"$input.xml"
It failed.
Can anyone suggest me a better solution or modify the above snippet to suit the solution?
Not 100% sure what do you mean with "delete a file present in multiple directories if it is found else ignore". Assuming that you simply want to delete some files that are somewhere under $dir, do this:
Use find to find the files, and pipe to xargs rm:
find "$dir" -type f -name "*.xml" | xargs rm
If your filename is likely to contain spaces then do this:
find "$dir" -type f -name "*.xml" -print0 | xargs -0 rm
To supress the rm error message in case there are no files:
find "$dir" -type f -name "*.xml" -print0 | xargs -0 rm 2>/dev/null
To make your code working Try this [Insert space],
`echo $?`
Instead of this,
`echo$?`
I cannot get the following piece of script (which is part of a larger backup script) to work correctly:
BACKUPDIR=/BACKUP/db01/physical/incremental # Backups base directory
FULLBACKUPDIR=$BACKUPDIR/full # Full backups directory
INCRBACKUPDIR=$BACKUPDIR/incr # Incremental backups directory
KEEP=5 # Number of full backups (and its incrementals) to keep
...
FIRST_DELETE=`expr $KEEP + 1` # add one to the number of backups to keep, this will be the first deleted
FILE0=`ls -ltr $FULLBACKUPDIR | awk '{print $9}' | tail -$FIRST_DELETE | head -1` # search for the first backup to be deleted
...
find $FULLBACKUPDIR -maxdepth 1 -type d ! -newer $FULLBACKUPDIR/$FILE0 -execdir echo "removing: "$FULLBACKUPDIR/$(basename {}) \; -execdir bash -c 'rm -rf $FULLBACKUPDIR/$(basename {})' \; -execdir echo "removing: "$INCRBACKUPDIR/$(basename {}) \; -execdir bash -c 'rm -rf $INCRBACKUPDIR/$(basename {})' \;
So the find works correctly which on its own will output something like this:
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-51-28
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-51-28
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-25-07
What I want is the -exec to echo a line showing what is being removed and then remove the folder from both directories.
I've tried various ways to get just the basename but nothing seems to be working. I get this:
removing: /BACKUPS/mysql/physical/incremental/full/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-51-28"
removing: /BACKUPS/mysql/physical/incremental/incr/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-51-28"
removing: /BACKUPS/mysql/physical/incremental/full/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-25-07"
And of course the folders arn't deleted because they don't exist, just fail silently because of the -f option. If I remove the -f I get the 'cannot be found' error on each rm.
How do I accomplish this? Because backups and parts of backups may be stored across different storage systems I really need the ability to just get the folder name for use in any known path.
the $(basename {}) is run first, making removing: "$INCRBACKUPDIR/$(basename {}) to removing: "$INCRBACKUPDIR/{} then the replacement is done of {}.
a way around it may be to pipe it to bash:
-exec echo "echo \"removing: \\\"$INCRBACKUPDIR/\$(basename {})\\\"\" | bash" \;
Lots of broken here.
All caps variables are by convention env vars and should not be used in scripts.
Using legacy backticks instead of $()
Parsing the output of ls (!)
Parsing the output of ls -l (!!!)
Expanding variables known to contain paths without full quotes.
All you absolutely need in order to improve this is to -exec bash properly, e.g.
-execdir bash -c 'filepath="$1" ; base=$(basename "$filepath") ; echo use $filepath and $base here' -- {} \;
But how about this instead:
#!/usr/bin/env bash
backup_base=/BACKUP/db01/physical/incremental
full_backup="$backup_base"/full
incremental_backup="$backup_base"/incr
keep=5
rm=echo
let n=0
while IFS= read -r -d $'\0' line ; do
file="${line#* }"
if [[ $n -lt $keep ]] ; then
let n=n+1
continue
fi
base=$(basename "$file")
echo "removing: $full_backup/$base"
"$rm" -rf -- "$full_backup"/"$base"
echo "removing: $incremental_backup/$base"
"$rm" -rf -- "$incremental_backup"/"$base"
done < <(find "$full_backup" -maxdepth 1 -printf '%T#.%p\0' 2>/dev/null | sort -z -r -n -t. -k1,2)
Iterate over files and directories immediately under the backup dir and skip the first 5 newest. Delete from the full and incremental dirs files matching the names of the rest.
This is an essentially safe version, except of course for timing attacks.
I have defined rm as being echo to avoid accidental deletes; swap it back to rm for actual deletion once you're sure it's correct.
The following bash script is slow when scanning for .git directories because it looks at every directory. If I have a collection of large repositories it takes a long time for find to churn through every directory, looking for .git. It would go much faster if it would prune the directories within repos, once a .git directory is found. Any ideas on how to do that, or is there another way to write a bash script that accomplishes the same thing?
#!/bin/bash
# Update all git directories below current directory or specified directory
HIGHLIGHT="\e[01;34m"
NORMAL='\e[00m'
DIR=.
if [ "$1" != "" ]; then DIR=$1; fi
cd $DIR>/dev/null; echo -e "${HIGHLIGHT}Scanning ${PWD}${NORMAL}"; cd ->/dev/null
for d in `find . -name .git -type d`; do
cd $d/.. > /dev/null
echo -e "\n${HIGHLIGHT}Updating `pwd`$NORMAL"
git pull
cd - > /dev/null
done
Specifically, how would you use these options? For this problem, you cannot assume that the collection of repos is all in the same directory; they might be within nested directories.
top
repo1
dirA
dirB
dirC
repo1
Check out Dennis' answer in this post about find's -prune option:
How to use '-prune' option of 'find' in sh?
find . -name .git -type d -prune
Will speed things up a bit, as find won't descend into .git directories, but it still does descend into git repositories, looking for other .git folders. And that 'could' be a costly operation.
What would be cool is if there was some sort of find lookahead pruning mechanism, where if a folder has a subfolder called .git, then prune on that folder...
That said, I'm betting your bottleneck is in the network operation 'git pull', and not in the find command, as others have posted in the comments.
Here is an optimized solution:
#!/bin/bash
# Update all git directories below current directory or specified directory
# Skips directories that contain a file called .ignore
HIGHLIGHT="\e[01;34m"
NORMAL='\e[00m'
function update {
local d="$1"
if [ -d "$d" ]; then
if [ -e "$d/.ignore" ]; then
echo -e "\n${HIGHLIGHT}Ignoring $d${NORMAL}"
else
cd $d > /dev/null
if [ -d ".git" ]; then
echo -e "\n${HIGHLIGHT}Updating `pwd`$NORMAL"
git pull
else
scan *
fi
cd .. > /dev/null
fi
fi
#echo "Exiting update: pwd=`pwd`"
}
function scan {
#echo "`pwd`"
for x in $*; do
update "$x"
done
}
if [ "$1" != "" ]; then cd $1 > /dev/null; fi
echo -e "${HIGHLIGHT}Scanning ${PWD}${NORMAL}"
scan *
I've taken the time to copy-paste the script in your question, compare it to the script with your own answer. Here some interesting results:
Please note that:
I've disabled the git pull by prefixing them with a echo
I've removed also the color things
I've removed also the .ignore file testing in the bash solution.
And removed the unecessary > /dev/null here and there.
removed pwd calls in both.
added -prune which is obviously lacking in the find example
used "while" instead of "for" which was also counter productive in the find example
considerably untangled the second example to get to the point.
added a test on the bash solution to NOT follow sym link to avoid cycles and behave as the find solution.
added shopt to allow * to expand to dotted directory names also to match find solution's functionality.
Thus, we are comparing, the find based solution:
#!/bin/bash
find . -name .git -type d -prune | while read d; do
cd $d/..
echo "$PWD >" git pull
cd $OLDPWD
done
With the bash shell builting solution:
#!/bin/bash
shopt -s dotglob
update() {
for d in "$#"; do
test -d "$d" -a \! -L "$d" || continue
cd "$d"
if [ -d ".git" ]; then
echo "$PWD >" git pull
else
update *
fi
cd ..
done
}
update *
Note: builtins (function and the for) are immune to MAX_ARGS OS limit for launching processes. So the * won't break even on very large directories.
Technical differences between solutions:
The find based solution uses C function to crawl repository, it:
has to load a new process for the find command.
will avoid ".git" content but will crawl workdir of git repositories, and loose some
times in those (and eventually find more matching elements).
will have to chdir through several depth of sub-dir for each match and go back.
will have to chdir once in the find command and once in the bash part.
The bash based solution uses builtin (so near-C implementation, but interpreted) to crawl repository, note that it:
will use only one process.
will avoid git workdir subdirectory.
will only perform chdir one level at a time.
will only perform chdir once for looking and performing the command.
Actual speed results between solutions:
I have a working development collection of git repository on which I launched the scripts:
find solution: ~0.080s (bash chdir takes ~0.010s)
bash solution: ~0.017s
I have to admit that I wasn't prepared to see such a win from bash builtins. It became
more apparent and normal after doing the analysis of what's going on. To add insult to injuries, if you change the shell from /bin/bash to /bin/sh (you must comment out the shopt line, and be prepared that it won't parse dotted directories), you'll fall to
~0.008s . Beat that !
Note that you can be more clever with the find solution by using:
find . -type d \( -exec /usr/bin/test -d "{}/.git" -a "{}" != "." \; -print -prune \
-o -name .git -prune \)
which will effectively remove crawling all sub-repository in a found git repository, at the price of spawning a process for each directory crawled. The final find solution I came with was around ~0.030s, which is more than twice faster than the previous find version, but remains 2 times slower than the bash solution.
Note that /usr/bin/test is important to avoid search in $PATH which costs time, and I needed -o -name .git -prune and -a "{}" != "." because my main repository was itself a git subrepository.
As a conclusion, I won't be using the bash builtin solution because it has too much corner cases for me (and my first test hit one of the limitation). But it was important for me to explain why it could be (much) faster in some cases, but find solution seems much more robust and consistent to me.
The answers above all rely on finding a ".git" repository. However not all git repos have these (e.g. bare repos). The following command will loop through all directories and ask git if it considers each to be a directory. If so, it prunes sub dirs off the tree and continues.
find . -type d -exec sh -c 'cd "{}"; git rev-parse --git-dir 2> /dev/null 1>&2' \; -prune -print
It's a lot slower than other solutions because it's executing a command in each directory, but it doesn't rely on a particular repository structure. Could be useful for finding bare git repositories for example.
I list all git repositories anywhere in the current directory using:
find . -type d -execdir test -d {}/.git \\; -prune -print
This is fast since it stops recursing once it finds a git repository. (Although it does not handle bare repositories.) Of course, you can change the . to whatever directory you want. If you need, you can change the -print to -print0 for null-separated values.
To also ignore directories containing a .ignore file:
find . -type d \( -execdir test -e {}/.ignore \; -prune \) -o \( -execdir test -d {}/.git \; -prune -print \)
I've added this alias to my ~/.gitconfig file:
[alias]
repos = !"find -type d -execdir test -d {}/.git \\; -prune -print"
Then I just need to execute:
git repos
To get a complete listing of all the git repositories anywhere in my current directory.
For windows, you can put the following into a batch file called gitlist.bat and put it on your PATH.
#echo off
if {%1}=={} goto :usage
for /r %1 /d %%I in (.) do echo %%I | find ".git\."
goto :eof
:usage
echo usage: gitlist ^<path^>
Check out the answer using the locate command:
Is there any way to list up git repositories in terminal?
The advantages of using locate instead of a custom script are:
The search is indexed, so it scales
It does not require the use (and maintenance) of a custom bash script
The disadvantages of using locate are:
The db that locate uses is updated weekly, so freshly-created git repositories won't show up
Going the locate route, here's how to list all git repositories under a directory, for OS X:
Enable locate indexing (will be different on Linux):
sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.locate.plist
Run this command after indexing completes (might need some tweaking for Linux):
repoBasePath=$HOME
locate '.git' | egrep '.git$' | egrep "^$repoBasePath" | xargs -I {} dirname "{}"
This answer combines the partial answer provided #Greg Barrett with my optimized answer above.
#!/bin/bash
# Update all git directories below current directory or specified directory
# Skips directories that contain a file called .ignore
HIGHLIGHT="\e[01;34m"
NORMAL='\e[00m'
export PATH=${PATH/':./:'/:}
export PATH=${PATH/':./bin:'/:}
#echo "$PATH"
DIRS="$( find "$#" -type d \( -execdir test -e {}/.ignore \; -prune \) -o \( -execdir test -d {}/.git \; -prune -print \) )"
echo -e "${HIGHLIGHT}Scanning ${PWD}${NORMAL}"
for d in $DIRS; do
cd "$d" > /dev/null
echo -e "\n${HIGHLIGHT}Updating `pwd`$NORMAL"
git pull 2> >(sed -e 's/X11 forwarding request failed on channel 0//')
cd - > /dev/null
done
In a script I'm working on, I want to rm -rf a folder hierarchy. But before doing so, I want to make sure the process can complete successfully. So, this is the approach I came up with:
#!/bin/bash
set -o nounset
set -o errexit
BASE=$1
# Check if we can delete the target base folder
echo -n "Testing write permissions in $BASE..."
if ! find $BASE \( -exec test -w {} \; -o \( -exec echo {} \; -quit \) \) | xargs -I {} bash -c "if [ -n "{}" ]; then echo Failed\!; echo {} is not writable\!; exit 1; fi"; then
exit 1
fi
echo "Succeeded"
rm -rf $BASE
Now I'm wondering if there might be a better solution (more readable, shorter, reliable, etc).
Please note that I am fully aware of the fact that there could be changes in file access permissions between the check and the actual removal. In my use case, that is acceptable (compared to not doing any checks). If there was a way to avoid this though, I'd love to know how.
Are you aware of the -perm switch of find (if it is present in your version)?
find -perm -u+w
i have bunch of dirs , say **a, b, c0, d, Z , foo, ** and so on.
I want to remove all the directories except dirs foo, foo2, a and b
can anyone provide me the syntax to do this shell?
Thanks
UPDATE
I just want to say Thank you to all of you for your responses!
echo `ls -1 -d */ | egrep -v '^(foo|foo2|a|b)/$'`
If you are satisfied with the output, replace echo with rmdir (or rm -r, if the directories still contain data).
Probably the easiest way;
mkdir ../tempdir
mv foo foo2 a b ../tempdir
rm *
mv ../tempdir/* .
rmdir ../tempdir
Please note that this deletes also all files, not just directories.
You can use find on a complicated command line, but perhaps the simplest and, more importantly, safest way is to create a file that lists all of the directories you want to remove. Then use the file as input to rm like this:
find . -maxdepth 1 -type d > dirs_to_remove
Now edit the file and take out any directories you want to keep, then use rm:
rm -ir $(<edited_dirs_to_remove)
Note the -i argument. It's optional and forces rm to ask you before deleting each file.
Also note the $(<filename) syntax, which is specific to bash and is equivalent to, but cheaper than $(cat filename).
One of the more powerful ways to do this sort of trick is using find + grep + xargs:
DONT_REMOVE='a|b|c0|d|Z|foo'
find . -type d -print | egrep -v "^\.$DONT_REMOVE\$" | xargs rm -r
The only trick here is making sure the pattern matches only those you don't want to remove.
The above pattern only matches files in the current directory. You can make it more or less
permissive, e.g:
IF_PATH_IS_IMMEDIATE_SUBDIR="^\./($DONT_REMOVE)$"
IF_PATH_ENDS_IN="/($DONT_REMOVE)$"
IF_PATH_CONTAINS="/($DONT_REMOVE)(/.*)?$"
Then pass one of these in your egrep, e.g:
find . -type d -print | egrep -v "$IF_PATH_ENDS_IN" | xargs rm -r
To invert the choice (ie. delete all those items) just remove the -v from the egrep
one way
find . -maxdepth 1 -type d \( ! -name "bar" -a ! -name "foo" -a ! -name "a" -a ! -name "b" \) -delete # to remove files as well, remove -type d
OR try using extglob
shopt -s extglob
rm -rf !(foo|bar|a|b)/ # to delete files as well, remove the last "/"
And yes, this assume you don't want the rest of the directories in your directory except the 4 directories you want.