Can't seem to crack this one.
I have a bash script to search a folder and exclude certain file types.
list=`find . -type f ! \( -name "*data.php" -o -name "*.log" -o -iname "._*" -o -path "*patch" \)`
I want to exclude files which start with dot-dash ._ but the above just refuses to work.
Here's some more of the script, but I am still getting files copied with start with ._
O/S is CentOS 5.3
list=`find . -type f ! \( -name "*data.php" -o -name "*.log" -o -iname "._*" -o -path "*patch" \)`
for a in $list; do
if [ ! -f "$OLDFOL$a" ]; then
cp --preserve=all --parents $a $UPGFOL
continue
fi
diff $a "$OLDFOL$a" > /dev/null
if [[ "$?" == "1" ]]; then
# exists & different so copy
cp --preserve=all --parents $a $UPGFOL
fi
done
First -- don't do it that way.
files="`find ...`"
splits names on whitespace, meaning that Some File becomes two files, Some and File. Even splitting on newlines is unsafe, as valid UNIX filenames can contain $'\n' (any character other than / and null is valid in a UNIX filename). Instead...
getfiles() {
find . -type f '!' '(' \
-name '*data.php' -o \
-name '*.log' -o \
-iname "._*" -o \
-path "*patch" ')' \
-print0
}
while IFS= read -r -d '' file; do
if [[ ! -e $orig_dir/$file ]] ; then
cp --preserve=all --parents "$file" "$dest_dir"
continue
fi
if ! cmp -q "$file" "$orig_dir/$file" ; then
cp --preserve=all --parents "$file" "$dest_dir"
fi
done < <(getfiles)
The above does a number of things right:
It is safe against filenames containing spaces or newlines.
It uses cmp -q, not diff. cmp exits immediately when a change is made, rather than needing to calculate the delta between two files, and is thus far faster.
Read BashFAQ #1, UsingFind, and BashPitfalls #1 to understand some of the differences between this and the original.
Also -- I've validated that this correctly excludes filenames which start with ._ -- but the original version did too. Perhaps what you really want is to exclude filenames matching *._* rather than ._*?
Related
I have a list of videos for which I'd like to convert. To build up the file list I need to work with, I use the following:
file_list=( $(find . \( \
-name '*.[Mm][Oo][Vv]' -o \
-name '*.[Aa][Vv][Ii]' -o \
-name '*[!-][!h][!v][!c][!1].mp4' \
\) -print) )
task_list=()
for i in "${!file_list[#]}" ; do
m="${file_list[$i]}"
n="${m%.*}-hvc1.mp4"
if [[ ! -f "$n" ]] ; then
task_list+=("$m")
fi
done
Is there some way I might be able to fold this logic into find and get the file list in one pass, or am I stuck with this two pass script where I do a find and then I have to do a loop?
May be this works for you
task_list=()
for file in *.[Mm][Oo][Vv] *.[Aa][Vv][Ii] *[!-][!h][!v][!c][!1].mp4 ; do
n="${file%.*}-hvc1.mp4"
if [[ ! -f "$n" ]] ; then
task_list+=("$file")
fi
done
Does not use find but I suppose that it is not a must.
for subj in `cat dti_list.txt`; do
echo $subj
find . -type f -iname '*306.nii' -execdir bash -c 'rename.ul "$subj" DTI_MAIN_AP.nii *.nii' \+
done
I have some trouble with a small bash script, which adds the name instead of replacing when I use the rename.ul function.
Currently, the code adds DTI_MAIN_AP.nii in front of the old name.
My goal is to replace the name from the subj list and using the find to search up any directory with a *306.nii file, and then using execdir to execute the rename.ul function to rename the file from the dti_list.txt.
Any solution, or correction to get the code working, will be appreciated.
If you just want to rename the first file matching *306.nii in each directory to DTI_MAIN_AP.nii, that might look like:
find . -type f -iname '*306.nii' \
-execdir sh -c '[[ -e DTI_MAIN_AP.nii ]] || mv "$1" DTI_MAIN_AP.nii' _ {} +
If instead of matching on *306.nii you want to iterate over names from dti_list.txt, that might instead look like:
while IFS= read -r -d '' filename <&3; do
find . -type f -name "$filename" \
-execdir sh -c '[[ -e DTI_MAIN_AP.nii ]] || mv "$1" DTI_MAIN_AP.nii' _ {} +
done <dti_list.txt
References of note:
BashFAQ #1 (on reading files line-by-line)
Using Find
I would like to get all the files from a directory which have a pattern and are not in a .ignore file.
I've tried this command :
find . -name '*.js' | grep -Fxv .ignore
but find output is like ./directory/file.js and the format in my .ignore is the following:
*.min.js
directory/directory2/*
directory/file_56.js
So grep does not match any...
Does anyone has an idea/clue of how to do this?
Update
So i've found some things but it's not completely working:
find . -name '*.js' -type f $(printf "! -name %s " $(cat .ignore | sed 's/\//\\/g')) | # keeps the path
sed 's/^\.\///' | # deleting './'
grep -Fxvf .ignore
It works (not showing) for *.min.js and directory/file_56.js but not for directory/directory2/*
It looks like you're looking for a subset of the functionality supported by Git's .gitignore file:
args=()
while read -r pattern; do
[[ ${#args[#]} -gt 0 ]] && args+=( '-o' )
[[ $pattern == */* ]] && args+=( -path "./$pattern" ) || args+=( -name "$pattern" )
done < .ignore
find . -name '*.js' ! \( "${args[#]}" \)
The exclusion tests for find are built up in a Bash array first, which allows applying line-specific logic:
Note how a -path or -name test is used, depending on whether the pattern at hand from .ignore contains at least one / or not:
Patterns for -path tests are prefixed with ./ to match the paths output by find.
Patterns for -name are left as-is; patterns for *.min.js will match anywhere in the subtree.
With your sample .ignore file, the above results in the following find command:
find . -name '*.js' ! \( \
-name '*.min.js' -o -path './directory/directory2/*' -o -path './directory/file_56.js' \
\)
I wrote this code a few months ago and didn't touch it again. Now I picked it up to complete it. This is part of a larger script to find all files with specific extensions, find which ones have a certain word, and replace every instance of that word with another one.
In this excerpt, ARG4 is the directory it starts looking at (it keeps going recursively).
ARG2 is the word it looks for.
ARG3 is the word that replaces ARG2.
ARG4="$4"
find -P "$ARG4" -type f -name '*.h' -o -name '*.C' \
-o -name '*.cpp' -o -name "*.cc" \
-exec grep -l "$ARG2" {} \; | while read file; do
echo "$file"
sed -n -i -E "s/"$ARG2"/"$ARG3"/g" "$file"
done
Like I said it's been a while, but I've read the code and I think it's pretty understandable. I think the problem must be in the while loop. I googled more info about "while read ---" but I didn't find much.
EDIT 2: See my answer down below for the solution.
I discovered that find wasn't working properly. It turns out that it's because of -maxdepth 0 which I put there so that the search would only happen in the current directory. I took it out, but then the output of find was one single string with all of the file names. They needed to be separate entities so that the while loop could read each one. So I rewrote it:
files=(`find . -type f \( -name "*.h" -o -name "*.C" -o \
-name "*.cpp" -o -name "*.cc" \) \
-exec grep -l "$ARG1" {} \;`)
for i in ${files[#]} ; do
echo $i
echo `gsed -E -i "s/$ARG1/$ARG2/g" ${i}`
done
I had to install GNU sed, the regular one just wouldn't accept the file names.
It's hard to say if this is the only issue, since you haven't said precisely what's wrong. However, your find command's -exec action is only being applied for *.cc files. If you want it to apply for any of those, it should look more like:
ARG4="$4"
find -P "$ARG4" -type f \( -name '*.h' -o -name '*.C' \
-o -name '*.cpp' -o -name "*.cc" \) \
-exec grep -l "$ARG2" {} \; | while read file; do
echo "$file"
sed -n -i -E "s/"$ARG2"/"$ARG3"/g" "$file"
done
Note the added ( and ) for grouping to attach the action to the result of all of those.
A bunch of Word & Excel documents were being moved on the server when the process terminated before it was complete. As a result, we're left with several perfectly fine files that have a .tmp extension, and we need to rename these files back to the appropriate .xlsx or .docx extension.
Here's my current code to do this in Bash:
#!/bin/sh
for i in "$(find . -type f -name *.tmp)"; do
ft="$(file "$i")"
case "$(file "$i")" in
"$i: Microsoft Word 2007+")
mv "$i" "${i%.tmp}.docx"
;;
"$i: Microsoft Excel 2007+")
mv "$i" "${i%.tmp}.xlsx"
;;
esac
done
It seems that while this does search recursively, it only does 1 file. If it finds an initial match, it doesn't go on to rename the rest of the files. How can I get this to loop correctly through the directories recursively without it doing just 1 file at a time?
Try find command like this:
while IFS= read -r -d '' i; do
ft="$(file "$i")"
case "$ft" in
"$i: Microsoft Word 2007+")
mv "$i" "${i%.tmp}.docx"
;;
"$i: Microsoft Excel 2007+")
mv "$i" "${i%.tmp}.xlsx"
;;
esac
done < <(find . -type f -name '*.tmp' -print0)
Using <(...) is called process substitution to run find command here
Quote filename pattern in find
Use -print0 to get find output delimited by a null character to allow space/newline characters in file names
Use IFS= and -d '' to read null separated filenames
I too would recommend using find. I would do this in two passes of find:
find . -type f -name \*.tmp \
-exec sh -c 'file "{}" | grep -q "Microsoft Word 2007"' \; \
-exec sh -c 'f="{}"; echo mv "$f" "${f%.tmp}.docx"' \;
find . -type f -name \*.tmp \
-exec sh -c 'file "{}" | grep -q "Microsoft Excel 2007"' \; \
-exec sh -c 'f="{}"; echo mv "$f" "${f%.tmp}.xlsx"' \;
Lines are split for readability.
Each instance of find will search for tmp files, then use -exec to test the output of find. This is similar to how you're doing it within the while loop in your shell script, only it's launched from within find itself. We're using the pipe to grep instead of your case statement.
The second -exec only gets run if the first one returned "true" (i.e. grep -q ... found something), and executes the rename in a tiny shell instance.
I haven't profiled this to see whether it would be faster or slower than a loop in a shell script. Just another way to handle things.