bash: IF condition to remove files with a given pattern occured within subfolders - bash

As a part of my bash routine I am trying to add IF condition which should remove all csv filles contained pattern "filt" in their names:
# this is a folder contained all subfolders
results=./results
# looping all directories located in the $results
for d in "${results}"/*/; do
if [ -f "${d}"/*filt*.csv ]; check if csv file is present within dir $d
rm "${d}"/*filt*.csv # remove this csv file
fi
done
Although a version without condition find and removes the csv properly:
rm "${d}"/*filt*.csv
While executing my example with IF gives the following error:
line 27: [: too many arguments
where the line 27 corresponds to the IF condition. How it could be fixed?
can I use something like without any IF statement?
# find all CSV filles matching pattern within the "${d}"
find "${d}" -type f -iname *filt*.csv* -delete

You could use shopt -s nullglob and then skip the test and use rm -f "$d"/*filt*.csv directly. The nullglob option makes sure that the glob not matching anything expands to the empty string, and -f would silence rm.
You could also skip the outer loop and simplify everything to
shopt -s nullglob
rm -f results/*filt*.csv
This could fail if the glob matches so many files that the maximum line length is exceeded. In that case, you're better off with find:
find results -name '*filt*.csv' -exec rm {} +
or, with GNU find:
find results -name '*filt*.csv' -delete
If there are subdirectories you want to skip, use -maxdepth 1. If there are directories matching the pattern, use -type f.

Related

find and delete empty files in directory and its subdirs without find

I am trying to make a bash script that finds and removes empty files in a directory including subdirectories, without using the find command.
This is part of the script using the find command but I am unsure how to convert this line without using find.
find . -type f -empty -delete
Try this code:
# enable recursive globstar matching
shopt -s globstar
# directory to delete files from
dir="/tmp"
# loop through files recusively
for f in ${dir}/* ${dir}/**/* ; do
# check if file is empty
if [ ! -s "$f" ]; then
# remove file
rm "$f"
fi
done

Randomly select a file with a given extension and copy it to a given directory

I would like to write a simple bash script that randomly selects a .flac file from the /music/ folder (it will have to be recursive because there are many subfolders within that folder), and copy that file to /test/random.flac
This post is very close to what I want to do but I'm not sure how to change the script to do what I want.
I tried this:
ls /music |sort -R |tail -1 |while read file; do
cp $file /test/random.flac
done
But I'm missing how to tell ls to do a recursive search of all the .flac inside the subfolders.
You can shove all the files into an array, then select one at a random index to move. This requires the globstar option to be enabled so you can use the **/* glob pattern:
shopt -s globstar
flacfiles=(/music/**/*.flac)
cp "${flacfiles[RANDOM % ${#flacfiles[#]}]}" /test/random.flac
To select a random index, we take $RANDOM modulo the number of elements in the flacfiles array.
If you want an error message in case the glob doesn't match anything, you can use the failglob shell option:
shopt -s globstar failglob
if flacfiles=(/music/**/*.flac); then
cp "${flacfiles[RANDOM % ${#flacfiles[#]}]}" /test/random.flac
fi
This fails with an error message
-bash: no match: music/**/*.flac
in case there are no matching files and doesn't try to copy anything.
If you know for sure that there are .flac files, you can ignore this.
Recurse
Use find instead of ls:
find /music/ -iname '*.flac' -print0 | shuf -z -n1
The find part will find all flac files inside the directory /music/ and list them.
shuf shuffles that list and prints the first entry (which is random, because the list was shuffled).
-print0 and -z are there to use \NUL for separating the file names. Without these options, \n (newline) would be used, which is unsafe, because filenames can contain newlines (even though it is very uncommon).
Copy
If you want to copy just one random file, there's no need for a while loop. Use Substitution $() instead.
cp "$(find /music/ -iname '*.flac' -print0 | shuf -z -n1)" /test/random.flac

shell script iterate throw directories and split filenames

I need to extract 2 things from filenames - the extension and a number.
I have a folder "/var/www/html/MyFolder/", this folder contains a few more folders and in each folder are some files stored.
The file has the following structure: "a_X_mytest.jpg" or "a_X_mytest.png".
The "a_" is fix and in each folder the same, and i need the "X" and the file extension.
My script looks like this:
#!/bin/bash
for dir in /var/www/html/MyFolder/*/
do
dir=${dir%*/}
find "/var/www/html/MyFolder/${dir##*/}/a_*.*" -maxdepth 1 -mindepth 1 -type f
done
That's only the beginning from my script.
There is a mistake in my script:
find: `/var/www/html/MyFolder/first/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/sec/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/test/a_*.*': No such file or directory
Does anybody know where the mistake is?
The next step, when the lines above are working, is to split the found files and get the two parts.
To split i would use this:
arrFIRST=(${IN//_/ })
echo ${arrFIRST[1]}
arrEXT=(${IN//./ })
echo ${arrEXT[1]}
Can anybody help me with my problem?
tl;dr:
Your script can be simplified to the following:
for file in /var/www/html/MyFolder/*/a_*.*; do
[[ -f $file ]] || continue
[[ "${file##*/}" =~ _(.*)_.*\.(.*)$ ]] &&
x=${BASH_REMATCH[1]} ext=${BASH_REMATCH[2]}
echo "$x"
echo "$ext"
done
A single glob (filename pattern, wildcard pattern) is sufficient in your case, because a glob can have multiple wildcards across levels of the hierarchy: /var/www/html/MyFolder/*/a_*.* finds files matching a_*.* in any immediate subfolder of (*/) of folder /var/www/html/MyFolder.
You only need find to match files located on different levels of a subtree (but you may also need it for more complex matching needs).
[[ -f $file ]] || break ensures that only files are considered and also effectively exits the loop if NO matches are found.
[[ ... =~ ... ]] uses bash's regex-matching operator, =~, to extract the tokens of interest from the filename part of each matching file (${file##*/}).
The results of the regex matching are stored in reserved array variable "${BASH_REMATCH}", with the 1st element containing what the 1st parenthesized subexpression ((...) - a.k.a. capture group) captured, and so on.
Alternatively, you could have used read with an array to parse matching filenames into their components:
IFS='_.' read -ra tokens <<<"${file##*/}"
x="${tokens[0]}"
ext="${tokens[#]: -1}"
As for why what you tried didn't work:
find does NOT support globs as filename arguments, so it interprets "/var/www/html/MyFolder/${dir##*/}/a_*.*" literally.
Also, you have to separate the root folder for your search from the filename pattern to look for on any level of the root folder's subtree:
the root folder becomes the filename argument
the filename pattern is passed (always quoted) via the -name or -iname (for case-insensitive matching) options
Ergo: find "/var/www/html/MyFolder/${dir##*/}" -name 'a_*.*' ..., analogous to #konsolebox' answer.
I'm not sure about the needed complexity but perhaps what you want is
find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*'
Thus:
while IFS= read -r FILE; do
# Do something with "$FILE"...
done < <(exec find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*')
Or
readarray -t FILES < <(exec find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*')
for FILE in "${FILES[#]}"; do
# Do something with "$FILE"...
done

How to search for *~ as in anything ending with ~ in a bash script

I'm writing a Bash script and I need to find and move/delete all files with names ending in ~ or beginning and ending with #, that is file~ or #file#, emacs junk files.
I'm trying to use [ -f *~ ] && ( ... move or delete those files ... ) to determine if any files of this kind exist before I try to do anything to them, so as not to get error messages from the rm or mv function if they don't find the files. However, this results in "binary operator expected". I think it has something to do with the fact that ~ is an unary operator. Is there a way to make it work as intended?
Nothing wrong with what you were doing originally for current directory (not any slower than find), though not as one-liney.
#!/bin/bash
for file in *"~"; do
if [ -f "$file" ]; then
#do something with $file
fi
done
Also, "binary operator expected" is just coming from bash expecting a single argument for the "-f" operator, whereas *~ can expand to multiple arguments, e.g.
$ mkdir test && cd test
$ touch "1~"
$ if [ -f *"~" ]; then echo "Confirmed file ending in ~"; fi
Confirmed file ending in ~
$ touch {2..10}"~" && echo *"~"
1~ 10~ 2~ 3~ 4~ 5~ 6~ 7~ 8~ 9~
$ if [ -f *"~" ]; then echo "Confirmed file ending in ~"; fi
bash: [: too many arguments
$ if [ -f "arg1" "arg2"; then echo "Confirmed file ending in ~"; fi
bash: [: arg1: binary operator expected
Not positive why errors are different for the two cases, but pretty sure either error can result depending on expansion.
Your problem stems from the fact that file-testing operators such as -f are not designed to be used with globbing patterns - only with a single, literal path.
You can simply let bash's path expansion (globbing) do the work:
Note: The approaches below are an alternative to using a loop (as demonstrated in #BroSlow's answer).
Simplest approach:
rm -f *'~' '#'*'#'
This removes all matching files, if any, and, if there are no matches, does nothing (and outputs nothing and reports exit code 0) - thanks to the -f option (tip of the hat to #chris).
Caveat: This also silently removes files marked as read-only, IF you have sufficient permissions to make them writable. In other words: if files match that you have intentionally marked as read-only, they will still get removed.
Also, if directories happen to match, they will NOT be removed, an error message will be displayed and the exit code will be 1 - matching files, however, are still removed.
At your own peril you may add -r to also quietly remove any matching directories (whether they're empty or not).
Using find, if explicitly ruling out directories is desired:
To avoid matching directories, you can use find, but to make it safe, the command gets lengthy:
# delete
find . -maxdepth 1 -type f -name '*~' -delete -or -name '#*#' -delete
# move
find . -maxdepth 1 -type f \
-name '*~' -exec mv {} /tmp/ \; -or \
-name '#*#' -exec mv {} /tmp/ \;
(Two general notes on find:
The path itself (., in this case) is by default included in the set of items (not a concern in this particular case due to excluding directories from matching) - to avoid that, add -mindepth 1.
Terminating the command passed to the -exec primary with + rather than \; is generally preferable, as find then substitutes as many matches as will safely fit for {}, resulting in much fewer invocations (typically just 1) of the command (assuming, of course, that your command can take argument lists of variable length) - this is similar to xargs' behavior.
Here's the catch: -exec only accepts commands terminated with + if {} is the command's last argument (and will otherwise fail with the misleading error message find: missing argument to '-exec').
Thus, in the case at hand + cannot be used, because the mv command's last argument must be the target.
)
The shell will expand your *~ to a list of all files ending in ~. So if you have more than one of them, they all will be in the parameter list of -f, but -f handles only one parameter.
Try
find . -name "*~" -print | xargs rm
and read about the parameters to find if you want to stop it from recursing your whole directory structure.
The find command is generally used for things of this nature. It even has a built-in -delete flag.
find -name '*~' -delete
or, with xargs (to move, for example)
# Moves files to /tmp using the replacement string specified with the -I flag
find -name '*~' -print0 | xargs -0 -I _ mv _ /tmp/
If you prefer to use xargs for deletion as well, you can do away with the use of -I
find -name '*~' -print0 | xargs -0 rm
Note the use of the -print0 and -0 flags to specify null-terminated paths. This allows paths with spaces to run properly. Without -0, filenames with spaces (including spaces anywhere in the path) will be treated as two separate (possibly invalid) paths.

Iterate through subdirectories in bash

How can we iterate over the subdirectories of the given directory and get file within those subdirectories in bash. Can I do that using grep command?
This will go one subdirectory deep. The inner for loop will iterate over enclosed files and directories. The if statement will exclude directories. You can set options to include hidden files and directories (shopt -s dotglob).
shopt -s nullglob
for dir in /some/dir/*/
do
for file in "$dir"/*
do
if [[ -f $file ]]
then
do_something_with "$file"
fi
done
done
This will be recursive. You can limit the depth using the -maxdepth option.
find /some/dir -mindepth 2 -type f -exec do_something {} \;
Using -mindepth excludes files in the current directory, but it includes files in the next level down (and below, depending on -maxdepth).
Well, you can do that using grep:
grep -rl ^ /path/to/dir
But why? find is better.
You are probably looking for find(1).

Resources