How to use multiple filepaths with spaces in find and grep - bash

In Bash,
From a file I am reading the list of files matching a pattern to search in.
variable content will be something like this after reading
files="C:/Downloads/tutorial java*.txt D:/text materials*.java"
It is then used in find
find $files -type f -exec grep -PrnIi my-search-term --color=auto {} /dev/null \;
I tried escaping space with '\' like this
files="C:/Downloads/tutorial\ java*.txt D:/text\ materials*.java".
But not working. I cannot hard code the list of files as it needs to be read from a different file

You are combining paths with the patterns to match in those paths. That's fine, but you would need would need to search basically the entire file system for files matching the full paths.
find / \( -path "C:/Downloads/tutorial java*.txt" -o -path "D:/text materials*.java" \) ...
If you want to store these in a variable, use an array, not a regular variable.
files=( "C:/Downloads/tutorial java*.txt" "D:/text materials*.java")
patterns=(-path "${files[0]}")
for pattern in "${files[#]:1}"; do
patterns+=(-o -path "$pattern")
done
find / \( "${patterns[#]"} \)

Related

Assign all file matches from a folder to a variable

I would like to search for a simple text query (inside of a directory named "textfiles") and based on the matches, assign the results to a variable in bash (as an array or list). This query should be case-insensitive, and the context is inside of a bash script (.sh file). The names I'd hope to see in the array are simply the filenames, not the full paths.
What I am trying:
myfiles=./textfiles/*text*.txt
This matches all files that have the word text in them, but not the word TEXT.
I've also tried
myfiles=(find textfiles -iname *text*)
...and...
myfiles=find textfiles -iname *text*
Is there a solution to this?
myfiles=$(find textfiles -iname '*text*' -exec basename '{}' \; 2>/dev/null)
Note how -exec allows you to perform powerful operations on the files find finds. Maybe you do not even need the array after all, and can do what you need to do right there in the -exec argument.
And be aware that the -exec argument may be a script or other executable of your own making...
# plain
myfiles=($(find textfiles -iname *text*))
# if you write like below, you get the result in myflies as a single string
myfiles=$(find textfiles -iname *text*)
# if you want to assign all string as array then you write the following way
myfiles=(abc def ijk)
But this impose a problem, if there is space in your file name or directory name, it will give you incorrect result. Better solution would be
myfiles=()
while read -r fname; do
push myfiles $fname;
done < <(find . -type f)
As #Roadowl suggested better alternative to be xargs can be a better alternative
There are more than one way to solve a problem.
Since you said in your posting explicitly that you want to have files containing text, but not TEXT, you can not do a case-insensitive search, but have to be case-sensitive:
myfiles=($(find -name '*text*' 2>/dev/null))
However, this would also return a file named x.text.y.TEXT.z. If you want to exclude this file (since you consider exclusion of TEXT more important than inclusion of text), you can do a
myfiles=($(find -name '*text*' '!' -name '*TEXT*' 2>/dev/null))

Adding prefixes to certain filenames in Unix

I need to create a script that will go through and add underscores to all files in multiple directories, ignoring the files that already have prefixes. For example, _file1, _file2, file3, file4 needs to look like _file1, _file2, _file3, _file4
I've got little to no knowledge of Unix scripting, so a simple explanation would be greatly appreciated!
You could use one liner like this:
find dir_with_files -regextype posix-extended -type f -regex '^.*\/[^_][^\/]*$' -exec rename -v 's/^(.*\/)([^_][^\/]*)$/$1_$2/' '{}' \;
where dir_with_files is upper dir where you search for your files. Then it finds files with names starting not from _, and each of them is renamed.
Before doing any changes you can use rename with params -n -v showing you what operations will take place, without actually executing them.
find dir_with_files -regextype posix-extended -type f -regex '^.*\/[^_][^\/]*$' -exec rename -v -n 's/^(.*\/)([^_][^\/]*)$/$1_$2/' '{}' \;
From the best Bash resource out there:
Create a glob which matches all of the relevant files.
Loop through all of the matching files.
Remove the underscore from the file name and save the result to a variable.
Prepend an underscore to the variable.
echo the original file name followed by the changed file name using proper quotes to check that they look sane (the quotes will not be printed by echo since they are syntax).
Use mv instead of echo to actually rename the files.
In addition:
If your mv supports -n/--no-clobber, use it to avoid the possibility of data loss in case you mess up

using find with variables in bash

I am new to bash scripting and need help:
I need to remove specific files from a directory . My goal is to find in each subdirectory a file called "filename.A" and remove all files that starts with "filename" with extension B,
that is: "filename01.B" , "filename02.B" etc..
I tried:
B_folders="$(find /someparentdirectory -type d -name "*.B" | sed 's# (.*\)/.*#\1#'|uniq)"
A_folders="$(find "$B_folders" -type f -name "*.A")"
for FILE in "$A_folders" ; do
A="${file%.A}"
find "$FILE" -name "$A*.B" -exec rm -f {}\;
done
Started to get problems when the directories name contained spaces.
Any suggestions for the right way to do it?
EDIT:
My goal is to find in each subdirectory (may have spaces in its name), files in the form: "filename.A"
if such files exists:
check if "filename*.B" exists And remove it,
That is: remove: "filename01.B" , "filename02.B" etc..
In bash 4, it's simply
shopt -s globstar nullglob
for f in some_parent_directory/**/filename.A; do
rm -f "${f%.A}"*.B
done
If the space is the only issue you can modify the find inside the for as follows:
find "$FILE" -name "$A*.B" -print0 | xargs -0 rm
man find shows:
-print0
True; print the full file name on the standard output, followed by a null character (instead of the newline character that -print uses). This allows
file names that contain newlines or other types of white space to be correctly interpreted by programs that process the find output. This option corre-
sponds to the -0 option of xargs.
and xarg's manual
-0 Input items are terminated by a null character instead of by whitespace, and the quotes and backslash are not special (every character is taken literal-
ly). Disables the end of file string, which is treated like any other argument. Useful when input items might contain white space, quote marks, or
backslashes. The GNU find -print0 option produces input suitable for this mode.

Find a file and delete the parent level dir

How would it possible to delete the parent dir (only one-level above) where the file is located and is found with find command like
find . -type f -name "*.root" -size 1M
which returns
./level1/level1_chunk84/file.root
So, I want to do actually delete recursively the level_chunck84 dir for example..
thanks
You can try something like:
find . -type f -name "*.root" -size 1M -print0 | \
xargs -0 -n1 -I'{}' bash -c 'fpath={}; rm -r ${fpath%%$(basename {})}'
find + xargs combo is very common. Please refer to man find and you will find a few examples showing how to use them together.
All I did here I simply added -print0 flag to your original find statement:
-print0
True; print the full file name on the standard output, followed by a null character (instead of the newline character that -print
uses). This allows file names that contain newlines or other types of white space to be correctly interpreted by programs that
process the find output. This option corresponds to the -0 option of xargs.
Then piped out everything to xargs which serves as a helper to craft further commands:
- execute everything in bash subshell
- assign file path to a variable fpath={}
- extract dirname from your file path
${parameter%%word}
Remove matching suffix pattern. The word is expanded to produce a pattern just as in pathname expansion. If the pattern matches a
trailing portion of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the
shortest matching pattern (the %'' case) or the longest matching pattern (the%%'' case) deleted. If parameter is # or *, the
pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is
an array variable subscripted with # or *, the pattern removal operation is applied to each member of the array in turn, and the
expansion is the resultant list.
- and finally remove recursively
Also there's a little shorter version of it:
find . -type f -name "*.root" -size 1M -print0 | \
xargs -0 -n1 -I'{}' bash -c 'fpath={}; rm -r ${fpath%/*}'

How to search for file names that contain a number at a specific location in bash

How do I ensure that there is a number after a file name in find? Conceptually:
find ./directory -name filename{number}.temp
If I enter
find ./directory -name filename'[0-9]'*.temp
it will give me file names of the form filename'[0-9]'text.temp as well.
find ./directory -regex '.*/filename[0-9][0-9]*\.temp'
Note that -regex matches on the whole path, not just the filename.
The older versions of Unix find don't do regular expressions or Kornshell style globs. You can use either "?" or "*" in your glob, but that's it. The find command on Linux and Macs do have the -regex expression.
If your find command isn't gnu compatible and doesn't have the -regex parameter, you need to pipe the output to grep:
find ./directory -name 'filename*.temp' | grep '/filename[0-9].temp$'

Resources