Exclude a string from wildcard search in a shell - bash

I am trying to exclude a certain string from a file search.
Suppose I have a list of files: file_Michael.txt, file_Thomas.txt, file_Anne.txt.
I want to be able and write something like
ls *<and not Thomas>.txt
to give me file_Michael.txt and file_Anne.txt, but not file_Thomas.txt.
The reverse is easy:
ls *Thomas.txt
Doing it with a single character is also easy:
ls *[^s].txt
But how to do it with a string?
Sebastian

You can use find to do this:
$ find . -name '*.txt' -a ! -name '*Thomas.txt'

With Bash
shopt -s extglob
ls !(*Thomas).txt
where the first line means "set extended globbing", see the manual for more information.
Some other ways could be:
find . -type f \( -iname "*.txt" -a -not -iname "*thomas*" \)
ls *txt |grep -vi "thomas"

If you are looping a wildcard, just skip the rest of the iteration if there is something you want to exclude.
for file in *.txt; do
case $file in *Thomas*) continue;; esac
: ... do stuff with "$file"
done

Related

find command - get base name only - NOT with basename command / NOT with printf

Is there any way to get the basename in the command find?
What I don't need:
find /dir1 -type f -printf "%f\n"
find /dir1 -type f -exec basename {} \;
Why you may ask? Because I need to continue using the found file. I basically want something like this:
find . -type f -exec find /home -type l -name "*{}*" \;
And it uses ./file1, not file1 as the agrument for -name.
Thanks for your help :)
If you've got Bash version 4.3 or later, try this Shellcheck-clean pure Bash code:
#! /bin/bash -p
shopt -s dotglob globstar nullglob
for path in ./**; do
[[ -L $path ]] && continue
[[ -f $path ]] || continue
base=${path##*/}
for path2 in /home/**/*"$base"*; do
[[ -L $path2 ]] && printf '%s\n' "$path2"
done
done
shopt -s ... enables some Bash settings that are required by the code:
dotglob enables globs to match files and directories that begin with .. find shows such files by default.
globstar enables the use of ** to match paths recursively through directory trees. globstar was introduced in Bash 4.0, but it was dangerous to use before Bash 4.3 (2014) because it followed symlinks when looking for matches.
nullglob makes globs expand to nothing when nothing matches (otherwise they expand to the glob pattern itself, which is almost never useful in programs).
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of ${path##*/}. That always works, even in some rare cases where $(basename "$path") doesn't.
See the accepted, and excellent, answer to Why is printf better than echo? for an explanation of why I used printf instead of echo to output the found paths.
This solution works correctly if you've got files that contain pattern characters (?, *, [, ], \) in their names.
Spawn a shell and make the second call to find from there
find /dir1 -type f -exec sh -c '
for p; do
find /dir2 -type l -name "*${p##*/}*"
done' sh {} +
If your files may contain special characters in their names (like [, ?, etc.), you may want to escape them like this to avoid false positives
find /dir1 -type f -exec sh -c '
for p; do
esc=$(printf "%sx\n" "${p##*/}" | sed "s/[?*[\]/\\\&/g")
esc=${esc%x}
find /dir2 -type l -name "*$esc*"
done' sh {} +
You'll have to forward it to another evaluator. There is no way to do that in find.
find . -type f -printf '%f\0' |
xargs -r0I{} find /home -type l -name '*{}*'
This answers your question about trying to merge the functionality of %f and -exec find and is based on your example but your example injects raw filenames as -name patterns so avoid that and look at other solutions instead.
Simply spawn a bash shell:
find /dir1 -type f -exec bash -c '
base=$(basename "$1")
echo "$base"
do_something_else "$base"
' bash {} \;
$1 in the bash part is each file filtered by find.

Use "rm" command with inverse match

I want to remove all files that do not match R1.fastq.gz in my list of files.
How do I use rm with inverse match?
Use the extended pattern syntax available in bash:
shopt -s extglob
printf '%s\n' !(R1.fastq.gz) # To verify the list of files the pattern matches
rm !(R1.fastq.gz) # To actually remove them.
Or, use find:
find . ! -name R1.fastq.gz -print # Verify
find . ! -name R1.fastq.gz -exec rm {} + # Delete
If your version of find supports it, you can use -delete instead of -exec rm {} +:
find . ! -name R1.fastq.gz -delete
You may want the "invert match" option, grep –v, which means "select lines which do not match the regex." and then remove them.
Something like rm $(ls | grep -v -e "R1.fastq.gz") should do it.
Please, note that this will erase all files in the folder you are on, except R1.fastq.gz

How to log variable in bash

for i in *.txt;
do
xxd -l 3 $i >> log
done
I also want to log file names $i for each result. E.g.:
file_name
result_of_command
You probably just need to use printf:
for f in *.txt; do
printf "%s: %s\n" "$f" "$(xxd -l 3 "$f")"
done >> log
I'm not totally clear what you are asking, but is this what you want?
for i in *.txt;
do
echo "$i" >> log
xxd -l 3 $i >> log
done
It's better to use find with the -exec option to run a command for every file matching certain criteria.
If you want all files in your current directory matching *.txt you can use find. You can use the -exec option to run a command for each file. {} replaces the name of the file and \; (an escaped ; terminates the command). You can use + instead to tell find to replace {} with multiple filenames.
find . -type f -name '*.txt' -maxdepth 1 -exec xxd -l 3 {} \; >> log
Note that the above example includes hidden files, you can exclude them using a regex.
find . -type f \( ! -regex '.*/\..*' \) -name '*.txt' -maxdepth 1 -exec xxd -l 3 {} \; >> log
Also, if you're going to be globbing files in the current directory and using them in commands, always use ./*. Paths beginning with - are likely to be interpreted by your command as options.

How to find directories without dot in bash?

I try to find folders without dot symbol.
I Search it in users directory via this script:
!#/bin/bash
users=$(ls /home)
for user in $users;
do
find /home/$user/web/ -maxdepth 1 -type d -iname '*' ! -iname "*.*"
done
But I see in result users with folders with dot, for example - test.uk or test.cf
What I do wrong?
Thanks in advance!
You can use find with -regex option for that:
find /home/$user/web/ -maxdepth 1 -type d -regex '\./[^.]*$'
'\./[^.]*$' will match names without any DOT.
The problem is that your command finds directories in /home/username/web/ where the directory doesn't contain a dot.
It does not check to see if username itself contains a dot.
To see if there's a dot anywhere, you can use ipath instead of iname:
!#/bin/bash
users=$(ls /home)
for user in $users;
do
find /home/$user/web/ -maxdepth 1 -type d -iname '*' ! -ipath "*.*"
done
Or more correctly and succinctly:
#!/bin/bash
find /home/*/web/ -maxdepth 1 -type d ! -ipath "*.*"
No need for find; just use an extended glob to match any files not containing a .
shopt -s extglob
for dir in /home/*/;
do
printf '%s\n' "$dir"/!(*.*)
done
You could even do away with the loop entirely:
shopt -s extglob
printf '%s\n' /home/*/!(*.*)
To exclude directories in /home that contain a ., you can change /home/*/ to /home/!(*.*)/ in either example.

What's a more concise way of finding text in a set of files?

I currently use the following command, but it's a little unwieldy to type. What's a shorter alternative?
find . -name '*.txt' -exec grep 'sometext' '{}' \; -print
Here are my requirements:
limit to a file extension (I use SVN and don't want to be searching through all those .svn directories)
can default to the current directory, but it's nice to be able to specify a different directory
must be recursive
UPDATE: Here's my best solution so far:
grep -r 'sometext' * --include='*.txt'
UPDATE #2: After using grep for a bit, I realized that I like the output of my first method better. So, I followed the suggestions of several responders and simply made a shell script and now I call that with two parameters (extension and text to find).
grep has -r (recursive) and --include (to search only in files and directories matching a pattern).
If its too unweildy, write a script that does it and put it in your personal bin directory. I have a 'fif' script which searches source files for text, basically just doing a single find like you have here:
#!/bin/bash
set -f # disable pathname expansion
pattern="-iname *.[chsyl] -o -iname *.[ch]pp -o -iname *.hh -o -iname *.cc
-o -iname *.java -o -iname *.inl"
prune=""
moreargs=true
while $moreargs && [ $# -gt 0 ]; do
case $1 in
-h)
pattern="-iname *.h -o -iname *.hpp -o -iname *.hh"
shift
;;
-prune)
prune="-name $2 -prune -false -o $prune"
shift
shift
;;
*)
moreargs=false;
;;
esac
done
find . $prune $pattern | sed 's/ /\\ /g' | xargs grep "$#"
it started life as a single-line script and got features added over the years as I needed them.
This is much more efficient since it invokes grep many fewer times, though it's hard to say it's more succinct:
find . -name '*.txt' -print0 | xargs -0 grep 'sometext' /dev/null
Notes:
/find -print0 and xargs -0 makes pathnames with embedded blanks work correctly.
The /dev/null argument makes sure grep always prepends a filename.
Install ack and use
ack -aG'\.txt$' 'sometext'
I second ephemient's suggestion of ack. I'm writing this post to highlight a particular issue.
In response to jgormley (in the comments): ack is available as a single file which will work wherever the right Perl version is installed (which is everywhere).
Given that on non-Linux platforms grep regularly does not accept -R, arguably using ack is more portable.
I use zsh, which has recursive globbing. If you needed to look at specific filetypes, the following would be equivalent to your example:
grep 'sometext' **/*.txt
If you don't care about the filetype, the -r option will be better:
grep -r 'sometext' *
Although, A minor tweak to your original example will give you exactly what you want:
find . -name '*.txt' \! -wholename '*/.svn/*' -exec grep 'sometext' '{}' \; -print
If this is something you do frequently, make it a function (put this in your shell config):
function grep_no_svn {
find . -name "${2:-*}" \! -wholename '*/.svn/*' -exec grep "$1" '{}' \; -print
}
Where the first argument to the function is the text you're searching for. So:
$ grep_here_no_svn "sometext"
Or:
$ grep_here_no_svn "sometext" "*.txt"
You could write a script (in bash or whatever -- I have one in Groovy) and place it on the path. E.g.
$ myFind.sh txt targetString
where myFind.sh is:
find . -name "*.$1" -exec grep $2 {} \; -print
I usualy avoid the "man find" by using grep $(find . -name "*,txt")
You say that you like the output of your method (using find) better. The only difference I can see between them is that grepping multiple files will put the filename on the front.
You can always (in GNU grep, but you must be using that or -r and --include wouldn't work) turn the filename off by using -h (--no-filename). The opposite, for anyone who does want filenames but has to use find for some other reason, is -H (--with-filename).

Resources