list all the files that dont match pattern in bash - bash

i have to search the files that don't have pattern:-
*abc*.txt and *xyz*.txt
Please suggest a way to list all the files which don't have the above patterns.

You can use an extended glob, such as the following:
!(*#(abc|xyz)*.txt)
In ksh, this works by default, whereas in bash you need to first enable a shell option:
shopt -s extglob
! negates the match and # matches any of the pipe-separated patterns.
This pattern expands to the list of files that don't match *abc*.txt or *xyz*.txt, so you can pass it to another command to see the result, e.g. printf:
printf '%s\n' !(*#(abc|xyz)*.txt)

With find command:
find -type f ! \( -name '*abc*.txt' -o -name '*xyz*.txt' \)

You can use the --hide=PATTERN option with ls. In your case it would be
ls --hide="*abc*.txt" --hide="*xyz*.txt"

I got the solution simple grep is working for me
ls | grep -v "abc" | grep -v "xyz"

Related

Unix shell-scripting: Can find-result be made independent of string capitalization?

First I'm not a star with shell-scripting, more used to programming in Python, but have to work with an existing Python script which calls Unix commands via subprocess.
In this script we use 2 find commands to check if 2 certain strings can be found in an xml file / file-name:
FIND_IN_FILE_CMD: find <directory> -name *.xml -exec grep -Hnl STRING1|STRING2 {} +
FIND_IN_FILENAME_CMD: find <directory> ( -name *STRING1*xml -o -name *STRING2*xml )
The problem we saw is that STRING1 and STRING2 are not always written capitalized.
Now I can do something like STRING1|STRING2|String1|String2|string1|string2 and ( -name *STRING1*xml -o -name *STRING2*xml -o -name *String1*xml -o -name *String2*xml -o -name *string1*xml -o -name *string2*xml ), but I was wondering if there was something more efficient to do this check in one go which basically matches all different writing styles.
Can anybody help me with that?
Both of your commands have syntax errors:
$ find -name *.xml -exec grep -Hnl STRING1|STRING2 {} +
bash: STRING2: command not found
find: missing argument to `-exec'
This is because you cannot have an unquoted | in a shell command as that is taken as a pipe symbol. As you can see above, the shell tries to execute STRING2 as a command. In any case, grep cannot understand | unless you use the -E flag or, if your grep supports it, the -P flag. For vanilla grep, you need STRING1\|STRING2.
All implementations of grep should support the POSIX-mandated -i and -E options:
-E
Match using extended regular expressions. Treat each pattern specified as an ERE, as described in XBD Extended Regular Expressions. If any entire ERE pattern matches some part of an input line excluding the terminating <newline>, the line shall be matched. A null ERE shall match every line.
-i
Perform pattern matching in searches without regard to case; see XBD Regular Expression General Requirements.
This means you can use -i for case insensitive matching and -E for extended regular expressions, making your command:
find <directory> -name '*.xml' -exec grep -iEHnl 'STRING1|STRING2' {} +
Note how I also quoted the *.xml since without the quotes, if any xml files
are present in the directory you ran the command in, then *.xml would be expanded by the shell to the list of xml files in that directory.
Your next command also has issues:
$ find ( -name *STRING1*xml -o -name *STRING2*xml )
bash: syntax error near unexpected token `-name'
This is because the ( has a special meaning in the shell (it opens a subshell) so you need to escape it (\(). As for case insensitive matching, GNU find, the default on Linux has an -iname option which is equivalent to -name but case insensitive. If you are using GNU find, then you can do:
find <directory> \( -iname '*STRING1*xml' -o -iname '*STRING2*xml' \)
If your find doesn't have -iname, you are stuck with writing out all possible permutations. In all cases, however, you will need to quote the patterns and escape the parentheses as I have done above.
If you are going to continue using find, just replace -name with the case insensitive version -iname.

move files that no not contain specific string

I would like to move files using mv that do not contain the letter S in the filename. Could not find anyhting in the mv manual. Maybe combination with find or grep? It has to be case-sensitive.
input:
file1
fileS1
file2
fileS2
file to move:
file1
file2
You can do the selection in pure Bash without any extra software, if you enable extended globbing, which is off by default:
shopt -s extglob
mv !(*S*) /target/dir
For more information, search for extglob in the bash(1) manual page (the info is at the second match).
You could also use the Ignore-pattern switch from ls, like:
mv $(ls -I '*S*') /target/dir
You can use find with the -not flag for example.
find /path/to/source/dir -type f -not -name '*S*' \
| xargs mv -t /path/to/target/dir
GREP's -v flag can also be used here. According to the docs,
-v, --invert-match
Invert the sense of matching, to select non-matching lines.
Just use
ls | grep -v '*S*' | xargs mv -t target_dir/
Also, see this post.

Check if any of the files within a folder contain a pattern then return filename

I'm writing a script that aim to automate the fulfill of some variables and I'm looking for help to achieve this:
I have a nginx sites-enabled folder which contain some reverses proxied sites.
I need to:
check if a pattern $var1 is found in any of the files in "/volume1/nginx/sites-enabled/"
return the name of the file containing $var1 as $var2
Many thanks for your attention and help!
I have found some lines but none try any files in a folder
if grep -q $var1 "/volume1/nginx/sites-enabled/testfile"; then
echo "found"
fi
find and grep can be used to produce a list of matching files:
find /volume1/nginx/sites-enabled/ -type f -exec grep -le "${var1}" {} +
The ‘trick’ is using find’s -exec and grep’s -l.
If you only want the filenames you could use:
find /volume1/nginx/sites-enabled/ -type f -exec grep -qe "${var1}" {} \; -exec basename {} \;
If you want to assign the result to a variable use command substitution ($()):
var2="$(find …)"
Don’t forget to quote your variables!
This command is the most traditional and efficient one which works on any Unix
without the requirement to have GNU versions of grep with special features.
The efficiency is, that xargs feeds the grep command as many filenames as arguments as it is possible according to the limits of the system (how long a shell command may be) and it excecutes the grep command by this only as least as possible.
With the -l option of grep it shows you only the filename once on a successful pattern search.
find /path -type f -print | xargs grep -l pattern
Assuming you have GNU Grep, this will store all files containing the contents of $var1 in an array $var2.
for file in /volume1/nginx/sites-enabled/*
do
if grep --fixed-strings --quiet "$var1" "$file"
then
var2+=("$file")
fi
done
This will loop through NUL-separated paths:
while IFS= read -d'' -r -u9 path
do
…something with "$path"
done 9< <(grep --fixed-strings --files-without-match --recursive "$var1" /volume1/nginx/sites-enabled)

How do I find all files that do not begin with a given prefix in bash?

I have a bunch of files in a folder:
foo_1
foo_2
foo_3
bar_1
bar_2
buzz_1
...
I want to find all the files that do not start with a given prefix and save the list to a text file. Here is an example for the files that do have a given prefix:
find bar_* > Positives.txt
If you're doing subdirectories as well:
find . ! -name "bar_*"
Or, equivalently,
find . -not -name "*bar_*"
This should do the trick in any shell
ls | grep -v '^prefix'
The -v option inverts grep's search logic, making it filter out all matches.
Using grep instead of find you can use powerful regular expressions instead of the limited glob patterns.
You want to find filenames not starting with bar_*?
recursive:
find ! -name 'bar_*' > Negatives.txt
top directory:
find -maxdepth 1 ! -name 'bar_*' > Negatives.txt
With extended globs:
shopt -s extglob
ls !(bar_*) > filelist.txt
The !(pattern) matches anything but pattern, so !(bar_*) is any filename that does not start with bar_.
Using bash and wildcards: ls [!bar_]*. There is a caveat: the order of the letters is not important, so rab_something.txt will not be listed.
In my case I had an extra requirement, the files must end with the .py extension. So I use:
find . -name "*.py" | grep -v prefix_
In your case, to just exclude files with prefix_:
find . | grep -v prefix_
Note that this includes all sub-directories. There are many ways to do this, but it can be easy to remember for those already familiar with find and grep -v which excludes results.

How can I use inverse or negative wildcards when pattern matching in a unix/linux shell?

Say I want to copy the contents of a directory excluding files and folders whose names contain the word 'Music'.
cp [exclude-matches] *Music* /target_directory
What should go in place of [exclude-matches] to accomplish this?
In Bash you can do it by enabling the extglob option, like this (replace ls with cp and add the target directory, of course)
~/foobar> shopt extglob
extglob off
~/foobar> ls
abar afoo bbar bfoo
~/foobar> ls !(b*)
-bash: !: event not found
~/foobar> shopt -s extglob # Enables extglob
~/foobar> ls !(b*)
abar afoo
~/foobar> ls !(a*)
bbar bfoo
~/foobar> ls !(*foo)
abar bbar
You can later disable extglob with
shopt -u extglob
The extglob shell option gives you more powerful pattern matching in the command line.
You turn it on with shopt -s extglob, and turn it off with shopt -u extglob.
In your example, you would initially do:
$ shopt -s extglob
$ cp !(*Music*) /target_directory
The full available extended globbing operators are (excerpt from man bash):
If the extglob shell option is enabled using the shopt builtin, several extended
pattern matching operators are recognized.A pattern-list is a list of one or more patterns separated by a |. Composite patterns may be formed using one or more of the following sub-patterns:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
So, for example, if you wanted to list all the files in the current directory that are not .c or .h files, you would do:
$ ls -d !(*#(.c|.h))
Of course, normal shell globing works, so the last example could also be written as:
$ ls -d !(*.[ch])
Not in bash (that I know of), but:
cp `ls | grep -v Music` /target_directory
I know this is not exactly what you were looking for, but it will solve your example.
If you want to avoid the mem cost of using the exec command, I believe you can do better with xargs. I think the following is a more efficient alternative to
find foo -type f ! -name '*Music*' -exec cp {} bar \; # new proc for each exec
find . -maxdepth 1 -name '*Music*' -prune -o -print0 | xargs -0 -i cp {} dest/
A trick I haven't seen on here yet that doesn't use extglob, find, or grep is to treat two file lists as sets and "diff" them using comm:
comm -23 <(ls) <(ls *Music*)
comm is preferable over diff because it doesn't have extra cruft.
This returns all elements of set 1, ls, that are not also in set 2, ls *Music*. This requires both sets to be in sorted order to work properly. No problem for ls and glob expansion, but if you're using something like find, be sure to invoke sort.
comm -23 <(find . | sort) <(find . | grep -i '.jpg' | sort)
Potentially useful.
You can also use a pretty simple for loop:
for f in `find . -not -name "*Music*"`
do
cp $f /target/dir
done
In bash, an alternative to shopt -s extglob is the GLOBIGNORE variable. It's not really better, but I find it easier to remember.
An example that may be what the original poster wanted:
GLOBIGNORE="*techno*"; cp *Music* /only_good_music/
When done, unset GLOBIGNORE to be able to rm *techno* in the source directory.
My personal preference is to use grep and the while command. This allows one to write powerful yet readable scripts ensuring that you end up doing exactly what you want. Plus by using an echo command you can perform a dry run before carrying out the actual operation. For example:
ls | grep -v "Music" | while read filename
do
echo $filename
done
will print out the files that you will end up copying. If the list is correct the next step is to simply replace the echo command with the copy command as follows:
ls | grep -v "Music" | while read filename
do
cp "$filename" /target_directory
done
One solution for this can be found with find.
$ mkdir foo bar
$ touch foo/a.txt foo/Music.txt
$ find foo -type f ! -name '*Music*' -exec cp {} bar \;
$ ls bar
a.txt
Find has quite a few options, you can get pretty specific on what you include and exclude.
Edit: Adam in the comments noted that this is recursive. find options mindepth and maxdepth can be useful in controlling this.
The following works lists all *.txt files in the current dir, except those that begin with a number.
This works in bash, dash, zsh and all other POSIX compatible shells.
for FILE in /some/dir/*.txt; do # for each *.txt file
case "${FILE##*/}" in # if file basename...
[0-9]*) continue ;; # starts with digit: skip
esac
## otherwise, do stuff with $FILE here
done
In line one the pattern /some/dir/*.txt will cause the for loop to iterate over all files in /some/dir whose name end with .txt.
In line two a case statement is used to weed out undesired files. – The ${FILE##*/} expression strips off any leading dir name component from the filename (here /some/dir/) so that patters can match against only the basename of the file. (If you're only weeding out filenames based on suffixes, you can shorten this to $FILE instead.)
In line three, all files matching the case pattern [0-9]*) line will be skipped (the continue statement jumps to the next iteration of the for loop). – If you want to you can do something more interesting here, e.g. like skipping all files which do not start with a letter (a–z) using [!a-z]*, or you could use multiple patterns to skip several kinds of filenames e.g. [0-9]*|*.bak to skip files both .bak files, and files which does not start with a number.
this would do it excluding exactly 'Music'
cp -a ^'Music' /target
this and that for excluding things like Music?* or *?Music
cp -a ^\*?'complete' /target
cp -a ^'complete'?\* /target

Categories

Resources