What does "[chS]" mean in the regex of this shell command? - shell

egrep -r --include "*.[chS]" "myregularexpression" .
What does [chS] mean in the shell command above?

That is part of the shell globbing that selects multiple files.
The expression [chS] matches a single character containing the value c, h, or S.
So, the glob "*.[chS]" is looking for all files that have the extension .c, .h, or .s

[chS] is a character class and is equivalent to the expression c|h|S. It matches any one of the listed characters. In this case, *.[chS] is matching files (*.c, or *.h or *.S), i.e., C source and headers, and assembly files.

Related

What's the wildcard for dot in bash?

I have files like 0001.file1.email.data.spam.txt and 0001.file1.email.data.spam_1.txt
Now I want to delete all files end with "_1.txt", I tried to use "rm -rf *spam_1.txt", but it cannot find the files. Maybe because * cannot viewed as dot.
Dot is a literal character which simply matches itself.
The * wildcard matches any sequence of characters, and the ? wildcard matches any single character, but wildcard matching will ignore files which start with a dot unless you have dotglob enabled (it is off by default).
You can easily examine what files are being matched by a wildcard expression with something like
printf '>>%s<<\n' *spam_1.txt
(The >>...<< decoration is just to make it easy to see any leading or trailing whitespace, and isn't strictly necessary.)
In the absence of nullglob, the above will print the wildcard itself if it doesn't find any matches. Also check out failglob which causes an error to be printed and the command to be aborted if the wildcard doesn't match anything.

Removing an optional / (directory separator) in Bash

I have a Bash script that takes in a directory as a parameter, and after some processing will do some output based on the files in that directory.
The command would be like the following, where dir is a directory with the following structure inside
dir/foo
dir/bob
dir/haha
dir/bar
dir/sub-dir
dir/sub-dir/joe
> myscript ~/files/stuff/dir
After some processing, I'd like the output to be something like this
foo
bar
sub-dir/joe
The code I have to remove the path passed in is the following:
shopt -s extglob
for file in $files ; do
filename=${file#${1}?(/)}
This gets me to the following, but for some reason the optional / is not being taken care of. Thus, my output looks like this:
/foo
/bar
/sub-dir/joe
The reason I'm making it optional is because if the user runs the command
> myscript ~/files/stuff/dir/
I want it to still work. And, as it stands, if I run that command with the trailing slash, it outputs as desired.
So, why does my ?(/) not work? Based on everything I've read, that should be the right syntax, and I've tried a few other variations as well, all to no avail.
Thanks.
that other guy's helpful answer solves your immediate problem, but there are two things worth nothing:
enumerating filenames with an unquoted string variable (for file in $files) is ill-advised, as sjsam's helpful answer points out: it will break with filenames with embedded spaces and filenames that look like globs; as stated, storing filenames in an array is the robust choice.
there is no strict need to change global shell option shopt -s extglob: parameter expansions can be nested, so the following would work without changing shell options:
# Sample values:
file='dir/sub-dir/joe'
set -- 'dir/' # set $1; value 'dir' would have the same effect.
filename=${file#${1%/}} # -> '/sub-dir/joe'
The inner parameter expansion, ${1%/}, removes a trailing (%) / from $1, if any.
I suggested you change files to an array which is a possible workaround for non-standard filenames that may contain spaces.
files=("dir/A/B" "dir/B" "dir/C")
for filename in "${files[#]}"
do
echo ${filename##dir/} #replace dir/ with your param.
done
Output
A/B
B
C
Here's the documentation from man bash under "Parameter Expansion":
${parameter#word}
${parameter##word}
Remove matching prefix pattern. The word is
expanded to produce a pattern just as in pathname
expansion. If the pattern matches the beginning of
the value of parameter, then the result of the
expansion is the expanded value of parameter with
the shortest matching pattern (the ``#'' case) or
the longest matching pattern (the ``##'' case)
deleted.
Since # tries to delete the shortest match, it will never include any trailing optional parts.
You can just use ## instead:
filename=${file##${1}?(/)}
Depending on what your script does and how it works, you can also just rewrite it to cd to the directory to always work with paths relative to .

How can I get a long listing of text files containing "foo" followed by two digits?

Using metacharacters, I need to perform a long listing of all files whose name contains the string foo followed by two digits, then followed by .txt. foo**.txt will not work, obviously. I can't figure out how to do it.
Use Valid Shell Globbing with Character Class
To find your substring anywhere in a filename like bar-foo12-baz.txt, you need a wilcard before and after the match. You can also use a character class in your pattern to match a limited range of characters. For example, in Bash:
# Explicit character classes.
ls -l *foo[0-9][0-9]*.txt
# POSIX character classes.
ls -l *foo[[:digit:]][[:digit:]]*.txt
See Also
Filename Expansion
Pattern Matching
Something like ls foo[0-9][0-9]*.txt of whatever exactly fits your pattern.

How to delete files like 'Incoming11781rKD'

I have a programme that is generating files like this "Incoming11781Arp", and there is always Incoming, and there is always 5 numbers, but there are 3 letters/upper-case/lower-case/numbers/special case _ in any way. Like Incoming11781_pi, or Incoming11781rKD.
How can I delete them using a script run from a cron job please? I've tried -
#!/bin/bash
file=~/Mail/Incoming******
rm "$file";
but it failed saying that there was no matching file or directory.
You mustn't double-quote the variable reference for pathname expansion to occur - if you do, the wildcard characters are treated as literals.
Thus:
rm $file
Caveat: ~/Mail/Incoming****** doesn't work the way you think it does and will potentially match more files than intended, as it is equivalent to ~/Mail/Incoming*, meaning that any file that starts with Incoming will match.
To only match files starting with Incoming that are followed by exactly 6 characters, use ~/Mail/Incoming??????, as #Jidder suggests in a comment.
Note that you could make your glob (pattern) even more specific:
file=~/Mail/Incoming[0-9][0-9][0-9][0-9][0-9][[:alpha:]_][[:alpha:]_][[:alpha:]_]
See the bash manual for a description of pathname expansion and pattern syntax: http://www.gnu.org/software/bash/manual/bashref.html#index-pathname-expansion.
You can achieve the same effect with the find command...
$ directory='~/Mail/'
$ file_pattern='Incoming*'
$ find "${directory}" -name "${file_pattern}" -delete
The first two lines define the directory and the file pattern separately, the find command will then proceed to delete any matching files inside that directory.

What does two asterisks together in file path mean?

What does the following file path mean?
$(Services_Jobs_Drop_Path)\**\*.config
The variable just holds some path, nothing interesting. I'm a lot more concerned, what the hell the ** mean.
Any ideas?
P.S. The following path is used in msbuild scripts, if it helps.
\**\ This pattern is often used in Copy Task for recursive folder tree traversal. Basically it means that all files with extension config would be processed from the all subdirectories of $(Services_Jobs_Drop_Path) path.
MSDN, Using Wildcards to Specify Items:
You can use the **, *, and ? wildcard characters to specify a group of
files as inputs for a build instead of listing each file separately.
The ? wildcard character matches a single character.
The * wildcard character matches zero or more characters.
The ** wildcard character sequence matches a partial path.
MSDN, Specifying Inputs with Wildcards
To include all .jpg files in the Images directory and subdirectories
Use the following Include attribute:
Include="Images\**\*.jpg"

Resources