I have files like 0001.file1.email.data.spam.txt and 0001.file1.email.data.spam_1.txt
Now I want to delete all files end with "_1.txt", I tried to use "rm -rf *spam_1.txt", but it cannot find the files. Maybe because * cannot viewed as dot.
Dot is a literal character which simply matches itself.
The * wildcard matches any sequence of characters, and the ? wildcard matches any single character, but wildcard matching will ignore files which start with a dot unless you have dotglob enabled (it is off by default).
You can easily examine what files are being matched by a wildcard expression with something like
printf '>>%s<<\n' *spam_1.txt
(The >>...<< decoration is just to make it easy to see any leading or trailing whitespace, and isn't strictly necessary.)
In the absence of nullglob, the above will print the wildcard itself if it doesn't find any matches. Also check out failglob which causes an error to be printed and the command to be aborted if the wildcard doesn't match anything.
Related
I have a Bash script that takes in a directory as a parameter, and after some processing will do some output based on the files in that directory.
The command would be like the following, where dir is a directory with the following structure inside
dir/foo
dir/bob
dir/haha
dir/bar
dir/sub-dir
dir/sub-dir/joe
> myscript ~/files/stuff/dir
After some processing, I'd like the output to be something like this
foo
bar
sub-dir/joe
The code I have to remove the path passed in is the following:
shopt -s extglob
for file in $files ; do
filename=${file#${1}?(/)}
This gets me to the following, but for some reason the optional / is not being taken care of. Thus, my output looks like this:
/foo
/bar
/sub-dir/joe
The reason I'm making it optional is because if the user runs the command
> myscript ~/files/stuff/dir/
I want it to still work. And, as it stands, if I run that command with the trailing slash, it outputs as desired.
So, why does my ?(/) not work? Based on everything I've read, that should be the right syntax, and I've tried a few other variations as well, all to no avail.
Thanks.
that other guy's helpful answer solves your immediate problem, but there are two things worth nothing:
enumerating filenames with an unquoted string variable (for file in $files) is ill-advised, as sjsam's helpful answer points out: it will break with filenames with embedded spaces and filenames that look like globs; as stated, storing filenames in an array is the robust choice.
there is no strict need to change global shell option shopt -s extglob: parameter expansions can be nested, so the following would work without changing shell options:
# Sample values:
file='dir/sub-dir/joe'
set -- 'dir/' # set $1; value 'dir' would have the same effect.
filename=${file#${1%/}} # -> '/sub-dir/joe'
The inner parameter expansion, ${1%/}, removes a trailing (%) / from $1, if any.
I suggested you change files to an array which is a possible workaround for non-standard filenames that may contain spaces.
files=("dir/A/B" "dir/B" "dir/C")
for filename in "${files[#]}"
do
echo ${filename##dir/} #replace dir/ with your param.
done
Output
A/B
B
C
Here's the documentation from man bash under "Parameter Expansion":
${parameter#word}
${parameter##word}
Remove matching prefix pattern. The word is
expanded to produce a pattern just as in pathname
expansion. If the pattern matches the beginning of
the value of parameter, then the result of the
expansion is the expanded value of parameter with
the shortest matching pattern (the ``#'' case) or
the longest matching pattern (the ``##'' case)
deleted.
Since # tries to delete the shortest match, it will never include any trailing optional parts.
You can just use ## instead:
filename=${file##${1}?(/)}
Depending on what your script does and how it works, you can also just rewrite it to cd to the directory to always work with paths relative to .
Using metacharacters, I need to perform a long listing of all files whose name contains the string foo followed by two digits, then followed by .txt. foo**.txt will not work, obviously. I can't figure out how to do it.
Use Valid Shell Globbing with Character Class
To find your substring anywhere in a filename like bar-foo12-baz.txt, you need a wilcard before and after the match. You can also use a character class in your pattern to match a limited range of characters. For example, in Bash:
# Explicit character classes.
ls -l *foo[0-9][0-9]*.txt
# POSIX character classes.
ls -l *foo[[:digit:]][[:digit:]]*.txt
See Also
Filename Expansion
Pattern Matching
Something like ls foo[0-9][0-9]*.txt of whatever exactly fits your pattern.
I have a programme that is generating files like this "Incoming11781Arp", and there is always Incoming, and there is always 5 numbers, but there are 3 letters/upper-case/lower-case/numbers/special case _ in any way. Like Incoming11781_pi, or Incoming11781rKD.
How can I delete them using a script run from a cron job please? I've tried -
#!/bin/bash
file=~/Mail/Incoming******
rm "$file";
but it failed saying that there was no matching file or directory.
You mustn't double-quote the variable reference for pathname expansion to occur - if you do, the wildcard characters are treated as literals.
Thus:
rm $file
Caveat: ~/Mail/Incoming****** doesn't work the way you think it does and will potentially match more files than intended, as it is equivalent to ~/Mail/Incoming*, meaning that any file that starts with Incoming will match.
To only match files starting with Incoming that are followed by exactly 6 characters, use ~/Mail/Incoming??????, as #Jidder suggests in a comment.
Note that you could make your glob (pattern) even more specific:
file=~/Mail/Incoming[0-9][0-9][0-9][0-9][0-9][[:alpha:]_][[:alpha:]_][[:alpha:]_]
See the bash manual for a description of pathname expansion and pattern syntax: http://www.gnu.org/software/bash/manual/bashref.html#index-pathname-expansion.
You can achieve the same effect with the find command...
$ directory='~/Mail/'
$ file_pattern='Incoming*'
$ find "${directory}" -name "${file_pattern}" -delete
The first two lines define the directory and the file pattern separately, the find command will then proceed to delete any matching files inside that directory.
In a bash script i want to copy a file but the file name will change over time.
The start and end of the file name will however stay the same.
is there a way so i get the file like so:
cp start~end.jar
where ~ can be anything?
the cp command would be run a a bash script on a ubuntu machine if this makes and difference.
A glob (start*end) will give you all matching files.
Check out the Expansion > Pathname Expansion > Pattern Matching section of the bash manual for more specific control
* Matches any string, including the null string.
? Matches any single character.
[...] Matches any one of the enclosed characters. A pair of characters separated by a hyphen denotes a range expression; any character that sorts between those two characters, inclusive, using the current locale's collat-
ing sequence and character set, is matched. If the first character following the [ is a ! or a ^ then any character not enclosed is matched. The sorting order of characters in range expressions is determined by
the current locale and the value of the LC_COLLATE shell variable, if set. A - may be matched by including it as the first or last character in the set. A ] may be matched by including it as the first character in
the set.
and if you enable extglob:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
Use a glob to capture the variable text:
cp start*end.jar
I thought I understood wildcards, till this happened to me. Essentially, I'm looking for a wild card pattern that would return all files that are not named .gitignore. I came up with this, which seems to work for all cases I could conjure:
ls *[!{gitignore}]
To really validate if this works, I thought I'd negate the expression and see if it returns the file named .gitignore (actually any file that ended with gitignore; so 1.gitignore should also be returned). To that effect, I thought the negated expression would be:
ls *[{gitignore}]
However, this expression doesn't return a files named .gitignore (although it returns a file named 1.gitignore).
Essentially, my question, after simplification, boils down to:
Why doesn't *.abc match a file that is named .abc
I think I can take it from there.
PS:
I am working on Mac OSX Lion (10.7.4)
I wanted to add a clause to .gitignore such that I would ignore every file, except .gitignore in a given folder. So I ended up adding * in the .gitignore file. Result was, git ended up ignoring .gitignore :)
From the numerous searches I've made on google - Use the asterisk character (*) to represent zero or more characters.
I assume you're using Bash. From the Bash manual:
When a pattern is used for filename expansion, the character ‘.’ at the start of a filename or immediately following a slash must be matched explicitly, unless the shell option dotglob is set.
.gitignore patterns, however, are treated differently:
Otherwise, git treats the pattern as a shell glob suitable for consumption by fnmatch(3) with the FNM_PATHNAME flag: wildcards in the pattern will not match a / in the pathname.
According to the fnmatch(3) docs, a leading dot has to be explicitly matched only if the FNM_PERIOD flag is set, so *gitignore as a gitignore pattern would match .gitignore.
There is an easier way to accomplish this, though. To have .gitignore ignore everything except .gitignore:
*
!.gitignore
If you want to ignore everything except the gitignore file, use this as the file:
*
!.gitignore
Lines starting with an exclamation point are interpreted as exceptions.