A command that prints a list of files and folders in the current directory along with their total sizes is du -sh *. That command alone doesn't, however, list hidden files or folders. I found a solution for a command that does correctly list the hidden files and folders along with the rest: du -sh .[!.]* *. Although it works perfectly, the solution was provided as-is, without any explanation.
What is the meaning of .[!.]*, exactly? How does it work?
It's a globbing pattern that basically tells bash to find all files starting with a ., followed by any character but a .and containing any character after that.
See this page for a great explanation of bash globbing patterns.
. - match a ., prefix of hidden file
[!.] - match any character, as long as it is not a ., see ref
* - any number of characters
so this pattern means match files starts with . but not ..
.[!.]* the meaning is any file or directory name start with . but not following with ., so it will include all hidden files and directories under current directory but exclude parent directory.
Because this behaviour is decided by shell glob pattern. So you can use ls .[!.]* to see what actually get in your shell environment.
BTW, you can turn dotglob on in your shell to simplify your du command.
$ shopt -s dotglob
$ du -sh *
$ shopt -u dotglob
From bash manual
dotglob If set, bash includes filenames beginning with a `.' in the results of pathname expansion.
Related
I want to list all .jpg files in all subdirectories using ls.
For the same directory this works fine:
ls *.jpg
However, when using the -R for recursiveness:
ls -R *.jpg
I get:
zsh:no matches found: *.jpg
Why does this not work?
Note: I know it can be done using find or grep but I want to know why the above does not work.
The program ls is not designed to handle patterns by itself.
When you run ls -R *.jpg, the pattern *.jpg is not directly passed to ls. The shell replaces it by a list of all files that match the pattern. (Only if there is no file with a matching name, ls will see the file name *.jpg and not find a file of this name.)
Since you are using zsh (with the default setting setopt nomatch), it prints an error message instead of passing the pattern to ls.
If there are matching files, e.g. A.jpg, B.jpg, C.jpg, the command
ls *.jpg
will be run by the shell as
ls A.jpg B.jpg C.jpg
In contrast to this, find is designed to handle patterns with its -name test. When using find you should make sure the pattern is not replaced by the shell, e.g. by using -name '*.jpg' or -name \*.jpg. Otherwise you might get unexpected results or an error if there are matching files in the current directory.
Edit:
As shown in Martin Tournoij's answer you could use the recursive glob pattern ls **/*.jpg, but this is also handled by the shell not by ls, so you don't need option -R. In zsh this recursive pattern ** is enabled by default, in bash you need to enable it with shopt -s globstar.
The shell first expands any glob patterns, and then runs the command. So from ls's perspective, ls *.jpg is exactly the same as if you had typed ls one.jpg two.jpg. The -R flag to ls only makes sense if you use it on a directory, which you're not doing here.
This is also why mv *.jpg *.png doesn't work as expected on Unix systems, since mv never sees those patterns but just the filenames it expanded to (it does on e.g. Windows, where the globbing is done by the program rather than the shell; there are advantages and disadvantages to both approaches).
* matches all characters except a /, so *.jpg only expands to patterns in the current directory. **/ is similar, but also matches /, so it expands to patterns in any directory. This is supported by both bash and zsh.
So ls **/*.jpg will do what you want; you don't need to use find or grep. In zsh, especially you rarely need to use find since globbing is so much more powerful than in the standard Bourne shell or bash.
In zsh you can also use setopt glob_star_short and then **.jpg will work as well, which is a shortcut for **/*.jpg.
I want to copy all the files in the current directory to directory "folder_1", except for those ending in .txt and .png
I've tried the following:
shopt -s extglob
cp !(*.txt) folder_1
But I need to make this more general to include png as well
cp !(*.txt|*.png) folder_1
bash manual
If the extglob shell option is enabled using the shopt builtin, several extended pattern matching operators are recognized. In the following description, a pattern-list is a list of one or more patterns separated by a ‘|’. Composite patterns may be formed using one or more of the following sub-patterns:
...
!(pattern-list)
Matches anything except one of the given patterns.
I am trying to list all files located in specific sub-directories of a directory in my bash script. Here is what I tried.
root_dir="/home/shaf/data/"
sub_dirs_prefixes=('ab' 'bp' 'cd' 'cn' 'll' 'mr' 'mb' 'nb' 'nh' 'nw' 'oh' 'st' 'wh')
ls "$root_dir"{$(IFS=,; echo "${sub_dirs_prefixes[*]}")}"rrc/"
Please note that I do not want to expand value stored in $root_dir as it may contain spaces but I do want to expand sub-path contained in {} which is a comma delimited string of contents of $sub_dirs_prefixes. I stored sub-directories prefixes in an array variable, $sub_dirs_prefixes , because I have to use them later on for something else.
I am getting following error:
ls: cannot access /home/shaf/data/{ab,bp,cd,cn,ll,mr,mb,nb,nh,nw,oh,st,wh}rrc/
If I copy the path in error message and run ls from command line, it does display contents of listed sub-directories.
You can command substitution to generate an extended pattern.
shopt -s extglob
ls "$root_dir"/$(IFS="|"; echo "#(${sub_dirs_prefixes[*]})rrc")
By the time parameter can command substitutions have completed, the shell sees this just before performing pathname expansion:
ls "/home/shaf/data/"/#(ab|bp|cd|cn|ll|mr|mb|nb|nh|nw|oh|st|wh)rrc
The #(...) pattern matches one of the enclosed prefixes.
It gets a little trickier if the components of the directory names contain characters that need to be quoted, since we aren't quoting the command substitution.
Just witting a simple shell script and little confused:
Here is my script:
% for f in $FILES; do echo "Processing $f file.."; done
The Command:
ls -la | grep bash
produces:
% ls -a | grep bash
.bash_from_cshrc
.bash_history
.bash_profile
.bashrc
When
FILES=".bash*"
I get the same results (different formatting) as ls -a. However when
FILES="*bash*"
I get this output:
Processing *bash* file..
This is not the expected output and not what I expect. Am I not allowed to have a wild card at the beginning of the file name? Is the . at the beginning of the file name "special" somehow?
Setting
FILES="bash*"
Also does not work.
The default globbing in bash does not include filenames starting with a . (aka hidden files).
You can change that with
shopt -s dotglob
$ ls -a
. .. .a .b .c d e f
$ ls *
d e f
$ shopt -s dotglob
$ ls *
.a .b .c d e f
$
To disable it again, run shopt -u dotglob.
If you want hidden and non hidden, set dotglob (bash)
#!/bin/bash
shopt -s dotglob
for file in *
do
echo "$file"
done
FILES=".bash*" works because the hidden files name begin with a .
FILES="bash*" doesn't work because the hidden files name begin with a . not a b
FILES="*bash*" doesn't work because the * wildcard at the beginning of a string omits hidden files.
Yes, the . at the front is special, and normally won't be matched by a * wildcard, as documented in the bash man page (and common to most Unix shells):
When a pattern is used for pathname expansion, the character “.”
at the start of a name or immediately following a slash must
be matched explicitly, unless the shell option dotglob is
set. When matching a pathname, the slash character must
always be matched explicitly. In other cases, the “.”
character is not treated specially.
If you want to include hidden files, you can specify two wildcards; one for the hidden files, and another for the others.
for f in .[!.]* *; do
echo "Processing $f file.."
done
The wildcard .* would expand to all the dot files, but that includes the parent directory, which you normally would want to exclude; so .[!.]* matches all files whose first character is a dot, but the second one isn't.
If you have other files with two leading dots, you need to specify a third wildcard to cover those but exclude the parent directory! Try ..?* which requires there to be at least one character after the second dot.
for file in directory/{.[!.]*,*};do echo $file;done
Should echo either hidden files and normal file. Thanks to tripleee for the .[!.]* tip.
The curly brackets permits a 'or' in the pattern matching. {pattern1,pattern2}
I have a simple test bash script which looks like that:
#!/bin/bash
cmd="rsync -rv --exclude '*~' ./dir ./new"
$cmd # execute command
When I run the script it will copy also the files ending with a ~ even though I meant to exclude them. When I run the very same rsync command directly from the command line, it works! Does someone know why and how to make bash script work?
Btw, I know that I can also work with --exclude-from but I want to know how this works anyway.
Try eval:
#!/bin/bash
cmd="rsync -rv --exclude '*~' ./dir ./new"
eval $cmd # execute command
The problem isn't that you're running it in a script, it's that you put the command in a variable and then run the expanded variable. And since variable expansion happens after quote removal has already been done, the single quotes around your exclude pattern never get removed... and so rsync winds up excluding files with names starting with ' and ending with ~'. To fix this, just remove the quotes around the pattern (the whole thing is already in double-quotes, so they aren't needed):
#!/bin/bash
cmd="rsync -rv --exclude *~ ./dir ./new"
$cmd # execute command
...speaking of which, why are you putting the command in a variable before running it? In general, this is a good way make code more confusing than it needs to be, and trigger parsing oddities (some even weirder than this). So how about:
#!/bin/bash
rsync -rv --exclude '*~' ./dir ./new
You can use a simple --eclude '~' as (accoding to the man page):
if the pattern starts with a / then it is anchored to a particular spot in
the hierarchy of files, otherwise it
is matched against the end of the
pathname. This is similar to a leading
^ in regular expressions. Thus "/foo"
would match a name of "foo" at either
the "root of the transfer" (for a
global rule) or in the merge-file's
directory (for a per-directory rule).
An unqualified "foo" would match a
name of "foo" anywhere in the tree
because the algorithm is applied
recursively from the top down; it
behaves as if each path component gets
a turn at being the end of the
filename. Even the unanchored
"sub/foo" would match at any point in
the hierarchy where a "foo" was found
within a directory named "sub". See
the section on ANCHORING
INCLUDE/EXCLUDE PATTERNS for a full
discussion of how to specify a pattern
that matches at the root of the
transfer.
if the pattern ends with a / then it will only match a directory, not a
regular file, symlink, or device.
rsync chooses between doing a simple string match and wildcard
matching by checking if the pattern
contains one of these three wildcard
characters: '*', '?', and '[' .
a '*' matches any path component, but it stops at slashes.
use '**' to match anything, including slashes.