Glob matching only returning first match - bash

I'm coding up a sort of custom rm script that I would like to pass wildcard matches to. I have several files in the working directory that would match the wildcard that I'm passing to the script, but I'm only getting one of them back from a simple test case:
sh remove r*
Inside the remove script, I've whittled it down to just
echo $1
Here's the directory contents:
$ ls
file2 file4 newTestFile remove_engine restore
file3 fileName_1234 remove remove_w restore_engine
And here's what I get back.
$ sh remove r*
remove
I understand that BASH expands the wildcard out even before the script is executed. But why am I not getting all of the files in the directory that match f*?

Pathname expansion, aka globbing, expands a single shell word into multiple. In your case
./remove r*
is entirely identical to running
./remove remove remove_engine restore remove remove_w restore_engine
As you discovered, $1 will be remove because this is the first argument. The rest of the files are separate positional parameters, so $2 will be remove_engine and $3 will be restore.
To process all of the arguments, you use "$#", either in a loop:
for file in "$#"
do
printf 'One of the matches was: %s\n' "$file"
done
or just directly in commands that also accept multiple parameters:
# Delete all matches
echo rm "$#"

Related

How can I loop through a list of files, from directories specified in a variable?

I am trying to loop through the files in some directories, and performa an action on each file.
The list of directories is specified by a list of strings, stored as an environment variable
LIST_OF_DIRECTORIES=dir1 dir2 dir3
for dir in $LIST_OF_DIRECTORIES; do
for file in $dir/* ; do
echo $file
done
done
This results in nothing. I'm expecting all of the files within that directory to be echoed.
I am basing my logic off of Bash For-Loop on Directories, and trying to make this work for my use case.
You have to place strings with spaces around quotes otherwise each "word" will be interpreted separately. In your example, LIST_OF_DIRECTORIES=dir1 is executed (dir1 is indeed assigned LIST_OF_DIRECTORIES), but because it precedes a now interpreted simple command (dir2 dir3), it only lives temporarily for that command.
You should do either of these instead:
LIST_OF_DIRECTORIES="dir1 dir2 dir3"
LIST_OF_DIRECTORIES='dir1 dir2 dir3'
From Simple Command Expansion:
If no command name results, the variable assignments affect the
current shell environment. In the case of such a command (one that
consists only of assignment statements and redirections), assignment
statements are performed before redirections. Otherwise, the variables
are added to the environment of the executed command and do not affect
the current shell environment. If any of the assignments attempts to
assign a value to a readonly variable, an error occurs, and the
command exits with a non-zero status.
Also as a suggestion, use arrays for storing multiple entries instead and don't use word splitting unless your script doesn't use filename expansion and noglob is enabled with set -f or shopt -so noglob.
LIST_OF_DIRECTORIES=(dir1 dir2 dir3)
for dir in "${LIST_OF_DIRECTORIES[#]}"; do
Other References:
Quoting
Arrays
Filename Expansion
Word Splitting
This will work fine for you as.
LIST_OF_DIRECTORIES="dir1 dir2 dir3"
for dir in $LIST_OF_DIRECTORIES;
#add all the files to the files variable
do files=`ls $dir`;
for file in $files;
#Take action on your file here, I am just doing ls for my file here.
do echo `ls $dir/$file`;
done;
done

Produce a file that contains names of all empty subfolders

I want to write a script that takes a name of a folder as a command line argument and produces a file that contains the names of all subfolders with size 0 (empty subfolder). This is what I got:
#!/bin/bash
echo "Name of a folder'
read FOLDER
for entry in "$search_dir"/*
do
echo "$entry"
done
your script doesn't have the logic you intended. find command has a feature for this
$ find path/to/dir -type d -empty
will print empty directories starting from the given path/to/dir
I would suggest you accept the answer which suggests to use find instead. But just to be complete, here is some feedback on your code.
You read the input directory into FOLDER but then never use this variable.
As an aside, don't use uppercase for your private variables; this is reserved for system variables.
You have unpaired quotes in the prompt string. If the opening quote is double, you need to close with a double quote, or vice versa for single quotes.
You loop over directory entries, but do nothing to isolate just the ones which are directories, let alone empty directories.
Finally, nothing in your script uses Bash-only facilities, so it would be safe and somewhat more portable to use #!/bin/sh
Now, looping over directories can be done by using search_dir/*/ instead of just search_dir/*; and finding out which ones are empty can be done by checking whether a wildcard within the directory returns just the directory itself. (This assumes default globbing behavior -- with nullglob you would make a wildcard with no matches expand to an empty list, but this is problematic in some scenarios so it's not the default.)
#!/bin/bash
# read -p is not POSIX
read -p "Name of a folder" search_dir
for dir in "$search_dir"/*/
do
# [[ is Bash only
if [[ "$dir"/* = "$dir/*" ]]; then # Notice tricky quoting
echo "$dir"
fi
done
Using the wildcard expansion with [ is problematic because it is not prepared to deal with a wildcard expansion -- you get "too many arguments" if the wildcard expands into more than one filename -- so I'm using the somewhat more mild-tempered Bash replacement [[ which copes just fine with this. Alternatively, you could use case, which I would actually prefer here; but I've stuck to if in order to make only minimal changes to your script.

How to prevent glob patterns from expanding while splitting colon separated glob patterns by colon?

Here is my shell script written for POSIX shell. I am aiming at POSIX shell compatibility. There is a PATTERNS variable that contains colon separated glob patterns. This variable is a user input. I cannot change this. The rest of the code is my code and I can change it to achieve my purpose. The purpose in the small demo program below is to log (or display) each glob pattern on a separate line.
PATTERNS=*.*:*.txt:*.html
IFS=:
for pattern in $PATTERNS
do
unset IFS
echo Pattern: "$pattern"
done
This is the current working directory.
$ ls
a.txt b.txt foo.sh
When I run the code, I get this output.
$ sh foo.sh
Pattern: a.txt
Pattern: b.txt
Pattern: foo.sh
Pattern: a.txt
Pattern: b.txt
Pattern: *.html
Instead of displaying each glob pattern in a line, it is displaying each file that was matched by the glob pattern in a line. Since *.html did not match any file in the current directory, only this glob pattern was displayed as desired.
The output I desire is:
Pattern: *.*
Pattern: *.txt
Pattern: *.html
So I want the glob patterns to be not expanded into the filenames in the current directory. How can I do that? The solution must work fine for any POSIX shell.
Not positive I didn't miss anything or that there isn't a better way but this works isn't using anything I know to be unportable.
#!/bin/sh
PAT="*.rc:*.rpms:*.sh"
# Function so we can repeatedly expand the remaining string.
read_next() {
IFS=: read hd tl <<EOF
$1
EOF
}
tl=$PAT
while read_next "$tl" && [ -n "$hd" ]; do
echo "Pattern: $hd"
done
Alternatively, and this is simpler, though it requires the "global" IFS modification which I am never a fan of (I greatly prefer the scoped IFS modification on read) as well as modifying global shell settings you could simply disable pathname expansion entirely with the -f flag to set (and then re-enable it with set +f afterwards if desired) and then just keep your original loop.

Bash glob parameter only shows first file instead of all files

I want to run this cmd line script
$ script.sh lib/* ../test_git_thing
I want it to process all the files in the /lib folder.
FILES=$1
for f in $FILES
do
echo "Processing $f file..."
done
Currently it only prints the first file. If I use $#, it gives me all the files, but also the last param which I don't want. Any thoughts?
The argument list is being expanded at the command line when you invoke "script.sh lib/*" your script is being called with all the files in lib/ as args. Since you only reference $1 in your script, it's only printing the first file. You need to escape the wildcard on the command line so it's passed to your script to perform the globbing.
As correctly noted, lib/* on the command line is being expanded into all files in lib. To prevent expansion, you have 2 options. (1) quote your input:
$ script.sh 'lib/*' ../test_git_thing
Or (2), turn file globbing off. However, the option set -f will disable pathname expansion within the shell, but it will disable all pathname expansion (setting it within the script doesn't help as expansion is done by the shell before passing arguments to your script). In your case, it is probably better to quote the input or pass the first arguments as a directory name, and add the expansion in the script:
DIR=$1
for f in "$DIR"/*
In bash and ksh you can iterate through all arguments except the last like this:
for f in "${#:1:$#-1}"; do
echo "$f"
done
In zsh, you can do something similar:
for f in $#[1,${#}-1]; do
echo "$f"
done
$# is the number of arguments and ${#:start:length} is substring/subsequence notation in bash and ksh, while $#[start,end] is subsequence in zsh. In all cases, the subscript expressions are evaluated as arithmetic expressions, which is why $#-1 works. (In zsh, you need ${#}-1 because $#- is interpreted as "the length of $-".)
In all three shells, you can use the ${x:start:length} syntax with a scalar variable, to extract a substring; in bash and ksh, you can use ${a[#]:start:length} with an array to extract a subsequence of values.
This answers the question as given, without using non-POSIX features, and without workarounds such as disabling globbing.
You can find the last argument using a loop, and then exclude that when processing the list of files. In this example, $d is the directory name, while $f has the same meaning as in the original answer:
#!/bin/sh
if [ $# != 0 ]
then
for d in "$#"; do :; done
if [ -d "$d" ]
then
for f in "$#"
do
if [ "x$f" != "x$d" ]
then
echo "Processing $f file..."
fi
done
fi
fi
Additionally, it would be a good idea to also test if "$f" is a file, since it is common for shells to pass the wildcard character through the argument list if no match is found.

Bash -- How to execute syntax: ./myscript.sh *.dat?

I would like to run recursively myscript.sh, to execute all files in the directory:
It has been discussed here that I could do like this:
#!/bin/bash
for file in * ; do
echo $file
done
But I would like myscript.sh to execute with this syntax, so that I could select only certain filetypes to be executed:
./myscript.sh *.dat
Thus I modify the script above:
#!/bin/bash
for file in $1 ; do
echo $file
done
In which when executing, it only executes first occurrence, not all files with *.dat extensions.
What is wrong here?
The wildcard *.dat is expanded by the shell before your script ever sees it. So the filenames show up in your script as $1, $2, $3, etc.
You can work with them all at once by using the special $# variable:
for file in "$#"; do
echo $file
done
Note that the double quotes around "$#" is special. From man bash:
# Expands to the positional parameters, starting from one. When the expansion
occurs within double quotes, each parameter expands to a separate word. That
is, "$#" is equivalent to "$1" "$2" ... If the double-quoted expansion occurs
within a word, the expansion of the first parameter is joined with the begin-
ning part of the original word, and the expansion of the last parameter is
joined with the last part of the original word. When there are no positional
parameters, "$#" and $# expand to nothing (i.e., they are removed).
You could do the same thing with just: ls *.dat | xargs echo
You need to get the actual contents of *, like so:
for file in $* ; do
echo $file
done

Resources