how to match a specific file extension in shellscript - bash

I looked some other posts and learnt to match file extension in the following way but why my code is not working? Thanks.
1 #!/bin/sh
2
3 for i in `ls`
4 do
5 if [[ "$i" == *.txt ]]
6 then
7 echo "$i is .txt file"
8 else
9 echo "$i is NOT .txt file"
10 fi
11 done
eidt:
I realized #!/bin/sh and #!/bin/bash are different, if you are looking at this post later, remember to check which one you are using.

The [[ ]] expression is only available in some shells, like bash and zsh. Some more basic shells, like dash, do no support it. I'm guessing you're running this on a recent version of Ubuntu or Debian, where /bin/sh is actually dash, and hence doesn't recognize [[. And actually, you shouldn't use [[ ]] with a #!/bin/sh shebang anyway, since it's unsafe to depend on a feature that the shebang doesn't request.
So, what to do about it? You'll have the [ ] type of test expression available, but it doesn't do pattern matching (like *.txt). There are a number of alternate ways to do it:
The case statement is available in even basic shells, and has the same pattern matching capability as [[ = ]]. This is the most common way to do this type of thing, especially when you have a list of different patterns to check against.
More indirectly, you can use ${var%pattern} to try remove .txt from the end of the end of the value (see "Remove Smallest Suffix Pattern" here), and then check to see if that changed the value:
if [ "$i" != "${i%.txt}" ]
More explanation: suppose $i is "file.txt"; then this expands to [ "file.txt" != "file" ], so they're not equal, and the test (for !=) succeeds. On the other hand, if $i is "file.pdf", then it expands to [ "file.pdf" != "file.pdf" ], which fails because the strings are the same.
Other notes: when using [ ], use a single equal sign for string comparison, and be sure to properly double-quote all variable references to avoid confusion. Also, if you use anything that has special meaning to the shell (like < or >), you need to quote or escape them.
You could use the expr command's : operator to do regular expression matching. (Regular expressions are a different type of pattern from the basic wildcard or "glob" expression.) You could do this, but don't.

#!/bin/sh
for i in `ls`
do
if [[ "$i" = *".txt" ]] ; then
echo "$i is .txt file"
else
echo "$i is NOT .txt file"
fi
done

You don't have to loop in ls output, and sh implementation might vary among OS distributions.
Consider:
#! /bin/sh
for i in *
do
if [[ "$i" == *.txt ]]
then
echo "$i is txt file"
else
echo "$i is NOT txt file"
fi
done

Related

Why am i getting the binary operator expected error

I'm trying to write a shell script to check if there's a file existing that ends with .txt using an if statement.
Within single bracket conditionals, all of the Shell Expansions will occur, particularly in this case Filename expansion.
The condional construct acts upon the number of arguments it's given: -f expects exactly one argument to follow it, a filename. Apparently your *.txt pattern matches more than one file.
If your shell is bash, you can do
files=(*.txt)
if (( ${#files[#]} > 0 )); then ...
or, more portably:
count=0
for file in *.txt; do
count=1
break
done
if [ "$count" -eq 0 ]; then
echo "no *.txt files"
else
echo "at least one *.txt file"
fi
I finally get your perspective now. I've been giving you some incomplete advice. This is what you need:
for f in *.txt; do
if [ -f "$f" ]; then
do_something_with "$f"
fi
done
The reason: if there are no files matching the pattern then the shell leaves the patten as a plain string. On the first iteration of the loop, we have f="*.txt" and mv responds with "file not found".
I'm used to working in bash with the nullglob option that handles this edge case.

Fixing POSIX sh warning in a small Bash program

I wrote the following code in Bash:
#!/bin/sh
host=$1
regex="^(((git|ssh|http(s)?)|(git#[\w\.]+))(:(\/\/)?)([A-Za-z0-9.#:_/-]+)\.com)(.*)"
if [[ "$host" =~ $regex ]]; then
d=${BASH_REMATCH[1]}
if [[ "$d" = *github* ]]; then
return
fi
fi
die "Current repository is not stored in Github."
I want to learn how to write a better Bash code so I use the shellcheck.net.
Line 5:
if [[ "$host" =~ $regex ]]; then
^-- SC2039: In POSIX sh, [[ ]] is undefined.
Line 6:
d=${BASH_REMATCH[1]}
^-- SC2039: In POSIX sh, array references are undefined.
Line 7:
if [[ "$d" = *github* ]]; then
^-- SC2039: In POSIX sh, [[ ]] is undefined.
I'm trying to understand how to fix those warnings. I understand that in order to fix [[ ]] I need it to switch to [ ] but then I get an error due globs. Also how should I replace the =~ operator?
When you write #!/bin/sh then you shouldn't use bash-specific features like [[. But you don't need to change [[ to [ or anything like that; just change the shebang line to #!/bin/bash. Then you can use all the bash features you like.
Use grep and sed in posix.
# use grep -q to match with regex
if printf "%s\n" "$host" | grep -q '\(git\|ssh\|http\(s\)\)etc. etc. etc.'; then
# use sed to extract part of the string matching regex
d=$(printf "%s\n" "$host" | sed 's/\(g\|ssh\|http\(s\)\)etc. etc. etc./\2/')
if printf "%s\n" "$d" | grep -q github; then
return
fi
fi
Finding out proper regexes is left to others.
You could try to parse out the different parts with parameter expansions though it's going to get a bit tedious. (The link is to the Bash manual; only a few of the expansions supported by Bash are POSIX.)
Assuming the input is a valid, well-formed URL (which may or may not be warranted) maybe try
host=$1
tail=${1#*://*/}
case $tail in "$host") tail=${host#*/};; esac
case ${host%/$tail} in
*github.com) return ;;
esac
die "Current repository is not stored in Github."
(where of course we assume that this is in a context where return makes sense, and where die is defined separately, like we have to assume in the original code).
This is quite a lot simpler than the regex you presented, and definitely does not cover all the strings that the regex would be able to handle; but perhaps it doesn't have to be all that complex if we can assume that the URL has gone through some sort of validation (i.e. if it's the output from git remote it's pretty safe to assume that the user has verified it by other means already).

shell script how to compare file name with expected filename but different extention in single line

I have doubt in shell script
I will describe the scenario, $file is containing the file name of my interest,
consider $file can contain foo.1, foo.2, foo.3 here foo will be constant,
but .1,.2,.3 will change, i want to test this in single line in if statement something like
if [ $file = "foo.[1-9]" ]; then
echo "File name is $file"
fi'
i know above script doesn't work :) can anyone suggest what should i refer for this ?
Trim any extension, then see if it's "foo"?
base=${file%.[1-9]}
if [ "$base" = "foo" ]; then
echo Smashing success
fi
Equivalently, I always like to recommend case because it's portable and versatile.
case $file in
foo.[1-9] ) echo Smashing success ;;
esac
The syntax may seem weird at first but it's well worth knowing.
Both of these techniques should be portable to any Bourne-compatible shell, including Dash and POSIX sh.
Use [[ instead for regex matching.
if [[ $file =~ ^foo\.[1-9]$ ]] ; ...

Pattern matching within a string

I am writing a bash script that contained a command similar to:
echo Configure with --with-foo=\"/tmp/foo-*\"
I wanted this command to produce output such as:
Configure with --with-foo="/tmp/foo-1.3.2"
but the pattern wasn't expanded because it was embedded within a string. I got it to work by using command substitution:
echo Configure with --with-foo=\"$(echo /tmp/foo-*)\"
I think this is the standard /bin/sh solution, but does bash support a solution that doesn't require forking a sub-shell, in the same way that $((6 * 7)) can be used in place of $(expr 6 \* 7)? Also, is there a way to restrict the result to a single match?
To check how many files your pattern expands into, store the expansion into an array before using it
shopt -s nullglob
foo=(/tmp/foo-*)
if (( ${#foo[#]} == 0 )); then echo "no foo files"
elif (( ${#foo[#]} > 1 )); then echo "too many foo files"
else do something with "${foo[0]}"
fi
As alternative, use a for loop and break after first iteration:
shopt -s nullglob
for f in /tmp/foo-*; do
echo "Configure with --with-foo=\"$f\""
break
done

Bash script if statements

In Bash script, what is the difference between the following snippets?
1) Using single brackets:
if [ "$1" = VALUE ] ; then
# code
fi
2) Using double brackets:
if [[ "$1" = VALUE ]] ; then
# code
fi
The [[ ]] construct is the more versatile Bash version of [ ]. This is the extended test command, adopted from ksh88.
Using the [[ ... ]] test construct, rather than [ ... ] can prevent many logic errors in scripts. For example, the &&, ||, <, and > operators work within a [[ ]] test, despite giving an error within a [ ] construct.
More info on the Advanced Bash Scripting Guide.
In your snippets, there's no difference as you're not using any of the additional features.
[ is a bash builtin, [[ is a keyword. See the bash FAQ. Beware: most bash scripts on the internet are crap (don't work with filenames with spaces, introduce hidden security holes, etc.), and bash is much more difficult to master than one might think. If you want to do bash programming, you should study at least the bash guide and the bash pitfalls.
Using [[ supresses the normal wordsplitting and pathname expansion on the expression in the brackets. It also enables a number of addition operations, like pattern matching
Just in case portability is needed:
For portability testing you can get the Bourne shell via the Heirloom project or:
http://freshmeat.net/projects/bournesh
(On Mac OS X, for example, /bin/sh is no pure Bourne shell.)
which is also an external program, which doesn't mean that it isn't a builtin.
which [
/usr/bin/[
In single square brackets you have to use -lt for 'less than' alias < while else you could use <
if [ 3 -lt 4 ] ; then echo yes ; fi
yes
if [ 3 < 4 ] ; then echo yes ; fi
bash: 4: No such file or directory
if [[ 3 < 4 ]] ; then echo yes ; fi
yes
if [[ 3 -lt 4 ]] ; then echo yes ; fi
yes
4: No such file means, it tries to read from a file named "4" - redirecting stdin < The same for > and stdout.

Resources