How to validate filename in bash script? - bash

A little bit about my application:
I am writing a small application in bash script. The application must store personal settings to home directory.
My settings are in the form of a key/value pair which will be stored as filename/content:
for example:
~/my-github-app
├── github-account
├── github-token
My current solution for adding a key/value:
read KEY
read VALUE
# FIXME! should check for for valid filename.
# user can do injection hack by KEY="../../path/to/yourfile"
echo $VALUE > ~/my-github-app/$KEY
What is the simplest and safe way to validate $KEY?
A built-in function?
A regular expression?
I really need a reusable solution, not just for this application.
Edit:
"validate filename" mean check string for proper filename format, accepted by OS.
"bob" : good filename format
"" : bad because filename can not be empty
"*" : ?
"/" : ?
"con" : ?
....

The only way to make something secure is to use a whitelist. Which means instead of blacklisting bad characters you allow good ones. The reason why blacklists will always fail is that you can't blacklist all of the weird characted, you'd always forget something. Especially if you're working with unicode strings.
Filenames could contain anything. According to wikipedia:
Ext4 Allowed characters in filenames: All bytes except NUL ('\0') and '/'
Which means that whole bash scripts could be valid filenames.
So, if I were you, I would only allow a-zA-Z as valid characters. Problem solved.
That's how you do it:
# if [[ $key =~ [^a-zA-Z] ]]; then # or this. Whatever makes more sense to you
if ! [[ $key =~ ^[a-zA-Z]+$ ]]; then
echo 'Wrong key. Only a-zA-Z characters are allowed' >&2 # write to stderr
exit 1
fi

In addition to #Aleks-Daniel Jakimenko-A.'s answer, following script checks following conditions, If all conditions are set then True is returned:
a-z
A-Z
0-9
underscore (_)
dash (-)
period (.)
Max length is 255
First character should be a character or number: {a-z or A-z or 0-9}
my_script.sh:
#!/bin/bash
# To run: bash my_script.sh <my_string_is>
key=$1
val=$(echo "${#key}")
if [[ $key == "" ]]; then
echo "False";
exit
fi
if [[ $key == "." ]] || [[ $key == ".." ]]; then
# "." and ".." are added automatically and always exist, so you can't have a
# file named . or .. // https://askubuntu.com/a/416508/660555
echo "False";
exit
fi
if [ $val -gt 255 ]; then
# String's length check
echo "False";
exit
fi
if ! [[ $key =~ ^[0-9a-zA-Z._-]+$ ]]; then
# Checks whether valid characters exist
echo "False";
exit
fi
_key=$(echo $key | cut -c1-1)
if ! [[ $_key =~ ^[0-9a-zA-Z.]+$ ]]; then
# Checks the first character
echo "False";
exit
fi
echo "True";

If you just want to check if a file already exists, use the test command and use it like this for your validation :
if [[ ! -e "$KEY" ]]
then
#file doet not exists
fi

What do you want to validate, just that a key doesn't contain any path info?
KEY=$(basename $KEY)
This would remove any parts of the KEY that are part of the path. That said, there are plenty of things the user could enter that would probably be a bad idea. Can you perhaps have a list of allowed keys, then reject anything that isn't in that list?
If you're trying to see if a file is writable, you could check if a) it exists and is writable (-w) or b) just try to write to it and check for errors.

Related

Shell, if statment of variable only containing alphabet? [duplicate]

Trying to verify that a string has only lowercase, uppercase, or numbers in it.
if ! [[ "$TITLE" =~ ^[a-zA-Z0-9]+$ ]]; then echo "INVALID"; fi
Thoughts?
* UPDATE *
The variable TITLE currently only has upper case text so it should pass and nothing should be outputted. If however I add a special character to TITLE, the IF statement should catch it and echo INVALID. Currently it does not work. It always echos invalid. I think this is because my regex statement is wrong. I think the way I have it written, its looking for a title that has all three in it.
Bash 4.2.25
The idea is, the user should be able to add any title as long as it only contains uppercase, lowercase or numbers. All other characters should fail.
* UPDATE *
If TITLE = ThisIsAValidTitle it echos invalid.
If TITLE = ThisIs#######InvalidTitle it also echos invalid.
* SOLUTION *
Weird, well it started working when I simplified it down to this:
TEST="Valid0"
if ! [[ "$TEST" =~ [^a-zA-Z0-9] ]]; then
echo "VALID"
else
echo "INVALID"
fi
* REAL SOLUTION *
My variable had spaces in it... DUH
Sorry for the trouble guys...
* FINAL SOLUTION *
This accounts for spaces in titles
if ! [[ "$TITLE" =~ [^a-zA-Z0-9\ ] ]]; then
echo "VALID"
else
echo "INVALID"
fi
I'd invert the logic. Test for invalid characters and echo a warning if at least one is present:
if [[ "$TITLE" =~ [^a-zA-Z0-9] ]]; then
echo "INVALID"
fi
With that said, your original check worked for me, so you probably need to provide more context (i.e. a larger portion of your script).
why cant we use alnum
[[ 'mystring123' =~ [:alnum:] ]] && echo "ok" || echo "no"
the nominated answer is wrong. Because it doesn't check to the end of the string. also it's inverted. as the conditional says: "if the start of the string is valid characters then echo invalid"
[[ $TITLE =~ ^[a-zA-Z0-9_-]{3,20}$ ]] && ret="VALID" || ret="INVALID"
echo $ret

How do I check if a variable contains at least one alphabetic character in Bash?

The version of bash i use is 4.3.11 and I use 'mcedit' as my script writer. I want to check if a variable contains at least one alphabetical character such that 'harry33' and 'a1111' are deemed valid.
I've tried the code below in my script however an error is returned which states that there is an error with '[[:'
SOLVED
#name = "test123"
read -p "Enter you name: " name
until [[ "$name" =~ [A-Za-z] ]]; do
read -p "Please enter a valid name: " name
done
The code you wrote has a couple of issues with spaces (one you already corrected after the [[) and the spaces around an equal should not exist:
name="test123"
if [[ "$name" =~ [A-Za-z] ]]; then
echo "Please enter valid input: "
fi
The line: "Please enter valid input: " will be printed in this case.
As $name contains several values in the range a-z.
Maybe what you want is the opposite, that the line is printed if the variable contains characters outside the range:
name="test"
if [[ "$name" =~ [^A-Za-z] ]]; then
echo "The input contains characters outside the a-z or A-Z range."
fi
But in this case, the characters accepted may include accented (international) characters like é, è, or ë. Which are in-range in several Language Collate sequences.
That also happens with [^[:alpha:]].
Either you embrace full internationalization or limit yourself to old ASCII:
name="test"
LC_COLLATE=C
if [[ "$name" =~ [^A-Za-z] ]]; then
echo "The input contains characters outside the a-z or A-Z range."
fi
If you want to have as valid names with Alpha and digits, there are two options. One which is very strict (old ASCII ranges):
name="harry33"
LC_COLLATE=C
if [[ "$name" =~ ^[0-9A-Za-z]+$ ]]; then
echo "The input only contains digits and alpha."
fi
The other option will also allow é, ß or đ, etc. (which is perfectly fine for an internationalized name), and the range is defined either by the variable LC_COLLATE or LC_ALL as set in the environment.
name="harry33"
if [[ "$name" =~ ^[[:alnum:]]+$ ]]; then
echo "The input only contains digits and alpha."
fi
This option will reject $%&() and similar.
The portable solution free of bashisms such as [[ would be
case $name in
(*[a-zA-Z]*)
echo "Yay! Got an alphabetic character."
;;
(*)
echo "Hmm, no a-z or A-Z found."
;;
esac
First, Bash is picky about spacing, so have a space after your test brackets [[ and ]] Also, if you are looking for user names, I'd think you'd want it to start with a letter, and if it didn't, then echo your prompt.
if [[ ! $var =~ ^[[:alpha:]] ]]; then
echo -n "Please enter valid input: "
read response
fi

Bash - check for a string in file path

How can I check for a string in a file path in bash? I am trying:
if [[$(echo "${filePathVar}" | sed 's#//#:#g#') == *"File.java"* ]]
to replace all forward slashes with a colon (:) in the path. It's not working. Bash is seeing the file path string as a file path and throws the error "No such file or directory". The intention is for it to see the file path as a string.
Example: filePathVar could be
**/myloc/src/File.java
in which case the check should return true.
Please note that I am writing this script inside a Jenkins job as a build step.
Updates as of 12/15/15
The following returns Not found, which is wrong.
#!/bin/bash
sources="**/src/TESTS/A.java **/src/TESTS/B.java"
if [[ "${sources}" = ~B.java[^/]*$ ]];
then
echo "Found!!"
else
echo "Not Found!!"
fi
The following returns Found which also also wrong (removed the space around the comparator =).
#!/bin/bash
sources="**/src/TESTS/A.java **/src/TESTS/C.java"
if [[ "${sources}"=~B.java[^/]*$ ]];
then
echo "Found!!"
else
echo "Not Found!!"
fi
The comparison operation is clearly not working.
It is easier to use bash's builtin regex matching facility:
$ filePathVar=/myLoc/src/File.java
if [[ "$filePathVar" =~ File.java[^/]*$ ]]; then echo Match; else echo No Match; fi
Match
Inside [[...]], the operator =~ does regex matching. The regular expression File.java[^/]* matches any string that contains File.java optionally followed by anything except /.
It worked in a simpler way as below:
#!/bin/bash
sources="**/src/TESTS/A.java **/src/TESTS/B.java"
if [[ $sources == *"A.java"* ]]
then
echo "Found!!"
else
echo "Not Found!!"
fi

Check if a variable contains a domain name in a set of TLDs

I have a variable name which contains name of DNS record. I want to check for certain conditions, for example if it ends with .com or .es.
Right now I am using
name="test.com"
namecheck=`echo $name | grep -w "^[A-Za-z]*..com"`
but it only checks the com and ignores the . also is it possible to check it against series of value stored in array like
domain=[ ".es" ".com" ".de"]
Pure bash implementation:
name=test.com
domains=(es com de)
for dom in ${domains[#]}; do
[[ $name == *.$dom ]] && echo $name && break
done
You can use this egrep:
egrep -q "\.(com|es|de)$"
This will return 0 (true) if given input is ending with .com OR .es OR .de
EDIT: Using it with an array of allowed domains:
domain=( "es" "com" "de" )
str=$(printf "|%s" ${domain[#]})
str="${str:1}"
echo "abc.test.com"|egrep "\.(${str})$"
abc.test.com
echo "abc.test.org"|egrep "\.(${str})$"
POSIX shell compliant and super-fast:
case "$name" in *.de|*.com|*.es) echo "$name" or whatever ;; esac

BASH: Everything but not slash? IF STATEMENT (STRING COMPARISION)

I'm trying to match any strings that start with /John/ but does not contain / after /John/
if
[ $string == /John/[!/]+ ]; then ....
fi
This is what I got and it doesn't seem to be working.
So I tried
if
[[ $string =~ ^/John/[!/]+$ ]]; then ....
fi
It still didn't work, and so I changed it to
if
[[ $string =~ /John/[^/] ]]; then ....
fi
It worked but will match with all the strings that has / behind /John/ too.
For bash you want [[ $string =~ /John/[^/]*$ ]] -- the end-of-line anchor ensures there are no slashes after the last acceptable slash.
How about "the string starts with '/John/' and doesn't contain any slashes after '/John/'"?
[[ $string = /John/* && $string != /John/*/* ]]
Or you could compare against a parameter expansion that only expands if the conditions are met. This says "after stripping off everything including and after the last slash, the string is /John":
[[ ${string%/*} = /John ]]
In fact, this last solution is the only entirely POSIXLY_STRICT one I can come up with without multiple test expressions.
[ "${string%/*}" = /John ]
By the way, your problem is probably simply be using double-equals inside a single-bracket test expression. bash actually does accept them inside double-bracket test expressions, but a single equals is a better idea.
You can also use plain old grep:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -q "/John[^/]" ; then
echo "matched"
else
echo "no match found"
fi
This only fails if /John is at the very end of the string... if that's a possibility then you can tweak to handle that case, for instance:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -qP "(/John[^/])|(/John$)" ; then
echo "matched"
else
echo "no match found"
fi
Not sure what language you're using, but normal negative character classes are prefixed with a ^
e.g.
[^/]
You can also put in start/end qualifiers (clojure example, so Java's regex engine). Usually ^ at beginning and $ at end.
user => (re-matches #"^/[a-zA-Z]+[^/]$" "/John/")
nil

Resources