checking a string in bash to see if its a domain - bash

I am working on a project in Bash that takes a live xlsx file, converts it into a csv file, and checks the file to make sure that the data inside it are urls. This is part of a larger progragam that will eventually test each url for domain squatting.
I am having problems with the verification of the string data. I am having to teach myself bash as i go along since this is a self study class. Thanks for the Help!
INPUT=domain3.csv
while IFS= read -r line
do
if [[ "$line" == *".com"*] || [ "$line" == *".net"*] || [ "$line" == *".org"*] || [ "$line" == *".biz"*]];
then echo "$line"
else echo "$line is not an URL"
fi
echo "Finished!"
done

Use the =~ to perform regular expression match:
if [[ $INPUT =~ \.(com|net|org)$ ]]
then
echo $INPUT is a domain
else
echo $INPUT is not a domain
fi
The expression reads that if $INPUT matches a dot (\.), then one of "com", "net", or "org", then end of string ($), then it is a domain.

[[ ... ]] (since bash 4.1) temporarily enables the extglob option, so you can write
if [[ "$line" == *.#(com|net|org|biz)* ]];
You probably don't actually want the trailing *, which would let you match things like foo.comzzz.

A case statement.
#!/bin/sh
while IFS= read -r line; do
case $line in
*.com|*.net|*.org|*.biz)
echo "$line";;
*) printf >&2 '%s is not a url!\n' "$line" ;;
esac
done

Please execute the below code once and then compare it with your's to find out the error.
while IFS= read -r line
do
if [[ "$line" == *".com"* ]] || [[ "$line" == *".net"* ]] || [[ "$line" == *".org"* ]] || [[ "$line" == *".biz"* ]]
then
echo "$line"
else
echo "$line is not an URL"
fi
done < $INPUT
echo "Finished!"

Related

[[Bash]] Search for combined Expressions in every row

I am very new to Bash Scripting and I have a question regarding my CheckOurCodingRules.sh script:
I want to search for every 'hPar,' in a textfile and if found it should be checked if there is a also a 'const' in the same row.
Thats what I got so far but there is something wrong here:
while read line
do
if [[ $line == *hPar\,* ]] && [[ $line == *const\*]];then
DOCUMENTATION_TEST_A=1
else
echo DOCUMENTATION_TEST_A=0
fi
done < $INPUT_FILE
if [[DOCUMENTATION_TEST_A=0]];then
echo "error: Rule1: No const before hpar"
fi
There are a couple of issues with your script, see the code below which works for me:
DOCUMENTATION_TEST_A=0 # initial value
while read line
do
# spaces between conditional and brackets, no backslashes
if [[ $line == *hPar,* ]] && [[ $line == *const* ]]
then
DOCUMENTATION_TEST_A=1
break # optional, no need to scan the rest of the file
fi
done < $INPUT_FILE
# spaces and $, -eq is used for numerical comparisons
if [[ $DOCUMENTATION_TEST_A -eq 0 ]];
then
echo "error: Rule1: No const before hpar"
fi
A cleaner solution would be to use grep:
if ! grep "hPar," $INPUT_FILE | grep "const" >/dev/null
then
echo "error: Rule1: No const before hpar"
fi

If loop with OR

I have following strings in input.txt file.
Running
isn't running
is running
Stopped
stopped
Aborted
aborted
Here I would need to match everything except "Running" and "is running". So far I have done below but it seems to be printing "Running" and "is running" as well. Can some help ?
exec < input.txt
while read line
do
if [[ $line =~ [Aa]borted || [Ss]topped || isn*t ]]; then
echo "$line"
else
echo "FINE"
fi
done
Why don't use grep?
grep -v "unning" < input.txt
gives you desired lines.
grep -vE '\<(R|is r)unning\>' input.txt
The \< and \> markers are word boundaries, so you won't filter out "this running"
In shell
while IFS= read -r line; do
case $line in
*"Running"* | *"is running"*) echo FINE ;;
*) echo "$line" ;;
esac
done < input.txt
Problem in your code: this line
[[ $line =~ [Aa]borted || [Ss]topped || isn*t ]]
is not equivalent to this
[[ $line =~ [Aa]borted ]] || [[ $line =~ [Ss]topped ]] || [[ $line =~ isn*t ]]
it is equivalent to this
[[ $line =~ [Aa]borted ]] || [[ -n "[Ss]topped" ]] || [[ -n "isn*t" ]]
because the "or" operator || has higher precedence than the regex match operator =~
Since the last 2 conditions are always true, each line passes.
Note also that you're using glob patterns with a regular expression match operator, so this [[ $line =~ isn*t ]] will not match isn't, it will match is and t with zero or more n in between. (ist, isnt, isnnt, etc)
You intended to write this
[[ $line == [Aa]borted || $line == [Ss]topped || $line == isn*t ]]
which is mote concisely written with case:
case $line in
[Aa]borted | [Ss]topped | isn*t) echo "$line" ;;
*) echo FINE ;;
esac
which is the inverse of my answer.

Bash while loop if statment

Can anyone see whats wrong here? If I put X|9 in lan.db (or any db in this directory) and run the following code, the IF statement does not work. It's weird! if you echo $LINE, it is indeed pulling X|9 out of lan.db (or any db in this directory) and setting it equal to LINE, but it wont do the comparison.
DBREGEX="^[0-9]|[0-9]$"
shopt -s nullglob
DBARRAY=(databases/*)
i=0
for i in "${!DBARRAY[#]}"; do
cat ${DBARRAY[$i]} | grep -v \# | while read LINE; do
echo "$LINE" (Whats weird is that LINE DOES contain X|9)
if [[ !( $LINE =~ $DBREGEX ) ]]; then echo "FAIL"; fi
done
done
If however I just manually sent LINE="X|9" the same code (minus the while) works fine. ie LINE=X|9 fails, but LINE=9|9 succeeds.
DBREGEX="^[0-9]|[0-9]$"
Comment shopt -s nullglob
Comment DBARRAY=(databases/*)
Comment i=0
Comment for i in "${!DBARRAY[#]}"; do
Comment cat ${DBARRAY[$i]} | grep -v \# | while read LINE; do
LINE="X|9"
if [[ !( $LINE =~ $DBREGEX ) ]]; then echo "FAIL"; fi
Comment done
Comment done
* UPDATE *
UGH I GIVE UP
Now not even this is working...
DBREGEX="^[0-9]|[0-9]$"
LINE="X|9"
if [[ ! $LINE =~ $DBREGEX ]]; then echo "FAIL"; fi
* UPDATE *
Ok, so it looks like I have to escape |
DBREGEX="^[0-9]\|[0-9]$"
LINE="9|9"
echo "$LINE"
if [[ ! $LINE =~ $DBREGEX ]]; then echo "FAIL"; fi
This seems to work ok again
| has a special meaning in a regular expression. ^[0-9]|[0-9]$ means "starts with a digit, or ends with a digit". If you want to match a literal vertical bar, backslash it:
DBREGEX='^[0-9]\|[0-9]$'
for LINE in 'X|9' '9|9' ; do
echo "$LINE"
if [[ ! $LINE =~ $DBREGEX ]] ; then echo "FAIL" ; fi
done
You don't need round brackets in regex evaluation. You script is also creating a sub shell and making a useless use of cat which can be avoided.
Try this script instead:
while read LINE; do
echo "$LINE"
[[ "$LINE" =~ $DBREGEX ]] && echo "PASS" || echo "FAIL"
done < <(grep -v '#' databases/lan.db)

A simple bash program that prints the third character of a string, that uses [[ ]] (compare to method)

Unfortunately i tried this and it doesn't work, i must use the [[ ]]
read input
for i in input
do
if [[ i = "$input" ]]
then
echo "i"
fi
done
when I run this nothing happens, it only reads my input
This line:
if [[ i = "$input" ]]
should be:
if [[ "$i" = "$input" ]]
OR:
if [[ "$i" == "$input" ]]
PS: Same thing for input also.
Remember that variables in shell are accessed with $ prefix.
May be you can re-factor your script to this:
read input
for i in $input
do
[[ "$i" == "something" ]] && echo "$i"
done
I think when you use only numbers you can also try:
for i in input
do
if [[ $i -eq "$input" ]]
then
echo "$i"
fi
done

how to search for blank lines in bash

I am trying to create an if statement that performs an action when it reads a blank line.
I would assume it would be something like this : if ($line=='\n');then
where line is the line that it is reading from a text file. But this is not working.
while read line; do
if [ "$line" = "" ]; then
echo BLANK
fi
done < filename.txt
or a slight variation:
while read line; do
if [ "$line" ]; then
echo NOT BLANK
else
echo BLANK
fi
done < filename.txt
try this:
if [[ "x$line" == "x" ]]; then...
or
if [[ "$line" =~ "^$" ]]; ...
Or also:
grep -q '.' <<< $line
Returns 1 if line is empty, 0 if non-empty

Resources