Check for valid link (URL)

Check for valid link (URL) - bash

I was reading though this other question which has some really good regex's for the job but as far as I can see non of them work with BASH commands as BASH commands don't support such complex rexeg's.
if echo "http://www.google.com/test/link.php" | grep -q '(https?|ftp|file)://[-A-Z0-9\+&##/%?=~_|!:,.;]*[-A-Z0-9\+&##/%=~_|]'; then
echo "Link valid"
else
echo "Link not valid"
fi
But this doesn't work as grep -q doesn't work ...
Edit, ok I just realised that grep had an "extended-regex" (-E) option which seems to make it work. But if anyone has a better/faster way I would still love to here about it.

The following works in Bash >= version 3.2 without using grep:
regex='(https?|ftp|file)://[-[:alnum:]\+&##/%?=~_|!:,.;]*[-[:alnum:]\+&##/%=~_|]'
string='http://www.google.com/test/link.php'
if [[ $string =~ $regex ]]
then
echo "Link valid"
else
echo "Link not valid"
fi
I simplified your regex by using [:alnum:] which also matches any alphanumeric character (e.g. Э or ß), but support varies by the underlying regex library. This is another potential simplification which uses + instead of * and a repeated sequence (although your second sequence is different from the first).
regex='(https?|ftp|file)://[-[:alnum:]\+&##/%?=~_|!:,.;]+'

Since I don't have enough rep to comment above, I am going to amend the answer given by Dennis above with this one.
I incorporated Christopher's update to the regex and then added more to it so that the URL has to at least be in this format:
http://w.w (has to have a period in it).
And tweaked output a bit :)
regex='^(https?|ftp|file)://[-A-Za-z0-9\+&##/%?=~_|!:,.;]*[-A-Za-z0-9\+&##/%=~_|]\.[-A-Za-z0-9\+&##/%?=~_|!:,.;]*[-A-Za-z0-9\+&##/%=~_|]$'
url='http://www.google.com/test/link.php'
if [[ $url =~ $regex ]]
then
echo "$url IS valid"
else
echo "$url IS NOT valid"
fi

Probably because the regular expression is written in PCRE syntax. See if you have (or can install) the program pcregrep on your system - it has the same syntax as grep but accepts Perl-compatible regexes - and you should be able to make that work.
Another option is to try the -P option to grep, but the man page says that's "highly experimental" so it may or may not actually work.
I will say that you should think carefully about whether it's really appropriate to be using this or any regex to validate a URL. If you want to have a correct validation, you'd probably be better off finding or writing a small script in, say, Perl, to use the URL validation facilities of the language.
EDIT: In response to your edit in the question, I didn't notice that that regex is also valid in "extended" syntax. I don't think you can get better/faster than that.

Related

How could I change this egrep script to a zgrep script and still have it work?

I'm trying to look for phone numbers in any of the following formats: +1.570.555.1212, 570.555.1212, (570)555-1212, and 570-555-1212. We also need to look in compressed folders using zgrep, however I would have my code come back "No matches found". The code is working as it is below to find phone numbers from txt files. It is very bad, but here it is below
Code:
#!/bin/bash
egrep '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
if [ $? -eq 0 ] ; then echo $1 ; else echo "No matches found" ; fi 2>/dev/null

zgrep without any options is equivalent in its regex capabilities to grep; you need to say zgrep -E if you want to use grep -E (aka egrep) regex syntax when searching compressed files.
#!/bin/bash
if zgrep -E -q '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
then
echo "$1"
else
echo "No matches found" >&2
fi
Notice also Why is testing “$?” to see if a command succeeded or not, an anti-pattern? and When to wrap quotes around a shell variable as well as the preference for -q over redirecting to /dev/null, and the displaying of error messages on standard error (>&2 redirection).
Your regex could also use some refactoring; maybe try
(\+\(1\).)?[0-9]{3}.[0-9]{3}.[0-9]{4}
Notice how round brackets and the plus sign need to be backslash-escaped to match literally, and how after refactoring out the +(1) prefix as optional the rest of the regex subsumes all the other variants you had enumerated, because . matches - and ( and . and many other characters. (The optional prefix could also be dropped completely and this would still match the same strings, but I had to guess some things so I am leaving it in with this remark.)

bash: comparing strings with grep-variable issue

bash-newbie here.
I want to use the following simple script as a shortcut to enable/disable the touchpad of my laptop:
#!/bin/bash
result=$(xinput --list-props 11 | grep "Device Enabled")
echo $result
# Output: Device Enabled (140): 1
if [[ "$result" = "Device Enabled (140): 1" ]]; then
`xinput set-prop 11 "Device Enabled" 0`
else
`xinput set-prop 11 "Device Enabled" 1`
fi
The if-condition is however never entered. echo $result shows that the variable really contains the string-value that I want to compare. I have been searching for a while but can not at all figure out why the result-variable and the string do not match in the if-condition.

The string obtained by grep has a tab at the beginning, which needed to be included in the compared string. Checking again with echo "$result" (with added quotation marks) helped.

In bash (but not more basic shells), you can use [[ ]]'s pattern matching capabilities to check whether a string contains a pattern; using it this way removes the need to worry about leading tabs, or even the need to use grep to pick out the relevant line:
if [[ "$(xinput --list-props 11 | grep "Device Enabled")" = *"Device Enabled (140): 1"* ]]; then
Note that the *s at the beginning and end of the pattern mean "preceded by anything" and "followed by anything" respectively.
Also, the double-quotes around $(xinput ...) aren't really necessary here, but IMO keeping track of which places are safe to leave double-quotes off and when it's not safe is too much trouble. (The left side of an = comparison inside [[ ]] is one of the safe places, but the right side isn't, and in [ ] it's almost never safe -- good luck remembering all that correctly!) So I prefer to just always use double-quotes.

Check if string is in date format (UNIX)?

I would like to check if that variable is in the correct date format or the variable is empty... if it is in the correct date format then i will perform sth
I have tried:
dada=2015-10-11
if [[ "$dada" = ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]
then echo "Date $dada is valid (YYYY-MM-DD)"
else echo "Date $dada is not invalid format (YYYY-MM-DD)"
fi
And also
if [ "`date '+%Y-%m-%d' -d $d 2>/dev/null`" = "$dada" ]
then echo "Date $dada is valid (YYYY-MM-DD)"
else echo "Date $dada is not invalid format (YYYY-MM-DD)"
fi
But it seems like it will always return and telling me that my format is incorrect.
$dada is a dynamic variable wherby it can be a number '444.1' , date format '2017-11-12' or a string 'hello this is not valid'

Converting an extensive set of comments into an answer.
How thorough a check do you want? Should the check reject 2015-02-29, for example?
2015-02-29 should be also rejected yup!
If you need to reject 2015-02-29, you're going to need much more checking than a single line — or the single line will be very long and complex and will have many alternatives in it.
The classic way to validate the data pattern would use the pattern matching from the case statement — maybe using something like this:
case "$dada" in
([12][0189][0-9][0-9]-[01][0-9]-[0-3][0-9]) : OK;;
(*) : Not OK;;
esac
but there are probably better modern ways of doing it. That mainly allows years 18xy, 19xy, 20xy, 21xy (though it does also let through 10xy, 11xy, 28xy, 29xy); you'll have to decide whether that's sensible. Similarly, it lets through months 13-19 (and 00), and days 32-39 (and 00); those are unconditionally invalid. Then you're left with "30 days hath September, …" to worry about.
If you removed the leading ( around the patterns, that statement would work in antique and archaic shells such as the Bourne shell. It isn't tied to Korn shell — it is standard notation in POSIX-like shells, and pre-POSIX shells.
How about just checking if the string format is in place like XXXX-XX-XX?
The case command I showed does a reasonable job for years in the range 1800 through 2199. But it is 'old school' notation. The merit is it works and I don't have to read the manual. Test it — change the : commands into echo.
I have tried the case but it seems like the code did not identify my data as a date. Is there any problem with my declaration of dada?
On my Mac, I was able to run (verbatim — a single line command):
ksh -c 'dada=2015-10-11; case "$dada" in ([12][0189][0-9][0-9]-[01][0-9]-[0-3][0-9]) echo OK;; (*) echo Not OK;; esac'
and I got OK as the output. For values such as 2215-10-11 and 2015-20-11, I got Not OK. It would be better, but isn't actually crucial, to use dada="2015-01-11"; instead of the unquoted form.
How about if I were to add a time at the back of the date — 2015-20-11 23:21? Can I write it as 'case "$HELLO" in ([0-3][0-9]/[01][0-9]/[0-9][0-9] [0-2][0-9]:[0-6][0-9])'
You could certainly add a glob expression that would match the time. I don't understand why the one you propose might be correct, but other patterns could be used.
For example:
dada="2015-11-20 23:21"
case "$dada" in
([12][0189][0-9][0-9]-[01][0-9]-[0-3][0-9]\ [012][0-9]:[0-5][0-9])
echo OK;;
(*) echo Not OK;;
esac
Note that the backslash is needed before the space in the pattern. When run with the data shown, the script reports OK. Change 23 to 32 and it reports Not OK.
There probably is a way to do this with the [[ command instead of writing out the case statement.
Doing more complex (thorough) validation using case is probably not a good idea. You'd do better to invoke a tool that validates dates properly. You might be able to use the (GNU) date command, or you could use Perl or Python or one of those scripting languages. These would reject 2015-02-29 23:21 but allow 2016-02-29 23:21 without problem.

dada=2015-10-11
date -d $dada > /dev/null 2>&1
[[ $? -eq 0 ]] && echo ok || echo not ok

BASH if then else with quoted text

I'm trying to write a short script that checks for verizon fios availability by zip code from a list of 5 digit us zip codes.
The basis of this I have working but comparing the recived output from curl to the expected output in the if statements isn't working.
I know there is a better & cleaner way to do this however I'm really interested in what is wrong with this method. I think it's something to do with the quotes getting jumbled up.
Let me know what you guys think. I originally thought this would be a quick little script. ha. Thanks for the help
Here is what I have so far:
#!/bin/bash
Avail='<link rel="canonical" href="http://fios.verizon.com/fios-plans.html#availability-msg" />'
NotAvail='<link rel="canonical" href="http://fios.verizon.com/order-now.html#availability-msg" />'
while read zip; do
chk=`curl -s -d "ref=GIa6uiuwP81j047HjKMHOwEyW4QJTYjG&PageID=page9765&AvailabilityZipCode=$zip" http://fios.verizon.com/availability_post4.php --Location | grep "availability-msg"`
#echo $chk
if [ "$chk" = "$Avail" ]
then
fios=1
elif [ "$chk" = "$NotAvail" ]
then
fios=0
else
fios=err
fi
echo "$zip | $fios"
done < zipcodes.txt

Most likely, the line read from curl ends in CR/LF. grep will take the LF as a line-end, and leave the CR in the line, where it will not match either of your patterns. (Other whitespace issues could also cause a similarly invisible mismatch, but stray CR's are very common since HTTP insists on them.)
The easiest solution is to use a less specific match, like a glob or regex; these are both available with bash's [[ (rather than [) command.
Eg.:
if [[ $chk =~ /fios-plans\.html ]]; then
will do a substring comparison

finding a specific part of a variable?

Ok I know my question seems confusing. So i'll explain it here. So i'm making a mmo with bash script (i am bored don't say do it with java or c++ or something like that please) which i won't really explain other than that when registering I want it so I can have a if statement see if they have anything provocative in their username and then tell them this and then make them make a different username. I'm just trying to make it more appropriate and all. So to make it so I can have the word seen by my if statement I need to have it like this pretty much
if [ var == provocative word ]; then
echo "You have a provocative word in your username. Please change"
fi
But to do this I'll need it to look if word is in the statement.
I know that in java it is just by doing 'word*' the star making it so if it sees the word it will so whatever the if said even though the thing might of been 'wordghdksjgh'. Thanks in advance for whoever answers.

To check if a variable contains a substring, you can use:
if [[ $var == *"foo"* ]]
then
echo "The variable contains the substring 'foo'"
fi

In addition to pattern-matching using
if [[ $var = *foo* ]]; then
you can also use regular expressions:
if [[ $var =~ foo ]]; then # Successfully match anything with "foo" as a substring
The nocasematch option applies to regular expression matches as well.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Check for valid link (URL) - bash

Related

How could I change this egrep script to a zgrep script and still have it work?

bash: comparing strings with grep-variable issue

Check if string is in date format (UNIX)?

BASH if then else with quoted text

finding a specific part of a variable?

Categories

Resources