Avoid entering white space in regex password Laravel 5.4 - laravel-5

I am trying Regex Strong Password.
My regex is below. Works perfectly for below features.
Min 1 Digit Min 1 Lower char Min 1 Upper char Min 1 Special char Min 8
chars Max 15 chars
^(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])(?=.*?[^\w]).{8,15}$
Can somebody suggest to avoid entering white spaces?

How about this?
^(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])(?=.*?[^\w])(?!.*?\s).{8,15}$
I just added a negative lookahead for whitespace in addition to all your positive lookaheads.
As for what it means, it basically has a bunch of "lookaheads" which means "only create a match if the selected thing is followed by". It has four different lookaheads:
(?=.*?[A-Z]) // followed by any number of characters and then a capital letter
(?=.*?[a-z]) // followed by any number of characters and then a lowercase letter
(?=.*?[0-9]) // followed by any number of characters and then a number
(?=.*?[^\w]) // followed by any number of characters and then not a word character (0-9a-zA-Z_)
the ^ at the beginning means starts with. So it basically says the start of the regex should be followed by all four conditions specified above. I just added one more condidtion that says the start may NOT be followed by a space. It's called a "negative lookahead":
(?!.*?\s)

Related

How can I mask everything but the last four characters of a credit card number (PAN) with "#" symbols? [duplicate]

This question already has answers here:
How to mask all but last four characters in a string
(6 answers)
Closed 2 years ago.
I have a credit card number like 1234567891234 and I want to show only the last 4 characters of this string, like #########1234. How can I do this?
string.gsub(/.(?=....)/, '*')
=> "*********1234"
gsub without the ! does not mutate the original object that string points to and can take regex arguements.
. matches with a character that is not a line break and ?= is a positive lookahead, so any character that has four characters beyond it, that are not line breaks, will be replaced with the second gsub parameter, which is *.
string.gsub(/\d(?=[0-9]{4})/, '*')
=> "*********1234"
produces the same output, looking for digits with \d and doing a positive lookahead with [0-9]{4} which matches for four characters between zero and nine.
Masking All Digits Except Last Four
If you're just trying to mask the credit card number, there are a number of ways to do that. However, what makes it potentially tricky is that credit card numbers can have anywhere from 13-19 digits, although 16 is certainly the most common.
One of the easiest ways to work around this expected variation is to use is String#slice! to save the last four digits, and then String#tr to convert the remainder of the digits to your masking character. For example:
def mask_credit_card card_number
credit_card_number = String(card_number).scan(/\d/).join
last_four_digits = credit_card_number.slice! -4..-1
credit_card_number.tr("0-9", "#") << last_four_digits
end
# Test against various lengths & formats.
[
"1234567890123456",
"1234-5678-9012-3456",
1234567890123456,
"1234-567890-12345",
].map { |card_number| mask_credit_card card_number }
#=> ["############3456", "############3456", "############3456", "###########2345"]
Caveats & Considerations
Some cards like Diners Club can start with a zero, making the card number unsuitable for processing as an Integer. Treating the card number as a String can be more reliable, but forces you to think about how you'll handle unexpected characters or spacing.
Extracting digits with #scan is safer than using #delete when invoking the mask on unsanitized input. For example, String(card_number).delete "-\s\t" would normalize the example data above, but might not catch other unexpected characters. Never trust user input!
If you want to preserve spacing, dashes, and so forth in your masking, you run the risk that a malformed string (e.g. "1234-5678-9012-34 5-6") will yield unexpected results like " 5-6" as the last four digits. It's usually better to normalize your inputs, and apply your chosen formatting to your outputs (e.g. with printf or sprintf) instead. Of course, your specific use case may vary.

Counting words from a mixed-language document

Given a set of lines containing Chinese characters, Latin-alphabet-based words or a mixture of both, I wanted to obtain the word count.
To wit:
this is just an example
这只是个例子
should give 10 words ideally; but of course, without access to a dictionary, 例子 would best be treated as two separate characters. Therefore, a count of 11 words/characters would also be an acceptable result here.
Obviously, wc -w is not going to work. It considers the 6 Chinese characters / 5 words as 1 "word", and returns a total of 6.
How do I proceed? I am open to trying different languages, though bash and python will be the quickest for me right now.
You should split the text on Unicode word boundaries, then count the elements which contain letters or ideographs. If you're working with Python, you could use the uniseg or nltk packages, for example. Another approach is to simply use Unicode-aware regexes but these will only break on simple word boundaries. Also see the question Split unicode string on word boundaries.
Note that you'll need a more complex dictionary-based solution for some languages. UAX #29 states:
For Thai, Lao, Khmer, Myanmar, and other scripts that do not typically use spaces between words, a good implementation should not depend on the default word boundary specification. It should use a more sophisticated mechanism, as is also required for line breaking. Ideographic scripts such as Japanese and Chinese are even more complex. Where Hangul text is written without spaces, the same applies. However, in the absence of a more sophisticated mechanism, the rules specified in this annex supply a well-defined default.
I thought about a quick hack since Chinese characters are 3 bytes long in UTF8:
(pseudocode)
for each character:
if character (byte) begins with 1:
add 1 to total chinese chars
if it is a space:
add 1 to total "normal" words
if it is a newline:
break
Then take total chinese chars / 3 + total words to get the sum for each line. This will give an erroneous count for the case of mixed languages, but should be a good start.
这是test
However, the above sentence will give a total of 2 (1 for each of the Chinese characters.) A space between the two languages would be needed to give the correct count.

Can someone give me an example of regular expressions using {x} and {x,y}?

I just learned from a book about regular expressions in the Ruby language. I did Google it, but still got confused about {x} and {x,y}.
The book says:
{x}→Match x occurrences of the preceding character.
{x,y}→Match at least x occurrences and at most y occurrences.
Can anyone explain this better, or provide some examples?
Sure, look at these examples:
http://rubular.com/r/sARHv0vf72
http://rubular.com/r/730Zo6rIls
/a{4}/
is the short version for:
/aaaa/
It says: Match exact 4 (consecutive) characters of 'a'.
where
/a{2,4}/
says: Match at least 2, and at most 4 characters of 'a'.
it will match
/aa/
/aaa/
/aaaa/
and it won't match
/a/
/aaaaa/
/xxx/
Limiting Repetition good online tutorial for this.
I highly recommend regexbuddy.com and very briefly, the regex below does what you refer to:
[0-9]{3}|\w{3}
The [ ] characters indicate that you must match a number between 0 and 9. It can be anything, but the [ ] is literal match. The { } with a 3 inside means match sets of 3 numbers between 0 and 9. The | is an or statement. The \w, is short hand for any word character and once again the {3} returns only sets of 3.
If you go to RegexPal.com you can enter the code above and test it. I used the following data to test the expression:
909 steve kinzey
and the expression matched the 909, the 'ste', the 'kin' and the 'zey'. It did not match the 've' because it is only 2 word characters long and a word character does not span white space so it could not carry over to the second word.
Interval Expressions
GNU awk refers to these as "interval expressions" in the Regexp Operators section of its manual. It explains the expressions as follows:
{n}
{n,}
{n,m}
One or two numbers inside braces denote an interval expression. If there is one number in the braces, the preceding regexp is repeated n times. If there are two numbers separated by a comma, the preceding regexp is repeated n to m times. If there is one number followed by a comma, then the preceding regexp is repeated at least n times:
The manual also includes these reference examples:
wh{3}y
Matches ‘whhhy’, but not ‘why’ or ‘whhhhy’.
wh{3,5}y
Matches ‘whhhy’, ‘whhhhy’, or ‘whhhhhy’, only.
wh{2,}y
Matches ‘whhy’ or ‘whhhy’, and so on.
See Also
Ruby's Regexp class.
Quantifiers section of Ruby's oniguruma engine.

matching single letters in a sentence with a regular expression

I want to match single letters in a sentence. So in ...
I want to have my turkey. May I. I 20,000-t bar-b-q
I'd like to match
*I* want to have my turkey. May *I*. *I* 20,000-t bar-b-q
right now I'm using
/\b\w\b/
as my regular expression, but that is matching
*I* want to have my turkey. May *I*. *I* 20,000-*t* bar-*b*-*q*
Any suggestions on how to get past that last mile?
Use a negative lookbehind and negative lookahead to fail if the previous character is a word or a hyphen, or if the next character is a word a or a hyphen:
/(?<![\w\-])\w(?![\w\-])/
Example: http://www.rubular.com/r/9upmgfG9u4
Note that as mentioned by rtcherry, this will also match single numbers. To prevent this you may want to change the \w that is outside of the character classes to [a-zA-Z].
F.J's answer will also include numbers. This is restricted to ASCII characters, but you really need to define what characters can be side by side an still count as a single letter.
/(?<![0-9a-zA-Z\-])[a-zA-Z](?![0-9a-zA-Z\-])/
That will also avoid things like This -> 1a <- is not a single letter. Neither is -> 2 <- that.
As long as we're being picky, non-ASCII letters are easy to include:
/(?<![[:alnum:]-])[[:alpha:]](?![[:alnum:]-])/
This will avoid matching the t in 'Cómo eres tú'
Notice that it's not necessary to escape the - when it is the last character in a character class (which I'm not sure that this technically is).
You are asking far too much of a regular expression. \w matches a word character, which includes upper and lower case alphabetics, the ten digits, and underscore. So it is the same as [0-9A-Z_a-z].
\b matches the (zero-width) boundary where a word character doesn't have another word character next to it, for instance at the beginning or end of a string, or next to some punctuation or white space.
Using negative look-behinds and look-aheads, this amounts to \b\w\b being equivalent to
(?<!\w)\w(?!\w)
i.e. a word character that doesn't have another word character before or after it.
As you have found, that finds t, b and q in 20,000-t bar-b-q. So it's back in your court to define what you really mean by "single letters in a sentence".
It nearly works to say "any letter that isn't preceded or followed by a printable character, which is
/(?<!\S)[A-Za-z](?!\S)/
But that leaves out I in May I. because it has a dot after it.
So, do you mean a single letter that isn't preceded by a printable character, and is followed by whitespace, a dot, or the end of the string (or a comma, a semicolon or a colon for good measure)? Then you want
/(?<!\S)[A-Za-z](?=(?:[\s.,;:]|\z))/
which finds exactly three I characters in your string.
I hope that helps.

Password validation

I need to validate a password with the following requirements:
1. Be at least seven characters long
2. Contain at least one letter (a-z or A-Z)
3 Contain at least one number (0-9)
4 Contain at least one symbol (#, $, %, etc)
Can anyone give me the correct expression?
/.{7,}/
/[a-zA-Z]/
/[0-9]/
/[-!##$%^...]/
For a single regex, the most straightforward way to check all of the requirements would be with lookaheads:
/(?=.*[a-zA-Z])(?=.*\d)(?=.*[^a-zA-Z0-9\s]).{7,}/
Breaking it down:
.{7,} - at least seven characters
(?=.*[a-zA-Z]) - a letter must occur somewhere after the start of the string
(?=.*\d) - ditto 2, except a digit
(?=.*[^a-zA-Z0-9\s]) - ditto 2, except something not a letter, digit, or whitespace
However, you might choose to simply utilize multiple separate regex matches to keep things even more readable - chances are you aren't validating a ton of passwords at once, so performance isn't really a huge requirement.

Resources