Ruby valid license plate regex - ruby

I am trying to write a function in RUBY that will determine if a given string is a valid license plate. A valid license plate has the following format: 3 capital letters, followed by a dash, followed by 4 numbers. EX: HES-2098.
I have written the following function but need some help for pattern matching.
def liscence()
plate = "HES-2098"
plateNo = plate.upcase
if(plate.length == 8)
if(plate == plateNo)
if(/\A-Z\A-Z\A-Z\-\d{4}/.match(plate))
puts "valid"
else
puts "invalid"
end
else
puts "First 3 letter must be uppercase"
end
else
puts "Only 8 char long"
end
end
liscence()

Your regex did not work because \A matches a string start position that can only be one in a string (and you have three). To match an uppercase ASCII letter, you may use a [A-Z] character class.
You can use
if /\A[A-Z]{3}-[0-9]{4}\z/ =~ plate
See the regex and Ruby demos.
Pattern details:
\A - start of string (not line, as with ^) anchor
[A-Z]{3} - exactly 3 (since the limiting quantifier {n} is used) uppercase ASCII letters (from the [A-Z] range in a character class)
- - a literal hyphen (not necessary to escape it outside the character class)
[0-9]{4} - exactly 4 ASCII digits
\z - the very end of the string anchor (not $ that matches the end of the line)

plate[/\A[A-Z]{3}-\d{4}\z/] ? 'valid' : 'invalid'

Related

how to get formatted data from a array and convert it back to new array in ruby

I have a data array like below. I need to format it like shown
a = ["8619 [EC006]", "9876 [ED009]", "1034 [AX009]"]
Need to format like
["EC006", "ED009", "AX009"]
arr = ["8619 [EC006]", "9876 [ED009]", "1034 [AX009]"]
To merely extract the strings of interest, assuming the data is formatted correctly, we may write the following.
arr.map { |s| s[/(?<=\[)[^\]]*/] }
#=> ["EC006", "ED009", "AX009"]
See String#[] and Demo
In the regular expression (?<=\[) is a positive lookbehind that asserts the previous character is '['. The ^ at the beginning of the character class [^\]] means that any character other than ']' must be matched. Appending the asterisk ([^\]]*) causes the character class to be matched zero or more times.
Alternatively, we could use the regular expression
/\[\K[^\]]*/
where \K causes the beginning of the match to be reset to the current string location and all previously-matched characters to be discarded from the match that is returned.
To confirm the correctness of the formatting as well, use
arr.map { |s| s[/\A[1-9]\d{3} \[\K[A-Z]{2}\d{3}(?=]\z)/] }
#=> ["EC006", "ED009", "AX009"]
Demo
Note that at the link I replaced \A and \z with ^ and $, respectively, in order to test the regex against multiple strings.
This regular expression can be broken down as follows.
\A # match beginning of string
[1-9] # match a digit other than zero
\d{3} # match 3 digits
[ ] # match one space
\[ # match '['
\K # reset start of match to current stringlocation and discard
# all characters previously matched from match that is returned
[A-Z]{2} # match 2 uppercase letters
\d{3} # match 3 digits
(?=]\z) # positive lookahead asserts following character is
# ']' and that character is at the end of the string
In the above I placed a space character in a character class ([ ]) merely to make it visible to the reader.
Input
a = ["8619 [EC006]", "9876 [ED009]", "1034 [AX009]"]
Code
p a.collect { |x| x[/\[(.*)\]/, 1] }
Output
["EC006", "ED009", "AX009"]

Regex: Match all hyphens or underscores not at the beginning or the end of the string

I am writing some code that needs to convert a string to camel case. However, I want to allow any _ or - at the beginning of the code.
I have had success matching up an _ character using the regex here:
^(?!_)(\w+)_(\w+)(?<!_)$
when the inputs are:
pro_gamer #matched
#ignored
_proto
proto_
__proto
proto__
__proto__
#matched as nerd_godess_of, skyrim
nerd_godess_of_skyrim
I recursively apply my method on the first match if it looks like nerd_godess_of.
I am having troubled adding - matches to the same, I assumed that just adding a - to the mix like this would work:
^(?![_-])(\w+)[_-](\w+)(?<![_-])$
and it matches like this:
super-mario #matched
eslint-path #matched
eslint-global-path #NOT MATCHED.
I would like to understand why the regex fails to match the last case given that it worked correctly for the _.
The (almost) full set of test inputs can be found here
The fact that
^(?![_-])(\w+)[_-](\w+)(?<![_-])$
does not match the second hyphen in "eslint-global-path" is because of the anchor ^ which limits the match to be on the first hyphen only. This regex reads, "Match the beginning of the line, not followed by a hyphen or underscore, then match one or more words characters (including underscores), a hyphen or underscore, and then one or more word characters in a capture group. Lastly, do not match a hyphen or underscore at the end of the line."
The fact that an underscore (but not a hyphen) is a word (\w) character completely messes up the regex. In general, rather than using \w, you might want to use \p{Alpha} or \p{Alnum} (or POSIX [[:alpha:]] or [[:alnum:]]).
Try this.
r = /
(?<= # begin a positive lookbehind
[^_-] # match a character other than an underscore or hyphen
) # end positive lookbehind
( # begin capture group 1
(?: # begin a non-capture group
-+ # match one or more hyphens
| # or
_+ # match one or more underscores
) # end non-capture group
[^_-] # match any character other than an underscore or hyphen
) # end capture group 1
/x # free-spacing regex definition mode
'_cats_have--nine_lives--'.gsub(r) { |s| s[-1].upcase }
#=> "_catsHaveNineLives--"
This regex is conventionally written as follows.
r = /(?<=[^_-])((?:-+|_+)[^_-])/
If all the letters are lower case one could alternatively write
'_cats_have--nine_lives--'.split(/(?<=[^_-])(?:_+|-+)(?=[^_-])/).
map(&:capitalize).join
#=> "_catsHaveNineLives--"
where
'_cats_have--nine_lives--'.split(/(?<=[^_-])(?:_+|-+)(?=[^_-])/)
#=> ["_cats", "have", "nine", "lives--"]
(?=[^_-]) is a positive lookahead that requires the characters on which the split is made to be followed by a character other than an underscore or hyphen
you can try the regex
^(?=[^-_])(\w+[-_]\w*)+(?=[^-_])\w$
see the demo here.
Switch _- to -_ so that - is not treated as a range op, as in a-z.

Matching strings that contain a letter with the first character not being a number

How do I write a regular expression that has at least one letter, but the first character must not be a number? I tried this
str = "a"
str =~ /^[^\d][[:space:]]*[a-z]*/i
# => 0
str = "="
str =~ /^[^\d][[:space:]]*[a-z]*/i
# => 0
The "=" is matched even though it contains no letters. I expect the"a"to match, and similarly a string like"3abcde"` should not match.
The [a-z]* and [[:space:]]* patterns can match an empty string, so they do not really make any difference when validating is necessary. Also, = is not a digit, it is matched with [^\d] negated character class that is a consuming type of pattern. It means it requires a character other than a digit in the string.
You may rely on a lookahead that will restrict the start of string position:
/\A(?!\d).*[a-z]/im
Or even a bit faster and Unicode-friendly version:
/\A(?!\d)\P{L}*\p{L}/
See the regex demo
Details:
\A - start of a string
(?!\d) - the first char cannot be a digit
\P{L}* - 0 or more (*) chars other than letters
or
.* - any 0+ chars, including line breaks if /m modifier is used)
\p{L} - a letter
The m modifier enables the . to match line break chars in a Ruby regex.
Use [a-z] when you need to restrict the letters to those in ASCII table only. Also, \p{L} may be replaced with [[:alpha:]] and \P{L} with [^[:alpha:]].
If two regular expressions were permitted you could write:
def pass_de_test?(str)
str[0] !~ /\d/ && str =~ /[[:alpha]]/
end
pass_de_test?("*!\n?a>") #=> 4 (truthy)
pass_de_test?("3!\n?a>") #=> false
If you want true or false returned, change the operative line to:
str[0] !~ /\d/ && str =~ /[[:alpha]]/) ? true : false
or
!!(str[0] !~ /\d/ && str =~ /[[:alpha]]/)

How to remove a certain character after substring in Ruby

I have a string with exclamation marks. I want to remove the exclamation marks at the end of the word, not the ones before a word. Assume there is no exclamation mark by itself/ not accompanied by a word. By word I mean [a..z], can be uppercased.
For example:
exclamation("Hello world!!!")
#=> ("Hello world")
exclamation("!!Hello !world!")
#=> ("!!Hello !world")
I have read How do I remove substring after a certain character in a string using Ruby? ; these two are close, but different.
def exclamation(s)
s.slice(0..(s.index(/\w/)))
end
# exclamation("Hola!") returns "Hol"
I have also tried s.gsub(/\w!+/, ''). Although it retains the '!' before word, it removes both the last letter and exclamation mark. exclamation("!Hola!!") #=> "!Hol".
How can I remove only the exclamation marks at the end?
If you don't want to use regex that sometimes difficult to understand use this:
def exclamation(sentence)
words = sentence.split
words_wo_exclams = words.map do |word|
word.split('').reverse.drop_while { |c| c == '!' }.reverse.join
end
words_wo_exclams.join(' ')
end
Although you haven't given a lot of test data, here's an example of something that might work:
def exclamation(string)
string.gsub(/(\w+)\!(?=\s|\z)/, '\1')
end
The \s|\z part means either a space or the end of the string, and (?=...) means to just peek ahead in the string but not actually match against it.
Note that this won't work in the case of things like "I'm mad!" where the exclamation mark is not adjacent to a space, but you could always add that as another potential end-of-word match.
"!!Hello !world!, world!! I say".gsub(r, '')
#=> "!!Hello !world, world! I say"
where
r = /
(?<=[[:alpha:]]) # match an uppercase or lowercase letter in a positive lookbehind
! # match an exclamation mark
/x # free-spacing regex definition mode
or
r = /
[[:alpha:]] # match an uppercase or lowercase letter
\K # discard match so far
! # match an exclamation mark
/x # free-spacing regex definition mode
If the above example should return "!!Hello !world, world I say", change ! to !+ in the regexes.

Ruby search a string for matching character pairs

I want to match character pairs in a string. Let's say the string is:
"zttabcgqztwdegqf". Both "zt" and "gq" are matching pairs of characters in the string.
The following code finds the "zt" matching pair, but not the "gq" pair:
#!/usr/bin/env ruby
string = "zttabcgqztwdegqf"
puts string.scan(/.{1,2}/).detect{ |c| string.count(c) > 1 }
The code provides matching pairs where the indices of the pairs are 0&1,2&3,4&5... but not 1&2,3&4,5&6, etc:
zt
ta
bc
gq
zt
wd
eg
qf
I'm not sure regex in Ruby is the best way to go. But I want to use Ruby for the solution.
You can do your search with a single regex:
puts string.scan(/(?=(.{2}).*\1)/)
regex101 demo
Output
zt
gq
Regex Breakout
(?= # Start a lookahead
(.{2}) # Search any couple of char and group it in \1
.*\1 # Search ahead in the string for another \1 to validate
) # Close lookahead
Note
Putting all the checks inside lookahead assure the regex engine does not consume the couple when validates it.
So it also works with overlapping couples like in the string abcabc: the output will correctly be ab,bc.
Oddity
If the regex engine does not consume the chars how it can reach the end of the string?
Internally after the check Onigmo (the ruby regex engine) makes one step further automatically. Most regex flavours behaves in this way but e.g. the javascript engine needs the programmer to increment the last match index manually.
str = "ztcabcgqzttwtcdegqf"
r = /
(.) # match any character in capture group 1
(?= # begin a positive lookahead
(.) # match any character in capture group 2
.+ # match >= 1 characters
\1 # match capture group 1
\2 # match capture group 2
) # close positive lookahead
/x # extended/free-spacing regex definition mode
str.scan(r).map(&:join)
#=> ["zt", "tc", "gq"]
Here is one way to do this without using regex:
string = "zttabcgqztwdegqf"
p string.split('').each_cons(2).map(&:join).select {|i| string.scan(i).size > 1 }.uniq
#=> ["zt", "gq"]

Resources