Regex dismatch repeat special character - ruby

I'm doing a regex to check a slug.
Actually my regex is : /^[^-][a-z\-].*[^-]+$/
here's what I'm checking right now :
my-awesome-project => valid
-my-awesome-project => invalid
my-awesome-project- => invalid
Now what I want is to check if the dash is repeating or not :
my-awesome-project => should be valid
my-awesome--project => should not be valid
my----awesome-project => should not be valid
Can I do that with a regex ?
Thank you,

I think this regexp should work:
/^[a-z]+(-[a-z]+)*$/
What this does: ^[a-z]+ matches if the string begins with at least on character. After that there may be (-[a-z]+)*$ zero or more occurances of a dash followed by again at least one character.
See on Rubular.

As I understand, the string is valid unless it:
contains a character other than a lower-case letter or hyphen,
begins with a hyphen,
ends with a hyphen, or
contains two (or more) hyphens in a row.
If that's the case, it's easiest to check if it invalid:
R = /
[^a-z-] # match one character other than a lower-case letter or hyphen
| # or
^- # match a hyphen as the first character
| # or
-$ # match a hyphen as the last character
| # or
-- # match two hypens
/x
def valid?(str)
str !~ R
end
valid? 'my-awesome-project' #=> true
valid? '-my-awesome-project' #=> false
valid? 'my-awesome-project-' #=> false
valid? 'my-awesome--project' #=> false
valid? 'my----awesome-project' #=> false

Below regex may be helpful.
[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*

Related

Ruby regex to filter out word ending with a "string" suffix

I am trying to come up with a Ruby Regex that will match the following string:
MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something
But should not match the following:
MAINT: Refactored something
STRY-1: Added something
STRY-2: Updated something
MAINT: Refactoring something
STRY-3: Adding something
STRY-4: Updating something
Basically, the first word after : should not end with either ed or ing
This is what I have currently:
^(MAINT|(STRY|PRB)-\d+):\s([A-Z][a-z]+)\s([a-zA-Z0-9._\-].*)
I have tried [^ed] and [^ing] but they would not work here since I am targeting more than single character.
I am not able to come up with a proper solution to achieve this.
You could use
^[-\w]+:\s*(?:(?!(?:ed|ing)\b)\w)+\b.+
See a demo on regex101.com.
Broken down this says:
^ # start of the line/string
[-\w]+:\s* # match - and word characters, 1+ then :
(?: # non-capturing group
(?!(?:ed|ing)\b) # neg. lookahead: no ed or ing followed by a word boundary
\w # match a word character
)+\b # as long as possible, followed by a boundary
.* # match the rest of the string, if any
I have no experience in Ruby but I guess you could alternatively do a split and check if the second word ends with ed or ing. The latter approach might be easier to handle for future programmers/colleagues.
r = /
\A # match beginning of string
(?: # begin a non-capture group
MAINT # match 'MAINT'
| # or
STRY\-\d+ # match 'STRY-' followed by one or more digits
) # end non-capture group
:[ ] # match a colon followed by a space
[[:alpha:]]+ # match one or more letters
(?<! # begin a negative lookbehind
ed # match 'ed'
| # or
ing # match 'ing'
) # end negative lookbehind
[ ] # match a space
/x # free-spacing regex definition mode
"MAINT: Refactor something".match?(r) #=> true
"STRY-1: Add something".match?(r) #=> true
"STRY-2: Update something".match?(r) #=> true
"MAINT: Refactored something".match?(r) #=> false
"STRY-1: Added something".match?(r) #=> false
"STRY-2: Updated something".match?(r) #=> false
"A MAINT: Refactor something".match?(r) #=> false
"STRY-1A: Add something".match?(r) #=> false
This regular expression is conventionally written as follows.
r = /\A(?:MAINT|STRY\-\d+): [[:alpha:]]+(?<!ed|ing) /
Expressed this way the two spaces can each be represented a space character. In free-spacing mode, however, all spaces outside character classes are removed, which is why I needed to enclose each space in a character class.
(Posted on behalf of the question author).
This is what I ended up using:
^(MAINT|(STRY|PRB)-\d+):\s(?:(?!(?:ed|ing)\b)[A-Za-z])+\s([a-zA-Z0-9._\-].*)

How do I match something only if a character doesn't follow a pattern?

I"m using Ruby 2.4 How do I write a regular expression that matches a series of numbers, the plus sign and then any sequence that follows provided that sequence doesn't contain another number? For example, this would match per my rules
23+abcdef
as would this
1111111+ __++
But this would not
2+3
Neither would this
2+ L43
I tried this but was unsuccessful ...
/\d+[[:space:]]*(\+|plus).*([^\d]|$)/i.match(mystr)
r = /\A # match beginning of string
\d+ # match one or more digits
\+ # match plus sign
\D* # match zero or more characters other than a digit
\z # match end of string
/x # free-spacing regex definition mode
"23+abcdef".match?(r)
#=> true
"1111111+ __++".match?(r)
#=> true
"23 abcdef".match?(r)
#=> false
"2+3".match?(r)
#=> false
"2+ L43".match?(r)
#=> false
If at least one character that is not a digit is to follow '+', change \D* in the regex to \D+.

Capitalize the first character after a dash

So I've got a string that's an improperly formatted name. Let's say, "Jean-paul Bertaud-alain".
I want to use a regex in Ruby to find the first character after every dash and make it uppercase. So, in this case, I want to apply a method that would yield: "Jean-Paul Bertaud-Alain".
Any help?
String#gsub can take a block argument, so this is as simple as:
str = "Jean-paul Bertaud-alain"
str.gsub(/-[a-z]/) {|s| s.upcase }
# => "Jean-Paul Bertaud-Alain"
Or, more succinctly:
str.gsub(/-[a-z]/, &:upcase)
Note that the regular expression /-[a-z]/ will only match letters in the a-z range, meaning it won't match e.g. à. This is because String#upcase does not attempt to capitalize characters with diacritics anyway, because capitalization is language-dependent (e.g. i is capitalized differently in Turkish than in English). Read this answer for more information: https://stackoverflow.com/a/4418681
"Jean-paul Bertaud-alain".gsub(/(?<=-)\w/, &:upcase)
# => "Jean-Paul Bertaud-Alain"
I suggest you make the test more demanding by requiring the letter to be upcased: 1) be preceded by a capitalized word followed by a hypen and 2) be followed by lowercase letters followed by a word break.
r = /
\b # Match a word break
[A-Z] # Match an upper-case letter
[a-z]+ # Match >= 1 lower-case letters
\- # Match hypen
\K # Forget everything matched so far
[a-z] # Match a lower-case letter
(?= # Begin a positive lookahead
[a-z]+ # Match >= 1 lower-case letters
\b # Match a word break
) # End positive lookahead
/x # Free-spacing regex definition mode
"Jean-paul Bertaud-alain".gsub(r) { |s| s.upcase }
#=> "Jean-Paul Bertaud-Alain"
"Jean de-paul Bertaud-alainM".gsub(r) { |s| s.upcase }
#=> "Jean de-paul Bertaud-alainM"

Ruby Regular expression match

why this regexp
(?<!\S)[^\s]*[aeiou][^\s]*(?<=\d)(?!\S)
match test123 but not 123test
i want to match a word which must have a vowel and digit .As i am new to this i dont understand all methods completely. maybe thats causing problem.
i want to match a word which must have a vowel and digit
(?<!\S)\S*(?:[aeiou]\S*\d|\d\S*[aeiou])\S*(?!\S)
This part in your regex,
(?<=\d)(?!\S)
will look for a digit to be present which must not be followed by a non-space character. In this test123, because 3 present at the last satisfies this condition where 3 is not followed by a non-space character. So your regex matches test123 and fails to match 123test because all the digits present in this input is followed by a non-space character. And also your regex asserts that there must be an vowel exists before the digit. This is also a reason.
(?<!\S)[^\s]*[aeiou][^\s]*(?<=\d)(?!\S)
^^^^^^^
Because of the lookbehind which makes regex engine look for integer after the last match which is not present in 123test
For your needs you can simply use
\b(?=[a-zA-Z0-9]*[0-9])(?=[a-zA-Z0-9]*[aeiou])[a-zA-Z0-9]+\b
See demo.
https://www.regex101.com/r/fJ6cR4/27
Alternative way of doing this:
def vowel_and_num?(str)
!(str.scan(/[aeiou]/).empty? || str.scan(/[0-9]/).empty?)
end
vowel_and_num?("test123")
# => true
vowel_and_num?("123test")
# => true
vowel_and_num?("test")
# => false
vowel_and_num?("123")
# => false

Regex condition on first and last characters

How can I write a regex to match a string that does not start or end with a white space character? A matching string can have any character in the middle, and importantly, a single-character string should match.
My attempt was:
/\A\S.*\S\z/
but this will not match a single character.
This is one of the cases where you should not attempt to build a regex that matches something, but rather a regex that matches the complement of something, and use the regex negatively.
re = /\A\s|\s\z/
re !~ " " # => false
re !~ "" # => true
re !~ "sss" # => true
re !~ "s ss" # => true
re !~ " s ss" # => false
is_ok = lambda do |str|
a, z = str.chars.first, str.chars.last
"#{a}#{z}" =~ / |\n|\t/ ? false : true
end
#"more elegant" (yeah dude I rock)
is_ok = lambda {|str| [0, -1].map{|i| str.chars[i] }.join =~ / |\n|\t/ ? false : true}
Use this regex:
\A\S+(?:\s*\S+)*\Z
You can play with the Test String part of this demo to see how this works. I'm assuming that strings can span multiple lines, hence the \A and \Z
In Ruby, something like:
if subject =~ /\A\S+(?:\s*\S+)*\Z/
match = $&
Explanation
The \A anchor asserts that we are at the beginning of the subject string
\S+ matches one or more non-whitespace characters (including tabs, newlines etc.) Alternaltely, if you want to allow newlines at the beginning but only want to exclude a space character, you can use [^ ]+ instead of \S+
(?:\s*\S+) matches any number of optional whitespace characters, followed by one or more non-space characters
The * quantifier repeats that zero or more times
The \Z anchor asserts that we are at the end of the subject string
Use lookaheads, like this:
\A(?=\S).*\S\Z
Regex101 Demo
This matches the start of the string and requires (1) that the first character be a non-whitespace character and (2) that the last character be a non-whitespace character.
Matches:
a
a b
a b c d 1231 e
Non matches:
(just a space)
a (leading space)
b (trailing space)
empty string

Resources