Building regex to match 2 words only

Building regex to match 2 words only - ruby

I'm trying to make regexp that match only 2 words and a single pace between. No special symbols, only [a-zA-Z] space [a-zA-z].
Foo Bar # Match (two words and one space only)
Foo # Mismatch (only one word)
Foo Bar # Mismatch (2 spaces)
Foo Bar Baz # Mismatch (3 words)

You want ^[a-zA-Z]+\s[a-zA-Z]+$
^ # Matches the start of the string
+ # quantifier mean one or more of the previous character class
\s # matches whitespace characters
$ # Matches the end of the string
The anchors ^ and $ are important here.
Demo:
if "foo bar" =~ /^[a-zA-Z]+\s[a-zA-Z]+$/
print "match 1"
end
if "foo bar" =~ /^[a-zA-Z]+\s[a-zA-Z]+$/
print "match 2"
end
if "foo bar biz" =~ /^[a-zA-Z]+\s[a-zA-Z]+$/
print "match 3"
end
Output:
Match 1

Related

How to match the last occurrence of a pattern?

Given that I have the following string:
String 1 | string 2 | string 3
I want my regex to match the value after the last pipe and space, which in this case is "string 3".
Right now I am doing using this: /[^|]+$/i but it also return the space character after the pipe.
https://regex101.com/r/stnW0D/1

Without regex:
"String 1 | string 2 | string 3".split(" | ").last # => "string 3"

'String 1 | string 2 | string 3'[/(?<=\|\s)(\w+\s\d+\z)/]
# "string 3"
Where (escaped):
\| # a pipe
\s # a whitespace
\w+ # one or more of any word character
\s # a whitespace
\d+ # one or more digits
\z # end of string
(...) # capture everything enclosed
(?<=...) # a positive lookbehind
Notice in this case the regex is already getting the last occurrence of the pattern in the string, by attaching it to the end of the string (\z). In other case you could use [\|\s]? instead of (\|\s) to match the string followed by a whitespace and a number and from there, access the last element in the returned array:
'String 1 | string 2 | string 3'.scan(/[\|\s]?(\w+\s\d+)/).last
# ["string 3"]

str = "string 1 | string 2 | string 3"
str[/[^ |][^|]*\z/]
#=> "string 3"

what would the regular expression to extract the 3 from be?

I basically need to get the bit after the last pipe
"3083505|07733366638|3"
What would the regular expression for this be?

You can do this without regex. Here:
"3083505|07733366638|3".split("|").last
# => "3"
With regex: (assuming its always going to be integer values)
"3083505|07733366638|3".scan(/\|(\d+)$/)[0][0] # or use \w+ if you want to extract any word after `|`
# => "3"

Try this regex :
.*\|(.*)
It returns whatever comes after LAST | .

You could do that most easily by using String#rindex:
line = "3083505|07733366638|37"
line[line.rindex('|')+1..-1]
#=> "37"
If you insist on using a regex:
r = /
.* # match any number of any character (greedily!)
\| # match pipe
(.+) # match one or more characters in capture group 1
/x # extended mode
line[r,1]
#=> "37"
Alternatively:
r = /
.* # match any number of any character (greedily!)
\| # match pipe
\K # forget everything matched so far
.+ # match one or more characters
/x # extended mode
line[r]
#=> "37"
or, as suggested by #engineersmnky in a comment on #shivam's answer:
r = /
(?<=\|) # match a pipe in a positive lookbehind
\d+ # match any number of digits
\z # match end of string
/x # extended mode
line[r]
#=> "37"

I would use split and last, but you could do
last_field = line.sub(/.+\|/, "")
That remove all chars up to and including the last pipe.

Check if string1 is before string2 on the same line

I am trying to match comment lines in a c#/sql code. CREATE may come before or after /*. They can be on the same line.
line6 = " CREATE /* this is ACTIVE line 6"
line5 = " charlie /* CREATE inside this is comment 5"
In the first case, it will be an active line; in the second, it will be a comment. I probably can do some kind of charindex, but maybe there is a simpler way
regex1 = /\/\*||\-\-/
if (line1 =~ regex1) then puts "Match comment___" + line6 else puts '____' end
if (line1 =~ regex1) then puts "Match comment___" + line5 else puts '____' end

With the regex
r = /
\/ # match forward slash
\* # match asterisk
\s+ # match > 0 whitespace chars
CREATE # match chars
\b # match word break (to avoid matching CREATED)
/ # extended mode for regex def
you can return an array of the comment lines thus:
[line6, line5].select { |l| l =~ r }
#=> [" charlie /* CREATE inside this is comment 5"]

Why $ doesn't match \r\n

Can someone explain this:
str = "hi there\r\n\r\nfoo bar"
rgx = /hi there$/
str.match rgx # => nil
rgx = /hi there\s*$/
str.match rgx # => #<MatchData "hi there\r\n\r">
On the one hand it seems like $ does not match \r. But then if I first capture all the white spaces, which also include \r, then $ suddenly does appear to match the second \r, not continuing to capture the trailing "\nfoo bar".
Is there some special rule here about consecutive \r\n sequences? The docs on $ simply say it will match "end of line" which doesn't explain this behavior.

$ is a zero-width assertion. It doesn't match any character, it matches at a position. Namely, it matches either immediately before a \n, or at the end of string.
/hi there\s*$/ matches because \s* matches "\r\n\r", which allows the $ to match at the position before the second \n. The $ could have also matched at the position before the first \n, but the \s* is greedy and matches as much as it can, while still allowing the overall regex to match.

Variables inside Regexp

Lets say I have the code:
str = "foobar"
print "Enter in the letters you would like to match: "
match = gets
# Pseudocode:
str =~ /[match]/
I don't want to match the whole string: match, I just want to match each of the letters, like:
str =~ /[aeiou]/
would yield the vowels.
How do I make it so I can match the letters the user inputs?

Try this:
match = gets.chomp # cut off that trailing \n
str =~ /[#{match}]/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Building regex to match 2 words only - ruby

I'm trying to make regexp that match only 2 words and a single pace between. No special symbols, only [a-zA-Z] space [a-zA-z]. Foo Bar # Match (two words and one space only) Foo # Mismatch (only one word) Foo Bar # Mismatch (2 spaces) Foo Bar Baz # Mismatch (3 words)

Related

How to match the last occurrence of a pattern?

what would the regular expression to extract the 3 from be?

Check if string1 is before string2 on the same line

Why $ doesn't match \r\n

Variables inside Regexp

Categories

Resources