This code:
/ell/ === 'Hello'
evalutes to 'true' in IRB.
I don't understand why this makes sense logically. Integer === 30 makes sense because 30 is a PART OF the Integer class, but in what way is the string 'Hello' a PART OF /ell/? I don't get it.
Semantically you're saying does the regular expression 'ell' match the string 'Hello'. Since 'Hello' contains the substring 'ell', it is true.
The '===' method is described here:
http://www.ruby-doc.org/core-2.0.0/Regexp.html#method-i-3D-3D-3D
You should not use === for anything in ruby except case equality, find the documentation on Regex#===
Following a regular expression literal with the === operator allows you to compare against a String.
/^[a-z]$/ === "HELLO" #=> false
/^[A-Z]$/ === "HELLO" #=> true
The === the case operator, it is primarily used in case statements and should not really be seen by its own.
case my_string
when /ll/ then puts 'the string migth be hello'
when /x/ then puts 'all i know is that the sting contain x'
else puts 'I have no idea'
end
It can also be used in some other functions such as grep:
array = ['ll', 'aa', 'hello']
p array.grep(/ll/){|x| x.upcase} #=> ["LL", "HELLO"]
Any other use is discouraged and it really does not need to make any sense.
A regular expression describes a language, i.e. a set of strings. The === checks whether the string is a member of that set.
See my answer to a similar question for details.
Related
I had mostly been using === for matching values to patterns in Ruby. Recently, I discovered that the language also supports the =~ operator for regular expressions.
The Ruby Documentation defines === as "case equality" and =~ as "pattern match".
Case Equality – For class Object, effectively the same as calling #==, but typically overridden by descendants to provide meaningful semantics in case statements.
Pattern Match—Overridden by descendants (notably Regexp and String) to provide meaningful pattern-match semantics.
By experimentation, I find that === works for regular expressions, class names, literal values, and even ranges, while =~ only seems to return useful values for regular expressions. My question is: why would I ever use =~? It seems like === supports everything =~ does and then some. Is there something I'm missing here that =~ is intended to do differently?
Firstly, =~ is symmetric:
'string' =~ /regex/
And
/regex/ =~ 'string'
Both work.
Secondly as you noted, === works with other classes. If you want to match strings, you should be using the operator for... matching. It is called case operator for a reason - case uses it internally.
case foo
when bar then x
when baz then y
else z
end
Is the same as:
if bar === foo
x
elsif baz === foo
y
else
z
end
Explicitly using === is considered unidiomatic.
str = "Something is amiss."
r = /me/
r === str #=> true
str =~ r #=> 2
What if you want to know if there's a match and if so, where it begins?
I'm testing some strings to make sure they start with a letter:
name =~ /\A[a-zA-Z].*/
but since in Ruby this evaluates to either nil or 0 and both cast to false, I need to put an additional .nil? test:
if(name =~ /\A[a-zA-Z].*/).nil? ...
Is this the proper way or am I missing something?
EDIT:
Thanks for the replies, in my ignorance I made wrong assumptions, oversimplified the example. It should read (note the negation):
name !=~ /\A[a-zA-Z].*/
irb(main):001:0> a = "abc"
=> "abc"
irb(main):006:0> (a !=~/\Aabc/)
=> true
irb(main):007:0> (a !=~/\Ab/)
=> true
but since in Ruby this evaluates to either nil or 0 and both cast to false
wrong, only nil (and false, to be precise) are treated as false in conditionals. 0 is treated as true. So
if name =~ /\A[a-zA-Z].*/
is perfectly ok.
About your edited question, you're not allowed to add exclamation sign (!) to any operator to make it negated operator. There's no such operator (BTW, these 'operators' are actually methods) as !=~, so to achieve your goal. you should do:
if !(name =~ /\A[a-zA-Z].*/)
or you can use unless instead:
unless name =~ /\A[a-zA-Z].*/
After reading a comment to an answer in another question and doing a little research, I see that =~ is defined on Object and then overridden by String and Regexp. The implementations for String and Regexp appear to assume the other class:
"123" =~ "123" # => TypeError: type mismatch: String given
/123/ =~ /123/ # => TypeError: can't convert Regexp to String
Although =~ is defined for Object, + is not:
Object.new =~ 1 # => nil
Object.new + 1 # => undefined method `+' for #<Object:0x556d38>
Why has Object#=~ been defined, rather than restricting =~ to to String and Regexp?
Because it allows any object to be used in a match expression:
Object.new =~ /abc/
=> nil
I guess this makes sense in the way that Object.new does not match the regexp /abc/ and the code would blow up if the left argument wasn't a String object. So it generally simplifies the code because you can have any object on the left side of the =~ operator.
Well, I suppose that's actually nicely answered in String =~ documentation:
Match — If obj is a Regexp, use it as a pattern to match against str,and
returns the position the match starts, or nil if there is no match.
Otherwise, invokes obj.=~, passing str as an argument. The default =~
in Object returns nil.
The point is, you can write your own implementation of Object =~ - and it will be used in String =~ Not Regexp statement.
From your comments, your actual question is why is =~ defined on Object while + isn't.
The reason is that Object#=~ can return nil for random objects (since they don't match), but Object#+ can not return a meaningful result.
It is not necessarily super useful, but it can not be said to be false (you would have to show a match to prove that a nil result is a contradiction). See the mathematical concept of vacuous truth. On the other hand, any result for Object.new + 1 could lead to contradictions.
This is similar to <=> that can return nil (and is thus also defined on Object) while <, >, ..., can not return true nor false while being completely consistent. Note that for Class#> it was decided to return nil in those cases.
something like this:
a = 6
case a
when /\d/ then "it's a number"
end
no luck, it doesn't work
When used with a value on the initializer, all case does is try it with === against each expression. The problem isn't with case, try:
6 === /\d/
All that to say, regexes match against strings only. Try replacing the second line by:
case (a.is_a?(String) ? a : a.to_s)
EDIT: To answer the OP's follow-up in comments, there's a subtlety here.
/\d/ === '6' # => true
'6' === /\d/ # => false
Perhaps unexpectedly to the beginner, String#=== and Regexp#=== have different effects. So, for:
case 'foo'
when String
end
This will call String === 'foo', not 'foo' === String, etc.
It doesn't work because regexes match against a string, whereas 6 is not a string. If you do a = '6', it shall work.
Because regexps match strings. A is a Fixnum.
If you would write a = "6", it would work. Testing if a is a number can be done with a.is_a?(Numeric)
One minor change to make it work:
a = 6
case a.to_s
when /\d/ then "it's a number"
end
The to_s will convert everything to a string. Note that your regex just checks for the existence of a digit anywhere in the string.
It would perhaps be better to do this:
case a
when Numeric then "it's a number"
end
What is the purpose of the question mark operator in Ruby?
Sometimes it appears like this:
assert !product.valid?
sometimes it's in an if construct.
It is a code style convention; it indicates that a method returns a boolean value (true or false) or an object to indicate a true value (or “truthy” value).
The question mark is a valid character at the end of a method name.
https://docs.ruby-lang.org/en/2.0.0/syntax/methods_rdoc.html#label-Method+Names
Also note ? along with a character acts as shorthand for a single-character string literal since Ruby 1.9.
For example:
?F # => is the same as "F"
This is referenced near the bottom of the string literals section of the ruby docs:
There is also a character literal notation to represent single
character strings, which syntax is a question mark (?) followed by a
single character or escape sequence that corresponds to a single
codepoint in the script encoding:
?a #=> "a"
?abc #=> SyntaxError
?\n #=> "\n"
?\s #=> " "
?\\ #=> "\\"
?\u{41} #=> "A"
?\C-a #=> "\x01"
?\M-a #=> "\xE1"
?\M-\C-a #=> "\x81"
?\C-\M-a #=> "\x81", same as above
?あ #=> "あ"
Prior to Ruby 1.9, this returned the ASCII character code of the character. To get the old behavior in modern Ruby, you can use the #ord method:
?F.ord # => will return 70
It's a convention in Ruby that methods that return boolean values end in a question mark. There's no more significance to it than that.
In your example it's just part of the method name. In Ruby you can also use exclamation points in method names!
Another example of question marks in Ruby would be the ternary operator.
customerName == "Fred" ? "Hello Fred" : "Who are you?"
It may be worth pointing out that ?s are only allowed in method names, not variables. In the process of learning Ruby, I assumed that ? designated a boolean return type so I tried adding them to flag variables, leading to errors. This led to me erroneously believing for a while that there was some special syntax involving ?s.
Relevant: Why can't a variable name end with `?` while a method name can?
In your example
product.valid?
Is actually a function call and calls a function named valid?. Certain types of "test for condition"/boolean functions have a question mark as part of the function name by convention.
I believe it's just a convention for things that are boolean. A bit like saying "IsValid".
It's also used in regular expressions, meaning "at most one repetition of the preceding character"
for example the regular expression /hey?/ matches with the strings "he" and "hey".
It's also a common convention to use with the first argument of the test method from Kernel#test
test ?d, "/dev" # directory exists?
# => true
test ?-, "/etc/hosts", "/etc/hosts" # are the files identical
# => true
as seen in this question here