How does `in` works in Ruby? - ruby

I could not find usage about in here https://docs.ruby-lang.org/en/master/index.html,
but it seems to be supported.
h = {a: 1}
:a in h # => true
2 in 1..3 # => true
How does it work? Is there any special method I can override like == or [] ?

I could not find usage about in here https://docs.ruby-lang.org/en/master/index.html,
but it seems to be supported.
All keywords are documented in the doc/keywords.rdoc document, where you can find this entry for the in keyword:
Used to separate the iterable object and iterator variable in a for loop. See control expressions syntax It also serves as a pattern in a case expression. See pattern matching
Since the example in your code is clearly not a for loop, it must therefore be Pattern Matching.
Ruby's Pattern Matching feature is documented in the Pattern Matching Syntax documentation doc/syntax/pattern_matching.rdoc:
Or with the => operator and the in operator, which can be used in a standalone expression:
<expression> => <pattern>
<expression> in <pattern>
[…]
<expression> in <pattern> is the same as case <expression>; in <pattern>; true; else false; end. You can use it when you only want to know if a pattern has been matched or not:
users = [{name: "Alice", age: 12}, {name: "Bob", age: 23}]
users.any? {|user| user in {name: /B/, age: 20..} } #=> true
How does it work?
A high-level overview about Pattern Matching can be found in the above-mentioned documentation.
However, as always, for the details, you need to check the various bits and pieces that make up the Ruby Language Specification (which, unfortunately, does not exist as a single document in a single place).
Unlike many other programming languages, Ruby does not have a single formal specification that defines what certain language constructs mean.
There are several resources, the sum of which can be considered kind of a specification for the Ruby programming language.
Some of these resources are:
The ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification – Note that the ISO Ruby Specification was written around 2009–2010 with the specific goal that all existing Ruby implementations at the time would easily be compliant. Since YARV only implements Ruby 1.9+ and MRI only implements Ruby 1.8 and lower, this means that the ISO Ruby Specification only contains features that are common to both Ruby 1.8 and Ruby 1.9. Also, the ISO Ruby Specification was specifically intended to be minimal and only contain the features that are absolutely required for writing Ruby programs. Because of that, it does for example only specify Strings very broadly (since they have changed significantly between Ruby 1.8 and Ruby 1.9). It obviously also does not specify features which were added after the ISO Ruby Specification was written, such as Ractors or Pattern Matching.
The Ruby Spec Suite aka ruby/spec – Note that the ruby/spec is unfortunately far from complete. However, I quite like it because it is written in Ruby instead of "ISO-standardese", which is much easier to read for a Rubyist, and it doubles as an executable conformance test suite.
The Ruby Programming Language by David Flanagan and Yukihiro 'matz' Matsumoto – This book was written by David Flanagan together with Ruby's creator matz to serve as a Language Reference for Ruby.
Programming Ruby by Dave Thomas, Andy Hunt, and Chad Fowler – This book was the first English book about Ruby and served as the standard introduction and description of Ruby for a long time. This book also first documented the Ruby core library and standard library, and the authors donated that documentation back to the community.
The Ruby Issue Tracking System, specifically, the Feature sub-tracker – However, please note that unfortunately, the community is really, really bad at distinguishing between Tickets about the Ruby Programming Language and Tickets about the YARV Ruby Implementation: they both get intermingled in the tracker.
The Meeting Logs of the Ruby Developer Meetings.
New features are often discussed on the mailing lists, in particular the ruby-core (English) and ruby-dev (Japanese) mailing lists.
The Ruby documentation – Again, be aware that this documentation is generated from the source code of YARV and does not distinguish between features of Ruby and features of YARV.
In the past, there were a couple of attempts of formalizing changes to the Ruby Specification, such as the Ruby Change Request (RCR) and Ruby Enhancement Proposal (REP) processes, both of which were unsuccessful.
If all else fails, you need to check the source code of the popular Ruby implementations to see what they actually do.
Unfortunately, in this particular case, most of those resources are useless: the ISO Ruby Language Specification only specifies features that are common between Ruby 1.8 and 1.9 and hasn't updated since 2010. Since Pattern Matching is a feature of Ruby 3.x, it is not described there. Also, the book by Flanagan and matz and the Programming Ruby book are too old to contain documentation about Pattern Matching.
The Ruby Spec Suite does have a quite extensive specification of Pattern Matching: language/pattern_matching_spec.rb, but the one thing it does not specify is the one-line form of the in operator.
There are some references in the Feature tracker:
Feature #15865 <expr> in <pattern> expression introduces the Feature.
Feature #17260 Promote pattern matching to official feature promotes Pattern Matching from an experimental Ruby feature to an official Ruby feature but removes one-line <expr> in <pattern>.
Feature #17371 Reintroduce expr in pat reintroduces one-line <expr> in <pattern> as an official feature.
There is also some discussion in the Meeting Notes of the 20201026Japan Developer Meeting and the corresponding ticket.
Is there any special method I can override like == or [] ?
For Value Patterns, the method used is ===.
For Array Patterns and Find Patterns, the method used is deconstruct.
For Hash Patterns, the method used is deconstruct_keys.

Please see here:
https://docs.ruby-lang.org/en/master/syntax/pattern_matching_rdoc.html
<expression> in <pattern> is the same as case <expression>; in <pattern>; true; else false; end
So :a in h is the same as
case :a
in h
true
else
false
end
You can use it when you only want to know if a pattern has been matched or not

I think Ruby 3.0's in is what you are looking for.
See also https://rubyreferences.github.io/rubychanges/3.0.html#in-as-a-truefalse-check. It says:
in was reintroduced to return true/false (whether the pattern matches) instead of raising NoMatchingPatternError.

Related

Module#Refine and Module#used - couldn't understand their use in Ruby

Can anyone help me here to understand how the below two methods works with some examples ?
Module#Refine
Module#used
Regarding "refine", it is a part of an "experimental" feature named Refinements. Refinements are not part of the Ruby 2.0 core spec, as their value and consequences where still discussed between the various Ruby implementors (remember there is more to Ruby than its core implementation, "MRI" - JRuby, Rubinius and others).
Refinements (should they arrive one day in the spec), would allow some kind of "local monkey patching", allowing to patch an existing class only in the scope of a given module. Should you be interested in some parts of the discussions around them, you should take a look at Charles Olivier Nutter article on it (he is the main implementor of JRuby) or this one from Yehuda Katz.
Regarding "used", as per the source, it does not do much :
static VALUE
rb_obj_dummy(void)
{
return Qnil;
}
After some research and a "call to help", here is Charles Olivier Nutter (JRuby lead implementor) answer :
#used is called when a module appears in a refinement's "using" call
So your two questions are actually linked.

What is the formal term for the "#{}" token in Ruby syntax?

The Background
I recently posted an answer where I variously referred to #{} as a literal, an operator, and (in one draft) a "literal constructor." The squishiness of this definition didn't really affect the quality of the answer, since the question was more about what it does and how to find language references for it, but I'm unhappy with being unable to point to a canonical definition of exactly what to call this element of Ruby syntax.
The Ruby manual mentions this syntax element in the section on expression substitution, but doesn't really define the term for the syntax itself. Almost every reference to this language element says it's used for string interpolation, but doesn't define what it is.
Wikipedia Definitions
Here are some Wikipedia definitions that imply this construct is (strictly speaking) neither a literal nor an operator.
Literal (computer programming)
Operator (programming)
The Questions
Does anyone know what the proper term is for this language element? If so, can you please point me to a formal definition?
Ruby's parser calls #{} the "embexpr" operator. That's EMBedded EXPRession, naturally.
I would definitely call it neither a literal (that's more for, e.g. string literals or number literals themselves, but not parts thereof) nor an operator; those are solely for e.g. binary or unary (infix) operators.
I would either just refer to it without a noun (i.e. for string interpolation), or perhaps call those characters the string interpolation sequence or escape.
TL;DR
Originally, I'd hypothesized:
Embedded expression seems the most likely definition for this token, based on hints in the source code.
This turned out to be true, and has been officially validated by the Ruby 2.x documentation. Based on the updates to the Ripper documentation since this answer was originally written, it seems the parser token is formally defined as string_embexpr and the symbol itself is called an "embedded expression." See the Update for Ruby 2.x section at the bottom of this answer for detailed corroboration.
The remainder of the answer is still relevant, especially for older Rubies such as Ruby 1.9.3, and the methodology used to develop the original answer remains interesting. I am therefore updating the answer, but leaving the bulk of the original post as-is for historical purposes, even though the current answer could now be shorter.
Pre-2.x Answer Based on Ruby 1.9.3 Source Code
Related Answer
This answer calls attention to the Ruby source, which makes numerous references to embexpr throughout the code base. #Phlip suggests that this variable is an abbreviation for "EMBedded EXPRession." This seems like a reasonable interpretation, but neither the ruby-1.9.3-p194 source nor Google (as of this writing) explicitly references the term embedded expression in association with embexpr in any context, Ruby-related or not.
Additional Research
A scan of the Ruby 1.9.3-p194 source code with:
ack-grep -cil --type-add=YACC=.y embexpr .rvm/src/ruby-1.9.3-p194 |
sort -rnk2 -t: |
sed 's!^.*/!!'
reveals 9 files and 33 lines with the term embexpr:
test_scanner_events.rb:12
test_parser_events.rb:7
eventids2.c:5
eventids1.c:3
eventids2table.c:2
parse.y:1
parse.c:1
ripper.y:1
ripper.c:1
Of particular interest is the inclusion of string_embexpr on line 4,176 of the parse.y and ripper.y bison files. Likewise, TestRipper::ParserEvents#test_string_embexpr contains two references to parsing #{} on lines 899 and 902 of test_parser_events.rb.
The scanner, exercised in test_scanner_events.rb, is also noteworthy. This file defines tests in #test_embexpr_beg and #test_embexpr_end that scan for the token #{expr} inside various string expressions. The tests reference both embexpr and expr, raising the likelihood that "embedded expression" is indeed a sensible name for the thing.
Update for Ruby 2.x
Since this post was originally written, the documentation for the standard library's Ripper class has been updated to formally identify the token. The usage section provides "Hello, #{world}!" as an example, and says in part:
Within our :string_literal you’ll notice two #tstring_content, this is the literal part for Hello, and !. Between the two #tstring_content statements is a :string_embexpr, where embexpr is an embedded expression.
This Block post suggests, it is called an 'idiom':
http://kconrails.com/2010/12/08/ruby-string-interpolation/
The Wikipedia Article doesn't seem to contradict that:
http://en.wikipedia.org/wiki/Programming_idiom
#{} It's called placeholder and is used to reference variables with a string.
puts "My name is #{my_name}"

What a Ruby parser would you suggest to parse Ruby sources?

A parser I'm looking for should:
be Ruby parsing friendly,
be elegant by rule design,
produce user friendly parsing errors,
user documentation should be available in volume more than a calculator example,
UPD: allowing to omit optional whitespaces writing a grammar.
Fast parsing is not an important feature.
I tried Citrus but the lack of documentation and need to specify every space in rules just turned me away from it.
Treetop
Ragel
Or in case you want to parse Ruby itself:
parse_tree and ruby_parser
Edit:
I just saw your last comment about needing a subset of Ruby for your project, in that case I'd also recommend having a look at tinyrb.

What are the pros and cons of Ruby's general delimited input? (percent syntax)

I don't understand why some people use the percentage syntax a lot in ruby.
For instance, I'm reading through the ruby plugin guide and it uses code such as:
%w{ models controllers }.each do |dir|
path = File.join(File.dirname(__FILE__), 'app', dir)
$LOAD_PATH << path
ActiveSupport::Dependencies.load_paths << path
ActiveSupport::Dependencies.load_once_paths.delete(path)
end
Every time I see something like this, I have to go and look up the percentage syntax reference because I don't remember what %w means.
Is that syntax really preferable to ["models", "controllers"].each ...?
I think in this latter case it's more clear that I've defined an array of strings, but in the former - especially to someone learning ruby - it doesn't seem as clear, at least for me.
If someone can tell me that I'm missing some key point here then please do, as I'm having a hard time understanding why the percent syntax appears to be preferred by the vast majority of ruby programmers.
One good use for general delimited input (as %w, %r, etc. are called) to avoid having to escape delimiters. This makes it especially good for literals with embedded delimiters. Contrast the regular expression
/^\/home\/[^\/]+\/.myprogram\/config$/
with
%r|^/home/[^/]+/.myprogram/config$|
or the string
"I thought John's dog was called \"Spot,\" not \"Fido.\""
with
%Q{I thought John's dog was called "Spot," not "Fido."}
As you read more Ruby, the meaning of general delimited input (%w, %r, &c.), as well as Ruby's other peculiarities and idioms, will become plain.
I believe that is no accident that Ruby often has several ways to do the same thing. Ruby, like Perl, appears to be a postmodern language: Minimalism is not a core values, but merely one of many competing design forces.
The %w syntax shaves 3 characters off each item in the list... can't beat that!
It's easy to remember: %w{} is for "words", %r{} for regexps, %q{} for "quotes", and so on... It's pretty easy once you build such memory aids.
As the size of the array grows, the %w syntax saves more and more keystrokes by not making you type in all the quotes and commas. At least that's the reason given in Learning Ruby.

Ruby Parser

 I want to know whether it is possible to parse ruby language using just
deterministic parser having no backtracking at all ??
Instead of actually having to write a parser, you can always leverage the existing interpreter to do what you want.
For example: ruby2ruby
http://seattlerb.rubyforge.org/ruby2ruby/ ruby2ruby
I don't know any specific details about parsing Ruby, or why you insist on "no backtracking". My guess is that you believe the Ruby grammar isn't LALR(1), e.g., isn't processable by YACC or equivalents.
Regardless, if the problem is to parse a language whose grammar is context-free, one can do this using a GLR parser, which does not backtrack:
http://en.wikipedia.org/wiki/GLR_parser
I've used this to build production parsers for many real languages.

Resources