Optimizing basic method memoization with early return - ruby

When implementing basic memoization in Ruby, is there a pattern or simple way to return the memoized instance var if the value predicates on a more complex evaluation before hand?
Say the assignment of something requires an intense calculation, is Ruby smart enough to return the instance variable if it's present, or will something always be assigned within the scope of that method before setting #some_value?
def some_value
#some_value if #some_value.present? # possible?
something = something_else.try(:method_name) || another_something.method_name # prevent this from evaluating after execution
#some_value ||= MyClass.new(property: something.property)
end
What would a better memoization pattern be to implement what I have?

Based on how your code is currently written, the "intense calculation" will always occur. Ruby uses implicit return unless you explicitly use the keyword return, so, even if #some_value is present, the code will still execute to the last line.
def some_value
return #some_value if #some_value.present? # possible?
something = something_else.try(:method_name) || another_something.method_name # prevent this from evaluating after execution
#some_value ||= MyClass.new(property: something.property)
end
So, if you want to return #some_value if it is present, and not run any code afterwards, you will want to use explicit return. See above.
Now, Ruby will check if #some_value is present, and if that is true, the value is returned, otherwise, it will continue with the calculation.

Related

Does 'any?' break from the loop when a match is found? [duplicate]

This question already has answers here:
Do all? and any? guarantee short-circuit evaluation?
(3 answers)
Closed 4 years ago.
Does any? break from the loop when a match is found?
The following is the any? source code, but I don't understand it.
static VALUE
enum_any(VALUE obj)
{
VALUE result = Qfalse;
rb_block_call(obj, id_each, 0, 0, ENUMFUNC(any), (VALUE)&result);
return result;
}
Yes, it does break the loop. One does not need to dig into c code to check that:
[1,2,3].any? { |e| puts "Checking #{e}"; e == 2 }
# Checking 1
# Checking 2
#⇒ true
The term is "short-circuiting" and yes, any? does that. After it finds a match, it doesn't look any further.
Does any? break from the loop when a match is found?
The documentation is unclear about that:
The method returns true if the block ever returns a value other than false or nil.
Note: it does not say "when the block ever returns a value other than false or nil" or "as soon as the block ever returns a value other than false or nil".
This can be interpreted either way, or it can be interpreted as making no guarantees at all. If you go by this documentation, then you can neither guarantee that it will short-ciruit, nor can you guarantee that it won't short-circuit.
Generally speaking, this is typical for API specifications: make the minimum amount of guarantees, giving the API implementor maximum freedom in how to implement the API.
There is somewhere else we can look: the ISO Ruby Programming Language Specification (bold emphasis mine):
15.3.2.2.2 Enumerable#any?
any?(&block)
Visibility: public
Behavior:
a) Invoke the method each on the receiver
b) For each element X which each yields
If block is given, call block with X as the argument.
If this call results in a trueish object, return true
As you can see, again it only says "if", but not "when" or "as soon as". This sentence can be interpreted in two ways: "Return true as the result of the method" (no indication of how often the block gets called, only that the method will return true at the end) or "return true when you encounter an invocation of the block that evaluates to a trueish value".
Try #3: The Ruby Spec:
it "stops iterating once tähe return value is determined" do
So, yes, we can indeed rely on the fact that the block is only evaluated until the first truthy value is encountered.
The following is the any? source code, but I don't understand it.
Note: by looking at the source code, you can not determine how something behaves in Ruby. You can only determine how something behaves in that specific version of that specific implementation of Ruby. Different implementations may behave differently (for example, in YARV, Ruby threads cannot run at the same time, in JRuby, they can). Even different versions of the same implementation can behave differently.
It is usually not a good idea to make assumptions about the behavior of a programming language by just looking at a single version of a single implementation.
However, if you really want to look at some implementation, and are fully aware about the limitations of this approach, then I would suggest to look at Rubinius, Topaz, Opal, IronRuby, or JRuby. They are (in my opinion) better organized and easier to read than YARV.
For example, this is the code for Enumerable#any? in Rubinius:
def any?
if block_given?
each { |*element| return true if yield(*element) }
else
each { return true if Rubinius.single_block_arg }
end
false
end
This looks rather clear and readable, doesn't it?
This is the definition in Topaz:
def any?(&block)
if block
self.each { |*e| return true if yield(*e) }
else
self.each_entry { |e| return true if e }
end
false
end
This also looks fairly readable.
The soure in Opal is a little bit more complex, but only marginally so:
def any?(pattern = undefined, &block)
if `pattern !== undefined`
each do |*value|
comparable = `comparableForPattern(value)`
return true if pattern.public_send(:===, *comparable)
end
elsif block_given?
each do |*value|
if yield(*value)
return true
end
end
else
each do |*value|
if Opal.destructure(value)
return true
end
end
end
false
end
[Note the interesting use of overriding the ` method for injecting literal ECMAScript into the compiled code.]
Most of the added complexity compared to the Rubinius and Topaz versions stems from the fact that Opal already supports the third overload of any? taking a pattern which was introduced in Ruby 2.5, whereas Rubinius and Topaz only support the two overloads with a block and without any arguments at all.
IronRuby's implementation implements the short-circuiting like this:
if (predicate.Yield(item, out blockResult)) {
result = blockResult;
return selfBlock.PropagateFlow(predicate, blockResult);
}
JRuby's implementation is a little bit more involved still, but you can see that as soon as it encounters a truthy block value, it breaks out of the loop by throwing a SPECIAL_JUMP exception and catching it to return true.
Yes and it's easy to prove:
irb(main):009:0> %w{ant bear cat}.any? {|word| puts "hello"; word.length >= 4}
hello
hello
=> true
It has printed only twice. If it did not break it would print 3 times.

In Ruby, is an if/elsif/else statement's subordinate block the same as a 'block' that is passed as a parameter?

I was doing some reading on if/elsif/else in Ruby, and I ran into some differences in terminology when describing how control expressions work.
In the Ruby Programming Wikibooks (emphasis added):
A conditional Branch takes the result of a test expression and executes a block of code depending whether the test expression is true or false.
and
An if expression, for example, not only determines whether a subordinate block of code will execute, but also results in a value itself.
Ruby-doc.org, however, does not mention blocks at all in the definitions:
The simplest if expression has two parts, a “test” expression and a “then” expression. If the “test” expression evaluates to a true then the “then” expression is evaluated.
Typically, when I have read about 'blocks' in Ruby, it has almost always been within the context of procs and lambdas. For example, rubylearning.com defines a block:
A Ruby block is a way of grouping statements, and may appear only in the source adjacent to a method call; the block is written starting on the same line as the method call's last parameter (or the closing parenthesis of the parameter list).
The questions:
When talking about blocks of code in Ruby, are we talking about
the group of code that gets passed in to a method or are we simply
talking about a group of code in general?
Is there a way to easily differentiate between the two (and is there
a technical difference between the two)?
Context for these questions: I am wondering if referring to the code inside of conditionals as blocks will be confusing to to new Ruby programmers when they are later introduced to blocks, procs, and lambdas.
TL;DR if...end is an expression, not a block
The proper use of the term block in Ruby is the code passed to a method in between do...end or curly braces {...}. A block can be and often is implicitly converted into a Proc within a method by using the &block syntax in the method signature. This new Proc is an object with its own methods that can be passed to other methods, stored in variables and data structures, called repeatedly, etc...
def block_to_proc(&block)
prc = block
puts prc
prc.class
end
block_to_proc { 'inside the block' }
# "#<Proc:0x007fa626845a98#(irb):21>"
# => Proc
In the code above, a Proc is being implicitly created with the block as its body and assigned to the variable block. Likewise, a Proc (or a lambda, a type of Proc) can be "expanded" into blocks and passed to methods that are expecting them, by using the &block syntax at the end of an arguments list.
def proc_to_block
result = yield # only the return value of the block can be saved, not the block itself
puts result
result.class
end
block = Proc.new { 'inside the Proc' }
proc_to_block(&block)
# "inside the Proc"
# => String
Although there's somewhat of a two-way street between blocks and Procs, they're not the same. Notice that to define a Proc we had to pass a block to Proc.new. Strictly speaking a block is just a chunk of code passed to a method whose execution is deferred until explicitly called. A Proc is defined with a block, its execution is also deferred until called, but it is a bonafide object just like any other. A block cannot survive on its own, a Proc can.
On the other hand, block or block of code is sometimes casually used to refer to any discreet chunk of code enclosed by Ruby keywords terminating with end: if...else...end, begin...rescue...end, def...end, class...end, module...end, until...end. But these are not really blocks, per se, and only really resemble them on the surface. Often they also have deferred execution until some condition is met. But they can stand entirely on their own, and always have return values. Ruby-doc.org's use of "expression" is more accurate.
From wikipedia
An expression in a programming language is a combination of one or
more explicit values, constants, variables, operators, and functions
that the programming language interprets (according to its particular
rules of precedence and of association) and computes to produce ("to
return", in a stateful environment) another value.
This is why you can do things like this
return_value = if 'expression'
true
end
return_value # => true
Try doing that with a block
return_value = do
true
end
# SyntaxError: (irb):24: syntax error, unexpected keyword_do_block
# return_value = do
# ^
A block is not an expression on its own. It needs either yield or a conversion to a Proc to survive. What happens when we pass a block to a method that doesn't want one?
puts("indifferent") { "to blocks" }
# "indifferent"
# => nil
The block is totally lost, it disappears with no return value, no execution, as if it never existed. It needs yield to complete the expression and produce a return value.
class Object
def puts(*args)
super
yield if block_given?
end
end
puts("mindful") { "of blocks" }
# "mindful"
# => "of blocks"

doubts regarding "||=" OR EQUALS operator in ruby [duplicate]

This question already has answers here:
What does ||= (or-equals) mean in Ruby?
(23 answers)
Closed 8 years ago.
I have some doubts regarding OR EQUALS (||=) operator in ruby. How does ruby interpreter implement it? Here is a sample of code:
class C
def arr
#num ||= []
end
end
When we use OR EQUALS operator in this circumstances, the first call to this method initializes the variable and adds an element, that's fine. When a second call is made to arr, how does it know that array has one element in it..
In Ruby, there are two values that are considered logical false. The first is the boolean value false, the other is nil. Anything which is non-nil and not explicitly false is true. The first time though the method, #num is nil, which is treated as false and the logical or portion of ||= needs to be evaluated and ends up assigning the empty array to #num. Since that's now non-nil, it equates to true. Since true || x is true no matter what x is, in future invocations Ruby short circuits the evaluation and doesn't do the assignment.
In general terms x ||= y is equivalent to x = x || y, it's just shorthand. It's implemented as the expanded form, same as &&=, += or -=.
Most programming languages, Ruby included, will stop executing a logical comparison statement like || on the first true value it encounters and return that. Likewise, it will halt on the first false value when using &&.
In general terms:
false || foo()
This will return false and not evaluate foo().
The pattern is best described as a "lazy initializer", that is the variable is defined only once, but only when it's actually used. This is in contrast to an "eager initializer" that will do it as early as possible, like inside the initialize method.
You'll see other versions of this pattern, like:
def arr
#num ||= begin
stuff = [ ]
# ...
stuff
end
end
This handles cases where the initial value is not as trivial and may need some work to produce. Once again, it's only actually generated when the method is called for the first time.
How does Ruby know on the second pass to not initialize it again? Simple, by that point #num is already defined as something.
As a note, if you're using values like false that would evaluate as non-true, then ||= will trigger every time. The only two logically false values in Ruby are nil and false, so most of the time this isn't an issue.
You'll have to do this the long-form way if you need to handle false as well:
def arr
return #num unless num.nil?
#num = false
end
There was talk of adding an ||=-like operator that would only trigger on nil but I don't think that has been added to Ruby yet.

Best way to prevent returning last evaluated expression

Suppose I want to write a method in ruby whose last line is a method call but I do not want to return its return value. Is there a more elegant way to accomplish this other than adding a nil after the call?
def f(param)
# some extra logic with param
g(param) # I don't want to return the return of g
end
If you want to make it "poke you in the eye" explicit, just say "this method doesn't return anything":
def f(param)
# some extra logic with param
g(param) # I don't want to return the return of g
return
end
f(x) will still evaluate to nil but a bare return is an unambiguous way to say "this method doesn't return anything of interest", a trailing nil means that "this method explicitly returns nil" and that's not quite the same as not returning anything of use.
No, but if it is important that f indeed returns nil, and not whatever g(param) returns, then nothing is more elegant than spelling that out with a nil on the last line. Why would you want to obfuscate this away? Most of the time, elegance is in the explicit and the obvious.
A few tenants from The Zen of Python come to mind:
Explicit is better than implicit.
Simple is better than complex.
Readability counts.
No. If you want to return nil, the last expression has to evaluate to nil. You can do this with a terminating nil line or by surrounding the method body in nil.tap {} or however else you like, but it's pretty straightforward — the last expression evaluated gets returned.
As the others have said, no. However, if you want to avoid adding another line, you have a couple of options:
g(param); nil
g(param) && nil
The first will always cause f to return nil; the second will return false (if g returns false) or nil (if g returns a truthy value).
No, there is no other way than to either explicitly return nil or evaluate some other expression which implicitly evaluates to nil (e.g. ()).
If you want to add some kind of semantic marker that shows that you explicitly want to ignore the return value, you could invent some convention for that, e.g.:
def f(param)
# some extra logic with param
g(param) # I don't want to return the return of g
()
end
or
def f(param)
# some extra logic with param
g(param) # I don't want to return the return of g
_=_
end
which will make those cases easily grepable but probably won't aid much in understanding.
This is a design choice of Ruby which it shares with many other expression-based languages: the value of a block/subroutine/procedure/function/method is the value of the last expression evaluated inside the block. That's how it works in Lisp, for example.
Note that there are other choices as well. E.g. in Smalltalk, the return value of a method must be explicitly returned using the ↑ operator, otherwise the return value is nil. In E, which is heavily focused on security, this is even a conscious design choice: automatically returning the value of the last expression is considered a potential information leak.

ruby, define []= operator, why can't control return value?

Trying to do something weird that might turn into something more useful, I tried to define my own []= operator on a custom class, which you can do, and have it return something different than the value argument, which apparently you can't do. []= operator's return value is always value; even when you override this operator, you don't get to control the return value.
class Weird
def []=(key, value)
puts "#{key}:#{value}"
return 42
end
end
x = Weird.new
x[:a] = "a"
output "a:a"
return value => "a" # why not 42?
Does anyone have an explanation for this? Any way around it?
ruby MRI 1.8.7. Is this the same in all rubys; Is it part of the language?
Note that this behavior also applies to all assignment expressions (i.e. also attribute assignment methods: def a=(value); 42; end).
My guess is that it is designed this way to make it easy to accurately understand assignment expressions used as parts of other expressions.
For example, it is reasonable to expect x = y.a = z[4] = 2 to:
call z.[]=(4,2), then
call y.a=(2), then
assign 2 to the local variable x, then finally
yield the value 2 to any “surrounding” (or lower precedence) expression.
This follows the principle of least surprise; it would be rather surprising if, instead, it ended up being equivalent to x = y.a=(z.[]=(4,2)) (with the final value being influenced by both method calls).
While not exactly authoritative, here is what Programming Ruby has to say:
Programming Ruby (1.8), in the Expressions section:
An assignment statement sets the variable or attribute on its left side (the lvalue) to refer to the value on the right (the rvalue). It then returns that value as the result of the assignment expression.
Programming Ruby 1.9 (3rd ed) in section 22.6 Expressions, Conditionals, and Loops:
(right after describing []= method calls)
The value of an assignment expression is its rvalue. This is true even if the assignment is to an attribute method that returns something different.
It’s an assignment statement, and those always evaluate to the assigned value. Making this different would be weird.
I suppose you could use x.[]= :a, "a" to capture the return value.

Resources