How does enumerators created with code block actually runs - ruby

It's just a simple question, how is y.<< method is able to halt the code-block mid execution ??
I have expected the code block to run only once and never halt in the middle :/
e = Enumerator.new do |y|
puts "Ruby"
y << 1
y << 2
puts "Ruby"
y << 3
end
puts e.each.next
puts e.each.next
puts e.each.next
e.rewind
puts e.each.next
puts e.each.next
puts e.each.next

Almost all Ruby implementations are Free Software and Open Source, so you can just look at the source code to see how it is implemented.
In Rubinius, the most interesting part is Enumerator::Iterator#reset, implemented in core/enumerator.rb:
#fiber = Fiber.new stack_size: STACK_SIZE do
obj = #object
#result = obj.each { |*val| Fiber.yield *val }
#done = true
end
and Enumerator::Iterator#next:
val = #fiber.resume
TruffleRuby's implementation is very similar, as you can see in src/main/ruby/truffleruby/core/enumerator.rb:
class FiberGenerator
# irrelevant methods omitted
def next
reset unless #fiber
val = #fiber.resume
raise StopIteration, 'iteration has ended' if #done
val
end
def reset
#done = false
#fiber = Fiber.new do
obj = #object
#result = obj.each do |*val|
Fiber.yield(*val)
end
#done = true
end
end
end
JRuby is also very similar, as you can see in core/src/main/ruby/jruby/kernel/enumerator.rb:
class FiberGenerator
# irrelevant methods omitted
def next
reset unless #fiber&.__alive__
val = #fiber.resume
raise StopIteration, 'iteration has ended' if #state.done
val
end
def reset
#state.done = false
#state.result = nil
#fiber = Fiber.new(&#state)
end
end
MRuby's implementation is very similar, as you can see in mrbgems/mruby-enumerator/mrblib/enumerator.rb.
YARV also uses Fibers, as can be seen in enumerator.c, for example here:
static void
next_init(VALUE obj, struct enumerator *e)
{
VALUE curr = rb_fiber_current();
e->dst = curr;
e->fib = rb_fiber_new(next_i, obj);
e->lookahead = Qundef;
}
static VALUE
get_next_values(VALUE obj, struct enumerator *e)
{
VALUE curr, vs;
if (e->stop_exc)
rb_exc_raise(e->stop_exc);
curr = rb_fiber_current();
if (!e->fib || !rb_fiber_alive_p(e->fib)) {
next_init(obj, e);
}
vs = rb_fiber_resume(e->fib, 1, &curr);
if (e->stop_exc) {
e->fib = 0;
e->dst = Qnil;
e->lookahead = Qundef;
e->feedvalue = Qundef;
rb_exc_raise(e->stop_exc);
}
return vs;
}
So, not surprisingly, Enumerator is implemented using Fibers in many Ruby implementations. Fiber is essentially just Ruby's name for semi-coroutines, and of course, coroutines are a popular way of implementing generators and iterators. E.g. CPython and CoreCLR also implement generators using coroutines.
One exception to this seems to be Opal. My assumption was that Opal would use ECMAScript Generators to implement Ruby Enumerators, but it does not look like that is the case. The implementation of Ruby Enumerators in Opal is found in opal/corelib/enumerator.rb, opal/corelib/enumerator/generator.rb, and opal/corelib/enumerator/yielder.rb with some help from opal/corelib/runtime.js, but unfortunately, I don't fully understand it. It does not appear to use either Ruby Fibers or ECMAScript Generators, though.
By the way, your usage of Enumerators is somewhat strange: you call Enumerator#each six times without a block, but calling Enumerator#each without a block just returns the Enumerator itself:
each → enum
Iterates over the block according to how this Enumerator was constructed. If no block and no arguments are given, returns self.
So, in other words, all those calls to Enumerator#each are just no-ops. It would make much more sense to just call Enumerator#next directly:
puts e.next
puts e.next
puts e.next
e.rewind
puts e.next
puts e.next
puts e.next

Related

Exponentiation not working

I'm new to programming, especially in Ruby so I've been making some basic projects. I have this code and as far as I know, it should work, but it gives results that I don't expect
The program takes a and B and returns a^b. I did this as a programming exercise, hence why I didn't just go a**b.
class Exponate
attr_accessor :args
def initialize args = {}
#args = args
#ans = nil
end
def index
#args[:b].times {
#ans = #args[:a] * #args[:a]
}
puts #ans
end
end
e = Exponate.new(:a => 32, :b => 6)
e.index
e.args[:a] = 5
e.index
Returns
1024 # Should be 1_073_741_824
25 # Should be 15_625
But they are definitely not that
You can write like this:
class Exponate
attr_accessor :args, :ans
def initialize args = {}
#args = args
end
def index
#ans = 1 # multiplication will start from 1
#args[:b].times {
#ans *= #args[:a] #same as #ans = #ans * #args[:a]
}
puts #ans
end
end
#ans = #args[:a] * #args[:a] will return the same value, no matter how many times called, you need to reference the accumulator variable in some way to make use of the cycle.
Using an instance variable for a local does not seem right - their lifetime is longer, so after method exits they cannot not be collected if the whole object is still referenced somewhere. Also the #s are more error-prone - if you make a typo (for example - #asn instead of #ans), you'll get nil instead of NameError, it may be harder to debug, so better to write this way:
def index
ans = 1
args[:b].times {
ans *= args[:a]
}
puts ans
end
For loops with an accumulator in ruby it's better to use Enumerable#inject:
#ans = #args[:b].times.inject(1){|acc,v| acc * #args[:a]}
this way it's less likely to forget initialisation.

Ruby: How to chain methods specified in an array (or split string) of methods?

How is it possible to chain methods in Ruby when the method calls are specified as an array?
Example:
class String
def bipp(); self.to_s + "-bippity"; end
def bopp(); self.to_s + "-boppity"; end
def drop(); self.to_s + "-dropity"; end
end
## this produces the desired output
##
puts 'hello'.bipp.bopp.drop #=> hello-bippity-boppity-dropity
## how do we produce the same desired output here?
##
methods = "bipp|bopp|drop".split("|")
puts 'world'.send( __what_goes_here??__ ) #=> world-bippity-boppity-droppity
[Note to Ruby purists: stylistic liberties were taken with this example. For notes on preferred usage regarding semicolons, parenthesis, comments and symbols, please feel free to consult Ruby style guides (e.g., https://github.com/styleguide/ruby)]
Try this:
methods = "bipp|bopp|drop".split("|")
result = 'world'
methods.each {|meth| result = result.send(meth) }
puts result
or, using inject:
methods = "bipp|bopp|drop".split("|")
result = methods.inject('world') do |result, method|
result.send method
end
or, more briefly:
methods = "bipp|bopp|drop".split("|")
result = methods.inject('world', &:send)
By the way - Ruby doesn't need semicolons ; at the end of each line!
methods = "bipp|bopp|drop".split("|")
result = 'world'
methods.each {|meth| result = result.method(meth).call }
puts result #=> world-bippity-boppity-dropity
or
methods = "bipp|bopp|drop".split("|")
methods.each_with_object('world') {|meth,result| result.replace(result.method(meth).call)} #=> world-bippity-boppity-dropity

Is there a built-in lazy Hash in Ruby?

I need to populate a Hash with various values. Some of values are accessed often enough and another ones really seldom.
The issue is, I'm using some computation to get values and populating the Hash becomes really slow with multiple keys.
Using some sort of cache is not a option in my case.
I wonder how to make the Hash compute the value only when the key is firstly accessed and not when it is added?
This way, seldom used values wont slow down the filling process.
I'm looking for something that is "kinda async" or lazy access.
There are many different ways to approach this. I recommend using an instance of a class that you define instead of a Hash. For example, instead of...
# Example of slow code using regular Hash.
h = Hash.new
h[:foo] = some_long_computation
h[:bar] = another_long_computation
# Access value.
puts h[:foo]
... make your own class and define methods, like this...
class Config
def foo
some_long_computation
end
def bar
another_long_computation
end
end
config = Config.new
puts config.foo
If you want a simple way to cache the long computations or it absolutely must be a Hash, not your own class, you can now wrap the Config instance with a Hash.
config = Config.new
h = Hash.new {|h,k| h[k] = config.send(k) }
# Access foo.
puts h[:foo]
puts h[:foo] # Not computed again. Cached from previous access.
One issue with the above example is that h.keys will not include :bar because you haven't accessed it yet. So you couldn't, for example, iterate over all the keys or entries in h because they don't exist until they're actually accessed. Another potential issue is that your keys need to be valid Ruby identifiers, so arbitrary String keys with spaces won't work when defining them on Config.
If this matters to you, there are different ways to handle it. One way you can do it is to populate your hash with thunks and force the thunks when accessed.
class HashWithThunkValues < Hash
def [](key)
val = super
if val.respond_to?(:call)
# Force the thunk to get actual value.
val = val.call
# Cache the actual value so we never run long computation again.
self[key] = val
end
val
end
end
h = HashWithThunkValues.new
# Populate hash.
h[:foo] = ->{ some_long_computation }
h[:bar] = ->{ another_long_computation }
h["invalid Ruby name"] = ->{ a_third_computation } # Some key that's an invalid ruby identifier.
# Access hash.
puts h[:foo]
puts h[:foo] # Not computed again. Cached from previous access.
puts h.keys #=> [:foo, :bar, "invalid Ruby name"]
One caveat with this last example is that it won't work if your values are callable because it can't tell the difference between a thunk that needs to be forced and a value.
Again, there are ways to handle this. One way to do it would be to store a flag that marks whether a value has been evaluated. But this would require extra memory for every entry. A better way would be to define a new class to mark that a Hash value is an unevaluated thunk.
class Unevaluated < Proc
end
class HashWithThunkValues < Hash
def [](key)
val = super
# Only call if it's unevaluated.
if val.is_a?(Unevaluated)
# Force the thunk to get actual value.
val = val.call
# Cache the actual value so we never run long computation again.
self[key] = val
end
val
end
end
# Now you must populate like so.
h = HashWithThunkValues.new
h[:foo] = Unevaluated.new { some_long_computation }
h[:bar] = Unevaluated.new { another_long_computation }
h["invalid Ruby name"] = Unevaluated.new { a_third_computation } # Some key that's an invalid ruby identifier.
h[:some_proc] = Unevaluated.new { Proc.new {|x| x + 2 } }
The downside of this is that now you have to remember to use Unevaluted.new when populating your Hash. If you want all values to be lazy, you could override []= also. I don't think it would actually save much typing because you'd still need to use Proc.new, proc, lambda, or ->{} to create the block in the first place. But it might be worthwhile. If you did, it might look something like this.
class HashWithThunkValues < Hash
def []=(key, val)
super(key, val.respond_to?(:call) ? Unevaluated.new(&val) : val)
end
end
So here is the full code.
class HashWithThunkValues < Hash
# This can be scoped inside now since it's not used publicly.
class Unevaluated < Proc
end
def [](key)
val = super
# Only call if it's unevaluated.
if val.is_a?(Unevaluated)
# Force the thunk to get actual value.
val = val.call
# Cache the actual value so we never run long computation again.
self[key] = val
end
val
end
def []=(key, val)
super(key, val.respond_to?(:call) ? Unevaluated.new(&val) : val)
end
end
h = HashWithThunkValues.new
# Populate.
h[:foo] = ->{ some_long_computation }
h[:bar] = ->{ another_long_computation }
h["invalid Ruby name"] = ->{ a_third_computation } # Some key that's an invalid ruby identifier.
h[:some_proc] = ->{ Proc.new {|x| x + 2 } }
You can define your own indexer with something like this:
class MyHash
def initialize
#cache = {}
end
def [](key)
#cache[key] || (#cache[key] = compute(key))
end
def []=(key, value)
#cache[key] = value
end
def compute(key)
#cache[key] = 1
end
end
and use it as follows:
1.9.3p286 :014 > hash = MyHash.new
=> #<MyHash:0x007fa0dd03a158 #cache={}>
1.9.3p286 :019 > hash["test"]
=> 1
1.9.3p286 :020 > hash
=> #<MyHash:0x007fa0dd03a158 #cache={"test"=>1}>
you can use this:
class LazyHash < Hash
def [] key
(_ = (#self||{})[key]) ?
((self[key] = _.is_a?(Proc) ? _.call : _); #self.delete(key)) :
super
end
def lazy_update key, &proc
(#self ||= {})[key] = proc
self[key] = proc
end
end
Your lazy hash will behave exactly as a normal Hash, cause it is actually a real Hash.
See live demo here
*** UPDATE - answering to nested procs question ***
Yes, it would work, but it is cumbersome.
See updated answer.
Use lazy_update instead of []= to add "lazy" values to your hash.
This isn't strictly an answer to the body of your question, but Enumerable::Lazy will definitely be a part of Ruby 2.0. This will let you do lazy evaluation on iterator compositions:
lazy = [1, 2, 3].lazy.select(&:odd?)
# => #<Enumerable::Lazy: #<Enumerator::Generator:0x007fdf0b864c40>:each>
lazy.to_a
# => [40, 50]

Are there something like Python generators in Ruby?

I am new to Ruby, is there a way to yield values from Ruby functions? If yes, how? If not, what are my options to write lazy code?
Ruby's yield keyword is something very different from the Python keyword with the same name, so don't be confused by it. Ruby's yield keyword is syntactic sugar for calling a block associated with a method.
The closest equivalent is Ruby's Enumerator class. For example, the equivalent of the Python:
def eternal_sequence():
i = 0
while True:
yield i
i += 1
is this:
def eternal_sequence
Enumerator.new do |enum|
i = 0
while true
enum.yield i # <- Notice that this is the yield method of the enumerator, not the yield keyword
i +=1
end
end
end
You can also create Enumerators for existing enumeration methods with enum_for. For example, ('a'..'z').enum_for(:each_with_index) gives you an enumerator of the lowercase letters along with their place in the alphabet. You get this for free with the standard Enumerable methods like each_with_index in 1.9, so you can just write ('a'..'z').each_with_index to get the enumerator.
I've seen Fibers used in that way, look at an example from this article:
fib = Fiber.new do
x, y = 0, 1
loop do
Fiber.yield y
x,y = y,x+y
end
end
20.times { puts fib.resume }
If you are looking to lazily generate values, #Chuck's answer is the correct one.
If you are looking to lazily iterate over a collection, Ruby 2.0 introduced the new .lazy enumerator.
range = 1..Float::INFINITY
puts range.map { |x| x+1 }.first(10) # infinite loop
puts range.lazy.map { |x| x+1 }.first(10) # [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Ruby supports generators out of the box using Enumerable::Generator:
require 'generator'
# Generator from an Enumerable object
g = Generator.new(['A', 'B', 'C', 'Z'])
while g.next?
puts g.next
end
# Generator from a block
g = Generator.new { |g|
for i in 'A'..'C'
g.yield i
end
g.yield 'Z'
}
# The same result as above
while g.next?
puts g.next
end
https://ruby-doc.org/stdlib-1.8.7/libdoc/generator/rdoc/Generator.html
Class Enumerator and its method next behave similar
https://docs.ruby-lang.org/en/3.1/Enumerator.html#method-i-next
range = 1..Float::INFINITY
enumerator = range.each
puts enumerator.class # => Enumerator
puts enumerator.next # => 1
puts enumerator.next # => 2
puts enumerator.next # => 3

conditional chaining in ruby

Is there a good way to chain methods conditionally in Ruby?
What I want to do functionally is
if a && b && c
my_object.some_method_because_of_a.some_method_because_of_b.some_method_because_of_c
elsif a && b && !c
my_object.some_method_because_of_a.some_method_because_of_b
elsif a && !b && c
my_object.some_method_because_of_a.some_method_because_of_c
etc...
So depending on a number of conditions I want to work out what methods to call in the method chain.
So far my best attempt to do this in a "good way" is to conditionally build the string of methods, and use eval, but surely there is a better, more ruby, way?
You could put your methods into an array and then execute everything in this array
l= []
l << :method_a if a
l << :method_b if b
l << :method_c if c
l.inject(object) { |obj, method| obj.send(method) }
Object#send executes the method with the given name. Enumerable#inject iterates over the array, while giving the block the last returned value and the current array item.
If you want your method to take arguments you could also do it this way
l= []
l << [:method_a, arg_a1, arg_a2] if a
l << [:method_b, arg_b1] if b
l << [:method_c, arg_c1, arg_c2, arg_c3] if c
l.inject(object) { |obj, method_and_args| obj.send(*method_and_args) }
You can use tap:
my_object.tap{|o|o.method_a if a}.tap{|o|o.method_b if b}.tap{|o|o.method_c if c}
Sample class to demonstrate chaining methods that return a copied instance without modifying the caller.
This might be a lib required by your app.
class Foo
attr_accessor :field
def initialize
#field=[]
end
def dup
# Note: objects in #field aren't dup'ed!
super.tap{|e| e.field=e.field.dup }
end
def a
dup.tap{|e| e.field << :a }
end
def b
dup.tap{|e| e.field << :b }
end
def c
dup.tap{|e| e.field << :c }
end
end
monkeypatch: this is what you want to add to your app to enable conditional chaining
class Object
# passes self to block and returns result of block.
# More cumbersome to call than #chain_if, but useful if you want to put
# complex conditions in the block, or call a different method when your cond is false.
def chain_block(&block)
yield self
end
# passes self to block
# bool:
# if false, returns caller without executing block.
# if true, return result of block.
# Useful if your condition is simple, and you want to merely pass along the previous caller in the chain if false.
def chain_if(bool, &block)
bool ? yield(self) : self
end
end
Sample usage
# sample usage: chain_block
>> cond_a, cond_b, cond_c = true, false, true
>> f.chain_block{|e| cond_a ? e.a : e }.chain_block{|e| cond_b ? e.b : e }.chain_block{|e| cond_c ? e.c : e }
=> #<Foo:0x007fe71027ab60 #field=[:a, :c]>
# sample usage: chain_if
>> cond_a, cond_b, cond_c = false, true, false
>> f.chain_if(cond_a, &:a).chain_if(cond_b, &:b).chain_if(cond_c, &:c)
=> #<Foo:0x007fe7106a7e90 #field=[:b]>
# The chain_if call can also allow args
>> obj.chain_if(cond) {|e| e.argified_method(args) }
Although the inject method is perfectly valid, that kind of Enumerable use does confuse people and suffers from the limitation of not being able to pass arbitrary parameters.
A pattern like this may be better for this application:
object = my_object
if (a)
object = object.method_a(:arg_a)
end
if (b)
object = object.method_b
end
if (c)
object = object.method_c('arg_c1', 'arg_c2')
end
I've found this to be useful when using named scopes. For instance:
scope = Person
if (params[:filter_by_age])
scope = scope.in_age_group(params[:filter_by_age])
end
if (params[:country])
scope = scope.in_country(params[:country])
end
# Usually a will_paginate-type call is made here, too
#people = scope.all
Use #yield_self or, since Ruby 2.6, #then!
my_object.
then{ |o| a ? o.some_method_because_of_a : o }.
then{ |o| b ? o.some_method_because_of_b : o }.
then{ |o| c ? o.some_method_because_of_c : o }
Here's a more functional programming way.
Use break in order to get tap() to return the result. (tap is in only in rails as is mentioned in the other answer)
'hey'.tap{ |x| x + " what's" if true }
.tap{ |x| x + "noooooo" if false }
.tap{ |x| x + ' up' if true }
# => "hey"
'hey'.tap{ |x| break x + " what's" if true }
.tap{ |x| break x + "noooooo" if false }
.tap{ |x| break x + ' up' if true }
# => "hey what's up"
Maybe your situation is more complicated than this, but why not:
my_object.method_a if a
my_object.method_b if b
my_object.method_c if c
I use this pattern:
class A
def some_method_because_of_a
...
return self
end
def some_method_because_of_b
...
return self
end
end
a = A.new
a.some_method_because_of_a().some_method_because_of_b()
If you're using Rails, you can use #try. Instead of
foo ? (foo.bar ? foo.bar.baz : nil) : nil
write:
foo.try(:bar).try(:baz)
or, with arguments:
foo.try(:bar, arg: 3).try(:baz)
Not defined in vanilla ruby, but it isn't a lot of code.
What I wouldn't give for CoffeeScript's ?. operator.
I ended up writing the following:
class Object
# A naïve Either implementation.
# Allows for chainable conditions.
# (a -> Bool), Symbol, Symbol, ...Any -> Any
def either(pred, left, right, *args)
cond = case pred
when Symbol
self.send(pred)
when Proc
pred.call
else
pred
end
if cond
self.send right, *args
else
self.send left
end
end
# The up-coming identity method...
def itself
self
end
end
a = []
# => []
a.either(:empty?, :itself, :push, 1)
# => [1]
a.either(:empty?, :itself, :push, 1)
# => [1]
a.either(true, :itself, :push, 2)
# => [1, 2]

Resources