ruby lambda capture : a weird effect and workaround - ruby

closures = []
vals = ('a'..'z').to_a
until vals.empty?
val = vals.shift()
closures << lambda { puts val }
end
closures.each { |l| l.call() }
this Ruby code prints 'z' for every call which is a bit surprising
def closure(val)
lambda {puts val}
end
closures = []
vals = ('a'..'z').to_a
until vals.empty?
val = vals.shift()
closures << closure(val)
end
closures.each { |l| l.call() }
this prints 'a' to 'z' as it would be expected.
so what I see here is certain misbehavior in Ruby lambdas capturing their parameters at the moment of their creation
can please anyone explain this effect by citing Ruby specification ? (my Ruby is 2.2.5p319/Cygwin)
should this be reported as a bug in Ruby bug tracker ?
or it is an expected behavior ?
or it has already been fixed in some further version of Ruby ?
thanks in advance for your replies
UPDATE. Here is the same code ported to Perl. Surprisingly, but it works as expected:
use strict;
use warnings;
my #vals = 'a'..'z';
my #subs = ();
while (#vals) {
my $val = shift #vals;
push #subs, sub { print "$val\n"; };
}
$_->() for #subs;

Variables are captured by reference in Ruby, not by value (the same thing is true in Python, JavaScript, and many other languages). Furthermore, the scope of val is the function scope, not scoped to the inside of the loop, so you don't get a new variable val in each iteration of the loop -- it's the same variable val; you are just assigning another value to it in each iteration.
In each iteration of the loop, a closure is created that references the variable val -- the exact same variable val. Thus, when the closures are evaluated later, they all read the same value -- the value of the (single) variable val at that point.
When you pass it to a method and create the closure inside the method, it's different because the variable that the closure captures is the val in the body of the method closure, scoped to that method. Each time you call the method closure, you get a new variable val, whose value is the value passed in, and it is never altered thereafter (nothing in closure assigns to it). So when the value is read by the closure later, it is still the value passed in to the call of closure when the closure was created.

I believe what is happening here is in the first case
closures = []
vals = ('a'..'z').to_a
until vals.empty?
val = vals.shift()
closures << lambda { puts val }
end
closures.each { |l| l.call() }
Every time you push lambda { puts val } into closures, you are simply pushing in a method that does not remember the current value of val. So, if we add the line puts val at the end of the until loop, val = 'z', so when you call each lambda in the closure, you are calling puts val, with the current value of val.
In the second case,
def closure(val)
lambda {puts val}
end
closures = []
vals = ('a'..'z').to_a
until vals.empty?
val = vals.shift()
closures << closure(val)
end
closures.each { |l| l.call() }
When you push closure(val) into closures, ruby is able to remember the current value of the argument, so you are pushing closure('a'), closure('b'), etc. Now, when you call each l in closures, you are able to print out a to z.

Related

Advantages of using block, proc, lambda in Ruby

Example: LinkedList printing method.
For this object, you will find a printing method using block, proc, and lambda.
It is not clear to me what the advantages/disadvantages are (if any).
Thank you
What is a LinkedList?
A LinkedList is a node that has a specific value attached to it (which is sometimes called a payload), and a link to another node (or nil if there is no next item).
class LinkedListNode
attr_accessor :value, :next_node
def initialize(value, next_node = nil)
#value = value
#next_node = next_node
end
def method_print_values(list_node)
if list_node
print "#{list_node.value} --> "
method_print_values(list_node.next_node)
else
print "nil\n"
return
end
end
end
node1 = LinkedListNode.new(37)
node2 = LinkedListNode.new(99, node1)
node3 = LinkedListNode.new(12, node2)
#printing the linked list through a method defined within the scope of the class
node3.method_print_values(node3)
#---------------------------- Defining the printing method through a BLOCK
def block_print_value(list_node, &block)
if list_node
yield list_node
block_print_value(list_node.next_node, &block)
else
print "nil\n"
return
end
end
block_print_value(node3) { |list_node| print "#{list_node.value} --> " }
#---------------------------- Defining the printing method through a PROC
def proc_print_value(list_node, callback)
if list_node
callback.call(list_node) #this line invokes the print function defined below
proc_print_value(list_node.next_node, callback)
else
print "nil\n"
end
end
proc_print_value(node3, Proc.new {|list_node| print "#{list_node.value} --> "})
#---------------------------- Defining the printing method through a LAMBDA
def lambda_print_value(list_node, callback)
if list_node
callback.call(list_node) #this line invokes the print function defined below
lambda_print_value(list_node.next_node, callback)
else
print "nil\n"
end
end
lambda_print_value(node3, lambda {|list_node| print "#{list_node.value} --> "})
#---------------------------- Defining the printing method outside the class
def print_values(list_node)
if list_node
print "#{list_node.value} --> "
print_values(list_node.next_node)
else
print "nil\n"
return
end
end
print_values(node3)
Examples display how to use different things to do the same. So, there is no principal difference between them in this context:
my_proc = Proc.new { |list_node| print "#{list_node.value} --> " }
node3.block_print_values(node3, &my_proc)
node3.proc_print_value(node3, my_proc)
node3.lambda_print_value(node3, my_proc)
Also, there is possible to define a method by using any of them:
define_method(:my_method, p, &proc { puts p })
my_method 'hello' #=> hello
define_method(:my_method, p, &-> { puts p })
my_method 'hello' #=> hello
But Proc, Lambda, block are not the same. Firstly, need a bit more display how to works magic &. The great article can help with that:
&object is evaluated in the following way:
if object is a block, it converts the block into a simple proc.
if object is a Proc, it converts the object into a block while preserving the lambda? status of the object.
if object is not a Proc, it first calls #to_proc on the object and then converts it into a block.
But this does not show the differences between them. So, now let go to the ruby source:
Proc objects are blocks of code that have been bound to a set of local variables. Once bound, the code may be called in different contexts and still access those variables.
And
+lambda+, +proc+ and Proc.new preserve the tricks of a Proc object given by & argument.
lambda(&lambda {}).lambda? #=> true
proc(&lambda {}).lambda? #=> true
Proc.new(&lambda {}).lambda? #=> true
lambda(&proc {}).lambda? #=> false
proc(&proc {}).lambda? #=> false
Proc.new(&proc {}).lambda? #=> false
Proc created as:
VALUE block = proc_new(klass, FALSE);
rb_obj_call_init(block, argc, argv);
return block;
When lambda:
return proc_new(rb_cProc, TRUE);
Both are Proc. In this case, the difference is just in TRUE or FALSE. TRUE, FALSE - check the number of parameters passed when called.
So, lambda is like more strict Proc:
is_proc = !proc->is_lambda;
Summary of Lambda vs Proc:
Lambdas check the number of arguments, while procs do not.
Return within the proc would exit the method from where it is called.
Return within a lambda would exit it from the lambda and the method would continue executing.
Lambdas are closer to a method.
Blocks: They are called closures in other languages, it is a way of grouping code/statements. In ruby single line blocks are written in {} and multi-line blocks are represented using do..end.
Block is not an object and can not be saved in a variable. Lambda and Proc are both an object.
So, let do small code test based on this answer:
# ruby 2.5.1
user system total real
0.016815 0.000000 0.016815 ( 0.016823)
0.023170 0.000001 0.023171 ( 0.023186)
0.117713 0.000000 0.117713 ( 0.117775)
0.217361 0.000000 0.217361 ( 0.217388)
This shows that using block.call is almost 2x slower than using yield.
Thanks, #engineersmnky, for good references in comments.
Proc is an object wrapper over block. Lambda basically is a proc with different behavior.
AFAIK pure blocks are more rational to use compared to procs.
def f
yield 123
end
Should be faster than
def g(&block)
block.call(123)
end
But proc can be passed on further.
I guess you should find some articles with performance comparison on the toppic
IMO, your block_print_value method is poorly designed/named, which makes it impossible to answer your question directly. From the name of the method, we would expect that the method "prints" something, but the only printing is the border condition, which does a
print "nil\n"
So, while I would strongly vote against using this way to print the tree, it doesn't mean that the whole idea of using a block for the printing problem is bad.
Since your problem looks like a programming assignment, I don't post a whole solution, but give a hint:
Replace your block_print_value by a, say block_visit_value, which does the same like your current method, but doesn't do any printing. Instead, the "else" part could also invoke the block to let it do the printing.
I'm sure that you will see afterwards the advantage of this method. If not, come back here for a discussion.
At a high level, procs are methods that can be stored inside variables like so:
full_name = Proc.new { |first,last| first + " " + last }
I can call this in two ways, using the bracket syntax followed by the arguments I want to pass to it or use the call method to run the proc and pass in arguments inside of parentheses like so:
p full_name.call("Daniel","Cortes")
What I did with the first line above is create a new instance of Proc and assigned it to a variable called full_name. Procs can take a code block as a parameter so I passed it two different arguments, arguments go inside the pipes.
I can also make it print my name five times:
full_name = Proc.new { |first| first * 5 }
The block I was referring to is called a closure in other programming languages. Blocks allow you to group statements together and encapsulate behavior. You can create blocks with curly braces or do...end syntax.
Why use Procs?
The answer is Procs give you more flexibility than methods. With Procs you can store an entire set of processes inside a variable and then call the variable anywhere else in your program.
Similar to Procs, Lambdas allow you to store functions inside a variable and call the method from other parts of the program. So really the same code I had above can be used like so:
full_name = lambda { |first,last| first + " " + last }
p full_name["daniel","cortes"]
So what is the difference between the two?
There are two key differences in addition to syntax. Please note that the differences are subtle, even to the point that you may never even notice them while programming.
The first key difference is that Lambdas count the arguments you pass to them whereas Procs do not. For example:
full_name = lambda { |first,last| first + " " + last }
p full_name.call("Daniel","Cortes")
The code above works, however, if I pass it another argument:
p full_name.call("Daniel","Abram","Cortes")
The application throws an error saying that I am passing in the wrong number of arguments.
However, with Procs it will not throw an error. It simply looks at the first two arguments and ignores anything after that.
Secondly, Lambdas and Procs have different behavior when it comes to returning values from methods, for example:
def my_method
x = lambda { return }
x.call
p "Text within method"
end
If I run this method, it prints out Text within method. However, if we try the same exact implementation with a Proc:
def my_method
x = Proc.new { return }
x.call
p "Text within method"
end
This will return a nil value.
Why did this occur?
When the Proc saw the word return it exited out of the entire method and returned a nil value. However, in the case of the Lambda, it processed the remaining part of the method.

How to use reduce/inject in Ruby without getting Undefined variable

When using an accumulator, does the accumulator exist only within the reduce block or does it exist within the function?
I have a method that looks like:
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
str.split.reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
end
return true if (new_array == new_array.sort)
end
When I execute this code I get the error
"undefined variable new_array in line 11 (the return statement)"
I also tried assigning the new_array value to another variable as an else statement inside my reduce block but that gave me the same results.
Can someone explain to me why this is happening?
The problem is that new_array is created during the call to reduce, and then the reference is lost afterwards. Local variables in Ruby are scoped to the block they are in. The array can be returned from reduce in your case, so you could use it there. However, you need to fix a couple things:
str.split does not break a string into characters in Ruby 2+. You should use str.chars, or str.split('').
The object retained for each new iteration of reduce must be retained by returning it from the block each time. The simplest way to do this is to put new_array as the last expression in your block.
Thus:
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
crazy_only = str.split('').reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
new_array
end
return true if (crazy_only == crazy_only.sort)
end
Note that your function is not very efficient, and not very idiomatic. Here's a shorter version of the function that is more idiomatic, but not much more efficient:
def my_useless_function(str)
crazy_letters = %w[a s d f g h]
crazy_only = str.chars.select{ |c| crazy_letters.include?(c) }
crazy_only == crazy_only.sort # evaluates to true or false
end
And here's a version that's more efficient:
def efficient_useless(str)
crazy_only = str.scan(/[asdfgh]/) # use regex to search for the letters you want
crazy_only == crazy_only.sort
end
Block local variables
new_array doesn't exist outside the block of your reduce call. It's a "block local variable".
reduce does return an object, though, and you should use it inside your method.
sum = [1, 2, 3].reduce(0){ |acc, elem| acc + elem }
puts sum
# 6
puts acc
# undefined local variable or method `acc' for main:Object (NameError)
Your code
Here's the least amount of change for your method :
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
new_array = str.split(//).reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
new_array
end
return true if (new_array == new_array.sort)
end
Notes:
return isn't needed at the end.
true if ... isn't needed either
for loop should never be used in Ruby
reduce returns the result of the last expression inside the block. It was for in your code.
If you always need to return the same object in reduce, it might be a sign you could use each_with_object.
"test".split is just ["test"]
String and Enumerable have methods that could help you. Using them, you could write a much cleaner and more efficient method, as in #Phrogz answer.

Ruby block's result as an argument

There are many examples how to pass Ruby block as an argument, but these solutions pass the block itself.
I need a solution that takes some variable, executes an inline code block passing this variable as a parameter for the block, and the return value as an argument to the calling method. Something like:
a = 555
b = a.some_method { |value|
#Do some stuff with value
return result
}
or
a = 555
b = some_method(a) { |value|
#Do some stuff with value
return result
}
I could imagine a custom function:
class Object
def some_method(&block)
block.call(self)
end
end
or
def some_method(arg, &block)
block.call(arg)
end
but are there standard means present?
I think, you are looking for instance_eval.
Evaluates a string containing Ruby source code, or the given block, within the context of the receiver (obj). In order to set the context, the variable self is set to obj while the code is executing, giving the code access to obj’s instance variables. In the version of instance_eval that takes a String, the optional second and third parameters supply a filename and starting line number that are used when reporting compilation errors.
a = 55
a.instance_eval do |obj|
# some operation on the object and stored it to the
# variable and then returned it back
result = obj / 5 # last stament, and value of this expression will be
# returned which is 11
end # => 11
This is exactly how #Arup Rakshit commented. Use tap
def compute(x)
x + 1
end
compute(3).tap do |val|
logger.info(val)
end # => 4

Yield within Set to eliminate in an Array

I found the following code here for eliminating duplicate records in an array:
require 'set'
class Array
def uniq_by
seen = Set.new
select{ |x| seen.add?( yield( x ) ) }
end
end
And we can use the code above as follows:
#messages = Messages.all.uniq_by { |h| h.body }
I would like to know how and what happens when the method is called. Can someone explain the internals of the code above? In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?
Let's break it down :
seen = Set.new
Create an empty set
select{ |x| seen.add?( yield( x ) ) }
Array#select will keep elements when the block yields true.
seen.add?(yield(x)) will return true if the result of the block can be added in the set, or false if it can't.
Indeed, yield(x) will call the block passed to the uniq_by method, and pass x as an argument.
In our case, since our block is { |h| h.body }, it would be the same as calling seen.add?(x.body)
Since a set is unique, calling add? when the element already exists will return false.
So it will try to call .body on each element of the array and add it in a set, keeping elements where the adding was possible.
The method uniq_by accepts a block argument. This allows to specify, by what criteria you wish to identify two elements as "unique".
The yield statement will evaluate the value of the given block for the element and return the value of the elements body attribute.
So, if you call unique_by like above, you are stating that the attribute body of the elements has to be unique for the element to be unique.
To answer the more specific question you have: yield will call the passed block {|h| h.body} like a method, substituting h for the current x and therefore return x.body
In Ruby, when you are putting yield keyword inside any method(say #bar), you are explicitly telling #bar that, you will be using a block with the method #bar. So yield knows, inside the method block will be converted to a Proc object, and yield have to call that Proc object.
Example :
def bar
yield
end
p bar { "hello" } # "hello"
p bar # bar': no block given (yield) (LocalJumpError)
In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?
You did do, that is you put yield. Once you will put this yield, now method is very smart to know, what it supposed to so. In the line Messages.all.uniq_by { |h| h.body } you are passing a block { |h| h.body }, and inside the method definition of uniq_by, that block has been converted to a Proc object, and yield does Proc#call.
Proof:
def bar
p block_given? # true
yield
end
bar { "hello" } # "hello"
Better for understanding :
class Array
def uniq_by
seen = Set.new
select{ |x| seen.add?( yield( x ) ) }
end
end
is same as
class Array
def uniq_by
seen = Set.new
# Below you are telling uniq_by, you will be using a block with it
# by using `yield`.
select{ |x| var = yield(x); seen.add?(var) }
end
end
Read the doc of yield
Called from inside a method body, yields control to the code block (if any) supplied as part of the method call. If no code block has been supplied, calling yield raises an exception. yield can take an argument; any values thus yielded are bound to the block's parameters. The value of a call to yield is the value of the executed code block.
Array#select returns a new array containing all elements of the array for which the given block returns a true value.
The block argument of the select use Set#add? to determine whether the element is already there. add? returns nil if there is already the same element in the set, otherwise it returns the set itself and add the element to the set.
The block again pass the argument (an element of the array) to another block (the block passed to the uniq_by) using yield; Return value of the yield is return value of the block ({|h| h.body })
The select .. statement is basically similar to following statement:
select{ |x| seen.add?(x.body) }
But by using yield, the code avoid hard-coding of .body, and defers decision to the block.

Ruby: Retrieving values from method parameters

I'm trying to figure out how I can create a method in Ruby where I can retrieve values from the method's parameters such as strings/integers.
For example, if this were a function coded in C, it might be done similar to this:
main()
{
int value;
GetAnIntegerValue(value);
printf("The value is %d", value);
}
// The "value" integer variable is passed to it, and updated accordingly because of the use of the ampersand prior to the parameter
GetAnIntegerValue(&int value)
{
value = 5;
}
// The output would be "The value is 5"
I think the term for this is pass by value but I'm not sure. My mind is a little vague on this area and I couldn't find many decent results.
Here's my example Ruby function, the array that the parameters are being assigned to is only local to the class which is the reason for this usage:
def getRandomWordAndHint(&RandomWord, &RandomHint)
randIndex = rand(7)
RandomWord = EnglishLevel1Word[randIndex]
RandomHint = EnglishLevel1Hint[randIndex]
end
Cheers!i
Ruby is pass-by-value. Always. No exceptions. You cannot do pass-by-reference in Ruby.
What you can do, is put the object you want to change into some sort of mutable container:
class MutableCell
attr_accessor :val
def initialize(val)
self.val = val
end
end
def change_the_value(cell)
cell.val = 5
end
value = MutableCell.new(42)
change_the_value(value)
value.val
# => 5
Of course, you can just use an Array instead of writing your own MutableCell class, this is just for demonstration.
However, mutable state is a bad idea in general, and mutating arguments passed to methods is a really bad idea especially. Methods know about their own object (i.e. self) and thus can safely modify it, but for other objects, that's generally a no-go.

Resources