Block-local variables are to prevent a block from tampering with variables outside of its scope.
Using a block-local variable
x = 10
3.times do |y; x|
x = y
end
x # => 10
But this is easily done by declaring a regular block parameter. A new local scope is created for that parameter, which takes precedence over previous variables/scopes.
Without using a block-local variable
x = 10
3.times do |y, x|
x = y
end
x # => 10
The variable x outside the block doesn't get changed in either case. Is there any need for block-local variables other than for enhancing readability?
The block parameter is a real parameter, while a block local variable is not.
If you give yield two parameters like this:
def foo
yield("hello", "world")
end
Calling
x = 10
foo do |y; x|
puts x
end
x is nil inside the function because only the first argument is assigned to y, the second argument is discarded.
Calling
x = 10
foo do |y, x|
puts x
end
#=>world
x gets the parameter correctly as "world".
To expand on Yu Hao's answer, the difference between block parameters and block-local not obvious when calling a method that only yields one value, but consider a method that yields multiple values:
def frob
yield 1, 2, 3
end
If you pass this a block with a single argument, you get the first value:
frob { |a| a.inspect }
# => "1"
But if you pass a block with multiple arguments, you get multiple values, even if you pass too few arguments, or too many:
frob { |a, b, c| [a, b, c].inspect }
# => "[1, 2, 3]"
frob { |a, b| [a, b].inspect }
# => "[1, 2]"
frob { |a, b, c, d| [a, b, c, d].inspect }
# => "[1, 2, 3, nil]"
If you pass block-scoped variables, however, those are independent of the yielded value(s):
frob { |a; b, c| [a, b, c].inspect }
# => "[1, nil, nil]"
Something similar happens with methods that yield an array, except that when you pass a block with a single argument, it gets the whole array:
def frobble
yield [1, 2, 3]
end
frobble {|a| a.inspect }
# => "[1, 2, 3]"
Multiple arguments, however, destructure the array --
frobble {|a, b| [a, b].inspect }
# => "[1, 2]"
-- while a block-scoped variable doesn't:
frobble {|a; b| [a, b].inspect }
# => "[[1, 2, 3], nil]"
(Even with a block-scoped variable present, though, multiple values will still destructure the array: frobble {|a, b; c| [a, b, c].inspect } will get you "[1, 2, nil]".)
For more discussion and examples, see also this answer.
Related
Today I was surprised to find ruby automatically find the values of an array given as a block parameter.
For example:
foo = "foo"
bar = "bar"
p foo.chars.zip(bar.chars).map { |pair| pair }.first #=> ["f", "b"]
p foo.chars.zip(bar.chars).map { |a, b| "#{a},#{b}" }.first #=> "f,b"
p foo.chars.zip(bar.chars).map { |a, b,c| "#{a},#{b},#{c}" }.first #=> "f,b,"
I would have expected the last two examples to give some sort of error.
Is this an example of a more general concept in ruby?
I don't think my wording at the start of my question is correct, what do I call what is happening here?
Ruby block are quirky like that.
The rule is like this, if a block takes more than one argument and it is yielded a single object that responds to to_ary then that object is expanded. This makes yielding an array versus yielding a tuple seem to behave the same way for blocks that take two or more arguments.
yield [a,b] versus yield a,b do differ though when the block takes one argument only or when the block takes a variable number of arguments.
Let me demonstrate both of that
def yield_tuple
yield 1, 2, 3
end
yield_tuple { |*a| p a }
yield_tuple { |a| p [a] }
yield_tuple { |a, b| p [a, b] }
yield_tuple { |a, b, c| p [a, b, c] }
yield_tuple { |a, b, c, d| p [a, b, c, d] }
prints
[1, 2, 3]
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, nil]
Whereas
def yield_array
yield [1,2,3]
end
yield_array { |*a| p a }
yield_array { |a| p [a] }
yield_array { |a, b| p [a, b] }
yield_array { |a, b, c| p [a, b, c] }
yield_array { |a, b, c, d| p [a, b, c, d] }
prints
[[1, 2, 3]]
[[1, 2, 3]]
[1, 2] # array expansion makes it look like a tuple
[1, 2, 3] # array expansion makes it look like a tuple
[1, 2, 3, nil] # array expansion makes it look like a tuple
And finally to show that everything in Ruby uses duck-typing
class A
def to_ary
[1,2,3]
end
end
def yield_arrayish
yield A.new
end
yield_arrayish { |*a| p a }
yield_arrayish { |a| p [a] }
yield_arrayish { |a, b| p [a, b] }
yield_arrayish { |a, b, c| p [a, b, c] }
yield_arrayish { |a, b, c, d| p [a, b, c, d] }
prints
[#<A:0x007fc3c2969190>]
[#<A:0x007fc3c2969050>]
[1, 2] # array expansion makes it look like a tuple
[1, 2, 3] # array expansion makes it look like a tuple
[1, 2, 3, nil] # array expansion makes it look like a tuple
PS, the same array expansion behavior applies for proc closures which behave like blocks, whereas lambda closures behave like methods.
Ruby's block mechanics have a quirk to them, that is if you're iterating over something that contains arrays you can expand them out into different variables:
[ %w[ a b ], %w[ c d ] ].each do |a, b|
puts 'a=%s b=%s' % [ a, b ]
end
This pattern is very useful when using Hash#each and you want to break out the key and value parts of the pair: each { |k,v| ... } is very common in Ruby code.
If your block takes more than one argument and the element being iterated is an array then it switches how the arguments are interpreted. You can always force-expand:
[ %w[ a b ], %w[ c d ] ].each do |(a, b)|
puts 'a=%s b=%s' % [ a, b ]
end
That's useful for cases where things are more complex:
[ %w[ a b ], %w[ c d ] ].each_with_index do |(a, b), i|
puts 'a=%s b=%s # %d' % [ a, b, i ]
end
Since in this case it's iterating over an array and another element that's tacked on, so each item is actually a tuple of the form %w[ a b ], 0 internally, which will be converted to an array if your block only accepts one argument.
This is much the same principle you can use when defining variables:
a, b = %w[ a b ]
a
# => 'a'
b
# => 'b'
That actually assigns independent values to a and b. Contrast with:
a, b = [ %w[ a b ] ]
a
# => [ 'a', 'b' ]
b
# => nil
I would have expected the last two examples to give some sort of error.
It does in fact work that way if you pass a proc from a method. Yielding to such a proc is much stricter – it checks its arity and doesn't attempt to convert an array argument to an argument list:
def m(a, b)
"#{a}-#{b}"
end
['a', 'b', 'c'].zip([0, 1, 2]).map(&method(:m))
#=> wrong number of arguments (given 1, expected 2) (ArgumentError)
This is because zip creates an array (of arrays) and map just yields each element, i.e.
yield ['a', 0]
yield ['b', 1]
yield ['c', 2]
each_with_index on the other hand works:
['a', 'b', 'c'].each_with_index.map(&method(:m))
#=> ["a-0", "b-1", "c-2"]
because it yields two separate values, the element and its index, i.e.
yield 'a', 0
yield 'b', 1
yield 'c', 2
The code:
a = [1, 2, 3]
h = {a: 1}
def f args
p args
end
h.map(&method(:f))
a.map(&method(:f))
h.map do |k,v|
p [k,v]
end
The output:
[:a, 1]
1
2
3
[:a, 1]
Why can't I define f for a hash as follows?
def f k, v
p [k, v]
end
You are correct that the reason stems from the one of the two main differences between proc's and lambda's. I'll trying explaining it in a slightly different way than you did.
Consider:
a = [:a, 1]
h = {a: 1}
def f(k,v)
p [k, v]
end
a.each(&method(:f))
#-> in `f': wrong number of arguments (1 for 2) (ArgumentError)
h.each(&method(:f))
#-> in `f': wrong number of arguments (1 for 2) (ArgumentError)
where I use #-> to show what is printed and #=> to show what is returned. You used map, but each is more appropriate here, and makes the same point.
In both cases elements of the receiver are being passed to the block1:
&method(:f)
which is (more-or-less, as I will explain) equivalent to:
{ |k,v| p [k,v] }
The block is complaining (for both the array and hash) that it is expecting two arguments but receiving only one, and that is not acceptable. "Hmmm", the reader is thinking, "why doesn't it disambiguate in the normal way?"
Let's try using the block directly:
a.map { |k,v| p [k,v] }
#-> [:a, nil]
# [1, nil]
h.map { |k,v| p [k,v] }
#-> [:a, 1]
This works as expected, but does not return what we wanted for the array.
The first element of a (:a) is passed into the block and the block variables are assigned:
k,v = :a
#=> :a
k #=> :a
v #=> nil
and
p [k,v]
#-> :a
#-> nil
Next, 1 is passed to the block and [1,nil] is printed.
Let's try one more thing, using a proc created with Proc::new:
fp = Proc.new { |k,v| p [k, v] }
#=> #<Proc:0x007ffd6a0a8b00#(irb):34>
fp.lambda?
#=> false
a.each { |e| fp.call(e) }
#-> [:a, nil]
#-> [:a, 1]
h.each { |e| fp[e] }
#-> [:a, 1]
(Here I've used one of three aliases for Proc#call.) We see that calling the proc has the same result as using a block. The proc expects two arguments and but receives only one, but, unlike the lambda, does not complain2.
This tells us that we need to make small changes to a and f:
a = [[:a, 1]]
h = {a: 1}
def f(*(k,v))
p [k, v]
end
a.each(&method(:f))
#-> [:a, 1]
h.each(&method(:f))
#-> [:a, 1]
Incidentally, I think you may have fooled yourself with the variable name args:
def f args
p args
end
as the method has a single argument regardless of what you call it. :-)
1 The block is created by & calling Method#to_proc on the method f and then converting the proc (actually a lambda) to a block.
2 From the docs for Proc: "For procs created using lambda or ->() an error is generated if the wrong number of parameters are passed to a Proc with multiple parameters. For procs created using Proc.new or Kernel.proc, extra parameters are silently discarded."
As it appears, it must be some sort of implicit destructuring (or non-strict arguments handling), which works for procs, but doesn't for lambdas:
irb(main):007:0> Proc.new { |k,v| p [k,v] }.call([1,2])
[1, 2]
=> [1, 2]
irb(main):009:0> lambda { |k,v| p [k,v] }.call([1,2])
ArgumentError: wrong number of arguments (1 for 2)
from (irb):9:in `block in irb_binding'
from (irb):9:in `call'
from (irb):9
from /home/yuri/.rubies/ruby-2.1.5/bin/irb:11:in `<main>'
But one can make it work:
irb(main):010:0> lambda { |(k,v)| p [k,v] }.call([1,2])
[1, 2]
=> [1, 2]
And therefore:
def f ((k, v))
p [k, v]
end
So Hash#map always passes one argument.
UPD
This implicit destructuring also happens in block arguments.
names = ["Arthur", "Ford", "Trillian"]
ids = [42, 43, 44]
id_names = ids.zip(names) #=> [[42, "Arthur"], [43, "Ford"], [44, "Trillian"]]
id_names.each do |id, name|
puts "user #{id} is #{name}"
end
http://globaldev.co.uk/2013/09/ruby-tips-part-2/
UPD Don't take me wrong. I'm not suggesting writing such code (def f ((k, v))). In the question I was asking for explanation, not for the solution.
Why do Ruby (2.0) procs/blocks with splat arguments behave differently than methods and lambdas?
def foo (ids, *args)
p ids
end
foo([1,2,3]) # => [1, 2, 3]
bar = lambda do |ids, *args|
p ids
end
bar.call([1,2,3]) # => [1, 2, 3]
baz = proc do |ids, *args|
p ids
end
baz.call([1,2,3]) # => 1
def qux (ids, *args)
yield ids, *args
end
qux([1,2,3]) { |ids, *args| p ids } # => 1
Here's a confirmation of this behavior, but without explanation:
http://makandracards.com/makandra/20641-careful-when-calling-a-ruby-block-with-an-array
There are two types of Proc objects: lambda which handles argument list in the same way as a normal method, and proc which use "tricks" (Proc#lambda?). proc will splat an array if it's the only argument, ignore extra arguments, assign nil to missing ones. You can partially mimic proc behavior with lambda using destructuring:
->((x, y)) { [x, y] }[1] #=> [1, nil]
->((x, y)) { [x, y] }[[1, 2]] #=> [1, 2]
->((x, y)) { [x, y] }[[1, 2, 3]] #=> [1, 2]
->((x, y)) { [x, y] }[1, 2] #=> ArgumentError
Just encountered a similar issue!
Anyways, my main takeaways:
The splat operator works for array assignment in a predictable manner
Procs effectively assign arguments to input (see disclaimer below)
This leads to strange behavior, i.e. the example above:
baz = proc do |ids, *args|
p ids
end
baz.call([1,2,3]) # => 1
So what's happening? [1,2,3] gets passed to baz, which then assigns the array to its arguments
ids, *args = [1,2,3]
ids = 1
args = [2,3]
When run, the block only inspects ids, which is 1. In fact, if you insert p args into the block, you will find that it is indeed [2,3]. Certainly not the result one would expect from a method (or lambda).
Disclaimer: I can't say for sure if Procs simply assign their arguments to input under the hood. But it does seem to match their behavior of not enforcing the correct number of arguments. In fact, if you give a Proc too many arguments, it ignores the extras. Too few, and it passes in nils. Exactly like variable assignment.
I am confused about a good style to adopt to define block local variables. The choices are:
Choice A:
method_that_calls_block { |v, w| puts v, w }
Choice B:
method_that_calls_block { |v; w| puts v, w }
The confusion is compunded when I want the block local to have a default value. The choices I am confused about are:
Choice C:
method_that_calls_block { |v, w = 1| puts v, w }
Choice D:
method_that_calls_block { |v, w: 1| puts v, w }
Is there a convention about how block local variables must be defined?
P.S. Also it seems the ; syntax does not work when I need to assign default value to a block local variable! Strange.
Choice B is not valid. As #matt indicated - it is a valid (though obscure) syntax (see here: How to write an inline block to contain local variable scope in Ruby?)
Choice C gives a default value to w, which is a regular value, while Choice D is a syntax for default keyword argument.
All four of these are valid, but they all have different semantics -- which is correct depends on what you're trying to accomplish.
Examples
Consider the following method, which yields multiple values.
def frob
yield 1, 2, 3
end
Choice A: block parameters
"Get me the first two yielded values, if any, I don't care about the others."
frob { |v, w| [v, w].inspect}
# => "[1, 2]"
Choice B: block parameter + block-local variable
"Get me the first value, I don't care about the others; and give me an additional, uninitialized variable".
frob { |v; w| [v, w].inspect}
# => "[1, nil]"
Choice C: block parameters, some with default values
"Get me the first two values, and if the second value isn't initialized, set that variable to 1":
frob { |v, w = 1| [v, w].inspect }
# => "[1, 2]" <-- all values are present, default value ignored
"Get me the first five values, and if the fifth value isn't initialized, set that variable to 99":
frob { |v, w, x, y, z = 99| [v, w, x, y, z].inspect }
# => "[1, 2, 3, nil, 99]"
Choice D: positional and keyword block parameters
"Get me the first value, and if the method yields a keyword parameter w, get that, too; if not, set it to 1."
frob { |v, w: 1| [v, w].inspect }
# => "[1, 1]"
This is designed for the case where a method does yield block parameters:
def frobble
yield 1, 2, 3, w: 4
end
frobble { |v, w: 1| [v, w].inspect }
# => "[1, 4]"
In Ruby < 2.7, a block with a keyword parameter will also destructure a hash, although Ruby 2.7 will give you a deprecation warning, just as if you'd passed a hash to a method that takes keyword arguments:
def frobnitz
h = {w: 99}
yield 1, 2, 3, h
end
# Ruby 2.7
frobnitz { |v, w: 1| [v, w].inspect }
# warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
# => "[1, 99]"
Ruby 3.0 doesn't give you a deprecation warning, but it also ignores the hash:
# Ruby 3.0
frobnitz { |v, w: 1| [v, w].inspect }
# => [1, 1]
Yielding an explicit keyword argument still works as expected in 3.0, though:
# Ruby 3.0
frobble { |v, w: 1| [v, w].inspect }
# => "[1, 4]"
Note that the keyword argument form will fail if the method yields unexpected keywords:
def frobnicate
yield 1, 2, 3, w: 99, z: -99
end
frobnicate { |v, w: 1| [v, w].inspect }
# => ArgumentError (unknown keyword: :z)
Array destructuring
Another way in which the differences become obvious is when considering a method that returns an array:
def gork
yield [1, 2, 3]
end
Passing a block with a single argument will get you the whole array:
gork { |v| v.inspect }
# => "[1, 2, 3]"
Passing a block with multiple arguments, though, will get you the elements of the array, even if you pass too few arguments, or too many:
gork { |v, w| [v, w].inspect }
# "[1, 2]"
gork { |v, w, x, y| [v, w, x, y].inspect }
# => "[1, 2, 3, nil]"
Here again the ; syntax for block-local variables can come in handy:
gork { |v; w| [v, w].inspect }
# => "[[1, 2, 3], nil]"
Note, though, that even a keyword argument will still cause the array to be destructured:
gork { |v, w: 99| [v, w].inspect }
# => "[1, 99]"
gork { |v, w: 99; x| [v, w, x].inspect }
# => "[1, 99, nil]"
Outer variable shadowing
Ordinarily, if you use the name of an outer variable inside a block, you're using that variable:
w = 1; frob { |v| w = 99}; w
# => 99
You can avoid this with any of the choices above; any of them will shadow the outer variable, hiding the outer variable from the block and ensuring that any effects the block has on it are local.
Choice A: block parameters:
w = 1; frob { |v, w| puts [v, w].inspect; w = 99}; w
# [1, 2]
# => 1
Choice B: block parameter + block-local variable
w = 1; frob { |v; w| puts [v, w].inspect; w = 99}; w
# [1, nil]
# => 1
Choice C: block parameters, some with default values
w = 1; frob { |v, w = 33| puts [v, w].inspect; w = 99}; w
# [1, 2]
# => 1
Choice D: positional and keyword block parameters
w = 1; frob { |v, w: 33| puts [v, w].inspect; w = 99}; w
# [1, 33]
# => 1
The other behavioral differences, though, still hold.
Default values
You can't set a default value for block-local variables.
frob { |v; w = 1| [v, w].inspect }
# syntax error, unexpected '=', expecting '|'
You also can't use a keyword argument as a block parameter.
frob { |v; w: 1| [v, w].inspect }
# syntax error, unexpected ':', expecting '|'
If you know the method you're calling doesn't yield a block parameter, though, you can declare a fake block parameter with a default value, and use that to get yourself a pre-initialized block-local variable. Repeated from the first Choice D example, above:
frob { |v, w: 1| [v, w].inspect }
# => "[1, 1]"
Local variables defined outside of a thread seem to be visible from inside so that the following two uses of Thread.new seem to be the same:
a = :foo
Thread.new{puts a} # => :foo
Thread.new(a){|a| puts a} # => :foo
The document gives the example:
arr = []
a, b, c = 1, 2, 3
Thread.new(a,b,c){|d, e, f| arr << d << e << f}.join
arr #=> [1, 2, 3]
but since a, b, c are visible from inside of the created thread, this should also be the same as:
arr = []
a, b, c = 1, 2, 3
Thread.new{d, e, f = a, b, c; arr << d << e << f}.join
arr #=> [1, 2, 3]
Is there any difference? When do you need to pass local variables as arguments to Thread.new?
When you pass a variable into a thread like that, then the thread makes a local copy of the variable and uses it, so modifications to it do not affect the variable outside of the thread you passed in
a = "foo"
Thread.new{ a = "new"}
p a # => "new"
Thread.new(a){|d| d = "old"}
p a # => "new"
p d # => undefined
I think I hit the actual problem. With a code like this:
sock = Socket.unix_server_socket(SOCK)
sock.listen 10
while conn = sock.accept do
io, address = conn
STDERR.puts "#{io.fileno}: Accepted connection from '#{address}'"
Thread.new{ serve io }
end
it appears to work when accepting few connections. The problem comes when accepting connections quickly one after another. The update to local variable io will be reflected in multiple concurrent threads unless passed as argument to Thread.new