When do you need to pass arguments to `Thread.new`? - ruby

Local variables defined outside of a thread seem to be visible from inside so that the following two uses of Thread.new seem to be the same:
a = :foo
Thread.new{puts a} # => :foo
Thread.new(a){|a| puts a} # => :foo
The document gives the example:
arr = []
a, b, c = 1, 2, 3
Thread.new(a,b,c){|d, e, f| arr << d << e << f}.join
arr #=> [1, 2, 3]
but since a, b, c are visible from inside of the created thread, this should also be the same as:
arr = []
a, b, c = 1, 2, 3
Thread.new{d, e, f = a, b, c; arr << d << e << f}.join
arr #=> [1, 2, 3]
Is there any difference? When do you need to pass local variables as arguments to Thread.new?

When you pass a variable into a thread like that, then the thread makes a local copy of the variable and uses it, so modifications to it do not affect the variable outside of the thread you passed in
a = "foo"
Thread.new{ a = "new"}
p a # => "new"
Thread.new(a){|d| d = "old"}
p a # => "new"
p d # => undefined

I think I hit the actual problem. With a code like this:
sock = Socket.unix_server_socket(SOCK)
sock.listen 10
while conn = sock.accept do
io, address = conn
STDERR.puts "#{io.fileno}: Accepted connection from '#{address}'"
Thread.new{ serve io }
end
it appears to work when accepting few connections. The problem comes when accepting connections quickly one after another. The update to local variable io will be reflected in multiple concurrent threads unless passed as argument to Thread.new

Related

Surprising Ruby scoping with while loop

(1)
a = [1, 2]
while b = a.pop do puts b end
outputs
2
1
(2)
a = [1, 2]
puts b while b = a.pop
results in an error
undefined local variable or method `b'
(3)
b = nil
a = [1, 2]
puts b while b = a.pop
outputs
2
1
What is going on? Why is the scope of b different in #2 than any of the rest?
$ ruby --version
ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]
EDIT: Originally I listed irb's behavior as different. It isn't; I was working in a "dirty" session.
Variables are declared to their scope by the lexical parser, which is linear. In while b = a.pop do puts b end, the assignment (b = a.pop) is seen by the parser before the use (puts b). In the second example, puts b while b = a.pop, the use is seen when the definition is still unknown, which produces the error.
The puts statement is executed before the variable 'b' is initially defined, thus resulting in an error.
As a similar example but with an until-statement, consider the following code:
a = [1, 2]
begin
puts "in the block"
end until b = a.pop
Would you expect b to be defined within the block?
Technically the only difference is that until stops on a true return value, while while will continue as long as a.pop returns a true value.
The point in both cases is that b is not in scope until the assignment happened. Right after the assignment, e.g. when the loop returns, b comes available in the current scope. That is called lexical scoping and is how ruby works for local variables like this one.
I found this article to be helpful for understanding scope in ruby.
Update 1: In the previous version of my answer I wrote that this is not comparable to an if. While this is still true, it has nothing to do with the question, which is a simple scope issue.
Update 2: Added a link to some more details explanations regarding scoping in ruby.
Update 3: Removed the first sentence, as it was wrong.
a = [1, 2]
while b = a.pop do puts b end
is same as
a = [1, 2]
b = a.pop
puts b
b = a.pop
puts b
(2)
a = [1, 2]
puts b while b = a.pop
is same as this
a = [1, 2]
puts b
b = a.pop
puts b
b = a.pop
puts b
When b is passed into puts the first time it has not yet been initialized. hence the error message
(3)
b is initialized to nil. even nil is an object in ruby
b = nil
a = [1, 2]
puts b while b = a.pop

How does Ruby return two values?

Whenever I swap values in an array, I make sure I stored one of the values in a reference variable. But I found that Ruby can return two values as well as automatically swap two values. For example,
array = [1, 3, 5 , 6 ,7]
array[0], array[1] = array[1] , array[0] #=> [3, 1]
I was wondering how Ruby does this.
Unlike other languages, the return value of any method call in Ruby is always an object. This is possible because, like everything in Ruby, nil itself is an object.
There's three basic patterns you'll see. Returning no particular value:
def nothing
end
nothing
# => nil
Returning a singular value:
def single
1
end
x = single
# => 1
This is in line with what you'd expect from other programming languages.
Things get a bit different when dealing with multiple return values. These need to be specified explicitly:
def multiple
return 1, 2
end
x = multiple
# => [ 1, 2 ]
x
# => [ 1, 2 ]
When making a call that returns multiple values, you can break them out into independent variables:
x, y = multiple
# => [ 1, 2 ]
x
# => 1
y
# => 2
This strategy also works for the sorts of substitution you're talking about:
a, b = 1, 2
# => [1, 2]
a, b = b, a
# => [2, 1]
a
# => 2
b
# => 1
No, Ruby doesn't actually support returning two objects. (BTW: you return objects, not variables. More precisely, you return pointers to objects.)
It does, however, support parallel assignment. If you have more than one object on the right-hand side of an assignment, the objects are collected into an Array:
foo = 1, 2, 3
# is the same as
foo = [1, 2, 3]
If you have more than one "target" (variable or setter method) on the left-hand side of an assignment, the variables get bound to elements of an Array on the right-hand side:
a, b, c = ary
# is the same as
a = ary[0]
b = ary[1]
c = ary[2]
If the right-hand side is not an Array, it will be converted to one using the to_ary method
a, b, c = not_an_ary
# is the same as
ary = not_an_ary.to_ary
a = ary[0]
b = ary[1]
c = ary[2]
And if we put the two together, we get that
a, b, c = d, e, f
# is the same as
ary = [d, e, f]
a = ary[0]
b = ary[1]
c = ary[2]
Related to this is the splat operator on the left-hand side of an assignment. It means "take all the left-over elements of the Array on the right-hand side":
a, b, *c = ary
# is the same as
a = ary[0]
b = ary[1]
c = ary.drop(2) # i.e. the rest of the Array
And last but not least, parallel assignments can be nested using parentheses:
a, (b, c), d = ary
# is the same as
a = ary[0]
b, c = ary[1]
d = ary[2]
# which is the same as
a = ary[0]
b = ary[1][0]
c = ary[1][1]
d = ary[2]
When you return from a method or next or break from a block, Ruby will treat this kind-of like the right-hand side of an assignment, so
return 1, 2
next 1, 2
break 1, 2
# is the same as
return [1, 2]
next [1, 2]
break [1, 2]
By the way, this also works in parameter lists of methods and blocks (with methods being more strict and blocks less strict):
def foo(a, (b, c), d) p a, b, c, d end
bar {|a, (b, c), d| p a, b, c, d }
Blocks being "less strict" is for example what makes Hash#each work. It actually yields a single two-element Array of key and value to the block, but we usually write
some_hash.each {|k, v| }
instead of
some_hash.each {|(k, v)| }
tadman and Jörg W Mittag know Ruby better than me, and their answers are not wrong, but I don't think they are answering what OP wanted to know. I think that the question was not clear though. In my understanding, what OP wanted to ask has nothing to do with returning multiple values.
The real question is, when you want to switch the values of two variables a and b (or two positions in an array as in the original question), why is it not necessary to use a temporal variable temp like:
a, b = :foo, :bar
temp = a
a = b
b = temp
but can be done directly like:
a, b = :foo, :bar
a, b = b, a
The answer is that in multiple assignment, the whole right hand side is evaluated prior to assignment of the whole left hand side, and it is not done one by one. So a, b = b, a is not equivalent to a = b; b = a.
First evaluating the whole right hand side before assignment is a necessity that follows from adjustment when the both sides of = have different numbers of terms, and Jörg W Mittag's description may be indirectly related to that, but that is not the main issue.
Arrays are a good option if you have only a few values. If you want multiple return values without having to know (and be confused by) the order of results, an alternative would be to return a Hash that contains whatever named values you want.
e.g.
def make_hash
x = 1
y = 2
{x: x, y: y}
end
hash = make_hash
# => {:x=>1, :y=>2}
hash[:x]
# => 1
hash[:y]
# => 2
Creating a hash as suggested by some is definitely better than array as array indexing can be confusing. When an additional attribute needs to be returned at a certain index, we'll need to make changes to all the places where the return value is used with array.
Another better way to do this is by using OpenStruct. Its advantage over using a hash is its ease of accessibility.
Example: computer = OpenStruct.new(ram: '4GB')
there are multiple ways to access the value of ram
as a symbol key: computer[:ram]
as a string key: computer['ram']
as an attribute(accessor method): computer.ram
Reference Article: https://medium.com/rubycademy/openstruct-in-ruby-ab6ba3aff9a4

How to declare multiple variables

I want to know how I can declare multiple variables. I typed a,b=1 expecting to get a=1,b=1, but I got:
a,b=1
a #=> 1
b #=> nil
How am I able to do this?
After this code, I did:
a="Hello "
b=a
c="World~"
b << c
b #=> "Hello World"
Why is b the same as a's value?
To declare multiple vars on the same line, you can do that:
a = b = "foo"
puts a # return "foo"
puts b # return "foo" too
About your second question, when doing b << c, you are assigning c's value to b. Then, you are overriding previous value stored in b. Meanwhile, a keeps the same value because Ruby does not user pointers.
What you are doing is called destructuring assignment. Basically, you take what is on the right side of the equals sign, and destructure it, or break it apart, and then assign each section to each corresponding variable on the left.
Ruby is super friendly, and is providing some syntactic sugar that might be confusing.
When you type this:
a, b = 1
You are really saying something closer to this:
[a, b] = [1, nil]
A good example of destructuring assignment can be found here. It's for JavaScript, but I like it because the syntax is very explicit about what is happen when you do such an assigment.
I suppose, in the case of
a, b, c = 1, 2
the runtime system works the following way:
a, b, c = [1, 2]
_result = ( a, b, c = (_values = [1, 2]) )
a = _values[0] # => 1
b = _values[1] # => 2
c = _values[2] # => nil
_result = _values # => [1, 2]
However, in the case of a single value on the right hand side: a, b = 1, the computation process looks a bit different:
_result = ( a, b = ( _value = (_values = [1]).first ) )
a = _values[0] # => 1
b = _values[1] # => nil
_result = _value # => 1
Can someone approve or disprove my assumption?

Do block-local variables exist just to enhance readability?

Block-local variables are to prevent a block from tampering with variables outside of its scope.
Using a block-local variable
x = 10
3.times do |y; x|
x = y
end
x # => 10
But this is easily done by declaring a regular block parameter. A new local scope is created for that parameter, which takes precedence over previous variables/scopes.
Without using a block-local variable
x = 10
3.times do |y, x|
x = y
end
x # => 10
The variable x outside the block doesn't get changed in either case. Is there any need for block-local variables other than for enhancing readability?
The block parameter is a real parameter, while a block local variable is not.
If you give yield two parameters like this:
def foo
yield("hello", "world")
end
Calling
x = 10
foo do |y; x|
puts x
end
x is nil inside the function because only the first argument is assigned to y, the second argument is discarded.
Calling
x = 10
foo do |y, x|
puts x
end
#=>world
x gets the parameter correctly as "world".
To expand on Yu Hao's answer, the difference between block parameters and block-local not obvious when calling a method that only yields one value, but consider a method that yields multiple values:
def frob
yield 1, 2, 3
end
If you pass this a block with a single argument, you get the first value:
frob { |a| a.inspect }
# => "1"
But if you pass a block with multiple arguments, you get multiple values, even if you pass too few arguments, or too many:
frob { |a, b, c| [a, b, c].inspect }
# => "[1, 2, 3]"
frob { |a, b| [a, b].inspect }
# => "[1, 2]"
frob { |a, b, c, d| [a, b, c, d].inspect }
# => "[1, 2, 3, nil]"
If you pass block-scoped variables, however, those are independent of the yielded value(s):
frob { |a; b, c| [a, b, c].inspect }
# => "[1, nil, nil]"
Something similar happens with methods that yield an array, except that when you pass a block with a single argument, it gets the whole array:
def frobble
yield [1, 2, 3]
end
frobble {|a| a.inspect }
# => "[1, 2, 3]"
Multiple arguments, however, destructure the array --
frobble {|a, b| [a, b].inspect }
# => "[1, 2]"
-- while a block-scoped variable doesn't:
frobble {|a; b| [a, b].inspect }
# => "[[1, 2, 3], nil]"
(Even with a block-scoped variable present, though, multiple values will still destructure the array: frobble {|a, b; c| [a, b, c].inspect } will get you "[1, 2, nil]".)
For more discussion and examples, see also this answer.

ruby multiple assignment not working as expected

I am trying to do something like this:
a = b = c = []
a << 1
Now I am expecting that b and c will be an empty array, whereas a will have one element. But its not working like that, here b and c also contains the same element, how is it working like this?
When you do this
a = b = c = []
All three variables point to the same location in memory. They are three references to same location in memory
So when you do
a << 1, you are writing to the memory space referred by all three variables
If you want 3 separate arrays, do:
a, b, c = [], [], []
You can use .dup to create an object with same value at different memory location.
Here is your example without c because it is irrelevant.
irb(main):028:0> a = b = []
=> []
irb(main):029:0> a.object_id #a and b refer to the same location in memory
=> 19502520
irb(main):030:0> b.object_id
=> 19502520
irb(main):031:0> b = a.dup
=> []
irb(main):032:0> b.object_id #b refers to different location in memory
=> 18646920
irb(main):033:0> a << 1
=> [1]
irb(main):034:0> b
=> []

Resources