Proc.arity in ruby-3.0.2 vs ruby-2.X.X - ruby

Please take a look on the below deviation when executing exactly the same code:
Ruby-3.0.2:
3.0.2 :001 > id = :id
3.0.2 :002 > puts id.to_proc.arity.abs
2
Ruby-2.7.4:
2.7.4 :001 > id = :id
=> :id
2.7.4 :002 > puts id.to_proc.arity.abs
1
Screenshot for execution evidence:
2.7.4 vs 3.0.2
I used 2.7.4. However, any 2.X.X returns “1”.
Is this some kind of BWD compatability issue or a documented change?
Thank you

The arity of a method or proc can be either a positive or a negative value. If it's positive, the method/proc accepts a static number of arguments with the number representing the number of required arguments.
If the arity is negative (as is the case for those procs), the method/proc accepts a variable number of arguments. The returned the number gives you the number of required arguments as -1 - n with n being the number of required arguments.
Now for the result of Symbol#to_proc, the more correct arity is -2 as you have to pass an argument (i.e. the receiver where the method representing the Symbol will be called on) plus any number of additional arguments (which will be passed along as method arguments).
In previous Ruby versions, the returned proc was defined to accept any number of arguments, similar to lambda { |*args| ... }. The "body" of the proc then checked the number of arguments and raised if there were too few arguments.
In Ruby 3.0 however, the returned proc is defined similar to lambda { |receiver, *args| ... } Here, the interface of the proc makes it more clear that there is a single required argument and any number of optional arguments.
The behavior of the proc itself was not changed much (but see below). As in both cases, the internal implementation was heavily optimized, both versions check the number of arguments in C code in the implementation of the proc and handle arguments on its own.
With Ruby 3.0 however, the returned Proc is a lambda as it ​actually behaves like a lambda rather than a proc with regards to argument handling (and always has). This was changed in https://bugs.ruby-lang.org/issues/16260.
As such, the behavior itself has not changed, only the announced interface was updated in Ruby 3.0 to better reflect the actual behavior of the proc.

Related

What is the meaning of the 'send' keyword in Ruby's AST?

I am trying to learn the Ruby lexer and parser (whitequark parser) to know more about the procedure to further generate machine code from a Ruby script.
On parsing the following Ruby code string.
def add(a, b)
return a + b
end
puts add 1, 2
It results in the following S-expression notation.
s(:begin,
s(:def, :add,
s(:args,
s(:arg, :a),
s(:arg, :b)),
s(:return,
s(:send,
s(:lvar, :a), :+,
s(:lvar, :b)))),
s(:send, nil, :puts,
s(:send, nil, :add,
s(:int, 1),
s(:int, 3))))
Can anyone please explain me the definition of the :send keyword in the resultant S-expression notation?
Ruby is built on top of “everything is an object” paradigm. That said, everything, including numbers, is an object.
Operators, as we see them in plain ruby code, are nothing but a syntactic sugar for respective object’s methods calls:
3.14.+(42)
#⇒ 45.14
The above is exactly how Ruby treats 3.14 + 42 short notation. It, in turn, might be written using generic Object#send:
3.14.send :+, 42
#⇒ 45.14
The latter should be read as: “send the message :+ with argument[s] (42) to the receiver 3.14.”
Ruby is an object-oriented language. In object-oriented programming, we do stuff by having objects send messages to other objects. For example,
foo.bar(baz)
means that self sends the message bar to the object obtained by dereferencing the local variable foo, passing the object obtained by dereferencing the local variable baz as argument. (Assuming that foo and baz are local variables. They could also be message sends, since Ruby allows you to leave out the receiver if it is self and the argument list if it is empty. Note that this would be statically known by the parser at this point, however, since local variables are created statically at parse time.)
In your code, there are several message sends:
a + b
sends the message + to the object in variable a passing the object in variable b
puts add 1, 2
sends to message add to self passing the literal integers 1 and 2 as arguments, then sends the message puts to self passing the result of the above message send as an argument.
Note that this has nothing to do with Object#send / Object#public_send. Those two are reflective methods that allow you to specify the message dynamically instead of statically in the source code. They are typically implemented internally by delegating to the same private internal runtime routine that the AST interpreter delegates to. Not the other way around. The interpreter does not call Object#send (otherwise, you could customize method lookup rules in Ruby by monkey-patching Object#send, which you can easily try is not the case), rather both Object#send and the interpreter call the same private internal implementation detail.

"NoMethodError: undefined method '-#' for ["some-text"]:Array" when inside while loop [duplicate]

The pre/post increment/decrement operator (++ and --) are pretty standard programing language syntax (for procedural and object-oriented languages, at least).
Why doesn't Ruby support them? I understand you could accomplish the same thing with += and -=, but it just seems oddly arbitrary to exclude something like that, especially since it's so concise and conventional.
Example:
i = 0 #=> 0
i += 1 #=> 1
i #=> 1
i++ #=> expect 2, but as far as I can tell,
#=> irb ignores the second + and waits for a second number to add to i
I understand Fixnum is immutable, but if += can just instanciate a new Fixnum and set it, why not do the same for ++?
Is consistency in assignments containing the = character the only reason for this, or am I missing something?
Here is how Matz(Yukihiro Matsumoto) explains it in an old thread:
Hi,
In message "[ruby-talk:02706] X++?"
on 00/05/10, Aleksi Niemelä <aleksi.niemela#cinnober.com> writes:
|I got an idea from http://www.pragprog.com:8080/rubyfaq/rubyfaq-5.html#ss5.3
|and thought to try. I didn't manage to make "auto(in|de)crement" working so
|could somebody help here? Does this contain some errors or is the idea
|wrong?
(1) ++ and -- are NOT reserved operator in Ruby.
(2) C's increment/decrement operators are in fact hidden assignment.
They affect variables, not objects. You cannot accomplish
assignment via method. Ruby uses +=/-= operator instead.
(3) self cannot be a target of assignment. In addition, altering
the value of integer 1 might cause severe confusion throughout
the program.
matz.
One reason is that up to now every assignment operator (i.e. an operator which changes a variable) has a = in it. If you add ++ and --, that's no longer the case.
Another reason is that the behavior of ++ and -- often confuse people. Case in point: The return value of i++ in your example would actually be 1, not 2 (the new value of i would be 2, however).
It's not conventional in OO languages. In fact, there is no ++ in Smalltalk, the language that coined the term "object-oriented programming" (and the language Ruby is most strongly influenced by). What you mean is that it's conventional in C and languages closely imitating C. Ruby does have a somewhat C-like syntax, but it isn't slavish in adhering to C traditions.
As for why it isn't in Ruby: Matz didn't want it. That's really the ultimate reason.
The reason no such thing exists in Smalltalk is because it's part of the language's overriding philosophy that assigning a variable is fundamentally a different kind of thing than sending a message to an object — it's on a different level. This thinking probably influenced Matz in designing Ruby.
It wouldn't be impossible to include it in Ruby — you could easily write a preprocessor that transforms all ++ into +=1. but evidently Matz didn't like the idea of an operator that did a "hidden assignment." It also seems a little strange to have an operator with a hidden integer operand inside of it. No other operator in the language works that way.
I think there's another reason: ++ in Ruby wouldn't be remotely useful as in C and its direct successors.
The reason being, the for keyword: while it's essential in C, it's mostly superfluous in Ruby. Most of the iteration in Ruby is done through Enumerable methods, such as each and map when iterating through some data structure, and Fixnum#times method, when you need to loop an exact number of times.
Actually, as far as I have seen, most of the time +=1 is used by people freshly migrated to Ruby from C-style languages.
In short, it's really questionable if methods ++ and -- would be used at all.
You can define a .+ self-increment operator:
class Variable
def initialize value = nil
#value = value
end
attr_accessor :value
def method_missing *args, &blk
#value.send(*args, &blk)
end
def to_s
#value.to_s
end
# pre-increment ".+" when x not present
def +(x = nil)
x ? #value + x : #value += 1
end
def -(x = nil)
x ? #value - x : #value -= 1
end
end
i = Variable.new 5
puts i #=> 5
# normal use of +
puts i + 4 #=> 9
puts i #=> 5
# incrementing
puts i.+ #=> 6
puts i #=> 6
More information on "class Variable" is available in "Class Variable to increment Fixnum objects".
I think Matz' reasoning for not liking them is that it actually replaces the variable with a new one.
ex:
a = SomeClass.new
def a.go
'hello'
end
# at this point, you can call a.go
# but if you did an a++
# that really means a = a + 1
# so you can no longer call a.go
# as you have lost your original
Now if somebody could convince him that it should just call #succ! or what not, that would make more sense, and avoid the problem. You can suggest it on ruby core.
And in the words of David Black from his book "The Well-Grounded Rubyist":
Some objects in Ruby are stored in variables as immediate values. These include
integers, symbols (which look like :this), and the special objects true, false, and
nil. When you assign one of these values to a variable (x = 1), the variable holds
the value itself, rather than a reference to it.
In practical terms, this doesn’t matter (and it will often be left as implied, rather than
spelled out repeatedly, in discussions of references and related topics in this book).
Ruby handles the dereferencing of object references automatically; you don’t have to
do any extra work to send a message to an object that contains, say, a reference to
a string, as opposed to an object that contains an immediate integer value.
But the immediate-value representation rule has a couple of interesting ramifications,
especially when it comes to integers. For one thing, any object that’s represented
as an immediate value is always exactly the same object, no matter how many
variables it’s assigned to. There’s only one object 100, only one object false, and
so on.
The immediate, unique nature of integer-bound variables is behind Ruby’s lack of
pre- and post-increment operators—which is to say, you can’t do this in Ruby:
x = 1
x++ # No such operator
The reason is that due to the immediate presence of 1 in x, x++ would be like 1++,
which means you’d be changing the number 1 to the number 2—and that makes
no sense.
Some objects in Ruby are stored in variables as immediate values. These include integers, symbols (which look like :this), and the special objects true, false, and nil. When you assign one of these values to a variable (x = 1), the variable holds the value itself, rather than a reference to it.
Any object that’s represented as an immediate value is always exactly the same object, no matter how many variables it’s assigned to. There’s only one object 100, only one object false, and so on.
The immediate, unique nature of integer-bound variables is behind Ruby’s lack of pre-and post-increment operators—which is to say, you can’t do this in Ruby:
x=1
x++ # No such operator
The reason is that due to the immediate presence of 1 in x, x++ would be like 1++, which means you’d be changing the number 1 to the number 2—and that makes no sense.
Couldn't this be achieved by adding a new method to the fixnum or Integer class?
$ ruby -e 'numb=1;puts numb.next'
returns 2
"Destructive" methods seem to be appended with ! to warn possible users, so adding a new method called next! would pretty much do what was requested ie.
$ ruby -e 'numb=1; numb.next!; puts numb'
returns 2 (since numb has been incremented)
Of course, the next! method would have to check that the object was an integer variable and not a real number, but this should be available.

How can you determine if a proc call is still free or bound?

I'd like to see if a proc call with the same arguments will give the same results every time. pureproc called with arguments is free, so every time I call pureproc(1,1), I'll get the same result. dirtyproc called with arguments is bound within its environment, and thus even though it has the same arity as pureproc, its output will depend on the environment.
ruby-1.9.2-p136 :001 > envx = 1
=> 1
ruby-1.9.2-p136 :003 > pureproc = Proc.new{ |a,b| a+b }
=> #
ruby-1.9.2-p136 :004 > dirtyproc = Proc.new{ |a,b| a+b+envx }
How can I programmatically determine whether a called proc or a method is free, as defined by only being bound over the variables that must be passed in? Any explanation of bindings, local variables, etc would be welcome as well.
Probably you can parse the source using some gem like sourcify, take out all the tokens, and check if there is anything that is a variable. But note that this is a different concept from the value of the proc/method call being constant. For example, if you had things like Time.now or Random.new in your code, that does not require any variable to be defined, but will still vary every time you call. Also, what would you want to be the case when a proc has envx - envx? That will remain constant, but will still affect the code in the sense that it will return an error unless envx is defined.
Hm, tricky. There's the parameters method that tells you about expected arguments (note how they are optional cause you are using a procs, not lambdas).
pureproc.parameters
=> [[:opt, :a], [:opt, :b]]
dirtyproc.parameters
=> [[:opt, :a], [:opt, :b]]
As for determining whether or not one of the closed over variables are actually used to compute the return value of the proc, walking the AST comes to mind (there are gems for that), but seems cumbersome. My first idea was something like dirtyproc.instance_eval { local_variables }, but since both closures close over the same environment, that obviously doesn't get you very far...
The overall question though is: if you want to make sure something is pure, why not make it a proper method where you don't close over the environment in the first place?

Enumerable::each_with_index now optionally takes a arguments in Ruby 1.9. What significance and/or what is a use case for that?

In Ruby 1.8.7 and prior, Enumerable::each_with_index did not accept any arguments. In Ruby 1.9, it will accept an arbitrary number of arguments. Documentation/code shows that it simply passes those arguments along to ::each. With the built in and standard library Enumerables, I believe passing an argument will yield an error, since the Enumerable's ::each method isn't expecting parameters.
So I would guess this is only useful in creating your own Enumerable in which you do create an ::each method that accepts arguments. What is an example where this would be useful?
Are there any other non-obvious consequences of this change?
I went through some gems code and found almost no uses of that feature. One that it does, spreadsheet:
def each skip=dimensions[0], &block
skip.upto(dimensions[1] - 1) do |idx|
block.call row(idx)
end
end
I don't really see that as an important change: #each is the base method for classes that mix-in module Enumerable, and methods added (map, select, ...) do not accept arguments.

How does ruby's String .hash method work?

I'm just a newbie to ruby. I've seen a string method (String).hash .
For example, in irb, I've tried
>> "mgpyone".hash
returns
=> 144611910
how does this method works ?
The hash method is defined for all objects. See documentation:
Generates a Fixnum hash value for this
object. This function must have the
property that a.eql?(b) implies a.hash == b.hash.
The hash value is used by class Hash. Any hash value that
exceeds the capacity of a Fixnum will
be truncated before being used.
So the String.hash method is defined in C-Code. Basically (over-simplified) it just sums up the characters in that string.
If you need to get a consistent hashing output I would recommend NOT to use 'string.hash but instead consider using Digest::MD5 which will be safe in multi-instance cloud applications for example you can test this as mentioned in comment in previous by #BenCrowell
Run this 2x from your terminal, you will get different output each time:
ruby -e "puts 'a'.hash"
But if you run this the output will be consistent:
ruby -e "require 'digest'; puts Digest::MD5.hexdigest 'a'"

Resources