How does Ruby handle assignment semantically? - ruby

In Ruby, we assign values to objects with the = operator.
Combine this with implicit typing and we frequently get situations like this:
myVar= :asymbol
The above line both creates a new symbol object, and binds the object to the variable name myVar.
Semantically, how is this done?
I have had it hammered into my head that the = operator is not magic syntax built into the interpreter, but is actually just syntactic sugar for the object.=(value) method.
With this in mind, my best guess is that when the interpreter sees we are trying to assign a value to an undefined variable name, it first creates a new object of some special type, like undefined or null or something, and then passes the := message to that object with the payload being the value we are trying to assign.
However, calling .class on an un-instantiated object just throws an exception because Ruby thinks we're trying to call a method (whose name is the name of the variable that you're trying to bring into existence) on self
> obj.class
> NameError: undefined variable or method 'obj' for main:Object
So, as far as I can tell, I have no way of figuring this out experimentally.
Side note:
In the case of symbol assignment, I believe that the value assigned ( A.K.A. the value returned by the instantiated object's object_id method, A.K.A. the value of the unsigned long VALUE variable on the C level) is a number that represents an offset in a table somewhere (I believe this is how Ruby achieves 'immediate value' for symbol objects).
In other cases, the value may be a direct encoding of the object itself, or a value that is meant to be cast to a pointer in reference to a struct.
Regardless, the way that Ruby represents the object and whether we end up assigning a reference or the object itself is not what I am asking about here.
Additional question:
What class is the = method inherited from? I can't find it in the spec for Object or BasicObject.

Variables are, in a technical sense, just pointers to objects. There's nothing remarkable about that, but a simple variable assignment to an existing object does not involve any method calls or messages being sent.
Remember variables are just there so that programmers can refer to objects by name instead of by some kind of internal identifier or memory location. So there's a bit of "magic" here, = is special when making an assignment as there's rules for what you can do on the left and right side of it.
The only way you can send messages to something, that is make method calls, is if you've defined it in a way the compiler understands. x = 1 is sufficient, it means x refers to the Fixnum in question.
Note that the Ruby interpreter will need to determine if x refers to a variable or method call, as x= may be a method that's defined on the object context in which this is evaluated.
For example:
class Example
def x=(value)
#x = value
end
def test
# Equivalent to send(:x=, 1) because x= is a method
x = 1
# Is a variable definition because y= is not a method
y = 2
# Is always a method call because self is referenced.
self.x = 3
end
end
# Is a variable definition because x= is not defined in this context
x = 4
If there's no x= method for your object, x is automatically presumed to be a variable.
You can't have a := message because that would imply you can replace one object with another, something that's not allowed. Once an object is created, it cannot magically change type. For that you need to create a new instance of a different object. Variables only appear to change types, but in fact, they just end up pointing to different objects.
So in short, there's no := method call, but there may be special methods like :x= that work in very specific cases.

Related

What is an appropriate way to think of the 'self' keyword in Ruby?

Regarding the semantics of self, is it more appropriate to say:
self is a keyword that holds a reference to whatever the current receiver is.
self is the only receiver in Ruby. When you call a method or invoke a class definition, the value bound to self becomes a copy of the value bound to that object. By value, I mean unsigned long VALUE
In other words, is it ever accurate to say that myObj is the actual receiver of a message, or is it instead the case that self is the true receiver and a copy of the unsigned long VALUE variable value that is bound to the myObj variable name gets bound to the self variable name?
In the latter case, you could only ever say that self is receiving a message, but self and myObj happen to reference the same object. (self is always the current object)
In the former case, you could actually say that the current object (message receiver) changes and Ruby just updates self accordingly.
Whats going on under the hood?
The reason I am concerned with this apparently arbitrary distinction is because I am trying to figure out how Ruby "passes a message"
When you "pass a message" to an object, the message name is used to determine which method definition to execute. As arguments, that method definition receives both the rest of the message and the value to be used for self.
So from the point of view of the method code, self is just another parameter. But in the Ruby source, instead of being declared in the method's formal parameter list, it's implicit.
In that sense, it's certainly not true that only self can receive messages, because the invocant of a message only becomes self after the message is received, and continues to be so only within the body of the method responding to that message.
Javascript, like Ruby, has a predefined name (this instead of self), but other languages deal with the invocant differently. Some let you pick your own name for it, which may differ between method definitions. Maybe the invocant is just the first formal parameter, as in Perl and Python; the answer to your question may be clearer in those languages. Or, since the syntax to specify the recipient of a message is normally different from the syntax of a passed-in argument, there might likewise be a special syntax for declaring an invocant parameter, as in Go.
Ruby has the additional wrinkle that a bareword which doesn't refer to a local variable is interpreted as a message with no explicit invocant, and automatically sent to self. Other than that, self is just another local variable (which happens to be predefined and read-only).
Execution in Ruby always occurs within the scope of an instance of an object. self is a keyword which refers to the object which is the receiver of the current stack frame. At the top level, this is an instance of Object called main. Each time you pass a message to a receiver, the Ruby VM pushes a frame onto the stack which includes the receiver of that object, which it then exposes through the self keyword.
It's probably most accurate to think of self as "the receiver in the current stack frame". To that end, it's inaccurate to say that "self is the only receiver", as otherwise you could never change receivers!

how does the assignment symbol work - Ruby

In Ruby if i just assign a local variable.
sound = "bang".
is that a main.sound=("bang") method? if so, where and how is that method "sound=" being defined? or how is that assignment working? if not, what is actually happening?
i know that for a setter method you would say x.sound=("bang"). and you are calling the method "sound=" on the object "x" with the argument "bang". and you are creating an instance variable "sound".
and i can picture all of that. but not when you assign a variable in the "main" object. as far as i know it isn't an instance variable of the Object class... or is it? I'm so confused.
In most programming languages, Ruby included, assignment is a strange beast. It is not a method or function, what it does is associate a name (also called an lvalue since it's left of the assignment) with a value.
Ruby adds the ability to define methods with names ending in = that can be invoked using the assignment syntax.
Attribute accessors are just methods that create other methods that fetch and assign member variables of the class.
So basically there are 3 ways you see assignment:
the primitive = operator
methods with names ending in =
methods generated for you by the attribute accessor (these are methods ending in =)
A variable assignment is just creating a reference to an object, like naming a dog "Spot". The "=" is not calling any method whatsoever.
As #ZachSmith comments, a simple expression such as sound could refer to a local variable named "sound"or a method of selfnamed "sound". To resolve this ambiguity, Ruby treats an identifier as a local variable if it has "seen" a previous assignment to the variable.
is that a main.sound=("bang") method?
No. main.sound="bang" should set instance variable or element of that variable.
With dot(main.sound) you tell object to do some method(in this case sound).
To manage local variables ruby create new scope.
class E
a = 42
def give_a
puts a
end
def self.give_a
puts a
end
binding
end
bin_e = _ # on pry
E.give_a # error
E.new.give_a # error
Both methods doesn't know about a. After you create your class, a will soon disappear, deleted by garbage collector. However you can get that value using binding method. It save local scope to some place and you can assign it to the variable.
bin.eval "a" # 42
lambdas have scope where they were defined:
local_var_a = 42
lamb = ->{puts local_var_a}
lamb.call() # 42

Ruby module variable accessor not working as expected

So I want a module with a variable and access methods.
My code looks something like this
module Certificates
module Defaults
class << self
attr_accessor :address
def get_defaults
address = "something"
make_root_cert
end
def make_root_cert
blub = address
# do somthing
end
end
end
I inspected it with pry.
The result is
Certificates::Defaults has methods called address and address=.
If I call address in the get_defaults method it returns "something" as expected
If I call it in make_root_cert it returns nil
I used this way of attr_accessor creation in another module and it worked fine. I hope I'm just misunderstanding the way ruby works and somebody can explain why this example doesn't work. Maybe using the implementation details of the ruby object model.
Jeremy is right.
My findings
This seems inconsistent to me.
If you use the expression "address" and the instance variable has not been set it returns the local variable
If the instance variable has been set and the local variable not it returns the instance variable.
If both have been set it returns the local variable.
On the other hand address="test" always sets the local variable.
In your get_defaults methods, address is a local variable. To use the setter, you have to type this:
self.address = "something"
That will properly call the address= method.
This rather confusing behavior occurs because the Ruby interpreter places local variable definition at a higher precedence than method calls. There is consistency here, but unless you know how it works in advance it can be hard to see clearly.
Given that so many things in Ruby are objects and method calls, it might be natural to assume that variable definition was some sort of a method called on something (like Kernel or main or the object in which it was defined or whatever) and that the resulting variable was some sort of object. If this was the case, you would guess that the interpreter would resolve name conflicts between variable definitions and other methods according to the rules of method lookup, and would only define a new variable if it didn't find a method with the same name as the potential variable definition first.
However, variable definition is not a method call, and variables are not objects. Instead, variables are just references to objects, and variable definition is something the interpreter keeps track of below the surface of the language. This is why Kernel.local_variables returns an array of symbols, and there's no way to get an array of some sort of local variable objects.
So, Ruby needs a special set of rules to handle name conflicts between variables and methods. Non-local variables have a special prefix denoting their scope ($, #, etc.) which fixes this, but not so for local variables. If Ruby required parens after methods, that would also address this problem, but we are given the luxury of not having to do that. To get the convenience of referencing local variables without a prefix and invoking methods without parens, the language just defaults to assuming you want the local variable whenever it's in scope. It could have been designed the other way, but then you would have weird situations where you defined a local variable and it was instantly eclipsed by some faraway method with the same name halfway across the program, so it's probably better like this.
The Ruby Programming Language, p. 88, has this to say:
"...local variables don't have a punctuation character as a prefix. This means that local variable references look just like method invocation expressions. If the Ruby interpreter
has seen an assignment to a local variable, it knows it is a variable and not a method, and it can return the value of the variable. If there has been no assignment, then Ruby treats the expression as a method invocation. If no method by that name exists, Ruby raises a NameError."
It goes on to explain why you were getting nil when calling address in make_root_cert:
"In general, therefor, attempting to use a local variable before it has been initialized results in an error. There is one quirk--a variable comes into existence when the Ruby interpreter sees an assignment expression for that variable. This is the case even if that assignment is not actually executed. A variable that exists but has not been assigned a value is given the default value nil. For example:
a = 0.0 if false # This assignment is never executed
print a # Prints nil: the variable exists but is not assigned
print b # NameError: no variable or method named b exists"
The setter method you get with attr_accessor leads the interpreter to create a variable before the setter method is ever called, but it has to be called to assign that variable a value other than nil. address = "something" in get_defaults defines a local variable within that method called address that goes out of scope at the end of the method. When you call make_root_cert, there's no local variable called address, so the getter method address that you got with attr_accessor is called and returns nil because the setter method hasn't been called to give it some other value. self.address= lets the interpreter know that you want the class method address= instead of a new local variable, resolving the ambiguity.

Generic way to replace an object in it's own method

With strings one can do this:
a = "hello"
a.upcase!
p a #=> "HELLO"
But how would I write my own method like that?
Something like (although that doesn't work obviously):
class MyClass
def positify!
self = [0, self].max
end
end
I know there are some tricks one can use on String but what if I'm trying to do something like this for Object?
Many classes are immutable (e.g. Numeric, Symbol, ...), so have no method allowing you to change their value.
On the other hand, any Object can have instance variables and these can be modified.
There is an easy way to delegate the behavior to a known object (say 42) and be able to change, later on, to another object, using SimpleDelegator. In the example below, quacks_like_an_int behaves like an Integer:
require 'delegate'
quacks_like_an_int = SimpleDelegator.new(42)
quacks_like_an_int.round(-1) # => 40
quacks_like_an_int.__setobj__(666)
quacks_like_an_int.round(-1) # => 670
You can use it to design a class too, for example:
require 'delegate'
class MutableInteger < SimpleDelegator
def plus_plus!
__setobj__(self + 1)
self
end
def positify!
__setobj__(0) if self < 0
self
end
end
i = MutableInteger.new(-42)
i.plus_plus! # => -41
i.positify! # => 0
Well, the upcase! method doesn't change the object identity, it only changes its internal structure (s.object_id == s.upcase!.object_id).
On the other hand, numbers are immutable objects and therefore, you can't change their value without changing their identity. AFAIK, there's no way for an object to self-change its identity, but, of course, you may implement positify! method that changes properties of its object - and this would be an analogue of what upcase! does for strings.
Assignment, or binding of local variables (using the = operator) is built-in to the core language and there is no way to override or customize it. You could run a preprocessor over your Ruby code which would convert your own, custom syntax to valid Ruby, though. You could also pass a Binding in to a custom method, which could redefine variables dynamically. This wouldn't achieve the effect you are looking for, though.
Understand that self = could never work, because when you say a = "string"; a = "another string" you are not modifying any objects; you are rebinding a local variable to a different object. Inside your custom method, you are in a different scope, and any local variables which you bind will only exist in that scope; it won't have any effect on the scope which you called the method from.
You cannot change self to point to anything other than its current object. You can make changes to instance variables, such as in the case string which is changing the underlying characters to upper case.
As pointed out in this answer:
Ruby and modifying self for a Float instance
There is a trick mentioned here that is a work around, which is to write you class as a wrapper around the other object. Then your wrapper class can replace the wrapped object at will. I'm hesitant on saying this is a good idea though.

Which is better? Creating a instance variable or passing around a local variable in Ruby?

In general what is the best practice and pro/cons to creating an instance variable that can be accessed from multiple methods or creating an instance variable that is simply passed as an argument to those methods. Functionally they are equivalent since the methods are still able to do the work using the variable. While I could see a benefit if you were updating the variable and wanted to return the updated value but in my specific case the variable is never updated only read by each method to decide how to operate.
Example code to be clear:
class Test
#foo = "something"
def self.a
if #foo == "something"
puts "do #{#foo}"
end
end
a()
end
vs
class Test
foo = "something"
def self.a(foo)
if foo == "something"
puts "do #{foo}"
end
end
a(foo)
end
I don't pass instance variable around. They are state values for the instance.
Think of them as part of the DNA of that particular object, so they'll always be part of what makes the object be what it is. If I call a method of that object, it will already know how to access its own DNA and will do it internally, not through some parameter being passed in.
If I want to apply something that is foreign to the object, then I'll have to pass it in via the parameters.
As you mentioned, this is a non-functional issue about the code. With that in mind...
It's hard to give a definitive rule about it since it depends entirely on the context. Is the variable set once and forgotten about it, or constantly updated? How many methods share the same variable? How will the code be used?
In my experience, variables that drive behavior of the object but are seldom (if at all) modified are set in the initialize method, or given to the method that will cascade behavior. Libraries and leaf methods tend to have the variable passed in, as it's likely somebody will want to call it in isolation.
I'd suggest you start by passing everything first, and then refactoring if you notice the same variable being passed around all over the class.
If I need a variable that is scoped at the instance level, I use an instance variable, set in the initialize method.
If I need a variable that is scoped at the method level (that is, a value that is passed from one method to another method) I create the variable at the method level.
So the answer to your question is "When should my variable be in scope" and I can't really answer that without seeing all of your code and knowing what you plan to do with it.
If your object behavior should be statically set in the initialization phase, I would use an instance variable.

Resources