Best practice for avoiding mutating parameters? - ruby

Can someone suggest a good, ruby-idiomatic, way of avoiding this?
class Foo
attr_accessor :bar
end
a = {one: 1}
x = Foo.new; x.bar = a
x.bar[:two] = 2
p a #=> {one: 1, two: 2}
I could simply not allow the users of a class to access its attributes, which solves the problem...in this case. (What about passing parameters to a method?) Anyway, avoiding everything but attr_reader, and using that only on non-mutable attributes, doesn't seem very Ruby-ish.
Or, I can just not write any code which mutates values, which appeals, but is not exactly easy to do in Ruby.
I could systematically dup or clone every parameter my class is given -- except that those methods don't work on Nilclass, Fixnums, Symbols, etc -- and worse, responds_to?(:dup) == true for those types. (Also, neither dup nor clone do a deep copy.)
In the example above I modify the bar attribute in the caller, but the problem remains the same if the code is in the class, or if I use a method on the class instead of attr_accessor : If I want a class which can accept a value and do something with it, and if for some reason I have to do that by mutating that value somewhere -- is there an idiomatic way in ruby to ensure that I don't infect the caller with that mutated value?
In Ruby we are supposed not to care about the type of the incoming data very much, but it looks as if I have to care about it quite a lot in order to tell how to make this value I want to mutate safe. If it's a NullObject or a Fixnum or a Symbol it's fine, otherwise I can dup it ... unless I need to deep copy it.
That can't be right, can it?
Edit: After Some More Thought
Sergio is of course right -- sometimes you want this behaviour. Not because using the side effect in your code is a good idea, but because sometimes the class you are passing a message to needs a live reference to an object that might change afterwards.
The only time this behaviour is going to be problematic is when you are passing an Enumerable. If I pass an Array or a Hash, I really don't want the receiver to modify that. So my takeaway is:
Do what Sergio said and code defensively whenever I pass stuff to a receiver, just in case the person who coded it hasn't been careful.
Implement a blanket rule in my own classes: dup all incoming Enumerables.

It is responsibility of the caller to shield itself from code being called. Let's say, you have some command line options parsing code. You got this hash of parameters and you want to do some validation (or something). Now, the validating code was written by some other guy who likes to do things in-place for "efficiency". So it is likely that your hash will be mutated and you won't be able to use it later.
Solution? Pass a copy.
validate_params! Marshal.load(Marshal.dump(params)) # deep copy
Note that in some cases mutation is desirable. So it must be the caller who controls the effect (allows or prevents it).

I would consider using freeze:
class Foo
attr_reader :bar
def bar=(value)
#bar = value.freeze # You may need to freeze nested values too
end
end
a = { one: 1 }
x = Foo.new
x.bar = a
x.bar[:two] = 2
# raises: can't modify frozen Hash
Or if you prefer to not change Foo, freeze the value when assigning:
class Foo
attr_accessor :bar
end
a = {one: 1}
x = Foo.new
x.bar = a.freeze
x.bar[:two] = 2
# raises: can't modify frozen Hash

Related

Ruby semantics for accepting an object or its id as an argument

I'm trying to work on the principle of least surprise here...
Let's say you've got a method that accepts two objects. The method needs these to be object instances, but in the place where you initialize the class you may only have reference IDs. This would be common in a router / controller in a web service, for example. The setup might look something like this:
post "/:foo_id/add_bar/:bar_id" do
AddFooToBar.call(...)
end
There are many different ways that this could be solved. To me the most 'idomatic' here is something like this:
def AddFooToBar.call(foo:nil,foo_id:nil,bar:nil,bar_id:nil)
#foo = foo || Foo[foo_id]
#bar = bar || Bar[bar_id]
...
end
Then when you call the method, you could call it like:
AddFooToBar.call(foo: a_foo, bar: a_bar)
AddFooToBar.call(foo_id: 1, bar_id: 2)
This creates a pretty clear interface, but the implementation is a little verbose, particularly if there are more than 2 objects and their names are longer than foo and bar.
You could use a good old fashioned hash instead...
def AddFooToBar.call(input={})
#foo = input[:foo] || Foo[ input[:foo_id] ]
#bar = input[:bar] || Bar[ input[:bar_id ]
end
The method signature is super simple now, but it loses a lot of clarity compared to what you get using keyword arguments.
You could just use a single key instead, especially if both inputs are required:
def AddFooToBar.call(foo:,bar:)
#foo = foo.is_a?(Foo) ? foo : Foo[foo]
#bar = bar.is_a?(Bar) ? bar : Bar[bar]
end
The method signature is simple, though it's a little weird to pass just an ID using the same argument name you'd pass an object instance to. The lookup in the method definition is also a little uglier and less easy to read.
You could just decide not to internalize this at all and require the caller to initialize instances before passing them in.
post "/:foo_id/add_bar/:bar_id" do
foo = Foo[ params[:foo_id] ]
bar = Bar[ params[:bar_id] ]
AddFooToBar.call(foo: foo, bar: bar)
end
This is quite clear, but it means that every place that calls the method needs to know how to initialize the required objects first, rather than having the option to encapsulate that behavior in the method that needs the objects.
Lastly, you could do the inverse, and only allow object ids to be passed in, ensuring the objects will be looked up in the method. This may cause double lookups though, in case you sometimes have instances already existing that you want to pass in. It's also harder to test since you can't just inject a mock.
I feel like this is a pretty common issue in Ruby, particularly when building web services, but I haven't been able to find much writing about it. So my questions are:
Which of the above approaches (or something else) would you expect as more conventional Ruby? (POLS)
Are there any other gotchas or concerns around one of the approaches above that I didn't list which should influence which one works best, or experiences you've had that led you to choose one option over the others?
Thanks!
I would go with allowing either the objects or the ids indistinctively. However, I would not do like you did:
def AddFooToBar.call(foo:,bar:)
#foo = foo.is_a?(Foo) ? foo : Foo[foo]
#bar = bar.is_a?(Bar) ? bar : Bar[foo]
end
In fact, I do not understand why you have Bar[foo] and not Bar[bar]. But besides this, I would put the conditions built-in within the [] method:
def Foo.[] arg
case arg
when Foo then arg
else ...what_you_originally_had...
end
end
Then, I would have the method in question to be defined like:
def AddFooToBar.call foo:, bar:
#foo, #bar = Foo[foo], Bar[bar]
end

Is there a way to safely override Module#=== for a given type?

In Rails (version 3.2 at least; I don't have 4 to be able to try there), ActiveRecord::Base#find chokes if given a SimpleDelegator, even if the object it delegates would otherwise work properly.
The reason for this is that AR::Base#find passes values into AR::ConnectionAdapters::Quoting#quote while creating an SQL statement, and since it doesn't know what to do with SimpleDelegator, it tries to pass it to YAML.dump, which raises an exception. AR determines how to quote by a case statement of classes (ie. String === value, etc.).
Now, of course, even if a SimpleDelegator contains a String, its class is SimpleDelegator, so the above check will fail. However, SimpleDelegator has a __getobj__ method, which provides access to the actual object being delegated to:
> s = SimpleDelegator.new("test")
#=> "test"
> String === s
#=> false
> String === s.__getobj__
#=> true
In order to get around this problem, I could override Class#=== to take SimpleDelegator into account:
class Class
def ===(other)
return super(other.__getobj__) if other.is_a?(SimpleDelegator)
super
end
end
> String === s
#=> true
However, this clearly does not look like a safe way to go about doing this (I don't know if this will negatively impact anything, but at the least, class equality to SimpleDelegator will be broken). On the other hand, this makes it easier to handle other instances of code like that in AR::ConnectionAdapters::Quoting#quote which I'm not yet aware of (as opposed to specifically monkey patching quote to know about SimpleDelegator, for instance).
Module#=== is a native C method in MRI, and makes use of a method called rb_obj_is_kind_of. I had hoped that overriding SimpleDelegator#kind_of? might allow me to do what I want to here in a safer manner, but it seemed to have no impact (I guess rb_obj_is_kind_of doesn't really have anything to do with Object#kind_of?).
Is there any way to do this in a "safe" manner, or am I just stuck monkey patching individual cases as they come up?
It looks like you may hack around and to cheat the AR. It tries to YAML.dump? Well, we’ll help it:
class A < SimpleDelegate
def initialize *args
#s = 'voilá'
end
# required by YAML (Psych)
def encode_with coder
coder.tag = nil
coder.represent_scalar(nil, #s)
end
end
require 'yaml'
puts YAML.dump(A.new)
# ⇒ --- voilá
Now AR should be able to dump it and voilá. The encode_with method might be put into CheatModule to include it whereever you need to cheat AR:
module DelegateYamler
def encode_with coder
coder.tag = nil
coder.represent_scalar(nil, __getobj__)
end
end
Hope it helps.

Can't modify self in ruby for integer

I'm looking for a way in ruby to chain a destructive method to change the value of a variable by one, but I'm getting errors saying Can't change the value of self. Is this something not possible in Ruby?
guesses_left = 3
class Integer
def decrement_guess_count!
self -= 1
end
end
guesses_left.decrement_guess_count!
That's by design. It's not specific to integers, all classes behave like that. For some classes (String, for example) you can change state of an instance (this is called destructive operation), but you can't completely replace the object. For integers you can't change even state, they don't have any.
If we were willing to allow such thing, it would raise a ton of hard questions. Say, what if foo references bar1, which we're replacing with bar2. Should foo keep pointing to bar1? Why? Why it should not? What if bar2 has completely different type, how users of bar1 should react to this? And so on.
class Foo
def try_mutate_into another
self = another
end
end
f1 = Foo.new
f2 = Foo.new
f1.try_mutate_into f2
# ~> -:3: Can't change the value of self
# ~> self = another
# ~> ^
I challenge you to find a language where this operation is possible. :)

Ruby: Automatically set instance variable as method argument?

Are there any plans to implement ruby behavior similar to the CoffeeScript feature of specifying an instance variable name in a method argument list?
Like
class User
def initialize(#name, age)
# #name is set implicitly, but #age isn't.
# the local variable "age" will be set, just like it currently works.
end
end
I'm aware of this question: in Ruby can I automatically populate instance variables somehow in the initialize method? , but all the solutions (including my own) don't seem to fit the ruby simplicity philosophy.
And, would there be any downsides for having this behavior?
UPDATE
One of the reasons for this is the DRY (don't repeat yourself) philosophy of the ruby community. I often find myself needing to repeat the name of an argument variable because I want it to be assigned to the instance variable of the same name.
def initialize(name)
# not DRY
#name = name
end
One downside I can think of is that it may look as though a method is doing nothing if it has no body. If you're scanning quickly, this may look like a no-op. But I think given time, we can adapt.
Another downside: if you're setting other instance variables in the body, and you try to be readable by putting all the assignments at the beginning, it can take more cognitive "power" to see that there assignments also happening in the argument list. But I don't think this is any harder than, say, seeing a constant or method call and having to jump to its definition.
# notice: instance var assignments are happening in 2 places!
def initialize(#name)
#errors = []
end
After some pondering, I wondered if it's possible to actually get the argument names from a ruby method. If so, I could use a special argument prefix like "iv_" to indicate which args should be set as instance variables.
And it is possible: How to get argument names using reflection.
Yes! So I can maybe write a module to handle this for me. Then I got stuck because if I call the module's helper method, it doesn't know the values of the arguments because they're local to the caller. Ah, but ruby has Binding objects.
Here's the module (ruby 1.9 only):
module InstanceVarsFromArgsSlurper
# arg_prefix must be a valid local variable name, and I strongly suggest
# ending it with an underscore for readability of the slurped args.
def self.enable_for(mod, arg_prefix)
raise ArgumentError, "invalid prefix name" if arg_prefix =~ /[^a-z0-9_]/i
mod.send(:include, self)
mod.instance_variable_set(:#instance_vars_from_args_slurper_prefix, arg_prefix.to_s)
end
def slurp_args(binding)
defined_prefix = self.class.instance_variable_get(:#instance_vars_from_args_slurper_prefix)
method_name = caller[0][/`.*?'/][1..-2]
param_names = method(method_name).parameters.map{|p| p.last.to_s }
param_names.each do |pname|
# starts with and longer than prefix
if pname.start_with?(defined_prefix) and (pname <=> defined_prefix) == 1
ivar_name = pname[defined_prefix.size .. -1]
eval "##{ivar_name} = #{pname}", binding
end
end
nil
end
end
And here's the usage:
class User
InstanceVarsFromArgsSlurper.enable_for(self, 'iv_')
def initialize(iv_name, age)
slurp_args(binding) # this line does all the heavy lifting
p [:iv_name, iv_name]
p [:age, age]
p [:#name, #name]
p [:#age, #age]
end
end
user = User.new("Methuselah", 969)
p user
Output:
[:iv_name, "Methuselah"]
[:age, 969]
[:#name, "Methuselah"]
[:#age, nil]
#<User:0x00000101089448 #name="Methuselah">
It doesn't let you have an empty method body, but it is DRY. I'm sure it can be enhanced further by merely specifying which methods should have this behavior (implemented via alias_method), rather than calling slurp_args in each method - the specification would have to be after all the methods are defined though.
Note that the module and helper method name could probably be improved. I just used the first thing that came to mind.
Well, actually...
class User
define_method(:initialize) { |#name| }
end
User.new(:name).instance_variable_get :#name
# => :name
Works in 1.8.7, but not in 1.9.3. Now, just where did I learn about this...
I think you answered your own question, it does not fit the ruby simplicity philosophy. It would add additional complexity for how parameters are handled in methods and moves the logic for managing variables up into the method parameters. I can see the argument that it makes the code less readable a toss up, but it does strike me as not very verbose.
Some scenarios the # param would have to contend with:
def initialize( first, last, #scope, #opts = {} )
def search( #query, condition )
def ratchet( #*arg )
Should all of these scenarios be valid? Just the initialize? The #*arg seems particularly dicey in my mind. All these rules and exclusions make the Ruby language more complicated. For the benefit of auto instance variables, I do not think it would be worth it.

Is it possible to compare private attributes in Ruby?

I'm thinking in:
class X
def new()
#a = 1
end
def m( other )
#a == other.#a
end
end
x = X.new()
y = X.new()
x.m( y )
But it doesn't works.
The error message is:
syntax error, unexpected tIVAR
How can I compare two private attributes from the same class then?
There have already been several good answers to your immediate problem, but I have noticed some other pieces of your code that warrant a comment. (Most of them trivial, though.)
Here's four trivial ones, all of them related to coding style:
Indentation: you are mixing 4 spaces for indentation and 5 spaces. It is generally better to stick to just one style of indentation, and in Ruby that is generally 2 spaces.
If a method doesn't take any parameters, it is customary to leave off the parantheses in the method definition.
Likewise, if you send a message without arguments, the parantheses are left off.
No whitespace after an opening paranthesis and before a closing one, except in blocks.
Anyway, that's just the small stuff. The big stuff is this:
def new
#a = 1
end
This does not do what you think it does! This defines an instance method called X#new and not a class method called X.new!
What you are calling here:
x = X.new
is a class method called new, which you have inherited from the Class class. So, you never call your new method, which means #a = 1 never gets executed, which means #a is always undefined, which means it will always evaluate to nil which means the #a of self and the #a of other will always be the same which means m will always be true!
What you probably want to do is provide a constructor, except Ruby doesn't have constructors. Ruby only uses factory methods.
The method you really wanted to override is the instance method initialize. Now you are probably asking yourself: "why do I have to override an instance method called initialize when I'm actually calling a class method called new?"
Well, object construction in Ruby works like this: object construction is split into two phases, allocation and initialization. Allocation is done by a public class method called allocate, which is defined as an instance method of class Class and is generally never overriden. It just allocates the memory space for the object and sets up a few pointers, however, the object is not really usable at this point.
That's where the initializer comes in: it is an instance method called initialize, which sets up the object's internal state and brings it into a consistent, fully defined state which can be used by other objects.
So, in order to fully create a new object, what you need to do is this:
x = X.allocate
x.initialize
[Note: Objective-C programmers may recognize this.]
However, because it is too easy to forget to call initialize and as a general rule an object should be fully valid after construction, there is a convenience factory method called Class#new, which does all that work for you and looks something like this:
class Class
def new(*args, &block)
obj = alloc
obj.initialize(*args, &block)
return obj
end
end
[Note: actually, initialize is private, so reflection has to be used to circumvent the access restrictions like this: obj.send(:initialize, *args, &block)]
Lastly, let me explain what's going wrong in your m method. (The others have already explained how to solve it.)
In Ruby, there is no way (note: in Ruby, "there is no way" actually translates to "there is always a way involving reflection") to access an instance variable from outside the instance. That's why it's called an instance variable after all, because it belongs to the instance. This is a legacy from Smalltalk: in Smalltalk there are no visibility restrictions, all methods are public. Thus, instance variables are the only way to do encapsulation in Smalltalk, and, after all, encapsulation is one of the pillars of OO. In Ruby, there are visibility restrictions (as we have seen above, for example), so it is not strictly necessary to hide instance variables for that reason. There is another reason, however: the Uniform Access Principle.
The UAP states that how to use a feature should be independent from how the feature is implemented. So, accessing a feature should always be the same, i.e. uniform. The reason for this is that the author of the feature is free to change how the feature works internally, without breaking the users of the feature. In other words, it's basic modularity.
This means for example that getting the size of a collection should always be the same, regardless of whether the size is stored in a variable, computed dynamically every time, lazily computed the first time and then stored in a variable, memoized or whatever. Sounds obvious, but e.g. Java gets this wrong:
obj.size # stored in a field
vs.
obj.getSize() # computed
Ruby takes the easy way out. In Ruby, there is only one way to use a feature: sending a message. Since there is only one way, access is trivially uniform.
So, to make a long story short: you simply can't access another instance's instance variable. you can only interact with that instance via message sending. Which means that the other object has to either provide you with a method (in this case at least of protected visibility) to access its instance variable, or you have to violate that object's encapsulation (and thus lose Uniform Access, increase coupling and risk future breakage) by using reflection (in this case instance_variable_get).
Here it is, in all its glory:
#!/usr/bin/env ruby
class X
def initialize(a=1)
#a = a
end
def m(other)
#a == other.a
end
protected
attr_reader :a
end
require 'test/unit'
class TestX < Test::Unit::TestCase
def test_that_m_evaluates_to_true_when_passed_two_empty_xs
x, y = X.new, X.new
assert x.m(y)
end
def test_that_m_evaluates_to_true_when_passed_two_xs_with_equal_attributes
assert X.new('foo').m(X.new('foo'))
end
end
Or alternatively:
class X
def m(other)
#a == other.instance_variable_get(:#a)
end
end
Which one of those two you chose is a matter of personly taste, I would say. The Set class in the standard library uses the reflection version, although it uses instance_eval instead:
class X
def m(other)
#a == other.instance_eval { #a }
end
end
(I have no idea why. Maybe instance_variable_get simply didn't exist when Set was written. Ruby is going to be 17 years old in February, some of the stuff in the stdlib is from the very early days.)
There are several methods
Getter:
class X
attr_reader :a
def m( other )
a == other.a
end
end
instance_eval:
class X
def m( other )
#a == other.instance_eval { #a }
end
end
instance_variable_get:
class X
def m( other )
#a == other.instance_variable_get :#a
end
end
I don't think ruby has a concept of "friend" or "protected" access, and even "private" is easily hacked around. Using a getter creates a read-only property, and instance_eval means you have to know the name of the instance variable, so the connotation is similar.
If you don't use the instance_eval option (as #jleedev posted), and choose to use a getter method, you can still keep it protected
If you want a protected method in Ruby, just do the following to create a getter that can only be read from objects of the same class:
class X
def new()
#a = 1
end
def m( other )
#a == other.a
end
protected
def a
#a
end
end
x = X.new()
y = X.new()
x.m( y ) # Returns true
x.a # Throws error
Not sure, but this might help:
Outside of the class, it's a little bit harder:
# Doesn't work:
irb -> a.#foo
SyntaxError: compile error
(irb):9: syntax error, unexpected tIVAR
from (irb):9
# But you can access it this way:
irb -> a.instance_variable_get(:#foo)
=> []
http://whynotwiki.com/Ruby_/_Variables_and_constants#Variable_scope.2Faccessibility

Resources