Can someone explain the difference between initializing "self" and having #variables when defining classes?
Here's an example
class Child < Parent
def initialize(self, stuff):
self.stuff = stuff
super()
end
end
So in this case, wouldn't I be able to replace self.stuff with #stuff? What's the difference? Also, the super() just means whatever is in the Parent initialize method the Child should just inherit it right?
In general, no, self.stuff = stuff and #stuff = stuff are different. The former makes a method call to stuff= on the object, whereas the latter directly sets an instance variable. The former invokes a method which may be public (unless specifically declared private in the class), whereas the latter is always setting a private instance variable.
Usually, they look the same because it is common to define attr_accessor :stuff on classes. attr_accessor is roughly equivalent to the following:
def stuff
#stuff
end
def stuff=(s)
#stuff = s
end
So in that case, they are functionally identical. However, it is possible to define the public interface to allow for different results and side-effects, which would make those two "assignments" clearly different:
def stuff
#stuff_called += 1 # Keeps track of how often this is called, a side effect
return #stuff
end
def stuff=(s)
if s.nil? # Validation, or other side effect. This is not triggered when setting the instance variable directly
raise "Argument should not be nil"
end
#stuff = s
end
You actually can't use self.stuff= unless you specifically create an attr_writer for modifying that value.
In fact, these are equivalent:
class Child
attr_writer :stuff
end
class Child
def stuff=(val)
#stuff = val
end
end
It is more common to use an attr_writer if that is the functionality you want, rather than the explicit method. But you will often use an explicit method if you want to perform extra error checking or change the way the assignment works.
To your question of when to use #stuff = and when to use self.stuff =, I would use the former if you only have simple assignments and if your class is simple, and would move towards the latter if your requirements might become more complicated. There are many other reasons too, but it's more a matter of style than anything else.
Related
I'm wondering if there's a convention / best practice for how initialize should be used when building Ruby classes. I've recently built a class as follows:
class MyClass
def initialize(file_path)
#mapped_file = map_file(file_path)
end
def map_file(file_path)
# do some processing and return the data
end
def run
#mapped_file.do_something
end
end
This uses initialize to do a lot of heavy lifting, before methods are subsequently called (all of which rely on #mapped_data).
My question is whether such processing should be handled outside of the constructor, with initialize used simply to store the instances' inputs. Would the following, for example, be preferable?
class MyClass
def initialize(file_path)
#file_path = file_path
end
def run
mapped_file.do_something_else do
etc_etc
end
end
def mapped_file(file_path)
#mapped_file ||= map_the_file_here
end
end
I hope this question isn't considered too opinion based, but will happily remove if it's deemed to be.
So, is there a 'correct' way to use initialize, and how would this fit with the scenarios above?
Any questions or comments, let me know.
As was mentioned in the comments, constructor is usually used to prepare the object, not do any actual work. More than a ruby convention, this a rule of thumb of almost all Object-Oriented languages.
What does "preparing the object" usually entail? Initializing members with default values, assigning passed arguments, calling the initializer of a super-class if such exists, etc.
In your case, this how I would rewrite your class:
class MyClass
def initialize(file_path)
#file_path = file_path
end
def map_file
#mapped_file ||= map_file_here(#file_path)
end
def run
map_file.do_something
end
end
Since run requires the file to be mapped, it always calls map_file first. But the internal map_file_here executes only once.
Say I have a parent class:
class Stat
def val
raise "method must be implemented by subclass"
end
end
And a subclass:
class MyStat < Stat
def val
#performs costly calculation and returns value
end
end
By virtue of extending the parent class, I would like the subclass to not have to worry about caching the return value of the "val" method.
There are many patterns one could employ here to this effect, and I've tried several on for size, but none of them feel right to me and I know this is a solved problem so it feels silly to waste the time and effort. How is this most commonly dealt with?
Also, it's occurred to me that I may be asking the wrong questions. Maybe I should't be using inheritance at all but composition instead.
Any and all thoughts appreciated.
Edit:
Solution I went with can be summed up as follows:
class Stat
def value
#value ||= build_value
end
def build_value
#to be implemented by subclass
end
end
Typically I use a simple pattern regardless of the presence of inheritance:
class Parent
def val
#val ||= calculate_val
end
def calculate_value
fail "Implementation missing"
end
end
class Child < Parent
def calculate_val
# some expensive computation
end
end
I always prefer to wrap the complex and expensive logic in its own method or methods that have no idea that their return value will be memoized. It gives you a cleaner separation of concerns; one method is for caching, one method is for computing.
It also happens to give you a nice way of overriding the logic, without overriding the caching logic.
In the simple example above, the memoized method val is pretty redundant. But the pattern it also lets you memoize methods that accept arguments, or when the actual caching is less trivial, maintaining that separation of responsibilities between caching and computing:
def is_prime(n)
#is_prime ||= {}
#is_prime[n] ||= compute_is_prime
end
If you want to keep the method names same and not create new methods to put logic in, then prepend modules instead of using parent/child inheritance.
module MA
def val
puts("module's method")
#_val ||= super
end
end
class CA
def val
puts("class's method")
1
end
prepend MA
end
ca = CA.new
ca.val # will print "module's method" and "class's method". will return 1.
ca.val # will print "module's method". will return 1.
Due to the fact that Ruby doesn't support overloading (because of several trivial reasons), I am trying to find a way to 'simulate' it.
In static typed languages, you mustn't use instanceof, (excepting some particular cases of course...) to guide the application.
So, keeping this in mind, is this the correct way to overload a method in which I do care about the type of the variable? (In this case, I don't care about the number of parameters)
class User
attr_reader :name, :car
end
class Car
attr_reader :id, :model
end
class UserComposite
attr_accessor :users
# f could be a name, or a car id
def filter(f)
if (f.class == Car)
filter_by_car(f)
else
filter_by_name(f)
end
end
private
def filter_by_name(name)
# filtering by name...
end
def filter_by_car(car)
# filtering by car id...
end
end
There are cases where this is a good approach, and Ruby gives you the tools to deal with it.
However your case is unclear because your example contradicts itself. If f.class == Car then filter_by_car accepts a _car, not a _car_id.
I'm assuming that you're actually passing instances of the class around, and if so you can do this:
# f could be a name, or a car
def filter(f)
case f
when Car
filter_by_car(f)
else
filter_by_name(f)
end
end
case [x] looks at each of its when [y] clauses and executes the first one for which [y] === [x]
Effectively this is running Car === f. When you call #=== on a class object, it returns true if the argument is an instance of the class.
This is quite a powerful construct because different classes can define different "case equality". For example the Regexp class defines case equality to be true if the argument matches the expression, so the following works:
case "foo"
when Fixnum
# Doesn't run, the string isn't an instance of Fixnum
when /bar/
# Doesn't run, Regexp doesn't match
when /o+/
# Does run
end
Personally, I don't see a big problem in branching that way. Although it would look cleaner with a case
def filter(f)
case f
when Car
filter_by_car(f)
else
filter_by_name(f)
end
end
Slightly more complicated example involves replacing branching with objects (ruby is oop language, after all :) ). Here we define handlers for specific formats (classes) of data and then look up those handlers by incoming data class. Something along these lines:
class UserComposite
def filter(f)
handler(f).filter
end
private
def handler(f)
klass_name = "#{f.class}Handler"
klass = const_get(klass_name) if const_defined?(klass_name)
klass ||= DefaultHandler
klass.new(f)
end
class CarHandler
def filter
# ...
end
end
class DefaultHandler # filter by name or whatever
def filter
# ...
end
end
end
There could be a problem lurking in your architecture - UserComposite needs to know too much about Car and User. Suppose you need to add more types? UserComposite would gradually become bloated.
However, it's hard to give specific advice because the business logic behind filtering isn't clear (architecture should always adapt to your real-world use-cases).
Is there really a common action you need to do to both Cars and Users?
If not, don't conflate the behavior into a single UserComposite class.
If so, you should use decorators with a common interface. Roughly like this:
class Filterable
# common public methods for filtering, to be called by UserComposite
def filter
filter_impl # to be implemented by subclasses
end
end
class FilterableCar < Filterable
def initialize(car)
#car = car
end
private
def filter_impl
# do specific stuff with #car
end
end
class DefaultFilterable < Filterable
# Careful, how are you expecting this generic_obj to behave?
# It might be better replace the default subclass with a FilterableUser.
def initialize(generic_obj)
# ...
end
private
def filter_impl
# generic behavior
end
end
Then UserComposite only needs to care that it gets passed a Filterable, and all it has to do is call filter on that object. Having the common filterable interface keeps your code predictable, and easier to refactor.
I recommend that you avoid dynamically generating the filterable subclass name, because if you ever decide to rename the subclass, it'll be much harder to find the code doing the generating.
I personally don't have anything against this, apart from the fact that's is long, but what really bothers me is the word eval.
I do a lot of stuff in JavaScript and I run from anything resembling eval like it's the devil, I also don't fancy the fact that the parameter is a string (again, probably because it's eval).
I know I could write my own method to fix the method-name-length problem, my 'method name issue' and the parameter-being-a-string thingy, but what I really want to know is: Is there a better, shorter, fancier, yet native, way of doing class_eval to extract class variables?
Side note: I know about the existence of class_variable_get() and class_variables(), but they don't really look appealing to me; horribly long, aren't they?
EDIT: Updated the question to be more specific.
Thanks!
Use class_variable_get, but only if you must
class_variable_get is the better way, other than the fact that it is not "appealing" to you. If you are reaching inside a class and breaking encapsulation, perhaps it is appropriate to have this extra barrier to indicate that you're doing something wrong.
Create accessor methods for the variables you want to access
If these are your classes, and accessing the variables doesn't break encapsulation, then you should create class accessor methods for them to make it easier and prettier:
class Foo
def self.bar
##bar
end
end
p Foo.bar
If this is your class, however, are you sure that you need class variables? If you don't understand the implications (see below), you may actually be wanting instance variables of the class itself:
class Foo
class << self
attr_accessor :bar
end
end
Foo.bar = 42
p Foo.bar
The behavior of class variables
Class variables appear to newcomers like the right way to store information at a class level, mostly because of the name. They are also convenient because you can use the same syntax to read and write them whether you are in a method of the class or an instance method. However, class variables are shared between a class and all its subclasses.
For example, consider the following code:
class Rectangle
def self.instances
##instances ||= []
end
def initialize
(##instances ||= []) << self
end
end
class Square < Rectangle
def initialize
super
end
end
2.times{ Rectangle.new }
p Rectangle.instances
#=> [#<Rectangle:0x25c7808>, #<Rectangle:0x25c77d8>]
Square.new
p Square.instances
#=> [#<Rectangle:0x25c7808>, #<Rectangle:0x25c77d8>, #<Square:0x25c76d0>]
Ack! Rectangles are not squares! Here's a better way to do the same thing:
class Rectangle
def self.instances
#instances ||= []
end
def initialize
self.class.instances << self
end
end
class Square < Rectangle
def initialize
super
end
end
2.times{ Rectangle.new }
p Rectangle.instances
#=> [#<Rectangle:0x25c7808>, #<Rectangle:0x25c77d8>]
2.times{ Square.new }
p Square.instances
#=> [#<Square:0x25c76d0>, #<Square:0x25c76b8>]
By creating an instance variable and accesor methods on the class itself—which happens to be an instance of the Class class, similar to MyClass = Class.new—all instances of the class (and outsiders) have a common, clean location to read/write information that is not shared between other classes.
Note that explicitly tracking every instance created will prevent garbage collection on 'unused' instances. Use code like the above carefully.
Using class_eval in a cleaner manner
Finally, if you're going to use class_eval, note that it also has a block form that doesn't have to parse and lex the string to evaluate it:
Foo.class_eval('##bar') # ugh
Foo.class_eval{ ##bar } # yum
I'm thinking in:
class X
def new()
#a = 1
end
def m( other )
#a == other.#a
end
end
x = X.new()
y = X.new()
x.m( y )
But it doesn't works.
The error message is:
syntax error, unexpected tIVAR
How can I compare two private attributes from the same class then?
There have already been several good answers to your immediate problem, but I have noticed some other pieces of your code that warrant a comment. (Most of them trivial, though.)
Here's four trivial ones, all of them related to coding style:
Indentation: you are mixing 4 spaces for indentation and 5 spaces. It is generally better to stick to just one style of indentation, and in Ruby that is generally 2 spaces.
If a method doesn't take any parameters, it is customary to leave off the parantheses in the method definition.
Likewise, if you send a message without arguments, the parantheses are left off.
No whitespace after an opening paranthesis and before a closing one, except in blocks.
Anyway, that's just the small stuff. The big stuff is this:
def new
#a = 1
end
This does not do what you think it does! This defines an instance method called X#new and not a class method called X.new!
What you are calling here:
x = X.new
is a class method called new, which you have inherited from the Class class. So, you never call your new method, which means #a = 1 never gets executed, which means #a is always undefined, which means it will always evaluate to nil which means the #a of self and the #a of other will always be the same which means m will always be true!
What you probably want to do is provide a constructor, except Ruby doesn't have constructors. Ruby only uses factory methods.
The method you really wanted to override is the instance method initialize. Now you are probably asking yourself: "why do I have to override an instance method called initialize when I'm actually calling a class method called new?"
Well, object construction in Ruby works like this: object construction is split into two phases, allocation and initialization. Allocation is done by a public class method called allocate, which is defined as an instance method of class Class and is generally never overriden. It just allocates the memory space for the object and sets up a few pointers, however, the object is not really usable at this point.
That's where the initializer comes in: it is an instance method called initialize, which sets up the object's internal state and brings it into a consistent, fully defined state which can be used by other objects.
So, in order to fully create a new object, what you need to do is this:
x = X.allocate
x.initialize
[Note: Objective-C programmers may recognize this.]
However, because it is too easy to forget to call initialize and as a general rule an object should be fully valid after construction, there is a convenience factory method called Class#new, which does all that work for you and looks something like this:
class Class
def new(*args, &block)
obj = alloc
obj.initialize(*args, &block)
return obj
end
end
[Note: actually, initialize is private, so reflection has to be used to circumvent the access restrictions like this: obj.send(:initialize, *args, &block)]
Lastly, let me explain what's going wrong in your m method. (The others have already explained how to solve it.)
In Ruby, there is no way (note: in Ruby, "there is no way" actually translates to "there is always a way involving reflection") to access an instance variable from outside the instance. That's why it's called an instance variable after all, because it belongs to the instance. This is a legacy from Smalltalk: in Smalltalk there are no visibility restrictions, all methods are public. Thus, instance variables are the only way to do encapsulation in Smalltalk, and, after all, encapsulation is one of the pillars of OO. In Ruby, there are visibility restrictions (as we have seen above, for example), so it is not strictly necessary to hide instance variables for that reason. There is another reason, however: the Uniform Access Principle.
The UAP states that how to use a feature should be independent from how the feature is implemented. So, accessing a feature should always be the same, i.e. uniform. The reason for this is that the author of the feature is free to change how the feature works internally, without breaking the users of the feature. In other words, it's basic modularity.
This means for example that getting the size of a collection should always be the same, regardless of whether the size is stored in a variable, computed dynamically every time, lazily computed the first time and then stored in a variable, memoized or whatever. Sounds obvious, but e.g. Java gets this wrong:
obj.size # stored in a field
vs.
obj.getSize() # computed
Ruby takes the easy way out. In Ruby, there is only one way to use a feature: sending a message. Since there is only one way, access is trivially uniform.
So, to make a long story short: you simply can't access another instance's instance variable. you can only interact with that instance via message sending. Which means that the other object has to either provide you with a method (in this case at least of protected visibility) to access its instance variable, or you have to violate that object's encapsulation (and thus lose Uniform Access, increase coupling and risk future breakage) by using reflection (in this case instance_variable_get).
Here it is, in all its glory:
#!/usr/bin/env ruby
class X
def initialize(a=1)
#a = a
end
def m(other)
#a == other.a
end
protected
attr_reader :a
end
require 'test/unit'
class TestX < Test::Unit::TestCase
def test_that_m_evaluates_to_true_when_passed_two_empty_xs
x, y = X.new, X.new
assert x.m(y)
end
def test_that_m_evaluates_to_true_when_passed_two_xs_with_equal_attributes
assert X.new('foo').m(X.new('foo'))
end
end
Or alternatively:
class X
def m(other)
#a == other.instance_variable_get(:#a)
end
end
Which one of those two you chose is a matter of personly taste, I would say. The Set class in the standard library uses the reflection version, although it uses instance_eval instead:
class X
def m(other)
#a == other.instance_eval { #a }
end
end
(I have no idea why. Maybe instance_variable_get simply didn't exist when Set was written. Ruby is going to be 17 years old in February, some of the stuff in the stdlib is from the very early days.)
There are several methods
Getter:
class X
attr_reader :a
def m( other )
a == other.a
end
end
instance_eval:
class X
def m( other )
#a == other.instance_eval { #a }
end
end
instance_variable_get:
class X
def m( other )
#a == other.instance_variable_get :#a
end
end
I don't think ruby has a concept of "friend" or "protected" access, and even "private" is easily hacked around. Using a getter creates a read-only property, and instance_eval means you have to know the name of the instance variable, so the connotation is similar.
If you don't use the instance_eval option (as #jleedev posted), and choose to use a getter method, you can still keep it protected
If you want a protected method in Ruby, just do the following to create a getter that can only be read from objects of the same class:
class X
def new()
#a = 1
end
def m( other )
#a == other.a
end
protected
def a
#a
end
end
x = X.new()
y = X.new()
x.m( y ) # Returns true
x.a # Throws error
Not sure, but this might help:
Outside of the class, it's a little bit harder:
# Doesn't work:
irb -> a.#foo
SyntaxError: compile error
(irb):9: syntax error, unexpected tIVAR
from (irb):9
# But you can access it this way:
irb -> a.instance_variable_get(:#foo)
=> []
http://whynotwiki.com/Ruby_/_Variables_and_constants#Variable_scope.2Faccessibility