In Ruby, what are the use cases for adding methods to an instance's singleton class? - ruby

Thanks to some other posts and reading, I understand singleton/meta classes. And I understand why we'd want to use them on a class. But I still don't understand why we'd want to use them on instance objects. And I've yet to see it in practice.
I'm referring to something like this:
class Vehicle
def odometer_reading
# some code
end
end
my_car = Vehicle.new
def my_car.open_door
# some code
end
At first thought, this seems like a bad idea as it would lead to difficulties in understanding the code and debugging.
Why would we want to do this? What are some examples of when this is a good idea?

One example is using it for testing purposes: creating mock and double objects, stubbing methods. Debugging is somewhere nearby: re-defining the logging method for a specific object that you suspect is mis-behaving, so that the log info is printed directly to console (or more info is printed) during the debug session.
Another example is dealing with special cases - instead of inheritance you can do just that. Starting from a classical example if you use two types of Employees, say, Engineers and SalesPersons, for which the rules of compensation calculation are different, you can put the common logic into the Employee class, then inherit the other two classes from it and implement their own calculate_salary methods there. Now, if there is an outlier - a star salesman that you have agreed to a different compensation scheme with, a CEO with a very special scheme, etc - instead of creating a whole sub-class for this special employee, you can just define this method for a specific object representing that employee.
The third example is dealing with an object lifecycle and performance considerations. Instead of having a long case of various states in some processing method. E.g. for a file-reading class that transparently caches the entire file in the background (I know a too-simplistic-for-real-life approach, but just as a model) all read requests while the file is not entirely read should check if the requested data is already in the cache or should be read from disk. Once the file is fully read they always go from the cache. Instead of having the if (case if there are more states) to deal with this you could simply re-define the read method at the object-level once the file is fully read to the cache. For this simple example it doesn't lead to any sizable performance benefit (if any benefit at all), but for more complex cases that may be worth it.

You wouldn't add them using def, that's a rather rigid way of doing it, but instead by using something like define_method or extend.
Although this is not the sort of thing you'd do on a routine basis, it does mean you can do some rather unusual things. ActiveRecord in Rails produces results in the form of an Array with additional methods added on to perform other operations.
An Object-Relationship Mapper would be a case where you'd probably want to do this. Sometimes, depending on how you fetch a record, the methods available differ significantly. Being able to add those dynamically means each fetched object can be completely customized even if they have the same class and general-purpose methods.

Another example: You have an array of hashes and you want each hash to have a method-call getter and setter. Something like:
user = HashOnSteroids.new(name: 'John')
user[:name] # => 'John'
user[:name] = 'Joe'
user.name # => 'Joe'
user.name = 'John'
user.set(name: 'Jim', age: 5)
This means you cannot write standard method definitions in the class as each hash will have a different set of keys (method names). This means you have to resort to defining singleton methods so each object has its own set of methods (not a pack of shared methods).
Warning: Using singleton methods for this use case is highly inefficient. A sneaky method_missing is faster and uses way less memory as it doesn't have to allocate a billion of proc objects.

Related

What is the relation in (class diagrams) between those 3 classes?

I have the code as follow :
class Synchronization
def initialize
end
def perform
detect_outdated_documents
update_documents
end
private
attr_reader :documents
def detect_outdated_documents
#documents = DetectOutdatedDocument.new.perform
end
def update_documents
UpdateOutdatedDocument.new(documents).perform
end
#documents is an array of Hashes I return from a method in DetectOutdatedDocument.
I then use this array of Hash to initialize the UpdateOutdatedDocument class and run the perform method.
Is something like this correct?
Or should I use associations or something else?
Ruby to UML mapping
I'm not a Ruby expert, but what I understand from your snippet given its syntax is:
There's a Ruby class Synchronization: That's one UML class
The Ruby class has 4 methods initialize, perform, detect_outdated_documents, and update_documents, the two last being private. These would be 4 UML operations.
initialize is the constructor, and since it's empty, you have not mentioned it in your UML class diagram, and that's ok.
The Ruby class has 1 instance variable #documents. In UML, that would be a property, or a role of an association end.
The Ruby class has a getter created with attr_reader. But since it is in a private section, its visibility should be -. This other answer explains how to work with getters and setters elegantly and accurately in UML (big thanks to #engineersmnky for the explanations on getters in Ruby, and for having corrected my initial misunderstanding in this regard)
I understand that SomeClass.new creates in Ruby a new object of class SomeClass.
Ruby and dynamic typing in UML
UML class diagrams are based on well-defined types/classes. You would normally indicate associations, aggregations and compositions only with known classes with whom there’s for sure a stable relation. Ruby is dynamically typed, and all what is known for sure about an instance variable is that it's of type Object, the highest generalization possible in Ruby.
Moreover, Ruby methods return the value of the latest statement/expression in its execution path. If you did not care about a return value of an object, you'd just mark it as being Object (Thanks engineersmnky for the explanation).
Additional remarks:
There is no void type in UML (see also this SO question). An UML operation that does not return anything, would just be an operation with no return type indicated.
Keep also in mind that the use of types that do not belong to the UML standard (such as Array, Hash, Object, ...) would suppose the use of a language specific UML profile.
Based on all this, and considering that an array is also an Object, your code would lead to a very simple UML diagram, with 3 classes, that are all specializations of Object, and a one-to-many association between Synchronization and Object, with the role #documents at the Object end.
Is it all what we can hope for?
The very general class diagram, may perhaps match very well the implementation. But it might not accurately represent the design.
It's your right to model in UML a design independently of the implementation. Hence, if the types of instance variables are known by design (e.g. you want it to be of some type and make sure via the initialization and the API design that the type will be enforced), you may well show this in your diagram even if it deviates from the code:
You have done some manual type inferencing to deduce the return type of the UML operations. Since all Ruby methods return something, we'd expect for all Ruby methods at least an Object return type. But it would be ok for you not to indicate any return type (the UML equivalent to void) to express taht the return value is not important.
You also have done some type inference for the instance variable (UML property): you clarify that the only value it can take is the value return by DetectOutdatedDocument.new.perform.
Your diagram indicates that the class is related to an unspecified number of DetectOutdatedDocument objects, and we guess it's becaus of the possible values of #documents. And the property is indicated as an array of objects. It's very misleading to have both on the diagram. So I recommend to remove the document property. Instead, prefer a document role at the association end on the side of DetectOutdatedDocument. This would greatly clarify for the non-Ruby-native readers why there is a second class on the diagram. :-) (It took me a while)
Now you should not use the black diamond for composition. Because documents has a public reader; so other objects could also be assigned to the same documents. Since Ruby seems to have reference semantic for objects, the copy would then refer to the same objects. That's shared aggregation (white diamond) at best. And since UML has not defined very well the aggregation semantic, you could even show a simple association.
A last remark: from the code you show, we cannot confirm that there is an aggregation between UpdateOutdatedDocument and DetectOutdatedDocument. If you are sure there is such a relationship, you may keep it. But if it's only based on the snippet you showed us, remove the aggregation relation. You could at best show a usage dependency. But normally in UML you would not show such a dependency if it is about the body of a method, since the operation could be implemented very differently without being obliged to have this dependency.
There is no relation, UML or otherwise, in the posted code. In fact, at first glance it might seem like a Synchronization has-many #documents, but the variable and its contents are never defined, initialized, or assigned.
If this is a homework assignment, you probably need to ask your instructor what the objective is, and what the correct answer should be. If it's a real-world project, you haven't done the following:
defined the collaborator objects like Document
initialized #documents in a way that's accessible to the Synchronization class
allowed your class method to accept any dependency injections
Without at least one of the items listed, your UML diagram doesn't really fit the posted code.

Should I use DataMapper entities only for persistence purposes?

I'm creating a non-Rails application and using DataMapper as ORM.
For entities which will be mapped to SQL tables I declare classes which include DataMapper::Resource.
The question is. Is it okay to use the instances of these classes as plain objects (pass to methods, manipulate values etc.)? Or they should be used only for persisting data (for instance in Repository classes)?
I'm new in the Ruby world and do not know the conventions.
If I have a User entity, which has methods creates, all etc., is it a good idea to create another class User, which only will store information (will have state - fields and no methods)? Analogue of POJO (Plain old java object) in Java?
I can see creating a wrapper class for a plain object list having some benefits. As you mention in the comment, if you want to store data in different ways then writing distinct classes is useful.
For typical DataMapper or ActiveRecord usage, though, I don't think it's common to create wrapper classes for plain-object lists, especially if you're not adding any methods to the collection. The main reason why it's not common is that query results in ActiveRecord or DataMapper are array-like already. Additionally, you're not really gaining any added functionality by converting your model instances to hashes. Let me show some example:
# collections are array-like
User.all.map(&:name) == User.all.to_a.map(&:name)
# converting a record to a hash doesn't add much
user = User.first
user_hash = user.attributes
user.name == user_hash[:name]
That being said, there is one caveat, and that has to do with chainable methods in the ORM:
# this is valid chaining
User.all.where(name: "max")
# this raises a NoMethodError for 'where'
User.all.to_a.where(name: "max")
where is a ORM method, not an array method. So if you convert the query result to an array you couldn't access it. For this reason, making a distinction between arrays and query collections is useful.
But how much benefit do you really get from creating an empty wrapper class?
class RecordsInMemory
def initialize(query_collection)
#list = query_collection.map(&:attributes)
end
end
records_in_memory = RecordsInMemory.new(User.all)
records_in_memory.list.map(&:name)
# versus ...
records_in_memory = User.all.map(&:attributes)
records_in_memory.map(&:name)
if you think in the long run you will add methods to the plain-object list, then you should make it into a class. But otherwise I think using clearly-named variables suffices.

Ruby: marshal and unmarshal a variable, not an instance

OK, Ruby gurus, this is a hard one to describe in the title, so bear with me for this explanation:
I'm looking to pass a string that represents a variable: not an instance, not the collection of properties that make up an object, but the actual variable: the handle to the object.
The reason for this is that I am dealing with resources that can be located on the filesystem, on the network, or in-memory. I want to create URI handler that can handle each of these in a consistent manner, so I can have schemes like eg.
file://
http://
ftp://
inmemory://
you get the idea. It's the last one that I'm trying to figure out: is there some way to get a string representation of a reference to an object in Ruby, and then use that string to create a new reference? I'm truly interested in marshalling the reference, not the object. Ideally there would be something like taking Object#object_id, which is easy enough to get, and using it to create a new variable elsewhere that refers to the same object. I'm aware that this could be really fragile and so is an unusual use case: it only works within one Ruby process for as long as there is an existing variable to keep the object from being garbage collected, but those are both true for the inmemory scheme I'm developing.
The only alternatives I can think of are:
marshal the whole object and cram it into the URI, but that won't work because the data in the object is an image buffer - very large
Create a global or singleton purgatory area to store a variable for retrieval later using e.g. a hash of object_id:variable pairs. This is a bit smelly, but would work.
Any other thoughts, StackOverflowers?
There's ObjectSpace._id2ref :
f = Foo.new #=> #<Foo:0x10036c9b8>
f.object_id #=> 2149278940
ObjectSpace._id2ref(2149278940) #=> #<Foo:0x10036c9b8>
In addition to the caveats about garbage collection ObjectSpace carries a large performance penalty in jruby (so much so that it's disabled by default)
Variables aren't objects in Ruby. You not only cannot marshal/unmarshal them, you can't do anything with them. You can only do something with objects, which variables aren't.
(It would be really nice if they were objects, though!)
You could look into MagLev which is an alternative Ruby implementation built on top of VMware's Gemstone. It has a distributes object model wiht might suit your use-case.
Objects are saved in the central Gemstne instance (with some nifty caching) and can be accessed by any number of remote worker instances. That way, any of the workers act on the same object space and can access the very same objects simultaneously. That way, you can even do things like having the global Garbage Collector running on a single Ruby instance or seamlessly moving execution at any point to different nodes (while preserving all the stack frames) using Continuations.

Philosophy Object/Properties Parameter Query

I'm looking at some code I've written and thinking "should I be passing that object into the method or just some of its properties?".
Let me explain:
This object has about 15 properties - user inputs. I then have about 10 methods that use upto 5 of these inputs. Now, the interface looks a lot cleaner, if each method has 1 parameter - the "user inputs object". But each method does not need all of these properties. I could just pass the properties that each method needs.
The fact I'm asking this question indicates I accept I may be doing things wrong.
Discuss......:)
EDIT: To add calrity:
From a web page a user enters details about their house and garden. Number of doors, number of rooms and other properties of this nature (15 in total).
These details are stored on a "HouseDetails" object as simple integer properties.
An instance of "HouseDetails" is passed into "HouseRequirementsCalculator". This class has 10 private methods like "calculate area of carpet", "caclulateExtensionPotential" etc.
For an example of my query, let's use "CalculateAreaOfCarpet" method.
should I pass the "HouseDetails" object
or should I pass "HouseDetails.MainRoomArea, HouseDetails.KitchenArea, HouseDetails.BathroomArea" etc
Based on my answer above and related to your edit:
a) You should pass the "HouseDetails"
object
Other thoughts:
Thinking more about your question and especially the added detail i'm left wondering why you would not just include those calculation methods as part of your HouseDetails object. After all, they are calculations that are specific to that object only. Why create an interface and another class to manage the calculations separately?
Older text:
Each method should and will know what part of the passed-in object it needs to reference to get its job done. You don't/shouldn't need to enforce this knowledge by creating fine-grained overloads in your interface. The passed-in object is your model and your contract.
Also, imagine how much code will be affected if you add and remove a property from this object. Keep it simple.
Passing individual properties - and different in each case - seems pretty messy. I'd rather pass whole objects.
Mind that you gave not enough insight into your situation. Perhaps try to describe the actual usage of this things? What is this object with 15 properties?, are those "10 methods that use upto 5 of these input" on the same object, or some other one?
After the question been edited
I should definitely go with passing the whole object and do the necessary calculations in the Calculator class.
On the other hand you may find Domain Driven Design an attractive alternative (http://en.wikipedia.org/wiki/Domain-driven_design). With regard to that principles you could add methods from calculator to the HouseDetails class. Domain Driven Design is quite nice style of writing apps, just depends how clean this way is for you.

OO Design: Multiple persistance design for a ruby class

I am designing a class for log entries of my mail server. I have parsed the log entries and created the class hierarchy. Now I need to save the in memory representation to the disk. I need to save it to multiple destinations like mysql and disk files. I am at a loss to find out the proper way to design the persistence mechanism. The challenges are:
How to pass persistence
initialization information like
filename, db connection parameters
passed to them. The options I can
think of are all ugly for eg:
1.1 Constructor: it becomes ugly as I
add more persistence.
1.2 Method: Object.mysql_params(" "),
again butt ugly
"Correct" method name to call each
persistance mechanism: eg:
Object.save_mysql, Object.save_file,
or Object.save (mysql) and
Object.save(file)
I am sure there is some pattern to solve this particular problem. I am using ruby as my language, with out any rails, ie pure ruby code. Any clue is much welcome.
raj
Personally I'd break things out a bit - the object representing a log entry really shouldn't be worrying about how it should save it, so I'd probably create a MySQLObjectStore, and FileObjectStore, which you can configure separately, and gets passed the object to save. You could give your Object class a class variable which contains the store type, to be called on save.
class Object
cattr_accessor :store
def save
##store.save(self)
end
end
class MySQLObjectStore
def initialize(connection_string)
# Connect to DB etc...
end
def save(obj)
# Write to database
end
end
store = MySQLObjectStore.new("user:password#localhost/database")
Object.store = store
obj = Object.new(foo)
obj.save
Unless I completely misunstood your question, I would recommend using the Strategy pattern. Instead of having this one class try to write to all of those different sources, delegate that responsibility to another class. Have a bunch of LogWriter classes, each one with the responsibility of persiting the object to a particular data store. So you might have a MySqlLogWriter, FileLogWriter, etc.
Each one of these objects can be instantiated on their own and then the persitence object can be passed to it:
lw = FileLogWriter.new "log_file.txt"
lw.Write(log)
You really should separate your concerns here. The message and the way the message is saved are two separate things. In fact, in many cases, it would also be more efficient not to open a new mysql connection or new file pointer for every message.
I would create a Saver class, extended by FileSaver and MysqlSaver, each of which have a save method, which is passed your message. The saver is responsible for pulling out the parts of the message that apply and saving them to the medium it's responsible for.

Resources