In the book The Well Grounded Rubyist (excerpt), David Black talks about the "Class/Object Chicken-and-Egg Paradox". I'm having a tough time understanding the entire concept.
Can someone explain it in better/easier/analogical/other terms?
Quote (emphasis mine):
The class Class is an instance of itself; that is, it’s a Class
object. And there’s more. Remember the class Object? Well, Object
is a class... but classes are objects. So, Object is an object. And
Class is a class. And Object is a class, and Class is an object.
Which came first? How can the class Class be created unless the
class Object already exists? But how can there be a class Object
(or any other class) until there’s a class Class of which there can
be instances?
The best way to deal with this paradox, at least for now, is to ignore
it. Ruby has to do some of this chicken-or-egg stuff in order to get
the class and object system up and running—and then, the circularity
and paradoxes don’t matter. In the course of programming, you just
need to know that classes are objects, instances of the class called
Class.
(If you want to know in brief how it works, it’s like this: every
object has an internal record of what class it’s an instance of, and
the internal record inside the object Class points back to Class.)
You can see the problem in this diagram:
(source: phrogz.net)
All object instances inherit from Object. All classes are objects, and Class is a class, therefore Class is an object. However, object instances inherit from their class, and Object is an instance of the Class class, therefore Object itself gets methods from Class.
As you can see in the diagram, however, there isn't a circular lookup loop, because there are two different inheritance 'parts' to every class: the instance methods and the 'class' methods. In the end, the lookup path is sane.
N.B.: This diagram reflects Ruby 1.8, and thus does not include the core BasicObject class introduced in Ruby 1.9.
In practical terms, all you need to understand is that Object is the mother of all classes. All classes extend Object. It is this relationship that you will use in programming, understanding inheritance and so forth.
Eg; You can call hash() on any instance of any object at any time? Why? Because that function appears in the Object class, and all classes inherit that function, because all classes extend Object.
As far as the idea of Class goes, this comes up much less frequently. An object of class Class is like a blueprint, it's like having the class in your hands, without creating an instance of it. There's a little more to it, but it's a difficult one to describe without a lengthy example. Rest assured, when (if ever) the time comes to use it, you'll see it's purpose.
All this excerpt is saying is that Object has a class of type Class and Class is an object, so must extend Object. Its cyclic, but it's irrelevant. The answer is buried somewhere in the compiler.
Regarding the which-came-first criterion, there are two kinds of Ruby objects:
Built-in objects. They exist at the start of a Ruby program and can be considered to have zero creation time.
User created objects. They are created after the program starts via object creation methods (new/clone/dup, class-definition, module-definition, literal-constructs, ...). Such objects are linearly ordered by their time of creation. This order happens to inversely correspond to class-inheritance and instance-of relations.
A detailed explanation of the Ruby object model is available at www.atalon.cz.
I know that my answer comes at least 3 years late, but I have learned about Ruby quite recently and I must admit that the literature sometimes presents (in my opinion) misleading explanations, even though one is handling a very simple problem. Moreover, I am (and was) surprised by such appalling phrases as:
"The best way to deal with this paradox, at least for now, is to ignore it."
stated by the author D. Black, and quoted in the question, but this attitude seems to be pervasive. I have experienced this tendency within the physics community but I have not suspected it had also spread through the programming one. I am not stating that nobody understands what is lurking behind, but it seems at least intriguing why not providing the (indeed) very simple and precise answer, as in this case there is one, without invoking any obscure words such as "paradox" within the explanation.
This so-called "paradox" (even though it is definitely NOT such thing) comes from the (at least misleading) belief that "being an instance of (a subclass of)" should be something as "being an element of" (in, say, naive set theory), or in other terms, a class (in Ruby) is the set containing all the objects sharing some common property (for instance, under this naive interpretation the class String includes all the string objects). Even though this naive point of view (which I may call the "membership interpretation") may help understanding isolated (and rather easy) classes such as String or Symbol, it indeed PRODUCES A CLEAR CONTRADICTION with our naive understanding (and also the axiomatic one, for it contradicts Von Neumann's regularity axiom of set theory, if someone knows what I am talking about) of the membership relationship, for an object could not be an element of itself, as the previous interpretation implies when regarding the object Class.
The previously stated problem is easily avoided if one gets rid of such misleading membership interpretation with a very simply minded model as follows.
I would have guess that my following explanation is well-known by the experts, so I claim no originality at all, but perhaps it was not rephrased in the (simple) terms I am going to present it: in some sense I am completely astonished that (apparently) nobody stated in these terms from the very beginning, and my intention is just to highlight what I believe is the basic underlying structure.
Let us consider a set O of (basic) objects, which consists of all the (basic) objects Ruby has, provided with a mapping f from O to O (which is more or less the .class method), satisfying that the image of the composition of f with itself has only one element.
This latter element (or object) is denoted Class (indeed, what you know to be the class Class).
I would be tempted to write this post using LaTeX code but I am not quite sure if it will be parsed and converted into typical math expressions.
The image of the mapping f is (by definition) the set of Ruby classes (e.g. String, Object, Symbol, Class, etc).
Ruby programmers (even though if they do not know it) interpret the previous construction as follows: we say that an object "x is an instance of y" if and only if y = f(x). By the way this tells us you exactly that Class is an instance of Class (i.e. of itself).
Now, we would need much more decorations to this simple model in order to get the full Ruby hierarchy of classes and functionality (imposing the existence of some fixed members on the image of the map f, a partial order on the image of the map f in order to define subclasses to get afterwards inheritance, etc), and in particular to get the nice picture that was interestingly upvoted, but I suppose that everybody can figure this out from the primitive model I gave (I have written some pages explaining this for myself, after having read several Ruby manuals. I may provide a copy if anybody finds it useful).
Which came first: The class Class or the class Object?
Every class is an instance of the class Class. It follows that the class Object is an instance of the class Class. So you need the class Class to create the class Object. Therefore:
The class Class exists before the class Object.
The class Class is a subclass of the class Object. So you need the class Object from which the class Class can be created. Therefore:
The class Object exists before the class Class.
So statement-2 conflicts with statement-1 and so we have our chicken & egg dilemma. Or you can just accept it as a circular definition!
I have done a dig into the source code of Ruby and created this post on how to make sense of it.
Related
I have the code as follow :
class Synchronization
def initialize
end
def perform
detect_outdated_documents
update_documents
end
private
attr_reader :documents
def detect_outdated_documents
#documents = DetectOutdatedDocument.new.perform
end
def update_documents
UpdateOutdatedDocument.new(documents).perform
end
#documents is an array of Hashes I return from a method in DetectOutdatedDocument.
I then use this array of Hash to initialize the UpdateOutdatedDocument class and run the perform method.
Is something like this correct?
Or should I use associations or something else?
Ruby to UML mapping
I'm not a Ruby expert, but what I understand from your snippet given its syntax is:
There's a Ruby class Synchronization: That's one UML class
The Ruby class has 4 methods initialize, perform, detect_outdated_documents, and update_documents, the two last being private. These would be 4 UML operations.
initialize is the constructor, and since it's empty, you have not mentioned it in your UML class diagram, and that's ok.
The Ruby class has 1 instance variable #documents. In UML, that would be a property, or a role of an association end.
The Ruby class has a getter created with attr_reader. But since it is in a private section, its visibility should be -. This other answer explains how to work with getters and setters elegantly and accurately in UML (big thanks to #engineersmnky for the explanations on getters in Ruby, and for having corrected my initial misunderstanding in this regard)
I understand that SomeClass.new creates in Ruby a new object of class SomeClass.
Ruby and dynamic typing in UML
UML class diagrams are based on well-defined types/classes. You would normally indicate associations, aggregations and compositions only with known classes with whom there’s for sure a stable relation. Ruby is dynamically typed, and all what is known for sure about an instance variable is that it's of type Object, the highest generalization possible in Ruby.
Moreover, Ruby methods return the value of the latest statement/expression in its execution path. If you did not care about a return value of an object, you'd just mark it as being Object (Thanks engineersmnky for the explanation).
Additional remarks:
There is no void type in UML (see also this SO question). An UML operation that does not return anything, would just be an operation with no return type indicated.
Keep also in mind that the use of types that do not belong to the UML standard (such as Array, Hash, Object, ...) would suppose the use of a language specific UML profile.
Based on all this, and considering that an array is also an Object, your code would lead to a very simple UML diagram, with 3 classes, that are all specializations of Object, and a one-to-many association between Synchronization and Object, with the role #documents at the Object end.
Is it all what we can hope for?
The very general class diagram, may perhaps match very well the implementation. But it might not accurately represent the design.
It's your right to model in UML a design independently of the implementation. Hence, if the types of instance variables are known by design (e.g. you want it to be of some type and make sure via the initialization and the API design that the type will be enforced), you may well show this in your diagram even if it deviates from the code:
You have done some manual type inferencing to deduce the return type of the UML operations. Since all Ruby methods return something, we'd expect for all Ruby methods at least an Object return type. But it would be ok for you not to indicate any return type (the UML equivalent to void) to express taht the return value is not important.
You also have done some type inference for the instance variable (UML property): you clarify that the only value it can take is the value return by DetectOutdatedDocument.new.perform.
Your diagram indicates that the class is related to an unspecified number of DetectOutdatedDocument objects, and we guess it's becaus of the possible values of #documents. And the property is indicated as an array of objects. It's very misleading to have both on the diagram. So I recommend to remove the document property. Instead, prefer a document role at the association end on the side of DetectOutdatedDocument. This would greatly clarify for the non-Ruby-native readers why there is a second class on the diagram. :-) (It took me a while)
Now you should not use the black diamond for composition. Because documents has a public reader; so other objects could also be assigned to the same documents. Since Ruby seems to have reference semantic for objects, the copy would then refer to the same objects. That's shared aggregation (white diamond) at best. And since UML has not defined very well the aggregation semantic, you could even show a simple association.
A last remark: from the code you show, we cannot confirm that there is an aggregation between UpdateOutdatedDocument and DetectOutdatedDocument. If you are sure there is such a relationship, you may keep it. But if it's only based on the snippet you showed us, remove the aggregation relation. You could at best show a usage dependency. But normally in UML you would not show such a dependency if it is about the body of a method, since the operation could be implemented very differently without being obliged to have this dependency.
There is no relation, UML or otherwise, in the posted code. In fact, at first glance it might seem like a Synchronization has-many #documents, but the variable and its contents are never defined, initialized, or assigned.
If this is a homework assignment, you probably need to ask your instructor what the objective is, and what the correct answer should be. If it's a real-world project, you haven't done the following:
defined the collaborator objects like Document
initialized #documents in a way that's accessible to the Synchronization class
allowed your class method to accept any dependency injections
Without at least one of the items listed, your UML diagram doesn't really fit the posted code.
Problem:
We know that Java doesn’t allow to extend multiple classes because it would result in the Diamond Problem where the compiler could’t decide which superclass method to use. With interface default methods the Diamond Problem were introduction in Java 8. That is, because if a class implements two interfaces, each defining the same default method, and the implementing class doesn’t override the common default method, the compiler couldn’t decide which implementation to chose.
Solution:
Java 8 requires to provide an implementation for default methods implemented by more than one interface. So if a class would implement both interfaces mentioned above, it would have to provide an implementation for the common default method. Otherwise the compiler would throw a compile time error.
Question:
Why is this solution not applicable for multiple class inheritance, by overriding common methods introduced by child class?
You didn’t understand the Diamond Problem correctly (and granted, the current state of the Wikipedia article doesn’t explain it sufficiently). As shown in this graphic,
the diamond problem occurs, when the same class is inherited multiple times through different inheritance paths. This isn’t a problem for interfaces (and never was), as they only define a contract and specifying the same contract multiple times makes no difference.
The main problem is not associated with the methods but the data of that super type. Should the instance state of A exist once or twice in that case? If once, C and B can have different, conflicting constraints on A’s instance state. Both classes might also assume to have full control over A’s state, i.e. not consider that other class having the same access level. If having two different A states, a widening conversion of a D reference to an A reference becomes ambiguous, as either A could be meant.
Interfaces don’t have these problems, as they do not carry instance data at all. They also have (almost) no accessibility issues as their methods are always public. Allowing default methods, doesn’t change this, as default methods still don’t access instance variables but operate with the interface methods only.
Of course, there is the possibility that B and C declared default methods with identical signature, causing an ambiguity that has to be resolved in D. But this is even the case, when there is no A, i.e. no “diamond” at all. So this scenario is not a correct example of the “Diamond Problem”.
Methods introduced by interfaces may always be overriden, while methods introduced by classes could be final. This is one reason why you potentially couldn't apply the same strategy for classes as you could for interfaces.
The conflict described as "diamond problem" can best be illustrated using a polymorphic call to method A.m() where the runtime type of the receiver has type D: Imagine D inherits two different methods both claiming to play the role of A.m() (one of them could be the original method A.m(), at least one of them is an override). Now, dynamic dispatch cannot decide which of the conflicting methods to invoke.
Aside: the distinction betwee the "diamond problem" and regular name clashes is particularly relevant in languages like Eiffel, where the conflict could be locally resolved for the perspective of type D, e.g., by renaming one method. This would avoid the name clash for invocations with static type D, but not for invocations with static type A.
Now, with default methods in Java 8, JLS was amended with rules that detect any such conflicts, requiring D to resolve the conflict (many different cases exist, depending on whether or not some of the types involved are classes). I.e., the diamond problem is not "solved" in Java 8, it is just avoided by rejecting any programs that would produce it.
In theory, similar rules could have been defined in Java 1 to admit multiple inheritance for classes. It's just a decision that was made early on, that the designers of Java did not want to support multiple inheritance.
The choice to admit multiple (implementation) inheritance for default methods but not for class methods is a purely pragmatic choice, not necessitated by any theory.
I've been reading Sandi Metz's Practical Object-Oriented Design in Ruby and many sites online discussing design in Ruby. Something I've had a hard time fully understanding is the proper way to implement dependency injection.
The internet is flooded with blog posts that explain how dependency injection works in what I think is a very partial way.
I understand that this is supposed to be bad:
class ThisClass
def initialize
#another_class = AnotherClass.new
end
end
While this is a solution:
class ThisClass
def initialize(another_class)
#another_class = another_class
end
end
And that I could send the AnotherClass.new like this:
this_class = ThisClass.new(AnotherClass.new)
That is the approach that Sandi Metz recommends at least. What I don't understand is where should a line like that go? It has to go somewhere and generally in examples of this what's shown is a line like that being placed totally outside of any class, method, or module as if I'm simply entering it all by hand in IRB for testing purposes.
This post (among others) suggests this different approach:
class ThisClass
def another_class
#another_class ||= AnotherClass.new
end
end
Jamis Buck would take a similar approach like this:
class AnotherClass
end
class ThisClass
def another_class_factory(class_name = AnotherClass)
class_name.new
end
end
However, these two examples both preserve AnotherClass's name inside ThisClass, which Sandi Metz says is one of the main things we're trying to avoid.
So what is the best practice for doing this? Should I make a 'dependency' module filled with methods that are factories for objects of each class in my application?
Something I've had a hard time fully understanding is the proper way to implement dependency injection.
I think the best definition of a "proper" implementation is one that adheres to the SOLID principles of object oriented design. In this case mostly the Dependency Inversion Principle.
In this regard, this is the only presented solution that does not violate the DIP(1):
class ThisClass
def initialize(another_class)
#another_class = another_class
end
end
In all other cases, ThisClass has a hard dependency on AnotherClass, and can not function without it. Furthermore, if we wish to replace AnotherClass with a third, we need to modify ThisClass, which is a violation of the Open Closed Principle.
Of course, in the example above, naming the parameter and instance variable another_class is not ideal, since we do not now (and do not need to know) what object is passed to us, as long as it responds to the expected interface. This is the beauty of polymorphism.
Consider the below example, taken from this ThoughtBot video on DIP:
class Copier
def initialize(reader, writer)
#reader = reader
#writer = writer
end
def copy
#writer.write(#reader.read_until_eof)
end
end
Here you can pass any reader and writer objects that respond to read_until_eof and write respectively. This gives you full freedom to compose your business logic using different pairs of read and write implementations, even at runtime:
Copier.new(KeyboardReader.new, Printer.new)
Copier.new(KeyboardReader.new, NetworkPrinter.new)
Which brings us to your next question.
It has to go somewhere and generally in examples of this what's shown is a line like that being placed totally outside of any class, method, or module [...]
You are correct. While object thinking involves modelling the domain with well isolated, decoupled, and composable objects, you will still need to define how these objects interact, in order to implement any business logic. After all, having composable objects is no good unless we compose them.
The analogy that is often made here is to think of your objects as actors. You are the director, and you still need to create a script(2) for the actors to know how to interact with each other.
That is, you need an entry point into your application. A place where the script starts. This might itself be an object--normally an abstract one. In a command line application, it can be your classic Main class, and in a Rails application it can be your controller.
This might seem strange at first, because the focus of object thinking is on modelling concrete domain objects, and a great deal of all writings on the subject is dedicated to this effort, but just remember the actor-script metaphor, and you'll be on your way.
I strongly recommend you pick up the book Object Thinking. It does a great job explaining the mindset behind object oriented design, without which knowing the language specific implementation details becomes rather futile.
(1): It is worth noting that some proponents consider storing an instance of another class in an instance variable an anti-pattern, but in Ruby, this is fairly idiomatic.
(2): I am not sure if this is the origin of the term script in programming in general, but maybe some historian can shed some light on this.
If we omit visibility modifiers(let's say all methods are public), is there any common convension to put new methods in class? I mean if I put them on bottom it's logically correct, because methods are sorted by date. If I put them on top, it's easy to see and compare what methods are added if the class is very long.
Just depends on what you and your team are comfortable with. I usually have method at the top of my class followed by fields. If there are many method that do different thing you are better off organizing them in a new class. Now without seeing any code I'm simply guessing.
No, I don't think there is.
As Kalagen said it's up to you and your team to decide it.
I would keep any methods that share similar functionality together and keep the class definition short.
The short answer in my opinion is no. Coding style might vary depending on the language you are using and the team you are working with. In addition you might have your own preference as well. I tend to add new methods close to methods that are related to it (e.g. if method1 calls method2 then method1 is above method2). Then it is relatively easy to find the method that's being called. On the other hand, most IDEs can find the method with a mouse click.
If you're using version control, you can easily see what methods were added and in which order, so sorting by date is not needed.
And as others have mentioned, keep the class small. Take a look at the Single responsibility principle. If the methods you are adding are not related to the responsibility of the class, extract them and create a new class.
Warning acronym overload approaching!!! I'm doing TDD and DDD with an MVP passive view pattern and DI. I'm finding myself adding dependency after dependency to the constructor of my presenter class as I write each new test. Most are domain objects. I'm using factories for dependency injection though I will likely be moving to an IoC container eventually.
When using constructor injection (as apposed to property injection) its easy to see where your dependencies are. A large number of dependencies is usually an indicator that a class has too much responsibility but in the case of a presenter, I fail to see how to avoid this.
I've thought of wrapping all the domain objects into a single "Domain" class which would act as a middle man but I have this gut feeling that I'd only be moving the problem instead of fixing it.
Am I missing something or is this unavoidable?
Often a large number of arguments to a method (constructor, function, etc) is a code smell. It can be hard to understand what all the arguments are. This is especially the case if you have large numbers of arguments of the same type. It is very easy for them to get confused which can introduce subtle bugs.
The refactoring is called "Introduce Parameter Object". Whether that's really a domain object or not, it is basically a data transfer object that minimizes the number of parameters passed to a method and gives them a bit more context.
I only use DI on the Constructor if I need something to be there from the start. Otherwise I use properties and have lazy loading for the other items. For TDD/DI as long as you can inject the item when you need it you don't need to add it to your constructor.
I recommend always following the Law of Demeter and not following the DI myth of everything needs to be in the constructor. Misko Hevery (Agile Coach at Google) describes it well on his blog http://misko.hevery.com/2008/10/21/dependency-injection-myth-reference-passing/
Having a Layer Supertype might not be a bad idea, but I think your code smell might be indicating something else. Geofflane mentioned the refactor pattern, Introduce Parameter Object. While it's a good pattern for this sort of problem, I'm not entirely sure it's the way to go for this situation.
Question: Why are you passing in Domain Model objects to the constructor?
There is such a thing as having too much abstraction. If there's one solid layer of code you should be able to trust, it's your Domain Model. You don't need to reference 3 IEntity objects when you're dealing with Customer, Vendor, and Product classes if those are part of your basic Domain Model and you don't necessarily need polymorphism.
My advice: Pass in application and domain services. Trust your Domain Model.
EDIT:
Re-reading the problem when it's not horribly late at night, I realize your "Domain class" is already the Introduce Parameter Object refactoring and not, in fact, a Layer Supertype, as I thought at 3AM.
I also realize that perhaps you need to reference the Model objects in the application code, outside the Presenter. Perhaps you're doing some initial setup of your Model objects before passing them in to the Presenter. If this is the case, your "Domain class" idea might be best. If there is some initial setup, when moving to an IoC, you'll want to look at something like Factory Support in Castle Windsor. (Other IoC containers have similar concepts.)