Related
I have the code as follow :
class Synchronization
def initialize
end
def perform
detect_outdated_documents
update_documents
end
private
attr_reader :documents
def detect_outdated_documents
#documents = DetectOutdatedDocument.new.perform
end
def update_documents
UpdateOutdatedDocument.new(documents).perform
end
#documents is an array of Hashes I return from a method in DetectOutdatedDocument.
I then use this array of Hash to initialize the UpdateOutdatedDocument class and run the perform method.
Is something like this correct?
Or should I use associations or something else?
Ruby to UML mapping
I'm not a Ruby expert, but what I understand from your snippet given its syntax is:
There's a Ruby class Synchronization: That's one UML class
The Ruby class has 4 methods initialize, perform, detect_outdated_documents, and update_documents, the two last being private. These would be 4 UML operations.
initialize is the constructor, and since it's empty, you have not mentioned it in your UML class diagram, and that's ok.
The Ruby class has 1 instance variable #documents. In UML, that would be a property, or a role of an association end.
The Ruby class has a getter created with attr_reader. But since it is in a private section, its visibility should be -. This other answer explains how to work with getters and setters elegantly and accurately in UML (big thanks to #engineersmnky for the explanations on getters in Ruby, and for having corrected my initial misunderstanding in this regard)
I understand that SomeClass.new creates in Ruby a new object of class SomeClass.
Ruby and dynamic typing in UML
UML class diagrams are based on well-defined types/classes. You would normally indicate associations, aggregations and compositions only with known classes with whom there’s for sure a stable relation. Ruby is dynamically typed, and all what is known for sure about an instance variable is that it's of type Object, the highest generalization possible in Ruby.
Moreover, Ruby methods return the value of the latest statement/expression in its execution path. If you did not care about a return value of an object, you'd just mark it as being Object (Thanks engineersmnky for the explanation).
Additional remarks:
There is no void type in UML (see also this SO question). An UML operation that does not return anything, would just be an operation with no return type indicated.
Keep also in mind that the use of types that do not belong to the UML standard (such as Array, Hash, Object, ...) would suppose the use of a language specific UML profile.
Based on all this, and considering that an array is also an Object, your code would lead to a very simple UML diagram, with 3 classes, that are all specializations of Object, and a one-to-many association between Synchronization and Object, with the role #documents at the Object end.
Is it all what we can hope for?
The very general class diagram, may perhaps match very well the implementation. But it might not accurately represent the design.
It's your right to model in UML a design independently of the implementation. Hence, if the types of instance variables are known by design (e.g. you want it to be of some type and make sure via the initialization and the API design that the type will be enforced), you may well show this in your diagram even if it deviates from the code:
You have done some manual type inferencing to deduce the return type of the UML operations. Since all Ruby methods return something, we'd expect for all Ruby methods at least an Object return type. But it would be ok for you not to indicate any return type (the UML equivalent to void) to express taht the return value is not important.
You also have done some type inference for the instance variable (UML property): you clarify that the only value it can take is the value return by DetectOutdatedDocument.new.perform.
Your diagram indicates that the class is related to an unspecified number of DetectOutdatedDocument objects, and we guess it's becaus of the possible values of #documents. And the property is indicated as an array of objects. It's very misleading to have both on the diagram. So I recommend to remove the document property. Instead, prefer a document role at the association end on the side of DetectOutdatedDocument. This would greatly clarify for the non-Ruby-native readers why there is a second class on the diagram. :-) (It took me a while)
Now you should not use the black diamond for composition. Because documents has a public reader; so other objects could also be assigned to the same documents. Since Ruby seems to have reference semantic for objects, the copy would then refer to the same objects. That's shared aggregation (white diamond) at best. And since UML has not defined very well the aggregation semantic, you could even show a simple association.
A last remark: from the code you show, we cannot confirm that there is an aggregation between UpdateOutdatedDocument and DetectOutdatedDocument. If you are sure there is such a relationship, you may keep it. But if it's only based on the snippet you showed us, remove the aggregation relation. You could at best show a usage dependency. But normally in UML you would not show such a dependency if it is about the body of a method, since the operation could be implemented very differently without being obliged to have this dependency.
There is no relation, UML or otherwise, in the posted code. In fact, at first glance it might seem like a Synchronization has-many #documents, but the variable and its contents are never defined, initialized, or assigned.
If this is a homework assignment, you probably need to ask your instructor what the objective is, and what the correct answer should be. If it's a real-world project, you haven't done the following:
defined the collaborator objects like Document
initialized #documents in a way that's accessible to the Synchronization class
allowed your class method to accept any dependency injections
Without at least one of the items listed, your UML diagram doesn't really fit the posted code.
Problem:
We know that Java doesn’t allow to extend multiple classes because it would result in the Diamond Problem where the compiler could’t decide which superclass method to use. With interface default methods the Diamond Problem were introduction in Java 8. That is, because if a class implements two interfaces, each defining the same default method, and the implementing class doesn’t override the common default method, the compiler couldn’t decide which implementation to chose.
Solution:
Java 8 requires to provide an implementation for default methods implemented by more than one interface. So if a class would implement both interfaces mentioned above, it would have to provide an implementation for the common default method. Otherwise the compiler would throw a compile time error.
Question:
Why is this solution not applicable for multiple class inheritance, by overriding common methods introduced by child class?
You didn’t understand the Diamond Problem correctly (and granted, the current state of the Wikipedia article doesn’t explain it sufficiently). As shown in this graphic,
the diamond problem occurs, when the same class is inherited multiple times through different inheritance paths. This isn’t a problem for interfaces (and never was), as they only define a contract and specifying the same contract multiple times makes no difference.
The main problem is not associated with the methods but the data of that super type. Should the instance state of A exist once or twice in that case? If once, C and B can have different, conflicting constraints on A’s instance state. Both classes might also assume to have full control over A’s state, i.e. not consider that other class having the same access level. If having two different A states, a widening conversion of a D reference to an A reference becomes ambiguous, as either A could be meant.
Interfaces don’t have these problems, as they do not carry instance data at all. They also have (almost) no accessibility issues as their methods are always public. Allowing default methods, doesn’t change this, as default methods still don’t access instance variables but operate with the interface methods only.
Of course, there is the possibility that B and C declared default methods with identical signature, causing an ambiguity that has to be resolved in D. But this is even the case, when there is no A, i.e. no “diamond” at all. So this scenario is not a correct example of the “Diamond Problem”.
Methods introduced by interfaces may always be overriden, while methods introduced by classes could be final. This is one reason why you potentially couldn't apply the same strategy for classes as you could for interfaces.
The conflict described as "diamond problem" can best be illustrated using a polymorphic call to method A.m() where the runtime type of the receiver has type D: Imagine D inherits two different methods both claiming to play the role of A.m() (one of them could be the original method A.m(), at least one of them is an override). Now, dynamic dispatch cannot decide which of the conflicting methods to invoke.
Aside: the distinction betwee the "diamond problem" and regular name clashes is particularly relevant in languages like Eiffel, where the conflict could be locally resolved for the perspective of type D, e.g., by renaming one method. This would avoid the name clash for invocations with static type D, but not for invocations with static type A.
Now, with default methods in Java 8, JLS was amended with rules that detect any such conflicts, requiring D to resolve the conflict (many different cases exist, depending on whether or not some of the types involved are classes). I.e., the diamond problem is not "solved" in Java 8, it is just avoided by rejecting any programs that would produce it.
In theory, similar rules could have been defined in Java 1 to admit multiple inheritance for classes. It's just a decision that was made early on, that the designers of Java did not want to support multiple inheritance.
The choice to admit multiple (implementation) inheritance for default methods but not for class methods is a purely pragmatic choice, not necessitated by any theory.
In my code I working with different types of collections and often converting one to another. I do it easily calling toList, toVector, toSet, toArray functions.
Now I am interested in performance of this operations. I find information about length, head, tail, apply performance in documentation. What actually happens when I call functions(toList, toVector, toSet, toArray) on List, Set, Array and Vector implementation in scala?
P.S. Question is only about standard scala collections which is immutable.
Well my advice would be: look yourself into the source code ! For instance, method toSet is defined as follow in the TraversableOnce trait (annotated by myself) :
def to[Col[_]](implicit cbf: CanBuildFrom[Nothing, A, Col[A #uV]]): Col[A #uV] = {
val b = cbf() //generic way to build the collection, if it would be a List, it would create an empty List
b ++= seq // add all the elements
b.result() //transform the result to the target collection
}
So it means that the toSet method has a performance of O(N) since you traverse all the list once! I believe that all the collections inheriting this trait are using this implementation.
While playing around with the new Java 8 Stream API I got to wondering, why not:
public interface Map<K,V> extends Function<K, V>
Or even:
public interface Map<K,V> extends Function<K, V>, Predicate<K>
It would be fairly easy to implement with default methods on the Map interface:
#Override default boolean test(K k) {
return containsKey(k);
}
#Override default V apply(K k) {
return get(k);
}
And it would allow for the use of a Map in a map method:
final MyMagicMap<String, Integer> map = new MyMagicHashMap<>();
map.put("A", 1);
map.put("B", 2);
map.put("C", 3);
map.put("D", 4);
final Stream<String> strings = Arrays.stream(new String[]{"A", "B", "C", "D"});
final Stream<Integer> remapped = strings.map(map);
Or as a Predicate in a filter method.
I find that a significant proportion of my use cases for a Map are exactly that construct or a similar one - as a remapping/lookup Function.
So, why did the JDK designers not decide to add this functionality to the Map during the redesign for Java 8?
The JDK team was certainly aware of the mathematical relationship between java.util.Map as a data structure and java.util.function.Function as a mapping function. After all, Function was named Mapper in early JDK 8 prototype builds. And the stream operation that calls a function on each stream element is called Stream.map.
There was even a discussion about possibly renaming Stream.map to something else like transform because of possible confusion between a transforming function and a Map data structure. (Sorry, can't find a link.) This proposal was rejected, with the rationale being the conceptual similarity (and that map for this purpose is in common usage).
The main question is, what would be gained if java.util.Map were a subtype of java.util.function.Function? There was some discussion in comments about whether subtyping implies an "is-a" relationship. Subtyping is less about "is-a" relationships of objects -- since we're talking about interfaces, not classes -- but it does imply substitutability. So if Map were a subtype of Function, one would be able to do this:
Map<K,V> m = ... ;
source.stream().map(m).collect(...);
Right away we're confronted with baking in the behavior of what is now Function.apply to one of the existing Map methods. Probably the only sensible one is Map.get, which returns null if the key isn't present. These semantics are, frankly, kind of lousy. Real applications are probably going to have to write their own methods that supply key-missing policy anyway, so there seems to be very little advantage of being able to write
map(m)
instead of
map(m::get)
or
map(x -> m.getOrDefault(x, def))
The question is “why should it extend Function?”
Your example of using strings.map(map) doesn’t really justify the idea of changing the type inheritance (implying adding methods to the Map interface), given the little difference to strings.map(map::get). And it’s not clear whether using a Map as a Function is really that common that it should get that special treatment compared to, e.g. using map::remove as a Function or using map::get of a Map<…,Integer> as ToIntFunction or map::get of a Map<T,T> as BinaryOperator.
That’s even more questionable in the case of a Predicate; should map::containsKey really get a special treatment compared to map::containsValue?
It’s also worth noting the type signature of the methods. Map.get has a functional signature of Object → V while you suggests that Map<K,V> should extend Function<K,V> which is understandable from a conceptional view of maps (or just by looking at the type), but it shows that there are two conflicting expectations, depending on whether you look at the method or at the type. The best solution is not to fix the functional type. Then you can assign map::get to either Function<Object,V> or Function<K,V> and everyone is happy…
Because a Map is not a Function. Inheritance is for A is a B relationships. Not for A can be the subject of various kinds of B relationships.
To have a function transforming a key to its value, you just need
Function<K, V> f = map::get;
To have a predicate testing if an object is contained in a map, you just need
Predicate<Object> p = map::contains;
That is both clearer and more readable than your proposal.
In the book The Well Grounded Rubyist (excerpt), David Black talks about the "Class/Object Chicken-and-Egg Paradox". I'm having a tough time understanding the entire concept.
Can someone explain it in better/easier/analogical/other terms?
Quote (emphasis mine):
The class Class is an instance of itself; that is, it’s a Class
object. And there’s more. Remember the class Object? Well, Object
is a class... but classes are objects. So, Object is an object. And
Class is a class. And Object is a class, and Class is an object.
Which came first? How can the class Class be created unless the
class Object already exists? But how can there be a class Object
(or any other class) until there’s a class Class of which there can
be instances?
The best way to deal with this paradox, at least for now, is to ignore
it. Ruby has to do some of this chicken-or-egg stuff in order to get
the class and object system up and running—and then, the circularity
and paradoxes don’t matter. In the course of programming, you just
need to know that classes are objects, instances of the class called
Class.
(If you want to know in brief how it works, it’s like this: every
object has an internal record of what class it’s an instance of, and
the internal record inside the object Class points back to Class.)
You can see the problem in this diagram:
(source: phrogz.net)
All object instances inherit from Object. All classes are objects, and Class is a class, therefore Class is an object. However, object instances inherit from their class, and Object is an instance of the Class class, therefore Object itself gets methods from Class.
As you can see in the diagram, however, there isn't a circular lookup loop, because there are two different inheritance 'parts' to every class: the instance methods and the 'class' methods. In the end, the lookup path is sane.
N.B.: This diagram reflects Ruby 1.8, and thus does not include the core BasicObject class introduced in Ruby 1.9.
In practical terms, all you need to understand is that Object is the mother of all classes. All classes extend Object. It is this relationship that you will use in programming, understanding inheritance and so forth.
Eg; You can call hash() on any instance of any object at any time? Why? Because that function appears in the Object class, and all classes inherit that function, because all classes extend Object.
As far as the idea of Class goes, this comes up much less frequently. An object of class Class is like a blueprint, it's like having the class in your hands, without creating an instance of it. There's a little more to it, but it's a difficult one to describe without a lengthy example. Rest assured, when (if ever) the time comes to use it, you'll see it's purpose.
All this excerpt is saying is that Object has a class of type Class and Class is an object, so must extend Object. Its cyclic, but it's irrelevant. The answer is buried somewhere in the compiler.
Regarding the which-came-first criterion, there are two kinds of Ruby objects:
Built-in objects. They exist at the start of a Ruby program and can be considered to have zero creation time.
User created objects. They are created after the program starts via object creation methods (new/clone/dup, class-definition, module-definition, literal-constructs, ...). Such objects are linearly ordered by their time of creation. This order happens to inversely correspond to class-inheritance and instance-of relations.
A detailed explanation of the Ruby object model is available at www.atalon.cz.
I know that my answer comes at least 3 years late, but I have learned about Ruby quite recently and I must admit that the literature sometimes presents (in my opinion) misleading explanations, even though one is handling a very simple problem. Moreover, I am (and was) surprised by such appalling phrases as:
"The best way to deal with this paradox, at least for now, is to ignore it."
stated by the author D. Black, and quoted in the question, but this attitude seems to be pervasive. I have experienced this tendency within the physics community but I have not suspected it had also spread through the programming one. I am not stating that nobody understands what is lurking behind, but it seems at least intriguing why not providing the (indeed) very simple and precise answer, as in this case there is one, without invoking any obscure words such as "paradox" within the explanation.
This so-called "paradox" (even though it is definitely NOT such thing) comes from the (at least misleading) belief that "being an instance of (a subclass of)" should be something as "being an element of" (in, say, naive set theory), or in other terms, a class (in Ruby) is the set containing all the objects sharing some common property (for instance, under this naive interpretation the class String includes all the string objects). Even though this naive point of view (which I may call the "membership interpretation") may help understanding isolated (and rather easy) classes such as String or Symbol, it indeed PRODUCES A CLEAR CONTRADICTION with our naive understanding (and also the axiomatic one, for it contradicts Von Neumann's regularity axiom of set theory, if someone knows what I am talking about) of the membership relationship, for an object could not be an element of itself, as the previous interpretation implies when regarding the object Class.
The previously stated problem is easily avoided if one gets rid of such misleading membership interpretation with a very simply minded model as follows.
I would have guess that my following explanation is well-known by the experts, so I claim no originality at all, but perhaps it was not rephrased in the (simple) terms I am going to present it: in some sense I am completely astonished that (apparently) nobody stated in these terms from the very beginning, and my intention is just to highlight what I believe is the basic underlying structure.
Let us consider a set O of (basic) objects, which consists of all the (basic) objects Ruby has, provided with a mapping f from O to O (which is more or less the .class method), satisfying that the image of the composition of f with itself has only one element.
This latter element (or object) is denoted Class (indeed, what you know to be the class Class).
I would be tempted to write this post using LaTeX code but I am not quite sure if it will be parsed and converted into typical math expressions.
The image of the mapping f is (by definition) the set of Ruby classes (e.g. String, Object, Symbol, Class, etc).
Ruby programmers (even though if they do not know it) interpret the previous construction as follows: we say that an object "x is an instance of y" if and only if y = f(x). By the way this tells us you exactly that Class is an instance of Class (i.e. of itself).
Now, we would need much more decorations to this simple model in order to get the full Ruby hierarchy of classes and functionality (imposing the existence of some fixed members on the image of the map f, a partial order on the image of the map f in order to define subclasses to get afterwards inheritance, etc), and in particular to get the nice picture that was interestingly upvoted, but I suppose that everybody can figure this out from the primitive model I gave (I have written some pages explaining this for myself, after having read several Ruby manuals. I may provide a copy if anybody finds it useful).
Which came first: The class Class or the class Object?
Every class is an instance of the class Class. It follows that the class Object is an instance of the class Class. So you need the class Class to create the class Object. Therefore:
The class Class exists before the class Object.
The class Class is a subclass of the class Object. So you need the class Object from which the class Class can be created. Therefore:
The class Object exists before the class Class.
So statement-2 conflicts with statement-1 and so we have our chicken & egg dilemma. Or you can just accept it as a circular definition!
I have done a dig into the source code of Ruby and created this post on how to make sense of it.