ruby: check if two variables are the same instance - ruby

I have a class Issue in which each class has a key field. Because keys are meant to be unique, I overrode the comparison operator such that two Issue objects are compared based on key like so:
def ==(other_issue)
other_issue.key == #key
end
However, I am dealing with a case in which two there may be two variables referring to the same instance of an Issue, and thus a comparison by key would not distinguish between them. Is there any way I could check if the two variables refer to the same place?

According to the source, the Object#equal? method should do what you're trying to do:
// From object.c
rb_obj_equal(VALUE obj1, VALUE obj2)
{
if (obj1 == obj2) return Qtrue;
return Qfalse;
}
So the ruby code you would write is:
obj1.equal?(obj2)

No, not really. And that is Good Thing™.
A basic tenet of object-orientation is that one object can simulate another object. If you could tell whether or not two objects are the same object ("Reference Equality"), then you could tell apart the simulation from the real thing, thus breaking object-orientation … which, for an object-oriented language like Ruby is not so good.
The default implementation of Object#equal? does indeed check for reference equality, but, just like every other method in Ruby, it can be overridden in subclasses or monkey-patched, so there is no way to guarantee that it will actually check for reference equality. And that is how it should be.

Related

What is the relation in (class diagrams) between those 3 classes?

I have the code as follow :
class Synchronization
def initialize
end
def perform
detect_outdated_documents
update_documents
end
private
attr_reader :documents
def detect_outdated_documents
#documents = DetectOutdatedDocument.new.perform
end
def update_documents
UpdateOutdatedDocument.new(documents).perform
end
#documents is an array of Hashes I return from a method in DetectOutdatedDocument.
I then use this array of Hash to initialize the UpdateOutdatedDocument class and run the perform method.
Is something like this correct?
Or should I use associations or something else?
Ruby to UML mapping
I'm not a Ruby expert, but what I understand from your snippet given its syntax is:
There's a Ruby class Synchronization: That's one UML class
The Ruby class has 4 methods initialize, perform, detect_outdated_documents, and update_documents, the two last being private. These would be 4 UML operations.
initialize is the constructor, and since it's empty, you have not mentioned it in your UML class diagram, and that's ok.
The Ruby class has 1 instance variable #documents. In UML, that would be a property, or a role of an association end.
The Ruby class has a getter created with attr_reader. But since it is in a private section, its visibility should be -. This other answer explains how to work with getters and setters elegantly and accurately in UML (big thanks to #engineersmnky for the explanations on getters in Ruby, and for having corrected my initial misunderstanding in this regard)
I understand that SomeClass.new creates in Ruby a new object of class SomeClass.
Ruby and dynamic typing in UML
UML class diagrams are based on well-defined types/classes. You would normally indicate associations, aggregations and compositions only with known classes with whom there’s for sure a stable relation. Ruby is dynamically typed, and all what is known for sure about an instance variable is that it's of type Object, the highest generalization possible in Ruby.
Moreover, Ruby methods return the value of the latest statement/expression in its execution path. If you did not care about a return value of an object, you'd just mark it as being Object (Thanks engineersmnky for the explanation).
Additional remarks:
There is no void type in UML (see also this SO question). An UML operation that does not return anything, would just be an operation with no return type indicated.
Keep also in mind that the use of types that do not belong to the UML standard (such as Array, Hash, Object, ...) would suppose the use of a language specific UML profile.
Based on all this, and considering that an array is also an Object, your code would lead to a very simple UML diagram, with 3 classes, that are all specializations of Object, and a one-to-many association between Synchronization and Object, with the role #documents at the Object end.
Is it all what we can hope for?
The very general class diagram, may perhaps match very well the implementation. But it might not accurately represent the design.
It's your right to model in UML a design independently of the implementation. Hence, if the types of instance variables are known by design (e.g. you want it to be of some type and make sure via the initialization and the API design that the type will be enforced), you may well show this in your diagram even if it deviates from the code:
You have done some manual type inferencing to deduce the return type of the UML operations. Since all Ruby methods return something, we'd expect for all Ruby methods at least an Object return type. But it would be ok for you not to indicate any return type (the UML equivalent to void) to express taht the return value is not important.
You also have done some type inference for the instance variable (UML property): you clarify that the only value it can take is the value return by DetectOutdatedDocument.new.perform.
Your diagram indicates that the class is related to an unspecified number of DetectOutdatedDocument objects, and we guess it's becaus of the possible values of #documents. And the property is indicated as an array of objects. It's very misleading to have both on the diagram. So I recommend to remove the document property. Instead, prefer a document role at the association end on the side of DetectOutdatedDocument. This would greatly clarify for the non-Ruby-native readers why there is a second class on the diagram. :-) (It took me a while)
Now you should not use the black diamond for composition. Because documents has a public reader; so other objects could also be assigned to the same documents. Since Ruby seems to have reference semantic for objects, the copy would then refer to the same objects. That's shared aggregation (white diamond) at best. And since UML has not defined very well the aggregation semantic, you could even show a simple association.
A last remark: from the code you show, we cannot confirm that there is an aggregation between UpdateOutdatedDocument and DetectOutdatedDocument. If you are sure there is such a relationship, you may keep it. But if it's only based on the snippet you showed us, remove the aggregation relation. You could at best show a usage dependency. But normally in UML you would not show such a dependency if it is about the body of a method, since the operation could be implemented very differently without being obliged to have this dependency.
There is no relation, UML or otherwise, in the posted code. In fact, at first glance it might seem like a Synchronization has-many #documents, but the variable and its contents are never defined, initialized, or assigned.
If this is a homework assignment, you probably need to ask your instructor what the objective is, and what the correct answer should be. If it's a real-world project, you haven't done the following:
defined the collaborator objects like Document
initialized #documents in a way that's accessible to the Synchronization class
allowed your class method to accept any dependency injections
Without at least one of the items listed, your UML diagram doesn't really fit the posted code.

Deleting an instance of a class via a method of that class

I have a class for pieces on a board. I want to be able to delete an instance of Piece so that anything else in the program that points to that piece will just point to nil.
Here's the very basic code version of what I want to do:
piece = Piece.new
variable = piece
variable #=> <Piece:0x0000000xxxxxxxx>
piece.delete
variable #=> nil
This seems like a very basic task so I feel like I'm missing something obvious. I've tried creating a delete method for the class with "self = nil", but this returns an error ("Can't change the value of self").
So far I have just worked around this by updating the other things that point to the object in my 'delete' method, but it seems like there should be a better way.
This is not possible.
Firstly, Ruby is an object-oriented language, which means that all manipulation is done via messages to objects, and all that is manipulated are objects. Variables are not objects, therefore you cannot manipulate them. (The only things you can do with variables are assign a value to them and dereference them.)
And even if you could manipulate variables, you would still need to hunt down every single reference to the object in question and remove it, in order for the object to be eligible for "deletion" (i.e. garbage collection).

What's better practice? Retrieve object or object.id?

This is more of a general question. And it might be dumb but since I constantly have this dilemma- decided to ask.
I have a function (in Rails if it matters) and I was wondering which approach is best practice and more common when writing large apps.
def retrieve_object(id_of_someobject)
# Gets class object ID (integer)
OtherObject.where('object_id = ?', id_of_someobject)
end
Here for example it receives 12 as id_of_someobject
OR
def retrieve_object(someobject)
# Gets class object
OtherObject.where('object_id = ?', someobject.id)
end
Here it gets class object and gets its ID by triggering the object attribute 'id'.
In this instance I would prefer the second approach. They may be functionally equivalent, but in the event that there's an error (e.g. calling nil.id), it makes more sense to handle that within the function so that it's easier to debug in the event of failure.
For the first approach, passing in nil wouldn't result in an error, but rather would return an empty array. So it might be difficult to know why your results aren't what you expected. The second approach would throw a flag and tell you exactly where that error is. If you wanted to handle that case by returning an empty array, you could do so explicitly.
As Michael mentioned, passing the whole object also gives you the flexibility to perform other operations down the road if you desire. I don't see a whole lot of benefit to evaluating the id and then passing it to a method unless you already have that ID without having to instantiate the object. (That would be a compelling use case for the first option)
Support both. It's only one more line and this way you don't have to remember or care.
def retrieve_object(id_or_someobject)
id = id_or_someobject.is_a?(SomeObject) ? id_or_someobject.id : id_or_someobject
OtherObject.where('object_id = ?', id)
end

is it possible to tell rspec to warn me if a stub returns a different type of object than the method it's replacing?

I have a method called save_title:
def save_title (data)
...
[ if the record exists, update, return 0]
[ if the record is new, create, return 1]
end
All fine, until I stubbed it:
saved_rows = []
proc.stub(:save_title) do |arg|
saved_rows << arg
end
The bug here is that I was using the integer returned from the real method to determine how many records were created vs. updated. The stub doesn't return an integer. Oooops. So the code worked fine in reality, but appeared broken in the test. A while later (more than I care to admit, cursing included) I realize the stub and the real method don't behave the same. Such are the pitfalls of dynamic languages I suppose.
Questions:
Can I tell rspec to warn me if the stub doesn't return the same sort of thing as the real method?
Is there an analyzer gem that I can use to warn about this sort of thing?
Is there some sort of best practice that I don't know about with returning values from methods?
1) There is no way that rspec can know what type of object the method is supposed to return, that's for you to tell it, however...
2) There is something you can look into. Instead of using a stub, try using a mock instead as your test double. It is basically the same thing as a stub, however, you can do many more validations on it (check out the documentation here). Things like how many times the specific method was called, the arguments it should be called with and what the return value should be as well. Your test will fail if any of those validations don't pass.
3) The best practice would be the method name itself. For example, methods ending in ? like object.exists? should always return a boolean value. In your case, I would suggest a refactoring of your method, maybe divide it in two, one for updating and one for creating and have another method to tell you if an object exists or not. It is not good practice to have a method behave in two different ways depending on the input (see separation of concerns)
Good luck! hope this helps.

Using return statements to great effect!

When I am making methods with return values, I usually try and set things up so that there is never a case when the method is called in such a way that it would have to return some default value. When I started I would often write methods that did something, and would either return what they did or, if they failed to do anything, would return null. But I hate having ugly if(!null) statements all over my code,
I'm reading a re-guide to ruby that I read many moons ago, by the pragmatic programmers, and I notice that they often return self (ruby's this) when they wouldn't normally return anything. This is, they say, in order to be able to chain method calls, as in this example using setters that return the object whose attributes they set.
tree.setColor(green).setDecor(gaudy).setPractical(false)
Initially I find this sort of thing attractive. There have been a couple of times when I have rejoiced at being able to chain method calls, like Player.getHand().getSize() but this is somewhat different in that the object of the method call changes from step to step.
What does Stack Overflow think about return values? Are there any patterns or idioms that come to mind warmly when you think of return values? Any great ways to avoid frustration and increase beauty?
In my humble opinion, there are three kinds of return-cases that you should take into consideration:
Object property manipulation
The first is the manipulation of object properties. The pattern you describe here is very often used when manipulating objects. A very typical scenario is using it together with a factory. Consider this hypothetical creation call:
// When the object has manipulative methods:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes();
// When the factory has manipulative methods working on the
// object, IMHO more elegant from a semantic point of view:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes().getPizza();
It allows for a quick grasp at what exactly is being created or how an object is manipulated, because the methods form one human-readable expression. It's definitely nice, but don't overuse. A rule of thumb is that this might be used with methods whose return value you could also declare as void.
Evaluating object properties
The second might be when a method evaluates something on an object. Consider, for example, the method car.getCurrentSpeed(), that could be interpreted as a message to an object asking for the current speed and returning that. It would simply return the value, not too complicated. :)
Make object do this or that
The third might be when a method makes an perform an operation, returning some sort of value indicating how well the caller's intention was fulfilled - but laying out such a method could be difficult:
int new_gear = 20;
if (car.gears.changeGear(new_gear)) // does that mean success or fail?
This is where you can see a difficulty in designing the method. Should it return 0 upon success or failure? How about -1 if the gear could not be set, because the car only has 5 gears? Does that mean the current gear is at -1 now, too? The method could return the gear it changed to, meaning you would have to compare the argument supplied to the method to the return code. That would work. On the other hand, you could simply return either true or false for failure or false or true for failure. Which one to use could be decided by estimating if you'd expect those method calls to rather fail or succeed.
In my humble opinion, there is a way to better express the semantics of such return values, by giving them a semantic description. Future developers interacting with your objects will love you for not having to look up the comments or documentation for your methods:
class GearSystem {
// (...)
public:
enum GearChangeResult
{ GearChangeSuccess, NonExistingGear, MechanicalGearProblem };
GearChangeResult changeGear (int gear);
};
That way, it becomes perfectly obvious for any programmer looking at your code, what the return value means (consider: if (gears.changeGear(20) == GearSystem::GearChangeSuccess) - much clearer what that means than the example above)
Antipattern: Failures as return codes.
The fourth possibility for a return value I actually omitted, because in my opinion it isn't any: when there's an error in your program, like a logic error or a failure that needs to be dealt with - you could theoretically return a value indicating so. But today, that's not done so often anymore (or should not be), because for that, there are exceptions.
I don't agree that methods should never return null. The most obvious examples are from systems programming. For instance, if someone asks to open a file, you simply have to give them null if the open fails. There is no sane alternative. There are other cases where null is appropriate, such as a getNextNode(node) method, when called on the last node of a linked list. So I guess what these cases have in common is that null represents "no object" (either no file handle or no list node), which makes sense.
In other cases, the method should never fail, and there is an appropriate exception facility. Then, I think method chaining like your example can be used to great effect. I think it's a bit funny that you seem to believe this is an innovation of the "Pragmatic Programmers". In fact, it dates to Lisp if not before.
Returning this is also used in the "builder pattern", another case where method chaining can enhance readability as well as writing convenience.
A null is often returned as an out-of-band value to indicate that no result could be produced. I believe that this is perfectly reasonable when getting no result is a normal event; examples would include a null return from readLine() at end-of-file, or a null returned when providing a non-existent key to the get(...) method of a Map. Reading to the end of the file is normal behavior (as opposed to an IOException, which indicates that something went abnormally wrong while trying to read). Similarly, looking up a key and being told that it has no value is a normal case.
A good alternative to null for some cases is a "null object", which is a full-fledged instance of the result class, but which has appropriate state and behavior for a "nobody's home" case. For instance, the result of looking up a non-existent user ID might well be a NullUser object which has a zero-length name and no permissions to do anything in the system.
It's confusing to me. OO programming languages need Smalltalk's semicolon:
tree color: green;
decor: gaudy;
practical: false.
obj method1; method2. means "call method1 on obj then method2 on obj". This kind of object setup is very common.

Resources