Using return statements to great effect! - coding-style

When I am making methods with return values, I usually try and set things up so that there is never a case when the method is called in such a way that it would have to return some default value. When I started I would often write methods that did something, and would either return what they did or, if they failed to do anything, would return null. But I hate having ugly if(!null) statements all over my code,
I'm reading a re-guide to ruby that I read many moons ago, by the pragmatic programmers, and I notice that they often return self (ruby's this) when they wouldn't normally return anything. This is, they say, in order to be able to chain method calls, as in this example using setters that return the object whose attributes they set.
tree.setColor(green).setDecor(gaudy).setPractical(false)
Initially I find this sort of thing attractive. There have been a couple of times when I have rejoiced at being able to chain method calls, like Player.getHand().getSize() but this is somewhat different in that the object of the method call changes from step to step.
What does Stack Overflow think about return values? Are there any patterns or idioms that come to mind warmly when you think of return values? Any great ways to avoid frustration and increase beauty?

In my humble opinion, there are three kinds of return-cases that you should take into consideration:
Object property manipulation
The first is the manipulation of object properties. The pattern you describe here is very often used when manipulating objects. A very typical scenario is using it together with a factory. Consider this hypothetical creation call:
// When the object has manipulative methods:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes();
// When the factory has manipulative methods working on the
// object, IMHO more elegant from a semantic point of view:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes().getPizza();
It allows for a quick grasp at what exactly is being created or how an object is manipulated, because the methods form one human-readable expression. It's definitely nice, but don't overuse. A rule of thumb is that this might be used with methods whose return value you could also declare as void.
Evaluating object properties
The second might be when a method evaluates something on an object. Consider, for example, the method car.getCurrentSpeed(), that could be interpreted as a message to an object asking for the current speed and returning that. It would simply return the value, not too complicated. :)
Make object do this or that
The third might be when a method makes an perform an operation, returning some sort of value indicating how well the caller's intention was fulfilled - but laying out such a method could be difficult:
int new_gear = 20;
if (car.gears.changeGear(new_gear)) // does that mean success or fail?
This is where you can see a difficulty in designing the method. Should it return 0 upon success or failure? How about -1 if the gear could not be set, because the car only has 5 gears? Does that mean the current gear is at -1 now, too? The method could return the gear it changed to, meaning you would have to compare the argument supplied to the method to the return code. That would work. On the other hand, you could simply return either true or false for failure or false or true for failure. Which one to use could be decided by estimating if you'd expect those method calls to rather fail or succeed.
In my humble opinion, there is a way to better express the semantics of such return values, by giving them a semantic description. Future developers interacting with your objects will love you for not having to look up the comments or documentation for your methods:
class GearSystem {
// (...)
public:
enum GearChangeResult
{ GearChangeSuccess, NonExistingGear, MechanicalGearProblem };
GearChangeResult changeGear (int gear);
};
That way, it becomes perfectly obvious for any programmer looking at your code, what the return value means (consider: if (gears.changeGear(20) == GearSystem::GearChangeSuccess) - much clearer what that means than the example above)
Antipattern: Failures as return codes.
The fourth possibility for a return value I actually omitted, because in my opinion it isn't any: when there's an error in your program, like a logic error or a failure that needs to be dealt with - you could theoretically return a value indicating so. But today, that's not done so often anymore (or should not be), because for that, there are exceptions.

I don't agree that methods should never return null. The most obvious examples are from systems programming. For instance, if someone asks to open a file, you simply have to give them null if the open fails. There is no sane alternative. There are other cases where null is appropriate, such as a getNextNode(node) method, when called on the last node of a linked list. So I guess what these cases have in common is that null represents "no object" (either no file handle or no list node), which makes sense.
In other cases, the method should never fail, and there is an appropriate exception facility. Then, I think method chaining like your example can be used to great effect. I think it's a bit funny that you seem to believe this is an innovation of the "Pragmatic Programmers". In fact, it dates to Lisp if not before.

Returning this is also used in the "builder pattern", another case where method chaining can enhance readability as well as writing convenience.
A null is often returned as an out-of-band value to indicate that no result could be produced. I believe that this is perfectly reasonable when getting no result is a normal event; examples would include a null return from readLine() at end-of-file, or a null returned when providing a non-existent key to the get(...) method of a Map. Reading to the end of the file is normal behavior (as opposed to an IOException, which indicates that something went abnormally wrong while trying to read). Similarly, looking up a key and being told that it has no value is a normal case.
A good alternative to null for some cases is a "null object", which is a full-fledged instance of the result class, but which has appropriate state and behavior for a "nobody's home" case. For instance, the result of looking up a non-existent user ID might well be a NullUser object which has a zero-length name and no permissions to do anything in the system.

It's confusing to me. OO programming languages need Smalltalk's semicolon:
tree color: green;
decor: gaudy;
practical: false.
obj method1; method2. means "call method1 on obj then method2 on obj". This kind of object setup is very common.

Related

What's better practice? Retrieve object or object.id?

This is more of a general question. And it might be dumb but since I constantly have this dilemma- decided to ask.
I have a function (in Rails if it matters) and I was wondering which approach is best practice and more common when writing large apps.
def retrieve_object(id_of_someobject)
# Gets class object ID (integer)
OtherObject.where('object_id = ?', id_of_someobject)
end
Here for example it receives 12 as id_of_someobject
OR
def retrieve_object(someobject)
# Gets class object
OtherObject.where('object_id = ?', someobject.id)
end
Here it gets class object and gets its ID by triggering the object attribute 'id'.
In this instance I would prefer the second approach. They may be functionally equivalent, but in the event that there's an error (e.g. calling nil.id), it makes more sense to handle that within the function so that it's easier to debug in the event of failure.
For the first approach, passing in nil wouldn't result in an error, but rather would return an empty array. So it might be difficult to know why your results aren't what you expected. The second approach would throw a flag and tell you exactly where that error is. If you wanted to handle that case by returning an empty array, you could do so explicitly.
As Michael mentioned, passing the whole object also gives you the flexibility to perform other operations down the road if you desire. I don't see a whole lot of benefit to evaluating the id and then passing it to a method unless you already have that ID without having to instantiate the object. (That would be a compelling use case for the first option)
Support both. It's only one more line and this way you don't have to remember or care.
def retrieve_object(id_or_someobject)
id = id_or_someobject.is_a?(SomeObject) ? id_or_someobject.id : id_or_someobject
OtherObject.where('object_id = ?', id)
end

Replace lots of switches with polymorphism but no type code

I have a class which could benefit with the state pattern. However the common "Replace Type Code with State/Strategy" refactoring does not seem to fit well in my case: the state is calculated by watching other objects, there is no type code variable.
Most of my class code is just "calculating" some state when it is called, and running the functions for that state.
Forcing a type code variable feels wrong because:
I will be forced to call an "updateState()" function in every place where the polymorphic functions are used.
My class will no longer be 100% behavior, which I would rather habe instead of some internal state.
Since the state must be calculated every single time its functions are called, I am wonder if I am thinking about the wrong pattern.
Normally I refactor this:
if (this.someOtherThingIsRunning()) {
...
} else {
...
}
like this:
typecode.doSomething()
// that being polymorphic
it seems strange doing:
updateTypeCode()
typecode.doSomething()
Does the state pattern applies to this case? Is there any alternative strategy pull from polymorphism without a type code?
While writing my question, I realized that maybe I could just make the type code a function and return a temporal (function scope) type code. Like:
typecode().doSomething()
This solution would never store the state, which is what I want to avoid. However I am still wondering if my problem started because I am using the wrong pattern.
If you're open to storing the state, maybe think about combining State and Observer to modify the state as the dependent classes change (rather than checking on every call). There's only certain models that this will work for though.
Otherwise you might as well say object.doSomething() and have the checks inside doSomething(). In this case using design patterns doesn't present any significant advantages (though if you loosen up slightly on the definitions of design patterns, many things would be considered such). I'd probably go with:
doSomething()
{
if (someOtherThingIsRunning())
doOneThing();
else
doAnotherThing();
}
The alternative (that you already suggested) is to have the above checks in typecode() and to return another class that contains the method doSomething().

is it possible to tell rspec to warn me if a stub returns a different type of object than the method it's replacing?

I have a method called save_title:
def save_title (data)
...
[ if the record exists, update, return 0]
[ if the record is new, create, return 1]
end
All fine, until I stubbed it:
saved_rows = []
proc.stub(:save_title) do |arg|
saved_rows << arg
end
The bug here is that I was using the integer returned from the real method to determine how many records were created vs. updated. The stub doesn't return an integer. Oooops. So the code worked fine in reality, but appeared broken in the test. A while later (more than I care to admit, cursing included) I realize the stub and the real method don't behave the same. Such are the pitfalls of dynamic languages I suppose.
Questions:
Can I tell rspec to warn me if the stub doesn't return the same sort of thing as the real method?
Is there an analyzer gem that I can use to warn about this sort of thing?
Is there some sort of best practice that I don't know about with returning values from methods?
1) There is no way that rspec can know what type of object the method is supposed to return, that's for you to tell it, however...
2) There is something you can look into. Instead of using a stub, try using a mock instead as your test double. It is basically the same thing as a stub, however, you can do many more validations on it (check out the documentation here). Things like how many times the specific method was called, the arguments it should be called with and what the return value should be as well. Your test will fail if any of those validations don't pass.
3) The best practice would be the method name itself. For example, methods ending in ? like object.exists? should always return a boolean value. In your case, I would suggest a refactoring of your method, maybe divide it in two, one for updating and one for creating and have another method to tell you if an object exists or not. It is not good practice to have a method behave in two different ways depending on the input (see separation of concerns)
Good luck! hope this helps.

Returning both computation result and status. Best practices

I was thinking about patterns which allow me to return both computation result and status:
There are few approaches which I could think about:
function returns computation result, status is being returned via out parameter (not all languages support out parameters and this seems wrong, since in general you don't expect parameters to be modified).
function returns object/pair consisting both values (downside is that you have to create artificial class just to return function result or use pair which have no semantic meaning - you know which argument is which by it's order).
if your status is just success/failure you can return computation value, and in case of error throw an exception (look like the best approach, but works only with success/failure scenario and shouldn't be abused for controlling normal program flow).
function returns value, function arguments are delegates to onSuccess/onFailure procedures.
there is a (state-full) method class which have status field, and method returning computation results (I prefer having state-less/immutable objects).
Please, give me some hints on pros, cons and situations' preconditions of using aforementioned approaches or show me other patterns which I could use (preferably with hints on preconditions when to use them).
EDIT:
Real-world example:
I am developing java ee internet application and I have a class resolving request parameters converting them from string to some business logic objects. Resolver is checking in db if object is being created or edited and then return to controller either new object or object fetched from db. Controller is taking action based on object status (new/editing) read from resolver. I know it's bad and I would like to improve code design here.
function returns computation result, status is being returned via out
parameter (not all languages support out parameters and this seems
wrong, since in general you don't expect parameters to be modified).
If the language supports multiple output values, then the language clearly was made to support them. It would be a shame not to use them (unless there are strong opinions in that particular community against them - this could be the case for languages that try and do everything)
function returns object/pair consisting both values (downside is that
you have to create artificial class just to return function result or
use pair which have no semantic meaning - you know which argument is
which by it's order).
I don't know about that downside. It seems to me that a record or class called "MyMethodResult" should have enough semantics by itself. You can always use such a class in an exception as well, if you are in an exceptional condition only of course. Creating some kind of array/union/pair would be less acceptable in my opinion: you would inevitably loose information somewhere.
if your status is just success/failure you can return computation
value, and in case of error throw an exception (look like the best
approach, but works only with success/failure scenario and shouldn't
be abused for controlling normal program flow).
No! This is the worst approach. Exceptions should be used for exactly that, exceptional circumstances. If not, they will halt debuggers, put colleagues on the wrong foot, harm performance, fill your logging system and bugger up your unit tests. If you create a method to test something, then the test should return a status, not an exception: to the implementation, returning a negative is not exceptional.
Of course, if you run out of bytes from a file during parsing, sure, throw the exception, but don't throw it if the input is incorrect and your method is called checkFile.
function returns value, function arguments are delegates to
onSuccess/onFailure procedures.
I would only use those if you have multiple results to share. It's way more complex than the class/record approach, and more difficult to maintain. I've used this approach to return multiple results while I don't know if the results are ignored or not, or if the user wants to continue. In Java you would use a listener. This kind of operation is probably more accepted for functinal languages.
there is a (state-full) method class which have status field, and
method returning computation results (I prefer having
state-less/immutable objects).
Yes, I prefer those to. There are producers of results and the results themselves. There is little need to combine the two and create a stateful object.
In the end, you want to go to producer.produceFrom(x): Result in my opinion. This is either option 1 or 2a, if I'm counting correctly. And yes, for 2a, this means writing some extra code.
My inclination would be to either use out parameters or else use an "open-field" struct, which simply contains public fields and specifies that its purpose is simply to carry the values of those fields. While some people suggest that everything should be "encapsulated", I would suggest that if a computation naturally yields two double values called the Moe and Larry coefficients, specifying that the function should return "a plain-old-data struct with fields of type double called MoeCoefficient and LarryCoefficient" would serve to completely define the behavior of the struct. Although the struct would have to be declared as a data type outside the method that performs the computation, having its contents exposed as public fields would make clear that none of the semantics associated with those values are contained in the struct--they're all contained in the method that returns it.
Some people would argue that the struct should be immutable, or that it should include validation logic in its constructor, etc. I would suggest the opposite. If the purpose of the structure is to allow a method to return a group of values, it should be the responsibility of that method to ensure that it puts the proper values into the structure. Further, while there's nothing wrong with a structure exposing a constructor as a "convenience member", simply having the code that will return the struct fill in the fields individually may be faster and clearer than calling a constructor, especially if the value to be stored in one field depends upon the value stored to the other.
If a struct simply exposes its fields publicly, then the semantics are very clear: MoeCoefficient contains the last value that was written to MoeCoefficient, and LarryCoefficient contains the last value written to LarryCoefficient. The meaning of those values would be entirely up to whatever code writes them. Hiding the fields behind properties obscures that relationship, and may impede performance as well.

Do you ToList()?

Do you have a default type that you prefer to use in your dealings with the results of LINQ queries?
By default LINQ will return an IEnumerable<> or maybe an IOrderedEnumerable<>. We have found that a List<> is generally more useful to us, so have adopted a habit of ToList()ing our queries most of the time, and certainly using List<> in our function arguments and return values.
The only exception to this has been in LINQ to SQL where calling .ToList() would enumerate the IEnumerable prematurely.
We are also using WCF extensively, the default collection type of which is System.Array. We always change this to System.Collections.Generic.List in the Service Reference Settings dialog in VS2008 for consistency with the rest of our codebase.
What do you do?
ToList always evaluates the sequence immediately - not just in LINQ to SQL. If you want that, that's fine - but it's not always appropriate.
Personally I would try to avoid declaring that you return List<T> directly - usually IList<T> is more appropriate, and allows you to change to a different implementation later on. Of course, there are some operations which are only specified on List<T> itself... this sort of decision is always tricky.
EDIT: (I would have put this in a comment, but it would be too bulky.) Deferred execution allows you to deal with data sources which are too big to fit in memory. For instance, if you're processing log files - transforming them from one format to another, uploading them into a database, working out some stats, or something like that - you may very well be able to handle arbitrary amounts of data by streaming it, but you really don't want to suck everything into memory. This may not be a concern for your particular application, but it's something to bear in mind.
We have the same scenario - WCF communications to a server, the server uses LINQtoSQL.
We use .ToArray() when requesting objects from the server, because it's "illegal" for the client to change the list. (Meaning, there is no purpose to support ".Add", ".Remove", etc).
While still on the server, however, I would recommend that you leave it as it's default (which is not IEnumerable, but rather IQueryable). This way, if you want to filter even more based on some criteria, the filtering is STILL on the SQL side until evaluated.
This is a very important point as it means incredible performance gains or losses depending on what you do.
EXAMPLE:
// This is just an example... imagine this is on the server only. It's the
// basic method that gets the list of clients.
private IEnumerable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsEnumerable();
}
// This method here is actually called by the user...
public Client[] GetClientsForLoggedInUser()
{
var clients = GetClients().Where(client=> client.Owner == currentUser);
return clients.ToArray();
}
Do you see what's happening there? The "GetClients" method is going to force a download of ALL 'clients' from the database... THEN the Where clause will happen in the GetClientsForLoogedInUser method to filter it down.
Now, notice the slight change:
private IQueryable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsQueryable();
}
Now, the actual evaluation won't happen until ".ToArray" is called... and SQL will do the filtering. MUCH better!
In the Linq-to-Objects case, returning List<T> from a function isn't as nice as returning IList<T>, as THE VENERABLE SKEET points out. But often you can still do better than that. If the thing you are returning ought to be immutable, IList is a bad choice because it invites the caller to add or remove things.
For example, sometimes you have a method or property that returns the result of a Linq query or uses yield return to lazily generate a list, and then you realise that it would be better to do that the first time you're called, cache the result in a List<T> and return the cached version thereafter. That's when returning IList may be a bad idea, because the caller may modify the list for their own purposes, which will then corrupt your cache, making their changes visible to all other callers.
Better to return IEnumerable<T>, so all they have is forward iteration. And if the caller wants rapid random access, i.e. they wish they could use [] to access by index, they can use ElementAt, which Linq defines so that it quietly sniffs for IList and uses that if available, and if not it does the dumb linear lookup.
One thing I've used ToList for is when I've got a complex system of Linq expressions mixed with custom operators that use yield return to filter or transform lists. Stepping through in the debugger can get mighty confusing as it jumps around doing lazy evaluation, so I sometimes temporarily add a ToList() to a few places so that I can more easily follow the execution path. (Although if the things you are executing have side-effects, this can change the meaning of the program.)
It depends if you need to modify the collection. I like to use an Array when I know that no one is going to add/delete items. I use a list when I need to sort/add/delete items. But, usually I just leave it as IEnumerable as long as I can.
If you don't need the added features of List<>, why not just stick with IQueryable<> ?!?!?! Lowest common denominator is the best solution (especially when you see Timothy's answer).

Resources