Replace lots of switches with polymorphism but no type code - refactoring

I have a class which could benefit with the state pattern. However the common "Replace Type Code with State/Strategy" refactoring does not seem to fit well in my case: the state is calculated by watching other objects, there is no type code variable.
Most of my class code is just "calculating" some state when it is called, and running the functions for that state.
Forcing a type code variable feels wrong because:
I will be forced to call an "updateState()" function in every place where the polymorphic functions are used.
My class will no longer be 100% behavior, which I would rather habe instead of some internal state.
Since the state must be calculated every single time its functions are called, I am wonder if I am thinking about the wrong pattern.
Normally I refactor this:
if (this.someOtherThingIsRunning()) {
...
} else {
...
}
like this:
typecode.doSomething()
// that being polymorphic
it seems strange doing:
updateTypeCode()
typecode.doSomething()
Does the state pattern applies to this case? Is there any alternative strategy pull from polymorphism without a type code?

While writing my question, I realized that maybe I could just make the type code a function and return a temporal (function scope) type code. Like:
typecode().doSomething()
This solution would never store the state, which is what I want to avoid. However I am still wondering if my problem started because I am using the wrong pattern.

If you're open to storing the state, maybe think about combining State and Observer to modify the state as the dependent classes change (rather than checking on every call). There's only certain models that this will work for though.
Otherwise you might as well say object.doSomething() and have the checks inside doSomething(). In this case using design patterns doesn't present any significant advantages (though if you loosen up slightly on the definitions of design patterns, many things would be considered such). I'd probably go with:
doSomething()
{
if (someOtherThingIsRunning())
doOneThing();
else
doAnotherThing();
}
The alternative (that you already suggested) is to have the above checks in typecode() and to return another class that contains the method doSomething().

Related

Single responsibility principle - function

I'm reading some tuts about SOLID programming, and I'm trying to refactor my test project to implement some of those rules.
Often I have doubts with SingleResponsibilityPrinciple, so I hope someone could help me with that.
As I understood, SRP means that (in case of a function), function should be responsible for only one thing. And that's seems pretty easy and simple, but I do get in a trap of doing more than thing.
This is simplified example:
class TicketService {
private ticket;
getTicket() {
httpClient.get().then(function(response) {
ticket = response.ticket;
emit(ticket); <----------------------
});
}
}
The confusing part is emit(ticket). So, my function is named getTicket, that's exactly what I'm doing there (fetching it from server e.g.), but on the other hand, I need to emit that change to all other parts of my application, and let them know that ticket is changed.
I could create separate set() function, where I could do setting of private variable, and emit it there, but that seems like a same thing.
Is this wrong? Does it break the rule? How would you fix it?
You could also return the ticket from the getTicket() function, and then have a separate function called setUpdatedTicket() that takes a ticket and sets the private parameter, and at the end calls the emit function.
This can lead to unexpected behavior. If I want to re-use your class in the future and I see with auto-completion in my IDE the method getTicket() I expect to get a Ticket.
However renaming this method to mailChangedTicket, ideally you want this method to call the getTicket method (which actually returns the ticket) and this way you have re-usable code which will make more sense.
You can take SRP really far, for example your TicketService has a httpClient, but it probably doesn't matter where the ticket comes from. In order to 'fix' this, you will have to create a seperate interface and class for this.
A few advantages:
Code is becoming more re-usable
It is easier to test parts separately
I can recommend the book 'Clean Code' from Robert C. Martin which gives some good guidelines to achieve this.

En/Decode struct containing many interfaces with different implementations each with gob

I have a quite complex struct that contains many interfaces with each different implementations. For en/decoding that struct in gob I seem to have to register every implementation that could be possibly used for every interface. So I end up with a method along these lines:
func registerImplementations() {
gob.Register(&Type1{})
gob.Register(&Type2{})
gob.Register(&Type3{})
gob.Register(&Type4{})
....
}
which I need to call before en/decoding. Is there an easier way to do this? Or should I look into possibilities for generating this method, since it's quite tedious to keep track of all possible implementations?
The documentation says:
We must register the concrete type for the encoder and decoder (which would
normally be on a separate machine from the encoder). On each end, this tells the
engine which concrete type is being sent that implements the interface.
So, at some point, you're going to want to call gob.Register, but you do want your code to be maintainable. This leaves (broadly) two options:
Creating a function like you're doing now, calling each struct after one another.
Advantage: all your Register-calls in a list, so you'll easily spot if you miss one, and you surely won't register one twice.
Disadvantage: you'll have to update it when using another implementation. You'll also have to call this function some time before encoding/decoding.
Creating something like this:
func Register(i interface{}) error {
gob.Register(i)
return nil
}
And then when writing a new implementation in your (let's say) dummy package, you can put this line below / above the interface declaration.
var regResult = reg.Register(myNewInterface{})
This will be called on startup (because it's global).
Advantage: not having to update the registerImplementations method.
Disadvantage: you'll have your registers all across your code (which can consist of a lot of files) - so you might miss one.
As to which is better: I'll leave that up to you.

Is there a good way to use polymorphism to remove this switch statement?

I've been reading on refactoring and replacing conditional statements with polymorphism. The trouble I have is that it only seems to make sense to me when you have a more complex case where, without polymorphism, you would have to repeat the same switch statements or if-elses many times. I don't see how it makes sense if you're only doing it once - you have to have that conditional somewhere, right?
As an example, I recently wrote the following class, which is responsible for reading a XML file and converting its data into the program's objects. There are 2 possible formats for the file that we are supporting, so I simply wrote a method in the class for handling each one, and used a case-switch to determine which one to use:
public class ComponentXmlReader
{
public IEnumerable<UserComponent> ImportComponentsFromXml(string path)
{
var xmlFile = XElement.Load(path);
switch (xmlFile.Name.LocalName)
{
case "CaseDefinition":
return ImportComponentsFromA(xmlFile);
case "Root":
return ImportComponentsFromB(xmlFile);
}
}
private IEnumerable<UserComponent> ImportComponentsFromA(XContainer file)
{
//do stuff
}
private IEnumerable<UserComponent> ImportComponentsFromB(XContainer file)
{
//do stuff
}
}
As far as I can tell, I could write a class hierarchy for this to do the parsing, but I don't see the advantage here - I'd still have to use a case-switch to determine which class to instantiate. It looks to me like it would be extra complexity for no benefit. If I was going to keep these classes around and do more things with them that depended on the file type, then it would eliminate doing the same switch in multiple places, but this is single-use. Is this right, or is there some reason or technique I'm not seeing that makes it a good idea to use a polymorphic class hierarchy to do this?
If you had, say, an abstract ComponentImporter class, with concrete subclasses FromA and FromB, you could instantiate one of each, and put it in a Map. Then you could call componentImporterMap.get(xmlFile.Name.LocalName).importComponents() and avoid the switch.
As with all design choices, context is key. In this case, you have what seems to be a fairly simple class handling two very similar tasks. If the two Import methods contained very little duplicate code, then including them in a single class is perhaps the best choice since, as you say, it reduces complexity.
However, it's possible you'll use this class in the future, and even add new types of imports. In that case, the class would be more reusable if it was polymorphic.
Additionally, since these methods sound very similar, you're likely to have a bunch of duplicate code, which you could keep in a base class and only put import-specific code in the child classes.
Plus, as Carl mentions, there are numbers of ways to implement this logic without using a case statement.

Returning both computation result and status. Best practices

I was thinking about patterns which allow me to return both computation result and status:
There are few approaches which I could think about:
function returns computation result, status is being returned via out parameter (not all languages support out parameters and this seems wrong, since in general you don't expect parameters to be modified).
function returns object/pair consisting both values (downside is that you have to create artificial class just to return function result or use pair which have no semantic meaning - you know which argument is which by it's order).
if your status is just success/failure you can return computation value, and in case of error throw an exception (look like the best approach, but works only with success/failure scenario and shouldn't be abused for controlling normal program flow).
function returns value, function arguments are delegates to onSuccess/onFailure procedures.
there is a (state-full) method class which have status field, and method returning computation results (I prefer having state-less/immutable objects).
Please, give me some hints on pros, cons and situations' preconditions of using aforementioned approaches or show me other patterns which I could use (preferably with hints on preconditions when to use them).
EDIT:
Real-world example:
I am developing java ee internet application and I have a class resolving request parameters converting them from string to some business logic objects. Resolver is checking in db if object is being created or edited and then return to controller either new object or object fetched from db. Controller is taking action based on object status (new/editing) read from resolver. I know it's bad and I would like to improve code design here.
function returns computation result, status is being returned via out
parameter (not all languages support out parameters and this seems
wrong, since in general you don't expect parameters to be modified).
If the language supports multiple output values, then the language clearly was made to support them. It would be a shame not to use them (unless there are strong opinions in that particular community against them - this could be the case for languages that try and do everything)
function returns object/pair consisting both values (downside is that
you have to create artificial class just to return function result or
use pair which have no semantic meaning - you know which argument is
which by it's order).
I don't know about that downside. It seems to me that a record or class called "MyMethodResult" should have enough semantics by itself. You can always use such a class in an exception as well, if you are in an exceptional condition only of course. Creating some kind of array/union/pair would be less acceptable in my opinion: you would inevitably loose information somewhere.
if your status is just success/failure you can return computation
value, and in case of error throw an exception (look like the best
approach, but works only with success/failure scenario and shouldn't
be abused for controlling normal program flow).
No! This is the worst approach. Exceptions should be used for exactly that, exceptional circumstances. If not, they will halt debuggers, put colleagues on the wrong foot, harm performance, fill your logging system and bugger up your unit tests. If you create a method to test something, then the test should return a status, not an exception: to the implementation, returning a negative is not exceptional.
Of course, if you run out of bytes from a file during parsing, sure, throw the exception, but don't throw it if the input is incorrect and your method is called checkFile.
function returns value, function arguments are delegates to
onSuccess/onFailure procedures.
I would only use those if you have multiple results to share. It's way more complex than the class/record approach, and more difficult to maintain. I've used this approach to return multiple results while I don't know if the results are ignored or not, or if the user wants to continue. In Java you would use a listener. This kind of operation is probably more accepted for functinal languages.
there is a (state-full) method class which have status field, and
method returning computation results (I prefer having
state-less/immutable objects).
Yes, I prefer those to. There are producers of results and the results themselves. There is little need to combine the two and create a stateful object.
In the end, you want to go to producer.produceFrom(x): Result in my opinion. This is either option 1 or 2a, if I'm counting correctly. And yes, for 2a, this means writing some extra code.
My inclination would be to either use out parameters or else use an "open-field" struct, which simply contains public fields and specifies that its purpose is simply to carry the values of those fields. While some people suggest that everything should be "encapsulated", I would suggest that if a computation naturally yields two double values called the Moe and Larry coefficients, specifying that the function should return "a plain-old-data struct with fields of type double called MoeCoefficient and LarryCoefficient" would serve to completely define the behavior of the struct. Although the struct would have to be declared as a data type outside the method that performs the computation, having its contents exposed as public fields would make clear that none of the semantics associated with those values are contained in the struct--they're all contained in the method that returns it.
Some people would argue that the struct should be immutable, or that it should include validation logic in its constructor, etc. I would suggest the opposite. If the purpose of the structure is to allow a method to return a group of values, it should be the responsibility of that method to ensure that it puts the proper values into the structure. Further, while there's nothing wrong with a structure exposing a constructor as a "convenience member", simply having the code that will return the struct fill in the fields individually may be faster and clearer than calling a constructor, especially if the value to be stored in one field depends upon the value stored to the other.
If a struct simply exposes its fields publicly, then the semantics are very clear: MoeCoefficient contains the last value that was written to MoeCoefficient, and LarryCoefficient contains the last value written to LarryCoefficient. The meaning of those values would be entirely up to whatever code writes them. Hiding the fields behind properties obscures that relationship, and may impede performance as well.

Using return statements to great effect!

When I am making methods with return values, I usually try and set things up so that there is never a case when the method is called in such a way that it would have to return some default value. When I started I would often write methods that did something, and would either return what they did or, if they failed to do anything, would return null. But I hate having ugly if(!null) statements all over my code,
I'm reading a re-guide to ruby that I read many moons ago, by the pragmatic programmers, and I notice that they often return self (ruby's this) when they wouldn't normally return anything. This is, they say, in order to be able to chain method calls, as in this example using setters that return the object whose attributes they set.
tree.setColor(green).setDecor(gaudy).setPractical(false)
Initially I find this sort of thing attractive. There have been a couple of times when I have rejoiced at being able to chain method calls, like Player.getHand().getSize() but this is somewhat different in that the object of the method call changes from step to step.
What does Stack Overflow think about return values? Are there any patterns or idioms that come to mind warmly when you think of return values? Any great ways to avoid frustration and increase beauty?
In my humble opinion, there are three kinds of return-cases that you should take into consideration:
Object property manipulation
The first is the manipulation of object properties. The pattern you describe here is very often used when manipulating objects. A very typical scenario is using it together with a factory. Consider this hypothetical creation call:
// When the object has manipulative methods:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes();
// When the factory has manipulative methods working on the
// object, IMHO more elegant from a semantic point of view:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes().getPizza();
It allows for a quick grasp at what exactly is being created or how an object is manipulated, because the methods form one human-readable expression. It's definitely nice, but don't overuse. A rule of thumb is that this might be used with methods whose return value you could also declare as void.
Evaluating object properties
The second might be when a method evaluates something on an object. Consider, for example, the method car.getCurrentSpeed(), that could be interpreted as a message to an object asking for the current speed and returning that. It would simply return the value, not too complicated. :)
Make object do this or that
The third might be when a method makes an perform an operation, returning some sort of value indicating how well the caller's intention was fulfilled - but laying out such a method could be difficult:
int new_gear = 20;
if (car.gears.changeGear(new_gear)) // does that mean success or fail?
This is where you can see a difficulty in designing the method. Should it return 0 upon success or failure? How about -1 if the gear could not be set, because the car only has 5 gears? Does that mean the current gear is at -1 now, too? The method could return the gear it changed to, meaning you would have to compare the argument supplied to the method to the return code. That would work. On the other hand, you could simply return either true or false for failure or false or true for failure. Which one to use could be decided by estimating if you'd expect those method calls to rather fail or succeed.
In my humble opinion, there is a way to better express the semantics of such return values, by giving them a semantic description. Future developers interacting with your objects will love you for not having to look up the comments or documentation for your methods:
class GearSystem {
// (...)
public:
enum GearChangeResult
{ GearChangeSuccess, NonExistingGear, MechanicalGearProblem };
GearChangeResult changeGear (int gear);
};
That way, it becomes perfectly obvious for any programmer looking at your code, what the return value means (consider: if (gears.changeGear(20) == GearSystem::GearChangeSuccess) - much clearer what that means than the example above)
Antipattern: Failures as return codes.
The fourth possibility for a return value I actually omitted, because in my opinion it isn't any: when there's an error in your program, like a logic error or a failure that needs to be dealt with - you could theoretically return a value indicating so. But today, that's not done so often anymore (or should not be), because for that, there are exceptions.
I don't agree that methods should never return null. The most obvious examples are from systems programming. For instance, if someone asks to open a file, you simply have to give them null if the open fails. There is no sane alternative. There are other cases where null is appropriate, such as a getNextNode(node) method, when called on the last node of a linked list. So I guess what these cases have in common is that null represents "no object" (either no file handle or no list node), which makes sense.
In other cases, the method should never fail, and there is an appropriate exception facility. Then, I think method chaining like your example can be used to great effect. I think it's a bit funny that you seem to believe this is an innovation of the "Pragmatic Programmers". In fact, it dates to Lisp if not before.
Returning this is also used in the "builder pattern", another case where method chaining can enhance readability as well as writing convenience.
A null is often returned as an out-of-band value to indicate that no result could be produced. I believe that this is perfectly reasonable when getting no result is a normal event; examples would include a null return from readLine() at end-of-file, or a null returned when providing a non-existent key to the get(...) method of a Map. Reading to the end of the file is normal behavior (as opposed to an IOException, which indicates that something went abnormally wrong while trying to read). Similarly, looking up a key and being told that it has no value is a normal case.
A good alternative to null for some cases is a "null object", which is a full-fledged instance of the result class, but which has appropriate state and behavior for a "nobody's home" case. For instance, the result of looking up a non-existent user ID might well be a NullUser object which has a zero-length name and no permissions to do anything in the system.
It's confusing to me. OO programming languages need Smalltalk's semicolon:
tree color: green;
decor: gaudy;
practical: false.
obj method1; method2. means "call method1 on obj then method2 on obj". This kind of object setup is very common.

Resources