How do I refactor chained methods? - refactoring

Starting with this code:
new Person("ET").WithAge(88)
How can it be refactored to:
new Person("ET", 88)
What sequence of refactorings needs to be performed to complete the transformation?
Why? Because there could be hundreds of these, and I wouldn't want to introduce errors by doing it manually.
Would you say a drawback with fluent interfaces is they can't easily be refactored?
NOTE: I want to do this automatically without hand typing the code.

Perhaps the simplest way to refactor this is to change the name "WithAge" to "InitAge", make InitAge private, then call it from your constructor instead. Then update all references of new Person(string).WithAge(int) to use the new constructor.
If WithAge is a one-liner, you can just move the code to your new constructor instead, and do away with InitAge altogether, unless having the additional method provides extra readability.
Having good unit tests will isolate where errors are introduced, if they are.

Assuming that WithAge is a method on Person that returns a Person, what about something like
Person(string name, int age)
{
this.name = name;
this.WithAge(age);
}
Or more generalized:
Person(SomeType originalParameter, FluentParamType fluentParameter)
{
//Original constructor stuff
this.FluentMethod(fluentParameter);
}
And then as make the FluentMethod private if you don't want it, or keep it public if you want to allow both ways.

If this is C# (ideally you would tag the question with the language), the Person class needs this constructor:
public Person(string name, int age)
: this(name) { WithAge(age); }
To then change all client code to call this new constructor where appropriate, you would need to find all occurrences of the pattern:
new Person(x1).WithAge(x2)
where x1 and x2 are expressions, and replace them with:
new Person(x1, x2)
If there are other modifier methods aside from WithAge, it might get more complicated. For example:
new Person(x1).WithHair(x2).WithAge(x3)
Perhaps you'd want that to become:
new Person(x1, x3).WithHair(x2)
It all depends on whether you have an IDE that lets you define language-aware search/replace patterns like that. You can get a long way to the solution with simple textual search and replace, combined with a macro that replays a sequence of key presses.
Would you say a drawback with fluent
interfaces is they can't easily be
refactored?
Not especially - it's more that refactoring features in IDEs are either designed flexibly enough to let you creatively invent new refactorings, or else they are hard-coded for certain common cases. I'd prefer the common cases to be defined as examples that I could mutate to invent new ones.

I don't have any practical experience with that sort of thing, but if I was in your situation the place I'd go looking would be custom Eclipse refactorings (or the equivalent in Refactor! Pro for .Net if that's what you're using).
Basically what you want is a match and replace, except that your regular expressions should match abstract syntax trees rather than plain text. That's what automated refactorings are.
One risk of this refactoring is that the target version is less precise than the original. Consider:
class Person {
public Person(String name, int age);
public Person(String name, int numberOfChildren);
}
There is no way to tell which of these constructors the chained call to Person.WithAge should be replaced with.
So, automated support for this would have to check for such ambiguities before allowing you to proceed. If there is already a constructor with the target parameters, abort the refactoring.
Other than that it seems pretty straightforward. Give the new constructor the following content:
public Person(String name, int age) {
this(name);
withAge(age);
}
Then you can safely replace the original call with the new.
(There is a subtle additional risk, in that calling withAge within the constructor, i.e. on a partially constructed object, isn't quite the same as calling it after the constructor. The difference matters if you have an inheritance chain and if withAge does something non-trivial. But then that's what your unit tests are for...)

Write unit tests for the old code.
Refactor until the tests pass again.

Related

What's the better coding style in oop: one method with one parameter VS two methods without parameters?

What's the better way for clean code from the oop point of view? Having two related methods with different names or one common method with an extra parameter?
(Simplified) Example:
1.) public void LogError() { ... }
public void LogWarning() { ... }
VS
2.) public void Log(LogType logType) { ... } //LogType.Error vs LogType.Warning
Both are good choices. Maybe a few examples can make it more clear. Usually, I try to think who is gonna use the library (me or someone else) and what programming language I use.
For example:
If I use strongly typed language like Java, C#, etc then I prefer choice 2.
If I use something else like PHP or Python, then I prefer choice 1.
If I want to make a simplified interface for other developers that are gonna use my library, for example, then I prefer choice 1 too.
When you have LogType enum for example, then it really doesn’t matter. Just try to think about how to describe the intent and make it clear.
Watch out boolean parameters that can be confusing many times. For example:
public void SaveProduct(bool cache) { ... }
In those situations, choice 1 is usually better because it can be very hard to understand what boolean values do. (How it changes the behavior) Also, it usually tells that the method is doing two different actions so possibly there is a way to refactor it. For example, splitting it into two methods and then the developer does not need to know about the implementation details.

When to use Encapsulate Collection?

In the smell Data Class as Martin Fowler described in Refactoring, he suggests if I have a collection field in my class I should encapsulate it.
The pattern Encapsulate Collection(208) says we should add following methods:
get_unmodified_collection
add_item
remove_item
and remove these:
get_collection
set_collection
To make sure any changes on this collection need go through the class.
Should I refactor every class which has a collection field with this pattern? Or it depends on some other reasons like frequency of usage?
I use C++ in my project now.
Any suggestion would be helpful. Thanks.
These are well formulated questions and my answer is:
Should I refactor every class which has a collection field with this
pattern?
No, you should not refactor every class which has a collection field. Every fundamentalism is a way to hell. Use common sense and do not make your design too good, just good enough.
Or it depends on some other reasons like frequency of usage?
The second question comes from a common mistake. The reason why we refactor or use design pattern is not primarily the frequency of use. We do it to make the code more clear, more maintainable, more expandable, more understandable, sometimes (but not always!) more effective. Everything which adds to these goals is good. Everything which does not, is bad.
You might have expected a yes/no answer, but such one is not possible here. As said, use your common sense and measure your solution from the above mentioned viewpoints.
I generally like the idea of encapsulating collections. Also encapsulating plain Strings into named business classes. I do it almost always when the classes are meaningful in the business domain.
I would always prefer
public class People {
private final Collection<Man> people;
... // useful methods
}
over the plain Collection<Man> when Man is a business class (a domain object). Or I would sometimes do it in this way:
public class People implements Collection<Man> {
private final Collection<Man> people;
... // delegate methods, such as
#Override
public int size() {
return people.size();
}
#Override
public Man get(int index) {
// Here might also be some manipulation with the returned data etc.
return people.get(index);
}
#Override
public boolean add(Man man) {
// Decoration - added some validation
if (/* man does not match some criteria */) {
return false;
}
return people.add(man);
}
... // useful methods
}
Or similarly I prefer
public class StreetAddress {
private final String value;
public String getTextValue() { return value; }
...
// later I may add more business logic, such as parsing the street address
// to street name and house number etc.
}
over just using plain String streetAddress - thus I keep the door opened to any future change of the underlying logic and to adding any useful methods.
However, I try not to overkill my design when it is not needed so I am as well as happy with plain collections and plain Strings when it is more suited.
I think it depends on the language you are developing with. Since there are already interfaces that do just that C# and Java for example. In C# we have ICollection, IEnumerable, IList. In Java Collection, List, etc.
If your language doesn't have an interface to refer to a collection regarless of their inner implementation and you require to have your own abstraction of that class, then it's probably a good idea to do so. And yes, you should not let the collection to be modified directly since that completely defeats the purpose.
It would really help if you tell us which language are you developing with. Granted, it is kind of a language-agnostic question, but people knowledgeable in that language might recommend you the best practices in it and if there's already a way to achieve what you need.
The motivation behind Encapsulate Collection is to reduce the coupling of the collection's owning class to its clients.
Every refactoring tries to improve maintainability of the code, so future changes are easier. In this case changing the collection class from vector to list for example, changes all the clients' uses of the class. If you encapsulate this with this refactoring you can change the collection without changes to clients. This follows on of SOLID principles, the dependency inversion principle: Depend upon Abstractions. Do not depend upon concretions.
You have to decide for your own code base, whether this is relevant for you, meaning that your code base is still being changed and has to be maintained (then yes, do it for every class) or not (then no, leave the code be).

Is there a good way to use polymorphism to remove this switch statement?

I've been reading on refactoring and replacing conditional statements with polymorphism. The trouble I have is that it only seems to make sense to me when you have a more complex case where, without polymorphism, you would have to repeat the same switch statements or if-elses many times. I don't see how it makes sense if you're only doing it once - you have to have that conditional somewhere, right?
As an example, I recently wrote the following class, which is responsible for reading a XML file and converting its data into the program's objects. There are 2 possible formats for the file that we are supporting, so I simply wrote a method in the class for handling each one, and used a case-switch to determine which one to use:
public class ComponentXmlReader
{
public IEnumerable<UserComponent> ImportComponentsFromXml(string path)
{
var xmlFile = XElement.Load(path);
switch (xmlFile.Name.LocalName)
{
case "CaseDefinition":
return ImportComponentsFromA(xmlFile);
case "Root":
return ImportComponentsFromB(xmlFile);
}
}
private IEnumerable<UserComponent> ImportComponentsFromA(XContainer file)
{
//do stuff
}
private IEnumerable<UserComponent> ImportComponentsFromB(XContainer file)
{
//do stuff
}
}
As far as I can tell, I could write a class hierarchy for this to do the parsing, but I don't see the advantage here - I'd still have to use a case-switch to determine which class to instantiate. It looks to me like it would be extra complexity for no benefit. If I was going to keep these classes around and do more things with them that depended on the file type, then it would eliminate doing the same switch in multiple places, but this is single-use. Is this right, or is there some reason or technique I'm not seeing that makes it a good idea to use a polymorphic class hierarchy to do this?
If you had, say, an abstract ComponentImporter class, with concrete subclasses FromA and FromB, you could instantiate one of each, and put it in a Map. Then you could call componentImporterMap.get(xmlFile.Name.LocalName).importComponents() and avoid the switch.
As with all design choices, context is key. In this case, you have what seems to be a fairly simple class handling two very similar tasks. If the two Import methods contained very little duplicate code, then including them in a single class is perhaps the best choice since, as you say, it reduces complexity.
However, it's possible you'll use this class in the future, and even add new types of imports. In that case, the class would be more reusable if it was polymorphic.
Additionally, since these methods sound very similar, you're likely to have a bunch of duplicate code, which you could keep in a base class and only put import-specific code in the child classes.
Plus, as Carl mentions, there are numbers of ways to implement this logic without using a case statement.

Method naming convention

If a method takes a class/struct as an input parameter, what is the best way to name it?
Example:
class Person{}
class Address{}
class Utility{
//name **style 1** - use method overloading
public void Save(Person p){}
public void Save(Address a){}
*//name **style 2** - use unique names that define what they are doing
//or public void SavePerson(Person p){}
//and public void SaveAddress(Address a){}*
}
I personally like style 1 (Use the languages features - in this case overloading).
If you like style 1, can you point me to any "official" documentation, that states this to be a standard?
I would say your challenge is not in the field of method naming, but rather type design. A type that is responsible for saving both Person objects and Address objects seems like a type with more than one responsibility. Such a type will tend to grow and grow and grow and will eventually get hard to maintain. If you instead create more specialized types, method naming may automatically become a simpler task.
If you would still want to collect these methods in the same type, it's mostly a matter of style. One thing to perhaps think about is whether this type may be consumed by code written in another language, and that does not support method overloading. In such cases the longer names is the way to go. Otherwise just stick to what feels best (or whatever is the ruling convention at your workplace).
It is a matter of style.
If you don't like long method names, go with 1.
If you don't like long overload lists, go with 2.
The important bit is to keep consistent, so do not mix the two styles in one project.
If you are seeing that you have many such methods, you may need to rethink your design - perhaps a solution involving inheritance would be more appropriate.
Distinct names avoid entirely any problems associated with method overloading. For example:
Ambiguity is avoided if an argument's type matches more than one of the candidates.
In C++, overloaded methods can hide those of the same name in a superclass.
In Java, type erasure prevents overloaded methods differing only by type parameterization.
It would also be worthwhile to ask whether polymorphism could be used instead of overloading.

Where do you add new methods?

When you add a new method to a class where do you put it? At the end of the class...the top? Do you organize methods into specific groupings? Sorted alphabetically?
Just looking for general practices in keeping class methods organized.
Update When grouped where do you add the new method in the group? Just tack on the end or do you use some sort of sub-grouping, sorting?
Update 2 Mmmm...guess the question isn't as clear as I thought. I'm not really looking for class organization. I'm specifically interested in adding a new method to an existing class. For example:
public class Attendant
{
public void GetDrinks(){}
public void WelcomeGuests(){}
public void PickUpTrask(){}
public void StrapIn(){}
}
Now we're going to add a new method PrepareForCrash(). Where does it go? At the top of the list, bottom, alphabetically or near the StrapIn() method since it's related.
Near "StrapIn" because it's related. That way if you refactor later, all related code is nearby.
Most code editors allow you to browse method names alphabetically in another pane, so organizing your code functionally makes sense within the actual code itself. Group functional methods together, makes life easier when navigating through the class.
For goodness sake, not alphabetically!
I tend to group my functions in the order I expect them to be called during the life of the object, so that a top to bottom read of the header file tends to explain the operation of the class.
I think it's a personal choice.
However I like to organise my classes as such.
public class classname
{
<member variables>
<constructors>
<destructor>
<public methods>
<protected methods>
<private methods>
}
The reason for this is as such.
Member variables at the top
To see what member variables exist and if they are initialised.
Constructors
To see if the member variables are setup/initialised as well as what are all the construction options for the class.
Destructor
To see the how the class is cleaned up and verify it with the constructors and member variables.
Public methods
To see what are the available contracts callers of the object can use.
Protected methods
To see what inherited classes would be using.
Private methods
As it's information about the internals of the class if you needed to know about the internals you can just scroll straight to the end quickly. But to know the interface for the class it's all at the start.
UPDATE - Based on OP's update
Logically a good way would be to organise the methods by categories of what they do.
This way you get the readabilty of categorising your methods as well as the alphabetical search from you IDE (provided this is in your IDE).
However in a practical sense I think placing the methods at the end of that section is the best way. It would be quite hard to continually police where each method goes, as it's subjective, for every method if the code is shared by more than yourself.
If you were to make this a standard it'd be quite hard to provide the boundaries for where to put each method.
What I like about C# and VB.net is the ability to use #region tags, so generally my classes look like this
class MyClass
{
#region Constructors
public MyClass()
{
}
public MyClass(int x)
{
_x = x;
}
#endregion
#region Members
private int _x;
#endregion
#region methods
public void DoSomething()
{
}
#endregion
#region Properties
public int Y {get; private set;}
#endregion
}
So basically You put similar things together so you can collapse everything to definition and get to your stuff really faster.
Generally, it depends on the existing grouping; if there's an existing grouping that the new method fits into, I'll put it there. For example, if there's a grouping of operators, I'll put the new method with the operators if it's an operator.
Of course, if there is no good grouping, adding a method may suggest a new grouping; I treat that as an opportunity for refactoring, and try to regroup the existing operators where reasonable.
I organize all methods into regions like public methods, private methods or sometimes by features like Saving methods, etc..
IMHO:
If you organize your methods alphabetically, put a new one depends on its name. Otherwise put it at the bottom of related group. This helps to know, what method is newer. The bigger problem is how to organize methods in groups, e.g. depend on what properties, but this is more individual for everyone and depends on a specific class.

Resources