How to simplify complicated business "IF" logic? - refactoring

What are the good ways to handle complicated business logic that from the first glance requires many nested if statements?
Example:
Discount Coupon. could be:
1a) Value discount
1b) Percentage discount
2a) Normal discount
2b) Progressive discount
3a) Requires access coupon
3b) Do not require access coupon
4a) Applied only to the customer who already bought before
4b) Applied to any customer
5a) Applied to customer only from countries (X,Y,…)
That requires code even more complicated then this:
if (discount.isPercentage) {
if (discount.isNormal) {
if (discount.requiresAccessCoupon) {
} else {
}
} else if (discount.isProgressive) {
if (discount.requiresAccessCoupon) {
} else {
}
}
} else if (discount.isValue) {
if (discount.isNormal) {
if (discount.requiresAccessCoupon) {
} else {
}
} else if (discount.isProgressive) {
if (discount.requiresAccessCoupon) {
} else {
}
}
} else if (discount.isXXX) {
if (discount.isNormal) {
} else if (discount.isProgressive) {
}
}
Even if you replace IFs to switch/case it's still too complicated.
What are the ways to make it readable, maintainable, more testable and easy to understand?

Good question. "Conditional Complexity" is a code smell. Polymorphism is your friend.
Conditional logic is innocent in its infancy, when it’s simple to understand and contained within a
few lines of code. Unfortunately, it rarely ages well. You implement several new features and
suddenly your conditional logic becomes complicated and expansive. [Joshua Kerevsky: Refactoring to Patterns]
One of the simplest things you can do to avoid nested if blocks is to learn to use Guard Clauses.
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
The other thing I have found simplifies things pretty well, and which makes your code self-documenting, is Consolidating conditionals.
double disabilityAmount() {
if (isNotEligableForDisability()) return 0;
// compute the disability amount
Other valuable refactoring techniques associated with conditional expressions include Decompose Conditional, Replace Conditional with Visitor, and Reverse Conditional.

Specification pattern might be what you are looking for.
Summary:
In computer programming, the specification pattern is a particular software design pattern, whereby business logic can be recombined by chaining the business logic together using boolean logic.

I would write a generic state-machine that feeds on lists of things to compare.

The object oriented way of doing it is to have multiple discount classes implementing a common interface:
dicsount.apply(order)
Put the logic for determining whether the order qualifies for the discount within the discount classes.

You should really see
Clean Code Talks - Inheritance, Polymorphism, & Testing
by Miško Hevery
Google Tech Talks
November 20, 2008
ABSTRACT
Is your code full of if statements? Switch statements? Do you have the same switch statement in various places? When you make changes do you find yourself making the same change to the same if/switch in several places? Did you ever forget one?
This talk will discuss approaches to using Object Oriented techniques to remove many of those conditionals. The result is cleaner, tighter, better designed code that's easier to test, understand and maintain.

Using guard clauses might help some.

FWIW, I have used Hamcrest very successfully for this sort of thing. I believe you could say that it implements the Specification Pattern, #Arnis talked about.

My first thought is that this is not testable, which leads me to a solution, in order to get it testable.
if (discount.isPercentage) {
callFunctionOne(...);
} else if (discount.isValue) {
callFunctionThree(...);
} else if (discount.isXXX) {
callFunctionTwo(...);
}
Then you can have each nested if statement be a separate call. This way you can test them individually and when you test the large group you know that each individual one works.

Make methods that checks for a particular case.
bool IsValueNormalAndRequiresCoopon(Discount discount){...}
bool IsValueNormalAndRequiresCoupon(Discount discount){...}
etc
Once you start doing that it becomes easier to see where you can abstract out common logic between the choices. You can then go from there.
For complex decisions I often end up with a class that handles the possible states.

Related

What's the better coding style in oop: one method with one parameter VS two methods without parameters?

What's the better way for clean code from the oop point of view? Having two related methods with different names or one common method with an extra parameter?
(Simplified) Example:
1.) public void LogError() { ... }
public void LogWarning() { ... }
VS
2.) public void Log(LogType logType) { ... } //LogType.Error vs LogType.Warning
Both are good choices. Maybe a few examples can make it more clear. Usually, I try to think who is gonna use the library (me or someone else) and what programming language I use.
For example:
If I use strongly typed language like Java, C#, etc then I prefer choice 2.
If I use something else like PHP or Python, then I prefer choice 1.
If I want to make a simplified interface for other developers that are gonna use my library, for example, then I prefer choice 1 too.
When you have LogType enum for example, then it really doesn’t matter. Just try to think about how to describe the intent and make it clear.
Watch out boolean parameters that can be confusing many times. For example:
public void SaveProduct(bool cache) { ... }
In those situations, choice 1 is usually better because it can be very hard to understand what boolean values do. (How it changes the behavior) Also, it usually tells that the method is doing two different actions so possibly there is a way to refactor it. For example, splitting it into two methods and then the developer does not need to know about the implementation details.

Why is there two identical method in laravel message bag?

I was reading laravel documentation regarding messagebag. There I found any() and isNotEmpty() methods. Both of them does the exact same thing. They try to determine if there are any messages and return true if there are any. I went to see the source code and I found that isNotEmpty() is doing nothing but calling the any() method.
public function isNotEmpty()
{
return $this->any();
}
public function any()
{
return $this->count() > 0;
}
What I don't understand is why laravel does same things in two places? Shouldn't one method be sufficient for this job?
You are right that one of the two methods would be sufficient. There is also not a single reason both of these methods exist, but I guess the combination of the following (and possibly even more) is why we have them:
The any() method has been there long before isNotEmpty() and there has always been an isEmpty() method alongside any() (as inverse method).
Because isNotEmpty() is a more obvious method name than any(), it has been added some years ago.
For backwards compatibility and because the implementation of any() is really trivial, there has never been a good reason to remove any().
It is actually quite common for programming languages and libraries to have different methods for the same thing. Some languages are using it more often and more obvious than others, but I guess it has a lot to do with readability. Although isNotEmpty() as in if ($messages->isNotEmpty()) { ... } is a lot longer than if ($messages->any()) { ... }, it seems more readable and understandable to me. But not everyone sees it the same way and my best guess is that there is a fan of any() in the group of the framework maintainers who doesn't want to write more than necessary.

When to use Encapsulate Collection?

In the smell Data Class as Martin Fowler described in Refactoring, he suggests if I have a collection field in my class I should encapsulate it.
The pattern Encapsulate Collection(208) says we should add following methods:
get_unmodified_collection
add_item
remove_item
and remove these:
get_collection
set_collection
To make sure any changes on this collection need go through the class.
Should I refactor every class which has a collection field with this pattern? Or it depends on some other reasons like frequency of usage?
I use C++ in my project now.
Any suggestion would be helpful. Thanks.
These are well formulated questions and my answer is:
Should I refactor every class which has a collection field with this
pattern?
No, you should not refactor every class which has a collection field. Every fundamentalism is a way to hell. Use common sense and do not make your design too good, just good enough.
Or it depends on some other reasons like frequency of usage?
The second question comes from a common mistake. The reason why we refactor or use design pattern is not primarily the frequency of use. We do it to make the code more clear, more maintainable, more expandable, more understandable, sometimes (but not always!) more effective. Everything which adds to these goals is good. Everything which does not, is bad.
You might have expected a yes/no answer, but such one is not possible here. As said, use your common sense and measure your solution from the above mentioned viewpoints.
I generally like the idea of encapsulating collections. Also encapsulating plain Strings into named business classes. I do it almost always when the classes are meaningful in the business domain.
I would always prefer
public class People {
private final Collection<Man> people;
... // useful methods
}
over the plain Collection<Man> when Man is a business class (a domain object). Or I would sometimes do it in this way:
public class People implements Collection<Man> {
private final Collection<Man> people;
... // delegate methods, such as
#Override
public int size() {
return people.size();
}
#Override
public Man get(int index) {
// Here might also be some manipulation with the returned data etc.
return people.get(index);
}
#Override
public boolean add(Man man) {
// Decoration - added some validation
if (/* man does not match some criteria */) {
return false;
}
return people.add(man);
}
... // useful methods
}
Or similarly I prefer
public class StreetAddress {
private final String value;
public String getTextValue() { return value; }
...
// later I may add more business logic, such as parsing the street address
// to street name and house number etc.
}
over just using plain String streetAddress - thus I keep the door opened to any future change of the underlying logic and to adding any useful methods.
However, I try not to overkill my design when it is not needed so I am as well as happy with plain collections and plain Strings when it is more suited.
I think it depends on the language you are developing with. Since there are already interfaces that do just that C# and Java for example. In C# we have ICollection, IEnumerable, IList. In Java Collection, List, etc.
If your language doesn't have an interface to refer to a collection regarless of their inner implementation and you require to have your own abstraction of that class, then it's probably a good idea to do so. And yes, you should not let the collection to be modified directly since that completely defeats the purpose.
It would really help if you tell us which language are you developing with. Granted, it is kind of a language-agnostic question, but people knowledgeable in that language might recommend you the best practices in it and if there's already a way to achieve what you need.
The motivation behind Encapsulate Collection is to reduce the coupling of the collection's owning class to its clients.
Every refactoring tries to improve maintainability of the code, so future changes are easier. In this case changing the collection class from vector to list for example, changes all the clients' uses of the class. If you encapsulate this with this refactoring you can change the collection without changes to clients. This follows on of SOLID principles, the dependency inversion principle: Depend upon Abstractions. Do not depend upon concretions.
You have to decide for your own code base, whether this is relevant for you, meaning that your code base is still being changed and has to be maintained (then yes, do it for every class) or not (then no, leave the code be).

Is there a good way to use polymorphism to remove this switch statement?

I've been reading on refactoring and replacing conditional statements with polymorphism. The trouble I have is that it only seems to make sense to me when you have a more complex case where, without polymorphism, you would have to repeat the same switch statements or if-elses many times. I don't see how it makes sense if you're only doing it once - you have to have that conditional somewhere, right?
As an example, I recently wrote the following class, which is responsible for reading a XML file and converting its data into the program's objects. There are 2 possible formats for the file that we are supporting, so I simply wrote a method in the class for handling each one, and used a case-switch to determine which one to use:
public class ComponentXmlReader
{
public IEnumerable<UserComponent> ImportComponentsFromXml(string path)
{
var xmlFile = XElement.Load(path);
switch (xmlFile.Name.LocalName)
{
case "CaseDefinition":
return ImportComponentsFromA(xmlFile);
case "Root":
return ImportComponentsFromB(xmlFile);
}
}
private IEnumerable<UserComponent> ImportComponentsFromA(XContainer file)
{
//do stuff
}
private IEnumerable<UserComponent> ImportComponentsFromB(XContainer file)
{
//do stuff
}
}
As far as I can tell, I could write a class hierarchy for this to do the parsing, but I don't see the advantage here - I'd still have to use a case-switch to determine which class to instantiate. It looks to me like it would be extra complexity for no benefit. If I was going to keep these classes around and do more things with them that depended on the file type, then it would eliminate doing the same switch in multiple places, but this is single-use. Is this right, or is there some reason or technique I'm not seeing that makes it a good idea to use a polymorphic class hierarchy to do this?
If you had, say, an abstract ComponentImporter class, with concrete subclasses FromA and FromB, you could instantiate one of each, and put it in a Map. Then you could call componentImporterMap.get(xmlFile.Name.LocalName).importComponents() and avoid the switch.
As with all design choices, context is key. In this case, you have what seems to be a fairly simple class handling two very similar tasks. If the two Import methods contained very little duplicate code, then including them in a single class is perhaps the best choice since, as you say, it reduces complexity.
However, it's possible you'll use this class in the future, and even add new types of imports. In that case, the class would be more reusable if it was polymorphic.
Additionally, since these methods sound very similar, you're likely to have a bunch of duplicate code, which you could keep in a base class and only put import-specific code in the child classes.
Plus, as Carl mentions, there are numbers of ways to implement this logic without using a case statement.

Is there a DSL or declarative system for TPL Dataflow?

Is there any DSL or other fully- or partially-declarative mechanism for declaring TPL Dataflow flows? Or is the best (only) practice to just wire them up in code?
Failing that, is there any DSL or other fully- or partially-declarative mechanism for using any dataflow library that I could use as a model and/or source of ideas?
(I've searched without success so maybe one doesn't exist ... or maybe I didn't find it.)
Update: To answer #svick below as to why I want this and what do I gain by it:
First, I just like a sparser syntax that more clearly shows the flow rather than the details. I think
downloadString => createWordList => filterWordList => findPalindromes;
is preferable to
downloadString.LinkTo(createWordList);
createWordList.LinkTo(filterWordList);
filterWordList.LinkTo(findPalindromes);
findPalindromes.LinkTo(printPalindrome);
with its repeated names and extra punctuation. Similar to the way you'd rather use the dot DSL to describe a DAG than a bunch of calls to the Visio DOM API. You can imagine a syntax for network flows, as well as pipelines, such that network flows in particular would be very clear. That may not seem compelling, of course, but I like it.
Second, I think that with a DSL you might be able to persist the DSL description, e.g., as a field in a row in a database, and then instantiate it later. Though perhaps that's a different capability entirely.
Let's start with the relevant facts and work from there:
There isn't anything like this for TPL Dataflow yet.
There isn't a good way of embedding a DSL into C#. The common compilers are not extensible and it would be hard to access local variables from a string-based DSL.
The are several limitations to operators in C#, but the most significant here is that operators can't be generic. This means that the sparser syntax either wouldn't be type-safe (which is unacceptable to me), or it can't use overloaded operators.
The IDisposable returned from LinkTo() that can be used to break the created link isn't used that often, so it doesn't have to be supported. (Or maybe the expression that sets up the flow could return a single IDisposable that breaks the whole flow?)
Because of this, I think the best that can be done is something like:
downloadString.Link(createWordList).Link(filterWordList).Link(findPalindromes);
This avoids the repetition of LinkTo(), but is not much better.
The implementation of the simple form of this is mostly trivial:
public static class DataflowLinkExtensions
{
public static ISourceBlock<TTarget> Link<TSource, TTarget>(
this ISourceBlock<TSource> source,
IPropagatorBlock<TSource, TTarget> target)
{
source.LinkTo(
target,
new DataflowLinkOptions { PropagateCompletion = true });
return target;
}
public static void Link<TSource>(
this ISourceBlock<TSource> source, ITargetBlock<TSource> target)
{
source.LinkTo(
target,
new DataflowLinkOptions { PropagateCompletion = true });
}
}
I chose to set PropagateCompletion to true, because I think that makes the most sense here. But it could also be an option of Link().
I think most of the alternative linking operators of Axum are not relevant to TPL Dataflow, but linking multiple blocks to or from the same block could be done by taking a collection or array as one of the parameters of Link():
new[] { source1, source2 }.Link(target);
source.Link(target1, target2);
If Link() actually returned something that represents the whole flow (similar to Encapsulate()), you could combine this to create more complicated flows, like:
source.Link(propagator1.Link(target1), target2);

Resources