Are there any tools to aid with complex 'if' logic? - logic

One of my personal programming demons has always been complex logic that needs to be controlled by if statements (or similiar). Not always necessarily that complex either, sometimes just a few states that needs to be accounted for.
Are there any tools or steps a developer can perform during design time to help see the 'states' and take measures to refactor the code down to simplify the resulting code? I'm thinking drawing up a matrix or something along those lines...?

I'd recommend a basic course in propositional logic for every aspiring programmer. At first, the notation and Greek letters may seem off-putting to the math-averse, but it is really one of the most powerful (and oft-neglected) tools in your skillset, and rather simple, at the core.
The basic operators, de Morgan's and other basic laws, truth tables, and existence of e.g. disjunctive and conjunctive normal forms were an eye-opener to me. Before I learned about them, conditional expressions felt like dangerous beasts. Ever since, I know that I can whip them into submission whenever necessary by breaking out the heavy artillery!

Truth tables are basically the exhaustive approach and will (hopefully) highlight all the possibilities.
You might like to take a look at Microsoft Pex, which can be helpful for spotting the fringe cases you hadn't thought of.

I think that the developer is asking how to make his life easier when dealing with complex if code.
The way that I handle complex if code is to code as flat as possible and weed out all negations first. If you can get rid of compound if by placing a portion of it above, then do that.
The beauty of simplicity is that it doesn't take a book or a class to learn it. If you can break it up, do so. If you can remove any part of it, do so. If you don't understand it, do it differently. And flat is almost always better than nested (thanks python!).
It's simpler to read:
if(broken){
return false;
}
if (simple){
doit();
return true;
}
if(complicated){
divide();
conquor();
}
if(extra){
extra();
}
than it is to read:
if(!broken && (simple || complicated)){
....
}
return false;

Truth tables and unit tests - draw up the tables (n dimensional for n variables), and then use these as inputs to your unit test, which can test each combination of variables and verify the results.

The biggest problem I've seen through the years with complex IFs is that people don't test all the branches. Make sure to write a test for each possible branch no matter how unlikely it seems that you will hit it.

You might also want to try Karnaugh maps, which are good for up to 4 variables.

If you haven't already, I'd highly suggest reading Code Complete. It has a lot of advice on topics such as this. I don't have my copy handy at the moment, otherwise I'd post a summary of this section in the book.

Split the logic down into discrete units (a && b, etc.), each with their own variable. Then build these up using the logic you need. Name each variable with something appropriate, so that your complex statement is fairly readable (although it may take up several extra lines and a fair few temporary variables).

Any reason you cannot just handle the logic with guard statements?

Karnaugh maps can be nice ways of taking information from a truth table (suggested by Visage) and turning them into compact and/or/not expressions. These are typically taught in an EE digital logic course.

Have you tried a design pattern? You might look into what is known as the Strategy pattern: http://en.wikipedia.org/wiki/Strategy_pattern

Check out the nuclear option: Drools. There's quite a lot to it-- took me a day or two of perusing the literature just to get a handle on its capabilities. But if you have applications where your complex if-then logic is an evolving part of the project (for example, an application with modular algorithms) it might be just the thing.

Related

Style! Pattern vs double code what is good?

i have a lot of similiar looking small pieces of code. E.g. parsing config files with jdom, coverting the into regex patterns. It's all stuff that is done in 10 lines. Writing some abstract meta-monster that does all this would be very complicated.
Now I always here people crying about double-code. Is this in my usecase really such a bad thing? Having the code similar makes it easy to understand and maintain. There is no big interrelation of functions.
Am I doing the right thing?
Over-engineering is an antipattern. If you don't need abstraction, don't use it.
Abstraction and patterns are the most useful when your project is large or is supposed to grow. If it that isn't your situation, then Keep It Simple and Stupid.
It's also a matter of taste. Personally, even if it is sometimes discouraged, I prefer using patterns and abstraction even in simple situations if I feel that it might be useful in the future, because I hate rewriting the same lines of code twice. In addition, design patterns also help you to avoid errors because they put order into your code and class relations.
No, having very similar code makes it hard to maintain, if you've got more than, say, three of those pieces of code. When you catch a bug (or get a specs change) that affects all or several of those pieces of code, you have to try and spot the differences. It may even be harder to fix then when they're all exactly the same.
The least you can do is try to lift out some commonalities and make a tiny library of well-named helper functions. Lifting out the tricky bits is more important than how many lines of code you save.
It really depends on what those 10-lines look like. Some cases that don't seem to warrant a proper abstraction, can be solved with a simple loop.

When should I break a function?

Its prudent to break a long function into a chief function and helper functions.
I know that the outside the module only chief function will be called, but its long length may prove to be intimidating.
Textbooks put a limit on the number of lines, but I feel that this is too rigid.
P.S. I am programming in Python and need to process incoming, messages. The function returns a tuple containing the message but in Python's internal data types.
So you can see somewhat independent code for each message type.
Duplicate Question
When is a function too long?
I think you need to go about this from the other end of the problem. Think bottom-up. Identify small units of work, as small as possible, and start composing your code that way. You will only run into spaghetti-code issues when you code top-down and don't keep a structured approach.
If you already have spaghetti code and need to refactor, you pretty much have to start over. It is probably more work to break up existing spaghetti code than to rewrite it, and the result may not be as good.
I don't think there should be a hard number for the lines of code in a method either, but well written code does not have methods with more than 5 to 10 lines in the lower layers, and 20 to 30 lines in the business logic. To give you some kind of metric.
I'm not a big fan of breaking a function into multiple functions unnecessarily. It's not a hard and fast thing - if there are things that seem like distinct logical units, then by all means, break those out and think about them separately. But don't just break things out for the sake of some guideline like "one page per function" or "N lines per function".
One good rule of thumb is that if it doesn't fit on a single screen it is worth thinking about splitting it up. But only if it makes sense to split it up, some long functions are perfectly readable and it doesn't make any sense to slavishly split them into multiple functions just for the sake of it.
Never write a function that, when printed on fanfold paper, is taller than you are.
I like the rule of thumb that you should break out the subfunction if you can think of a good domain-relevant name for it.
When someone can understand the top-level function without necessarily having to look up the definition of the sub-function, you've likely made a net gain. (But when you break it down too far, your names start referring to your implementation artifacts rather than the domain)
I was recently discussing this with a friend. He suggested refactoring to separate concerns and I must say I have to agree. That is, one function should do one thing, if it does more than one thing, split it up. If not, let it be together, it makes no sense to split up a function, only to have it obfuscate the meaning. After all, a function is a block of code that does one thing!
The limit in term of number of lines is often impractical becuase it doesn't account for readability well. It's better to try to seperate groups of lines of code that have just a few inputs and just a few outputs and make this a separate functon. It's not always possible - then it's often wise to just leave the code as it is and not to refactor for the sake of refactoring.
Well since I am coding in Python so I have the liberty to write functions inside functions, unlike C, C++ or Java. This i feel is a better choice.
It's not specified. But line should be as low as possible. But you may follow the Role of 30. I follow this in my PHP scripts when needed.
Rule of 30:
“Rule of 30” in Refactoring in Large Software Projects by Martin Lippert and Stephen Roock:
Methods should not have more than an average of 30 code lines.
A class should contain an average of less than 30 methods.
A package/library shouldn’t contain more than 30 classes.
Subsystems should avoid more than 30 packages.
A system more than 30 subsystems may create problem.
If an element consists of more than 30 subelements, it is highly probable that there is a serious problem.
personally I break a function if it either saves total lines or total processing time.
if I only run the helper once per chief function I don't bother
The point is that in principal it's better to have specialiced functions. But where one sets the limit depends very much on
1) the "usual" programming style in certain languages. (one can observe that, object-oriented langauges tend to shorter procedureds than let's say C or the like
2) it depends on your way of programming. Every hard limit must be questioned. IMHO. Overall there will probably some "natural" distribution of programs
3) I think what one should keep on one's mind is that a function should do a certain task take for example some function for parsing it is usually much longer than a function just settin some field in a structure. Or getting back just consider how a event loop in the Windows API may look. So that all suggests that there may be good reasons for long methods...
If there is independent code (in your case specifics for each message type) those areas should be broken out.
Size matters not. Judge me by my size do you? - Yoda
Your main concerns are readability, simplicity and maintainability. A good indicator is if you need to write comments to explain a section of a function then that section is a good candidate for a separate function.
There are many reasons to break a long function into its constituent pieces. Most important is:
readability
maintainability
code clarity/intent
Some functions simple cannot be broken into smaller pieces without negatively impacting the listed goals, so there is no hard-and-fast rule.
If you didn't write it and it's already in production: NEVER!!! If you break it up, you're likely to break it, it's that simple.
If you are writing it and you're not sure, the on screen rule apples as others have said.

How often do you use pseudocode in the real world?

Back in college, only the use of pseudo code was evangelized more than OOP in my curriculum. Just like commenting (and other preached 'best practices'), I found that in crunch time psuedocode was often neglected. So my question is...who actually uses it a lot of the time? Or do you only use it when an algorithm is really hard to conceptualize entirely in your head? I'm interested in responses from everyone: wet-behind-the-ears junior developers to grizzled vets who were around back in the punch card days.
As for me personally, I mostly only use it for the difficult stuff.
I use it all the time. Any time I have to explain a design decision, I'll use it. Talking to non-technical staff, I'll use it. It has application not only for programming, but for explaining how anything is done.
Working with a team on multiple platforms (Java front-end with a COBOL backend, in this case) it's much easier to explain how a bit of code works using pseudocode than it is to show real code.
During design stage, pseudocode is especially useful because it helps you see the solution and whether or not it's feasible. I've seen some designs that looked very elegant, only to try to implement them and realize I couldn't even generate pseudocode. Turned out, the designer had never tried thinking about a theoretical implementation. Had he tried to write up some pseudocode representing his solution, I never would have had to waste 2 weeks trying to figure out why I couldn't get it to work.
I use pseudocode when away from a computer and only have paper and pen. It doesn't make much sense to worry about syntax for code that won't compile (can't compile paper).
I almost always use it nowadays when creating any non-trivial routines. I create the pseudo code as comments, and continue to expand it until I get to the point that I can just write the equivalent code below it. I have found this significantly speeds up development, reduces the "just write code" syndrome that often requires rewrites for things that weren't originally considered as it forces you to think through the entire process before writing actual code, and serves as good base for code documentation after it is written.
I and the other developers on my team use it all the time. In emails, whiteboard, or just in confersation. Psuedocode is tought to help you think the way you need to, to be able to program. If you really unstand psuedocode you can catch on to almost any programming language because the main difference between them all is syntax.
If I'm working out something complex, I use it a lot, but I use it as comments. For instance, I'll stub out the procedure, and put in each step I think I need to do. As I then write the code, I'll leave the comments: it says what I was trying to do.
procedure GetTextFromValidIndex (input int indexValue, output string textValue)
// initialize
// check to see if indexValue is within the acceptable range
// get min, max from db
// if indexValuenot between min and max
// then return with an error
// find corresponding text in db based on indexValue
// return textValue
return "Not Written";
end procedure;
I've never, not even once, needed to write the pseudocode of a program before writing it.
However, occasionally I've had to write pseudocode after writing code, which usually happens when I'm trying to describe the high-level implementation of a program to get someone up to speed with new code in a short amount of time. And by "high-level implementation", I mean one line of pseudocode describes 50 or so lines of C#, for example:
Core dumps a bunch of XML files to a folder and runs the process.exe
executable with a few commandline parameters.
The process.exe reads each file
Each file is read line by line
Unique words are pulled out of the file stored in a database
File is deleted when its finished processing
That kind of pseudocode is good enough to describe roughly 1000 lines of code, and good enough to accurately inform a newbie what the program is actually doing.
On many occasions when I don't know how to solve a problem, I actually find myself drawing my modules on a whiteboard in very high level terms to get a clear picture of how their interacting, drawing a prototype of a database schema, drawing a datastructure (especially trees, graphs, arrays, etc) to get a good handle on how to traverse and process it, etc.
I use it when explaining concepts. It helps to trim out the unnecessary bits of language so that examples only have the details pertinent to the question being asked.
I use it a fair amount on StackOverflow.
I don't use pseudocode as it is taught in school, and haven't in a very long time.
I do use english descriptions of algorithms when the logic is complex enough to warrant it; they're called "comments". ;-)
when explaining things to others, or working things out on paper, i use diagrams as much as possible - the simpler the better
Steve McConnel's Code Complete, in its chapter 9, "The Pseudocode Programming Process" proposes an interesting approach: when writing a function longer than a few lines, use simple pseudocode (in the form of comments) to outline what the function/procedure needs to do before writing the actual code that does it. The pseudocode comments can then become actual comments in the body of the function.
I tend to use this for any function that does more than what can be quickly understood by looking at a screenful (max) of code. It works specially well if you are already used to separate your function body in code "paragraphs" - units of semantically related code separated by a blank line. Then the "pseudocode comments" work like "headers" to these paragraphs.
PS: Some people may argue that "you shouldn't comment what, but why, and only when it's not trivial to understand for a reader who knows the language in question better then you". I generally agree with this, but I do make an exception for the PPP. The criteria for the presence and form of a comment shouldn't be set in stone, but ultimately governed by wise, well-thought application of common sense anyway. If you find yourself refusing to try out a slight bent to a subjective "rule" just for the sake of it, you might need to step back and realize if you're not facing it critically enough.
Mostly use it for nutting out really complex code, or when explaining code to either other developers or non developers who understand the system.
I also flow diagrams or uml type diagrams when trying to do above also...
I generally use it when developing multiple if else statements that are nested which can be confusing.
This way I don't need to go back and document it since its already been done.
Fairly rarely, although I often document a method before writing the body of it.
However, If I'm helping another developer with how to approach a problem, I'll often write an email with a pseudocode solution.
I don't use pseudocode at all.
I'm more comfortable with the syntax of C style languages than I am with Pseudocode.
What I do do quite frequently for design purposes is essentially a functional decomposition style of coding.
public void doBigJob( params )
{
doTask1( params);
doTask2( params);
doTask3( params);
}
private void doTask1( params)
{
doSubTask1_1(params);
...
}
Which, in an ideal world, would eventually turn into working code as methods become more and more trivial. However, in real life, there is a heck of a lot of refactoring and rethinking of design.
We find this works well enough, as rarely do we come across an algorithm that is both: Incredibly complex and hard to code and not better solved using UML or other modelling technique.
I never use or used it.
I always try to prototype in a real language when I need to do something complex, usually writting unit tests first to figure out what the code needs to do.

When to call the gang of four? [When to use design patterns?]

In The Guerilla Guide to Interviewing Joel says that guys who want to get things done, but are not smart will do stupid things like using a visitor design pattern where a simple array would be sufficient.
I find it hard to detect, if the design pattern suggested by the Gang of Four should be applied.
Therefore, I would like some examples from Your work experience
When is a simple approach (fixed size array) sufficient?
What is the minimum size of a piece of software that justifies the use of the GoF patterns?
When to refactor from simple-minded to GoF? Can this be done in a sensible way?
I often find that using test driven development helps guide me when faced with these questions.
When is a simple approach
sufficient? It is always sufficient
to use the simplest approach to get
the next test to pass. But knowing
when/how to refactor is the real art
form.
What is the minimum size of a
piece of software that justifies the
use of the GoF patterns? A rule of
thumb I once read is that when you
code something once, fine, when you
duplicate that code somewhere a
second time, make a note and move
on. When you find a need for the
same code a third time, it's time to
refactor to remove duplication and
simplify, and often that involves
moving to a design pattern.
When to
refactor from simple-minded to GoF?
I like what #anopres said - it's
time when you feel the pain of not
having the design pattern in place.
The pain (or code "smell") may
manifest itself in several ways.
Code duplication is the most
obvious. Refactoring books like
Fowler's Refactoring or
Kerievsky's Refactoring to
Patterns list many such pain
points/code stenches.
Can this
[refactoring] be done in a sensible
way? The trick to refactoring is to
have a suite of unit tests in place
which you have confidence in, and
then to refactor without causing any
of those tests to fail.
Refactoring, by definition, does not
change the functionality of your
code. Therefore, if your tests
continue to pass, you can have a
pretty good feeling that you didn't
break anything. Although it can be difficult, I actually enjoy this part of TDD, it's almost like a game to make changes without breaking any tests.
In summary, I would say that TDD helps guide me to write the code that is sufficient at the time, and perhaps more importantly helps me to make the changes later when inevitably requirements change, more functionality is required, etc.
Design Patterns are a consequence, not an objective. You don't think today I shall use Strategy Patterns, you just do it. Halfway through doing the third nearly identical class you stop and use a paper notebook to figure out the general case and knock up a base class that describes the shared context. You refactor the first two classes to be descendants, and this gives you a reality check and quite a few changes to your base class. Then the next thirty are a walk in the park.
It's only the next day at the team meeting that you save everyone thirty minutes of boredom by saying "I used strategy pattern. They all work the same so there's only one test program, it takes parameters to change the test case."
Intimate familiarity with patterns makes you use them reflexively, whenever the situation demands it. When people treat the use of patterns as an objective in its own right, you get stilted, ugly code that speaks of mechanism rather than purpose; the how rather than the why.
Most patterns address recurring fundamental problems like complexity mitigation and the need to provide extensibility points. Providing extensibility points when its clear they won't be needed pointlessly complicates your code and creates more failure points and test cases. Unless you're building a framework for release into the wild, solve only the problems you actually face.
Patterns are only tools and vocabulary. You write the code to be as simple, understandable and maintainable as you know how. By knowing patterns you have more allternatives at your disposal, and you have a language to discuss pros and cons of the approach before implementing it.
In either case you dont just "switch" to "using a pattern". You just keep doing what you allways do, write the code the best way you know how.
This is similar to any other design decision. Ultimately, it depends. You should learn those patterns that are useful in your language (many GoF patterns aren't needed in Lisp or Smalltalk, for example), learn their advantages and disadvantages, understand the constraints of your system, and make the choice that best fits your needs.
The best advice that I can give is to learn, learn, learn.
Switching from a simple approach to a formal design pattern is usually something that happens fairly naturally for me as a problem increases in complexity. The key is to be familiar enough with the patterns that you can recognize the tipping point and switch from the simple approach to a design pattern when it will bring the most benefit for current and future development.
For a larger, more complex project, the tipping point should be fairly early on; in many cases, before you even start coding. For smaller projects, you can afford to wait before deciding to implement a pattern.
One of the best things you can do to increase your ability to recognize when a pattern should be used is to take some time after completing a project to examine how complex your "simple" approach has become. If it would have taken you less time and effort to implement a pattern, or if the pattern would clarify what you were trying to do, you can file that knowledge away for the next time you encounter a similar problem.
When you have a problem that one of the patterns solves. The GoF book has a section in each chapter that explains what types of scenarios each pattern is appropriate for. You should not analyze each problem you have, then go look up what pattern to use. You should become familiar with the patterns so that you learn to recognize what situations call for them.
One of the nice things about the the original GoF book is that there is discussion of the scenarios where the pattern would best solve problem. Reviewing these discussions can help you determine if "it's time".
Another good place to start is with Head First Design Patterns. The exercises that illustrate the use of different design patterns are elaborate enough to offer a good learning experience. In addition the exercises are also grounded in real world scenarios, so it's never a stretch to see when the appropriate times to apply design patterns.

Code replication and refactoring

I would like to hear opinions on small amounts of code replication within methods that
check for the same condition
e.g.
While(condition){
...... do x
}
Normally if there was any of this kind of replication I would refactor the code as it can make versioning a nightmare, what if the condition changes for example you have to change every instance, not a nice job to do.
However what if the condition is relatively simple and is only used within say 3 methods is it wise to refactor,
So in summary where do people draw the line at refactoring code?
Refactoring is not free. Any change to the code can introduce bugs. So every concious developer thinks about whether he has time to carefully examine the changes and in many cases decides not to refactor.
It depends on 2 things, IMO - the size of the duplication (how many lines are involved), and the locality of the duplications - how 'close' are they in terms of context.
If you have duplicated code in several methods in the same class then I would consider extracting even duplicates of a single line into a seperate method (assuming that the line in question was an easly identifiable, easy to isolate, fairly uncoupled piece of code).
Alternatively, if I had some code in one part of a project, and an almost identical piece of code in an (almost) unrelated area then I wouldnt factor that out, as the scope for future divergence would seem quitre high....
The key to refactoring is to do it when you need to. In the above example if you have 3 different while loops with the same condition, who is to say in the future you might want different conditions, if you've already refactored then you've introduced a potential error situation.
It's a matter of judgement, the same condition three times seems ok, the same condition 10 times is an obvious refactor, but where is the tipping point?
I am personally usually quite aggressive with refactorings. I believe that if you clean your code regularly you mostly need to do small and simple refactorings. If you leave it for a while, it gets so messy and difficult to maintain.
In your particular case, I would definitely refactor if the condition has a reasonable business meaning, because it will make the code more readable. But even if this is a technicality, I would consider refactoring provided that its just the matter of extracting a function or property.
Ideally you should have unit tests that will make sure that your refactoring is still correct, so the cost of doing it should be really a few minutes, often smaller than writing this response.
A common rule of thumb is the Rule of Three introduced by Martin Fowler in the seminal work Refactoring. The rule says that two things that are basically the same can stay, but once you add a third you should refactor them.
Besides making future changes easier as you mention, refactoring helps with readability and can make intention more obvious.

Resources