Related
i have a lot of similiar looking small pieces of code. E.g. parsing config files with jdom, coverting the into regex patterns. It's all stuff that is done in 10 lines. Writing some abstract meta-monster that does all this would be very complicated.
Now I always here people crying about double-code. Is this in my usecase really such a bad thing? Having the code similar makes it easy to understand and maintain. There is no big interrelation of functions.
Am I doing the right thing?
Over-engineering is an antipattern. If you don't need abstraction, don't use it.
Abstraction and patterns are the most useful when your project is large or is supposed to grow. If it that isn't your situation, then Keep It Simple and Stupid.
It's also a matter of taste. Personally, even if it is sometimes discouraged, I prefer using patterns and abstraction even in simple situations if I feel that it might be useful in the future, because I hate rewriting the same lines of code twice. In addition, design patterns also help you to avoid errors because they put order into your code and class relations.
No, having very similar code makes it hard to maintain, if you've got more than, say, three of those pieces of code. When you catch a bug (or get a specs change) that affects all or several of those pieces of code, you have to try and spot the differences. It may even be harder to fix then when they're all exactly the same.
The least you can do is try to lift out some commonalities and make a tiny library of well-named helper functions. Lifting out the tricky bits is more important than how many lines of code you save.
It really depends on what those 10-lines look like. Some cases that don't seem to warrant a proper abstraction, can be solved with a simple loop.
This is sort of a follow up on this question I read:
What is the biggest mistake people make when starting to use LINQ?
The top answer is "That it should be used for everything." That made me wonder what exactly that means.
What are some common examples where someone used LINQ when they should not have?
You shouldn't use LINQ when the alternative is simpler or significantly more efficient.
I would suggest avoiding LINQ anytime it made the code less obvious.
However, in general, I think LINQ makes things easier to follow, not more difficult, so I rarely avoid it.
It is possible for LINQ to be significantly slower than the alternatives, especially if you have a lot of intermediate Lists. However you are talking about some pretty large datasets, so large that I haven't encountered them.
However, one thing to keep in mind is that a well-written LINQ query can also be much faster than the alternative because of the way IEnumerable works.
Finally, using LINQ now will allow you to switch to Parallel LINQ when it is released with little or no changes.
It's still okay to use foreach. :)
One of my personal programming demons has always been complex logic that needs to be controlled by if statements (or similiar). Not always necessarily that complex either, sometimes just a few states that needs to be accounted for.
Are there any tools or steps a developer can perform during design time to help see the 'states' and take measures to refactor the code down to simplify the resulting code? I'm thinking drawing up a matrix or something along those lines...?
I'd recommend a basic course in propositional logic for every aspiring programmer. At first, the notation and Greek letters may seem off-putting to the math-averse, but it is really one of the most powerful (and oft-neglected) tools in your skillset, and rather simple, at the core.
The basic operators, de Morgan's and other basic laws, truth tables, and existence of e.g. disjunctive and conjunctive normal forms were an eye-opener to me. Before I learned about them, conditional expressions felt like dangerous beasts. Ever since, I know that I can whip them into submission whenever necessary by breaking out the heavy artillery!
Truth tables are basically the exhaustive approach and will (hopefully) highlight all the possibilities.
You might like to take a look at Microsoft Pex, which can be helpful for spotting the fringe cases you hadn't thought of.
I think that the developer is asking how to make his life easier when dealing with complex if code.
The way that I handle complex if code is to code as flat as possible and weed out all negations first. If you can get rid of compound if by placing a portion of it above, then do that.
The beauty of simplicity is that it doesn't take a book or a class to learn it. If you can break it up, do so. If you can remove any part of it, do so. If you don't understand it, do it differently. And flat is almost always better than nested (thanks python!).
It's simpler to read:
if(broken){
return false;
}
if (simple){
doit();
return true;
}
if(complicated){
divide();
conquor();
}
if(extra){
extra();
}
than it is to read:
if(!broken && (simple || complicated)){
....
}
return false;
Truth tables and unit tests - draw up the tables (n dimensional for n variables), and then use these as inputs to your unit test, which can test each combination of variables and verify the results.
The biggest problem I've seen through the years with complex IFs is that people don't test all the branches. Make sure to write a test for each possible branch no matter how unlikely it seems that you will hit it.
You might also want to try Karnaugh maps, which are good for up to 4 variables.
If you haven't already, I'd highly suggest reading Code Complete. It has a lot of advice on topics such as this. I don't have my copy handy at the moment, otherwise I'd post a summary of this section in the book.
Split the logic down into discrete units (a && b, etc.), each with their own variable. Then build these up using the logic you need. Name each variable with something appropriate, so that your complex statement is fairly readable (although it may take up several extra lines and a fair few temporary variables).
Any reason you cannot just handle the logic with guard statements?
Karnaugh maps can be nice ways of taking information from a truth table (suggested by Visage) and turning them into compact and/or/not expressions. These are typically taught in an EE digital logic course.
Have you tried a design pattern? You might look into what is known as the Strategy pattern: http://en.wikipedia.org/wiki/Strategy_pattern
Check out the nuclear option: Drools. There's quite a lot to it-- took me a day or two of perusing the literature just to get a handle on its capabilities. But if you have applications where your complex if-then logic is an evolving part of the project (for example, an application with modular algorithms) it might be just the thing.
Back in college, only the use of pseudo code was evangelized more than OOP in my curriculum. Just like commenting (and other preached 'best practices'), I found that in crunch time psuedocode was often neglected. So my question is...who actually uses it a lot of the time? Or do you only use it when an algorithm is really hard to conceptualize entirely in your head? I'm interested in responses from everyone: wet-behind-the-ears junior developers to grizzled vets who were around back in the punch card days.
As for me personally, I mostly only use it for the difficult stuff.
I use it all the time. Any time I have to explain a design decision, I'll use it. Talking to non-technical staff, I'll use it. It has application not only for programming, but for explaining how anything is done.
Working with a team on multiple platforms (Java front-end with a COBOL backend, in this case) it's much easier to explain how a bit of code works using pseudocode than it is to show real code.
During design stage, pseudocode is especially useful because it helps you see the solution and whether or not it's feasible. I've seen some designs that looked very elegant, only to try to implement them and realize I couldn't even generate pseudocode. Turned out, the designer had never tried thinking about a theoretical implementation. Had he tried to write up some pseudocode representing his solution, I never would have had to waste 2 weeks trying to figure out why I couldn't get it to work.
I use pseudocode when away from a computer and only have paper and pen. It doesn't make much sense to worry about syntax for code that won't compile (can't compile paper).
I almost always use it nowadays when creating any non-trivial routines. I create the pseudo code as comments, and continue to expand it until I get to the point that I can just write the equivalent code below it. I have found this significantly speeds up development, reduces the "just write code" syndrome that often requires rewrites for things that weren't originally considered as it forces you to think through the entire process before writing actual code, and serves as good base for code documentation after it is written.
I and the other developers on my team use it all the time. In emails, whiteboard, or just in confersation. Psuedocode is tought to help you think the way you need to, to be able to program. If you really unstand psuedocode you can catch on to almost any programming language because the main difference between them all is syntax.
If I'm working out something complex, I use it a lot, but I use it as comments. For instance, I'll stub out the procedure, and put in each step I think I need to do. As I then write the code, I'll leave the comments: it says what I was trying to do.
procedure GetTextFromValidIndex (input int indexValue, output string textValue)
// initialize
// check to see if indexValue is within the acceptable range
// get min, max from db
// if indexValuenot between min and max
// then return with an error
// find corresponding text in db based on indexValue
// return textValue
return "Not Written";
end procedure;
I've never, not even once, needed to write the pseudocode of a program before writing it.
However, occasionally I've had to write pseudocode after writing code, which usually happens when I'm trying to describe the high-level implementation of a program to get someone up to speed with new code in a short amount of time. And by "high-level implementation", I mean one line of pseudocode describes 50 or so lines of C#, for example:
Core dumps a bunch of XML files to a folder and runs the process.exe
executable with a few commandline parameters.
The process.exe reads each file
Each file is read line by line
Unique words are pulled out of the file stored in a database
File is deleted when its finished processing
That kind of pseudocode is good enough to describe roughly 1000 lines of code, and good enough to accurately inform a newbie what the program is actually doing.
On many occasions when I don't know how to solve a problem, I actually find myself drawing my modules on a whiteboard in very high level terms to get a clear picture of how their interacting, drawing a prototype of a database schema, drawing a datastructure (especially trees, graphs, arrays, etc) to get a good handle on how to traverse and process it, etc.
I use it when explaining concepts. It helps to trim out the unnecessary bits of language so that examples only have the details pertinent to the question being asked.
I use it a fair amount on StackOverflow.
I don't use pseudocode as it is taught in school, and haven't in a very long time.
I do use english descriptions of algorithms when the logic is complex enough to warrant it; they're called "comments". ;-)
when explaining things to others, or working things out on paper, i use diagrams as much as possible - the simpler the better
Steve McConnel's Code Complete, in its chapter 9, "The Pseudocode Programming Process" proposes an interesting approach: when writing a function longer than a few lines, use simple pseudocode (in the form of comments) to outline what the function/procedure needs to do before writing the actual code that does it. The pseudocode comments can then become actual comments in the body of the function.
I tend to use this for any function that does more than what can be quickly understood by looking at a screenful (max) of code. It works specially well if you are already used to separate your function body in code "paragraphs" - units of semantically related code separated by a blank line. Then the "pseudocode comments" work like "headers" to these paragraphs.
PS: Some people may argue that "you shouldn't comment what, but why, and only when it's not trivial to understand for a reader who knows the language in question better then you". I generally agree with this, but I do make an exception for the PPP. The criteria for the presence and form of a comment shouldn't be set in stone, but ultimately governed by wise, well-thought application of common sense anyway. If you find yourself refusing to try out a slight bent to a subjective "rule" just for the sake of it, you might need to step back and realize if you're not facing it critically enough.
Mostly use it for nutting out really complex code, or when explaining code to either other developers or non developers who understand the system.
I also flow diagrams or uml type diagrams when trying to do above also...
I generally use it when developing multiple if else statements that are nested which can be confusing.
This way I don't need to go back and document it since its already been done.
Fairly rarely, although I often document a method before writing the body of it.
However, If I'm helping another developer with how to approach a problem, I'll often write an email with a pseudocode solution.
I don't use pseudocode at all.
I'm more comfortable with the syntax of C style languages than I am with Pseudocode.
What I do do quite frequently for design purposes is essentially a functional decomposition style of coding.
public void doBigJob( params )
{
doTask1( params);
doTask2( params);
doTask3( params);
}
private void doTask1( params)
{
doSubTask1_1(params);
...
}
Which, in an ideal world, would eventually turn into working code as methods become more and more trivial. However, in real life, there is a heck of a lot of refactoring and rethinking of design.
We find this works well enough, as rarely do we come across an algorithm that is both: Incredibly complex and hard to code and not better solved using UML or other modelling technique.
I never use or used it.
I always try to prototype in a real language when I need to do something complex, usually writting unit tests first to figure out what the code needs to do.
Reading this question I found this as (note the quotation marks) "code" to solve the problem (that's perl by the way).
100,{)..3%!'Fizz'*\5%!'Buzz'*+\or}%n*
Obviously this is an intellectual example without real (I hope to never see that in real code in my life) implications but, when you have to make the choice, when do you sacrifice code readability for performance? Do you apply just common sense, do you do it always as a last resort? What are your strategies?
Edit: I'm sorry, seeing the answers I might have expressed the question badly (English is not my native language). I don't mean performance vs readability only after you've written the code, I ask about before you write it as well. Sometimes you can foresee a performance improvement in the future by making some darker design or providing with some properties that will make your class darker. You may decide you will use multiple threads or just a single one because you expect the scalability that such threads may give you, even when that will make the code much more difficult to understand.
My process for situations where I think performance may be an issue:
Make it work.
Make it clear.
Test the performance.
If there are meaningful performance issues: refactor for speed.
Note that this does not apply to higher-level design decisions that are more difficult to change at a later stage.
I always start with the most readable version I can think of. If performance is a problem, I refactor. If the readable version makes it hard to generalize, I refactor.
The key is to have good tests so that refactoring is easy.
I view readability as the #1 most important issue in code, though working correctly is a close second.
Readability is most important. With modern computers, only the most intensive routines of the most demanding applications need to worry too much about performance.
My favorite answer to this question is:
Make it work
Make it right
Make it fast
In the scope of things no one gives a crap about readability except the next unlucky fool that has to take care of your code. However, that being said... if you're serious about your art, and this is an art form, you will always strive to make your code the most per formant it can be while still being readable by others. My friend and mentor (who is a BADASS in every way) once graciously told me on a code-review that "the fool writes code only they can understand, the genius writes code that anyone can understand." I'm not sure where he got that from but it has stuck with me.
Reference
Programs must be written for people to read, and only incidentally for
machines to execute. — Abelson & Sussman, SICP
Well written programs are probably easier to profile and hence improve performance.
You should always go for readability first. The shape of a system will typically evolve as you develop it, and the real performance bottlenecks will be unexpected. Only when you have the system running and can see real evidence - as provided by a profiler or other such tool - will the best way to optimise be revealed.
"If you're in a hurry, take the long way round."
agree with all the above, but also:
when you decide that you want to optimize:
Fix algorithmic aspects before syntax (for example don't do lookups in large arrays)
Make sure that you prove that your change really did improve things, measure everything
Comment your optimization so the next guy seeing that function doesn't simplify it back to where you started from
Can you precompute results or move the computation to where it can be done more effectively (like a db)
in effect, keep readability as long as you can - finding the obscure bug in optimized code is much harder and annoying than in the simple obvious code
I apply common sense - this sort of thing is just one of the zillion trade-offs that engineering entails, and has few special characteristics that I can see.
But to be more specific, the overwhelming majority of people doing weird unreadable things in the name of performance are doing them prematurely and without measurement.
Choose readability over performance unless you can prove that you need the performance.
I would say that you should only sacrifice readability for performance if there's a proven performance problem that's significant. Of course "significant" is the catch there, and what's significant and what isn't should be specific to the code you're working on.
"Premature optimization is the root of all evil." - Donald Knuth
Readability always wins. Always. Except when it doesn't. And that should be very rarely.
at times when optimization is necessary, i'd rather sacrifice compactness and keep the performance enhancement. perl obviously has some deep waters to plumb in search of the conciseness/performance ratio, but as cute as it is to write one-liners, the person who comes along to maintain your code (who in my experience, is usually myself 6 months later) might prefer something more in the expanded style, as documented here:
http://www.perl.com/pub/a/2004/01/16/regexps.html
There are exceptions to the premature optimization rule. For example, when accessing an image in memory, reading a pixel should not be an out-of-line function. And when providing for custom operations on the image, never do it like this:
typedef Pixel PixelModifierFunction(Pixel);
void ModifyAllPixels(PixelModifierFunction);
Instead, let external functions access the pixels in memory, though it's uglier. Otherwise, you are sure to write slow code that you'll have to refactor later anyway, so you're doing extra work.
At least, that's true if you know you're going to deal with large images.