In what ways should we evaluate system refactoring and why? - refactoring

This is a college assignment question but I haven't refactored any projects. I can just think of following points:
Has the refactoring improved cohesion and lowered coupling;
Has the refactoring improved code readability?
Is it more convenient to add more functions than before refactoring?
Performance improved?
Do you have any ideas?like adding some points or correct my ideas.

Refactoring can have one or more of the following goals (not an exhaustive list):
reducing duplicate code (by extracting functions and types)
enforcing invariants (by extracting types)
improving development speed (extract functions, rename, lower cyclomatic complexity, lower compile times, lower number of responsibilities of a piece of code, lower dependencies, improve readability)
optimize (for memory, execution time, cache friendliness, cpu usage)
If you have no clear ideas of what you are trying to achieve you could refactor with no clear benefit.

Related

Does profile-guided optimization done by compiler notably hurt cases not covered with profiling dataset?

This question is not specific to C++, AFAIK certain runtimes like Java RE can do profiled-guided optimization on the fly, I'm interested in that too.
MSDN describes PGO like this:
I instrument my program and run it under profiler, then
the compiler uses data gathered by profiler to automatically reorganize branching and loops in such way that branch misprediction is reduced and most often run code is placed compactly to improve its locality
Now obviously profiling result will depend on a dataset used.
With normal manual profiling and optimization I'd find some bottlenecks and improve those bottlenecks and likely leave all the other code untouched. PGO seems to improve often run code at expense of making rarely run code slower.
Now what if that slowered code is run often on another dataset that the program will see in real world? Will the program performance degrade compared to a program compiled without PGO and how bad will the degradation likely be? In other word, does PGO really improve my code performance for the profiling dataset and possibly worsen it for other datasets? Are there any real examples with real data?
Disclaimer: I have not done more with PGO than read up on it and tried it once with a sample project for fun. A lot of the following is based on my experience with the "non-PGO" optimizations and educated guesses. TL;DR below.
This page lists the optimizations done by PGO. Lets look at them one-by-one (grouped by impact):
Inlining – For example, if there exists a function A that frequently calls function B, and function B is relatively small, then profile-guided optimizations will inline function B in function A.
Register Allocation – Optimizing with profile data results in better register allocation.
Virtual Call Speculation – If a virtual call, or other call through a function pointer, frequently targets a certain function, a profile-guided optimization can insert a conditionally-executed direct call to the frequently-targeted function, and the direct call can be inlined.
These apparently improves the prediction whether or not some optimizations pay off. No direct tradeoff for non-profiled code paths.
Basic Block Optimization – Basic block optimization allows commonly executed basic blocks that temporally execute within a given frame to be placed in the same set of pages (locality). This minimizes the number of pages used, thus minimizing memory overhead.
Function Layout – Based on the call graph and profiled caller/callee behavior, functions that tend to be along the same execution path are placed in the same section.
Dead Code Separation – Code that is not called during profiling is moved to a special section that is appended to the end of the set of sections. This effectively keeps this section out of the often-used pages.
EH Code Separation – The EH code, being exceptionally executed, can often be moved to a separate section when profile-guided optimizations can determine that the exceptions occur only on exceptional conditions.
All of this may reduce locality of non-profiled code paths. In my experience, the impact would be noticable or severe if this code path has a tight loop that does exceed L1 code cache (and maybe even thrashes L2). That sounds exactly like a path that should have been included in a PGO profile :)
Dead Code separation can have a huge impact - both ways - because it can reduce disk access.
If you rely on exceptions being fast, you are doing it wrong.
Size/Speed Optimization – Functions where the program spends a lot of time can be optimized for speed.
The rule of thumb nowadays is to "optimize for size by default, and only optimize for speed where needed (and verify it helps). The reason is again code cache - in most cases, the smaller code will also be the faster code, because of code cache. So this kind of automates what you should do manually. Compared to a global speed optimization, this would slow down non-profiled code paths only in very atypical cases ("weird code" or a target machine with unusual cache behavior).
Conditional Branch Optimization – With the value probes, profile-guided optimizations can find if a given value in a switch statement is used more often than other values. This value can then be pulled out of the switch statement. The same can be done with if/else instructions where the optimizer can order the if/else so that either the if or else block is placed first depending on which block is more frequently true.
I would file that under "improved prediction", too, unless you feed the wrong PGO information.
The typical case where this can pay a lot are run time parameter / range validation and similar paths that should never be taken in a normal execution.
The breaking case would be:
if (x > 0) DoThis() else DoThat();
in a relevant tight loop and profiling only the x > 0 case.
Memory Intrinsics – The expansion of intrinsics can be decided better if it can be determined if an intrinsic is called frequently. An intrinsic can also be optimized based on the block size of moves or copies.
Again, mostly better informaiton with a small possibility of penalizing untested data.
Example: - this is all an "educated guess", but I think it's quite illustrativefor the entire topic.
Assume you have a memmove that is always called on well aligned non-overlapping buffers with a length of 16 bytes.
A possible optimization is verifying these conditions and use inlined MOV instructions for this case, calling to a general memmove (handling alignment, overlap and odd length) only when the conditions are not met.
The benefits can be significant in a tight loop of copying structs around, as you improve locality, reduce expected path instruction, likely with more chances for pairing/reordering.
The penalty is comparedly small, though: in the general case without PGO, you would either always call the full memmove, or nline the full memmove implementation. The optimization adds a few instructions (including a conditional jump) to something rather complex, I'd assume a 10% overhead at most. In most cases, these 10% will be below the noise due to cache access.
However, there is a very slight slight chance for significant impact if the unexpected branch is taken frequently and the additional instructions for the expected case together with the instructions for the default case push a tight loop out of the L1 code cache
Note that you are already at the limits of what the compiler could do for you. The additional instructions can be expected to be a few bytes, compared to a few K in code cache. A static optimizer could hit the same fate depending on how well it can hoist invariants - and how much you let it.
Conclusion:
Many of the optimizations are neutral.
Some optimizations can have slight negative impact on non-profiled code paths
The impact is usually much smaller than the possible gains
Very rarely, a small impact can be emphasized by other contributing pathological factors
Few optimizations (namely, layout of code sections) can have large impact, but again the possible gains signidicantly outweight that
My gut feel would further claim that
A static optimizer, on a whole, would be at least equally likely to create a pathological case
It would be pretty hard to actually destroy performance even with bad PGO input.
At that level, I would be much more afraid of PGO implementation bugs/shortcomings than of failed PGO optimizations.
PGO can most certainly affect the run time of the code that is run less frequently. After all you are modifying the locality of some functions/blocks and that will make the blocks that are now together to be more cache friendly.
What I have seen is that teams identify their high priority scenarios. Then they run those to train the optimization profiler and measure the improvement. You don't want to run all the scenarios under PGO because if you do you might as well not run any.
As in everything related to performance you need to measure before you apply it. Masure your most common scenarios to see if they improved at all by using PGO training. And also measure the less common scenarios to see if they regressed at all.

Tips and tricks on improving Fortran code performance [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
As part of my Ph.D. research, I am working on development of numerical models of atmosphere and ocean circulation. These involve numerically solving systems of PDE's on the order of ~10^6 grid points, over ~10^4 time steps. Thus, a typical model simulation takes hours to a few days to complete when run in MPI on dozens of CPUs. Naturally, improving model efficiency as much as possible is important, while making sure the results are byte-to-byte identical.
While I feel quite comfortable with my Fortran programming, and am aware of quite some tricks to make code more efficient, I feel like there is still space to improve, and tricks that I am not aware of.
Currently, I make sure I use as few divisions as possible, and try not to use literal constants (I was taught to do this from very early on, e.g. use half=0.5 instead of 0.5 in actual computations), use as few transcendental functions as possible etc.
What other performance sensitive factors are there? At the moment, I am wondering about a few:
1) Does the order of mathematical operations matter? For example if I have:
a=1E-7 ; b=2E4 ; c=3E13
d=a*b*c
would d evaluate with different efficiency based on the order of multiplication? Nowadays, this must be compiler specific, but is there a straight answer? I notice d getting (slightly) different value based on the order (precision limit), but will this impact the efficiency or not?
2) Passing lots (e.g. dozens) of arrays as arguments to a subroutine versus accessing these arrays from a module within the subroutine?
3) Fortran 95 constructs (FORALL and WHERE) versus DO and IF? I know that these mattered back in the 90's when code vectorization was a big thing, but is there any difference now with modern compilers being able to vectorize explicit DO loops? (I am using PGI, Intel, and IBM compilers in my work)
4) Raising a number to an integer power versus multiplication? E.g.:
b=a**4
or
b=a*a*a*a
I have been taught to always use the latter where possible. Does this affect efficiency and/or precision? (probably compiler dependent as well)
Please discuss and/or add any tricks and tips that you know about improving Fortran code efficiency. What else is out there? If you know anything specific to what each of the compilers above do related to this question, please include that as well.
Added: Note that I do not have any bottlenecks or performance issues per se. I am asking if there are any general rules for optimizing the code in sense of operations.
Thanks!
Sorry but all the tricks you mentioned are simply ... ridiculous. More exactly, they have no meaning in practice. For instance:
what could be the advantage of using half(=0.5) instead of 0.5?
idem for computing a**4 or a*a*a*a. (a*a)** 2 would be another possibility too. My personal taste is a**4 because a good compiler which choose automatically the best way.
For **, the only point which could matter is the difference between a ** 4 and a ** 4., the latter being much more CPU time consuming. But even this point has no sense without a measurement in an actual simulation.
In fact, your approach is wrong. Develop your code as well as possible. After that, measure objectively the cost of the different parts of your code. Optimizing without measuring before is simply non sense.
If a part exhibits a high percentage of the CPU, 50% for instance, don't forget that optimizing that part only cannot divide the cost of the overall code by a factor greater than two. Any way, start the optimization work by the most expensive part (the bottle neck).
Don't forget also that the main improvements are generally coming from better algorithms.
I second the advice that these tricks that you have been taught are silly in this era. Compilers do this for you now; such micro-optimizations are unlikely to make a significant difference and may not be portable. Write clear & understandable code. Carefully select your algorithm. One thing that can make a difference is using indices of multi-dimensions arrays in the correct order ... recasting an M X N array to N X M can help depending on the pattern of data access by your program. After this, if your program is too slow, measure where the CPU is consumed and improve only those parts. Experience shows that guessing is frequently wrong and leads to writing more opaque code for nor reason. If you make a code section in which your program spends 1% of its time twice as fast, it won't make any difference.
Here are previous answers on FORALL and WHERE: How can I ensure that my Fortran FORALL construct is being parallelized? and Do Fortran 95 constructs such as WHERE, FORALL and SPREAD generally result in faster parallel code?
You've got a-priori ideas about what to do, and some of them might actually help,
but the biggest payoff is in a-posteriori anaylsis.
(Added: In other words, getting a*b*c into a different order might save a couple cycles (which I doubt), while at the same time you don't know you're not getting blind-sided by something spending 1000 cycles for no good reason.)
No matter how carefully you code it, there will be opportunities for speedup that you didn't foresee. Here's how I find them. (Some people consider this method controversial).
It's best to start with optimization flags OFF when you do this, so the code isn't all scrambled.
Later you can turn them on and let the compiler do its thing.
Get it running under a debugger with enough of a workload so it runs for a reasonable length of time.
While it's running, manually interrupt it, and take a good hard look at what it's doing and why.
Do this several times, like 10, so you don't draw erroneous conclusions about what it's spending time at.
Here's examples of things you might find:
It could be spending a large fraction of time calling math library functions unnecessarily due to the way some expressions were coded, or with the same argument values as in prior calls.
It could be spending a large fraction of time doing some file I/O, or opening/closing a file, deep inside some routine that seemed harmless to call.
It could be in a general-purpose library function, calling a subordinate subroutine, for the purpose of checking argument flags to the upper function. In such a case, much of that time might be eliminated by writing a special-purpose function and calling that instead.
If you do this entire operation two or three times, you will have removed the stupid stuff that finds its way into any software when it's first written.
After that, you can turn on the optimization, parallelism, or whatever, and be confident no time is being spent on silly stuff.

Optimization! - What is it? How is it done?

Its common to hear about "highly optimized code" or some developer needing to optimize theirs and whatnot. However, as a self-taught, new programmer I've never really understood what exactly do people mean when talking about such things.
Care to explain the general idea of it? Also, recommend some reading materials and really whatever you feel like saying on the matter. Feel free to rant and preach.
Optimize is a term we use lazily to mean "make something better in a certain way". We rarely "optimize" something - more, we just improve it until it meets our expectations.
Optimizations are changes we make in the hopes to optimize some part of the program. A fully optimized program usually means that the developer threw readability out the window and has recoded the algorithm in non-obvious ways to minimize "wall time". (It's not a requirement that "optimized code" be hard to read, it's just a trend.)
One can optimize for:
Memory consumption - Make a program or algorithm's runtime size smaller.
CPU consumption - Make the algorithm computationally less intensive.
Wall time - Do whatever it takes to make something faster
Readability - Instead of making your app better for the computer, you can make it easier for humans to read it.
Some common (and overly generalized) techniques to optimize code include:
Change the algorithm to improve performance characteristics. If you have an algorithm that takes O(n^2) time or space, try to replace that algorithm with one that takes O(n * log n).
To relieve memory consumption, go through the code and look for wasted memory. For example, if you have a string intensive app you can switch to using Substring References (where a reference contains a pointer to the string, plus indices to define its bounds) instead of allocating and copying memory from the original string.
To relieve CPU consumption, cache as many intermediate results if you can. For example, if you need to calculate the standard deviation of a set of data, save that single numerical result instead looping through the set each time you need to know the std dev.
I'll mostly rant with no practical advice.
Measure First. Optimization should be done to places where it matters. Highly optimized code is often difficult to maintain and a source of problems. In places where the code does not slow down execution anyway, I alwasy prefer maintainability to optimizations. Familiarize yourself with Profiling, both intrusive (instrumented) and non-intrusive (low overhead statistical). Learn to read a profiled stack, understand where the time inclusive/time exclusive is spent, why certain patterns show up and how to identify the trouble spots.
You can't fix what you cannot measure. Have your program report through some performance infrastructure the thing it does and the times it takes. I come from a Win32 background so I'm used to the Performance Counters and I'm extremely generous at sprinkling them all over my code. I even automatized the code to generate them.
And finally some words about optimizations. Most discussion about optimization I see focus on stuff any compiler will optimize for you for free. In my experience the greatest source of gains for 'highly optimized code' lies completely elsewhere: memory access. On modern architectures the CPU is idling most of the times, waiting for memory to be served into its pipelines. Between L1 and L2 cache misses, TLB misses, NUMA cross-node access and even GPF that must fetch the page from disk, the memory access pattern of a modern application is the single most important optimization one can make. I'm exaggerating slightly, of course there will be counter example work-loads that will not benefit memory access locality this techniques. But most application will. To be specific, what these techniques mean is simple: cluster your data in memory so that a single CPU can work an a tight memory range containing all it needs, no expensive referencing of memory outside your cache lines or your current page. In practice this can mean something as simple as accessing an array by rows rather than by columns.
I would recommend you read up the Alpha-Sort paper presented at the VLDB conference in 1995. This paper presented how cache sensitive algorithms designed specifically for modern CPU architectures can blow out of the water the old previous benchmarks:
We argue that modern architectures
require algorithm designers to
re-examine their use of the memory
hierarchy. AlphaSort uses clustered
data structures to get good cache
locality...
The general idea is that when you create your source tree in the compilation phase, before generating the code by parsing it, you do an additional step (optimization) where, based on certain heuristics, you collapse branches together, delete branches that aren't used or add extra nodes for temporary variables that are used multiple times.
Think of stuff like this piece of code:
a=(b+c)*3-(b+c)
which gets translated into
-
* +
+ 3 b c
b c
To a parser it would be obvious that the + node with its 2 descendants are identical, so they would be merged into a temp variable, t, and the tree would be rewritten:
-
* t
t 3
Now an even better parser would see that since t is an integer, the tree could be further simplified to:
*
t 2
and the intermediary code that you'd run your code generation step on would finally be
int t=b+c;
a=t*2;
with t marked as a register variable, which is exactly what would be written for assembly.
One final note: you can optimize for more than just run time speed. You can also optimize for memory consumption, which is the opposite. Where unrolling loops and creating temporary copies would help speed up your code, they would also use more memory, so it's a trade off on what your goal is.
Here is an example of some optimization (fixing a poorly made decision) that I did recently. Its very basic, but I hope it illustrates that good gains can be made even from simple changes, and that 'optimization' isn't magic, its just about making the best decisions to accomplish the task at hand.
In an application I was working on there were several LinkedList data structures that were being used to hold various instances of foo.
When the application was in use it was very frequently checking to see if the LinkedListed contained object X. As the ammount of X's started to grow, I noticed that the application was performing more slowly than it should have been.
I ran an profiler, and realized that each 'myList.Contains(x)' call had O(N) because the list has to iterate through each item it contains until it reaches the end or finds a match. This was definitely not efficent.
So what did I do to optimize this code? I switched most of the LinkedList datastructures to HashSets, which can do a '.Contains(X)' call in O(1)- much better.
This is a good question.
Usually the best practice is 1) just write the code to do what you need it to do, 2) then deal with performance, but only if it's an issue. If the program is "fast enough" it's not an issue.
If the program is not fast enough (like it makes you wait) then try some performance tuning. Performance tuning is not like programming. In programming, you think first and then do something. In performance tuning, thinking first is a mistake, because that is guessing.
Don't guess what to fix; diagnose what the program is doing.
Everybody knows that, but mostly they do it anyway.
It is natural to say "Could be the problem is X, Y, or Z" but only the novice acts on guesses. The pro says "but I'm probably wrong".
There are different ways to diagnose performance problems.
The simplest is just to single-step through the program at the assembly-language level, and don't take any shortcuts. That way, if the program is doing unnecessary things, then you are doing the same things, and it will become painfully obvious.
Another is to get a profiling tool, and as others say, measure, measure, measure.
Personally I don't care for measuring. I think it's a fuzzy microscope for the purpose of pinpointing performance problems. I prefer this method, and this is an example of its use.
Good luck.
ADDED: I think you will find, if you go through this exercise a few times, you will learn what coding practices tend to result in performance problems, and you will instinctively avoid them. (This is subtly different from "premature optimization", which is assuming at the beginning that you must be concerned about performance. In fact, you will probably learn, if you don't already know, that premature concern about performance can well cause the very problem it seeks to avoid.)
Optimizing a program means: make it run faster
The only way of making the program faster is making it do less:
find an algorithm that uses fewer operations (e.g. N log N instead of N^2)
avoid slow components of your machine (keep objects in cache instead of in main memory, or in main memory instead of on disk); reducing memory consumption nearly always helps!
Further rules:
In looking for optimization opportunities, adhere to the 80-20-rule: 20% of typical program code accounts for 80% of execution time.
Measure the time before and after every attempted optimization; often enough, optimizations don't.
Only optimize after the program runs correctly!
Also, there are ways to make a program appear to be faster:
separate GUI event processing from back-end tasks; priorize user-visible changes against back-end calculation to keep the front-end "snappy"
give the user something to read while performing long operations (every noticed the slideshows displayed by installers?)
However, as a self-taught, new programmer I've never really understood what exactly do people mean when talking about such things.
Let me share a secret with you: nobody does. There are certain areas where we know mathematically what is and isn't slow. But for the most part, performance is too complicated to be able to understand. If you speed up one part of your code, there's a good possibility you're slowing down another.
Therefore, anyone who tells you that one method is faster than another, there's a good possibility they're just guessing unless one of three things are true:
They have data
They're choosing an algorithm that they know is faster mathematically.
They're choosing a data structure that they know is faster mathematically.
Optimization means trying to improve computer programs for such things as speed. The question is very broad, because optimization can involve compilers improving programs for speed, or human beings doing the same.
I suggest you read a bit of theory first (from books, or Google for lecture slides):
Data structures and algorithms - what the O() notation is, how to calculate it,
what datastructures and algorithms can be used to lower the O-complexity
Book: Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest
Compilers and assembly - how code is translated to machine instructions
Computer architecture - how the CPU, RAM, Cache, Branch predictions, out of order execution ... work
Operating systems - kernel mode, user mode, scheduling processes/threads, mutexes, semaphores, message queues
After reading a bit of each, you should have a basic grasp of all the different aspects of optimization.
Note: I wiki-ed this so people can add book recommendations.
I am going with the idea that optimizing a code is to get the same results in less time. And fully optimized only means they ran out of ideas to make it faster. I throw large buckets of scorn on claims of "fully optimized" code! There's no such thing.
So you want to make your application/program/module run faster? First thing to do (as mentioned earlier) is measure also known as profiling. Do not guess where to optimize. You are not that smart and you will be wrong. My guesses are wrong all the time and large portions of my year are spent profiling and optimizing. So get the computer to do it for you. For PC VTune is a great profiler. I think VS2008 has a built in profiler, but I haven't looked into it. Otherwise measure functions and large pieces of code with performance counters. You'll find sample code for using performance counters on MSDN.
So where are your cycles going? You are probably waiting for data coming from main memory. Go read up on L1 & L2 caches. Understanding how the cache works is half the battle. Hint: Use tight, compact structures that will fit more into a cache-line.
Optimization is lots of fun. And it's never ending too :)
A great book on optimization is Write Great Code: Understanding the Machine by Randall Hyde.
Make sure your application produces correct results before you start optimizing it.

Do you find cyclomatic complexity a useful measure?

I've been playing around with measuring the cyclomatic complexity of a big code base.
Cyclomatic complexity is the number of linearly independent paths through a program's source code and there are lots of free tools for your language of choice.
The results are interesting but not surprising. That is, the parts I know to be the hairiest were in fact the most complex (with a rating of > 50). But what I am finding useful is that a concrete "badness" number is assigned to each method as something I can point to when deciding where to start refactoring.
Do you use cyclomatic complexity? What's the most complex bit of code you found?
We refactor mercilessly, and use Cyclomatic complexity as one of the metrics that gets code on our 'hit list'. 1-6 we don't flag for complexity (although it could get questioned for other reasons), 7-9 is questionable, and any method over 10 is assumed to be bad unless proven otherwise.
The worst we've seen was 87 from a monstrous if-else-if chain in some legacy code we had to take over.
Actually, cyclomatic complexity can be put to use beyond just method level thresholds. For starters, one big method with high complexity may be broken into several small methods with lower complexity. But has it really improved the codebase? Granted, you may get somewhat better readability by all those method names. But the total conditional logic hasn't changed. And the total conditional logic can often be reduced by replacing conditionals with polymorphism.
We need a metric that doesn't turn green by mere method decomposition. I call this CC100.
CC100 = 100 * (Total cyclomatic complexity of codebase) / (Total lines of code)
It's useful to me in the same way that big-O is useful: I know what it is, and can use it to get a gut feeling for whether a method is good or bad, but I don't need to compute it for every function I've written.
I think simpler metrics, like LOC, are at least as good in most cases. If a function doesn't fit on one screen, it almost doesn't matter how simple it is. If a function takes 20 parameters and makes 40 local variables, it doesn't matter if its cyclomatic complexity is 1.
Until there is a tool that can work well with C++ templates, and meta-programming techniques, it's not much help in my situation. Anyways just remember that
"not all things that count can be
measured, and not all things that can
be measured count"
Einstein
So remember to pass any information of this type through human filtering too.
We recently started to use it. We use NDepend to do some static code analysis, and it measures cyclomatic complexity. I agree, it's a decent way to identify methods for refactoring.
Sadly, we have seen #'s above 200 for some methods created by our developers offshore.
You'll know complexity when you see it. The main thing this kind of tool is useful for is flagging the parts of the code that were escaping your attention.
I frequently measure the cyclomatic complexity of my code. I've found it helps me spot areas of code that are doing too much. Having a tool point out the hot-spots in my code is much less time consuming than having to read through thousands of lines of code trying to figure out which methods are not following the SRP.
However, I've found that when I do a cyclomatic complexity analysis on other people's code it usually leads to feelings of frustration, angst, and general anger when I find code with cyclomatic complexity in the 100's. What compels people to write methods that have several thousand lines of code in them?!
It's great for help identifying candidates for refactoring, but it's important to keep your judgment around. I'd support kenj0418's ranges for pruning guides.
There's a Java metric called CRAP4J that empirically combines cyclomatic complexity and JUnit test coverage to come up with a single metric. He's been doing research to try and improve his empirical formula. I'm not sure how widespread it is.
Cyclomatic Complexity is just one composant of what could be called Fabricated Complexity. A while back, I wrote an article to summarize several dimensions of code complexity:
Fighting Fabricated Complexity
Tooling is needed to be efficient at handling code complexity. The tool NDepend for .NET code will let you analyze many dimensions of the code complexity including code metrics like:
Cyclomatic Complexity, Nesting Depth, Lack Of Cohesion of Methods, Coverage by Tests...
including dependencies analysis and including a language (Code Query Language) dedicated to ask, what is complex in my code, and to write rule?
Yes, we use it and I have found it useful too. We have a big legacy code base to tame and we found alarming high cyclomatic complexity. (387 in one method!). CC points you directly to areas that are worth to refactor. We use CCCC on C++ code.
I haven't used it in a while, but on a previous project it really helped identify potential trouble spots in someone elses code (wouldn't be mine of course!)
Upon finding the area's to check out, i quickly found numerious problems (also lots of GOTOS would you believe!) with logic and some really strange WTF code.
Cyclomatic complexity is great for showing areas which probably are doing to much and therefore breaking the single responsibilty prinicpal. These's ideally should be broken up into mulitple functions
I'm afraid that for the language of the project for which I would most like metrics like this, LPC, there are not, in fact, lots of free tools for producing it available. So no, not so useful to me.
+1 for kenj0418's hit list values.
The worst I've seen was a 275. There were a couple others over 200 that we were able to refactor down to much smaller CCs; they were still high but it got them pushed further back in line. We didn't have much luck with the 275 beast -- it was (probably still is) a web of if- and switch-statements that was just way too complex. It's only real value is as a step-through when they decide to rebuild the system.
The exceptions to high CC that I was comfortable with were factories; IMO, they are supposed to have a high CC but only if they are only doing simple object creation and returning.
After understanding what it means, I now have started to use it on a "trial" basis. So far I have found it to be useful, because usually high CC goes hand in hand with the Arrow Anti-Pattern, which makes code harder to read and understand. I do not have a fixed number yet, but NDepend is alerting for everything above 5, which looks like a good start to investigate methods.

Should a developer aim for readability or performance first? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Oftentimes a developer will be faced with a choice between two possible ways to solve a problem -- one that is idiomatic and readable, and another that is less intuitive, but may perform better. For example, in C-based languages, there are two ways to multiply a number by 2:
int SimpleMultiplyBy2(int x)
{
return x * 2;
}
and
int FastMultiplyBy2(int x)
{
return x << 1;
}
The first version is simpler to pick up for both technical and non-technical readers, but the second one may perform better, since bit shifting is a simpler operation than multiplication. (For now, let's assume that the compiler's optimizer would not detect this and optimize it, though that is also a consideration).
As a developer, which would be better as an initial attempt?
You missed one.
First code for correctness, then for clarity (the two are often connected, of course!). Finally, and only if you have real empirical evidence that you actually need to, you can look at optimizing. Premature optimization really is evil. Optimization almost always costs you time, clarity, maintainability. You'd better be sure you're buying something worthwhile with that.
Note that good algorithms almost always beat localized tuning. There is no reason you can't have code that is correct, clear, and fast. You'll be unreasonably lucky to get there starting off focusing on `fast' though.
IMO the obvious readable version first, until performance is measured and a faster version is required.
Take it from Don Knuth
Premature optimization is the root of all evil (or at least most of it) in programming.
Readability 100%
If your compiler can't do the "x*2" => "x <<1" optimization for you -- get a new compiler!
Also remember that 99.9% of your program's time is spent waiting for user input, waiting for database queries and waiting for network responses. Unless you are doing the multiple 20 bajillion times, it's not going to be noticeable.
Readability for sure. Don't worry about the speed unless someone complains
In your given example, 99.9999% of the compilers out there will generate the same code for both cases. Which illustrates my general rule - write for readability and maintainability first, and optimize only when you need to.
Readability.
Coding for performance has it's own set of challenges. Joseph M. Newcomer said it well
Optimization matters only when it
matters. When it matters, it matters a
lot, but until you know that it
matters, don't waste a lot of time
doing it. Even if you know it matters,
you need to know where it matters.
Without performance data, you won't
know what to optimize, and you'll
probably optimize the wrong thing.
The result will be obscure, hard to
write, hard to debug, and hard to
maintain code that doesn't solve your
problem. Thus it has the dual
disadvantage of (a) increasing
software development and software
maintenance costs, and (b) having no
performance effect at all.
I would go for readability first. Considering the fact that with the kind of optimized languages and hugely loaded machines we have in these days, most of the code we write in readable way will perform decently.
In some very rare scenarios, where you are pretty sure you are going to have some performance bottle neck (may be from some past bad experiences), and you managed to find some weird trick which can give you huge performance advantage, you can go for that. But you should comment that code snippet very well, which will help to make it more readable.
Readability. The time to optimize is when you get to beta testing. Otherwise you never really know what you need to spend the time on.
A often overlooked factor in this debate is the extra time it takes for a programmer to navigate, understand and modify less readible code. Considering a programmer's time goes for a hundred dollars an hour or more, this is a very real cost.
Any performance gain is countered by this direct extra cost in development.
Putting a comment there with an explanation would make it readable and fast.
It really depends on the type of project, and how important performance is. If you're building a 3D game, then there are usually a lot of common optimizations that you'll want to throw in there along the way, and there's no reason not to (just don't get too carried away early). But if you're doing something tricky, comment it so anybody looking at it will know how and why you're being tricky.
The answer depends on the context. In device driver programming or game development for example, the second form is an acceptable idiom. In business applications, not so much.
Your best bet is to look around the code (or in similar successful applications) to check how other developers do it.
If you're worried about readability of your code, don't hesitate to add a comment to remind yourself what and why you're doing this.
using << would by a micro optimization.
So Hoare's (not Knuts) rule:
Premature optimization is the root of all evil.
applies and you should just use the more readable version in the first place.
This is rule is IMHO often misused as an excuse to design software that can never scale, or perform well.
Both. Your code should balance both; readability and performance. Because ignoring either one will screw the ROI of the project, which in the end of the day is all that matters to your boss.
Bad readability results in decreased maintainability, which results in more resources spent on maintenance, which results in a lower ROI.
Bad performance results in decreased investment and client base, which results in a lower ROI.
Readability is the FIRST target.
In the 1970's the army tested some of the then "new" techniques of software development (top down design, structured programming, chief programmer teams, to name a few) to determine which of these made a statistically significant difference.
THe ONLY technique that made a statistically significant difference in development was...
ADDING BLANK LINES to program code.
The improvement in readability in those pre-structured, pre-object oriented code was the only technique in these studies that improved productivity.
==============
Optimization should only be addressed when the entire project is unit tested and ready for instrumentation. You never know WHERE you need to optimize the code.
In their landmark books Kernigan and Plauger in the late 1970's SOFTWARE TOOLS (1976) and SOFTWARE TOOLS IN PASCAL (1981) showed ways to create structured programs using top down design. They created text processing programs: editors, search tools, code pre-processors.
When the completed text formating function was INSTRUMENTED they discovered that most of the processing time was spent in three routines that performed text input and output ( In the original book, the i-o functions took 89% of the time. In the pascal book, these functions consumed 55%!)
They were able to optimize these THREE routines and produced the results of increased performance with reasonable, manageable development time and cost.
The larger the codebase, the more readability is crucial. Trying to understand some tiny function isn't so bad. (Especially since the Method Name in the example gives you a clue.) Not so great for some epic piece of uber code written by the loner genius who just quit coding because he has finally seen the top of his ability's complexity and it's what he just wrote for you and you'll never ever understand it.
As almost everyone said in their answers, I favor readability. 99 out of 100 projects I run have no hard response time requirements, so it's an easy choice.
Before you even start coding you should already know the answer. Some projects have certain performance requirements, like 'need to be able to run task X in Y (milli)seconds'. If that's the case, you have a goal to work towards and you know when you have to optimize or not. (hopefully) this is determined at the requirements stage of your project, not when writing the code.
Good readability and the ability to optimize later on are a result of proper software design. If your software is of sound design, you should be able to isolate parts of your software and rewrite them if needed, without breaking other parts of the system. Besides, most true optimization cases I've encountered (ignoring some real low level tricks, those are incidental) have been in changing from one algorithm to another, or caching data to memory instead of disk/network.
If there is no readability , it will be very hard to get performance improvement when you really need it.
Performance should be only improved when it is a problem in your program, there are many places would be a bottle neck rather than this syntax. Say you are squishing 1ns improvement on a << but ignored that 10 mins IO time.
Also, regarding readability, a professional programmer should be able to read/understand computer science terms. For example we can name a method enqueue rather than we have to say putThisJobInWorkQueue.
The bitshift versus the multiplication is a trivial optimization that gains next to nothing. And, as has been pointed out, your compiler should do that for you. Other than that, the gain is neglectable anyhow as is the CPU this instruction runs on.
On the other hand, if you need to perform serious computation, you will require the right data structures. But if your problem is complex, finding out about that is part of the solution. As an illustration, consider searching for an ID number in an array of 1000000 unsorted objects. Then reconsider using a binary tree or a hash map.
But optimizations like n << C are usually neglectible and trivial to change to at any point. Making code readable is not.
It depends on the task needed to be solved. Usually readability is more importrant, but there are still some tasks when you shoul think of performance in the first place. And you can't just spend a day or to for profiling and optimization after everything works perfectly, because optimization itself may require rewriting sufficiant part of a code from scratch. But it is not common nowadays.
I'd say go for readability.
But in the given example, I think that the second version is already readable enough, since the name of the function exactly states, what is going on in the function.
If we just always had functions that told us, what they do ...
You should always maximally optimize, performance always counts. The reason we have bloatware today, is that most programmers don't want to do the work of optimization.
Having said that, you can always put comments in where slick coding needs clarification.
There is no point in optimizing if you don't know your bottlenecks. You may have made a function incredible efficient (usually at the expense of readability to some degree) only to find that portion of code hardly ever runs, or it's spending more time hitting the disk or database than you'll ever save twiddling bits.
So you can't micro-optimize until you have something to measure, and then you might as well start off for readability.
However, you should be mindful of both speed and understandability when designing the overall architecture, as both can have a massive impact and be difficult to change (depending on coding style and methedologies).
It is estimated that about 70% of the cost of software is in maintenance. Readability makes a system easier to maintain and therefore brings down cost of the software over its life.
There are cases where performance is more important the readability, that said they are few and far between.
Before sacrifing readability, think "Am I (or your company) prepared to deal with the extra cost I am adding to the system by doing this?"
I don't work at google so I'd go for the evil option. (optimization)
In Chapter 6 of Jon Bentley's "Programming Pearls", he describes how one system had a 400 times speed up by optimizing at 6 different design levels. I believe, that by not caring about performance at these 6 design levels, modern implementors can easily achieve 2-3 orders of magnitude of slow down in their programs.
Readability first. But even more than readability is simplicity, especially in terms of data structure.
I'm reminded of a student doing a vision analysis program, who couldn't understand why it was so slow. He merely followed good programming practice - each pixel was an object, and it worked by sending messages to its neighbors...
check this out
Write for readability first, but expect the readers to be programmers. Any programmer worth his or her salt should know the difference between a multiply and a bitshift, or be able to read the ternary operator where it is used appropriately, be able to look up and understand a complex algorithm (you are commenting your code right?), etc.
Early over-optimization is, of course, quite bad at getting you into trouble later on when you need to refactor, but that doesn't really apply to the optimization of individual methods, code blocks, or statements.
How much does an hour of processor time cost?
How much does an hour of programmer time cost?
IMHO both things have nothing to do. You should first go for code that works, as this is more important than performance or how well it reads. Regarding readability: your code should always be readable in any case.
However I fail to see why code can't be readable and offer good performance at the same time. In your example, the second version is as readable as the first one to me. What is less readable about it? If a programmer doesn't know that shifting left is the same as multiplying by a power of two and shifting right is the same as dividing by a power of two... well, then you have much more basic problems than general readability.

Resources