Understanding the scope of refactoring - refactoring

I've always thought of code refactoring as only improving implementation details. I want to make sure I have the appropriate understanding of the scope to which refactoring applies (Wikipedia didn't help me much to understand this).
For example, I hear people talking about "refactoring their designs", which seems paradox. When you modify the design of something (rename a public method in a class, remove a method, or other similar changes) this (to me) is called "redesign", not refactoring.
What is the common accepted scope of refactoring? Is it really possible to refactor designs? Knowing this will really help communication with co-workers when I'm trying to describe work I'm doing as refactoring or not.

Refactoring can be applied to a design as well as well as code, when you work with the existing design of the system to make improvements to it, either to improve readability, clarity, or make it easier to add new features in the future.
Regardless of if you are applying refactoring to design or code, the idea is the same. You maintain the existing functionality (you don't add or remove capabilities of the system) yet produce something that's easier to work with in the end.

I think the scope of refactoring is dependent on the goal of the refactor.
I always consider refactoring as changing the code without changing the end result of what it does to those that use it (people, other methods, etc). I think the scope depends on what problem you are trying to solve. If you are just trying to speed one thing up then you may refactor a single method. However if you've accumulated a lot of "technical debt" and need to restructure your program to make it more extensible going forward, easier to read, easier to test, etc. then you may indeed refactor the entire design.
In fact, I just did a huge refactor of my design recently to make it easier to add additional features.

Related

What is refactoring?

I hear the word refactoring everywhere. Any programming tool has some blah-blah about how it helps refactoring, every programmer or a manager will tell me something about refactoring. But to me it still sounds like a magic word without any meaning. It seems that refactoring is just editing your code or what?
Wikipedia quote
Code refactoring is a "disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior",[1] undertaken in order to improve some of the nonfunctional attributes of the software. Advantages include improved code readability and reduced complexity to improve the maintainability of the source code, as well as a more expressive internal architecture or object model to improve extensibility.
WHAT? Does every(any)body understand this? Are all those people who talk to me about refactoring, really do mean this?
And why is the name? What's "factoring" then?
Refactoring is modifying existing code to improve its readability, re-usability, performance, extensibility and maintainability. Have you ever looked at code and thought, "Wow this is a mess" or "this could be done better"? When you start to clean up the code and improve different aspects of it, this is considered refactoring. Many times code will often repeat itself, requiring you to create abstractions to adhere to the DRY principle, another demonstration of refactoring. During most refactoring it is important to not break anything, which can be assured by using good unit tests.
Sometimes its best just to get some working code established that solves a particular problem. Think of this as a rough draft, it just gets the basic ideas established and allows you to think about the problem at hand. After the rough draft is finished, you return to the code and edit it, making improvements that leads to a final copy (refactoring). You may eventually receive further requirements that require further code modifications. At this point the cycle repeats. Get the initial ideas down in code, then revisit the code and clean it up (refactor it).
One of the main premises behind refactoring is that code can always be improved. When you make these improvements its refactoring.
Refactoring is all about making your code more maintainable.
Practically, the requirements of the software are changing continuously that leads to continuous changes in the software. Over a period of time, the software starts becoming complex. As correctly illustrated by Lehman in his excellent work on software evolution that "as a system evolves, its complexity increases unless work is done to maintain or reduce it". Hence, we have to reduce the complexity of the software periodically and that's why refactoring is required.
Let us consider an example:
Insufficient Modularization (or God class) design smell occurs when a class is too huge and/or complex. If you have such a class in your design then you will face multiple problems (such as understandability is poor - you will find difficult to understand your code, reliability issues - since the class is complex you may change the code incorrectly or change in one aspect leads to bug in another aspect). Therefore, it is better to refactor the class using techniques such as "Extract-class" refactoring.
(More information about "Insufficient Modularization" can be found in the book "Refactoring for software design smells")
From Wiktionary: http://en.wiktionary.org/wiki/refactor
(computing) to rewrite existing source code in order to improve its readability, reusability or structure without affecting its meaning or behaviour
"Refactor" is also the name of a menu of tools in Eclipse.
I will explain the "Rename" Eclipse tool as an example of the tools under the Eclipse "Refactor" menu.
In Eclipse, you can "refactor -> rename" your variables by highlighting the variable, right-clicking and going to "refactor" and "rename":
Instead of renaming each one by hand, it detects which variables refer to the same object, so you can change all of the applicable variables' names at once:
It's really convenient to prevent bugs that would happen when you forget to rename a variable.
“Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written.” - Martin Fowler (Father of Code Smell).
An example of a refactoring could be extract method (Figure 1): If a method is too long, it should be decomposed, using this refactoring technique. Find a clump of code (within the long method) that goes well together, create a new method with a descriptive name and move the code into the new method. If local variables are being used, these need to be passed as parameters. The last step is to add a invoke to the new method and test the code.
Although refactoring does not add features or functionalities in a software system, it is sharp weapon for developers in their maintenance activities. It makes a software system easier to understand and cheaper to modify without changing its observable behavior by changing its internal structure.
The purposes of refactoring according to M. Fowler are stated in the following:
Refactoring Improves the Design of Software
Refactoring Makes Software Easier to Understand
Refactoring Helps Finding Bugs
Refactoring Helps Programming Faster
Refactoring has been very clearing explained in the following link: Refactoring: Guru. But, for a quick overview, have created a diagram that would help in basic understanding -

When should you not refactor?

We all know that refactoring is good and I love it as much as the next guy, but do you have real cases where is better not to refactor ?
Something like time critical stuff or synchronization? Technical or human reasons are equally welcome. Real cases scenarios and experiences a plus.
Edit : from the answers thus far, it looks like the only reason not to refactor is money. My question is mostly relative to something like this: suppose you would like to perform "extract method", but if you add the additional function call, you will make the code slightly less faster and hinder a very strict synchronization. Just to give you an idea of what I mean.
Another reason I sometimes heard is that "others used to the current code layout will get annoyed by your changes". Of course, I doubt this is a good reason.
I'm a big fan of refactoring to keep code clean and maintainable. But you generally want to shy away from refactoring production modules that work fine and don't require change. However, when you do need to work on a module to fix bugs or introduce a new feature, some refactoring is usually worth it and won't cost much since you're already committed to doing a full set of tests and going through the release process. (Unit tests are very helpful, but are only part of the full test suite, as other posters noted.)
More significant refactorings may make it harder for others to find their way around the new code, and they may then react unfavorably to refactoring. To minimize this, bring other team members in on the process using an approach like pair programming.
Update (8/10): Another reason to not refactor is when you aren't approaching the existing code base with proper humility and respect. With these qualities you'll tend to be conservative and do only refactorings that really do make a difference. If you approach the code with too much arrogance, you may wind up just making changes instead of refactoring. Is that new method name really clearer, or did the old one have a name with a very specific meaning in your application domain? Did you really need to mechanically reformat that source file to your personal style, when the existing style met project guidelines? Again pair programming can help.
To reinforce the other answer (and touch on issues you mention): do not refactor a part of the code until it's well covered by all relevant kinds of testing. This doesn't mean "don't refactor it" -- the emphasis is on "add the necessary tests" (to do unit-tests properly may well require some refactoring, particularly the introduction of factory DPs and/or dependency injection DPs in code that's now solidly bolted to concrete dependencies).
Note that this does cover your second paragraph's issues: if a section of the code is time-critical it should be well covered by "load-tests" (which like the more usual kind, correctness-test, should cover both specific units [albeit performance-wise -- correctness-checking is other tests' business!-)] AND end-to-end operations -- the equivalent of unit tests and integration tests if one was talking about correctness rather than performance).
Multi-tasking code with subtle sync issues can be a nightmare as no test can really make you entirely confident about it -- no other refactoring (that might in any way affect any fragile sync that just appears to be working now) should be considered BEFORE one intended to make the synchronization much, MUCH more robust and sound (message-passing through guaranteed-threadsafe queues being BY FAR my favorite design pattern in this regard;-).
Hmmm - I disagree with the above (1st response). Given code with no tests, you may refactor it to to make it more testable.
You do not refactor code when you cannot test the resulting code in time to deliver it such that it is still valuable to the recipient.
You do not refactor code when your refactoring will not improve the quality of the code. Quality is not subjective, although at times, design may be.
You do not refactor code when there is no business justification for making an alteration.
There are probably more, but hopefully you get the idea...
As Martin Fowler writes, you shouldn't refactor if a deadline is near. That time in project is better suited to flush out bugs instead of improving design (refactoring). Do the refactoring omitted this time directly after the deadline is over.
Refactoring is not good in and of itself. Rather, its purpose is to improve code quality so that it can be maintained more cheaply and with less risk of adding defects. For actively developed code, the benefits of refactoring are worth the cost. For frozen code that there is no intention to do any further work on, refactoring yields no benefit.
Even for live code, refactoring has its own risks, which unit tests can minimize. It also has its own place in the development cycle, which is towards the front, where it's less disruptive. The best time and place for refactoring is just before you start to make major changes to some otherwise brittle code.
When it is not cost-effective. There's a guy at the place I work who loves refactoring. Making code perfect makes him very happy. He can check out a current project tree and go to town on it, moving functions and classes around and tightening things up so they look great, have better flow, and are more extensible in the future.
Unfortunately, it's not worth the money. If he spends a week refactoring some classes into more functional units that may be easier to work with in the future, that's a week's worth of salary lost to the company with no noticeable bottom-line improvements.
Code will never, ever be absolutely perfect. You learn to live with it, and keep your hands off something that could be done better, but perhaps isn't worth the time.
If the code seems very difficult to refactor without breaking, that's the most important code to refactor!
If there aren't any tests, write some as you refactor.
Honestly, the one case is where you are forbidden to touch some code by management/customer/SomeoneImportant, and when that happens I consider the project broken.
Here is my experience:
Don't refactor:
When you don't have test suite accompanying with the code you want to refactor. You might want to develop the test suit first instead.
When your manager doesn't really care about the maintainablity and extensibility of current code base, instead they care much about if they would be able to deliver the product on schedule, especially for the project with short and tight schedule.
If you stick to the principle that everything that you do should add value for the client/ business, then the times you should not refactor are the following:
Code that works and no new development is planned.
Code that is good enough / works and refactoring simply represents gold plating.
The cost of refactoring is higher than living with the existing code.
The cost of refactoring is higher that rewriting the code from scratch
Some of the other answsers say that you should not refactor code that does not have unit tests. If code needs refactoring, you should refactor it, you must however write tests first. If the code is written in a way that makes it difficult to test, it should be rewritten (in a perfect world).
When you've got other stuff to build. I always feel like refactoring an existing system when I'm supposed to be doing something else.
There's always a balance to be had between fixing or adding to code and refactoring. However, this balance is so far in favor of refactoring that I don't think I've ever been on a team that refactored too much. Chances are, if you think you're erring on the side of refactoring too much, you're right on the money.
Of course, the biggest determining factor is how close the deadline is. If a deadline is imminent, requirements come first.
Isn't the need to refactor code largely based on the propensity of people to cut and paste code rather than thinking the solution through, and doing the factoring in advance? In other words, whenever you feel the need to cut & paste some code, merely make that chunk of code a function, and document it.
I have had to maintain way too much code where people found it easier to cut and paste a whole function, only to make one or two trivial changes, which could easily have been parametrized. But like many other's experience, to try to refactor some of this code would have take a LOT of time and been very risky.
I have 4 projects wherein a 10K line collection of functions was merely copied and modified as needed. This is a horrid maintenance nightmare. Especially when the code has LOTS of problems, e.g. hard-wired endianness assumptions, tons of global variables, etc. I feel bile in my throat just thinking about it.
Don't refactor if you don't have the time to test the refactored code before release. Refactoring can introduce bugs. If you have well-tested and relatively bug-free code, why take the risk? Wait until the next development cycle.
If you're stuck maintaining an old flakey code base with no future beyond keeping it running until management can bite the bullet and do a rewrite then refactoring is a lose-lose situation. First the developer loses because refactoing bad flakey code is a nightmare and secondly the business loses because as the developer attempts to refactor the software breaks in unexpected and unforseen ways.
When you don't really know what the code is doing in the first place. And yes, I have seen people ignore that rule.
It's just a cost-benefit tradeoff. Estimate the cost to refactor, estimate the benefits, determine if you actually have the time to refactor given other tasks, determine if refactoring is the best time-benefit tradeoff. There may be other tasks more worth doing.

"refactor refactor refactor your code." What does this mean exactly and why do it?

I often heard from professionals blog something like refactoring your code whenever the chance you get. What is it exactly? Rewriting your code in simpler and fewer lines? What is the purpose of doing this?
Refactoring code is a process of cleaning up your code, reducing the clutter and improving the readability without causing any side effects or changes to features.
Basically, you refactor by applying a series of code change rules that improve code readability and re-usability, without affecting the logic.
Always unit test before and after refactoring to ensure your logic isn't affected.
This Wikipedia article will give you an idea of the types of things included in the general concept of Refactoring.
The idea is adapt / evolve your code as you go. Simple things may be to rename variables or method parameters, but others may be to pass an additional parameter or to drop one, or to change its type. The data model may evolve as well. etc.
Often refactoring, works hand-in-hand with unit-testing, whereby the risk of "breaking something" is offset by the fact that such an issue may likely be discovered by the automatic testing (provide a good coverage and relevant test cases...).
In a nutshell, the ability to refactor (and btw, most IDE or add-ons to the IDEs, offer various tools that make refactoring easier and less error prone) allows one to write more quickly without stressing about some decisions ("should this object include an array or a list etc...) letting the programmer change some of these decisions as times goes, and with the added insight offered by having a workable, if not perfect solution. See a related concept: agile development.
Beware, refactoring doesn't give you license to start coding without putting any thought in design, in the object model, the APIs etc., however it lessens the stiffness of some of these decisions.
Martin Fowler has probably done the most to popularize refactoring, but I think good developers have always done these sorts of restructurings. Check out Fowler'srefactoring web site, and his 1999 Refactoring, which is an excellent introduction and catalog of specific refactorings using Java.
And I see he's a co-author of the brand new Refactoring, Ruby Edition, which should be a great resource.
I find that regularly cleaning up your code like this makes it a lot clearer and more maintainable.
To take one example, I wrote a small (Java 1.6) client library for accessing remote web services (using the REST architectural style). The bulk of this library is in one source file, and about half of that deals with the web services, while the other half is a simple in-memory cache of the responses (for performance). Over time both halves have grown in functionality, to the point where the source file was getting too complex. So today I used Fowler's "Extract Class" refactoring to move the cache logic into a new class. Before that I had to do some "Extract Methods" to isolate the caching logic. Along the way I did a few "Rename Methods" and an "Introduce Explaining Variable".
As other folks have noted, it's very important to have a good set of unit tests to apply after you make each change. They help ensure that you're not introducing new bugs, among other good things.
In a nutshell, refactoring means improving the design and/or implementation of software, usually without changing its behavior. This is normally done to make the code easier to understand and work with going forward, thereby making future development faster and less bug-prone.
Refactoring is a long-term investment in your code - since it doesn't affect the outward "appearance" of the software, there is very often pressure (from management, etc.) to "just get it working and move on to the next thing." While this may sometimes be the right decision, depending on business drivers, a codebase that undergoes change but never gets refactored will decay into a difficult, buggy mess (See also Technical Debt).
Specifically, the top reasons to refactor are usually the following:
Getting rid of duplicated code
Breaking up a long method into smaller pieces by extracting new methods from sections of the longer method
Breaking up a class that has too many responsibilities into smaller, more targeted classes or subclasses
Moving methods from one class to another. Often this is done so the methods reside in the same class as the data they operate on.
In the simplest terms, refactoring code is optimizing code. The criteria for what is "better" code is open to much interpretation as there are various coding styles and patterns out there. A central idea with refactoring is the question of, "Could this code be made better?" A few examples of that criteria can include scalability, maintainability, readablity, performance, size of executable, or minimizing memory used in executing the code.
"Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure." -- MartinFowler in RefactoringImprovingTheDesignOfExistingCode
see this WhatIsRefactoring for more explanation.
Refactoring code generally means taking code that has been patched multiple times and re-writing it so that the needs of the later patches are taken into account.

Why do we refactor? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would like to know the reasons that we do refactoring and justify it. I read a lot of people upset over the idea of refactoring. Refactoring was variously described as:
A result of insufficient upfront
design.
Undisciplined hacking
A dangerous activity that needlessly risked destabilizing
working code
A waste of resources.
What are the responsible reasons that lead us to refactor our code?
I also found a similar question here how-often-should-you-refactor, it doesn't provide the reason for refactoring.
Why do we refactor?
Because there's no actual substitute for writing code. No amount of upfront planning or experience can substitute actual code writing. This is what an entire generation (called waterfall) learned the hard way.
Once you start writing the code and be in the middle of it, you reason about the way it works on a lower level you do notice things (performance, usability or correctness things) that escaped the higher design view.
Refactoring is perfecting.
Ask yourself: why do painters do multiple strokes with the brush on the same spot?
Refactoring is the way to pay the technical debt.
I'd like to briefly address three of your points.
1. "A result of insufficient up-front design"
Common sense (and several books and bloggers) tell us we should strive for the simplest, cleanest design possible to address a given problem. While it's quite possible that some code is written without sufficient work on developing an understanding of the requirements and the problem domain, it's probably more common that "poor code" wasn't "poor" when it was written; rather, it is no longer sufficient.
Requirements change, and designs have to support additional features and capabilities. It's not unreasonable to anticipate some future changes up-front, but McConnell et al. rightly caution against high-level, overly-flexible designs when there's no clear and present need for such an approach.
3. "A dangerous activity that needlessly risks destabilising working code"
Well, yes, if done improperly. Before you seek to make any significant modification to a working system, you should put in place proper measures to ensure that you're not causing any harm - a sort of "developmental Hippocratic oath", almost.
Typically, this will be done by a mixture of documentation and testing, and more often than not, the code wins out, because it's the most up-to-date description of the actual behaviour. In practical terms, this translates into having decent coverage with a unit test suite, so that if refactoring does introduce unexpected problems, these are identified and resolved.
Obviously, when you seek to refactor, you're going to break a certain number of tests, not least because you're trying to fix some broken code contracts. It is, however, perfectly possible to refactor with impunity, provided you have that mechanism in place to spot the accidental mistakes.
4. "A waste of resources"
Others have mentioned the concept of technical debt, which is, briefly, the idea that over time, the complexity of such systems builds up, and that some of that build-up has to be reduced, by refactoring and other techniques, in order to reasonably facilitate future development. In other words, sometimes you have to bite the bullet and go ahead with that change you've been putting off, because otherwise you'll be making a bad situation appallingly worse when you come to add something new in that area.
Obviously, there's a time and a place to pay off such things; you wouldn't try and repay a loan until you had the cash to do it, and you can't afford to go around refactoring willy nilly during a critical stage in development. Nevertheless, by making the decision to address some of the problems in your code base, you save future development time, and thus money, and maybe even further into the future, avoid the cost of having to abandon or completely rewrite some component that is beyond your understanding.
In order to keep a maintainable code base?
Code is more read than written, so it is necessary to have a code-base that is readable, understandable and maintainable. When you see something that is poorly written or designed, it can be refactored to improve the design of the code.
You clean your house also regularly, don't you? Although it may be considered a waste of time, it is necessary in order to keep your house clean, so that you have a nice environment to live in.
You may need to refactor if your code is
Inefficient
Buggy
Hard to extend
Hard to maintain
It all boils down to the original code not being very good, so you improve it.
If you have reasonable unit tests it shouldn't be dangerous at all.
Because hindsight is easier than foresight.
Software is one of the most complex things created by humans, so it is not easy to consider everything beforehand. For large projects it can even be impossible for the team (at least for one consisting of humans ;) ) to consider everything before they actually start developing it.
Another reason is that software isn't constructed, it's growing. That means software can and has to adapt to ever changing requirements and environments.
As Martin Fowler says, the only thing surprising about the requirements for software changing is that anyone is surprised by it.
The requirements will change, new features will be requested. This is a good thing. Enhancement efforts succeed most of the time, and when they fail, they fail small, so there is budget to do more. Big up front design projects fail often (one statistics puts the failure rate at 66%), so avoid them. The way to avoid them is to design enough for the first version, and as enhancements are added, refactor to the point where it looks like the system intended to do that in the first place. The lifespan of a project that can do this (there are issues when you publish data formats or APIs - once you go live you can't always be pristine anymore) is indefinite.
In response to the four points, I would say that a process that shuns refactoring demands:
A static world where nothing changes
so that the upfront design can hit a
non-moving target perfectly.
Will
result in ugly hacks to work around
design flaws that aren't being
refactored.
Will lead to dangerous
code duplication as the fear of
changing existing code sets in.
Will
waste resources over engineering the
problem and building large design
artifacts in anticipation of
requirements that never end up
getting built, causing large amounts
of code and complication to drag the
project down while not providing any
value.
One caveat, though. If you don't have the proper support, in an automated tool for simple cases, and thorough unit tests in the more complicated cases, it will hurt, there will be new bugs introduced, and you will develop a (quite rational) fear of doing it more. Refactoring is a great tool, but it requires safety equipment.
Another scenario where you need refactoring is TDD. The textbook approach for TDD is to write only the code you need to pass the test and then refactor it to something nicer afterwards.
...because coding is like gardening. Your codebase grows and you domain changes as time passes. What was a good idea back then often looks like a poor design now and what is a good design now may well not be optimal in the future.
Code should never be considered a permanent artifact nor should it be considered too sacred to touch. Confidence should be garnered through testing and refactoring is a mechanism to facilitate change.
While a lot of other people have already said perfectly valid reasons, here's mine:
Because it's fun. It's like beating your own time in steeplechase, having the stronger bicep in armwrestling or improving your highscore in a game of your choice.
A straightforward answer is, requirements change. No matter how elegant your design is, some requirements later on will not buy it.
Poor understanding of the requirements:
If developers don't have a clear understanding of the requirements, the resulting design and code cannot satisfy the customer. Later as the requirements become more clear, refactor becomes essential.
Supporting new requirements.
If a component is old, in most of the cases it will not be able handle the radical new requirements. It then becomes essential to go for refactoring.
Lots of bugs in the existing code.
If you have spent long hours in office fixing quite a few nasty bugs in a particular component, it becomes a natural choice for refactoring at the earliest.
Upfront: Refactoring does not need to be dangerous when a) supported by tools and b) you have a testsuite that you can run after the refactoring in order to check the functioning of your software.
One of the main reasons for refactoring is that at some point you find out that code is used by more than one code path and you don't want to duplicate (copy&paste) but reuse. This is especially important in cases where you find an error in that code. If you have refactored the duplicated code into an own method, you can fix that method and be done. If you copy&paste code around, there is a high chance that you don't fix all places where this code occurs (just think of projects with several members and thousands of lines of code).
You should of course not do refactoring just because of the sake of refactoring - then it is really a waste of resources.
For whatever reason, when I create or find a function that scrolls off the screen, I know it's time to sit back and consider whether it should be refactored or not - if I'm having to scroll the whole page to take in the function as a whole, chances are it's not a shining example of readability or maintainability.
To make insane stuff sane.
I mainly refactor when the code has suffered so much under copy + paste and a lack of architectural guideance that the action of understanding the code is akin to re-organising it and removing the duplication.
It is human to err, and you're ALWAYS going to make mistakes when you develop software. Creating a good design from the beginning helps a lot, and having skilled programmers on the team is also a good thing, but they will invariably make mistakes, and there will be code that is hard to read, tightly coupled or non-functional, etc. Refactoring is a tool to mend these flaws when they've already occurred. You should never stop working on preventing these things from happening to begin with, but when they do happen, you can fix them.
Refactoring to me is like cleaning my desk; it creates a better working environment because over time it will get messy.
I refactor because, without refactoring, it becomes harder and harder to add new features to a codebase over time. If I have features A, B, and C to add, feature C will be finished sooner, with less pain and suffering on my part, if I take time to refactor after features A and B. I'm happier, my boss is happier, and our customers are happier.
I think it's worth restating, in any conversation involving refactoring, that refactoring is verifiably behavior-preserving. If at the end of your "refactoring" your program has different outputs, or if you only think, but can't prove, that it has the same outputs, then what you've done isn't refactoring. (That doesn't mean it's worthless or not worth doing -- maybe it's an improvement. But it's not refactoring and shouldn't be confused with it.)
Refactoring is a central component in any agile software development methods.
Unless you fully understand all the requirements and technical limitations of your project you can't have a complete upfront design. In this case instead of using a traditional waterfall approach you're probably better off with an agile method - agile methods focus on adapting quickly to changing realities. And how would you adapt your source code without refactoring?
I've found code design and implementation, particularly with unfamiliar and large projects to be a learning process.
The scope and requirements of a project change over time, which has consequences on the design. It may be that after spending some time implementing your product you discover that your planned design is not optimal. Perhaps new requirements were added by the client. Or perhaps you're adding additional functionality to an older product and you need to refactor the code in order to sufficiently provide this functionality.
In my experience code has been written poorly and the refactoring has become necessary to prevent the product from failing and to ensure it is maintainable/extendable.
I believe an iterative design process, with prototyping early on is a good way to minimise refactoring later on. This also allows you to experiment with differing implementations to determine which is most suitable.
Not only that, but new ideas and methods for what you're doing may become available. Why stick with old, fallible code that could become problematic if it can be improved?
In short, projects will change overtime, which necessitates changes in structure to ensure it meets new requirements.
From my own personal experience I refactor because I find if I make software the way I want it made from first go that it takes a very long time to create something.
Therefore I value the pragmatism of developing software over clean code. Once I have something running I then begin to refactor it into the way it should be. Needless to say, the code never devolves into a piece of unreadable tripe.
Just a side note - I did my degree in software engineering after reading some material from Steve Mcconnell as a teen. I love design patterns, good code reuse, nicely thought out designs and so on. But I find when working on my own projects that designing things initially from that point of view just doesnt work unless I'm an absolute expert with the technology I'm using (Which is never the case)
Refactoring is done to help make code easier to understand/document.
To give a method a better name - perhaps the previous wasnt clear or incorrect.
To give variables more descriptive / better names.
Break up a really long method into many smaller methods representing the steps involved in solving the problem.
Move classes to a new package(namespace) to assist organisation.
Reduce duplicate code.
Does point number one even matter? If you're refactoring, the up-front design was obviously flawed. Don't waste time worrying about the flaws in the original design; it's old news. What matters is what you have now, so spend that time refactoring.
I refactor because proper refactoring makes maintenance SO much easier. I've had to maintain a TON of bad, awful code and I don't want to hand down any that I've written for someone else to maintain.
Maintenance costs of smelly code will almost always be higher than maintenance costs for sweet smelling code.
I refactor because:
Often my code is far from optimal first time around.
Hindsight is often 20-20.
My code will be easier to maintain for the next guy.
I have professional pride in the work I leave behind.
I believe time spent now can save a lot more time (and money) further down the track.
All your points are common descriptors of why people do refactor. I would say that the reason people should refactor lies within point #1: A Big Design Up Front (BDUF) is almost always imperfect. You learn about the system as you build it. In trying to anticipate what could happen you often end up building complex solutions to deal with things that never actually happen. (YAGNI - You ain't gonna need it).
Instead of the BDUF approach, a better solution is therefore to design the parts of the system you know you are going to need. Follow the principles of single responsibility principle, use inversion of control/dependency injection so that you can replace parts of your system when needed.
Write tests for your components. And then, when the requirements for your system change or you discover flaws in your initial design, you can refactor and extend your code. Since you have your unit tests and integration tests in place, you will know if and when the refactoring breaks something.
There is a difference between large refactorings (restructuring modules, class hierarchies, interfaces) and "unit" refactorings - within methods and classes.
Whenever I touch a piece of code I do a unit refactoring - renaming variables, extracting methods; because actually seeing the code in front of me gives me more information to make it better. Sometimes refactoring also helps me to better understand what the code is doing. It's like writing or painting, you extract a fuzzy idea out of your head; put a rough skeleton onto paper; then into code. You then refine the rough idea in the code.
With modern refactoring tools like ReSharper in C#, this kind of unit refactoring is extremely easy, quick & low risk.
Large refactorings are harder, break more things, and require communication with your team members. It will become clear to everyone when these need to happen - because requirements have changed so much that the original design no longer works - and then they should be planned like a new feature.
My last rule - only refactor code that you are actually working on. If code's functionality doesn't need to be changed, then it's good enough & doesn't need further work.
Avoid refactoring just for refactoring's sake; that's just refactorbating!

How do you decide which parts of the code shall be consolidated/refactored next?

Do you use any metrics to make a decision which parts of the code (classes, modules, libraries) shall be consolidated or refactored next?
I don't use any metrics which can be calculated automatically.
I use code smells and similar heuristics to detect bad code, and then I'll fix it as soon as I have noticed it. I don't have any checklist for looking problems - mostly it's a gut feeling that "this code looks messy" and then reasoning that why it is messy and figuring out a solution. Simple refactorings like giving a more descriptive name to a variable or extracting a method take only a few seconds. More intensive refactorings, such as extracting a class, might take up to a an hour or two (in which case I might leave a TODO comment and refactor it later).
One important heuristic that I use is Single Responsibility Principle. It makes the classes nicely cohesive. In some cases I use the size of the class in lines of code as a heuristic for looking more carefully, whether a class has multiple responsibilities. In my current project I've noticed that when writing Java, most of the classes will be less than 100 lines long, and often when the size approaches 200 lines, the class does many unrelated things and it is possible to split it up, so as to get more focused cohesive classes.
Each time I need to add new functionality I search for already existing code that does something similar. Once I find such code I think of refactoring it to solve both the original task and the new one. Surely I don't decide to refactor each time - most often I reuse the code as it is.
I generally only refactor "on-demand", i.e. if I see a concrete, immediate problem with the code.
Often when I need to implement a new feature or fix a bug, I find that the current structure of the code makes this difficult, such as:
too many places to change because of copy&paste
unsuitable data structures
things hardcoded that need to change
methods/classes too big to understand
Then I will refactor.
I sometimes see code that seems problematic and which I'd like to change, but I resist the urge if the area is not currently being worked on.
I see refactoring as a balance between future-proofing the code, and doing things which do not really generate any immediate value. Therefore I would not normally refactor unless I see a concrete need.
I'd like to hear about experiences from people who refactor as a matter of routine. How do you stop yourself from polishing so much you lose time for important features?
We use Cyclomatic_complexity to identify the code that needs to be refactored next.
I use Source Monitor and routinely refactor methods when the complexity metric goes aboove around 8.0.

Resources