What are the advantages and disadvantages of refactoring code smell in software quality? - refactoring

What are the advantages and disadvantages of refactoring code smell in software quality?

Although refactoring does not add features or functionalities in a software system, it is sharp weapon for developers in their maintenance activities. It makes a software system easier to understand and cheaper to modify without changing its observable behavior by changing its internal structure.
The purposes of refactoring according to Martin Fowler (Father of Code Smell) are stated in the following:
Refactoring Improves the Design of Software
Refactoring Makes Software Easier to Understand
Refactoring Helps Finding Bugs
Refactoring Helps Programming Faster
Specially for long-term software, it is essential to refactor the code in order to make the software more adaptive. However, you definitely do not perform refactoring tasks, if it exceeds your budget and time. In essence, stop refactoring when you -
Run out of money
Run out of time

Advantages:
1. Refactoring is a really good weapon to maintain the code
2. It's interesting thing to do whether part of current task or as a separate task
3. Make the code clean and organized
4. Help to follow principles like SOLID, GRASP, etc
Disadvantages:
1. It's risky when the application is big
2. It's risky when the existing code doesn't have proper test cases
3. It's risky when developers do not understand what's all about

Related

What is refactoring?

I hear the word refactoring everywhere. Any programming tool has some blah-blah about how it helps refactoring, every programmer or a manager will tell me something about refactoring. But to me it still sounds like a magic word without any meaning. It seems that refactoring is just editing your code or what?
Wikipedia quote
Code refactoring is a "disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior",[1] undertaken in order to improve some of the nonfunctional attributes of the software. Advantages include improved code readability and reduced complexity to improve the maintainability of the source code, as well as a more expressive internal architecture or object model to improve extensibility.
WHAT? Does every(any)body understand this? Are all those people who talk to me about refactoring, really do mean this?
And why is the name? What's "factoring" then?
Refactoring is modifying existing code to improve its readability, re-usability, performance, extensibility and maintainability. Have you ever looked at code and thought, "Wow this is a mess" or "this could be done better"? When you start to clean up the code and improve different aspects of it, this is considered refactoring. Many times code will often repeat itself, requiring you to create abstractions to adhere to the DRY principle, another demonstration of refactoring. During most refactoring it is important to not break anything, which can be assured by using good unit tests.
Sometimes its best just to get some working code established that solves a particular problem. Think of this as a rough draft, it just gets the basic ideas established and allows you to think about the problem at hand. After the rough draft is finished, you return to the code and edit it, making improvements that leads to a final copy (refactoring). You may eventually receive further requirements that require further code modifications. At this point the cycle repeats. Get the initial ideas down in code, then revisit the code and clean it up (refactor it).
One of the main premises behind refactoring is that code can always be improved. When you make these improvements its refactoring.
Refactoring is all about making your code more maintainable.
Practically, the requirements of the software are changing continuously that leads to continuous changes in the software. Over a period of time, the software starts becoming complex. As correctly illustrated by Lehman in his excellent work on software evolution that "as a system evolves, its complexity increases unless work is done to maintain or reduce it". Hence, we have to reduce the complexity of the software periodically and that's why refactoring is required.
Let us consider an example:
Insufficient Modularization (or God class) design smell occurs when a class is too huge and/or complex. If you have such a class in your design then you will face multiple problems (such as understandability is poor - you will find difficult to understand your code, reliability issues - since the class is complex you may change the code incorrectly or change in one aspect leads to bug in another aspect). Therefore, it is better to refactor the class using techniques such as "Extract-class" refactoring.
(More information about "Insufficient Modularization" can be found in the book "Refactoring for software design smells")
From Wiktionary: http://en.wiktionary.org/wiki/refactor
(computing) to rewrite existing source code in order to improve its readability, reusability or structure without affecting its meaning or behaviour
"Refactor" is also the name of a menu of tools in Eclipse.
I will explain the "Rename" Eclipse tool as an example of the tools under the Eclipse "Refactor" menu.
In Eclipse, you can "refactor -> rename" your variables by highlighting the variable, right-clicking and going to "refactor" and "rename":
Instead of renaming each one by hand, it detects which variables refer to the same object, so you can change all of the applicable variables' names at once:
It's really convenient to prevent bugs that would happen when you forget to rename a variable.
“Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written.” - Martin Fowler (Father of Code Smell).
An example of a refactoring could be extract method (Figure 1): If a method is too long, it should be decomposed, using this refactoring technique. Find a clump of code (within the long method) that goes well together, create a new method with a descriptive name and move the code into the new method. If local variables are being used, these need to be passed as parameters. The last step is to add a invoke to the new method and test the code.
Although refactoring does not add features or functionalities in a software system, it is sharp weapon for developers in their maintenance activities. It makes a software system easier to understand and cheaper to modify without changing its observable behavior by changing its internal structure.
The purposes of refactoring according to M. Fowler are stated in the following:
Refactoring Improves the Design of Software
Refactoring Makes Software Easier to Understand
Refactoring Helps Finding Bugs
Refactoring Helps Programming Faster
Refactoring has been very clearing explained in the following link: Refactoring: Guru. But, for a quick overview, have created a diagram that would help in basic understanding -

Can TDD Handle Complex Projects without an upfront design?

The idea of TDD is great, but i'm trying to wrap my head around how to implement a complex system if a design is not proposed upfront.
For example, let's say I have multiple services for an payment processing application. I'm not sure I understand how development would/can proceed across multiple developers if there is not a somewhat solid design upfront.
It would be great if someone can provide an example and high level steps to putting together a system in this manner. I can see how TDD can lead to simpler and more robust code, I'm just not sure how it can bring together 1) different developers to a common architectural vision and 2) result in a system that can abstract out behavior in order to prevent having to refactor large chunks of code (e.g. accept different payment methods or pricing models based on a long term development roadmap).
I see the refactoring as a huge overhead in a production system where data model changes increase risks for customers and the company.
Clearly i'm probably missing something that TDD gurus have discovered....
IMHO, It depends on the the team's composition and appetite for risk.
If the team consists of several experienced and good designers, you need a less formal 'architecture' phase. It could be just a back of the napkin doodle or a a couple of hours on the whiteboard followed by furious coding to prove the idea. If the team is distributed and/or contains lots of less skilled designers, you'd need to put more time/effort (thinking and documenting) in the design phase before everyone goes off on their own path
The next item that I can think of is to be risk first. Continually assess what are the risks to your project, calculate your exposure/impact and have mitigation plans. Focus on risky and difficult to reverse decisions first. If the decision is easily reversible, spend less time on it.
Skilled designers are able to evolve the architecture in tiny steps... if you have them, you can tone down the rigor in an explicit design phase
TDD can necessitate some upfront design but definitely not big design upfront. because no matter how perfect you think your design is before you start writing code, most of the time it won't pass the reality check TDD forces on it and will blow up to pieces halfway through your TDD session (or your code will blow up if you absolutely want to bend it to your original plan).
The great force of TDD is precisely that it lets your design emerge and refine as you write tests and refactor. Therefore you should start small and simple, making the least assumptions possible about the details beforehand.
Practically, what you can do is sketch out a couple of UML diagrams with your pair (or the whole team if you really need a consensus on the big picture of what you're going to write) and use these diagrams as a starting point for your tests. But get rid of these models as soon as you've written your first few tests, because they would do more harm than good, misleading you to stick to a vision that is no longer true.
First of all, I don't claim to be a TDD guru, but here are some thoughts based on the information in your question.
My thoughts on #1 above: As you have mentioned, you need to have an architectural design up-front - I can't think of a methodology that can be successful without this. The architecture provides your team with the cohesion and vision. You may want to do just-enough-design up front, but that depends on how agile you want to be. The team of developers needs to know how they are going to put together the various components of the system before they start coding, otherwise it will just be one big hackfest.
It would be great if someone can provide an example and high level
steps to putting together a system in this manner
If you are putting together a system that is composed of services, then I would start by defining the service interfaces and any messages that they will exchange. This defines how the various components of your system will interact (this would be an example of your up-front design). Once you have this, you can allocate various development resources to build the services in parallel.
As for #2; one of the advantages of TDD is that it presents you with a "safety net" during refactoring. Since your code is covered with unit tests, when you come to change some code, you will know pretty soon if you have broken something, especially if you are running continuous integration (which most people do with a TDD approach). In this case you either need to adapt your unit tests to cover the new behavior OR fix your code so that your unit tests pass.
result in a system that can abstract out behavior in order to prevent
having to refactor large chunks of code
This is just down to your design, using e.g. a strategy pattern to allow you to abstract and replace behavior. TDD does not prescribe that your design has to suffer. It just asks that you only do what is required to satisfy some functional requirement. If the requirement is that the system must be able to adapt to new payment methods or pricing models, then that is then a point of your design. TDD, if done correctly, will make sure that you are satisfying your requirements and that your design is on the right lines.
I see the refactoring as a huge overhead in a production system where
data model changes increase risks for customers and the company.
One of the problems of software design is that it is a wicked problem which means that refactoring is pretty much inevitable. Yes, refactoring is risky in production systems, but you can mitigate that risk and TDD will help you. You also need to have a supple design and a system with low coupling. TDD will help reduce your coupling since you are designing your code to be testable. And one of the by-products of writing testable code is that you reduce your dependencies on other parts of the system; you tend to code to interfaces which allows you to replace an implementation with a mock or stub. A good example of this is replacing a call to a database with a mock/stub that returns some known data - you don't want to hit a database in your unit tests. I guess I can mention here that a good mocking framework is invaluable with a TDD approach (Rhino mocks and Moq are both open source).
I am sure there are some real TDD gurus out there who can give you some pearls of wisdom...Personally, I wouldn't consider starting a new project with out a TDD approach.

When should you not refactor?

We all know that refactoring is good and I love it as much as the next guy, but do you have real cases where is better not to refactor ?
Something like time critical stuff or synchronization? Technical or human reasons are equally welcome. Real cases scenarios and experiences a plus.
Edit : from the answers thus far, it looks like the only reason not to refactor is money. My question is mostly relative to something like this: suppose you would like to perform "extract method", but if you add the additional function call, you will make the code slightly less faster and hinder a very strict synchronization. Just to give you an idea of what I mean.
Another reason I sometimes heard is that "others used to the current code layout will get annoyed by your changes". Of course, I doubt this is a good reason.
I'm a big fan of refactoring to keep code clean and maintainable. But you generally want to shy away from refactoring production modules that work fine and don't require change. However, when you do need to work on a module to fix bugs or introduce a new feature, some refactoring is usually worth it and won't cost much since you're already committed to doing a full set of tests and going through the release process. (Unit tests are very helpful, but are only part of the full test suite, as other posters noted.)
More significant refactorings may make it harder for others to find their way around the new code, and they may then react unfavorably to refactoring. To minimize this, bring other team members in on the process using an approach like pair programming.
Update (8/10): Another reason to not refactor is when you aren't approaching the existing code base with proper humility and respect. With these qualities you'll tend to be conservative and do only refactorings that really do make a difference. If you approach the code with too much arrogance, you may wind up just making changes instead of refactoring. Is that new method name really clearer, or did the old one have a name with a very specific meaning in your application domain? Did you really need to mechanically reformat that source file to your personal style, when the existing style met project guidelines? Again pair programming can help.
To reinforce the other answer (and touch on issues you mention): do not refactor a part of the code until it's well covered by all relevant kinds of testing. This doesn't mean "don't refactor it" -- the emphasis is on "add the necessary tests" (to do unit-tests properly may well require some refactoring, particularly the introduction of factory DPs and/or dependency injection DPs in code that's now solidly bolted to concrete dependencies).
Note that this does cover your second paragraph's issues: if a section of the code is time-critical it should be well covered by "load-tests" (which like the more usual kind, correctness-test, should cover both specific units [albeit performance-wise -- correctness-checking is other tests' business!-)] AND end-to-end operations -- the equivalent of unit tests and integration tests if one was talking about correctness rather than performance).
Multi-tasking code with subtle sync issues can be a nightmare as no test can really make you entirely confident about it -- no other refactoring (that might in any way affect any fragile sync that just appears to be working now) should be considered BEFORE one intended to make the synchronization much, MUCH more robust and sound (message-passing through guaranteed-threadsafe queues being BY FAR my favorite design pattern in this regard;-).
Hmmm - I disagree with the above (1st response). Given code with no tests, you may refactor it to to make it more testable.
You do not refactor code when you cannot test the resulting code in time to deliver it such that it is still valuable to the recipient.
You do not refactor code when your refactoring will not improve the quality of the code. Quality is not subjective, although at times, design may be.
You do not refactor code when there is no business justification for making an alteration.
There are probably more, but hopefully you get the idea...
As Martin Fowler writes, you shouldn't refactor if a deadline is near. That time in project is better suited to flush out bugs instead of improving design (refactoring). Do the refactoring omitted this time directly after the deadline is over.
Refactoring is not good in and of itself. Rather, its purpose is to improve code quality so that it can be maintained more cheaply and with less risk of adding defects. For actively developed code, the benefits of refactoring are worth the cost. For frozen code that there is no intention to do any further work on, refactoring yields no benefit.
Even for live code, refactoring has its own risks, which unit tests can minimize. It also has its own place in the development cycle, which is towards the front, where it's less disruptive. The best time and place for refactoring is just before you start to make major changes to some otherwise brittle code.
When it is not cost-effective. There's a guy at the place I work who loves refactoring. Making code perfect makes him very happy. He can check out a current project tree and go to town on it, moving functions and classes around and tightening things up so they look great, have better flow, and are more extensible in the future.
Unfortunately, it's not worth the money. If he spends a week refactoring some classes into more functional units that may be easier to work with in the future, that's a week's worth of salary lost to the company with no noticeable bottom-line improvements.
Code will never, ever be absolutely perfect. You learn to live with it, and keep your hands off something that could be done better, but perhaps isn't worth the time.
If the code seems very difficult to refactor without breaking, that's the most important code to refactor!
If there aren't any tests, write some as you refactor.
Honestly, the one case is where you are forbidden to touch some code by management/customer/SomeoneImportant, and when that happens I consider the project broken.
Here is my experience:
Don't refactor:
When you don't have test suite accompanying with the code you want to refactor. You might want to develop the test suit first instead.
When your manager doesn't really care about the maintainablity and extensibility of current code base, instead they care much about if they would be able to deliver the product on schedule, especially for the project with short and tight schedule.
If you stick to the principle that everything that you do should add value for the client/ business, then the times you should not refactor are the following:
Code that works and no new development is planned.
Code that is good enough / works and refactoring simply represents gold plating.
The cost of refactoring is higher than living with the existing code.
The cost of refactoring is higher that rewriting the code from scratch
Some of the other answsers say that you should not refactor code that does not have unit tests. If code needs refactoring, you should refactor it, you must however write tests first. If the code is written in a way that makes it difficult to test, it should be rewritten (in a perfect world).
When you've got other stuff to build. I always feel like refactoring an existing system when I'm supposed to be doing something else.
There's always a balance to be had between fixing or adding to code and refactoring. However, this balance is so far in favor of refactoring that I don't think I've ever been on a team that refactored too much. Chances are, if you think you're erring on the side of refactoring too much, you're right on the money.
Of course, the biggest determining factor is how close the deadline is. If a deadline is imminent, requirements come first.
Isn't the need to refactor code largely based on the propensity of people to cut and paste code rather than thinking the solution through, and doing the factoring in advance? In other words, whenever you feel the need to cut & paste some code, merely make that chunk of code a function, and document it.
I have had to maintain way too much code where people found it easier to cut and paste a whole function, only to make one or two trivial changes, which could easily have been parametrized. But like many other's experience, to try to refactor some of this code would have take a LOT of time and been very risky.
I have 4 projects wherein a 10K line collection of functions was merely copied and modified as needed. This is a horrid maintenance nightmare. Especially when the code has LOTS of problems, e.g. hard-wired endianness assumptions, tons of global variables, etc. I feel bile in my throat just thinking about it.
Don't refactor if you don't have the time to test the refactored code before release. Refactoring can introduce bugs. If you have well-tested and relatively bug-free code, why take the risk? Wait until the next development cycle.
If you're stuck maintaining an old flakey code base with no future beyond keeping it running until management can bite the bullet and do a rewrite then refactoring is a lose-lose situation. First the developer loses because refactoing bad flakey code is a nightmare and secondly the business loses because as the developer attempts to refactor the software breaks in unexpected and unforseen ways.
When you don't really know what the code is doing in the first place. And yes, I have seen people ignore that rule.
It's just a cost-benefit tradeoff. Estimate the cost to refactor, estimate the benefits, determine if you actually have the time to refactor given other tasks, determine if refactoring is the best time-benefit tradeoff. There may be other tasks more worth doing.

Why do we refactor? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would like to know the reasons that we do refactoring and justify it. I read a lot of people upset over the idea of refactoring. Refactoring was variously described as:
A result of insufficient upfront
design.
Undisciplined hacking
A dangerous activity that needlessly risked destabilizing
working code
A waste of resources.
What are the responsible reasons that lead us to refactor our code?
I also found a similar question here how-often-should-you-refactor, it doesn't provide the reason for refactoring.
Why do we refactor?
Because there's no actual substitute for writing code. No amount of upfront planning or experience can substitute actual code writing. This is what an entire generation (called waterfall) learned the hard way.
Once you start writing the code and be in the middle of it, you reason about the way it works on a lower level you do notice things (performance, usability or correctness things) that escaped the higher design view.
Refactoring is perfecting.
Ask yourself: why do painters do multiple strokes with the brush on the same spot?
Refactoring is the way to pay the technical debt.
I'd like to briefly address three of your points.
1. "A result of insufficient up-front design"
Common sense (and several books and bloggers) tell us we should strive for the simplest, cleanest design possible to address a given problem. While it's quite possible that some code is written without sufficient work on developing an understanding of the requirements and the problem domain, it's probably more common that "poor code" wasn't "poor" when it was written; rather, it is no longer sufficient.
Requirements change, and designs have to support additional features and capabilities. It's not unreasonable to anticipate some future changes up-front, but McConnell et al. rightly caution against high-level, overly-flexible designs when there's no clear and present need for such an approach.
3. "A dangerous activity that needlessly risks destabilising working code"
Well, yes, if done improperly. Before you seek to make any significant modification to a working system, you should put in place proper measures to ensure that you're not causing any harm - a sort of "developmental Hippocratic oath", almost.
Typically, this will be done by a mixture of documentation and testing, and more often than not, the code wins out, because it's the most up-to-date description of the actual behaviour. In practical terms, this translates into having decent coverage with a unit test suite, so that if refactoring does introduce unexpected problems, these are identified and resolved.
Obviously, when you seek to refactor, you're going to break a certain number of tests, not least because you're trying to fix some broken code contracts. It is, however, perfectly possible to refactor with impunity, provided you have that mechanism in place to spot the accidental mistakes.
4. "A waste of resources"
Others have mentioned the concept of technical debt, which is, briefly, the idea that over time, the complexity of such systems builds up, and that some of that build-up has to be reduced, by refactoring and other techniques, in order to reasonably facilitate future development. In other words, sometimes you have to bite the bullet and go ahead with that change you've been putting off, because otherwise you'll be making a bad situation appallingly worse when you come to add something new in that area.
Obviously, there's a time and a place to pay off such things; you wouldn't try and repay a loan until you had the cash to do it, and you can't afford to go around refactoring willy nilly during a critical stage in development. Nevertheless, by making the decision to address some of the problems in your code base, you save future development time, and thus money, and maybe even further into the future, avoid the cost of having to abandon or completely rewrite some component that is beyond your understanding.
In order to keep a maintainable code base?
Code is more read than written, so it is necessary to have a code-base that is readable, understandable and maintainable. When you see something that is poorly written or designed, it can be refactored to improve the design of the code.
You clean your house also regularly, don't you? Although it may be considered a waste of time, it is necessary in order to keep your house clean, so that you have a nice environment to live in.
You may need to refactor if your code is
Inefficient
Buggy
Hard to extend
Hard to maintain
It all boils down to the original code not being very good, so you improve it.
If you have reasonable unit tests it shouldn't be dangerous at all.
Because hindsight is easier than foresight.
Software is one of the most complex things created by humans, so it is not easy to consider everything beforehand. For large projects it can even be impossible for the team (at least for one consisting of humans ;) ) to consider everything before they actually start developing it.
Another reason is that software isn't constructed, it's growing. That means software can and has to adapt to ever changing requirements and environments.
As Martin Fowler says, the only thing surprising about the requirements for software changing is that anyone is surprised by it.
The requirements will change, new features will be requested. This is a good thing. Enhancement efforts succeed most of the time, and when they fail, they fail small, so there is budget to do more. Big up front design projects fail often (one statistics puts the failure rate at 66%), so avoid them. The way to avoid them is to design enough for the first version, and as enhancements are added, refactor to the point where it looks like the system intended to do that in the first place. The lifespan of a project that can do this (there are issues when you publish data formats or APIs - once you go live you can't always be pristine anymore) is indefinite.
In response to the four points, I would say that a process that shuns refactoring demands:
A static world where nothing changes
so that the upfront design can hit a
non-moving target perfectly.
Will
result in ugly hacks to work around
design flaws that aren't being
refactored.
Will lead to dangerous
code duplication as the fear of
changing existing code sets in.
Will
waste resources over engineering the
problem and building large design
artifacts in anticipation of
requirements that never end up
getting built, causing large amounts
of code and complication to drag the
project down while not providing any
value.
One caveat, though. If you don't have the proper support, in an automated tool for simple cases, and thorough unit tests in the more complicated cases, it will hurt, there will be new bugs introduced, and you will develop a (quite rational) fear of doing it more. Refactoring is a great tool, but it requires safety equipment.
Another scenario where you need refactoring is TDD. The textbook approach for TDD is to write only the code you need to pass the test and then refactor it to something nicer afterwards.
...because coding is like gardening. Your codebase grows and you domain changes as time passes. What was a good idea back then often looks like a poor design now and what is a good design now may well not be optimal in the future.
Code should never be considered a permanent artifact nor should it be considered too sacred to touch. Confidence should be garnered through testing and refactoring is a mechanism to facilitate change.
While a lot of other people have already said perfectly valid reasons, here's mine:
Because it's fun. It's like beating your own time in steeplechase, having the stronger bicep in armwrestling or improving your highscore in a game of your choice.
A straightforward answer is, requirements change. No matter how elegant your design is, some requirements later on will not buy it.
Poor understanding of the requirements:
If developers don't have a clear understanding of the requirements, the resulting design and code cannot satisfy the customer. Later as the requirements become more clear, refactor becomes essential.
Supporting new requirements.
If a component is old, in most of the cases it will not be able handle the radical new requirements. It then becomes essential to go for refactoring.
Lots of bugs in the existing code.
If you have spent long hours in office fixing quite a few nasty bugs in a particular component, it becomes a natural choice for refactoring at the earliest.
Upfront: Refactoring does not need to be dangerous when a) supported by tools and b) you have a testsuite that you can run after the refactoring in order to check the functioning of your software.
One of the main reasons for refactoring is that at some point you find out that code is used by more than one code path and you don't want to duplicate (copy&paste) but reuse. This is especially important in cases where you find an error in that code. If you have refactored the duplicated code into an own method, you can fix that method and be done. If you copy&paste code around, there is a high chance that you don't fix all places where this code occurs (just think of projects with several members and thousands of lines of code).
You should of course not do refactoring just because of the sake of refactoring - then it is really a waste of resources.
For whatever reason, when I create or find a function that scrolls off the screen, I know it's time to sit back and consider whether it should be refactored or not - if I'm having to scroll the whole page to take in the function as a whole, chances are it's not a shining example of readability or maintainability.
To make insane stuff sane.
I mainly refactor when the code has suffered so much under copy + paste and a lack of architectural guideance that the action of understanding the code is akin to re-organising it and removing the duplication.
It is human to err, and you're ALWAYS going to make mistakes when you develop software. Creating a good design from the beginning helps a lot, and having skilled programmers on the team is also a good thing, but they will invariably make mistakes, and there will be code that is hard to read, tightly coupled or non-functional, etc. Refactoring is a tool to mend these flaws when they've already occurred. You should never stop working on preventing these things from happening to begin with, but when they do happen, you can fix them.
Refactoring to me is like cleaning my desk; it creates a better working environment because over time it will get messy.
I refactor because, without refactoring, it becomes harder and harder to add new features to a codebase over time. If I have features A, B, and C to add, feature C will be finished sooner, with less pain and suffering on my part, if I take time to refactor after features A and B. I'm happier, my boss is happier, and our customers are happier.
I think it's worth restating, in any conversation involving refactoring, that refactoring is verifiably behavior-preserving. If at the end of your "refactoring" your program has different outputs, or if you only think, but can't prove, that it has the same outputs, then what you've done isn't refactoring. (That doesn't mean it's worthless or not worth doing -- maybe it's an improvement. But it's not refactoring and shouldn't be confused with it.)
Refactoring is a central component in any agile software development methods.
Unless you fully understand all the requirements and technical limitations of your project you can't have a complete upfront design. In this case instead of using a traditional waterfall approach you're probably better off with an agile method - agile methods focus on adapting quickly to changing realities. And how would you adapt your source code without refactoring?
I've found code design and implementation, particularly with unfamiliar and large projects to be a learning process.
The scope and requirements of a project change over time, which has consequences on the design. It may be that after spending some time implementing your product you discover that your planned design is not optimal. Perhaps new requirements were added by the client. Or perhaps you're adding additional functionality to an older product and you need to refactor the code in order to sufficiently provide this functionality.
In my experience code has been written poorly and the refactoring has become necessary to prevent the product from failing and to ensure it is maintainable/extendable.
I believe an iterative design process, with prototyping early on is a good way to minimise refactoring later on. This also allows you to experiment with differing implementations to determine which is most suitable.
Not only that, but new ideas and methods for what you're doing may become available. Why stick with old, fallible code that could become problematic if it can be improved?
In short, projects will change overtime, which necessitates changes in structure to ensure it meets new requirements.
From my own personal experience I refactor because I find if I make software the way I want it made from first go that it takes a very long time to create something.
Therefore I value the pragmatism of developing software over clean code. Once I have something running I then begin to refactor it into the way it should be. Needless to say, the code never devolves into a piece of unreadable tripe.
Just a side note - I did my degree in software engineering after reading some material from Steve Mcconnell as a teen. I love design patterns, good code reuse, nicely thought out designs and so on. But I find when working on my own projects that designing things initially from that point of view just doesnt work unless I'm an absolute expert with the technology I'm using (Which is never the case)
Refactoring is done to help make code easier to understand/document.
To give a method a better name - perhaps the previous wasnt clear or incorrect.
To give variables more descriptive / better names.
Break up a really long method into many smaller methods representing the steps involved in solving the problem.
Move classes to a new package(namespace) to assist organisation.
Reduce duplicate code.
Does point number one even matter? If you're refactoring, the up-front design was obviously flawed. Don't waste time worrying about the flaws in the original design; it's old news. What matters is what you have now, so spend that time refactoring.
I refactor because proper refactoring makes maintenance SO much easier. I've had to maintain a TON of bad, awful code and I don't want to hand down any that I've written for someone else to maintain.
Maintenance costs of smelly code will almost always be higher than maintenance costs for sweet smelling code.
I refactor because:
Often my code is far from optimal first time around.
Hindsight is often 20-20.
My code will be easier to maintain for the next guy.
I have professional pride in the work I leave behind.
I believe time spent now can save a lot more time (and money) further down the track.
All your points are common descriptors of why people do refactor. I would say that the reason people should refactor lies within point #1: A Big Design Up Front (BDUF) is almost always imperfect. You learn about the system as you build it. In trying to anticipate what could happen you often end up building complex solutions to deal with things that never actually happen. (YAGNI - You ain't gonna need it).
Instead of the BDUF approach, a better solution is therefore to design the parts of the system you know you are going to need. Follow the principles of single responsibility principle, use inversion of control/dependency injection so that you can replace parts of your system when needed.
Write tests for your components. And then, when the requirements for your system change or you discover flaws in your initial design, you can refactor and extend your code. Since you have your unit tests and integration tests in place, you will know if and when the refactoring breaks something.
There is a difference between large refactorings (restructuring modules, class hierarchies, interfaces) and "unit" refactorings - within methods and classes.
Whenever I touch a piece of code I do a unit refactoring - renaming variables, extracting methods; because actually seeing the code in front of me gives me more information to make it better. Sometimes refactoring also helps me to better understand what the code is doing. It's like writing or painting, you extract a fuzzy idea out of your head; put a rough skeleton onto paper; then into code. You then refine the rough idea in the code.
With modern refactoring tools like ReSharper in C#, this kind of unit refactoring is extremely easy, quick & low risk.
Large refactorings are harder, break more things, and require communication with your team members. It will become clear to everyone when these need to happen - because requirements have changed so much that the original design no longer works - and then they should be planned like a new feature.
My last rule - only refactor code that you are actually working on. If code's functionality doesn't need to be changed, then it's good enough & doesn't need further work.
Avoid refactoring just for refactoring's sake; that's just refactorbating!

Test-Driven Development "Barriers to Entry"?

I'm in the process of doing a study on Test-Driven Development and one of the discussion points is the "Barrier to Entry" associated with TDD. Does anyone have any experience around this area, on any projects you've worked on that decided not to use TDD because the barrier to entry was too high?
From what I can tell the only barrier to entry is knowledge (and as such experience) of individual developers, with most not being entirely accustomed to the process and it being slightly alien. Financially it seems to be very appealing given most of the market leading tools are open source, freely available, well documented and well supported.
Thoughts/feelings appreciated.
Thanks,
EDIT - does anyone know of any high profile quotes of people advocating TDD? Would love to see how high it goes up the chain. Cheers.
Some barriers include:
An existing code base which doesn't lend itself to unit testing.
A problem domain that is hard to unit test meaningfully, such as GUI work or integrations with third party systems.
A perception of integration problems over unit problems (in other words, if it doesn't work end to end it doesn't do anything, so what is the point of testing the unit).
A mindset that wants to design ahead of time and have a clear system design rather than have tests drive design
A political culture where design is done by a different person/group than development, and that design is not unit-test friendly.
An inability to get over the fact that TDD is not about testing for conformance (arguments like "the one who writes the tests shouldn't be the one who codes it, they will be too lenient on themselves" and such variants).
It isn't they way they have coded until now, so the shift is harder.
Sometimes a certain test can be hard to set up, so the method will get abandoned because it "feels" slower.
Design requirements that don't lend themselves to evolving design well or at all (think Nuclear Plant control software or other systems were actual lives depend on their functioning correctly).
If everyone isn't running the test before checking in code, tests start to break often for wrong reasons (that is the intended behavior of the code changed, but the test didn't keep up, so the test is wrong, not the code) so they can be perceived as a drag.
In terms of barriers to entry, effectively, because you are explicitly writing tests that must pass before code is considered to be complete, the lead time in the dev cycle involved in getting functional code is longer. Now, when using TDD, you're effectively guaranteeing a certain level of quality on the code (whatever level of quality you choose to test against) and so that is generally more than enough compensation for the lag in lead time, but strictly speaking, there IS a greater lead time to getting functional code using TDD.
Effectively, if you have coders that write bug-free code, TDD will be a drag on your development cycle. The value of TDD, of course, is that there aren't any coders who can always write bug-free code, and so the cost of fixing bugs has to be factored in somewhere; in TDD, the cost of the test infrastructure is front-loaded.
Note that this is not in any way a negative thing about TDD; I'm just saying, that front-loading COULD be considered to be a "barrier to entry". Personally, as a coder, I would say that the Return on Investment is more than worth the effort, and I think most experienced dev managers would as well.
Team and/or management buy-in is the biggest obstacle in some companies. If you're the lone developer trying to use TDD and you can't get others on the project interested, it can be very frustrating.
Of course that's not a financial barrier at all. The biggest perceived financial barrier is probably time. If you have a large code base that you need to write unit tests for, it can seem quite daunting. Your manager (or someone above them) will question why you want to spend time writing code that will not add features/functionality to the code. Many people don't realize that writing the tests up front (as you do in TDD) can actually save you time, both immediately and in the long run when you're maintaining that code.
I think one major barrier is how it requires you to change the way you think.
Before I tried TDD, I would create a class, say Employee, then I would stub in things like FirstName, LastName, Email, etc. Then I would write some logic and forget that I missed a few fields or something else. And before I knew it I had a pretty complex class without knowing if those fields were ever necessary.
Also, it's a complete change from how we are used to writing software. We are used to writing software as we receive features from the guys who sign our checks. We are not used to writing code which doesn't compile, making it compile, then making it work to make our tests pass.
The first time you do this, you feel a bit.. well silly and stupid. Why am I making my code intentionally fail? It seems illogical to the "make it work" philosophy we've all been taught for so long.
A few reasons why it has failed so far where I work:
Most of the project at work on are older apps. Not pre-.NET but, .NET 2.0 and in some cases .NET 1.0.
Some of these projects are not well factored, either because the technology wasn't there in 1.0, or it was built quickly because they needed something NOW..
As Jon pointed out, some things are still a PIA (pain-in-the-***) to unit test, UI, database, etc.
Expensive tools. If you are only allowed to Microsoft tools, it's a high price tag to do this the "right way". We use resharper, so it really isn't a problem.
Time. I'm in a team of three guys supporting a department of 30 people. We are considered overhead, and many of our development consists of interfacing systems together
Yes the main barrier to entry is in your head or in the head of other programmers.
In the beginning you don't know what to srite in your tests.
The trick is to think about how your code will be used instead of focusing on how you re going to write it. Easier said than done ...
When you start to "get it", it's a bit hard to know where to stop writing tests.
You have to remember that tests prove nothing so you just can't write tests to covers all cases, you have to select the most useful ones ... and that's already a lot !
I've certainly seen plenty of resistance. The barriers I've encountered are:
Unit testing user interfaces (web or thick client) is tricky. I know there are lots of attempts to solve the problem, but I don't think any of them have made it really simple - because it's a naturally hard problem.
At the other end, although there are various ways of making it easier to test the code involved with the database, it's still tricky and time-consuming.
While good tests definitely speed up development overall, testing is a skill - and while you suck at it, unit testing may well be more trouble than it's worth, which means you never build up the skill...
Managers often see it as an optional extra to development - a nice to have rather than critical. This means it's the first thing to go when the project inevitably has a resource squeeze.
I wrote a long-ish article about this a few weeks back, "Why I write tests first".
I think the biggest barrier is building the discipline to start with tests first, but I don't believe the TDD (or any practice for that matter) should be approached as an always, absolutely, 100% of the time solution.
TDD is a tool in each developer's arsenal. I tend to think it works well for me most of the time. A developer that isn't as accustomed to writing tests (first or otherwise) may it difficult to get anything done if TDD is forced on them because they can't think in terms of writing the test first.
I consider myself an experienced test-writer, but I can't always think in terms of tests. Some problems don't lend themselves well to it, or at least my head doesn't get wrapped around it some days. And some types of code (such as UI and client-side code) doesn't lend itself well to always writing tests.
If you have a team of developers that do not write tests as a matter of habit, I'd push that first. I have no problem requiring that all new code have accompanying unit tests where possible/practical. Once testing is a discipline, converting developers to TDD individually or as a team is much easier.
One non-obvious barrier (non-obvious to me, at least) is the build infrastructure. If developers don't have control over the build process, or if the infrastructure is too baroque to be manageable, then integrating tests into the build process is going to be shunted to the side in the name of "efficiency". (Of course, in these situations it's the build infrastructure that should be shunted aside in the name of efficiency.)

Resources