Do you know best practices for refactoring other people's sourcecode? - refactoring

I have received a bunch of source code from other developers. Source code is not documented and my task is to divide it to different modules. Do you have some approach in your practice how to do it so that the existing application will not be broken.

First step is to write test cases for the current part of the code you want to refactor. Writing test cases will also help you to understand what the code really does.
If you have the test code running you can continue with refactoring. Split you code in smaller methods (a method should do only one thing) and follow the clean code principle verifying that the tests are still running.
Then I would start looking for dead and complex code and move methods and classes together that belong together. For this job there are lot of good tools out there that will ease your work. Sonar and dead code detection just to name two.
As the last step I would change the model and find patterns and transform you code accordingly.
The magic is to have tested code when you start. Because then you can be pretty sure that your resulting new shiny code behaves the same as before.
Books I can recommend for your task.
Clean Code, Cleaner Code, The Art of Readable Code, Code Simplicity, Quality Code: Software Testing Principles, Practices, and Patterns,Head First Design Patterns

Try to establish dependencies between different segments of code. If you see repetition, this is almost certainly a place where you can increase modularity. Additionally, organize things cohesively. Functions in one file should really all serve similar, related purposes. Also, functions in general should usually do small, specialized tasks.
For documentation, I would recommend contacting the developers to ask exactly what they meant. If this is impossible, read, read, and read again. Make sure you understand it inside and out before you begin.

I like to establish metrics as goals for the exercise. The problem with refactoring is to know when you're done, and by establishing metrics, you force yourself to think about that.
I use a tool like Sonar, integrated with the source code control system and continuous integration pipeline, as a measurement framework. (So, yes, the first things you probably need to do is set up source code control and continuous integration).
The standard Sonar rulesets are pretty sensible. At a high level, you can set goals for unit test coverage and rules compliance. Rules compliance covers all sorts of metrics - code duplication, commented out code, code dependencies, complexity etc.
You can also see graphs of the code's evolution over time, showing you whether you're going in the right direction.
The next bit of "best practice" advice is to capture your work in a task tracking system, as a user story or whatever. In most cases, my business stakeholders want to know when I'm going to be done with the refactoring, and using the same processes I use for tracking feature development helps build confidence.
Finally - while Vadimo's book recommendations are solid, I'd want to add "Refactoring to Patterns". It shows pragmatic steps to improve the code, without necessarily going all the way to "gang of four" purity.

Related

Trying to use TDD when unfamiliar with the domain your programming in

Background:
I'm currently creating my first game engine project for learning purposes. This is the first larger project I've ever done. I knew going in to the project the sort of higher level details of what's involved in a game engine (such as separate systems - input, graphics, audio, physics, etc.) but when getting into the nitty gritty I'm kindof figuring things out as I go. My coding process right now looks like this:
1.) Figure out some quick, higher level details of the system I want to design
2.) Start some experimental coding to see how exactly things need to work
3.) Add tests afterwards once I'm feeling good about what I have.
Since I'm pretty unfamiliar with the problem domain (game engine programming) I find I really need to experiment in code to see what functions I need and what design suites best. However, I know most of the community (so it seems anyway) usually advocates more of a TDD approaches to building applications. I can see the benefit of this but I'm not quite sure how I would apply the "Write test first, fail, then pass" when I'm not really even sure what functions I really need to be testing yet. Even if I can think of 1 or 2 definite functions, what if during the experimenting phase I find it's better to split those functions out into more functions across different classes. Then I would have to redesign both my code and my tests constantly.
My Issue/Question(s):
So, is there a good way to go about using a TDD approach when your experimenting in code? Or is TDD generally meant for those who are familiar with the projects their working on, and have an idea of what the design needs to be or what functions they will be needing to test?
A common approach when using TDD in unfamiliar territory:
Do some experimental coding to learn more
Set aside the experimental code and start the TDD "test first" approach
As you write production code to make your tests pass, you'll be harvesting pieces of your experimental code.
The benefit of this approach is that with the TDD "red,green,refactor" cycle, you'll usually be improving the design of your experimental code. Without it, your experimental code can often end up as a "big ball of mud".
I have found that when I let "I'm not comfortable writing unit tests in this situation" be an excuse for not writing unit tests... I never write any unit tests.
If you know enough about the behavior of something to code it, you know enough to test it.
Even if I can think of 1 or 2 definite functions, what if during the
experimenting phase I find it's better to split those functions out
into more functions across different classes
If you already have the original function tested, and you're only changing the interface, then there should be little test logic that would have to change. And there's no risk of breaking something when you change the interface afterwards. Further, this concern is not specific to experimenting with a new domain.
Perhaps an unfamiliar domain is not the best place to start learning TDD. But I would certainly not say it is inherently unsuitable to the process.

TDD as a defect-reduction strategy

Can TDD be successful as a defect-reduction strategy without incorporating guidance on test case construction and evaluation?
IMO, my answer would be no. For TDD to be effective, there has to be guidelines around what is a test and what it means to have something be reasonably tested. Without a guideline, there may be some developers that end up with tons of defects because their initial tests cover a very small set of inputs,e.g. only the valid ones, which can cause the idea of using TDD to become worthless.
Test driven development can reduce defects in a QA cycle simply because testing allows developers to find defects prior to releasing their code to the QA team.
But without guidance on how to test you really won't be able to create any kind of long-term benefit since haphazard testing will only prevent defects by blind luck. Good tests based on good guidance can go a long way towards reducing defects.
if you don't have tests to reproduce defects, how do you know that "defect reduction" has taken place?
of course you do have tests - they're just manual, and thus tedious and time-consuming to reproduce...
Here's a study (warning: link to PDF file) done by microsoft on some of their internal teams.
A quote from it:
The results of the case studies indicate that the pre-release defect density of the four products decreased between 40% and 90% relative to similar projects that did not use the TDD practice. Subjectively, the teams experienced a 15–35% increase in initial development time after adopting TDD
That's the only actual empirical study done on TDD/Unit testing that I'm aware of, but there are plenty of people (including myself) that will anecdotally tell you that TDD (and unit testing in general) will definitely provide an increase in the quality of your code.
From my own experience, there is definitely a reduction in the number of defects, but the numbers feel like they would be far less than even the 40% from the Microsoft study; This is (again, based solely on what I've seen) primarily because most corporate developers have little to no experience with Unit Testing (let alone TDD), and will invariably do a bad job of it while they are learning. Actually learning how to do TDD well requires at least a solid year of experience, and I've never worked in (or even seen) a team which actually had a full complement of developers with that experience.
You may want to pickup a copy of Gerard Meszaros' xUnit Test Patterns. Specifically, Chapter 5 might apply most directly to your question where it covers the Principles of Test Automation. Some of those principles that I think apply to your scenario where there needs to be some sort of guidance, common interest, or some sort of implied do this or fear the wrath of :
Principle: Communicate Intent
Tests need to be easy to maintain, readily apparent what the test is doing.
Principle: Keep Tests Independent
Small tests that cover one small piece. Tests should not interfere with each other.
Principle: Minimize Test Overlap
Need to design tests that cover a specific piece, and do not create tests that exercise the same paths repeatedly.
Principle: Verify One Condition Per Test
This one seemed simple enough for me, but is one of the most challenging in my experiences for people to grasp. I may write tests that have some multiple asserts, but I try to keep all those together around the specific area. When it comes to hunting down failures and other test issues, it is MUCH easier to fiddle with a single test that is testing a specific path instead of several different paths all clumped into a single test method.
Further relating to my experiences, if we want to talk about the corporate developer, I have seen some folks that are interested and take the initiative to learn something new, but more often than not, you have folks that like to go with the flow, and like to have things laid out for them. Without some sort of direction, be it a mandate from a senior engineer-type, or some sort of joint-team discussions (see Practices of an Agile Developer for some ideas such as lunch time meetings once a week), I think your chance of success would be limited.
In a team situation, where your code is likely to be used by someone else, the tests have a fringe benefit that can reduce defects without necessarily even improving anyone's code.
Where documentation is poor (which during development is "often"), the tests act a crib for how you expect your code to be called. So, even in cases where the code is really very fragile, TDD can still reduce the number of defects raised against the end-product -- by ensuring your colleagues can see well-written tests before they can use your code, you've ensured they know how you intend your code to be used before they call it. They are thus less-likely to call your code in an unexpected sequence / without having configured something you expected (but forgot to write a check for) as a prerequisite. Thus they are less likely to trigger the failure condition, and you are less likely to see them or the (human) test team raising a defect because something crashed.
Of course, whether you see that "there's a hidden bug in there, it's just not being called yet" as a problem itself is another good question.

When Refactoring a project to improve maintainability, what are some of the things to target?

I've got a project (about 80K LOC) that I'm working on, and I've got nearly a full month of luxury refactoring and feature adding time prior to release so long as I'm careful not to break anything. With that being said, what can I do to improve maintainability. Please not there was no unit testing in place with this project and I'm not really PLANNING to add unit tests to this, however if it is the common consensus I will take it under consideration.
What are the key things I should look for and consider revising or refactoring to improve software maintainability and quality?
Edit: It's been made clear to me I need unit testing. This is not something I've ever really done, what are some good resources for a developer new to unit testing (preferably via VS2008's unit testing framework)?
Please not there was no unit testing in place with this project and I'm not really PLANNING to add unit tests to this, however if it is the common consensus I will take it under consideration.
Frankly, if your goal is to improve maintainability, there is no substitution for unit testing.
This would be step one. The problem is, without unit testing, there's no way to know if you "break" something during the refactoring process.
Unit tests provides a layer of safety around your refactoring. It's difficult to feel comfortable refactoring, especially doing large-scale refactoring, if you have no way to verify that your refactoring isn't going to change behavior.
You can do some minor refactoring - small renames to improve understanding, etc, but any large-scale refactoring, or any design-style refactoring to improve long term maintainability should come after designing and writing tests that help you protect yourself.
The key thing to consider is why you want to refactor your code. Answer that question, and you'll have half your answer already.
You mention wanting to improve maintainability, which is a very common reason for refactoring. Given that as a goal, here are some things that I would specifically target:
1) Remove duplicate code. Most programmers try to avoid this anyway, but large projects (especially projects with large teams) tend to accumulate it anyway. This is an easy target for refactoring.
2) Make simplicity your goal. Is each function/method/class clearly defined? Can you look at it and know very well exactly what it's doing? If not, it's a good target for a refactor. Good examples are modules that do lots of things (or have lots of side effects). Consider breaking them into smaller modules of logically grouped functionality.
3) Variable/class/function names. Are they clear? They don't necessarily need to be long, but they should make it very clear to you (or whomever is maintaining the code) exactly what the variable is for or what the function does. If there are some that are unclear, consider renaming them.
4) Do you have code that's never getting called? It could be worth leaving if you think you'll use it later. Otherwise, it's just a red herring for any maintainers. It's worth considering getting rid of it.
5) Performance enhancements. You may or may not have the time for full up algorithmic rewrites (the best performance enhancement). However, this is a good time to check for simple things. As a C++ example, are you passing classes as const references or by value? The former is much more efficient when you can get away with it (which is 95% of the time).
Good luck on your refactoring!
[Edit] Also, I second everybody below with a recommendation that you write unit tests before refactoring to ensure that your code remains correct.
Even though you said no unit tests I am going to plug them anyways. Wrap complicated logic up in tests before refactoring them.
Jrud's answer of code smells is good.
Also, study up on the S.O.L.I.D. principals.
I would look at the wiki article on Code Smells on this site, its a great place to start!
Having a project well covered with solid tests (both unit-tests, using mocking &c to run blazingly fast so you can run them all the time, AND integration-tests to be run more rarely that actually interface with real databases, etc, etc) is the key to maintainability: the single most important thing you can do to make a project easily maintainable for whatever purpose (features, bug removal, performance, porting, etc, etc). A strong test suite gives you confidence in the correctness of any further specific change you may want to try down the road, plus, code refactored to be well-testable (highly modular, dependency injection, etc, etc) intrinsically also becomes more flexible and maintainable.
I highly recommend Feathers' Working Effectively With Legacy Code (both the PDF I'm pointing to, and the book by the same title and author) for a thorough and highly practical guide to how best to proceed in such situations.
Find places which are likely to change in future and make it more flexible maybe.

Exercises to enforce good practices such as TDD and Mocking

I'm looking for resources that provide an actual lesson plan or path to encourage and reinforce programming practices such as TDD and mocking. There are plenty of resources that show examples, but I'm looking for something that actually provides a progression that allows the concepts to be learned instead of forcing emulation.
My primary goal is speeding up the process for someone to understand the concepts behind TDD and actually be effective at implementing them. Are there any free resources like this?
It's a difficult thing to encourage because it can be perceived (quite fairly) as a sea-change; not so much a progression to a goal but an entirely different approach to things.
The short-list of advice is:
You need to be the leader, you need to become proficient before you can convince others to, you need to be able to show others the path and settle their uncertainties.
First become proficient in writing unit tests yourself
Practice writing tests for existing methods. You'll probably beat your head on the desk trying to test lots of your code--it's not because testing is hard or you can't understand testing; it's more likely because your existing code and coding style isn't very testable.
If you have a hard time getting started then find the simplest methods you can and use them as a starting point.
Then focus on improving the testability of the code you produce
The single biggest tip: make things smaller and more to the point. This one is the big change--this is the hardest part to get yourself to do, and even harder to convince others of.
Personally I had my "moment of clarity" while reading Bob Martin's "Clean Code" book; an early chapter talks about what a clean method will look like and as an example he takes a ~40 line method that visually resembled something I'd produce and refactors it out into a class which is barely larger line-count wise but consists of nothing but bite-sized methods that are perhaps 3-7 lines each.
Looking at these itty-bitty methods it suddenly clicked that the unit-testing cornerstone "each test only tests one thing" is easiest to achieve when your methods only do one thing (and do that one thing without having 30 internal mechanisms at play).
The good thing is that you can begin to apply your findings immediately; practice writing small methods and small classes and testing along the way. You'll probably start out slow, and hit a few snags fairly quickly, but the first couple months will help get you pointed in the right direction.
You could try attending (or hosting one if there is none near you!) a coding dojo
I attended one such excercise and it was fun learning TDD.
Books are always a good resource - even though not free - they may be worth your time searching for the good free resources - for the money those books cost.
"Test driven development by example" by Kent Beck.
"Test Driven Development in Microsoft .NET" by James W. Newkirk and Alexei A. Vorontsov
please feel free to add to this list
One thing I worked through that helped me appreciate TDD more was NHibernate and the Unit of Work Pattern. Although it's specific to NHibernate and .NET, I liked the way that it was arranged. Using TDD, you develop something (a UnitofWork) that's actually useful rather than some simple "this is what a mock looks like" example.
How I learn a concept best is by putting it to use towards an actual need. I suggest you take a look at the structure of the article and see if it's along the lines of what you're looking for.
Geeks are excellent at working to metrics, whether they are good for them or not!
You can use this to your advantage. Set up a CI server and fail the build whenever code coverages drops below 50 percent. Let them know that the threshold will rise 10 percent every month until it's 90. You could perhaps use some commit hooks to stop them being able to check code in to begin with but I've never tried this myself.
Let them know the coverage by the team will be taken into effect in any performance reviews, etc. By emphasising it is the coverage of the team, you should get peer pressure helping you ensure good coverage.
This will only ensure they are testing their code, not how well they are testing their code, nor whether they are writing the tests first. However, it is strongly encouraging (or forcing) them to incorporate testing into their daily development process.
Generally, once people have something in their process they'll want to do something as easily/ efficiently as possible. TDD is the easiest way to write code with high coverage as you don't write a line of code without it being covered.
Find someone with experience and talk to them. If there isn't a local developer group, then start one.
You should also try pushing things too far to start with, and then learn when to back off. For example, the whole mock thing started when someone asked "What if we program with no getters".
Finally, learn to "listen to the tests". When the tests look dreadful, consider whether it's the code that's at fault, not your testing technique.

Should I start using TDD on a project that doesn't use it already

I have a project that I have been working on for a while, just one of those little pet projects that I would like to one day release to open source.
Now I started the project about 12 months ago but I was only working on it lightly, I have just started to concentrate a lot more of my time on it(almost every night).
Because it is a framework like application I sometimes struggle with a sense of direction due to the fact I don't have anything driving my design decisions and I sometimes end up making features that are hard to use or even find. I have been reading about how to do TDD and thought maybe this will help me with some of the problems that I am having.
So the question is do you think it's a good idea to start using TDD on a project that doesn't already use it.
EDIT: I have just added a bit to clarify what I mean by struggle with a "sense of direction", it properly wasn't the best thing to say without clarification.
In my opinion, it's never too late to adopt a better practice - or to drop a worse one - so I'd say "Yes, you should start".
However ... (there's always a "but") ...
... one of the biggest gains of TDD is that it impacts on your design, encouraging you to keep reponsibilties separate, interactions clean and so on.
At this point in your project, you may find it difficult to get tests written for some aspects of your framework. Don't give up though, even if you can't test some areas, your quality will be the better for the areas you can test, and your skills will improve for the experience.
Yes.
Basically, you can't do any harm by adding TDD for any new code you write, and any changes you make to existing code. Obviously it would be tricky to go back and retro-fit accurate tests to existing code, but it certainly couldn't hurt to cover the primary use-cases.
Maybe consider having a look at Brownfield Application Development in .NET? It is full of pragmatic and practical advice for exactly this scenario (one of the definitions offered for "Brownfield" is "without proper unit tests").
Yes, absolutely a good idea to start doing TDD.
You will pay a start-up cost for at least two reasons:
Learning a new skill TDD/unit testing.
Retrofitting your code to be testable.
You'll need to do some of both, but as you work if you find yourself struggling think of which of those two is the source of the effort.
But the end result is worth it. From what you describe this is a project you intend to live with for quite a while. Remember that when you lose an hour here or there. In a year you'll be very happy that you made this investment both in your skill set and the code base.
At worse, you can just do TDD on new stuff, while you slowly create tests for your existing code base.
Yes, it's never too late to start using TDD. I have introduced TDD to a commercial project that was already running for five years when I joined, and it was definitely a good decision.
While you are new to the technique, you should probably concentrate on using it for the code that you are writing from a clean slate - new classes, new methods etc. Once you got a hang on it, start writing tests for code that you change.
For some of the code, the latter might prove to be difficult, because the code you have written until now is unlikely to be written with testability in mind. There are some techniques to deal with that, but it's probably too early to care about them.
If you are missing a sense of direction, though, I doubt that TDD will help you a lot. You might want to look into Acceptance Testing instead, which is at least as important as unit testing, and will help you focus on the functionality of the system instead of single units of code. The TDD book by Lasse Koskela is a good introduction to both techniques.
Another technique that might help you is the Extreme Programming planning game, where you put pieces of functionality on index cards and prioritize them. I typically notice that getting ideas out of my head and in prioritized order helps me a lot in understanding where I want to go next.
As others have said, TDD shouldn't hurt a project in progress, but think carefully if you're tempted to do large-scale refactoring just to allow testing. Make sure the benefits justify the cost.
I'm a little concerned that you "struggle with a sense of direction." I don't know that TDD will help you there. I find it's a great help for low-level design decisions, but not so great for architecture decisions. Adding TDD to a directionless project sounds a bit like having a baby to save a marriage - unwise. Hopefully I misread your intention.
Yes.
TDD makes it easier for other people to understand the code, as well as it gives the application a better design over time
In theory you were supposed to test first, but you didn't. In this scenario, contrary to others opinion, I wouldn't start with new features.
Take advantage of the 80:20 rule, run a profiler, and put the test cases to the most frequently called piece of code.
Put tests around the house jewel, gut, most-important code.
Put tests around the annoying, always-breaking, recurrent déjà vu buggy code.
Put tests around all bugs you come across before fixing the bug for failing test.
Warning: Putting test cases will require refactoring, which means you must fix something that's not breaking.
If you still love unit tests at this point, you'd be Red, Green, Refactoring on your own.
Absolutely.
Introduce TDD to new code and if time allows, introduce "Comment Driven Design" with your existing code if it's not already tested.
Comment out the block of existing code you need to test
Write your test
Uncomment your original code one statement at a time (if you have an if block, uncomment the entire block)
Determine if your original code ultimately passes your test and if not, re-write to pass your tests accordingly
Writing tests for existing, working code that you don't plan to change doesn't fit with the thrust of TDD, which is to write tests that teach you about the system you're building.
My approach to bringing in TDD mid-stream has been to:
write tests for all new features, and
when changing a piece of code, write a test that covers the existing functionality (to make sure I understand it), then change the test before changing the code.
It can also be beneficial to write tests for code related to code you're changing - e.g., if you're altering a parent class, you may want to build tests around child classes first to protect yourself from potential damage.
Yes, you should. I'm currently working on a project that until recently wasn't covered with unit tests, but we decided that we should start testing our code, so we started writing them now. Unfortunately, I'm the only developer that practices TDD, others just write tests after writing their code.
Still, I found that practicing TDD helps me write better code, and I write it faster than before. Now that I learned how to do TDD, I just don't want to go back to writing code the way I used to.

Resources