Related
In this article, the author explains rebasing with this diagram:
Rebase: If you have not yet published your
branch, or have clearly communicated
that others should not base their work
on it, you have an alternative. You
can rebase your branch, where instead
of merging, your commit is replaced by
another commit with a different
parent, and your branch is moved
there.
while a normal merge would have looked like this:
So, if you rebase, you are just losing a history state (which would be garbage collected sometime in the future). So, why would someone want to do a rebase at all? What am I missing here?
There are variety of situations in which you might want to rebase.
You develop a few parts of a feature on separate branches, then realize they're in reality a linear progression of ideas. Rebase them into that configuration.
You fork a topic from the wrong place. Maybe it's too early (you need something from later), maybe it's too late (it actually applies to previous versions as well). Move it to the right place. The "too late" case actually can't be fixed by a merge, so rebase is critical.
You want to test the interaction of a branch with another branch, but for some reason don't want to merge. For example, you might want to see what conflicts crop up commit-by-commit, instead of all at once.
The general theme here is that excessive merging clutters up the history, and rebasing is a way to avoid it if you didn't get your branch/merge plan right at first. Too many merges can make it hard for a human to follow the history, and also can make it harder to use tools like git-bisect.
There are also all the many cases which prompt an interactive rebase:
Multiple commits should've been one commit.
A commit (not the current one) should've been multiple commits.
A commit (not the current one) had a mistake in it or its message.
A commit (not the current one) should be removed.
Commits should be reordered (e.g. to flow more logically).
While it's true that you "lose history" doing these things, the reality is that you want to only publish clean work. If something is still unpublished, it's okay to rebase it in order to transform it to the way you should have committed it. This means that the final version in the public repository will be logical and easy to follow, not preserving any of the hiccups a developer had along the way.
Rebasing allows you to pick up merges in the proper order. The theory behind merging means you shouldn't have to worry about that. The reality of resolving complicated conflicts gets easier if you rebase, then merge new changes in order.
You might want to read up on Bunny Hopping
Do you intermingle refactoring changes with feature development/bug fixing changes, or do you keep them separate? Large scale refactorings or reformatting of code that can be performed with a tool like Resharper should be kept separate from feature work or bug fixes because it is difficult to do a diff between revisions and see the real changes to code in amongst the numerous refactoring changes. Is this a good idea?
When I remember, I like to check in after a refactoring in preparation for adding a feature. Generally it leaves the code in a better state but without a change in behaviour. If I decide to back out the feature, I can always keep the better structured code.
Keep it simple.
Every check in should be a distinct, single, incremental change to the codebase.
This makes it much easier to track changes and understand what happened to the code, especially when you discover that an obscure bug appeared somewhere on the 11th of last month. Trying to find a one-line change amidst a 300-file refactoring checkin really, really sucks.
Typically, I check in when I have done some unit of work, and the code is back to compiling/unit tests are green. That may include refactorings. I would say that the best practice would be to try to separate them out. I find that to be difficult to do with my workflow.
I agree with the earlier responses. When in doubt, split your changes up into multiple commits. If you don't want to clutter the change history with lots of little changes (and have your revisions appear as one atomic change), perform these changes in a side branch where you can split them up. It's a lot easier to read the diffs later (and be reassured that nothing was inadvertently broken) if each change is clear and understandable.
Don't change functionality at the same time you are fixing the formatting. If you change the meaning of a conditional so that a whole bunch of code can be outdented, change the logic in one change and perform the outdent in a subsequent change. And be explicit with your commit messages.
If the source code control system allows it..
(this does not work in my current job due to the source code control system not liking a single user checking out a single file to more than one location.)
I have two working folders
Both folders are checkout from the same branch
I use one folder to implement the new feature development/bug fixing changes
In the other folder I do the refactoring,
After each refactoring I check in the refactoring folder
Then update the new feature development folder that merges in my refactorings
Hence each refactoring is in own checkin and other developers get the refactoring quickly, so there are less merge problems.
What do you do when you're assigned to work on code that's
atrocious and antiquated to the point where it's almost incomprehensible?
For example: hardware interface code, mixed with logic, AND user interface code, ALL in the same functions?
We see bad code all the time, but what do you actually do about it?
Do you try to refactor it?
Try to make it OO if it's not?
Or do you try to make some sense of it, make the necessary changes and move on?
Depends on a few factors for me:
Will I be maintaining this code in the future, or is it a one-off fix?
How long until this system is replaced entirely?
How busy am I at the moment?
Ideally, I'd refactor all bad code I had to maintain, but the reality is there are only so many hours in the day.
As is frequently the case, "It Depends".
I tend to ask myself some of the following questions:
Are there unit tests for the existing code?
Is refactoring the code an acceptable risk for my project?
Is the author still available to clarify any questions I might have about the code?
Will my employer consider the time spent on changing existing, functioning code to be an acceptable use of my time?
And so on...
But assuming that I have the capacity to do so, refactoring is preferential as the up front cost of fixing the code now will likely save me a lot of time and effort later in maintenance and development time.
There are other benefits as well, including the fact that the more clean and well maintained you keep your code base, the more likely other developers are to keep it that way. The Pragmatic Programmer calls this the Broken Window Theory.
Developers have an instinct to assume that code is always ugly because of other, inferior developers. Sometimes, code is ugly because the problem space is ugly. All that ugliness isn't just ugliness - it is sometimes institutional memory. Each line of ugly in your code probably represents a bug fix. So think very carefully before you rip it all out.
Basically, I would say that you shouldn't touch code like this unless you actually have to. If there's a real bug that you can solve, refactoring is reasonable, if you can be sure you're maintaining the same amount of functionality. But refactoring for the sake of refactoring (eg, "make the code OO") is what I would generally classify as a classic newbie mistake.
The book Working Effectively with Legacy Code discusses the options you can do. In general the rule is not to change code until you have to (to fix a bug or add a feature). The book describes how to make changes when you can't add testing and how to add testing to complex code (which allows more substantial changes).
You try to refactor it, in the strict sense on the word, where you're not changing the behaviour.
The first target is usually to break up giant methods.
Given the strength of some of the adjectives you use, i.e. atrocious, antiquated and incomprehensible, I'd bin it!
If it is in such a state, like the example you give, it's probably not got any test code for it either. Refactoring is mentioned in many of the other answers but, sometimes, it is not appropriate. I always find that, when refactoring, you generally need a clear path through which the old code can be gradually morphed into the new in a number of well defined steps.
When the old code is so far removed from how you want it to look, such as the extreme cases you seem to be suggesting, you could probably redesign, rewrite and test the new code in a shorter time than it would to take to refactor it.
Scrap it and start over, using the compiled legacy application as a business requirements document.
And spending time in analysis with the users to see what they want changed.
Post it to www.worsethanfailure.com!!!
If no modifications are needed, I don't touch it.
If at all possible, I write automated unit tests first, especially focused on the areas that need modification.
If automated unit tests are not possible, I do what I can to document manual unit tests.
I am just using the tests to document "current" behavior at this point.
If possible, I always keep a version of the code and executable environment that runs things the "original" way (before I touched it) so I can always add new "behavior documentation" tests and better detect regressions I may have caused later.
Once I start changing things, I want to be very careful not to introduce regressions. I do this by continually rerunning (and or adding new tests) to the tests I wrote before I started writing code.
When possible, I leave bugs as-is if there is no business need for them to be fixed. Those bugs may be "features" to some users and may have unclear side effects that wouldn't be clear until the code was redeployed to production.
As far as refactoring, I do that as aggressively as possible, but only in the code that I need to change otherwise anyway. I may refactor more aggressively in my own personal copy of the code that will never be checked in, just to improve the readability of the code for me personally. It's often times difficult to properly test changes that are only made for readability reasons, so for safety reasons, I generally don't check those changes in / deploy them unless I can confidently test that the code changes are completely safe (it's really bad to introduce bugs when you are making changes that are unnecessary for anything but readability).
Really, it's a risk management problem. Proceed with caution. The users do not care if the code is atrocious, they just care that it gets better without getting worse. Your need for beautiful code is not important in this scenario, get past it.
Just like any other code, you leave it slightly better when you leave it than it was when you entered it. You do not ever, ever rewrite the whole code. If that is the work it takes for some reason, then you start a project (small or large) for it.
I am assuming we are talking about a substantial amount of code here.
Not every day is a great day at work you know :)
The first question to ask is: does it work?
If the answer is yes, that would be a huge disincentive to simply ditch it and start over. There may be thousands of man-hours in that code which address edge cases and nasty bugs. Worse yet, there may be other modules in the system that depend on the current incorrect (but known and possibly documented) behavior. Don't mess with it if it isn't broken.
If you are keen on cleaning it up, start by writing test cases for the current behavior. When you run across an instance where the behavior differs from the specification, you must decide whether to accept the behavior as "correct" or go with what the spec say it ought to do.
Only once you have written test cases that all pass should you begin to refactor. The tests will tell you whether your efforts are breaking anything.
I'd talk to my manager and describe the code. Most managers would not want a program held together by banding wire and duct tape per se. If the code is really that bad there are sure to be some business logic errors, hardcoding etc. stuffed in there that will eventually just destroy productivity.
I've come across some pretty bad code before (single letter variable names, no comments, everything crammed onto one line, etc.) and once I mentioned/showed it to my manager they almost always said "go ahead and re-write it", because not only are you taking the hit for reading and changing the code but future co-workers will have to go through the same pain. Better that you take a longer period of time just once to rewrite it rather than having each person who touches the code in the future have to go through and comprehend and decipher it first.
There is an old saying. If is isn't broke, do not fix it. If you have to maintain it then reverse engineer it and document it so the next time you come across it you will know what it does.
You do not know the situation the developer was in when he or she wrote the code. He or she may have been under a time crunch when it was written, (management was all over the developer, etc)
There are also situations where he or she wrote the code per the spec, The spec then changed several times, the developer had to patch the code, as rewrite is out of the question due to time constraints. This happens all of the time.
If the code impacts the performance of robustness of the application and is modular then you can re factor or re-write. Document the situation to assist future programmers in understanding.
Also many programmers consider reverse engineering other developers code as beneath them.
they would rather rewrite without considering the ramifications of doing so.
If you have never done so, try it sometime, it will make you a better developer.
Thanks
Joe
Kill it with fire.
Depends on your time frame and how important that code is to you. If you have to "just make it work" then do that and rewrite the module when time allows.
If its an important or integral part of what you do then refactor refactor refactor.
Then find the guy/girl who wrote it and send them a rude postcard!
The worst offender (in my experience) of really AWFUL code is the ease with which people can do cut & paste these days. Cut & paste should be used rarely. If you think that's the right solution, it's generally better to step back and generalize the problem a little.
Anytime you see code that is "nearly incomprehensible", PROCEED WITH CAUTION. You need to assume that any major re-factoring will result in new bugs being introduced that you'll need to find and correct.
Additionally, I've seen this scenario many times (even fell victim to it myself once or twice): Programmer inherits legacy code, decides code is ancient & unmaintainable and decides to refactor it, ends up deleting key "fixes" or "business rules" subtly patched in over the years, ends up spending a lot of time tracking down and re-introducing similar code when users complain about "a problem fixed years ago is happening again".
Re-factoring (and debugging) almost always takes longer than expected and should never be considered as a "freebie" that comes along with whatever task you're supposed to be doing.
"If it ain't broke, don't 'fix' it" still has a lot of truth.
Im my company we always Refactor Mercilessly. so we still come across atrocious code but LESS and Less and less ...
We write a lot of in-house code and the company is run for about 100 years by the same family. Management usually tells us we have to maintain the code base (evolve) for another 50 years or so. In this setting having code you don't dare to touch is considered a bigger risk to the long term survival of the company then the prospect of downtime because some under-tested code broke because of refactoring.
I run copy-paste detector and findbugs on all legacy code that comes my way.
I then plan my initial refactoring:
remove unused code, unused variable and unused methods
refactor duplicated code
set up a single step build
build a basic functional test
By that point the code meets the basic minimum for maintainability. It can be easily built and basic errors can be found via an automated test.
I often add code like this:
log.debug("is foo null? " + (foo == null));
log.debug("is discount < raw price ? " + (foo.getDiscount() < foo.getRawPrice()));
Some of that code will be recovered for unit tests when I can refactor to it.
I've worked places where we ship that kind of code.
I try to make sense of it, make the necessary changes, and move on.
Of course, making sense of it usually involves some changes; at the very least, I move around the whitespace and line up corresponding braces in the same column like so:
if(condition){
doSomething(); }
// becomes...
if(condition)
{
doSomething();
}
I'll also often change variable names.
And very often, "the necessary changes" require refactoring. :)
Get the idea of what they're doing and the deadline to finish. A larger deadline, typically rebuild much of the code from the ground up, as I find it a very worthwhile experience to not only decipher terrible code and make it legible and document, but somewhere in your brain those neurons are pressed to avoid similar mistakes in the future.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Do you do it when you’re in the code doing something else?
When your manager approves it? (Seems this never happens)
I guess some of this depends on the impact of the changes. If I change the code and it affects nothing outside of the class, to me that is low impact.
What does it become a design change? When it effect X object or X projects?
I’m just curious how others teams tackle this...
As part of original development (red/green/refactor)
When suggested by a code reviewer
When we've noticed a design pain-point
When making another change, if the refactoring is low impact, i.e. typically not affecting any other files.
If it affects the public API, I generally like to make the refactoring a single source code commit which doesn't change behaviour (and then build new behaviour into another commit). If it affects other projects too, there needs to be consensus over it and I would want to get permission to change their code to go in the same refactoring commit.
I find I refactor when revisiting code (presumably to add/extend functionality) more than 3 months after it was written.
If it takes me more than 2 minutes to discern what a chunk of code is doing, I'll break it apart to make it more immediately understandable (or just add some more comments.)
as soon as all of the tests run.
I work in a large system, so I only change things I have to. It is easy to have bad side effects to changes.
I will refactor sections of code that are performing poorly, not working properly, or needs new functionality.
I never just decide to fix things, I would never be done. if it works, and no one is asking for changes or complaining about problems, move on. life is too short to fix everything.
I often refactor my code when there is a user requirement change or bug fixes. Then there will be a chance for people to review your changes.
Otherwise, I normally don't touch the workable code even it smells.
We found small refactorings are best done while we were working on a bit of code - do what's required, preferably paired.
For bigger things, we had a Technical Debt section on the wall - if you spotted something and didn't have the time to address it, or it was going to take some discussion to solve, you'd add it to the wall and they would be scheduled for future iterations (or when free time cropped up).
Refactoring while you're already in the code is sometimes easiest, especially if your manager does not support the initiative, but if you only change a small part it will break consistency with surrounding parts. In these cases it's better to be selective and, as you suggested, do things that are low-impact. It may also be helpful to refactor long select/switch statements into functions and delay on refactoring the inner code until sometime later.
At a previous job, I was the manager, so I refactored whenever I wanted. At my current job, I'm an analyst so most of the code is not directly my responsibility. When I do write code, I avoid impacting anything that I'm not writing. I have one project which is entirely under my own control and I refactor any time I learn a better way to do something.
We refactor as often as we can. Having unit tests to ensure that everything works pre- and post- refactoring really helps.
Code review processes often help with this. If I touch some code, it gets reviewed, reviewer asks, "why did you do it this way?", I say, "I had to because of (insert ugliness here)". This is a sign that the code should be refactored right after the review is done.
To look at our company, we have decided that our upcoming application release is mostly dedicated to performance optimizations rather than new functionality. This was something we felt was needed and also was requested by some clients.
Therefore we have spent a lot of time identifying performance bottlenecks in our app and reviewing code and refactoring it to make things run faster.
So in our case we did it because management approved us doing it for this new release, because we showed to them how much performance improvement could be gained.
Refactor when needed:
when you need a better understanding of the code you are working on (pairing often helps here), examples are: renaming, method extraction etc.
when the current design doesn't allow for a 'clean' change: at this point you can actually argue with your manager on a value basis (e.g. what is this new feature worth to the project)
I am always making small refactorings in my code. I know as long as I have my unit tests to verify that everything is still functioning properly afterward, I see no harm in doing it as I go. That way you don't get that vague "needs refactoring" feeling every time you work on it.
Now if it requires a large refactoring, it's best to plan for that and set aside some time.
Seems most other posters are resistant to refacotring mercilessly. Of course this isn't possible if the system you're working on doesn't support this through extensive unit tests. But in general, If I can see an opportunity to make the code tighter without spending more than a few minutes or hours at most, I go for it. If I'm not sure what I should be working on, I look for something to refactor.
I refactor when I'm fixing a bug or adding a feature and the process of refactoring makes the code easier to read and easier to maintain.
Following DRY principles vehemently will often be a trigger for me to refactor.
Insufficiently often, thus building up technical debt.
Sad, but so.
Do as I say, not as the team I work on does.
When inheriting applications at a new job do you tend to stick to the original developers coding practices or do you start applying your own?
I work in a small shop with no guidelines and always wondered what the rule was here. Some applications are written very well but do not follow the standards I use (variable names etc...) and I do not want to "dirty" them up. I find my self taking a little extra time being consistent.
Others are written very poorly and it looks like the developer was changing his mind every keystroke...
ADDITIONAL THOUGHT
What about when I start my own projects? So now I have introduced a new coding standard to the mix:
The good code - but not my style
The bad code with bad practices and lack of standards
My own standards
If there are standards evident in the code, you should stick to them. If there aren't, start introducing your own.
If there are multiple developers who work on the same module, don't change the style.
If you will hand it off to another developer in the near future (this role is temporary), don't change the style.
If you are taking complete, exclusive, permanent ownership of the module, change it, but follow these rules:
One change at a time.
Fix all indentation to your liking at once, and commit that change.
Fix all brace placement to your liking at once, and commit that change.
Fix all other formatting to your liking at once, and commit that change.
Fix all naming to your liking at once, and commit that change.
Don't spend a lot of time on it.
If it takes more than an hour or two, then cut back.
Make the commit description clear.
So you can quickly ignore these changes when analyzing change history.
Use automated tools
to make sure the result is consistent and complete, so you don't have to mess with it again.
Run your tests
Just because your changes shouldn't affect behavior doesn't mean they won't. (Triple negative, ouch!)
Make sure everyone knows what you're doing
Someone might have a change hanging around that they want to commit now, and it'll be painful to merge with your changes. Also, you don't want anyone to get surprised and go tell your boss before you do.
Don't do it again
This is a one-time thing.
Publish a style guide that follows best practices, and build consensus around following it. Refactor old code as you need to maintain it.
I'm in the same boat as you. Lone developer who inherited some apps from the last guy. I
I've been sticking to what appear to be his standards for existing projects for consistency, and using my own preferences for new stuff.
I've noticed that most people think whoever came before them had no idea how to write code. Then whoever comes after them thinks the same thing. Some things are common sense, but most things are just personal preference.
For major problems, i.e. using comments v.s. not using comments, updating the code will probably make it easier to work with, and easier for anyone else to work with. Even then, your time is probably best-spent updating the code as you come across it, instead of embarking on a huge project to refactor everything (introducing new problems in the process).
For things like indentation, line spacing, variable names, one-line ifs v.s. multi-line ifs, the reality is that your coding style is likely just as bad as you think theirs is.
I think it depends on what you mean by "coding practices". If you mean things like code formatting and naming conventions and things that I would personally consider "cosmetic", then stick with whats already there. If you mean things like coding best prcatices and writing code correctly in the first place, then go back and fix the problems if possible, but at the very least make your new code follow best practices.
Given that most of the applications I've inherited have been hacked together by "cowboy coders" who didn't apply even the most basic of coding practices, my opinion is a little biased.
I say introduce coding standards if there are none or the ones that exist are blatantly wrong and/or stupid (e.g. "All variables must be no more than 4 characters in length", "Every database column is varchar(255) null", etc.). Obviously if you have a team then you'll need to come to an agreement as to what practices to implement, but if you're a solo dev then you have free reign and IMO you should introduce order to the chaos.
If the code works, and seems to have had a clean format. Don't waste time changing the style.
If the code is badly written. By all means change it when you have some down time, or the next time you work on the project.
For new projects do them your way, since there is no standard. As with the other well written programs yours should be easy enough to maintain.
composition is often preferred over inheritance
:-P
If it's just you, go for it. If it's a team, especially if any of the original developers are still around (or likely to be called in for consulting), keep with the existing style and practices as much as you can. Don't follow them down a rat hole - if you think they're doing something stupid, change it, but if it's just a stylistic thing, keep to their style as much as you can.
On several jobs I've been on, we had no rule on coding style other than "if you're making changes to an existing file/class, use the existing style, even for new code."
I follow company standards if there are any.
If there aren't any and the changes are small, I adopt to the used style of coding.
If there are larger changes to be made and I don't like the coder's style, I will use my own.
And if the existing code is bad I will change that too.
Will you ever have a better opportunity to update existing code with a standard style? Probably not. When you are new to the code you are going to have the best chance of taking some extra time to make non-new-feature and non-bug-fix changes. The lack of standards may be discouraging but you are unlikely to have a better chance to standardize than when you first inherit the code.
It sounds like we're talking about a situation with no official style guides / best practices. In that case, as Sean said, I'd take the lead on establishing some. But... if at all possible, pick an existing, widely-used standard. It's more likely to be accepted, all the arguments are done with, and the odds of out-of-the-box tool support (editors, code review tools, etc.) greatly increase.
Getting others to adopt it will often work best from the bottom up -- write new code to the new standards, mention to others that you've done so, ask for feedback. Much easier than trying to get approval and buy-in in advance.
Within the existing, ugly project, avoid wholesale changes to existing modules. For one thing, diffs and version control will get quite confusing if a file is suddenly reindented.
If the chunk you're working on is so bad as to be unreadable, I'd do an initial checkin just to reformat it; follow that up with actual code changes.
I would apply the same refactoring standards to the code as I would if it DID match my style standards. That is, I'd ignore the style and just go on about my business.
If it's not terribly difficult to follow the style that is in the code - with regards to naming conventions, I'd go ahead and use those for new code.
However, I wouldn't bother trying to follow stuff like 'tabs should not be used', 'every line should be indented 2 spaces', etc. There are plenty of editors out there where you can 'pretty' the code whenever you need it these days.
G-Man
I think it depends highly on the specific case.
If you are a consultant on a project for a short time you should stick to the way thing are.
If you are on for a long time. Try to refactor bad code into your own scheme.
If you are on for a short time but you are working on an isolated module, then use your own scheme.
Short answer is, "It depends." Here are a few factors that I'd consider important in determining whether to keep the old style or not:
1) Scope of changes. If it is close to a total re-write of the application, then it may make more sense to put in a new standard if you have one that you feel works well for you.
2) Likelihood of future changes. Will this be changed over and over again? If so, then taking some time early on may well be worth it in the end. This does require a bit of judgement and predicting the future, but it may be easy in some cases to see that there will be changes over and over again for some systems that are fairly complex.
3) How much of the code is a customization on a 3rd party codebase, e.g. a company's specific customizations of Oracle products for their business processes, compared to a completely home grown application. The impact here is that when new versions are relased and an upgrade is requested, how much pain may there be on what breaks since it was customized so much.
When starting your own projects, put in the best standard that you know.
If I inherit code that has obviously never been refactored, I would take that as an opportunity to impose some of my own structure.
If people expect me to make time and cost estimates for adding functionality to the code, I'll need to be intimately familiar it, and make sure it lives up to my standards.
If the code is already well-written, that would be a blessing that I would not mess with. But in my experience, this hasn't happened very often.