General Debug Efficiency - debugging

I wanted to know which one of these would be more time-efficient when implementing new features/whatever in a library:
Write it little by little, perfecting each stage
Write the entire feature, and debug the whole thing at once.
I just wanted some thoughts on this. Thanks :)

It is generally advised to do incremental coding ( your first option ) so that your are sure that at least some part of your feature is working before you move on.
Debugging everything at once will be a big headache as you will not know which part of the code the error is in.
In the long run , the first way will definitely be faster than the second one.

Related

Advice on working with legacy code

I need some advice on how to work with legacy code.
A while ago, I was given the task to add a few reports to a reporting app. written in Struts 1, back in 2005. No big deal, but the code is quite messy. No usage of Action forms, and basically the code is one huge action, and a lot of if-else statements inside. Also, no one here has functional knowledge on this. We just happened to have it in our contract.
I'm quite unhappy about this, and not sure how to proceed. This application is invisible: Few people (but all very important) use it, so they don't care whether my eyes bleed while reading the code, standards, etc.
However, I feel that a technical debt is to be paid. How should I proceed on this? Continue down the if-else road, or try to do this requirement the right way, ignoring the rest of the project? Starting a huge refactor, risking my deadline?
Legacy code is a big issue, and I'm sure people will not agree!
I would say that starting a big re-factor could be a mistake.
A big re-factor means doing a lot of work to make it function exactly the way that it does now. If you choose to take this on on your own, there won't be a lot of visibility of what you are doing. If it works, no one will know the hours of work you put it. If it does NOT work, and you end up with tidy code, but add some bugs (and who has ever written code without adding some bugs) then you will get 'why did this change' type questions.
I have currently nearly completed a project working on a 10 year old code base. We have done quite a few bits of re-factoring along the way. But for each re-factor we have made we can justify 'this specific change will make the actual task we are doing now easier'. Rather than 'this is now cleaner for future work'. We have found that as we worked on the code, fixing the issues that we actually come up against one at a time, we have cleaned up a lot of it, without breaking it (much).
And I would say before you can re-factor much, you will need automated tests, so you can be fairly happy that you have put it back together right!
Most re-factoring is done to 'make maintenance and future development easier'. Your project sounds like there is not a lot of future development coming. That limits the advantage a re-factor will give the company.
Rule #1: If it ain't broke, don't fix it.
Rule #2: When in doubt, reread rule #1.
Unfortunately, legacy code can very rarely be described as "it ain't broke." Therefor we must tweak the existing code to correct a newly found bug, tweak the existing code to modify behavior that was previously acceptable, or tweak the existing code to add new functionality.
My experience has taught me that any refactoring must be done in 'infinitesimally' small increments. If you must break rule #2, I suggest that you start your search with the inner-most nested loop or IF structure and expand outward until you find a clean, logical separation point and create a new function/method/subroutine that contains only the guts of that loop or structure. This won't make anything more efficient but it should give you a clearer view of the underlying logic and structure. Once you have several new, smaller functions/methods/subroutines you can refactor and consolidate those into something more manageable.
Rule #3: Ignore my previous paragraph and reread the first two rules.
I agree with other comments. If you don't have to, then don't do it. It usually cost far more then it's worth if the code base is more or less dead any way.
On the other hand, if you feel that you cannot get your head around the code then a refactor is probably unavoidable. If this is the case then, since it's a web application, can you create a solid suite of functional tests using selenium? If so this is the fastest and most rewarding test approach for such code and will catch most bugs for the buck.
Second, start with the extract method refactoring to create compose methods of the big difficult methods. Every time you think to your self "This should have a comment to explain what it does" you should extract it to a method with a name that replaces the comment.
Once this has been done, if you still can't add the functionality required, you can go for more advanced refactorings, and perhaps even adding some unit tests. But I usually find that I can add what is required/fix the bug in legacy code by just creating self documenting code.
In a few words: before make any modifications to legacy code its good idea to start from automated unit tests.
This will give developer understanding about key things: dependencies this piece of code has, input data, output results, boundary conditions so on.
When it’s done most likely you will better understand what this code does and how it works.
After that its make sense (but not must) clean code a bit giving more accurate names to local variables, moving some functionality (repetitive code, if any) in to functions with clear human friendly names.
A simple clean up could make code more readable and at the same time save developer from regression issues with unit tests help.
Refactoring – make small changes, step by step, when you have a time and understanding of the requirements and functionality, regularly unit testing the code.
But do not start from refactoring

how to tackle a new project

I have a question about best practice on how to tackle a new project, any project. When starting a new project how do you go about tackling the project, do you split it into sections, start writing code, draw up flow diagrams.
I'm asking this question because I'm looking for advice on how I can start new projects so I can get going on them quicker. I can have it planned, designed and starting coding with everything worked out.
Any advice?
Thanks
Stephen
It all really depends. Is the project for controlling a space shuttle with 200+ people working on it, or is it a hobby project with 1 person.
I'm guessing this is a small project. In that case, do whatever works for you. Write a list of things you think are required. If there are parts you know you need to learn more about or research, get reading the web, try some stuff out with prototype code to see whether it works or not. Don't turn prototype code into real code though, start again with production code and make sure you get all the appropriate error handling etc in.
When you think you've got a good feel for what's needed, get coding. If you hit a point where you think it's not working, go back to the design and rethink it and sketch some more diagrams, and then go back to the code again.
It is extremely doubtful that you can work everything out in your plan and that's how things will actually work out. So, there's little point in trying to plan too far ahead because you'll be wasting time. Just plan out far enough ahead to keep yourself focused on working on the right things and so that you've given yourself a reasonable chance that the code you're working on will fit the big picture and solve the problem you're trying to solve.
Start by writing a simple functional spec, a few paragraphs from the user's perspective: what they see, how they perform actions, what they expect to happen if they click widget X. This will glue the logic together in your head, and on paper.
From there you can work on the technical spec, which details the gritty things like database structure, special controls and components you need, SDK's if any, and all other developer-type details that you need to implement.

Removing tightly coupled code

Forgive me if this is a dupe but I couldn't find anything that hit this exact question.
I'm working with a legacy application that is very tightly coupled. We are pulling out some of the major functionality because we will be getting that functionality from an external service.
What is the best way to begin removing the now unused code? Should I just start at the extreme base, remove, and refactor my way up the stack? During lunch I'm going to go take a look at Working Effectively with Legacy Code.
If you can, and it makes sense in your problem domain, I would try to, during development, try and keep the legacy code functioning in parallel with the new API. And use the results from the legacy API to cross check that the new API is working as expected.
I think the most important thing you can do is to refactor/remove/test in very small chunks. It's tedious and time consuming but it will help limit risks and errors later on.
I would also start with code that is "low risk" to change.
My advice is to use findbugs and PMD/CPD (copy-paste-detector) to remove dead code (code that can not or will not be called) unused variables and duplicated code. Getting rid of this junk will make re-factoring easier.
Then learn the key mappings for the common re-factoring in your IDE. Extract method and introduce variable should be committed to muscle memory after an hour.
Use the primary disadvantage of tightly coupled code to... your advantage!
Step 1: Identify the area which provides the redundant functionality which you want to replace. Break it...do a quick smoke test of some of the critical parts of the application. Get the feel.
Step 2: Depending on what language it is find the relevant static-code analysis tools and get the needed refactoring info.
Step 3: Repeat Step 1 in incremental levels of narrowing down to the exact pattern.
All this of course, in a sandbox environment. This may seem a bit haphazard but if you limit yourself to critical functionality testing ... you may get many leads in the process. You will definitely identify the pattern of the legacy code, if nothing else.
You absolutely cannot do with with a live development version [new features being added]. You must start with a feature freeze.
I tend to look at all of the components of the system in an overview and see the biggest places of reuse. From there I would implement the appropriate design pattern to solve it, and make the new component reusable. Write test cases to ensure the new code works as expected, then refactor your code around the new change. Then repeat [overview, etc] till you are satisfied.
I would suggest this for many reasons:
Everyone working with you on refactoring will learn something
People learn how to avoid design mistakes down the road
Everyone working on it will get a better understanding of the code base

What do you do with atrocious code?

What do you do when you're assigned to work on code that's
atrocious and antiquated to the point where it's almost incomprehensible?
For example: hardware interface code, mixed with logic, AND user interface code, ALL in the same functions?
We see bad code all the time, but what do you actually do about it?
Do you try to refactor it?
Try to make it OO if it's not?
Or do you try to make some sense of it, make the necessary changes and move on?
Depends on a few factors for me:
Will I be maintaining this code in the future, or is it a one-off fix?
How long until this system is replaced entirely?
How busy am I at the moment?
Ideally, I'd refactor all bad code I had to maintain, but the reality is there are only so many hours in the day.
As is frequently the case, "It Depends".
I tend to ask myself some of the following questions:
Are there unit tests for the existing code?
Is refactoring the code an acceptable risk for my project?
Is the author still available to clarify any questions I might have about the code?
Will my employer consider the time spent on changing existing, functioning code to be an acceptable use of my time?
And so on...
But assuming that I have the capacity to do so, refactoring is preferential as the up front cost of fixing the code now will likely save me a lot of time and effort later in maintenance and development time.
There are other benefits as well, including the fact that the more clean and well maintained you keep your code base, the more likely other developers are to keep it that way. The Pragmatic Programmer calls this the Broken Window Theory.
Developers have an instinct to assume that code is always ugly because of other, inferior developers. Sometimes, code is ugly because the problem space is ugly. All that ugliness isn't just ugliness - it is sometimes institutional memory. Each line of ugly in your code probably represents a bug fix. So think very carefully before you rip it all out.
Basically, I would say that you shouldn't touch code like this unless you actually have to. If there's a real bug that you can solve, refactoring is reasonable, if you can be sure you're maintaining the same amount of functionality. But refactoring for the sake of refactoring (eg, "make the code OO") is what I would generally classify as a classic newbie mistake.
The book Working Effectively with Legacy Code discusses the options you can do. In general the rule is not to change code until you have to (to fix a bug or add a feature). The book describes how to make changes when you can't add testing and how to add testing to complex code (which allows more substantial changes).
You try to refactor it, in the strict sense on the word, where you're not changing the behaviour.
The first target is usually to break up giant methods.
Given the strength of some of the adjectives you use, i.e. atrocious, antiquated and incomprehensible, I'd bin it!
If it is in such a state, like the example you give, it's probably not got any test code for it either. Refactoring is mentioned in many of the other answers but, sometimes, it is not appropriate. I always find that, when refactoring, you generally need a clear path through which the old code can be gradually morphed into the new in a number of well defined steps.
When the old code is so far removed from how you want it to look, such as the extreme cases you seem to be suggesting, you could probably redesign, rewrite and test the new code in a shorter time than it would to take to refactor it.
Scrap it and start over, using the compiled legacy application as a business requirements document.
And spending time in analysis with the users to see what they want changed.
Post it to www.worsethanfailure.com!!!
If no modifications are needed, I don't touch it.
If at all possible, I write automated unit tests first, especially focused on the areas that need modification.
If automated unit tests are not possible, I do what I can to document manual unit tests.
I am just using the tests to document "current" behavior at this point.
If possible, I always keep a version of the code and executable environment that runs things the "original" way (before I touched it) so I can always add new "behavior documentation" tests and better detect regressions I may have caused later.
Once I start changing things, I want to be very careful not to introduce regressions. I do this by continually rerunning (and or adding new tests) to the tests I wrote before I started writing code.
When possible, I leave bugs as-is if there is no business need for them to be fixed. Those bugs may be "features" to some users and may have unclear side effects that wouldn't be clear until the code was redeployed to production.
As far as refactoring, I do that as aggressively as possible, but only in the code that I need to change otherwise anyway. I may refactor more aggressively in my own personal copy of the code that will never be checked in, just to improve the readability of the code for me personally. It's often times difficult to properly test changes that are only made for readability reasons, so for safety reasons, I generally don't check those changes in / deploy them unless I can confidently test that the code changes are completely safe (it's really bad to introduce bugs when you are making changes that are unnecessary for anything but readability).
Really, it's a risk management problem. Proceed with caution. The users do not care if the code is atrocious, they just care that it gets better without getting worse. Your need for beautiful code is not important in this scenario, get past it.
Just like any other code, you leave it slightly better when you leave it than it was when you entered it. You do not ever, ever rewrite the whole code. If that is the work it takes for some reason, then you start a project (small or large) for it.
I am assuming we are talking about a substantial amount of code here.
Not every day is a great day at work you know :)
The first question to ask is: does it work?
If the answer is yes, that would be a huge disincentive to simply ditch it and start over. There may be thousands of man-hours in that code which address edge cases and nasty bugs. Worse yet, there may be other modules in the system that depend on the current incorrect (but known and possibly documented) behavior. Don't mess with it if it isn't broken.
If you are keen on cleaning it up, start by writing test cases for the current behavior. When you run across an instance where the behavior differs from the specification, you must decide whether to accept the behavior as "correct" or go with what the spec say it ought to do.
Only once you have written test cases that all pass should you begin to refactor. The tests will tell you whether your efforts are breaking anything.
I'd talk to my manager and describe the code. Most managers would not want a program held together by banding wire and duct tape per se. If the code is really that bad there are sure to be some business logic errors, hardcoding etc. stuffed in there that will eventually just destroy productivity.
I've come across some pretty bad code before (single letter variable names, no comments, everything crammed onto one line, etc.) and once I mentioned/showed it to my manager they almost always said "go ahead and re-write it", because not only are you taking the hit for reading and changing the code but future co-workers will have to go through the same pain. Better that you take a longer period of time just once to rewrite it rather than having each person who touches the code in the future have to go through and comprehend and decipher it first.
There is an old saying. If is isn't broke, do not fix it. If you have to maintain it then reverse engineer it and document it so the next time you come across it you will know what it does.
You do not know the situation the developer was in when he or she wrote the code. He or she may have been under a time crunch when it was written, (management was all over the developer, etc)
There are also situations where he or she wrote the code per the spec, The spec then changed several times, the developer had to patch the code, as rewrite is out of the question due to time constraints. This happens all of the time.
If the code impacts the performance of robustness of the application and is modular then you can re factor or re-write. Document the situation to assist future programmers in understanding.
Also many programmers consider reverse engineering other developers code as beneath them.
they would rather rewrite without considering the ramifications of doing so.
If you have never done so, try it sometime, it will make you a better developer.
Thanks
Joe
Kill it with fire.
Depends on your time frame and how important that code is to you. If you have to "just make it work" then do that and rewrite the module when time allows.
If its an important or integral part of what you do then refactor refactor refactor.
Then find the guy/girl who wrote it and send them a rude postcard!
The worst offender (in my experience) of really AWFUL code is the ease with which people can do cut & paste these days. Cut & paste should be used rarely. If you think that's the right solution, it's generally better to step back and generalize the problem a little.
Anytime you see code that is "nearly incomprehensible", PROCEED WITH CAUTION. You need to assume that any major re-factoring will result in new bugs being introduced that you'll need to find and correct.
Additionally, I've seen this scenario many times (even fell victim to it myself once or twice): Programmer inherits legacy code, decides code is ancient & unmaintainable and decides to refactor it, ends up deleting key "fixes" or "business rules" subtly patched in over the years, ends up spending a lot of time tracking down and re-introducing similar code when users complain about "a problem fixed years ago is happening again".
Re-factoring (and debugging) almost always takes longer than expected and should never be considered as a "freebie" that comes along with whatever task you're supposed to be doing.
"If it ain't broke, don't 'fix' it" still has a lot of truth.
Im my company we always Refactor Mercilessly. so we still come across atrocious code but LESS and Less and less ...
We write a lot of in-house code and the company is run for about 100 years by the same family. Management usually tells us we have to maintain the code base (evolve) for another 50 years or so. In this setting having code you don't dare to touch is considered a bigger risk to the long term survival of the company then the prospect of downtime because some under-tested code broke because of refactoring.
I run copy-paste detector and findbugs on all legacy code that comes my way.
I then plan my initial refactoring:
remove unused code, unused variable and unused methods
refactor duplicated code
set up a single step build
build a basic functional test
By that point the code meets the basic minimum for maintainability. It can be easily built and basic errors can be found via an automated test.
I often add code like this:
log.debug("is foo null? " + (foo == null));
log.debug("is discount < raw price ? " + (foo.getDiscount() < foo.getRawPrice()));
Some of that code will be recovered for unit tests when I can refactor to it.
I've worked places where we ship that kind of code.
I try to make sense of it, make the necessary changes, and move on.
Of course, making sense of it usually involves some changes; at the very least, I move around the whitespace and line up corresponding braces in the same column like so:
if(condition){
doSomething(); }
// becomes...
if(condition)
{
doSomething();
}
I'll also often change variable names.
And very often, "the necessary changes" require refactoring. :)
Get the idea of what they're doing and the deadline to finish. A larger deadline, typically rebuild much of the code from the ground up, as I find it a very worthwhile experience to not only decipher terrible code and make it legible and document, but somewhere in your brain those neurons are pressed to avoid similar mistakes in the future.

What helps to you improve your ability to find a bug?

I want to know if there are method to quickly find bugs in the program.
It seems that the more you master the architecture of your software, the more quickly
you can locate the bugs.
How the programmers improve their ability to find a bug?
Logging, and unit tests. The more information you have about what happened, the easier it is to reproduce it. The more modular you can make your code, the easier it is to check that it really is misbehaving where you think it is, and then check that your fix solves the problem.
Divide and conquer. Whenever you are debugging, you should be thinking about cutting down the possible locations of the problem. Every time you run the app, you should be trying to eliminate a possible source and zero in on the actual location. This can be done with logging, with a debugger, assertions, etc.
Here's a prophylactic method after you have found a bug: I find it really helpful to take a minute and think about the bug.
What was the bug exactly in essence.
Why did it occur.
Could you have found it earlier, easier.
Anything else you learned from the bug.
I find taking a minute to think about these things will make it far less likely that you will produce the same bug in the future.
I will assume you mean logic bugs. The best way I have found to capture logic bugs is to implement some sort of testing scheme. Check out jUnit as the standard. Pretty much you define a set of accepted outputs of your methods. Every time you compile your system it checks all of your test cases. If you have introduced new logic that breaks your tests, you will know about it instantly and know exactly what you have to fix.
Test driven design is a pretty big movement in programming right now. You will be hard pressed to find a language that doesn't support some kind of testing. Even JavaScript has a multitude of test suites.
Experience makes you a better debugger. Pay close attention to the bugs that you AND others commonly make. Try to figure out if/how these bugs apply to ALL code that affects you, not the single instance of where the bug was seen.
Raymond Chen is famous for his powers of psychic debugging.
Most of what looks like psychic
debugging is really just knowing what
people tend to get wrong.
That means that you don't necessarily have to be intimately familiar with the architecture / system. You just need enough knowledge to understand the types of bugs that apply and are easy to make.
I personally take the approach of thinking about where the bug may be in the code before actually opening up the code and taking a look. When you first start with this approach, it may not actually work very well, especially if you are pretty unfamiliar with the code base. However, over time someone will be able to tell you the behavior they are experiencing and you'll have a good idea where the problem is located or you may even know what to fix in the code to remedy the problem before even looking at the code.
I was on a project for several years that maintained by a vendor. They were not very good debuggers and most of the time it was up to us to point them to an area of the code that had the problem. What made our problem worse was that we didn't have a nice way to view the source code, so a lot of our "debugging" was just feeling.
Error checking and reporting. The #1 newbie coder debugging mistake is to turn off error reporting, avoid checking for whether what's going on makes sense, etc etc. In general, people feel like if they can't see anything going wrong then nothing is going wrong. Which of course could not be further from the case.
Instead, your code should be chock full of error conditions that will make lots of noise, with detailed reporting, someplace you will see it. (This doesn't mean inside a production web page.) Then, instead of having to trace an error all over the place because it got passed through sixteen layers of execution before it finally got someplace that broke, your errors start happening proximately to the actual issue.
It seems that the more you master the
architecture of your software ,the
more quickly you can locate the bugs.
After understanding the architecture, one's ability to find bugs in the application increases with their ability to identify and write extensive tests.
Know your tools.
Make sure that you know how to use conditional breakpoints and watches in your debugger.
Use static analysis tools as well - they can point out the more obvious issues.
Sleep and rest.
Use programming methods that produce fewer bugs in the first place.
If to implement a single stand-alone functional requirement it takes N separate point-edits to source code, the number of bugs put into the code is roughly proportional to N, so find programming methods that minimize N. Ways to do this: DRY (don't repeat yourself), code generation, and DSL (domain-specific-language).
Where bugs are likely, have unit tests.
Obviously.IMHO, the best unit tests are monte-carlo.
Make intermediate results visible.
For example, compilers have intermediate representations, in the form of 4-tuples. If there is a bug, the intermediate code can be examined. That tells if the bug is in the first or second half of the compiler.
P.S. Most programmers are not aware that they have a choice of how much data structure to use. The less data structure you use, the less are the chances for bugs (and performance issues) caused by it.
I find tracepoints to be an invaluable debugging tool. They are a bit like logging, except you create them during a debugging session to solve a particular issue, like breakpoints.
Printing the stacktrace in a tracepoint can be especially useful. For example, you can print the hash code and stacktrace in the constructor of an object, and then later on when the object is used again you can search for its hashcode to see which client code created it. Same for seeing who disposed it or called a certain method etc.
They are also great for debugging issues related to window focus changes etc, where the debugger would interfere if you drop in break mode.
Static code tools like FindBugs
Assertions, assertions, and assertions.
Some areas of our code has 4 or 5 assertions for each line of real code. When we get a bug report the first thing that happens is that the customer data is processed in our debug build 99 times out a hundred an assert will fire near the cause of the bug.
Additionally our debug build perform redundant calculations to ensure that an optimized algorithm is returning the correct result, and also debug functions are used to examine the sanity of data structures.
The hardest thing new developers have to contend with is getting their code to survive the assertions of the code gthey are calling.
Additionally we do not allow any code to be putback to toplevel that causes any integration or unit test to fail.
Stepping through the code, examining flow/state where unexpected behavior is occurring. (Then develop a test for it, of course).
Writing Debug.Write(message) in your code and using DebugView is another option. And then run your application find out what is going on.
"Architecture" in software means something like:
Several components
The components interact across clearly-defined interfaces
Each component has a well-defined responsibility
The responsibility of one component is unlike the responsibilities of other components
So, as you said, the better the architecture the easier it is to find bugs.
First: knowing the bug, you can decide which functionality is broken, and therefore know which component implements that functionality. For example, if the bug is that something isn't being logged properly, therefore this bug should be in one of 3 places:
In the component that's responsible for logging (your logging library)
Or, above that in the application code which is using this library
Or, below that in the system code which this library is using
Second: examine the data transfered across the interfaces between components. To continue the previous example above:
Set a debugger breakpoint on the application code which invokes the logger API, to verify whether the logger API is being used correctly (e.g. whether it's being invoked at all, whether parameters are as-expected, etc.).
Doing this tells you whether the bug is in the component above this interface, or in the component that's below this interface.
Repeat (perhaps using binary search if the call stack is very deep) until you've found which component is at fault.
When you come to the point that you think there must be a bug in the OS, check your assertions -- and put them into the code with "assert" statements.
Conversely, as you are writing the code, think of the range of valid inputs for your algorithms and put in assertions to make sure you have what you think you have. Same goes for output: Check that you produced what you think you produced.
E.g. if you expect a non-empty list:
l = getList(input)
assert l, "List was empty for input: %s" % str(input)
I'm part of the QA team # work, and knowing anything about the product and how it is developed, helps a lot in finding bugs, also when I make new QA tools I pass it to our dev team to test it, finding bugs in your own code is just plain hard!
Some people say programmers are tainted, so we cannot see bugs in their own product; we are not talking about code here, we are beyond that, usability and functionality itself.
Meanwhile unit testing seams to be a nice solution to find bugs in your own code, its totally pointless if you're wrong even before writing the unit test, how are you going to find the bugs then? you don't!, let your co-worker find them, hire a QA guy.
Scientific debugging is what I always used, and it greatly helps.
Basically, if you can replicate a bug, you can track its origin. You should then experiment some tests, observe the results, and infer hypotheses on why the bug happens.
Writing about all your hypotheses, attempts, expected results and observed results can help you track down the bugs, particularly if they're nasty.
There are automated tools that can help you with that process, particularly git-bisect (and similar bisection tools on other revision systems) to quickly find which change introduced the bug, unit testing to reproduce a bug and prevent regressions in your code (can be used in combination with bisect), and delta debugging to find the culprit in your code (similar to git-bisect but whereas git-bisect works on the code history, delta debugging works on the code directly).
But whatever the tools you are using, the most important benefit is in the scientific methodology, as this is the formalization of what most experienced debuggers do.

Resources