Writing fool-proof applications vs writing for performance

Writing fool-proof applications vs writing for performance - performance

I'm biased towards writing fool-proof applications. For example with PHP site, I validate all the inputs from client-side using JS. On the server-side I validate again. On both sides I do validation for emptiness, and other patterns (email, phone, url, number, etc). And then I strip malicious tags or characters, trim them (server-side). Later I convert the input into desired formats/data types (string, int, float, etc). If the library meant for server-side only, I even give developers chances for graceful degradation and accommodate the tolerate the worst inputs and normalize to the acceptable ones (I have predefined set of the acceptable ones).
Now I'm reading a library that I wrote one and a half years ago. I'm wondering if developers are so evil or lack IQ for me do so much of graceful degradation, finding every possible chance to make the dudes right, even they gave crappy input which seriously harms performance. Or shall I do minimal checking and expect developers to be able and are willfully to give proper input? I have no hope for end-users but should I trust developers more and give them application/library with better performance?

Common policy is to validate on the server anything sent from the client because you can't be totally sure it really was your client that sent it. You don't want to "trust developers more" and in the process find that you've "trusted hackers of your site more".
Fixing invalid input automatically can be as much a curse as a blessing -- you've essentially committed to accepting the invalid input as a valid part of your protocol (ie, in a future version if you make a change that will break the invalid input that you were correcting, it is no longer backwards compatible with the client code that has been written). In extremis, you might paint yourself into a corner that way. Also, invalid calls tend to propagate to new code -- people often copy-and-paste example code and then modify it to meet their needs. If they've copied bad code that you've been correcting at the server, you might find you start getting proportionally more and more bad data coming in, as well as confusing new programmers who think "that just doesn't look like it should be right, but it's the example everyone is using -- maybe I don't understand this after all".

Never expect diligence from developers. Always validate, if you can, any input that comes into your code, especially if it comes across a network.

End users (whether they're programmers using your tool, or non-programmers using your application) don't have to be stupid or evil to type the wrong thing in. As programmers we all too often make wrong assumptions about what's obvious for them.
That's the first thing, which justifies comprehensive validation all on its own. But validation isn't the same as guessing what they meant from what they typed, and inferring correct input from incorrect - unless the inference rules are also well known to the users (like Word's auto-correct, for instance).
But what is this performance you seek? There's no bit of client-side (or server-side, for that matter) validation that takes longer to run than the second or so that is an acceptable response time.
Validate, and ensure it doesn't break as the first priority. Then worry about making it clever enough to know (reliably) what they meant. After that, worry about how fast it is. In the real world, syntax validation doesn't make a measurable difference to anything where user input takes most of the total time.

Microsoft made the mistake of trusting programmers to do the right thing back in the days of Windows 3.1 and to a lesser extent Windows 95. You need only read a few posts from Raymond Chen to see where that road ultimately leads.
(P.S. This is not a dig against Microsoft - it's a statement on fact about how programmers abused the more liberal Win16, either deliberately or through ignorance)

I think you are right in being biased toward fool-proof applications. I would not assume that that degrades performance enough to be of much concern. Rather I would address performance concerns separately, starting by profiling or my favorite method, stackshots. There must be a way to get those in PHP.

Related

How can avoid people using my code for evil? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm not sure if this is quite the right place, but it seems like a decent place to ask.
My current job involves manual analysis of large data sets (at several levels, each more refined and done by increasingly experienced analysts). About a year ago, I started developing some utilities to track analyst performance by comparing results at earlier levels to final levels. At first, this worked quite well - we used it in-shop as a simple indicator to help focus training efforts and do a better job overall.
Recently though, the results have been taken out of context and used in a way I never intended. It seems management (one person in particular) has started using the results of these tools to directly affect EPR's (enlisted performance reports - \ it's an air force thing, but I assume something similar exists in other areas) and similar paperwork. The problem isn't who is using these results, but how. I've made it clear to everyone that the results are, quite simply, error-prone.
There are numerous unavoidable obstacles to generating this data, which I have worked to minimize with some nifty heuristics and such. Taken in the proper context, they're a useful tool. Out of context however, as they are now being used, they do more harm than good.
The manager(s) in question are taking the results as literal indicators of whether an analyst is performing well or poorly. The results are being averaged and individual scores are being ranked as above (good) or below (bad) average. This is being done with no regard for inherent margins of error and sample bias, with no regard for any sort of proper interpretation. I know of at least one person whose performance rating was marked down for an 'accuracy percentage' less than one percentage point below average (when the typical margin of error from the calculation method alone is around two to three percent).
I'm in the process of writing a formal report on the errors present in the system ("Beginner's Guide to Meaningful Statistical Analysis" included), but all signs point to this having no effect.
Short of deliberately breaking the tools (a route I'd prefer avoiding but am strongly considering under the circumstances), I'm wondering if anyone here has effectively dealt with similar situations before? Any insight into how to approach this would be greatly appreciated.
Update:
Thanks for the responses - plenty of good ideas all around.
If anyone is curious, I'm moving in the direction of 'refine, educate, and take control of interpretation'. I've started rebuilding my tools to try and negate or track error better and automatically generate any numbers and graphs they could want, with included documentation throughout (while hiding away as obscure references the raw data they currently seem so eager to import to the 'magical' excel sheets).
In particular, I'm hopeful that visual representations of error and properly created ranking systems (taking into account error, standard deviations, etc.) will help the situation.

Either modify the output to include error information (so if the error is +/- 5 %, don't output 22%, output 17% - 27%), or educate those whom this is being used against to the error so that they can defend themselves when it is used against them.

Well, you seem to have run afoul of the Law of Unintended Consequences in the context of human behavior.
Unfortunately, once the cat is out of the bag, it's pretty hard to put back in. You have a few options (which are not mutually exclusive, by the way) to consider, including:
Alter the reports so that their data can no longer be abused in the way you describe.
Work with management to help them understand why their use of your data is improper or misleading.
Work with those whose performance is being measured to pressure management to rethink their policy on the matter.
Work with management/analysts to come up with a viable means to measure performance in a way that is fair to everyone.
Break the report in a manner that makes them unusable for any purposes.
Clearly there is a desire on the part of management to get analytics on performance of analysts. Likely there is a real need for this ... and your reports happened to fill a void in the available information. The best option for everyone would be to find a way to effectively and fairly fill this need. There are many possible ways to achieve this - from dropping dense rankings in favor of performance tiers to using time-over-time variance to refine performance measurements.
Now, it's entirely possible that the existing reports you've provided simply cannot be applied in a fair and accurate manner to address this problem. In which case, you should work with your management team to make sure they understand why this is the case - and either redefine the way performance is measured or take the time to develop an appropriate and fair methodology.
One of the strongest means to convince management that their (ab)use of the data in your report is unwise is to remind them of the concept of perverse incentives. It's entirely possible that over time, analysts will modify their behavior in a way that results in higher rankings in performance reports at the cost of real performance or quality of results that are not otherwise captured or expressed. You seem to have a good understanding of your domain - so I would hope that you could provide specific and dramatic examples of such consequences to help make your case.

All you can do is to try and educate the managers as to why what they're doing is incorrect.
Beyond that, you can't stop idiots from being idiotic, and you'll just go mad trying.
I definitely wouldn't "break" code that people are relying on, even if it's not a specific deliverable. That will only cause them to complain about you, a move which may affect your own EPR :-)

I really think the key here is good communication with your managers.
Besides, I like PatrickV's idea. You could also try some other ways to engineer your tool around the problem so that it'll seem silly/be hard to use it as performance measurement - change the name of the statistics to mean something other than "how good programmer X is", make it hard to get data per-person, show error statistics.
You can also try to display the data in another way (this may actually make your managers think you are trying to help them). Show a graph - a several pixels difference in position may be harder to identify than a numeric results (my guess - your managers are using excel and coloring red everything below average). Draw the error margin so it doesn't make sense to obsess over fractions of percentages.
Give the result as a scale - low and high margin that take into account your error information, it is harder to compare.
Edit: Oh yeah, and read about "social interfaces". You can start with's Spolsky's Not Just Usability and Building Communities with Software.

I would echo #paxdiablo's advice, as a first step:
Work on the report on the inherent errors. In fact, make it the introduction to every copy generated.
When you refer to the measurement errors, indicate they are the lower limit of the errors (unless there actually aren't any).
Try to educate the manager(s) in the error of his/her ways.
If possible, discuss the issue with your manager. And perhaps with the offending managers' management, depending on how familiar you are with them you probably limit it to just "express some concerns" and giving a heads-up.
Consult your HR department, or whomever is in charge of fairness in the performance reviews.
Good luck.

The problem is that the code is not yours, it belongs to your company. They really can do whatever they want with it.
I hate to say this, but if you have an issue with the ethics of your company you will have to leave that company.

One thing you could do is implement the comparison yourself. If he really wants to check if somebody is performing significantly less than the rest, it should be tested formally as well.
Now to choose the right test is a bit tricky without knowing the data and the structure, so I can't really advise you on that one. Just take into account that if you do pairwise comparisons, or compare multiple scores against an average, that you run into the multitesting problem. A classic way of correcting is using Bonferroni. If you implement that one, you can be sure that at a certain point, noone will jump out any more. The Bonferroni correction is very conservative. Another option is using Dunn-Sidak, which is supposed to be less conservative.
The correct implementation would be an ANOVA -if the assumptions are met and the data suitable off course- with a post-hoc comparison like a Tukey Honest Significant Difference test. That way at least the uncertainty on the results is taken into account.
If you don't have a clue on which test to use, describe your data in detail on stats.stackexchange.com and ask for help on which test to use.
Cheers

I just wanted to elaborate on the Perverse Incentives answer of LBushkin. I can easily see your problem extending to where analysts will avoid difficult topics for fear of reducing their score. Or maybe they will provide the same answer as earlier stages to avoid hurting a friends score, even if that is not correct. An interesting question is what happens if the later answer is incorrect - you have no truth, just successive analytic opinions - in this case I assume the first answer is marked as "incorrect", right?
Maybe presenting some of these extensions to the manager will help.

Fail Fast vs. Robustness

Our product is a distributed system. The modules I work on are fairly new, quite rigorous, well tested. They were developed with recent best practices in mind. Other modules can be considered as legacy software.
While I'm vigilant about everything that happens within modules I'm responsible for, I'm under constant pressure to work with bad data sent to me from the other modules. At heart, I'm a "Fail Fast" principle developer and as a result , when problems arise I usually am able to eliminate the possibility of error in my modules. It's not so much about blame, just saving wasted effort in chasing bugs in the wrong places.
But the argument I keep coming up against is: "We can't let this stuff fail in production, the customer expects this to work, why don't you work around this problem". And this would be an argument for robustness: be liberal in what you accept, conservative in what you send.
I should also note that these are mostly intermittent problems. We see them in integration tests but they are hard to reproduce. Timing and concurrency are involved.
I'm having a hard time balancing between the two principles. Part of it is my worry that if I start allowing and propagating exceptional data, I'm inviting trouble and I won't have as much confidence in my system. But I can't argue against keeping the system working even if other modules are sending me wrong data. The reason other modules aren't getting fixed is that they are too complex and fragile, while mine still appear clear and safe. But if I don't resist the pressure, my modules will slowly be saddled with the same problems I've been rejecting until now.
I should say that the system is not "crashing" in production, but my module may simply display an error to the operator and ask them to contact support. A crash would be a big problem, but if I'm reporting the error clearly, then isn't this the right thing to do? I suspect that my peers just don't want the customer to see any problems, period. But my module is rejecting data from other modules within our product, not customer input. So it seems to me that we are just not tackling problems.
So, do I need to be more pragmatic or hold my ground?

I share the "fail fast" preference/principle. Don't think of this as a conflict of principles though, its more a conflict of understanding. Your counterpart has some unspoken requirement ("dont show the user a bad time") that implies some missed requirement. You did not have a chance to think about/implement this requirement beforehand, so the requirement has left a bad taste in your mouth. Forget this viewpoint, re-approach it as a new project with a fixed requirement you can work against.
Maybe the best result is to give an error message like you displayed. But it sounds like you implemented it before having buy-in from your counterpart, when they had a choice to accept it. Earlier communication about what you were doing could have addressed something like that.
Be careful in how you prevent the ideas. Constantly referring to the other systems "too complex and fragile" might be rubbing people the wrong way. Express simply the systems are new to you and take longer to understand. Do put the time into understanding them, so you do not reduce peoples expectations of your capability.

I'd say that it depends on what happens if you don't halt. Does someone's paycheck get processed wrong? Does the wrong order get sent out? That would be worth stopping for.
If possible, have your cake and eat it too - don't report the error to the user, get the customer to agree to send diagnostic reports and report every failure back. Bug the developer(s) who own the faulting module(s) to fix them. And by bug I mean file a bug against them. Or, if management doesn't think it's worth the cost of fixing, don't.
I'd also write up unit tests against those modules that fail, especially if you can tell what the original input was that caused them to generate the wrong output.
What it really comes down to though is what the person who reviews your performance wants from you, especially after you explain the problem to them, via email.

Simply put, this sounds like a "don't check for something you can't handle". The fact that you're catching the error and able to report it means you're not propagating it. But it also means that since you can report it, you have some mechanism to trap the error and, therefore potentially handle it yourself, and correct it rather than report it.
Mind, I'm assuming that your error report is more interesting than a random exception you caught some place deep in the system. But even then, if it's an exception you're testing for and you're creating (i.e. you check if the denominator is zero and send an error rather than simply inadvertently dividing by zero and catching the exception higher up), then that suggests you may well have a way of correcting the problem.
Bottom line, you need both. You need to try to make the data as error free as practical, but also report the unexpected.
I don't think that you can lock the door and cross your arms saying "it's not my problem". The fact that it's coming from "old, fragile systems" is meaningless. YOUR code is not old a fragile and clearly the efficient place, in terms of the entire integrated system, to "fix" the data, once you've detected the problem. Yea the old modules will continue to GIGO to other, lesser systems, but those legacy modules combined with your new module are a cohesive whole and thus make up "the system".
The typical real problem here is simply the time/value equation of writing all this fix up code vs new features. That's a different debate. But if you have time, and you know things that you can do to clean up incoming data, "be liberal in what you accept" is sound policy.

I won't get into the reasons, but you are right.
In my experience, PHB's are missing the part of the brain required to understand why fail fast has merit and "robustness" as defined by do-whatever-it-takes-eat-errors-if-necessary is a bad idea. It is hopeless. They just don't have the hardware to grok it. They tend to say things "ok you make a good point but what about the user" - it's just their version of think of the children, and signals the end of a conversion with me anytime it's brought up.
My advice is to stand your ground. Eternally.

Thanks everyone. The case that prompted this question ended well, and partly thanks to insights I got from the answers above.
My initial reaction was to stick to fail fast, but I thought about this some more, and had reached the conclusion that one of the roles of my module is to provide a stabilizing anchor to the rest of the system. That does not necessarily mean accepting bad data, but surfacing problems, isolating them and handling them in a transparent manner until we find a solution.
I planned adding a new handler and code path for this case, which would properly execute as if it was a special use case that was previously undocumented.
We had a discussion where I reiterated the need to deal with the problem at the boundary, but was also willing to help. I outlined my plan to the other side, because I had a suspicion that my position was viewed as overly pedantic, and that the solution was perceived as me only having to turn off spurious validation of harmless data, even if it was incorrect. In reality though, the way I work is largely data driven, so I explained why it has to be correct and how behavior is driven by it and how in accommodating this data I will be implementing a special code path.
I think this gave weight to my position and it led to a more thorough discussion of the other side's aversion to fixing the data. It turned out that it was more of a weariness of dealing with an error prone legacy system than an actual obstacle. There was a relatively simple solution, it was just scary to make a change, a mindset that's fairly entrenched.
But having aired all challenges and possible solutions, we eventually agreed to fix the data, and so far it seems to have solved our problem. Our integration tests are now passing consistently, but we have also added logging and will continue to monitor it.
In summary, I think that for me, the synthesis of both principles is that fail fast is essential for surfacing problems. But once they do surface, robustness means providing a transparent path to continue operation in a way that does not compromise the system. I was able to offer that, and by doing so, won some goodwill from the other side and got the data fixed in the end.
Again, thanks to everyone that responded. I'm too new to rate comments, but I do appreciate all the perspectives presented.

That's a tricky one. If your module receives bad data and it's "ok" for you to just do nothing with them and return, then I would suggest to write to an error log instead of showing an error to the user.

It kind of depends on the class of error you are getting. If the way the system is breaking means you can keep going without feeding bad data to any other parts of the system, you should do everything in your power to work with whatever input is given.
To my mind though data purity trumps working systems, you cannot allow bad data to propagate elsewhere and corrupt other systems. To the extent you can massage data to be correct and then keep going, you should do so on the theory that the data is safe and you must keep the system running...
I like to think of things in terms of data streams. Passing bad data along is polluting the whole stream, and that is bad because just like real pollution a drop can spoil a whole river of data (if one element is bad, what else can you trust?). But equally bad is blocking the flow, letting nothing pass because you spotted something you could easily remove. Filter it out and if everyone at every stage is also filter, you get clear clean data out the other end even if a few impurities started up in the middle.

The question from your peers is: "why don't you work around this problem"
You say that it's possible for you detect the bad data, and report an error to the user. This is the normal approach - once you know the data coming to your functions is bad, you should fail fast (and this is the recommendation from the other answers I have read here).
However, your question doesn't specify the domain in which your software is operating. If you know the data coming in is erroneous, is it possible for you to request that data again? Is it actually possible to recover from the situation?
I mentioned that the "domain" here is important. So if you have an app which displays streamed video data for example, and maybe your wireless signal is weak so the stream is corrupt, should the system "fail fast" and display an error message? Or should a poorer image be displayed, and an attempt to reconnect made if needed, depending on the magnitude of the problem?
Depending on your domain, it may be possible for you to detect bad data, and make a second request for the data without inconveniencing the user. (This is clearly only relevant in cases where you'd expect the data to be better the second time, but you do say the issues you are experiencing are intermittent and possible concurrency related)...
So, fail-fast is good, and is definitely something you should do if you can't recover. And you should definitely not propagate bad data. But if you can recover, which in some domains you can, then failing straight away is not necessarily the best thing to do.

How to react when the client's response is negative on delivery? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am a junior programmer. Since my supervisor told me to sit in with the client, I joined. I saw the unsatisfied face of the client despite the successful (from my programmer's perspective) delivery of the project!
Client: You could have included this!
Us: Was not in the specification!
Client: Common Sense!
As a programmer, how do you respond in this situation?

What you should do to avoid this situation:
Explicitly spec out what will be included and what will not be included.
The problem probably comes down to the unspecified parts of the spec:
The client thinks that unspecified stuff should be in, i.e. it was implied.
The developer thinks that unspecified stuff should not be in.
For future specs that you have, you should have a catch all statement, that explicitly states that if something is not specified in this document, it can be done after the original specification is done at an additional cost.
What you should do in the current situation:
Other than learning from your experiences, you should come to some compromise with the client.
Example: I will do this feature that you feel is common sense, but for all future additions/changes it will have to be spec'ed out explicitly.
I.e. you will have to do a little more work, but it is worth it in return for the catch all explicitly spec'ed agreement your client will enter into.
Bad spec?
Was it necessarily a bad spec? No.
It is impossible to mention everything your clients may expect, so it is critical to have this catch all statement mentioned above stated clearly and explicitly in your spec/contract.
Other ways to reduce the problem:
Involve the client early, show them early prototypes. Even if they don't demand it.
Try not to sell the client an end product, but more of a service for working on his product.
Consider an agile development model or something similar so that tasks are well defined, small, paid for, and indisputable.

This would be one of many reasons why I switched to an Agile development philosophy. The only way, in my opinion, to successfully avoid this scenario is to either be omniscient or involve the customer heavily and release early/release often to get feedback as soon as possible. That way you can develop the software the customer really wants, not the software the customer tells you they want.

Client: You could have included this!
Us: Was not in the specification!
Client: Common Sense!!
Us: We do not attempt to go beyond what the client has specified - we follow the specification. It's as important to NOT implement features not specified as it is to implement features specified. We will never second guess our customers, who value the fact that they can completely depend on us to correctly and completely implement the specification on time and under budget.
As others very rightly point out, the situation is almost always more complex than the simple exchange I've described above.
However, the above is valid if the implementer has a specification with the customer's signature on it which essentially implements an agreement that says "once the software provably implements all the features in the spec then it is considered complete", and anything additional is outside the specification and therefore outside the contract.
The contract itself may have some input here as well - if you don't have a signed contract than it doesn't matter what's in the spec - everything so far has been done on a handshake, and the entire deal (including payment) can go down the toilet based on any dissatisfaction on either side.
But if you have a contract and a specification, and the customer has seen and signed both, then they have no wriggle room to ask you to go further.
Now, as to the question of whether you should implement it:
AWESOME! You delivered a product and they only had one complaint. Implement the feature, call it a 'freebie (make sure they understand you're working outside the spec and contract and explicitly send them a bill for the work with the discount shown in dollars) and have them sign off on the project as a whole.
It will explicitly demonstrate that the project is ended, that you went above and beyond the call of duty, and that any further 'surprises' are outside the contract/spec, which gives you a nice layer of protection beyond what you already (ostensibly) have.
If it's a UI issue, then you're in murkier water.
Does the spec adequately describe the UI? Does it have mockups? I wouldn't fault a customer for this complaint about the UI if the spec did not very closely describe the layout, usage, and include mockups.
Either way, I think you can understand the customer's position - if they haven't played with UI mockups, then they're going to be disappointed with the result regardless - there's no way, psychologically speaking, that you and your customer could have possibly had the same idea in mind (nevermind the fact that common sense isn't!).
Quite frankly if this is the first time the customer has thought about checking out the UI before the work is finished, then it's at least partially your fault for not explaining good UI design processes to them. This is a key feature for their app, and it's very tightly coupled to what they've imagined - no one can be satisfied in such a situation unless they've 'grown' their internal representation over time to match what the reality is.
This disconnect is solved only through frequent user and customer testing, which is obviously missing. This is a problem regarding client education and communication, not whether the specification was met or not.
-Adam

Expect last minute changes of scope - they always happen, so be ready.
Review progress frequently with client - to minimize surprises.
Contract: Functional Spec, plus Time & Materials with initial cap (so client feels control).
Then when changes come along, re-negotiate the cap if necessary.
Never say they can't have what they want. They can get that answer for free!
Always give them a little more than they asked for, so they know you've got a positive attitude.
Relate to the client as being on the same team with them. Don't accept being legalistically painted as an adversary.
They may think of contractors as not loyal, compared to employees. Show them you're as dedicated to their success as their employees are, and you'll go the extra mile.

Classic case...
There's not definite answer to this one, but it all turns around communication. There should have been preventive measures put in place (like weekly reviews or something like that).
For sure, you can't redo the whole thing for free.
Two ways: Or to tell them to ** off or you deal with it.
If you choose to deal:
First, empathize, respect the client.
Have a look at what can easily be changed.
Have a look at the contracts.
Maybe create a new agreement.
Don't do too much.
Make them see the progress and the work it takes.
Find workarounds for the missing features (maybe using other great features, or available tools.)
Use your common sense, it is so common, its not even funny.

This is one of the many drawbacks of a fixed bid arrangement. Any time business needs or priorities change, or there is even a simple misunderstanding, it results in anything from an awkward situation like this to calling lawyers in. If you have an arrangement where you get paid for development time, you can always react to any change and get paid for whatever time it takes to make that change. Also, having a by-the-hour arrangement does not preclude having a plan or making an estimate.
Once you are in a fixed bid pickle, though, your options are:
1) Do it at an additional cost.
2) Do it free.
3) Don't do it.
Option 3 is the worst, and Option 1 is the best. If you have a good trusting relationship and decent communication with the client, it's usually easy to arrive at Option 1. If the relationship is bad, then you've got bigger problems. At that point, just try to avoid laywers.
A final point - any project that has something known as "The Delivery Date" inevitably runs into the problem described. Projects with said date usually involve retreating to a cave for several months to develop in hiding followed by an unleashing of the product all at once in front of the stakeholders. This is abrupt and leaves plenty of time for client expectations and the actual product to drift apart. If, instead, you show intermediate versions of the product and gather feedback every few weeks, two things happen. First, you get better feedback, minimize misunderstandings, and make a better product. Second, there is no single point in time on which a massive amount of expectation is laid. The potential difference between what the client is imagining and what actually exists is much smaller. No surprises.
Good luck.

"how do you react?"
Question 1 - do you want to continue this relationship with this customer? Seriously. If they are going to claim that unspecified features are "common sense," this may not be a good relationship to maintain or enhance.
If you want to disengage, then that's easy. Ask for them to highlight each part of the specification that you failed to comply with and play that game. Get specific test criteria for each missing feature. Pull Teeth. Be confrontational in determining what's missing. Don't ask why. Just ask for all the details up front. It's slow and unpleasant. But you don't want them anyway.
If you want to engage, well, you're going to have to change the relationship. Currently, you have a Passive Aggressive Customer. They won't say what they want, but they will say what they don't want.
This may be a habit with them; this may be how they win concessions. Or this just may be sloppy specification on their part.
If you want the relationship, your reaction has two parts.
Short-term. Get something they're happy with. They have to identify specific changes. You have to score each change with a "cost to do" and "fit with specification".
Some things are cheap and a good fit. Do those.
Some things are cheap to do, but a bad fit with the specification. Think twice about enabling a bad specification to lead to rework. In a sense, you purchased the specification from them; you may need to raise your standards, also.
The expensive things which (sadly) fit the specification are a problem. You're in trouble with these, and pretty much have to do them.
The expensive things which don't fit well with the specification are lessons learned for everyone. Detail a plan for these, including specification rewrites and approvals.
Long-term. Make sure they you're not PA'd again. Review early and often, use Agile techniques. Communicate more, prototype more, release more.

Well, it was not successfully delivered. Somewhere along the line there was miscommunication. Without knowing the specifics I would suggest this is not a developer injected problem and this is probably not to be blamed on the customer - the requirements gathering task was insufficient. This is a classic example of what happens when the software side does not have domain experts or the requirements discovery process doesn't do all that it could...
If it was me I would correct the problem and figure out how to avoid similar issue in the future.
How you handle this can very well determine the future of this contract/business with the client. Taking responsibility and correcting the issue is a huge opportunity for your company.
EDIT:
This is a good time to evaluate how this happened to help correct it. Some companies choose to totally revamp everything they do which is a mistake I think. So is ignoring it. Blaming people for the problem is also a mistake.
It is a good time to walk through how this happened, what the process is, and maybe how it could have been caught. I would not make huge rule changes or process changes - but coming up with guidelines for future work is a great thing. Your company had a clear lesson about a shortcoming. Losing the opportunity to correct this problem and to correct your process would be a waste of a good chance.

ZiG, I've had to deal with this problem on several occasions at my current place of Employment. My group (3 developers) tries to approach things in an Agile manner. We're used to getting mid-stream and even last-second requests (which we then treat on a case-by-case basis).
However, we make it clear that resources (particularly time) are limited and if it's not in the spec we can't make promises. If it's judged important and it can't fit into the current release, we generally plan a followup release. If it isn't important, it goes on a list.
One thing I've found is that you can get users to agree to Spec S at Time T. However at Time T + N, getting them to remember they agreed to Spec S, or getting them to acknowledge that they did so (with the documentation you've been keeping, I hope!) can be trickier than it should be.

Speaking to the OP's subject and question:
If you are an employed programmer, then I would hope that other resources are in the meeting with you. Possibly "higher ups" in the organization.
If this is the case, then your job is to answer DIRECT questions, and to keep your emotions in check. Yes, you may feel injured because they don't love your code, but showing any emotion with bosses present is not a good thing. Rather, try and look neutral and let the others handle the session.
Now, if they "hang you out to dry", then I would recommend the following questions:
a) "OK. I see. Why exactly to you feel this is common sense to include this feature? I'd like to discover why we didn't include it." (force them to explain their thought process. Common sense to one person is rarely common sense to anyone else.)
b) "Well, I'm sure we could include that in the next release. I'll leave it up to XXX (the bosses) to come to a mutually agreeable approach" (i.e. don't talk cost or freebies with bosses present. EVER.)
Again, this assumes you are a programmer WORKING for a company that delivered the product. Now, if are more than that - i.e. you ARE one of the higher ups, then many of the suggestions here are excellent.
However, if you are the higher-up or are a consultant programmer, then first and formost
a) Apologize for the process that did not catch this requirement. Promise to work with the client to prevent this from recurring.
Then on to the other strategies. It really doesn't matter if you charge for the fix or not - the apology is the most important action to the client. Again, it bears repeating - you are not apologizing for the missed feature. You are apologizing for the faulty design process that let it slip. Clients are usually pretty accommodating when you start this way and then seek a solution.
Cheers,
-Richard

Use SCRUM like approaches to avoid this deathtrap: involve the client in the dev process early, frequently and in informal, restricted commitees -> risk reduction and improved agility.

In terms of your literal question, how to react, the best way is to ignore your ego ("what?! After I worked so hard on this and met the spec?!") and instead focus on some active listening and working to consensus.
Client: You could have included this!
Us: Was not in the specification!
Client: Common Sense!!
Us: I understand that you're not happy that we didn't go beyond the bounds of the specification. Seeing how you feel about this, how can we make you happy? Let's see if there's a process we can create together that will help everyone.
Essentially, you don't want to turn this into a "you said/I said" death match. The only way to resolve those involves lawyers and then nobody wins. If you can agree that the spec or the process was to fault, work together to fix those.

This approach actually just worked for me: wait for the guy who doesn't like your software to leave and be replaced by the guy who does like it.
Obviously you can't really rely on this, but if you're sure that you did a good job and that your software really will satisfy the business needs of the people who hired you, it does pay to wait it out. Sometimes the client's initial reaction will not be their final one, especially if you can quickly incorporate their concerns.

Don't try make the client feel like it is their fault. It might be their fault, but making them feel that way will not produce constructive results, and could just annoy them.
Instead, you should realize that clients only complain about software they use, in most cases because they like it. Nobody complains about software nobody uses. It is inevitable that a client will complain about the software you deliver, even if you deliver exactly what they ask for. So don't sweat it. Software is never done.

Total failure on the part of the person in charge of requirements collection, no doubt about it. Additional failure of the project management to not iterate the deliverable and have check-in meetings with the client.
However, you have a signed-off spec, and what you've delivered matches the spec. So, your company has two choices: write off the cost in the name of business development and make the change for free, or charge them for the change request.

If it ain't in the spec, it ain't in the spec. As a developer with no specific domain knowledge, 'common sense' is an irrelevant concept. Different industries work in different ways and one approach might be quite appropriate for a particular domain but completely unacceptable in the other.
Writing good specs is an art-form. IMO, you can either take an agile 'analyst/programmer' approach where you make small iterations or write and maintain a detailed, unambiguous specification. Both are highly skilled tasks, and are still iterative. You still have to evolve the specification.
Either way is not as easy as it sounds and both require the ability to establish a good working relationship with the client.

You cannot know what your customer think in his head. This situation occur often with client that haven't got any experiences with programming project. What I suggest to you is to simply show him that "common sense" isn't very accurate as answer in engineering (or programming if you prefer).
Show him other example in life that will show him that you cannot build something that aren't written. Example: building a new house, the guy who build the house need a plan with all detail... he won't put optional electric plug because in the living room it's more "common sense" to have some extra...

I had this once. And luckily it wasn't me that created the design because that proved to be the problem.
It is of vital importance that the communcation between your company and the client is as perfect as possible. Be sure you understand each other. Ask questions and let them ask questions. Do not let anything open in the design. This will be the problem point at delivery. And have regular meetings during the project (preferably with a prerelease).
Unfortunately a lot of developers are bad at communciation, and a lot of clients are not aware of their own needs. But if you can minimize the gap, you have found yourself a happy (and returning) customer.

This is why I/the teams I worked with always used a prototype-style approach, that means:
after collecting the requirements, you show the client an early and basic release of the software
the client says "you could have included this"/"it's common sense"
you change your design to reflect the client's desiderata
iterate from point 1 till the official release

You have to start it early on; tell the customer, early and often, that the spec/use-cases/user-stories are a contract which define what will be delivered. in an agile environment there are plenty of chances for the customer to observe some "common sense" feature they want and ask for it, which is one of the advantages of an agile approach, but if you start accepting "common sense" additions at the end, you are preparing yourself for infinite extensions, probably at your expense.
Some customers expect this; the more and better you tell them they can't, the easier the eventual arguments will be.
As a junior guy, I realize you can't do this -- yet -- but one of the hard-but-necessary lessons is that sometimes you have to fire a customer.

You learn - everything is learning and nothing is personal.
We are experts in our area we know better than customer what he need. And next time for next customer we will suggest all useful features in advance and make him happy and will make him pay more money because we are the experts and we know better.

When is it good (if ever) to scrap production code and start over? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I was asked to do a code review and report on the feasibility of adding a new feature to one of our new products, one that I haven't personally worked on until now. I know it's easy to nitpick someone else's code, but I'd say it's in bad shape (while trying to be as objective as possible). Some highlights from my code review:
Abuse of threads: QueueUserWorkItem and threads in general are used a lot, and Thread-pool delegates have uninformative names such as PoolStart and PoolStart2. There is also a lack of proper synchronization between threads, in particular accessing UI objects on threads other than the UI thread.
Magic numbers and magic strings: Some Const's and Enum's are defined in the code, but much of the code relies on literal values.
Global variables: Many variables are declared global and may or may not be initialized depending on what code paths get followed and what order things occur in. This gets very confusing when the code is also jumping around between threads.
Compiler warnings: The main solution file contains 500+ warnings, and the total number is unknown to me. I got a warning from Visual Studio that it couldn't display any more warnings.
Half-finished classes: The code was worked on and added to here and there, and I think this led to people forgetting what they had done before, so there are a few seemingly half-finished classes and empty stubs.
Not Invented Here: The product duplicates functionality that already exists in common libraries used by other products, such as data access helpers, error logging helpers, and user interface helpers.
Separation of concerns: I think someone was holding the book upside down when they read about the typical "UI -> business layer -> data access layer" 3-tier architecture. In this codebase, the UI layer directly accesses the database, because the business layer is partially implemented but mostly ignored due to not being fleshed out fully enough, and the data access layer controls the UI layer. Most of the low-level database and network methods operate on a global reference to the main form, and directly show, hide, and modify the form. Where the rather thin business layer is actually used, it also tends to control the UI directly. Most of this lower-level code also uses MessageBox.Show to display error messages when an exception occurs, and most swallow the original exception. This of course makes it a bit more complicated to start writing units tests to verify the functionality of the program before attempting to refactor it.
I'm just scratching the surface here, but my question is simple enough: Would it make more sense to take the time to refactor the existing codebase, focusing on one issue at a time, or would you consider rewriting the entire thing from scratch?
EDIT: To clarify a bit, we do have the original requirements for the project, which is why starting over could be an option. Another way to phrase my question is: Can code ever reach a point where the cost of maintaining it would become greater than the cost of dumping it and starting over?

Without any offense intended, the decision to rewrite a codebase from scratch is a common, and serious management mistake newbie software developers make.
There are many disadvantages to be wary of.
Rewrites stop new features from being developed cold for months/years. Few, if any companies can afford to stand-still for this long.
Most development schedules are difficult to nail. This rewrite will be no exception. Amplify the previous point by, now, a delay in development.
Bugs that were fixed in the existing codebase through painful experience will be re-introduced. Joel Spolsky has more examples in this article.
Danger of falling victim to the Second-system effect -- in summary, ``People who have designed something only once before try to do all the things they "didn't get to do last time", loading the project up with all the things they put off while making version one, even if most of them should be put off in version two as well.''
Once this expensive, burdensome rewrite is completed, the very next team to inherit the new codebase is likely to use the same excuses for doing another rewrite. Programmers hate learning someone else's code. No one writes perfect code because perfection is so subjective. Find me any real-world application and I can give you a damning indictment and rationale for doing a from-scratch rewrite.
Whether you ultimately rewrite from scratch or not, beginning a refactoring phase now is a good way to both really sit down and understand the problem so that the rewrite will go more smoothly if truly called for, as well as giving the existing codebase an honest look to really see if a rewrite's needed.

To actually scrap and start over?
When the current code doesn't do what you would like it to do, and would be cost prohibitive to change.
I'm sure someone will now link Joel's article about Netscape throwing their code away and how it's oh-so-terrible and a huge mistake. I don't want to talk about it in detail, but if you do link that article, before you do so, consider this: the IE engine, the engine that allowed MS to release IE 4, 5, 5.5, and 6 in quick succession, the IE engine that totally destroyed Netscape... it was new. Trident was a new engine after they threw away the IE 3 engine because it didn't provide a suitable basis for their future development work. MS did that which Joel says you must never do, and it is because MS did so that they had a browser that allowed them to completely eclipse Netscape. So please... just meditate on that thought for a moment before you link Joel and say "oh you should never do it, it's a terrible idea".

A rule of thumb I've found useful is that if given a code base, if I have to re-write more than 25% of the code to make it work or modify it based upon new requirements, you may as well re-write it from scratch.
The reasoning is that you can only patch a body of code so far; beyond a certain point, it's quicker to do over.
There's an underlying assumption that you have a mechanism (such as thorough unit and/or system tests) that will tell you whether your re-written version is functionally equivalent (where it needs to be) as the original.

If it requires more time to read and understand the code (if that is even possible)
than it would to rewrite the entire application, I say scrap it and start over.
Be very carefull with this:
Are you sure you aren't just being lazy and not bothering to read the code
Are you being arrogant about the great code you will write compared to the rubbish anyone else produced.
Remember tested-working code is worth a lot more than imaginary yet-to-be-written code
In the words of our estemed host and overlord, Joel - things you should never do,
it's not always wrong to abandon working code - but you have to be sure about the reason.

I saw an application re-architected within 2 years of its introduction into production, and others rewritten in different technologies (one was C++ - now Java). Both efforts were were not, to my mind, successful.
I prefer a more evolutionary approach to bad software. If you can "componentize" your old app such that you can introduce your new requirements and interface with the old code, you can ease yourself into the new environment without having to "sell" the zero-value (from a biz perspective) investment in rewriting.
Suggested approach - write unit tests for the functionality with which you wish to interface to 1) ensure the code behaves as you expect and 2) provide a safety net for any refactoring that you may wish to do on the old base.
Bad code is the norm. I think IT gets a bad rap from business for favoring rewrites/rearchitecting/etc. They pay the money and "trust" us (as an industry) to deliver solid, extensible code. Sadly, business pressures frequently result in shortcuts that make the code unmaintainable. Sometimes it's bad programmers... sometimes bad situations.
To answer your rephrased question... can code maintenance costs ever exceed rewriting costs... the answer is clearly yes. I don't see anything in your examples, however, that lead me to believe this is your case. I think those issues can be addressed with tests and refactoring.

In terms of business value, I would think it's extremely rare that a real case can be made for a rewrite due solely to the internal state of the code. If the product's customer-facing and is currently live and bringing in money (i.e. is not a mothballed or unreleased product), then consider that:
You already have customers using it. They're familiar with it, and might have built some of their own assets around it. (Other systems that interface to it; products based on it; processes they'd have to change; staff they'd maybe have to retrain). All of this costs the customer money.
Re-writing it might cost less in the long term than making difficult changes and fixes. But you can't quantify that yet, unless your app is no more complex than Hello World. And a re-write means a re-test and a redeploy, and probably an upgrade path for your customers.
Who says the re-write will be any better? Can you honestly say your firm is writing sparkly code now? Have the practices that turned the original code to spaghetti been corrected? (Even if the main culprit was a single developer, where were his peers and management, ensuring quality through reviews, testing, etc.?)
In terms of technical reasons, I'd suggest it could be time for a major rewrite if the original has some technical dependencies that have become problematic. e.g. a third party dependency that's now out of support, etc.
In general though, I think the most sensible move is to refactor piece by piece (very small pieces if it's really that bad), and improve the internal architecture incrementally rather than in one big drop.

Two threads of thought on this one: Do you have the original requirements? Do you have confidence that the original requirements are accurate? What about test plans or unit tests? If you have those things in place it might be easier.
Putting on my customer hat, does the system work or is it unstable? If you've got something that's unstable you've got an argument to change; otherwise you're best of refactoring it bit by bit.

I think the line in the sand is when basic maintenance is taking 25% - 50% longer than it should. There comes a time when maintaining legacy code becomes too costly. A number of factors contribute to the final decision. Time and cost being the most important factors I think.

If there are clean interfaces and you can cleanly delineate module boundaries, then it might be worth refactoring it module by module or layer by layer in order to allow you to migrate existing customers forward into cleaner more stable codebases, and over time, after you've refactored every module, you will have rewritten everything.
But, based on the codereview, doesn't sound like there would be any clean boundaries.

I wonder if the people who vote for scrapping and starting over have ever successfully refactored a large project, or at least seen a large project in poor condition that they think could use a refactoring?
If anything, I err on the opposite side: I've seen 4 large projects that were a mess, that I advocated refactoring as opposed to rewriting. On a couple, there was barely a single line of original code that remained, and major interfaces changed in significant ways, but the process never involved the entire project failing to function as well as it originally did, for any more than a week. (And top-of-trunk was never broken).
Perhaps a project exists that is so severely broken that to attempt to refactor it would be doomed to failure, or perhaps one of the previous projects I refactored would have been better served by a "clean re-write", but I'm not sure I'd know how to recognize it.

I agree with Martin. You really need to weigh the effort that will be involved in writing the app from scratch against the current state of the app and how many people use it, do they like it, etc. Often we may want to completely start from scratch, but the cost far outweighs the benefit. I come across bits of ugly looking code all the time, but I soon realize that some of these 'ugly' areas are really bug fixes and make the program work correctly.

I would try to consider the architecture of the system and see whether it is possible to scrap and rewrite specific well defined components without starting everything from scratch.
What would usually happen is that you can either do that (and then sell that to the customer/management), or that you find out that the code is such a horrible and tangled mess that you become even more convinced that you need a rewrite and have more convincing arguments for it (including: "if we engineer it right, we would never need to scrap the whole thing and do a third rewrite).
Slow maintenance would eventually cause that architectural drift that would make a rewrite more expensive later.

Scrap old code early and often. When in doubt, throw it out. The hard part is convincing non-technical folks of the cost-to-maintain.
So long as the value derived appears to be greater than the cost to operate and maintain, there's still positive value flowing from the software. The question surrounding a rewrite this: "will we get even more value from a rewrite?" Or alternatively "How much more value will we get from a rewrite?" How many person-hours of maintenance will you save?
Remember, the rewrite investment is once only. The return on the rewrite investment lasts forever. Forever.
Focus the value question down to specific issues. You listed a bunch of them above. Stick with that.
"Will we get more value by reducing cost through
dropping the junk that we don't use
but still have to wade through?"
"Will we get more value from dropping the junk that's unreliable and breaks?"
"Will we get more value if we understand it -- not by documenting, but by replacing with something we built as a team?"
Do you homework. You'll have to confront the following show-stoppers.
These will originate somewhere in your executive foodchain from someone who'll respond as follows:
"Is it broken?" And when you say "It's not crashed as such," They'll say "It's not broke - don't fix it."
"You've done the code analysis, you understand it, you no longer need to fix it."
What's your answer to them?
That's only the first hurdle. Here's the worst possible situation. This doesn't always happen, but it does happen with alarming frequency.
Someone in your executive foodchain will have this thought:
"A rewrite doesn't create enough value. Rather than simply rewrite, let's expand it." The justification is that by creating enough value, users are more likely to buy in to the rewrite.
A project where scope is expanded -- artificially -- to add value is usually doomed.
Instead, do the smallest rewrite you can to replace the darn thing. Then expand to fit real needs and add value.

You can only give a definite yes to rewriting in case if you know completely how your application works (and by completely I mean it, not just having a general idea of how it should work) and you know more or less exactly how to make it better. Any other cases and it's a shot in the dark, it depends on too much things. Perhaps gradual refactoring would be safer if it is possible.

If possible, I typically would prefer to rewrite smaller portions of the code over time when I need to refactor a baseline. There are typically many smaller issues such as magic number, poor commenting, etc. that tend to make the code look worse than it actually is. So, unless the baseline is just awful, keep the code and just make improvements at the same time you are maintaining the code.
If refactoring requires a lot of work, I recommend laying out a small re-design plan/todo list that gives you a list of things to work on in order so that you can bring the baseline to a better state. Starting from scratch is always a risky move and you are not guaranteed that the code will be better when you are finished. Using this technique, you will always have a working system that improves over time.

Code with excessively high cyclomatic complexity (like over 100 in a large number of modules) is a good clue. Also, how many bugs does it have / KLOC? How critical are the bugs? How often are bugs introduced when bug fixes are made. If your answer is a lot (I cant remember norms right now), then a rewrite is warranted.

As early as possible. Whenever you get a premonition that your code is slowly turning into an ugly beast that is very likely to consume your soul and give you headaches, and you know the problem is in the underlying structure of the code (so any fix would be a hack, e.g. introduce a global variable), then it's time to start over.
For some reasons people don't like throwing away precious code, but if you feel your better off starting over, you are probably right. Trust your instinct and remember that it wasn't a waste of time, it taught you one more way of NOT approaching the problem. You could (should) always use a version control system so your baby is never really lost.

I do not have any experience with using metrics for this myself, but the
article
"Software Maintainability Metrics Models in Practice" discusses
more or less the same question asked here for two case studies they did.
It starts with the following editor's note:
In the past, when a maintainer
received new code to maintain, the
rule-of-thumb was "If you have to
change more than 40 percent of someone
else's code, you throw it out and
start over." The Maintainability Index
[MI] addressed here gives a much more
quantifiable method to determine when
to "throw it out and start over." This
work was sponsored by the U.S. Air
Force Information Warfare Center and
the U.S. Department of Energy [DOE],
Idaho Field Office, DOE Contract No.
DE-AC07-94ID13223.)

I think the rule was...
The first version is always a throw away
So, if you learned your lesson(s), or his/her lessons, then you can go ahead and write it fresh now that you understand your problem domain better.
Not that there aren't parts that can/should be kept. Tested code is the most valuable code, so if it isn't deficient in any real way other than style, no reason to toss it all out.

When is it good (if ever) to scrap production code and start over?
Never had to do this, but logic would dictate (to me, anyway) that once you pass the inflection point where you're spending more time reworking and fixing bugs in the existing code base than you are adding new functionality, it's time to trash the old stuff and get a fresh start.

If it requires more time to read and understand the code (if that is even possible) than it would to rewrite the entire application, I say scrap it and start over.

I have never completely thrown out code. Even when going from a foxpro system to a c# system.
If the old system worked then why just throw it out?
I have come across a few really bad system. Threads being used where not needed. Horrible inheritance and abuse of interfaces.
It is best to understand what the old code is doing and why it is doing it. Then change it so that it is not confusing.
Of course if the old code doesn't work. I mean can't even compile. Then you might be justified in just starting over. But how often does that actually happen?

Yes, it totally can happen. I've seen money be saved by doing it.
This is not a tech decision, it's a business decision. Code rewrites are long term gains, while "if it ain't totally broke..." is a short term gain. If you are in a first year startup that is focused on getting a product out the door, the answer is usually to just live with it. If you're in an established company, or the errors with the current systems are causing more workload, therefor more company money.. then they might go for it.
Present the problem as best as you can to your GM, use dollar values where you can. "I don't like dealing with it" means nothing. "It'll take twice the time to do everything until this is fixed" means a lot.

I think there are a number of issues here that depend largely on where you are at.
Is the software working well from a customer perspective? (If yes be very careful about changes). I would think there would be little point re-witting unless you were expanding the feature set if the system was working. And are you planning to expand the features and customer base of the software? If so then you have much more reason to change.
As much as anything just trying to understand some else's code even if well written can be difficult, when badly written I would imagine almost impossible. What you describe sounds like something that would be very difficult to expand.

I would take into consideration if the application does what it is intended to do, is required for you to ever make modifications, and are you confident that the app has been thoroughly tested in all scenarios that it will be used in.
Do not invest the time if the app does not need alterations. However, if it doesn't function as you need and you need to control the hours and time invested to make corrections, scrap it and re-write to the standards that your team can support. There's nothing worse than terrible code that you have to support / decipher but still have to live with. Remember, Murphy's Law says it will 10 at night when you'll have to make things work, and that is never productive.

Production code always has some value. The only case where I would truly throw it all out and start again is if we determine the intellectual property is irrevocably contaminated. For example if someone brought large amounts of code from a previous employer, or a large percentage of the code was ripped from a GPLd codebase.

I'm going to post this book every time I see a discussion on Refactoring. Everyone should read "Working Effectively with Legacy Code" by Michael Feathers. I found it to be an excellent book - if nothing else, it's a fun read, and motivational.

When the code has reached a point that is not maintainable or extensible anymore. Is full of short-term hacky fixes. It has lots of coupling. It has long (100+lines) methods. It has database access in the UI. It generates a lot of random, impossible to debug errors.
Bottom line: When maintaining it is more expensive (i.e. takes longer) than rewriting it.

I used to believe in just re-write from scratch, but it is wrong.
http://www.joelonsoftware.com/articles/fog0000000069.html
Changed my mind.
What I would suggested is figuring out a way to properly refactor the code. Keep all existing functionality and test as you go. We have all seen horrible code bases, but it is important to keep the knowledge over time you application has.

Debugging is a bad smell - how to persuade them?

I've been working on a project that can't be described as 'small' anymore (40+ months), with a team that can't be defined as 'small' anymore (~30 people). We've been using Agile/Scrum (1) practices all along, and a healthy dose of TDD.
I'm not sure if I picked this up from Agile or TDD, more likely a combination of the two, but I'm now clearly in the camp of people that looks at debugging as a bad smell. By 'debugging' I'm not referring to the more abstract concept of figuring out what might be wrong with the system, but the specific activity of running the system in Debug mode, stepping through the code to figure out details that are otherwise inscrutable.
Since I'm fairly convinced, this question is not about whether debugging is a bad smell or not. Rather, I'd like to know how I can persuade my team-mates about this.
People that believe debugging mode is the 'standard' mode tend to write code that can be understood only by debugging through it, which leads to a lot of time wasted since every time you work an item on top of code developed by someone else, you get to first spend a considerable amount of time debugging it (and, since there's no bug involved.. the term is becoming increasingly ridiculous) - and then silos happen. So I'd love to convince a few of my team-mates that avoiding debug mode is a Good Thing (2). Since they are used to live in Debug mode, however, they don't seem to see the problem; to them, spending hours debugging someone else code before they even start doing anything related to their new item is the norm; they don't see anything wrong with it. Plus, as they spend time 'figuring it out' they know eventually the developer that worked that area will become available and the item will be passed on to them (leading to yet another silo).
Help me come up with a plan to turn them from the Dark Side !
Thanks in advance.
(1) Also referred to as SCRUM (all caps). Capitalization arguments aside, I think an asterisk after the term must be used since - unsurprisingly - our organization 'tweaked' the Agile and Scrum process to fit the perceived needs of all stakeholders involved. So, in all honesty, I won't pretend this has been 100% according to theory, but that's beside the point of my question.
(2) Yes, there will always be times when we'll have to get in debug mode, I'm not trying to absolutely avoid it, just.. trying to minimize the number of times we have to dive into it.

If you want to persuade your coworkers that your programming practices are better, first demonstrate by your productiveness that you are more effective than they are, at least for some tasks. Then they'll believe you when you explain how you get so much done.
It's also sometimes easier to focus on something concrete. Do your coworkers even talk in terms of "code smell"? Perhaps you could focus on specifics like "When the ABC module fails, it takes forever to debug it; it's much faster to use technique XYZ. Here, let me demonstrate." Then afterwards you can mention your basic principle, which is yeah the debugger is a useful tool, but there's usually other more useful ones.

This is a cross-post, because the first time around it was more of an aside on someone else's answer to a different question. To this question it's a direct answer.
Debugging degrades the quality code of
the code we produce because it allows
us to get away with a lower level of
preparation and less mental
discipline. I learnt this from an
accidental controlled experiment in
early 2000, which I now relate:
I took on a contract as a Delphi
coder, and the first task assigned was
to write a template engine
conceptually similar to a reporting
engine - using Java, a language with
which I was unfamiliar.
Bizarrely, the employer was quite
happy to pay me contract rates to
spend months becoming proficient with
a new language, but wouldn't pay for
books or debuggers. I was told to
download the compiler and learn using
online resources (Java Trails were
pretty good).
The golden rule of arts and sciences
is that whoever has the gold makes the
rules, so I proceeded as instructed. I
got my editor macros rigged up so I
could launch the Java compiler on the
current edit buffer with a single
keystroke, I found syntax-colouring
definitions for my editor and I used
regexes to parse the compiler output
and put my cursor on the reported
location of compile errors. When the
dust settled, I had a little IDE with
everything but a debugger.
To trace my code I used the good old
fashioned technique of inserting
writes to the console that logged
position in the code and the state of
any variables I cared to inspect. It
was crude, it was time-consuming, it
had to be pulled out once the code
worked and it sometimes had confusing
side-effects (eg forcing
initialisation earlier than it might
otherwise have occurred resulting in
code that only works while the trace
is present).
Under these conditions my class
methods got shorter and more and more
sharply defined, until typically they
did exactly one very well defined
operation. They also tended to be
specifically designed for easy
testing, with simple and completely
deterministic output so I could test
them independently.
The long and the short of it is that
when debugging is more painful than
designing, the path of least
resistance is better design.
What turned this from an observation
to a certainty was the success of the
project. Suddenly there was budget and
I had a "proper" IDE with an
integrated debugger. Over the course
of the next two weeks I noticed a
reversion to prior habits, with
"sketch" code made to work by
iterative refinement in the debugger.
Having noticed this I recreated some
earlier work using a debugger in place
of thoughtful design. Interestingly,
taking away the debugger slowed
development only slightly, and the
finished code was vastly better
quality particularly from a
maintenance perspective.
Don't get me wrong: there is a place
for debuggers. Personally, I think
that place is in the hands of the team
leader, to be brought out in times of
dire need to figure out a mystery, and
then taken away again before people
lose their discipline.
People won't want to ask for it
because that would be an admission of
weakness in front of their peers, and
the act of explaining the need and the
surrounding context may well induce
peer insights that solve the problem -
or even better designs free from the
problem.
So, FOR, I not only agree with your position, I have real data from a controlled experiment to support it. It is, however, a rather small sample. More elaborate tests are required before my conclusions are supportable.
Why don't you take what I've said to your team and suggest trials. You have more data than they do (I just gave it to you) and in order to have a credible basis for disagreeing with you they basically have to test the idea, and the only way to do that is to give your idea a go.
You should be ready for it to all fall apart, though, because the whole thing is predicated on the assumption that the developers have the talent and experience to rise to the challenge of stronger design in the absence of step-through debugging.
Step-through debugging was created to make debugging easier. The direct effect of lowering the bar is that people with less talent can participate - if you build a tool that even jackasses can use, you will get jackasses using it -- a lot of them, if the newly accessible activity is well-remunerated.
This causes an exodus of people with talent because they generally use that talent to do rare and precious things in order to be well paid without working too hard, and the market doesn't want to pay for excellence because it cannot distinguish talent well enough to know when paying for it is justified.
Another thought: more recent work with problems on production servers, where it was impossible to install a debugger, has shown the importance of having a codebase for which maintenance doesn't depend on the availability of a debugger. Code that's grown in the absence of debuggers is much less hassle. Choose not to use them when you can change your mind, and then when you can't change your mind it won't be so awful.

Since I'm fairly convinced, this question is not about whether debugging is a bad smell or not.
Well, your local Church might be more appropriate place for your question then.
That aside, convince them by arguments. You might want to reconsider your fundamentalist stance, however, because this is the very opposite of persuasive. One thing you might want to do is drop the term “debugging” in your whole discussion and replace it by “stepping through the code” or the likes, emphasizing that you oppose the uninformend guesswork/patchwork practice of probing that you condemn rather than an informed reflection about the code.
(I would still disagree with you, but that's besides the point since you didn't want a discussion.)

I think the real problem here is
People that believe debugging mode is
the 'standard' mode tend to write code
that can be understood only by
stepping through it
This, if true, should be self evidently wrong and there should be no need to discuss it. If it's not evident it's because they don't see how the badly written code could be improved. Show them, do code reviews where you show how that code could be refactored in a way that is clear without stepping through it.
Code stepping will automatically diminish once better code is written, it just doesn't work the other way around. People will still write bad code and if they avoid stepping through it that will only lead to more wasted time (damn I wish I could step through this spaghetti mess), not to better code.

There is something wrong here, but it's hard to put my finger on it. Perhaps the real issue is that the code has other smells that make it difficult to readily understand. I agree that with TDD one ought to use the debugger less rather than more, since you'll be developing the code in small increments. But, if you can't look at the code and understand it, perhaps it's because the design is too coupled -- there are too many interrelated classes required to make things work.
If the code really needs to be so complex that observation won't suffice, then maybe you need to invest in some good commenting, explaining what is happening -- though I would prefer to see things refactored to the point where comments are not needed. My suspicion is that the debugger may be a symptom rather than the problem.
I know that for me, switching from traditional, code-first development to test-first development has resulted in less time spent debugging...and it's not something I miss. Typically I'll only involve the debugger when its not obvious why the code I just wrote to pass a test, didn't.

This is going to sound like the argument you said you don't want to have, but I think if you want to convince your teammates, you're going to have to make a stronger case. I don't understand your objection. I frequently step through code I'm trying to understand with the debugger. It's a great way to see what's going on. You have not established your claim that people who use the debugger in this way tend to write code which is otherwise difficult to understand. The only convincing way to do so would be through some kind of case/control study which tried to measure and compare the readability of code written by people with varying approaches to the debugger. And you have not even told a plausible story explaining why you think using a tool to understand code execution tends to lead to sloppier code construction. For me it's a complete non sequitur.

A "plan" to convince them of the advantage of another approach is by establishing metrics linked to the number of time you debug the same function for different bugs.
By analysis the trend of that metric, you may convince them that non-regression tests are more useful to spend time writing, and will help them to debug more efficiently.
That way, you do not write completely off the "debug" habit, but you convince them of establishing a solid set of test, allowing them to focus on really useful debug session, if needed.
Should you consider this course of action (metrics), you should know its implementation involves the all hierarchy (stakeholder, project manager, architect, developers). They all need to be implicated in those metrics in order to act on them.
Regarding developers, you could try to suggest:
some new ways of closing a bug case (close it only with the test scenario played to reproduce that bug, meaning they need an independent test in order to, if needed, launch their debug session)
a clear relationship between those metrics and their evaluation by the management (it would be a bad practice to debug over and over the same function)
a larger involvement in architectural decisions: sometimes, knowing some functional or applicative features rather than just classes and code can incite a developer to think more in term of black-box test rather than white-box (which can more easily lead to debug session)
a participation into "operational architecture" process (where you need to deploy your app, and make full front-to-back integration test). Again, a larger picture of the all system can help a developer to get more interested in features rather than 'lines of code'

I think a better phrasing of this question would be "Is non-TDD a code smell?" TDD seems to lead to less time spent in the debugger due to more time spent writing/failing/passing tests. Without TDD, you are more likely to spend time in the debugger to diagnose errors.
At least within Visual Studio, using the debugger is not that painful, so the challenge for you would be to explain to your teammates how TDD would make their development more enjoyable, productive and successful. Just avoiding the debugger is probably not reason enough for a team to switch their development methodology.

Right on roadwarrior.
debugging isn't the problem, it's poorly commented and or documented code and bad archetecture. I work on a smaller team but when a bug does surface, I do step through the code. frequently it's a very small job because the app is well planned out and the doc's on the code are clear.
That said lets get to my point. Want the team to not debug... comment, comment comment. Nothing beats down the urge to debug faster. Sure they'll still do it, but they'll be more likely to step over well documented code.
Oh and though it should go without saying, I'll do it anyway. don't have bugs in your code. :)

I agree with those above who expressed the relative irrelevance of this "debugger issue."
IMO, the 2 most important goals of a developer are:
1) Make the software do what it's supposed to do.
2) Write the code so that a maintenance developer 2 years down the road enjoys the experience of changing existing or adding new features.

Before you make a plan, you should decide how important this change is to you. Although I agree that debugging is a smell, it is also a very well accepted and ingrained practice for developers, so convincing them that they should stop doing it won't be easy or quick - and for good reasons. How much energy do you want to put into this topic?
Second, why do you want to persuade them in the first place? If your motivation is to help them, is it really their top priority problem? When you help people in ways they want to be helped, change becomes easy.
Once you have decided that you want to go on with your change initiative, you need to take into account that different people are convinced by different things. Some people will already be convinced by trying something new and exciting. Some will be convinced by numbers (metrics). Some by getting told about it while eating their favorite type of cookie (seriously!), some by hearing about it from their favorite guru. Some by reading about it in a magazine. Some by seeing that "everyone else is doing it, too". Etc. pp.
There is an insightful interview with Linda Rising on this topic at InfoQ: http://www.infoq.com/interviews/Linda-Rising-Fearless-Change. She can say it much better than me. The book is quite good, too.
Whatever you do, don't press too much, but also don't give up. Change can happen - especially if you take resistance as a resource -, and sometimes it happens at unexpected times, so always keep a sense of wonder.

#FOR : You have a second problem too, here it is :
sadly it doesn't seem the devs are interested in being more productive (they get paid the same anyway)
How do you intend to make them want to be more productive when there is nothing (visible) for them to gain?

Designing software by debugging is a good practice.
The number of environments supporting this way of developing is very small: the best known is Smalltalk. In Smalltalk, you can write a test describing your objects protocol without the methods being implemented. Running this test will then trigger the debugger, and you can add the method to the right class in the debugger, and can continue stepping through the code until all functionality is implemented and the test is green.
This needs a compiler to be available at run-time, and first-class invocations. It offers a very short feedback cycle, and is one of the primary reasons for Smalltalks' productivity

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio