Tracking and prediciting quality level - metrics

What techniques do people recommend to track the quality level of a new program? Are their ways to take a poorly defined term like "quality level", quanitify it and then make predictions? Currently I use bug rates and S curves but I am looking for other ways to evaluate, estimate and predict quality levels.

Are you looking for this measure?
More seriously, for me code quality is about maintainability:
how easy it is to fix a bug, to add/remove/modify a feature,
how easy it is to refactor: is a regression test suite available?
how much technical debt?
Remember that you write code once, but you read it several times.

I am not sure what counting your bugs is going to do with out something to compare it with. What if the software you made was very hard and had lots of edge cases? You need some kind some kind of comparison...
Even though one of the other answers was clearly a joke code reviews are also probably a good idea. If you have too many bugs hire better engineers or have them write less code.
Edit: Added after considering comments...
Every bug is a like a unique sucky snow flake (a suckflake?). They have different levels of impact on your customers and developers. I would at least take this into account. Maybe adding severity (a measure of customer push for fix) and engineering hours spent fixing it might help improve accuracy. My concern I guess is that this is still over simplification of "quality" when it comes to developing software.
Sadly software quality != product quality. A recently released game called Fallout 3 won tons of awards and made lots of money (at least I assume) but was also a buggy hunk of junk on PC at least.
Just make sure you are tracking and optimizing the correct thing. Tracking # bugs vs time is just tracking # of bugs vs time. Reading any more into it requires some level of assumption however correct or incorrect.
What are your goals? Bugs are but one part of software quality. If you want to continue you to support your software maintainability is key. A lot of times bug fixes can make things less maintainable if your coder did things in a rush. This in turn makes future fixes and features harder and fixes can add new bugs.

This depends a lot on what kind of comparisons you're trying to make. If you're looking at a single project over time and the team doesn't change, then bug rates might be meaningful. However, if you're comparing different projects, with different teams, there's really no way to compare things like bug rates, because you are really comparing the rate of known bugs. One team may be much better at identifying bugs than the other, making their bug rate look higher, but they're really the ones with better software quality.

Do you do unit testing?
Code coverage on unit tests can be a decent measure of quality.

Related

How to estimate the contribution of an individual to a software project? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I work on a software project and would like to estimate the percentage out of the total contribution that I have put in the development of the software. Is there some tool doing this? Such a tool can be useful for appraisals or negotiations, for example. After all, we work for money (yes, not only money, put the point remains). I think there is enough hand-waving for the most important things.
The estimation is very subjective (at least to me now) but I do not know of any tool that provides even a subjective estimate. I know of Sloccount that spells out the total effort using the lines of code but not on per-developer basis.
My idea of an ideal tool for this purpose would:
measure the complexity of the code (more complex is more effort, but more effort is not necessarily more contribution)
measure the decomposibility/flexibility of the software (more decomposable is better)
how much library code is used -- using library code speeds up the development process, increases the associated risk and requires the developer to know from before or learn about the library.
be intelligent enough to differentiate between "who wrote the code", "who copied the code" and "who indented the code".
It is difficult to differentiate between the complexity in the implementation and the intrinsic complexity of the problem. Perhaps a comparison can be made with an equivalent open source counterpart if there is, or for each submodule separately.
If there is no such tool, is there no merit in having such a tool? Or do you believe in "I do work, I do not measure"? It takes time after all. Perhaps the project manager should do this estimation continuously, say, weekly. Are there any standards? Yes, standardization is difficult because every project has different goals, but perhaps that should mean there should be multiple standards, not no standards at all. This looks similar to the how a company is valued in the market.
Update: after seeing a few initial answers: It does not make sense to imagine a tool that just outputs the percentages. Are there tools that can help humans (particularly managers) in making better decisions? Or what is the sufficient statistic for making better decisions? Are these statistics available?
I really doubt there is any reliable trustworthy way of measuring individual's contribution to the solution. Sometimes rewriting some complicated legacy code that results in less lines of code, less complicated solution (smaller cyclomatic complexity etc.) can be seen as a quite significant contribution, while in other cases deleting valuable code covering edge cases that results in the same statistics (less lines of code, smaller CC etc.) is definitely something bad. It all comes down to people, trust and cooperation, individualism in the team is almost always wrong and I would rather avoid it and especially not use it as a motivation factor.
This is a research topic on its own. There are several tools that have tried to define metrics like code ownership. There are other approaches which tackle other aspect of collaborative development, for instance the trustability we can have in the code.
There has been also several studies that tried to use the information from bug trackers. For instance, to identify the developer that is the more likely to introduce bugs. But it's hard to be objective (A brilliant developer that is assigned the most critical part of the system, will still be more likely to introduce critical bugs).
It's actually hard to monetize the development tasks. What is the cost of a bug? What is the gain of refactoring? That would be however one way to estimate the contribution of a developer.
The last cool tool I saw of this kind was the Game Plugin for Hudson continuous integration system. A score is assigned to each developer according their actions
-10 if they break the build
-1 for breaking a test
+1 for fixing a test
etc.
That's again a way to somehow assess the contribution of the developer.
All in all, I do feel like what you are asking for exist, but is still very immature.
I don't think you can get a tool to evaluate your share of the project. Measuring lines of source is all very well, but what of the quality of that source? You wouldn't want someone taking the credit for 200 lines of source if the job could have been easiy done in 20...
Also, thinking of my employer for a moment, a lot of people contribute to the project in ways other than code. Immediate examples I can think of would be Project Managers and Testers - both of whom are essential, both of whom rightly deserve some credit.
Martin
The only thing that I could imagine would be a voting system. I have absolutely no idea, if that would work in your team or anywhere - but I'm sure, that you will need humans for any realistic estimation of code quality.
In Stroustrup's Book on C++ I've read once "Don't try to solve social problems with technical means".
Thinking progmatically, the attitude and the ability of a programmer could be very quickly estimated by making a code-review together and having a talk on relevant topics.
Thinking as an IT-enthusiast and as a control-freak, this shouldn't be very hard, to implement a teachable machine-learning software, which uses version-cotrol, bug-database, etc and greates real-time performanced data for each contributor. E.g. R, KNIME or WEKA could be used for this.

Separating rapid development from refactoring/optimization

I'm working in a team of 2 front-end developers on a web-based late-stage startup project.
The site works quite well, but there's a lot of room for improvement code-wise, as the code is quite messy and disorganized.
I would like to clean things up gradually by writing tests and carefully refactoring to avoid breaking anything. (Using principles from the book 'Working Effectively with Legacy Code')
However, the developer I'm working with is being given a lot of high-priority feature work, and I don't want to burden him with maintenance tasks. A lot of the time he has to write messy code simply because of the time-constraints.
As the team grows I'm concerned about how to manage the different concerns.
I'm thinking of dividing the team into 2 groups:
Does rapid development on new features, with less care on code quality.
Writes unit tests, refactors code, generally optimizes things.
The result I'm aiming for is to bring as much of the code under test as possible, while still keeping up the pace of new-feature development.
Has this been tried before? Any thoughts?
There is a problem with dividing the team.
The group doing the rapid development of new features will be very popular with the sales force, and the management. They will be perceived as more productive, more solution-oriented and they will be included in more of the important discussions.
The group worrying about the quality will be perceived as costly, negative, unproductive, problem-oriented whiners.
The bean-counter reality of it is that the "easy wins" of the rapid group accumulates hidden costs of poor quality. If this goes on too long the code will become more and more messy as new rapid code builds on old rapid code and it will become more and more time-consuming and risky to add more features. Those hidden costs becomes visible when no one can add any new features anymore.
Think two years down the road. Your rapid coders will probably have moved on, cashing in on their many "successes". The sales force will be talking about them, and how easy it was when they were running the shop.
It's like weed in a garden. It's easy to solve the problem early, much harder when it's all weeds.
The engineering reality of it is that rapid coding shouldn't be messy. You don't save time by doing things messy. If it looks that way, there is either a skill problem or a framework problem.
So if you divide the group, rotate after every iteration!
And how do I know this, scenario so well? I came in after the "successful" coders had left...
Hi I think You Can not dive team into two groups.
#Guge
Have covered financial and political impact on members of those groups.
And there are more:
1) Speed group will not care about quality of code.(They wont be paid for it!!) So they will produce ugly code.
2) Quality group will rewrite some code to be better. Speed group will add to this code some new functionality and again code will be ugly.
3) Speed group is about to add new func. to code that was not refactored. They do no refactors (they are not paid for it so management may be anger if they do, or if they work slow down) so they will create ugly^2 code at lower speed than if they refactor first.
4) No new func. mean no $$$ mean smaller wages mean worse developers assigned mean lower speed and quality.
5) Better devs will find a way to change goup to speed or a way to change work :)
5) If quality group will improve any subsytem speed group may deny implementing any patches to existing code to not slow down.
Ok So what to do to prevent such situation?
Talk with your managment, You need to maintaing balance betwen new func. and quality of code. Only good code allow to introduce fast changes or implementing new func. at high speed.
Good argument is that badly written code or architecture that evolved into something ugly is a debt. When you wont pay it monthly interest grow rapidly. You can afford to not paid one month probably but not longer.
You can get you credit card with you to emphasize.
And maybe if your team get too much work to do it in deadline it is always worth to speak it loudly. But carefully ;)
And all programmers should refactor and write testes (if relevant). No exception.
Just refactor a bit today a bit tommorow etc.
Trick with credit card is borroved from Fowler "Refactoring" <- great book.
PS there are projects like "death march" where only pushing hard dev. make economic sense (because project will make money only if success in deadline with limited expenses; symptoms are constants work over hours with less dev than required) and my suggestions don't apply to it
I think that in the short term its not a problem for you to devote some people to new development, and some people to cleaning up the new code. However to keep this as an ongoing pattern is flawed. Your aim should be to get code delivered on time and to a high standard and your entire team should be working toward this goal, If you split the team then you have one half of the team writing exciting new features with no concern for code quality, and the other half cleaning up after them. Over time you may find that the cleanup team needs to be significantly bigger than the new features team.
Your first step (which you seem to have made some progress with) is to decide on what sort of quality your code is and how it is going to be measured (test coverage, cyclomatic complexity etc). You can then apply these metrics to your current code and understand how much of it doesn't come up to standard. This code is the code that has accrued 'Technical Debt' which needs repaying.
The next thing to do is to repay this technical debt. This involves bringing this code in line with the standards that you have put in place. This is a tedious task if it has to be done often, however it needs to be done, and to be done by anyone who is free to do it.
Finally make it your main goal to produce code that is in line with your quality measures. If a deadline shows up then you may have to ignore them for a while but once the deadline is met your priority is to get your code back up to the quality you need.

What is Best for Defect Rate Tracking? Defects per KLOC?

I'm trying to create some internal metrics to demonstrate (determine?) how well TDD improves defect rates in code.
Is there a better way than defects/KLOC? What about a language's 'functional density'?
Any comments or suggestions would be helpful.
Thanks - Jonathan
You may also consider mapping defect discovery rate and defect resolution rates... how long does it take to find bugs, and once they're found, how long do they take to fix? To my knowledge, TDD is supposed to improve on fix times because it makes defects known earlier... right?
Any measure is an arbitrary comparison of defects to code size; so long as the comparison is similar, it should work. E.g., defects/kloc in C to defects/kloc in C. If you changed languages, it would affect the metric in any case, since the same program in another language might be less defect-prone.
Measuring defects isn't an easy thing. One would like to account for the complexity of the code, but that is incredibly messy and unpleasant. When measuring code quality I recommend:
Measure the current state (what is your defect rate now)
Make a change (peer reviews, training, code guidelines, etc)
Measure the new defect rate (Have things improved?)
Goto 2
If you are going to compare coders make sure you compare coders doing similar work in the same language. Don't compare the coder who works in the deep internals of your most complex calculation engine to the coder who writes the code that stores stuff in the database.
I try to make sure that coders know that the process is being measured not the coders. This helps to improve the quality of the metrics.
I suggest to use the ratio between the times :
the time spend fixing bugs
the time spend writing other codes
This seem valid across languages...
It also works if you only have a rough estimation of some big code base. You can still compare it to the new code you are writing, to impress you management ;-)
I'm skeptical of all LOC-related measurements, not just because of different relative expressiveness of languages, but because individual programmers will vary enough in the expressiveness of their code as to make this metric "fuzzy" at best.
The things I would measure in the interests of project management are:
Number of open defects on the project. There's no single scalar that can tell you where the project is and how close it is to a releasable state, but this is still a handy number to have on hand and watch over time.
Defect detection rate. This is not the rate of introduction of new defects into the system, but it's probably the closest proxy you'll find.
Defect resolution rate. If this is less than the detection rate, you're falling behind - if it's greater, you're getting ahead.
All of these numbers are more useful if you combine them with severity information. A product with 20 minor bugs may well closer to release than one with 2 crashing bugs. If you're clearing the minor bugs but not the severe ones, you have to get the developers to refocus their attention.
I would track these numbers per project and per developer. The reason for doing them per project should be clear. The per-developer numbers are certainly not the whole picture of an individual contributor's skill or productivity, but can point you to people who might need training or remediation.
You may also wish to tag all the tickets in your defect tracking system by project module as well (especially for larger projects), so that you can tell when critical modules are in a fragile state.
Why dont you consider defects per use case ? or defects per requirement. We have faced practical issues in arriving at the KLOC.

IT evaluating quality of coding - how do we know what's good? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Coming from an IT background, I've been involved with software projects but I'm not a programmer. One of my biggest challenges is that having a lot of experience in IT, people often turn to me to manage projects that include software development. The projects are usually outsourced and there isnt a budget for a full time architect or PM, which leaves me in a position to evaluate the work being performed.
Where I've managed to get through this in the past, I'm (with good reason) uneasy about accepting these responsibilities.
My question is, from a perspective of being technically experienced but not in programming, how can I evaluate whether coding is written well besides just determining if it works or not? Are there methodologies, tips, tricks of the trade, flags, signs, anything that would say - hey this is junk or hey this is pretty damn good?
Great question. Should get some good responses.
Code cleanliness (indented well, file organization, folder structure)
Well commented (not just inline comments, but variables that say what they are, functions that say what they do, etc.)
Small understandable functions/methods (no crazy 300 line methods that do all sorts of things with nested if logic all over the place)
Follows SOLID principles
Is the amount of unit test code similar in size and quality as the code base of the project
Is the interface code separate from the business logic code which in turn should be separate from the infrastructure access code (email, database, web services, file system, etc.)
What does a performance analysis tool think of the code (NDepend, NDoc, NCover, etc.)
There is a lot more to this...but this gets your started.
Code has 2 primary audiences:
The people who use it
The people who develop it
So you neeed 2 simple tests:
Run the code. Can you get it to do the job it is supposed to do?
Read the code. Can you understand the general intentions of the developer?
If you can answer yes to both of these, it is great code.
When reading the code, don't worry that you are not a programmer. If code is well written / documented, even a non-programmer should be able to see guess much of what it is intended to achieve.
BTW: Great question! I wish more non-programmers cared about code quality.
First, set ground rules (that all programmers sign up to) that say what's 'good' and what isn't. Automate tests for those that you can measure (e.g. functions less than a number of lines, McCabe complexity, idioms that your coders find confusing). Then accept that 'good coding' is something you know when you see rather than something you can actually pin down with a set of rules, and allow people to deviate from the standard provided they get agreement from someone with more experience. Similarly, such standards have to be living documents, adapted in the face of feedback.
Code reviews also work well, since not all such 'good style' rules can be automatically determined. Experienced programmers can say what they don't like about inexperienced programmers' code - and you have to get the original authors to change it so that they learn from their mistakes - and inexperienced programmers can say what they find hard to understand about other people's code - and, by being forced to read other people's code, they'll also learn new tricks. Again, this will give you feedback on your standard.
On some of your specific points, complexity and function size work well, as does code coverage during repeatable (unit) testing, but that last point comes with a caveat: unless you're working on something where high quality standards are a necessity (embedded code, as an example, or safety-critical code) 100% code coverage means you're testing the 10% of code paths that are worthwhile to test and the 90% that almost never get coded wrong in the first place. Worthwhile tests are the ones that find bugs and improve maintainability.
I think it's great you're trying to evaluate something that typically isn't evaluated. There have been some good answers above already. You've already shown yourself to be more mature in dealing with software by accepting that since you don't practice development personally, you can't assume that writing software is easy.
Do you know a developer whose work you trust? Perhaps have that person be a part of the evaluation process.
how can I evaluate whether coding is written well
There are various ways/metrics to define 'well'or 'good', for example:
Delivered on time
Delivered quickly
No bugs after delivery
Easy to install
Well documented
Runs quickly
Uses cheap hardware
Uses cheap software
Didn't cost much to write
Easy to administer
Easy to use
Easy to alter (i.e. add new features)
Easy to port to new hardware
...etc...
Of these, programmers tend to value "easy to alter": because, their job is to alter existing software.
Its a difficult one and could be where your non-functional requirements will help you
specify your performance requirements: transactions per second, response time, expected DB records over time,
require the delivery to include outcome from a performance analysis tool
specify the machine the application will be running on, you should not have to upgrade your hardware to run the app
For eyeballing the code and working out whether or not its well written its tougher, the answers from #Andrew & #Chris cover it pretty much... you want code that looks good, is easy to maintain and is performant.
Summary
Use Joel Test.
Why?
Thanks for tough question. I was about to write a long answer on merits of direct and indirect code evaluation, understanding your organisational context, perspective, figuring out a process and setting a criteria for code to be good enough, and then the difference between the code being perfect and just good enough which still might mean “very impressive”. I was about to refer to Steve McConnell’s Code Complete and even suggest delegating code audit to someone impartial you can trust, who is savvy enough business and programming-wise to get a grasp of the context, perspective, apply the criteria sensibly and report results neatly back to you. I was going to recommend looking at parts of UI that are normally out of end-user reach in the same way as one would be judging quality of cleaning by checking for dirt in hard-to-reach places.
Well, and then it struck me: what is the end goal? In most, but very few edge cowboy-coding scenarios, as a result of the audit you’re likely to discover that the code is better than junk, but certainly not damn good, maybe just slightly below the good enough mark. And then what is next? There are probably going to be a few choices:
Changing the supplier.
Insisting on the code being re-factored.
Leaving things as they are and from that point on demanding better code.
Unfortunately, none of the options is ideal or very good either. Having made an investment changing supplier is costly and quite risky: part of the software conceptual integrity will be lost, your company will have to, albeit indirectly, swallow the inevitable cost of the new supplier taking over the development and going through the learning curve (exactly opposite to that most suppliers are going to tell you to try and get their foot in the door). And there is going to be a big risk of missing the original deadlines.
The option of insisting on code re-factoring isn’t perfect either. There is going to be a question of cost and it’s very likely that for various contractual and historical reasons you won’t find yourself in a good negotiation position. In any case re-writing software is likely to affect deadlines and the organisation what couldn’t do the job right the first time is very unlikely to produce much better code on the second attempt. The latter is pertinent to the third option I would be dubious of any company producing a better code without some, often significant, organisational change. Leaving things as they are not good either: a piece of rotten code unless totally isolated is going to eventually poison the rest of the source.
This brings me to the actual conclusion, or in fact two:
Concentrate on picking the right software company in a first place, since going forward options are going to be somewhat constrained.
Make use of IT and management knowledge to pick a company that is focused on attracting and retaining good developers, that creates a working environment and culture fit for production of good quality code instead of relying on the post factum analysis.
It’s needless to expand on the importance of choosing the right company in the first place as opposed to summative evaluation of delivered project; hopefully the point is already made.
Well, how do we know the software company is right? Here I fully subscribe to the philosophy evangelised by Joel Spolsky: quality of software directly depends on quality of people involved which as it has been indicated by several studies can vary by an order of magnitude. And through the workings of free markets developers end up clustered in companies based on how much a particular company cares about attracting and retaining them.
As a general rule of life, best programmers end up working with the best, good with good, average with average and cowboy coders with other cowboy coders. However, there is a caveat. Most companies would have at least one or two very good developers they care about and try their hardest to retain. These devs are always put on a frontline: to fire fight, to lure a customer, to prove the organisation potential and competence. Working amongst more than average colleagues, overstretched between multiple projects, and being treated as royalty, sadly, these star programmers very often loose touch with the reality and become prima donnas who won’t “dirty” their hands with any actual programming work.
Unfortunately, programming talent doesn’t scale and it’s unlikely that the prima donna is going to work on your project past the initial phase designed to lure and lock in you as a customer. At the end the code is going to be produced by a less talented colleague and as a result you’ll get what you’ll get.
The solution is to look for a company there developer talents are more consistent and everyone is at least good enough to produce the right quality of code. And when it comes to choosing such an organisation that’s where Joel Test comes mighty handy. I believe it’s especially suitable for application by someone who has no programming experience but good understanding of IT and management.
The more points company scores on the Joel Test the more it’s likely to attract and retain good developers and most importantly provide them with the conditions to produce quality code. And since most great devs are actually in love with programming all the need is to be teamed up, given good and supportive work environment, a credible goal (or even better incredible) and they’ll start chucking out high quality code. It’s that simple.
Well, the only thing is that company that scores full twelve points on the Joel’s Test is likely to charge more than a sweatshop that scores a mere 3 or 5 (a self-estimated industry average). However, the benefits of having the synergy of efficient operations and bespoke trouble-free software that leverage strategic organisational goals will undoubtedly produce exceptional return on investment and overcome any hurdle rates by far outweighing any project costs. I mean, at the end of the day the company's work will likely be worth the money, every penny of it.
Also hope that someone will find this longish answer worthwhile.

Evaluating software estimates: sure signs of unrealistic figures?

Whilst answering “Dealing with awful estimates” posted by Ash I shared a few tips that I learned and personally use to spot weak estimates. But I am certain there must be many more!
What heuristics to use in the scenario when one needs to make a quick evaluation of software project estimate that has been compiled by a third-party (a colleague, a business partner or an external company)?
What are the obvious and not so obvious signs of weak software estimates that can be spotted without much detailed knowledge of task at hand?
A single person having done the estimates, rather than having used consensus based estimation (to fully understand the implied scope of requirements) such as Wideband Delphi.
Especially true if the person doing the estimation is not the person doing the implementation!! - I once worked on a project estimated by someone else as 60 days before any requirements had even been given. Lets just say I was not a happy bunny
No time for documentation.
No time for ramp-up (in terms of learning, and team size).
No list of risks, and their impact to the timescale.
No buffer for the unexpected - in terms of late breaking requirements, and risks.
No one has said it, so I will. The obvious answer is that if you have software schedule estimates then that is a sure sign of unrealistic figures. Yes, there are many methods for estimating software but none of them are accurate in any way, shape or form. What usually happens is that deadlines are set. If the task is over-estimated then extra time is spent making the result better. If the task is under-estimated then something is sacrificed to meet the delivery (like testing and features).
I know this answer isn’t what people want to believe, but estimating is always a guess. More often than not, a developer can’t even predict how much they will accomplish by the end of the day. You are expecting them to guess things months/years down the road on something that they aren’t even sure what is really involved yet.
The only practical answer to your question that isn’t prone to giving unrealistic results would be using a worksheet that comes up with guesses based on previous history at your company. Unfortunately, that will not account for tasks the estimator missed. At least this may give ballpark numbers.
Unless you develop knock offs of the same exact system over and over again, then anyone who thinks they have figured this out is fooling themselves. There are way too many variables involved.
There are two types of estimates: task estimates and project estimates. You can view these as the big and small pictures.
Project estimates are necessarily high level (granularity no smaller than days typically) and must include things like:
High level architecture;
Time for testing;
Ramp up times;
Defect processes;
Time for documentation;
Relevant training;
Assumptions;
Dependencies (eg team A can't do what they need to until team B delivers phase 1);
Critical path (which pieces are likely to determine if the project slips and by how much); and
Risks.
The more of those things that are missing, the more unrealistic (or risky) the estimates.
The second kind of a task estimate, which is typically much lower level. For this kind of estimate it should be simply a task breakdown (with no task being larger than say 5 days).
These don't tend to address the above items but some of them might be relevant, such as assumptions regarding decisions not made yet (eg production hardware). It may also be worth identifying who can and can't do the tasks due to relevant experience, background knowledge or skills (as that person or those persons may end up overcommitted).
Other posts have mentioned the testing time should equal or exceed dev time. I strongly disagree with this. I've seen 8 hour dev tasks result in 100+ hours test time and 80 hour dev tasks result in less than 2 hours of testing. In both cases the testing time was entirely reasonable. The is no absolute correlation between the two. At best, there is a loose connection.
Is the estimate what the management
wanted to be told?
Does the
estimate nicely fit in with the
planned shipment date for the next
release?
Does the management reward
people that give good news more then
people the give bad news?
Was the
estimate prepared before knowing who
would be working on the project?
Did
someone that wanted that bit of
functionally implemented prepare the
estimate?
Is there a history of
software being late?
Is it normal for
developers to be moved onto other
tasks partway though a project?
Have
some or all developers given up on
commenting on bad estimates as a
waste of time?
Count up the number of questions you get “yes” or “maybe” answers.…
If you get mostly “no” answers to the above questions, then it may be worth looking at the estimate in detail to see if it includes the tasks that other people of listed in this thread.
Wow... I really like toolkit's answer.
And I agree that any estimate at all is flawed, because it assumes that the estimator has way more of a clue for how to solve the problem than any estimator actually does when a project gets estimated. However, I think you still need to at least estimate the size of the mountain before you start. Some thought as to whether it's worth trying to do it should precede any endeavor and that's what the essence of an estimate should be.
I did come up with a few more indicators of a dangerous estimate:
No cross-reference - If the estimate can't be validated at least two different ways, it's likely to be unreliable. For example, the last estimates I've done I've been able to break down the work into small feature chunks, and consider how long it's taken our team to do similarly scoped features. Then I was able to look at the sum of these costs and see how big the scope was relative to other projects I've worked on. That's two ways to validate.
The background of the estimator - if this is a software estimate done by a hardware guy who's never written code - be very afraid. More subtle - the closer the estimator's background is to the technology and problem domain of the project, the better.
Detail - as said a few different ways in a few different posts - I like to see detail for both individual features, as well as the tasks needed to complete each feature. Most estimates I see don't show the detail in the formal presentation, but if you ask the person who did the estimate, they should have a file somewhere. Hopefully it's not the back of a paper napkin stained with beer and ketchup. :)
Documented Assumptions - any estimator will have had to make some set of assumptions about the task. These should be documented somewhere, perferably in the formal paperwork. I always get a little worried when I see a short proposal with not many documented assumptions. Either they were thought through and not communicated to the customer, or they were not thought through. I'm not honestly sure which one is worse. It goes without saying that the assumptions should also be reasonable.
Balanced Lifecycle - However the task is broken down, what's the ratio of design, code and test? How about documentation, integration with external systems and post release support? How about those extra things that are so vital (system admins, CM gurus, management effort)?
Slack - I'm sure the corporate daemons of cheapness will come and flay me, but a schedule and a cost should have some slack. If the estimate looks ambitious and agressive to an experienced eye, it is likely to be too low. Estimates are almost never too high.
One good heuristic is to see if test time is roughly the same a development time. That's a good sign for the estimate.
If they can't give you a breakdown of the estimate then that's a bad thing. Usually a sign of lots of little things that may have been forgotten. They don't need to provide the complete original breakdown, just a breakdown like:
requirements
development
testing
packaging and deployment
etc.
They should be using a standard template to calculate their estimate. They don't need a number in every column, but they do the template to list all possible tasks. That way the template can be used to jog peoples's minds when working on the estimate.
If the estimate is overly precise, e.g. 0.25 hour increments, then that, for me, is a bad smell.
If there are things missing like requirements capture, testing, deployment, and handover to any Ops group? If any of those are missing, that's the sort of thing that will come back and bite you.
Edit: One other thing to watch for is the old "perpetually 90% complete" tasks. You get progress update after progress update listing a task as "90% complete". That's not good!
HTH
cheers
Is the compiler of the
estimate available and willing to
discuss it with other senior project
members? If not, that is often a
concern.
Was the estimate sent to the
customer before the experience and
skills of the development staff are
known. Two point estimates may help
but only to some extent.
Before even getting a chance to look at the estimate, you are told that you are committed to delivering all of the functionality described by a specific date.
(Thanks for responding to my question, by the way.)
If you see one or more of these, you may have a bad estimate:
Single point estimates: an estimate should be associated with a range of possible dates or a confidence value
Insufficient granularity of tasks: a large task bucket usually indicates that the functionality is not well understood (which is especially a problem since poorly understood problems are usually under-estimated)
No expression of assumptions and/or risks
Inadequate effort allocated for commonly skipped or underestimated items (e.g. build scripts, documentation, deployment, etc.)
I agree with sateesh, I really like Software Estimation: Demystifying the Black Art by Steve McConnell. He has several checklists which are useful when reviewing and/or preparing estimates.
I totally agree with Dunk, the first sign of bad estimates is the mere presence of a large detailed upfront schedule. Estimates are exactly that, an approximation, otherwise we would call them exactimates. So they should never be used alone in the management of a project.
The most important point to consider is not the accuracy of estimates but the consistency. If a third party were doing estimates for you, then ask to see a history of their successes or failures, speak with their past clients and determine their reliability.
That all being said, from an Agile standpoint, some of the ways we attempt to gain more consistent estimates during a project are;
Use a relative sizing standard (S,M,L,XL) rather than time based (ideal days).
focus on complexity not time
Always use group estimates not single person estimates
Gather estimates frequently throughout the project, generally at the start of each sprint
use feedback from previous sprints in determination of story complexity
track velocity to give meaning to the relative sizing
frequent and early story retrospection to examine/understand any thrashing
If you are dealing with companies that use these estimation methods then, chances are you are going to receive consistent and therefore better results.
Estimates of the form 3, 6, or 12 months (basically any round numbers) reek of guessing. Usually when you guess you pick some round number bigger than you think it will take -- quarters, half a year, etc. -- are the usual suspects. I much prefer estimates in terms of actual development iterations (whatever their size).
What are the obvious and not so
obvious signs of weak software
estimates that can be spotted without
much detailed knowledge of task at
hand?
Estimates which are given without much detailed knowledge of the task at hand are generally not good.
Perhaps a general approach you could take is to check that items in the requirements correspond to those in the estimate. If you want to be very quick check the relative sizes, if there is a 100 word estimate given to a 100,000 word brief it stands no chance of being right.
Also (as others have said) check that analysis, coding, debugging, testing, integration, contingency etc are mentioned. It shows some thought has gone into it.
Having success and sign off criteria at various stages is a great sign. If they have a defined point which is 10% done at least if the estimate is wrong you know early and have a chance to adapt. If there are no checkpoints until “finish” you may not know that you are behind until that date is hit.
How familar is the person giving the estimate with the people doing the work?
I have often seen estimates where there is a generic person doing the work, even though the team is made up of individuals with very different backgrounds. Most likely the tasks and the specialities don't line up perfectly and you get a c++ serverside programmer who ends up doing either your gui or your database... Sometimes the manager of the team doesn't really appreciate the team member's strengths, so if he has been asked to come up with the estimate on his own because his team is busy on the previous project you will find that the work in question is really only suitable for part of the team (not motivating, lack of skills etc)
One other helpful way to evaluate the estimates is to compare it with the actual effort that was spent on previous projects of similar kind. The best data for the comparison would be the effort data of the previous projects that the organization has done. If there is no organizational historical data you can try to benchmark the estimate against industry wide benchmarks.
I would also say if the estimate is presented as single absolute number (say 180 days) then it is not a good sign. A single absolute number would indicate that the estimate is that the task will be finished with 100% probability on the given data. The estimates presented in a range (say 130 to 180 days) would indicate that the range in which task could be completed.
Much of what I have written above I attribute it to the book :
Software Estimation: Demystifying the Black Art
by Steve McConnell
I check the estimates against the man-power. Although not a very accurate heuristic, it's clear if some massive work has just one or two devs assigned to it, that the task was not estimated correctly
A good estimate will have a good breakdown, involving all phases of the project.
It will almost certainly not finish at a convenient date for the business.
It will include risks of various sorts.
It will be presented in terms of confidence intervals, either implicitly (10-12 months) or by using large units (about four quarters).
It will be made by somebody with responsibility for getting the project done, preferably more than one such person.
If there are delays at the start, there will be delays at the end (expressed as 10-12 months from start, or about 1Q2010 if we start now, not something like January 2010 when the project hasn't started yet).
Assumptions and dependencies will be clearly and prominently listed.
Edit: Part of this depends on the stage the project is in. An early but precise estimate is a warning sign, particularly if there is no confidence interval assigned. That reeks of a Procrustean estimate.
Also, watch for other development methodologies. A timeboxed project can have a rigid and arbitrary schedule, but the feature set will be flexible.
Any of the following:
It is one big project and there isn't a short high level strategy described
There isn't a clear, short and concise vision of what wants to be achieve with the project
The project isn't structured around business value being delivered gradually
The team is trying to give "accurate" estimates for a big project, going into (or was done with) a long analysis phase? (changes will come, and will usually affect those estimates in way that can't be easily quantified without yet more big efforts)
There are "detailed" estimates for the whole project (related to previous)
There aren't detailed estimates for the first phase, or there is something wrong in those.
"Four to six weeks", when not accompanied with a breakdown into shorter tasks...

Resources